Soo, I started dabbling with the thing I talked about before last summer, letting String parameters in NB calls have an encoding: option. (There’s already a slice in inbox to allow optional values other than true/false)
Thought I’d start with decoding; here’s a small preview of the part which does
the actual decoding, after needed string class has has been determined and
instantiated.
While it’s a fallback path for when the platform doesn’t support SSE or other
batch operations, it’s still using some neat tricks (imho) I thought others
might enjoy on a Friday afternoon :)
emitStandardDecodeUTF8CharactersFrom: aSource to: aDestination withCharSize:
aCharSize scratchReg: scratchReg using: aGenerator
"Emit decoding using only standard x86 ops"
"We have already found what String class is needed for decoding
aSource, and created an instance of the proper size"
"This implementation focuses on minimizing jumps and register usage, at
the cost of loading from source one byte at a time.
"Input:
aSource - memory pointer to C-string with UTF8 bytes
aDestination - memoryPointer to first var field of String
instance
scratchReg - a register which will be modified while decoding
aCharSize - The size in bytes of each character in our
destination string, known at emission time
Clobbers: scratchReg
aSource and aDestination will end up pointing to end of strings"
| asm scratch32 sLowByte sHighByte loop done oneByte twoBytes
threeBytes |
asm := aGenerator asm.
loop := asm uniqueLabelName: 'utf8DecodeLoop'.
done := asm uniqueLabelName: 'utf8DecodingDone'.
scratch32 := scratchReg as32.
sLowByte := scratch32 as8.
sHighByte := sLowByte asHighByte.
asm label: loop.
"Unroll the inner loop as many times as we want, or, well, at least as
many times as the backwards jump will allow us to"
8 timesRepeat:[
oneByte := asm uniqueLabelName: 'utf8OneByteDecode'.
twoBytes := asm uniqueLabelName: 'utf8TwoByteDecode'.
threeBytes := asm uniqueLabelName: 'utf8ThreeByteDecode'.
asm xor: scratch32 with:scratch32.
asm or: sLowByte with: aSource ptr8.
asm cmp: sLowByte with: 0.
asm je: done.
asm add: aSource with: 1.
asm test: sLowByte with: 2r10000000 asUImm8.
asm jz: oneByte.
"We have a header, place its data bits as initial high byte value"
asm shl: scratch32 with: 8.
asm xor: sHighByte with: 2r11000000 asUImm8. "Strip 2 byte header"
asm test: sHighByte with: 2r00100000.
asm jz: twoBytes.
aCharSize > 1 ifTrue: [
asm xor: sHighByte with: 2r00100000. "Strip 3 byte header"
asm test: sHighByte with: 2r000100000.
asm jz: threeBytes.
"This is a 4-byte character"
asm xor: sHighByte with:2r00010000."Strip 4 byte header"
"Read one trailing byte, remove the header, and shift the data out of
low byte"
asm or: sLowByte with: aSource ptr8.
asm shl: sLowByte with:2.
asm shl: scratch32 with: 6.
asm add: aSource with: 1.
asm label: threeBytes.
"Read one trailing byte, remove the header, and shift the data out of
low byte"
asm or: sLowByte with: aSource ptr8.
asm shl: sLowByte with:2.
asm shl: scratch32 with: 6.
asm add: aSource with: 1.
].
asm label: twoBytes.
"Read last trailing byte, remove header, and shift the data bits into
proper place"
asm or: sLowByte with: aSource ptr8.
asm shl: sLowByte with:2.
asm shr: scratch32 with: 2.
asm add: aSource with: 1.
asm label: oneByte.
asm mov: (aDestination ptr size: aCharSize) with: (scratch32 as:
aCharSize).
asm add: aDestination with: aCharSize.].
asm jmp: loop.
asm label: done.
And the relevant test code for that:
testStandardDecodeWide
| bytes string |
"bytes := (ZnUTF8Encoder new encodeString: 'Cash, like €, is king'),
#[0]."
bytes := #[67 97 115 104 44 32 108 105 107 101 32 226 130 172 44 32 105
115 32 107 105 110 103 0].
string := WideString new: bytes size - 1.
self testStandardDecode: bytes toWideString: string.
^ string
testStandardDecode: utf8Bytes toWideString:aString
<primitive: #primitiveNativeCall module: #NativeBoostPlugin>
^ self nbCallout
function: #(void #(char* utf8Bytes, char* aString ))
emit: [ :gen :proxy :asm |
asm pop: asm EBX;
pop: asm ECX.
self emitStandardDecodeUTF8CharactersFrom: asm EBX to:
asm ECX withCharSize: 4 scratchReg: asm EAX using: gen.
asm mov: EAX with: gen proxy nilObject ]
Which, though it’s currently cheating by pre-knowledn string class/size, isn’t
alot of overhead:
ext := NBExternalString new.
[ext testStandardDecodeWide] bench '5,030,000 per second.' '5,080,000 per
second.' '5,190,000 per second.’
Compared to an equivalent to testStandardDecodeWide, with emitStandard… removed
from the primitive:
[ext testEmptyDecode] bench '5,850,000 per second.' '5,800,000 per second.'
'5,640,000 per second.'
… or compared to doing the decoding in image after the call:
int := ZnUTF8Encoder new.
[int decodeBytes:#[67 97 115 104 44 32 108 105 107 101 32 226 130 172 44 32 105
115 32 107 105 110 103 0]] bench '130,000 per second.' '131,000 per second.'
'132,000 per second.’
Cheers,
Henry
signature.asc
Description: Message signed with OpenPGP using GPGMail
