Re: [Oorexx-devel] An alternative algorithm for x2c()
Hi, looking at ooRexx's implemetation of x2c I find the original code much easier to understand and to reason about. Instead of hand-tuning an implementation I'd like to know why the native implementation is slower than your approach. I'd suspect it's because it's using strchr to validate each character against a set of valid characters. Might want to improve that instead... It might be good to use VTune to do the actual performance analysis because that will give a cycle-accurate break-down of instructions. Cheers, Moritz On Fri, Apr 19, 2019 at 4:18 PM Rony G. Flatscher wrote: > Double-checked the test hex strings and their positions of illegal > whitespace chars, here the native routine with the corrected position: > > int64_t hex2char(RexxCallContext *rcc, CSTRING hexData, size_t len, char > *data) > { > // check for trailing whitespace > char tmp=(hexData[len-1]); // get last character (must not be > whitespace) > if (tmp==' ' || tmp=='\t') > rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(len)); > > size_t dIdx=0; > > for (size_t i=0; i { > > if (hexData[i]>='0' && hexData[i]<='9') tmp=hexData[i]-'0'; > else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10; > else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10; > else > { > // check for leading whitespace > if (dIdx==0 && (hexData[i]==' ' || hexData[i]=='\t') ) > rcc->ThrowException1(93931, rcc->Int32ToObject(1)); > > if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on > whitespace > > // illegal hex digit > char buf[64]; > snprintf(buf, 64, "%c\0", hexData[i]); > rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf)); > } > tmp<<=4;// shift to high order nibble > i++;// position on next hex digit > > if (i==len) // we are beyond the hex string and missing the second > hex digit > { > char buf[256]; > snprintf(buf, 256, "Hexadecimal string incomplete: hexadecimal > pair starting with the character \"%c\" in position %zd is missing the second > hexadecimal digit\0", hexData[i-1], len); > rcc->ThrowException1(93900,rcc->NewStringFromAsciiz(buf)); > } > > > if (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0'; > else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10; > else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10; > else if (hexData[i]==' ' || hexData[i]=='\t') // also whitespace not > allowed in between of a hex pair > rcc->ThrowException1(93931, > rcc->UnsignedInt64ToObject(i*+1*)); > else // illegal hex digit > { > char buf[64]; > snprintf(buf, 64, "%c\0", hexData[i]); > rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf)); > } > data[dIdx++]=tmp; > } > data[dIdx]='\0'; > return (int64_t)dIdx;// return number or characters in data > } > > > ---rony > ___ > Oorexx-devel mailing list > Oorexx-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oorexx-devel > -- Moritz Hoffmann; http://antiguru.de/ ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
Double-checked the test hex strings and their positions of illegal whitespace chars, here the native routine with the corrected position: int64_t hex2char(RexxCallContext *rcc, CSTRING hexData, size_t len, char *data) { // check for trailing whitespace char tmp=(hexData[len-1]); // get last character (must not be whitespace) if (tmp==' ' || tmp=='\t') rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(len)); size_t dIdx=0; for (size_t i=0; i='0' && hexData[i]<='9') tmp=hexData[i]-'0'; else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10; else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10; else { // check for leading whitespace if (dIdx==0 && (hexData[i]==' ' || hexData[i]=='\t') ) rcc->ThrowException1(93931, rcc->Int32ToObject(1)); if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on whitespace // illegal hex digit char buf[64]; snprintf(buf, 64, "%c\0", hexData[i]); rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf)); } tmp<<=4;// shift to high order nibble i++;// position on next hex digit if (i==len) // we are beyond the hex string and missing the second hex digit { char buf[256]; snprintf(buf, 256, "Hexadecimal string incomplete: hexadecimal pair starting with the character \"%c\" in position %zd is missing the second hexadecimal digit\0", hexData[i-1], len); rcc->ThrowException1(93900,rcc->NewStringFromAsciiz(buf)); } if (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0'; else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10; else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10; else if (hexData[i]==' ' || hexData[i]=='\t') // also whitespace not allowed in between of a hex pair rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(i*+1*)); else // illegal hex digit { char buf[64]; snprintf(buf, 64, "%c\0", hexData[i]); rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf)); } data[dIdx++]=tmp; } data[dIdx]='\0'; return (int64_t)dIdx;// return number or characters in data } ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
On 19.04.2019 15:55, Rony G. Flatscher wrote: > On 19.04.2019 13:09, Rick McGuire wrote: ... cut ... Sorry, the following error message is correct, the position 6 is correct (had changed the hex string): > > * the hexadecimal string "Ab 0 0 ff 00 ff 00 ff 12" has an illegal > whitespace in position > 5, x2c() reports wrongly position 6 instead, > ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
On 19.04.2019 13:09, Rick McGuire wrote: > Well, you're missing all of the logic that allows for blanks at the > boundaries and you're also > assuming that you have an even number of digits to convert. If you remove all > of the things that > x2c() has to account for, then yes, you can do a faster conversion. OK, this version caters for illegal leading and trailing whitespace making sure an even number of digits gets processed; it is still appr. 4,8 faster than the current x2c: /* A Rexx hex string may include any number of whitespace after even hex characters, e.g. "00 01 02"x but no leading or trailing blanks like " 00 01 02 "x */ int64_t hex2char(RexxCallContext *rcc, CSTRING hexData, size_t len, char *data) { // check for trailing whitespace char tmp=(hexData[len-1]); // get last character (must not be whitespace) if (tmp==' ' || tmp=='\t') rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(len)); size_t dIdx=0; for (size_t i=0; i='0' && hexData[i]<='9') tmp=hexData[i]-'0'; else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10; else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10; else { // check for leading whitespace if (dIdx==0 && (hexData[i]==' ' || hexData[i]=='\t') ) rcc->ThrowException1(93931, rcc->Int32ToObject(1)); if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on whitespace // illegal hex digit char buf[64]; snprintf(buf, 64, "%c\0", hexData[i]); rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf)); } tmp<<=4;// shift to high order nibble i++;// position on next hex digit if (i==len) // we are beyond the hex string and missing the second hex digit { char buf[256]; snprintf(buf, 256, "Hexadecimal string incomplete: hexadecimal pair starting with the character \"%c\" in position %zd is missing the second hexadecimal digit\0", hexData[i-1], len); rcc->ThrowException1(93900,rcc->NewStringFromAsciiz(buf)); } if (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0'; else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10; else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10; else if (hexData[i]==' ' || hexData[i]=='\t') // also whitespace not allowed in between of a hex pair rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(i)); else // illegal hex digit { char buf[64]; snprintf(buf, 64, "%c\0", hexData[i]); rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf)); } data[dIdx++]=tmp; } data[dIdx]='\0'; return (int64_t)dIdx;// return number or characters in data } RexxRoutine1( RexxStringObject, cppX2C, CSTRING, hexData ) { size_t len=strlen(hexData); char *data=(char *) malloc (len+1); int64_t res=(int64_t) hex2char(context, hexData,len,data); // true, if translated; false, else RexxStringObject rso=context->String(data, (size_t) res); free(data); return rso; } Here the hexadecimal test strings checked against the x2c()-BIF and the above hex2char(): .context~package~local~lineal="1234+6789|"~copies(7) hexStrings=("Ab00ff00ff00ff12" , - "Ab 00 ff 00 ff 00 ff 12" , - "AB 00 ff 00 ff 00 ff 12", - "AB 00 ff 00 ff 00 ff 1", --- not ok: hex-digit missing "AB 00 ff 00 f 00 ff 12", - -- not ok: whitespace in wrong position (actually hex-digit missing) "Ab 0 0 ff 00 ff 00 ff 12" , - -- not ok: pairs may not contain whitespace "Ab != 00 ff 00 ff 00 ff 12" , - -- not ok: illegal characters "!=" "aB 0X 00 ff 00 ff 00 ff 12" , - -- not ok: illegal characters "!=" " aB 00 ff 00 ff 00 ff 12" , - -- not ok: leading whitespace "aB 00 ff 00 ff 00 ff 12 " ) -- not ok: trailing whitespace do h over hexStrings call testHexStringWithX2ch say call testHexStringWithCppX2C h say say "="~copies(79) end The x2c()-BIF has two wrong error messages: * the hexadecimal string "Ab 0 0 ff 00 ff 00 ff 12" has an illegal whitespace in position 5, x2c() reports wrongly position 6 instead, * the hexadecimal string "AB 00 ff 00 ff 00 ff 1" is an incomplete hexadecimal string (last hex digit missing), x2c() r
Re: [Oorexx-devel] An alternative algorithm for x2c()
Well, you're missing all of the logic that allows for blanks at the boundaries and you're also assuming that you have an even number of digits to convert. If you remove all of the things that x2c() has to account for, then yes, you can do a faster conversion. Rick On Fri, Apr 19, 2019 at 6:51 AM Rony G. Flatscher wrote: > Thank you all for your feedback to my quite "dirty" ;) posting! > > As Mike pointed out 'a'-'z' were not honored, but also blanks in between > pairs of hex digits, which Rexx allows for. > > Rewriting the algorithm a little bit it runs even faster (up to 4.5 times > compared to x2c()): > > /* A Rexx hex string may include any number of whitespace after even hex > characters, e.g. > "00 01 02"x > */ > int64_t hex2char(CSTRING hexData, size_t len, char *data) > { > size_t dIdx=0; > > for (size_t i=0; i { > char tmp='\0'; > > if (hexData[i]>='0' && hexData[i]<='9') tmp=hexData[i]-'0'; > else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10; > else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10; > else if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on > whitespace > else > { > data[0]='\0'; > return -i; // error: return position as negative number > } > tmp<<=4;// shift to high order nibble > > i++;// position on next hex digit > if (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0'; > else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10; > else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10; > else// also whitespace is not allowed in between of a hex pair > { > data[0]='\0'; > return -i; // error: return position as negative number > } > data[dIdx++]=tmp; > } > data[dIdx]='\0'; > return dIdx;// return number or characters in data > } > > RexxRoutine1( RexxStringObject, cppX2C, CSTRING, hexData ) > { > size_t len=strlen(hexData); > > char *data=(char *) malloc (len+1); > int64_t res=hex2char(hexData,len,data); // true, if translated; false, > else > > if (res==-1) > { > free(data); > return NULLOBJECT; // indicate error; TODO: raise exception? > } > > RexxStringObject rso=context->String(data, res); > free(data); > > return rso; > } > > Maybe missing something compared to the x2c()-BIF (x2c-method in the > String class). > > ---rony > > ___ > Oorexx-devel mailing list > Oorexx-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oorexx-devel > ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
Thank you all for your feedback to my quite "dirty" ;) posting! As Mike pointed out 'a'-'z' were not honored, but also blanks in between pairs of hex digits, which Rexx allows for. Rewriting the algorithm a little bit it runs even faster (up to 4.5 times compared to x2c()): /* A Rexx hex string may include any number of whitespace after even hex characters, e.g. "00 01 02"x */ int64_t hex2char(CSTRING hexData, size_t len, char *data) { size_t dIdx=0; for (size_t i=0; i='0' && hexData[i]<='9') tmp=hexData[i]-'0'; else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10; else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10; else if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on whitespace else { data[0]='\0'; return -i; // error: return position as negative number } tmp<<=4;// shift to high order nibble i++;// position on next hex digit if (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0'; else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10; else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10; else// also whitespace is not allowed in between of a hex pair { data[0]='\0'; return -i; // error: return position as negative number } data[dIdx++]=tmp; } data[dIdx]='\0'; return dIdx;// return number or characters in data } RexxRoutine1( RexxStringObject, cppX2C, CSTRING, hexData ) { size_t len=strlen(hexData); char *data=(char *) malloc (len+1); int64_t res=hex2char(hexData,len,data); // true, if translated; false, else if (res==-1) { free(data); return NULLOBJECT; // indicate error; TODO: raise exception? } RexxStringObject rso=context->String(data, res); free(data); return rso; } Maybe missing something compared to the x2c()-BIF (x2c-method in the String class). ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel