Re: [Oorexx-devel] An alternative algorithm for x2c()
case 'A': tmp=10; break; case 'B': tmp=11; break; case 'C': tmp=12; break; case 'D': tmp=13; break; case 'E': tmp=14; break; case 'F': tmp=15; break; Also need to add cases 'a' through 'f'? ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
Two additional quick notes: 1. The static const arrays may be const only when dealing with a fixed character set. For some products I’ve worked on, the initialization of the arrays occurs during startup and after a character set option has been selected. 2. (len % 2 == 1), while syntactically correct, does not optimize on several compilers for performance. The fastest all around is ((len & 1u)). Mark From: Mark L. Gaubatz via Oorexx-devel Sent: Thursday, April 18, 2019 11:05 AM To: 'Open Object Rexx Developer Mailing List' Cc: Mark L. Gaubatz Subject: Re: [Oorexx-devel] An alternative algorithm for x2c() Rony: Too many comparisons are being done from a performance perspective; a minimal comparison for validation and no comparison for translation is the fastest. Actual translation should be done with a pair of 256-byte arrays for speed AND portability across character sets. Speed varies according to caching levels on each chipset as compared to ANSI specific translation that can be accomplished with just register operations. I will note that in some of my speed tests on a commercial product using newer Intel chipsets, caching brings only the cache lines of the trio of 256-byte tables used for translation into faster operation than the register operations. In addition, both upper- and lower-case A-F are handled with no performance impact. static const char trinvalid[256] = {… invalid character table … }; // 0x00 for valid characters, non-zero for invalid characters static const char trhigh [256] = {… high order nibble translations … }; static const char trlow[256] = {… low order nibble translations … }; { charc1 = hexdata[i]; charc2 = hexdata[i+1]; if ((trinvalid[c1] | trinvalid[c2])) // Yes, the | symbol is correct to drop to a single comparison operation { data[0] = 0x00; return false; } data[dIdx++] = trhigh[c1] | trlow[c2]; i += 2; } Mark From: Rony G. Flatscher mailto:rony.flatsc...@wu.ac.at> > Sent: Thursday, April 18, 2019 9:04 AM To: Open Object Rexx Developer Mailing List mailto:oorexx-devel@lists.sourceforge.net> > Subject: [Oorexx-devel] An alternative algorithm for x2c() While experimenting a little bit with C code to decode hex strings I came up with the following code: boolean hex2char(CSTRING hexData, size_t len, char *data) { if (len % 2 == 1) // not an even number of hex characters { data[0]='\0'; return false; } size_t dIdx=0; for (size_t i=0; iString(data, len/2); free(data); return rso; } Comparing the duration of x2c() with the above cppX2C() 1,000 times on a 512KB string ( xrange("00"x,"FF"x)~copies(1000)~c2x ) the implementation of hex2char() seems to be about 3,8 faster than x2c(). (This was tested on Windows, 32-bit ooRexx.) Left the RexxRoutine1 cppX2C() in the above pasted code, such that you could double-check it by merely copying and pasting the above code and test it for yourself. ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
Not sure if it would be any faster, but maybe neater to simplify some of the below along the lines of: case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': tmp=hexData[i-1]-'0'; break; (I'd also make the increments of i separate rather than a side-effect within the switch .. but that's just a style issue.) Mike _ From: Rony G. Flatscher [mailto:rony.flatsc...@wu.ac.at] Sent: 18 April 2019 17:04 To: Open Object Rexx Developer Mailing List Subject: [Oorexx-devel] An alternative algorithm for x2c() While experimenting a little bit with C code to decode hex strings I came up with the following code: boolean hex2char(CSTRING hexData, size_t len, char *data) { if (len % 2 == 1) // not an even number of hex characters { data[0]='\0'; return false; } size_t dIdx=0; for (size_t i=0; iString(data, len/2); free(data); return rso; } Comparing the duration of x2c() with the above cppX2C() 1,000 times on a 512KB string ( xrange("00"x,"FF"x)~copies(1000)~c2x ) the implementation of hex2char() seems to be about 3,8 faster than x2c(). (This was tested on Windows, 32-bit ooRexx.) Left the RexxRoutine1 cppX2C() in the above pasted code, such that you could double-check it by merely copying and pasting the above code and test it for yourself. ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
[Oorexx-devel] An alternative algorithm for x2c()
While experimenting a little bit with C code to decode hex strings I came up with the following code: boolean hex2char(CSTRING hexData, size_t len, char *data) { if (len % 2 == 1) // not an even number of hex characters { data[0]='\0'; return false; } size_t dIdx=0; for (size_t i=0; iString(data, len/2); free(data); return rso; } Comparing the duration of x2c() with the above cppX2C() 1,000 times on a 512KB string ( xrange("00"x,"FF"x)~copies(1000)~c2x ) the implementation of hex2char() seems to be about 3,8 faster than x2c(). (This was tested on Windows, 32-bit ooRexx.) Left the RexxRoutine1 cppX2C() in the above pasted code, such that you could double-check it by merely copying and pasting the above code and test it for yourself. ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel
Re: [Oorexx-devel] An alternative algorithm for x2c()
Rony: Too many comparisons are being done from a performance perspective; a minimal comparison for validation and no comparison for translation is the fastest. Actual translation should be done with a pair of 256-byte arrays for speed AND portability across character sets. Speed varies according to caching levels on each chipset as compared to ANSI specific translation that can be accomplished with just register operations. I will note that in some of my speed tests on a commercial product using newer Intel chipsets, caching brings only the cache lines of the trio of 256-byte tables used for translation into faster operation than the register operations. In addition, both upper- and lower-case A-F are handled with no performance impact. static const char trinvalid[256] = {… invalid character table … }; // 0x00 for valid characters, non-zero for invalid characters static const char trhigh [256] = {… high order nibble translations … }; static const char trlow[256] = {… low order nibble translations … }; { charc1 = hexdata[i]; charc2 = hexdata[i+1]; if ((trinvalid[c1] | trinvalid[c2])) // Yes, the | symbol is correct to drop to a single comparison operation { data[0] = 0x00; return false; } data[dIdx++] = trhigh[c1] | trlow[c2]; i += 2; } Mark From: Rony G. Flatscher Sent: Thursday, April 18, 2019 9:04 AM To: Open Object Rexx Developer Mailing List Subject: [Oorexx-devel] An alternative algorithm for x2c() While experimenting a little bit with C code to decode hex strings I came up with the following code: boolean hex2char(CSTRING hexData, size_t len, char *data) { if (len % 2 == 1) // not an even number of hex characters { data[0]='\0'; return false; } size_t dIdx=0; for (size_t i=0; iString(data, len/2); free(data); return rso; } Comparing the duration of x2c() with the above cppX2C() 1,000 times on a 512KB string ( xrange("00"x,"FF"x)~copies(1000)~c2x ) the implementation of hex2char() seems to be about 3,8 faster than x2c(). (This was tested on Windows, 32-bit ooRexx.) Left the RexxRoutine1 cppX2C() in the above pasted code, such that you could double-check it by merely copying and pasting the above code and test it for yourself. ---rony ___ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel