Re: [Oorexx-devel] An alternative algorithm for x2c()

2019-04-19 Thread Moritz Hoffmann
Hi,
looking at ooRexx's implemetation of x2c I find the original code much
easier to understand and to reason about. Instead of hand-tuning an
implementation I'd like to know why the native implementation is slower
than your approach. I'd suspect it's because it's using strchr to validate
each character against a set of valid characters. Might want to improve
that instead...

It might be good to use VTune to do the actual performance analysis because
that will give a cycle-accurate break-down of instructions.

Cheers,
Moritz

On Fri, Apr 19, 2019 at 4:18 PM Rony G. Flatscher 
wrote:

> Double-checked the test hex strings and their positions of illegal
> whitespace chars, here the native routine with the corrected position:
>
> int64_t hex2char(RexxCallContext *rcc, CSTRING hexData, size_t len, char 
> *data)
> {
> // check for trailing whitespace
> char tmp=(hexData[len-1]);  // get last character (must not be 
> whitespace)
> if (tmp==' ' || tmp=='\t')
> rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(len));
>
> size_t dIdx=0;
>
> for (size_t i=0; i {
>
> if  (hexData[i]>='0' && hexData[i]<='9') tmp=hexData[i]-'0';
> else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10;
> else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10;
> else
> {
> // check for leading whitespace
> if (dIdx==0 && (hexData[i]==' ' || hexData[i]=='\t') )
> rcc->ThrowException1(93931, rcc->Int32ToObject(1));
>
> if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on 
> whitespace
>
> // illegal hex digit
> char buf[64];
> snprintf(buf, 64, "%c\0", hexData[i]);
> rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf));
> }
> tmp<<=4;// shift to high order nibble
> i++;// position on next hex digit
>
> if (i==len) // we are beyond the hex string and missing the second 
> hex digit
> {
> char buf[256];
> snprintf(buf, 256, "Hexadecimal string incomplete: hexadecimal 
> pair starting with the character \"%c\" in position %zd is missing the second 
> hexadecimal digit\0", hexData[i-1], len);
> rcc->ThrowException1(93900,rcc->NewStringFromAsciiz(buf));
> }
>
>
> if  (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0';
> else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10;
> else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10;
> else if (hexData[i]==' ' || hexData[i]=='\t')  // also whitespace not 
> allowed in between of a hex pair
> rcc->ThrowException1(93931, 
> rcc->UnsignedInt64ToObject(i*+1*));
> else // illegal hex digit
> {
> char buf[64];
> snprintf(buf, 64, "%c\0", hexData[i]);
> rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf));
> }
> data[dIdx++]=tmp;
> }
> data[dIdx]='\0';
> return (int64_t)dIdx;// return number or characters in data
> }
>
>
> ---rony
> ___
> Oorexx-devel mailing list
> Oorexx-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oorexx-devel
>


-- 
Moritz Hoffmann;
http://antiguru.de/
___
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel


Re: [Oorexx-devel] An alternative algorithm for x2c()

2019-04-19 Thread Rony G. Flatscher
Double-checked the test hex strings and their positions of illegal whitespace 
chars, here the native
routine with the corrected position:

int64_t hex2char(RexxCallContext *rcc, CSTRING hexData, size_t len, char 
*data)
{
// check for trailing whitespace
char tmp=(hexData[len-1]);  // get last character (must not be 
whitespace)
if (tmp==' ' || tmp=='\t')
rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(len));

size_t dIdx=0;

for (size_t i=0; i='0' && hexData[i]<='9') tmp=hexData[i]-'0';
else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10;
else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10;
else
{
// check for leading whitespace
if (dIdx==0 && (hexData[i]==' ' || hexData[i]=='\t') )
rcc->ThrowException1(93931, rcc->Int32ToObject(1));

if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on 
whitespace

// illegal hex digit
char buf[64];
snprintf(buf, 64, "%c\0", hexData[i]);
rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf));
}
tmp<<=4;// shift to high order nibble
i++;// position on next hex digit

if (i==len) // we are beyond the hex string and missing the second 
hex digit
{
char buf[256];
snprintf(buf, 256, "Hexadecimal string incomplete: hexadecimal 
pair starting with the character \"%c\" in position %zd is missing the second 
hexadecimal digit\0", hexData[i-1], len);
rcc->ThrowException1(93900,rcc->NewStringFromAsciiz(buf));
}


if  (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0';
else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10;
else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10;
else if (hexData[i]==' ' || hexData[i]=='\t')  // also whitespace 
not allowed in between of a hex pair
rcc->ThrowException1(93931, 
rcc->UnsignedInt64ToObject(i*+1*));
else // illegal hex digit
{
char buf[64];
snprintf(buf, 64, "%c\0", hexData[i]);
rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf));
}
data[dIdx++]=tmp;
}
data[dIdx]='\0';
return (int64_t)dIdx;// return number or characters in data
}

---rony

___
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel


Re: [Oorexx-devel] An alternative algorithm for x2c()

2019-04-19 Thread Rony G. Flatscher

On 19.04.2019 15:55, Rony G. Flatscher wrote:
> On 19.04.2019 13:09, Rick McGuire wrote:

... cut ...

Sorry, the following error message is  correct, the position 6 is correct (had 
changed the hex string):
>
>   * the hexadecimal string "Ab  0 0  ff  00  ff  00  ff  12" has an illegal 
> whitespace in position
> 5, x2c() reports wrongly position 6 instead,
>
---rony
___
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel


Re: [Oorexx-devel] An alternative algorithm for x2c()

2019-04-19 Thread Rony G. Flatscher
On 19.04.2019 13:09, Rick McGuire wrote:
> Well, you're missing all of the logic that allows for blanks at the 
> boundaries and you're also
> assuming that you have an even number of digits to convert. If you remove all 
> of the things that
> x2c() has to account for, then yes, you can do a faster conversion.

OK, this version caters for illegal leading and trailing whitespace making sure 
an even number of
digits gets processed; it is still appr. 4,8 faster than the current x2c:

/* A Rexx hex string may include any number of whitespace after even hex 
characters, e.g.
 "00 01   02"x
 but no leading or trailing blanks like " 00 01   02 "x
*/
int64_t hex2char(RexxCallContext *rcc, CSTRING hexData, size_t len, char 
*data)
{
// check for trailing whitespace
char tmp=(hexData[len-1]);  // get last character (must not be 
whitespace)
if (tmp==' ' || tmp=='\t')
rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(len));

size_t dIdx=0;

for (size_t i=0; i='0' && hexData[i]<='9') tmp=hexData[i]-'0';
else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10;
else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10;
else
{
// check for leading whitespace
if (dIdx==0 && (hexData[i]==' ' || hexData[i]=='\t') )
rcc->ThrowException1(93931, rcc->Int32ToObject(1));

if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on 
whitespace

// illegal hex digit
char buf[64];
snprintf(buf, 64, "%c\0", hexData[i]);
rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf));
}
tmp<<=4;// shift to high order nibble
i++;// position on next hex digit

if (i==len) // we are beyond the hex string and missing the second 
hex digit
{
char buf[256];
snprintf(buf, 256, "Hexadecimal string incomplete: hexadecimal 
pair starting with the character \"%c\" in position %zd is missing the second 
hexadecimal digit\0", hexData[i-1], len);
rcc->ThrowException1(93900,rcc->NewStringFromAsciiz(buf));
}


if  (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0';
else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10;
else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10;
else if (hexData[i]==' ' || hexData[i]=='\t')  // also whitespace 
not allowed in between of a hex pair
rcc->ThrowException1(93931, rcc->UnsignedInt64ToObject(i));
else // illegal hex digit
{
char buf[64];
snprintf(buf, 64, "%c\0", hexData[i]);
rcc->ThrowException1(93933, rcc->NewStringFromAsciiz(buf));
}
data[dIdx++]=tmp;
}
data[dIdx]='\0';
return (int64_t)dIdx;// return number or characters in data
}


RexxRoutine1( RexxStringObject, cppX2C, CSTRING, hexData  )
{
size_t len=strlen(hexData);
char *data=(char *) malloc (len+1);
int64_t res=(int64_t) hex2char(context, hexData,len,data);  // true, if 
translated; false, else
RexxStringObject rso=context->String(data, (size_t) res);
free(data);
return rso;
}

Here the hexadecimal test strings checked against the x2c()-BIF and the above 
hex2char():

.context~package~local~lineal="1234+6789|"~copies(7)

hexStrings=("Ab00ff00ff00ff12"  , -
"Ab 00 ff 00 ff 00 ff 12"   , -
"AB  00  ff  00  ff  00  ff  12", -
"AB  00  ff  00  ff  00  ff  1", --- not ok: hex-digit 
missing
"AB  00  ff  00  f   00  ff  12", -   -- not ok: whitespace 
in wrong position (actually hex-digit missing)
"Ab  0 0  ff  00  ff  00  ff  12"   , -   -- not ok: pairs may 
not contain whitespace
"Ab != 00  ff  00  ff  00  ff  12"  , -   -- not ok: illegal 
characters "!="
"aB 0X 00  ff  00  ff  00  ff  12"  , -   -- not ok: illegal 
characters "!="
"  aB  00  ff  00  ff  00  ff  12"  , -   -- not ok: leading 
whitespace
"aB  00  ff  00  ff  00  ff  12  "  ) -- not ok: trailing 
whitespace
do h over hexStrings
   call testHexStringWithX2ch
   say
   call testHexStringWithCppX2C h
   say
   say "="~copies(79)
end

The x2c()-BIF has two wrong error messages:

  * the hexadecimal string "Ab  0 0  ff  00  ff  00  ff  12" has an illegal 
whitespace in position
5, x2c() reports wrongly position 6 instead,

  * the hexadecimal string "AB 00 ff 00 ff 00 ff 1" is an incomplete 
hexadecimal string (last hex
digit missing), x2c() r

Re: [Oorexx-devel] An alternative algorithm for x2c()

2019-04-19 Thread Rick McGuire
Well, you're missing all of the logic that allows for blanks at the
boundaries and you're also assuming that you have an even number of digits
to convert. If you remove all of the things that x2c() has to account for,
then yes, you can do a faster conversion.

Rick

On Fri, Apr 19, 2019 at 6:51 AM Rony G. Flatscher 
wrote:

> Thank you all for your feedback to my quite "dirty" ;) posting!
>
> As Mike pointed out 'a'-'z' were not honored, but also blanks in between
> pairs of hex digits, which Rexx allows for.
>
> Rewriting the algorithm a little bit it runs even faster (up to 4.5 times
> compared to x2c()):
>
> /* A Rexx hex string may include any number of whitespace after even hex 
> characters, e.g.
>  "00 01   02"x
> */
> int64_t hex2char(CSTRING hexData, size_t len, char *data)
> {
> size_t dIdx=0;
>
> for (size_t i=0; i {
> char tmp='\0';
>
> if  (hexData[i]>='0' && hexData[i]<='9') tmp=hexData[i]-'0';
> else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10;
> else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10;
> else if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on 
> whitespace
> else
> {
> data[0]='\0';
> return -i;  // error: return position as negative number
> }
> tmp<<=4;// shift to high order nibble
>
> i++;// position on next hex digit
> if  (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0';
> else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10;
> else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10;
> else// also whitespace is not allowed in between of a hex pair
> {
> data[0]='\0';
> return -i;  // error: return position as negative number
> }
> data[dIdx++]=tmp;
> }
> data[dIdx]='\0';
> return dIdx;// return number or characters in data
> }
>
> RexxRoutine1( RexxStringObject, cppX2C, CSTRING, hexData  )
> {
> size_t len=strlen(hexData);
>
> char *data=(char *) malloc (len+1);
> int64_t res=hex2char(hexData,len,data);  // true, if translated; false, 
> else
>
> if (res==-1)
> {
> free(data);
> return NULLOBJECT;  // indicate error; TODO: raise exception?
> }
>
> RexxStringObject rso=context->String(data, res);
> free(data);
>
> return rso;
> }
>
> Maybe missing something compared to the x2c()-BIF (x2c-method in the
> String class).
>
> ---rony
>
> ___
> Oorexx-devel mailing list
> Oorexx-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oorexx-devel
>
___
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel


Re: [Oorexx-devel] An alternative algorithm for x2c()

2019-04-19 Thread Rony G. Flatscher
Thank you all for your feedback to my quite "dirty" ;) posting!

As Mike pointed out 'a'-'z' were not honored, but also blanks in between pairs 
of hex digits, which
Rexx allows for.

Rewriting the algorithm a little bit it runs even faster (up to 4.5 times 
compared to x2c()):

/* A Rexx hex string may include any number of whitespace after even hex 
characters, e.g.
 "00 01   02"x
*/
int64_t hex2char(CSTRING hexData, size_t len, char *data)
{
size_t dIdx=0;

for (size_t i=0; i='0' && hexData[i]<='9') tmp=hexData[i]-'0';
else if (hexData[i]>='A' && hexData[i]<='F') tmp=hexData[i]-'A'+10;
else if (hexData[i]>='a' && hexData[i]<='f') tmp=hexData[i]-'a'+10;
else if (hexData[i]==' ' || hexData[i]=='\t') continue; // skip on 
whitespace
else
{
data[0]='\0';
return -i;  // error: return position as negative number
}
tmp<<=4;// shift to high order nibble

i++;// position on next hex digit
if  (hexData[i]>='0' && hexData[i]<='9') tmp|=hexData[i]-'0';
else if (hexData[i]>='A' && hexData[i]<='F') tmp|=hexData[i]-'A'+10;
else if (hexData[i]>='a' && hexData[i]<='f') tmp|=hexData[i]-'a'+10;
else// also whitespace is not allowed in between of a hex 
pair
{
data[0]='\0';
return -i;  // error: return position as negative number
}
data[dIdx++]=tmp;
}
data[dIdx]='\0';
return dIdx;// return number or characters in data
}

RexxRoutine1( RexxStringObject, cppX2C, CSTRING, hexData  )
{
size_t len=strlen(hexData);

char *data=(char *) malloc (len+1);
int64_t res=hex2char(hexData,len,data);  // true, if translated; false, 
else

if (res==-1)
{
free(data);
return NULLOBJECT;  // indicate error; TODO: raise exception?
}

RexxStringObject rso=context->String(data, res);
free(data);

return rso;
}

Maybe missing something compared to the x2c()-BIF (x2c-method in the String 
class).

---rony


___
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel