Thanks. I pretty much get all of your first paragraph. I just would have expected on MVS that the letters in the "C" (default) locale would be pretty much the same as the order of *EBCDIC* characters when looked at as plain 8 byte unsigned integers.
It's one of those things: std::sort and std::lower_bound (both with the same strcasecmp()-based less-than function) are working just fine. They are consistent. My application works flawlessly. I never suspected any issue. So I was just stunned to find that under the covers it is "working" more or less in ASCII rather than EBCDIC. Charles -----Original Message----- From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf Of John McKown Sent: Thursday, June 1, 2017 1:08 PM To: [email protected] Subject: Re: strcasecmp() comparing punctuation in ASCII? On Thu, Jun 1, 2017 at 2:43 PM, Charles Mills <[email protected]> wrote: > It's clearly doing everything in ASCII: > > strcasecmp("Z", "0") 122 > > It's interesting. I use the same compare function for both a sort and > for a binary search, so it all works correctly -- it's just not > working the way I think it is. > > Charles > > I'm not any kind of an expert on this. So take everything I say with about a kilo of NaCl. As the pages you referenced states, the strcasecmp() function is locale sensitive. The locale ordering is NOT based on the code point (hex value) at all (well at least conceptually). It is based on the "rune". Where "rune" is basically the concept of what character this is, such as used in UNICODE (e.g. LATIN-SMALL-LETTER-A is 'a' regardless of the hex value(s) used to store that in memory). For something "simple" you can sort of think of the hex value as being an index into an array of values, where the value at that index value is the relative collating position of the "rune" involved in the comparison. This is how strcasecmp("A","a") is "equal". The relative collating position of "A" and "a" are the same, so the comparison is "equal". Of course, it looks like it is an ASCII compare because the relative positioning of the of the letters in the "C" (default) locale is pretty much the same as the order of "ASCII" characters when looked at as plain 8 byte unsigned integers. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
