RE: Hexadecimal digits?

Jim Allan Mon, 10 Nov 2003 12:49:11 -0800

Jim Ramonsky posted:

I am not the one who has not thought it through. There _is_ no difference between decimal 7 and hex 7. They are the same digit. File777 sorts before File999 in _ALL_ radices.

Exactly.

So mixed hex and mixed decimal will not sort or compare properly using a natural sort *string* comparison even with creation of clones of the alpha characters with numeric values.

Why then use a natural sort at all?

If you want a natural sort using a mixed alpha and numeric string which may use multiple bases, a reasonable procedure might be to use the Unicode subscript numbers as base markers.

Upon reaching one of these the parser evaluates the superscript digits to create a decimal number and then goes backward until it comes to the first non-digit according to that base identified by that decimal number. Then it can simply zero extend for sort or comparison. Or a binary value can be used for sort or comparison if required.

This solves for all bases up to base 36. Such a system would be understood on sight by humans.

Or again, if hex number are the only issue, use some normal hex-indication flag in the string so that both humans and the customized natural sort will know that the number is hex and where the number begins and ends, e.g. File-0x15A-19, File-oxB23A5-25, File-ox123ABCD-Extra in which the center portion, between the two hyphens, would be recognized as hex by the "0x" prefix.

Using symbols that the computer automatically distinguishes while human beings do not is a *dangerous* solution to any problem. Enough typos are made even when symbols are different. It is common in producing random uppercase alpha / numeric codes to avoid 0, O, Q, 1, I, 5, S, 8, B, U, V for that reason alone.

Now a completely new set of hex digits, as has been suggested, might make sense. But that is not for Unicode to prescribe, but for mathematical associations or perhaps some other computer standards organization. If such a set of digits were proposed by international organizations with very strong backing (comparable to introduction of the Euro symbol) then they would certainly have a place in Unicode.

Or if a particular computer language were to introduce them in the PUA for that language and that usage became popular, then again they would be encoded by Unicode.

But one wants to avoid as much as possible symbols that look identical to human beings but have radically different meanings. Unicode as enough of those by necessity and for backward compatibility.

Jim Allan

RE: Hexadecimal digits?

Reply via email to