Rob,

As you know, compared strings must themselves be of equal length if
anomalies like 'aa' > 'x' are to be avoided.

Case also poses problems.  Are 'john', 'jOHn', 'John' and the like to
match?  Sometimes this is appropriate, even necessary, sometimes not.

There is  probably no alternative to writing a macro that 'normalizes'
a set of strings parametrically.  I recommend the use of a scheme that

1) first strips any insignificant leftmost and rightmost blanks,
2) conditionally forces case coherence using the LOWER and/or UPPER bifs
3) pads on the right with nuls, x'00', not blanks, EBCDIC x'40' or
ASCII x'20', to obtain strings that are of eq
4) conditionally adds non-duplicates to an array or list specified
using the identifier of a global created character set-symbol [array]
to permit multiple sets to be distinguished and managed concurrently
at assembly time.

If you need to retain information about after-normalization duplicates
you can do so using an array of created global arithmetic set symbols,
adding a sequence-number element to it as you encounter such
duplicates.

More fanciness is possible.  Created set symbols can be used to
construct lists and binary-search trees at assembly time.  This is
very fast, but it is appropriate only if you are going to be doing a
lot of table-generation wprk.

John Gilmore, Ashland, MA 01721 - USA

Reply via email to