Daniel Staal wrote:

Quick question: is this data more representative than the data in the first email? In particular, does set 4 from the first email actually exist, exactly as listed, anywhere?


If this latter data is more representative I'd bet on ASCIIbetical ordering: Compare each string one character at a time based on it's ASCII encoding, and stop the comparison at the first different character. Don't think of numbers or punctuation as anything different, just handle them the same.

Daniel T. Staal

The data from the first email actually exists in the file. Set 4 was:


000
0000

and yes, that appears in the file in that order. Also, the sample data I gave in the last email exists as listed as well.

So in set 4, we'd scan the first 3 corresponding characters but the first would run out. If we treated it as a null, which has ASCII value = 0, then 000 would come before 0000 and we have the right ordering.

However, the last two lines int he sample file i gave are:

ABC-MARKET.ABC-MARKET
ABC-MARKET

Here, the first 10 characters correspond but then the second string runs out. Using our rule, we'd order ABC-MARKET before ABC-MARKET.ABC-MARKET, which is wrong. I guess I could try following that rule, but if the character position in the longer string that corresponds to the first position in the shorter string where there is no character is a "-" or a ".", then the longer string is "less". Just a thought. I'll have to check out the data more and try it out.

Thanks,

Dan

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to