Jim Allan wrote:

> Jonathan Kaye wrote:
>> I should say that the sort does seem to work on the ascii codes so that
>> higher codes come later in the collating sequence.
> 
> The OpenOffice.org sort order for the Latin 1 characters (including
> ASCII characters) is:
> 
> ` ´ ^ ¯ ¨ ¸ _ - , ; : ! ¡ ? ¿ . · ' ' ‚ " « » ( ) [ ] { } § ¶ © ® @ * /
> \ & # % ° + ± ÷ × < = > ¬ | ¦ ~ ¤ ¢ £ ¥ 0 1 ¹ ½ ¼ 2 ² 3 ³ ¾ 4 5 6 7 8 9
> a A ª á Á à À â Â â å Å ä Ä ã Ã æ Æ b B c C ç Ç d D ð Ð e E é É è È ê Ê
> ë Ë f F g G h H i I í Í ì Ì î Î ï Ï j J k K l L m M n N ñ Ñ o O º ó Ó ò
> Ó ô Ô ö Ö õ Õ ø Ø p P q Q r R s S ß t T u U ú Ú ù Ù û Û ü Ü v V w W x X
> y Y ý Ý ÿ z Z þ Þ µ
> 
> This isn't the same as the native ASCII order or native Latin 1 order
> but is much more reasonable.
> 
> See http://unicode.org/reports/tr10/ for information on the Unicode
> Collation Algorithm which is what OpenOffice.org is supposed to be using
> and which is here seen to be using, at least in this simple sort of
> characters standing alone, despite the bug you have found involving
> numeric digits
> 
> Jim Allan
Thanks again Jim but this is not really the nature of the problem. You can
try the experiment yourself. It's quite surprising. From your table you see
that the character Ô appears before the character Õ. In an empty table put
aÕ in cell A1 and aÔ in A2. Do a sort based on column A and the contents of
cells A1 and A2 change places. This is exactly what you expect since Õ is
ordered AFTER Ô in the collating sequence. Ok now put aÔ3 in cell A3 and
aÕ2 in A4 and do the sort again. You would expect that aÔ3 remains above in
the sorted version right? But it doesn't! Now enter aO in A5 and sort
again. aO appears at the top of the list. Edit cell A1 (now containing aO)
changing it to aO9 and do another sort. Boom! Now aO is at the bottom of
the list.

The problem seems clear: the addion of numbers changes the sensitivity of
the sort. If there are no numbers then the characters O, Ô and Õ are
distinct and sorted in the order you gave. If you add a following number
they all merge and the sort is then based on sorting the following number.
To summarise: aO5 aÕ2 aÔ4 will give a bad sort. If the numbers are the same
as in aO5 aÕ5 aÔ5 then the sort is good BUT if anything follows the "5" in
the previous example then the sort is bad so aO5z aÕ5a aÔ5g is sorted based
on the FINAL character. The sort comes out as aÕ5a, aÔ5g, aO5z.

I hope this is clearer now and that you can see the nature of the problem.
BTW all sorts are done with the "Case Sensitive" box checked and the
language set to "None".
Cheers,
Jonathan
-- 
Registerd Linux user #445917 at http://counter.li.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to