Re: [Python-Dev] order of Misc/ACKS

2011-11-13 Thread Stephen J. Turnbull
Xavier Morel writes:
  On 2011-11-12, at 10:24 , Georg Brandl wrote:
   Am 12.11.2011 08:03, schrieb Stephen J. Turnbull:

   The sensible thing is to just sort in Unicode code point order, I
   think.

   The sensible thing is to accept that there is no solution, and to stop
   worrying.

  The file could use the default collation order, that way it'd be
  incorrectly sorted for everybody.

What I tell you three times is true.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-12 Thread Larry Hastings

On 11/11/2011 11:03 PM, Stephen J. Turnbull wrote:

The sensible thing is to just sort in Unicode code point order, I
think.


I was going to suggest the official Unicode Collation Algorithm:

   http://unicode.org/reports/tr10/

But I peeked in the can, saw it was chock-a-block with worms, and 
declined to open it.



/larry/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-12 Thread Georg Brandl
Am 12.11.2011 08:03, schrieb Stephen J. Turnbull:
 Eli Bendersky writes:
 
   special locale. It makes me wonder whether it's possible to have a
   contradiction in the ordering, i.e. have a set of names that just
   can't be sorted in any order acceptable by everyone.
 
 Yes, it is.  The examples were already given in this thread.  The
 Han-using languages also have this problem, and Japanese is
 nondetermistic all by itself (there are kanji names which for
 historical reasons are pronounced in several different ways, and
 therefore cannot be placed in phonetic order without additional
 information).
 
 The sensible thing is to just sort in Unicode code point order, I
 think.

The sensible thing is to accept that there is no solution, and to stop
worrying.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-12 Thread Barry Warsaw
On Nov 12, 2011, at 04:03 PM, Stephen J. Turnbull wrote:

The sensible thing is to just sort in Unicode code point order, I
think.

M-x sort-lines-by-unicode-point-order RET

wink

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-12 Thread Xavier Morel
On 2011-11-12, at 10:24 , Georg Brandl wrote:
 Am 12.11.2011 08:03, schrieb Stephen J. Turnbull:
 Eli Bendersky writes:
 
 special locale. It makes me wonder whether it's possible to have a
 contradiction in the ordering, i.e. have a set of names that just
 can't be sorted in any order acceptable by everyone.
 
 Yes, it is.  The examples were already given in this thread.  The
 Han-using languages also have this problem, and Japanese is
 nondetermistic all by itself (there are kanji names which for
 historical reasons are pronounced in several different ways, and
 therefore cannot be placed in phonetic order without additional
 information).
 
 The sensible thing is to just sort in Unicode code point order, I
 think.
 
 The sensible thing is to accept that there is no solution, and to stop
 worrying.
The file could use the default collation order, that way it'd be incorrectly 
sorted for everybody.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-11 Thread Ezio Melotti

Hi,

On 11/11/2011 10.39, Eli Bendersky wrote:

The PS: at the top of Misc/ACKS says:

PS: In the standard Python distribution, this file is encoded in UTF-8
and the list is in rough alphabetical order by last names.

However, the last 3 names in the list don't appear to be part of that
alphabetical order. Is this somehow intentional, or just a mistake?


Only the last two are out of place, and should be fixed.  The 'Å' in 
Peter Åstrand sorts after 'Z'.
See http://mail.python.org/pipermail/python-dev/2010-August/102961.html 
for a discussion about the order of Misc/ACKS.


Best Regards,
Ezio Melotti


Eli



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-11 Thread Martin v. Löwis
Am 11.11.2011 10:56, schrieb Ezio Melotti:
 Hi,
 
 On 11/11/2011 10.39, Eli Bendersky wrote:
 The PS: at the top of Misc/ACKS says:

 PS: In the standard Python distribution, this file is encoded in UTF-8
 and the list is in rough alphabetical order by last names.

 However, the last 3 names in the list don't appear to be part of that
 alphabetical order. Is this somehow intentional, or just a mistake?
 
 Only the last two are out of place, and should be fixed.  The 'Å' in
 Peter Åstrand sorts after 'Z'.
 See http://mail.python.org/pipermail/python-dev/2010-August/102961.html
 for a discussion about the order of Misc/ACKS.

The key point here is that it is *rough* alphabetic order. IMO, sorting
accented characters along with their unaccented versions would be fine
as well, and be more practical. In general, it's not possible to provide
a correct alphabetic order. For example, in German, 'ö' sorts after
'o', whereas in Swedish, it sorts after 'z'. In fact, in German, we have
two different ways of sorting the ö: one is to treat it is a letter
after o, and the other is to treat it as equivalent to oe.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-11 Thread Eli Bendersky
 The key point here is that it is *rough* alphabetic order. IMO, sorting
 accented characters along with their unaccented versions would be fine
 as well, and be more practical. In general, it's not possible to provide
 a correct alphabetic order. For example, in German, 'ö' sorts after
 'o', whereas in Swedish, it sorts after 'z'. In fact, in German, we have
 two different ways of sorting the ö: one is to treat it is a letter
 after o, and the other is to treat it as equivalent to oe.

This is really interesting. I guess lexical ordering of alphabet
letters is a locale thing, but Misc/ACKS isn't supposed to be any
special locale. It makes me wonder whether it's possible to have a
contradiction in the ordering, i.e. have a set of names that just
can't be sorted in any order acceptable by everyone. We can then call
it the Misc/ACKS incompleteness theorem ;-)

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] order of Misc/ACKS

2011-11-11 Thread Stephen J. Turnbull
Eli Bendersky writes:

  special locale. It makes me wonder whether it's possible to have a
  contradiction in the ordering, i.e. have a set of names that just
  can't be sorted in any order acceptable by everyone.

Yes, it is.  The examples were already given in this thread.  The
Han-using languages also have this problem, and Japanese is
nondetermistic all by itself (there are kanji names which for
historical reasons are pronounced in several different ways, and
therefore cannot be placed in phonetic order without additional
information).

The sensible thing is to just sort in Unicode code point order, I
think.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com