Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Travis Griggs
On Jan 16, 2014, at 2:51 AM, Robin Becker wrote: > I assure you that I fully understand my ignorance of ... Robin, don’t take this personally, I totally got what you meant. At the same time, I got a real chuckle out of this line. That beats “army intelligence” any day. -- https://mail.

Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Tim Chase
On 2014-01-16 14:07, Steven D'Aprano wrote: > The unicode type in Python 2.x is less-good because: > > - it is missing some functionality, e.g. casefold; Just for the record, str.casefold() wasn't added until 3.3, so earlier 3.x versions (such as the 3.2.3 that is the default python3 on Debian St

Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Steven D'Aprano
On Thu, 16 Jan 2014 10:51:42 +, Robin Becker wrote: > On 16/01/2014 00:32, Steven D'Aprano wrote: >>> >Or are you saying thatwww.unicode.org is wrong about the definitions >>> >of Unicode terms? >> No, I think he is saying that he doesn't know Unicode anywhere near as >> well as he thinks he

Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Chris Angelico
On Thu, Jan 16, 2014 at 9:51 PM, Robin Becker wrote: > On 16/01/2014 00:32, Steven D'Aprano wrote: >>> >>> >Or are you saying thatwww.unicode.org is wrong about the definitions of >>> >Unicode terms? >> >> No, I think he is saying that he doesn't know Unicode anywhere near as >> well as he thinks

Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Robin Becker
On 16/01/2014 00:32, Steven D'Aprano wrote: >Or are you saying thatwww.unicode.org is wrong about the definitions of >Unicode terms? No, I think he is saying that he doesn't know Unicode anywhere near as well as he thinks he does. The question is, will he cherish his ignorance, or learn from th

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Steven D'Aprano
On Wed, 15 Jan 2014 12:00:51 +, Robin Becker wrote: > so two 'characters' are 3 (or 2 or more) codepoints. Yes. > If I want to isolate so called graphemes I need an algorithm even > for python's unicode Correct. Graphemes are language dependent, e.g. in Dutch "ij" is usually a single gra

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Steven D'Aprano
On Thu, 16 Jan 2014 02:14:38 +1100, Chris Angelico wrote: > On Thu, Jan 16, 2014 at 1:55 AM, wrote: >> Le mercredi 15 janvier 2014 13:13:36 UTC+1, Ned Batchelder a écrit : >> >> >>> ... more than one codepoint makes up a grapheme ... >> >> No > > Yes. > http://www.unicode.org/faq/char_combmark

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Terry Reedy
On 1/15/2014 11:55 AM, Robin Becker wrote: The fact that unicoders want to take over the meaning of encoding is not relevant. I agree with you that 'encoding' should not be limited to 'byte encoding of a (subset of) unicode characters. For instance, .jpg and .png are byte encodings of images

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Ian Kelly
On Wed, Jan 15, 2014 at 9:55 AM, Robin Becker wrote: > The fact that unicoders want to take over the meaning of encoding is not > relevant. A virus is a small infectious agent that replicates only inside the living cells of other organisms. In the context of computing however, that definition is

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Robin Becker
On 15/01/2014 17:14, Chris Angelico wrote: On Thu, Jan 16, 2014 at 3:55 AM, Robin Becker wrote: I think about these as encodings, because that's what they are mathematically, logically & practically. I can encode the target grapheme sequence as a sequence of bytes using a particular 'unicode en

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 3:55 AM, Robin Becker wrote: > I think about these as encodings, because that's what they are > mathematically, logically & practically. I can encode the target grapheme > sequence as a sequence of bytes using a particular 'unicode encoding' eg > utf8 or a sequence of code

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Robin Becker
On 15/01/2014 16:28, Travis Griggs wrote: of a sequence of graphemes I can use either a sequence of bytes or a sequence of codepoints. They are both encodings of the graphemes; what unicode says is an encoding doesn't define what encodings are ie mappings from some source alphabet to a

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 1:55 AM, wrote: > Le mercredi 15 janvier 2014 13:13:36 UTC+1, Ned Batchelder a écrit : > >> >> ... more than one codepoint makes up a grapheme ... > > No Yes. http://www.unicode.org/faq/char_combmark.html >> In Unicode terms, an encoding is a mapping between codepoints

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread wxjmfauth
Le mercredi 15 janvier 2014 13:13:36 UTC+1, Ned Batchelder a écrit : > > ... more than one codepoint makes up a grapheme ... No > In Unicode terms, an encoding is a mapping between codepoints and bytes. No jmf -- https://mail.python.org/mailman/listinfo/python-list