Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-14 Thread Steven D'Aprano
On Tue, 13 Aug 2013 15:34:45 +, Prasad, Ramit wrote: Michael Torrie wrote: [...] However I know of no phone or network that won't let you use longer messages; multiple SMS packets are used and most phone paste them back together. So no there's nothing that anyone needs to change to use

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-13 Thread Chris Angelico
On Tue, Aug 13, 2013 at 4:32 AM, MRAB pyt...@mrabarnett.plus.com wrote: On 13/08/2013 04:20, Jason Friedman wrote: I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they felt like it. I thought

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-13 Thread Joshua Landau
On 11 August 2013 12:14, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Sun, 11 Aug 2013 10:44:40 +0100, Joshua Landau wrote: café will be in your Copy-Paste buffer, and you can paste it in to the tweet-box. It takes 5 characters. So much for testing ;). How do you know that

RE: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-13 Thread Prasad, Ramit
Michael Torrie wrote: On 08/11/2013 11:54 PM, Gregory Ewing wrote: Michael Torrie wrote: I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they felt like it. Isn't it for compatibility with

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-12 Thread Gregory Ewing
Michael Torrie wrote: I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they felt like it. Isn't it for compatibility with SMS? Twitter could probably change it, but persuading all the cell phone

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-12 Thread Michael Torrie
On 08/11/2013 11:54 PM, Gregory Ewing wrote: Michael Torrie wrote: I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they felt like it. Isn't it for compatibility with SMS? Twitter could

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-12 Thread Chris Angelico
On Tue, Aug 13, 2013 at 2:48 AM, Michael Torrie torr...@gmail.com wrote: On 08/11/2013 11:54 PM, Gregory Ewing wrote: Michael Torrie wrote: I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-12 Thread Jason Friedman
I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they felt like it. I thought it was 140 characters? https://twitter.com/about -- http://mail.python.org/mailman/listinfo/python-list

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-12 Thread MRAB
On 13/08/2013 04:20, Jason Friedman wrote: I've always wondered if the 160 character limit or whatever it is is a hard limit in their system, or if it's just a variable they could tweak if they felt like it. I thought it was 140 characters? https://twitter.com/about He did say or whatever.

Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Joshua Landau
Basically, I think Twitter's broken. For my full discusion on the matter, see: http://www.reddit.com/r/learnpython/comments/1k2yrn/help_with_len_and_input_function_33/cbku5e8 Here's the first post of mine, ineffectually edited for this list: strikethroughThe obvious solution [to getting the

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Chris Angelico
On Sun, Aug 11, 2013 at 7:17 AM, Joshua Landau jos...@landau.ws wrote: Given tweet = bcaf\x65\xCC\x81.decode(): tweet 'café' But: len(tweet) 5 You're now looking at the difference between glyphs and combining characters. Twitter counts combining characters, so when you

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Steven D'Aprano
On Sun, 11 Aug 2013 07:17:42 +0100, Joshua Landau wrote: Basically, I think Twitter's broken. Oh, in about a million ways, but apparently people like it :-( For my full discusion on the matter, see: http://www.reddit.com/r/learnpython/comments/1k2yrn/

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Joshua Landau
On 11 August 2013 10:09, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: The reason some accented letters have single code point forms is to support legacy charsets; the reason some only exist as combining characters is due to the combinational explosion. Some languages allow you

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Joshua Landau
On 11 August 2013 07:24, Chris Angelico ros...@gmail.com wrote: On Sun, Aug 11, 2013 at 7:17 AM, Joshua Landau jos...@landau.ws wrote: Given tweet = bcaf\x65\xCC\x81.decode(): tweet 'café' But: len(tweet) 5 You're now looking at the difference between glyphs and

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Steven D'Aprano
On Sun, 11 Aug 2013 10:44:40 +0100, Joshua Landau wrote: On 11 August 2013 10:09, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: The reason some accented letters have single code point forms is to support legacy charsets; the reason some only exist as combining characters is due

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Chris Angelico
On Sun, Aug 11, 2013 at 12:14 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Consider a single character. It can have 0 to 5 accents, in any combination. Order doesn't matter, and there are no duplicates, so there are: 0 accent: take 0 from 5 = 1 combination; 1 accent: take

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Joshua Landau
On 11 August 2013 12:14, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Sun, 11 Aug 2013 10:44:40 +0100, Joshua Landau wrote: On 11 August 2013 10:09, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: The reason some accented letters have single code point forms is

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread wxjmfauth
Le dimanche 11 août 2013 11:09:44 UTC+2, Steven D'Aprano a écrit : On Sun, 11 Aug 2013 07:17:42 +0100, Joshua Landau wrote: The reason some accented letters have single code point forms is to support legacy charsets; ... No. jmf PS Unicode normalization is failing expectedly very

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Joshua Landau
On 11 August 2013 13:51, wxjmfa...@gmail.com wrote: Le dimanche 11 août 2013 11:09:44 UTC+2, Steven D'Aprano a écrit : On Sun, 11 Aug 2013 07:17:42 +0100, Joshua Landau wrote: The reason some accented letters have single code point forms is to support legacy charsets; ... No. jmf PS

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread MRAB
On 11/08/2013 10:54, Joshua Landau wrote: On 11 August 2013 07:24, Chris Angelico ros...@gmail.com wrote: On Sun, Aug 11, 2013 at 7:17 AM, Joshua Landau jos...@landau.ws wrote: Given tweet = bcaf\x65\xCC\x81.decode(): tweet 'café' But: len(tweet) 5 You're now looking at

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

2013-08-11 Thread Michael Torrie
On 08/11/2013 09:34 AM, MRAB wrote: If twitter counts characters, not codepoints, you could then ask whether it passes the codepoints through as given. If it does, then you experiment to see how much data you could send encoded as a sequence of combining codepoints. (You might want to check