Re: String is ASCII or UTF-8?

2010-03-09 Thread Emile van Sebille
On 3/9/2010 1:36 PM Stef Mientki said... On 09-03-2010 18:36, Robert Kern wrote: No, you can't. ASCII strings only have characters in the range 0..127. You could create Latin-1 (or any number of the 8-bit encodings out there) strings with characters 0..255, yes, but not ASCII. Probably, an

Re: String is ASCII or UTF-8?

2010-03-09 Thread Stef Mientki
On 09-03-2010 18:36, Robert Kern wrote: On 2010-03-09 11:12 AM, Stef Mientki wrote: On 09-03-2010 18:02, Alf P. Steinbach wrote: * C. Benson Manica: Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte pe

Re: String is ASCII or UTF-8?

2010-03-09 Thread Martin v. Loewis
> I can create ASCII strings containing byte values between 127 and 255. No, you can't - or what you create wouldn't be an ASCII string, by definition of ASCII. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list

Re: String is ASCII or UTF-8?

2010-03-09 Thread Roel Schroeven
Op 2010-03-09 18:31, C. Benson Manica schreef: > On Mar 9, 12:24 pm, "Richard Brodie" wrote: >> "C. Benson Manica" wrote in >> messagenews:98375575-1071-46af-8ebc-f3c817b47...@q23g2000yqd.googlegroups.com... >> >>> The strings come from the same place, i.e. they're exclusively >>> normal ASCII c

Re: String is ASCII or UTF-8?

2010-03-09 Thread Terry Reedy
On 3/9/2010 11:54 AM, C. Benson Manica wrote: Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte per character) or UTF-8 (and two bytes per character)? Utf-8 is an encoding that uses 1 to 4 bytes per cha

Re: String is ASCII or UTF-8?

2010-03-09 Thread Robert Kern
On 2010-03-09 11:12 AM, Stef Mientki wrote: On 09-03-2010 18:02, Alf P. Steinbach wrote: * C. Benson Manica: Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte per character) or UTF-8 (and two bytes per

Re: String is ASCII or UTF-8?

2010-03-09 Thread C. Benson Manica
On Mar 9, 12:24 pm, "Richard Brodie" wrote: > "C. Benson Manica" wrote in > messagenews:98375575-1071-46af-8ebc-f3c817b47...@q23g2000yqd.googlegroups.com... > > >The strings come from the same place, i.e. they're exclusively > > normal ASCII characters. > > In this case then converting them to/f

Re: String is ASCII or UTF-8?

2010-03-09 Thread Richard Brodie
"C. Benson Manica" wrote in message news:98375575-1071-46af-8ebc-f3c817b47...@q23g2000yqd.googlegroups.com... >The strings come from the same place, i.e. they're exclusively > normal ASCII characters. In this case then converting them to/from UTF-8 is a no-op, so it makes no difference at all.

Re: String is ASCII or UTF-8?

2010-03-09 Thread C. Benson Manica
On Mar 9, 12:07 pm, Tim Golden wrote: > You can't. You can apply one or more heuristics, depending on exactly > what your requirement is. But any valid ASCII text is also valid > UTF8-encoded text since UTF-8 isn't "two bytes per char" but a variable > number of bytes per char. Hm, well that's v

Re: String is ASCII or UTF-8?

2010-03-09 Thread Stef Mientki
On 09-03-2010 18:02, Alf P. Steinbach wrote: * C. Benson Manica: Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte per character) or UTF-8 (and two bytes per character)? This is python 2.4.3, so I don't

Re: String is ASCII or UTF-8?

2010-03-09 Thread Tim Golden
On 09/03/2010 16:54, C. Benson Manica wrote: Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte per character) or UTF-8 (and two bytes per character)? This is python 2.4.3, so I don't have getsizeof availa

Re: String is ASCII or UTF-8?

2010-03-09 Thread Alf P. Steinbach
* C. Benson Manica: Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte per character) or UTF-8 (and two bytes per character)? This is python 2.4.3, so I don't have getsizeof available to me. Generally, i

String is ASCII or UTF-8?

2010-03-09 Thread C. Benson Manica
Hours of Googling has not helped me resolve a seemingly simple question - Given a string s, how can I tell whether it's ascii (and thus 1 byte per character) or UTF-8 (and two bytes per character)? This is python 2.4.3, so I don't have getsizeof available to me. -- http://mail.python.org/mailman/l