subject:"Unicode Question"

Re: unicode question

2015-01-28 Thread Michael Torrie

On 01/28/2015 03:17 PM, Albert-Jan Roskam wrote: >> I do not know how complete the support is, but this is copied from 3.4.2, >> which uses tcl/tk 8.6. t = "الحركات" for c in t: print(c) # Prints rightmost char above first >> ا >> ل >> ح >> ر >> ك >> ا >> ت > > Wow, I never knew this w

Re: unicode question

2015-01-28 Thread Albert-Jan Roskam

On Wed, Jan 28, 2015 8:21 AM CET Terry Reedy wrote: >On 1/27/2015 12:17 AM, Rehab Habeeb wrote: >> Hi there python staff >> does python support arabic language for texts ? and what to do if it >> support it? >> i wrote hello in Arabic using codeskulptor and the powers

Re: unicode question

2015-01-27 Thread Terry Reedy

On 1/27/2015 12:17 AM, Rehab Habeeb wrote: Hi there python staff does python support arabic language for texts ? and what to do if it support it? i wrote hello in Arabic using codeskulptor and the powershell just for testing and the same error appeared( a sytanx error in unicode)!! I do not kno

Re: unicode question

2015-01-27 Thread random832

On Tue, Jan 27, 2015, at 12:25, Mark Lawrence wrote: > People might find this http://bugs.python.org/issue1602 and hence this > https://github.com/Drekin/win-unicode-console useful. The latter is > available on pypi. However, Arabic is one of those scripts that runs up against the real limitati

Re: unicode question

2015-01-27 Thread Mark Lawrence

On 27/01/2015 16:13, random...@fastmail.us wrote: On Tue, Jan 27, 2015, at 00:17, Rehab Habeeb wrote: Hi there python staff does python support arabic language for texts ? and what to do if it support it? i wrote hello in Arabic using codeskulptor and the powershell just for testing and the same

Re: unicode question

2015-01-27 Thread random832

On Tue, Jan 27, 2015, at 00:17, Rehab Habeeb wrote: > Hi there python staff > does python support arabic language for texts ? and what to do if it > support it? > i wrote hello in Arabic using codeskulptor and the powershell just for > testing and the same error appeared( a sytanx error in unicode)

Re: unicode question

2015-01-26 Thread Chris Angelico

On Tue, Jan 27, 2015 at 4:17 PM, Rehab Habeeb wrote: > Hi there python staff > does python support arabic language for texts ? and what to do if it support > it? > i wrote hello in Arabic using codeskulptor and the powershell just for > testing and the same error appeared( a sytanx error in unicod

unicode question

2015-01-26 Thread Rehab Habeeb

Hi there python staff does python support arabic language for texts ? and what to do if it support it? i wrote hello in Arabic using codeskulptor and the powershell just for testing and the same error appeared( a sytanx error in unicode)!! -- https://mail.python.org/mailman/listinfo/python-list

Re: Beginner python 3 unicode question [SOLVED]

2013-11-16 Thread Chris Angelico

On Sun, Nov 17, 2013 at 8:44 AM, Laszlo Nagy wrote: > >> >> So is the default utf-8 or not? Should the documentation be updated? Or do >> we have a bug in the interactive shell? >> > It was my fault, sorry. The other program used os.system at some places, and > it accidentally used python2 instead

Re: Beginner python 3 unicode question

2013-11-16 Thread Chris Angelico

On Sun, Nov 17, 2013 at 8:19 AM, Laszlo Nagy wrote: > print("digest",digest,type(digest)) > > This function was called inside a script, and gave me this: > > ('digest', '\xa0\x98\x8b\xff\x04\xf9V;\xbd\x1eIHzh\x10-\xc5!\x14\x1b', 'str'>) > This looks very much like you're running under Py

Re: Beginner python 3 unicode question [SOLVED]

2013-11-16 Thread Laszlo Nagy

So is the default utf-8 or not? Should the documentation be updated? Or do we have a bug in the interactive shell? It was my fault, sorry. The other program used os.system at some places, and it accidentally used python2 instead of python 3. :-( -- This message has been scanned for viruse

Re: Beginner python 3 unicode question

2013-11-16 Thread Luuk

On 16-11-2013 21:57, Laszlo Nagy wrote: the error is in one of the lines you did not copy here because this works without problems: <> #!/usr/bin/python Most probably, your /usr/bin/python program is python version 2, and not python version 3 Try the same program with /usr/bin/python3.

Re: Beginner python 3 unicode question

2013-11-16 Thread Laszlo Nagy

Why it is behaving differently on the command line? What should I do to fix this? I was experimenting with this a bit more and found some more confusing things. Can somebody please enlight me? Here is a test function: def password_hash(self,password): public = bytearray([rando

Re: Beginner python 3 unicode question

2013-11-16 Thread Laszlo Nagy

the error is in one of the lines you did not copy here because this works without problems: <> #!/usr/bin/python Most probably, your /usr/bin/python program is python version 2, and not python version 3 Try the same program with /usr/bin/python3. And also try the interactive mode with

Re: Beginner python 3 unicode question

2013-11-16 Thread Luuk

On 16-11-2013 20:12, Laszlo Nagy wrote: Example interactive: $ python3 Python 3.3.1 (default, Sep 25 2013, 19:29:01) [GCC 4.7.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> import base64 >>> base64.b32encode(uuid.uuid1().bytes)[:-6].lowe

Beginner python 3 unicode question

2013-11-16 Thread Laszlo Nagy

Example interactive: $ python3 Python 3.3.1 (default, Sep 25 2013, 19:29:01) [GCC 4.7.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> import base64 >>> base64.b32encode(uuid.uuid1().bytes)[:-6].lower() b'zsz653co6ii6hgjejqhw42ncgy' >>> But w

Re: tkinter unicode question

2010-07-27 Thread Ned Deily

In article <20100727204532.r7gmz.27213.r...@cdptpa-web20-z02>, wrote: > Just curious if anyone could shed some light on this? I'm using > tkinter, but I can't seem to get certain unicode characters to > show in the label for Python 3. > > In my test, the label and button will contain the sa

tkinter unicode question

2010-07-27 Thread jyoung79

Just curious if anyone could shed some light on this? I'm using tkinter, but I can't seem to get certain unicode characters to show in the label for Python 3. In my test, the label and button will contain the same 3 characters - a Greek Alpha, a Greek Omega with a circumflex and soft breath

Re: Another (simple) unicode question

2009-10-29 Thread Scott David Daniels

John Machin wrote: On Oct 29, 10:02 pm, Rustom Mody wrote:... I thought of trying to port it to python3 but it barfs on some unicode related stuff (after running 2to3) which I am unable to wrap my head around. Can anyone direct me to what I should read to try to understand this? to which Jon

Re: Another (simple) unicode question

2009-10-29 Thread Carl Banks

On Oct 29, 4:02 am, Rustom Mody wrote: > Constructhttp://construct.wikispaces.com/is a kick-ass binary file > structurer (written by a 21 year old!) > I thought of trying to port it to python3 but it barfs on some unicode > related stuff (after running 2to3) which I am unable to wrap my head > aro

Re: Another (simple) unicode question

2009-10-29 Thread John Machin

On Oct 29, 10:02 pm, Rustom Mody wrote: > Constructhttp://construct.wikispaces.com/is a kick-ass binary file > structurer (written by a 21 year old!) > I thought of trying to port it to python3 but it barfs on some unicode > related stuff (after running 2to3) which I am unable to wrap my head > ar

Another (simple) unicode question

2009-10-29 Thread Rustom Mody

Construct http://construct.wikispaces.com/ is a kick-ass binary file structurer (written by a 21 year old!) I thought of trying to port it to python3 but it barfs on some unicode related stuff (after running 2to3) which I am unable to wrap my head around. Can anyone direct me to what I should read

Re: a simple unicode question

2009-10-28 Thread Tim Arnold

"Chris Jones" wrote in message news:mailman.2149.1256707687.2807.python-l...@python.org... > On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote: >> Chris Jones wrote: > > [..] > >>> Best part of Unicode is that there are multiple encodings, right? ;-) >> >> No, the best part about Unicode is

Re: a simple unicode question

2009-10-28 Thread Gabriel Genellina

En Wed, 28 Oct 2009 02:28:01 -0300, Chris Jones escribió: On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote: Chris Jones wrote: Best part of Unicode is that there are multiple encodings, right? ;-) No, the best part about Unicode is there is no encoding! Unicode does not define any enco

Re: a simple unicode question

2009-10-27 Thread Chris Jones

On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote: > Chris Jones wrote: [..] >> Best part of Unicode is that there are multiple encodings, right? ;-) > > No, the best part about Unicode is there is no encoding! > Unicode does not define any encoding; RFC 3629: "ISO/IEC 10646 and Unicode

Re: a simple unicode question

2009-10-27 Thread Lie Ryan

Chris Jones wrote: On Wed, Oct 21, 2009 at 12:35:11PM EDT, Nobody wrote: [..] Characters outside the 16-bit range aren't supported on all builds. They won't be supported on most Windows builds, as Windows uses 16-bit Unicode extensively: I knew nothing about UTF-16 & friends before this thre

Re: a simple unicode question

2009-10-22 Thread Gabriel Genellina

En Thu, 22 Oct 2009 17:08:21 -0300, escribió: On 10/22/2009 03:23 AM, Gabriel Genellina wrote: En Wed, 21 Oct 2009 15:14:32 -0300, escribió: On Oct 21, 4:59 am, Bruno Desthuilliers wrote: beSTEfar a écrit : (snip) > When parsing strings, use Regular Expressions. And now you have _two_ p

Re: a simple unicode question

2009-10-22 Thread rurpy

On 10/22/2009 03:23 AM, Gabriel Genellina wrote: > En Wed, 21 Oct 2009 15:14:32 -0300, escribió: > >> On Oct 21, 4:59 am, Bruno Desthuilliers > 42.desthuilli...@websiteburo.invalid> wrote: >>> beSTEfar a écrit : >>> (snip) >>> > When parsing strings, use Regular Expressions. >>> >>> And now you h

Re: a simple unicode question

2009-10-22 Thread Chris Jones

On Wed, Oct 21, 2009 at 12:35:11PM EDT, Nobody wrote: [..] > Characters outside the 16-bit range aren't supported on all builds. > They won't be supported on most Windows builds, as Windows uses 16-bit > Unicode extensively: I knew nothing about UTF-16 & friends before this thread. Best part of

Re: a simple unicode question

2009-10-22 Thread Gabriel Genellina

En Wed, 21 Oct 2009 15:14:32 -0300, escribió: On Oct 21, 4:59 am, Bruno Desthuilliers wrote: beSTEfar a écrit : (snip) > When parsing strings, use Regular Expressions. And now you have _two_ problems For some simple parsing problems, Python's string methods are powerful enough to make REs

Re: a simple unicode question

2009-10-21 Thread Terry Reedy

Nobody wrote: Just curious, why did you choose to set the upper boundary at 0x? Characters outside the 16-bit range aren't supported on all builds. They won't be supported on most Windows builds, as Windows uses 16-bit Unicode extensively: Python 2.5.1 (r251:54863, Apr 18 2007, 08

Re: a simple unicode question

2009-10-21 Thread rurpy

On Oct 21, 4:59 am, Bruno Desthuilliers wrote: > beSTEfar a écrit : > (snip) > > When parsing strings, use Regular Expressions. > > And now you have _two_ problems > > For some simple parsing problems, Python's string methods are powerful > enough to make REs overkill. And for any complex enough

Re: a simple unicode question

2009-10-21 Thread Nobody

On Wed, 21 Oct 2009 05:16:56 -0400, Chris Jones wrote: >> > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? >> >> You can get them from the unicodedata module, e.g.: >> >> import unicodedata >> for i in xrange(0x1): >>n = unicodedata.name(unichr(i),None) >>

Re: a simple unicode question

2009-10-21 Thread Bruno Desthuilliers

beSTEfar a écrit : (snip) > When parsing strings, use Regular Expressions. And now you have _two_ problems For some simple parsing problems, Python's string methods are powerful enough to make REs overkill. And for any complex enough parsing (any recursive construct for example - think XML, H

Re: a simple unicode question

2009-10-21 Thread Chris Jones

On Wed, Oct 21, 2009 at 12:20:35AM EDT, Nobody wrote: > On Tue, 20 Oct 2009 17:56:21 +, George Trojan wrote: [..] > > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? > > You can get them from the unicodedata module, e.g.: > > import unicodedata > for i in xrange(0x100

Re: a simple unicode question

2009-10-21 Thread Scott David Daniels

George Trojan wrote: Scott David Daniels wrote: ... And if you are unsure of the name to use: >>> import unicodedata >>> unicodedata.name(u'\xb0') 'DEGREE SIGN' > Thanks for all suggestions. It took me a while to find out how to > configure my keyboard to be able to type the degree sign. I

Re: a simple unicode question

2009-10-20 Thread Mark Tolonen

"George Trojan" wrote in message news:hbktk6$8b...@news.nems.noaa.gov... Thanks for all suggestions. It took me a while to find out how to configure my keyboard to be able to type the degree sign. I prefer to stick with pure ASCII if possible. Where are the literals (i.e. u'\N{DEGREE SIGN}') d

Re: a simple unicode question

2009-10-20 Thread Martin v. Löwis

> Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found > http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt > Is that the place to look? Correct - you are supposed to fill in a Unicode character name into the \N escape. The specific list of names depends on the version of the UCD

Re: a simple unicode question

2009-10-20 Thread Nobody

On Tue, 20 Oct 2009 17:56:21 +, George Trojan wrote: > Thanks for all suggestions. It took me a while to find out how to > configure my keyboard to be able to type the degree sign. I prefer to > stick with pure ASCII if possible. > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I

Re: a simple unicode question

2009-10-20 Thread George Trojan

Thanks for all suggestions. It took me a while to find out how to configure my keyboard to be able to type the degree sign. I prefer to stick with pure ASCII if possible. Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt Is

Re: a simple unicode question

2009-10-20 Thread Scott David Daniels

Mark Tolonen wrote: Is there a better way of getting the degrees? It seems your string is UTF-8. \xc2\xb0 is UTF-8 for DEGREE SIGN. If you type non-ASCII characters in source code, make sure to declare the encoding the file is *actually* saved in: # coding: utf-8 s = '''48° 13' 16.80" N'

Re: a simple unicode question

2009-10-19 Thread Mark Tolonen

"George Trojan" wrote in message news:hbidd7$i9...@news.nems.noaa.gov... A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s

Re: a simple unicode question

2009-10-19 Thread Mark Tolonen

"George Trojan" wrote in message news:hbidd7$i9...@news.nems.noaa.gov... A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s

Re: a simple unicode question

2009-10-19 Thread beSTEfar

On 19 Okt, 21:07, George Trojan wrote: > A trivial one, this is the first time I have to deal with Unicode. I am > trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is > "iso-8859-1". To get the degrees I did > >>> encoding='iso-8859-1' > >>> q=s.decode(encoding) > >>> q.spl

Re: a simple unicode question

2009-10-19 Thread Diez B. Roggisch

George Trojan schrieb: A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s.decode(encoding) >>> q.split() [u'48\xc2\xb0', u"13

a simple unicode question

2009-10-19 Thread George Trojan

A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s.decode(encoding) >>> q.split() [u'48\xc2\xb0', u"13'", u'16.80"', u'N'] >>> r=

Re: python 3.1 unicode question

2009-09-16 Thread Duncan Booth

jeffunit wrote: >>That looks like a "surrogate escape" (See PEP 383) >>http://www.python.org/dev/peps/pep-0383/. It indicates the wrong >>encoding was used to decode the filename. > > That seems likely. How do I set the encoding to something correct to > decode the filename? > > Clearly win

Re: python 3.1 unicode question

2009-09-15 Thread Chris Rebert

On Tue, Sep 15, 2009 at 9:48 PM, jeffunit wrote: > At 09:25 PM 9/15/2009, Mark Tolonen wrote: >> >> "jeffunit" wrote in message >> news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com... >>> >>> I wrote a program that diffs files and prints out matching file names. >>> I will be executing th

Re: python 3.1 unicode question

2009-09-15 Thread jeffunit

At 09:25 PM 9/15/2009, Mark Tolonen wrote: "jeffunit" wrote in message news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com... I wrote a program that diffs files and prints out matching file names. I will be executing the output with sh, to delete select files. Most of the files names are

Re: python 3.1 unicode question

2009-09-15 Thread Mark Tolonen

"jeffunit" wrote in message news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com... I wrote a program that diffs files and prints out matching file names. I will be executing the output with sh, to delete select files. Most of the files names are plain ascii, but about 10% of them have un

python 3.1 unicode question

2009-09-15 Thread jeffunit

I wrote a program that diffs files and prints out matching file names. I will be executing the output with sh, to delete select files. Most of the files names are plain ascii, but about 10% of them have unicode characters in them. When I try to print the string containing the name, I get an excep

Re: (Simple?) Unicode Question

2009-08-30 Thread Nobody

On Sun, 30 Aug 2009 02:36:49 +, Steven D'Aprano wrote: >>> So long as your terminal has a sensible encoding, and you have a good >>> quality font, you should be able to print any string you can create. >> >> UTF-8 isn't a particularly sensible encoding for terminals. > > Did I mention UTF-8?

Re: (Simple?) Unicode Question

2009-08-29 Thread Steven D'Aprano

On Sat, 29 Aug 2009 20:09:12 +0100, Nobody wrote: > On Sat, 29 Aug 2009 08:26:54 +, Steven D'Aprano wrote: > >> Python only needs to know when you convert the text to or from bytes. I >> can do this: >> > s = "hello" > t = "world" > print(' '.join([s, t])) >> hello world >> >> a

Re: (Simple?) Unicode Question

2009-08-29 Thread Nobody

On Sat, 29 Aug 2009 08:26:54 +, Steven D'Aprano wrote: > Python only needs to know when you convert the text to or from bytes. I > can do this: > s = "hello" t = "world" print(' '.join([s, t])) > hello world > > and not need to care anything about encodings. > > So long as y

Re: (Simple?) Unicode Question

2009-08-29 Thread Steven D'Aprano

On Sat, 29 Aug 2009 09:34:43 +0200, Thorsten Kampe wrote: > * Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700) >> > Further, does anything, except a printing device need to know the >> > encoding of a piece of "text"? > > Python needs to know if you are processing the text. Python only needs to

Re: (Simple?) Unicode Question

2009-08-29 Thread Thorsten Kampe

* Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700) > > Further, does anything, except a printing device need to know the > > encoding of a piece of "text"? Python needs to know if you are processing the text. > I may be wrong, but I believe that's part of the idea between separation > of strin

Re: (Simple?) Unicode Question

2009-08-27 Thread Albert Hopkins

On Thu, 2009-08-27 at 22:09 +0530, Shashank Singh wrote: > Hi All! > > I have a very simple (and probably stupid) question eluding me. > When exactly is the char-set information needed? > > To make my question clear consider reading a file. > While reading a file, all I get is basically an array

Re: (Simple?) Unicode Question

2009-08-27 Thread Rami Chowdhury

Further, does anything, except a printing device need to know the encoding of a piece of "text"? I may be wrong, but I believe that's part of the idea between separation of string and bytes types in Python 3.x. I believe, if you are using Python 3.x, you don't need the character encoding mum

(Simple?) Unicode Question

2009-08-27 Thread Shashank Singh

Hi All! I have a very simple (and probably stupid) question eluding me. When exactly is the char-set information needed? To make my question clear consider reading a file. While reading a file, all I get is basically an array of bytes. Now suppose a file has 10 bytes in it (all is data, no metad

Re: Unicode question

2006-07-28 Thread Martin v. Löwis

Ben Edwards (lists) wrote: > Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do > sys.setdefaultencoding = 'iso−8859−1' That "works", but has no effect. You bind the variable sys.setdefaultencoding to some value, but that value is never used for anything (do sys.getdefaultenc

Re: Unicode question

2006-07-28 Thread Steve M

Ben Edwards (lists) wrote: > I am using python 2.4 on Ubuntu dapper, I am working through Dive into > Python. > > There are a couple of inconsictencies. > > Firstly sys.setdefaultencoding('iso-8859-1') does not work, I have to do > sys.setdefaultencoding = 'iso-8859-1' When you run a Python script

Re: Unicode question

2006-07-28 Thread Max Erickson

"Ben Edwards (lists)" <[EMAIL PROTECTED]> wrote: > I am using python 2.4 on Ubuntu dapper, I am working through Dive > into Python. ... > Any insight? > Ben Did you follow all the instructions, or did you try to call sys.setdefaultencoding interactively? See: http://diveintopython.org/xml_pro

Unicode question

2006-07-28 Thread Ben Edwards (lists)

I am using python 2.4 on Ubuntu dapper, I am working through Dive into Python. There are a couple of inconsictencies. Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do sys.setdefaultencoding = 'iso−8859−1' secondly the following does not give a 'UnicodeError: ASCII encodin

[OT] Re: a unicode question?

2006-04-11 Thread Peter Otten

John Machin wrote: > ... and yes Peter, info travels faster also from China that it does > from Armenia :-()) Q: Can info travel faster from Armenia than from China? Radio Yerevan: In principle, yes. Just make sure that it doesn't go the other way round the globe or meets some friends on the way.

Re: a unicode question?

2006-04-10 Thread John Machin

E, it get's worse: not only is the title written in Chinese, it is encoded as gb2312 -- here is the repr() of the first few chunks: "\n\n\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) : \xc4\xd a\xb2\xbf\xc8\xcb\xd4\xb1\xb3\xd6\xb9\xc9 - \xcb\xd1\xba\xfc\xb9\xc9\xc6\xb1\n\n" and here is wha

Re: a unicode question?

2006-04-10 Thread Serge Orlov

[EMAIL PROTECTED] wrote: > Mr. John Machin > > This question come form the flow codes. I use the PyXml to build a DOM > tree. > > from xml.dom.ext.reader import HtmlLib > doc = > HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028') > title_elem = doc.documentElement.getElem

Re: a unicode question?

2006-04-09 Thread zdwang

Mr. John Machin This question come form the flow codes. I use the PyXml to build a DOM tree. from xml.dom.ext.reader import HtmlLib doc = HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028') title_elem = doc.documentElement.getElementsByTagName("TITLE")[0] title_string = t

Re: a unicode question?

2006-04-09 Thread zdwang

Mr. John Machin, Thank you very much! -- http://mail.python.org/mailman/listinfo/python-list

Re: a unicode question?

2006-04-09 Thread John Machin

What do you mean by "ansi string"? Here is a superficially not-unreasonable answer to your more specific question: # >>> s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' # >>> s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' # >>> s3 = s1.encode('latin1') # >>> s2 == s3 # True But what are you

a unicode question?

2006-04-09 Thread zdwang

Hello, There is a unicode string, I want to change it to ansi string. but it raise an exception. Could you help me? ## I want to change s1 to s2. s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' -- http://mail.python.org/mailman

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread Ben Finney

"Ian Sparks" <[EMAIL PROTECTED]> writes: > This is probably stupid and/or misguided but supposing I'm passed a > byte-string value that I want to be unicode, this is what I do. I'm > sure I'm missing something very important. Perhaps you need to read one of the good Python Unicode tutorials, such

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread Kent Johnson

ianaré wrote: > maybe a bit off topic, but how does one find the console's encoding > from within python? > In [1]: import sys In [3]: sys.stdout.encoding Out[3]: 'cp437' In [4]: sys.stdin.encoding Out[4]: 'cp437' Kent -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread John Machin

The most important thing that you are missing is that you need to know the encoding used for the 8-bit-character string. Let's guess that it's Latin1. Then all you have to do is use the unicode() builtin function, or the string decode method. # >>> s = 'Jos\xe9' # >>> s # 'Jos\xe9' # >>> u = unico

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread ianaré

maybe a bit off topic, but how does one find the console's encoding from within python? -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread aurora

First of all, if you run this on the console, find out your console's encoding. In my case it is English Windows XP. It uses 'cp437'. C:\>chcp Active code page: 437 Then >>> s = "José" >>> u = u"Jos\u00e9" # same thing in unicode escape >>> s.decode('cp437') == u # use encoding that

Unicode question : turn "José" into u"José"

2006-04-05 Thread Ian Sparks

This is probably stupid and/or misguided but supposing I'm passed a byte-string value that I want to be unicode, this is what I do. I'm sure I'm missing something very important. Short version : >>> s = "José" #Start with non-unicode string >>> unicoded = eval("u'%s'" % "José") Long version :

Re: unicode question

2006-03-01 Thread Walter Dörwald

Edward Loper wrote: > Walter Dörwald wrote: >> Edward Loper wrote: >> >>> [...] >>> Surely there's a better way than converting back and forth 3 times? Is >>> there a reason that the 'backslashreplace' error mode can't be used >>> with codecs.decode? >>> >>> >>> 'abc \xff\xe8 def'.decode('ascii

Re: unicode question

2006-02-27 Thread Edward Loper

Walter Dörwald wrote: > Edward Loper wrote: > >> [...] >> Surely there's a better way than converting back and forth 3 times? Is >> there a reason that the 'backslashreplace' error mode can't be used >> with codecs.decode? >> >> >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace') >> Trac

Re: unicode question

2006-02-27 Thread Walter Dörwald

Edward Loper wrote: > [...] > Surely there's a better way than converting back and forth 3 times? Is > there a reason that the 'backslashreplace' error mode can't be used with > codecs.decode? > > >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace') > Traceback (most recent call last): >

Re: unicode question

2006-02-25 Thread Kent Johnson

Edward Loper wrote: > I would like to convert an 8-bit string (i.e., a str) into unicode, > treating chars \x00-\x7f as ascii, and converting any chars \x80-xff > into a backslashed escape sequences. I.e., I want something like this: > > >>> decode_with_backslashreplace('abc \xff\xe8 def') > u'a

Re: unicode question

2006-02-25 Thread Tim Roberts

Edward Loper <[EMAIL PROTECTED]> wrote: >I would like to convert an 8-bit string (i.e., a str) into unicode, >treating chars \x00-\x7f as ascii, and converting any chars \x80-xff >into a backslashed escape sequences. I.e., I want something like this: > > >>> decode_with_backslashreplace('abc \xff

unicode question

2006-02-24 Thread Edward Loper

I would like to convert an 8-bit string (i.e., a str) into unicode, treating chars \x00-\x7f as ascii, and converting any chars \x80-xff into a backslashed escape sequences. I.e., I want something like this: >>> decode_with_backslashreplace('abc \xff\xe8 def') u'abc \\xff\\xe8 def' The best I c

Re: Unicode Question

2006-01-09 Thread David Pratt

Hi Max. Many thanks for helping to realize where I was missing the point and making this clearer. Regards, David Max Erickson wrote: > The encoding argument to unicode() is used to specify the encoding of the > string that you want to translate into unicode. The interpreter stores > unicode as

Re: Unicode Question

2006-01-09 Thread David Pratt

Hi Erik. Thank you for your reply. The advice I has helped clarify this for me. Regards, David Erik Max Francis wrote: > David Pratt wrote: > > >>This is not working for me. Can someone explain why. Many thanks. > > > Because '\xbe' isn't UTF-8 for the character you want, '\xc2\xbe' is, as

Re: Unicode Question

2006-01-09 Thread David Pratt

Hi Martin. Many thanks for your reply. What I am reall after, the following accomplishes. > > If you are looking for "at the same time", perhaps this is also > interesting: > > py> unicode('\xbe', 'windows-1252').encode('utf-8') > '\xc2\xbe' > Your answer really helped quite a bit to clarify t

Re: Unicode Question

2006-01-09 Thread Max Erickson

The encoding argument to unicode() is used to specify the encoding of the string that you want to translate into unicode. The interpreter stores unicode as unicode, it isn't encoded... >>> unicode('\xbe','cp1252') u'\xbe' >>> unicode('\xbe','cp1252').encode('utf-8') '\xc2\xbe' >>> max -- ht

Unicode Question

2006-01-09 Thread David Pratt

Hi. I am working through some tutorials on unicode and am hoping that someone can help explain this for me. I am on mac platform using python 2.4.1 at the moment. I am experimenting with unicode with the 3/4 symbol. I want to prepare strings for db storage that come from normal Windows machin

Re: Unicode Question

2006-01-09 Thread Martin v. Löwis

David Pratt wrote: > I want to prepare strings for db storage that come from normal Windows > machine (cp1252) so my understanding is to unicode and encode to utf-8 > and to store properly. That also depends on the database. The database must accept UTF-8-encoded strings, and must not modify them

Re: Unicode Question

2006-01-09 Thread Erik Max Francis

David Pratt wrote: > This is not working for me. Can someone explain why. Many thanks. Because '\xbe' isn't UTF-8 for the character you want, '\xc2\xbe' is, as you just showed yourself in the code snippet. -- Erik Max Francis && [EMAIL PROTECTED] && http://www.alcyone.com/max/ San Jose, CA, US

Re: Once again a unicode question

2005-03-26 Thread Nicolas Evrard

* Serge Orlov [23:45 26/03/05 CET]: Nicolas Evrard wrote: Hello, I'm puzzled by this test I made while trying to transform a page in html to plain text. Because I cannot send unicode to feed, nor str so how can I do this ? Seems like the parser is in the broken state after the first exception. Fe

Re: Once again a unicode question

2005-03-26 Thread Serge Orlov

Nicolas Evrard wrote: > Hello, > > I'm puzzled by this test I made while trying to transform a page in > html to plain text. Because I cannot send unicode to feed, nor str so > how can I do this ? Seems like the parser is in the broken state after the first exception. Feed only binary strings to i

Once again a unicode question

2005-03-26 Thread Nicolas Evrard

Hello, I'm puzzled by this test I made while trying to transform a page in html to plain text. Because I cannot send unicode to feed, nor str so how can I do this ? [EMAIL PROTECTED]:~$ python2.4 .Python 2.4.1c2 (#2, Mar 19 2005, 01:04:19) .[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2 .Type "help", "

Re: unicode question

2004-11-29 Thread Bengt Richter

On Tue, 23 Nov 2004 20:37:04 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: >Steve Holden wrote: >> Am I the only person who found it scary that Bengt could apparently >> casually drop on a polynomial the would decode to " Löwis"? Well, don't give me too much credit

93 matches

Mail list logo