[issue465502] urllib2: urlopen unicode problem

2022-04-10 Thread admin
Change by admin : -- github: None -> 35241 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue418173] Unicode problem in Tkinter under Windows

2022-04-10 Thread admin
Change by admin : -- github: None -> 34398 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Marko Rauhamaa
BartC : > Usually anything that is defined can be changed at run-time so that the > compiler can never assume anything. The compiler can't assume anything permanent, but it could heuristically make excellent guesses at runtime. It needs to verify its guesses at the boundaries of

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread BartC
On 04/07/2016 15:46, Ned Batchelder wrote: On Monday, July 4, 2016 at 10:36:54 AM UTC-4, BartC wrote: On 04/07/2016 13:47, Ned Batchelder wrote: This is a huge change. I've used a kind of 'weak' import scheme elsewhere, corresponding to C's '#include'. I think that could work in Python

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Ned Batchelder
On Monday, July 4, 2016 at 10:36:54 AM UTC-4, BartC wrote: > On 04/07/2016 13:47, Ned Batchelder wrote: > > On Monday, July 4, 2016 at 6:05:20 AM UTC-4, BartC wrote: > >> On 04/07/2016 03:30, Steven D'Aprano wrote: > > >>> You're still having problems with the whole Python-as-a-dynamic-language >

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread BartC
On 04/07/2016 13:47, Ned Batchelder wrote: On Monday, July 4, 2016 at 6:05:20 AM UTC-4, BartC wrote: On 04/07/2016 03:30, Steven D'Aprano wrote: You're still having problems with the whole Python-as-a-dynamic-language thing, aren't you? :-) Most Pythons seem to pre-compile code before

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Ned Batchelder
On Monday, July 4, 2016 at 6:05:20 AM UTC-4, BartC wrote: > On 04/07/2016 03:30, Steven D'Aprano wrote: > > On Mon, 4 Jul 2016 10:17 am, BartC wrote: > > > >> On 04/07/2016 01:00, Lawrence D’Oliveiro wrote: > >>> On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: > Python lacks a

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Rustom Mody
On Monday, July 4, 2016 at 3:56:43 PM UTC+5:30, BartC wrote: > On 04/07/2016 02:15, Lawrence D’Oliveiro wrote: > > On Monday, July 4, 2016 at 12:40:14 PM UTC+12, BartC wrote: > >> The structure of such a parser doesn't need to exactly match the grammar > >> with a dedicated block of code for each

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Jussi Piitulainen
BartC writes: > A simpler approach is to treat user-defined operators as aliases for > functions: > > def myadd(a,b): > return a+b > > operator ∇: >(myadd,2,+3) # map to myadd, 2 operands, prio 3, LTR > > x = y ∇ z > > is then equivalent to: > > x = myadd(y,z) > > However you will

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread BartC
On 04/07/2016 02:15, Lawrence D’Oliveiro wrote: On Monday, July 4, 2016 at 12:40:14 PM UTC+12, BartC wrote: The structure of such a parser doesn't need to exactly match the grammar with a dedicated block of code for each operator precedence. It can be table-driven so that an operator precedence

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Jussi Piitulainen
Lawrence D’Oliveiro writes: > On Monday, July 4, 2016 at 6:08:51 PM UTC+12, Jussi Piitulainen wrote: >> Something could be done, but if the intention is to allow >> mathematical notation, it needs to be done with care. > > Mathematics uses single-character variable names so that > multiplication

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread BartC
On 04/07/2016 03:30, Steven D'Aprano wrote: On Mon, 4 Jul 2016 10:17 am, BartC wrote: On 04/07/2016 01:00, Lawrence D’Oliveiro wrote: On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: Python lacks a mechanism to add user-defined operators. (R has this capability.) Maybe this

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Marko Rauhamaa
Lawrence D’Oliveiro : > Mathematics uses single-character variable names so that > multiplication can be implicit. I don't think anybody developed mathematical notation systematically. Rather, over the centuries, various masters came up with personal abbreviations and

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Lawrence D’Oliveiro
On Monday, July 4, 2016 at 6:08:51 PM UTC+12, Jussi Piitulainen wrote: > Something could be done, but if the intention is to allow > mathematical notation, it needs to be done with care. Mathematics uses single-character variable names so that multiplication can be implicit. An old, stillborn

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-04 Thread Jussi Piitulainen
Rustom Mody writes: > Subscripts OTOH as part of identifier-lexemes doesn't seem to have any > issues They have the general issue that one might *want* them interpreted as indexes, so that a₁ would mean the same as a[1]. Mathematical symbols face similar issues. One would not *want* them all be

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Rustom Mody
On Monday, July 4, 2016 at 8:03:47 AM UTC+5:30, Steven D'Aprano wrote: > On Mon, 4 Jul 2016 07:28 am, Lawrence D’Oliveiro wrote: > > > On Monday, July 4, 2016 at 6:39:45 AM UTC+12, John Ladasky wrote: > >> Here's another worm for the can. Would you rather read this... > >> > >> d = sqrt(x**2 +

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Steven D'Aprano
On Mon, 4 Jul 2016 07:28 am, Lawrence D’Oliveiro wrote: > On Monday, July 4, 2016 at 6:39:45 AM UTC+12, John Ladasky wrote: >> Here's another worm for the can. Would you rather read this... >> >> d = sqrt(x**2 + y**2) >> >> ...or this? >> >> d = √(x² + y²) > > Neither. I would rather see >

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Steven D'Aprano
On Mon, 4 Jul 2016 10:17 am, BartC wrote: > On 04/07/2016 01:00, Lawrence D’Oliveiro wrote: >> On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: >>> Python lacks a mechanism to add user-defined operators. (R has this >>> capability.) Maybe this feature could be added. >> >> That

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Random832
On Sun, Jul 3, 2016, at 21:15, Lawrence D’Oliveiro wrote: > On Monday, July 4, 2016 at 12:40:14 PM UTC+12, BartC wrote: > > The structure of such a parser doesn't need to exactly match the grammar > > with a dedicated block of code for each operator precedence. It can be > > table-driven so that

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Random832
On Sun, Jul 3, 2016, at 20:00, Lawrence D’Oliveiro wrote: > That would be neat. But remember, you would have to define the operator > precedence as well. So you could no longer use a recursive-descent > parser. You could use a recursive-descent parser if you monkey-patch the parser when adding a

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Monday, July 4, 2016 at 12:40:14 PM UTC+12, BartC wrote: > The structure of such a parser doesn't need to exactly match the grammar > with a dedicated block of code for each operator precedence. It can be > table-driven so that an operator precedence value is just an attribute. Of course.

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread BartC
On 04/07/2016 01:24, Lawrence D’Oliveiro wrote: On Monday, July 4, 2016 at 12:17:47 PM UTC+12, BartC wrote: On 04/07/2016 01:00, Lawrence D’Oliveiro wrote: On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: Python lacks a mechanism to add user-defined operators. (R has this

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Monday, July 4, 2016 at 12:17:47 PM UTC+12, BartC wrote: > > On 04/07/2016 01:00, Lawrence D’Oliveiro wrote: >> >> On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: >>> >>> Python lacks a mechanism to add user-defined operators. (R has this >>> capability.) Maybe this feature could

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread BartC
On 04/07/2016 01:00, Lawrence D’Oliveiro wrote: On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: Python lacks a mechanism to add user-defined operators. (R has this capability.) Maybe this feature could be added. That would be neat. But remember, you would have to define the

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Monday, July 4, 2016 at 11:47:26 AM UTC+12, eryk sun wrote: > Python lacks a mechanism to add user-defined operators. (R has this > capability.) Maybe this feature could be added. That would be neat. But remember, you would have to define the operator precedence as well. So you could no

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread eryk sun
On Sun, Jul 3, 2016 at 6:58 AM, John Ladasky wrote: > The nabla symbol (∇) is used in the naming of gradients. Python isn't having > it. > The interpreter throws a "SyntaxError: invalid character in identifier" when > it > encounters the ∇. Del is a mathematical

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Monday, July 4, 2016 at 6:39:45 AM UTC+12, John Ladasky wrote: > Here's another worm for the can. Would you rather read this... > > d = sqrt(x**2 + y**2) > > ...or this? > > d = √(x² + y²) Neither. I would rather see d = math.hypot(x, y) Much simpler, don’t you think? --

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Marko Rauhamaa
Random832 : > Being able to put any character in a symbol doesn't make those strings > identifiers, any more than passing them to getattr/setattr (or > __import__, something's __name__, etc) does in Python. From R7RS, the newest Scheme standard (p. 61-62): 7.1.1.

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Sunday, July 3, 2016 at 11:50:52 PM UTC+12, BartC wrote: > Otherwise you can be looking at: > >a b c d e f g h > > (not Scheme) and wondering which are names and which are operators. I did a language design for my MSc thesis where all “functions” were operators. So a construct like

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Sunday, July 3, 2016 at 9:02:05 PM UTC+12, Marko Rauhamaa wrote: > Lawrence D’Oliveiro: > >> On Sunday, July 3, 2016 at 7:27:04 PM UTC+12, Marko Rauhamaa wrote: >> >>> Personally, I don't think even π should be used in identifiers. >> > > Why not? > > 1. It can't be typed easily. I have a

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Random832
On Sun, Jul 3, 2016, at 07:22, Marko Rauhamaa wrote: > Christian Gollwitzer : > > Am 03.07.16 um 13:01 schrieb Marko Rauhamaa: > >> Scheme allows *any* characters whatsoever in identifiers. > > > > Parentheses? > > Yes. > > Hint: Python allows *any* characters whatsoever in

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread MRAB
On 2016-07-03 19:39, John Ladasky wrote: On Sunday, July 3, 2016 at 12:42:14 AM UTC-7, Chris Angelico wrote: On Sun, Jul 3, 2016 at 4:58 PM, John Ladasky wrote: Very good question! The detaily answer is here: https://docs.python.org/3/reference/lexical_analysis.html#identifiers > A

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread John Ladasky
On Sunday, July 3, 2016 at 12:42:14 AM UTC-7, Chris Angelico wrote: > On Sun, Jul 3, 2016 at 4:58 PM, John Ladasky wrote: > Very good question! The detaily answer is here: > > https://docs.python.org/3/reference/lexical_analysis.html#identifiers > > > A philosophical question. Why should any

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread John Ladasky
Lawrence, I trust you understand that I didn't post a complete working program, just a few lines showing the intended usage? -- https://mail.python.org/mailman/listinfo/python-list

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Chris Angelico
On Sun, Jul 3, 2016 at 7:01 PM, Marko Rauhamaa wrote: > Lawrence D’Oliveiro : > >> On Sunday, July 3, 2016 at 7:27:04 PM UTC+12, Marko Rauhamaa wrote: >> >>> Personally, I don't think even π should be used in identifiers. >> >> Why not? > > 1. It can't be

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Marko Rauhamaa
Christian Gollwitzer : > Am 03.07.16 um 13:22 schrieb Marko Rauhamaa: >> Christian Gollwitzer : >>> Am 03.07.16 um 13:01 schrieb Marko Rauhamaa: Scheme allows *any* characters whatsoever in identifiers. >>> Parentheses? >> Yes. > > My knowledge of Scheme is

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Christian Gollwitzer
Am 03.07.16 um 13:22 schrieb Marko Rauhamaa: Christian Gollwitzer : Am 03.07.16 um 13:01 schrieb Marko Rauhamaa: Alain Ketterlin : It would be very confusing to have a variable named ∇f, as confusing as naming a variable a+b or √x.

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread BartC
On 03/07/2016 12:01, Marko Rauhamaa wrote: Alain Ketterlin : It would be very confusing to have a variable named ∇f, as confusing as naming a variable a+b or √x. Scheme allows *any* characters whatsoever in identifiers. I think it's one of those

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Marko Rauhamaa
Christian Gollwitzer : > Am 03.07.16 um 13:01 schrieb Marko Rauhamaa: >> Alain Ketterlin : >> >>> It would be very confusing to have a variable named ∇f, as confusing >>> as naming a variable a+b or √x. >> >> Scheme allows *any*

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Christian Gollwitzer
Am 03.07.16 um 13:01 schrieb Marko Rauhamaa: Alain Ketterlin : It would be very confusing to have a variable named ∇f, as confusing as naming a variable a+b or √x. Scheme allows *any* characters whatsoever in identifiers. Parentheses?

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Marko Rauhamaa
Alain Ketterlin : > It would be very confusing to have a variable named ∇f, as confusing > as naming a variable a+b or √x. Scheme allows *any* characters whatsoever in identifiers. Marko -- https://mail.python.org/mailman/listinfo/python-list

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Alain Ketterlin
John Ladasky writes: > from math import pi as π > [...] > c = 2 * π * r > Up until today, every character I've tried has been accepted by the > Python interpreter as a legitimate character for inclusion in a > variable name. Now I'm copying a formula which defines a

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Robert Kern
On 2016-07-03 08:29, Jussi Piitulainen wrote: (Hm. Python seems to understand that the character occurs in what is intended to be an identifier. Perhaps that's a default error message.) I suspect that "identifier" is the final catch-all token in the lexer. Comments and strings are clearly

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Marko Rauhamaa
Lawrence D’Oliveiro : > On Sunday, July 3, 2016 at 7:27:04 PM UTC+12, Marko Rauhamaa wrote: > >> Personally, I don't think even π should be used in identifiers. > > Why not? 1. It can't be typed easily. 2. It can look like an n. 3. Single-character identifiers should

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Sunday, July 3, 2016 at 7:27:04 PM UTC+12, Marko Rauhamaa wrote: > Personally, I don't think even π should be used in identifiers. Why not? Python already has all the other single-character constants in what probably the most fundamental identity in all of mathematics: $$e^{i \pi} + 1 =

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Chris Angelico
On Sun, Jul 3, 2016 at 4:58 PM, John Ladasky wrote: > Up until today, every character I've tried has been accepted by the Python > interpreter as a legitimate character for inclusion in a variable name. Now > I'm copying a formula which defines a gradient. The

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Rustom Mody
On Sunday, July 3, 2016 at 12:29:14 PM UTC+5:30, John Ladasky wrote: > A while back, I shared my love for using Greek letters as variable names in > my Python (3.4) code -- when, and only when, they are warranted for improved > readability. For example, I like to see the following: > > > from

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Marko Rauhamaa
Lawrence D’Oliveiro : > It wasn’t the “π” it was complaining about... The question is why π is accepted but ∇ is not. The immediate reason is that π is a letter while ∇ is not. But the question, then, is why bother excluding nonletters from identifiers. Personally, I

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Jussi Piitulainen
John Ladasky writes: [- -] > The nabla symbol (∇) is used in the naming of gradients. Python isn't > having it. The interpreter throws a "SyntaxError: invalid character > in identifier" when it encounters the ∇. > > I am now wondering what constitutes a valid character for an > identifier, and

Re: Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread Lawrence D’Oliveiro
On Sunday, July 3, 2016 at 6:59:14 PM UTC+12, John Ladasky wrote: > from math import pi as π > > c = 2 * π * r ldo@theon:~> python3 Python 3.5.1+ (default, Jun 10 2016, 09:03:40) [GCC 5.4.0 20160603] on linux Type "help", "copyright", "credits" or "license" for more information.

Well, I finally ran into a Python Unicode problem, sort of

2016-07-03 Thread John Ladasky
A while back, I shared my love for using Greek letters as variable names in my Python (3.4) code -- when, and only when, they are warranted for improved readability. For example, I like to see the following: from math import pi as π c = 2 * π * r When I am copying mathematical formulas

How to work around a unicode problem?

2012-01-24 Thread tinnews
I have a small python program that uses the pyexiv2 package to view exif data in image files. I've hit a problem because I have a filename with accented characters in its path and the pyexiv2 code traps as follows:- Traceback (most recent call last): File /home/chris/bin/eview.py, line

Re: How to work around a unicode problem?

2012-01-24 Thread Chris Rebert
On Tue, Jan 24, 2012 at 3:57 AM, tinn...@isbd.co.uk wrote: I have a small python program that uses the pyexiv2 package to view exif data in image files. I've hit a problem because I have a filename with accented characters in its path and the pyexiv2 code traps as follows:-    Traceback

Re: How to work around a unicode problem?

2012-01-24 Thread Peter Otten
tinn...@isbd.co.uk wrote: I have a small python program that uses the pyexiv2 package to view exif data in image files. I've hit a problem because I have a filename with accented characters in its path and the pyexiv2 code traps as follows:- Traceback (most recent call last):

Re: How to work around a unicode problem?

2012-01-24 Thread tinnews
Peter Otten __pete...@web.de wrote: tinn...@isbd.co.uk wrote: I have a small python program that uses the pyexiv2 package to view exif data in image files. I've hit a problem because I have a filename with accented characters in its path and the pyexiv2 code traps as follows:-

Re: How to work around a unicode problem?

2012-01-24 Thread tinnews
Chris Rebert c...@rebertia.com wrote: On Tue, Jan 24, 2012 at 3:57 AM, tinn...@isbd.co.uk wrote: I have a small python program that uses the pyexiv2 package to view exif data in image files. I've hit a problem because I have a filename with accented characters in its path and the

unicode problem?

2010-10-09 Thread Brian Blais
This may be a stemming from my complete ignorance of unicode, but when I do this (Python 2.6): s='\xc2\xa9 2008 \r\n' and I want the ascii version of it, ignoring any non-ascii chars, I thought I could do: s.encode('ascii','ignore') but it gives the error: In [20]:s.encode('ascii','ignore')

Re: unicode problem?

2010-10-09 Thread Benjamin Kaplan
On Sat, Oct 9, 2010 at 7:59 PM, Brian Blais bbl...@bryant.edu wrote: This may be a stemming from my complete ignorance of unicode, but when I do this (Python 2.6): s='\xc2\xa9 2008 \r\n' and I want the ascii version of it, ignoring any non-ascii chars, I thought I could do:

Re: unicode problem?

2010-10-09 Thread Chris Rebert
On Sat, Oct 9, 2010 at 4:59 PM, Brian Blais bbl...@bryant.edu wrote: This may be a stemming from my complete ignorance of unicode, but when I do this (Python 2.6): s='\xc2\xa9 2008 \r\n' and I want the ascii version of it, ignoring any non-ascii chars, I thought I could do:

Re: Re: unicode problem?

2010-10-09 Thread hidura
I had a similar problem but i can 't encode a byte to a file what has been uploaded, without damage the data if i used utf-8 to encode the file duplicates the size, and i try to change the codec to raw_unicode_escape and this barely give me the correct size but still damage the file, i used

Re: Unicode problem in ucs4

2009-03-25 Thread abhi
On Mar 24, 4:55 am, Martin v. Löwis mar...@v.loewis.de wrote: So, both Py_UNICODE and wchar_t are 4 bytes and since it contains 3 \0s after a char, printf or wprintf is only printing one letter. No. printf indeed will see a terminating character. However, wprintf should correctly know that

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 20, 5:47 pm, M.-A. Lemburg m...@egenix.com wrote: On 2009-03-20 12:13, abhi wrote: On Mar 20, 11:03 am, Martin v. Löwis mar...@v.loewis.de wrote: Any idea on why this is happening? Can you provide a complete example? Your code looks correct, and should just work. How do you

Re: Unicode problem in ucs4

2009-03-23 Thread John Machin
On Mar 23, 6:18 pm, abhi abhigyan_agra...@in.ibm.com wrote: [snip] Hi Mark,      Thanks for the help. I tried PyUnicode_AsWideChar() but I am getting the same result i.e. only the first letter. sample code: #includePython.h static PyObject *unicode_helper(PyObject *self,PyObject *args){

Re: Unicode problem in ucs4

2009-03-23 Thread John Machin
On Mar 23, 6:41 pm, John Machin sjmac...@lexicon.net had a severe attack of backslashitis: [presuming littleendian] The ucs4 string will look like \t\0\0\0e \0\0\0s\0\0\0t\0\0\0 in memory. I suspect that your wprintf is grokking only 16-bit doodads -- \t\0 is printed and then \0\0 is

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 08:18, abhi wrote: On Mar 20, 5:47 pm, M.-A. Lemburg m...@egenix.com wrote: unicodeTest.c #includePython.h static PyObject *unicode_helper(PyObject *self,PyObject *args){ PyObject *sampleObj = NULL; Py_UNICODE *sample = NULL; if (!PyArg_ParseTuple(args, O,

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 23, 3:04 pm, M.-A. Lemburg m...@egenix.com wrote: On 2009-03-23 08:18, abhi wrote: On Mar 20, 5:47 pm, M.-A. Lemburg m...@egenix.com wrote: unicodeTest.c #includePython.h static PyObject *unicode_helper(PyObject *self,PyObject *args){    PyObject *sampleObj = NULL;          

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 11:50, abhi wrote: On Mar 23, 3:04 pm, M.-A. Lemburg m...@egenix.com wrote: Thanks Marc, John, With your help, I am at least somewhere. I re-wrote the code to compare Py_Unicode and wchar_t outputs and they both look exactly the same. #includePython.h static

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 23, 4:37 pm, M.-A. Lemburg m...@egenix.com wrote: On 2009-03-23 11:50, abhi wrote: On Mar 23, 3:04 pm, M.-A. Lemburg m...@egenix.com wrote: Thanks Marc, John,          With your help, I am at least somewhere. I re-wrote the code to compare Py_Unicode and wchar_t outputs and they

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 23, 4:57 pm, abhi abhigyan_agra...@in.ibm.com wrote: On Mar 23, 4:37 pm, M.-A. Lemburg m...@egenix.com wrote: On 2009-03-23 11:50, abhi wrote: On Mar 23, 3:04 pm, M.-A. Lemburg m...@egenix.com wrote: Thanks Marc, John,          With your help, I am at least somewhere. I

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 14:05, abhi wrote: Hi Marc, Is there any way to ensure that wchar_t size would always be 2 instead of 4 in ucs4 configured python? Googling gave me the impression that there is some logic written in PyUnicode_AsWideChar() which can take care of ucs4 to ucs2 conversion if

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 12:57, abhi wrote: Is there any way by which I can force wchar_t to be 2 bytes, or can I convert this UCS4 data to UCS2 explicitly? Sure: just use the appropriate UTF-16 codec for this. /* Generic codec based encoding API. object is passed through the encoder function

Re: Unicode problem in ucs4

2009-03-23 Thread Martin v. Löwis
So, both Py_UNICODE and wchar_t are 4 bytes and since it contains 3 \0s after a char, printf or wprintf is only printing one letter. No. printf indeed will see a terminating character. However, wprintf should correctly know that a wchar_t has four bytes per character, and print it correctly.

Re: Unicode problem in ucs4

2009-03-20 Thread Martin v. Löwis
Any idea on why this is happening? Can you provide a complete example? Your code looks correct, and should just work. How do you know the result contains only 't' (i.e. how do you know it does not contain 'e', 's', 't')? Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode problem in ucs4

2009-03-20 Thread abhi
On Mar 20, 11:03 am, Martin v. Löwis mar...@v.loewis.de wrote: Any idea on why this is happening? Can you provide a complete example? Your code looks correct, and should just work. How do you know the result contains only 't' (i.e. how do you know it does not contain 'e', 's', 't')?

Re: Unicode problem in ucs4

2009-03-20 Thread M.-A. Lemburg
On 2009-03-20 12:13, abhi wrote: On Mar 20, 11:03 am, Martin v. Löwis mar...@v.loewis.de wrote: Any idea on why this is happening? Can you provide a complete example? Your code looks correct, and should just work. How do you know the result contains only 't' (i.e. how do you know it does

Unicode problem in ucs4

2009-03-19 Thread abhi
Hi, I have a C extension, which takes a unicode or string value from python and convert it to unicode before doing more operations on it. The skeleton looks like: static PyObject *unicode_helper( PyObject *self, PyObject *args){ PyObject *sampleObj = NULL; Py_UNICODE *sample =

Unicode Problem

2008-10-30 Thread Seid Mohammed
I am new to python. I want to print Amharic character using the Python IDLE. here goes somple code == abebe = 'አበበ በሶ በላ' abebe '\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0 \xe1\x89\xa0\xe1\x88\xb6 \xe1\x89\xa0\xe1\x88\x8b' print abebe አበበ በሶ

Re: Unicode Problem

2008-10-30 Thread Marc 'BlackJack' Rintsch
On Thu, 30 Oct 2008 10:28:39 +0300, Seid Mohammed wrote: I am new to python. I want to print Amharic character using the Python IDLE. here goes somple code == abebe = 'አበበ በሶ በላ' abebe '\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0

Re: Unicode Problem

2008-10-30 Thread Ulrich Eckhardt
Seid Mohammed wrote: I am new to python. Welcome! :) abebe = 'አበበ በሶ በላ' abebe '\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0 \xe1\x89\xa0\xe1\x88\xb6 \xe1\x89\xa0\xe1\x88\x8b' print abebe አበበ በሶ በላ abeba = ['አበበ','በሶ','በላ'] abeba ['\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0',

Re: Unicode Problem

2008-10-30 Thread Bard Aase
On Thu, Oct 30, 2008 at 8:28 AM, Seid Mohammed [EMAIL PROTECTED] wrote: I am new to python. I want to print Amharic character using the Python IDLE. here goes somple code == abebe = 'አበበ በሶ በላ' abebe

Re: Logging library unicode problem

2008-08-20 Thread Vinay Sajip
On 13 Aug, 11:08, Victor Lin [EMAIL PROTECTED] wrote: Hi, I'm writting a application using python standardloggingsystem. I encounter some problem with unicode message passed tologginglibrary. I found that unicode message will be messed up bylogginghandler. piese of StreamHandler:

Logging library unicode problem

2008-08-13 Thread Victor Lin
Hi, I'm writting a application using python standard logging system. I encounter some problem with unicode message passed to logging library. I found that unicode message will be messed up by logging handler. piese of StreamHandler: try: self.stream.write(fs %

Re: Logging library unicode problem

2008-08-13 Thread Patrol Sun
What's your system? Simple Chinese Windows??? 2008/8/13 Victor Lin [EMAIL PROTECTED] Hi, I'm writting a application using python standard logging system. I encounter some problem with unicode message passed to logging library. I found that unicode message will be messed up by logging

Unicode Problem

2008-01-28 Thread Victor Subervi
Hi; New to unicode. Got this error: Traceback (most recent call last): File stdin, line 1, in module File stdin, line 29, in tagWords File /usr/local/lib/python2.5/codecs.py, line 303, in write data, consumed = self.encode(object, self.errors) UnicodeDecodeError: 'ascii' codec can't

[issue1040] Unicode problem with TZ

2007-08-30 Thread Martin v. Löwis
Martin v. Löwis added the comment: This is now fixed in r57720. Using wide APIs would be possible through GetTimeZoneInformation, however, then TZ won't be supported anymore (unless the CRT code to parse TZ is duplicated). -- nosy: +loewis resolution: - fixed status: open - closed

[issue1040] Unicode problem with TZ

2007-08-29 Thread Martin v. Löwis
Changes by Martin v. Löwis: -- assignee: - loewis __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1040 __ ___ Python-bugs-list mailing list Unsubscribe:

[issue1040] Unicode problem with TZ

2007-08-29 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: I have a patch for this, which uses MBCS conversion instead of relying on the default utf-8 (here and several other places). Tested on a French version of winXP. Which leads me to the question: should Windows use MBCS encoding by default when converting

[issue1040] Unicode problem with TZ

2007-08-29 Thread Thomas Heller
Thomas Heller added the comment: IMO the very best would be to avoid as many conversions as possible by using the wide apis on Windows. Not for _tzname maybe, but for env vars, sys.argv, sys.path, and so on. Not that I would have time to work on that... __

[issue1040] Unicode problem with TZ

2007-08-28 Thread Thomas Heller
Thomas Heller added the comment: BTW, setting the environment variable TZ to, say, 'GMT' makes the problem go away. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1040 __ ___

[issue1040] Unicode problem with TZ

2007-08-28 Thread Thomas Heller
messages: 55351 nosy: theller severity: normal status: open title: Unicode problem with TZ versions: Python 3.0 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1040 __ ___ Python-bugs

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: On Jul 26, 3:13 pm, John Machin [EMAIL PROTECTED] wrote: On Jul 26, 9:24 pm, [EMAIL PROTECTED] wrote: OK, I solved the problem but I still don't get what went wrong. Solution - use tree builder in order to create the new xml file (previously I was manually creating

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread John Machin
On Jul 26, 9:24 pm, [EMAIL PROTECTED] wrote: OK, I solved the problem but I still don't get what went wrong. Solution - use tree builder in order to create the new xml file (previously I was manually creating it). I'm still curious so I'm adding a link to a short and very simple script that

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread oren . tsur
On Jul 26, 3:13 pm, John Machin [EMAIL PROTECTED] wrote: On Jul 26, 9:24 pm, [EMAIL PROTECTED] wrote: OK, I solved the problem but I still don't get what went wrong. Solution - use tree builder in order to create the new xml file (previously I was manually creating it). I'm still

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread oren . tsur
OK, I solved the problem but I still don't get what went wrong. Solution - use tree builder in order to create the new xml file (previously I was manually creating it). I'm still curious so I'm adding a link to a short and very simple script that gets an xml (containing non ascii chars) from the

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-26 Thread oren . tsur
On Jul 26, 4:34 pm, Stefan Behnel [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: On Jul 26, 3:13 pm, John Machin [EMAIL PROTECTED] wrote: On Jul 26, 9:24 pm, [EMAIL PROTECTED] wrote: OK, I solved the problem but I still don't get what went wrong. Solution - use tree builder in order

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-24 Thread oren . tsur
On Jul 23, 4:46 pm, Richard Brodie [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] so what's the difference? how comes parsing is fine in the first case but erroneous in the second case? You may have guessed the encoding wrong. It probably wasn't utf-8

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-24 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: On Jul 23, 4:46 pm, Richard Brodie [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] so what's the difference? how comes parsing is fine in the first case but erroneous in the second case? You may have guessed the encoding wrong. It

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-24 Thread Marc 'BlackJack' Rintsch
On Tue, 24 Jul 2007 05:57:26 +, oren.tsur wrote: but the thing is that the parser parses it all right from the web (the amazon response) but fails to parse the locally saved file. I've just used wget to fetch that URL and `ElementTree` parses that local file without problems. Maybe you

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-24 Thread Steve Holden
Marc 'BlackJack' Rintsch wrote: On Tue, 24 Jul 2007 05:57:26 +, oren.tsur wrote: but the thing is that the parser parses it all right from the web (the amazon response) but fails to parse the locally saved file. I've just used wget to fetch that URL and `ElementTree` parses that local

Re: Parsing XML with ElementTree (unicode problem?)

2007-07-24 Thread André
On Jul 23, 11:29 am, [EMAIL PROTECTED] wrote: (this question was also posted in the devshed python forum:http://forums.devshed.com/python-programming-11/parsing-xml-with-elem... ). - (it's a bit longish but I hope I give all the information) 1. here is my

  1   2   >