Re: [Python-Dev] Unicode exception indexing

2011-11-04 Thread Terry Reedy
On 11/4/2011 3:39 AM, "Martin v. Löwis" wrote: Is it worth the hassle? We can just port our existing error handlers, and I guess the few third-party error handlers written in C (if any) can bear the transition. That was my question exactly. As the author of PEP 393, I was leaning towards full b

Re: [Python-Dev] Unicode exception indexing

2011-11-04 Thread Stefan Behnel
"Martin v. Löwis", 04.11.2011 08:39: Is it worth the hassle? We can just port our existing error handlers, and I guess the few third-party error handlers written in C (if any) can bear the transition. That was my question exactly. As the author of PEP 393, I was leaning towards full backwards c

Re: [Python-Dev] Unicode exception indexing

2011-11-04 Thread Martin v. Löwis
> Is it worth the hassle? We can just port our existing error handlers, > and I guess the few third-party error handlers written in C (if any) > can bear the transition. That was my question exactly. As the author of PEP 393, I was leaning towards full backwards compatibility, but you, Victor, and

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Antoine Pitrou
On Thu, 03 Nov 2011 22:47:00 +0100 "Martin v. Löwis" wrote: > >> On the one hand, these indices are used in formatting error messages such > >> as > >> "codec can't encode character \u%04x in position %d", suggesting they > >> are regular > >> indices into the string (counting code points). >

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Nick Coghlan
Your approach (doing the right thing for both Python and C, new API to avoid the C performance problem) sounds good to me. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Nov 4, 2011 7:58 AM, Martin v. Löwis wrote: > > I started such hack for the UTF-8 codec... I

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Terry Reedy
On 11/3/2011 5:43 PM, "Martin v. Löwis" wrote: I had the impression that we were abolishing the wide versus narrow build difference and that this issue would disappear. I must have missed something. Most certainly. The Py_UNICODE type continues to exist for backwards compatibility. It is now

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Martin v. Löwis
> I started such hack for the UTF-8 codec... It is really tricky, we should not > do that! With the proper encapsulation, it's not that tricky. I have written functions PyUnicode_IndexToWCharIndex and PyUnicode_WCharIndexToIndex, and PyUnicodeEncodeError_GetStart and friends would use that functi

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Martin v. Löwis
>> On the one hand, these indices are used in formatting error messages such as >> "codec can't encode character \u%04x in position %d", suggesting they >> are regular >> indices into the string (counting code points). >> >> On the other hand, they are used by error handlers to lookup the charact

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Martin v. Löwis
Am 03.11.2011 22:19, schrieb Terry Reedy: > On 11/3/2011 3:16 PM, Victor Stinner wrote: >> Le jeudi 3 novembre 2011 18:14:42, mar...@v.loewis.de a écrit : >>> There is a backwards compatibility issue with PEP 393 and Unicode >>> exceptions: the start and end indices: are they Py_UNICODE indices, or

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Terry Reedy
On 11/3/2011 3:16 PM, Victor Stinner wrote: Le jeudi 3 novembre 2011 18:14:42, mar...@v.loewis.de a écrit : There is a backwards compatibility issue with PEP 393 and Unicode exceptions: the start and end indices: are they Py_UNICODE indices, or code point indices? I had the impression that we

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Guido van Rossum
On Thu, Nov 3, 2011 at 12:29 PM, Antoine Pitrou wrote: > On Thu, 03 Nov 2011 18:14:42 +0100 > mar...@v.loewis.de wrote: >> There is a backwards compatibility issue with PEP 393 and Unicode exceptions: >> the start and end indices: are they Py_UNICODE indices, or code point >> indices? >> >> On th

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Antoine Pitrou
On Thu, 03 Nov 2011 18:14:42 +0100 mar...@v.loewis.de wrote: > There is a backwards compatibility issue with PEP 393 and Unicode exceptions: > the start and end indices: are they Py_UNICODE indices, or code point indices? > > On the one hand, these indices are used in formatting error messages suc

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Victor Stinner
Le jeudi 3 novembre 2011 18:14:42, mar...@v.loewis.de a écrit : > There is a backwards compatibility issue with PEP 393 and Unicode > exceptions: the start and end indices: are they Py_UNICODE indices, or > code point indices? Oh oh. That's exactly why I didn't want to start to work on this issue.