Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Stephen J. Turnbull
Matt Giuca writes: > OK, for all the people who say URI encoding does not encode characters: yes > it does. This is not an encoding for binary data, it's an encoding for > character data, but it's unspecified how the strings map to octets before > being percent-encoded. In other words, it's a

[Python-Dev] Memory Error while reading large file

2008-07-30 Thread Sumant Gupta
Hi I have a problem reading very large text file. When I call the len function to get the total lines in python file.i get memory error . I am reading the list of files in a loop ,2 files are read properly but when the third file is read , It gives an memory error . Sumant Gupta Software Engine

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Guido van Rossum
On Wed, Jul 30, 2008 at 8:49 PM, Matt Giuca <[EMAIL PROTECTED]> wrote: > >> Con: URI encoding does not encode characters. > > OK, for all the people who say URI encoding does not encode characters: yes > it does. This is not an encoding for binary data, it's an encoding for > character data, but it

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Matt Giuca
> Con: URI encoding does not encode characters. OK, for all the people who say URI encoding does not encode characters: yes it does. This is not an encoding for binary data, it's an encoding for character data, but it's unspecified how the strings map to octets before being percent-encoded. From R

Re: [Python-Dev] critical issues for 2.6 and 3.0

2008-07-30 Thread Brett Cannon
On Wed, Jul 30, 2008 at 7:31 PM, Benjamin Peterson <[EMAIL PROTECTED]> wrote: > I just went through the disturbingly long list of 67 open issues with > a "critical" priority pinging and trying to get things moving. There > are ~55 now; I was able to close some, but others I promoted to > release bl

[Python-Dev] critical issues for 2.6 and 3.0

2008-07-30 Thread Benjamin Peterson
I just went through the disturbingly long list of 67 open issues with a "critical" priority pinging and trying to get things moving. There are ~55 now; I was able to close some, but others I promoted to release blocker for beta 3. Shouldn't all criticals be resolved by the final? I've never been t

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Bill Janssen
> I think this is as close as consensus as we can get on this issue. Can > whoever wrote the patch adjust the patch to this outcome? (I think the > only change is to remove the encoding arguments and make separate > functions for bytes.) This is 2.7/3.1 only, right? I'm looking at the bales of co

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Guido van Rossum
On Wed, Jul 30, 2008 at 12:49 PM, Bill Janssen <[EMAIL PROTECTED]> wrote: >> > unquote() -- takes string, produces bytes or string >> > >> > If optional "encoding" parameter is specified, decodes bytes with >> > that encoding and returns string. Otherwise, returns bytes. >> >> The default

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Bill Janssen
> > unquote() -- takes string, produces bytes or string > > > > If optional "encoding" parameter is specified, decodes bytes with > > that encoding and returns string. Otherwise, returns bytes. > > The default of returning bytes will break almost all uses. Most code > will uses the unquo

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Jeff Hall
> > > (Aside: I dislike functions that have a different return type based on > the value of a parameter.) > > I wanted to stay out of the whole discussion as it's largely over my head... But I did want to express support for this idea which I think almost rises to the level of a standard... I see m

Re: [Python-Dev] Fuzzing bugs: most bugs are closed

2008-07-30 Thread Guido van Rossum
On Mon, Jul 21, 2008 at 10:41 AM, A.M. Kuchling <[EMAIL PROTECTED]> wrote: > On Mon, Jul 21, 2008 at 03:53:18PM +, Antoine Pitrou wrote: >> The underscore at the beginning of _sre clearly indicates that the module is >> not recommended for direct consumption, IMO. Even the functions that don't

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Guido van Rossum
On Wed, Jul 30, 2008 at 10:33 AM, Bill Janssen <[EMAIL PROTECTED]> wrote: >> It looks like all other APIs in the Py3k version of >> urllib treat URLs as text. > > The URL is text, a string of ASCII characters. We're just talking > about urllib.quote() and urllib.unquote(), which are there to suppo

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Bill Janssen
> It looks like all other APIs in the Py3k version of > urllib treat URLs as text. The URL is text, a string of ASCII characters. We're just talking about urllib.quote() and urllib.unquote(), which are there to support the text-ization of binary values, and the de-text-ization. > I think that wo

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Bill Janssen
> Actually (as I pointed out before) the existing functions are not > string-in/string-out. They are something-in and bytes-out. Sorry, this is wrong. "quote" is clearly bytes-in and string-out. "unquote" is clearly string-in and bytes-out. The whole point of "quote" is to take an arbitrary seq

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Guido van Rossum
On Wed, Jul 30, 2008 at 9:52 AM, Bill Janssen <[EMAIL PROTECTED]> wrote: >> On Wed, Jul 30, 2008 at 8:09 AM, André Malo <[EMAIL PROTECTED]> wrote: >> > I'm actually in favour of encoding bytes only back and forth. A useful >> > extension would be *another* function which wraps quote/unquote and enc

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Bill Janssen
> On Wed, Jul 30, 2008 at 8:09 AM, André Malo <[EMAIL PROTECTED]> wrote: > > I'm actually in favour of encoding bytes only back and forth. A useful > > extension would be *another* function which wraps quote/unquote and encod= > es > > and decodes characters. > > I'd reverse this. By all means, ad

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Bill Janssen
> For unquote, I think it will break a lot and surprise everyone. I > think that while this may be "purely" the best option, it's pretty > silly. I don't mind being silly to do the right thing. Happens to me a lot :-). Bill ___ Python-Dev mailing list

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Guido van Rossum
On Wed, Jul 30, 2008 at 8:09 AM, André Malo <[EMAIL PROTECTED]> wrote: > I'm actually in favour of encoding bytes only back and forth. A useful > extension would be *another* function which wraps quote/unquote and encodes > and decodes characters. I'd reverse this. By all means, add a new pair of

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread André Malo
[I was pretty busy these days, so sorry for jumping in late again] * Matt Giuca wrote: > 1. Leave it as it is. quote is Latin-1 if range(0,256), fallback to > UTF-8. unquote is Latin-1. > In favour: Anybody who doesn't reply to this thread > Pros: Already implemented; some existing code depends

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Antoine Pitrou
Facundo Batista gmail.com> writes: > > 2008/7/30 Matt Giuca gmail.com>: > > > 2. Default to UTF-8. > > In favour: Matt Giuca, Brett Cannon, Jeroen Ruigrok van der Werven > > Pros: Fully working and tested solution is implemented; recommended by > > RFC 3986 for all future schemes; recommended

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Facundo Batista
2008/7/30 Matt Giuca <[EMAIL PROTECTED]>: > 2. Default to UTF-8. > In favour: Matt Giuca, Brett Cannon, Jeroen Ruigrok van der Werven > Pros: Fully working and tested solution is implemented; recommended by > RFC 3986 for all future schemes; recommended by W3C for use with HTML; > UTF-8 used by al

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Oleg Broytmann
On Thu, Jul 31, 2008 at 12:11:40AM +1000, Matt Giuca wrote: > 2. Default to UTF-8. > In favour: Matt Giuca, Brett Cannon, Jeroen Ruigrok van der Werven Count me too: +1. Most sites I use theese days use UTF-8 for URL encoding. Examples: Wikipedia: http://ru.wikipedia.org/wiki/%D0%93%D0%B2%D0%B

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Matt Giuca
Arg! Damnit, why do my replies get split off from the main thread? Sorry about any confusion this may be causing. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mai

Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread Matt Giuca
Hi folks, This issue got some attention a few weeks back but it seems to have fallen quiet, and I haven't had a good chance to sit down and reply again till now. As I've said before this is a serious issue which will affect a great deal of code. However it's obviously not as clear-cut as I origin

Re: [Python-Dev] Matrix product

2008-07-30 Thread Sebastien Loisel
Dear Raymond, Thank you for your email. > I think much of this thread is a repeat of conversations > that were held for PEP 225: > http://www.python.org/dev/peps/pep-0225/ > > That PEP is marked as deferred. Maybe it's time to > bring it back to life. This is a much better PEP than the one I ha

Re: [Python-Dev] Matrix product

2008-07-30 Thread Raymond Hettinger
Further, while A**B is not so common, A**n is quite common (for integral n, in the sense of repeated matrix multiplication). So a matrix multiplication operator really should come with a power operator cousin. Which obviously should be @@ :-) I think much of this thread is a repeat of conversa