Re: [Python-3000] Reversing through text files with the new IO library

Guido van Rossum Mon, 12 Mar 2007 12:18:14 -0800

On 3/12/07, Mark Russell <[EMAIL PROTECTED]> wrote:
> On 12 Mar 2007, at 17:56, Guido van Rossum wrote:
> > Thanks! This is a very interesting idea, I'd like to keep this
> > around somehow.
>
> Thanks for the positive feedback - much appreciated.
>
> > I also see that you noticed a problem with text I/O in the current
> > design; there's no easy way to implement readline() efficiently. I
> > want readline() to be as efficient as possible -- "for line in <file>"
> > should *scream*, like it does in 2.x.
>
> Yes, I suspect that BufferedReader needs some kind of readuntil()
> method, so that (at least for sane encodings like utf-8) each line is
> read via a single readuntil() followed by a decode() call for the
> entire line.
>
> Maybe something like this (although the only way to be sure is to
> experiment):
>
>      line, endindex = buffer.readuntil(line_endings)
>
>      Read until we see one of the byte strings in line_endings, which
> is a sequence of one or
>      more byte strings.  If there are multiple line endings with a
> common prefix, use the longest.
>      Return the line complete with the ending, with endindex being
> the index within line of the
>      line ending (or None if EOF was encountered).
>
> Is anyone working on io.py btw?  If not I'd be willing to put some
> time into it.  I guess the todo list is something like this:


I am, when I have time (which seems rarely) and Mike Verdone and
Daniel Stutzbach are (though I may have unintentionally discouraged
them by not providing feedback soon enough).

>      - Finish off the python prototypes in io.py (using and maybe
> tweaking the API spec)

Yes. I am positive that attempting to implement the entire PEP (and
trying to do it relatively efficiently) will require us to go back to
the API design several times.

Note that some of the binary prototypes don't work right yet; the
unittests don't cover everything that's been implemented yet.

I would love for you to start working on this. Let me know off-line if
you need more guidance (but CC Daniel and Mike so they know what's
going on).

>      - Get unit tests working with __builtin__.open = io.open

I'm not even sure about this one; we may have to do that
simultaneously with the str/unicode conversion. If we attempt do to it
before then, I expect that we'll get lots of failures because the new
I/O text layer always returns unicode and the new binary layer returns
bytes objects. We may have to do it more piecemeal. Perhaps a good
start would be to convert selected modules that use binary I/O to
switch to the new io module explicitly by importing it and recoding
them to deal with bytes.

>      - Profile and optimize (e.g. by selective conversion to C)

I'd be okay with doing that after the 3.0 alpha 1 release (planned for June).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] Reversing through text files with the new IO library

Reply via email to