On Thu, 25 May 2006 15:01:36 +0000, Runar Petursson <[EMAIL PROTECTED]> wrote:
>We've been talking this week about ideas for speeding up the parsing of
>Longs coming out of files or network.  The use case is having a large string
>with embeded Long's and parsing them to real longs.  One approach would be
>to use a simple slice:
>long(mystring[x:y])
>
>an expensive operation in a tight loop.  The proposed solution is to add
>further keyword arguments to Long (such as):
>
>long(mystring, base=10, start=x, end=y)
>
>The start/end would allow for negative indexes, as slices do, but otherwise
>simply limit the scope of the parsing.  There are other solutions, using
>buffer-like objects and such, but this seems like a simple win for anyone
>parsing a lot of text.  I implemented it in a branch  runar-longslice- 
>branch,
>but it would need to be updated with Tim's latest improvements to long.
>Then you may ask, why not do it for everything else parsing from string--to
>which I say it should.  Thoughts?

This really seems like a poor option.  Why fix the problem with a hundred 
special cases instead of a single general solution?

Hmm, one reason could be that the general solution doesn't work:

  [EMAIL PROTECTED]:~$ python
  Python 2.4.3 (#2, Apr 27 2006, 14:43:58) 
  [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> long(buffer('1234', 0, 3))
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  ValueError: null byte in argument for long()
  >>> long(buffer('123a', 0, 3))
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  ValueError: invalid literal for long(): 123a
  >>> 

Still, fixing that seems like a better idea. ;)

Jean-Paul
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to