Re: [Numpy-discussion] numpy.loadtxt requires seek()?

2008-11-29 Thread Stéfan van der Walt
2008/11/20 Ryan May [EMAIL PROTECTED]:
 I've attached a simple patch that changes the check for seek() to a
 check for readline().  I'll punt on my idea of just using iterators,
 since that seems like slightly greater complexity for no gain. (I'm not
 sure how many people end up with data in a list of strings and wish they
 could pass that to loadtxt).

 While you're at it, would you commit my patch to add support for bzipped
 files as well (attached)?

Thanks, applied.

Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.loadtxt requires seek()?

2008-11-20 Thread Ryan May
Hi,

Does anyone know why numpy.loadtxt(), in checking the validity of a
filehandle, checks for the seek() method, which appears to have no
bearing on whether an object will work?

I'm trying to use loadtxt() directly with the file-like object returned
by urllib2.urlopen().  If I change the check for 'seek' to one for
'readline', using the urlopen object works with a hitch.

As far as I can tell, all the filehandle object needs to meet is:

1) Have a readline() method so that loadtxt can skip the first N lines
and read the first line of data

2) Be compatible with itertools.chain() (should be any iterable)

At a minimum, I'd ask to change the check for 'seek' to one for 'readline'.

On a bit deeper thought, it would seem that loadtxt would work with any
iterable that returns individual lines.  I'd like then to change the
calls to readline() to just getting the next object from the iterable
(iter.next() ?) and change the check for a file-like object to just a
check for an iterable.  In fact, we could use the iter() builtin to
convert whatever got passed.  That would give automatically a next()
method and would raise a TypeError if it's incompatible.

Thoughts?  I'm willing to write up the patch for either
.
Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.loadtxt requires seek()?

2008-11-20 Thread Ryan May
Stéfan van der Walt wrote:
 2008/11/20 Ryan May [EMAIL PROTECTED]:
 Does anyone know why numpy.loadtxt(), in checking the validity of a
 filehandle, checks for the seek() method, which appears to have no
 bearing on whether an object will work?
 
 I think this is simply a naive mistake on my part.  I was looking for
 a way to identify files; your patch would be welcome.

I've attached a simple patch that changes the check for seek() to a
check for readline().  I'll punt on my idea of just using iterators,
since that seems like slightly greater complexity for no gain. (I'm not
sure how many people end up with data in a list of strings and wish they
could pass that to loadtxt).

While you're at it, would you commit my patch to add support for bzipped
files as well (attached)?

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Index: numpy/lib/io.py
===
--- numpy/lib/io.py (revision 5953)
+++ numpy/lib/io.py (working copy)
@@ -253,8 +253,8 @@
 Parameters
 --
 fname : file or string
-File or filename to read.  If the filename extension is ``.gz``,
-the file is first decompressed.
+File or filename to read.  If the filename extension is ``.gz`` or
+``.bz2``, the file is first decompressed.
 dtype : data-type
 Data type of the resulting array.  If this is a record data-type,
 the resulting array will be 1-dimensional, and each row will be
@@ -320,6 +320,9 @@
 if fname.endswith('.gz'):
 import gzip
 fh = gzip.open(fname)
+elif fname.endswith('.bz2'):
+import bz2
+fh = bz2.BZ2File(fname)
 else:
 fh = file(fname)
 elif hasattr(fname, 'seek'):
Index: numpy/lib/io.py
===
--- numpy/lib/io.py (revision 6085)
+++ numpy/lib/io.py (working copy)
@@ -333,7 +333,7 @@
 fh = gzip.open(fname)
 else:
 fh = file(fname)
-elif hasattr(fname, 'seek'):
+elif hasattr(fname, 'readline'):
 fh = fname
 else:
 raise ValueError('fname must be a string or file handle')
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion