[Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
Hi all, I was just having a new look into the mess that is, imo, the support for automatic line ending recognition in genfromtxt, and more generally, the Python file openers. I am glad at least reading gzip files is no longer entirely broken in Python3, but actually detecting in particular

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Julian Taylor
genfromtxt and loadtxt need an almost full rewrite to fix the botched python3 conversion of these functions. There are a couple threads about this on this list already. There are numerous PRs fixing stuff in these functions which I currently all -1'd because we need to fix the underlying unicode

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: genfromtxt and loadtxt need an almost full rewrite to fix the botched python3 conversion of these functions. There are a couple threads about this on this list already. There are numerous PRs fixing stuff in

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30 Jun 2014, at 04:39 pm, Nathaniel Smith n...@pobox.com wrote: On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: genfromtxt and loadtxt need an almost full rewrite to fix the botched python3 conversion of these functions. There are a couple threads

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On Mon, Jun 30, 2014 at 3:47 PM, Derek Homeier de...@astro.physik.uni-goettingen.de wrote: Does it make sense to keep maintaing both functions at all? IIRC the idea that loadtxt would be the faster version of the two has been discarded long ago, thus it seems there is very little, if anything,

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Chris Barker
It's also an interesting question whether they've fixed the unicode/binary issues, Which brings up the how do we handle text/strings in numpy? issue. We had a good thread going here about what the 'S' data type should be , what with py3 and all, but I don't think we ever really resolved that.

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On 30 Jun 2014 17:05, Chris Barker chris.bar...@noaa.gov wrote: It's also an interesting question whether they've fixed the unicode/binary issues, Which brings up the how do we handle text/strings in numpy? issue. We had a good thread going here about what the 'S' data type should be , what

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Chris Barker
On Mon, Jun 30, 2014 at 9:31 AM, Nathaniel Smith n...@pobox.com wrote: On 30 Jun 2014 17:05, Chris Barker chris.bar...@noaa.gov wrote: Anyway, this all ties in with the text file parsing issues... Only tangentially though :-) well, a fast text parser (and text mode) input file will

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30 Jun 2014, at 04:56 pm, Nathaniel Smith n...@pobox.com wrote: A real need, which had also been discussed at length, is a truly performant text IO function (i.e. one using a compiled ASCII number parser, and optimally also a more memory-efficient one), but unfortunately all people

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Jeff Reback
In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. specifying '\s+' as a separator. Not sure when you were playing last with pandas, but the c-parser has been in place since late 2012. (version 0.8.0)

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30.06.2014, at 23:10, Jeff Reback jeffreb...@gmail.com wrote: In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. specifying '\s+' as a separator. Not sure when you were playing last with pandas, but the c-parser has been in place since late 2012. (version 0.8.0)

Re: [Numpy-discussion] genfromtxt and gzip

2013-06-11 Thread Derek Homeier
On 05.06.2013, at 9:52AM, Ted To rainexpec...@theo.to wrote: From the list archives (2011), I noticed that there is a bug in the python gzip module that causes genfromtxt to fail with python 2 but this bug is not a problem for python 3. When I tried to use genfromtxt and python 3 with a

[Numpy-discussion] genfromtxt and gzip

2013-06-05 Thread Ted To
Hi all, From the list archives (2011), I noticed that there is a bug in the python gzip module that causes genfromtxt to fail with python 2 but this bug is not a problem for python 3. When I tried to use genfromtxt and python 3 with a gzip'ed csv file, I instead got: IOError: Mode rbU not

[Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Albert Kottke
I noticed that genfromtxt() did not skip comments if the keyword names is not True. If names is True, then genfromtxt() would take the first line as the names. I am proposing a fix to genfromtxt that skips all of the comments in a file, and potentially using the last comment line for names. This

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Benjamin Root
On Fri, May 31, 2013 at 5:08 PM, Albert Kottke albert.kot...@gmail.comwrote: I noticed that genfromtxt() did not skip comments if the keyword names is not True. If names is True, then genfromtxt() would take the first line as the names. I am proposing a fix to genfromtxt that skips all of the

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Albert Kottke
I agree that last comment line before the first line of data is more descriptive. Regarding the location of the names. I thought taking it from the last comment line before the first line of data made sense because it would permit reading of just the data with np.loadtxt(), but also permit

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Albert Kottke
Now try the same thing with np.recfromcsv(). I get the following (Python 3.3): import io b = io.BytesIO(b!blah\n!blah\n!blah\n!A:B:C\n1:2:3\n4:5:6\n) np.recfromcsv(b, delimiter=':', comments='!') ... ValueError: Some errors were detected ! Line #5 (got 3 columns instead of 1) Line #6

[Numpy-discussion] genfromtxt

2011-10-11 Thread Nils Wagner
Hi all, How do I use genfromtxt to read a file with the following lines 11 2.2592365264892578D+01 22 2.2592365264892578D+01 13 2.669845581055D+00 33 2.2592365264892578D+01

Re: [Numpy-discussion] genfromtxt

2011-10-11 Thread Derek Homeier
Hi Nils, On 11 Oct 2011, at 16:34, Nils Wagner wrote: How do I use genfromtxt to read a file with the following lines 11 2.2592365264892578D+01 22 2.2592365264892578D+01 13 2.669845581055D+00

Re: [Numpy-discussion] genfromtxt converter question

2011-06-18 Thread Derek Homeier
On 18 Jun 2011, at 04:48, gary ruben wrote: Thanks guys - I'm happy with the solution for now. FYI, Derek's suggestion doesn't work in numpy 1.5.1 either. For any developers following this thread, I think this might be a nice use case for genfromtxt to handle in future. Numpy 1.6.0 and above

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
If I understand correctly, your error is that you convert only the second column, because your converters dictionary contains a single key (1). If you have it contain keys from 0 to 3 associated to the same function, it should work. -=- Olivier 2011/6/17 gary ruben gru...@bigpond.net.au I'm

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread gary ruben
Thanks Olivier, Your suggestion gets me a little closer to what I want, but doesn't quite work. Replacing the conversion with c = lambda x:np.cast[np.complex64](complex(*eval(x))) b = np.genfromtxt(a,converters={0:c, 1:c, 2:c, 3:c},dtype=None,delimiter=18,usecols=range(4)) produces

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Bruce Southey
On 06/17/2011 08:22 AM, gary ruben wrote: Thanks Olivier, Your suggestion gets me a little closer to what I want, but doesn't quite work. Replacing the conversion with c = lambda x:np.cast[np.complex64](complex(*eval(x))) b = np.genfromtxt(a,converters={0:c, 1:c, 2:c,

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
2011/6/17 Bruce Southey bsout...@gmail.com On 06/17/2011 08:22 AM, gary ruben wrote: Thanks Olivier, Your suggestion gets me a little closer to what I want, but doesn't quite work. Replacing the conversion with c = lambda x:np.cast[np.complex64](complex(*eval(x))) b =

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Bruce Southey
On 06/17/2011 08:51 AM, Olivier Delalleau wrote: 2011/6/17 Bruce Southey bsout...@gmail.com mailto:bsout...@gmail.com On 06/17/2011 08:22 AM, gary ruben wrote: Thanks Olivier, Your suggestion gets me a little closer to what I want, but doesn't quite work. Replacing the

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread gary ruben
Thanks for the hints Olivier and Bruce. Based on them, the following is a working solution, although I still have that itchy sense that genfromtxt should be able to do it directly. import numpy as np from StringIO import StringIO a = StringIO('''\ (-3.9700,-5.0400) (-1.1318,-2.5693)

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Derek Homeier
Hi Gary, On 17.06.2011, at 5:39PM, gary ruben wrote: Thanks for the hints Olivier and Bruce. Based on them, the following is a working solution, although I still have that itchy sense that genfromtxt should be able to do it directly. import numpy as np from StringIO import StringIO a =

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
2011/6/17 Derek Homeier de...@astro.physik.uni-goettingen.de Hi Gary, On 17.06.2011, at 5:39PM, gary ruben wrote: Thanks for the hints Olivier and Bruce. Based on them, the following is a working solution, although I still have that itchy sense that genfromtxt should be able to do it

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Derek Homeier
On 17.06.2011, at 11:01PM, Olivier Delalleau wrote: You were just overdoing it by already creating an array with the converter, this apparently caused genfromtxt to create a structured array from the input (which could be converted back to an ndarray, but that can prove tricky as well) -

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread gary ruben
Thanks guys - I'm happy with the solution for now. FYI, Derek's suggestion doesn't work in numpy 1.5.1 either. For any developers following this thread, I think this might be a nice use case for genfromtxt to handle in future. As a corollary of this problem, I wonder whether there's a

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
For the hardcoded part, you can easily read the first line of your file and split it with the same delimiter to know the number of columns. It's sure it'd be best to be able to be able to skip this part, but you don't need to hardcode this number into your code at least. Something like: n_cols =

[Numpy-discussion] genfromtxt converter question

2011-06-16 Thread gary ruben
I'm trying to read a file containing data formatted as in the following example using genfromtxt and I'm doing something wrong. It almost works. Can someone point out my error, or suggest a simpler solution to the ugly converter function? I thought I'd leave in the commented-out line for future

[Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Matt Studley
Hi all first, please forgive me for my ignorance - I am taking my first stumbling steps with numpy and scipy. I am having some difficulty with the behaviour of genfromtxt. s = SIO.StringIO(1, 2, 3 4, 5, 6 7, 8, 9) g= genfromtxt(s, delimiter=', ', dtype=None) print g[:,0] This produces the

Re: [Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Pierre GM
On Oct 29, 2010, at 2:35 PM, Matt Studley wrote: Hi all first, please forgive me for my ignorance - I am taking my first stumbling steps with numpy and scipy. No problem, it;s educational I am having some difficulty with the behaviour of genfromtxt. s = SIO.StringIO(1, 2, 3 4, 5, 6

[Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Matt Studley
snip How can I do my nice 2d slicing on the latter? array([('a', 2, 3), ('b', 5, 6), ('c', 8, 9)], dtype=[('f0', '|S1'), ('f1', 'i4'), ('f2', 'i4')]) Select a column by its name: yourarray['f0'] Super! So I would need to get the dtype object... myData[ myData.dtype.names[0] ] in

Re: [Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Pierre GM
On Oct 29, 2010, at 2:59 PM, Matt Studley wrote: snip How can I do my nice 2d slicing on the latter? array([('a', 2, 3), ('b', 5, 6), ('c', 8, 9)], dtype=[('f0', '|S1'), ('f1', 'i4'), ('f2', 'i4')]) Select a column by its name: yourarray['f0'] Super! So I would need to get

[Numpy-discussion] genfromtxt usage

2010-08-25 Thread Antoine Dechaume
Hello, I am trying to read a file with a variable number of values on each lines, using genfromtxt and missing_values or filling_values arguments. The usage of those arguments is not clear in the documentation, if what I am trying to do is possible, how could I do it? Thanks, Antoine.

Re: [Numpy-discussion] genfromtxt usage

2010-08-25 Thread Stéfan van der Walt
Hi Antoine On 25 August 2010 10:44, Antoine Dechaume boole...@gmail.com wrote: Hello, I am trying to read a file with a variable number of values on each lines, using genfromtxt and missing_values or filling_values arguments. The usage of those arguments is not clear in the documentation, if

Re: [Numpy-discussion] genfromtxt documentation : review needed

2009-10-16 Thread Skipper Seabold
On Thu, Oct 15, 2009 at 7:08 PM, Pierre GM pgmdevl...@gmail.com wrote: All, Here's a first draft for the documentation of np.genfromtxt. It took me longer than I thought, but that way I uncovered and fix some bugs. Please send me your comments/reviews/etc I count especially on our

Re: [Numpy-discussion] genfromtxt documentation : review needed

2009-10-16 Thread Pierre GM
On Oct 16, 2009, at 8:29 AM, Skipper Seabold wrote: Great work! I am especially glad to see the better documentation on missing values, as I didn't fully understand how to do this. A few small comments and a small attached diff with a few nitpicking grammatical changes and some of what's

Re: [Numpy-discussion] genfromtxt - the return

2009-10-07 Thread Christopher Barker
Pierre GM wrote: On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: option to merge delimiters - actually in SAS it is default Wow! that sure strikes me as a bad choice. Ahah! I get it. Well, I remember that we discussed something like that a few months ago when I started working on

Re: [Numpy-discussion] genfromtxt - the return

2009-10-07 Thread Bruce Southey
On 10/07/2009 02:14 PM, Christopher Barker wrote: Pierre GM wrote: On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: option to merge delimiters - actually in SAS it is default Wow! that sure strikes me as a bad choice. Ahah! I get it. Well, I remember that we

Re: [Numpy-discussion] genfromtxt - the return

2009-10-07 Thread Pierre GM
On Oct 7, 2009, at 3:54 PM, Bruce Southey wrote: Anyhow, I do like what genfromtxt is doing so merging multiple delimiters of the same type is not really needed. Thinking about it, merging multiple delimiters of the same type can be tricky: how do you distinguish between, say, AAA\t\tCCC

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Bruce Southey
On 10/05/2009 02:13 PM, Pierre GM wrote: All, Could you try r7449 ? I introduced some mechanisms to keep track of invalid lines (where the number of columns don't match what's expected). By default, a warning is emitted and these lines are skipped, but an optional argument gives the

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 2:42 PM, Bruce Southey wrote: Hi, Excellent as the changes appear to address incorrect number of delimiters. They should also give some extra info if there's a problem w/ the converters. I think that the default invalid_raise should be True. Mmh, OK, that's a +1/)

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Christopher Barker
Pierre GM wrote: I think that the default invalid_raise should be True. Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? yup -- make it +2 -- ignoring erreos and losing data by default is a bad idea! One 'feature' is that there is no way to indicate multiple delimiters when

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: No, just seeing what sort of problems I can create. This case is partly based on if someone is using tab-delimited then they need to set the delimiter='\t' otherwise it gives an error. Also I often parse text files so, yes, you have to be

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Skipper Seabold
On Tue, Oct 6, 2009 at 10:08 PM, Bruce Southey bsout...@gmail.com wrote: On Tue, Oct 6, 2009 at 4:04 PM, Pierre GM pgmdevl...@gmail.com wrote: On Oct 6, 2009, at 4:43 PM, Christopher Barker wrote: Pierre GM wrote: I think that the default invalid_raise should be True. Mmh, OK, that's a

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Skipper Seabold
On Tue, Oct 6, 2009 at 10:27 PM, Pierre GM pgmdevl...@gmail.com wrote: snip Anyhow, I am really impressed on how this function works. Thx. I hope things haven't been slowed down too much. In keeping with the making some work for you theme, I filed an enhancement ticket for one change that we

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 11:01 PM, Skipper Seabold wrote: In keeping with the making some work for you theme, I filed an enhancement ticket for one change that we discussed and another IMO useful addition. http://projects.scipy.org/numpy/ticket/1238 I think it would be nice if we could do data

[Numpy-discussion] genfromtxt to structured array

2009-09-25 Thread Timmie
Hello, this may be a easier question. I want to load data into an structured array with getting the names from the column header (names=True). The data looks like: ;month;day;hour;value 1995;1;1;01;0 but loading only works only if changed to: year;month;day;hour;value

Re: [Numpy-discussion] genfromtxt to structured array

2009-09-25 Thread Ryan May
On Fri, Sep 25, 2009 at 4:30 PM, Timmie timmichel...@gmx-topmail.de wrote: Hello, this may be a easier question. I want to load data into an structured array with getting the names from the column header (names=True). The data looks like:    ;month;day;hour;value    1995;1;1;01;0 but

[Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
hi, i am using genfromtxt, with a dtype like this: [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', 'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'), ('phase', 'i4'), ('attrs', '|O4')] where i'm having problems with the attrs column which i'd like to be a dict. i can

Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Pierre GM
OK, Brent, try r6341. I fixed genfromtxt for cases like yours (explicit dtype involving a np.object). Note that the fix won't work if the dtype is nested and involves np.objects (as we would hit the pb of renaming fields we observed...). Let me know how it goes. P. On Feb 4, 2009, at 4:03 PM,

Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
On Wed, Feb 4, 2009 at 8:51 PM, Pierre GM pgmdevl...@gmail.com wrote: OK, Brent, try r6341. I fixed genfromtxt for cases like yours (explicit dtype involving a np.object). Note that the fix won't work if the dtype is nested and involves np.objects (as we would hit the pb of renaming fields we

[Numpy-discussion] genfromtxt

2009-01-21 Thread Brent Pedersen
hi, i'm using the new genfromtxt stuff in numpy svn, looks great pierre any who contributed. is there a way to have the header commented and still be able to have it recognized as the header? e.g. #gender age weight M 21 72.10 F 35 58.33 M 33 21.99 if i use np.loadtxt or

Re: [Numpy-discussion] genfromtxt

2009-01-21 Thread Pierre GM
Brent, Currently, no, you won't be able to retrieve the header if it's commented. I'll see what I can do. P. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] genfromtxt

2009-01-21 Thread Pierre GM
Brent, Mind trying r6330 and let me know if it works for you ? Make sure that you use names=True to detect a header. P. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] genfromtxt

2009-01-21 Thread Brent Pedersen
On Wed, Jan 21, 2009 at 9:39 PM, Pierre GM pgmdevl...@gmail.com wrote: Brent, Mind trying r6330 and let me know if it works for you ? Make sure that you use names=True to detect a header. P. yes, works perfectly. thanks! -brent ___ Numpy-discussion