On Mon, Dec 12, 2011 at 10:22 AM, Chris.Barker <chris.bar...@noaa.gov>wrote:
> On 12/11/11 8:40 AM, Ralf Gommers wrote: > > On Wed, Dec 7, 2011 at 7:50 PM, Chris.Barker <chris.bar...@noaa.gov > > * If we have a good, fast ascii (or unicode?) to array reader, > hopefully > > it could be leveraged for use in the more complex cases. So that > rather > > than genfromtxt() being written from scratch, it would be a wrapper > > around the lower-level reader. > > > > You seem to be contradicting yourself here. The more complex cases are > > Wes' 10% and why genfromtxt is so hairy internally. There's always a > > trade-off between speed and handling complex corner cases. You want both. > > I don't think the version in my mind is contradictory (Not quite). > > What I'm imagining is that a good, fast ascii to numpy array reader > could read a whole table in at once (the common, easy, fast, case), but > it could also be used to read snippets of a file in at a time, which > could be leveraged to handle many of the more complex cases. > > I suppose there will always be cases where the user needs to write their > own converter from string to dtype, and there is simply no way to > leverage what I'm imagining to supported that. > > Hmm, maybe there is -- for instance, if a "record" consisted off mostly > standard, easy-to-parse, numbers, but one field was some weird text that > needed custom parsing, we could read it as a dtype, with a string for > that one weird field, and that could be converted in a post-processing > step. > > Maybe that wouldn't be any faster or easier, but it could be done... > > Anyway, whether you can leverage it for the full-featured version or > not, I do think there is call for a good, fast, 90% case text file parser. > > > Would anyone like to join/form a small working group to work on this? > > Wes, I'd like to see your Cython version -- maybe a starting point? > > -Chris > I'm also working on a faster text file reader, so count me in. I've been experimenting in both C and Cython. I'll put it on github as soon as I can. Warren > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > chris.bar...@noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion