Hi Pushkar,

Il 17/07/2013 19:28, Pushkar Raj Pande ha scritto:
> Hi all,
> 
> I am trying to figure out the best way to bulk load data into pytables.
> This question may have been already answered but I couldn't find what I was
> looking for.
> 
> The source data is in form of csv which may require parsing, type checking
> and setting default values if it doesn't conform to the type of the column.
> There are over 100 columns in a record. Doing this in a loop in python for
> each row of the record is very slow compared to just fetching the rows from
> one pytable file and writing it to another. Difference is almost a factor
> of ~50.
> 
> I believe if I load the data using a C procedure that does the parsing and
> builds the records to write in pytables I can get close to the speed of
> just copying and writing the rows from 1 pytable to another. But may be
> there is something simple and better that already exists. Can someone
> please advise? But if it is a C procedure that I should write can someone
> point me to some examples or snippets that I can refer to put this together.
> 
> Thanks,
> Pushkar
> 

numpy has some tools for loading data from csv files like loadtxt [1],
genfromtxt [2] and other variants.

Non of them is OK for you?

[1]
http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html#numpy.loadtxt
[2]
http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt


cheers

-- 
Antonio Valentino

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to