Hi Pushkar, Il 17/07/2013 19:28, Pushkar Raj Pande ha scritto: > Hi all, > > I am trying to figure out the best way to bulk load data into pytables. > This question may have been already answered but I couldn't find what I was > looking for. > > The source data is in form of csv which may require parsing, type checking > and setting default values if it doesn't conform to the type of the column. > There are over 100 columns in a record. Doing this in a loop in python for > each row of the record is very slow compared to just fetching the rows from > one pytable file and writing it to another. Difference is almost a factor > of ~50. > > I believe if I load the data using a C procedure that does the parsing and > builds the records to write in pytables I can get close to the speed of > just copying and writing the rows from 1 pytable to another. But may be > there is something simple and better that already exists. Can someone > please advise? But if it is a C procedure that I should write can someone > point me to some examples or snippets that I can refer to put this together. > > Thanks, > Pushkar >
numpy has some tools for loading data from csv files like loadtxt [1], genfromtxt [2] and other variants. Non of them is OK for you? [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html#numpy.loadtxt [2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt cheers -- Antonio Valentino ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users