I add an array of dates to the PyTables file. That array just keeps track of which dates are stored. When I append a new date of data I just check whether that date is already in the file or not. For me that is simpler than involving SQL.
BTW, what I found handy was to put the downloaded data into a particular data format, and then the interface for my PyTables file only has to know about that data format. -r On Apr 22, 2011, at 10:20 AM, Abhijit Gadgil wrote: > Dear All, > > I agree, similar questions would be discussed here many times here on the > list. I've some peculiar requirements, which have got mostly to do with the > way I am importing data to pytables. > > I've a csv file of about 2-4k entries generated everyday, which has data for > different scrips and such files for about 8-10 years. Initially, when I was > not knowing much about pytables, I started with good old sql way and the DB > size there is currently about 4G. But with pytables, I expect it to be around > 1-2G. So the size of the data is not such a concern as of now, but it'd grow > as one goes along. My questions are as follows > > The way I am structuring the data is - > > I create a group and in the each group, there's a separate table for each of > the instruments in the group and entries for all dates for a given > instruments are rows for the table. This looked to me a better choice than > having one big table and having all the instruments data as rows. Is this the > right choice? > > The other question is a little important to me - The way I populate the table > is I download data everyday and push it to the table. So far so good - my > real concern is - it is very likely that I download and push the data for a > given date twice in the table. In the sql counterpart, primary keys come to > the rescue. But there's no such a choice available with pytables (to the best > of my knowledge). Would that have to be handled in the application? One > alternative that I am thinking is - actually populating an SQL table along > with pytables when inserting the data and SQL table will take care of the > integrity part (not clean, but best I can think of as of now). Are there any > other alternatives? > > Thanks > > > -- > अभिजीत > > ------------------------------------------------------------------------------ > Fulfilling the Lean Software Promise > Lean software platforms are now widely adopted and the benefits have been > demonstrated beyond question. Learn why your peers are replacing JEE > containers with lightweight application servers - and what you can gain > from the move. > http://p.sf.net/sfu/vmware-sfemails_______________________________________________ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users ------------------------------------------------------------------------------ Fulfilling the Lean Software Promise Lean software platforms are now widely adopted and the benefits have been demonstrated beyond question. Learn why your peers are replacing JEE containers with lightweight application servers - and what you can gain from the move. http://p.sf.net/sfu/vmware-sfemails _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users