Thanks for the quick reply, please see below.
On Fri, Apr 22, 2011 at 10:32 PM, Robert Ferrell <ferr...@diablotech.com>wrote:
> I add an array of dates to the PyTables file. That array just keeps track
> of which dates are stored. When I append a new date of data I just check
> whether that date is already in the file or not. For me that is simpler
> than involving SQL.
>
This is something very obvious that I overlooked, still there's a minor
glitch in using this and that is - the data is not very consistent.
Sometimes for a date, a given entry is repeated twice. However, such cases
too few, maybe 5-10 in about 1000-2000 dates. I'll follow this path.
> BTW, what I found handy was to put the downloaded data into a particular
> data format, and then the interface for my PyTables file only has to know
> about that data format
>
Pardon my ignorance, but I've just spent a couple of days with pytables and
hence I am not aware of many features of pytables, let alone their optimum
use. Can you please elaborate?
Thanks
-abhijit
-r
>
> On Apr 22, 2011, at 10:20 AM, Abhijit Gadgil wrote:
>
> > Dear All,
> >
> > I agree, similar questions would be discussed here many times here on the
> list. I've some peculiar requirements, which have got mostly to do with the
> way I am importing data to pytables.
> >
> > I've a csv file of about 2-4k entries generated everyday, which has data
> for different scrips and such files for about 8-10 years. Initially, when I
> was not knowing much about pytables, I started with good old sql way and the
> DB size there is currently about 4G. But with pytables, I expect it to be
> around 1-2G. So the size of the data is not such a concern as of now, but
> it'd grow as one goes along. My questions are as follows
> >
> > The way I am structuring the data is -
> >
> > I create a group and in the each group, there's a separate table for each
> of the instruments in the group and entries for all dates for a given
> instruments are rows for the table. This looked to me a better choice than
> having one big table and having all the instruments data as rows. Is this
> the right choice?
> >
> > The other question is a little important to me - The way I populate the
> table is I download data everyday and push it to the table. So far so good -
> my real concern is - it is very likely that I download and push the data for
> a given date twice in the table. In the sql counterpart, primary keys come
> to the rescue. But there's no such a choice available with pytables (to the
> best of my knowledge). Would that have to be handled in the application? One
> alternative that I am thinking is - actually populating an SQL table along
> with pytables when inserting the data and SQL table will take care of the
> integrity part (not clean, but best I can think of as of now). Are there any
> other alternatives?
> >
> > Thanks
>
------------------------------------------------------------------------------
Fulfilling the Lean Software Promise
Lean software platforms are now widely adopted and the benefits have been
demonstrated beyond question. Learn why your peers are replacing JEE
containers with lightweight application servers - and what you can gain
from the move. http://p.sf.net/sfu/vmware-sfemails
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users