Hi folks,

Quick question:

I have data (lots of it) that looks like this:

mydict['key_string'] = [(int1, 'str1'), (int2, 'str2'), ... (intn, 'strn')]

The ints are 7 digits max (unsigned 24 bits max); the strings are a 3
character code (it
could be replaced with a 4-bit number -- possible an Enum?).

It would also be possible to structure the data like this, if it would
help matters:

mydict['key_string'] = [(int1, int2, int3, ...., intn), ('str1',
'str2', 'str3', ..., 'strn')]

(or both inner and outer being lists, or both tuples, etc.  which ever
is a better way
to think about the PyTables data structuring)

I should note that while there are _many_ keys, there are relatively
tame entries per key (say a maximum of
10?  maybe 20 in a very rare instance).  The overall database is about
600MB which I currently wrote out to disk
as a text python dictionary (by hand, it crashed cPickle) ... the data
I scraped out amounted to about 300MB.
Even reading that in with execfile was a bad idea.  I had to resort to
reading subsets and appending them to
the in-memory dictionary. Needless to say, these options aren't going
to work.  I don't mind 20 minutes to build
the datastructure, but another 20 to load it isn't going to work very
well.  And, I typically only need some entries, not
all of them.

Assuming that I want to be able to quickly look up a 'key_string' and
return the list of tuples (or equivalent structure), how should I
structure a pytable to hold
this data?  In particular, I'm puzzling out what my "row" class should
look like.  Of course, I'd like to avoid extraneous rows if possible.
But, maybe I'm not thinking about "rows" in the right way.  Since each
entry (a row?) has a list of things associated with it and b/c those
things are uniform types, I was thinking of using an array within a
row, but I don't think that is possible.

Thoughts?  I do apologize, I have very little experience with database
design (of any sort,
let alone PyTables).

Thanks,
Mark

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to