[Pytables-users] Data organization

Nicholas Potter Thu, 09 Dec 2010 01:20:06 -0800

Hello everyone,

I am working with economic data for 3140 counties and the 50 states as well
as 500 industries, and trying to figure out the best way to store and access
the data.  The two options seem to be to have one table of ~32 million rows,
like this:


Region | Industry | variable | value
**data**

or instead, decouple by region, having 3140 tables, one for each county,
with industries as the columns (so 500 columns) and variables as the rows.

I guess this is essentially a row versus column orientation question, but
also whether it would be better to split the tables or keep them together.
 Are there advantages to either way?

I will always be accessing data for a specific county, so it seems separate
tables might be better, but is there any reason to go with one giant table
instead?

Thanks for your help.

Nick

------------------------------------------------------------------------------
This SF Dev2Dev email is sponsored by:

WikiLeaks The End of the Free Internet
http://p.sf.net/sfu/therealnews-com

_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users

[Pytables-users] Data organization

Reply via email to