Hi, I'm new to pytables and I'm having trouble working out how to model data relationships. Now, I realise that pytables is not a relational database, but there are a few ways to go and I want some advice on which is best. I've checked the FAQ and the archives, but if this is answered before apologies and point me there.
Imagine I am migrating from a SQL database with two tables, Customer and Order (this is not my problem, but it is easy to think about it this way). Customers have many orders. There are three ways I can see to model this: 1. have a single denormalised PyTable where the customer data is repeated (effectively the table is the join of the two sql tables) 2. have many smaller tables in a hierarchy of customer/order 3. create two tables with a customer id (just port the sql tables to pytables) As a novice, the hierarchical focus of PyTables makes me think (2) is the way to go, but this has a high metadata cost (I think) bloating the file size. Also, can I get a unified view of the individual customer tables when I want to query orders - ie a view of */order? So those problems indicate that (1) might be better. However, I will then have a lot of repeated data. Querying is easiest, but the harder operation here is changing a customer: many rows need updating. (3) seems the worst option. If I want to get the orders from a set of related customers (all customers in a country), I have to do a query which looks for a customer id in a list of customer ids. Thanks, -- James http://casbon.me/ ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users