On 03/29/12 9:43 AM, Jonathan Bartlett wrote:
1) A large (~150GB) dataset. This data set is mainly static. It is updated, but not by the users (it is updated by our company, which provides the data to users). There are some deletions, but it is safe to consider this an "add-only" database, where only new records are created. 2) A small (~10MB but growing) dataset. This is the user's data. It includes many bookmarks (i.e. foreign keys) into data set #1. However, I am not explicitly using any referential integrity system.

by 'dataset' do you mean table, aka relation ?

by 'not using any referential integrity', do you mean, you're NOT using foreign keys ('REFERENCES table(field)' in your table declaration ?


Also, many queries cross the datasets together.


by 'cross', do you mean JOIN  ?

Now, my issue is that right now when we do updates to the dataset, we have to make them to the live database. I would prefer to manage data releases the way we manage software releases - have a staging area, test the data, and then deploy it to the users. However, I am not sure the best approach for this. If there weren't lots of crossover queries, I could just shove them in separate databases, and then swap out dataset #1 when we have a new release.


you can't JOIN data across relations(tables) in different databases.


--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to