On Sat, Dec 11, 2010 at 12:18 PM, John Q. Public <sqlite3.u...@gmail.com> wrote: > > Hello, > > Before I can ask my question, I have to present the data I work with. > My goal is to create a (very) simple version control for a unique type of > (text) file. > The file has three parts: > 1. atoms - set of attributes (e.g. name, id, etc). Each version of the file > contains up to thousands of atoms > 2. molecules - has some attributes (e.g. name, id, etc) and a group of atoms > (list of atoms ids). Each molecule can contain zero to hundreds of atoms. > 3. matter - a group of molecules (a matter to molecule is what molecule is > to atom - and no, it's not homework of some kind....) > > ---------------------- EXAMPLE FILE: -------------------- > atoms: > # atr1 atr2 ... atrN > 1 A B C > 2 D B E > 3 D F G > ... > molucules: > # atr1 atr2 ... atrN atom_list > 1 A B C 1 5 6 16 53 102 > 2 D E F 2 7 13 53 104 205 206 207 > 3 G H I 4 > ... > matter: > # atr1 atr2 ... atrN molecules_list > 1 A B C 1 5 6 16 53 102 > 2 D E F 2 7 13 53 104 205 206 207 > 3 G H I 4 > ... > ------------------- END OF EXAMPLE --------------------- > > Each file version is only slightly different from the previous version > (changes in only few atoms, molecules and\or matter). > > I thought on the following scheme (first 6 tables are trivial): > - version table - each raw represent a file that was added > - atom table - each raw represent one atom from a file > - molecules table - each raw represent one molecule from a file (later > I'll connect atoms to molecules) > - matter table - each raw represent one matter from a file (later I'll > connect molecules to matter) > - atom to molecule table - foreign keys from both tables represent the > relation: atom BELONG to molecule > - molecule to matter table - foreign keys from both tables represent the > relation: molecule BELONG to matter > - data to version table - In order not to make 3 different tables > connecting atoms, molecules and matter to version, I will create one table > that contains 2 foreign keys (data_id and version_id) together with another > attribute stating the source table for the data. Now, because versions are > only slightly different, does it make sense to implement the relation: data > DOES NOT BELONG to version ?
I think I would combine atom, molecule. matter into one table and add a 3 record type table. Now there is only 1 table to worry about versioning. I think it is better for some reason, but it will make the code harder to read. I'll confess I don't really know why I would want to do this, and in my next section I am ignoring this idea. I would add a column "like" effective_date datetime - when the version became real. The reason I say "like" is because effect dates are a previously solved problem so is versioning, but I have a feeling you will find more googling for that term. Here is what I found: select tax_rate from t where effective_date = (select max(effective_date) from t where effective_date <= '2002-04 http://forums.devshed.com/postgresql-help-21/how-do-i-return-a-value-from-one-column-based-144349.html atoms: pk id edt atr1 atr2 ... atrN 1 1 1/1/10 A B C 2 2 1/1/10 D B E 3 3 1/1/10 D F G ... molucules: # atr1 atr2 ... atrN atom_list 1 A B C 1 5 6 16 53 102 So molecule #1 contains atom 1 (and more...) Atom 1 changes from ABC to ABD - add this record: pk id edt atr1 atr2 ... atrN 4 1 1/2/10 A B D Notice it gets a new primary key, but keeps the ID. The 1 in atom_list refers to the atom ID. So to find the atom as it was on a certain date, use the above query. follow? > > My second question is about inserting new versions to the db. After parsing > the text file I have an object with all the information from the file. Is > the next step to go attribute by attribute and update the db scheme or > should I create some sort of temporary scheme (containing only data from the > object) and do the inserting by merging tables? further more, is there a > technique to keep db integrity while updating (especially in case of error > in the insert process)? > > As always - thanks for your time and patience > transactions - you don't have to understand the impemtation, but it is something like: buffers all the changes until you commit, then locks, appends, moves some pointers, deletes, unlocks. It has been a while, so this may not be right. but most pleople don't worry about how it works anyway, just that it works. -- Carl K _______________________________________________ DB-SIG maillist - DB-SIG@python.org http://mail.python.org/mailman/listinfo/db-sig