Jonathan Feally wrote: > I think that we should also add a column that specifies what hashing > method(s) the hash value is on the row of the part. This would allow 2 > things.
that is a bad idea. > > 1) Re-Calculation of the hash value for a change in hashing methods > could be interrupted and resumed later, If that is *really* necessary, a simple flag column would be better: UPDATE=0 OKHASH=1 ACTIVE=2 # if the previous run wasn't interupted queue all blobs for rehash if (select count(*) from dbmail_mimeparts where hash_flag=UPDATE) ==0: update dbmail_mimeparts set hash_flag=UPDATE where hash_flag=ACTIVE; foreach mimepart in select id,data from dbmail_mimeparts where hash_flag=UPDATE do update_hash(id,data) update dbmail_mimeparts set hash_flag=OKHASH where id=id; done # all hash values are now up-to-date update dbmail_mimeparts set hash_flag=ACTIVE where hash_flag=OKHASH; > > and > 2) Message insertion can still occur where as a simple check to see what > hashing methods are use in the table, a new part to be inserted (or not) > can be checked against all hashing methods used (old and new) to ensure > that a duplicate doesn't take place during this migration period. KISS: shut down incoming mail and dbmail daemons during such a transition. You won't be doing it that often (if ever). Not worth the effort and added complexity, imo. And even better - there is no real downside to allowing access during such a transition. Worse case scenario is that identical message parts get inserted twice using different hash algorithms under different ids. After completing the transition you would end up with identical blobs stored with identical hash values under different ids. Big deal. -- ________________________________________________________________ Paul Stevens paul at nfg.nl NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31 The Netherlands________________________________http://www.nfg.nl _______________________________________________ DBmail mailing list [email protected] https://mailman.fastxs.nl/mailman/listinfo/dbmail
