On Mon, Jan 9, 2012 at 11:23, Vincent Massol <[email protected]> wrote:
> > On Jan 9, 2012, at 11:09 AM, Denis Gervalle wrote: > > > On Mon, Jan 9, 2012 at 10:07, Vincent Massol <[email protected]> wrote: > > > >> +1 with the following caveats: > >> > >> * We need to guarantee that a migration cannot corrupt the DB. > > > > > > The evolution of the migration was the first steps in that procedure, > since > > accessing a DB with an inappropriate XWiki core could have corrupt it. > > > > > >> For example imagine that we change a document id but this id is also > used > >> in some other tables and the migration stops before it's changed in the > >> other tables. The change needs to be done in transactions for each doc > >> being changed across all tables. > > > > > > That would be nice, but MySQL does not support transaction on ISAM table. > > I use a single transaction for the whole migration process, > > I think we should have one transaction per document update instead. We've > had this problem in the past when upgrading very large systems. The > migration was never going through in one go for some reason which I have > forgotten so we had needed to use several tx so that the migrations could > be restarted when it failed and so that it could complete. > This could be done easily if you want it so. Just note that all other migration are single transaction based AFAICS. > > > so on systems > > that support it (Oracle ?), there will be migration or not. But I could > not > > secure MySQL better that it is possible to. > > It should work fine on MySQL with InnoDB which recommend (see > http://platform.xwiki.org/xwiki/bin/view/AdminGuide/InstallationMySQL). > I am myself on MyISAM since long, since there is other drawback using InnoDB. I do not experience much issue with corruption up to now. So you could expect other to have similar setup. > > Thanks > -Vincent > > >> Said differently the migrator should be allowed to be ctrl-c-ed at any > >> time and you safely restart xwiki and the migrator will just carry on > from > >> where it was. > >> > > > > The migrator will restart were it left-off, but the granularity is the > > document. I proceed the updates by documents, updating all tables for > each > > one. If there is some issue during the migration let say on MySQL, and it > > is restarted, it will start again skipping documents that have been > > converted previously. So the corruption could be limited to a single > > document. > > > > > >> * OR we need to have a configuration parameter for deciding to run this > >> migration or not so that users run it only when they decide thus > ensuring > >> that they've done the proper backups and saving of DBs. > >> > > > > This is true using the new migration procedure, but not as flexible as > you > > seems to expect. Supporting two hashing algorithm is not a feature, but > an > > augmented risk of causing corruption for me. > > Now, if you use a recent core, that use new id, and on the other side, > you > > have not activated migrations and access an old db, you will simply be > > unable to access the database. You will receive a "db require migration" > > exception. > > > > Anyway, migration are disable by default, and should be enabled by an > > administrator in xwiki.cfg. The release notes will mention the needs to > > proceed to it, and of course, to make a backup before. And you are always > > supposed to have backup when you upgrade, or you are not a system admin > ;) > > > > > >> I prefer the first option but we need to guarantee it. > >> > > > > We will never be able to guarantee it, but I have done my best to have it > > the most secure. > > > > > >> > >> Thanks > >> -Vincent > >> > >> On Jan 7, 2012, at 10:39 PM, Denis Gervalle wrote: > >> > >>> Now that the database migration mechanism has been improved, I would > like > >>> to go ahead with my patch to improve document ids. > >>> > >>> Currently, ids are simple string hashcode of a locally serialized > >> document > >>> reference, including the language for translated documents. The > >> likelihood > >>> of having duplicates with the string hashing algorithm of java is > really > >>> high. > >>> > >>> What I propose is: > >>> > >>> 1) use an MD5 hashing which is particularly good at distributing. > >>> 2) truncate the hash to the first 64bits, since the XWD_ID column is a > >>> 64bit long. > >>> 3) use a better string representation as the source of hashing > >>> > >>> Based on previous discussion, point 1) and 2) has already been agreed, > >> and > >>> this vote is in particular about the string used for 3). > >>> I propose it in 2 steps: > >>> > >>> 1) before locale are fully supported in document reference, use this > >>> format: > >>> > >>> > >> > <lengthOfLastSpaceName>:<lastSpaceName><lengthOfDocumentName>:<documentName><lengthOfLanguage>:<language> > >>> where language would be an empty string for the default document, so > >> it > >>> would look like 7:mySpace5:myDoc0: and its french translation could be > >>> 7:mySpace5:myDoc2:fr > >>> 2) when locale are included in reference, we will replace the > >>> implementation by a reference serializer that would produce the same > kind > >>> of representation, but that will include all spaces (not only the last > >>> one), to be prepared for the future. > >>> > >>> While doing so, I also propose to fix the cache key issue by using the > >> same > >>> reference, but prefixed by <lengthOfWikiName>:<wikiName>, so the > previous > >>> examples will have the following key in the document cache: > >>> 5:xwiki7:mySpace5:myDoc0: and 5:xwiki7:mySpace5:myDoc2:fr > >>> > >>> Using such a key (compared to the usual serialization) has the > following > >>> advantages: > >>> - ensure uniqueness of the reference without requiring a complex > escaping > >>> algorithm, which is unneeded here. > >>> - potentially reversible > >>> - faster than the usual serialization > >>> - support language > >>> - independent of the current serialization that may evolved > >> independently, > >>> so it will be stable over time which is really important when it is > used > >> as > >>> a base for the hashing algorithm used for document ids stored in the > >>> database. > >>> > >>> I would like to introduce this as early as possible, which means has > soon > >>> has we are confident with the migration mechanism recently introduced. > >>> Since the migration of ids will convert 32bits hashes into 64bits ones, > >> the > >>> risk of collision is really low, and to be careful, I have written a > >>> migration algorithm that would support such collision (unless it cause > a > >>> circular reference collision, but this is really unexpected). However, > >>> changing ids again later, if we change our mind, will be really more > >> risky > >>> and the migration difficult to implements, so it is really important > that > >>> we agree on the way we compute these ids, once for all. > >>> > >>> Here is my +1, > >>> > >>> -- > >>> Denis Gervalle > >> _______________________________________________ > >> devs mailing list > >> [email protected] > >> http://lists.xwiki.org/mailman/listinfo/devs > >> > > > > > > > > -- > > Denis Gervalle > > SOFTEC sa - CEO > > eGuilde sarl - CTO > > _______________________________________________ > > devs mailing list > > [email protected] > > http://lists.xwiki.org/mailman/listinfo/devs > > _______________________________________________ > devs mailing list > [email protected] > http://lists.xwiki.org/mailman/listinfo/devs > -- Denis Gervalle SOFTEC sa - CEO eGuilde sarl - CTO _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

