On Mon, Jan 9, 2012 at 11:44, Vincent Massol <[email protected]> wrote:
> > On Jan 9, 2012, at 11:36 AM, Denis Gervalle wrote: > > > On Mon, Jan 9, 2012 at 11:23, Vincent Massol <[email protected]> wrote: > > > >> > >> On Jan 9, 2012, at 11:09 AM, Denis Gervalle wrote: > >> > >>> On Mon, Jan 9, 2012 at 10:07, Vincent Massol <[email protected]> > wrote: > >>> > >>>> +1 with the following caveats: > >>>> > >>>> * We need to guarantee that a migration cannot corrupt the DB. > >>> > >>> > >>> The evolution of the migration was the first steps in that procedure, > >> since > >>> accessing a DB with an inappropriate XWiki core could have corrupt it. > >>> > >>> > >>>> For example imagine that we change a document id but this id is also > >> used > >>>> in some other tables and the migration stops before it's changed in > the > >>>> other tables. The change needs to be done in transactions for each doc > >>>> being changed across all tables. > >>> > >>> > >>> That would be nice, but MySQL does not support transaction on ISAM > table. > >>> I use a single transaction for the whole migration process, > >> > >> I think we should have one transaction per document update instead. > We've > >> had this problem in the past when upgrading very large systems. The > >> migration was never going through in one go for some reason which I have > >> forgotten so we had needed to use several tx so that the migrations > could > >> be restarted when it failed and so that it could complete. > >> > > > > This could be done easily if you want it so. Just note that all other > > migration are single transaction based AFAICS. > > I'm pretty sure this isn't the case. > See R4359XWIKI1459DataMigration and R6079XWIKI1878DataMigration for > example. > >From what I see in those, we use a single separate session and transaction for executing some stuffs, I really do not see a partial commit there between "rows". > > Thanks > -Vincent > > >>> so on systems > >>> that support it (Oracle ?), there will be migration or not. But I could > >> not > >>> secure MySQL better that it is possible to. > >> > >> It should work fine on MySQL with InnoDB which recommend (see > >> http://platform.xwiki.org/xwiki/bin/view/AdminGuide/InstallationMySQL). > >> > > > > I am myself on MyISAM since long, since there is other drawback using > > InnoDB. > > I do not experience much issue with corruption up to now. So you could > > expect other to have similar setup. > > > > > >> > >> Thanks > >> -Vincent > >> > >>>> Said differently the migrator should be allowed to be ctrl-c-ed at any > >>>> time and you safely restart xwiki and the migrator will just carry on > >> from > >>>> where it was. > >>>> > >>> > >>> The migrator will restart were it left-off, but the granularity is the > >>> document. I proceed the updates by documents, updating all tables for > >> each > >>> one. If there is some issue during the migration let say on MySQL, and > it > >>> is restarted, it will start again skipping documents that have been > >>> converted previously. So the corruption could be limited to a single > >>> document. > >>> > >>> > >>>> * OR we need to have a configuration parameter for deciding to run > this > >>>> migration or not so that users run it only when they decide thus > >> ensuring > >>>> that they've done the proper backups and saving of DBs. > >>>> > >>> > >>> This is true using the new migration procedure, but not as flexible as > >> you > >>> seems to expect. Supporting two hashing algorithm is not a feature, but > >> an > >>> augmented risk of causing corruption for me. > >>> Now, if you use a recent core, that use new id, and on the other side, > >> you > >>> have not activated migrations and access an old db, you will simply be > >>> unable to access the database. You will receive a "db require > migration" > >>> exception. > >>> > >>> Anyway, migration are disable by default, and should be enabled by an > >>> administrator in xwiki.cfg. The release notes will mention the needs to > >>> proceed to it, and of course, to make a backup before. And you are > always > >>> supposed to have backup when you upgrade, or you are not a system admin > >> ;) > >>> > >>> > >>>> I prefer the first option but we need to guarantee it. > >>>> > >>> > >>> We will never be able to guarantee it, but I have done my best to have > it > >>> the most secure. > >>> > >>> > >>>> > >>>> Thanks > >>>> -Vincent > >>>> > >>>> On Jan 7, 2012, at 10:39 PM, Denis Gervalle wrote: > >>>> > >>>>> Now that the database migration mechanism has been improved, I would > >> like > >>>>> to go ahead with my patch to improve document ids. > >>>>> > >>>>> Currently, ids are simple string hashcode of a locally serialized > >>>> document > >>>>> reference, including the language for translated documents. The > >>>> likelihood > >>>>> of having duplicates with the string hashing algorithm of java is > >> really > >>>>> high. > >>>>> > >>>>> What I propose is: > >>>>> > >>>>> 1) use an MD5 hashing which is particularly good at distributing. > >>>>> 2) truncate the hash to the first 64bits, since the XWD_ID column is > a > >>>>> 64bit long. > >>>>> 3) use a better string representation as the source of hashing > >>>>> > >>>>> Based on previous discussion, point 1) and 2) has already been > agreed, > >>>> and > >>>>> this vote is in particular about the string used for 3). > >>>>> I propose it in 2 steps: > >>>>> > >>>>> 1) before locale are fully supported in document reference, use this > >>>>> format: > >>>>> > >>>>> > >>>> > >> > <lengthOfLastSpaceName>:<lastSpaceName><lengthOfDocumentName>:<documentName><lengthOfLanguage>:<language> > >>>>> where language would be an empty string for the default document, so > >>>> it > >>>>> would look like 7:mySpace5:myDoc0: and its french translation could > be > >>>>> 7:mySpace5:myDoc2:fr > >>>>> 2) when locale are included in reference, we will replace the > >>>>> implementation by a reference serializer that would produce the same > >> kind > >>>>> of representation, but that will include all spaces (not only the > last > >>>>> one), to be prepared for the future. > >>>>> > >>>>> While doing so, I also propose to fix the cache key issue by using > the > >>>> same > >>>>> reference, but prefixed by <lengthOfWikiName>:<wikiName>, so the > >> previous > >>>>> examples will have the following key in the document cache: > >>>>> 5:xwiki7:mySpace5:myDoc0: and 5:xwiki7:mySpace5:myDoc2:fr > >>>>> > >>>>> Using such a key (compared to the usual serialization) has the > >> following > >>>>> advantages: > >>>>> - ensure uniqueness of the reference without requiring a complex > >> escaping > >>>>> algorithm, which is unneeded here. > >>>>> - potentially reversible > >>>>> - faster than the usual serialization > >>>>> - support language > >>>>> - independent of the current serialization that may evolved > >>>> independently, > >>>>> so it will be stable over time which is really important when it is > >> used > >>>> as > >>>>> a base for the hashing algorithm used for document ids stored in the > >>>>> database. > >>>>> > >>>>> I would like to introduce this as early as possible, which means has > >> soon > >>>>> has we are confident with the migration mechanism recently > introduced. > >>>>> Since the migration of ids will convert 32bits hashes into 64bits > ones, > >>>> the > >>>>> risk of collision is really low, and to be careful, I have written a > >>>>> migration algorithm that would support such collision (unless it > cause > >> a > >>>>> circular reference collision, but this is really unexpected). > However, > >>>>> changing ids again later, if we change our mind, will be really more > >>>> risky > >>>>> and the migration difficult to implements, so it is really important > >> that > >>>>> we agree on the way we compute these ids, once for all. > >>>>> > >>>>> Here is my +1, > >>>>> > >>>>> -- > >>>>> Denis Gervalle > _______________________________________________ > devs mailing list > [email protected] > http://lists.xwiki.org/mailman/listinfo/devs > -- Denis Gervalle SOFTEC sa - CEO eGuilde sarl - CTO _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

