On Mon, Jan 9, 2012 at 11:44, Vincent Massol <[email protected]> wrote:

>
> On Jan 9, 2012, at 11:36 AM, Denis Gervalle wrote:
>
> > On Mon, Jan 9, 2012 at 11:23, Vincent Massol <[email protected]> wrote:
> >
> >>
> >> On Jan 9, 2012, at 11:09 AM, Denis Gervalle wrote:
> >>
> >>> On Mon, Jan 9, 2012 at 10:07, Vincent Massol <[email protected]>
> wrote:
> >>>
> >>>> +1 with the following caveats:
> >>>>
> >>>> * We need to guarantee that a migration cannot corrupt the DB.
> >>>
> >>>
> >>> The evolution of the migration was the first steps in that procedure,
> >> since
> >>> accessing a DB with an inappropriate XWiki core could have corrupt it.
> >>>
> >>>
> >>>> For example imagine that we change a document id but this id is also
> >> used
> >>>> in some other tables and the migration stops before it's changed in
> the
> >>>> other tables. The change needs to be done in transactions for each doc
> >>>> being changed across all tables.
> >>>
> >>>
> >>> That would be nice, but MySQL does not support transaction on ISAM
> table.
> >>> I use a single transaction for the whole migration process,
> >>
> >> I think we should have one transaction per document update instead.
> We've
> >> had this problem in the past when upgrading very large systems. The
> >> migration was never going through in one go for some reason which I have
> >> forgotten so we had needed to use several tx so that the migrations
> could
> >> be restarted when it failed and so that it could complete.
> >>
> >
> > This could be done easily if you want it so. Just note that all other
> > migration are single transaction based AFAICS.
>
> I'm pretty sure this isn't the case.
> See R4359XWIKI1459DataMigration and R6079XWIKI1878DataMigration for
> example.
>

>From what I see in those, we use a single separate session and transaction
for executing some stuffs, I really do not see a partial commit there
between "rows".


>
> Thanks
> -Vincent
>
> >>> so on systems
> >>> that support it (Oracle ?), there will be migration or not. But I could
> >> not
> >>> secure MySQL better that it is possible to.
> >>
> >> It should work fine on MySQL with InnoDB which recommend (see
> >> http://platform.xwiki.org/xwiki/bin/view/AdminGuide/InstallationMySQL).
> >>
> >
> > I am myself on MyISAM since long, since there is other drawback using
> > InnoDB.
> > I do not experience much issue with corruption up to now. So you could
> > expect other to have similar setup.
> >
> >
> >>
> >> Thanks
> >> -Vincent
> >>
> >>>> Said differently the migrator should be allowed to be ctrl-c-ed at any
> >>>> time and you safely restart xwiki and the migrator will just carry on
> >> from
> >>>> where it was.
> >>>>
> >>>
> >>> The migrator will restart were it left-off, but the granularity is the
> >>> document. I proceed the updates by documents, updating all tables for
> >> each
> >>> one. If there is some issue during the migration let say on MySQL, and
> it
> >>> is restarted, it will start again skipping documents that have been
> >>> converted previously. So the corruption could be limited to a single
> >>> document.
> >>>
> >>>
> >>>> * OR we need to have a configuration parameter for deciding to run
> this
> >>>> migration or not so that users run it only when they decide thus
> >> ensuring
> >>>> that they've done the proper backups and saving of DBs.
> >>>>
> >>>
> >>> This is true using the new migration procedure, but not as flexible as
> >> you
> >>> seems to expect. Supporting two hashing algorithm is not a feature, but
> >> an
> >>> augmented risk of causing corruption for me.
> >>> Now, if you use a recent core, that use new id, and on the other side,
> >> you
> >>> have not activated migrations and access an old db, you will simply be
> >>> unable to access the database. You will receive a "db require
> migration"
> >>> exception.
> >>>
> >>> Anyway, migration are disable by default, and should be enabled by an
> >>> administrator in xwiki.cfg. The release notes will mention the needs to
> >>> proceed to it, and of course, to make a backup before. And you are
> always
> >>> supposed to have backup when you upgrade, or you are not a system admin
> >> ;)
> >>>
> >>>
> >>>> I prefer the first option but we need to guarantee it.
> >>>>
> >>>
> >>> We will never be able to guarantee it, but I have done my best to have
> it
> >>> the most secure.
> >>>
> >>>
> >>>>
> >>>> Thanks
> >>>> -Vincent
> >>>>
> >>>> On Jan 7, 2012, at 10:39 PM, Denis Gervalle wrote:
> >>>>
> >>>>> Now that the database migration mechanism has been improved, I would
> >> like
> >>>>> to go ahead with my patch to improve document ids.
> >>>>>
> >>>>> Currently, ids are simple string hashcode of a locally serialized
> >>>> document
> >>>>> reference, including the language for translated documents. The
> >>>> likelihood
> >>>>> of having duplicates with the string hashing algorithm of java is
> >> really
> >>>>> high.
> >>>>>
> >>>>> What I propose is:
> >>>>>
> >>>>> 1) use an MD5 hashing which is particularly good at distributing.
> >>>>> 2) truncate the hash to the first 64bits, since the XWD_ID column is
> a
> >>>>> 64bit long.
> >>>>> 3) use a better string representation as the source of hashing
> >>>>>
> >>>>> Based on previous discussion, point 1) and 2) has already been
> agreed,
> >>>> and
> >>>>> this vote is in particular about the string used for 3).
> >>>>> I propose it in 2 steps:
> >>>>>
> >>>>> 1) before locale are fully supported in document reference, use this
> >>>>> format:
> >>>>>
> >>>>>
> >>>>
> >>
> <lengthOfLastSpaceName>:<lastSpaceName><lengthOfDocumentName>:<documentName><lengthOfLanguage>:<language>
> >>>>>  where language would be an empty string for the default document, so
> >>>> it
> >>>>> would look like 7:mySpace5:myDoc0: and its french translation could
> be
> >>>>> 7:mySpace5:myDoc2:fr
> >>>>> 2) when locale are included in reference, we will replace the
> >>>>> implementation by a reference serializer that would produce the same
> >> kind
> >>>>> of representation, but that will include all spaces (not only the
> last
> >>>>> one), to be prepared for the future.
> >>>>>
> >>>>> While doing so, I also propose to fix the cache key issue by using
> the
> >>>> same
> >>>>> reference, but prefixed by <lengthOfWikiName>:<wikiName>, so the
> >> previous
> >>>>> examples will have the following key in the document cache:
> >>>>> 5:xwiki7:mySpace5:myDoc0: and 5:xwiki7:mySpace5:myDoc2:fr
> >>>>>
> >>>>> Using such a key (compared to the usual serialization) has the
> >> following
> >>>>> advantages:
> >>>>> - ensure uniqueness of the reference without requiring a complex
> >> escaping
> >>>>> algorithm, which is unneeded here.
> >>>>> - potentially reversible
> >>>>> - faster than the usual serialization
> >>>>> - support language
> >>>>> - independent of the current serialization that may evolved
> >>>> independently,
> >>>>> so it will be stable over time which is really important when it is
> >> used
> >>>> as
> >>>>> a base for the hashing algorithm used for document ids stored in the
> >>>>> database.
> >>>>>
> >>>>> I would like to introduce this as early as possible, which means has
> >> soon
> >>>>> has we are confident with the migration mechanism recently
> introduced.
> >>>>> Since the migration of ids will convert 32bits hashes into 64bits
> ones,
> >>>> the
> >>>>> risk of collision is really low, and to be careful, I have written a
> >>>>> migration algorithm that would support such collision (unless it
> cause
> >> a
> >>>>> circular reference collision, but this is really unexpected).
> However,
> >>>>> changing ids again later, if we change our mind, will be really more
> >>>> risky
> >>>>> and the migration difficult to implements, so it is really important
> >> that
> >>>>> we agree on the way we compute these ids, once for all.
> >>>>>
> >>>>> Here is my +1,
> >>>>>
> >>>>> --
> >>>>> Denis Gervalle
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs
>



-- 
Denis Gervalle
SOFTEC sa - CEO
eGuilde sarl - CTO
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Reply via email to