Re: Why running the Git version of Invenio on a non-Atlantis collection might lead to problems.

Benoit Thiell Thu, 10 Dec 2009 23:31:42 +0100

Hi Tibor,

Tibor Simko wrote:

On Thu, 10 Dec 2009, Benoit Thiell wrote:

Additions to the latest Makefile update script (currently v0.99.1) are
added regularly but differences between the code and the MySQL table
updating scripts remain

I'm not sure if it is worth to keep maintaining these targets for any

> moment in time during the unstable period.

I think it is. It is a small task for the developer and it would preventforgetting updates as the BibIndex problem that I pointed out.

What is often done though is that configure.ac development version
number (YYYYMMDD) is bumped in case of significant DB table changes or
updates to the file organization.  So one can in principle judge by
checking out if the dev version number in configure.ac has changed, and
then diff tabcreate.sql between the two given bleeding edge dates.

That was a rough description of the current practice.  As for how to
improve it:

  * Reject all the commits that touches the database structure without
updating the Makefile.


This may not be practical.  What if some changes are taken back
partially; some conditionals would have to be maintained for people that
have done git/master updates prior to YYYYMMDD1 but not later than
YYYYMMDD2, etc.

You're right. But some changes in the code (minor) were not mirrored inthe makefile. I still think that the makefile should be updated, howeverthe update script would become valid only when a release happens. Peoplerunning a git test server (probably not a lot out there) need then to bewarned that they need to verify their database structure when applyingcommits.

  * Using the DROP TABLE statement in the update scripts should be
allowed only if the table to remove will actually disappear from the
system (c.f. update-v0.92.1-tables). Table definitions should be
updated and not recreated. People might have useful information in
there.


That DROP TABLE statement concerned only internal indexed ranked data
structure.  It was necessary to drop it because of a change in the
citation handling.  The RELEASE-NOTES instructions advise people about a
need to rerun indexing in case of such needs.  So no valuable data is
lost in this case... just an internal structure that is regenerated as
needed as part of the upgrade.


OK, I wasn't aware of that change.

  * Create an automated database updating system that would rely on an
internal database version number.


Yes, that would be nice indeed.  But it may not be practical to maintain
this fully for all the in-between-stable-releases periods anyway (see

above).

I got your previous point and agree with it. But I think that it wouldbe nice to have it for the stable releases. It would make things muchmore user-friendly.

Which brings me to an analogous alternative:

* One of the roots of these problems is that we allow a long time in
  between releases, accumulating non-trivial DB structure updates.  We
  should rather ``release early, release often''.

That sounds really nice. I can't wait to get to version 1.0 to startapplying that scheme.

P.S. I hope you have not lost stuff.  Are there some problems on your
     site still?

Just lost the citation dictionaries. Which is not a big deal butrequires 3 days of hard computing to recreate. Hopefully we made abackup of the database before I started working on it. So no harm was done.

P.S. BTW, speaking of table updates, I think 7M of records necessitated
     to change columns in bibrec/bibrec_bibxxx/bibxxx tables from
     MEDIUMINT to INT (or maybe BIGINT but probably not).

Yes, we discussed that already. I included the ADS makefile in our trunkbut it is not yet callable with inveniocfg. I will do that.


Best regards,
Benoit.

Re: Why running the Git version of Invenio on a non-Atlantis collection might lead to problems.

Reply via email to