Hello Jukka,

Sorry for the late answer, at the end of a long day of investigation ;-)
First, thank you for the quick reply, your explanations put us on the
right track.

The short version is: it seems the culprit was in fact the MySQL database,
although we still don't understand how that's even possible.
The long version is in the text below.


On 31 Mar 2009, at 01:34, Jukka Zitting wrote:
Seems weird. A save() call can only succeed if the underlying
persistence database has successfully committed the changes. Does your
log show any exceptions or problems related to save() calls?

We've analyzed the logs even further but never found any exception or problem
during a save operation, which is reassuring. I guess the "broken pipe"
exceptions occur after each period of database inactivity (when closing
statements?), but whenever new data needs to be saved, the PM manages to
establish a new connection.


Jackrabbit does cache content, but the content from one session only
becomes available to another session after it has been successfully
persisted.

Indeed, after further tests that seems to be the case. Great to know.


So unless you only ever used a single session, there should
be no way for your data to be kept *only* in a cache unless the
underlying database was seriously broken (i.e. it returned success for
uncompleted commits).

Well, it looks like the database *was* seriously broken. The db files on
the disk had not been touched in months (making our backups useless),
however the binary logs did contain all the jackrabbit transactions!

So it looks like MySQL had been writing to /invisible/ database files
somewhere (perhaps unlinked or something), or not at all. And then
it suddenly recovered yesterday, only to synchronize with the *old* db
files on the disk. We don't know what triggered that, as the only
noticeable thing that happened to the MySQL server yesterday is that
the disk where the binary logs are  written became full for a little
while.
The MySQL server's uptime is 300+ days (and counting), and we have
never altered its config. Moreover, our other MySQL databases on the
same server did not experience the same issue.
So that's still a mystery... (MySQL 5.0.44 on Gentoo 2.6.24.2)

Fortunately we also had backups of the binary logs, so we restored older
database dumps and then replayed the whole binary logs range since June
2008 over it.
And yes, after restarting jackrabbit-server everything is back!


Thanks a lot for your help!

Cheers,

--
Olivier Dony

Reply via email to