Lucene is not fsync'ing files on commit
---------------------------------------
Key: LUCENE-3418
URL: https://issues.apache.org/jira/browse/LUCENE-3418
Project: Lucene - Java
Issue Type: Bug
Components: core/store
Affects Versions: 3.3, 3.2, 3.1, 3.4, 4.0
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
Fix For: 3.4, 4.0
Thanks to hurricane Irene, when Mark's electricity became unreliable, he
discovered that on power loss Lucene could easily corrumpt the index, which of
course should never happen...
I was able to easily repro, by pulling the plug on an Ubuntu box during
indexing. On digging, I discovered, to my horror, that Lucene is failing to
fsync any files, ever!
This bug was unfortunately created when we committed LUCENE-2328... that issue
added tracking, in FSDir, of which files have been closed but not sync'd, so
that when sync is called during IW.commit we only sync those files that haven't
already been sync'd.
That tracking is done via the FSDir.onIndexOutputClosed callback, called when
an FSIndexOutput is closed. The bug is that we only call it on exception
during close:
{noformat}
@Override
public void close() throws IOException {
// only close the file if it has not been closed yet
if (isOpen) {
boolean success = false;
try {
super.close();
success = true;
} finally {
isOpen = false;
if (!success) {
try {
file.close();
parent.onIndexOutputClosed(this);
} catch (Throwable t) {
// Suppress so we don't mask original exception
}
} else
file.close();
}
}
}
{noformat}
And so FSDir thinks no files need syncing when its sync method is called....
I think instead we should call it up-front; better to over-sync than under-sync.
The fix is trivial (move the callback up-front), but I'd love to somehow have a
test that can catch such a bad regression in the future.... still I think we
can do that test separately and commit this fix first.
Note that even though LUCENE-2328 was backported to 2.9.x and 3.0.x, this bug
wasn't, ie the backport was a much simpler fix (to just address the original
memory leak); it's 3.1, 3.2, 3.3 and trunk when this bug is present.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]