[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1044:
---------------------------------------

    Attachment: LUCENE-1044.take4.patch

OK I did a simplistic patch (attached) whereby FSDirectory has a
background thread that re-opens, syncs, and closes those files that
Lucene has written.  (I'm using a modified version of the class from
Doron's test).

This patch is nowhere near ready to commit; I just coded up enough so
we could get a rough measure of performance cost of syncing.  EG we
must prevent deletion of a commit point until a future commit point is
fully sync'd to stable storage; we must also take care not to sync a
file that has been deleted before we sync'd it; don't sync until the
end when running with autoCommit=false; merges if run by
ConcurrentMergeScheduler should [maybe] sync in the foreground; maybe
forcefully throttle back updates if syncing is falling too far behind;
etc.

I ran the same alg as the tests above (index first 150K docs of
Wikipedia).  I ran CFS and no CFS X sync and nosync (4 tests) for each
IO system.  Time is the fastest of 2 runs:

|| IO System || CFS sync || CFS nosync || CFS % slower || non-CFS sync || 
non-CFS nosync || non-CFS % slower ||
| ReiserFS 6-drive RAID5 array Linux (2.6.22.1) | 188 | 157 | 19.7% | 143 | 147 
| -2.7% |
| EXT3 single internal drive Linux (2.6.22.1) | 173 | 157 | 10.2% | 136 | 132 | 
3.0% |
| 4 drive RAID0 array Mac Pro (10.4 Tiger) | 153 | 152 | 0.7% | 150 | 149 | 
0.7% |
| Win XP Pro laptop, single drive | 463 | 352 | 31.5% | 343 | 335 | 2.4% |
| Mac Pro single external drive | 463 | 352 | 31.5% | 343 | 335 | 2.4% |

The good news is, the non-CFS case shows very little cost when we do
BG sync'ing!

The bad news is, the CFS case still shows a high cost.  However, by
not sync'ing the files that go into the CFS (and also not committing a
new segments_N file until after the CFS is written) I expect that cost
to go way down.

One caveat: I'm using a 8 MB RAM buffer for all of these tests.  As
Yonik pointed out, if you have a smaller buffer, or, you add just a
few docs and then close your writer, the sync cost as a pctg of net
indexing time will be quite a bit higher.


> Behavior on hard power shutdown
> -------------------------------
>
>                 Key: LUCENE-1044
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1044
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>         Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>            Reporter: venkat rangan
>            Assignee: Michael McCandless
>             Fix For: 2.3
>
>         Attachments: FSyncPerfTest.java, LUCENE-1044.patch, 
> LUCENE-1044.take2.patch, LUCENE-1044.take3.patch, LUCENE-1044.take4.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to