[ 
https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931024#action_12931024
 ] 

Nico Krijnen edited comment on LUCENE-2729 at 11/11/10 9:45 AM:
----------------------------------------------------------------

We finished another run today and got some interesting results with from the 
additional logging. The log times are searchable in the log file...

- 09:40:21,996: we are creating a snapshot for our backup, segments filename 
used for snapshot: 'segments_2j'
On the next line we can see that at that moment the segments_2j is 2353 bytes. 
We log the filenames in the index + the (filesizes).

- 11:03:27,653: segments_2j is written to disk
_How can this be while this is the name of the segments filename used to create 
our Snapshot?_
On the next line we see that it has indeed been written to disk, but has a 0 
byte file size.

- 11:08:02,731: First "read past EOF" exception
_It looks like segments filenames are reused and at a point in time well after 
our Snapshot is created, the name 'segments_2j' is reused?
This somehow causes the segments file to get a size of 0 bytes. Perhaps it is 
being deleted and written at the same time, or because of the name clash 
there's no data to be written to the segments file?_

- 11:16:03,300: our backup is finished.

Here is an excerpt of some of the more interesting log entries (complete log is 
attached as backup_force_failure2.log.zip):

{quote}
09:39:30,719 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Start 
asset index backup

09:40:21,996 [http-8080-2] DEBUG 
proj.zoie.impl.indexing.internal.ZoieIndexDeletionPolicy - Ctor Snapshot. 
segments filename: *segments_2j*, commit filenames: [_5f.tvd, _4c.tvf, _5f.tvf, 
_4d.fdx, _4p.tis, _4c.tvd, _4p.frq, _5e.tvx, _39.tis, _3u.frq, _4d.fdt, 
_4p.tii, _5f.fnm, _4c.tvx, _39.tii, _55.tis, _2q.fnm, _4p.fdx, _2q.tvf, 
_5e.tvf, _5e.tvd, _2q.tvd, _5f.tvx, _4p.fdt, _55.tii, _55.nrm, _55.frq, 
_2q.tvx, _3u.tvx, _5f.fdx, _4c.frq, _5f.fdt, _4c.fdx, _2q.nrm, _4c.fdt, 
_4p.tvf, _4c.nrm, _3v.tvf, _4c.tii, _3u.prx, _4p.tvd, _4c.tis, _3u.fdt, 
_5e.fnm, _3v.tvd, _3u.fdx, _55.tvx, _39.fdt, _39.fnm, _5e.prx, _3u.tis, 
*segments_2j*, _3v.prx, _2q.fdx, _4c.prx, _4d.prx, _4p.tvx, _39.fdx, _3u.tii, 
_2q.fdt, _4d.frq, _55.fdt, _2q.prx, _5e.tii, _3u.fnm, _55.fdx, _5e.nrm, 
_5f.tii, _5e.tis, _4d.nrm, _4d.tii, _5e.fdt, _4d.tis, _5f.frq, _5e.fdx, 
_39.tvd, _39.tvf, _55.fnm, _3v.frq, _2q.frq, _2q.tii, _3v.tii, _5f.prx, 
_4p.nrm, _5f.nrm, _2q.tis, _3v.tvx, _4c.fnm, _3v.nrm, _55.tvf, _55.tvd, 
_3u.nrm, _5f.tis, _3v.tis, _3v.fdt, _4d.fnm, _4d.tvf, _3v.fdx, _4d.tvd, 
_4d.tvx, _4p.prx, _3v.fnm, _5e.frq, _39.prx, _39.tvx, _39.frq, _4p.fnm, 
_39.nrm, _3u.tvf, _55.prx, _3u.tvd]

11:03:27,653 [http-8080-2] TRACE 
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing 
*segments_2j*, directory contents: _2q.fdt(969134416), _2q.fdx(36652), 
_2q.fnm(276), _2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403), 
_2q.tii(7447), _2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081), 
_2q.tvx(73300), _39.fdt(2061261675), _39.fdx(1012), _39.fnm(276), 
_39.frq(17754579), _39.nrm(256), _39.prx(121067407), _39.tii(137511), 
_39.tis(11726653), _39.tvd(185), _39.tvf(233037042), _39.tvx(2020), 
_3u.fdt(1595882722), _3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869), 
_3u.nrm(15926), _3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399), 
_3u.tvd(14209), _3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455), 
_3v.fdx(476), _3v.fnm(384), _3v.frq(15201111), _3v.nrm(122), 
_3v.prx(109724024), _3v.tii(132491), _3v.tis(11457688), _3v.tvd(114), 
_3v.tvf(211902147), _3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572), 
_4c.fnm(276), _4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960), 
_4c.tii(141243), _4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779), 
_4c.tvx(79140), _4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354), 
_4d.frq(24581614), _4d.nrm(140), _4d.prx(158243133), _4d.tii(141948), 
_4d.tis(12259425), _4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092), 
_4p.fdt(1081212027), _4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56), 
_4p.prx(60513257), _4p.tii(134898), _4p.tis(11376730), _4p.tvd(56), 
_4p.tvf(116715012), _4p.tvx(420), _55.fdt(787581180), _55.fdx(156), 
_55.fnm(354), _55.frq(5376741), _55.nrm(42), _55.prx(39539985), 
_55.tii(133483), _55.tis(11351989), _55.tvd(42), _55.tvf(75958725), 
_55.tvx(308), _5e.prx(0), _5z.fdt(853614415), _5z.fdx(23700), _5z.fnm(1696), 
_5z.frq(9041984), _5z.nrm(5928), _5z.prx(32617823), _5z.tii(146115), 
_5z.tis(12387143), _5z.tvd(4142), _5z.tvf(74409815), _5z.tvx(47396), 
_63.fdt(212690811), _63.fdx(8188), _63.fnm(1652), _63.frq(2492889), 
_63.nrm(2050), _63.prx(5672093), _63.tii(8470), _63.tis(689993), _63.tvd(1442), 
_63.tvf(15371474), _63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484), 
_64.fnm(384), _64.frq(35294399), _64.nrm(374), _64.prx(230791431), 
_64.tii(143860), _64.tis(12491845), _64.tvd(295), _64.tvf(444939185), 
_64.tvx(2964), _65.fdt(1369489408), _65.fdx(0), _65.fnm(1749), 
index.directory(5), segments.gen(20), segments_2x(2354)
11:03:27,706 [http-8080-2] TRACE 
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing _3v.prx, 
directory contents: _2q.fdt(969134416), _2q.fdx(36652), _2q.fnm(276), 
_2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403), _2q.tii(7447), 
_2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081), _2q.tvx(73300), 
_39.fdt(2061261675), _39.fdx(1012), _39.fnm(276), _39.frq(17754579), 
_39.nrm(256), _39.prx(121067407), _39.tii(137511), _39.tis(11726653), 
_39.tvd(185), _39.tvf(233037042), _39.tvx(2020), _3u.fdt(1595882722), 
_3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869), _3u.nrm(15926), 
_3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399), _3u.tvd(14209), 
_3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455), _3v.fdx(476), 
_3v.fnm(384), _3v.frq(15201111), _3v.nrm(122), _3v.prx(109724024), 
_3v.tii(132491), _3v.tis(11457688), _3v.tvd(114), _3v.tvf(211902147), 
_3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572), _4c.fnm(276), 
_4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960), _4c.tii(141243), 
_4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779), _4c.tvx(79140), 
_4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354), _4d.frq(24581614), 
_4d.nrm(140), _4d.prx(158243133), _4d.tii(141948), _4d.tis(12259425), 
_4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092), _4p.fdt(1081212027), 
_4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56), _4p.prx(60513257), 
_4p.tii(134898), _4p.tis(11376730), _4p.tvd(56), _4p.tvf(116715012), 
_4p.tvx(420), _55.fdt(787581180), _55.fdx(156), _55.fnm(354), _55.frq(5376741), 
_55.nrm(42), _55.prx(39539985), _55.tii(133483), _55.tis(11351989), 
_55.tvd(42), _55.tvf(75958725), _55.tvx(308), _5e.prx(0), _5z.fdt(853614415), 
_5z.fdx(23700), _5z.fnm(1696), _5z.frq(9041984), _5z.nrm(5928), 
_5z.prx(32617823), _5z.tii(146115), _5z.tis(12387143), _5z.tvd(4142), 
_5z.tvf(74409815), _5z.tvx(47396), _63.fdt(212690811), _63.fdx(8188), 
_63.fnm(1652), _63.frq(2492889), _63.nrm(2050), _63.prx(5672093), 
_63.tii(8470), _63.tis(689993), _63.tvd(1442), _63.tvf(15371474), 
_63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484), _64.fnm(384), 
_64.frq(35294399), _64.nrm(374), _64.prx(230791431), _64.tii(143860), 
_64.tis(12491845), _64.tvd(295), _64.tvf(444939185), _64.tvx(2964), 
_65.fdt(1369554944), _65.fdx(0), _65.fnm(1749), index.directory(5), 
segments.gen(20), *segments_2j(0)*, segments_2x(2354)

IFD [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]: init: 
load commit "*segments_2j*"
IW 1110 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]: 
init: hit exception on init; releasing write lock
11:08:02,731 
[proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a] ERROR 
proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying 
segments: *read past EOF*, start=0, end=0, bufferStart=0, bufferPosition=0, 
bufferSize=1024

11:16:03,300 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Asset 
index backup complete
{quote}

Looking in the logs there are *31 occurences of 'delete "segments_2j"' and 32 
for 'wrote segments file "segments_2j"'*. Very weird, since the 
ZoieIndexDeletionPolicy should prevent the segments_2j from being deleted...

So just guessing here, but the problem seems to be that segement filenames are 
being 'rotated' and getting reused?
The ZoieIndexDeletionPolicy definitely does not seem to take this into account, 
it uses the segments filename as a unique id to track the references to a 
specific commit. But looking at it, segment filenames are not unique over a 
certain period of time, and that time period gets shorter when lots of 
modifications are being made to the index.
Reading the code it should still prevent any segments_2j commit from being 
deleted... but we do see deletes for that file... hmmm

And then something else still seems te be going wrong, somehow the segments 
file does become 0 bytes. It is not because it was written with no data, we 
added a check for that (look for 'closing 0kb IndexOutput') but that never 
happens to a segments_* file.

Maybe some logic needs to be added to prevent certain segment_... numbers from 
being re-used if the commit is still 'referenced' by an IndexDeletionPolicy...?

      was (Author: nkrijnen):
    We finished another run today and got some interesting results with from 
the additional logging. The log times are searchable in the log file...

- 09:40:21,996: we are creating a snapshot for our backup, segments filename 
used for snapshot: 'segments_2j'
On the next line we can see that at that moment the segments_2j is 2353 bytes. 
We log the filenames in the index + the (filesizes).

- 11:03:27,653: segments_2j is written to disk
_How can this be while this is the name of the segments filename used to create 
our Snapshot?_
On the next line we see that it has indeed been written to disk, but has a 0 
byte file size.

- 11:08:02,731: First "read past EOF" exception
_It looks like segments filenames are reused and at a point in time well after 
our Snapshot is created, the name 'segments_2j' is reused?
This somehow causes the segments file to get a size of 0 bytes. Perhaps it is 
being deleted and written at the same time, or because of the name clash 
there's no data to be written to the segments file?_

- 11:16:03,300: our backup is finished.

Here is an excerpt of some of the more interesting log entries:

{quote}
09:39:30,719 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Start 
asset index backup

09:40:21,996 [http-8080-2] DEBUG 
proj.zoie.impl.indexing.internal.ZoieIndexDeletionPolicy - Ctor Snapshot. 
segments filename: *segments_2j*, commit filenames: [_5f.tvd, _4c.tvf, _5f.tvf, 
_4d.fdx, _4p.tis, _4c.tvd, _4p.frq, _5e.tvx, _39.tis, _3u.frq, _4d.fdt, 
_4p.tii, _5f.fnm, _4c.tvx, _39.tii, _55.tis, _2q.fnm, _4p.fdx, _2q.tvf, 
_5e.tvf, _5e.tvd, _2q.tvd, _5f.tvx, _4p.fdt, _55.tii, _55.nrm, _55.frq, 
_2q.tvx, _3u.tvx, _5f.fdx, _4c.frq, _5f.fdt, _4c.fdx, _2q.nrm, _4c.fdt, 
_4p.tvf, _4c.nrm, _3v.tvf, _4c.tii, _3u.prx, _4p.tvd, _4c.tis, _3u.fdt, 
_5e.fnm, _3v.tvd, _3u.fdx, _55.tvx, _39.fdt, _39.fnm, _5e.prx, _3u.tis, 
*segments_2j*, _3v.prx, _2q.fdx, _4c.prx, _4d.prx, _4p.tvx, _39.fdx, _3u.tii, 
_2q.fdt, _4d.frq, _55.fdt, _2q.prx, _5e.tii, _3u.fnm, _55.fdx, _5e.nrm, 
_5f.tii, _5e.tis, _4d.nrm, _4d.tii, _5e.fdt, _4d.tis, _5f.frq, _5e.fdx, 
_39.tvd, _39.tvf, _55.fnm, _3v.frq, _2q.frq, _2q.tii, _3v.tii, _5f.prx, 
_4p.nrm, _5f.nrm, _2q.tis, _3v.tvx, _4c.fnm, _3v.nrm, _55.tvf, _55.tvd, 
_3u.nrm, _5f.tis, _3v.tis, _3v.fdt, _4d.fnm, _4d.tvf, _3v.fdx, _4d.tvd, 
_4d.tvx, _4p.prx, _3v.fnm, _5e.frq, _39.prx, _39.tvx, _39.frq, _4p.fnm, 
_39.nrm, _3u.tvf, _55.prx, _3u.tvd]

11:03:27,653 [http-8080-2] TRACE 
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing 
*segments_2j*, directory contents: _2q.fdt(969134416), _2q.fdx(36652), 
_2q.fnm(276), _2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403), 
_2q.tii(7447), _2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081), 
_2q.tvx(73300), _39.fdt(2061261675), _39.fdx(1012), _39.fnm(276), 
_39.frq(17754579), _39.nrm(256), _39.prx(121067407), _39.tii(137511), 
_39.tis(11726653), _39.tvd(185), _39.tvf(233037042), _39.tvx(2020), 
_3u.fdt(1595882722), _3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869), 
_3u.nrm(15926), _3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399), 
_3u.tvd(14209), _3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455), 
_3v.fdx(476), _3v.fnm(384), _3v.frq(15201111), _3v.nrm(122), 
_3v.prx(109724024), _3v.tii(132491), _3v.tis(11457688), _3v.tvd(114), 
_3v.tvf(211902147), _3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572), 
_4c.fnm(276), _4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960), 
_4c.tii(141243), _4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779), 
_4c.tvx(79140), _4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354), 
_4d.frq(24581614), _4d.nrm(140), _4d.prx(158243133), _4d.tii(141948), 
_4d.tis(12259425), _4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092), 
_4p.fdt(1081212027), _4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56), 
_4p.prx(60513257), _4p.tii(134898), _4p.tis(11376730), _4p.tvd(56), 
_4p.tvf(116715012), _4p.tvx(420), _55.fdt(787581180), _55.fdx(156), 
_55.fnm(354), _55.frq(5376741), _55.nrm(42), _55.prx(39539985), 
_55.tii(133483), _55.tis(11351989), _55.tvd(42), _55.tvf(75958725), 
_55.tvx(308), _5e.prx(0), _5z.fdt(853614415), _5z.fdx(23700), _5z.fnm(1696), 
_5z.frq(9041984), _5z.nrm(5928), _5z.prx(32617823), _5z.tii(146115), 
_5z.tis(12387143), _5z.tvd(4142), _5z.tvf(74409815), _5z.tvx(47396), 
_63.fdt(212690811), _63.fdx(8188), _63.fnm(1652), _63.frq(2492889), 
_63.nrm(2050), _63.prx(5672093), _63.tii(8470), _63.tis(689993), _63.tvd(1442), 
_63.tvf(15371474), _63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484), 
_64.fnm(384), _64.frq(35294399), _64.nrm(374), _64.prx(230791431), 
_64.tii(143860), _64.tis(12491845), _64.tvd(295), _64.tvf(444939185), 
_64.tvx(2964), _65.fdt(1369489408), _65.fdx(0), _65.fnm(1749), 
index.directory(5), segments.gen(20), segments_2x(2354)
11:03:27,706 [http-8080-2] TRACE 
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing _3v.prx, 
directory contents: _2q.fdt(969134416), _2q.fdx(36652), _2q.fnm(276), 
_2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403), _2q.tii(7447), 
_2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081), _2q.tvx(73300), 
_39.fdt(2061261675), _39.fdx(1012), _39.fnm(276), _39.frq(17754579), 
_39.nrm(256), _39.prx(121067407), _39.tii(137511), _39.tis(11726653), 
_39.tvd(185), _39.tvf(233037042), _39.tvx(2020), _3u.fdt(1595882722), 
_3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869), _3u.nrm(15926), 
_3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399), _3u.tvd(14209), 
_3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455), _3v.fdx(476), 
_3v.fnm(384), _3v.frq(15201111), _3v.nrm(122), _3v.prx(109724024), 
_3v.tii(132491), _3v.tis(11457688), _3v.tvd(114), _3v.tvf(211902147), 
_3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572), _4c.fnm(276), 
_4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960), _4c.tii(141243), 
_4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779), _4c.tvx(79140), 
_4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354), _4d.frq(24581614), 
_4d.nrm(140), _4d.prx(158243133), _4d.tii(141948), _4d.tis(12259425), 
_4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092), _4p.fdt(1081212027), 
_4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56), _4p.prx(60513257), 
_4p.tii(134898), _4p.tis(11376730), _4p.tvd(56), _4p.tvf(116715012), 
_4p.tvx(420), _55.fdt(787581180), _55.fdx(156), _55.fnm(354), _55.frq(5376741), 
_55.nrm(42), _55.prx(39539985), _55.tii(133483), _55.tis(11351989), 
_55.tvd(42), _55.tvf(75958725), _55.tvx(308), _5e.prx(0), _5z.fdt(853614415), 
_5z.fdx(23700), _5z.fnm(1696), _5z.frq(9041984), _5z.nrm(5928), 
_5z.prx(32617823), _5z.tii(146115), _5z.tis(12387143), _5z.tvd(4142), 
_5z.tvf(74409815), _5z.tvx(47396), _63.fdt(212690811), _63.fdx(8188), 
_63.fnm(1652), _63.frq(2492889), _63.nrm(2050), _63.prx(5672093), 
_63.tii(8470), _63.tis(689993), _63.tvd(1442), _63.tvf(15371474), 
_63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484), _64.fnm(384), 
_64.frq(35294399), _64.nrm(374), _64.prx(230791431), _64.tii(143860), 
_64.tis(12491845), _64.tvd(295), _64.tvf(444939185), _64.tvx(2964), 
_65.fdt(1369554944), _65.fdx(0), _65.fnm(1749), index.directory(5), 
segments.gen(20), *segments_2j(0)*, segments_2x(2354)

IFD [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]: init: 
load commit "*segments_2j*"
IW 1110 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]: 
init: hit exception on init; releasing write lock
11:08:02,731 
[proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a] ERROR 
proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying 
segments: *read past EOF*, start=0, end=0, bufferStart=0, bufferPosition=0, 
bufferSize=1024

11:16:03,300 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Asset 
index backup complete
{quote}

Looking in the logs there are *31 occurences of 'delete "segments_2j"' and 32 
for 'wrote segments file "segments_2j"'*. Very weird, since the 
ZoieIndexDeletionPolicy should prevent the segments_2j from being deleted...

So just guessing here, but the problem seems to be that segement filenames are 
being 'rotated' and getting reused?
The ZoieIndexDeletionPolicy definitely does not seem to take this into account, 
it uses the segments filename as a unique id to track the references to a 
specific commit. But looking at it, segment filenames are not unique over a 
certain period of time, and that time period gets shorter when lots of 
modifications are being made to the index.
Reading the code it should still prevent any segments_2j commit from being 
deleted... but we do see deletes for that file... hmmm

And then something else still seems te be going wrong, somehow the segments 
file does become 0 bytes. It is not because it was written with no data, we 
added a check for that (look for 'closing 0kb IndexOutput') but that never 
happens to a segments_* file.

Maybe some logic needs to be added to prevent certain segment_... numbers from 
being re-used if the commit is still 'referenced' by an IndexDeletionPolicy...?
  
> Index corruption after 'read past EOF' under heavy update load and snapshot 
> export
> ----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2729
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2729
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 3.0.1, 3.0.2
>         Environment: Happens on both OS X 10.6 and Windows 2008 Server. 
> Integrated with zoie (using a zoie snapshot from 2010-08-06: 
> zoie-2.0.0-snapshot-20100806.jar).
>            Reporter: Nico Krijnen
>         Attachments: 2010-11-02 IndexWriter infoStream log.zip, 
> backup_force_failure2.log.zip, LUCENE-2729-test1.patch
>
>
> We have a system running lucene and zoie. We use lucene as a content store 
> for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled 
> backups of the index. This works fine for small indexes and when there are 
> not a lot of changes to the index when the backup is made.
> On large indexes (about 5 GB to 19 GB), when a backup is made while the index 
> is being changed a lot (lots of document additions and/or deletions), we 
> almost always get a 'read past EOF' at some point, followed by lots of 'Lock 
> obtain timed out'.
> At that point we get lots of 0 kb files in the index, data gets lots, and the 
> index is unusable.
> When we stop our server, remove the 0kb files and restart our server, the 
> index is operational again, but data has been lost.
> I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. 
> Hopefully someone has some ideas where to look to fix this.
> Some more details...
> Stack trace of the read past EOF and following Lock obtain timed out:
> {code}
> 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] 
>     ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF
> java.io.IOException: read past EOF
>     at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
>     at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
>     at 
> org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37)
>     at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
>     at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245)
>     at 
> org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:166)
>     at 
> org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725)
>     at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987)
>     at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973)
>     at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162)
>     at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003)
>     at 
> proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203)
>     at 
> proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223)
>     at 
> proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153)
>     at 
> proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134)
>     at 
> proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171)
>     at 
> proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373)
> 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] 
>     ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - 
>     Problem copying segments: Lock obtain timed out: 
>     org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: 
>     org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock
>     at org.apache.lucene.store.Lock.obtain(Lock.java:84)
>     at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060)
>     at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:957)
>     at 
> proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176)
>     at 
> proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228)
>     at 
> proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153)
>     at 
> proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134)
>     at 
> proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171)
>     at 
> proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373)
> {code}
> We get exactly the same behavour on both OS X and on Windows. On both zoie is 
> using a SimpleFSDirectory.
> We also use a SingleInstanceLockFactory (since our process is the only one 
> working with the index), but we get the same behaviour with a NativeFSLock.
> The snapshot backup is being made by calling:
> *proj.zoie.impl.indexing.ZoieSystem.exportSnapshot(WritableByteChannel)*
> Same issue in zoie JIRA:
> http://snaprojects.jira.com/browse/ZOIE-51

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to