[
https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931024#action_12931024
]
Nico Krijnen edited comment on LUCENE-2729 at 11/11/10 9:45 AM:
----------------------------------------------------------------
We finished another run today and got some interesting results with from the
additional logging. The log times are searchable in the log file...
- 09:40:21,996: we are creating a snapshot for our backup, segments filename
used for snapshot: 'segments_2j'
On the next line we can see that at that moment the segments_2j is 2353 bytes.
We log the filenames in the index + the (filesizes).
- 11:03:27,653: segments_2j is written to disk
_How can this be while this is the name of the segments filename used to create
our Snapshot?_
On the next line we see that it has indeed been written to disk, but has a 0
byte file size.
- 11:08:02,731: First "read past EOF" exception
_It looks like segments filenames are reused and at a point in time well after
our Snapshot is created, the name 'segments_2j' is reused?
This somehow causes the segments file to get a size of 0 bytes. Perhaps it is
being deleted and written at the same time, or because of the name clash
there's no data to be written to the segments file?_
- 11:16:03,300: our backup is finished.
Here is an excerpt of some of the more interesting log entries (complete log is
attached as backup_force_failure2.log.zip):
{quote}
09:39:30,719 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Start
asset index backup
09:40:21,996 [http-8080-2] DEBUG
proj.zoie.impl.indexing.internal.ZoieIndexDeletionPolicy - Ctor Snapshot.
segments filename: *segments_2j*, commit filenames: [_5f.tvd, _4c.tvf, _5f.tvf,
_4d.fdx, _4p.tis, _4c.tvd, _4p.frq, _5e.tvx, _39.tis, _3u.frq, _4d.fdt,
_4p.tii, _5f.fnm, _4c.tvx, _39.tii, _55.tis, _2q.fnm, _4p.fdx, _2q.tvf,
_5e.tvf, _5e.tvd, _2q.tvd, _5f.tvx, _4p.fdt, _55.tii, _55.nrm, _55.frq,
_2q.tvx, _3u.tvx, _5f.fdx, _4c.frq, _5f.fdt, _4c.fdx, _2q.nrm, _4c.fdt,
_4p.tvf, _4c.nrm, _3v.tvf, _4c.tii, _3u.prx, _4p.tvd, _4c.tis, _3u.fdt,
_5e.fnm, _3v.tvd, _3u.fdx, _55.tvx, _39.fdt, _39.fnm, _5e.prx, _3u.tis,
*segments_2j*, _3v.prx, _2q.fdx, _4c.prx, _4d.prx, _4p.tvx, _39.fdx, _3u.tii,
_2q.fdt, _4d.frq, _55.fdt, _2q.prx, _5e.tii, _3u.fnm, _55.fdx, _5e.nrm,
_5f.tii, _5e.tis, _4d.nrm, _4d.tii, _5e.fdt, _4d.tis, _5f.frq, _5e.fdx,
_39.tvd, _39.tvf, _55.fnm, _3v.frq, _2q.frq, _2q.tii, _3v.tii, _5f.prx,
_4p.nrm, _5f.nrm, _2q.tis, _3v.tvx, _4c.fnm, _3v.nrm, _55.tvf, _55.tvd,
_3u.nrm, _5f.tis, _3v.tis, _3v.fdt, _4d.fnm, _4d.tvf, _3v.fdx, _4d.tvd,
_4d.tvx, _4p.prx, _3v.fnm, _5e.frq, _39.prx, _39.tvx, _39.frq, _4p.fnm,
_39.nrm, _3u.tvf, _55.prx, _3u.tvd]
11:03:27,653 [http-8080-2] TRACE
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing
*segments_2j*, directory contents: _2q.fdt(969134416), _2q.fdx(36652),
_2q.fnm(276), _2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403),
_2q.tii(7447), _2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081),
_2q.tvx(73300), _39.fdt(2061261675), _39.fdx(1012), _39.fnm(276),
_39.frq(17754579), _39.nrm(256), _39.prx(121067407), _39.tii(137511),
_39.tis(11726653), _39.tvd(185), _39.tvf(233037042), _39.tvx(2020),
_3u.fdt(1595882722), _3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869),
_3u.nrm(15926), _3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399),
_3u.tvd(14209), _3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455),
_3v.fdx(476), _3v.fnm(384), _3v.frq(15201111), _3v.nrm(122),
_3v.prx(109724024), _3v.tii(132491), _3v.tis(11457688), _3v.tvd(114),
_3v.tvf(211902147), _3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572),
_4c.fnm(276), _4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960),
_4c.tii(141243), _4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779),
_4c.tvx(79140), _4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354),
_4d.frq(24581614), _4d.nrm(140), _4d.prx(158243133), _4d.tii(141948),
_4d.tis(12259425), _4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092),
_4p.fdt(1081212027), _4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56),
_4p.prx(60513257), _4p.tii(134898), _4p.tis(11376730), _4p.tvd(56),
_4p.tvf(116715012), _4p.tvx(420), _55.fdt(787581180), _55.fdx(156),
_55.fnm(354), _55.frq(5376741), _55.nrm(42), _55.prx(39539985),
_55.tii(133483), _55.tis(11351989), _55.tvd(42), _55.tvf(75958725),
_55.tvx(308), _5e.prx(0), _5z.fdt(853614415), _5z.fdx(23700), _5z.fnm(1696),
_5z.frq(9041984), _5z.nrm(5928), _5z.prx(32617823), _5z.tii(146115),
_5z.tis(12387143), _5z.tvd(4142), _5z.tvf(74409815), _5z.tvx(47396),
_63.fdt(212690811), _63.fdx(8188), _63.fnm(1652), _63.frq(2492889),
_63.nrm(2050), _63.prx(5672093), _63.tii(8470), _63.tis(689993), _63.tvd(1442),
_63.tvf(15371474), _63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484),
_64.fnm(384), _64.frq(35294399), _64.nrm(374), _64.prx(230791431),
_64.tii(143860), _64.tis(12491845), _64.tvd(295), _64.tvf(444939185),
_64.tvx(2964), _65.fdt(1369489408), _65.fdx(0), _65.fnm(1749),
index.directory(5), segments.gen(20), segments_2x(2354)
11:03:27,706 [http-8080-2] TRACE
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing _3v.prx,
directory contents: _2q.fdt(969134416), _2q.fdx(36652), _2q.fnm(276),
_2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403), _2q.tii(7447),
_2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081), _2q.tvx(73300),
_39.fdt(2061261675), _39.fdx(1012), _39.fnm(276), _39.frq(17754579),
_39.nrm(256), _39.prx(121067407), _39.tii(137511), _39.tis(11726653),
_39.tvd(185), _39.tvf(233037042), _39.tvx(2020), _3u.fdt(1595882722),
_3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869), _3u.nrm(15926),
_3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399), _3u.tvd(14209),
_3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455), _3v.fdx(476),
_3v.fnm(384), _3v.frq(15201111), _3v.nrm(122), _3v.prx(109724024),
_3v.tii(132491), _3v.tis(11457688), _3v.tvd(114), _3v.tvf(211902147),
_3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572), _4c.fnm(276),
_4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960), _4c.tii(141243),
_4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779), _4c.tvx(79140),
_4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354), _4d.frq(24581614),
_4d.nrm(140), _4d.prx(158243133), _4d.tii(141948), _4d.tis(12259425),
_4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092), _4p.fdt(1081212027),
_4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56), _4p.prx(60513257),
_4p.tii(134898), _4p.tis(11376730), _4p.tvd(56), _4p.tvf(116715012),
_4p.tvx(420), _55.fdt(787581180), _55.fdx(156), _55.fnm(354), _55.frq(5376741),
_55.nrm(42), _55.prx(39539985), _55.tii(133483), _55.tis(11351989),
_55.tvd(42), _55.tvf(75958725), _55.tvx(308), _5e.prx(0), _5z.fdt(853614415),
_5z.fdx(23700), _5z.fnm(1696), _5z.frq(9041984), _5z.nrm(5928),
_5z.prx(32617823), _5z.tii(146115), _5z.tis(12387143), _5z.tvd(4142),
_5z.tvf(74409815), _5z.tvx(47396), _63.fdt(212690811), _63.fdx(8188),
_63.fnm(1652), _63.frq(2492889), _63.nrm(2050), _63.prx(5672093),
_63.tii(8470), _63.tis(689993), _63.tvd(1442), _63.tvf(15371474),
_63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484), _64.fnm(384),
_64.frq(35294399), _64.nrm(374), _64.prx(230791431), _64.tii(143860),
_64.tis(12491845), _64.tvd(295), _64.tvf(444939185), _64.tvx(2964),
_65.fdt(1369554944), _65.fdx(0), _65.fnm(1749), index.directory(5),
segments.gen(20), *segments_2j(0)*, segments_2x(2354)
IFD [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]: init:
load commit "*segments_2j*"
IW 1110 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]:
init: hit exception on init; releasing write lock
11:08:02,731
[proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a] ERROR
proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying
segments: *read past EOF*, start=0, end=0, bufferStart=0, bufferPosition=0,
bufferSize=1024
11:16:03,300 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Asset
index backup complete
{quote}
Looking in the logs there are *31 occurences of 'delete "segments_2j"' and 32
for 'wrote segments file "segments_2j"'*. Very weird, since the
ZoieIndexDeletionPolicy should prevent the segments_2j from being deleted...
So just guessing here, but the problem seems to be that segement filenames are
being 'rotated' and getting reused?
The ZoieIndexDeletionPolicy definitely does not seem to take this into account,
it uses the segments filename as a unique id to track the references to a
specific commit. But looking at it, segment filenames are not unique over a
certain period of time, and that time period gets shorter when lots of
modifications are being made to the index.
Reading the code it should still prevent any segments_2j commit from being
deleted... but we do see deletes for that file... hmmm
And then something else still seems te be going wrong, somehow the segments
file does become 0 bytes. It is not because it was written with no data, we
added a check for that (look for 'closing 0kb IndexOutput') but that never
happens to a segments_* file.
Maybe some logic needs to be added to prevent certain segment_... numbers from
being re-used if the commit is still 'referenced' by an IndexDeletionPolicy...?
was (Author: nkrijnen):
We finished another run today and got some interesting results with from
the additional logging. The log times are searchable in the log file...
- 09:40:21,996: we are creating a snapshot for our backup, segments filename
used for snapshot: 'segments_2j'
On the next line we can see that at that moment the segments_2j is 2353 bytes.
We log the filenames in the index + the (filesizes).
- 11:03:27,653: segments_2j is written to disk
_How can this be while this is the name of the segments filename used to create
our Snapshot?_
On the next line we see that it has indeed been written to disk, but has a 0
byte file size.
- 11:08:02,731: First "read past EOF" exception
_It looks like segments filenames are reused and at a point in time well after
our Snapshot is created, the name 'segments_2j' is reused?
This somehow causes the segments file to get a size of 0 bytes. Perhaps it is
being deleted and written at the same time, or because of the name clash
there's no data to be written to the segments file?_
- 11:16:03,300: our backup is finished.
Here is an excerpt of some of the more interesting log entries:
{quote}
09:39:30,719 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Start
asset index backup
09:40:21,996 [http-8080-2] DEBUG
proj.zoie.impl.indexing.internal.ZoieIndexDeletionPolicy - Ctor Snapshot.
segments filename: *segments_2j*, commit filenames: [_5f.tvd, _4c.tvf, _5f.tvf,
_4d.fdx, _4p.tis, _4c.tvd, _4p.frq, _5e.tvx, _39.tis, _3u.frq, _4d.fdt,
_4p.tii, _5f.fnm, _4c.tvx, _39.tii, _55.tis, _2q.fnm, _4p.fdx, _2q.tvf,
_5e.tvf, _5e.tvd, _2q.tvd, _5f.tvx, _4p.fdt, _55.tii, _55.nrm, _55.frq,
_2q.tvx, _3u.tvx, _5f.fdx, _4c.frq, _5f.fdt, _4c.fdx, _2q.nrm, _4c.fdt,
_4p.tvf, _4c.nrm, _3v.tvf, _4c.tii, _3u.prx, _4p.tvd, _4c.tis, _3u.fdt,
_5e.fnm, _3v.tvd, _3u.fdx, _55.tvx, _39.fdt, _39.fnm, _5e.prx, _3u.tis,
*segments_2j*, _3v.prx, _2q.fdx, _4c.prx, _4d.prx, _4p.tvx, _39.fdx, _3u.tii,
_2q.fdt, _4d.frq, _55.fdt, _2q.prx, _5e.tii, _3u.fnm, _55.fdx, _5e.nrm,
_5f.tii, _5e.tis, _4d.nrm, _4d.tii, _5e.fdt, _4d.tis, _5f.frq, _5e.fdx,
_39.tvd, _39.tvf, _55.fnm, _3v.frq, _2q.frq, _2q.tii, _3v.tii, _5f.prx,
_4p.nrm, _5f.nrm, _2q.tis, _3v.tvx, _4c.fnm, _3v.nrm, _55.tvf, _55.tvd,
_3u.nrm, _5f.tis, _3v.tis, _3v.fdt, _4d.fnm, _4d.tvf, _3v.fdx, _4d.tvd,
_4d.tvx, _4p.prx, _3v.fnm, _5e.frq, _39.prx, _39.tvx, _39.frq, _4p.fnm,
_39.nrm, _3u.tvf, _55.prx, _3u.tvd]
11:03:27,653 [http-8080-2] TRACE
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing
*segments_2j*, directory contents: _2q.fdt(969134416), _2q.fdx(36652),
_2q.fnm(276), _2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403),
_2q.tii(7447), _2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081),
_2q.tvx(73300), _39.fdt(2061261675), _39.fdx(1012), _39.fnm(276),
_39.frq(17754579), _39.nrm(256), _39.prx(121067407), _39.tii(137511),
_39.tis(11726653), _39.tvd(185), _39.tvf(233037042), _39.tvx(2020),
_3u.fdt(1595882722), _3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869),
_3u.nrm(15926), _3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399),
_3u.tvd(14209), _3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455),
_3v.fdx(476), _3v.fnm(384), _3v.frq(15201111), _3v.nrm(122),
_3v.prx(109724024), _3v.tii(132491), _3v.tis(11457688), _3v.tvd(114),
_3v.tvf(211902147), _3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572),
_4c.fnm(276), _4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960),
_4c.tii(141243), _4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779),
_4c.tvx(79140), _4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354),
_4d.frq(24581614), _4d.nrm(140), _4d.prx(158243133), _4d.tii(141948),
_4d.tis(12259425), _4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092),
_4p.fdt(1081212027), _4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56),
_4p.prx(60513257), _4p.tii(134898), _4p.tis(11376730), _4p.tvd(56),
_4p.tvf(116715012), _4p.tvx(420), _55.fdt(787581180), _55.fdx(156),
_55.fnm(354), _55.frq(5376741), _55.nrm(42), _55.prx(39539985),
_55.tii(133483), _55.tis(11351989), _55.tvd(42), _55.tvf(75958725),
_55.tvx(308), _5e.prx(0), _5z.fdt(853614415), _5z.fdx(23700), _5z.fnm(1696),
_5z.frq(9041984), _5z.nrm(5928), _5z.prx(32617823), _5z.tii(146115),
_5z.tis(12387143), _5z.tvd(4142), _5z.tvf(74409815), _5z.tvx(47396),
_63.fdt(212690811), _63.fdx(8188), _63.fnm(1652), _63.frq(2492889),
_63.nrm(2050), _63.prx(5672093), _63.tii(8470), _63.tis(689993), _63.tvd(1442),
_63.tvf(15371474), _63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484),
_64.fnm(384), _64.frq(35294399), _64.nrm(374), _64.prx(230791431),
_64.tii(143860), _64.tis(12491845), _64.tvd(295), _64.tvf(444939185),
_64.tvx(2964), _65.fdt(1369489408), _65.fdx(0), _65.fnm(1749),
index.directory(5), segments.gen(20), segments_2x(2354)
11:03:27,706 [http-8080-2] TRACE
com.ds.acm.engine.search.zoieimpl.core.DirectoryManager - Writing _3v.prx,
directory contents: _2q.fdt(969134416), _2q.fdx(36652), _2q.fnm(276),
_2q.frq(4685726), _2q.nrm(9166), _2q.prx(393230403), _2q.tii(7447),
_2q.tis(746299), _2q.tvd(8394), _2q.tvf(599185081), _2q.tvx(73300),
_39.fdt(2061261675), _39.fdx(1012), _39.fnm(276), _39.frq(17754579),
_39.nrm(256), _39.prx(121067407), _39.tii(137511), _39.tis(11726653),
_39.tvd(185), _39.tvf(233037042), _39.tvx(2020), _3u.fdt(1595882722),
_3u.fdx(63692), _3u.fnm(330), _3u.frq(8001869), _3u.nrm(15926),
_3u.prx(647374863), _3u.tii(11319), _3u.tis(1168399), _3u.tvd(14209),
_3u.tvf(986370136), _3u.tvx(127380), _3v.fdt(2057147455), _3v.fdx(476),
_3v.fnm(384), _3v.frq(15201111), _3v.nrm(122), _3v.prx(109724024),
_3v.tii(132491), _3v.tis(11457688), _3v.tvd(114), _3v.tvf(211902147),
_3v.tvx(948), _4c.fdt(2691565961), _4c.fdx(39572), _4c.fnm(276),
_4c.frq(18724620), _4c.nrm(9896), _4c.prx(590255960), _4c.tii(141243),
_4c.tis(12185869), _4c.tvd(9894), _4c.tvf(932649779), _4c.tvx(79140),
_4d.fdt(2398908136), _4d.fdx(548), _4d.fnm(354), _4d.frq(24581614),
_4d.nrm(140), _4d.prx(158243133), _4d.tii(141948), _4d.tis(12259425),
_4d.tvd(140), _4d.tvf(303769970), _4d.tvx(1092), _4p.fdt(1081212027),
_4p.fdx(212), _4p.fnm(354), _4p.frq(8294102), _4p.nrm(56), _4p.prx(60513257),
_4p.tii(134898), _4p.tis(11376730), _4p.tvd(56), _4p.tvf(116715012),
_4p.tvx(420), _55.fdt(787581180), _55.fdx(156), _55.fnm(354), _55.frq(5376741),
_55.nrm(42), _55.prx(39539985), _55.tii(133483), _55.tis(11351989),
_55.tvd(42), _55.tvf(75958725), _55.tvx(308), _5e.prx(0), _5z.fdt(853614415),
_5z.fdx(23700), _5z.fnm(1696), _5z.frq(9041984), _5z.nrm(5928),
_5z.prx(32617823), _5z.tii(146115), _5z.tis(12387143), _5z.tvd(4142),
_5z.tvf(74409815), _5z.tvx(47396), _63.fdt(212690811), _63.fdx(8188),
_63.fnm(1652), _63.frq(2492889), _63.nrm(2050), _63.prx(5672093),
_63.tii(8470), _63.tis(689993), _63.tvd(1442), _63.tvf(15371474),
_63.tvx(16372), _64.fdt(4118409126), _64.fdx(1484), _64.fnm(384),
_64.frq(35294399), _64.nrm(374), _64.prx(230791431), _64.tii(143860),
_64.tis(12491845), _64.tvd(295), _64.tvf(444939185), _64.tvx(2964),
_65.fdt(1369554944), _65.fdx(0), _65.fnm(1749), index.directory(5),
segments.gen(20), *segments_2j(0)*, segments_2x(2354)
IFD [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]: init:
load commit "*segments_2j*"
IW 1110 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a]:
init: hit exception on init; releasing write lock
11:08:02,731
[proj.zoie.impl.indexing.internal.realtimeindexdataloa...@7636d59a] ERROR
proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying
segments: *read past EOF*, start=0, end=0, bufferStart=0, bufferPosition=0,
bufferSize=1024
11:16:03,300 [http-8080-2] DEBUG com.ds.acm.logic.tools.BackupManager - Asset
index backup complete
{quote}
Looking in the logs there are *31 occurences of 'delete "segments_2j"' and 32
for 'wrote segments file "segments_2j"'*. Very weird, since the
ZoieIndexDeletionPolicy should prevent the segments_2j from being deleted...
So just guessing here, but the problem seems to be that segement filenames are
being 'rotated' and getting reused?
The ZoieIndexDeletionPolicy definitely does not seem to take this into account,
it uses the segments filename as a unique id to track the references to a
specific commit. But looking at it, segment filenames are not unique over a
certain period of time, and that time period gets shorter when lots of
modifications are being made to the index.
Reading the code it should still prevent any segments_2j commit from being
deleted... but we do see deletes for that file... hmmm
And then something else still seems te be going wrong, somehow the segments
file does become 0 bytes. It is not because it was written with no data, we
added a check for that (look for 'closing 0kb IndexOutput') but that never
happens to a segments_* file.
Maybe some logic needs to be added to prevent certain segment_... numbers from
being re-used if the commit is still 'referenced' by an IndexDeletionPolicy...?
> Index corruption after 'read past EOF' under heavy update load and snapshot
> export
> ----------------------------------------------------------------------------------
>
> Key: LUCENE-2729
> URL: https://issues.apache.org/jira/browse/LUCENE-2729
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 3.0.1, 3.0.2
> Environment: Happens on both OS X 10.6 and Windows 2008 Server.
> Integrated with zoie (using a zoie snapshot from 2010-08-06:
> zoie-2.0.0-snapshot-20100806.jar).
> Reporter: Nico Krijnen
> Attachments: 2010-11-02 IndexWriter infoStream log.zip,
> backup_force_failure2.log.zip, LUCENE-2729-test1.patch
>
>
> We have a system running lucene and zoie. We use lucene as a content store
> for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled
> backups of the index. This works fine for small indexes and when there are
> not a lot of changes to the index when the backup is made.
> On large indexes (about 5 GB to 19 GB), when a backup is made while the index
> is being changed a lot (lots of document additions and/or deletions), we
> almost always get a 'read past EOF' at some point, followed by lots of 'Lock
> obtain timed out'.
> At that point we get lots of 0 kb files in the index, data gets lots, and the
> index is unusable.
> When we stop our server, remove the 0kb files and restart our server, the
> index is operational again, but data has been lost.
> I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both.
> Hopefully someone has some ideas where to look to fix this.
> Some more details...
> Stack trace of the read past EOF and following Lock obtain timed out:
> {code}
> 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085]
> ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF
> java.io.IOException: read past EOF
> at
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
> at
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
> at
> org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37)
> at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245)
> at
> org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:166)
> at
> org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725)
> at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987)
> at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973)
> at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162)
> at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003)
> at
> proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203)
> at
> proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223)
> at
> proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153)
> at
> proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134)
> at
> proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171)
> at
> proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373)
> 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085]
> ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader -
> Problem copying segments: Lock obtain timed out:
> org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:84)
> at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:957)
> at
> proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176)
> at
> proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228)
> at
> proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153)
> at
> proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134)
> at
> proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171)
> at
> proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373)
> {code}
> We get exactly the same behavour on both OS X and on Windows. On both zoie is
> using a SimpleFSDirectory.
> We also use a SingleInstanceLockFactory (since our process is the only one
> working with the index), but we get the same behaviour with a NativeFSLock.
> The snapshot backup is being made by calling:
> *proj.zoie.impl.indexing.ZoieSystem.exportSnapshot(WritableByteChannel)*
> Same issue in zoie JIRA:
> http://snaprojects.jira.com/browse/ZOIE-51
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]