[
https://issues.apache.org/jira/browse/OAK-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989329#comment-14989329
]
Ian Boston commented on OAK-3547:
---------------------------------
Attempts to break the index operation have produced recovery behaviour that
appears stable.
{code}
04.11.2015 10:02:20.391 *INFO* [aysnc-index-update-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631340351 as
_3.si,229,1a1c2c541a3cb5087d2a3c60ae77b6b05461410d;segments_3,117,a94bece6be54b2690a3b11925ed9bbc80e914d5d;segments_4,117,a74de5ae54af2abbb2423573cd0ad93b950ec18a;_2.cfe,224,dd2d758773e57e172933ff3d3fc3a4908af59dc4;_0_1.del,36,906f2506ff277e28716cc19eb8b55f289e34c53c;_0.si,229,57e0616d14993ad2a9680f55d4151d440cad8255;_2.si,229,ceddd18aa9343666e78f6330d0e261e96474717b;segments.gen,20,395c2b9ba7f05f4debb52b0a7cea8ac56ad671a2;_3.cfe,224,2596860e7bcdd550e221488708afda8729689107;_0.cfs,1431868,a7368bb6e2398a5952eddbc062a498f100a29865;_0.cfe,266,62127a60fe2224e32e3720eb15b2bd9f34d4670a;_3.cfs,1210,1c7764b0713716c53d1b0ed21c0063c5606aad49;_2.cfs,1188,1cd8b738619350e6998b77c2142d5a3748e861f1;
04.11.2015 10:02:20.391 *INFO* [aysnc-index-update-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving due to close.
04.11.2015 10:02:20.427 *INFO* [aysnc-index-update-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631340391 as
_3.si,229,1a1c2c541a3cb5087d2a3c60ae77b6b05461410d;segments_4,117,a74de5ae54af2abbb2423573cd0ad93b950ec18a;_0_1.del,36,906f2506ff277e28716cc19eb8b55f289e34c53c;_0.si,229,57e0616d14993ad2a9680f55d4151d440cad8255;segments.gen,20,395c2b9ba7f05f4debb52b0a7cea8ac56ad671a2;_3.cfe,224,2596860e7bcdd550e221488708afda8729689107;_0.cfs,1431868,a7368bb6e2398a5952eddbc062a498f100a29865;_0.cfe,266,62127a60fe2224e32e3720eb15b2bd9f34d4670a;_3.cfs,1210,1c7764b0713716c53d1b0ed21c0063c5606aad49;
04.11.2015 10:02:23.375 *WARN* [aysnc-index-update-fulltext-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
IO Exception reading index file
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.getIndexFileMetaData(OakDirectory.java:965)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.validateListing(OakDirectory.java:844)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.load(OakDirectory.java:878)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.<init>(OakDirectory.java:750)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.<init>(OakDirectory.java:728)
04.11.2015 10:02:23.377 *WARN* [aysnc-index-update-fulltext-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Index File and Oak Version and not the same Name: _xi.fdt, length:14816174,
checksum:ff821a1bde2330f1389c782a777747677206c685 CheckSum:
ff821a1bde2330f1389c782a777747677206c685 != Unable to generate checksum
java.lang.RuntimeException: failed to read block from backend, id
b535214bddc090c74a426acaeeb5654140c1be52d4af824f2b759113c8a7bdc6@0,
04.11.2015 10:02:23.377 *WARN* [aysnc-index-update-fulltext-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Rejected directory listing l_1446631317105
04.11.2015 10:02:24.104 *INFO* [aysnc-index-update-fulltext-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631315966, using
04.11.2015 10:02:24.129 *INFO* [oak-lucene-0]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631317105, using
04.11.2015 10:02:25.034 *INFO* [aysnc-index-update-fulltext-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631344105 as
_xi.tim,7775213,f8e401ff1a95bba1c387a7d00239a9b10c4323ba;_xi.si,326,fcb94171f92573a5fd365b178dda96728c16da50;_xi_3.del,38,703e0294067b659a0f15f8114659b20de3d51385;_1oa.si,235,ddb8cb303fdf4372cc333a980a1755d25d64d6cb;_xi.pos,2748906,154e1ba4078865bc81f35646e36dea906e54b539;_xi.nvd,159060,abacf7b5c17963a24c9f54715ac40c1e2dd85f0b;_xi.fdt,14816174,ff821a1bde2330f1389c782a777747677206c685;_xj.cfe,224,1403e26ddc7e3d005758f3dae6c8bbf50e9a4313;_1o9.cfe,224,f7926636965fdd76c2622d57f3e24a217f230a44;_xi.nvm,46,d09f4ec10424aac4b5a2fe1da422f266aace8bca;_xj.si,232,f119a8cab06c10a0e34d87b92e16bf1b28a688f3;_xi.fdx,1272306,0c62a780b2f3c62af3014bc530d79d8144a8f014;segments.gen,20,7b46fe18999a02b4247cbcd8222034d4a2c9291c;segments_1o6,157,21acb7510ff37ef18452d622d8fcf2cdf90b9f58;_1o9.cfs,1595,d13a5088a06954752ea9ef2cc16510875fecb576;_xi.fnm,67704,fcf53c6f0a6169fe01eb7544d1a0c8118a1ef9a2;_1oa.cfs,1595,2cdbe7f26378deac710369f9c58b612c3c875260;segments_1o7,157,9e4d82f8520c49961c0e306c379bb01b2cbbce98;_xi.doc,2525612,8e52ab108ede6d344dbbf9322dbfee7396d26ab7;_1o9.si,235,58e8707ed138eb234f669aafd5416c7b78385932;_xi.tip,187482,0ead85853f2b7b1eceb4eef8290526a469bafa9c;_xj.cfs,4374,a2cda886ae2be975e61c26b830d85633e222c1dc;_1oa.cfe,224,ceadd06ff7a7ddafffde5e87acd83068924c3170;
04.11.2015 10:03:28.515 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423597, using
04.11.2015 10:03:28.819 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606422099, using
04.11.2015 10:03:29.436 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631317105, using
04.11.2015 10:03:29.744 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631340391, using
04.11.2015 10:03:29.846 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423630, using
04.11.2015 10:03:29.867 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423536, using
04.11.2015 10:03:29.908 *INFO* [FelixStartLevel]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423673, using
04.11.2015 10:03:37.928 *INFO* [aysnc-index-update-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631340391, using
04.11.2015 10:03:42.077 *INFO* [aysnc-index-update-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving due to sync [_4.cfe, _0.cfs, _0.cfe, _0_1.del, _4.si, _0.si, _4.cfs] .
04.11.2015 10:03:42.108 *INFO* [aysnc-index-update-async]
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631422077 as
_3.si,229,1a1c2c541a3cb5087d2a3c60ae77b6b05461410d;segments_4,117,a74de5ae54af2abbb2423573cd0ad93b950ec18a;_0_1.del,36,906f2506ff277e28716cc19eb8b55f289e34c53c;_4.cfe,224,bdce2bd8251981b9cb72acdc8205466e6c6262d2;_0.si,229,57e0616d14993ad2a9680f55d4151d440cad8255;_4.cfs,1188,a5a0324f5a05dfef2f57f5962240c82e9d62d88a;segments.gen,20,395c2b9ba7f05f4debb52b0a7cea8ac56ad671a2;_3.cfe,224,2596860e7bcdd550e221488708afda8729689107;_0.cfs,1431868,a7368bb6e2398a5952eddbc062a498f100a29865;segments_5,0,da39a3ee5e6b4b0d3255bfef95601890afd80709;_0.cfe,266,62127a60fe2224e32e3720eb15b2bd9f34d4670a;_3.cfs,1210,1c7764b0713716c53d1b0ed21c0063c5606aad49;_4.si,229,e50c79679300f6da51ca69bda351f838ee910015;
{code}
In this instance there appears to be an error reading data from the MongoDB
blob store (04.11.2015 10:02:23.375). That causes the SHA1 not to match and
first directory listing is not opened (04.11.2015 10:02:23.377). The second
directory listing is used (04.11.2015 10:02:24.104) and both the reader and the
async indexer continue to operate normally without errors. The error was
probably generated by using default write concern and killing the JVM and
mongoDB instances. There will have been dataloss in the index, however, the JVM
continued as did both searching and indexing.
The overnight test and restarts only caused this behaviour once.
More tests are required.
One observation. Indexing does not start until the Topology heatbeat timeout
has passed, which appears to be about 120s, meaning the index will be 120s
behind on starting an instance, until it catches up.
> Improve ability of the OakDirectory to recover from unexpected file errors
> --------------------------------------------------------------------------
>
> Key: OAK-3547
> URL: https://issues.apache.org/jira/browse/OAK-3547
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: lucene
> Affects Versions: 1.4
> Reporter: Ian Boston
>
> Currently if the OakDirectory finds that a file is missing or in some way
> damaged, and exception is thrown which impacts all queries using that index,
> at times making the index unavailable. This improvement aims to make the
> OakDirectory recover to a previously ok state by storing which files were
> involved in previous states, and giving the code some way of checking if they
> are valid.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)