Gregory Chanan created SOLR-7378:
------------------------------------
Summary: Be more conservative about loading a core when hdfs
transaction log could not be recovered
Key: SOLR-7378
URL: https://issues.apache.org/jira/browse/SOLR-7378
Project: Solr
Issue Type: Bug
Components: SolrCloud
Affects Versions: 5.0
Reporter: Gregory Chanan
Today, if an HdfsTransactionLog cannot recover its lease, you get the following
warning in the log:
{code}
log.warn("Cannot recoverLease after trying for " +
conf.getInt("solr.hdfs.lease.recovery.timeout", 900000) +
"ms (solr.hdfs.lease.recovery.timeout); continuing, but may be
DATALOSS!!!; " +
getLogMessageDetail(nbAttempt, p, startWaiting));
{code}
from:
https://github.com/apache/lucene-solr/blob/a8c24b7f02d4e4c172926d04654bcc007f6c29d2/solr/core/src/java/org/apache/solr/util/FSHDFSUtils.java#L145-L148
But some deployments may not actually want to continue if there is potential
data loss, they may want to investigate what the underlying issue is with HDFS
first. And there's no way outside of looking at the logs to figure out what is
going on.
There's a range of possibilties here, but here's a couple of ideas:
1) config parameter around whether to continue with potential data loss or not
2) load but require special flag to read potentially incorrect data (similar to
shards.tolerant, data.tolerant or something?)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]