[
https://issues.apache.org/jira/browse/LUCENE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Willnauer resolved LUCENE-4989.
-------------------------------------
Resolution: Fixed
fixed via LUCENE-5002
> Hanging on DocumentsWriterStallControl.waitIfStalled forever
> ------------------------------------------------------------
>
> Key: LUCENE-4989
> URL: https://issues.apache.org/jira/browse/LUCENE-4989
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/index
> Affects Versions: 4.1
> Environment: Linux 2.6.32
> Reporter: Jessica Cheng
> Assignee: Simon Willnauer
> Labels: hang
> Fix For: 5.0, 4.3.1
>
>
> In an environment where our underlying storage was timing out on various
> operations, we find all of our indexing threads eventually stuck in the
> following state (so far for 4 days):
> "Thread-0" daemon prio=5 Thread id=556 WAITING
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:503)
> at
> org.apache.lucene.index.DocumentsWriterStallControl.waitIfStalled(DocumentsWriterStallControl.java:74)
> at
> org.apache.lucene.index.DocumentsWriterFlushControl.waitIfStalled(DocumentsWriterFlushControl.java:676)
> at
> org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:301)
> at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:361)
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1484)
> at ...
> I have not yet enabled detail logging and tried to reproduce yet, but looking
> at the code, I see that DWFC.abortPendingFlushes does
> try {
> dwpt.abort();
> doAfterFlush(dwpt);
> } catch (Throwable ex) {
> // ignore - keep on aborting the flush queue
> }
> (and the same for the blocked ones). Since the throwable is ignored, I can't
> say for sure, but I've seen DWPT.abort thrown in other cases, so if it does
> throw, we'd fail to call doAfterFlush and properly decrement flushBytes. This
> can be a problem, right? Is it possible to do this instead:
> try {
> dwpt.abort();
> } catch (Throwable ex) {
> // ignore - keep on aborting the flush queue
> } finally {
> try {
> doAfterFlush(dwpt);
> } catch (Throwable ex2) {
> // ignore - keep on aborting the flush queue
> }
> }
> It's ugly but safer. Otherwise, maybe at least add logging for the throwable
> just to make sure this is/isn't happening.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]