[ https://issues.apache.org/jira/browse/LUCENE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-6381: --------------------------------------- Attachment: LUCENE-6381.patch Simple patch, one line change. I'd like to backport to 5.1... outright hangs are bad. This is just a defensive step ... separately, we have some concurrency bug where a .notify/All() was not sent. > DocumentsWriterStallControl's .wait() should have a time limit > -------------------------------------------------------------- > > Key: LUCENE-6381 > URL: https://issues.apache.org/jira/browse/LUCENE-6381 > Project: Lucene - Core > Issue Type: Bug > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: Trunk, 5.1 > > Attachments: LUCENE-6381.patch > > > This build was hung: > http://build-us-00.elastic.co/job/es_core_15_centos/230/testReport/junit/org.elasticsearch.index.engine/InternalEngineTests/testDeletesAloneCanTriggerRefresh/ > Only one thread was stalled in DocumentsWriterStallControl, which means we > have a bug somewhere, because that thread should have un-stalled once the > other (too many) threads finished flushing their segments. > I think we should make a simple defensive change here: instead of wait(), > which waits forever for a .notify/All() to wake it up, we should wait for up > to a time limit. This way when any concurrency bug like this strikes, we > won't hang forever. > I cannot reproduce that particular hang... what's unique about that test is > it uses a positively minuscule (1 KB) IW buffer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org