[ https://issues.apache.org/jira/browse/SOLR-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441898#comment-16441898 ]
Robert Muir commented on SOLR-12232: ------------------------------------ {quote} Understood. But we don't have to use NIO. {quote} Yes, use another lock factory or some alternative if you want. But this is NIO lock factory, and well it uses NIO. And its behavior is correct: its wrong to interrupt the NIO stuff. It is definitely OK to dictate that its wrong to interrupt NIO stuff, we document it that way for a reason, because its dangerous. Lock validation and other checks here are important because they prevent screw crazy corruption-looking cases from showing up. Please don't shoot the messenger but fix the actual bugs instead (the perp calling interrupt on lucene threads). > NativeFSLockFactory loses the channel when a thread is interrupted and the > SolrCore becomes unusable after > ---------------------------------------------------------------------------------------------------------- > > Key: SOLR-12232 > URL: https://issues.apache.org/jira/browse/SOLR-12232 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 7.1.1 > Reporter: Jeff Miller > Assignee: Erick Erickson > Priority: Minor > Labels: NativeFSLockFactory, locking > Original Estimate: 24h > Time Spent: 10m > Remaining Estimate: 23h 50m > > The condition is rare for us and seems basically a race. If a thread that is > running just happens to have the FileChannel open for NativeFSLockFactory and > is interrupted, the channel is closed since it extends > [AbstractInterruptibleChannel|https://docs.oracle.com/javase/7/docs/api/java/nio/channels/spi/AbstractInterruptibleChannel.html] > Unfortunately this means the Solr Core has to be unloaded and reopened to > make the core usable again as the ensureValid check forever throws an > exception after. > org.apache.lucene.store.AlreadyClosedException: FileLock invalidated by an > external force: > NativeFSLock(path=data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 > exclusive invalid],creationTime=2018-04-06T21:45:11Z) at > org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:178) > at > org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:43) > at > org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:113) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128) > at > org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183) > > Proposed solution is using AsynchronousFileChannel instead, since this is > only operating on a lock and .size method -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org