[ 
https://issues.apache.org/jira/browse/SOLR-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441898#comment-16441898
 ] 

Robert Muir commented on SOLR-12232:
------------------------------------

{quote}
Understood. But we don't have to use NIO.
{quote}

Yes, use another lock factory or some alternative if you want. But this is NIO 
lock factory, and well it uses NIO. And its behavior is correct: its wrong to 
interrupt the NIO stuff. It is definitely OK to dictate that its wrong to 
interrupt NIO stuff, we document it that way for a reason, because its 
dangerous.

Lock validation and other checks here are important because they prevent screw 
crazy corruption-looking cases from showing up. Please don't shoot the 
messenger but fix the actual bugs instead (the perp calling interrupt on lucene 
threads).


> NativeFSLockFactory loses the channel when a thread is interrupted and the 
> SolrCore becomes unusable after
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-12232
>                 URL: https://issues.apache.org/jira/browse/SOLR-12232
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.1.1
>            Reporter: Jeff Miller
>            Assignee: Erick Erickson
>            Priority: Minor
>              Labels: NativeFSLockFactory, locking
>   Original Estimate: 24h
>          Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> The condition is rare for us and seems basically a race.  If a thread that is 
> running just happens to have the FileChannel open for NativeFSLockFactory and 
> is interrupted, the channel is closed since it extends 
> [AbstractInterruptibleChannel|https://docs.oracle.com/javase/7/docs/api/java/nio/channels/spi/AbstractInterruptibleChannel.html]
> Unfortunately this means the Solr Core has to be unloaded and reopened to 
> make the core usable again as the ensureValid check forever throws an 
> exception after.
> org.apache.lucene.store.AlreadyClosedException: FileLock invalidated by an 
> external force: 
> NativeFSLock(path=data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807
>  exclusive invalid],creationTime=2018-04-06T21:45:11Z) at 
> org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:178)
>  at 
> org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:43)
>  at 
> org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
>  at 
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:113)
>  at 
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
>  at 
> org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
>  
> Proposed solution is using AsynchronousFileChannel instead, since this is 
> only operating on a lock and .size method



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to