Great. Will merge that patch into trunk as soon as possible.

-Johan

On Thu, Jun 16, 2011 at 10:21 PM, Jennifer Hickey <[email protected]> wrote:
> Hi Johan,
> Sorry for the delay.  I was finally able to try out that patch (against 1.3) 
> on our test environment, and things are running smoothly.  I have not seen 
> the ClosedChannelException (or any others) once in 24 hours.  Previously on 
> the same system I saw it frequently, as early as 15 minutes into the uptime.  
> Thanks!
>
> Jennifer
> ________________________________________
> From: [email protected] [[email protected]] On Behalf 
> Of Johan Svensson [[email protected]]
> Sent: Thursday, May 26, 2011 3:09 AM
> To: Neo4j user discussions
> Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment
>
> Hi Jennifier,
>
> Could you apply this patch to the kernel and then see if the problem
> still exists? If you want I can send you a jar but then I need to know
> what version of Neo4j you are using.
>
> Regards,
> Johan
>
>
> On Mon, May 23, 2011 at 6:50 PM, Jennifer Hickey <[email protected]> wrote:
>> Hi Tobias,
>>
>> Looks like the environment is still setup, so I should be able to attempt a 
>> repro with a patched version.  Let me know what you would like me to use.
>>
>> Thanks,
>> Jennifer
>> ________________________________________
>> From: [email protected] [[email protected]] On Behalf 
>> Of Tobias Ivarsson [[email protected]]
>> Sent: Monday, May 16, 2011 11:01 PM
>> To: Neo4j user discussions
>> Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment
>>
>> Hi Jennifer,
>>
>> Could you reproduce it on your side by doing the same kind of systems tests
>> again? If you could then I'd be very happy if you could try a patched
>> version that we have been working on and see if that fixes the issue.
>>
>> Cheers,
>> Tobias
>>
>> On Tue, May 17, 2011 at 2:49 AM, Jennifer Hickey <[email protected]> wrote:
>>
>>> Hi Tobias,
>>> Unfortunately I don't have an isolated test case, as I was doing a fairly
>>> involved system test at the time.  I may be able to have a colleague work on
>>> reproducing it at a later date (I've been diverted to something else for the
>>> moment).
>>>
>>> I was remote debugging with Eclipse, so I toggled a method breakpoint on
>>> Thread.interrupt() and then inspected the stack once the breakpoint was hit.
>>>
>>> Sorry I don't have more information at the moment.  I agree that
>>> eliminating the interrupts sounds like the best approach, if possible.
>>>
>>> Thanks,
>>> Jennifer
>>> ________________________________________
>>> From: [email protected] [[email protected]] On
>>> Behalf Of Tobias Ivarsson [[email protected]]
>>> Sent: Thursday, April 28, 2011 6:23 AM
>>> To: Neo4j user discussions
>>> Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent
>>> environment
>>>
>>> Hi Jennifer,
>>>
>>> I'd first like to thank you for the testing and analysis you've done. Very
>>> useful stuff. Do you think you could send some test code our way that
>>> reproduces this issue?
>>>
>>> This is actually the first time this issue has been reported, so I wouldn't
>>> say it is a common issue. My guess is that your thread volume triggered a
>>> rare condition that wouldn't be encountered otherwise.
>>>
>>> I'm also curious to know how you found the source of the interruptions.
>>> When
>>> I debug thread interruptions I've never been able to find out where the
>>> thread got interrupted from without doing tedious procedures of breakpoint
>>> +
>>> logging + trying to match thread ids. If you have a better method for doing
>>> that I'd very much like to know.
>>>
>>> I think we should focus the effort on fixing the interruption issue if we
>>> can. And I believe we would be able to do that if the interruptions do in
>>> fact originate from where you say they do. But the suggestion of being able
>>> to switch the lucene directory implementation is still interesting, but as
>>> you point out since it has issues on some platforms it would be better if
>>> we
>>> could be rid of the interruption issue.
>>>
>>> Cheers,
>>> Tobias
>>>
>>> On Thu, Apr 28, 2011 at 12:41 AM, Jennifer Hickey <[email protected]
>>> >wrote:
>>>
>>> > Hello,
>>> > I've been running some tests w/approx 400 threads reading various indexed
>>> > property values.  I'm running on 64 bit Linux.  I was frequently seeing
>>> the
>>> > ClosedChannelException below.  The javadoc on Lucene's NIOFSDirectory
>>> states
>>> > that "Accessing this class either directly or indirectly from a thread
>>> while
>>> > it's interrupted can close the underlying file descriptor immediately if
>>> at
>>> > the same time the thread is blocked on IO. The file descriptor will
>>> remain
>>> > closed and subsequent access to {@link NIOFSDirectory} will throw a
>>> {@link
>>> > ClosedChannelException}.  If your application uses either {@link
>>> > Thread#interrupt()} or {@link Future#cancel(boolean)} you should use
>>> {@link
>>> > SimpleFSDirectory} in favor of {@link NIOFSDirectory}."
>>> >
>>> > A bit of debugging revealed that the Thread.interrupts were coming from
>>> > Neo4j, specifically in RWLock and MappedPersistenceWindow.  So it seems
>>> like
>>> > this would be a common problem, though perhaps I am missing something?
>>> >
>>> > SimpleFSDirectory seems a bit of a performance bottleneck, so I switched
>>> to
>>> > MMapDirectory and the problem did go away.  I didn't see a way to switch
>>> > implementations w/out modifying neo4j code, so I changed LuceneDataSource
>>> as
>>> > follows:
>>> >
>>> >  static Directory getDirectory( String storeDir,
>>> >            IndexIdentifier identifier ) throws IOException
>>> > {
>>> >        MMapDirectory dir=new MMapDirectory(getFileDirectory( storeDir,
>>> > identifier), null);
>>> >        if(MMapDirectory.UNMAP_SUPPORTED) {
>>> >            dir.setUseUnmap(true);
>>> >        }
>>> >        return dir;
>>> >  }
>>> >
>>> > So I'm wondering if others have seen this problem and/or if there is a
>>> > recommended solution?  Our product runs on quite a few different
>>> operating
>>> > systems, so I have some reservations about using MMapDirectory as well
>>> > (javadoc speaks of a few caveats on Windows, 64 vs 32, etc). Also, I'd
>>> > rather not maintain a patched version of the neo4j code if avoidable.
>>> >
>>> > Thanks!
>>> > Jennifer
>>> >
>>> > Exception:
>>> > Caused by: java.nio.channels.ClosedChannelException
>>> > at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
>>> > at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:613)
>>> > at
>>> >
>>> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
>>> > at
>>> >
>>> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:139)
>>> > at
>>> >
>>> org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:285)
>>> > at
>>> >
>>> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:160)
>>> > at
>>> >
>>> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
>>> > at org.apache.lucene.store.DataInput.readVInt(DataInput.java:86)
>>> > at
>>> >
>>> org.apache.lucene.index.codecs.DeltaBytesReader.read(DeltaBytesReader.java:40)
>>> > at
>>> >
>>> org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.next(PrefixCodedTermsReader.java:469)
>>> > at
>>> >
>>> org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.seek(PrefixCodedTermsReader.java:385)
>>> > at org.apache.lucene.index.TermsEnum.seek(TermsEnum.java:68)
>>> > at org.apache.lucene.index.Terms.docFreq(Terms.java:53)
>>> > at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:898)
>>> > at org.apache.lucene.index.IndexReader.docFreq(IndexReader.java:882)
>>> > at
>>> > org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:687)
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to