[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150775#comment-13150775
 ] 

Michael McCandless commented on LUCENE-3235:
--------------------------------------------


bq. In my opinion we should not change our code to work around that issue.

In general, I think we should change our code to work around awful JVM
bugs, as long as 1) it's not so much effort for us to to do so (and as
always a volunteer steps up to the task), and 2) the change has
negligible cost to "lucky" users (who use a JVM / the right flags that
would not have hit the JVM bug).

I think the last patch fits these criteria, since it's a tiny change
and it scopes the workaround?

We've done this many times in the past; if the cost to "lucky" users
is negligible and the benefit to "unlucky" users (unknowingly using
the affected JVMs) is immense (not hitting horrific bug), I think the
tradeoff is worthwhile?  Otherwise users will conclude Lucene (or
whatever software is embedding it) is buggy.

bq. This testcase fails, but we are using concurrent also in 
ParallelMultiSearcher (die, die, die) and other places (even the indexer was 
partly upgraded to use ConcurrentLock).

Right, we use concurrent* elsewhere, but terms dict is the big
user... very few apps actually use PMS.

bq. It brings a false security and slows down VMs that work correctly.

Well, we already have "false security" that Lucene won't hang on any
JVM... we don't claim this patch will fully work around the bug, but
at least it should reduce it.

How are we slowing down other VMs...?  We scope the workaround?

I'm not saying we should go crazy here, making a big patch to avoid
concurrent* everywhere, but the current patch is minimal, addresses
the big usage of concurrent* in 3.x, is scoped down well.

It will avoid hangs for some number unlucky users out there... so why
not commit it?

                
> TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-3235
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3235
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
>            Reporter: Michael McCandless
>             Fix For: 3.5
>
>         Attachments: LUCENE-3235.patch, LUCENE-3235.patch, LUCENE-3235.patch
>
>
> Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
> under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
> I suspect this is relevant: 
> http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
>  which refers to this JVM bug 
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
> to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
> It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to