[
https://issues.apache.org/jira/browse/LUCENE-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974087#comment-13974087
]
Michael McCandless commented on LUCENE-5612:
--------------------------------------------
Hmm when I run "ant test-lock-factory" on linux it sometimes passes
and sometimes fails. I even up'd the count to 20000 and ... it still
sometimes passes. Not sure why. I do see the two clients printing
out X% done, interleaved ... oh I see, we set delay to 1 msec; if I
change that to 0 it always fails!
Can we remove this delay option (just don't sleep)? If you want to
have some confidence locking is working, you should fully stress it
out.
Should we use lockedID=-1 for the "no lock held" case? What if a
client id is 0? Won't this confuse LockVerifyServer?
{quote}
bq. As we bind and listen on 127.0.0.1 there is no need to pass the host to the
lock verifier.
Now how will Mike be able to test NFS :)
{quote}
Can we put back host/IP interface?
I think it's useful to validate your locking is working OK if you
store the Lucene index on a remote filesystem and you "rely" on this
locking to pick a machine to write to the index. Admittedly this is
not a recommended way to use Lucene... but at least this tool (today)
can be used to make sure locking is working if you do so... we can
still set it to 127.0.0.1 in build.xml to keep Uwe's firewall happy.
It would be better if we didn't fix the port the server binds to?
I think it's sort of weird to add the complexity of the timeouts to
the server, the waiting to make sure server is started, etc.: you can
just start the server first, see it's started, then spawn children;
ie, it seems like we are pushing complexity down into the tester
tools just to workaround limitations of ant's sub-process
handling... vs just letting Python handle this.
But I like the other cleanups like try-with-resources.
> LockStressTest fails always with NativeFSLockFactory
> ----------------------------------------------------
>
> Key: LUCENE-5612
> URL: https://issues.apache.org/jira/browse/LUCENE-5612
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Priority: Blocker
> Fix For: 4.8
>
> Attachments: LUCENE-5612-instant-crush.patch,
> LUCENE-5612-instant-crush.patch,
> LUCENE-5612-more-sophisticated-crusher.patch,
> LUCENE-5612-more-sophisticated-crusher.patch,
> LUCENE-5612-more-sophisticated-crusher.patch, LUCENE-5612-tester.patch,
> LUCENE-5612-tester.patch, LUCENE-5612.patch
>
>
> I was looking at this, because i wanted to remove the static map inside
> NativeFSLockFactory (no particular reason: it just smells bad, we require
> java7, and you get overlappingexception as of java6 so its unnecessary).
> Before changing any code, i wanted to run lockstresstest first, just to
> ensure it works: but it fails always. Simple works fine always.
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.RuntimeException: lock was double acquired at
> org.apache.lucene.store.VerifyingLockFactory$CheckedLock.verify(VerifyingLockFactory.java:67)
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]