Very interesting, I didn't know about the negative value being an option; don't replace all occurrences though as in some cases System.currentTimeMillis() is more appropriate, you can find some interesting discussions here: http://lists.jboss.org/pipermail/infinispan-dev/2011-October/009277.html
I'm having a test shutdown "hung" right now; all other nodes have stopped since minutes already, but the following is still hung.. could it be the same problem? I'm extremely surprised I could hit it. java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000071b442558> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) at org.jgroups.blocks.Request.responsesComplete(Request.java:197) at org.jgroups.blocks.Request.execute(Request.java:89) at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:406) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:370) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:167) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536) at org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:324) at org.infinispan.topology.LocalTopologyManagerImpl.leave(LocalTopologyManagerImpl.java:128) at org.infinispan.statetransfer.StateTransferManagerImpl.stop(StateTransferManagerImpl.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168) at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869) at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:674) at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:552) - locked <0x00000007195f4dc8> (a org.infinispan.factories.ComponentRegistry) at org.infinispan.factories.ComponentRegistry.stop(ComponentRegistry.java:241) at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:782) at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:777) at org.infinispan.query.helper.TestableCluster$Node.kill(TestableCluster.java:270) at org.infinispan.query.helper.TestableCluster$Node.access$2(TestableCluster.java:265) at org.infinispan.query.helper.TestableCluster.killAll(TestableCluster.java:94) -- Sanne On 10 September 2014 14:08, Bela Ban <b...@redhat.com> wrote: > > > On 10/09/14 13:58, Alan Field wrote: >> Hey Bela, >> >>> Just a quick heads up. I'm currently working on >>> https://issues.jboss.org/browse/JGRP-1877, which it critical as it >>> may: - cause RPCs to return prematurely (possibly with a >>> TimeoutException), or - cause RPCs to blocks for a long time (pick >>> which one is worse :-)) >> >> How frequently can these errors occur? Is this something that is not >> very likely to happen or something that requires an external action >> to trigger it? (i.e. changing the time via NTP) Just trying to >> determine the priority of this issue. > > > Changing the system time will definitely screw up code that relies on > System.currentTimeMillis(). Once I replace this with nanoTime(), this > problem should be eliminated. > > The nanoTime() problem is that an 'origin' chosen by the JVM can be in > the future, so all calls to nanoTime() return negative values. Or - if > positive - due to numeric overflow, the long can wrap around and become > negative. > > Once this happens, all RPCs (for example) will return immediately, > without any response, or throw TimeoutExceptions. This will last for 292 > years... :-) > > >> Thanks, Alan >> >> ----- Original Message ----- >>> From: "Bela Ban" <b...@redhat.com> To: >>> infinispan-dev@lists.jboss.org Sent: Wednesday, September 10, 2014 >>> 12:05:11 PM Subject: [infinispan-dev] JGRP-1877 >>> >>> Just a quick heads up. I'm currently working on >>> https://issues.jboss.org/browse/JGRP-1877, which it critical as it >>> may: - cause RPCs to return prematurely (possibly with a >>> TimeoutException), or - cause RPCs to blocks for a long time (pick >>> which one is worse :-)) >>> >>> This is due to my misunderstanding of the semantics of >>> System.nanoTime(), I frequently have code like this, which computes >>> a future deadline for a timeout: >>> >>> long wait_time=TimeUnit.NANOSECONDS.convert(timeout, >>> TimeUnit.MILLISECONDS); final long target_time=System.nanoTime() + >>> wait_time; while(wait_time > 0 && !hasResult) { /* Wait for >>> responses: */ wait_time=target_time - System.nanoTime(); >>> if(wait_time > 0) { try {cond.await(wait_time, >>> TimeUnit.NANOSECONDS);} catch(Exception e) {} } } if(!hasResult && >>> wait_time <= 0) throw new TimeoutException(); >>> >>> Variable target_time can possibly become *negative* if nanoTime() >>> returns a negative value. If so, hasResult is false and wait_time >>> negative, and therefore a TimeoutException would be thrown ! >>> >>> While I'm at it, I'll also fix my uses of >>> System.currentTimeMillis(), and replace it with nanoTime(). Our >>> good friend Erik has run into issues with RPCs (using >>> currentTimeMillis()) hanging forever when their NTP-based servers >>> adjusted the time .... backwards ! >>> >>> Please be aware of nanoTime() in your own code, e.g. long >>> t0=nanoTime(); ... long t1=nanoTime(); >>> >>> It is *not* guaranteed that t1 > t0 because of numeric overflow >>> (t0 might be Long.MAX_VALUE-1 and t1 Long.MAX_VALUE +2 !). The only >>> way to compare them is t1 - t0 > 0 (t1 is more recent) or < 0 t0 is >>> more recent. >>> >>> Just thought I wanted to pass this on, in case somebody made the >>> same stupid mistake... >>> >>> Thanks to David Lloyd for pointing this out ! >>> >>> -- Bela Ban, JGroups lead (http://www.jgroups.org) >>> _______________________________________________ infinispan-dev >>> mailing list infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> _______________________________________________ infinispan-dev >> mailing list infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > -- > Bela Ban, JGroups lead (http://www.jgroups.org) > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev