On 10 Jun 2011, at 12:23, Dan Berindei wrote:

> We may have a deadlock because we hold the processing lock (for
> reading) while invoking commands remotely (in JGroupsTransport). The
> remote commands might block waiting for the remote cache to start, and
> the remote cache won't start because it is waiting for this cache to
> acquire the processing log (for writing) and send the state.
> Do we really need to hold the processing lock while invoking remote commands?

Yes, since this processing lock is what holds up more commands from being 
handled when a rehash is in progress or state is being generated.  Otherwise a 
node's in-memory state becomes a moving target and generating state for sending 
a neighbour node becomes an issue.

Some of this is duplicated by the TransactionLogger we use in DIST, so maybe 
for DIST this is no longer necessary, but it does need to be considered 
carefully before removing.


> Dan
> On Fri, Jun 10, 2011 at 12:42 PM, Sanne Grinovero <sa...@infinispan.org> 
> wrote:
>> 2011/6/10 Galder Zamarreño <gal...@redhat.com>:
>>> On Jun 9, 2011, at 5:47 PM, Manik Surtani wrote:
>>>> +1 to writing the error marker to the stream. At least prevent false 
>>>> alarms.
>>>> Re: unit testing our externalizers, Galder, any thoughts there?
>>> The debugging done by Sanne/Dan seems to be correct.
>>> EOFException is simply saying: "hey, i'm expecting all these bytes but the 
>>> stream finished before I could read them all"
>>> This generally means that the side generating the stream encountered an 
>>> issue, and that's precisely what happens on the generation side.
>>> The receiver side cannot do much here other than say: "hey, i don't have 
>>> all the bytes" - and that's precisely what the EOFException is doing.
>>> I think the error marker could be complicated to implement (i.e. imagine 
>>> expecting to read a byte and instead getting an ERROR marker). What would 
>>> be much easier to do is for VersionAwareMarshaller or 
>>> GenericJBossMarshaller to provide more hints about what's wrong. So, they 
>>> could hide the inner details of the EOFException and launder it into 
>>> something that's clearer to the user:
>>> "The stream ended unexpectedly, please check for any errors where the 
>>> stream was generated"
>>> The right exception here is still an EOFException.
>>> That's all the receiver side can do.
>> Agreed that's reasonable on the receiver side, and of course we can't
>> control the network so while we can't always prevent it, can we still
>> try to not send incomplete streams from the sending side?
>> Sanne
>>>> Sent from my mobile phone
>>>> On 9 Jun 2011, at 16:24, Dan Berindei <dan.berin...@gmail.com> wrote:
>>>>> I don't think it's an externalizer issue, as I also see some
>>>>> exceptions on the node that generates state:
>>>>> 2011-06-09 18:16:18,250 ERROR
>>>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>>>> (STREAMING_STATE_TRANSFER-sender-1,Infinispan-Cluster,NodeA-57902)
>>>>> ISPN00095: Caught while responding to state transfer request
>>>>> org.infinispan.statetransfer.StateTransferException:
>>>>> java.util.concurrent.TimeoutException:
>>>>> STREAMING_STATE_TRANSFER-sender-1,Infinispan-Cluster,NodeA-57902 could
>>>>> not obtain exclusive processing lock after 10 seconds.  Locks in
>>>>> question are 
>>>>> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock@a35c90[Read
>>>>> locks = 1] and 
>>>>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@111fb7f[Unlocked]
>>>>>       at 
>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:177)
>>>>>       at 
>>>>> org.infinispan.remoting.InboundInvocationHandlerImpl.generateState(InboundInvocationHandlerImpl.java:248)
>>>>>       at 
>>>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport.getState(JGroupsTransport.java:585)
>>>>>       at 
>>>>> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:690)
>>>>>       at 
>>>>> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)
>>>>>       at org.jgroups.JChannel.up(JChannel.java:1484)
>>>>>       at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)
>>>>>       at 
>>>>> org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderHandler.process(STREAMING_STATE_TRANSFER.java:651)
>>>>>       at 
>>>>> org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderThreadSpawner$1.run(STREAMING_STATE_TRANSFER.java:580)
>>>>>       at 
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>>>       at 
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>>>       at java.lang.Thread.run(Thread.java:636)
>>>>> Caused by: java.util.concurrent.TimeoutException:
>>>>> STREAMING_STATE_TRANSFER-sender-1,Infinispan-Cluster,NodeA-57902 could
>>>>> not obtain exclusive processing lock after 10 seconds.  Locks in
>>>>> question are 
>>>>> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock@a35c90[Read
>>>>> locks = 1] and 
>>>>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@111fb7f[Unlocked]
>>>>>       at 
>>>>> org.infinispan.remoting.transport.jgroups.JGroupsDistSync.acquireProcessingLock(JGroupsDistSync.java:100)
>>>>>       at 
>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.generateTransactionLog(StateTransferManagerImpl.java:204)
>>>>>       at 
>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:167)
>>>>>       ... 11 more
>>>>> I guess we could write an error marker in the stream to prevent the
>>>>> EOFException on the receiving side, but the end result would be the
>>>>> same.
>>>>> Dan
>>>>> On Thu, Jun 9, 2011 at 5:58 PM, Sanne Grinovero <sa...@infinispan.org> 
>>>>> wrote:
>>>>>> Hello all,
>>>>>> if I happen to look at the console while the tests are running, I see
>>>>>> this exception popup very often:
>>>>>> 2011-06-09 15:32:18,092 ERROR [JGroupsTransport]
>>>>>> (Incoming-1,Infinispan-Cluster,NodeB-32230) ISPN00096: Caught while
>>>>>> requesting or applying state
>>>>>> org.infinispan.statetransfer.StateTransferException:
>>>>>> java.io.EOFException: Read past end of file
>>>>>>       at 
>>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:333)
>>>>>>       at 
>>>>>> org.infinispan.remoting.InboundInvocationHandlerImpl.applyState(InboundInvocationHandlerImpl.java:230)
>>>>>>       at 
>>>>>> org.infinispan.remoting.transport.jgroups.JGroupsTransport.setState(JGroupsTransport.java:602)
>>>>>>       at 
>>>>>> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:711)
>>>>>>       at 
>>>>>> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)
>>>>>>       at org.jgroups.JChannel.up(JChannel.java:1441)
>>>>>>       at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)
>>>>>>       at 
>>>>>> org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:523)
>>>>>>       at 
>>>>>> org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462)
>>>>>>       at 
>>>>>> org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223)
>>>>>>       at org.jgroups.protocols.FRAG2.up(FRAG2.java:189)
>>>>>>       at org.jgroups.protocols.FC.up(FC.java:479)
>>>>>>       at org.jgroups.protocols.pbcast.GMS.up(GMS.java:891)
>>>>>>       at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)
>>>>>>       at 
>>>>>> org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613)
>>>>>>       at org.jgroups.protocols.UNICAST.up(UNICAST.java:294)
>>>>>>       at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703)
>>>>>>       at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:133)
>>>>>>       at org.jgroups.protocols.FD.up(FD.java:275)
>>>>>>       at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275)
>>>>>>       at org.jgroups.protocols.MERGE2.up(MERGE2.java:209)
>>>>>>       at org.jgroups.protocols.Discovery.up(Discovery.java:291)
>>>>>>       at org.jgroups.protocols.TP.passMessageUp(TP.java:1102)
>>>>>>       at 
>>>>>> org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1658)
>>>>>>       at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1640)
>>>>>>       at 
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>>>       at 
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>>> Caused by: java.io.EOFException: Read past end of file
>>>>>>       at 
>>>>>> org.jboss.marshalling.SimpleDataInput.eofOnRead(SimpleDataInput.java:126)
>>>>>>       at 
>>>>>> org.jboss.marshalling.SimpleDataInput.readUnsignedByteDirect(SimpleDataInput.java:263)
>>>>>>       at 
>>>>>> org.jboss.marshalling.SimpleDataInput.readUnsignedByte(SimpleDataInput.java:224)
>>>>>>       at 
>>>>>> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
>>>>>>       at 
>>>>>> org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:37)
>>>>>>       at 
>>>>>> org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:192)
>>>>>>       at 
>>>>>> org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:190)
>>>>>>       at 
>>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:230)
>>>>>>       at 
>>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:252)
>>>>>>       at 
>>>>>> org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:322)
>>>>>>       ... 27 more
>>>>>> But I'm not sure if it's an issue, as it seems tests are not failing.
>>>>>> I consider a "Read past end of file" quite suspiciously looking; would
>>>>>> it be possible to think that some internal Externalizer is writing
>>>>>> less bytes than what it's attempting to read?
>>>>>> Is there something clever I could do to understand which object the
>>>>>> marshaller is trying to read when something like this is happening?
>>>>>> I've found debugging this quite hard.
>>>>>> Also, it doesn't look like our externalizers have a good test
>>>>>> coverage; They are likely implicitly tested as I assume that nothing
>>>>>> would work if they aren't, but still it looks like we have no explicit
>>>>>> tests for them?
>>>>>> Cheers,
>>>>>> Sanne
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev@lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev@lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev@lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> --
>>> Galder Zamarreño
>>> Sr. Software Engineer
>>> Infinispan, JBoss Cache
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev@lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

Manik Surtani

Lead, Infinispan

infinispan-dev mailing list

Reply via email to