[ 
https://issues.apache.org/jira/browse/CASSANDRA-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Moos updated CASSANDRA-10005:
-----------------------------------
    Description: 
I'm adding a new node to the cluster and I'm seeing a bunch of the errors below 
and the node never joins. It looks like a deadlock.

After looking through the code it looks like IncomingFileMessage will tell the 
session to retry on Exceptions (except IOException) but the 
CompressedInputStream thread is still running and then the retry happens and 
the deadlock ensues. It might be best to close the StreamReader (and stop the 
thread) if an Exception happens before retrying.

I'm not sure why I am getting this error to begin with though, might it have 
something to do with not being able to upgrade my SSTables after going from 
2.1.2 -> 2.2.0?

{code}
error: null
-- StackTrace --
java.lang.AssertionError
        at 
org.apache.cassandra.db.lifecycle.LifecycleTransaction.checkUnused(LifecycleTransaction.java:428)
        at 
org.apache.cassandra.db.lifecycle.LifecycleTransaction.split(LifecycleTransaction.java:408)
        at 
org.apache.cassandra.db.compaction.CompactionManager.parallelAllSSTableOperation(CompactionManager.java:268)
        at 
org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:373)
        at 
org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:1524)
        at 
org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2521)
{code}

  was:
I'm adding a new node to the cluster and I'm seeing a bunch of the errors 
below. The node never joins the cluster. This causes a deadlock, (see deadlock 
below):

{code}
WARN  [STREAM-IN-/10.220.0.160] 2015-08-06 16:16:42,640 StreamSession.java:638 
- [Stream #4be6d7c0-3c53-11e5-b5bc-dbbae7f19873] Retrying for following error
java.lang.IllegalArgumentException: Not enough bytes
        at 
org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:362)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:98)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:381)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:365)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:75)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52) 
~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46) 
~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
        at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.appendFromStream(BigTableWriter.java:243)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.StreamReader.writeRow(StreamReader.java:162) 
~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:95)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:49)
 [apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
 [apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
 [apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
 [apache-cassandra-2.2.0.jar:2.2.0]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
{code}

{code}
ERROR 06:28:26 [Stream #059b7cc0-3c04-11e5-8c56-dbbae7f19873] Streaming error 
occurred
java.lang.IllegalArgumentException: Unknown type 0
        at 
org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:90)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
{code}

Found one Java-level deadlock:
=============================
"Thread-869":
  waiting to lock monitor 0x00007f2ef8003f08 (object 0x000000062b8a9a28, a 
java.lang.Object),
  which is held by "STREAM-IN-/10.220.0.147"
"STREAM-IN-/10.220.0.147":
  waiting to lock monitor 0x00007f2ed00436a8 (object 0x000000062bc96d68, a 
java.lang.Object),
  which is held by "Thread-869"

Java stack information for the threads listed above:
===================================================
"Thread-869":
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:295)
        - waiting to lock <0x000000062b8a9a28> (a java.lang.Object)
        at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)
        - locked <0x000000062bc96d68> (a java.lang.Object)
        at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
        at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
        - locked <0x00000006340e3f30> (a sun.nio.ch.ChannelInputStream)
        at 
org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:161)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at java.lang.Thread.run(Thread.java:745)
"STREAM-IN-/10.220.0.147":
        at 
java.nio.channels.spi.AbstractSelectableChannel.isBlocking(AbstractSelectableChannel.java:261)
        - waiting to lock <0x000000062bc96d68> (a java.lang.Object)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:299)
        - locked <0x000000062b8a9a28> (a java.lang.Object)
        at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:52)
        at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
        at java.lang.Thread.run(Thread.java:745)



> Streaming not enough bytes error
> --------------------------------
>
>                 Key: CASSANDRA-10005
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10005
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Moos
>            Priority: Minor
>         Attachments: deadlock.txt, errors.txt
>
>
> I'm adding a new node to the cluster and I'm seeing a bunch of the errors 
> below and the node never joins. It looks like a deadlock.
> After looking through the code it looks like IncomingFileMessage will tell 
> the session to retry on Exceptions (except IOException) but the 
> CompressedInputStream thread is still running and then the retry happens and 
> the deadlock ensues. It might be best to close the StreamReader (and stop the 
> thread) if an Exception happens before retrying.
> I'm not sure why I am getting this error to begin with though, might it have 
> something to do with not being able to upgrade my SSTables after going from 
> 2.1.2 -> 2.2.0?
> {code}
> error: null
> -- StackTrace --
> java.lang.AssertionError
>         at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.checkUnused(LifecycleTransaction.java:428)
>         at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.split(LifecycleTransaction.java:408)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager.parallelAllSSTableOperation(CompactionManager.java:268)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:373)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:1524)
>         at 
> org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2521)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to