[ 
https://issues.apache.org/jira/browse/CASSANDRA-14554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679214#comment-16679214
 ] 

Stefania commented on CASSANDRA-14554:
--------------------------------------

You're welcome [~benedict] !

bq.  I wonder if you had considered (and potentially discarded) what might be a 
slightly simpler approach of allocating a separate LifecycleTransaction for 
each operation, and atomically transferring their contents as they "complete" 
to the shared LivecycleTransaction?

No I hadn't considered it. It sounds elegant in principle but in order to 
atomically transfer child transactions to their parent, we'd have to add some 
complexity to transactions that I'm not sure we need. Obviously, the state of 
the parent transaction could change at any time (due to an abort), including 
whilst a child transaction is trying to transfer its state. So this would 
require some form of synchronization or CAS. The same is true for two child 
transactions transferring their state simultaneously. The state on disk should 
be fine as long as child transactions are never committed but only transferred. 
Child transaction should be allowed to abort independently though. So different 
rules for child and parent transactions would apply. 

I'm not sure we need this additional complexity because the txn state only 
changes rarely. {{LifecycleTransaction}} exposes a large API, but many methods 
are probably only used during compaction. Extracting a more comprehensive 
interface that can be implemented with a synchronized wrapper may be an easier 
approach.

I submitted a safe patch that fixes a known problem with streaming and that is 
safe for branches that will not undergo a major release testing cycle. 
Unfortunately, I do not have the time to work on a more comprehensive solution, 
at least not right now. I could however review whichever approach we choose.

> LifecycleTransaction encounters ConcurrentModificationException when used in 
> multi-threaded context
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14554
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Dinesh Joshi
>            Assignee: Dinesh Joshi
>            Priority: Major
>
> When LifecycleTransaction is used in a multi-threaded context, we encounter 
> this exception -
> {quote}java.util.ConcurrentModificationException: null
>  at 
> java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
>  at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
>  at java.lang.Iterable.forEach(Iterable.java:74)
>  at 
> org.apache.cassandra.db.lifecycle.LogReplicaSet.maybeCreateReplica(LogReplicaSet.java:78)
>  at org.apache.cassandra.db.lifecycle.LogFile.makeRecord(LogFile.java:320)
>  at org.apache.cassandra.db.lifecycle.LogFile.add(LogFile.java:285)
>  at 
> org.apache.cassandra.db.lifecycle.LogTransaction.trackNew(LogTransaction.java:136)
>  at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.trackNew(LifecycleTransaction.java:529)
> {quote}
> During streaming we create a reference to a {{LifeCycleTransaction}} and 
> share it between threads -
> [https://github.com/apache/cassandra/blob/5cc68a87359dd02412bdb70a52dfcd718d44a5ba/src/java/org/apache/cassandra/db/streaming/CassandraStreamReader.java#L156]
> This is used in a multi-threaded context insideĀ {{CassandraIncomingFile}} 
> which is anĀ {{IncomingStreamMessage}}. This is being deserialized in parallel.
> {{LifecycleTransaction}} is not meant to be used in a multi-threaded context 
> and this leads to streaming failures due to object sharing. On trunk, this 
> object is shared across all threads that transfer sstables in parallel for 
> the given {{TableId}} in a {{StreamSession}}. There are two options to solve 
> this - make {{LifecycleTransaction}} and the associated objects thread safe, 
> scope the transaction to a single {{CassandraIncomingFile}}. The consequences 
> of the latter option is that if we experience streaming failure we may have 
> redundant SSTables on disk. This is ok as compaction should clean this up. A 
> third option is we synchronize access in the streaming infrastructure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to