[
https://issues.apache.org/jira/browse/CASSANDRA-14554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683835#comment-16683835
]
Benedict commented on CASSANDRA-14554:
--------------------------------------
bq. If we are synchronizing the LifecycleTransaction methods anyway, I'm not
sure I understand why we need child transactions.
The only reason would be simplifying analysis of the code's behaviour. For
instance, it's not clear to me how we either would (or should) behave in the
stream writers actively working (and creating sstable files) but for whom the
transaction has already been cancelled. Does such a scenario even arise? Is
it possible it would leave partially written sstables?
A separate transaction is very easy to reason about, so we have only to
consider what happens when we transfer ownership.
I agree that there is no sensible reason to worry about blocking behaviour
specifically, and perhaps synchronising the transaction object is a simple
first step we can follow-up later (we could even do it with a delegating
SynchronizedLifecycleTransaction, which would seem to be equivalent to your
patch, but with the changes isolated to a couple of classes, I think?)
> LifecycleTransaction encounters ConcurrentModificationException when used in
> multi-threaded context
> ---------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-14554
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14554
> Project: Cassandra
> Issue Type: Bug
> Reporter: Dinesh Joshi
> Assignee: Dinesh Joshi
> Priority: Major
>
> When LifecycleTransaction is used in a multi-threaded context, we encounter
> this exception -
> {quote}java.util.ConcurrentModificationException: null
> at
> java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
> at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
> at java.lang.Iterable.forEach(Iterable.java:74)
> at
> org.apache.cassandra.db.lifecycle.LogReplicaSet.maybeCreateReplica(LogReplicaSet.java:78)
> at org.apache.cassandra.db.lifecycle.LogFile.makeRecord(LogFile.java:320)
> at org.apache.cassandra.db.lifecycle.LogFile.add(LogFile.java:285)
> at
> org.apache.cassandra.db.lifecycle.LogTransaction.trackNew(LogTransaction.java:136)
> at
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.trackNew(LifecycleTransaction.java:529)
> {quote}
> During streaming we create a reference to a {{LifeCycleTransaction}} and
> share it between threads -
> [https://github.com/apache/cassandra/blob/5cc68a87359dd02412bdb70a52dfcd718d44a5ba/src/java/org/apache/cassandra/db/streaming/CassandraStreamReader.java#L156]
> This is used in a multi-threaded context insideĀ {{CassandraIncomingFile}}
> which is anĀ {{IncomingStreamMessage}}. This is being deserialized in parallel.
> {{LifecycleTransaction}} is not meant to be used in a multi-threaded context
> and this leads to streaming failures due to object sharing. On trunk, this
> object is shared across all threads that transfer sstables in parallel for
> the given {{TableId}} in a {{StreamSession}}. There are two options to solve
> this - make {{LifecycleTransaction}} and the associated objects thread safe,
> scope the transaction to a single {{CassandraIncomingFile}}. The consequences
> of the latter option is that if we experience streaming failure we may have
> redundant SSTables on disk. This is ok as compaction should clean this up. A
> third option is we synchronize access in the streaming infrastructure.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]