[ https://issues.apache.org/jira/browse/CASSANDRA-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17763718#comment-17763718 ]
Stefan Miklosovic edited comment on CASSANDRA-18781 at 9/11/23 1:30 PM: ------------------------------------------------------------------------ I think we have a problem. The problem is that when stream fails, sstableloader logs this in the console {code} ERROR [Stream-Deserializer-/127.0.0.1:7000-4476be2a] 2023-09-11 15:19:41,234 StreamSession.java:1128 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Remote peer /127.0.0.1:7000 failed stream session. INFO [NonPeriodicTasks:1] 2023-09-11 15:19:41,236 StreamResultFuture.java:201 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Session with /127.0.0.1:7000 is failed progress: total: 100% 0.000B/s (avg: 0.000B/s) WARN [NonPeriodicTasks:1] 2023-09-11 15:19:41,241 StreamResultFuture.java:250 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Stream failed: Session peer /127.0.0.1:7000 Remote peer /127.0.0.1:7000 failed stream session ERROR [NonPeriodicTasks:1] 2023-09-11 15:21:19,086 JVMStabilityInspector.java:70 - Exception in thread Thread[NonPeriodicTasks:1,5,NonPeriodicTasks] java.lang.AssertionError: for sstable = BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1 at org.apache.cassandra.io.sstable.SSTableLoader.releaseReferences(SSTableLoader.java:249) at org.apache.cassandra.io.sstable.SSTableLoader.onFailure(SSTableLoader.java:236) at org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.run(ListenerList.java:213) at org.apache.cassandra.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:140) at org.apache.cassandra.utils.concurrent.ListenerList.safeExecute(ListenerList.java:166) at org.apache.cassandra.utils.concurrent.ListenerList.notifyListener(ListenerList.java:157) at org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.notifySelf(ListenerList.java:219) at org.apache.cassandra.utils.concurrent.ListenerList.lambda$notifyExclusive$0(ListenerList.java:124) at org.apache.cassandra.utils.concurrent.IntrusiveStack.forEach(IntrusiveStack.java:195) at org.apache.cassandra.utils.concurrent.ListenerList.notifyExclusive(ListenerList.java:124) at org.apache.cassandra.utils.concurrent.ListenerList.notify(ListenerList.java:96) at org.apache.cassandra.utils.concurrent.AsyncFuture.trySet(AsyncFuture.java:104) at org.apache.cassandra.utils.concurrent.AbstractFuture.tryFailure(AbstractFuture.java:148) at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:251) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205) at org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:545) at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829) {code} Notice {code} java.lang.AssertionError: for sstable = BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1 {code} It is executing this on a failure: https://github.com/instaclustr/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L232-L248 So, the problem here is that, for some reason, there is one more object referencing that SSTable and it is not cleaned up as we throw that exception. I am not sure why this is happening. On happy path, (since it is cleaning them on success too, same method is called on success if you notice), sstable.selfRef().globalCount() returns 0, but not on failure. While this is "harmless" when using it in a tool, I do not think this is safe it it is called programmatically, we would basically leak the references. was (Author: smiklosovic): I think we have a problem. The problem is that when stream fails, sstableloader logs this in the console {code} ERROR [Stream-Deserializer-/127.0.0.1:7000-4476be2a] 2023-09-11 15:19:41,234 StreamSession.java:1128 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Remote peer /127.0.0.1:7000 failed stream session. INFO [NonPeriodicTasks:1] 2023-09-11 15:19:41,236 StreamResultFuture.java:201 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Session with /127.0.0.1:7000 is failed progress: total: 100% 0.000B/s (avg: 0.000B/s) WARN [NonPeriodicTasks:1] 2023-09-11 15:19:41,241 StreamResultFuture.java:250 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Stream failed: Session peer /127.0.0.1:7000 Remote peer /127.0.0.1:7000 failed stream session ERROR [NonPeriodicTasks:1] 2023-09-11 15:21:19,086 JVMStabilityInspector.java:70 - Exception in thread Thread[NonPeriodicTasks:1,5,NonPeriodicTasks] java.lang.AssertionError: for sstable = BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1 at org.apache.cassandra.io.sstable.SSTableLoader.releaseReferences(SSTableLoader.java:249) at org.apache.cassandra.io.sstable.SSTableLoader.onFailure(SSTableLoader.java:236) at org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.run(ListenerList.java:213) at org.apache.cassandra.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:140) at org.apache.cassandra.utils.concurrent.ListenerList.safeExecute(ListenerList.java:166) at org.apache.cassandra.utils.concurrent.ListenerList.notifyListener(ListenerList.java:157) at org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.notifySelf(ListenerList.java:219) at org.apache.cassandra.utils.concurrent.ListenerList.lambda$notifyExclusive$0(ListenerList.java:124) at org.apache.cassandra.utils.concurrent.IntrusiveStack.forEach(IntrusiveStack.java:195) at org.apache.cassandra.utils.concurrent.ListenerList.notifyExclusive(ListenerList.java:124) at org.apache.cassandra.utils.concurrent.ListenerList.notify(ListenerList.java:96) at org.apache.cassandra.utils.concurrent.AsyncFuture.trySet(AsyncFuture.java:104) at org.apache.cassandra.utils.concurrent.AbstractFuture.tryFailure(AbstractFuture.java:148) at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:251) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205) at org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:545) at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829) {code} Notice {code} java.lang.AssertionError: for sstable = BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1 {code} It is executing this on a failure: https://github.com/instaclustr/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L232-L248 So, the problem here is that, for some reason, there is one more object referencing that SSTable and it is not cleaned up as we throw that exception. I am not sure why this is happening. On happy path, (since it is cleaning them on success too, same method is called on success if you notice), sstable.selfRef().globalCount() returns 0, but not on failure. > Add the ability to disable bulk loading of SSTables on a node > ------------------------------------------------------------- > > Key: CASSANDRA-18781 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18781 > Project: Cassandra > Issue Type: Improvement > Components: Tool/bulk load > Reporter: Runtian Liu > Assignee: Runtian Liu > Priority: Normal > Fix For: 5.x > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Currently, Cassandra database users can use sstableloader to bulk load data > into Cassandra. However, for a Cassandra operator, there is no way to > forcibly block this behavior. Additionally, there is no metric indicating > whether the bulk load is being used on the server side. If a client is using > sstableloader, they will also need to upgrade the sstableloader code to the > new major version. This lack of control and visibility can become a blocker > during a major version upgrade. > > 1. Can we add a config to disable bulk load feature? Or it falls into > https://issues.apache.org/jira/browse/CASSANDRA-8303 > 2. Can we add metrics for bulk load used on server end? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org