[ 
https://issues.apache.org/jira/browse/CASSANDRA-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17763718#comment-17763718
 ] 

Stefan Miklosovic edited comment on CASSANDRA-18781 at 9/11/23 1:30 PM:
------------------------------------------------------------------------

I think we have a problem. The problem is that when stream fails, sstableloader 
logs this in the console

{code}
ERROR [Stream-Deserializer-/127.0.0.1:7000-4476be2a] 2023-09-11 15:19:41,234 
StreamSession.java:1128 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Remote 
peer /127.0.0.1:7000 failed stream session.
INFO  [NonPeriodicTasks:1] 2023-09-11 15:19:41,236 StreamResultFuture.java:201 
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Session with /127.0.0.1:7000 
is failed
progress: total: 100% 0.000B/s (avg: 0.000B/s)
WARN  [NonPeriodicTasks:1] 2023-09-11 15:19:41,241 StreamResultFuture.java:250 
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Stream failed: 
Session peer /127.0.0.1:7000 Remote peer /127.0.0.1:7000 failed stream session
ERROR [NonPeriodicTasks:1] 2023-09-11 15:21:19,086 
JVMStabilityInspector.java:70 - Exception in thread 
Thread[NonPeriodicTasks:1,5,NonPeriodicTasks]
java.lang.AssertionError: for sstable = 
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
        at 
org.apache.cassandra.io.sstable.SSTableLoader.releaseReferences(SSTableLoader.java:249)
        at 
org.apache.cassandra.io.sstable.SSTableLoader.onFailure(SSTableLoader.java:236)
        at 
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.run(ListenerList.java:213)
        at 
org.apache.cassandra.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:140)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.safeExecute(ListenerList.java:166)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.notifyListener(ListenerList.java:157)
        at 
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.notifySelf(ListenerList.java:219)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.lambda$notifyExclusive$0(ListenerList.java:124)
        at 
org.apache.cassandra.utils.concurrent.IntrusiveStack.forEach(IntrusiveStack.java:195)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.notifyExclusive(ListenerList.java:124)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.notify(ListenerList.java:96)
        at 
org.apache.cassandra.utils.concurrent.AsyncFuture.trySet(AsyncFuture.java:104)
        at 
org.apache.cassandra.utils.concurrent.AbstractFuture.tryFailure(AbstractFuture.java:148)
        at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:251)
        at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205)
        at 
org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:545)
        at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
        at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
        at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at 
java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
        at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
{code}

Notice 
{code}
java.lang.AssertionError: for sstable = 
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
{code}

It is executing this on a failure:

https://github.com/instaclustr/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L232-L248

So, the problem here is that, for some reason, there is one more object 
referencing that SSTable and it is not cleaned up as we throw that exception. I 
am not sure why this is happening. On happy path, (since it is cleaning them on 
success too, same method is called on success if you notice), 
sstable.selfRef().globalCount() returns 0, but not on failure.

While this is "harmless" when using it in a tool, I do not think this is safe 
it it is called programmatically, we would basically leak the references. 


was (Author: smiklosovic):
I think we have a problem. The problem is that when stream fails, sstableloader 
logs this in the console

{code}
ERROR [Stream-Deserializer-/127.0.0.1:7000-4476be2a] 2023-09-11 15:19:41,234 
StreamSession.java:1128 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Remote 
peer /127.0.0.1:7000 failed stream session.
INFO  [NonPeriodicTasks:1] 2023-09-11 15:19:41,236 StreamResultFuture.java:201 
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Session with /127.0.0.1:7000 
is failed
progress: total: 100% 0.000B/s (avg: 0.000B/s)
WARN  [NonPeriodicTasks:1] 2023-09-11 15:19:41,241 StreamResultFuture.java:250 
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Stream failed: 
Session peer /127.0.0.1:7000 Remote peer /127.0.0.1:7000 failed stream session
ERROR [NonPeriodicTasks:1] 2023-09-11 15:21:19,086 
JVMStabilityInspector.java:70 - Exception in thread 
Thread[NonPeriodicTasks:1,5,NonPeriodicTasks]
java.lang.AssertionError: for sstable = 
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
        at 
org.apache.cassandra.io.sstable.SSTableLoader.releaseReferences(SSTableLoader.java:249)
        at 
org.apache.cassandra.io.sstable.SSTableLoader.onFailure(SSTableLoader.java:236)
        at 
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.run(ListenerList.java:213)
        at 
org.apache.cassandra.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:140)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.safeExecute(ListenerList.java:166)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.notifyListener(ListenerList.java:157)
        at 
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.notifySelf(ListenerList.java:219)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.lambda$notifyExclusive$0(ListenerList.java:124)
        at 
org.apache.cassandra.utils.concurrent.IntrusiveStack.forEach(IntrusiveStack.java:195)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.notifyExclusive(ListenerList.java:124)
        at 
org.apache.cassandra.utils.concurrent.ListenerList.notify(ListenerList.java:96)
        at 
org.apache.cassandra.utils.concurrent.AsyncFuture.trySet(AsyncFuture.java:104)
        at 
org.apache.cassandra.utils.concurrent.AbstractFuture.tryFailure(AbstractFuture.java:148)
        at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:251)
        at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205)
        at 
org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:545)
        at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
        at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
        at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at 
java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
        at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
{code}

Notice 
{code}
java.lang.AssertionError: for sstable = 
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
{code}

It is executing this on a failure:

https://github.com/instaclustr/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L232-L248

So, the problem here is that, for some reason, there is one more object 
referencing that SSTable and it is not cleaned up as we throw that exception. I 
am not sure why this is happening. On happy path, (since it is cleaning them on 
success too, same method is called on success if you notice), 
sstable.selfRef().globalCount() returns 0, but not on failure.

> Add the ability to disable bulk loading of SSTables on a node
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-18781
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18781
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tool/bulk load
>            Reporter: Runtian Liu
>            Assignee: Runtian Liu
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, Cassandra database users can use sstableloader to bulk load data 
> into Cassandra. However, for a Cassandra operator, there is no way to 
> forcibly block this behavior. Additionally, there is no metric indicating 
> whether the bulk load is being used on the server side. If a client is using 
> sstableloader, they will also need to upgrade the sstableloader code to the 
> new major version. This lack of control and visibility can become a blocker 
> during a major version upgrade.
>  
> 1. Can we add a config to disable bulk load feature? Or it falls into 
> https://issues.apache.org/jira/browse/CASSANDRA-8303
> 2. Can we add metrics for bulk load used on server end?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to