[ 
https://issues.apache.org/jira/browse/CASSANDRA-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766258#comment-17766258
 ] 

Stefan Miklosovic commented on CASSANDRA-18781:
-----------------------------------------------

We should add this (1)

Currently, there is this in logs in Cassandra node:

{code}
ERROR [Stream-Deserializer-/127.0.1.1:7000-f6f5b035] 2023-09-18 08:55:35,683 
StreamSession.java:733 - [Stream #5e1599d0-55f0-11ee-ad9c-ed91de66bda5] 
Streaming error occurred on session with peer 127.0.1.1:7000 through 
127.0.0.1:43704
org.apache.cassandra.db.guardrails.GuardrailViolatedException: Guardrail 
bulk_load_enabled violated: Bulk load of SSTables is not allowed. Bulk loading 
of SSTables might potentially destabilize the node.
        at org.apache.cassandra.db.guardrails.Guardrail.fail(Guardrail.java:143)
        at org.apache.cassandra.db.guardrails.Guardrail.fail(Guardrail.java:124)
        at 
org.apache.cassandra.db.guardrails.EnableFlag.ensureEnabled(EnableFlag.java:126)
        at 
org.apache.cassandra.streaming.StreamDeserializingTask.run(StreamDeserializingTask.java:75)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
INFO  [NonPeriodicTasks:1] 2023-09-18 08:55:35,685 StreamResultFuture.java:201 
- [Stream #5e1599d0-55f0-11ee-ad9c-ed91de66bda5] Session with /127.0.1.1:7000 
is failed
WARN  [NonPeriodicTasks:1] 2023-09-18 08:55:35,690 StreamResultFuture.java:250 
- [Stream #5e1599d0-55f0-11ee-ad9c-ed91de66bda5] Stream failed: 
Session peer /127.0.1.1:7000 Failed because of an unknown exception
org.apache.cassandra.db.guardrails.GuardrailViolatedException: Guardrail 
bulk_load_enabled violated: Bulk load of SSTables is not allowed. Bulk loading 
of SSTables might potentially destabilize the node.
        org.apache.cassandra.db.guardrails.Guardrail.fail(Guardrail.java:143)
        org.apache.cassandra.db.guardrails.Guardrail.fail(Guardrail.java:124)
{code}

That is not very user-friendly. We should not propagate exceptions there. That 
is just wrong IMHO.

With my suggestion it looks like this:

{code}
INFO  [Messaging-EventLoop-3-3] 2023-09-18 09:24:19,552 
InboundConnectionInitiator.java:447 - 
/127.0.1.1:7000(/127.0.0.1:57364)->localhost/127.0.0.1:7000-STREAMING-58959818 
streaming connection established, version = 12, framing = UNPROTECTED, 
encryption = unencrypted
INFO  [Stream-Deserializer-/127.0.1.1:7000-58959818] 2023-09-18 09:24:19,571 
StreamResultFuture.java:123 - [Stream #619d6430-55f4-11ee-b546-b35448f59a3b 
ID#0] Creating new streaming plan for Bulk Load from /127.0.1.1:7000 
channel.remote /127.0.0.1:57364 channel.local /127.0.0.1:7000 channel.id 
58959818
INFO  [Stream-Deserializer-/127.0.1.1:7000-58959818] 2023-09-18 09:24:19,571 
StreamResultFuture.java:131 - [Stream #619d6430-55f4-11ee-b546-b35448f59a3b, 
ID#0] Received streaming plan for Bulk Load from /127.0.1.1:7000 channel.remote 
/127.0.0.1:57364 channel.local /127.0.0.1:7000 channel.id 58959818
ERROR [Stream-Deserializer-/127.0.1.1:7000-58959818] 2023-09-18 09:24:19,574 
NoSpamLogger.java:111 - Guardrail bulk_load_enabled violated: Bulk load of 
SSTables is not allowed. Bulk loading of SSTables might potentially destabilize 
the node.
INFO  [Stream-Deserializer-/127.0.1.1:7000-58959818] 2023-09-18 09:24:19,576 
StreamDeserializingTask.java:88 - [Stream #619d6430-55f4-11ee-b546-b35448f59a3b 
channel: 58959818] Aborting StreamInitMessage: from = /127.0.1.1:7000, planId = 
619d6430-55f4-11ee-b546-b35448f59a3b, session index = 0. Bulk loading of 
SSTables is disabled.
INFO  [Stream-Deserializer-/127.0.1.1:7000-58959818] 2023-09-18 09:24:19,576 
StreamSession.java:1323 - [Stream #619d6430-55f4-11ee-b546-b35448f59a3b] 
Aborting stream session with peer /127.0.1.1:7000...
INFO  [NonPeriodicTasks:1] 2023-09-18 09:24:19,577 StreamResultFuture.java:201 
- [Stream #619d6430-55f4-11ee-b546-b35448f59a3b] Session with /127.0.1.1:7000 
is aborted
INFO  [Stream-Deserializer-/127.0.1.1:7000-58959818] 2023-09-18 09:24:19,577 
StreamDeserializingTask.java:88 - [Stream #619d6430-55f4-11ee-b546-b35448f59a3b 
channel: 58959818] Aborting Prepare SYN (0 requests,  5 files}. Bulk loading of 
SSTables is disabled.
INFO  [NonPeriodicTasks:1] 2023-09-18 09:24:19,577 StreamResultFuture.java:255 
- [Stream #619d6430-55f4-11ee-b546-b35448f59a3b] Stream aborted
{code}

Client's side suffers same problem but I do not think we have easy solution at 
hand at the moment because putting any specific exception message into 
SessionFailedMessage would not be deserializable (2). It is also questionable 
how that would look like in a cluster with mixed versions.

(1) 
https://github.com/instaclustr/cassandra/commit/744937b8804af798ea3da7619259cfca394f3ed0
(2) 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/streaming/messages/SessionFailedMessage.java#L30

> Add the ability to disable bulk loading of SSTables on a node
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-18781
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18781
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tool/bulk load
>            Reporter: Runtian Liu
>            Assignee: Runtian Liu
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, Cassandra database users can use sstableloader to bulk load data 
> into Cassandra. However, for a Cassandra operator, there is no way to 
> forcibly block this behavior. Additionally, there is no metric indicating 
> whether the bulk load is being used on the server side. If a client is using 
> sstableloader, they will also need to upgrade the sstableloader code to the 
> new major version. This lack of control and visibility can become a blocker 
> during a major version upgrade.
>  
> 1. Can we add a config to disable bulk load feature? Or it falls into 
> https://issues.apache.org/jira/browse/CASSANDRA-8303
> 2. Can we add metrics for bulk load used on server end?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to