[ https://issues.apache.org/jira/browse/CASSANDRA-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sam Tunnicliffe updated CASSANDRA-20052: ---------------------------------------- Status: Changes Suggested (was: Review In Progress) > Size of CQL messages is not limited in V5 protocol logic > -------------------------------------------------------- > > Key: CASSANDRA-20052 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20052 > Project: Apache Cassandra > Issue Type: Bug > Components: Messaging/Client > Reporter: Dmitry Konstantinov > Assignee: Dmitry Konstantinov > Priority: Normal > Attachments: cassandra_rate_limit.svg > > Time Spent: 20m > Remaining Estimate: 0h > > Size of CQL messages is not limited in V5 protocol logic > - After introducing of v5 frames we do not have any CQL message limit > anymore, native_transport_max_frame_size_in_mb which had such limit in pre-V5 > epoch is applicable now only to pre-V5 protocol sessions, otherwise it is > applied only to the initial STARTUP/OPTIONS messages handling, it is not > checked in any v5 logic. So, currently a v5 CQL message of any size can be > sent to Cassandra server. > - The overload logic just allows to process huge messages for free to avoid > starvation, so it does not provide any protection against the most dangerous > requests from a memory pressure point of view. > - The situation even more dangerous: the v5 framing logic is enabled just > after AUTH response, so we do not limit message size even for AUTH_RESPONSE > messages from a client. It can be used as a DoS attack: a non-authenticated > client can send a huge username/password to Cassandra server to cause > troubles with GC or even kill it. > An easy example: > {code:java} > public class TestBigAuthRequest { > public static void main(String[] args) { > String password = getString(500_000_000, '-'); > try (CqlSession session = CqlSession.builder() > .addContactEndPoint(new DefaultEndPoint(new > InetSocketAddress("localhost", 9042))) > .withAuthCredentials("cassandra", password) > .withLocalDatacenter("datacenter1") > .build()) { > session.execute("select * from system.local"); > } > } > private static String getString(int length, char charToFill) { > if (length > 0) { > char[] array = new char[length]; > Arrays.fill(array, charToFill); > return new String(array); > } > return ""; > } > } > {code} > A thread stack of such invocation (captured to show the execution flow): > {code:java} > "nioEventLoopGroup-5-21@9164" prio=10 tid=0x86 nid=NA runnable > java.lang.Thread.State: RUNNABLE > at > org.apache.cassandra.transport.messages.AuthResponse$1.decode(AuthResponse.java:45) > at > org.apache.cassandra.transport.messages.AuthResponse$1.decode(AuthResponse.java:39) > at > org.apache.cassandra.transport.Message$Decoder.decodeMessage(Message.java:432) > at > org.apache.cassandra.transport.Message$Decoder$RequestDecoder.decode(Message.java:467) > at > org.apache.cassandra.transport.Message$Decoder$RequestDecoder.decode(Message.java:459) > at > org.apache.cassandra.transport.CQLMessageHandler.processRequest(CQLMessageHandler.java:377) > at > org.apache.cassandra.transport.CQLMessageHandler$LargeMessage.onComplete(CQLMessageHandler.java:755) > at > org.apache.cassandra.net.AbstractMessageHandler$LargeMessage.supply(AbstractMessageHandler.java:561) > at > org.apache.cassandra.net.AbstractMessageHandler.processSubsequentFrameOfLargeMessage(AbstractMessageHandler.java:257) > at > org.apache.cassandra.net.AbstractMessageHandler.processIntactFrame(AbstractMessageHandler.java:229) > at > org.apache.cassandra.net.AbstractMessageHandler.process(AbstractMessageHandler.java:216) > at > org.apache.cassandra.transport.CQLMessageHandler.process(CQLMessageHandler.java:147) > at > org.apache.cassandra.net.FrameDecoder.deliver(FrameDecoder.java:330) > at > org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:294) > at > org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:277) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:829) > {code} > The provided MR (https://github.com/apache/cassandra/pull/3655) contains a > fix for the issue which introduces 2 new parameters: > native_transport_max_message_size - to limit any CQL message size > native_transport_max_auth_message_size (default = 128KiB) - to limit auth > response message size more strictly and add an extra protection against a > possible DoS attack. > Design questions: > * The current implementation closes a CQL connection if a message is bigger > than the limits. A skip message body logic can be implemented to continue the > connection usage but it is more complicated and error prone. > * The tricky question is the default value for > native_transport_max_message_size, > from one side - we want to have it not more than > min(native_transport_max_request_data_in_flight_per_ip, > native_transport_max_request_data_in_flight) to reduce chances to invoke the > branch of logic when a error handling does not work > from another size - min(native_transport_max_request_data_in_flight_per_ip, > native_transport_max_request_data_in_flight) can be too small and there is a > chance to break a backward compatibility for existing deployments where > people use large messages and small heaps (while it is not a good idea). > Related observations: > 1) https://issues.apache.org/jira/browse/CASSANDRA-16886 - Reduce > native_transport_max_frame_size_in_mb (from 256M to 16M) > 2) A correspondent logic for Cassandra server internode protocol a message > limit exists and rate limiting parameters are validated to be smaller than a > single message max size: > internode_max_message_size = > min(internode_application_receive_queue_reserve_endpoint_capacity, > internode_application_send_queue_reserve_endpoint_capacity) > internode_application_receive_queue_reserve_endpoint_capacity = 128MiB > internode_application_send_queue_reserve_endpoint_capacity = 128MiB > internode_max_message_size <= > internode_application_receive_queue_reserve_endpoint_capacity > internode_max_message_size <= > internode_application_receive_queue_reserve_global_capacity > internode_max_message_size <= > internode_application_send_queue_reserve_endpoint_capacity > internode_max_message_size <= > internode_application_send_queue_reserve_global_capacity > 3) Request types according to CQL specification: > 4.1.1. STARTUP, in normal cases should be small > 4.1.2. AUTH_RESPONSE, in normal cases should be small > 4.1.3. OPTIONS, in normal cases should be small > 4.1.4. QUERY, in normal cases should be small > 4.1.5. PREPARE, in normal cases should be small > 4.1.6. EXECUTE <-- potentially large in case of inserts, max_mutation_size = > commitlog_segment_size / 2; where commitlog_segment_size_in_mb = 32MiB > 4.1.7. BATCH <-- potentially large, max_mutation_size = > commitlog_segment_size / 2; where commitlog_segment_size_in_mb = 32MiB > 4.1.8. REGISTER, in normal cases should be small -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org