[GitHub] [cassandra] maedhroz commented on a change in pull request #1180: CASSANDRA-16896: Add soft/hard limits to local reads to protect against reading too much data in a single query

GitBox Fri, 03 Sep 2021 10:47:12 -0700


maedhroz commented on a change in pull request #1180:
URL: https://github.com/apache/cassandra/pull/1180#discussion_r702070392




##########
File path: src/java/org/apache/cassandra/db/RowIndexEntry.java
##########
@@ -343,6 +349,47 @@ public static void skipForCache(DataInputPlus in) throws 
IOException
             }
         }
 
+        private void checkSize(int entries, int bytes)
+        {
+            ReadCommand command = ReadCommand.getCommand();
+            if (command == null || 
SchemaConstants.isSystemKeyspace(command.metadata().keyspace) || 
!DatabaseDescriptor.getClientTrackWarningsEnabled())
+                return;
+
+            int warnThreshold = 
DatabaseDescriptor.getRowIndexSizeWarningThresholdKb() * 1024;
+            int abortThreshold = 
DatabaseDescriptor.getRowIndexSizeAbortThresholdKb() * 1024;
+
+            long estimatedMemory = estimateMaterializedIndexSize(entries, 
bytes);
+            ColumnFamilyStore cfs = 
Schema.instance.getColumnFamilyStoreInstance(command.metadata().id);
+            if (cfs != null)
+                cfs.metric.rowIndexSize.update(estimatedMemory);
+
+            if (abortThreshold != 0 && estimatedMemory > abortThreshold)
+            {
+                String msg = String.format("Query %s attempted to access a 
large RowIndexEntry estimated to be %d bytes " +
+                                           "in-memory (total entries: %d, 
total bytes: %d) but the max allowed is %d;" +
+                                           " query aborted  (see 
row_index_size_abort_threshold_kb)",
+                                           command.toCQLString(), 
estimatedMemory, entries, bytes, abortThreshold);
+                
MessageParams.remove(ParamType.ROW_INDEX_ENTRY_TOO_LARGE_WARNING);
+                MessageParams.add(ParamType.ROW_INDEX_ENTRY_TOO_LARGE_ABORT, 
estimatedMemory);
+
+                throw new RowIndexEntryTooLargeException(msg);
+            }
+            else if (warnThreshold != 0 && estimatedMemory > warnThreshold)
+            {
+                // use addIfLarger rather than add as a previous partition may 
be larger than this one
+                
MessageParams.addIfLarger(ParamType.ROW_INDEX_ENTRY_TOO_LARGE_WARNING, 
estimatedMemory);

Review comment:
       This is sort of along the same lines as 
https://github.com/apache/cassandra/pull/1180/files#r702060997, but I want to 
make sure there aren't any edge cases (ex. local read and short-read protection 
read on different threads) where operations against the local node/replica will 
accumulate information on different thread-local map in `MessageParams`. (I 
think requests to remote replicas should be safe, as they'll accumulate these 
and serialize a response on the same thread.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [cassandra] maedhroz commented on a change in pull request #1180: CASSANDRA-16896: Add soft/hard limits to local reads to protect against reading too much data in a single query

Reply via email to