[GitHub] [cassandra] dcapwell commented on a change in pull request #1180: CASSANDRA-16896: Add soft/hard limits to local reads to protect against reading too much data in a single query

GitBox Fri, 03 Sep 2021 13:19:08 -0700


dcapwell commented on a change in pull request #1180:
URL: https://github.com/apache/cassandra/pull/1180#discussion_r702141795




##########
File path: src/java/org/apache/cassandra/db/RowIndexEntry.java
##########
@@ -343,6 +349,47 @@ public static void skipForCache(DataInputPlus in) throws 
IOException
             }
         }
 
+        private void checkSize(int entries, int bytes)
+        {
+            ReadCommand command = ReadCommand.getCommand();
+            if (command == null || 
SchemaConstants.isSystemKeyspace(command.metadata().keyspace) || 
!DatabaseDescriptor.getClientTrackWarningsEnabled())
+                return;
+
+            int warnThreshold = 
DatabaseDescriptor.getRowIndexSizeWarningThresholdKb() * 1024;
+            int abortThreshold = 
DatabaseDescriptor.getRowIndexSizeAbortThresholdKb() * 1024;
+
+            long estimatedMemory = estimateMaterializedIndexSize(entries, 
bytes);
+            ColumnFamilyStore cfs = 
Schema.instance.getColumnFamilyStoreInstance(command.metadata().id);
+            if (cfs != null)
+                cfs.metric.rowIndexSize.update(estimatedMemory);
+
+            if (abortThreshold != 0 && estimatedMemory > abortThreshold)
+            {
+                String msg = String.format("Query %s attempted to access a 
large RowIndexEntry estimated to be %d bytes " +
+                                           "in-memory (total entries: %d, 
total bytes: %d) but the max allowed is %d;" +
+                                           " query aborted  (see 
row_index_size_abort_threshold_kb)",
+                                           command.toCQLString(), 
estimatedMemory, entries, bytes, abortThreshold);
+                
MessageParams.remove(ParamType.ROW_INDEX_ENTRY_TOO_LARGE_WARNING);
+                MessageParams.add(ParamType.ROW_INDEX_ENTRY_TOO_LARGE_ABORT, 
estimatedMemory);
+
+                throw new RowIndexEntryTooLargeException(msg);
+            }
+            else if (warnThreshold != 0 && estimatedMemory > warnThreshold)
+            {
+                // use addIfLarger rather than add as a previous partition may 
be larger than this one
+                
MessageParams.addIfLarger(ParamType.ROW_INDEX_ENTRY_TOO_LARGE_WARNING, 
estimatedMemory);

Review comment:
       your link just loads all files, so not sure if you were pointing to a 
line.
   
   > where operations against the local node/replica will accumulate 
information on different thread-local map in MessageParams
   
   🤷 This feature is blocked by a thread containing a ReadCommand, so it has to 
have a ReadCommand (aka this thread is doing a read).  If we hit a point where 
we are not single-threaded then I don't think this will work correctly, and the 
only data collected will be the one on the thread running the ReadCommand




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [cassandra] dcapwell commented on a change in pull request #1180: CASSANDRA-16896: Add soft/hard limits to local reads to protect against reading too much data in a single query

Reply via email to