[GitHub] [cassandra] mike-tr-adamson commented on a diff in pull request #2540: Reduce size of per-SSTable index components for SAI

via GitHub Fri, 04 Aug 2023 04:23:11 -0700


mike-tr-adamson commented on code in PR #2540:
URL: https://github.com/apache/cassandra/pull/2540#discussion_r1284301369



##########
src/java/org/apache/cassandra/index/sai/disk/v1/sortedterms/SortedTermsWriter.java:
##########
@@ -49,9 +43,9 @@
  * <p>
  * For documentation of the underlying on-disk data structures, see the 
package documentation.
  * <p>
- * The TERMS_DICT_ constants allow for quickly determining the id of the 
current block based on a point id
- * or to check if we are exactly at the beginning of the block.
- * Terms data are organized in blocks of (2 ^ {@link #TERMS_DICT_BLOCK_SHIFT}) 
terms.
+ * The {@code cassandra.sai.sorted_terms_block_shift} property is used to 
quickly determine the id of the current block
+ * based on a point id or to check if we are exactly at the beginning of the 
block.
+ * Terms data are organized in blocks of (2 ^ {@link #blockShift}) terms.
  * The blocks should not be too small because they allow prefix compression of
  * the terms except the first term in a block.

Review Comment:
   I have had exactly the same thoughts. Since we now know exactly how long 
each partition is, there is no real need for these to be combined. We also know 
the partition ID for the row ID, so we wouldn't need to store the partition key 
for each row, we'd only need to store it for the partition. 
   
   Let me try this and see how it looks.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [cassandra] mike-tr-adamson commented on a diff in pull request #2540: Reduce size of per-SSTable index components for SAI

Reply via email to