yifan-c commented on code in PR #2824:
URL: https://github.com/apache/cassandra/pull/2824#discussion_r1366116203


##########
src/java/org/apache/cassandra/io/sstable/SSTableSimpleWriter.java:
##########
@@ -36,64 +42,132 @@
  * means that rows should be added by increasing md5 of the row key. This is
  * rarely possible and SSTableSimpleUnsortedWriter should most of the time be
  * prefered.
+ * <p>
+ * Optionally, the writer can be configured with a max SSTable size for 
SSTables.
+ * The output will be a series of SSTables that do not exceed a specified size.
+ * By default, all sorted data are written into a single SSTable.
  */
 class SSTableSimpleWriter extends AbstractSSTableSimpleWriter
 {
+    private final long maxSSTableSize;
+
+    // Used to compute the row serialized size
+    private final SerializationHeader header;
+    private final SerializationHelper helper;
+
     protected DecoratedKey currentKey;
     protected PartitionUpdate.Builder update;
 
+    private long currentSize;
     private SSTableTxnWriter writer;
 
-    protected SSTableSimpleWriter(File directory, TableMetadataRef metadata, 
RegularAndStaticColumns columns)
+    /**
+     * Create a SSTable writer for sorted input data.
+     * When a positive {@param maxSSTableSizeInMiB} is defined, the writer 
outputs a sequence of SSTables,
+     * whose sizes do not exceed the specified value.
+     *
+     * @param directory directory to store the sstable files
+     * @param metadata table metadata
+     * @param columns columns to update
+     * @param maxSSTableSizeInMiB defines the max SSTable size if the value is 
positive.
+     *                            Any non-positive value indicates the sstable 
size is unlimited.
+     */
+    protected SSTableSimpleWriter(File directory, TableMetadataRef metadata, 
RegularAndStaticColumns columns, long maxSSTableSizeInMiB)
     {
         super(directory, metadata, columns);
+        this.maxSSTableSize = maxSSTableSizeInMiB * 1024L * 1024L;
+        this.header = new SerializationHeader(true, metadata.get(), columns, 
EncodingStats.NO_STATS);
+        this.helper = new SerializationHelper(this.header);
     }
 
-    private SSTableTxnWriter getOrCreateWriter() throws IOException
-    {
-        if (writer == null)
-            writer = createWriter(null);
-
-        return writer;
-    }
-
+    @Override
     PartitionUpdate.Builder getUpdateFor(DecoratedKey key) throws IOException
     {
-        assert key != null;
+        Preconditions.checkArgument(key != null, "Partition update cannot have 
null key");

Review Comment:
   Let's qualify the number of instructions.
   
   Using `assert`
   ```
     org.apache.cassandra.db.partitions.PartitionUpdate$Builder 
getUpdateFor(org.apache.cassandra.db.DecoratedKey) throws java.io.IOException;
       descriptor: 
(Lorg/apache/cassandra/db/DecoratedKey;)Lorg/apache/cassandra/db/partitions/PartitionUpdate$Builder;
       flags: (0x0000)
       Code:
         stack=7, locals=2, args_size=2
            0: getstatic     #4                  // Field $assertionsDisabled:Z
            3: ifne          18
            6: aload_1
            7: ifnonnull     18
           10: new           #5                  // class 
java/lang/AssertionError
           13: dup
           14: invokespecial #6                  // Method 
java/lang/AssertionError."<init>":()V
           17: athrow
           18: aload_1
           19: aload_0
   ```
   
   Using nullness check with `Precondition`
   ```
     org.apache.cassandra.db.partitions.PartitionUpdate$Builder 
getUpdateFor(org.apache.cassandra.db.DecoratedKey) throws java.io.IOException;
       descriptor: 
(Lorg/apache/cassandra/db/DecoratedKey;)Lorg/apache/cassandra/db/partitions/PartitionUpdate$Builder;
       flags: (0x0000)
       Code:
         stack=7, locals=2, args_size=2
            0: aload_1
            1: ifnull        8
            4: iconst_1
            5: goto          9
            8: iconst_0
            9: ldc           #5                  // String Partition update 
cannot have null key
           11: invokestatic  #6                  // Method 
com/google/common/base/Preconditions.checkArgument:(ZLjava/lang/Object;)V
           14: aload_0
   ```
   
   Assume the input is not null.
   
   In the case of `assert`, when with `-ea`, the method is going to check 
whether assertion is disabled and check whether the input is null. There are 4 
instructions. When w/o `-ea`, there are 2 instructions to check whether 
assertion is disabled. 
   
   In the case of `Preconditions`, there are 6 instructions, i.e. `0, 1, 4, 5, 
9, 11` compare null, store comparison result, and invoke checkArgument method. 
It has 2 or 4 extra instructions comparing with `assert`, depending on whether 
it is enabled or not.
   
   The method `getUpdateFor` is called per each row. Adding a row easily runs 
thousands of instructions. A couple more instruction is a negligible tradeoff, 
but provides a good improvement with the nicer descriptive error message and 
guarantees the check is always performed for correctness. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to