frankgh commented on code in PR #36:
URL: 
https://github.com/apache/cassandra-analytics/pull/36#discussion_r1468745164


##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/BulkSparkConf.java:
##########
@@ -147,9 +143,7 @@ public BulkSparkConf(SparkConf conf, Map<String, String> 
options)
         this.consistencyLevel = 
ConsistencyLevel.CL.valueOf(MapUtils.getOrDefault(options, 
WriterOptions.BULK_WRITER_CL.name(), "EACH_QUORUM"));
         this.localDC = MapUtils.getOrDefault(options, 
WriterOptions.LOCAL_DC.name(), null);
         this.numberSplits = MapUtils.getInt(options, 
WriterOptions.NUMBER_SPLITS.name(), DEFAULT_NUM_SPLITS, "number of splits");
-        this.rowBufferMode = MapUtils.getEnumOption(options, 
WriterOptions.ROW_BUFFER_MODE.name(), DEFAULT_ROW_BUFFER_MODE, "row buffering 
mode");
-        this.sstableDataSizeInMB = MapUtils.getInt(options, 
WriterOptions.SSTABLE_DATA_SIZE_IN_MB.name(), 160, "sstable data size in MB");
-        this.sstableBatchSize = MapUtils.getInt(options, 
WriterOptions.BATCH_SIZE.name(), 1_000_000, "sstable batch size");
+        this.sstableDataSizeInMB = MapUtils.getInt(options, 
WriterOptions.SSTABLE_DATA_SIZE_IN_MB.name(), DEFAULT_SSTABLE_DATA_SIZE_IN_MB, 
"sstable data size in mebibytes");

Review Comment:
   if data size is mebibytes, we should probably rename the variables to MIB.
   
   ```
   WriterOptions.SSTABLE_DATA_SIZE_IN_MB -> 
WriterOptions.SSTABLE_DATA_SIZE_IN_MIB
   DEFAULT_SSTABLE_DATA_SIZE_IN_MB -> MIB
   ```



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/RecordWriter.java:
##########
@@ -368,16 +344,23 @@ private static Map<String, Object> 
getBindValuesForColumns(Map<String, Object> m
         return map;
     }
 
+    /**
+     * Close the {@link RecordWriter#sstableWriter} is present. Schedule a 
stream session with the produced sstables.

Review Comment:
   ```suggestion
        * Close the {@link RecordWriter#sstableWriter} if present. Schedule a 
stream session with the produced sstables.
   ```



##########
cassandra-four-zero/src/main/java/org/apache/cassandra/bridge/CassandraBridgeImplementation.java:
##########
@@ -164,13 +166,31 @@ public static synchronized void setup()
                 throw new RuntimeException(exception);
             }
             config.data_file_directories = new 
String[]{tempDirectory.toString()};
+            setupCommitLogConfigs(tempDirectory);
             DatabaseDescriptor.setEndpointSnitch(new SimpleSnitch());
             Keyspace.setInitialized();
 
             setup = true;
         }
     }
 
+    public static void setupCommitLogConfigs(Path path)
+    {
+        String commitLogPath = path + "/commitlog";

Review Comment:
   can commit logs be disabled for offline tools instead?



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/JobInfo.java:
##########
@@ -33,13 +32,11 @@ public interface JobInfo extends Serializable
 
     String getLocalDC();
 
-    @NotNull
-    RowBufferMode getRowBufferMode();
-
+    /**
+     * @return the max sstable data file size in mebibytes
+     */
     int getSstableDataSizeInMB();

Review Comment:
   ```suggestion
       int getSstableDataSizeInMiB();
   ```



##########
scripts/build-dtest-jars.sh:
##########
@@ -38,8 +38,8 @@ else
   #   "cassandra-4.0:cassandra-4.0"
   # Due to MacOS being stuck on Bash < 4, we don't use associative arrays here.
   CANDIDATE_BRANCHES=(
-    "cassandra-4.0:1f79c8492528f01bcc5f88951a1cc9e0d7265c54"
-    "cassandra-4.1:725655dda2776fef35567496a6e331102eb7610d"
+    "cassandra-4.0:af752fcd535ccdac69b9fed88047b2dd7625801e"

Review Comment:
   it feels like we should use the tag here instead: `cassandra-4.0.12`
   ```suggestion
       "cassandra-4.0:cassandra-4.0.12"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to