Re: [PR] CASSANDRA-20066: Expose detailed bulk write failure message for better insight [cassandra-analytics]

via GitHub Fri, 15 Nov 2024 08:21:54 -0800


JeetKunDoug commented on code in PR #92:
URL: 
https://github.com/apache/cassandra-analytics/pull/92#discussion_r1844118718



##########
scripts/build-dtest-jars.sh:
##########
@@ -52,7 +52,7 @@ else
   if [[ "$CLEAN" == "true" ]]; then
     echo "Clean up $DTEST_JAR_DIR"
     rm -rf "$DTEST_JAR_DIR/cassandra-build"
-    rm "$DTEST_JAR_DIR/dtest*.jar"
+    rm -rf "$DTEST_JAR_DIR/dtest*.jar"

Review Comment:
   Why? you should only be deleting the jar _files_ so what does `-r` do here?



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/BulkWriteValidator.java:
##########
@@ -110,19 +112,24 @@ public static void updateFailureHandler(CommitResult 
commitResult,
         });
     }
 
-    private static void logFailedRanges(Logger logger, String phase,
-                                        
List<ReplicaAwareFailureHandler<RingInstance>.ConsistencyFailurePerRange> 
failedRanges)
+    // aggregate the stream errors in order to provide a better insight on 
failure
+    private static String 
aggregateErrors(List<ReplicaAwareFailureHandler<RingInstance>.ConsistencyFailurePerRange>
 failedRanges)
     {
+        StringBuilder sb = new StringBuilder();
         for 
(ReplicaAwareFailureHandler<RingInstance>.ConsistencyFailurePerRange 
failedRange : failedRanges)
         {
-            failedRange.failuresPerInstance.forEachInstance((instance, errors) 
-> {
-                logger.error("Failed in phase {} for {} on {}. Failure: {}",

Review Comment:
   Not thrilled with the fact that we lost the logging completely - I'd build 
the error message/cap its length, but continue logging everything, rather than 
just building the error and returning it.



##########
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/correctness/BulkWriteCorruptionTest.java:
##########
@@ -53,22 +56,55 @@
 import static org.apache.cassandra.testing.TestUtils.ROW_COUNT;
 import static org.apache.cassandra.testing.TestUtils.TEST_KEYSPACE;
 import static org.assertj.core.api.Assertions.assertThat;
-import static org.assertj.core.api.Assertions.fail;
 
-public class BulkWriteDiskCorruptionTest extends 
SharedClusterSparkIntegrationTestBase
+public class BulkWriteCorruptionTest extends 
SharedClusterSparkIntegrationTestBase
 {
-    private static final QualifiedName QUALIFIED_NAME = new 
QualifiedName(TEST_KEYSPACE, "test_write_disk_corruption");
+    enum CorruptionMode
+    {
+        DISK,
+        WIRE
+    }
+
+    private static final QualifiedName QUALIFIED_NAME = new 
QualifiedName(TEST_KEYSPACE, "test_write_corruption");
 
     static
     {
-        // Intercepts SortedSSTableWriter#validateSSTables to corrupt file on 
purpose.
         // Install the class rebase the earliest, before JVM loads the class
-        BBHelperInterceptSortedSSTableWriterValidateSSTables.install();
+        BBHelperFileCorrupter.install();

Review Comment:
   NIT: Why not just have 2 different byte buddy helper classes, rather than 
introducing the mode enum/if statements in two different intercepted methods?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] CASSANDRA-20066: Expose detailed bulk write failure message for better insight [cassandra-analytics]

Reply via email to