Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh merged PR #48:
URL: https://github.com/apache/cassandra-analytics/pull/48


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548723495


##
cassandra-four-zero-bridge/src/main/java/org/apache/cassandra/bridge/CassandraBridgeImplementation.java:
##
@@ -244,7 +244,7 @@ public static List> 
partitionKeyColumnTypes(@NotNull CqlTable ta
 }
 
 @Override
-public StreamScanner getCompactionScanner(@NotNull CqlTable table,
+public StreamScanner getCompactionScanner(@NotNull CqlTable table,

Review Comment:
   yes! good catch 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


yifan-c commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548719722


##
cassandra-four-zero-bridge/src/main/java/org/apache/cassandra/bridge/CassandraBridgeImplementation.java:
##
@@ -244,7 +244,7 @@ public static List> 
partitionKeyColumnTypes(@NotNull CqlTable ta
 }
 
 @Override
-public StreamScanner getCompactionScanner(@NotNull CqlTable table,
+public StreamScanner getCompactionScanner(@NotNull CqlTable table,

Review Comment:
   accidental indentation?



##
cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/SSTableReaderTests.java:
##
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.cassandra.spark;
+
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.stream.Collectors;
+
+import com.google.common.util.concurrent.Uninterruptibles;
+import org.apache.commons.lang.StringUtils;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import org.apache.cassandra.bridge.CassandraBridge;
+import org.apache.cassandra.spark.data.BasicSupplier;
+import org.apache.cassandra.spark.data.CqlTable;
+import org.apache.cassandra.spark.data.FileType;
+import org.apache.cassandra.spark.reader.Rid;
+import org.apache.cassandra.spark.reader.StreamScanner;
+import org.apache.cassandra.spark.stats.Stats;
+import org.apache.cassandra.spark.utils.ByteBufferUtils;
+import org.apache.cassandra.spark.utils.TimeProvider;
+import org.apache.cassandra.spark.utils.test.TestSchema;
+import org.apache.spark.sql.catalyst.util.GenericArrayData;
+
+import static org.apache.cassandra.spark.TestUtils.countSSTables;
+import static org.apache.cassandra.spark.TestUtils.getFileType;
+import static org.apache.cassandra.spark.TestUtils.runTest;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.quicktheories.QuickTheory.qt;
+
+public class SSTableReaderTests
+{
+
+@ParameterizedTest
+@MethodSource("org.apache.cassandra.bridge.VersionRunner#bridges")
+public void 
testCollectionWithTtlUsingConstantReferenceTime(CassandraBridge bridge)
+{
+// offset is 0, column values in all rows should be unexpired; thus, 
reading 10 values
+testTtlUsingConstantReferenceTimeHelper(bridge, 50, 0, 10, 10);
+// ensure all rows expires by advancing enough time in the future; 
thus, reading 0 values
+testTtlUsingConstantReferenceTimeHelper(bridge, 50, 100, 10, 0);
+}
+
+// helper that write rows with ttl, and assert on the compaction result by 
changing the reference time
+private void testTtlUsingConstantReferenceTimeHelper(CassandraBridge 
bridgeForTest,
+ int ttlSecs,
+ int timeOffsetSecs,
+ int rows,
+ int expectedValues)
+{
+AtomicInteger referenceEpoch = new AtomicInteger(0);
+TimeProvider navigatableTimeProvider = referenceEpoch::get;
+
+Set expectedColValue = new HashSet<>(Arrays.asList(1, 2, 3));
+TestRunnable test = (partitioner, dir, bridge) -> {
+TestSchema schema = TestSchema.builder(bridge)
+  .withPartitionKey("a", bridge.aInt())
+  .withColumn("b", 
bridge.set(bridge.aInt()))
+  .withTTL(ttlSecs)
+  .build();
+schema.writeSSTable(dir, bridge, partitioner, (writer) -> {
+for (int i = 0; i < rows; i++)
+{
+writer.write(i, expectedColValue);
+}
+Uninterruptibles.sleepUninterruptibly(1, TimeUnit.SECONDS);
+});
+int t1 = navigatableTimeProvider.nowInSeconds();

Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548710244


##
scripts/relocate-dtest-dependencies.pom:
##
@@ -114,12 +114,12 @@
 
 
 io.netty
-
shaded.io.netty
+
relocated.shaded.io.netty

Review Comment:
   see 
https://github.com/apache/cassandra/blob/trunk/relocate-dependencies.pom#L84



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548644061


##
cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/SSTableReaderTests.java:
##
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.cassandra.spark;
+
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.stream.Collectors;
+
+import com.google.common.util.concurrent.Uninterruptibles;
+import org.apache.commons.lang.StringUtils;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import org.apache.cassandra.bridge.CassandraBridge;
+import org.apache.cassandra.spark.data.BasicSupplier;
+import org.apache.cassandra.spark.data.CqlTable;
+import org.apache.cassandra.spark.data.FileType;
+import org.apache.cassandra.spark.reader.Rid;
+import org.apache.cassandra.spark.reader.StreamScanner;
+import org.apache.cassandra.spark.stats.Stats;
+import org.apache.cassandra.spark.utils.ByteBufferUtils;
+import org.apache.cassandra.spark.utils.TimeProvider;
+import org.apache.cassandra.spark.utils.test.TestSchema;
+import org.apache.spark.sql.catalyst.util.GenericArrayData;
+
+import static org.apache.cassandra.spark.TestUtils.countSSTables;
+import static org.apache.cassandra.spark.TestUtils.getFileType;
+import static org.apache.cassandra.spark.TestUtils.runTest;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.quicktheories.QuickTheory.qt;
+
+public class SSTableReaderTests
+{
+
+@ParameterizedTest
+@MethodSource("org.apache.cassandra.bridge.VersionRunner#bridges")
+public void 
testCollectionWithTtlUsingConstantReferenceTime(CassandraBridge bridge)
+{
+// offset is 0, column values in all rows should be unexpired; thus, 
reading 10 values
+testTtlUsingConstantReferenceTimeHelper(bridge, 50, 0, 10, 10);

Review Comment:
   I split the `SSTableReaderTests`. These tests are not specific to the 40 
bridge, so they are moved here instead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548642277


##
cassandra-analytics-core/build.gradle:
##
@@ -19,171 +19,152 @@ import java.nio.file.Paths
  * under the License.
  */
 
-project(':cassandra-analytics-core') {
-apply(plugin: 'java-library')
-apply(plugin: 'jacoco')
-apply(plugin: 'maven-publish')
-
-java {
-withJavadocJar()
-withSourcesJar()
-}
+plugins {
+id('java-library')
+id('jacoco')
+id('maven-publish')
+}
 
-publishing {
-publications {
-maven(MavenPublication) {
-from components.java
-groupId project.group
-artifactId "${archivesBaseName}"
-version System.getenv("CODE_VERSION") ?: "${version}"
-}
-}
+java {
+withJavadocJar()
+withSourcesJar()
+
+// Allows publishing the test jar
+registerFeature("test") {

Review Comment:
   produce a tests jar instead of packaging the test sources in the release jar



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548642100


##
cassandra-analytics-core/build.gradle:
##
@@ -19,171 +19,152 @@ import java.nio.file.Paths
  * under the License.
  */
 
-project(':cassandra-analytics-core') {

Review Comment:
   remove unnecessary project wrapping



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19513: Refactor Cassandra bridge [cassandra-analytics]

2024-04-02 Thread via GitHub


frankgh commented on code in PR #48:
URL: 
https://github.com/apache/cassandra-analytics/pull/48#discussion_r1548620059


##
scripts/relocate-dtest-dependencies.pom:
##
@@ -114,12 +114,12 @@
 
 
 io.netty
-
shaded.io.netty
+
relocated.shaded.io.netty

Review Comment:
   better aligns with Cassandra OSS relocation shading pattern.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org