This is an automated email from the ASF dual-hosted git repository.

echauchot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-connector-cassandra.git


The following commit(s) were added to refs/heads/main by this push:
     new 6088c45  [FLINK-32014][hotfix][javadoc] Fix incorrect javadoc 
regarding maxSplitMemorySize and add numSplits computation information (#14)
6088c45 is described below

commit 6088c456492976a08441821793edb47cdf61ce18
Author: Etienne Chauchot <[email protected]>
AuthorDate: Wed May 10 17:33:04 2023 +0200

    [FLINK-32014][hotfix][javadoc] Fix incorrect javadoc regarding 
maxSplitMemorySize and add numSplits computation information (#14)
---
 .../apache/flink/connector/cassandra/source/CassandraSource.java    | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git 
a/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/CassandraSource.java
 
b/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/CassandraSource.java
index dd45913..6ac90d0 100644
--- 
a/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/CassandraSource.java
+++ 
b/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/CassandraSource.java
@@ -70,7 +70,7 @@ import static org.apache.flink.util.Preconditions.checkState;
  *                   .build();
  *   }
  * };
- * long maxSplitMemorySize = ... //optional max split size in bytes. If not 
set, maxSplitMemorySize = tableSize / parallelism
+ * long maxSplitMemorySize = ... //optional max split size in bytes minimum is 
10MB. If not set, maxSplitMemorySize = 64 MB
  * Source cassandraSource = new CassandraSource(clusterBuilder,
  *                                              maxSplitMemorySize,
  *                                              Pojo.class,
@@ -80,6 +80,10 @@ import static org.apache.flink.util.Preconditions.checkState;
  * DataStream<Pojo> stream = env.fromSource(cassandraSource, 
WatermarkStrategy.noWatermarks(),
  * "CassandraSource");
  * }</pre>
+ *
+ * <p>Regarding performances, the source splits table data like this: 
numSplits =
+ * tableSize/maxSplitMemorySize. If tableSize cannot be determined or previous 
numSplits computation
+ * makes too few splits, falling back to numSplits=parallelism
  */
 @PublicEvolving
 public class CassandraSource<OUT>

Reply via email to