Abacn commented on code in PR #31733:
URL: https://github.com/apache/beam/pull/31733#discussion_r1662541194
##########
sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOTest.java:
##########
@@ -1407,7 +1411,62 @@ public void testPartitioningLongs() {
PCollection<KV<Long, Long>> ranges =
pipeline
.apply(Create.of(KV.of(10L, KV.of(0L, 12346789L))))
- .apply(ParDo.of(new PartitioningFn<>(TypeDescriptors.longs())));
+ .apply(
+ ParDo.of(
+ new
PartitioningFn<>(JdbcUtil.getPartitionsHelper(TypeDescriptors.longs()))));
+
+ PAssert.that(ranges.apply(Count.globally())).containsInAnyOrder(10L);
+ pipeline.run().waitUntilFinish();
+ }
+
+ @Test
+ public void testPartitioningStringsWithCustomPartitionsHelper() {
+ JdbcReadWithPartitionsHelper<String> partitionsHelper =
+ new JdbcReadWithPartitionsHelper<String>() {
+ @Override
+ public Iterable<KV<String, String>> calculateRanges(
+ String lowerBoundStr, String upperBoundStr, Long partitions) {
+ List<KV<String, String>> ranges = new ArrayList<>();
+ long lowerBound = Long.parseLong(lowerBoundStr);
Review Comment:
this is a bad partition logic. I was doing this also in #31702 at the
beginning. However it turns out that this does not read all rows. suppose we
have "0" to "999", and partition into 2 strips, it will become "0" - "500";
"501" to "1000", however "501" > "1000", and "9" > "500" in string comparison.
If go with this PR, could you please integrate the test added in #31702. It
can also be an example of custom string type partition helper.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]