[ https://issues.apache.org/jira/browse/BEAM-3485?focusedWorklogId=91744&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91744 ]
ASF GitHub Bot logged work on BEAM-3485: ---------------------------------------- Author: ASF GitHub Bot Created on: 17/Apr/18 13:18 Start Date: 17/Apr/18 13:18 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on a change in pull request #5124: [BEAM-3485] Fix split generation for Cassandra clusters URL: https://github.com/apache/beam/pull/5124#discussion_r182013212 ########## File path: sdks/java/io/cassandra/src/main/java/org/apache/beam/sdk/io/cassandra/CassandraIO.java ########## @@ -196,6 +197,11 @@ private CassandraIO() {} return builder().setConsistencyLevel(consistencyLevel).build(); } + public Read<T> withMinNumberOfSplits(Integer minNumberOfSplits) { + checkArgument(minNumberOfSplits != null, "minNumberOfSplits can not be null"); Review comment: We also need to check that this number is greater than 0. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 91744) Time Spent: 2h (was: 1h 50m) > CassandraIO.read() splitting produces invalid queries > ----------------------------------------------------- > > Key: BEAM-3485 > URL: https://issues.apache.org/jira/browse/BEAM-3485 > Project: Beam > Issue Type: Bug > Components: io-java-cassandra > Reporter: Eugene Kirpichov > Assignee: Alexander Dejanovski > Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > See > [https://stackoverflow.com/questions/48090668/how-to-increase-dataflow-read-parallelism-from-cassandra/48131264?noredirect=1#comment83548442_48131264] > As the question author points out, the error is likely that token($pk) should > be token(pk). This was likely masked by BEAM-3424 and BEAM-3425, and the > splitting code path effectively was never invoked, and was broken from the > first PR - so there are likely other bugs. > When testing this issue, we must ensure good code coverage in an IT against a > real Cassandra instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)