[ 
https://issues.apache.org/jira/browse/BEAM-3485?focusedWorklogId=91744&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91744
 ]

ASF GitHub Bot logged work on BEAM-3485:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Apr/18 13:18
            Start Date: 17/Apr/18 13:18
    Worklog Time Spent: 10m 
      Work Description: aromanenko-dev commented on a change in pull request 
#5124: [BEAM-3485] Fix split generation for Cassandra clusters
URL: https://github.com/apache/beam/pull/5124#discussion_r182013212
 
 

 ##########
 File path: 
sdks/java/io/cassandra/src/main/java/org/apache/beam/sdk/io/cassandra/CassandraIO.java
 ##########
 @@ -196,6 +197,11 @@ private CassandraIO() {}
       return builder().setConsistencyLevel(consistencyLevel).build();
     }
 
+    public Read<T> withMinNumberOfSplits(Integer minNumberOfSplits) {
+      checkArgument(minNumberOfSplits != null, "minNumberOfSplits can not be 
null");
 
 Review comment:
   We also need to check that this number is greater than 0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 91744)
    Time Spent: 2h  (was: 1h 50m)

> CassandraIO.read() splitting produces invalid queries
> -----------------------------------------------------
>
>                 Key: BEAM-3485
>                 URL: https://issues.apache.org/jira/browse/BEAM-3485
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-cassandra
>            Reporter: Eugene Kirpichov
>            Assignee: Alexander Dejanovski
>            Priority: Major
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> See 
> [https://stackoverflow.com/questions/48090668/how-to-increase-dataflow-read-parallelism-from-cassandra/48131264?noredirect=1#comment83548442_48131264]
> As the question author points out, the error is likely that token($pk) should 
> be token(pk). This was likely masked by BEAM-3424 and BEAM-3425, and the 
> splitting code path effectively was never invoked, and was broken from the 
> first PR - so there are likely other bugs.
> When testing this issue, we must ensure good code coverage in an IT against a 
> real Cassandra instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to