[ https://issues.apache.org/jira/browse/BEAM-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328252#comment-16328252 ]
Aleksandr Sosenko commented on BEAM-3485: ----------------------------------------- Should it really be token(pk)? I've made it work when I hardcoded real primary key column name instead of $pk (question on Stackoverflow is mine). Is there some magic, which substitutes real primary key column name instead of pk? I can't create pull request since I don't know how to elegantly retrieve primary key column name in context of the split function. I could make an additional request to Cassandra for table description, but it seams overcomplicated to me. Is there other way to get primary key column name in context of the function? > CassandraIO.read() splitting produces invalid queries > ----------------------------------------------------- > > Key: BEAM-3485 > URL: https://issues.apache.org/jira/browse/BEAM-3485 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions > Reporter: Eugene Kirpichov > Assignee: Jean-Baptiste Onofré > Priority: Major > > See > [https://stackoverflow.com/questions/48090668/how-to-increase-dataflow-read-parallelism-from-cassandra/48131264?noredirect=1#comment83548442_48131264] > As the question author points out, the error is likely that token($pk) should > be token(pk). This was likely masked by BEAM-3424 and BEAM-3425, and the > splitting code path effectively was never invoked, and was broken from the > first PR - so there are likely other bugs. > When testing this issue, we must ensure good code coverage in an IT against a > real Cassandra instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)