[
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=152063&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-152063
]
ASF GitHub Bot logged work on BEAM-4049:
----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Oct/18 12:13
Start Date: 07/Oct/18 12:13
Worklog Time Spent: 10m
Work Description: adejanovski commented on issue #5112: [BEAM-4049]
Improve CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-427648627
How was the 2.5.0 packages built? With Maven or Gradle?
In this PR, I used a method from the datastax Java driver which relies on
Guava's ListenableFuture class. Since beam's pom relocated all of Guava into a
custom package, it gave the error you mentioned.
As a workaround, I added an exception in the pom file to avoid relocating
ListenableFuture.
I think I understood the project was moving to Gradle and as I didn't touch
the Gradle build, ListenableFuture might still be relocated there.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 152063)
Time Spent: 8h 10m (was: 8h)
> Improve write throughput of CassandraIO
> ---------------------------------------
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
> Issue Type: Improvement
> Components: io-java-cassandra
> Affects Versions: 2.4.0
> Reporter: Alexander Dejanovski
> Assignee: Alexander Dejanovski
> Priority: Major
> Labels: performance
> Fix For: 2.5.0
>
> Time Spent: 8h 10m
> Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous
> fashion.
> This implies that writes are serialized and is a very suboptimal way of
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait
> for completion each time 100 queries are in flight, in order to avoid
> overwhelming clusters.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)