[
https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726901#comment-14726901
]
David Kua commented on CASSANDRA-9304:
--------------------------------------
[~Stefania]
Updates to my 9304 branch now include a parameter for the COPY command that
allows for number of jobs to be configured. RateMeter was also changed and
fixed up as an issue was found during testing. Testing also found issues with
ByteOrderedPartitioner and OrderPreservingPartitioner. Mainly that BOP's tokens
don't work with the SELECT statements I'm using and OPP has no token ring so
can't be parallelized. So changes were made to cause COPY TO to run as if it
were single process when it encounters those two partitioners.
Tests were updated and can be found here:
https://github.com/dkua/cassandra-dtest/tree/bulk_export
The cqlsh COPY tests now run with a cluster of 3 nodes and the tests have
increased from testing 1k rows to 10k rows. One of the read/write tests now
tests different partitioners also and should cover that case perfectly fine.
> COPY TO improvements
> --------------------
>
> Key: CASSANDRA-9304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9304
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: David Kua
> Priority: Minor
> Labels: cqlsh
> Fix For: 2.1.x
>
>
> COPY FROM has gotten a lot of love. COPY TO not so much. One obvious
> improvement could be to parallelize reading and writing (write one page of
> data while fetching the next).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)