[ 
https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726901#comment-14726901
 ] 

David Kua commented on CASSANDRA-9304:
--------------------------------------

[~Stefania]

Updates to my 9304 branch now include a parameter for the COPY command that 
allows for number of jobs to be configured. RateMeter was also changed and 
fixed up as an issue was found during testing. Testing also found issues with 
ByteOrderedPartitioner and OrderPreservingPartitioner. Mainly that BOP's tokens 
don't work with the SELECT statements I'm using and OPP has no token ring so 
can't be parallelized. So changes were made to cause COPY TO to run as if it 
were single process when it encounters those two partitioners.

Tests were updated and can be found here: 
https://github.com/dkua/cassandra-dtest/tree/bulk_export
The cqlsh COPY tests now run with a cluster of 3 nodes and the tests have 
increased from testing 1k rows to 10k rows. One of the read/write tests now 
tests different partitioners also and should cover that case perfectly fine.

> COPY TO improvements
> --------------------
>
>                 Key: CASSANDRA-9304
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9304
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: David Kua
>            Priority: Minor
>              Labels: cqlsh
>             Fix For: 2.1.x
>
>
> COPY FROM has gotten a lot of love.  COPY TO not so much.  One obvious 
> improvement could be to parallelize reading and writing (write one page of 
> data while fetching the next).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to