[
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056376#comment-15056376
]
Adam Holmberg commented on CASSANDRA-9302:
------------------------------------------
bq. DCAware policy fixed it, now we have only one session.
Now that we're not choosing session based on replica host, we might further
simplify {{split_batches}} to just group by partition key (i.e., no need for
{{get_replica}}). Alternatively, if you want to send to a specific host other
than one that load balancing would choose, we would need to borrow a connection
and send directly on that (I don't think that's worth doing).
I find it a little awkward that numeric option values require quoting:
{code}
cassandra@cqlsh> COPY test.t FROM 'f.csv' WITH HEADER = false AND
REPORTFREQUENCY = 100;
Improper COPY command.
cassandra@cqlsh> COPY test.t FROM 'f.csv' WITH HEADER = false AND
REPORTFREQUENCY = '100';
Starting copy of test.t...
{code}
Is that a hard thing to change?
> Optimize cqlsh COPY FROM, part 3
> --------------------------------
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Jonathan Ellis
> Assignee: Stefania
> Priority: Critical
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x,
> but people need a good bulk load tool now. One option is to add a separate
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which
> we point people, rather than adding more tools that need to be supported
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and
> CASSANDRA-8225.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)