[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-06 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085537#comment-15085537
 ] 

Stefania commented on CASSANDRA-9303:
-

CI still OK, ready to commit.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-06 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085424#comment-15085424
 ] 

Stefania commented on CASSANDRA-9303:
-

dtests are fine now, after rebasing the dtest branch as well.

However 2.2+ branches changed overnight, so I've rebased again and restarted CI 
for 2.2+.

In the end to make life easier, I converted the merge commits into simple 
commits so the rebase can be done without having to re-resolve the original 
merge conflicts.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083546#comment-15083546
 ] 

Stefania commented on CASSANDRA-9303:
-

Unit tests are fine but about 30 dtests fail on all branches due to "No such 
file or directory". They seem to pass locally so I don't understand if it's 
related to the patch or not, I will resume tomorrow.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083401#comment-15083401
 ] 

Sylvain Lebresne commented on CASSANDRA-9303:
-

bq. I thought a committer could do a squashed merge without necessarily having 
to rebase as long as the patch applies or is a rebase always necessary?

We don't commit by merging, we pull a squashed version of the patch on top of 
the current code base, so rebasing is always the preferred way. In any case, we 
do ideally want to have test run on a sufficiently rebased version (typically 
saying a test failure is fine because the patch is on an old version is 
potentially dangerous) and we want to avoid having the committer deal with 
merge conflicts since he's not necessarily familiar with the patch, so always 
rebasing is a good strategy.

Anyway, thanks for doing it and let's wait on CI.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083372#comment-15083372
 ] 

Stefania commented on CASSANDRA-9303:
-

bq. Can't you squash it? I mean, that's what the committer would have to do 
anyway so on top of giving accurate test results, it'll also make the committer 
job easier.

I thought a committer could do a squashed merge without necessarily having to 
rebase as long as the patch applies or is a rebase always necessary?

In any case, here are the branches squashed and rebased. I've also updated 
_CHANGES.txt_ and _NEWS.txt_. 

I've restarted CI again to rule out any mistakes up-merging. 

||2.1||2.2||3.0||3.2||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/9303-2.1]|[patch|https://github.com/stef1927/cassandra/commits/9303-2.2]|[patch|https://github.com/stef1927/cassandra/commits/9303-3.0]|[patch|https://github.com/stef1927/cassandra/commits/9303-3.2]|[patch|https://github.com/stef1927/cassandra/commits/9303]|
|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.2-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.2-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-dtest/]|

Old branches still exist with an {{-old}} suffix.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083319#comment-15083319
 ] 

Sylvain Lebresne commented on CASSANDRA-9303:
-

bq.  A few failures, especially on trunk, but this is due to the lack of a 
recent rebase (which would be painful without having recorded the merge 
conflicts with git rerere)

Can't you squash it? I mean, that's what the committer would have to do anyway 
so on top of giving accurate test results, it'll also make the committer job 
easier.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083217#comment-15083217
 ] 

Stefania commented on CASSANDRA-9303:
-

CI is fine. A few failures, especially on trunk, but this is due to the lack of 
a recent rebase (which would be painful without having recorded the merge 
conflicts with {{git rerere}}).

Note for committing: 

* to avoid dtest failures [this pull 
request|https://github.com/riptano/cassandra-dtest/pull/724] should be merged 
just before committing.
* repeating branches here: 
||2.1||2.2||3.0||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/9303-2.1]|[patch|https://github.com/stef1927/cassandra/commits/9303-2.2]|[patch|https://github.com/stef1927/cassandra/commits/9303-3.0]|[patch|https://github.com/stef1927/cassandra/commits/9303]|

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082929#comment-15082929
 ] 

Paulo Motta commented on CASSANDRA-9303:


bq. I've already modified..

That should be enough, thanks!

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082912#comment-15082912
 ] 

Stefania commented on CASSANDRA-9303:
-

Thanks, I will monitor the cassci jobs and update the ticket once completed.

bq. Also, if you could add a simple dtest to check that the unlogged batch 
warning is only logged if there are non local mutations that would be nice. 

I've already modified {{test_client_warnings}}, see [this commit | 
https://github.com/stef1927/cassandra-dtest/commit/0f8b8850cbf3410cc58bfc9c822502706bc6bf07].
 Checking client warnings should be equivalent to checking log messages but I 
can add that too if needed.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082850#comment-15082850
 ] 

Paulo Motta commented on CASSANDRA-9303:


Thanks! New code looks good. Please mark as ready to commit when tests look 
good as I will be away for the rest of the day.

Also, if you could add a simple dtest to check that the unlogged batch warning 
is only logged if there are non local mutations that would be nice. But this 
can go independent of commit as its just a dtest PR.

Good work!

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082837#comment-15082837
 ] 

Stefania commented on CASSANDRA-9303:
-

Applicable to 2.2+ only, I fixed a problem with {{ClientWarningsTest}} and 
restarted CI.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082700#comment-15082700
 ] 

Stefania commented on CASSANDRA-9303:
-

bq. Could you improve the local mutation check on {{BatchStatement}}

Done, sorry I totally missed the existence of those helper methods.

bq. Although the fix for CASSANDRA-10938 looks harmless...

I reverted it. I agree with your concerns but I am also equally worried about 
people importing data with CASSANDRA-10938 still not fixed.

bq. Did you validate the performance of the new batch-by-replica approach?

Yes. Although the hybrid approach may cost us 2-3 seconds with a 1M 
cassandra-stress benchmark with 3 nodes (~25 vs ~22 seconds), we do not impact 
batching by partition key because that has priority. So, unlike the discussion 
on CASSANDRA-9302, batching by partition key is still there and batching by 
replica is just a backup approach. 

It seems conceptually wrong to me to send UNLOGGED batches with non-local 
partitions: 

* we'll trigger the WARN that we worked towards removing for local partitions
* we also increase the risk of timeouts if one node gets overloaded 

CI pending including unit tests, here are all the links:

||2.1||2.2||3.0||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/9303-2.1]|[patch|https://github.com/stef1927/cassandra/commits/9303-2.2]|[patch|https://github.com/stef1927/cassandra/commits/9303-3.0]|[patch|https://github.com/stef1927/cassandra/commits/9303]|
|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.2-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-dtest/]|


> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-04 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082211#comment-15082211
 ] 

Paulo Motta commented on CASSANDRA-9303:


Nice job, we`re nearly there! :) Now tests are passing locally on Windows and 
code looks good. Some minor nits:
* Could you improve the local mutation check on {{BatchStatement}}, by using 
{{StorageService.getLocalRanges}}, {{Range.isInRanges}} and also skip the 
{{isMutationLocal()}} evaluation if the {{localMutationsOnly}} variable is 
{{false}}. Also you can remove the cqlsh reference on the comment, since even 
in a non-cqlsh context the warning is not necessary if there are only local 
mutations in an unlogged batch.
* Although the fix for CASSANDRA-10938 looks harmless, I'm not sure if it could 
have some unintended consequences, so I'd prefer to commit it separately after 
discussion on CASSANDRA-10938.

Did you validate the performance of the new batch-by-replica approach? In the 
end it seems CASSANDRA-10938 was not caused by batching by partition key and 
there was a lot of back-and-forth between batch-by-replica vs 
batch-by-partition, so it's not very clear which approach is the best. We could 
probably do a more thorough evaluation/validation later, but it would be nice 
to make sure our batching strategy performs well.

Since there are also java code changes, can you also submit unit tests in 
addition to dtests on cassci? Thanks!

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-01-04 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081092#comment-15081092
 ] 

Stefania commented on CASSANDRA-9303:
-

I've also performed a little bit more work:

* Removed the WARN for UNLOGGED batches with multiple partitions introduced by 
CASSANDRA-9399 _if the partitions are only local_.

* Optimized {{split_batches}} to first batch by partition key, if at least two 
rows have the same partition key, and batch by replica only those rows without 
common partition keys. This ensures we optimize single insertions server side 
per partition key and it saves us the cost of accessing the token map to work 
out the replica if we have common partition keys.

* Ensured that {{DCAwareRoundRobinPolicy}} gets the data center name to avoid a 
WARN.

* Applied a workaround for CASSANDRA-10938.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074033#comment-15074033
 ] 

Stefania commented on CASSANDRA-9303:
-

The hanging tests were caused by the way in which we run cqlsh in ccm, this 
[pull request|https://github.com/pcmanus/ccm/pull/432] fixed it.

The remaining failures were caused by the following two things:

* Handling of temporary files is quite different on Windows, the changes to 
address this are only in the dtest code, see the second commit of the [pull 
request|https://github.com/riptano/cassandra-dtest/pull/724]. 

* The path names should have been normalized, see [this 
commit|https://github.com/stef1927/cassandra/commit/295219dfbcf24ece9729030cce6e9638899b2842].

I've also changed a few more things, mostly discovered whilst trying to 
reproduce CASSANDRA-10938 on Windows:

* Reverted to batching by replica to avoid Cassandra processes using too much 
CPU. Batching by replica was changed to batching by partition key during the 
code review of CASSANDRA-9302 because there is a cost in determining the 
replicas of each record. However, sending batches with records on different 
replicas is probably worst then spending a few cycles in Python determining the 
correct replicas. It also allows up to use LOGGED batching, see next point.

* Changed batch type from UNLOGGED to LOGGED to avoid a WARN in the Cassandra 
log files and for more consistent failed batch status reporting (even though 
INSERT should be idempotent, so this can be changed back to UNLOGGED if 
performance is impacted too much but it shouldn't since all parititions should 
be local).

* Fixed a problem with cassandra-stress that only manifested on Windows and on 
trunk when using a custom profile. However the Windows stress launch scripts 
were incorrect from 2.1 onwards.

I worked on the 2.2 patch and merged upwards. I also cherry-picked back to 2.1 
with manual conflict resolution in bin/cqlsh. Even though we don't support 
Windows for 2.1 I figured it was best to fix these problems anyway.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-27 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072503#comment-15072503
 ] 

Stefania commented on CASSANDRA-9303:
-

Thanks for running the new tests on Windows [~pauloricardomg], I will set-up a 
Windows environment and take a look at the failures.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-25 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071623#comment-15071623
 ] 

Paulo Motta commented on CASSANDRA-9303:


There are still quite a few failures on Windows so I think we'll need to setup 
a cassci windows run to monitor them. The following tests are hanging, so I 
created a [dtest 
branch|https://github.com/pauloricardomg/cassandra-dtest/tree/9303-skipping] 
skipping them:

I believe those might be somehow related to CASSANDRA-10858:
* test_copy_to_with_fewer_failures_than_max_attempts
* test_copy_to_with_more_failures_than_max_attempts

Those might be related to CASSANDRA-10938:
* test_bulk_round_trip_default
* test_bulk_round_trip_blogposts
* test_bulk_round_trip_with_timeouts
* test_bulk_round_trip_with_low_ingestrate

I attached a [dtest run 
output|https://issues.apache.org/jira/secure/attachment/12779517/dtest.out] 
with more details about other failures. I will be off until January 6th, so 
feel free to find another reviewer until then.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-24 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071009#comment-15071009
 ] 

Stefania commented on CASSANDRA-9303:
-

CI on trunk restarted.

DTEST PR: https://github.com/riptano/cassandra-dtest/pull/724

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-24 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070957#comment-15070957
 ] 

Stefania commented on CASSANDRA-9303:
-

CI for 2.2 and 3.0 is fine.

The problems on trunk seem to originate from commit 
3c8d87f4324e5ff8bf6b1c3652e9c5eacf03bc20, CASSANDRA-10580.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-24 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070880#comment-15070880
 ] 

Stefania commented on CASSANDRA-9303:
-

2.1 CI is OK.

Found a small merge error in 2.2, fixed it and restarted CI for 2.2 and 3.0.

On trunk we will get lots of timeouts, it seems there is a problem on the 
unpatched branch.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-24 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070852#comment-15070852
 ] 

Stefania commented on CASSANDRA-9303:
-

Thank you for your input, we will commit to 2.1+ then.

--

bq. Only minor nit is to use {{os.linesep instead}} of {{'\n'}} on 
{{_printmsg(msg, eol='\n')}}.

Nope, it's intentional and in fact I changed the {{os.linesep}} occurrences 
that I found in other parts of the file
as well. See the doc here: 
https://docs.python.org/2/library/os.html?highlight=linesep#os.linesep - on 
Windows
{{os.linesep}} is '\r\n' which then becomes '\r\r\n' because '\n' is 
automatically converted to '\r\n'
when writing to text files. I assume this includes stdout.

bq. can you just check the failing dtest 
{{cqlsh_copy_tests.py:CqlshCopyTest.test_read_missing_partition_key}} from 
CASSANDRA-10854

They pass only on the [dtest 9303 
branch|https://github.com/stef1927/cassandra-dtest/commits/9303] since the 
exception name has changed - I had to fix one more small thing in the code as 
well.

I've also fixed a bug with COPY TO that I discovered when testing with VNODES 
disabled.

bq. Feel free to squash and up-merge.

Squashed (except for the latest changes) and merged:

||2.1||2.2||3.0||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/9303-2.1]|[patch|https://github.com/stef1927/cassandra/commits/9303-2.2]|[patch|https://github.com/stef1927/cassandra/commits/9303-3.0]|[patch|https://github.com/stef1927/cassandra/commits/9303]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-dtest/]|

We have a problem running the Windows tests, aside from the long time it takes, 
because we cannot parametrize the CASSCI job then we cannot use the 9303 dtest 
branch and therefore most of the tests will fail because {{format_value()}} in 
_formatting.py_ expects more parameters. The master branch tests won't exercise 
most of the options either. I do not have a working Windows environment 
available right now, would you be able to run _cqlsh_copy_tests.py_ on your 
environment and then send me any errors? Alternatively I can create the dtest 
pull request and run the tests on CASSCI once both PR and this ticket have been 
committed, but we keep this ticket open until we've verified the Windows tests 
are also OK.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069832#comment-15069832
 ] 

Jonathan Ellis commented on CASSANDRA-9303:
---

I'm reluctant to pull 9302 out, so I'd prefer adding this to 2.1 as well.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-23 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069743#comment-15069743
 ] 

Stefania commented on CASSANDRA-9303:
-

bq. I'd agree with Aleksey Yeschenko that this should go only on 3.0+, however, 
since this is a follow-up/complement to CASSANDRA-9302, which is a new feature 
and went into an unreleased 2.1 version, I'd advocate for this to go into 2.1 
as well, unless CASSANDRA-9302 is removed from 2.1, otherwise the new copy 
from/to feature would ship half-complete on 2.1 what wouldn't make much sense 
IMO.

I tend to agree that CASSANDRA-9302 is somewhat incomplete without these 
options. So either we roll it back from 2.1 and 2.2 or we commit this as well. 
Further, it would be a pain to fix 9302 bugs without this patch since the code 
changed significantly enough to cause merge conflicts.

[~iamaleksey] WDYT?

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-23 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069739#comment-15069739
 ] 

Aleksey Yeschenko commented on CASSANDRA-9303:
--

This situation is unfortunate, but you are right. Technically we would revert 
CASSANDRA-9302 and CASSANDRA-9304 from 2.1, but it does seem easier to just go 
ahead and commit this to 2.1.

Actually, I'm fine with either option. Revert the previous commits or commit 
this patch to 2.1 as well. [~jbellis], as the reporter of this JIRA, what'd be 
your preference?

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-23 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069736#comment-15069736
 ] 

Paulo Motta commented on CASSANDRA-9303:


Looks good now. Tested locally and all options look good. Dtests are also 
passing. Only minor nit is to use {{os.linesep}} instead of {{'\n'}} on 
{{_printmsg(msg, eol='\n')}}.

bq. It doesn't work because stdin is actually set to the file specified with 
the -f option. Since this is not an issue with COPY but with the way -f is 
implemented, I would prefer deferring to another ticket if this functionality 
is required.

+1

bq. I've also rebased on the 2.1 branch (since CASSANDRA-9494 will only be on 
trunk) and applied the fix for CASSANDRA-10854 since it requires extra work on 
this branch.

+1, can you just check the failing dtest 
{{cqlsh_copy_tests.py:CqlshCopyTest.test_read_missing_partition_key}} from 
CASSANDRA-10854?

bq. I would like to squash the dtest commits as well, let me know if you still 
need to review some individual commits first.

Feel free to squash and up-merge.

bq. I'm still waiting to hear about which branches we need to apply this patch 
to; plus I would like to squash the commits before up-merging. 

I'd agree with [~iamaleksey] that this should go only on 3.0+, however, since 
this is a follow-up/complement to CASSANDRA-9302, which is a new feature and 
went into an unreleased 2.1 version, I'd advocate for this to go into 2.1 as 
well, unless CASSANDRA-9302 is removed from 2.1, otherwise the *new copy 
from/to* feature would ship half-complete on 2.1 what wouldn't make much sense.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-23 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069701#comment-15069701
 ] 

Aleksey Yeschenko commented on CASSANDRA-9303:
--

This is tricky. This *should* only go to 3.x. 2.1 is close to EOL and at this 
stage should only include critical bug fixes.

That said, for pragmatic reasons, committing to 3.0.x as well should outweigh 
rues breakage, so I'm fine if we do it (this one time).

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-22 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068064#comment-15068064
 ] 

Stefania commented on CASSANDRA-9303:
-

bq. So just reverting to the previous approach should be fine,  Regarding 
the printing of the read options, I was thinking of something more concise 
instead of a one-config-per-line which can get too verbose, ...

Done, check reverted and we no longer print each option on a separate line but 
only once per section.

bq. Regarding COPY TO STDOUT should we skip printing info messages since a user 
may want to redirect the output to another script or file?

It's done but I had to move the static methods into {{CopyTask}} so the diff is 
a bit hard, sorry about it.

bq. f I have an {{import.cql}} file containing {{COPY keyspace1.standard1 from 
stdin;}} is the following supposed to work: {{cat input.csv | bin/cqlsh -f 
import.cql?}}

It doesn't work because {{stdin}} is actually set to the file specified with 
the {{-f}} option. Since this is not an issue with COPY but with the way {{-f}} 
is implemented, I would prefer deferring to another ticket if this 
functionality is required.

I've also rebased on the 2.1 branch (since CASSANDRA-9494 will only be on 
trunk) and applied the fix for CASSANDRA-10854 since it requires extra work on 
this branch.

I'm still waiting to hear about which branches we need to apply this patch to; 
plus I would like to squash the commits before up-merging. I would like to 
squash the dtest commits as well, let me know if you still need to review some 
individual commits first.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-21 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066557#comment-15066557
 ] 

Paulo Motta commented on CASSANDRA-9303:


bq. That's correct, copy-to* sections are not read in from executions and 
vice-versa. I've added a check to explicitly skip invalid or wrong direction 
options from config files along with more log messages so that it should be 
easier to see that an option is not read or ignored.

Ok, my bad then. I tested with the previous version which did not have 
exclusive sections. I don't think it`s necessary to skip invalid options 
(within the exclusive sections) as they are harmless and their treating make 
the code a bit more complex. So just reverting to the previous approach should 
be fine, but feel free to keep the way it is if you think it's OK. Sorry about 
this confusion!

Regarding the printing of the read options, I was thinking of something more 
concise instead of a one-config-per-line which can get too verbose, something 
along the lines of:

{noformat}
Reading options from /home/paulo/.cassandra/cqlshrc:[copy-from]: 
{chunksize=100, ingestrate=100, wtf=102, numprocesses=5}
Reading options from 
/home/paulo/.cassandra/cqlshrc:[copy-from:keyspace1.standard1] : 
{ingestrate=200, invalid="true"}
Using 5 child processes
{noformat}

Two more things:

* Regarding {{COPY TO STDOUT}} should we skip printing info messages since a 
user may want to redirect the output to another script or file? Like {{echo 
"copy keyspace1.standard1 TO STDOUT  with SKIPCOLS = 'C2';" | bin/cqlsh | 
process.sh}}
* If I have an {{import.cql}} file containing {{COPY keyspace1.standard1 from 
stdin;}} is the following supposed to work: {{cat input.csv | bin/cqlsh -f 
import.cql}}? Because I'm getting the following:
{noformat}
➜  cassandra git:(9303-2.1) ✗ cat input.csv | bin/cqlsh -f import.cql
Using 3 child processes

Starting copy of keyspace1.standard1 with columns ['key', 'C0', 'C1', 'C2', 
'C3', 'C4'].
[Use \. on a line by itself to end input]
Processed: 0 rows; Rate:   0 rows/s; Avg. rate:   0 rows/s
0 rows imported from 0 files in 0.007 seconds (0 skipped).
{noformat}

Thanks, we are really close now! :-)

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-21 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066163#comment-15066163
 ] 

Stefania commented on CASSANDRA-9303:
-

bq. I tested the new config options and the ingest rate is now working like a 
charm. 

Thanks, I made a slight modification to the ingest rate algorithm to give a 
better chance to the receive meter to show the statistics. The ingest rate 
should still be pretty accurate.

bq. I was initially thinking that while \[copy\] is a global section, 
\[copy-from*\] and \[copy-to*\] are exclusive sections for these commands, so 
for example if you define INGESTRATE by mistake in the \[copy-to\] section it's 
not picked up by a copy-from execution.

That's correct, {{copy-to*}} sections are not read in from executions and 
vice-versa. I've added a check to explicitly skip invalid or wrong direction 
options from config files along with more log messages so that it should be 
easier to see that an option is not read or ignored.

bq. Can you also add some examples to conf/cqlshrc.sample ? And maybe also 
update the cql protocol version there which is quite old.

Done.

bq.Also in the Reading options from /home/paulo/.cqlsh/cqlshrc message, maybe 
print which options are being read to improve clarity (don't worry if not 
straightforward)

Done.
   
bq. Cool! Since it's an edge-case I guess we can omit in the help and print a 
message instead in case it happens.

Done.

bq. Sounds good, it just seems the skipped columns is still being printed on 
the message Starting copy of keyspace1.standard1 with columns \['key', 'C0', 
'C1', 'C2', 'C3', 'C4'\]. (you fixed before, but it came back somehow).

It came back because of the changes to SKIPCOLS, it should be OK now.

bq. Move csv_dialect_defaults from cqlsh.py to copyutil.py

Done, I got rid of it.

bq. Move exclusive skip_columns field from CopyTask to ImportTask

Done, I've also moved it from ChildProcess to ImportProcess.

bq. csv_options are a bit misleading since they are not exclusive csv-related 
options, can we maybe rename the tuple CopyOptions(csv, dialect, unrecognized) 
to Options(copy, dialect, unrecognized)?

Done

--

I need to clarify with [~iamaleksey] for which branch we need to commit this 
since CASSANDRA-9494 was only committed to trunk. I will up-merge later on 
today once I know for sure.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-18 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064115#comment-15064115
 ] 

Paulo Motta commented on CASSANDRA-9303:


Looking very good, I tested the new config options and the ingest rate is now 
working like a charm. Some follow-up comments below:

bq. Done, I cleaned up the options a bit as well and removed the helper methods 
in the main cqlsh files.

* I was initially thinking that while \[copy\] is a global section, 
\[copy-from*\] and \[copy-to*\] are exclusive sections for these commands, so 
for example if you define INGESTRATE by mistake in the \[copy-to\] section it's 
not picked up by a copy-from execution.
* Can you also add some examples to {{conf/cqlshrc.sample}} ? And maybe also 
update the cql protocol version there which is quite old.
* Also in the {{Reading options from /home/paulo/.cqlsh/cqlshrc}} message, 
maybe print which options are being read to improve clarity (don't worry if not 
straightforward)

bq. If a file from a previous execution exists it will be ranamed to 
.MMDD_HHMMSS.

Cool! Since it's an edge-case I guess we can omit in the help and print a 
message instead in case it happens.

bq. So, I converted SKIPCOLS to a COPY FROM option and changed its semantic to 
just skip columns that exist in the file.

Sounds good, it just seems the skipped columns is still being printed on the 
message {{Starting copy of keyspace1.standard1 with columns \['key', 'C0', 
'C1', 'C2', 'C3', 'C4'\].}} (you fixed before, but it came back somehow).

Minor code style nits:

* Move csv_dialect_defaults from cqlsh.py to copyutil.py
* Move exclusive skip_columns field from CopyTask to ImportTask
* csv_options are a bit misleading since they are not exclusive csv-related 
options, can we maybe rename the tuple CopyOptions(csv, dialect, unrecognized) 
to Options(copy, dialect, unrecognized)?

We're getting there, I guess we'll be done by next round. :)

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-17 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062302#comment-15062302
 ] 

Stefania commented on CASSANDRA-9303:
-



bq. I'd suggest the following \[copy(:ks.table)\] (global and per-table copy 
(to and from) options), \[copy-from(:ks.table)\] (global and per-table 
copy-from options), \[copy-to(:ks.table)\] (global and per-table copy-to 
options) where (:ks.table) is optional. so you can have \[copy\], \[copy-to\], 
\[copy-from\], \[copy-to:ks.table\], \[copy-from:ks.table\].

Done, I cleaned up the options a bit as well and removed the helper methods in 
the main cqlsh files.

bq. maybe we could just add an unique suffix to avoid appending to an existing 
file from a previous execution?

If a file from a previous execution exists it will be ranamed to 
.MMDD_HHMMSS.

bq. We can address if it won't take too much time, otherwise we can address it 
separately. Can we maybe improve it by making batchsize adaptive = 
min(batchsize, ingest_rate - current_record) or something more complicated will 
be needed?

Done, adaptive chunk size and retries needed changing.
 
bq. Move SKIPCOLS to COPY_COMMON_OPTIONS since it can be used in both copy-to 
and copy-from.

Actually it should be a COPY FROM only option, see more below.

bq. Regarding the beahvior of SKIPCOLS with COPY FROM, right now it only 
supports having fewer columns in the CSV. Should we also support actually 
skipping columns in the CSV even if they are present?

I think the sematic I chose, to use SKIPCOLS to subtract from the set of 
columns specified in the command line, is not as advantageous as the ability to 
skip columns in the file. Providing both features with the same option would be 
confusing. So, I converted SKIPCOLS to a COPY FROM option and changed its 
semantic to just skip columns that exist in the file. If in future the need 
arises to specify "all columns except" in the command line, we can introduce a 
regex like extression (^col_name) in the columns part of the COPY cmd.

bq. Another related feature to have in the future would be to pick only 
specific columnms from the csv and allowing custom orderings of columns, but we 
can leave that for later if there's a need.

I think reordering columns is not as useful as skipping them so I tend to agree 
to leave this as a future development if the need arises.

bq. After those are addressed you can probably start making 2.2+ patches.

I changed a lot of code today and I've run out of time anyway, so I'll wait for 
one more round of review before up-merging.


> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060520#comment-15060520
 ] 

Paulo Motta commented on CASSANDRA-9303:


Looking good, thanks! Some follow-ups below:

bq. CONFIGSECTIONS: this is removed and instead we search the following static 
sections: \[copy\], \[copy-ks-table\], \[copy-ks-table-from\] or 
\[copy-ks-table-to\], in this order.

sounds good! I'd suggest the following \[copy(:ks.table)\] (global and 
per-table copy (to and from) options), \[copy-from(:ks.table)\] (global and 
per-table copy-from options), \[copy-to(:ks.table)\] (global and per-table 
copy-to options) where (:ks.table) is optional. so you can have \[copy\], 
\[copy-to\], \[copy-from\], \[copy-to:ks.table\], \[copy-from:ks.table\].

bq. if no error file is specified I've introduced a default error file called 
import_ks_table.err

nice! maybe we could just add an unique suffix to avoid appending to an 
existing file from a previous execution?

bq. Another thing that follows from the CASSANDRA-9302 review is that the 
INGESTRATE only works if it is much bigger than the CHUNKSIZE. We could address 
it here if you think this is important. 

We can address if it won't take too much time, otherwise we can address it 
separately. Can we maybe improve it by making batchsize adaptive = 
{{min(batchsize, ingest_rate - current_record)}} or something more complicated 
will be needed?

Some minor things I missed before:

* Move {{SKIPCOLS}} to {{COPY_COMMON_OPTIONS}} since it can be used in both 
copy-to and copy-from.
* Regarding the beahvior of {{SKIPCOLS}} with COPY FROM, right now it only 
supports having fewer columns in the CSV. Should we also support actually 
skipping columns in the CSV even if they are present?
** Another related feature to have in the future would be to pick only specific 
columnms from the csv and allowing custom orderings of columns, but we can 
leave that for later if there's a need.

After those are addressed you can probably start making 2.2+ patches.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-16 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060229#comment-15060229
 ] 

Stefania commented on CASSANDRA-9303:
-

{quote}
I didn't really get the purpose of the CONFIGSECTIONS option and I think it 
complicates more than bring us benefits. Is there any particular case you want 
to achieve with this option? Can't we just have a general \[copy\] section, in 
addition specific \[ks.table\] sections for custom options per-table?
{quote}

The purpose was to let people chose multiple sections for example depending on 
the direction, they may want to override some options that are common to both 
directions but require different values depending on the direction. Another 
purpose was a common copy section as you pointed out. Unfortunately we cannot 
have hierarchical sections, it doesn't seem to be supported. 

bq.  We could also support those sections on cqlshrc as well, and maybe add 
example to conf/cqlshrc.sample. But if it's too much additional work just leave 
it as is.

It's not much work since we have access to the {{CONFIG_FILE}} variable. 

It's just a matter of designing this feature in a sensible way. Here's a 
proposal, I'll wait for your comments before starting work:

* CONFIGFILE: a file where to read config sections, if not specified we search 
_.cqlshrc_
* CONFIGSECTIONS: this is removed and instead we search the following static 
sections: \[copy\], \[copy-ks-table\], \[copy-ks-table-from\] or 
\[copy-ks-table-to\], in this order.

{quote}
The ERRFILE option is not present in COPY_FROM_OPTIONS so it does not show up 
in the auto completer.

Also, it seems the default ERRFILE is not being written if one is not 
explicitly specified.

We could extend this error message on ImportTask.process_records to print the 
errfile name so the user will know where to look if he didn't specify one: 
"Failed to process 10 rows (failed rows written to bla.err)"
{quote}

Done, if no error file is specified I've introduced a default error file called 
_import_ks_table.err_ since we may have multiple input files now so it was not 
clear which input file name to pick as a default. This has also the advantage 
of working for STDIN as well. I've left the default file in the current folder, 
I didn't try anything too fancy, let me know if you want to enhance this. 

{quote}
In copyutil.py:maybe_read_config_file can you replace

{code}
ret.update(dict([(k, v,) for k, v in opts.iteritems() if k not in 
['configfile', 'configsections']]))
{code}

with

{{ret.update(opts)}}

since you already popped 'configfile' and 'configsections' from opts before? 
(or maybe there's something I'm missing).
{quote}

You didn't miss anything, it's fixed now thanks.

bq. The name {{ExportTask.check_processes}} is a bit misleading, since it sends 
work and monitors progress, maybe rename to schedule_and_monitor, or coordinate 
or start_work or even monitor_processes ? I can't find a good name as well as 
you can see

Renamed it to {{export_records}} and renamed the equivalent method in 
{{ImportTask}} to {{import_records}}.

bq. Minor typo in {{ExportTask.get_ranges}} method description: rage -> range

Fixed.

{quote}
On this snippet in ExportTask.get_ranges:

{code}
#  For the last ring interval we query the same replicas that hold the 
last token in the ring
if previous_range and (not end_token or previous < end_token):
ranges[(previous, end_token)] = ranges[previous_range].copy()
{code}

for the last ring interval aren't we supposed to query the replicas that hold 
the first token in the ring instead (wrap-around)?
{quote}

Yes technically this would be the correct thing to do. I guess so far we did 
not really care about edge cases, even if we query the wrong replicas for one 
range it doesn't really matter for performance. I changed it to query the first 
token replicas now.

{quote}
On ImportProcess.run_normal, did you find out the reason why the commented 
snippet below slows down the query? Did you try it again after the review 
changes of CASSANDRA-9302?

{code}
# not sure if this is required but it does slow things down three fold
# query_statement.consistency_level = self.consistency_level
{code}

If it still holds that's quite bizarre, as the same consistency is used later 
in the batch statement. I wonder how the prepared statement CL interacts with 
the batch CL, if it does at all.
{quote}

It must have been another problem fixed by 9302 as it makes no difference to 
performance now, I've re-introduced it.

{quote}
On ImportReader.get_source you forgot a debug print: print "Returning source 
{}".format(ret). You should probably remove it or print only on debug mode.
{quote}

Removed.

{quote}
On formatting.py you can probably replace the format_integer_with_thousands_sep 
with a simpler imple

[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-16 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060214#comment-15060214
 ] 

Stefania commented on CASSANDRA-9303:
-

It's fixed now thanks.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-15 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059027#comment-15059027
 ] 

Paulo Motta commented on CASSANDRA-9303:


Also, there seems to be a new failure with 
[cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_round_trip_with_rate_file|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-dtest/lastCompletedBuild/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_round_trip_with_rate_file/]
 probably due to the review changes of CASSANDRA-9302.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-15 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059004#comment-15059004
 ] 

Paulo Motta commented on CASSANDRA-9303:


Impressive work [~Stefania]. I tested it locally and most options work as 
expected. Overall I'm very satisfied with code and tests, except for some minor 
details listed below:
* I didn't really get the purpose of the {{CONFIGSECTIONS}} option and I think 
it complicates more than bring us benefits. Is there any particular case you 
want to achieve with this option? Can't we just have a general {{\[copy\]}} 
section, in addition specific {{\[ks.table\]}} sections for custom options 
per-table?
** We could also support those sections on {{cqlshrc}} as well, and maybe add 
example to {{conf/cqlshrc.sample}}. But if it's too much additional work just 
leave it as is.
* The {{ERRFILE}} option is not present in {{COPY_FROM_OPTIONS}} so it does not 
show up in the auto completer.
** Also, it seems the default {{ERRFILE}} is not being written if one is not 
explicitly specified.
** We could extend this error message on {{ImportTask.process_records}} to 
print the errfile name so the user will know where to look if he didn't specify 
one:
*** {{"Failed to process 10 rows (failed rows written to bla.err)"}}
* In {{copyutil.py:maybe_read_config_file}} can you replace 
{noformat}ret.update(dict([(k, v,) for k, v in opts.iteritems() if k not in 
['configfile', 'configsections']])){noformat} with 
{noformat}ret.update(opts){noformat} since you already popped {{'configfile'}} 
and {{'configsections'}} from {{opts}} before? (or maybe there's something I'm 
missing).
* The name {{ExportTask.check_processes}} is a bit misleading, since it sends 
work and monitors progress, maybe rename to {{schedule_and_monitor}}, or 
{{coordinate}} or {{start_work}} or even {{monitor_processes}} ? I can't find a 
good name as well as you can see :P
* Minor typo in {{ExportTask.get_ranges}} method description: {{rage}} -> 
{{range}}
* On this snippet in {{ExportTask.get_ranges}}:
{code}
#  For the last ring interval we query the same replicas that hold the 
last token in the ring
if previous_range and (not end_token or previous < end_token):
ranges[(previous, end_token)] = ranges[previous_range].copy()
{code}
for the last ring interval aren't we supposed to query the replicas that hold 
the first token in the ring instead (wrap-around)?
* On {{ImportProcess.run_normal}}, did you find out the reason why the 
commented snippet below slows down the query? Did you try it again after the 
review changes of CASSANDRA-9302?
{noformat}
# not sure if this is required but it does slow things down three fold
# query_statement.consistency_level = self.consistency_level
{noformat} If it still holds that's quite bizarre, as the same consistency is 
used later in the batch statement. I wonder how the prepared statement CL 
interacts with the batch CL, if it does at all.
* On {{ImportReader.get_source}} you forgot a debug print: {{print "Returning 
source {}".format(ret)}}. You should probably remove it or print only on debug 
mode.
* On {{formatting.py}} you can probably replace the 
{{format_integer_with_thousands_sep}} with a simpler implementation taking 
advantage of python support to thousand separator formatting (only available 
with "," though, that's why the replace afterwards):
{code}
def format_integer_with_thousands_sep(val, thousands_sep=','):
return "{:,}".format(val).replace(',', thousands_sep)
{code}
* Suggestion: modify the following messages to include the number of files 
written/read:
{noformat}
1000 rows exported to N files in 1.257 seconds.
130 rows imported from N files in 0.154 seconds.
{noformat}
* I found two situations where one corrupted row may fail importing of all the 
other rows, so you should probably cover these in your dtests:
** when there is a parse error in the primary key (stress-generated blob in 
this case):
{noformat}
Failed to import 1000 rows: ParseError - non-hexadecimal number found in 
fromhex() arg at position 0 -  given up without retries
Exceeded maximum number of parse errors 10
Failed to process 1000 rows
{noformat}
** when there is a row with fewer number of columns in the CSV:
{noformat}
Failed to import 20 rows: InvalidRequest - code=2200 [Invalid query] 
message="There were 6 markers(?) in CQL but 5 bound variables" -  will retry 
later, attempt 1 of 5
Failed to import 20 rows: InvalidRequest - code=2200 [Invalid query] 
message="There were 6 markers(?) in CQL but 5 bound variables" -  will retry 
later, attempt 2 of 5
Failed to import 20 rows: InvalidRequest - code=2200 [Invalid query] 
message="There were 6 markers(?) in CQL but 5 bound variables" -  will retry 
later, attempt 3 of 5
Failed to import 20 rows: InvalidRequest - code=2200 [Invalid query] 
message="There were 6 markers(?) in CQL b

[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-07 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044613#comment-15044613
 ] 

Stefania commented on CASSANDRA-9303:
-

[~aholmber] : this is the final part of the COPY enhancements and it is also 
ready for review. The patch is based on CASSANDRA-9494 and CASSANDRA-9302. I'll 
up-merge to 2.2+ once these tickets have been reviewed.

Here are the 2.1 links:

|[patch|https://github.com/stef1927/cassandra/commits/9303-2.1]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9303-2.1-dtest/]|

Also, note the location of the tests that were written for the new options: 
https://github.com/stef1927/cassandra-dtest/commits/9303.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-02 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036020#comment-15036020
 ] 

Stefania commented on CASSANDRA-9303:
-

bq. 1. If I set SKIPROWS to 10 say, and also set HEADER to true, will I skip 10 
or 11 rows?

10 data rows and the header will be skipped.

bq. 2. Why do you disable the ERRFILE for stdin?

No reason other than coming up with a sensible default name.

bq. 3. If the MAXERRORS/MAXINSERTERRORS is >1, where do you keep the error 
around? Is it captured anywhere so someone can look back on what type of error 
occurred? NoHostAvailableException, WriteTimeoutException, bad date format, etc.

Errors get printed to stdout whilst the failed rows are saved to ERRFILE and 
not printed to stdout.



> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-02 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035771#comment-15035771
 ] 

 Brian Hess commented on CASSANDRA-9303:


2 questions:
1. If I set SKIPROWS to 10 say, and also set HEADER to true, will I skip 10 or 
11 rows?
2. Why do you disable the ERRFILE for stdin?

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-01 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1503#comment-1503
 ] 

Stefania commented on CASSANDRA-9303:
-

Since the descriptions in the table above are for the loader, here is the 
corresponding documentation for COPY, there are minor differences but they 
should be pretty equivalent:

{code}
Available common COPY options and defaults:

  DELIMITER=','   - character that appears between records
  QUOTE='"'   - quoting character to be used to quote fields
  ESCAPE='\'  - character to appear before the QUOTE char 
when quoted
  HEADER=false- whether to ignore the first line
  NULL='' - string that represents a null value
  DATETIMEFORMAT= - timestamp strftime format
'%Y-%m-%d %H:%M:%S%z'   defaults to time_format value in cqlshrc
  JOBS='6'- the number of jobs each process can work on 
at a time
  MAXATTEMPTS='5' - the maximum number of attempts per batch or 
range
  REPORTFREQUENCY='1' - the frequency with which we display status 
updates
  DECIMALSEP='.'  - the separator for decimal values
  THOUSANDSSEP='' - the separator for thousands digit groups
  BOOLSTYLE='True,False'  - the representation for booleans, case 
insensitive, specify true followed by false,
for example yes,no or 1,0
  NUMPROCESSES='n'- the number of worker processes, by default 
the number of cores minus one
capped at 16
  CONFIGFILE=''   - a configuration file where you can specify 
WITH options, which may be overwritten
by those specified on the command line. The 
format of the config file is the same
as cqlshrc (see the Python ConfigParser 
documentation), you can put your options
under a section named 'ks.table' where ks 
and table are the names of they keyspace
and table of the COPY command. You can also 
specify alternative sections with
CONFIGSECTIONS. You cannot recursively link 
multiple configuration files by
specifying CONFIGFILE or CONFIGSECTIONS in 
a configuration file.
  CONFIGSECTIONS=''   - a comma separated list of sections to be 
read from a config file specified via
CONFIGFILE. The order is important since 
later sections will override values
from previous sections if the same key is 
specified in multiple sections.
  RATEFILE='' - an optional file where to print the output 
statistics

Available COPY FROM options and defaults:

  CHUNKSIZE='1000'- the size of chunks passed to worker 
processes
  INGESTRATE='5'  - the maximum rate to insert data in rows per 
second
  MINBATCHSIZE='2'- the minimum size of an import batch
  MAXBATCHSIZE='20'   - the maximum size of an import batch
  TTL='-1'- the time to live in seconds, by default 
data will not expire (neg. ttl)
  MAXROWS='-1'- the maximum number of rows, -1 means no 
maximum
  SKIPROWS='0'- the number of rows to skip
  SKIPCOLS='' - a comma separated list of column names to 
skip
  MAXPARSEERRORS='-1' - the maximum global number of parsing 
errors, -1 means no maximum
  MAXINSERTERRORS='-1'- the maximum global number of insert errors, 
-1 means no maximum
  ERRFILE=''  - a file where to store all rows that could 
not be imported, by default this is
 concatenated with ".err", 
disabled if importing from STDIN

Available COPY TO options and defaults:

  ENCODING='utf8'  - encoding for CSV output
  PAGESIZE='1000'  - the page size for fetching results
  PAGETIMEOUT=10   - the page timeout in seconds for fetching 
results
  BEGINTOKEN=''- the minimum token string to consider when 
exporting data
  ENDTOKEN=''  - the maximum token string to consider when 
exporting data
{code}

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> 

[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-01 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033320#comment-15033320
 ] 

Stefania commented on CASSANDRA-9303:
-

All options have been completed, refer to the table above. I still have to 
implement multi-file import however.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-30 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033243#comment-15033243
 ] 

Stefania commented on CASSANDRA-9303:
-

I was thinking of list of python globs so we can do things like: {{file1, 
file2, ... fileN}} but also {{*.csv, *.txt, folder/*}} and so forth. Is this 
what you have in mind?

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032077#comment-15032077
 ] 

Jonathan Ellis commented on CASSANDRA-9303:
---

Why a directory of files vs a list of any files?  (Globbing can turn a 
directory into a list easily enough.)

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015299#comment-15015299
 ] 

Stefania commented on CASSANDRA-9303:
-

Thanks for your feedback, here is the updated versions of the progress tables. 
I will try and see if we can add support for importing multiple files as well.

h3. Importing

||cassandra-loader||COPY FROM||description||status||
|-configFile filename| |File with configuration options|TODO (extend cqlsh 
config file)|
|-delim delimiter|delimiter|Delimiter to use|already available|
|-dateFormat dateFormatString|dtformats|Date format|TODO|
|-nullString nullString|nullval|String that signifies NULL|already available|
|-skipRows skipRows| |Number of rows to skip|TODO|
|-skipCols columnsToSkip|column|Comma-separated list of columsn to skip|already 
available, they can specify which columns in the cmd syntax|
|-maxRows maxRows| |Maximum number of rows to read (-1 means all)|TODO|
|-maxErrors maxErrors| |Maximum parse errors to endure|TODO|
|-badDir badDirectory| |Directory for where to place badly parsed rows.|TODO|
|-port portNumber| |CQL Port Number|already available via cqlsh|
|-user username| |Cassandra username|already available via cqlsh|
|-pw password| |Password for user|already available via cqlsh|
|-ssl-truststore-path path| |Path to SSL truststore|already available via cqlsh|
|-ssl-truststore-pw pwd| |Password for SSL truststore|already available via 
cqlsh|
|-ssl-keystore-path path| |Path to SSL keystore|already available via cqlsh|
|-ssl-keystore-pw pwd| |Password for SSL keystore|already available via cqlsh|
|-consistencyLevel CL| |Consistency level|already available via cqlsh|
|-numFutures numFutures|jobs|Number of CQL futures to keep in flight|already 
available|
|-batchSize batchSize|minbatchsize, maxbatchsize|Number of INSERTs to batch 
together|alredy available|
|-decimalDelim decimalDelim|decimalsep|Decimal delimiter|done|
| |thousandssep|Thousands delimiter|done|
|-boolStyle boolStyleString|boolstyle|Style for booleans|done|
|-numThreads numThreads|numProcesses|Number of concurrent threads (files) to 
load|done|
|-queryTimeout # seconds|pageTimeout|Query timeout (in seconds)|already 
available|
|-numRetries numRetries|maxattempts|Number of times to retry the INSERT|already 
available|
|-maxInsertErrors # errors| |Maximum INSERT errors to endure|TODO|
|-rate rows-per-second| |Maximum insert rate|TODO (unsure how)|
|-progressRate num txns|reportfrequency|How often to report the insert 
rate|already available|
|-rateFile filename| |Where to print the rate statistics|TODO|
|-successDir dir| |Directory where to move successfully loaded files|will 
implement only if adding support for multi-file import|
|-failureDir dir| |Directory where to move files that did not successfully 
load|will implement only if adding support for multi-file import|


h3. Exporting

||cassandra-unloader||COPY TO||description||status||
|configFile filename| |File with configuration options|TODO (extend cqlsh 
config file)|
|-delim delimiter|delimiter|Delimiter to use|already available|
|-dateFormat dateFormatString|dtformats|Date format|already available|
|-nullString nullString|nullval|String that signifies NULL|already available|
|-port portNumber| |CQL Port Number|already available via cqlsh|
|-user username| |Cassandra username|already available via cqlsh|
|-pw password| |Password for user|already available via cqlsh|
|-ssl-truststore-path path| |Path to SSL truststore|already available via cqlsh|
|-ssl-truststore-pw pwd| |Password for SSL truststore|already available via 
cqlsh|
|-ssl-keystore-path path| |Path to SSL keystore|already available via cqlsh|
|-ssl-keystore-pw pwd| |Password for SSL keystore|already available via cqlsh|
|consistencyLevel CL| |Consistency level|already available via cqlsh|
|decimalDelim decimalDelim|decimalsep|Decimal delimiter|done|
| |thousandssep|Thousands delimiter|done|
|boolStyle boolStyleString|boolstyle|Style for booleans|done|
|numThreads numThreads|numprocesses|Number of concurrent threads to unload|done|
|beginToken tokenString|begintoken|Begin token|done|
|endToken tokenString|endtoken|End token|done|

Where it says _done_, I'm actually still working on automated tests.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014773#comment-15014773
 ] 

Stefania commented on CASSANDRA-9303:
-

They are new in the CASSANDRA-9302 patch, not yet committed.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014280#comment-15014280
 ] 

 Brian Hess commented on CASSANDRA-9303:


I'm curious - how do you set the following in CQLSH COPY FROM:
- numFutures (the number of concurrent asynchronous requests "in flight" at a 
time)
- batchSize (the number of INSERTs to batch and send as one request)
- queryTimeout (the amount of time to wait on queries)
- numRetries (the number of times to retry failed/timed-out queries)
- progressRate (the rate at which progress is reported)

All of these are marked as "already available", but it isn't clear how to set 
them (nor from the documentation).

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014261#comment-15014261
 ] 

 Brian Hess commented on CASSANDRA-9303:


Are there no plans to support loading a directory of files?  I would say that 
that is one of the bigger options leveraged by users of cassandra-loader.

I'm +1 on not doing the things that CQLSH already handles (username, password, 
etc).

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013881#comment-15013881
 ] 

Tyler Hobbs commented on CASSANDRA-9303:


I don't think we need to repeat the options that are already passed to cqlsh 
(user, port, ssl stuff, consistency level).

Since we only support loading a single file right now, I don't think 
{{successDir}} and {{failureDir}} are important.  However, a single-file 
version of {{badDirectory}} for storing rows that errored in some way would be 
good.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013193#comment-15013193
 ] 

Stefania commented on CASSANDRA-9303:
-

[~jbellis], [~thobbs] : do we want all options or are there some we don't care 
about?

For example those related to moving files to specific folders (successDir, 
failureDir) or those for specifying options that are already passed to cqlsh 
(user, port, etc).

I think the cassandra-unloader also splits output files.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013182#comment-15013182
 ] 

Stefania commented on CASSANDRA-9303:
-

Here is the current status. I will regularly edit the following tables to 
reflect the progress:

h3. Importing

||cassandra-loader||COPY FROM||description||status||
|-configFile filename| |File with configuration options|TODO|
|-delim delimiter|delimiter|Delimiter to use|already available|
|-dateFormat dateFormatString|dtformats|Date format|TODO but we can parse all 
valid CQL time formats|
|-nullString nullString|nullval|String that signifies NULL|already available|
|-skipRows skipRows| |Number of rows to skip|TODO|
|-skipCols columnsToSkip|column|Comma-separated list of columsn to skip|already 
available, they can specify which columns in the cmd syntax|
|-maxRows maxRows| |Maximum number of rows to read (-1 means all)|TODO|
|-maxErrors maxErrors| |Maximum parse errors to endure|TODO|
|-badDir badDirectory| |Directory for where to place badly parsed rows.|TODO|
|-port portNumber| |CQL Port Number|TODO|
|-user username| |Cassandra username|TODO|
|-pw password| |Password for user|TODO|
|-ssl-truststore-path path| |Path to SSL truststore|TODO|
|-ssl-truststore-pw pwd| |Password for SSL truststore|TODO|
|-ssl-keystore-path path| |Path to SSL keystore|TODO|
|-ssl-keystore-pw pwd| |Password for SSL keystore|TODO|
|-consistencyLevel CL| |Consistency level|TODO|
|-numFutures numFutures|jobs|Number of CQL futures to keep in flight|already 
available|
|-batchSize batchSize|minbatchsize, maxbatchsize|Number of INSERTs to batch 
together|alredy available|
|-decimalDelim decimalDelim| |Decimal delimiter|TODO|
|-boolStyle boolStyleString| |Style for booleans|TODO|
|-numThreads numThreads| |Number of concurrent threads (files) to load|TODO 
(numProcesses)|
|-queryTimeout # seconds|pageTimeout|Query timeout (in seconds)|already 
available|
|-numRetries numRetries|maxattempts|Number of times to retry the INSERT|already 
available|
|-maxInsertErrors # errors| |Maximum INSERT errors to endure|TODO|
|-rate rows-per-second| |Maximum insert rate|TODO (unsure how)|
|-progressRate num txns|reportfrequency|How often to report the insert 
rate|already available|
|-rateFile filename| |Where to print the rate statistics|TODO|
|-successDir dir| |Directory where to move successfully loaded files|TODO|
|-failureDir dir| |Directory where to move files that did not successfully 
load|TODO|


h3. Exporting

||cassandra-unloader||COPY TO||description||status||
|configFile filename| |File with configuration options|TODO|
|-delim delimiter|delimiter|Delimiter to use|TODO|
|-dateFormat dateFormatString|dtformats|Date format|already available|
|-nullString nullString|nullval|String that signifies NULL|already available|
|-port portNumber| |CQL Port Number|TODO|
|-user username| |Cassandra username|TODO|
|-pw password| |Password for user|TODO|
|-ssl-truststore-path path| |Path to SSL truststore|TODO|
|-ssl-truststore-pw pwd| |Password for SSL truststore|TODO|
|-ssl-keystore-path path| |Path to SSL keystore|TODO|
|-ssl-keystore-pw pwd| |Password for SSL keystore|TODO|
|consistencyLevel CL| |Consistency level|TODO|
|decimalDelim decimalDelim| |Decimal delimiter|TODO|
|boolStyle boolStyleString| |Style for booleans|TODO|
|numThreads numThreads| |Number of concurrent threads to unload|TODO 
(numProcesses)|
|beginToken tokenString| |Begin token|TODO|
|endToken tokenString| |End token|TODO|

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)