Re: code snippet for cqlsh COPY from

Andy Tolbert Wed, 25 Oct 2017 13:15:49 -0700

Hi Suresh,

cqlsh COPY does batches intelligently by only grouping inserts targeting
the same partition in a batch.

As of version 3.6, C* will not emit the "batch size exceeded" errors if all
statements in a batch belong to the same partition (CASSANDRA-13467
<https://issues.apache.org/jira/browse/CASSANDRA-10876>).

The docs (https://cassandra.apache.org/doc/latest/tools/cqlsh.html#copy-from)
are a good reference for how to use copy from.

https://www.datastax.com/dev/blog/new-features-in-cqlsh-copy is also a good
reference.

Here's an example from something I was working from locally:

cqlsh -e "COPY andy.table100b (pkey,skey,text1,text2,text3,text4,text5)
from 'csv/ordered/100b/*.csv' WITH header = true AND INGESTRATE=1000000 AND
NUMPROCESSES=32 AND MAXBATCHSIZE=100;" myhostname

Note you should probably still keep your batches relatively small even with
single partition batches depending on your dataset.  In my particular case
I was working with relatively small data (100-byte rows).  There is
diminishing returns in terms of throughput as your increase your batch
size, but that will vary based on your data and environment.

Thanks,
Andy

On Wed, Oct 25, 2017 at 11:51 AM Suresh Babu Mallampati <
smallampat...@gmail.com> wrote:

> Hi All,
>
> Can someone provide me the code snippet for the cqlsh COPY from csv file.
>
> I just want to know how that COPY mechanism work compared to normal
> insert/commit to avaoid the batch size exceed the limit.
>
> Thanks,
> Suresh.
>

Re: code snippet for cqlsh COPY from

Reply via email to