[
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033333#comment-15033333
]
Stefania commented on CASSANDRA-9303:
-------------------------------------
Since the descriptions in the table above are for the loader, here is the
corresponding documentation for COPY, there are minor differences but they
should be pretty equivalent:
{code}
Available common COPY options and defaults:
DELIMITER=',' - character that appears between records
QUOTE='"' - quoting character to be used to quote fields
ESCAPE='\' - character to appear before the QUOTE char
when quoted
HEADER=false - whether to ignore the first line
NULL='' - string that represents a null value
DATETIMEFORMAT= - timestamp strftime format
'%Y-%m-%d %H:%M:%S%z' defaults to time_format value in cqlshrc
JOBS='6' - the number of jobs each process can work on
at a time
MAXATTEMPTS='5' - the maximum number of attempts per batch or
range
REPORTFREQUENCY='10000' - the frequency with which we display status
updates
DECIMALSEP='.' - the separator for decimal values
THOUSANDSSEP='' - the separator for thousands digit groups
BOOLSTYLE='True,False' - the representation for booleans, case
insensitive, specify true followed by false,
for example yes,no or 1,0
NUMPROCESSES='n' - the number of worker processes, by default
the number of cores minus one
capped at 16
CONFIGFILE='' - a configuration file where you can specify
WITH options, which may be overwritten
by those specified on the command line. The
format of the config file is the same
as cqlshrc (see the Python ConfigParser
documentation), you can put your options
under a section named 'ks.table' where ks
and table are the names of they keyspace
and table of the COPY command. You can also
specify alternative sections with
CONFIGSECTIONS. You cannot recursively link
multiple configuration files by
specifying CONFIGFILE or CONFIGSECTIONS in
a configuration file.
CONFIGSECTIONS='' - a comma separated list of sections to be
read from a config file specified via
CONFIGFILE. The order is important since
later sections will override values
from previous sections if the same key is
specified in multiple sections.
RATEFILE='' - an optional file where to print the output
statistics
Available COPY FROM options and defaults:
CHUNKSIZE='1000' - the size of chunks passed to worker
processes
INGESTRATE='50000' - the maximum rate to insert data in rows per
second
MINBATCHSIZE='2' - the minimum size of an import batch
MAXBATCHSIZE='20' - the maximum size of an import batch
TTL='-1' - the time to live in seconds, by default
data will not expire (neg. ttl)
MAXROWS='-1' - the maximum number of rows, -1 means no
maximum
SKIPROWS='0' - the number of rows to skip
SKIPCOLS='' - a comma separated list of column names to
skip
MAXPARSEERRORS='-1' - the maximum global number of parsing
errors, -1 means no maximum
MAXINSERTERRORS='-1' - the maximum global number of insert errors,
-1 means no maximum
ERRFILE='' - a file where to store all rows that could
not be imported, by default this is
<filename> concatenated with ".err",
disabled if importing from STDIN
Available COPY TO options and defaults:
ENCODING='utf8' - encoding for CSV output
PAGESIZE='1000' - the page size for fetching results
PAGETIMEOUT=10 - the page timeout in seconds for fetching
results
BEGINTOKEN='' - the minimum token string to consider when
exporting data
ENDTOKEN='' - the maximum token string to consider when
exporting data
{code}
> Match cassandra-loader options in COPY FROM
> -------------------------------------------
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
> Issue Type: New Feature
> Components: Tools
> Reporter: Jonathan Ellis
> Assignee: Stefania
> Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to
> handle real world requirements, we should match those.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)