[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

Stefania (JIRA) Tue, 01 Dec 2015 00:28:23 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033333#comment-15033333
 ]


Stefania commented on CASSANDRA-9303:
-------------------------------------

Since the descriptions in the table above are for the loader, here is the 
corresponding documentation for COPY, there are minor differences but they 
should be pretty equivalent:

{code}
Available common COPY options and defaults:

          DELIMITER=','           - character that appears between records
          QUOTE='"'               - quoting character to be used to quote fields
          ESCAPE='\'              - character to appear before the QUOTE char 
when quoted
          HEADER=false            - whether to ignore the first line
          NULL=''                 - string that represents a null value
          DATETIMEFORMAT=         - timestamp strftime format
            '%Y-%m-%d %H:%M:%S%z'   defaults to time_format value in cqlshrc
          JOBS='6'                - the number of jobs each process can work on 
at a time
          MAXATTEMPTS='5'         - the maximum number of attempts per batch or 
range
          REPORTFREQUENCY='10000' - the frequency with which we display status 
updates
          DECIMALSEP='.'          - the separator for decimal values
          THOUSANDSSEP=''         - the separator for thousands digit groups
          BOOLSTYLE='True,False'  - the representation for booleans, case 
insensitive, specify true followed by false,
                                    for example yes,no or 1,0
          NUMPROCESSES='n'        - the number of worker processes, by default 
the number of cores minus one
                                    capped at 16
          CONFIGFILE=''           - a configuration file where you can specify 
WITH options, which may be overwritten
                                    by those specified on the command line. The 
format of the config file is the same
                                    as cqlshrc (see the Python ConfigParser 
documentation), you can put your options
                                    under a section named 'ks.table' where ks 
and table are the names of they keyspace
                                    and table of the COPY command. You can also 
specify alternative sections with
                                    CONFIGSECTIONS. You cannot recursively link 
multiple configuration files by
                                    specifying CONFIGFILE or CONFIGSECTIONS in 
a configuration file.
          CONFIGSECTIONS=''       - a comma separated list of sections to be 
read from a config file specified via
                                    CONFIGFILE. The order is important since 
later sections will override values
                                    from previous sections if the same key is 
specified in multiple sections.
          RATEFILE=''             - an optional file where to print the output 
statistics

        Available COPY FROM options and defaults:

          CHUNKSIZE='1000'        - the size of chunks passed to worker 
processes
          INGESTRATE='50000'      - the maximum rate to insert data in rows per 
second
          MINBATCHSIZE='2'        - the minimum size of an import batch
          MAXBATCHSIZE='20'       - the maximum size of an import batch
          TTL='-1'                - the time to live in seconds, by default 
data will not expire (neg. ttl)
          MAXROWS='-1'            - the maximum number of rows, -1 means no 
maximum
          SKIPROWS='0'            - the number of rows to skip
          SKIPCOLS=''             - a comma separated list of column names to 
skip
          MAXPARSEERRORS='-1'     - the maximum global number of parsing 
errors, -1 means no maximum
          MAXINSERTERRORS='-1'    - the maximum global number of insert errors, 
-1 means no maximum
          ERRFILE=''              - a file where to store all rows that could 
not be imported, by default this is
                                    <filename> concatenated with ".err", 
disabled if importing from STDIN

Available COPY TO options and defaults:

          ENCODING='utf8'          - encoding for CSV output
          PAGESIZE='1000'          - the page size for fetching results
          PAGETIMEOUT=10           - the page timeout in seconds for fetching 
results
          BEGINTOKEN=''            - the minimum token string to consider when 
exporting data
          ENDTOKEN=''              - the maximum token string to consider when 
exporting data
{code}

> Match cassandra-loader options in COPY FROM
> -------------------------------------------
>
>                 Key: CASSANDRA-9303
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

Reply via email to