[ https://issues.apache.org/jira/browse/CASSANDRA-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brad Schoening updated CASSANDRA-17831: --------------------------------------- Description: CQL supports only CSV as a format for import and export. A binary big data format such as Avro and/or Parquet would be more compact and highly portable to other platforms. Parquet does not require a schema, so it appears the easier format to support. The existing syntax supports adding key value pair options, such as FORMAT = PARQUET {{ COPY table_name ... FROM 'file_name'[, 'file2_name', ...] }} {{[WITH option = 'value' [AND ...]]}} Side by side comparisons of CSV and Parquet show a 80% plus saving in disk space. [https://towardsdatascience.com/csv-files-for-storage-no-thanks-theres-a-better-option-72c78a414d1d] was: CQL supports only CSV as a format for import and export. A binary big data format such as Avro and/or Parquet would be more compact and highly portable to other platforms. Parquet does not require a schema, so it appears the easier format to support. The existing syntax supports adding key value pair options, such as FORMAT = PARQUET {{ COPY table_name ... FROM 'file_name'[, 'file2_name', ...] }} {{{}{}}}{{{}[WITH option = 'value' [AND ...]]{}}} {{{}{}}}Side by side comparisons of CSV and Parquet show a 80% plus saving in disk space. https://towardsdatascience.com/csv-files-for-storage-no-thanks-theres-a-better-option-72c78a414d1d > Add support in CQLSH for COPY FROM / TO in compact Parquet format > ----------------------------------------------------------------- > > Key: CASSANDRA-17831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17831 > Project: Cassandra > Issue Type: Improvement > Reporter: Brad Schoening > Assignee: Brad Schoening > Priority: Normal > > CQL supports only CSV as a format for import and export. A binary big data > format such as Avro and/or Parquet would be more compact and highly portable > to other platforms. > Parquet does not require a schema, so it appears the easier format to support. > The existing syntax supports adding key value pair options, such as FORMAT = > PARQUET > {{ COPY table_name ... FROM 'file_name'[, 'file2_name', ...] }} > {{[WITH option = 'value' [AND ...]]}} > Side by side comparisons of CSV and Parquet show a 80% plus saving in disk > space. > [https://towardsdatascience.com/csv-files-for-storage-no-thanks-theres-a-better-option-72c78a414d1d] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org