[GitHub] [arrow-cookbook] thisisnic edited a comment on issue #92: [R] Add content on Tables vs. Datasets

GitBox Wed, 02 Mar 2022 14:50:48 -0800


thisisnic edited a comment on issue #92:
URL: https://github.com/apache/arrow-cookbook/issues/92#issuecomment-1055869736



   Topics:
   
   - different formats (csv/feather/parquet)
   - partitioning (via group_by or just supplying column names)
   - customising filenames via basename_template param
   - hive_style vs bare values
   - overwrite existing data via existing_data_behavior param
   - fine control over file structure via 
max_partitions/max_open_files/max_rows_per_file/min_rows_per_group/max_rows_per_group
 [note: this functionality is not in version 7.0.0 of the R package]
   - CSV datasets - how to read ones in with or without headers
   - CSV datasets - similarities and differences compared to read_csv_arrow
   - CSV datasets - working with schemas
   - converting datasets from one format to another without loading it all into 
memory


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-cookbook] thisisnic edited a comment on issue #92: [R] Add content on Tables vs. Datasets

Reply via email to