Martin du Toit created ARROW-15856:
--------------------------------------
Summary: [R] S3FileSystem - open_dataset
Key: ARROW-15856
URL: https://issues.apache.org/jira/browse/ARROW-15856
Project: Apache Arrow
Issue Type: New Feature
Components: R
Affects Versions: 7.0.0
Reporter: Martin du Toit
Hi
I can successfully create a S3FileSystem that connects via minio.
I can create a SubTreeFileSystem:
s3://investmentaccountingdata/rawdata/transactions/transactions-xxx/v1.1/
I can list the files in the SubTreeFileSystem, and I can open a dataset on from
the list of files
{code:java}
// code placeholder
list_files <- sfs$ls(recursive=TRUE)
ds <- arrow::open_dataset(sources = list_files, schema = schema_file, format =
csv_format, filesystem = sfs)
{code}
This all works fine, if I provide the list of files, but I want to specify a
path higher up to be able to include the sub folders as partitions. The code I
use works perfectly if I run it on a local disk.
How can I do open_dataset, and give a folder as source?
--
This message was sent by Atlassian Jira
(v8.20.1#820001)