Martin du Toit created ARROW-15856:
--------------------------------------

             Summary: [R] S3FileSystem - open_dataset
                 Key: ARROW-15856
                 URL: https://issues.apache.org/jira/browse/ARROW-15856
             Project: Apache Arrow
          Issue Type: New Feature
          Components: R
    Affects Versions: 7.0.0
            Reporter: Martin du Toit


Hi

 I can successfully create a S3FileSystem that connects via minio. 

I can create a SubTreeFileSystem: 
s3://investmentaccountingdata/rawdata/transactions/transactions-xxx/v1.1/

I can list the files in the SubTreeFileSystem, and I can open a dataset on from 
the list of files
{code:java}
// code placeholder
list_files <- sfs$ls(recursive=TRUE)
ds <- arrow::open_dataset(sources = list_files, schema = schema_file, format = 
csv_format, filesystem = sfs)

{code}
This all works fine, if I provide the list of files, but I want to specify a 
path higher up to be able to include the sub folders as partitions. The code I 
use works perfectly if I run it on a local disk.

How can I do open_dataset, and give a folder as source?

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to