[jira] [Closed] (ARROW-15856) [R] S3FileSystem - open_dataset

Neal Richardson (Jira) Fri, 13 May 2022 11:38:05 -0700


     [ 
https://issues.apache.org/jira/browse/ARROW-15856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Neal Richardson closed ARROW-15856.
-----------------------------------
    Resolution: Cannot Reproduce

If the data is on Azure, there's an open PR to add a filesystem backend for 
Azure (ARROW-2034). Perhaps using that will be the right solution. 

> [R] S3FileSystem - open_dataset
> -------------------------------
>
>                 Key: ARROW-15856
>                 URL: https://issues.apache.org/jira/browse/ARROW-15856
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: R
>    Affects Versions: 7.0.0
>            Reporter: Martin du Toit
>            Priority: Major
>
> Hi
>  I can successfully create a S3FileSystem that connects via minio. 
> I can create a SubTreeFileSystem: 
> s3://investmentaccountingdata/rawdata/transactions/transactions-xxx/v1.1/
> I can list the files in the SubTreeFileSystem, and I can open a dataset on 
> from the list of files
> {code:java}
> // code placeholder
> list_files <- sfs$ls(recursive=TRUE)
> ds <- arrow::open_dataset(sources = list_files, schema = schema_file, format 
> = csv_format, filesystem = sfs)
> {code}
> This all works fine, if I provide the list of files, but I want to specify a 
> path higher up to be able to include the sub folders as partitions. The code 
> I use works perfectly if I run it on a local disk.
> How can I do open_dataset, and give a folder as source?
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Closed] (ARROW-15856) [R] S3FileSystem - open_dataset

Reply via email to