[
https://issues.apache.org/jira/browse/ARROW-15856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neal Richardson closed ARROW-15856.
-----------------------------------
Resolution: Cannot Reproduce
If the data is on Azure, there's an open PR to add a filesystem backend for
Azure (ARROW-2034). Perhaps using that will be the right solution.
> [R] S3FileSystem - open_dataset
> -------------------------------
>
> Key: ARROW-15856
> URL: https://issues.apache.org/jira/browse/ARROW-15856
> Project: Apache Arrow
> Issue Type: New Feature
> Components: R
> Affects Versions: 7.0.0
> Reporter: Martin du Toit
> Priority: Major
>
> Hi
> I can successfully create a S3FileSystem that connects via minio.
> I can create a SubTreeFileSystem:
> s3://investmentaccountingdata/rawdata/transactions/transactions-xxx/v1.1/
> I can list the files in the SubTreeFileSystem, and I can open a dataset on
> from the list of files
> {code:java}
> // code placeholder
> list_files <- sfs$ls(recursive=TRUE)
> ds <- arrow::open_dataset(sources = list_files, schema = schema_file, format
> = csv_format, filesystem = sfs)
> {code}
> This all works fine, if I provide the list of files, but I want to specify a
> path higher up to be able to include the sub folders as partitions. The code
> I use works perfectly if I run it on a local disk.
> How can I do open_dataset, and give a folder as source?
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)