[
https://issues.apache.org/jira/browse/ARROW-16619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neal Richardson resolved ARROW-16619.
-------------------------------------
Fix Version/s: 9.0.0
Assignee: Neal Richardson
Resolution: Fixed
> [R] Support compression + R connection (URL with .gz file)
> ----------------------------------------------------------
>
> Key: ARROW-16619
> URL: https://issues.apache.org/jira/browse/ARROW-16619
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Reporter: Carl Boettiger
> Assignee: Neal Richardson
> Priority: Major
> Fix For: 9.0.0
>
>
> Currently, remote access to data (particularly lazy read, an immensely
> powerful arrow ability) only works for data in an S3-compliant object store
> (though I know Azure support is in the works). It would be really fantastic
> if we could have remote access over HTTPS (I think this already works on the
> python side thanks to fsspec).
> For example, this fails in arrow but works in readr:
> arrow::read_csv_arrow("https://data.ecoforecast.org/targets/aquatics/aquatics-targets.csv.gz")
>
> readr::read_csv("https://data.ecoforecast.org/targets/aquatics/aquatics-targets.csv.gz")
> I think this ability would be even more compelling in `open_dataset()`, since
> it opens up for us all the power of lazy read access. Most servers support
> curl range requests so it seems this should be possible. (We can already do
> something similar from duckdb+R, but only after manually opting in the http
> extension and only for parquet).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)