[ 
https://issues.apache.org/jira/browse/ARROW-16619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-16619.
-------------------------------------
    Fix Version/s: 9.0.0
         Assignee: Neal Richardson
       Resolution: Fixed

> [R] Support compression + R connection (URL with .gz file)
> ----------------------------------------------------------
>
>                 Key: ARROW-16619
>                 URL: https://issues.apache.org/jira/browse/ARROW-16619
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: Carl Boettiger
>            Assignee: Neal Richardson
>            Priority: Major
>             Fix For: 9.0.0
>
>
> Currently, remote access to data (particularly lazy read, an immensely 
> powerful arrow ability) only works for data in an S3-compliant object store 
> (though I know Azure support is in the works).  It would be really fantastic 
> if we could have remote access over HTTPS (I think this already works on the 
> python side thanks to fsspec).  
> For example, this fails in arrow but works in readr:
> arrow::read_csv_arrow("https://data.ecoforecast.org/targets/aquatics/aquatics-targets.csv.gz";)
>  
> readr::read_csv("https://data.ecoforecast.org/targets/aquatics/aquatics-targets.csv.gz";)
> I think this ability would be even more compelling in `open_dataset()`, since 
> it opens up for us all the power of lazy read access.  Most servers support 
> curl range requests so it seems this should be possible.  (We can already do 
> something similar from duckdb+R, but only after manually opting in the http 
> extension and only for parquet).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to