alamb commented on issue #9280:
URL: https://github.com/apache/datafusion/issues/9280#issuecomment-2120680185

   @aditanase how are you running the external statement?
   
   It seems to work well from `datafusion-cli`
   
   
   ```shell
   andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ datafusion-cli
   DataFusion CLI v38.0.0
   > create external table nyc_trip_data
   stored as parquet
   location 
'https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2024-01.parquet';
   0 row(s) fetched.
   Elapsed 0.819 seconds.
   
   > describe nyc_trip_data;
   +-----------------------+------------------------------+-------------+
   | column_name           | data_type                    | is_nullable |
   +-----------------------+------------------------------+-------------+
   | VendorID              | Int32                        | YES         |
   | tpep_pickup_datetime  | Timestamp(Microsecond, None) | YES         |
   | tpep_dropoff_datetime | Timestamp(Microsecond, None) | YES         |
   | passenger_count       | Int64                        | YES         |
   | trip_distance         | Float64                      | YES         |
   | RatecodeID            | Int64                        | YES         |
   | store_and_fwd_flag    | LargeUtf8                    | YES         |
   | PULocationID          | Int32                        | YES         |
   | DOLocationID          | Int32                        | YES         |
   | payment_type          | Int64                        | YES         |
   | fare_amount           | Float64                      | YES         |
   | extra                 | Float64                      | YES         |
   | mta_tax               | Float64                      | YES         |
   | tip_amount            | Float64                      | YES         |
   | tolls_amount          | Float64                      | YES         |
   | improvement_surcharge | Float64                      | YES         |
   | total_amount          | Float64                      | YES         |
   | congestion_surcharge  | Float64                      | YES         |
   | Airport_fee           | Float64                      | YES         |
   +-----------------------+------------------------------+-------------+
   19 row(s) fetched.
   Elapsed 0.002 seconds.
   ```
   
   If you are just using a dataframe you ned to register the HTTP object store:
   
   Here is an example of how to register a file URL 
https://github.com/apache/datafusion/pull/10549/files#diff-bc239df2db94469eee52dbc0dba96dd7e5b8dde36dd1d63292fde71577efc3f5R96-R105
   
   ```rust
       // register file:// object store provider
       //  Get this error if not there:
       // Error: Internal("No suitable object store found for file://")
       // TODO: should make the error more helpful (and add an example of how 
to register local file object store)
       // todo add example of how to register local file object store
       let url = Url::try_from("file://")
           .map_err(|e| internal_datafusion_err!("can't parse file url: {e}"))?;
       let object_store = object_store::local::LocalFileSystem::new();
       ctx.runtime_env()
           .register_object_store(&url, Arc::new(object_store));
   ```
   
   I think you may need to register the the 
https://docs.rs/object_store/latest/object_store/http/struct.HttpBuilder.html 
object store as `https://`
   
   I will try and make some more documentation about this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to