alamb opened a new issue, #9133:
URL: https://github.com/apache/arrow-datafusion/issues/9133

   ### Is your feature request related to a problem or challenge?
   
   After https://github.com/apache/arrow-datafusion/pull/8753 it is now 
possible to read data from `http` via a create external table command:
   
   ```sql
   ❯ create external table hits stored as parquet location 
'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet';
   0 rows in set. Query took 0.178 seconds.
   
   ❯ describe hits;
   +-----------------------+-----------+-------------+
   | column_name           | data_type | is_nullable |
   +-----------------------+-----------+-------------+
   | WatchID               | Int64     | YES         |
   | JavaEnable            | Int16     | YES         |
   | Title                 | Binary    | YES         |
   ...
   | RefererHash           | Int64     | YES         |
   | URLHash               | Int64     | YES         |
   | CLID                  | Int32     | YES         |
   +-----------------------+-----------+-------------+
   105 rows in set. Query took 0.003 seconds.
   
   ```
   
   After https://github.com/apache/arrow-datafusion/pull/9064 from 
@manoj-inukolunu  it is possible to `COPY` **to** a remote url which is also 
great. 
   
   However, it is not yet possible to select directly from a remote store like 
   
   ```sql
   select * from 
'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet';
   ```
   
   
   ### Describe the solution you'd like
   
   I would like to be able to select directly from a remote http source like
   
   ```sql
   select * from 
'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet'
 limit 1;
   
   Error during planning: table 
'datafusion.public.https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet'
 not found
   ```
   
   This works great for local files:
   ```sql
   ❯ select * from '/Users/andrewlamb/Downloads/hits.parquet' limit 1;
   
+---------------------+------------+---------------+-----------+------------+-----------+-----------+------------+----------+----------------------+--------------+----+-----------+---------------------------------------+---------+-----------+-------------------+-----------------+---------------+-------------+-----------------+------------------+-----------------+------------+------------+-------------+----------+----------+----------------+----------------+--------------+------------------+----------+-------------+------------------+--------+-------------+----------------+----------------+--------------+-------------+-------------+-------------------+--------------------+----------------+-----------------+---------------------+---------------------+---------------------+---------------------+-------------+-------------+--------+------------+-------------+---------+------------------------------------------------------------+-----------+--------------+---------+-------------+------
 
---------+----------+----------+----------------+-----+-----+--------+-----------+-----------+------------+------------+------------+---------------+-----------------+----------------+---------------+--------------+-----------+------------+-----------+---------------+---------------------+-------------------+-------------+-----------------------+------------------+------------+--------------+---------------+-----------------+---------------------+--------------------+--------------+------------------+-----------+-----------+-------------+------------+---------+---------+----------+---------------------+---------------------+------+
   | WatchID             | JavaEnable | Title         | GoodEvent | EventTime  
| EventDate | CounterID | ClientIP   | RegionID | UserID               | 
CounterClass | OS | UserAgent | URL                                   | Referer 
| IsRefresh | RefererCategoryID | RefererRegionID | URLCategoryID | URLRegionID 
| ResolutionWidth | ResolutionHeight | ResolutionDepth | FlashMajor | 
FlashMinor | FlashMinor2 | NetMajor | NetMinor | UserAgentMajor | 
UserAgentMinor | CookieEnable | JavascriptEnable | IsMobile | MobilePhone | 
MobilePhoneModel | Params | IPNetworkID | TraficSourceID | SearchEngineID | 
SearchPhrase | AdvEngineID | IsArtifical | WindowClientWidth | 
WindowClientHeight | ClientTimeZone | ClientEventTime | SilverlightVersion1 | 
SilverlightVersion2 | SilverlightVersion3 | SilverlightVersion4 | PageCharset | 
CodeVersion | IsLink | IsDownload | IsNotBounce | FUniqID | OriginalURL         
                                       | HID       | IsOldCounter | IsEvent | 
IsParameter | DontC
 ountHits | WithHash | HitColor | LocalEventTime | Age | Sex | Income | 
Interests | Robotness | RemoteIP   | WindowName | OpenerName | HistoryLength | 
BrowserLanguage | BrowserCountry | SocialNetwork | SocialAction | HTTPError | 
SendTiming | DNSTiming | ConnectTiming | ResponseStartTiming | 
ResponseEndTiming | FetchTiming | SocialSourceNetworkID | SocialSourcePage | 
ParamPrice | ParamOrderID | ParamCurrency | ParamCurrencyID | 
OpenstatServiceName | OpenstatCampaignID | OpenstatAdID | OpenstatSourceID | 
UTMSource | UTMMedium | UTMCampaign | UTMContent | UTMTerm | FromTag | HasGCLID 
| RefererHash         | URLHash             | CLID |
   
+---------------------+------------+---------------+-----------+------------+-----------+-----------+------------+----------+----------------------+--------------+----+-----------+---------------------------------------+---------+-----------+-------------------+-----------------+---------------+-------------+-----------------+------------------+-----------------+------------+------------+-------------+----------+----------+----------------+----------------+--------------+------------------+----------+-------------+------------------+--------+-------------+----------------+----------------+--------------+-------------+-------------+-------------------+--------------------+----------------+-----------------+---------------------+---------------------+---------------------+---------------------+-------------+-------------+--------+------------+-------------+---------+------------------------------------------------------------+-----------+--------------+---------+-------------+------
 
---------+----------+----------+----------------+-----+-----+--------+-----------+-----------+------------+------------+------------+---------------+-----------------+----------------+---------------+--------------+-----------+------------+-----------+---------------+---------------------+-------------------+-------------+-----------------------+------------------+------------+--------------+---------------+-----------------+---------------------+--------------------+--------------+------------------+-----------+-----------+-------------+------------+---------+---------+----------+---------------------+---------------------+------+
   | 9153127107923182022 | 1          | Участи NEWSru | 1         | 1373034098 
| 15891     | 225510    | 1703485140 | 2        | -6224091410790412093 | 0      
      | 2  | 3         | http://liver.ru/belgorod/page=1024&wi |         | 0    
     | 0                 | 0               | 14328         | 22          | 2038 
           | 730              | 23              | 15         | 2          | 502 
        | 0        | 0        | 5              | D�             | 1            
| 1                | 0        | 0           |                  |        | 
4168741     | 0              | 0              |              | 0           | 0  
         | 1058              | 549                | 135            | 2035708370 
     | 0                   | 0                   | 0                   | 0      
             | windows     | 1601        | 0      | 0          | 0           | 
0       | http://video.yandex.ru/uglichnevyj-97442434830%20%D1%8C%20 | 
298722980 | 0            | 0       | 0          
  | 0             | 0        | 5        | 1373021451     | 0   | 0   | 0      | 
0         | 0         | 1961866254 | -1         | -1         | -1            | 
S0              | h1             |               |              | 0         | 0 
         | 0         | 0             | 0                   | 0                 
| 0           | 0                     |                  | 0          |         
     | NH            | 0               |                     |                  
  |              |                  |           |           |             |     
       |         |         | 0        | -296158784638538920 | 
7011450103338277684 | 0    |
   
+---------------------+------------+---------------+-----------+------------+-----------+-----------+------------+----------+----------------------+--------------+----+-----------+---------------------------------------+---------+-----------+-------------------+-----------------+---------------+-------------+-----------------+------------------+-----------------+------------+------------+-------------+----------+----------+----------------+----------------+--------------+------------------+----------+-------------+------------------+--------+-------------+----------------+----------------+--------------+-------------+-------------+-------------------+--------------------+----------------+-----------------+---------------------+---------------------+---------------------+---------------------+-------------+-------------+--------+------------+-------------+---------+------------------------------------------------------------+-----------+--------------+---------+-------------+------
 
---------+----------+----------+----------------+-----+-----+--------+-----------+-----------+------------+------------+------------+---------------+-----------------+----------------+---------------+--------------+-----------+------------+-----------+---------------+---------------------+-------------------+-------------+-----------------------+------------------+------------+--------------+---------------+-----------------+---------------------+--------------------+--------------+------------------+-----------+-----------+-------------+------------+---------+---------+----------+---------------------+---------------------+------+
   1 row in set. Query took 0.121 seconds.
   ```
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   I think the trick is intercepting requested URLs / references in 
[`DynamaicFileCatalog`](https://github.com/apache/arrow-datafusion/blob/dfb6435e16cf4cfd5245c84dd6e18fcf96ac72f2/datafusion-cli/src/catalog.rs#L33)
 and calling the appropriate object store registration function (e.g. what is 
in  https://github.com/apache/arrow-datafusion/pull/9064 )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to