cgivre opened a new pull request #2496:
URL: https://github.com/apache/drill/pull/2496


   # [DRILL-8169](https://issues.apache.org/jira/browse/DRILL-8169): Add UDFs 
to HTTP Plugin to Facilitate Joins
   
   ## Description
   
   (Please describe the change. If more than one ticket is fixed, include a 
reference to those tickets.)
   
   ## Documentation
   There are some situations where a user might want to join data with an API 
result and the pushdowns prevent that from happening.  The main situation where 
this happens is when 
   an API has parameters which are part of the URL AND these parameters are 
dynamically populated via a join. 
   
   In this case, there are two functions `http_get_url` and `http_get` which 
you can use to faciliate these joins. 
   
   * `http_get('<storage_plugin_name>', <params>)`:  This function accepts a 
storage plugin as input and an optional list of parameters to include in a URL.
   * `http_get_url(<url>, <params>)`:  This function works in the same way 
except that it does not pull any configuration information from existing 
storage plugins.
   
   ### Example Queries
   Let's say that you have a storage plugin called `github` with an endpoint 
called `repos` which points to the url: https://github.com/orgs/{org}/repos.  
It is easy enough to 
   write a query like this:
   
   ```sql
   SELECT * 
   FROM github.repos
   WHERE org='apache'
   ```
   However, if you had a file with organizations and wanted to join this with 
the API, the query would fail.  Using the functions listed above you could get 
this data as follows:
   
   ```sql
   SELECT http_get('github.repos', `org`)
   FROM dfs.`some_data.csvh`
   ```
   or
   ```sql
   SELECT http_get('https://github.com/orgs/{org}/repos', `org`)
   FROM dfs.`some_data.csvh`
   ```
   
   **WARNING:  This functionality will execute an HTTP Request FOR EVERY ROW IN 
YOUR DATA.  Use with caution.**
   
   ## Testing
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to