Paul Rogers created DRILL-5182:
----------------------------------
Summary: Storage plugins have no concept of end-of-query to
release resources
Key: DRILL-5182
URL: https://issues.apache.org/jira/browse/DRILL-5182
Project: Apache Drill
Issue Type: Improvement
Affects Versions: 1.8.0
Reporter: Paul Rogers
At planing time, storage plugins have to concepts of scope: global, or local to
a specific operator instance. Other JIRA mentioned the problems for security.
Here, let us focus on caching.
Suppose we create a storage plugin to access a remote system. To plan the
query, we have to ask if a given "table" exists, if the selected columns are
valid, if the user can access the table, the number of minor fragments that can
be created and so on.
Suppose the target service is a REST service such as Oracle Sales Cloud or any
other business service. We certainly do not want to keep sending REST metadata
requests to answer each of the above questions. And, for something like Oracle,
the answer to the questions depends on the user. (Oracle provides both table
and column-level security.)
Instead, we want to hit the service with one metadata request. ("Get me the
information for the 'customer' table.") Then, cache that info. This, then, is
the issue: how does a storage plugin know who the user is and when to release
the cache?
The storage plugin API should provide a query context along with an events that
says, "new use of a query context for this plugin" and "query planning done for
this context". This allows the plugin to create a object that represents the
query planning session, hold cached information for that scope, and release
resources at the completion of the scope.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)