[
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17387627#comment-17387627
]
ASF GitHub Bot commented on DRILL-7871:
---------------------------------------
paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-887041732
@vdiravka, I had an opportunity to get a bit more background info for this
project. I am going out of my way to try to facilitate this PR; normally we'd
require that the PR author provide this information so that us reviews can
simply review the code, and not have to reverse engineer requirements and
design.
Sounds like the requirements are for a very specific light-weight
multi-tenant model: one that allows tenants to set options, create storage
plugin configs, and run queries, but not access any other part of Drill.
Tenants are to be trusted to not make mistakes. Specifically:
* A *tenant* has a set of "system" options (maybe call them *tenant
options*) available when that tenant creates a Drill session.
* A tenant can define a set of storage plugin configs which are visible to
*only* that tenant. Perhaps call these *tenant plugin configs*.
* A tenant can run queries that use the tenant options and tenant plugin
configs.
This use case is limited compared to the normal multi-tenant requirements.
The following appear to be restrictions for this project:
* A tenant does not have access to the Drill Web Console or the Drill REST
API and thus does not have access to query profiles.
* A tenant does not have access to Zookeeper or the Drill native API.
Queries sent by the tenant must go through an intermediate software layer
provided by the service provider.
* A tenant does not have access to Drill logs to diagnose failed queries.
The above restrictions say that the feature is not useful for open source
Drill users who use the Drill-provided UI and APIs. This makes the feature of
very limited appeal to the Drill community. So, one of our challenges is to
design the feature in a way that users of the "out-of-the-box" Drill can
benefit.
Additional restrictions for this one use case:
* A tenant cannot start, stop or restart Drillbits, nor can they change
startup properties.
* A tenant cannot upload a UDF nor can a tenant provide custom *connectors*
(storage plugin classes). (Note that
[DRILL-7916](https://github.com/apache/drill/pull/2215) is working at
cross-purposes to this PR.)
* Tenants are trusted to not change system-wide performance-related options
(queueing, resource allocation, etc.) The resulting behavior, if those options
are changed, is undefined and must be dealt with by the service provider if
they occur.
* No provision for the Drill admin to view or modify tenant options or
plugin configs. If such behavior is desired, a service provider must write
tools that work with Drill's persistent storage.
* Tenants are trusted to not consume excess resources, so no resource
isolation between tenants. Tenant A might try to sort a trillion rows, which
might deny resources to other tenants.
* Tenants cannot (?) create views or a metadata store.
* Parquet metadata caching is either unsupported (?) or must be written to
the tenant's S3 bucket; Drill provides no storage for the metadata.
The above limit the solution, but leave the door open to eventually
providing more general multi-tenant support.
A final question is the relation between *tenant* and *user*. This PR
assumes that they are identical: that "fred" is either a normal Drill user in
"normal mode", or a tenant in "tenant mode." That is, each tenant has a single
Drill user (which works in this use case because of the intermediate software
layer.) This explains why this PR is labeled as "instances for different
*users*", the the discussion has revealed the goal to be "instances for
different *tenants*."
Since the "tenant = user" model applies to only this one use case, it again
is not a general enough feature to add to the Apache Drill code. Instead,
Apache Drill must provide a generally useful solution.
Prior notes have explained how a "per-user" model should work (it requires
sharing between users). Specifically a, "config per user" solution must
recognize that users work on a team, allow sharing, and permit admin abilities.
Similarly, an "options per user" solution must persist only a subset of options
(those which are neither system nor per-query options), and must solve the
synchronization issue.
Notes have also explained that the customary definition of "tenant" is an
organization with multiple users. A true multi-tenant solution must allow
multiple users per tenant, and provide a path toward full tenant isolation
later.
The challenge here is to find a design that balances the very specific,
ad-hoc, unusual needs of this one use case, with something that can evolve to
become of general use to the Drill community.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> StoragePluginStore instances for different users
> ------------------------------------------------
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
> Issue Type: New Feature
> Components: Security
> Affects Versions: 1.18.0
> Reporter: Vitalii Diravka
> Assignee: Vitalii Diravka
> Priority: Major
>
> Different users should have their own storage plugin configs to have access
> to own storages only. The feature can be based on Drill User Impersonation
> model
--
This message was sent by Atlassian Jira
(v8.3.4#803005)