[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720965#comment-17720965
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

cgivre closed pull request #2251: DRILL-7871 StoragePluginStore instances for 
different users
URL: https://github.com/apache/drill/pull/2251




> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2022-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17496722#comment-17496722
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

jnturton commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-1048739204


   Converting to draft while there is still discussion about how we approach 
the design.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387634#comment-17387634
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-887063203


   @vdiravka, here I will suggest the broad outline of a design which meets the 
very limited goals of this PR while also providing a path forward for a 
generally useful set of Apache Drill features.
   
   We introduce one new concept, the *tenant*. Key concepts:
   
   * A tenant is a *name space* for options and plugin configs. The tenant 
level is optional; by default, no tenants exist.
   * Each user has an optional associated *tenant ID*. A user with no tenant ID 
works at the system level. (Example: the Drill admin, or a user who sets up 
demo configs shared by all users.) A user with an associated tenant ID gives 
the tenant name space to use. There may be 0, 1 or many users associated with a 
tenant.
   * A persistent *tenant options* name space sits between the system options 
and the session options in the option hierarchy. Users "inherit" options from 
the system layer, unless overridden at the tenant layer, unless overridden at 
the session layer.
   * Queries and plugin config-related API calls resolve plugin configs within 
the *tenant plugin config name space*. On lookups, if the config is not found 
in the tenant name space, repeat the search in the global name space. (Allows 
system-wide configs for such things as demo data.)
   
   Now we can decide how much of the above to implement in this PR, leaving the 
rest to future PRs. Given your goals, it would seem the minimum is:
   
   * Provide an API or other means to define tenants and to associate tenants 
with users.
   * Provide documentation for how to inspect or remove per-tenant name spaces 
(e.g. if the tenant cancels service or is moved to another cluster.)
   * Tenant membership is enforced for the native and REST query APIs.
   * Tenant users can see the Drill Console, but features that are not 
"tenant-ready" should be disabled. That is, a tenant user might only be able to 
run a query, and see their options, but not see query profiles, etc.
   
   SQL semantics:
   
   * The `ALTER SYSTEM` command affects the tenant option store given by the 
user's tenant ID, if any. *System* (e.g. server-wide) options are set using the 
same command issued by a user with no tenant ID.
   * System tables are not available to users associated with a tenant ID. 
(Later, this can be modified to filter system table contents to include only 
information for that tenant.)
   
   An possible code design might be:
   
   * Add a tenant registry which can be as simple as the known, valid tenant 
IDs. This "registry" can simply be an API, with a default implementation that 
does nothing.
   * Add a field to the user to hold the tenant ID. When resolving a user name, 
obtain the tenant ID along with the admin/normal-user access level.
   * Unfortunately, Drill does not have a `User` object, we assume the only 
information for a user is the user name. Define a new `User` object which holds 
the user ID and tenant ID. (Ideally, it would also hold the security level, 
etc.)
   * We know that the system must determine if the user is an admin. Extend 
that mechanism to also provide the tenant ID. Modify that code to provide a new 
method which returns the `User` object. A storage plugin manager client can 
either use an existing `User` object, or obtain a new one by calling to this 
new `User` factory method.
   * Introduce the tenant option manager which acts like the system option 
manager, and persists to a location that includes the tenant ID in the name. 
Persistence is to a location which includes the tenant ID in the name.
   * When a session starts, if the user has a tenant ID, add the tenant option 
manager to the session option manager stack. This one change should ensure that 
the option solution "just works" everywhere else.
   * Modify the storage plugin manager to require a `User` object for every 
config CRUD method. The `User` says which tenant store to read from or write to.
   * The storage plugin registry currently uses keys to locate configs. This 
key is extended with the tenant ID. Modify config resolution to use the tenant 
ID. If the config is not found in the tenant store, or the tenant ID is not set 
for the user, repeat the search with the default "system" config store. When 
modifying configs, write to the persistent store given by the tenant ID.
   * Modify all clients of the storage plugin manager to pass along the `User` 
object, creating one if needed.
   * System tables do not have an associated config. For system tables (only), 
enforce the rule above: if a plugin lookup resolves to the system plugin, and 
the user has an associated tenant ID, act as if the plugin is not 

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387627#comment-17387627
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-887041732


   @vdiravka, I had an opportunity to get a bit more background info for this 
project. I am going out of my way to try to facilitate this PR; normally we'd 
require that the PR author provide this information so that us reviews can 
simply review the code, and not have to reverse engineer requirements and 
design.
   
   Sounds like the requirements are for a very specific light-weight 
multi-tenant model: one that allows tenants to set options, create storage 
plugin configs, and run queries, but not access any other part of Drill. 
Tenants are to be trusted to not make mistakes. Specifically:
   
   * A *tenant* has a set of "system" options (maybe call them *tenant 
options*) available when that tenant creates a Drill session.
   * A tenant can define a set of storage plugin configs which are visible to 
*only* that tenant. Perhaps call these *tenant plugin configs*.
   * A tenant can run queries that use the tenant options and tenant plugin 
configs.
   
   This use case is limited compared to the normal multi-tenant requirements. 
The following appear to be restrictions for this project:
   
   * A tenant does not have access to the Drill Web Console or the Drill REST 
API and thus does not have access to query profiles.
   * A tenant does not have access to Zookeeper or the Drill native API. 
Queries sent by the tenant must go through an intermediate software layer 
provided by the service provider.
   * A tenant does not have access to Drill logs to diagnose failed queries.
   
   The above restrictions say that the feature is not useful for open source 
Drill users who use the Drill-provided UI and APIs. This makes the feature of 
very limited appeal to the Drill community. So, one of our challenges is to 
design the feature in a way that users of the "out-of-the-box" Drill can 
benefit.
   
   Additional restrictions for this one use case:
   
   * A tenant cannot start, stop or restart Drillbits, nor can they change 
startup properties.
   * A tenant cannot upload a UDF nor can a tenant provide custom *connectors* 
(storage plugin classes). (Note that 
[DRILL-7916](https://github.com/apache/drill/pull/2215) is working at 
cross-purposes to this PR.)
   * Tenants are trusted to not change system-wide performance-related options 
(queueing, resource allocation, etc.) The resulting behavior, if those options 
are changed, is undefined and must be dealt with by the service provider if 
they occur.
   * No provision for the Drill admin to view or modify tenant options or 
plugin configs. If such behavior is desired, a service provider must write 
tools that work with Drill's persistent storage.
   * Tenants are trusted to not consume excess resources, so no resource 
isolation between tenants. Tenant A might try to sort a trillion rows, which 
might deny resources to other tenants.
   * Tenants cannot (?) create views or a metadata store.
   * Parquet metadata caching is either unsupported (?) or must be written to 
the tenant's S3 bucket; Drill provides no storage for the metadata.
   
   The above limit the solution, but leave the door open to eventually 
providing more general multi-tenant support.
   
   A final question is the relation between *tenant* and *user*. This PR 
assumes that they are identical: that "fred" is either a normal Drill user in 
"normal mode", or a tenant in "tenant mode." That is, each tenant has a single 
Drill user (which works in this use case because of the intermediate software 
layer.) This explains why this PR is labeled as "instances for different 
*users*", the the discussion has revealed the goal to be "instances for 
different *tenants*."
   
   Since the "tenant = user" model applies to only this one use case, it again 
is not a general enough feature to add to the Apache Drill code. Instead, 
Apache Drill must provide a generally useful solution. 
   
   Prior notes have explained how a "per-user" model should work (it requires 
sharing between users). Specifically a, "config per user" solution must 
recognize that users work on a team, allow sharing, and permit admin abilities. 
Similarly, an "options per user" solution must persist only a subset of options 
(those which are neither system nor per-query options), and must solve the 
synchronization issue.
   
   Notes have also explained that the customary definition of "tenant" is an 
organization with multiple users. A true multi-tenant solution must allow 
multiple users per tenant, and provide a path toward full tenant isolation 
later.
   
   The challenge here is to find a design that balances the very specific, 
ad-hoc, unusual needs of this one use 

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386562#comment-17386562
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers edited a comment on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-88591


   @vdiravka, thanks for the explanation. Let's be careful not to confuse 
existing implementation with new requirements. Yes, there is a comment that 
says that the system options are persistent and session options are not. Be 
careful to read this correctly. *At the time the comment was written*, system 
options were the only persistent options. System options, by definition, have 
to be persistent. The comment did not *forbid* session options from being 
persistent; it simply observes that, at that time, for the reasons we 
discussed, session options were not persistent. Nor does the comment say that, 
if you add persistent user-level options, that they have to be system options 
because only system options are persistent: you are free to add persistence to 
session options.
   
   Thanks splitting this PR into two: one for options, another for plugins. 
Else, discussion becomes overly complex.
   
   *Before* we worry about code details, we have to get the semantics right. 
After writing the comment on plugins, below, I realize that we are probably 
using the same words for entirely different ideas. Your goal is to add 
multi-tenant support to Drill. You uses the term "user options". I now realize 
you probably mean "tenant options."
   
   If we are doing per-user options (which is how I first read your 
description), then the requirements are one thing. If we want per-tenant 
options, the requirements are different. Let's list both scenarios: per-user 
options and per-tenant options. Please let us know which you are trying to 
build (or if you are tying to build something else entirely.)
   
   ## Per-User Persistent Options
   
   Suppose the goal is to let users (individual named humans) persist options. 
We assume that system-wide settings are set as system options. We are simply 
avoiding the need for the user to configure their own personal options on each 
login.
   
   Users will want to persist only *some* of their session options. We 
discussed that this may not work as one might think: we have to think through 
the issues. The point was that concurrency of user options will be a mess if 
they work like system options: if changes to user/session options are written 
to persistent storage immediately. This will screw up queries as I persist 
"all-text-mode" in one session, which breaks a query in another. Point is, user 
options *must* have semantics different from system options.
   
   * We need a concurrency guarantee. One choice is "options are propagated to 
all active sessions with unknown latency, just like system options." Another, 
more dependable, is "session options are initialized on session startup, after 
which they do not reflect changes made in other sessions." Whatever it is, the 
definition has to be stable and explainable.
   * Some options apply to queries: we've been using "all-text-mode" as an 
example. Such options will be set frequently. Since these options are 
per-query, we cannot set the option across all user sessions immediately, as 
that will break queries. Instead, the user has to *tell us* when they want to 
persist that setting. Otherwise the setting is per-session (that is, per-query.)
   
   Is it a bad thing that the behavior of queries depends on session options? 
Certainly. That was a poor design choice. But, it is how Drill works, and we 
have to design new features to reflect this choice.
   
   ## Per-Tenant Options
   
   Now, if the goal is to have "per-tenant" options, we have a much different 
problem. Each *tenant* is an organization which believes it has its own Drill 
cluster. That the cluster is shared is invisible to the tenant. The tenant 
reads in the docs that they can change system options, so they decide to do so. 
Those changes must persist only for that one *tenant* and all that tenant's 
users.
   
   I now see that this is probably what you are trying to implement. As a 
result, the semantics are closer to system options. For now, let's assume that 
you do not, in fact, need per-user (per-named human) persistent options.
   
   Let's list some possible requirements:
   
   * A bootstrap option enables multi-tenant support. The option is off by 
default, meaning that existing Drill behavior is unchanged by default.
   * A tenant is an organization with (a single user? multiple users?)
   * Each tenant has a *tenant admin* user who may set its own (subset of) 
system options. This subset is called *tenant preferences*. Such options 
persist, like session options, but apply only to sessions started for that 
tenant. Values set by one tenant cannot be seen by any other tenant.
   

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386560#comment-17386560
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers edited a comment on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-885898568


   @vdiravka, on the plugins side, I think the confusion over "user" and 
"tenant" has us talking about different designs. Again, let's discuss per-user 
and per-tenant designs. Tell me which you are trying to build.
   
   Clarifying the desired tenant model may help with the options discussion. 
I've been equating "user" with "person". If your tenant model equates "user" 
with "tenant", you are actually trying to define per-tenant defaults. Perhaps 
update the docs to clearly say so. Let's use the term "tenant" for a business 
entity and "user" for a human, perhaps within that tenant.
   
   ## Per-User Plugin Configs
   
   The goal here is to allow users (humans) to create their own private plugin 
configs. Since the users work in a common organization, users may wish to share 
configs without copy/paste reuse.
   
   Requirements for this case:
   
   * Provide a bootstrap option to enable the new semantics. The default is 
off, meaning that Drill behavior will not change by default.
   * Retain global plugin configurations. (They allow multiple users to share 
the same configuration without copy/paste.)
   * Add per-user plugin configurations.
   * Users can add and change only their own per-user configurations. (No user 
can "share" a config with another user: either the config must be copied to a 
global config by the admin, or the second user must make their own copy.)
   * If a user configuration happens to have the same name as a global 
configuration, the per-user configuration takes precedence. (If the admin make 
a global version of a now-shared config, the user must delete his own copy in 
order to use the now-shared version.)
   * No non-admin user can see or change the configuration for another user. 
(That is, if two people work in the same group, and want to share a config, 
they must do so via copy/paste, or by asking the admin to make the plugin 
global to all users.)
   * No non-admin user can change a global configuration.
   * The admin user can see, add, modify or delete *any* plugin configuration, 
both global and per-user. (Without this, the system is not maintainable as the 
admin won't be able to manage per-user configurations.)
   
   ## Per-Tenant Plugin Configs
   
   The other way to read your comments is that you want plugin configs per 
*tenant* (business entity). In this case, as we said earlier, Tenant A should 
not know that there is a Tenant B on the same cluster. There is likely no 
shared storage or configs between tenants. The only exception would be 
system-level configs offered by the organization that owns the Drill cluster, 
such as the system tables. Maybe some demo data.
   
   Requirements in this case:
   
   * Provide a bootstrap option to enable multi-tenant support. The default is 
off, meaning that Drill behavior will not change by default.
   * A set of global plugin configuration is provided by, and can only be 
changed by, the server admin (as defined in the earlier note.) If no global 
plugins are to be provided, then delete all global plugin configs.
   * The system table is not controlled by a plugin config. If tenants are to 
be prohibited from using system tables, provide an option to disable system 
tables. (Or, a per a later note, ensure that system tables do not leak tenant 
information.)
   * Tenants see only their own configs. To the tenant, the per-tenant configs 
*are* the plugin configs.
   * UI and API operations that operate on configs, when run in multi-tenant 
mode, implicitly apply to the set only for that one tenant. That is, there is 
no new UI or API; the existing operations simply change meaning to apply to 
only the active tenant.
   * Every user (except the server admin) is associated with a tenant. If the 
user has permission to modify plugins, then all changes apply to only that one 
tenant. (Possible refinement: only the tenant admin, as defined earlier, can 
change tenant plugin configs.)
   * A tenant may (may not?) have multiple users. All tenant users see the same 
set of tenant plugins. There are no "per-user" plugins, only per-tenant plugins.
   * Tenants can add and change only their own per-tenant configurations. No 
tenant can "share" a config with another tenant.
   * If a tenant configuration happens to have the same name as a global 
configuration, the per-tenant configuration takes precedence.
   * No tenant can change a global configuration.
   * The server admin user can see, add, modify or delete *any* plugin 
configuration, both global and per-tenant. (Without this, the system is not 
maintainable as the admin won't be able to manage per-tenant 

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386561#comment-17386561
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-885928024


   @vdiravka, let's think about your larger goal.  If the goal is a true 
multi-tenant model, in which tenants are distinct business entities (rather 
than different departments within a single entity), then we must also ensure 
each tenant has access to only their own query profiles. Will there be another 
PR for this? Is there a roll-up Jira ticket for all the issues involved in true 
multi-tenant operation?
   
   Also, the Drill UI assumes a single organization. If the UI is to be exposed 
to multi-tenant users (so they can monitor queries, see query profiles, etc.), 
then the UI must change. Tenants should not be able to see details, or change 
the state of, Drillbits. Session options should reflect tenant values. Probably 
other changes. Will there be a design or PR for this?
   
   Further, each tenant must have guarantees on resources. That is, Tenant A 
should not be able to run a huge query that denies resources sold to Tenant B. 
This is a **very hard** problem. If you don't solve the hard problem, the 
options and plugins are somewhat moot. The option and plugin features are 
handy, but do not, by themselves, give you multi-tenant support in Drill.
   
   System tables could "leak" information between tenants. Should system tables 
be disabled? What changes would be needed to ensure that no per-tenant 
information "leaks" to another tenant?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386541#comment-17386541
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers edited a comment on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-88591


   @vdiravka, thanks for the explanation. Let's be careful not to confuse 
existing implementation with new requirements. Yes, there is a comment that 
says that the system options are persistent and session options are not. Be 
careful to read this correctly. *At the time the comment was written*, system 
options were the only persistent options. System options, by definition, have 
to be persistent. The comment did not *forbid* session options from being 
persistent; it simply observes that, at that time, for the reasons we 
discussed, session options were not persistent. Nor does the comment say that, 
if you add persistent user-level options, that they have to be system options 
because only system options are persistent: you are free to add persistence to 
session options.
   
   I strongly recommend splitting this PR into two: one for options, another 
for plugins. Else, discussion becomes overly complex.
   
   *Before* we worry about code details, we have to get the semantics right. 
After writing the comment on plugins, below, I realize that we are probably 
using the same words for entirely different ideas. Your goal is to add 
multi-tenant support to Drill. You uses the term "user options". I now realize 
you probably mean "tenant options."
   
   If we are doing per-user options (which is how I first read your 
description), then the requirements are one thing. If we want per-tenant 
options, the requirements are different. Let's list both scenarios: per-user 
options and per-tenant options. Please let us know which you are trying to 
build (or if you are tying to build something else entirely.)
   
   ## Per-User Persistent Options
   
   Suppose the goal is to let users (individual named humans) persist options. 
We assume that system-wide settings are set as system options. We are simply 
avoiding the need for the user to configure their own personal options on each 
login.
   
   Users will want to persist only *some* of their session options. We 
discussed that this may not work as one might think: we have to think through 
the issues. The point was that concurrency of user options will be a mess if 
they work like system options: if changes to user/session options are written 
to persistent storage immediately. This will screw up queries as I persist 
"all-text-mode" in one session, which breaks a query in another. Point is, user 
options *must* have semantics different from system options.
   
   * We need a concurrency guarantee. One choice is "options are propagated to 
all active sessions with unknown latency, just like system options." Another, 
more dependable, is "session options are initialized on session startup, after 
which they do not reflect changes made in other sessions." Whatever it is, the 
definition has to be stable and explainable.
   * Some options apply to queries: we've been using "all-text-mode" as an 
example. Such options will be set frequently. Since these options are 
per-query, we cannot set the option across all user sessions immediately, as 
that will break queries. Instead, the user has to *tell us* when they want to 
persist that setting. Otherwise the setting is per-session (that is, per-query.)
   
   Is it a bad thing that the behavior of queries depends on session options? 
Certainly. That was a poor design choice. But, it is how Drill works, and we 
have to design new features to reflect this choice.
   
   ## Per-Tenant Options
   
   Now, if the goal is to have "per-tenant" options, we have a much different 
problem. Each *tenant* is an organization which believes it has its own Drill 
cluster. That the cluster is shared is invisible to the tenant. The tenant 
reads in the docs that they can change system options, so they decide to do so. 
Those changes must persist only for that one *tenant* and all that tenant's 
users.
   
   I now see that this is probably what you are trying to implement. As a 
result, the semantics are closer to system options. For now, let's assume that 
you do not, in fact, need per-user (per-named human) persistent options.
   
   Let's list some possible requirements:
   
   * A bootstrap option enables multi-tenant support. The option is off by 
default, meaning that existing Drill behavior is unchanged by default.
   * A tenant is an organization with (a single user? multiple users?)
   * Each tenant has a *tenant admin* user who may set its own (subset of) 
system options. This subset is called *tenant preferences*. Such options 
persist, like session options, but apply only to sessions started for that 
tenant. Values set by one tenant cannot be seen by any 

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386527#comment-17386527
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-885898568


   @vdiravka, on the plugins side, let's please do list our requirements and 
design goals. Else, I worry that the comments I make get lost, and I will this 
have to repeat them over and over.
   
   Requirements as I extract them from your description:
   
   * Provide a bootstrap option to enable the new semantics. The default is 
off, meaning that Drill behavior will not change by default.
   * Retain global plugin configurations. (They allow multiple users to share 
the same configuration without copy/paste.)
   * Add per-user plugin configurations.
   * Users can add and change only their own per-user configurations. (No user 
can "share" a config with another user: either the config must be copied to a 
global config by the admin, or the second user must make their own copy.)
   * If a user configuration happens to have the same name as a global 
configuration, the per-user configuration takes precedence. (If the admin make 
a global version of a now-shared config, the user must delete his own copy in 
order to use the now-shared version.)
   * No non-admin user can see or change the configuration for another user. 
(That is, if two people work in the same group, and want to share a config, 
they must do so via copy/paste, or by asking the admin to make the plugin 
global to all users.)
   * No non-admin user can change a global configuration.
   * The admin user can see, add, modify or delete *any* plugin configuration, 
both global and per-user. (Without this, the system is not maintainable as the 
admin won't be able to manage per-user configurations.)
   
   Now, one can disagree with these requirements. For example, in a 
multi-tenant environment, we do not one *tenant* to see the plugins for another 
*tenant*. But, we might want all users in "Tenant A" to see a shared set of 
configs. Your description appears to assume that "Tenant" = "User": one user 
per tenant. However, a typical model is for each human in a tenant to have 
their own user account. Please spell out the desired semantics.
   
   *Aside*: Clarifying the desired tenant model may help with the options 
discussion. I've been equating "user" with "person". If your tenant model 
equates "user" with "tenant", so you are tying to define per-tenant defaults, 
then please clearly say so in the design. If you want per-tenant default, I 
would suggest a somewhat different design for options.
   
   Finally, if the goal is a true multi-tenant model, in which tenants are 
distinct business entities (rather than different departments within a single 
entity), then we must also ensure each tenant has access to only their own 
query profiles. Will there be another PR for this? Is there a roll-up Jira 
ticket for all the issues involved in true multi-tenant operation?
   
   Also, the Drill UI assumes a single organization. If the UI is to be exposed 
to multi-tenant users (so they can monitor queries, see query profiles, etc.), 
then the UI must change. Tenants should not be able to see details, or change 
the state of, Drillbits. Session options should reflect tenant values. Probably 
other changes. Will there be a design or PR for this?
   
   Further, each tenant must have guarantees on resources. That is, Tenant A 
should not be able to run a huge query that denies resources sold to Tenant B. 
This is a **very hard** problem. If you don't solve the hard problem, the 
options and plugins are somewhat moot. The option and plugin features are 
handy, but do not, by themselves, give you multi-tenant support in Drill.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386516#comment-17386516
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-88591


   @vdiravka, thanks for the explanation. Let's be careful not to confuse 
existing implementation with new requirements. Yes, there is a comment that 
says that the system options are persistent and session options are not. Be 
careful to read this correctly. *At the time the comment was written*, system 
options were the only persistent options. System options, by definition, have 
to be persistent. The comment did not *forbid* session options from being 
persistent; it simply observes that, at that time, for the reasons we 
discussed, session options were not persistent. Nor does the comment say that, 
if you add persistent user-level options, that they have to be system options 
because only system options are persistent: you are free to add persistence to 
session options.
   
   The gist of your requirement is to let users persist *some* of their session 
options. We discussed that this may not work as one might think: we have to 
think through the issues. The point was that concurrency of user options will 
be a mess if they work like system options: if changes to user/session options 
are written to persistent storage immediately. This will screw up queries as I 
persist "all-text-mode" in one session, which breaks a query in another. Point 
is, user options *must* have semantics different from system options.
   
   *Before* we worry about code details, we have to get the semantics right. To 
repeat:
   
   * We need a concurrency guarantee. One choice is "options are propagated to 
all active sessions with unknown latency, just like system options." Another, 
more dependable, is "session options are initialized on session startup, after 
which they do not reflect changes made in other sessions." Whatever it is, the 
definition has to be stable and explainable.
   * Some options apply to queries: we've been using "all-text-mode" as an 
example. Such options will be set frequently. Since these options are 
per-query, we cannot set the option across all user sessions immediately, as 
that will break queries. Instead, the user has to *tell us* when they want to 
persist that setting. Otherwise the setting is per-session (that is, per-query.)
   
   Is it a bad thing that the behavior of queries depends on session options? 
Certainly. That was a poor design choice. But, it is how Drill works, and we 
have to design new features to reflect this choice.
   
   The proposal I outlined is the simplest way to achieve with concurrency 
semantics we could actually support. Feel free to suggest a better design; but 
let's discuss that design *before* we reduce it to code. Either way, the design 
has to reflect our requirements and constraints, not just follow existing code.
   
   I strongly recommend splitting this PR into two: one for options, another 
for plugins. Else, discussion becomes overly complex.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384113#comment-17384113
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-882856859


   * **Motivation:**
   
   The goal is to allow multi-tenant operations. So it doesn't matter what API 
user uses: REST, JDBC ot ODBC. This is true that Drill initially wasn't 
designed for that, but we ca add it.
   
   * **Options:** 
   > The user has to be able to set the user options. We don't need a new 
option level. Simply add a new statement: ALTER USER > SET =. This 
would do two things:
   > Write the given key/value pair into the persistent store for the user.
   > Change the session option in the current session to that new value.
   
   This is also questionable, because it breaks `SessionOptionManager extends 
InMemoryOptionManager`:
   ```
   This is an OptionManager that holds options in memory rather than in a 
persistent store. Options stored in 
   SessionOptionManager, QueryOptionManager, and FragmentOptionManager are held 
in memory 
   (see options) whereas the SystemOptionManager stores options in a persistent 
store.
   ```
   Most likely all easy ways here are some sort of hacks. So let's do that in 
correct manner: adding the new `UserOptionManager` as a layer between 
`SystemOptionManager` and `SessionOptionManager`.
   
   > The next step is store the per-user plugins in separate persistent stores. 
This is the "cleaner" design, but now you have to 
   > create the means for an admin to see, change, and remove the user-level 
plugins. You also have to extend the plugin registry, > which is, as we noted, 
a complex beast. Any change requires extensive unit tests.
   
   Different PS can be chosen per Drill doc. This is independent from Plugins. 
We have `HBasePersistentStore`, `InMemoryStore`, `LocalPersistentStore`, 
`MongoPersistentStore`, `NoWriteLocalStore`, `ZookeeperPersistentStore` 
implementations for PS.
   If you mean different instances of the same`PS`, that is not needed. 
PSRegistry is just a code, which allows to access actual PS. We can manage 
access via the common PSRegistry. Or we can reconsider it in future.
   
   *  **Plugins:**
   
   About plugins it makes sense to add Global plugins and to use it in case 
User plugin is absent. It's like a plugins fallback.
   And system plugins can be used only in scope of Global plugins. I don't 
think user name in schema is a nice and elegant solution. I think plugins 
fallback is simpler and more convenient approach.
   
   * **Conclusion:**
   
   So I am going to add this functionality to current implementation and 
separate out `user options` part of this PR into the new one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384021#comment-17384021
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-882856859


   * **Motivation:**
   
   The goal is to allow multi-tenant operations. So it doesn't matter what API 
user uses: REST, JDBC ot ODBC. This is true that Drill initially wasn't 
designed for that, but we ca add it.
   
   * **Options:** 
   > The user has to be able to set the user options. We don't need a new 
option level. Simply add a new statement: ALTER USER > SET =. This 
would do two things:
   > Write the given key/value pair into the persistent store for the user.
   > Change the session option in the current session to that new value.
   
   This is also questionable, because it breaks `SessionOptionManager extends 
InMemoryOptionManager`:
   ```
   This is an OptionManager that holds options in memory rather than in a 
persistent store. Options stored in 
   SessionOptionManager, QueryOptionManager, and FragmentOptionManager are held 
in memory 
   (see options) whereas the SystemOptionManager stores options in a persistent 
store.
   ```
   Most likely all easy ways here are some sort of hacks. So let's do that in 
correct manner: adding the new `UserOptionManager` as a layer between 
`SystemOptionManager` and `SessionOptionManager`.
   
   > The next step is store the per-user plugins in separate persistent stores. 
This is the "cleaner" design, but now you have to 
   > create the means for an admin to see, change, and remove the user-level 
plugins. You also have to extend the plugin registry, > which is, as we noted, 
a complex beast. Any change requires extensive unit tests.
   
   Different PS can be chosen per Drill doc. This is independent from Plugins. 
We have `HBasePersistentStore`, `InMemoryStore`, `LocalPersistentStore`, 
`MongoPersistentStore`, `NoWriteLocalStore`, `ZookeeperPersistentStore` 
implementations for PS.
   If you mean different instances of the same`PS`, that is not needed. 
PSRegistry is just a code, which allows to access actual PS. We can manage 
access via the common PSRegistry. Or we can reconsider it in future.
   
   *  **Plugins:**
   
   About plugins it makes sense to add Global plugins and to use it in case 
User plugin is absent. It's like a plugins fallback.
   And system plugins can be used only in scope of Global plugins. I don't 
think user name in schema is a nice and elegant solution. I think plugins 
fallback is simpler and more convenient approach.
   
   * **Conclusion:**
   
   So I am going to add this functionality to current implementation and 
separate out `user options` part of this PR into the new one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383581#comment-17383581
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-882856859


   * **Motivation:**
   
   The goal is to allow multi-tenant operations. So it doesn't matter what API 
user uses: REST, JDBC ot ODBC. This is true that Drill initially wasn't 
designed for that, but we ca add it.
   
   * **Options:** 
   > The user has to be able to set the user options. We don't need a new 
option level. Simply add a new statement: ALTER USER > SET =. This 
would do two things:
   > Write the given key/value pair into the persistent store for the user.
   > Change the session option in the current session to that new value.
   
   This is also questionable, because it breaks `SessionOptionManager extends 
InMemoryOptionManager`:
   ```
   This is an OptionManager that holds options in memory rather than in a 
persistent store. Options stored in 
   SessionOptionManager, QueryOptionManager, and FragmentOptionManager are held 
in memory 
   (see options) whereas the SystemOptionManager stores options in a persistent 
store.
   ```
   Most likely all easy ways here are some sort of hacks. So let's do that in 
correct manner: adding the new `UserOptionManager` as a layer between 
`SystemOptionManager` and `SessionOptionManager`.
   
   > The next step is store the per-user plugins in separate persistent stores. 
This is the "cleaner" design, but now you have to 
   > create the means for an admin to see, change, and remove the user-level 
plugins. You also have to extend the plugin registry, > which is, as we noted, 
a complex beast. Any change requires extensive unit tests.
   
   Different PS can be chosen per Drill doc. This is independent from Plugins. 
We have `HBasePersistentStore`, `InMemoryStore`, `LocalPersistentStore`, 
`MongoPersistentStore`, `NoWriteLocalStore`, `ZookeeperPersistentStore` 
implementations for PS.
   If you mean different instances of the same`PS`, that is not needed. 
PSRegistry is just a code, which allows to access actual PS. We can manage 
access via the common PSRegistry. Or we can reconsider it in future.
   
   *  **Plugins:**
   
   About plugins it makes sense to add Global plugins and to use it in case 
User plugin is absent. It's like a plugins fallback.
   And system plugins can be used only in scope of Global plugins. I don't 
think user name in schema is a nice and elegant solution. I think plugins 
fallback is simpler and more convenient approach.
   
   * **Conclusion:**
   
   So I am going to add this functionality to current implementation and 
separate out `user options` part of this PR into the new one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381767#comment-17381767
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-881179890


   One more thought: it is not clear from the PR or design document *why* you 
need these changes, so I'm having to guess. I may be guessing wrong. Can you 
spell out the use case?
   
   For example, I could imagine that you need the persistent user options 
because you want to use the REST API, and that API does not have a persistent 
session, so options don't persist. As it turns out, someone kindly recently 
added the ability to pass along options on each query request. This is, in 
fact, how Impala works to avoid the need for session state. So, you can achieve 
the same result by keeping options on the client.
   
   Or, you can achieve the same result by adding a session concept to the REST 
API. Log in and get a session token back. Send that token on subsequent 
requests to identify the user and to key to a cached session object. The 
session ends after a timeout or on an explicit logout call.
   
   If the per-user plugins are to allow multi-tenant operation, then the 
problem is considerably more complex as much of Drill is not designed for that. 
(You'd want to secure query profiles, police resource usage, etc.) If it is to 
allow different users in the same organization to define their own plugins, 
then you have the sharing issue we discussed.
   
   In short, if you can explain your goals, we can perhaps help you find a 
workable solution that achieves the goals. Without knowing the goals, then I'm 
free to make up my own understanding of what you want...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381715#comment-17381715
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-881155695


   Next, let's discuss plugins. Let's start with three goals:
   
   1. System plugins work for everyone (we can add security later).
   2. By default, plugins are available to everyone. That is, the Drill 
out-of-the-box plugins are available so that, say, `cp` works and people can 
try examples from the documentation.
   3. There is an optional way to create per-user plugin (configs).
   
   What is the simplest way to do this? Maybe by changing the plugin resolution 
rules. When the planner looks for a plugin, it tries two names: 
`` and ``. We use the resulting name in the query 
plan. So, if I have (unwisely) created my own `cp` plugin, when I ask for `cp`, 
that's what I get:  `$$paul$$cp`. If I do not have my own `dfs`, then when I 
ask for it, I get the (global) `dfs`. (The `` is something I just 
made up; it can be anything as long as the marker is not likely to ever appear 
in a real plugin name. Maybe `` or `\\` would be better since 
they are not legal SQL.)
   
   When a user stores a plugin, and the per-user config feature is enabled, 
then the UI appends the user prefix. The UI filters out all plugins owned by 
other users. When the admin works with plugins, there is no prefix. This allows 
the admin to see, and change, both `cp` and `$$paul$$cp`. If I leave the 
organization, the admin can remove all the `$$paul$$...` plugins.
   
   I suspect you'll find that, with this solution, *no* change is needed in the 
plugin registry. Plans will resolve to the correct reader configs. The reader 
configs in the query plan will just work (as they would in your design.)
   
   How does this help sharing? Simply ask the admin to change the name of the 
plugin from `$$paul$$whizbang` to just `whizbang`. Now, everyone can see it and 
my queries work without change. Super simple. We have only two security levels 
(everyone or single user), but that is **far** better than reuse by copy/paste. 
The design could be extended to the role-based one without breaking anything.
   
   The next step is store the per-user plugins in separate persistent stores. 
This is the "cleaner" design, but now you have to create the means for an admin 
to see, change, and remove the user-level plugins. You also have to extend the 
plugin registry, which is, as we noted, a complex beast. Any change requires 
*extensive* unit tests.
   
   By the way, when I revised the plugin registry a while back, I fixed dozens 
of bugs. Most were simply because I added tests and found all kinds of broken 
stuff. Concurrency was broken, upgrades were broken, expiring of old plugins 
was broken. and so on. I still have scars. So, I'm trying to save you from 
similar pain.
   
   Will something like this work and avoid the hard work we discussed earlier?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381707#comment-17381707
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-881151197


   Hi @vdiravka,
   
   Thanks for the note, you make good points. Let's see where we can 
compromise. Let's start with options.
   
   It makes no sense for users to have their own system options (my favorite 
example is the length of the global, shared query queue.) So, `SYSTEM` options 
can't be per-user. This means that the `SYSTEM` level has to remain as it is. 
We can cut corners: still allow users to change system options, we just ignore 
those values. (For example, IIRC, the query queue, the value is taken from the 
`SYSTEM` options; we ignore `SESSION` values.) That is, we just trust that user 
ignorance will prevent them from breaking things; they won't know about the 
system options.
   
   If we have any form of user-property persistence, then we have to worry 
about race conditions. I don't see how we can ignore that issue unless we are 
strict and we say that user options are read once: when the user starts a 
session. If I change `all-text-mode` in session 1, it will not affect session 
2. This seems reasonable for the "toy" use cases where Drill is now used.
   
   That is, when a session starts, we set the session options to the saved 
per-user values from the persistent store. A nice result is that, at runtime, 
there is no difference between user and session options. This fact should save 
some work.
   
   The user has to be able to set the user options. We don't need a new option 
level. Simply add a new statement: `ALTER USER SET =`. This would 
do two things:
   
   1. Write the given key/value pair into the persistent store for the user.
   2. Change the session option in the current session to that new value.
   
   Again, we let the user change anything they want; we rely on the fact that 
users will only care about one or two options, and only when someone suggests a 
change.
   
   The above is simple, but limited. The limits are no worse than today. It 
seems we could extend the feature in the future without breaking anything. (For 
example, broadcast user option changes to all sessions. Apply security to 
system options. Etc.)
   
   Then, we simply ignore the issue that the options users care about are 
associated with queries (all the Parquet, JSON, and other options that affect 
reader behavior.) We simply decide to be surprised when the user encounters 
those issues. User: "Gee, I set the `all-text-mode` option and my queries run 
file. But, when Bob runs them, they fail...". Our answer: "have Bob change the 
same option" This may break Bob's own queries, so we have Bob change the option 
back. This is a horrible user experience, but is the same as what Drill offers 
today, so we're not breaking anything.
   
   Will this work?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380223#comment-17380223
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-879480858


   Hi @paul-rogers,
   
   I started design for system options from adding the new additional layer. 
But this approach is much more complex and we should cover all the cases you 
described.
   The design for the feature proposed by me is other: we just don't use global 
system options at all (in case `separate_workspace` is enabled). Every user has 
it's own system options. It gives a great functionality, which drill required 
for a long time, I think. We used session options for this purpose, but it 
can't be serialized. I also abandon the idea to Store Session options for the 
same reason, it has other concept and couples with a Query.
   
   About plugins the same thing, I tried to keep things as simple as it is 
possible (it corresponds to your last sentence).
   * I agree we can factor out system plugins and use that shared plugins for 
all users. It is minor improvement, it allows to keep less things in memory. 
This is not a blocker for this PR. I will create a ticket and proceed working 
on it after finishing this PR.
   * Also I agree with it is nice to have Group/Organization level of plugins 
and `System Options`. I noted about it in Design document. But it is much more 
complex improvement, requires introducing a lot of new rules and grants. I 
thought to consider working on it after finishing on this PR. Because now 
separate plugins and options can provide all things needed for users, but 
setting up is not so comfortable as it can be with Group/Organization level of 
these instances. And there are more similar instances are placed in memory now 
(separate instances for Groups/Organizations will improve this too)
   * About concurrency and distributed systems issues, they are all the same as 
it is now. Since event now one user can run query and other can edit the plugin 
config. Now the picture is even better, since two user can't sue common plugins 
registry or system options. Whey their own ones. They don't impact each other.
   
   About dividing the PR I also thought about it initially, because leverage of 
System Options is other than plugins, so a lot of things for options is other. 
But options and plugins are combined with a common idea - to have the separate 
workspace for users: plugin configs, system options and to see their own query 
profiles only. And the approach for implementation is very similar - to replace 
the global instances with their own user-session ones.
   So I can here to divide the PR in two parts or just in two commits for 
easier review. But I think we can also go with one PR and commit.
   
   So let me know do you like the above approach for this feature with a two 
(or more) further improvements or possibly you want to bring mail discussion on 
this topic with other devs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378816#comment-17378816
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-877884101


   Let's now move onto the primary topic of this PR: plugins. Again, this is a 
complex topic, especially if we think about how the solution would work at 
scale. I'm afraid that, since the design is so sparse and leaves out many 
details, that we're perhaps going down the wrong path. Let's take each topic in 
turn so we can see what's what.
   
   First, what is the structure we want? To answer this we need to separate the 
*code* (Java libraries, which I'll call a "connector", following Presto) and 
*configs* (the JSON things that users see, what we normally mean when we say 
"plugin".)
   
   Drill will only ever work with a fixed set of connectors: those available on 
the class path at runtime. AFAIK, there has never been interest in adding 
connectors at runtime (no "dynamic connectors") and, for security reasons, I'm 
not sure that doing so would even be a good idea.
   
   Connectors are presently used by any number of configs. And, every query 
creates an instance of the objects in the connector (the readers, the planner 
objects, etc.) So, the structure we have is:
   
   * Config (a JSON object in the persistent store)
   * Connector (a library)
 * Plugin (an instance of the plugin class from a connector, along with its 
config.)
   * Readers, planner objects, created per-query from a plugin
   
   The plugin registry binds connectors to configs to produce plugins (where by 
"plugin" I mean an instance of the plugin class initialized with a config.) 
This is some mighty complex code! It has to deal with a bunch of distributed 
system and concurrency-related issues. I'll omit the details for now.
   
   So, what does this mean? For one, it means we only need *one* copy of the 
connector. The "new" plugin registry is designed to handle this idea. My "class 
path plugin" and your "class path plugin" use the same plugin library. I'll 
assume that the answer above, that each user has their own copy of the code, 
represents a misunderstanding because of Drill's horribly ambiguous names in 
this area.
   
   Next, the plugin registry holds two kinds of "plugins". First are those 
configured in the UI and that can be resolved in queries. Familiar things like 
`cp`, `dfs` and so on. Second are ad-hoc plugins: those created on-the-fly 
based on table properties in a query, or plugin properties passed along in a 
query definition. This is done so that queries work even if, right after 
planning completes, someone deletes or changes the stored config: the next 
query sees the new config, the executing query uses the config stored in the 
query. (Else, disaster would result when some Drillbits see one config, others 
another.)
   
   There is a system plugin that offers the system tables. This one is meant to 
be shared by all users. It has no config options, but if it did, they would be 
set at the system level by the admin. It makes no sense for each user to have 
their own copy. (It might make sense to disallow certain system tables for some 
users, but that is a different question.)
   
   Now we have the "regular" configured plugins: a JSON config and an 
associated plugin instance. The goal here is to define a new level so that 
individual users have these items.
   
   First, I'll question if the requirements are correct. Do we really want 
plugins associated with each user? If you and I both use tale "foo", do I have 
to ask you to send me your JSON so I can create a new copy? What if you change 
something? What if we have 100s of users all with copies? Making copies is a 
"demo only" feature, it does not scale in production. Remember: reuse by copy 
and paste is for amateurs who will throw away their solution, not for 
professionals who want to minimize total costs.
   
   Let's think of a use case. A firm has five departments with 10 people per 
department. Some people in one department can see some data in another. For 
example, the VP of marketing is allowed to see Sales and Dev data. The CFO can 
see everything. Oh, and the employees change regularly.
   
   This is not a new idea. How do other tools handle this? Oracle? SQLServer? 
They operate at the level of schemas (databases) and tables. So, think of a 
plugin (config in storage, config + connector at runtime) as a shared, named 
object.
   
   So, what we want are groups of plugins. "Marketing", "Sales", "Ops". Within 
each are the configs for that set of tables. Users can be given permissions on 
the whole group, or on individual plugins. Administrators or DBAs (those who 
understand production systems) set up the configs. Users use those to which 
they are given permission. Or, 

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378798#comment-17378798
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-877870693


   This one is difficult. We are reviewing code under time pressure when the 
design itself raises many questions. Let's start with the design. First, it is 
too short, it leaves many issues unspecified. So, I have to infer them from 
reviewing the code. As noted before, the storage plugin registry is complex; so 
having to reverse-engineer a design at the code level makes the process slower 
than it might otherwise be.
   
   This PR adds two separate features: a new options level and changes to the 
plugin registry. Since these are independent concepts, they should appear as 
separate PRs: one for the options, another for plugins. That will reduce the 
cognitive load on us poor reviewers.
   
   So, the design of options:
   
   > But it appears different users have the common Storage Plugins and System 
Options.
   > * There is no possibility for Drillbit configuring individually for every 
user and persistent between sessions (Note: SessionOptions are removed after 
closing session)
   
   The above is by design. **System** options are just that: options for the 
system. They are meant to be set by the administrator to control the Drill 
cluster as a whole. Drill also has *Session* options: these are per user 
session. If we want per-user option settings, then we need a new level: *User* 
options.
   
   Options form a hierarchy. At present that hierarchy, from most to least 
specific, is:
   
   * Per-query options *or* session options
   * System options
   * Options configured in the Drill config system.
   
   It sounds like you want to add a new layer: user options:
   
   * Per-query options **or**
 * Session option
 * Per-user option
   * System options...
   
   Suppose we have per-user system options: I set my queue length to 10 items, 
you set your queue length to 100 items. But, the queue is a global resource: 
which is used? This is just one example of how intractable "per-user system 
options" is as a concept. So, we need actual user options.
   
   One obvious choice is simply to persist the session options. I login and 
change `foo` to 10. Drill simply persists that change so that next time I log 
in, `foo` is 10. Easy, but is that the right answer? Drill has a serious flaw 
that the behavior of any query depends on options: JSON modes, all-text mode 
and so on. That was done as a quick and dirty short-term solution to ship the 
product, but is still with us and still causing issues. If I set an option to 
"all text mode", and run my query, I get a result. If I share that query with 
you, but you don't set the all-text option, the query will produce different 
results (or, probably, fail.)
   
   So, to "fix" options, we need to push those options into the query (with 
hints, say), or we need options associated with a query, as having per-user 
options won't solve the "options associated with a query" problem.
   
   So, we probably do not want to just persist session options as they are used 
to control specific queries. Instead, we'd want to add a new system: `SET USER 
OPTION ...` so that the hierarchy is as above.
   
   The design is silent on many important questions:
   
   * What is the new hierarchy?
   * How does *the* (not an) administrator set truly-global options if all 
users can set their own?
   * How does the user set, review and clear the user options?
   * Suppose a user is logged in twice. In session 1 they change `foo` to 10. 
Do we expect session 2 (on a different Drillbit) to see that change? This is a 
**very** difficult concurrency question. Of course they do. But, when? In the 
middle of query planning? At the start of the next SQL statement? After a time 
period? Only at the next login? If concurrently, how will one Drillbit track 
changes made to the user options by another Drillbit? How can the use a "force 
resync" to keep options synchronized?
   * Can the admin see the per-user options? If so, how? `SHOW OPTIONS FOR USER 
'bob'`? Remember, the admin is tasked with policing the system, they **must** 
be able to see all options.
   * How do we solve the current ambiguity of true system options (the query 
queuing options mentioned above) vs. user/session options? At present, the 
solution is ad-hoc and we just trust that most options are not changed. If a 
naive user can change any option, we've got issues. Do we need a permissions 
table to say which users can change which options? (And, categorize true system 
options as read-only at the user and session levels.)
   * You will need `USER` version of all of the option commands.
   * You will need complete tests that demonstrate that the hierarchy 

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375992#comment-17375992
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on a change in pull request #2251:
URL: https://github.com/apache/drill/pull/2251#discussion_r664845699



##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginRegistry.java
##
@@ -21,19 +21,18 @@
 import java.util.Map;
 import java.util.Set;
 
+import org.apache.calcite.schema.SchemaPlus;

Review comment:
   > Are we controlling access to the "connectors" (code) or to the 
"instances" (configs?)
   * connectors and configs
   > If we control the connectors, do we load new copies of the connector 
metadata per user?
   * yes, we do
   > Can a user see the common set of system-wide configs? Or, only their own 
private configs?
   * only their private
   > If system wide, how do we handle name conflicts? I add a "foo" plugin, 
later the admin adds a system-wide "foo" or visa-versa.
   * answered above
   > Can a private plugin be promoted to system level? Is that a copy/paste 
operation? With the user logged in as different windows? Or, is there a button 
to do the operation?
   * there is no such functionality yet
   > When is the per-user information created? As you know, on the web, the 
user session comes into existence for every single REST call. Plugin setup is 
pretty heavy-weight. Is the user config cached between requests? If so, when do 
we expire the user information if the user then becomes idle?
   * the users plugins is created and placed within the user session. The 
caching mechanism is absolutely the same as for for the system plugins registry 
(`ephemeralPlugins` is kept within every session `UserStoragePluginRegistry` 
inheritor of the `StoragePluginRegistryImpl`). If the UserSession is closed, 
the `ephemeralPlugins` is persisted to ZK similar to the global plugins 
approach.
   > The registry handles updates. The current code is rather awful since there 
is no coordination among Drillbits: multiple drillbits could come up at the 
same time and they can all decide to do the upgrade. Adding users complicates 
the effort. Will a user session trigger a plugin upgrade? How do we handle 
races?
   * Any plugins update or any query run is performed via the client session. A 
new User Session can handle only own `UserStoragePluginRegistry` instance. 
Handling races mechanism is the same as for `StoragePluginRegistryImpl`, but it 
is event simpler - the user within one session can't update plugins and run 
query simultaneously.
   > This registry handles a subtle, obscure case. Suppose (before this 
change), I run a query that users the "foo" plugin. Before the query starts 
running (but after planning), I rush in and delete the "foo" config. What 
happens? As it turns out, Drill saves the old config so that queries continue 
to work. This works because there is only one registry. If we add per-user 
registries, we'd need to pass the user info with the exec plan so that the 
readers grab the user's version of the config, including any cached, 
recently-deleted configs. (Distributed systems are complex!)
   * for this case the query can fail indeed. But do we expect it to be 
finished?! I don't think so. as the config was deleted and it is fine that the 
query based on this config is failed.
   
   The global idea is to have separate `StoragePluginRegistryImpl` per 
UserSession and to keep all other functionality without changes. It allows to 
avoid any unexpected behavior. Absolutely the same story with 
`SystemOptionManager`
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375607#comment-17375607
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on a change in pull request #2251:
URL: https://github.com/apache/drill/pull/2251#discussion_r663159582



##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/StatusResources.java
##
@@ -112,7 +115,9 @@ public String getMetrics(@PathParam("hostname") String 
hostname) throws Exceptio
   private List getSystemOptionsJSONHelper(boolean internal)
   {
 List options = new LinkedList<>();
-OptionManager optionManager = work.getContext().getOptionManager();
+OptionManager optionManager = authEnabled.separateWorkspace()

Review comment:
   Done.

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/StatusResources.java
##
@@ -183,9 +188,9 @@ public Viewable getSystemInternalOptions(@Context UriInfo 
uriInfo) {
   public Viewable updateSystemOption(@FormParam("name") String name,
@FormParam("value") String value,
@FormParam("kind") String kind) {
-SystemOptionManager optionManager = work.getContext()
-  .getOptionManager();
-
+SystemOptionManager optionManager = authEnabled.separateWorkspace()

Review comment:
   > This seems to say that updating a system option works differently 
depending on whether the user has their own workspace.
   
   - Yes
   
   > Shouldn't system options always update the (global) system options? 
Otherwise, do we need a "real system option" attribute to really update the 
global options?
   
   - No, we don't care about real system options in case separate_workspace is 
enabled. The user and all his activity relies on his own SystemOptions.

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
##
@@ -71,6 +73,87 @@
  */
 public class SystemOptionManager extends BaseOptionManager implements 
AutoCloseable {
   private static final Logger logger = 
LoggerFactory.getLogger(SystemOptionManager.class);
+  public static final String OPTIONS_PSTORE_NAME = "options";

Review comment:
   _Note:_ changes in `SystemOptionManager` is just refactoring (placing 
code blocks in right places)

##
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/TestUserPluginRegistry.java
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.rpc.user.UserServer;
+import org.apache.drill.exec.rpc.user.UserSession;
+import 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl;
+import org.apache.drill.exec.store.dfs.FileSystemConfig;
+import org.apache.drill.exec.store.dfs.FileSystemPlugin;
+import org.apache.drill.test.ClientFixture;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1_PASSWORD;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_2;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_2_PASSWORD;
+import static org.apache.drill.exec.util.StoragePluginTestUtils.CP_PLUGIN_NAME;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertNotSame;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertSame;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375007#comment-17375007
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka commented on a change in pull request #2251:
URL: https://github.com/apache/drill/pull/2251#discussion_r663159582



##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/StatusResources.java
##
@@ -112,7 +115,9 @@ public String getMetrics(@PathParam("hostname") String 
hostname) throws Exceptio
   private List getSystemOptionsJSONHelper(boolean internal)
   {
 List options = new LinkedList<>();
-OptionManager optionManager = work.getContext().getOptionManager();
+OptionManager optionManager = authEnabled.separateWorkspace()

Review comment:
   Done.

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/StatusResources.java
##
@@ -183,9 +188,9 @@ public Viewable getSystemInternalOptions(@Context UriInfo 
uriInfo) {
   public Viewable updateSystemOption(@FormParam("name") String name,
@FormParam("value") String value,
@FormParam("kind") String kind) {
-SystemOptionManager optionManager = work.getContext()
-  .getOptionManager();
-
+SystemOptionManager optionManager = authEnabled.separateWorkspace()

Review comment:
   > This seems to say that updating a system option works differently 
depending on whether the user has their own workspace.
   
   - Yes
   
   > Shouldn't system options always update the (global) system options? 
Otherwise, do we need a "real system option" attribute to really update the 
global options?
   
   - No, we don't care about real system options in case separate_workspace is 
enabled. The user and all his activity relies on his own SystemOptions.

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
##
@@ -71,6 +73,87 @@
  */
 public class SystemOptionManager extends BaseOptionManager implements 
AutoCloseable {
   private static final Logger logger = 
LoggerFactory.getLogger(SystemOptionManager.class);
+  public static final String OPTIONS_PSTORE_NAME = "options";

Review comment:
   _Note:_ changes in `SystemOptionManager` is just refactoring (placing 
code blocks in right places)

##
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/TestUserPluginRegistry.java
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.rpc.user.UserServer;
+import org.apache.drill.exec.rpc.user.UserSession;
+import 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl;
+import org.apache.drill.exec.store.dfs.FileSystemConfig;
+import org.apache.drill.exec.store.dfs.FileSystemPlugin;
+import org.apache.drill.test.ClientFixture;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1_PASSWORD;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_2;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_2_PASSWORD;
+import static org.apache.drill.exec.util.StoragePluginTestUtils.CP_PLUGIN_NAME;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertNotSame;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertSame;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17373119#comment-17373119
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

paul-rogers commented on a change in pull request #2251:
URL: https://github.com/apache/drill/pull/2251#discussion_r662667413



##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginRegistry.java
##
@@ -21,19 +21,18 @@
 import java.util.Map;
 import java.util.Set;
 
+import org.apache.calcite.schema.SchemaPlus;

Review comment:
   The plugin registry is a complex beast. Before this change, it supported 
built-in plugins and those provided in the plugins folder. The registry handles 
both the "connector" (code) and "instances" (code + a config.)
   
   This change adds per-user plugins. The overall bug description does not seem 
to provide answers to several questions. For example:
   
   * Are we controlling access to the "connectors" (code) or to the "instances" 
(configs?)
   * If we control the connectors, do we load new copies of the connector 
metadata per user?
   * Can a user see the common set of system-wide configs? Or, only their own 
private configs?
   * If system wide, how do we handle name conflicts? I add a "foo" plugin, 
later the admin adds a system-wide "foo" or visa-versa.
   * Can a private plugin be promoted to system level? Is that a copy/paste 
operation? With the user logged in as different windows? Or, is there a button 
to do the operation?
   * When is the per-user information created? As you know, on the web, the 
user session comes into existence for every single REST call. Plugin setup is 
pretty heavy-weight. Is the user config cached between requests? If so, when do 
we expire the user information if the user then becomes idle?
   * The registry handles updates. The current code is rather awful since there 
is no coordination among Drillbits: multiple drillbits could come up at the 
same time and they can all decide to do the upgrade. Adding users complicates 
the effort. Will a user session trigger a plugin upgrade? How do we handle 
races?
   * This registry handles a subtle, obscure case. Suppose (before this 
change), I run a query that users the "foo" plugin. Before the query starts 
running (but after planning), I rush in and delete the "foo" config. What 
happens? As it turns out, Drill saves the old config so that queries continue 
to work. This works because there is only one registry. If we add per-user 
registries, we'd need to pass the user info with the exec plan so that the 
readers grab the user's version of the config, including any cached, 
recently-deleted configs. (Distributed systems are complex!)
   * There are probably a few more issues that also deserve consideration. The 
comments in this part of the code try to explain the various issues.

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/prepare/PreparedStatementProvider.java
##
@@ -138,7 +139,9 @@ public void run() {
 final QueryId limit0QueryId = userWorker.submitWork(wrapper, 
limit0Query);
 
 final long timeoutMillis =
-
userWorker.getSystemOptions().getOption(CREATE_PREPARE_STATEMENT_TIMEOUT_MILLIS).num_val;
+
userWorker.getDrillbitContext().getConfig().getBoolean(ExecConstants.SEPARATE_WORKSPACE)

Review comment:
   As noted above, this kind of code will lead to countless subtle errors. 
Suggestion: introduce yet another option manager whose job is to decide which 
of the two "system" option managers is needed. Option managers are designed to 
nest, so this should be not too hard.
   
   And, there is still the question: why do we need two *system* option 
managers? Do users get their own system options now? See questions above...

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/StatusResources.java
##
@@ -183,9 +188,9 @@ public Viewable getSystemInternalOptions(@Context UriInfo 
uriInfo) {
   public Viewable updateSystemOption(@FormParam("name") String name,
@FormParam("value") String value,
@FormParam("kind") String kind) {
-SystemOptionManager optionManager = work.getContext()
-  .getOptionManager();
-
+SystemOptionManager optionManager = authEnabled.separateWorkspace()

Review comment:
   This seems to say that updating a *system* option works differently 
depending on whether the user has their own workspace. Shouldn't *system* 
options always update the (global) system options? Otherwise, do we need a 
"real system option" attribute to really update the global options?

##
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/StatusResources.java
##
@@ -112,7 +115,9 @@ public String getMetrics(@PathParam("hostname") String 
hostname) throws Exceptio
   

[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-06-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368106#comment-17368106
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vvysotskyi commented on pull request #2251:
URL: https://github.com/apache/drill/pull/2251#issuecomment-866817374


   @vdiravka, looks like the test you have added is failing in CI, could you 
please take a look?
   
   Could you please share some functional-level doc with the description use 
cases and behavior, so it will be easier to do the code review?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7871) StoragePluginStore instances for different users

2021-06-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357404#comment-17357404
 ] 

ASF GitHub Bot commented on DRILL-7871:
---

vdiravka opened a new pull request #2251:
URL: https://github.com/apache/drill/pull/2251


   # [DRILL-7871](https://issues.apache.org/jira/browse/DRILL-7871): 
StoragePluginStore instances for different users
   
   ## Description
   
   This feature allows to have the separate plugin set per user. Profiles are 
restricted per user too.
   
   ## Documentation
   To enable the feature:
   `drill.exec.security.user.auth.separate_workspace`
   
   TBI
   
   ## Testing
   Unit tests. Manual testing of Drill WebUI and JDBC connection via SqlLine
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StoragePluginStore instances for different users
> 
>
> Key: DRILL-7871
> URL: https://issues.apache.org/jira/browse/DRILL-7871
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Security
>Affects Versions: 1.18.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.19.0
>
>
> Different users should have their own storage plugin configs to have access 
> to own storages only. The feature can be based on Drill User Impersonation 
> model



--
This message was sent by Atlassian Jira
(v8.3.4#803005)