> On Jan. 16, 2016, 8:15 a.m., Lenni Kuff wrote: > > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, > > line 605 > > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605> > > > > Why not just always get a cache binding? How much does this improve > > things versus the previous approach? > > Colin Ma wrote: > Agree with Lenni, and we should have a performance test to check the > performance impact if always get a cache binding. > > Dapeng Sun wrote: > I don't think **always getting a cache binding** may be the best > solution, since **getting a cache binding** will obtaining all privileges of > current user per session. Some hierarchical queries will be happened at > database: group->role->privilege. If there are thousands of privileges for > user at database, even for the command like switch database: **use > database1** will get the thousands of privilege to local. > If we make it configurable, users could balance the two solutions with > their cluster. > > Lenni Kuff wrote: > What privileges are loaded without the cached binding? Is the subset of > privileges loaded because we stop at the first positive or because we loaded > only privileges for a specific object? The problem with having a separate > configuration is that users are not going to have any idea what value to set > max.query.num to and it makes configuration more complex. > > Dapeng Sun wrote: > Hi Lenni and Colin, > > **What privileges are loaded without the cached binding?** it will load > the only privileges relate to authorizable hierarchy for users, not all the > privilege for users. > > I do a performance test with **use database** 10 times, here is the > result. The database is derby, it will have about 10%~40% improvement, but > because the latency is not big (less than 1s), I think it is also okay to > remove the configuration to make it simple to use and always use the cached > binding, do you have any thoughts? > > > Here is test result. > number of privileges in database -> total cost time with cached -> total > cost time without cached > 10 -> 1021ms -> 919ms > 100 -> 1803ms -> 1285ms > 1000 -> 5590ms -> 3732ms
Can we build the cache to only contain the objects in the authorizable hierarchy rather than building a cache for all privileges? - Lenni ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/42344/#review114850 ----------------------------------------------------------- On Jan. 15, 2016, 11:47 a.m., Dapeng Sun wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/42344/ > ----------------------------------------------------------- > > (Updated Jan. 15, 2016, 11:47 a.m.) > > > Review request for sentry. > > > Bugs: SENTRY-1007 > https://issues.apache.org/jira/browse/SENTRY-1007 > > > Repository: sentry > > > Description > ------- > > Since current architecture will do one time authorization for every entity, > the sql script like **select col1,col2,col3,.....,colN from test_tb1** will > authorize all the query columns. > > This patch will reuse the CachedHiveBinding at SENTRY-565. > If entity > maxQueryNumber, it will query all user's privileges to local, and > use the local privilege for authorzation. > > > Diffs > ----- > > > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java > 57e4689 > > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java > e76fad1 > > Diff: https://reviews.apache.org/r/42344/diff/ > > > Testing > ------- > > > Thanks, > > Dapeng Sun > >
