Hi Colin,

The code snippet was just an experiment, to prove that text to Privilege
object conversion is consuming tremendous amount of cpu time. ( I didn't
try to profile that though )
Sentry loads rules from db or file and update as needed.
I didn't read all the relevant code, but i'm sure at some point text based
rules are
loaded and get in proper place.
At that moment instead of storing text based rule, store Privilege object
for future use.
Then there should be no need to do text to Privilege object conversion at
PrivilegeRequest evaluation time.

Regards,
     Steven

2016년 3월 24일 목요일, Ma, Junjie<[email protected]>님이 작성한 메시지:

> Hi Donghoo,
>
> For the first experiment, you cached all privileges to local memory, and
> do the authorization with the cache. But the cache won't refresh. I think
> it's a temp cache to resolve the performance problem. For Sentry, we should
> have an overall solution for the local cache, for example, what's the
> policy to cache the privileges, get addition privileges every time or get
> all privileges? How to refresh the local cache, pull or push, etc.
> Feel free to discuss.
>
> Best regards,
>
> Colin Ma(Ma Jun Jie)
>
> -----Original Message-----
> From: nazgul33 [mailto:[email protected] <javascript:;>]
> Sent: Thursday, March 24, 2016 1:13 PM
> To: [email protected] <javascript:;>
> Subject: [discuss] ResourceAuthorizationProvider.hasAccess() performance
>
> -- resending mail to [email protected] <javascript:;>. I accidentaly
> sent mail to [email protected] <javascript:;> --
>
> Hi, all
>
> I'm Donghoon Han, a software developer.
> I subscribed this mailing list to discuss performance problem of sentry.
>
> My current job is to develop a type 3 JDBC driver for universal access to
> Hive, Impala and Phoenix, which has it's own sentry enforcing logic.
> We have 1000 users and 10k tables in a single hive metastore. (Huge!)
>
> One problematic requirement is to implement JDBC api metadata.getTables()
> which only lists tables connected user has access.
>
> To evaluate all tables with 1000 public tables out of 10k tables, with
> 1000 public users, current logic requires about 200 seconds.
>
> I found that single hasAccess() call
> - lists rules in groups where the user belongs : getPrivileges()
> - iterate through the rules and test requested privilege.
> - while iterating, text based rules are converted to Privilege object and
> discarded.
>
> I thought the performance problem lies in converting text rule to
> Privilege object everytime.
> So I wrote some code to experiment.
>
> = first experiment =
> 1. get groups
> 2. list rules in groups and convert them to Privilege objects 3. put it in
> HashMap<string, List<Privilege>>, where key is name of group.
> 4. next evaluation will use Privilege objects which are cached in HashMap.
>
> this decreased full evaluation time from 200seconds to 10seconds.
>
> = second experiment =
> - most users are public users, with SELECT only.
> - current EnumSet.allOf(DBModelAction.class) has ordering of INSERT,
> SELECT, ALL so I changed evaluation order to SELECT, INSERT, ALL.
>
> 10 seconds -> 5 seconds.
>
> I believe pre-building Privilege object at loading time will greatly
> enhance SENTRY's performance.
> code snippet is available here. the code is "proof of idea" code.
>
> http://pastebin.com/S7uQnTN7
>
> If all agree that this could be a good idea, I think I can contribute some
> code.
>
> Thanks
>      Donghoon Han
>


-- 
한동훈 드림.

Reply via email to