Hi Lina,
> Glad I can help. Do you know what configuration caused the columns not > parsed by Hive? If it is due to SessionState.get().isAuthorizationModeV2() > == false? > Yes exactly - I'm using the V1 binding. Colm. > > Thanks, > > Lina > > On Fri, Jan 5, 2018 at 6:12 AM, Colm O hEigeartaigh <[email protected]> > wrote: > >> Hi Lina, >> >> Thanks a lot for your help on this! I was able to get the test to work by >> adding the following config option: >> >> conf.set(HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS.varname, "true"); >> >> Colm. >> >> On Thu, Jan 4, 2018 at 10:06 PM, Na Li <[email protected]> wrote: >> >> > Colm, >> > >> > The following code shows where Hive sets the column info. You can debug >> > into hive code and see why AccessedColumns is not set. >> > >> > The related code is in org.apache.hadoop.hive.ql.parse.SemanticAnalyzer >> > >> > boolean isColumnInfoNeedForAuth = >> SessionState.get().isAuthorizationModeV2() && >> HiveConf.getBoolVar(this.conf, ConfVars.HIVE_AUTHORIZATION_ENABLED); >> > if (isColumnInfoNeedForAuth || HiveConf.getBoolVar(this.conf, >> ConfVars.HIVE_STATS_COLLECT_SCANCOLS)) { >> > ColumnAccessAnalyzer columnAccessAnalyzer = new >> ColumnAccessAnalyzer(pCtx); >> > this.setColumnAccessInfo(columnAccessAnalyzer.analyzeColumn >> Access(this.getColumnAccessInfo())); >> > } >> > >> > this.LOG.info("Completed plan generation"); >> > if (HiveConf.getBoolVar(this.conf, >> ConfVars.HIVE_STATS_COLLECT_SCANCOLS)) { >> > this.putAccessedColumnsToReadEntity(this.inputs, >> this.columnAccessInfo); >> > } >> > >> > >> > On Wed, Jan 3, 2018 at 11:28 PM, Na Li <[email protected]> wrote: >> > >> >> Colm, >> >> >> >> I tried to reproduce your issue using sentry 2.0 (master branch) with >> >> Hive 2.3.2. >> >> >> >> The test code is >> >> >> >> @Test >> >> public void testPositiveOnAll() throws Exception { >> >> Connection connection = context.createConnection(ADMIN1); >> >> Statement statement = context.createStatement(connection); >> >> statement.execute("CREATE database " + DB1); >> >> statement.execute("use " + DB1); >> >> statement.execute("CREATE TABLE t1 (c1 string, c2 string)"); >> >> statement.execute("CREATE ROLE user_role1"); >> >> statement.execute("*GRANT SELECT ON TABLE t1 TO ROLE user_role1*"); >> >> statement.execute("GRANT ROLE user_role1 TO GROUP " + USERGROUP1); >> >> statement.close(); >> >> connection.close(); >> >> >> >> connection = context.createConnection(USER1_1); >> >> statement = context.createStatement(connection); >> >> statement.execute("use " + DB1); >> >> statement.execute("*SELECT * FROM t1*"); >> >> >> >> statement.close(); >> >> connection.close(); >> >> } >> >> >> >> >> >> required privileges: >> >> >> >> - Server=server1->Db=db_1->Table=t1->*Column=c1*->action=select >> >> - Server=server1->Db=db_1->Table=t1->*Column=c2*->action=select >> >> >> >> >> >> cached privilege: >> >> >> >> - server=server1->db=db_1->table=t1->action=select >> >> >> >> So the authorization works. >> >> >> >> Note >> >> >> >> - For me, the "*SELECT * FROM t1*" causes the required privileges to >> >> contain each column explicitly. However, for you, The "privilege" >> to check >> >> looks like: >> >> Server=server1->Db=authz->Table=words->action=select; The columns >> are >> >> not explicitly listed. Hive controls if the column is included in >> >> required privilege. At org.apache.sentry.binding.h >> >> ive.authz.HiveAuthzBindingHookBase.authorizeWithHiveBindings -> >> >> getInputHierarchyFromInputs -> addColumnHierarchy, Sentry uses >> >> accessedColumns from Hive input to add colHierarchy for each column. >> >> You can check if accessedColumns is empty or null for the hive >> >> version you are using. >> >> - For me, the cached privilege does not include column part. For >> you, >> >> the cached privilege is "Server=server1->Db=authz->Table=words-> >> >> *Column=**->action=select". *Can you share your test code*, so I can >> >> see how you grant the privilege and therefore the cached privilege >> contains >> >> column? >> >> - I tried to use "GRANT *SELECT(*)* ON TABLE t1 TO ROLE >> >> user_role1", and got following error >> >> - >> >> - 2018-01-03 23:23:50,459 (HiveServer2-Handler-Pool: Thread-212) >> >> [WARN - org.apache.hive.service.cli.th >> >> rift.ThriftCLIService.ExecuteStatement(ThriftCLIService. >> java:539)] >> >> Error executing statement: >> >> - org.apache.hive.service.cli.HiveSQLException: Error while >> >> compiling statement: FAILED: ParseException line 1:6 cannot >> recognize input >> >> near 'GRANT' 'SELECT' '(' in ddl statement >> >> - at org.apache.hive.service.cli.op >> eration.Operation.toSQLExcepti >> >> on(Operation.java:380) >> >> - at org.apache.hive.service.cli.operation.SQLOperation.prepare( >> >> SQLOperation.java:206) >> >> - at org.apache.hive.service.cli.op >> eration.SQLOperation.runIntern >> >> al(SQLOperation.java:290) >> >> - at org.apache.hive.service.cli.op >> eration.Operation.run(Operatio >> >> n.java:320) >> >> - at org.apache.hive.service.cli.se >> ssion.HiveSessionImpl.executeS >> >> tatementInternal(HiveSessionImpl.java:530) >> >> >> >> Thanks, >> >> >> >> Lina >> >> >> >> On Mon, Dec 18, 2017 at 10:14 AM, Colm O hEigeartaigh < >> >> [email protected]> wrote: >> >> >> >>> Thanks Kalyan! I was thinking that if the cached privilege part does >> not >> >>> appear in the requested "part", and if is "all", then we should skip >> that >> >>> part and continue on to the next one. But maybe there is a better >> >>> solution. >> >>> >> >>> Colm. >> >>> >> >>> On Mon, Dec 18, 2017 at 4:06 PM, Kalyan Kumar Kalvagadda < >> >>> [email protected]> wrote: >> >>> >> >>> > Colm, >> >>> > >> >>> > I will look closer into this today and see If i can help you out. >> >>> > >> >>> > -Kalyan >> >>> > >> >>> > On Mon, Dec 18, 2017 at 4:52 AM, Colm O hEigeartaigh < >> >>> [email protected]> >> >>> > wrote: >> >>> > >> >>> >> Hi, >> >>> >> >> >>> >> I've done some further analysis of the problem, and I think it is >> not >> >>> >> directly related to SENTRY-1291. The problem manifests in >> >>> >> CommonPrivilege.implies(privilege, model). My (cached) privilege >> >>> looks >> >>> >> like: >> >>> >> >> >>> >> Server=server1->Db=authz->Table=words->Column=*->action=select >> >>> >> >> >>> >> The "privilege" I want to check looks like: >> >>> >> >> >>> >> Server=server1->Db=authz->Table=words->action=select; >> >>> >> >> >>> >> The problem is in the "for" loop in CommonPrivilege.implies. It >> loops >> >>> on >> >>> >> the parts of the second privilege, and matches up to >> "action=select". >> >>> Here >> >>> >> it tries to compare to "Column=*" of the cached privilege and >> fails on >> >>> >> this >> >>> >> line: >> >>> >> >> >>> >> https://github.com/apache/sentry/blob/a4924edc79b26f937e3e5e >> >>> >> a3584f0b4307dd4135/sentry-policy/sentry-policy-common/ >> >>> >> src/main/java/org/apache/sentry/policy/common/CommonPrivileg >> >>> e.java#L86 >> >>> >> >> >>> >> It's clear there's a bug here somewhere, but I'm not sure where - >> can >> >>> >> someone please advise? >> >>> >> >> >>> >> Thanks, >> >>> >> >> >>> >> Colm. >> >>> >> >> >>> >> On Wed, Dec 13, 2017 at 8:28 PM, Na Li <[email protected]> >> wrote: >> >>> >> >> >>> >> > Sasha, >> >>> >> > >> >>> >> > sentry-1291 is helpful for the problem that sentry privilege >> checks >> >>> >> takes >> >>> >> > too long with many explicit grants, which is useful for big >> >>> customers. >> >>> >> > Another approach that can improve the performance is to organize >> the >> >>> >> > privileges according to the authorization hierarchy in a tree >> >>> >> structure, so >> >>> >> > finding match in ResourceAuthorizationProvider.doHasAccess() is >> in >> >>> the >> >>> >> > order of log(N), not linear of N, where N is the number of >> >>> privileges. >> >>> >> > >> >>> >> > We can wait for Colm to confirm his issue is caused by >> sentry-1291. >> >>> If >> >>> >> so, >> >>> >> > it may be fixed by selecting privileges by finding if the >> requesting >> >>> >> > authorization object is prefix of cached privileges instead of >> exact >> >>> >> match. >> >>> >> > >> >>> >> > in SimplePrivilegeCache >> >>> >> > >> >>> >> > public Set<String> listPrivileges(Set<String> groups, Set<String> >> >>> users, >> >>> >> > ActiveRoleSet roleSet, >> >>> >> > Authorizable... authorizationHierarchy) { >> >>> >> > Set<String> privileges = new HashSet<>(); >> >>> >> > Set<StringBuilder> authzKeys = getAuthzKeys(authorizationHier >> >>> >> archy); >> >>> >> > for (StringBuilder authzKey : authzKeys) { >> >>> >> > if (cachedAuthzPrivileges.get(authzKey.toString()) != >> null) { >> >>> >> > <- >> >>> >> > instead of exact matching, add extension function to check if >> >>> >> > authzKey.toString is the prefix of the key of the entries >> >>> >> > in cachedAuthzPrivileges. >> >>> >> > privileges.addAll(cachedAuthzPrivileges.get(authzKey. >> >>> >> toString())); >> >>> >> > } >> >>> >> > } >> >>> >> > >> >>> >> > return privileges; >> >>> >> > } >> >>> >> > >> >>> >> > Thanks, >> >>> >> > >> >>> >> > Lina >> >>> >> > >> >>> >> > On Wed, Dec 13, 2017 at 1:08 PM, Alexander Kolbasov < >> >>> [email protected] >> >>> >> > >> >>> >> > wrote: >> >>> >> > >> >>> >> > > I think that SENTRY-1291 should be just reverted - there are >> >>> multiple >> >>> >> > > issues with it and no one is actually using the fix. Anyone >> wants >> >>> to >> >>> >> do >> >>> >> > it? >> >>> >> > > >> >>> >> > > - Alex >> >>> >> > > >> >>> >> > > On Wed, Dec 13, 2017 at 4:44 AM, Na Li <[email protected]> >> >>> wrote: >> >>> >> > > >> >>> >> > > > Colm, >> >>> >> > > > >> >>> >> > > > Glad you find the cause! >> >>> >> > > > >> >>> >> > > > You can revert Sentry-1291, and see if it works. If so, it is >> >>> issue >> >>> >> at >> >>> >> > > > finding cached privileges. >> >>> >> > > > >> >>> >> > > > Cheers, >> >>> >> > > > >> >>> >> > > > Lina >> >>> >> > > > >> >>> >> > > > Sent from my iPhone >> >>> >> > > > >> >>> >> > > > > On Dec 13, 2017, at 4:58 AM, Colm O hEigeartaigh < >> >>> >> > [email protected]> >> >>> >> > > > wrote: >> >>> >> > > > > >> >>> >> > > > > Hi, >> >>> >> > > > > >> >>> >> > > > > I can see what the problem is (that the authorization >> >>> hierarchy >> >>> >> does >> >>> >> > > not >> >>> >> > > > > contain the column, and hence doesn't match against the >> cached >> >>> >> > > > privilege), >> >>> >> > > > > but I'm not sure about the best way to solve it. Either the >> >>> way we >> >>> >> > are >> >>> >> > > > > creating the authorization hierarchy is incorrect (e.g. in >> >>> >> > > > > HiveAuthzBindingHookBase) or else the way we are parsing >> the >> >>> >> cached >> >>> >> > > > > privilege is incorrect (e.g. in SimplePrivilegeCache/ >> >>> >> > CommonPrivilege). >> >>> >> > > > > >> >>> >> > > > > Colm. >> >>> >> > > > > >> >>> >> > > > >> On Wed, Dec 13, 2017 at 5:57 AM, Na Li < >> [email protected] >> >>> > >> >>> >> > wrote: >> >>> >> > > > >> >> >>> >> > > > >> Colm, >> >>> >> > > > >> >> >>> >> > > > >> I did not get chance to look into this issue today. Sorry >> >>> about >> >>> >> > that. >> >>> >> > > > >> >> >>> >> > > > >> You can add a e2e test case and set break point at where >> the >> >>> >> > > > authorization >> >>> >> > > > >> object hierarchy to a list of authorization objects, >> which is >> >>> >> used >> >>> >> > to >> >>> >> > > do >> >>> >> > > > >> exact match with cache >> >>> >> > > > >> >> >>> >> > > > >> Sent from my iPhone >> >>> >> > > > >> >> >>> >> > > > >>> On Dec 12, 2017, at 11:27 AM, Colm O hEigeartaigh < >> >>> >> > > [email protected] >> >>> >> > > > > >> >>> >> > > > >> wrote: >> >>> >> > > > >>> >> >>> >> > > > >>> That would be great, thanks! >> >>> >> > > > >>> >> >>> >> > > > >>> Colm. >> >>> >> > > > >>> >> >>> >> > > > >>>> On Tue, Dec 12, 2017 at 4:36 PM, Na Li < >> >>> [email protected]> >> >>> >> > > wrote: >> >>> >> > > > >>>> >> >>> >> > > > >>>> Colm, >> >>> >> > > > >>>> >> >>> >> > > > >>>> I suspect it is a bug in SENTRY-1291. I can take a look >> >>> later >> >>> >> > today. >> >>> >> > > > >>>> >> >>> >> > > > >>>> Thanks, >> >>> >> > > > >>>> >> >>> >> > > > >>>> Lina >> >>> >> > > > >>>> >> >>> >> > > > >>>> On Tue, Dec 12, 2017 at 4:32 AM, Colm O hEigeartaigh < >> >>> >> > > > >> [email protected]> >> >>> >> > > > >>>> wrote: >> >>> >> > > > >>>> >> >>> >> > > > >>>>> Hi all, >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> I've updated some local testcases to work with Sentry >> >>> 2.0.0 >> >>> >> and >> >>> >> > the >> >>> >> > > > >> "v1" >> >>> >> > > > >>>>> Hive binding (previously working fine using 1.8.0 and >> the >> >>> "v2" >> >>> >> > > > >> binding). >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> I have a simple table called "words" (word STRING, >> count >> >>> >> INT). I >> >>> >> > am >> >>> >> > > > >>>> making >> >>> >> > > > >>>>> an SQL call as the user "bob", e.g. "SELECT * FROM >> words >> >>> where >> >>> >> > > count >> >>> >> > > > == >> >>> >> > > > >>>>> '100'". >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> "bob" is in the "manager" group", which has the >> following >> >>> >> role: >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> select_all_role = >> >>> >> > > > >>>>> Server=server1->Db=authz->Tabl >> >>> e=words->Column=*->action=sele >> >>> >> ct >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> Essentially, authorization is denied even though the >> >>> policy is >> >>> >> > > > correct. >> >>> >> > > > >>>> If >> >>> >> > > > >>>>> I look at the SimplePrivilegeCache, the cached >> privilege >> >>> is: >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> server=server1->db=authz->tabl >> e=words->column=*=[Server= >> >>> >> > > > >>>>> server1->Db=authz->Table=words >> ->Column=*->action=select] >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> However, when "listPrivileges" is called, the >> authorizable >> >>> >> > > hierarchy >> >>> >> > > > >>>> looks >> >>> >> > > > >>>>> like: >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> Server [name=server1] >> >>> >> > > > >>>>> Database [name=authz] >> >>> >> > > > >>>>> Table [name=words] >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> There is no "column" here, and a match is not made >> >>> against the >> >>> >> > > cached >> >>> >> > > > >>>>> privilege as a result. Is this a bug or am I missing >> some >> >>> >> > > > configuration >> >>> >> > > > >>>>> switch? >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> Colm. >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> -- >> >>> >> > > > >>>>> Colm O hEigeartaigh >> >>> >> > > > >>>>> >> >>> >> > > > >>>>> Talend Community Coder >> >>> >> > > > >>>>> http://coders.talend.com >> >>> >> > > > >>>>> >> >>> >> > > > >>>> >> >>> >> > > > >>> >> >>> >> > > > >>> >> >>> >> > > > >>> >> >>> >> > > > >>> -- >> >>> >> > > > >>> Colm O hEigeartaigh >> >>> >> > > > >>> >> >>> >> > > > >>> Talend Community Coder >> >>> >> > > > >>> http://coders.talend.com >> >>> >> > > > >> >> >>> >> > > > > >> >>> >> > > > > >> >>> >> > > > > >> >>> >> > > > > -- >> >>> >> > > > > Colm O hEigeartaigh >> >>> >> > > > > >> >>> >> > > > > Talend Community Coder >> >>> >> > > > > http://coders.talend.com >> >>> >> > > > >> >>> >> > > >> >>> >> > >> >>> >> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Colm O hEigeartaigh >> >>> >> >> >>> >> Talend Community Coder >> >>> >> http://coders.talend.com >> >>> >> >> >>> > >> >>> > >> >>> >> >>> >> >>> -- >> >>> Colm O hEigeartaigh >> >>> >> >>> Talend Community Coder >> >>> http://coders.talend.com >> >>> >> >> >> >> >> > >> >> >> -- >> Colm O hEigeartaigh >> >> Talend Community Coder >> http://coders.talend.com >> > > -- Colm O hEigeartaigh Talend Community Coder http://coders.talend.com
