[ 
https://issues.apache.org/jira/browse/SENTRY-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261307#comment-16261307
 ] 

Sergio Peña commented on SENTRY-2024:
-------------------------------------

[~LinaAtAustin] I understand that indexes are useful when we DO too many 
queries but they are HARMFUL when the table has many inserts and updates. In 
this case, Sentry is querying the table before doing an insert or update, so 
doesn't this index on the authz_obj cause more overhead than performance 
benefits? 

Also, authz obj names will be unique and there won't be any conflict to keep a 
unique index in the table because Sentry already solves duplicates in a 
different way, do we still need a unique index here?

Is there any other reason to keep the index?

> Specify Char Set for AUTHZ_OBJ_NAME
> -----------------------------------
>
>                 Key: SENTRY-2024
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2024
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>         Attachments: SENTRY-2024.001.patch, SENTRY-2024.001.patch, 
> SENTRY-2024.001.patch, SENTRY-2024.002.patch, SENTRY-2024.002.patch
>
>
> AUTHZ_OBJ_NAME has (384) chars. It is constructed by sentry from notification 
> event as {color:red}DB_NAME + "." + TBL_NAME{color}. To be consistent with 
> hive, sentry should use the same char set as what's used by DB_NAME  and 
> TBL_NAME
> Besides, if its table char set is utf8, the constrain AUTHZOBJNAME will have 
> error  "Specified key was too long; max key length is 767 bytes" for mysql.
> The solution is to specify the char set for this field, so it works for mysql 
> regardless the char set of its table or DB.
> Reference:
> 1) CREATE TABLE AUTHZ_PATHS_MAPPING
>  (
>      AUTHZ_OBJ_ID BIGINT NOT NULL generated always as identity (start with 1),
>      AUTHZ_OBJ_NAME VARCHAR({color:red}384{color}),
>      CREATE_TIME_MS BIGINT NOT NULL
> );
> CREATE UNIQUE INDEX AUTHZOBJNAME ON AUTHZ_PATHS_MAPPING 
> ({color:red}AUTHZ_OBJ_NAME{color});
> 2) AUTHZ_OBJ_NAME  is constructed by sentry from notification event as 
> DB_NAME + "." + TBL_NAME. To be consistent with hive, sentry should use the 
> same char set as what's used by DB_NAME  and TBL_NAME in 
> NotificationProcessor.processAlterTable().
> 3) hive uses latin1 as default charset for notification event.
> CREATE TABLE IF NOT EXISTS `NOTIFICATION_LOG`
> (
>     `NL_ID` BIGINT(20) NOT NULL,
>     `EVENT_ID` BIGINT(20) NOT NULL,
>     `EVENT_TIME` INT(11) NOT NULL,
>     `EVENT_TYPE` varchar(32) NOT NULL,
>     `DB_NAME` varchar(128),
>     `TBL_NAME` varchar(128),
>     `MESSAGE` mediumtext,
>     PRIMARY KEY (`NL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> 4) actual installation shows both fields are in latin1 char set.
> SHOW FULL COLUMNS FROM NOTIFICATION_LOG;
> +------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
> | Field      | Type         | Collation         | Null | Key | Default | 
> Extra | Privileges                      | Comment |
> +------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
> | NL_ID      | bigint(20)   | NULL              | NO   | PRI | NULL    |      
>  | select,insert,update,references |         |
> | EVENT_ID   | bigint(20)   | NULL              | NO   |     | NULL    |      
>  | select,insert,update,references |         |
> | EVENT_TIME | int(11)      | NULL              | NO   |     | NULL    |      
>  | select,insert,update,references |         |
> | EVENT_TYPE | varchar(32)  | latin1_swedish_ci | NO   |     | NULL    |      
>  | select,insert,update,references |         |
> | DB_NAME    | varchar(128) | latin1_swedish_ci | YES  |     | NULL    |      
>  | select,insert,update,references |         |
> | TBL_NAME   | varchar(128) | latin1_swedish_ci | YES  |     | NULL    |      
>  | select,insert,update,references |         |
> | MESSAGE    | mediumtext   | latin1_swedish_ci | YES  |     | NULL    |      
>  | select,insert,update,references |         |
> +------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to