VenuReddy2103 commented on code in PR #4569:
URL: https://github.com/apache/hive/pull/4569#discussion_r1353584451


##########
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0-beta-2.mssql.sql:
##########
@@ -661,6 +661,8 @@ ALTER TABLE NOTIFICATION_LOG ADD CONSTRAINT 
NOTIFICATION_LOG_PK PRIMARY KEY (NL_
 
 CREATE UNIQUE INDEX NOTIFICATION_LOG_EVENT_ID ON NOTIFICATION_LOG (EVENT_ID);
 
+CREATE UNIQUE INDEX NOTIFICATION_LOG_UNIQUE_DB ON NOTIFICATION_LOG (DB_NAME, 
EVENT_ID);

Review Comment:
   I agree with @dengzhhu653 
   Optimizer would check the potential indexes available and decides to choose 
one(could be single index or composite index) among them or not to choose any 
index based on various factors. Since we have index with single 
column(`EVENT_ID`), in query having where clause with eventId + may be dbName + 
may be table name etc, optimizer detects the eventId index as potential 
candidate. Have captured explain queries on mysql for index used and not used 
cases.
   Index used:
   ```
   mysql> explain select * from NOTIFICATION_LOG where db_name='mydb' and 
event_id>8;
   
+----+-------------+------------------+------------+-------+---------------------------+---------------------------+---------+------+------+----------+------------------------------------+
   | id | select_type | table            | partitions | type  | possible_keys   
          | key                       | key_len | ref  | rows | filtered | 
Extra                              |
   
+----+-------------+------------------+------------+-------+---------------------------+---------------------------+---------+------+------+----------+------------------------------------+
   |  1 | SIMPLE      | NOTIFICATION_LOG | NULL       | range | 
NOTIFICATION_LOG_EVENT_ID | NOTIFICATION_LOG_EVENT_ID | 8       | NULL |    3 | 
   10.00 | Using index condition; Using where |
   
+----+-------------+------------------+------------+-------+---------------------------+---------------------------+---------+------+------+----------+------------------------------------+
   1 row in set, 1 warning (0.00 sec)
   
   mysql> explain select * from NOTIFICATION_LOG where db_name='mydb' and 
tbl_name='stage' and event_id>8;
   
+----+-------------+------------------+------------+-------+---------------------------+---------------------------+---------+------+------+----------+------------------------------------+
   | id | select_type | table            | partitions | type  | possible_keys   
          | key                       | key_len | ref  | rows | filtered | 
Extra                              |
   
+----+-------------+------------------+------------+-------+---------------------------+---------------------------+---------+------+------+----------+------------------------------------+
   |  1 | SIMPLE      | NOTIFICATION_LOG | NULL       | range | 
NOTIFICATION_LOG_EVENT_ID | NOTIFICATION_LOG_EVENT_ID | 8       | NULL |    3 | 
    9.09 | Using index condition; Using where |
   
+----+-------------+------------------+------------+-------+---------------------------+---------------------------+---------+------+------+----------+------------------------------------+
   1 row in set, 1 warning (0.00 sec)
   
   ```
   Index doesn't seem to be used here:
   ```
   mysql> explain select * from NOTIFICATION_LOG where event_id>12 and 
db_name='mydb';
   
+----+-------------+------------------+------------+------+---------------------------+------+---------+------+------+----------+-------------+
   | id | select_type | table            | partitions | type | possible_keys    
         | key  | key_len | ref  | rows | filtered | Extra       |
   
+----+-------------+------------------+------------+------+---------------------------+------+---------+------+------+----------+-------------+
   |  1 | SIMPLE      | NOTIFICATION_LOG | NULL       | ALL  | 
NOTIFICATION_LOG_EVENT_ID | NULL | NULL    | NULL |   29 |     5.86 | Using 
where |
   
+----+-------------+------------------+------------+------+---------------------------+------+---------+------+------+----------+-------------+
   1 row in set, 1 warning (0.00 sec)
   ```
   Have found some reference from mysql but this seem to be not very specific 
to mysql.
   
https://dev.mysql.com/doc/refman/8.0/en/multiple-column-indexes.html#:~:text=If%20the%20table%20has%20a%20multiple,of%20(col1%2C%20col2%2C%20col3).
   
https://dev.mysql.com/doc/refman/8.0/en/index-condition-pushdown-optimization.html#:~:text=Get%20the%20next%20row%27s,on%20the%20test%20result.
   
   Some dbs seem to provide a way to specify index hints in query explicilty to 
always use an index. 
   
   If we add the dbname + eventId composite index, index updation/maintence 
overhead may be high because index data need to be sorted by dbname first and 
then by eventIds within the db on the huge notification log? It may improve the 
query performance though.
   
   May be good to compare the tradeoffs between using the eventId based index 
vs dbname + eventId composite index ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to