Karthik Manamcheri created HIVE-20977:
-----------------------------------------

             Summary: Lazy evaluate the table object in PreReadTableEvent to 
improve get_partition performance
                 Key: HIVE-20977
                 URL: https://issues.apache.org/jira/browse/HIVE-20977
             Project: Hive
          Issue Type: Improvement
            Reporter: Karthik Manamcheri
            Assignee: Karthik Manamcheri


The PreReadTableEvent is generated for non-table operations (such as 
get_partitions), but only if there is an event listener attached. However, this 
is also not necessary if the event listener is not interested in the read table 
event.

For example, the TransactionalValidationListener's onEvent looks like this
{code:java}
@Override
public void onEvent(PreEventContext context) throws MetaException, 
NoSuchObjectException,
    InvalidOperationException {
  switch (context.getEventType()) {
    case CREATE_TABLE:
      handle((PreCreateTableEvent) context);
      break;
    case ALTER_TABLE:
      handle((PreAlterTableEvent) context);
      break;
    default:
      //no validation required..
  }
}{code}
 

Note that for read table events it is a no-op. The problem is that the 
get_table is evaluated when creating the PreReadTableEvent finally to be just 
ignored!

Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective of 
if the listener uses it or not.
{code:java}
private void fireReadTablePreEvent(String catName, String dbName, String 
tblName)
    throws MetaException, NoSuchObjectException {
  if(preListeners.size() > 0) {
    // do this only if there is a pre event listener registered (avoid 
unnecessary
    // metastore api call)
    Table t = getMS().getTable(catName, dbName, tblName);
    if (t == null) {
      throw new NoSuchObjectException(TableName.getQualified(catName, dbName, 
tblName)
          + " table not found");
    }
    firePreEvent(new PreReadTableEvent(t, this));
  }
}
{code}
This can be improved by using a {{Supplier}} and lazily evaluating the table 
when needed (once when the first time it is called, memorized after that).

*Motivation*

Whenever a partition call occurs (get_partition, etc.), we fire the 
PreReadTableEvent. This affects performance since it fetches the table even if 
it is not being used. This change will improve performance on the get_partition 
calls.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to