Karthik Manamcheri created HIVE-20977: -----------------------------------------
Summary: Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance Key: HIVE-20977 URL: https://issues.apache.org/jira/browse/HIVE-20977 Project: Hive Issue Type: Improvement Reporter: Karthik Manamcheri Assignee: Karthik Manamcheri The PreReadTableEvent is generated for non-table operations (such as get_partitions), but only if there is an event listener attached. However, this is also not necessary if the event listener is not interested in the read table event. For example, the TransactionalValidationListener's onEvent looks like this {code:java} @Override public void onEvent(PreEventContext context) throws MetaException, NoSuchObjectException, InvalidOperationException { switch (context.getEventType()) { case CREATE_TABLE: handle((PreCreateTableEvent) context); break; case ALTER_TABLE: handle((PreAlterTableEvent) context); break; default: //no validation required.. } }{code} Note that for read table events it is a no-op. The problem is that the get_table is evaluated when creating the PreReadTableEvent finally to be just ignored! Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective of if the listener uses it or not. {code:java} private void fireReadTablePreEvent(String catName, String dbName, String tblName) throws MetaException, NoSuchObjectException { if(preListeners.size() > 0) { // do this only if there is a pre event listener registered (avoid unnecessary // metastore api call) Table t = getMS().getTable(catName, dbName, tblName); if (t == null) { throw new NoSuchObjectException(TableName.getQualified(catName, dbName, tblName) + " table not found"); } firePreEvent(new PreReadTableEvent(t, this)); } } {code} This can be improved by using a {{Supplier}} and lazily evaluating the table when needed (once when the first time it is called, memorized after that). *Motivation* Whenever a partition call occurs (get_partition, etc.), we fire the PreReadTableEvent. This affects performance since it fetches the table even if it is not being used. This change will improve performance on the get_partition calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)