[jira] [Created] (IMPALA-11028) Table loading could fail if metastore cleans up old events

2021-11-18 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created IMPALA-11028:


 Summary: Table loading could fail if metastore cleans up old events
 Key: IMPALA-11028
 URL: https://issues.apache.org/jira/browse/IMPALA-11028
 Project: IMPALA
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


After IMPALA-10502, Catalogd tracks the table's create event id. When the table 
is loaded for the first time, it updates the create event id of the table. But 
if the table is loaded for the first time after a long delay (after 24 hrs) it 
is possible the metastore cleans up old notification logs entries which are 
required by catalogd during the table load.

See this snippet from TableLoader.java
{noformat}
  if (eventId != -1 && catalog_.isEventProcessingActive()) {
// If the eventId is not -1 it means this table was likely created by 
Impala.
// However, since the load operation of the table can happen much 
later, it is
// possible that the table was recreated outside Impala and hence the 
eventId
// which is stored in the loaded table needs to be updated to the 
latest.
// we are only interested in fetching the events if we have a valid 
eventId
// for a table. For tables where eventId is unknown are not created by
// this catalogd and hence the self-event detection logic does not 
apply.
events = MetastoreEventsProcessor.getNextMetastoreEvents(catalog_, 
eventId,
notificationEvent -> CreateTableEvent.CREATE_TABLE_EVENT_TYPE
.equals(notificationEvent.getEventType())
&& notificationEvent.getDbName().equalsIgnoreCase(db.getName())
&& notificationEvent.getTableName().equalsIgnoreCase(tblName));
  }
{noformat}

{{getNextMetastoreEvents}} method can throw the following exception if the 
metastore has cleaned up older entries (by default 24 hrs). This is controlled 
by configuration {{hive.metastore.event.db.listener.timetolive}} on the 
metastore side.

I could reproduce the problem setting the following metastore configs.

{noformat}
hive.metastore.event.db.listener.clean.interval=10s
hive.metastore.event.db.listener.timetolive=120s
{noformat}

Now run the following Impala script
{noformat}
create table t1 (c1 int);
create table t2 (c1 int);
select sleep(24);
create table t3 (c1 int);
select * from t1;
{noformat}





--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11028) Table loading could fail if metastore cleans up old events

2021-11-18 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created IMPALA-11028:


 Summary: Table loading could fail if metastore cleans up old events
 Key: IMPALA-11028
 URL: https://issues.apache.org/jira/browse/IMPALA-11028
 Project: IMPALA
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


After IMPALA-10502, Catalogd tracks the table's create event id. When the table 
is loaded for the first time, it updates the create event id of the table. But 
if the table is loaded for the first time after a long delay (after 24 hrs) it 
is possible the metastore cleans up old notification logs entries which are 
required by catalogd during the table load.

See this snippet from TableLoader.java
{noformat}
  if (eventId != -1 && catalog_.isEventProcessingActive()) {
// If the eventId is not -1 it means this table was likely created by 
Impala.
// However, since the load operation of the table can happen much 
later, it is
// possible that the table was recreated outside Impala and hence the 
eventId
// which is stored in the loaded table needs to be updated to the 
latest.
// we are only interested in fetching the events if we have a valid 
eventId
// for a table. For tables where eventId is unknown are not created by
// this catalogd and hence the self-event detection logic does not 
apply.
events = MetastoreEventsProcessor.getNextMetastoreEvents(catalog_, 
eventId,
notificationEvent -> CreateTableEvent.CREATE_TABLE_EVENT_TYPE
.equals(notificationEvent.getEventType())
&& notificationEvent.getDbName().equalsIgnoreCase(db.getName())
&& notificationEvent.getTableName().equalsIgnoreCase(tblName));
  }
{noformat}

{{getNextMetastoreEvents}} method can throw the following exception if the 
metastore has cleaned up older entries (by default 24 hrs). This is controlled 
by configuration {{hive.metastore.event.db.listener.timetolive}} on the 
metastore side.

I could reproduce the problem setting the following metastore configs.

{noformat}
hive.metastore.event.db.listener.clean.interval=10s
hive.metastore.event.db.listener.timetolive=120s
{noformat}

Now run the following Impala script
{noformat}
create table t1 (c1 int);
create table t2 (c1 int);
select sleep(24);
create table t3 (c1 int);
select * from t1;
{noformat}





--
This message was sent by Atlassian Jira
(v8.20.1#820001)