[ 
https://issues.apache.org/jira/browse/IMPALA-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434499#comment-17434499
 ] 

Vihang Karajgaonkar commented on IMPALA-10987:
----------------------------------------------

Possible solutions to improve this:
1. In case a table level sync is re-enabled:
  a. if the table exists in Impala, we can just invalidate the table so that it 
is reloaded the first time query accesses it. This would take of any missing 
ADD/DROP partition events on the table during the time the events sync was 
disabled on the table.
  b. If the table doesn't exist in Impala, create a Incomplete table, if there 
is no entry in the event delete log for this table.

I am not sure how to handle a database level sync re-enable efficiently. I wish 
we had a {{refresh database}} which would have been useful here. The other 
approach is to invalidate any tables in the database which evaluate to sync 
being turned on and previously didn't have them as turned on. We will still 
need to handle the missing create/drop table events during the time window when 
the events sync was disabled.


> Changing impala.disableHmsSync in Hive can break event processing
> -----------------------------------------------------------------
>
>                 Key: IMPALA-10987
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10987
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> To reproduce, start Impala with event polling:
> {code}
> bin/start-impala-cluster.py --catalogd_args="--hms_event_polling_interval_s=2 
> --catalog_topic_mode=minimal" --impalad_args="--use_local_catalog=1"
> {code}
> From Hive:
> {code}
> CREATE DATABASE temp;
> CREATE EXTERNAL TABLE temp.t (i int) PARTITIONED BY (p int) 
> TBLPROPERTIES('impala.disableHmsSync'='true');
> ALTER TABLE temp.t SET TBLPROPERTIES ('impala.disableHmsSync'='false');
> {code}
> From this point event sync will be broken in Impala. It can be fixed only 
> with global INVALIDATE METADATA (or restarting catalogd)
> catalogd log will include an exception like this:
> {code}
> E1026 10:30:16.151208 22514 MetastoreEventsProcessor.java:653] Event 
> processing needs a invalidate command to resolve the state
> Java exception follows:
> org.apache.impala.catalog.events.MetastoreNotificationNeedsInvalidateException:
>  EventId: 15956 EventType: ALTER_TABLE Detected that event sync was tur
> ned on for the table temp.t and the table does not exist. Event processing 
> cannot be continued further. Issue a invalidate metadata command to reset
>  the event processing state
>         at 
> org.apache.impala.catalog.events.MetastoreEvents$AlterTableEvent.process(MetastoreEvents.java:992)
>         at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:345)
>         at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:747)
>         at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:645)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
> and future events will be lead to a log like this:
> {code}
> W1026 10:30:18.151962 22514 MetastoreEventsProcessor.java:638] Event 
> processing is skipped since status is NEEDS_INVALIDATE. Last synced event id 
> is 15955
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to