[jira] [Commented] (IMPALA-11352) CI: Transient failure in test_event_based_replication

Quanlong Huang (Jira) Wed, 15 Mar 2023 16:48:07 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17700901#comment-17700901
 ]


Quanlong Huang commented on IMPALA-11352:
-----------------------------------------

I took a look on the above logs, the failure is due to the events-processor 
come into the ERROR state due to IMPALA-12002
{noformat}
I0310 18:42:48.446516    92 MetastoreEventsProcessor.java:805] Received 18 
events. Start event id : 24601
I0310 18:42:48.447026    92 MetastoreEventsProcessor.java:1026] Time elapsed in 
processing event batch: 395.969us
E0310 18:42:48.447289    92 MetastoreEventsProcessor.java:865] Unexpected 
exception received while processing event
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationException: EventId: 24606 
EventType: COMMIT_COMPACTION_EVENT Unable to parse commit compaction message
        at 
org.apache.impala.catalog.events.MetastoreEvents$CommitCompactionEvent.<init>(MetastoreEvents.java:2684)
        at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEventFactory.get(MetastoreEvents.java:223)
        at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEventFactory.getFilteredEvents(MetastoreEvents.java:255)
        at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:999)
        at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:851)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.impala.catalog.DatabaseNotFoundException: Database 
'test_acid_partitioned_ed33f785' not found
        at org.apache.impala.catalog.Catalog.getTable(Catalog.java:196)
        at 
org.apache.impala.catalog.events.MetastoreEvents$CommitCompactionEvent.<init>(MetastoreEvents.java:2679)
        ... 11 more
E0310 18:42:48.447371    92 MetastoreEventsProcessor.java:1040] Notification 
event is null
W0310 18:42:50.448392    92 MetastoreEventsProcessor.java:844] Event processing 
is skipped since status is ERROR. Last synced event id is 24601{noformat}

> CI: Transient failure in test_event_based_replication
> -----------------------------------------------------
>
>                 Key: IMPALA-11352
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11352
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>    Affects Versions: Impala 4.1.0
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>         Attachments: selected-logs.tgz
>
>
> Transient failure in 
> metadata.test_event_processing.TestEventProcessing.test_event_based_replication
>  during https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/5815/.
> Test output summary shows
> {code}
> metadata/test_event_processing.py:183: in test_event_based_replication
>     self.__run_event_based_replication_tests()
> metadata/test_event_processing.py:302: in __run_event_based_replication_tests
>     EventProcessorUtils.wait_for_event_processing(self)
> util/event_processor_utils.py:70: in wait_for_event_processing
>     allow_greater=True)
> common/impala_service.py:143: in wait_for_metric_value
>     self.__metric_timeout_assert(metric_name, expected_value, timeout)
> common/impala_service.py:210: in __metric_timeout_assert
>     assert 0, assert_string
> E   AssertionError: Metric catalog.curr-version did not reach value 9943 in 
> 10s.
> {code}
> Logs attached. Initial guess is something in parallel execution is delaying 
> catalog updates. I see messages around 00:38:17 for catalog version 9938 
> (which is what the captured metrics show) and catalog version 9961 at 
> 0:38:19. Test bumps are labeled with 0:38:21.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-11352) CI: Transient failure in test_event_based_replication

Reply via email to