[jira] [Commented] (IMPALA-13040) SIGSEGV in QueryState::UpdateFilterFromRemote
[ https://issues.apache.org/jira/browse/IMPALA-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848461#comment-17848461 ] ASF subversion and git services commented on IMPALA-13040: -- Commit aa01079478773aed28c9a4d8b07c062202de698d in impala's branch refs/heads/master from Riza Suminto [ https://gitbox.apache.org/repos/asf?p=impala.git;h=aa0107947 ] IMPALA-13040: (addendum) Inject larger delay for sanitized build TestLateQueryStateInit has been flaky in sanitized build because the largest delay injection time is fixed at 3 seconds. This patch fixes the issue by setting largest delay injection time equal to RUNTIME_FILTER_WAIT_TIME_MS, which is 3 second for regular build and 10 seconds for sanitized build. Testing: - Loop and pass test_runtime_filter_aggregation.py 10 times in ASAN build and 50 times in UBSAN build. Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19 Reviewed-on: http://gerrit.cloudera.org:8080/21439 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > SIGSEGV in QueryState::UpdateFilterFromRemote > -- > > Key: IMPALA-13040 > URL: https://issues.apache.org/jira/browse/IMPALA-13040 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Csaba Ringhofer >Assignee: Riza Suminto >Priority: Critical > Fix For: Impala 4.5.0 > > > {code} > Crash reason: SIGSEGV /SEGV_MAPERR > Crash address: 0x48 > Process uptime: not available > Thread 114 (crashed) > 0 libpthread.so.0 + 0x9d00 > rax = 0x00019e57ad00 rdx = 0x2a656720 > rcx = 0x059a9860 rbx = 0x > rsi = 0x00019e57ad00 rdi = 0x0038 > rbp = 0x7f6233d544e0 rsp = 0x7f6233d544a8 > r8 = 0x06a53540r9 = 0x0039 > r10 = 0x r11 = 0x000a > r12 = 0x00019e57ad00 r13 = 0x7f62a2f997d0 > r14 = 0x7f6233d544f8 r15 = 0x1632c0f0 > rip = 0x7f62a2f96d00 > Found by: given as instruction pointer in context > 1 > impalad!impala::QueryState::UpdateFilterFromRemote(impala::UpdateFilterParamsPB > const&, kudu::rpc::RpcContext*) [query-state.cc : 1033 + 0x5] > rbp = 0x7f6233d54520 rsp = 0x7f6233d544f0 > rip = 0x015c0837 > Found by: previous frame's frame pointer > 2 > impalad!impala::DataStreamService::UpdateFilterFromRemote(impala::UpdateFilterParamsPB > const*, impala::UpdateFilterResultPB*, kudu::rpc::RpcContext*) > [data-stream-service.cc : 134 + 0xb] > rbp = 0x7f6233d54640 rsp = 0x7f6233d54530 > rip = 0x017c05de > Found by: previous frame's frame pointer > {code} > The line that crashes is > https://github.com/apache/impala/blob/b39cd79ae84c415e0aebec2c2b4d7690d2a0cc7a/be/src/runtime/query-state.cc#L1033 > My guess is that inside the actual segfault is within WaitForPrepare() but it > was inlined. Not sure if a remote filter can arrive even before > QueryState::Init is finished - that would explain the issue, as > instances_prepared_barrier_ is not yet created at that point. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13040) SIGSEGV in QueryState::UpdateFilterFromRemote
[ https://issues.apache.org/jira/browse/IMPALA-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845175#comment-17845175 ] ASF subversion and git services commented on IMPALA-13040: -- Commit 09d2f10f4ddf3499b6255a6d14653e7738c2928b in impala's branch refs/heads/master from Riza Suminto [ https://gitbox.apache.org/repos/asf?p=impala.git;h=09d2f10f4 ] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add BE tests in network-util-test.cc - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py, test_query_live.py, and test_query_log.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Reviewed-on: http://gerrit.cloudera.org:8080/21383 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > SIGSEGV in QueryState::UpdateFilterFromRemote > -- > > Key: IMPALA-13040 > URL: https://issues.apache.org/jira/browse/IMPALA-13040 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Csaba Ringhofer >Priority: Critical > > {code} > Crash reason: SIGSEGV /SEGV_MAPERR > Crash address: 0x48 > Process uptime: not available > Thread 114 (crashed) > 0 libpthread.so.0 + 0x9d00 > rax = 0x00019e57ad00 rdx = 0x2a656720 > rcx = 0x059a9860 rbx = 0x > rsi = 0x00019e57ad00 rdi = 0x0038 > rbp = 0x7f6233d544e0 rsp = 0x7f6233d544a8 > r8 = 0x06a53540r9 = 0x0039 > r10 = 0x r11 = 0x000a > r12 = 0x00019e57ad00 r13 = 0x7f62a2f997d0 > r14 = 0x7f6233d544f8 r15 = 0x1632c0f0 > rip = 0x7f62a2f96d00 > Found by: given as instruction pointer in context > 1 > impalad!impala::QueryState::UpdateFilterFromRemote(impala::UpdateFilterParamsPB > const&, kudu::rpc::RpcContext*) [query-state.cc : 1033 + 0x5] > rbp = 0x7f6233d54520 rsp = 0x7f6233d544f0 > rip = 0x015c0837 > Found by: previous frame's frame pointer > 2 > impalad!impala::DataStreamService::UpdateFilterFromRemote(impala::UpdateFilterParamsPB > const*, impala::UpdateFilterResultPB*, kudu::rpc::RpcContext*) > [data-stream-service.cc : 134 + 0xb] > rbp = 0x7f6233d54640 rsp = 0x7f6233d54530 > rip = 0x017c05de > Found by: previous frame's frame pointer > {code} > The line that crashes is > https://github.com/apache/impala/blob/b39cd79ae84c415e0aebec2c2b4d7690d2a0cc7a/be/src/runtime/query-state.cc#L1033 > My guess is that inside the actual segfault is within WaitForPrepare() but it > was inlined. Not sure if a remote filter can arrive even before > QueryState::Init is finished - that would explain the issue, as > instances_prepared_barrier_ is not yet created at that point. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13040) SIGSEGV in QueryState::UpdateFilterFromRemote
[ https://issues.apache.org/jira/browse/IMPALA-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841259#comment-17841259 ] Riza Suminto commented on IMPALA-13040: --- I think it is possible, because there is no readiness coordination in distributed runtime filter aggregation. What do you think about adding some boolean flag in QueryState::QueryState constructor and spin loop waiting it to be set? > SIGSEGV in QueryState::UpdateFilterFromRemote > -- > > Key: IMPALA-13040 > URL: https://issues.apache.org/jira/browse/IMPALA-13040 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Csaba Ringhofer >Priority: Critical > > {code} > Crash reason: SIGSEGV /SEGV_MAPERR > Crash address: 0x48 > Process uptime: not available > Thread 114 (crashed) > 0 libpthread.so.0 + 0x9d00 > rax = 0x00019e57ad00 rdx = 0x2a656720 > rcx = 0x059a9860 rbx = 0x > rsi = 0x00019e57ad00 rdi = 0x0038 > rbp = 0x7f6233d544e0 rsp = 0x7f6233d544a8 > r8 = 0x06a53540r9 = 0x0039 > r10 = 0x r11 = 0x000a > r12 = 0x00019e57ad00 r13 = 0x7f62a2f997d0 > r14 = 0x7f6233d544f8 r15 = 0x1632c0f0 > rip = 0x7f62a2f96d00 > Found by: given as instruction pointer in context > 1 > impalad!impala::QueryState::UpdateFilterFromRemote(impala::UpdateFilterParamsPB > const&, kudu::rpc::RpcContext*) [query-state.cc : 1033 + 0x5] > rbp = 0x7f6233d54520 rsp = 0x7f6233d544f0 > rip = 0x015c0837 > Found by: previous frame's frame pointer > 2 > impalad!impala::DataStreamService::UpdateFilterFromRemote(impala::UpdateFilterParamsPB > const*, impala::UpdateFilterResultPB*, kudu::rpc::RpcContext*) > [data-stream-service.cc : 134 + 0xb] > rbp = 0x7f6233d54640 rsp = 0x7f6233d54530 > rip = 0x017c05de > Found by: previous frame's frame pointer > {code} > The line that crashes is > https://github.com/apache/impala/blob/b39cd79ae84c415e0aebec2c2b4d7690d2a0cc7a/be/src/runtime/query-state.cc#L1033 > My guess is that inside the actual segfault is within WaitForPrepare() but it > was inlined. Not sure if a remote filter can arrive even before > QueryState::Init is finished - that would explain the issue, as > instances_prepared_barrier_ is not yet created at that point. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org