[ https://issues.apache.org/jira/browse/IMPALA-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong updated IMPALA-8687: ---------------------------------- Description: Following on from IMPALA-8659, we may have some cases where impalads do self-RPCs via the thrift internal service IMPALA-7984. This JIRA is to investigate if this is a problem, and to fix it (either by intercepting self-RPCs in Thrift or by making code changes to avoid it). Basic join where global runtime filters should apply: {code} select straight_join count(*) from alltypes t1 join /*+ shuffle */ alltypes t2 on t1.id = t2.id where t2.string_col = '1'; {code} Interesting cases * Dedicated coordinator with distributed plan ==> expect that all joins and scans run on executors and all filter aggregation happens on coordinator. * Single node plan (num_nodes=1) ==> expect that all filters are local ==> no RPCs required * Combined coordinator/executor with distributed plan ==> may do self-RPC So I think in the dedicated coordinator/executor case we're ok. Note that IMPALA-3825 may violate the above assumptions. I can pretty easily reproduce the issue on combined coordinators/executors with verbosity level 2. This is a log excerpt from the Impalad tarmstrong-box:22000 {noformat} I0619 17:28:00.913919 25525 client-cache.cc:47] GetClient(tarmstrong-box:22000) I0619 17:28:00.913924 25525 client-cache.cc:57] GetClient(): returning cached client for tarmstrong-box:22000 I0619 17:28:00.914047 25425 rpc-trace.cc:202] RPC call: ImpalaInternalService.PublishFilter(from ::ffff:127.0.0.1:41902) I0619 17:28:00.914587 25425 query-exec-mgr.cc:98] QueryState: query_id=624be7fc0bc0e122:0fbdc17200000000 refcnt=6 I0619 17:28:00.914597 25425 fragment-instance-state.cc:511] PublishFilter(): instance_id=624be7fc0bc0e122:0fbdc17200000002 filter_id=0 I0619 17:28:00.915010 25425 query-exec-mgr.cc:162] ReleaseQueryState(): query_id=624be7fc0bc0e122:0fbdc17200000000 refcnt=6 I0619 17:28:00.915038 25425 rpc-trace.cc:212] RPC call: backend:ImpalaInternalService.PublishFilter from ::ffff:127.0.0.1:41902 took 1.000ms I0619 17:28:00.915043 25525 client-cache.cc:152] Releasing client for tarmstrong-box:22000 back to cache I0619 17:28:00.915175 25525 rpc-trace.cc:212] RPC call: backend:ImpalaInternalService.UpdateFilter from ::ffff:127.0.0.1:41930 took 5.000ms I0619 17:28:00.922312 25437 scan-node.cc:192] 624be7fc0bc0e122:0fbdc17200000002] Filters arrived. Waited 351ms {noformat} was: Following on from IMPALA-8659, we may have some cases where impalads do self-RPCs via the thrift internal service IMPALA-7984. This JIRA is to investigate if this is a problem, and to fix it (either by intercepting self-RPCs in Thrift or by making code changes to avoid it). Basic join where global runtime filters should apply: {code} select straight_join count(*) from alltypes t1 join /*+ shuffle */ alltypes t2 on t1.id = t2.id where t2.string_col = '1'; {code} Interesting cases * Dedicated coordinator with distributed plan ==> expect that all joins run on executors and all filter aggregation happens on coordinator * Single node plan (num_nodes=1) ==> expect that all filters are local ==> no RPCs required * Combined coordinator/executor with distributed plan ==> may do self-RPC > --rpc_use_loopback may not work for runtime filter RPCs > ------------------------------------------------------- > > Key: IMPALA-8687 > URL: https://issues.apache.org/jira/browse/IMPALA-8687 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec > Reporter: Tim Armstrong > Assignee: Tim Armstrong > Priority: Major > > Following on from IMPALA-8659, we may have some cases where impalads do > self-RPCs via the thrift internal service IMPALA-7984. This JIRA is to > investigate if this is a problem, and to fix it (either by intercepting > self-RPCs in Thrift or by making code changes to avoid it). > Basic join where global runtime filters should apply: > {code} > select straight_join count(*) > from alltypes t1 join /*+ shuffle */ alltypes t2 on t1.id = t2.id > where t2.string_col = '1'; > {code} > Interesting cases > * Dedicated coordinator with distributed plan ==> expect that all joins and > scans run on executors and all filter aggregation happens on coordinator. > * Single node plan (num_nodes=1) ==> expect that all filters are local ==> no > RPCs required > * Combined coordinator/executor with distributed plan ==> may do self-RPC > So I think in the dedicated coordinator/executor case we're ok. Note that > IMPALA-3825 may violate the above assumptions. > I can pretty easily reproduce the issue on combined coordinators/executors > with verbosity level 2. This is a log excerpt from the Impalad > tarmstrong-box:22000 > {noformat} > I0619 17:28:00.913919 25525 client-cache.cc:47] > GetClient(tarmstrong-box:22000) > I0619 17:28:00.913924 25525 client-cache.cc:57] GetClient(): returning cached > client for tarmstrong-box:22000 > I0619 17:28:00.914047 25425 rpc-trace.cc:202] RPC call: > ImpalaInternalService.PublishFilter(from ::ffff:127.0.0.1:41902) > I0619 17:28:00.914587 25425 query-exec-mgr.cc:98] QueryState: > query_id=624be7fc0bc0e122:0fbdc17200000000 refcnt=6 > I0619 17:28:00.914597 25425 fragment-instance-state.cc:511] PublishFilter(): > instance_id=624be7fc0bc0e122:0fbdc17200000002 filter_id=0 > I0619 17:28:00.915010 25425 query-exec-mgr.cc:162] ReleaseQueryState(): > query_id=624be7fc0bc0e122:0fbdc17200000000 refcnt=6 > I0619 17:28:00.915038 25425 rpc-trace.cc:212] RPC call: > backend:ImpalaInternalService.PublishFilter from ::ffff:127.0.0.1:41902 took > 1.000ms > I0619 17:28:00.915043 25525 client-cache.cc:152] Releasing client for > tarmstrong-box:22000 back to cache > I0619 17:28:00.915175 25525 rpc-trace.cc:212] RPC call: > backend:ImpalaInternalService.UpdateFilter from ::ffff:127.0.0.1:41930 took > 5.000ms > I0619 17:28:00.922312 25437 scan-node.cc:192] > 624be7fc0bc0e122:0fbdc17200000002] Filters arrived. Waited 351ms > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org