[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875278#comment-16875278 ] Michael Ho edited comment on IMPALA-8712 at 6/28/19 11:00 PM: -- We may be to work around some of the serialization overhead by serializing some of the immutable Thrift based RPC parameters once and send it as a sidecar. This should reduce the need to serialize it once per backend. was (Author: kwho): We may be to work around some of the serialization overhead by sending some of the currently Thrift based RPC parameters as a sidecar or something. > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873765#comment-16873765 ] Michael Ho edited comment on IMPALA-8712 at 6/28/19 10:57 PM: -- On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. was (Author: kwho): On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. -If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack.- > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873765#comment-16873765 ] Michael Ho edited comment on IMPALA-8712 at 6/28/19 7:22 AM: - On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. -If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack.- was (Author: kwho): On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack. > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873765#comment-16873765 ] Michael Ho edited comment on IMPALA-8712 at 6/27/19 6:59 PM: - On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack. was (Author: kwho): On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org