[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194613#comment-17194613 ] ASF subversion and git services commented on IMPALA-4065: - Commit 4bb7190549254c98eb675f3aa044085ca90084d0 in impala's branch refs/heads/master from Qifan Chen [ https://gitbox.apache.org/repos/asf?p=impala.git;h=4bb7190 ] IMPALA-4065 Inline comparator calls into TopN::InsertBatch() This work addresses the current limitation in TopN node by replacing std::priority_queue with an in-house priority queue implementation. In this way, call-site replacement of calls to a more efficient version of tuple comparator becomes possible. The tuple comparator is optimized by removing the condition test on whether the code-gen version is ready or not. Testing: 1. Added a new test TestBasic in a new test harness priority-queue-test.cc to verify that the priority queue works properly; 2. Ran ad-hoc performance test against tpcds.store_sales (2880404 rows) and saw 16% and 24% top-100 performance improvement on integer and decimal columns respectively. The test queries are as follows. -- SQL integer column version select ss_sold_time_sk, ss_item_sk, ss_customer_sk from tpcds.store_sales order by ss_sold_time_sk, ss_item_sk limit 100; -- SQL decimal column version select ss_ext_wholesale_cost, ss_ext_list_price, ss_ext_tax from tpcds.store_sales order by ss_ext_wholesale_cost, ss_ext_list_price limit 100; 3. Ran Core tests successfully. Change-Id: I676b4c05cf10a6946c05e317b0002c1e29e78aa8 Reviewed-on: http://gerrit.cloudera.org:8080/16373 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Qifan Chen >Priority: Minor > Labels: codegen, ramp-up > Attachments: > 0001-WIP-IMPALA-3816-IMPALA-4065-full-TupleRowComparator-.patch > > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163708#comment-17163708 ] ASF subversion and git services commented on IMPALA-4065: - Commit d3115a561d257ffba84e61664312b7150b8c888c in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d3115a5 ] IMPALA-9979: part 1: factor out Top-N heap. This extracts the implementation of the actual priority queue from the rest of TopNNode's state, so that we can, in the next patch, have multiple heaps per node. The codegen'd InsertBatch() function is unfortunately a little sensitive to minor changes in code, because of the weird way that it does an indirect call via TupleRowComparator - see IMPALA-4065. I had to tweak the code a little to find a variant that performed similarly to the previous version - other variants had small regressions. Perf: Single node TPC-H showed no perf change. The time for the TOP-N node in this targeted query was within the margin of error: use tpch30_parquet; set mt_dop=1; select l_extendedprice from lineitem order by 1 limit 100 Change-Id: I1f585216b547af7a470e02f75458b1901dc44a31 Reviewed-on: http://gerrit.cloudera.org:8080/16223 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Qifan Chen >Priority: Minor > Labels: codegen, ramp-up > Attachments: > 0001-WIP-IMPALA-3816-IMPALA-4065-full-TupleRowComparator-.patch > > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826246#comment-16826246 ] Tim Armstrong commented on IMPALA-4065: --- Attaching a patch that cleaned up the core algorithm a bit. Here's Tianyi's patch: https://gerrit.cloudera.org/#/c/10680/ > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Tianyi Wang >Priority: Minor > Labels: codegen, ramp-up > Attachments: > 0001-WIP-IMPALA-3816-IMPALA-4065-full-TupleRowComparator-.patch > > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506691#comment-16506691 ] Tianyi Wang commented on IMPALA-4065: - I observed more than 2X speed up in the TopNNode with an "order by int_col" extreme case, but not so much for the sort case in IMPALA-3816. It seems LLVM doesn't optimize the template code well if there are indirect calls among them. > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Tianyi Wang >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505508#comment-16505508 ] Zoram Thanga commented on IMPALA-4065: -- Assigned you the jira, [~tianyiwang]. > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Tianyi Wang >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505448#comment-16505448 ] Tianyi Wang commented on IMPALA-4065: - [~zoram] If you are not working on this I can take it over. I'm working on IMPALA-3816 and will refactor some code around this. > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Zoram Thanga >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498775#comment-16498775 ] Tianyi Wang commented on IMPALA-4065: - Got it. Thanks. > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Zoram Thanga >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498766#comment-16498766 ] Tim Armstrong commented on IMPALA-4065: --- We can't replace all of the call-sites though, we need to create a copy of each function we're modifying and then do the replacement. We can't modify the original function since we may have multiple specialized copies of the same funciton. > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Zoram Thanga >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498761#comment-16498761 ] Tianyi Wang commented on IMPALA-4065: - [~tarmstrong] I still don't get what replacing has to do with inlining. If nothing is inlined, aren't the call sites still there in the IR module? > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Zoram Thanga >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498756#comment-16498756 ] Tim Armstrong commented on IMPALA-4065: --- [~tianyiwang] yeah we could solve it in a different way, we'd just need to do the force inlining to apply our usual approach of searching and replacing within the function body. > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Zoram Thanga >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4065) Inline comparator calls into TopN::InsertBatch()
[ https://issues.apache.org/jira/browse/IMPALA-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498711#comment-16498711 ] Tianyi Wang commented on IMPALA-4065: - Why do we need to force inline std::priority_queue? Isn't removing the function pointer good enough? > Inline comparator calls into TopN::InsertBatch() > > > Key: IMPALA-4065 > URL: https://issues.apache.org/jira/browse/IMPALA-4065 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0 >Reporter: Tim Armstrong >Assignee: Zoram Thanga >Priority: Minor > Labels: codegen, ramp-up > > This is the more interesting follow-on from IMPALA-3815. We should inline the > Compare() calls in the codegen'd TopN code to avoid the indirect function > pointer call. > The tricky aspect is that the Compare() calls are called from > std::priority_queue, and we don't have a way to force-inline those functions > at the moment. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org