EmmyMiao87 edited a comment on issue #7901:
URL: 
https://github.com/apache/incubator-doris/issues/7901#issuecomment-1056815743


   # Join 性能优化
   
   ## 减少不必要的内存拷贝
   
   Join Node 的输出 schema 与 Join Node 的输入 schema 不同。但当前 Doris 的 Join Node 
算子在构造结果行时,直接将左右孩子的 tuple 进行拼接。
   而实际上结果行的列可能是输入行中列的子集。这导致了很多无用的内存拷贝。
   
   举例说明
   
   ```
   select a.k1 from a, b where a.k1=b.k1;
   ```
   输入 schema : a.k1, b.k1
   输出 schema :a.k1, b.k1
   优化后输出 schema :a.k1
   
   ```
   MySQL [ssb]> select count(d_datekey) from lineorder inner join date on 
lo_orderdate = d_datekey;
   +--------------------+
   | count(`d_datekey`) |
   +--------------------+
   |          600037902 |
   +--------------------+
   1 row in set (10.555 sec)
   ```
   打印 perf 发现,主要耗时函数在:
   <img width="1436" alt="image" 
src="https://user-images.githubusercontent.com/25147274/156358842-1d987caf-bea7-4a5a-9d63-0dca844d25f4.png";>
   1. replicate 负责非 Join 列的结果填写函数。**占用约 10%**
   
   # 测试
   After PR: #8618 
   
   下面查询主要在 join 后可以裁剪列 lo_orderdate。效果如下:
   ```
   MySQL [ssb]> select count(d_datekey) from lineorder inner join date on 
lo_orderdate = d_datekey;
   +--------------------+
   | count(`d_datekey`) |
   +--------------------+
   |          600037902 |
   +--------------------+
   1 row in set (7.286 sec)
   
   MySQL [ssb]> set enable_hash_project=true;
   Query OK, 0 rows affected (0.001 sec)
   
   MySQL [ssb]> select count(d_datekey) from lineorder inner join date on 
lo_orderdate = d_datekey;
   +--------------------+
   | count(`d_datekey`) |
   +--------------------+
   |          600037902 |
   +--------------------+
   1 row in set (5.479 sec)
   ```
   
   从测试结果看,开启裁剪后,性能提升 10% 符合预期。
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to