[jira] [Commented] (HUDI-2873) Support optimize data layout by sql and make the build more fast

Tao Meng (Jira) Mon, 17 Jan 2022 05:35:06 -0800


    [ 
https://issues.apache.org/jira/browse/HUDI-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477214#comment-17477214
 ]


Tao Meng commented on HUDI-2873:
--------------------------------

[~alexey.kudinkin]  [~shibei] 

1)  support optimize data by sparksql, just like dela lake : OPTIMIZE xx_table 
ZORDER/HILBERT by col1, col2;   

2)  introduce a new write operation to rewrite table data directly , At 
present, The performance of clustering operation is slightly worse than that of 
direct overwrite

 

 

> Support optimize data layout by sql and make the build more fast
> ----------------------------------------------------------------
>
>                 Key: HUDI-2873
>                 URL: https://issues.apache.org/jira/browse/HUDI-2873
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: Performance, spark
>            Reporter: tao meng
>            Priority: Major
>             Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HUDI-2873) Support optimize data layout by sql and make the build more fast

Reply via email to