[ 
https://issues.apache.org/jira/browse/HUDI-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-7544:
---------------------------------
    Description: 
First, we need summarize the access patterns to the LSM timeline 
 * Who reads/writes from/to , at what frequency (i.e once per query, once per 
table service x, or multiple times in a commit etc..) 
 * Understand defaults that control performance (e.g completiontime queryview 
loading last 7 days or lsm timeline or sth.. )
 * Flag any issues that can cause correctness issues for writes/queries based 
optimizations done/design.. 
 * Finally with the same/updated benchmark, run a large LSM timeline and ensure 
its performance and efficient (in terms of cloud API calls)..
 * Ensure LSM is well-maintained (compaction, ... etc runs at right frequency)  
with a long running test and ensure it does memory leak etc. 

 

 

  was:
First, we need summarize the access patterns to the LSM timeline 
 * Who reads/writes from/to , at what frequency (i.e once per query, once per 
table service x, or multiple times in a commit etc..) 
 * Understand defaults that control performance (e.g completiontime queryview 
loading last 7 days or lsm timeline or sth.. )
 * Flag any issues that can cause correctness issues for writes/queries based 
optimizations done/design.. 
 * Finally with the same/updated benchmark, run a large LSM timeline and ensure 
its performance and efficient (in terms of cloud API calls)..

 

 


> Harden, Stress and Performance test the LSM timeline on cloud storage
> ---------------------------------------------------------------------
>
>                 Key: HUDI-7544
>                 URL: https://issues.apache.org/jira/browse/HUDI-7544
>             Project: Apache Hudi
>          Issue Type: Improvement
>            Reporter: Vinoth Chandar
>            Assignee: Sagar Sumit
>            Priority: Blocker
>             Fix For: 1.0.0
>
>
> First, we need summarize the access patterns to the LSM timeline 
>  * Who reads/writes from/to , at what frequency (i.e once per query, once per 
> table service x, or multiple times in a commit etc..) 
>  * Understand defaults that control performance (e.g completiontime queryview 
> loading last 7 days or lsm timeline or sth.. )
>  * Flag any issues that can cause correctness issues for writes/queries based 
> optimizations done/design.. 
>  * Finally with the same/updated benchmark, run a large LSM timeline and 
> ensure its performance and efficient (in terms of cloud API calls)..
>  * Ensure LSM is well-maintained (compaction, ... etc runs at right 
> frequency)  with a long running test and ensure it does memory leak etc. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to