[ 
https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952115#comment-16952115
 ] 

Vinoth Chandar edited comment on HUDI-289 at 10/15/19 5:06 PM:
---------------------------------------------------------------

Main thing to redo was : Refactor the code in a way that these pipelines can be 
authored like a DAG. Right now, it's not very modular and its a bit coupled..

Ideally, we have a bunch of test nodes , where each Node is 

  - UpsertNode (doing an upsert)
  - InsertNode (doing an insert)
  - CheckDuplicatesNode (which checks if table has any duplicates)
  - CheckCountsNode (which checks if table has any missing rows)
  - CheckValuesNode (which checks if few rows have expected values)
  - IncrementalPullNode (which pulls incrementally some rows and they may. be 
passes to anther validator)
  - RollbackNode (injects a rolback)
  - CompactNode (Forces a compaction) 

If these individual nodes can be decoupled configured to plug and play, you can 
imagine how we can author tests using simple config files.. Thats my brain 
dump.. 

Feel free to choose or ignore any part of it. Since this can be a fairly 
involved, also focusing on some thing simpler with few nodes and keep iterating 
may also be good approach



was (Author: vc):
Main thing to redo was : Refactor the code in a way that these pipelines can be 
authored like a DAG. Right now, it's not very modular.  



> Implement a long running test for Hudi writing and querying end-end
> -------------------------------------------------------------------
>
>                 Key: HUDI-289
>                 URL: https://issues.apache.org/jira/browse/HUDI-289
>             Project: Apache Hudi (incubating)
>          Issue Type: Test
>          Components: Usability
>            Reporter: Vinoth Chandar
>            Assignee: vinoyang
>            Priority: Major
>
> We would need an equivalent of an end-end test which runs some workload for 
> few hours atleast, triggers various actions like commit, deltacopmmit, 
> rollback, compaction and ensures correctness of code before every release
> P.S: Learn from all the CSS issues managing compaction.. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to