sivabalan narayanan created HUDI-5647:
-----------------------------------------

             Summary: Automate savepoint and restore tests
                 Key: HUDI-5647
                 URL: https://issues.apache.org/jira/browse/HUDI-5647
             Project: Apache Hudi
          Issue Type: Improvement
          Components: writer-core
            Reporter: sivabalan narayanan


Automate savepoint and restore tests



Scenarios to cover:

 

All tests to be done for
w/ and w/o metadata
partitioned and non-partitioned dataset. 

COW

Format:
scenario being tested
timeline 
what to expect after restore. 

1. straight forward
C1, C2, savepoint C2. C3, C4, restore. 
should go back to C2. 
C3, C4 should be cleaned up. 

2. pending inflight. 
C1, C2, savepoint C2. C3, C4 inflight. restore. 
should go back to C2. 
C3, C4 should be cleaned up. 

3. completed rollbacks in timeline. 
C1, C2, savepoint C2, C3, C4 (RB_C3), C5. restore. 
should go back to C2. 
C3, C4(RB_C3), C5 should be cleaned up. 

4. pending rollbacks after savepoint. 

C1, C2, savepoint C2, C3, C4 (RB_C3) inflight. restore. 
should go back to C2. 
C3, C4 (RB_C3) should be cleaned up. 

5. clean commits after savepoint. 
C1, C2, savepoint C2, C3, C4, C5 (clean C1), C6, restore
should go back to C2. 
C3, C4, C5 (clean C1), C6 should be cleaned up.

6. clustering. 
C1, C2, savepoint C2. C3, C4.replace commit, C5, restore. 
should go back to C2. 
C3, C4.replace commit, C5 should be cleaned up. 

7. pending clustering after savepoint. 
C1, C2, savepoint C2. C3, C4.replace commit.inflight, C5, restore. 
should go back to C2.
C3, C4.replace commit files and C5 files should be cleaned up. 

8. completed clustering before savepoint. 
C1, C2, C3.replacecommit.complete, C4, savepoint C4, C5, restore. 
should go back to C4.
C5 should be cleaned up. 

9. pending clustering before savepoint. 
C1, C2, C3.replace commit.inflight, C3, C4, savepoint C4, C5, restore 
should go back to C4. 
C4 should be cleaned up. if pipeline is restarted, C3.replace commit should be 
re-attempted. 

MOR 

1. simple one
DC1, DC2, DC3, savepoint DC3. DC4, DC5. restore
should rollback DC4 and DC5 
No files will be cleaned up. only rollback log appends. 

2. simple one w/ compaction. 
DC1, DC2, DC3, C4, savepoint C4. DC5, DC6. restore
should rollback DC5 and DC6 
No files will be cleaned up. only rollback log appends. 

3. another one w/ compaction. 
DC1, DC2, DC3, savepoint DC3, DC4, C5, DC6, DC7. restore
should rollback DC5 and DC6. 
latest file slice should be fully cleaned up. and rollback log appends for DC4 
in first file slice. 

4. compaction and clean commits. 
DC1, DC2, DC3, savepoint DC3, DC4, C5, DC6, DC7, DC8, C9, C10.clean, DC11, DC12 
restore. 
should take the table back to DC3. 
Cleaner should not have cleaned up file slice 1 since it was part of savepoint. 
Entire file slice 2 and 3 should be cleaned up. 
i.e. C5, DC6, DC7, DC8, C9, C10.clean, DC11, DC12. and a rollback log append 
for DC4. 

5. pending compaction after savepoint. 
DC1, DC2, DC3, savepoint DC3, DC4, C5.pending. DC6, DC7. restore
should rollback until DC3. 
latest file slice should be fully delete. for DC4 a rollback log append should 
be made. 

6. pending compaction before savepoint. 
DC1, DC2, DC3, C4.pending, DC5, savepoint DC5, DC6, DC7. restore
should rollback until DC5. 
rollback log appends for DC6 and DC7. 

7. compaction and clustering. completed clustering before savepoint. 
DC1, DC2, DC3, C4, DC5, C6.replacecommit.completed. DC7, savepoint DC7, DC8, 
DC9. restore
inpsect what C6 does. likely it will create a new file group. and then start 
taking in DC7. 
should take the table back to DC7. 
rollback log appends for DC8 and DC9. 

8. compaction and clustering. completed clustering after savepoint. 
DC1, DC2, DC3, C4, DC5, savepoint DC5, C6.replacecommit.completed, DC7, DC8, 
restore
inpsect what C6 does. likely it will create a new file group. and then start 
taking in DC7. 
should take the table back to DC5. 
latest file slice created by C6 should be cleaned up fully. 

9. pending clustering before savepoint. 
DC1, DC2, DC3, C4, DC5, C6.replacecommit.inflight. DC7, savepoint DC7, DC8, 
DC9. restore
should take the table back to DC7. 
rollback log appends for DC8 and DC9. when pipeline is restarted, C6 should be 
re-attempted and get to completion. 

10. pending clustering after savepoint. 
DC1, DC2, DC3, C4, DC5, savepoint DC5, C6.replacecommit.inflight, DC7, DC8, 
restore
should take the table back to DC5. 
latest file slice created by C6 should be cleaned up fully. 

11. completed rollbacks after savepoint. 
DC1, DC2, DC3, C4, savepoint C4. DC5, C6(RB_DC5), DC7. restore
should rollback DC5, C6 and DC6. 
No files will be cleaned up. only rollback log appends. 

 

Few more cases to test:

 

case 1:
rolling back a commit thats already cleaned up: 
C1, C2, C3, C4, SP_C4, C5, C6, C7, C8, cleaner_C9 (cleaned up C1, C2, C3, C5), 
C10, restore. 

case 2: 
inflight clean after savepoint which is supposed to clean up files pertaining 
to a commit that will be rolled back by restore. 
C1, C2, C3, C4, SP_C4, C5, C6, C7, C8, cleaner_C9.inflight (cleaned up C1, C2, 
C3, C5), C10, restore. 

after restore:
C1, C2, C3, C4, SP_C4, cleaner_C9.inflight 
at some point, cleaner will retry. 

Fix: restore should first finish any pending clean after savepoint and then 
start the restore. 

 

More cases:

12:
rolling back a commit thats already cleaned up: 
C1, C2, C3, C4, SP_C4, C5, C6, C7, C8, cleaner_C9 (cleaned up C1, C2, C3, C5), 
C10, restore. 

13: 
inflight clean after savepoint which is supposed to clean up files pertaining 
to a commit that will be rolled back by restore. 
C1, C2, C3, C4, SP_C4, C5, C6, C7, C8, cleaner_C9.inflight (cleaned up C1, C2, 
C3, C5), C10, restore. 

after restore:
C1, C2, C3, C4, SP_C4, cleaner_C9.inflight 
at some point, cleaner will retry. 

When cleaner retries, it does succeed w/o any issues. 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to