[GitHub] [hudi] guanziyue commented on a diff in pull request #4913: [HUDI-1517] create marker file for every log file

via GitHub Sun, 20 Aug 2023 13:26:26 -0700


guanziyue commented on code in PR #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r1299430746



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -273,4 +280,31 @@ protected static Option<IndexedRecord> 
toAvroRecord(HoodieRecord record, Schema
       return Option.empty();
     }
   }
+
+  protected class AppendLogWriteCallback implements HoodieLogFileWriteCallback 
{
+    // here we distinguish log files created from log files being appended. 
Considering following scenario:
+    // An appending task write to log file.
+    // (1) append to existing file file_instant_writetoken1.log.1
+    // (2) rollover and create file file_instant_writetoken2.log.2
+    // Then this task failed and retry by a new task.
+    // (3) append to existing file file_instant_writetoken1.log.1
+    // (4) rollover and create file file_instant_writetoken3.log.2
+    // finally file_instant_writetoken2.log.2 should not be committed to hudi, 
we use marker file to delete it.
+    // keep in mind that log file is not always fail-safe unless it never roll 
over
+

Review Comment:
   > > I see. so its an issue even w/ S3 like systems? Let me go over the 
scenario that you are referring to so that we are on same page.
   > > MDT disabled.
   > > Writer1: writer1 updates records in file group1 which already has a base 
file and 1 log file. writer1 writes log file2 and logfile3 (due to spark task 
retries). but ideally we just need only log file, i.e. log file2.
   > > Writer2: Concurrently tries to do a snapshot read from the same table 
concurrently. Before reconcile step for writer1 could execute, hudi returns all 
3 log files (log file1, logfile2 and log file3) as part of FSView.
   > > Writer1: goes through the reconcile logic and deletes the extraneous log 
file. so log file3 is deleted.
   > > Writer2: continues w/ actual read, where it hits file not found 
exception wrt log file3.
   > 
   > My bad. Let me try to make things more clearly. Actually we mentioned two 
problems here. Firstly, an issue fixed by 
[HUDI-6401](https://issues.apache.org/jira/browse/HUDI-6401). Marker file plays 
two roles which are flag of file writing operations (creating or append) while 
the other is fence concurrent writing to the same file. One write op must 
firstly check marker file to see if it win the permission. If not, it should 
return a signal rather than throw exception. Here I used create method which is 
wrong because it throw exception when marker conflict happen. With 
[HUDI-6401](https://issues.apache.org/jira/browse/HUDI-6401), it is totally 
solved after replacing create by createIfNotExists. Now concurrent writing 
works fine.
   
   
   
   > I see. so its an issue even w/ S3 like systems? Let me go over the 
scenario that you are referring to so that we are on same page.
   > 
   > MDT disabled.
   > 
   > Writer1: writer1 updates records in file group1 which already has a base 
file and 1 log file. writer1 writes log file2 and logfile3 (due to spark task 
retries). but ideally we just need only log file, i.e. log file2.
   > 
   > Writer2: Concurrently tries to do a snapshot read from the same table 
concurrently. Before reconcile step for writer1 could execute, hudi returns all 
3 log files (log file1, logfile2 and log file3) as part of FSView.
   > 
   > Writer1: goes through the reconcile logic and deletes the extraneous log 
file. so log file3 is deleted.
   > 
   > Writer2: continues w/ actual read, where it hits file not found exception 
wrt log file3.
   
   Let's go to current unsolved problem. It is totally about writing and 
reading isolation. Even we don't have concurrent writing, the problem exists. 
   MDT disabled
   t1, writing job start writing a log file to an FileGroup with a valid based 
file existed. FG base instant time t0.
   t2, writing job create two log files. First is fileid_t0.log.1_1_0_1 
generated by failed task and second is fileid_t0.log.1_1_0_2 generated by a 
successful retry task.
   t3, reading job start, FileSystemView shows that FG has one base file and 
two log files. Will read all of them.
   t4, writing job comes to reconcile step. fileid_t0.log.1_1_0_1 is invalid. 
delete it and finish commit.
   t5, reading job want to read fileid_t0.log.1_1_0_1 but file not exists. 
Error occur.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] guanziyue commented on a diff in pull request #4913: [HUDI-1517] create marker file for every log file

Reply via email to