homatthew commented on code in PR #3818:
URL: https://github.com/apache/gobblin/pull/3818#discussion_r1380547570


##########
gobblin-modules/gobblin-orc/src/main/java/org/apache/gobblin/writer/GobblinBaseOrcWriter.java:
##########
@@ -259,6 +261,15 @@ public void commit()
       throws IOException {
     closeInternal();
     super.commit();
+    if(this.validateORCDuringCommit) {
+      try {
+        OrcFile.createReader(this.outputFile, new OrcFile.ReaderOptions(conf));
+      } catch (IOException ioException) {
+        log.error("Found error when validating ORC file {} during commit 
phase", this.outputFile, ioException);
+        log.error("Delete the malformed ORC file is successful: {}", 
this.fs.delete(this.outputFile, false));

Review Comment:
   Still not sure about the retries. If fs delete file fails, we won't delete 
the file but also won't retry. This works when we call commit because we throw 
the IO exception to prevent the file from being moved. But we do not do this 
when we  close the file in the close function, which calls `closeInternal()`. 
   
   If we flush the buffer, we should check after that the file is valid



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to