homatthew commented on code in PR #3818:
URL: https://github.com/apache/gobblin/pull/3818#discussion_r1380547570
##########
gobblin-modules/gobblin-orc/src/main/java/org/apache/gobblin/writer/GobblinBaseOrcWriter.java:
##########
@@ -259,6 +261,15 @@ public void commit()
throws IOException {
closeInternal();
super.commit();
+ if(this.validateORCDuringCommit) {
+ try {
+ OrcFile.createReader(this.outputFile, new OrcFile.ReaderOptions(conf));
+ } catch (IOException ioException) {
+ log.error("Found error when validating ORC file {} during commit
phase", this.outputFile, ioException);
+ log.error("Delete the malformed ORC file is successful: {}",
this.fs.delete(this.outputFile, false));
Review Comment:
Still not sure about the retries. If fs delete file fails, we won't delete
the file but also won't retry. This works when we call commit because we throw
the IO exception to prevent the file from being moved. But we do not do this
when we close the file in the close function, which calls `closeInternal()`.
If we flush the buffer, we should check after that the file is valid
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]