[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-571919780 > @bvaradar Yes, it is part of the code, I checked it. I think the problem came because, I ran the job when the bug was introduced, so it created many empty files. After the bug was fixed, I assume it only fixed by adding content back to `clean.requested` files (new `clean.requested` files are not empty), but there is no code to handle the empty files that were created due to the bug. You're right, this pr did not handle the empty files which were created due to that bug, because most people haven't been affected by HUDI-308. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569474638 > @lamber-ken : there is a bug in HoodieActiveTimeline.saveToCleanRequested(). It was never meant to be empty. The cleaner plan needs to be stored in requested file in the timeline > > Remove these 2 lines > > * // Plan is only stored in auxiliary folder > * createFileInMetaPath(instant.getFileName(), Option.empty(), false); > and add > > * createFileInMetaPath(instant.getFileName(), content, false); > > This should guarantee that requested clean file is non-empty and no longer need any special casing in archiving. Done. I think it is not logic error, unit tests should not cover it, because we don't create an empty file manually. WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569474638 > @lamber-ken : there is a bug in HoodieActiveTimeline.saveToCleanRequested(). It was never meant to be empty. The cleaner plan needs to be stored in requested file in the timeline > > Remove these 2 lines > > * // Plan is only stored in auxiliary folder > * createFileInMetaPath(instant.getFileName(), Option.empty(), false); > and add > > * createFileInMetaPath(instant.getFileName(), content, false); > > This should guarantee that requested clean file is non-empty and no longer need any special casing in archiving. Done. I think it's a logic error, unit tests should not cover it, because we don't create an empty file manually. WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569474638 > @lamber-ken : there is a bug in HoodieActiveTimeline.saveToCleanRequested(). It was never meant to be empty. The cleaner plan needs to be stored in requested file in the timeline > > Remove these 2 lines > > * // Plan is only stored in auxiliary folder > * createFileInMetaPath(instant.getFileName(), Option.empty(), false); > and add > > * createFileInMetaPath(instant.getFileName(), content, false); > > This should guarantee that requested clean file is non-empty and no longer need any special casing in archiving. Done. I think it's a logic error, unit tests should not cover it. WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569461408 Hi @bvaradar, I went through the process again. There are three kinds of empty files(`*.clean.requested`, `*.commit.requested`, `*.deltacommit.requested`). When reading `*.clean.requested` files, will throw `Not an Avro data file` error. Because `org.apache.avro.file.DataFileReader#openReader` cat not read it. When reading `*.commit.requested`, `*.deltacommit.requested` files, `HoodieCommitMetadata#fromBytes` checks whether empty commit file or not, return a new HoodieCommitMetadata instance when meets empty commit file. `HoodieActiveTimeline#saveToCleanRequested` created the empty `*.clean.requested` files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569461408 Hi @bvaradar, I went through the process again. There are three kinds of empty files(`*.clean.requested`, `*.commit.requested`, `*.deltacommit.requested`). When reading `*.clean.requested` files, will throw `Not an Avro data file` error. Because `org.apache.avro.file.DataFileReader#openReader` cat not read it. When reading `*.commit.requested`, `*.deltacommit.requested`, `HoodieCommitMetadata#fromBytes` checks whether empty commit file or not, return a new HoodieCommitMetadata instance when meets empty commit file. `HoodieActiveTimeline#saveToCleanRequested` created the empty `*.clean.requested` files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services