[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table

2020-01-07 Thread GitBox
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to 
archive commits error when writing data to MOR/COW table
URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-571919780
 
 
   > @bvaradar Yes, it is part of the code, I checked it. I think the problem 
came because, I ran the job when the bug was introduced, so it created many 
empty files. After the bug was fixed, I assume it only fixed by adding content 
back to `clean.requested` files (new `clean.requested` files are not empty), 
but there is no code to handle the empty files that were created due to the bug.
   
   You're right, this pr did not handle the empty files which were created due 
to that bug, because most people haven't been affected by HUDI-308. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table

2019-12-28 Thread GitBox
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to 
archive commits error when writing data to MOR/COW table
URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569474638
 
 
   > @lamber-ken : there is a bug in 
HoodieActiveTimeline.saveToCleanRequested(). It was never meant to be empty. 
The cleaner plan needs to be stored in requested file in the timeline
   > 
   > Remove these 2 lines
   > 
   > * // Plan is only stored in auxiliary folder
   > * createFileInMetaPath(instant.getFileName(), Option.empty(), false);
   >   and add
   > 
   > * createFileInMetaPath(instant.getFileName(), content, false);
   > 
   > This should guarantee that requested clean file is non-empty and no longer 
need any special casing in archiving.
   
   Done. I think it is not logic error, unit tests should not cover it, because 
we don't create an empty file manually. WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table

2019-12-28 Thread GitBox
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to 
archive commits error when writing data to MOR/COW table
URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569474638
 
 
   > @lamber-ken : there is a bug in 
HoodieActiveTimeline.saveToCleanRequested(). It was never meant to be empty. 
The cleaner plan needs to be stored in requested file in the timeline
   > 
   > Remove these 2 lines
   > 
   > * // Plan is only stored in auxiliary folder
   > * createFileInMetaPath(instant.getFileName(), Option.empty(), false);
   >   and add
   > 
   > * createFileInMetaPath(instant.getFileName(), content, false);
   > 
   > This should guarantee that requested clean file is non-empty and no longer 
need any special casing in archiving.
   
   Done. I think it's a logic error, unit tests should not cover it, because we 
don't create an empty file manually. WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table

2019-12-28 Thread GitBox
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to 
archive commits error when writing data to MOR/COW table
URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569474638
 
 
   > @lamber-ken : there is a bug in 
HoodieActiveTimeline.saveToCleanRequested(). It was never meant to be empty. 
The cleaner plan needs to be stored in requested file in the timeline
   > 
   > Remove these 2 lines
   > 
   > * // Plan is only stored in auxiliary folder
   > * createFileInMetaPath(instant.getFileName(), Option.empty(), false);
   >   and add
   > 
   > * createFileInMetaPath(instant.getFileName(), content, false);
   > 
   > This should guarantee that requested clean file is non-empty and no longer 
need any special casing in archiving.
   
   Done. I think it's a logic error, unit tests should not cover it. WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table

2019-12-28 Thread GitBox
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to 
archive commits error when writing data to MOR/COW table
URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569461408
 
 
   Hi @bvaradar, I went through the process again. There are three kinds of 
empty files(`*.clean.requested`, `*.commit.requested`, 
`*.deltacommit.requested`).
   
   When reading `*.clean.requested` files, will throw `Not an Avro data file` 
error. Because `org.apache.avro.file.DataFileReader#openReader` cat not read it.
   
   When reading `*.commit.requested`, `*.deltacommit.requested` files, 
`HoodieCommitMetadata#fromBytes` checks whether empty commit file or not, 
return a new HoodieCommitMetadata instance when meets empty commit file.
   
   `HoodieActiveTimeline#saveToCleanRequested` created the empty 
`*.clean.requested` files.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table

2019-12-28 Thread GitBox
lamber-ken edited a comment on issue #1128: [HUDI-453] Fix throw failed to 
archive commits error when writing data to MOR/COW table
URL: https://github.com/apache/incubator-hudi/pull/1128#issuecomment-569461408
 
 
   Hi @bvaradar, I went through the process again. There are three kinds of 
empty files(`*.clean.requested`, `*.commit.requested`, 
`*.deltacommit.requested`).
   
   When reading `*.clean.requested` files, will throw `Not an Avro data file` 
error. Because `org.apache.avro.file.DataFileReader#openReader` cat not read it.
   
   When reading `*.commit.requested`, `*.deltacommit.requested`, 
`HoodieCommitMetadata#fromBytes` checks whether empty commit file or not, 
return a new HoodieCommitMetadata instance when meets empty commit file.
   
   `HoodieActiveTimeline#saveToCleanRequested` created the empty 
`*.clean.requested` files.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services