Davis-Zhang-Onehouse commented on code in PR #12826:
URL: https://github.com/apache/hudi/pull/12826#discussion_r1964408231


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/LegacyArchivedMetaEntryReader.java:
##########
@@ -103,8 +104,8 @@ private Pair<HoodieInstant, Option<byte[]>> 
readInstant(GenericRecord record) {
           // should be json bytes.
           try {
             HoodieInstant instant = 
metaClient.getInstantGenerator().createNewInstant(HoodieInstant.State.COMPLETED,
 action, instantTime, stateTransitionTime);
-            org.apache.hudi.common.model.HoodieCommitMetadata commitMetadata = 
new CommitMetadataSerDeV1().deserialize(instant, 
getUTF8Bytes(actionData.toString()),
-                org.apache.hudi.common.model.HoodieCommitMetadata.class);
+            org.apache.hudi.common.model.HoodieCommitMetadata commitMetadata = 
new CommitMetadataSerDeV1().deserialize(
+                instant, Option.of(new 
ByteArrayInputStream(getUTF8Bytes(actionData.toString()))), 
org.apache.hudi.common.model.HoodieCommitMetadata.class);

Review Comment:
   here we start with byte[] of json -> go to Pojo HoodieCommitMetadata -> 
serialize to avro HoodieCommitMetadata.
   
   The method I wrote is for avro HoodieCommitMetadata to pojo 
HoodieCommitMetadata, which does not fit into the use case here



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanActionExecutor.java:
##########
@@ -193,10 +193,9 @@ protected Option<HoodieCleanerPlan> requestClean(String 
startCleanTime) {
         activeTimeline.deleteEmptyInstantIfExists(cleanInstant);
         HoodieInstant cleanPlanInstant = new 
HoodieInstant(HoodieInstant.State.INFLIGHT, cleanInstant.getAction(), 
cleanInstant.requestedTime(), 
InstantComparatorV1.REQUESTED_TIME_BASED_COMPARATOR);
         try {
-          Option<byte[]> content = 
activeTimeline.getInstantDetails(cleanPlanInstant);
           // Deserialize plan if it is non-empty
-          if (content.map(bytes -> bytes.length > 0).orElse(false)) {
-            return 
Option.of(TimelineMetadataUtils.deserializeCleanerPlan(content.get()));
+          if (!activeTimeline.isEmpty(cleanPlanInstant)) {

Review Comment:
   done



##########
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantReader.java:
##########
@@ -31,21 +31,23 @@
 public interface HoodieInstantReader {
   /**
    * Reads the provided instant's content into a stream for parsing.
+   *
    * @param instant the instant to read
    * @return an InputStream with the content
    */
-  default InputStream getContentStream(HoodieInstant instant) {
+  default Option<InputStream> getContentStream(HoodieInstant instant) {
     throw new RuntimeException("Not implemented");
   }
 
   /**
+   * dd
    * Reads the provided instant's content into a byte array for parsing.
    * @param instant the instant to read
    * @return an InputStream with the details
    */
   @Deprecated
   default Option<byte[]> getInstantDetails(HoodieInstant instant) {
-    try (InputStream inputStream = getContentStream(instant)) {
+    try (InputStream inputStream = getContentStream(instant).get()) {

Review Comment:
   So if we check the old code of getInstantDetails for byte[] in day 1, it 
unnecessarily warp the return value with Option.of(). The logic there never 
branching to return anything like Option.empty.
   
   Same case in getContentStream, we keep that un-wanted Option wrapper so the 
scope of code refactoring is less. Later we can follow up on it.
   
   To summarize, the old logic always assume the file should be there. If no, 
exception is thrown. For getContentStream it is either return with 
Option(something) or throw exception. I think it would be safer to stick to 
this behavior and let code error out if Option.empty() is somehow returned, as 
oppose to have extra logic implicitly supporting some new corner cases without 
notice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to