Re: [PR] [HUDI-9664] Refactor HoodieReaderContext and move record APIs into RecordContext [hudi]

via GitHub Wed, 30 Jul 2025 06:56:57 -0700


lokeshj1703 commented on code in PR #13646:
URL: https://github.com/apache/hudi/pull/13646#discussion_r2242776084



##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -483,65 +336,4 @@ public abstract ClosableIterator<T> 
mergeBootstrapReaders(ClosableIterator<T> sk
   public final UnaryOperator<T> projectRecord(Schema from, Schema to) {
     return projectRecord(from, to, Collections.emptyMap());
   }
-
-  /**
-   * Converts the ordering value to the specific engine type.
-   */
-  public final Comparable convertOrderingValueToEngineType(Comparable value) {
-    return value instanceof ArrayComparable
-        ? ((ArrayComparable) value).apply(comparable -> 
convertValueToEngineType(comparable))
-        : convertValueToEngineType(value);
-  }
-
-  /**
-   * Returns the value to a type representation in a specific engine.
-   * <p>
-   * This can be overridden by the reader context implementation on a specific 
engine to handle
-   * engine-specific field type system.  For example, Spark uses {@code 
UTF8String} to represent
-   * {@link String} field values, so we need to convert the values to {@code 
UTF8String} type
-   * in Spark for proper value comparison.
-   *
-   * @param value {@link Comparable} value to be converted.
-   *
-   * @return the converted value in a type representation in a specific engine.
-   */
-  public Comparable convertValueToEngineType(Comparable value) {
-    return value;
-  }
-
-  /**
-   * Extracts the record position value from the record itself.
-   *
-   * @return the record position in the base file.
-   */
-  public long extractRecordPosition(T record, Schema schema, String fieldName, 
long providedPositionIfNeeded) {
-    if (supportsParquetRowIndex()) {
-      Object position = getValue(record, schema, fieldName);
-      if (position != null) {
-        return (long) position;
-      } else {
-        throw new IllegalStateException("Record position extraction failed");
-      }
-    }
-    return providedPositionIfNeeded;
-  }
-
-  public boolean supportsParquetRowIndex() {
-    return false;
-  }
-
-  /**
-   * Encodes the given avro schema for efficient serialization.
-   */
-  public Integer encodeAvroSchema(Schema schema) {
-    return this.localAvroSchemaCache.cacheSchema(schema);
-  }
-
-  /**
-   * Decodes the avro schema with given version ID.
-   */
-  @Nullable
-  protected Schema decodeAvroSchema(Object versionId) {

Review Comment:
   The changes in the PR were made to have a serializable reader context. Since 
most of the APIs which could be serialized were related to record, the entity 
was named as a record context. We can address this as part of HUDI-9602. This 
PR is needed for ensuring buffered record merger can be used for indexing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-9664] Refactor HoodieReaderContext and move record APIs into RecordContext [hudi]

Reply via email to