Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

via GitHub Fri, 13 Oct 2023 08:46:17 -0700


amousavigourabi commented on code in PR #1141:
URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358432821



##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api/ReadSupport.java:
##########
@@ -101,6 +120,24 @@ abstract public RecordMaterializer<T> prepareForRead(
           MessageType fileSchema,
           ReadContext readContext);
 
+  /**
+   * called in {@link 
org.apache.hadoop.mapreduce.RecordReader#initialize(org.apache.hadoop.mapreduce.InputSplit,
 org.apache.hadoop.mapreduce.TaskAttemptContext)} in the back end
+   * the returned RecordMaterializer will materialize the records and add them 
to the destination
+   *
+   * @param configuration    the configuration
+   * @param keyValueMetaData the app specific metadata from the file
+   * @param fileSchema       the schema of the file
+   * @param readContext      returned by the init method
+   * @return the recordMaterializer that will materialize the records
+   */
+  public RecordMaterializer<T> prepareForRead(
+      ParquetConfiguration configuration,
+      Map<String, String> keyValueMetaData,
+      MessageType fileSchema,
+      ReadContext readContext) {
+    throw new UnsupportedOperationException("Override 
prepareForRead(ParquetConfiguration, Map<String, String>, MessageType, 
ReadContext)");

Review Comment:
   I follow the example set by `ReadSupport#init(Configuration, Map, 
MessageType)`. As this error will not occur unless you are implementing your 
own `ReadSupport` class, I am not sure whether there needs to be that much more 
information in the exception. I'll add a reference to the `ReadSupport` class 
though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

Reply via email to