alexeykudinkin commented on code in PR #5427:
URL: https://github.com/apache/hudi/pull/5427#discussion_r858121011


##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSparkUtils.scala:
##########
@@ -324,7 +326,14 @@ object HoodieSparkUtils extends SparkAdapterSupport {
       val name2Fields = tableAvroSchema.getFields.asScala.map(f => f.name() -> 
f).toMap
       // Here have to create a new Schema.Field object
       // to prevent throwing exceptions like 
"org.apache.avro.AvroRuntimeException: Field already used".
-      val requiredFields = requiredColumns.map(c => name2Fields(c))
+      val requiredFields = requiredColumns.filter(c => {

Review Comment:
   We should not relax this here actually, b/c `requiredColumns` will contain 
also query columns.
   
   Instead we should provide `HoodieMergeOnReadRDD` 2 parquet readers: 
   
   1. Primed for merging (ie for schema containing record-key, precombine-key)
   2. Primed for NO merging (ie whose schema could be essentially empty)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to