cloud-fan commented on a change in pull request #31993:
URL: https://github.com/apache/spark/pull/31993#discussion_r615556485



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
##########
@@ -40,7 +40,7 @@ import org.apache.spark.sql.types.{StructField, StructType}
 case class HadoopFsRelation(
     location: FileIndex,
     partitionSchema: StructType,
-    dataSchema: StructType,
+    dataSchema: StructType, // The top-level columns should not be pruned. 
Please see SPARK-34897.

Review comment:
       Can we put more details?
   ```
   // The top-level columns in `dataSchema` should match the actual physical 
file schema, otherwise
   // the ORC data source may not work with the by-ordinal mode.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to