sunchao commented on a change in pull request #32354:
URL: https://github.com/apache/spark/pull/32354#discussion_r625977975
##########
File path:
sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTable.scala
##########
@@ -504,9 +509,39 @@ private class BufferedRowsReader(
index < partition.rows.length
}
- override def get(): InternalRow = addMetadata(partition.rows(index))
+ override def get(): InternalRow = {
+ val originalRow = partition.rows(index)
+ val values = new Array[Any](nonMetadataColumns.length)
+ nonMetadataColumns.zipWithIndex.foreach { case (col, idx) =>
+ values(idx) = extractFieldValue(col, tableSchema, originalRow)
+ }
+ addMetadata(new GenericInternalRow(values))
+ }
override def close(): Unit = {}
+
+ private def extractFieldValue(
+ field: StructField,
+ schema: StructType,
+ row: InternalRow): Any = {
+ val index = schema.fieldIndex(field.name)
Review comment:
Good question. Looking at `PushdownUtils.pruneColumns`, I see that we
apply `SQLConf.resolver` when nested column pruning is enabled, but seems not
so when it is disabled. IMO perhaps we should have better contract between
Spark and data source implementors w.r.t
`SupportsPushDownRequiredColumns.pruneColumns`, and Spark should guarantee that
the `requiredSchema` passed in to the method should be a "subset" of the
relation's schema (e.g., table schema).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]