zml1206 commented on code in PR #8584:
URL: https://github.com/apache/incubator-gluten/pull/8584#discussion_r1924628747
##########
gluten-substrait/src/main/scala/org/apache/gluten/execution/GlutenWholeStageColumnarRDD.scala:
##########
@@ -62,17 +59,6 @@ class GlutenWholeStageColumnarRDD(
private val numaBindingInfo = GlutenConfig.get.numaBindingInfo
override def compute(split: Partition, context: TaskContext):
Iterator[ColumnarBatch] = {
-
- // To support input_file_name(). According to semantic we should return
- // the exact file name a row belongs to. However in columnar engine it's
- // not easy to accomplish this. so we return a list of file(part) names
- split match {
- case FirstZippedPartitionsPartition(_, g: GlutenPartition, _) =>
- InputFileBlockHolderProxy.set(g.files.mkString(","))
- case _ =>
- InputFileBlockHolderProxy.unset()
- }
-
Review Comment:
There are problems with the previous input file expression implementation.
#7124 optimizes the solution and pushes the input file expression down to
scanTransform or the project before scan. The results come from native scan or
spark thread local, so there is no need to retain the information in
InputFileBlockHolder.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]