[PR] [VL] Follow-up to #8454 to add a `ensureVeloxBatch` API for limited use cases [incubator-gluten]

via GitHub Wed, 08 Jan 2025 00:14:21 -0800


zhztheplayer opened a new pull request, #8463:
URL: https://github.com/apache/incubator-gluten/pull/8463

This is a follow-up change for #8454.

Usually a Gluten query plan knows exactly the [batch type of data it
processes](https://github.com/apache/incubator-gluten/blob/55ef64b02b9daf70038b20d4671eb5704059c25e/gluten-core/src/main/scala/org/apache/gluten/execution/GlutenPlan.scala#L64-L72)
with the help from Gluten's transition planner. Table cache write is an
exception here because vanilla Spark's cache generation code simply calls API
`CachedBatchSerializer#convertColumnarBatchToCachedBatch` for a child plan with
`supportsColumnar=true`. Hence, we have to dynamically do to-Velox batch
conversions in the implementation code of
`CachedBatchSerializer#convertColumnarBatchToCachedBatch` because we don't know
the batch type the child plan outputs.

The patch adds an `ensureVeloxBatch` API for dynamical to-Velox batch
conversion. The API should only be used in table cache write or similar
scenarios that explicit transitions are not able to add.

The patch adds a test case for the original issue #8453 also.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [VL] Follow-up to #8454 to add a `ensureVeloxBatch` API for limited use cases [incubator-gluten]

Reply via email to