Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20856
@HyukjinKwon good analysis!
Currently Spark is a little messy about what shall be serialized and sent
to executors. Sometimes we just send an entire query tree but only read a few
properties of it.
It seems to me it would be better to always do codegen at driver side, to
avoid complex expression/plan operations at executor side.(not sure if it's
possible, cc @viirya @rednaxelafx @kiszk).
For this particular problem, I think we can just change these `val`s to
`lazy val` or `def` in `FileSourceScanExec`, with a unit test.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]