Github user robert3005 commented on the issue:
https://github.com/apache/spark/pull/16575
This was posted mostly to get comments on what's the expected behaviour.
What's unclear is whether dataset can be shared across sparksessions and if so
what are the semantics and behaviour of it. It seems off to me that physical
plan of hadoopfsrelation will use sparksession from the moment when it was
defined. In cases where sparksession is bound to physical plans we can use
Dataset.ofRows to swap sparksession from logical plan but for hadoopfsrelation
that won't work. I'd think FileSourceScanExec would at least use the session
from the moment of execution (on from SparkPlan). However, if we can share
datasets across sessions then it makes sense to not tie the session to concrete
physical plan but fetch it every time the plan gets executed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]