szehon-ho commented on PR #53572:
URL: https://github.com/apache/spark/pull/53572#issuecomment-3808531588
chat with @cloud-fan and others offline. It's not worth the complexity, so
simplified the code
The behavior is slightly changed as running df.cache() on the result of some
commands like df = sql("SHOW TABLES") or df = sql("SHOW NAMESPACES")
'snapshotted' the result again vs now being a no-op. But this is incorrect, as
df.cache should not trigger a second run for commands as per the contract, and
the user may simply run df = sql("") if they want the content at that point
they used to run df.cache().
@cloud-fan can you take a look? Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]