Github user ivanko2000 commented on the issue:
https://github.com/apache/spark/pull/16609
I am a Spark newbie so I might not be aware of some obvious features, but
nonetheless: I consider the usefulness of a metadata field like "name" (not
necessarily unique on a global/session level) for a DataFrame/Dataset object to
go beyond the purely visual UI/'storage tab' use case (which apparently can be
achieved through other means, like temp view).
It could be used as an identifier/selector in a collection of dataframes,
either for internal 'business' logic or for integration purposes with 'outside'
components that could expose those dataframes with a human-readable name if the
developer/owner considered it useful to assign one (the latter does seem
similar to the "temp view" use case, but here the name/label is strictly a
metadata element of the dataframe itself, so there is no need to track is
separately).
Actually, I see that the Dataset's **as**(_alias_) can be used just for
that purpose: although it appears to be a "set-only" property (for use as
'alias' in SQL queries) and not intended to be read back, there is a fairly
straightforward way of retrieving it:
http://stackoverflow.com/questions/41249806/how-can-i-retrieve-the-alias-for-a-dataframe-in-spark
However, since the "accessor" appears to be version-sensitive, it would be
preferable to have a stable public accessor for this name/alias property.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]