Github user emlyn closed the pull request at:
https://github.com/apache/spark/pull/16609
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
I'm happy for this to be closed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
Nice, all I really want is a way to identify the RDDs in the storage tab, I
didn't realise it was possible with a temp view (but I think the second line
should be `df.sparkSession.catalog.cacheTable
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
Any more feedback on whether this is a reasonable way to go about this (or
if not, what would be a better way)? I think some way of identifying RDDs on
the storage tab would greatly improve its
Github user emlyn commented on a diff in the pull request:
https://github.com/apache/spark/pull/16609#discussion_r97210569
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -322,6 +333,9 @@ class Dataset[T] private[sql](
}
override def
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
Thanks @felixcheung, that test now seems to pass, although I still get `Had
test warnings or failures; see logs.`, and further up is:
```
Error in loadVignetteBuilder(pkgdir, TRUE
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
I think I've renamed to storageName everywhere now. I had to add it to
`generic.R` too, otherwise the tests complained. I tried to guess what to put
there from surrounding entries, I hope I did
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
I've modified the name in R to `storageName`, does anyone have any
opposition to that (or better ideas)?
Does it need any more tests? I'm not sure if it's possible to test that the
name
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
As far as I'm concerned it's only to help identify entries on the UI
storage tab (although others may have other ideas), so `storageName` sounds
fine to me.
---
If your project is set up
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
@felixcheung oh, in that case should the existing ones in RDD.R be made
public (and possibly renamed)? They are public in the Python/Java/Scala APIs.
---
If your project is set up for it, you can
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
@holdenk @davies I've adjusted the Python return pattern as mentioned in
the original PR (also in rdd.py), is this OK?
---
If your project is set up for it, you can reply to this email and have your
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
@felixcheung @shivaram I agree it probably makes sense to rename it, what
about `rddName`? Any other ideas?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
@felixcheung I think I've addressed your comments from the original PR
(although the similarity to `names` remains), but I'd appreciate you giving it
another look over as I'm not very familiar with R
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/15343
I've managed to get the name appearing on the storage UI tab by passing it
through to the cache manager, PR in #16609.
I'd appreciate any comments (e.g. whether that's a reasonable way to do
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/16609
@cloud-fan yes, this name appears in the storage tab of the UI.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user emlyn opened a pull request:
https://github.com/apache/spark/pull/16609
[SPARK-8480] [CORE] [PYSPARK] [SPARKR] Add setName for Dataframe
## What changes were proposed in this pull request?
Add name / setName to Dataset/DataFrame in all languages, to match RDD
Github user emlyn commented on the issue:
https://github.com/apache/spark/pull/15343
@srowen I would find this really useful, it would be great if it could be
made to work.
What happens if you pass the name through to `cacheManager.cacheQuery`
[here](https://github.com/apache
17 matches
Mail list logo