i put some println statements in BlockManagerUI i have RDDs that are cached in memory. I see this:
******************* onStageSubmitted ********************** rddInfo: RDD "2" (2) Storage: StorageLevel(false, false, false, false, 1); CachedPartitions: 0; TotalPartitions: 1; MemorySize: 0.0 B;TachyonSize: 0.0 B; DiskSize: 0.0 B _rddInfoMap: Map(2 -> RDD "2" (2) Storage: StorageLevel(false, false, false, false, 1); CachedPartitions: 0; TotalPartitions: 1; MemorySize: 0.0 B;TachyonSize: 0.0 B; DiskSize: 0.0 B) ******************* onTaskEnd ********************** Map(2 -> RDD "2" (2) Storage: StorageLevel(false, false, false, false, 1); CachedPartitions: 0; TotalPartitions: 1; MemorySize: 0.0 B;TachyonSize: 0.0 B; DiskSize: 0.0 B) ******************* onStageCompleted ********************** Map() ******************* onStageSubmitted ********************** rddInfo: RDD "7" (7) Storage: StorageLevel(false, false, false, false, 1); CachedPartitions: 0; TotalPartitions: 1; MemorySize: 0.0 B;TachyonSize: 0.0 B; DiskSize: 0.0 B _rddInfoMap: Map(7 -> RDD "7" (7) Storage: StorageLevel(false, false, false, false, 1); CachedPartitions: 0; TotalPartitions: 1; MemorySize: 0.0 B;TachyonSize: 0.0 B; DiskSize: 0.0 B) ******************* onTaskEnd ********************** Map(7 -> RDD "7" (7) Storage: StorageLevel(false, false, false, false, 1); CachedPartitions: 0; TotalPartitions: 1; MemorySize: 0.0 B;TachyonSize: 0.0 B; DiskSize: 0.0 B) ******************* onStageCompleted ********************** Map() The storagelevels you see here are never the ones of my RDDs. and apparently updateRDDInfo never gets called (i had println in there too). On Tue, Apr 8, 2014 at 2:13 PM, Koert Kuipers <ko...@tresata.com> wrote: > yes i am definitely using latest > > > On Tue, Apr 8, 2014 at 1:07 PM, Xiangrui Meng <men...@gmail.com> wrote: > >> That commit fixed the exact problem you described. That is why I want to >> confirm that you switched to the master branch. bin/spark-shell doesn't >> detect code changes, so you need to run ./make-distribution.sh to >> re-compile Spark first. -Xiangrui >> >> >> On Tue, Apr 8, 2014 at 9:57 AM, Koert Kuipers <ko...@tresata.com> wrote: >> >>> sorry, i meant to say: note that for a cached rdd in the spark shell it >>> all works fine. but something is going wrong with the SPARK-APPLICATION-UI >>> in our applications that extensively cache and re-use RDDs >>> >>> >>> On Tue, Apr 8, 2014 at 12:55 PM, Koert Kuipers <ko...@tresata.com>wrote: >>> >>>> note that for a cached rdd in the spark shell it all works fine. but >>>> something is going wrong with the spark-shell in our applications that >>>> extensively cache and re-use RDDs >>>> >>>> >>>> On Tue, Apr 8, 2014 at 12:33 PM, Koert Kuipers <ko...@tresata.com>wrote: >>>> >>>>> i tried again with latest master, which includes commit below, but ui >>>>> page still shows nothing on storage tab. >>>>> koert >>>>> >>>>> >>>>> >>>>> commit ada310a9d3d5419e101b24d9b41398f609da1ad3 >>>>> Author: Andrew Or <andrewo...@gmail.com> >>>>> Date: Mon Mar 31 23:01:14 2014 -0700 >>>>> >>>>> [Hot Fix #42] Persisted RDD disappears on storage page if re-used >>>>> >>>>> If a previously persisted RDD is re-used, its information >>>>> disappears from the Storage page. >>>>> >>>>> This is because the tasks associated with re-using the RDD do not >>>>> report the RDD's blocks as updated (which is correct). On stage submit, >>>>> however, we overwrite any existing >>>>> >>>>> Author: Andrew Or <andrewo...@gmail.com> >>>>> >>>>> Closes #281 from andrewor14/ui-storage-fix and squashes the >>>>> following commits: >>>>> >>>>> 408585a [Andrew Or] Fix storage UI bug >>>>> >>>>> >>>>> >>>>> On Mon, Apr 7, 2014 at 4:21 PM, Koert Kuipers <ko...@tresata.com>wrote: >>>>> >>>>>> got it thanks >>>>>> >>>>>> >>>>>> On Mon, Apr 7, 2014 at 4:08 PM, Xiangrui Meng <men...@gmail.com>wrote: >>>>>> >>>>>>> This is fixed in https://github.com/apache/spark/pull/281. Please >>>>>>> try >>>>>>> again with the latest master. -Xiangrui >>>>>>> >>>>>>> On Mon, Apr 7, 2014 at 1:06 PM, Koert Kuipers <ko...@tresata.com> >>>>>>> wrote: >>>>>>> > i noticed that for spark 1.0.0-SNAPSHOT which i checked out a few >>>>>>> days ago >>>>>>> > (apr 5) that the "application detail ui" no longer shows any RDDs >>>>>>> on the >>>>>>> > storage tab, despite the fact that they are definitely cached. >>>>>>> > >>>>>>> > i am running spark in standalone mode. >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >