No, cache() changes the bookkeeping of the existing RDD. Although it returns a reference, it works to just call "person.cache".
I can't reproduce this. When I try to cache an RDD and then count it, it is persisted in memory and I see it in the web UI. Something else must be different about what's being executed. On Wed, Apr 1, 2015 at 8:26 AM, Yuri Makhno <ymak...@gmail.com> wrote: > cache() method returns new RDD so you have to use something like this: > > val person = > sc.textFile("hdfs://namenode_host:8020/user/person.txt").map(_.split(",")).map(p > => Person(p(0).trim.toInt, p(1))) > > val cached = person.cache > > cached.count > > when you rerun count on cached you will see that cache works > > On Wed, Apr 1, 2015 at 9:35 AM, fightf...@163.com <fightf...@163.com> wrote: >> >> Hi >> That is just the issue. After running person.cache we then run >> person.count >> however, there still not be any cache performance showed from web ui >> storage. >> >> Thanks, >> Sun. >> >> ________________________________ >> fightf...@163.com >> >> >> From: Taotao.Li >> Date: 2015-04-01 14:02 >> To: fightfate >> CC: user >> Subject: Re: rdd.cache() not working ? >> rerun person.count and you will see the performance of cache. >> >> person.cache would not cache it right now. It'll actually cache this RDD >> after one action[person.count here] >> >> ________________________________ >> 发件人: fightf...@163.com >> 收件人: "user" <user@spark.apache.org> >> 发送时间: 星期三, 2015年 4 月 01日 下午 1:21:25 >> 主题: rdd.cache() not working ? >> >> Hi, all >> >> Running the following code snippet through spark-shell, however cannot see >> any cached storage partitions in web ui. >> >> Does this mean that cache now working ? Cause if we issue person.count >> again that we cannot say any time consuming >> >> performance upgrading. Hope anyone can explain this for a little. >> >> Best, >> >> Sun. >> >> case class Person(id: Int, col1: String) >> >> val person = >> sc.textFile("hdfs://namenode_host:8020/user/person.txt").map(_.split(",")).map(p >> => Person(p(0).trim.toInt, p(1))) >> >> person.cache >> >> person.count >> >> ________________________________ >> fightf...@163.com >> >> >> >> -- >> >> >> --------------------------------------------------------------------------- >> >> Thanks & Best regards >> >> 李涛涛 Taotao · Li | Fixed Income@Datayes | Software Engineer >> >> 地址:上海市浦东新区陆家嘴西路99号万向大厦8楼, 200120 >> Address :Wanxiang Towen 8F, Lujiazui West Rd. No.99, Pudong New District, >> Shanghai, 200120 >> >> 电话|Phone:021-60216502 手机|Mobile: +86-18202171279 >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org