Hi, According to my understanding, contents in df.cache() is currently on Java heap as a set of Byte arrays in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala#L58 . Data is accessed by using sun.misc.unsafe APIs. Data maybe compressed sometime. CachedBatch is private, and this representation may be changed in the future.
In general, It is not easy to access this data by using C/C++ API. Regards, Kazuaki Ishizaki From: Jacek Laskowski <ja...@japila.pl> To: "jpivar...@gmail.com" <jpivar...@gmail.com> Cc: dev <dev@spark.apache.org> Date: 2016/05/29 08:18 Subject: Re: How to access the off-heap representation of cached data in Spark 2.0 Hi Jim, There's no C++ API in Spark to access the off-heap data. Moreover, I also think "off-heap" has an overloaded meaning in Spark - for tungsten and to persist your data off-heap (it's all about memory but for different purposes and with client- and internal API). That's my limited understanding of the things (and I'm not even sure how trustworthy it is). Use with extreme caution. Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Sat, May 28, 2016 at 5:29 PM, jpivar...@gmail.com <jpivar...@gmail.com> wrote: > Is this not the place to ask such questions? Where can I get a hint as to how > to access the new off-heap cache, or C++ API, if it exists? I'm willing to > do my own research, but I have to have a place to start. (In fact, this is > the first step in that research.) > > Thanks, > -- Jim > > > > > -- > View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-access-the-off-heap-representation-of-cached-data-in-Spark-2-0-tp17701p17717.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org