Hi Vin, >From Spark 2.x, OFF_HEAP was changed to no longer directly interface with an external block store. The previous tight dependency was restrictive and reduced flexibility. It looks like the new version uses the executor's off heap memory to allocate direct byte buffers, and does not interface with any external system for the data storage. I am not aware of a way to connect the new version of OFF_HEAP to Alluxio.
You can experience similar benefits of the old OFF_HEAP <-> Tachyon mode as well as additional benefits like unified namespace <http://www.alluxio.org/docs/master/en/Unified-and-Transparent-Namespace.html> or sharing in-memory data across applications, by using the Alluxio filesystem API <http://www.alluxio.org/docs/master/en/File-System-API.html>. I hope this helps! Thanks, Gene On Wed, Jan 4, 2017 at 10:50 AM, Vin J <winjos...@gmail.com> wrote: > Until Spark 1.6 I see there were specific properties to configure such as > the external block store master url (spark.externalBlockStore.url) etc to > use OFF_HEAP storage level which made it clear that an external Tachyon > type of block store as required/used for OFF_HEAP storage. > > Can someone clarify how this has been changed in Spark 2.x - because I do > not see config settings anymore that point Spark to an external block store > like Tachyon (now Alluxio) (or am i missing seeing it?) > > I understand there are ways to use Alluxio with Spark, but how about > OFF_HEAP storage - can Spark 2.x OFF_HEAP rdd persistence still exploit > alluxio/external block store? Any pointers to design decisions/Spark JIRAs > related to this will also help. > > Thanks, > Vin. >