You are ignoring serde costs :-) - Mridul
On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson <[email protected]> wrote: > Tachyon should only be marginally less performant than memory_only, because > we mmap the data from Tachyon's ramdisk. We do not have to, say, transfer > the data over a pipe from Tachyon; we can directly read from the buffers in > the same way that Shark reads from its in-memory columnar format. > > > > On Tue, Jul 8, 2014 at 1:18 AM, qingyang li <[email protected]> > wrote: > >> hi, when i create a table, i can point the cache strategy using >> shark.cache, >> i think "shark.cache=memory_only" means data are managed by spark, and >> data are in the same jvm with excutor; while "shark.cache=tachyon" >> means data are managed by tachyon which is off heap, and data are not in >> the same jvm with excutor, so spark will load data from tachyon for each >> query sql , so, is tachyon less efficient than memory_only cache strategy >> ? >> if yes, can we let spark load all data once from tachyon for all sql query >> if i want to use tachyon cache strategy since tachyon is more HA than >> memory_only ? >>
