Re: ephemeral storage level in spark ?

Matei Zaharia Sun, 06 Apr 2014 20:00:23 -0700

The off-heap storage level is currently tied to Tachyon, but it might support 
other forms of off-heap storage later. However it’s not really designed to be 
mixed with the other ones. For this use case you may want to rely on memory 
locality and have some custom code to push the data to the accelerator. If you 
can think of a way to extend the storage level concept to handle this that 
would be general though, do send a proposal.


Matei

On Apr 5, 2014, at 5:14 PM, Mridul Muralidharan <mri...@gmail.com> wrote:

> No, I am thinking along lines of writing to an accelerator card or
> dedicated card with its own memory.
> 
> Regards,
> Mridul
> On Apr 6, 2014 5:19 AM, "Haoyuan Li" <haoyuan...@gmail.com> wrote:
> 
>> Hi Mridul,
>> 
>> Do you mean the scenario that different Spark applications need to read the
>> same raw data, which is stored in a remote cluster or machines. And the
>> goal is to load the remote raw data only once?
>> 
>> Haoyuan
>> 
>> 
>> On Sat, Apr 5, 2014 at 4:30 PM, Mridul Muralidharan <mri...@gmail.com
>>> wrote:
>> 
>>> Hi,
>>> 
>>>  We have a requirement to use a (potential) ephemeral storage, which
>>> is not within the VM, which is strongly tied to a worker node. So
>>> source of truth for a block would still be within spark; but to
>>> actually do computation, we would need to copy data to external device
>>> (where it might lie around for a while : so data locality really
>>> really helps if we can avoid a subsequent copy if it is already
>>> present on computations on same block again).
>>> 
>>> I was wondering if the recently added storage level for tachyon would
>>> help in this case (note, tachyon wont help; just the storage level
>>> might).
>>> What sort of guarantees does it provide ? How extensible is it ? Or is
>>> it strongly tied to tachyon with only a generic name ?
>>> 
>>> 
>>> Thanks,
>>> Mridul
>>> 
>> 
>> 
>> 
>> --
>> Haoyuan Li
>> Algorithms, Machines, People Lab, EECS, UC Berkeley
>> http://www.cs.berkeley.edu/~haoyuan/
>>

Re: ephemeral storage level in spark ?

Reply via email to