[ 
https://issues.apache.org/jira/browse/CRUNCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388554#comment-14388554
 ] 

Micah Whitacre commented on CRUNCH-505:
---------------------------------------

I haven't looked much into Tachyon but does it provide an implementation of 
Hadoop FileSystem or mostly focuses on the Java File API?  I'm not seeing one 
from my cursory glance  I'm not seeing one.  If it did provide one that'd be 
pretty easy to support.  The challenge here would be supporting Tachyon while 
also not requiring it for all consumers if it had a different API.

[1] - http://tachyon-project.org/Running-Hadoop-MapReduce-on-Tachyon.html

> Store intermediate data in memory only using Tachyon
> ----------------------------------------------------
>
>                 Key: CRUNCH-505
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-505
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.12.0
>            Reporter: Ioannis Kerkinos
>            Assignee: Josh Wills
>
> Tachyon is a memory-centric distributed storage system that enables reliable 
> data sharing at memory-speed. If used as the storage for intermediate data 
> (between MR jobs) it should improve performance as you won't have to go to 
> HDFS. In order to do so, the MUST_CACHE write type of Tachyon can be used. 
> This will enable data to be persisted in memory only without going to HDFS. 
> So the intermediate data will be read/written at memory-speed and only the 
> final result will be written in HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to