[jira] [Commented] (CRUNCH-470) Add hdfs/yarn minicluster crunch pipeline

Gabriel Reid (JIRA) Thu, 11 Sep 2014 04:17:25 -0700

    [ 
https://issues.apache.org/jira/browse/CRUNCH-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129890#comment-14129890
 ]


Gabriel Reid commented on CRUNCH-470:
-------------------------------------

Do you mean the addition of a new Pipeline implementation (in addition to 
MemPipeline, MRPipeline, and SparkPipeline)? The MRPipeline implementation will 
already run on YARN as long as Crunch is compiled for hadoop2, so there 
shouldn't be a new Pipeline impl needed for this.

On the other hand, if you're referring to testing pipelines on a 
pseudo-distributed mini cluster, that is already possible -- this is what's 
actually done in the HFileTargetIT integration test, a mini-cluster (with HDFS, 
etc) is spun up and the pipeline is run there.

> Add hdfs/yarn minicluster crunch pipeline
> -----------------------------------------
>
>                 Key: CRUNCH-470
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-470
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Rafal Wojdyla
>            Assignee: Josh Wills
>            Priority: Minor
>
> Crunch currently has two pipelines:
> * MemPipeline
> * MRPipeline
> MemPipeline is in-memory pipelines based on local in-memory mapreduce mode.
> MRPipeline is distributed pipeline based on distributed MapReduce.
> Using HDFS/YARN Minicluster it's possible to better emulate Hadoop cluster, 
> and it could be a 'final test' before running on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CRUNCH-470) Add hdfs/yarn minicluster crunch pipeline

Reply via email to