Oh, you're absolutely right. After checking my maven dependency tree I
see that the mapreduce jars brought in through a transitive dependency
from crunch.
Maybe I got this all wrong, but I tought this was only a API
dependency? At runtime yarn will do all the scheduling and execution,
or? Im prett
So I don't think using hadoop-yarn-client is right; that doesn't include
all of the hadoop-common stuff for accessing the filesystem or the
mapreduce stuff, so I'm honestly surprised the pipeline runs at all (I
suppose that technically it doesn't?) hadoop-yarn-client is what you would
use if you we
Ok, so I got it working now after doing apt install crunch on the name
node. Not really sure why it fixed the problem tough?
And i'm submitting the job using the yarn client with following dependencies.
org.apache.crunch
crunch-core
0.9.0-cdh5.0.0
org.apache
Yes, a pseudo distributed CDH5, but I realize now that I haven't
installed the apt packages for crunch. Im using the DistCache to
upload crunch-core-0.9.0-cdh5.0.0.jar instead. Does it matter?
One thing i noticed is that you're running
hadoop-client-2.3.0-cdh5.0.0 whereas i'm using
hadoop-yarn-cli
Hey Kristoffer,
Couldn't reproduce that in my crunch-demo project against my test cluster:
https://github.com/jwills/crunch-demo/tree/cdh5
So I hate asking dumb questions, but are you running against a CDH5 cluster?
J
On Wed, Jun 11, 2014 at 9:11 AM, Josh Wills wrote:
> That's very odd; let
That's very odd; let me see if I can reproduce it.
J
On Wed, Jun 11, 2014 at 7:23 AM, Kristoffer Sjögren
wrote:
> Hi
>
> Im trying out Crunch on YARN on CDH5 (0.9.0-cdh5.0.0) and get some
> errors when trying to materialize results (see below). The job itself
> is super simple.
>
> PCollection