I'm sorry but I don't understand your question. Is the output of the mapper
you're describing the key portion? If it is the key, then your data should
already be sorted by HouseHoldId since it occurs first in your key.
The SortComparator will tell Hadoop how to sort your data. So you use this
if y
I need to improve my MR jobs which uses HBase as source as well as sink..
Basically, i'm reading data from 3 HBase Tables in the mapper, writing them
out as one huge string for the reducer to do some computation and dump into
a HBase Table..
Table1 ~ 19 million rows.Table2 ~ 2 million rows.Table3
I tried something , see if this helps. Its incomplete though.
https://github.com/brahul/singular
Thanks,
Rahul
On Fri, Sep 20, 2013 at 11:54 PM, Pradeep Gollakota wrote:
> Hi All,
>
> I've been trying to write a Yarn application and I'm completely lost. I'm
> using Hadoop 2.0.0-cdh4.4.0 (Cloude
Hi Jamal,
Streaming also supports sharing files with workers
See here
http://hadoop.apache.org/docs/stable/streaming.html#Working+with+Large+Files+and+Archives
Thanks
On Sat, Sep 21, 2013 at 6:50 AM, jamal sasha wrote:
> Hi,
> So in native hadoop streaming, how do i send a helper file..
Hi,
few more questions
(which has 40 containers slots.) >> for total cluster? Please give below
details
for cluster
1) yarn-site.xml -> what is the resource memory configured for per node?
2) yarn-site.xml -> what is the minimum resource allocation for the cluster?
3) yarn-resource-manager-log
Hi,
So in native hadoop streaming, how do i send a helper file.. ?
Like in core hadoop, you can write your code in multiple files and then jar
it out...
But if i am using hadoop streaming, all my code should be in single file??
Is that so?
Oops.. wrong email thread :D Please ignore the previous email
On Fri, Sep 20, 2013 at 1:49 PM, jamal sasha wrote:
> Hi,
> So in native hadoop streaming, is there no way to send a helper file.. ?
> Like in core hadoop, you can write your code in multiple files and then
> jar it out...
> But if
Hi,
So in native hadoop streaming, is there no way to send a helper file.. ?
Like in core hadoop, you can write your code in multiple files and then jar
it out...
But if i am using hadoop streaming, all my code should be in single file??
Is that so?
On Wed, Sep 18, 2013 at 3:10 PM, Chris Embree
Hi All,
I've been trying to write a Yarn application and I'm completely lost. I'm
using Hadoop 2.0.0-cdh4.4.0 (Cloudera distribution). I've uploaded my
sample code to github at https://github.com/pradeepg26/sample-yarn
The problem is that my application master is exiting with a status of 1
(I'm e
Summary:
I want to harvest a subset of custom data from an avro container structure,
and store it off as avro data files. I'm having some difficulty in
determining the cleanest place to implement my logic.
Details:
I have a flume flow that is shipping a somewhat complex set of xml
structures to hd
Following up to my own post I tried upgrading to 1.2.1 and the problem
seemed to go away. Now the 2NN is working fine. I'm now seeing:
2013-09-20 11:13:31,318 INFO
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted URL
hpctest3.realm.com:50070putimage=1&port=50090&machine=hpctest3.
It does seem to be some kind of authentication issue as I can see "Login
failure for null from keytab" in the NN log, but I don't understand why I
get it. The NN and 2NN are the same box. Below are the logs from each.
On the 2NN I see:
2013-09-20 10:01:59,338 INFO
org.apache.hadoop.security.Use
Hello Omkar,
Thanks for your reply.
Yes, all 4 points are corrects.
However, my application is requesting let say 100 containers on my cluster
which has 40 containers slots.
So I expected to see all containers slots used but that is not the case.
Just in case it matters, it is the only applicat
Right now its MR specific (TaskUmbilicalProtocol) - YARN doesn't have
any reusable items here yet, but there are easy to use RPC libs such
as Avro and Thrift out there that make it easy to do such things once
you define what you want in a schema/spec form.
On Fri, Sep 20, 2013 at 5:32 PM, John Lil
I am using CDH4 and I am trying to access GPU from cleanup() method of
mapper class using JOCL.
(Note: My normal code(without map reduce) works fine on GPU).
When I execute my map-reduce code, It throws an error (specified below).
**Error***
Thanks Harsh. Is this protocol something that is available to all AMs/tasks?
Or is it up to each AM/task pair to develop their own protocol?
john
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Thursday, September 19, 2013 9:20 PM
To:
Subject: Re: Task status query
Hi again,
I've run into this link:
http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201112.mbox/%3ccafe9998.2fef6%25ev...@yahoo-inc.com%3E
Looks like a nice idea. Have someone tried something similar?
Thanks
On Wed, Sep 18, 2013 at 4:46 PM, Shahab Yunus wrote:
> Yes, you are corre
17 matches
Mail list logo