date:20130920

Re: How to best decide mapper output/reducer input for a huge string?

2013-09-20 Thread Pradeep Gollakota

I'm sorry but I don't understand your question. Is the output of the mapper you're describing the key portion? If it is the key, then your data should already be sorted by HouseHoldId since it occurs first in your key. The SortComparator will tell Hadoop how to sort your data. So you use this if y

How to best decide mapper output/reducer input for a huge string?

2013-09-20 Thread Pavan Sudheendra

I need to improve my MR jobs which uses HBase as source as well as sink.. Basically, i'm reading data from 3 HBase Tables in the mapper, writing them out as one huge string for the reducer to do some computation and dump into a HBase Table.. Table1 ~ 19 million rows.Table2 ~ 2 million rows.Table3

Re: Help writing a YARN application

2013-09-20 Thread Rahul Bhattacharjee

I tried something , see if this helps. Its incomplete though. https://github.com/brahul/singular Thanks, Rahul On Fri, Sep 20, 2013 at 11:54 PM, Pradeep Gollakota wrote: > Hi All, > > I've been trying to write a Yarn application and I'm completely lost. I'm > using Hadoop 2.0.0-cdh4.4.0 (Cloude

Re:

2013-09-20 Thread Jagat Singh

Hi Jamal, Streaming also supports sharing files with workers See here http://hadoop.apache.org/docs/stable/streaming.html#Working+with+Large+Files+and+Archives Thanks On Sat, Sep 21, 2013 at 6:50 AM, jamal sasha wrote: > Hi, > So in native hadoop streaming, how do i send a helper file..

Re: How to make hadoop use all nodes?

2013-09-20 Thread Omkar Joshi

Hi, few more questions (which has 40 containers slots.) >> for total cluster? Please give below details for cluster 1) yarn-site.xml -> what is the resource memory configured for per node? 2) yarn-site.xml -> what is the minimum resource allocation for the cluster? 3) yarn-resource-manager-log

[no subject]

2013-09-20 Thread jamal sasha

Hi, So in native hadoop streaming, how do i send a helper file.. ? Like in core hadoop, you can write your code in multiple files and then jar it out... But if i am using hadoop streaming, all my code should be in single file?? Is that so?

Re: Using combiners in python hadoop streaming

2013-09-20 Thread jamal sasha

Oops.. wrong email thread :D Please ignore the previous email On Fri, Sep 20, 2013 at 1:49 PM, jamal sasha wrote: > Hi, > So in native hadoop streaming, is there no way to send a helper file.. ? > Like in core hadoop, you can write your code in multiple files and then > jar it out... > But if

Re: Using combiners in python hadoop streaming

2013-09-20 Thread jamal sasha

Hi, So in native hadoop streaming, is there no way to send a helper file.. ? Like in core hadoop, you can write your code in multiple files and then jar it out... But if i am using hadoop streaming, all my code should be in single file?? Is that so? On Wed, Sep 18, 2013 at 3:10 PM, Chris Embree

Help writing a YARN application

2013-09-20 Thread Pradeep Gollakota

Hi All, I've been trying to write a Yarn application and I'm completely lost. I'm using Hadoop 2.0.0-cdh4.4.0 (Cloudera distribution). I've uploaded my sample code to github at https://github.com/pradeepg26/sample-yarn The problem is that my application master is exiting with a status of 1 (I'm e

Transforming xml payload from flume events (avro) to custom avro structure

2013-09-20 Thread Adrian Hains

Summary: I want to harvest a subset of custom data from an avro container structure, and store it off as avro data files. I'm having some difficulty in determining the cleanest place to implement my logic. Details: I have a flume flow that is shipping a somewhat complex set of xml structures to hd

Re: Securing the Secondary Name Node

2013-09-20 Thread Christopher Penney

Following up to my own post I tried upgrading to 1.2.1 and the problem seemed to go away. Now the 2NN is working fine. I'm now seeing: 2013-09-20 11:13:31,318 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted URL hpctest3.realm.com:50070putimage=1&port=50090&machine=hpctest3.

Re: Securing the Secondary Name Node

2013-09-20 Thread Christopher Penney

It does seem to be some kind of authentication issue as I can see "Login failure for null from keytab" in the NN log, but I don't understand why I get it. The NN and 2NN are the same box. Below are the logs from each. On the 2NN I see: 2013-09-20 10:01:59,338 INFO org.apache.hadoop.security.Use

Re: How to make hadoop use all nodes?

2013-09-20 Thread Antoine Vandecreme

Hello Omkar, Thanks for your reply. Yes, all 4 points are corrects. However, my application is requesting let say 100 containers on my cluster which has 40 containers slots. So I expected to see all containers slots used but that is not the case. Just in case it matters, it is the only applicat

Re: Task status query

2013-09-20 Thread Harsh J

Right now its MR specific (TaskUmbilicalProtocol) - YARN doesn't have any reusable items here yet, but there are easy to use RPC libs such as Avro and Thrift out there that make it easy to do such things once you define what you want in a schema/spec form. On Fri, Sep 20, 2013 at 5:32 PM, John Lil

Map task failed: No protocol specified

2013-09-20 Thread rohit sarewar

I am using CDH4 and I am trying to access GPU from cleanup() method of mapper class using JOCL. (Note: My normal code(without map reduce) works fine on GPU). When I execute my map-reduce code, It throws an error (specified below). **Error***

RE: Task status query

2013-09-20 Thread John Lilley

Thanks Harsh. Is this protocol something that is available to all AMs/tasks? Or is it up to each AM/task pair to develop their own protocol? john -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Thursday, September 19, 2013 9:20 PM To: Subject: Re: Task status query

Re: MAP_INPUT_RECORDS counter in the reducer

2013-09-20 Thread Yaron Gonen

Hi again, I've run into this link: http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201112.mbox/%3ccafe9998.2fef6%25ev...@yahoo-inc.com%3E Looks like a nice idea. Have someone tried something similar? Thanks On Wed, Sep 18, 2013 at 4:46 PM, Shahab Yunus wrote: > Yes, you are corre

Re: How to best decide mapper output/reducer input for a huge string?

How to best decide mapper output/reducer input for a huge string?

Re: Help writing a YARN application

Re:

Re: How to make hadoop use all nodes?

[no subject]

Re: Using combiners in python hadoop streaming

Re: Using combiners in python hadoop streaming

Help writing a YARN application

Transforming xml payload from flume events (avro) to custom avro structure

Re: Securing the Secondary Name Node

Re: Securing the Secondary Name Node

Re: How to make hadoop use all nodes?

Re: Task status query

Map task failed: No protocol specified

RE: Task status query

Re: MAP_INPUT_RECORDS counter in the reducer

17 matches

Site Navigation

Mail list logo

Footer information