Re: How to sort key,value pair by value(In ascending)

2011-09-13 Thread ksgupta misc
Hi Guys, Thanku for your valuable suggestion. I see this works fine in cases were key values are unique. In my use cases the values are as follows: *,,* 012742,3244,1 0028604164,2344,3 0062059017,2344,5 0075546701,2344,1 0130213268,2344,8 0140105425,5675,3 0141304286,5677,6 0195052668,3453,8

Re: How to sort key,value pair by value(In ascending)

2011-09-13 Thread Sudharsan Sampath
One way is to reverse the output in the mapper to emit<1, 10050> and in the reducer, use a treeset to order ur values.. for each value o/p in the reducer. With this O/P will be sorted as per ur needs within each reducer. If u need a total sorted o/p, u can use a single reducer or design ur part

RE: Issues starting TaskTracker

2011-09-13 Thread Shreya.Pal
Hi I downloaded cloudera VM (https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM#Clou dera%27sHadoopDemoVM-DemoVMWareImage) for VMware and vmware player. The VM is 64 bit but my OS is 32 bit. What can be the solution? Regards, Shreya From: Bejoy KS [mailto:bejoy.h

The problem with Hadoop and Iterative applications and merge join.

2011-09-13 Thread Kevin Burton
I was going to post this to my blog but I'm running into technical difficulties at the moment (don't ask) so figured I'd just post it here and see if anyone any feedback. I recently wrote an implementation of an algorithm in in Pig which exposed some bugs / design flaws in the Hadoop core wh

How to sort key,value pair by value(In ascending)

2011-09-13 Thread ksgupta misc
Hi, I have the content like *10103*,1042279,*4* *10070*,1001089,*5* *10102*,1015504,*7* *10080*,1024369,*7* *10050*,1025671,*1* ... from which i separated the key,value pairs and got the output after a single map and reduce as follows: 10050 1 10070 5 10080 7 10102 7 10103 4 ... I require t

Re: Hadoop Streaming job Fails - Permission Denied error

2011-09-13 Thread Jeremy Lewi
Benjoy to redirect stdout add the lines import sys sys.stdout=sys.stderr to the top of your py files (i.e right after the shebang line). J On Tue, Sep 13, 2011 at 1:42 AM, Bejoy KS wrote: > Hi Harsh > Thank You for the response. I'm on Cloudera demo VM. It is on > hadoop 0.20 and has

RE: Failing to contact Am/History for jobs

2011-09-13 Thread Eric Payne
I've seen it too. When I get this, I restart the NM, RM, and HS, and it stops happening. I don't have a cuase yet. -Eric From: Jeffrey Naisbitt [mailto:jnais...@yahoo-inc.com] Sent: Monday, September 12, 2011 12:23 PM To: mapreduce-...@hadoop.apache.org Subject: Failing to contact Am/History fo

setup and cleanup methods in MapReduce API

2011-09-13 Thread Sahana Bhat
Hi, In the new MapReduce API's in Hadoop 0.20.2 version, the methods setup( ) and cleanup( ) are run ONCE at the beginning and at the end of a task. How is this functionality provided in the older MapReduce API's in the 0.20.2 version of Hadoop.Can functionality similar to setup( ) be im

Re: Issues starting TaskTracker

2011-09-13 Thread Bejoy KS
Shreya To add on. From cloudera website you would get images for different VMs like VM Ware, Virtual Box etc. Choose the appropriate one for your use as per your availabe software. To your question, it is definitely possible to run map reduce progarms from Cloudera VM and in fact it is

Re: Issues starting TaskTracker

2011-09-13 Thread Bejoy KS
Hi Shreya You can copy files from windows to the linux on VM using any ftp tools like filezilla. Take a terminal on your linix, type ifconfig , the value given under 'inet addr:' would be your IP address. Use this IP address and default port (22) to connect to liux image from Windows thro

RE: Issues starting TaskTracker

2011-09-13 Thread Shreya.Pal
Hi Harsh, Version of Hadoop - hadoop-0.20.203.0 How do I make the process owner same as directory owner Directory owner is - Titun Regards Shreya -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Monday, September 12, 2011 10:50 PM To: mapreduce-user@hadoop.apache.org

RE: Issues starting TaskTracker

2011-09-13 Thread Shreya.Pal
Hi Harsh Is it possible to run my mapreduce programs in cloudera VM (VM is run using vmware player)?? How can I copy my jar files and input data there. Regards, Shreya -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Monday, September 12, 2011 10:50 PM To: mapreduce-us

Re: Hadoop Streaming job Fails - Permission Denied error

2011-09-13 Thread Bejoy KS
Hi Harsh Thank You for the response. I'm on Cloudera demo VM. It is on hadoop 0.20 and has python installed. Do I have to do any further installation/configuration to get python running? On Tue, Sep 13, 2011 at 1:36 PM, Harsh J wrote: > The env binary would be present, but do all your T

Re: Hadoop Streaming job Fails - Permission Denied error

2011-09-13 Thread Harsh J
The env binary would be present, but do all your TT nodes have python properly installed on them? The env program can't find them and that's probably why your scripts with shbang don't run. On Tue, Sep 13, 2011 at 1:12 PM, Bejoy KS wrote: > Thanks Jeremy. But I didn't follow 'redirect "stdout" to

Re: Hadoop Streaming job Fails - Permission Denied error

2011-09-13 Thread Bejoy KS
Thanks Jeremy. But I didn't follow 'redirect "stdout" to "stderr" at the entry point to your mapper and reducer'. Basically I'm a java hadoop developer and has no idea on python programming. Could you please help me with mode details like the line of code i need to include to achieve this. Also I

Hadoop security configuration how to?

2011-09-13 Thread 周俊清
Hello,everyone I would like to configure my hadoop cluster with security module authentication and authorization. In the version of branch-0.20-security-203( http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203/),which have integrated kerberos into the soure code. But