Re: Shuffle/sort

2012-06-05 Thread Harsh J
Hey Sean, Check out http://www.slideshare.net/jhammerb/hadoop-map-reduce-arch-106883, a slightly dated and MR1-oriented presentation from Owen O'Malley that goes a good level in-depth to get an overview of how things work (including how reduces pull data). After that, check out Chris Douglas' htt

Re: Cannot start name node after turning on hadoop security

2012-06-05 Thread Allan Yan
Figured it out. Kerberos key is not created properly. thanks allan On Mon, Jun 4, 2012 at 1:05 PM, Allan Yan wrote: > Sorry, the links should be: > > > http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201202.mbox/%3CCAMAD20=oKVRy_pDX6FWm=xvpz1pal0qcfqagssaxq8xugp7...@mail.gmail.com%3

mapreduce.job.max.split.locations just a warning in hadoop 1.0.3 but not in 2.0.1-alpha?

2012-06-05 Thread Jim Donofrio
final int max_loc = conf.getInt(MAX_SPLIT_LOCATIONS, 10); if (locations.length > max_loc) { LOG.warn("Max block location exceeded for split: " + split + " splitsize: " + locations.length + " maxsize: " + max_loc); locations = Arrays.c

Shuffle/sort

2012-06-05 Thread Barry, Sean F
"I was always wondering after mapping, how each reduce task get its input. It is said in google's paper and hadoop's documentation that a sort is done to aggregate the same key of the map output. But there is no detailed explanation of how it is implemented and my intuition is that perhaps a globa

hadoop file permission 1.0.3 (security)

2012-06-05 Thread Tony Dean
Can someone detail the options that are available to set file permissions at the hadoop and os level? Here's what I have discovered thus far: dfs.permissions = true|false (works as advertised) dfs.supergroup = supergroup (works as advertised) dfs.umaskmode = umask (I believe this should be used

Re: mini node in a cluster

2012-06-05 Thread Pat Ferrel
OK, so remove the mini-node (client) from the master's slaves since it's no longer a node. This will cause the client to not get started when the master starts. There is no init.d script on the client only the master since it was always started by the master through ssh and start-all.sh. The co

Re: What happens when I do not output anything from my mapper

2012-06-05 Thread murat migdisoglu
Hi Devaraj , Indeed, the previous email that I've sent you contained -ls output of SequenceFileOutputFormat with signatures of the class in it. Hence it was 87 bytes. Hadoop was creating "empty" files(in fact, files containing only the signature) before I started to use LazyOutputFormat. Regards

Hadoop Streaming Example - Issue

2012-06-05 Thread karanveer.singh
Hi, I am trying to run a java program as a mapper using Hadoop Streaming but getting the following error: "Cannot run program "new_code.class": java.io.IOException: error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)" The command being run is: /usr/b

Re: Web Service Interface for triggering a Hadoop Job

2012-06-05 Thread Nitin Pawar
you may want to check this on mahout user group Jun 5, 2012 9:14:46 AM org.apache.mahout.common.**AbstractJob parseArguments SEVERE: Unexpected mapred.output.dir=output while processing Job-Specific Options: this looks like command line argument parsing error On Tue, Jun 5, 2012 at 11:50 AM, Nik