Re: Best practices for hadoop shuffling/tunning ?

2012-01-31 Thread praveenesh kumar
Can anyone please eyeball the config parameters as defined below and share their thoughts on this ? Thanks, Praveenesh On Mon, Jan 30, 2012 at 6:20 PM, praveenesh kumar praveen...@gmail.comwrote: Hey guys, Just wanted to ask, are there any sort of best practices to be followed for hadoop

Re: refresh namenode topology cache

2012-01-31 Thread Sateesh Lakkarsu
We are using script based mapping and got the order wrong for a couple of nodes, will look into custom mapping class moving fwd. Thanks On Tue, Jan 31, 2012 at 12:42 AM, Harsh J ha...@cloudera.com wrote: Sateesh, On Tue, Jan 31, 2012 at 12:24 AM, Sateesh Lakkarsu lakka...@gmail.com wrote:

Hybrid Hadoop with fork/join ?

2012-01-31 Thread Rob Stewart
Hi, I'm investigating the feasibility of a hybrid approach to parallel programming, by fusing together the concurrent Java fork/join libraries with Hadoop... MapReduce, a paradigm suited for scalable execution over distributed memory + fork/join, a paradigm for optimal multi-threaded shared

Re: Hadoop Datacenter Setup

2012-01-31 Thread Aaron Tokhy
That was another one of our options (PXE/NFS netboot). I'm just afraid of NFS locking on me at random times. If you've had any success with running a live read-only root filesystem off of NFS, I'd be more compelled in using it. If our image is small enough (~500 MB), it may make sense to

Re: Hybrid Hadoop with fork/join ?

2012-01-31 Thread Alejandro Abdelnur
Rob, Hadoop has as a way to run Map tasks in multithreading mode, look for the MultithreadedMapRunner MultithreadedMapper. Thanks. Alejandro. On Tue, Jan 31, 2012 at 7:51 AM, Rob Stewart robstewar...@gmail.com wrote: Hi, I'm investigating the feasibility of a hybrid approach to parallel

Re: Adding mahout math jar to hadoop mapreduce execution

2012-01-31 Thread Daniel Quach
For Hadoop 0.20.203 (the latest stable), is it sufficient to do this to parse the lib jars from the command line? public static main (String args[]) { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); Job job = new Job(conf,

S3 block file system

2012-01-31 Thread Madhu Ramanna
Hello, Is S3 block file system still supported ? I read in aws docs that s3 block file system is deprecated; Not sure if the doc is in error (or old) Thanks, Madhu

Re: Best practices for hadoop shuffling/tunning ?

2012-01-31 Thread Arun C Murthy
Moving to mapreduce-user@, bcc common-user@. Please use project specific lists. Your io.sort.mb is too high. You only have 1G of heap for the map. Reduce parallel copies is too high too. On Jan 30, 2012, at 4:50 AM, praveenesh kumar wrote: Hey guys, Just wanted to ask, are there any sort

Re: Namenode service not running on the Configured IP address

2012-01-31 Thread anil gupta
Hi Harsh/Praveenesh, Thanks for the reply guys. I forgot to mention that /etc/hosts was having the ip to dns mapping and Yes, i deliberately used IP address in the configuration because i thought that if i use host-name then how would hadoop know about the network interface on which it is

Re: Adding mahout math jar to hadoop mapreduce execution

2012-01-31 Thread Joey Echeverria
You also need to add the jar to the classpath so it's available in your main. You can do soemthing like this: HADOOP_CLASSPATH=/usr/local/mahout/math/target/mahout-math-0.6-SNAPSHOT.jar hadoop jar ... -Joey On Tue, Jan 31, 2012 at 1:38 PM, Daniel Quach danqu...@cs.ucla.edu wrote: For Hadoop

plugging in to hadoop

2012-01-31 Thread tejas
Hi I am trying to develop a UI and display metrics of mapreduce and HDFS and also provide some kind of monitoring. 1. With version 0.20, I was able to extend the org.apache.JobTrackerPlugin class by configuring it in the mapred-site.properties. Is it possible to make a similar configuration

metrics2 not working

2012-01-31 Thread tejas
Hi I set up hadoop 0.23. I configured hadoop-metrics.properties as well as hadoop-metrics2.properties to dump metric files. However i see that hadoop still does not pick the metrics2 property file and still logs as per the configurations in hadoop-metrics-properties Is there some specific

Re: Does Hadoop 0.20.205 and Ganglia 3.1.7 compatible with each other ?

2012-01-31 Thread Merto Mertek
I would be glad to hear that too.. I've setup the following: Hadoop 0.20.205 Ganglia Front 3.1.7 Ganglia Back *(gmetad)* 3.1.7 RRDTool http://www.rrdtool.org/ 1.4.5. - i had some troubles installing 1.4.4 Ganglia works just in case hadoop is not running, so metrics are not publshed to gmetad

Re: plugging in to hadoop

2012-01-31 Thread real great..
I think there are a couple of projects already looking into UI. did you check them? On Wed, Feb 1, 2012 at 7:57 AM, tejas tejas.krishnamoor...@oracle.comwrote: Hi I am trying to develop a UI and display metrics of mapreduce and HDFS and also provide some kind of monitoring. 1. With version

Re: plugging in to hadoop

2012-01-31 Thread tejas
I am actually looking at implementing my own UI. I am looking to develop monitoring and control framework for hadoop I am mainly looking at hadoo 0.23 On 2/1/2012 8:36 AM, real great.. wrote: I think there are a couple of projects already looking into UI. did you check them? On Wed, Feb 1,

Re: Any samples of how to write a custom FileSystem

2012-01-31 Thread Harsh J
To write a custom filesystem, extend on the FileSystem class. Depending on the scheme it is supposed to serve, creating an entry fs.scheme.impl in core-site.xml, and then loading it via the FileSystem.get(URI, conf) API will auto load it for you, provided the URI you pass has the right scheme.

Re: Any samples of how to write a custom FileSystem

2012-01-31 Thread Alejandro Abdelnur
Steven, You could also look at HttpFSFilesystem in the hadoop-httpfs module, it is quite simple and selfcontained. Cheers. Alejandro On Tue, Jan 31, 2012 at 8:37 PM, Harsh J ha...@cloudera.com wrote: To write a custom filesystem, extend on the FileSystem class. Depending on the scheme it