Re: map task execution time

2012-04-05 Thread bikash sharma
Thanks Kai, I will try those.

On Thu, Apr 5, 2012 at 3:15 AM, Kai Voigt k...@123.org wrote:

 Hi,

 Am 05.04.2012 um 00:20 schrieb bikash sharma:

  Is it possible to get the execution time of the constituent map/reduce
  tasks of a MapReduce job (say sort) at the end of a job run?
  Preferably, can we obtain this programatically?


 you can access the JobTracker's web UI and see the start and stop
 timestamps for every individual task.

 Since the JobTracker Java API is exposed, you can write your own
 application to fetch that data through your own code.

 Also, hadoop job on the command line can be used to read job statistics.

 Kai


 --
 Kai Voigt
 k...@123.org







Re: map task execution time

2012-04-05 Thread bikash sharma
Yes, how can we use hadoop job to get MR job stats, especially
constituent task finish times?


On Thu, Apr 5, 2012 at 9:02 AM, Jay Vyas jayunit...@gmail.com wrote:

 (excuse the typo in the last email : I meant I've been playing with Cinch
 , not I've been with Cinch)

 On Thu, Apr 5, 2012 at 7:54 AM, Jay Vyas jayunit...@gmail.com wrote:

  How can hadoop job be used to read m/r statistics ?
 
  On Thu, Apr 5, 2012 at 7:30 AM, bikash sharma sharmabiks...@gmail.com
 wrote:
 
  Thanks Kai, I will try those.
 
  On Thu, Apr 5, 2012 at 3:15 AM, Kai Voigt k...@123.org wrote:
 
   Hi,
  
   Am 05.04.2012 um 00:20 schrieb bikash sharma:
  
Is it possible to get the execution time of the constituent
 map/reduce
tasks of a MapReduce job (say sort) at the end of a job run?
Preferably, can we obtain this programatically?
  
  
   you can access the JobTracker's web UI and see the start and stop
   timestamps for every individual task.
  
   Since the JobTracker Java API is exposed, you can write your own
   application to fetch that data through your own code.
  
   Also, hadoop job on the command line can be used to read job
  statistics.
  
   Kai
  
  
   --
   Kai Voigt
   k...@123.org
  
  
  
  
  
 
 
 
 
  --
  Jay Vyas
  MMSB/UCHC
 



 --
 Jay Vyas
 MMSB/UCHC



Re: getting the process id of mapreduce tasks

2011-09-30 Thread bikash sharma
Thanks so much Harsh!

On Thu, Sep 29, 2011 at 12:42 AM, Harsh J ha...@cloudera.com wrote:

 Hello Bikash,

 The tasks run on the tasktracker, so that is where you'll need to look
 for the process ID -- not the JobTracker/client.

 Crudely speaking,
 $ ssh tasktracker01 # or whichever.
 $ jps | grep Child | cut -d   -f 1
 # And lo, PIDs to play with.

 On Thu, Sep 29, 2011 at 12:15 AM, bikash sharma sharmabiks...@gmail.com
 wrote:
  Hi,
  Is it possible to get the process id of each task in a MapReduce job?
  When I run a mapreduce job and do a monitoring in linux using ps, i just
 see
  the id of the mapreduce job process but not its constituent map/reduce
  tasks.
  The use case is to monitor the resource usage of each task by using sar
  utility in linux with specific process id of task.
 
  Thanks,
  Bikash
 



 --
 Harsh J



Re: getting the process id of mapreduce tasks

2011-09-30 Thread bikash sharma
Thanks Varad.

On Wed, Sep 28, 2011 at 9:35 PM, Varad Meru meru.va...@gmail.com wrote:

 The process ids of each individual task can be seen using jps and jconsole
 commands provided by java.

 jconsole command on command-line interface provides a GUI screen for
 monitoring running tasks within java.

 The tasks are only visible as java virtual machine instance in the os
 system monitoring tool.


 Regards,
 Varad Meru
 ---
 Sent from my iPod

 On 29-Sep-2011, at 0:15, bikash sharma sharmabiks...@gmail.com wrote:

  Hi,
  Is it possible to get the process id of each task in a MapReduce job?
  When I run a mapreduce job and do a monitoring in linux using ps, i just
 see
  the id of the mapreduce job process but not its constituent map/reduce
  tasks.
  The use case is to monitor the resource usage of each task by using sar
  utility in linux with specific process id of task.
 
  Thanks,
  Bikash



linux containers with Hadoop

2011-09-30 Thread bikash sharma
Hi,
Does anyone knows if Linux containers (which are like kernel supported
virtualization technique for providing resource isolation across
process/appication) have ever been used with Hadoop to provide resource
isolation for map/reduce tasks?
If yes, what could be the up/down sides of such approach and how feasible it
is in the context of Hadoop?
Any pointers if any in terms of papers, etc would be useful.

Thanks,
Bikash


Re: linux containers with Hadoop

2011-09-30 Thread bikash sharma
Thanks Edward, so mostly the linux containers are used in Hadoop for
ensuring isolation in terms of providing security across mapreduce jobs from
different users (even mesos seem to leverage the same) not for resource
fairness?

On Fri, Sep 30, 2011 at 1:39 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Fri, Sep 30, 2011 at 9:03 AM, bikash sharma sharmabiks...@gmail.com
 wrote:

  Hi,
  Does anyone knows if Linux containers (which are like kernel supported
  virtualization technique for providing resource isolation across
  process/appication) have ever been used with Hadoop to provide resource
  isolation for map/reduce tasks?
  If yes, what could be the up/down sides of such approach and how feasible
  it
  is in the context of Hadoop?
  Any pointers if any in terms of papers, etc would be useful.
 
  Thanks,
  Bikash
 

 Previously hadoop launched map reduce tasks as a single user, now with
 security tasks can launch as different users in the same OS/VM. I would say
 the closest you can to that isolation is the work done with mesos .
 http://www.mesosproject.org/



getting the process id of mapreduce tasks

2011-09-28 Thread bikash sharma
Hi,
Is it possible to get the process id of each task in a MapReduce job?
When I run a mapreduce job and do a monitoring in linux using ps, i just see
the id of the mapreduce job process but not its constituent map/reduce
tasks.
The use case is to monitor the resource usage of each task by using sar
utility in linux with specific process id of task.

Thanks,
Bikash


Re: Getting the cpu, memory usage of map/reduce tasks

2011-09-27 Thread bikash sharma
Thanks Raif.

On Mon, Sep 26, 2011 at 2:01 PM, Ralf Heyde ralf.he...@gmx.de wrote:

 Hi Bikash,

 every map-/reduce task is - as far as I know - a single jvm instance - you
 can configure and/or run with jvm options.
 Maybe you can track these jvm's by using some system tools.

 Regards,
 Ralf

 -Original Message-
 From: bikash sharma [mailto:sharmabiks...@gmail.com]
 Sent: Freitag, 23. September 2011 20:58
 To: common-user@hadoop.apache.org; common-...@hadoop.apache.org
 Subject: Getting the cpu, memory usage of map/reduce tasks

 Hi -- Is it possible to get the cpu and memory usage of individual
 map/reduces tasks when any mapreduce job is run.
 I came across this jira issue, but was not sure about the exact ways to
 access in the current hadoop distriubtion
 https://issues.apache.org/jira/browse/MAPREDUCE-220

 Any help is highly appreciated.

 Thanks,
 Bikash




configuring different number of slaves for MR jobs

2011-09-27 Thread bikash sharma
Hi -- Can we specify a different set of slaves for each mapreduce job run.
I tried using the --config option and specify different set of slaves in
slaves config file. However, it does not use the selective slaves set but
the one initially configured.

Any help?

Thanks,
Biksah


Re: configuring different number of slaves for MR jobs

2011-09-27 Thread bikash sharma
Thanks Suhas. I will try using HOD. The use case for me is some research
experiments with different set of slaves for each job run.

On Tue, Sep 27, 2011 at 1:03 PM, Vitthal Suhas Gogate 
gog...@hortonworks.com wrote:

 Slaves file is used only by control scripts like {start/stop}-dfs.sh,
 {start/stop}-mapred.sh to start the data nodes and task trackers on
 specified set of slave machines.. they can not be used effectively to
 change
 the size of the cluster for each M/R job  (unless you want to restart the
 task trackers with different number of slaves before every M/R job :)

 You can use Hadoop Job Tracker Schedulers (Capacity/Fair-share) to allocate
 and share the cluster capacity effectively.  Also there is a option of
 using
 HOD (Hadoop on demand) for dynamically allocating the cluster of required
 number of nodes.. typically used by QA/RE folks for testing purposes..
 Again in production resizing the HDFS cluster is not easy as nodes hold the
 data.

 --Suhas

 On Tue, Sep 27, 2011 at 8:50 AM, bikash sharma sharmabiks...@gmail.com
 wrote:

  Hi -- Can we specify a different set of slaves for each mapreduce job
 run.
  I tried using the --config option and specify different set of slaves in
  slaves config file. However, it does not use the selective slaves set but
  the one initially configured.
 
  Any help?
 
  Thanks,
  Biksah
 



Getting the cpu, memory usage of map/reduce tasks

2011-09-23 Thread bikash sharma
Hi -- Is it possible to get the cpu and memory usage of individual
map/reduces tasks when any mapreduce job is run.
I came across this jira issue, but was not sure about the exact ways to
access in the current hadoop distriubtion
https://issues.apache.org/jira/browse/MAPREDUCE-220

Any help is highly appreciated.

Thanks,
Bikash


automatic monitoring the utilization of slaves

2011-06-16 Thread bikash sharma
Hi -- Is there a way, by which a slave can get a trigger when a Hadoop jobs
finished in master?
The use case is as follows:
I need to monitor the cpu, memory utilization utility automatically. For
which, I need to know the timestamp to start and stop the sar utility
corresponding to the start and finish of Hadoop job at master.
Its simple to do at master, since the Hadoop job runs there, but how we do
for slaves?

Thanks.
Bikash


/etc/hosts related error?

2011-06-08 Thread bikash sharma
Hi
I am experiencing a lot of tasks failures while running any Hadoop
application.
In particular, I get the following warnings:
Error initializing attempt_201106081500_0018_r_00_0:
java.io.IOException: Could not obtain block: blk_-7386162385184325734_1214
file=/home/hadoop/data/mapred/system/job_201106081500_0018/job.xml

Looking in the forums, it seems it has something to do with /etc/hosts
settings, because I cannot also access the jobtracker web interface via the
hostname, but can access it via the actual IP address.

I set the /etc/hosts in all the VMs as per
ip address  hostname

Any idea?

Thanks


hadoop cluster installation problems

2011-04-13 Thread bikash sharma
Hi,
I need to install hadoop on 16-node cluster. I have a couple of related
questions:
1. I have installed hadoop on a shared directory, i.e., there is just one
place where the whole hadoop installation files exist and all the 16 nodes
use the same installation.
Is that an issue or I need to install hadoop on each of these nodes in their
local directory separately?
2. I installed hadoop-0.21 and after following the installation
instructions, when i tried formatting, I get the following error:

/
Re-format filesystem in /var/tmp/data/dfs/name ? (Y or N) Y
11/04/13 09:16:23 INFO namenode.FSNamesystem: defaultReplication = 3
11/04/13 09:16:23 INFO namenode.FSNamesystem: maxReplication = 512
11/04/13 09:16:23 INFO namenode.FSNamesystem: minReplication = 1
11/04/13 09:16:23 INFO namenode.FSNamesystem: maxReplicationStreams = 2
11/04/13 09:16:23 INFO namenode.FSNamesystem: shouldCheckForEnoughRacks =
false
11/04/13 09:16:23 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=30
11/04/13 09:16:23 INFO namenode.FSNamesystem: fsOwner=bus145
11/04/13 09:16:23 INFO namenode.FSNamesystem: supergroup=supergroup
11/04/13 09:16:23 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/04/13 09:16:23 INFO namenode.FSNamesystem: isAccessTokenEnabled=false
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/04/13 09:16:24 INFO common.Storage: Cannot lock storage
/var/tmp/data/dfs/name. The directory is already locked.
11/04/13 09:16:24 ERROR namenode.NameNode: java.io.IOException: Cannot lock
storage /var/tmp/data/dfs/name. The directory is already locked.
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:617)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1426)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1444)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1242)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368)

11/04/13 09:16:24 INFO namenode.NameNode: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at inti79.cse.psu.edu/130.203.58.207
/

3. I was using before hadoop-0.20, and formatting was working fine.
4. Also, when i do a bin/start-dfs.sh, am able to see the Namenode, Datanode
up, however on bin/start-mapred.sh, am not able to see Jobtracker up on the
master node, though Tasktracker seems to be up on slaves.

Before upgrading to Hadoop-0.21, everything was working fine with
hadoop-0.20 including running benchmarks and getting stats.

Any suggestions in this regard is highly appreciated.

Thanks,
Bikash


Re: hadoop cluster installation problems

2011-04-13 Thread bikash sharma
p.s.
Also, while starting dfs using bin/start-dfs.sh, I get the following error:

2011-04-13 09:42:31,729 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = inti84.cse.psu.edu/130.203.58.212
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
/
2011-04-13 09:42:31,853 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.NullPointerException
at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:175)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

2011-04-13 09:42:31,854 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at inti84.cse.psu.edu/130.203.58.212
/
2011-04-13 09:44:03,265 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = inti84.cse.psu.edu/130.203.58.212
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
/
2011-04-13 09:44:03,384 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.NullPointerException
at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:175)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)


On Wed, Apr 13, 2011 at 9:20 AM, bikash sharma sharmabiks...@gmail.comwrote:

 Hi,
 I need to install hadoop on 16-node cluster. I have a couple of related
 questions:
 1. I have installed hadoop on a shared directory, i.e., there is just one
 place where the whole hadoop installation files exist and all the 16 nodes
 use the same installation.
 Is that an issue or I need to install hadoop on each of these nodes in
 their local directory separately?
 2. I installed hadoop-0.21 and after following the installation
 instructions, when i tried formatting, I get the following error:

 /
 Re-format filesystem in /var/tmp/data/dfs/name ? (Y or N) Y
 11/04/13 09:16:23 INFO namenode.FSNamesystem: defaultReplication = 3
 11/04/13 09:16:23 INFO namenode.FSNamesystem: maxReplication = 512
 11/04/13 09:16:23 INFO namenode.FSNamesystem: minReplication = 1
 11/04/13 09:16:23 INFO namenode.FSNamesystem: maxReplicationStreams = 2
 11/04/13 09:16:23 INFO namenode.FSNamesystem: shouldCheckForEnoughRacks =
 false
 11/04/13 09:16:23 INFO security.Groups: Group mapping
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
 cacheTimeout=30
 11/04/13 09:16:23 INFO namenode.FSNamesystem: fsOwner=bus145
 11/04/13 09:16:23 INFO namenode.FSNamesystem: supergroup=supergroup
 11/04/13 09:16:23 INFO namenode.FSNamesystem: isPermissionEnabled=true
 11/04/13 09:16:23 INFO namenode.FSNamesystem: isAccessTokenEnabled=false
 accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
 11/04/13 09:16:24 INFO common.Storage: Cannot lock storage
 /var/tmp/data/dfs/name. The directory is already locked.
 11/04/13 09:16:24 ERROR namenode.NameNode: java.io.IOException: Cannot lock
 storage /var/tmp/data/dfs/name. The directory is already locked.
 at
 org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:617)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1426)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1444)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java

cluster restart error

2011-04-12 Thread bikash sharma
Hi,
I changed some config. parameters in core-site/mapred.xml files and then
stopped dfs, mapred services.
While restarting them again, I am unable to do so and looking at the logs,
the following error occurs:

2011-04-12 17:27:39,343 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2011-04-12 17:27:39,453 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
start task tracker because java.lang.NoClassDefFoundError:
org/json/JSONException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at
org.apache.hadoop.metrics.ContextFactory.getContext(ContextFactory.java:132)
at
org.apache.hadoop.metrics.MetricsUtil.getContext(MetricsUtil.java:56)
at
org.apache.hadoop.metrics.MetricsUtil.getContext(MetricsUtil.java:45)
at
org.apache.hadoop.mapred.TaskTracker$ShuffleServerMetrics.init(TaskTracker.java:250)
at org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:917)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2834)
Caused by: java.lang.ClassNotFoundException: org.json.JSONException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 8 more

2011-04-12 17:27:39,455 INFO org.apache.hadoop.mapred.TaskTracker:
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down TaskTracker at inti79.cse.psu.edu/130.203.58.207
/


Also, am unable to do namenode -format to fresh start the cluster.
Any suggestions please?

Thanks,
Bikash


Re: cluster restart error

2011-04-12 Thread bikash sharma
p.s.
Also, am unable to connect while doing hadoop/bin/hadoop fs -ls with the
following error:

inti76.cse.psu.edu: starting tasktracker, logging to
/i3c/hpcl/bus145/cse598g/hadoop/bin/../logs/hadoop-bus145-tasktracker-inti76.cse.psu.edu.out
inti79.cse.psu.edu 36% hadoop/bin/hadoop fs -ls
11/04/12 17:37:34 INFO ipc.Client: Retrying connect to server:
inti79.cse.psu.edu/130.203.58.207:54310. Already tried 0 time(s).
11/04/12 17:37:35 INFO ipc.Client: Retrying connect to server:
inti79.cse.psu.edu/130.203.58.207:54310. Already tried 1 time(s).
11/04/12 17:37:36 INFO ipc.Client: Retrying connect to server:
inti79.cse.psu.edu/130.203.58.207:54310. Already tried 2 time(s).


On Tue, Apr 12, 2011 at 5:34 PM, bikash sharma sharmabiks...@gmail.comwrote:

 Hi,
 I changed some config. parameters in core-site/mapred.xml files and then
 stopped dfs, mapred services.
 While restarting them again, I am unable to do so and looking at the logs,
 the following error occurs:

 2011-04-12 17:27:39,343 INFO org.mortbay.log: Logging to
 org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
 org.mortbay.log.Slf4jLog
 2011-04-12 17:27:39,453 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
 start task tracker because java.lang.NoClassDefFoundError:
 org/json/JSONException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:169)
 at
 org.apache.hadoop.metrics.ContextFactory.getContext(ContextFactory.java:132)
 at
 org.apache.hadoop.metrics.MetricsUtil.getContext(MetricsUtil.java:56)
 at
 org.apache.hadoop.metrics.MetricsUtil.getContext(MetricsUtil.java:45)
 at
 org.apache.hadoop.mapred.TaskTracker$ShuffleServerMetrics.init(TaskTracker.java:250)
 at
 org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:917)
 at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2834)
 Caused by: java.lang.ClassNotFoundException: org.json.JSONException
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 ... 8 more

 2011-04-12 17:27:39,455 INFO org.apache.hadoop.mapred.TaskTracker:
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down TaskTracker at
 inti79.cse.psu.edu/130.203.58.207
 /


 Also, am unable to do namenode -format to fresh start the cluster.
 Any suggestions please?

 Thanks,
 Bikash



available hadoop logs

2011-04-08 Thread bikash sharma
Hi,
For research purpose, I need some real Hadoop MapReduce job traces (ideally
both inter and intra-job(in terms of Hadoop job configuration parameters
like mapred.io.sort.factor)).
Is there some freely available Hadoop traces corresponding to some real
large setup?

Thanks,
Bikash


runtime resource change of applications

2011-04-01 Thread bikash sharma
Hi,
Can we dynamically vary the resource allocation/consumption (say memory,
cores) of Hadoop MR applications like sort?

Thanks,
Bikash


Chukwa setup issues

2011-04-01 Thread bikash sharma
Hi,
I am trying to setup Chukwa for a 16-node Hadoop cluster.
I followed the admin guide -
http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Agents
However, I ran two the following issues:
1. What should be the collector port that needs to be specified in
conf/collectors file
2. Am unable to see the collector running via web browser

Am I missing something?

Thanks in advance.

-bikash

p.s. - after i run collector, nothing happens
% bin/chukwa collector
2011-04-01 16:07:16.410::INFO:  Logging to STDERR via
org.mortbay.log.StdErrLog
2011-04-01 16:07:16.523::INFO:  jetty-6.1.11
2011-04-01 16:07:17.707::INFO:  Started SelectChannelConnector@0.0.0.0:
started Chukwa http collector on port 


Re: Chukwa setup issues

2011-04-01 Thread bikash sharma
Thanks Bill.
I am able to connect via web now, actually had put wrong http port in config
file.
One following question - if i run a mapreduce program say terasort, how can
we link chukwa to collect job metrics via web.


On Fri, Apr 1, 2011 at 5:37 PM, Bill Graham billgra...@gmail.com wrote:

 Unfortunately conf/collectors is used in two different ways in Chukwa,
 each with a different syntax. This should really be fixed.

 1. The script that starts the collectors looks at it for a list of
 hostnames (no ports) to start collectors on. To start it just on one
 host, set it to localhost.
 2. The agent looks at that file for the list of collectors to attempt
 to communicate with. In that case the format is a list of HTTP urls
 with ports of the collectors.

 Can you telnet to port ? It looks like it's listening, but
 nothing's being sent. Is there anything in logs/collector.log?

 On Fri, Apr 1, 2011 at 1:09 PM, bikash sharma sharmabiks...@gmail.com
 wrote:
  Hi,
  I am trying to setup Chukwa for a 16-node Hadoop cluster.
  I followed the admin guide -
  http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Agents
  However, I ran two the following issues:
  1. What should be the collector port that needs to be specified in
  conf/collectors file
  2. Am unable to see the collector running via web browser
 
  Am I missing something?
 
  Thanks in advance.
 
  -bikash
 
  p.s. - after i run collector, nothing happens
  % bin/chukwa collector
  2011-04-01 16:07:16.410::INFO:  Logging to STDERR via
  org.mortbay.log.StdErrLog
  2011-04-01 16:07:16.523::INFO:  jetty-6.1.11
  2011-04-01 16:07:17.707::INFO:  Started
 SelectChannelConnector@0.0.0.0:
  started Chukwa http collector on port 
 



Re: Chukwa setup issues

2011-04-01 Thread bikash sharma
I was trying to install HICC in Chukwa, but hicc.sh does not exist in the
repository.
Any idea?

-bikash

On Fri, Apr 1, 2011 at 5:57 PM, bikash sharma sharmabiks...@gmail.comwrote:

 Thanks Bill.
 I am able to connect via web now, actually had put wrong http port in
 config file.
 One following question - if i run a mapreduce program say terasort, how can
 we link chukwa to collect job metrics via web.


 On Fri, Apr 1, 2011 at 5:37 PM, Bill Graham billgra...@gmail.com wrote:

 Unfortunately conf/collectors is used in two different ways in Chukwa,
 each with a different syntax. This should really be fixed.

 1. The script that starts the collectors looks at it for a list of
 hostnames (no ports) to start collectors on. To start it just on one
 host, set it to localhost.
 2. The agent looks at that file for the list of collectors to attempt
 to communicate with. In that case the format is a list of HTTP urls
 with ports of the collectors.

 Can you telnet to port ? It looks like it's listening, but
 nothing's being sent. Is there anything in logs/collector.log?

 On Fri, Apr 1, 2011 at 1:09 PM, bikash sharma sharmabiks...@gmail.com
 wrote:
  Hi,
  I am trying to setup Chukwa for a 16-node Hadoop cluster.
  I followed the admin guide -
  http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Agents
  However, I ran two the following issues:
  1. What should be the collector port that needs to be specified in
  conf/collectors file
  2. Am unable to see the collector running via web browser
 
  Am I missing something?
 
  Thanks in advance.
 
  -bikash
 
  p.s. - after i run collector, nothing happens
  % bin/chukwa collector
  2011-04-01 16:07:16.410::INFO:  Logging to STDERR via
  org.mortbay.log.StdErrLog
  2011-04-01 16:07:16.523::INFO:  jetty-6.1.11
  2011-04-01 16:07:17.707::INFO:  Started
 SelectChannelConnector@0.0.0.0:
  started Chukwa http collector on port 
 





Re: observe the effect of changes to Hadoop

2011-03-27 Thread bikash sharma
Thanks Steve. It worked.

On Sun, Mar 27, 2011 at 2:08 PM, Steve Loughran ste...@apache.org wrote:

 On 25/03/2011 14:10, bikash sharma wrote:

 Hi,
 For my research project, I need to add a couple of functions in
 JobTracker.java source file to include additional information about
 TaskTrackers resource usage through heartbeat messages. I made those
 changes
 to JobTracker.java file. However, I am not  very clear how to see these
 effects. I mean what are the next steps in terms of building the entire
 Hadoop code base, using the built distribution and installing it again in
 the cluster, etc?


 If you are working with the Job Tracker you only need to rebuild the
 mapreduce JARs and push the new JAR out to the Job Tracker server, restart
 that process.

 For more safety, put the same JAR on all the task trackers and shut down
 HDFS before the updates, but that's potentially overkil


  Any elaborate updates on these will be very useful since I do not have
 much
 experience in doing modifications to Hadoop like huge code base and
 observing the effects of these changes.


 I'd recommend getting everything working on a local machine single VM (the
 MiniMRCluster class helps), then move to multiple VMs and finally, if the
 code looks good, a real cluster with data you don't value.

 -stee



pointers to Hadoop eclipse

2011-03-17 Thread bikash sharma
Hi,
Can someone please point to any good reference that tells clearly how to
checkout Hadoop code base in eclipse, make any changes and re-compile.
Actually, I wanted to change some part in Hadoop, so wants to see the above
effect, preferrably in eclipse.

Thanks,
Bikash


Re: definition of slots in Hadoop scheduling

2011-03-12 Thread bikash sharma
Thanks Allen.

On Sat, Mar 12, 2011 at 11:34 AM, Allen Wittenauer a...@apache.org wrote:


 (Removing common-dev, because this isn't really a dev question)

 On Feb 25, 2011, at 5:52 AM, bikash sharma wrote:

  Hi,
  How is task slot in Hadoop defined with respect to scheduling a
 map/reduce
  task on such slots available on TaskTrackers?


 On a TaskTracker, one sets how many maps and reduces one wants to
 run on that node.  The JobTracker is informed of this value.  When a job is
 getting scheduled, it compares the various tasks's input to see if a
 DataNode is providing a matching block.  If a block exists or is nearby, the
 task is scheduled on that node.





slot related question

2011-03-05 Thread bikash sharma
Hi,
This is a conceptual question:
1. Are various resources shared across slots in Hadoop OR resources are
partitioned across slots?
2. Any thoughts on experiments using Hadoop setup that can help confirm the
above rationale?

Thanks,
Bikash


conceptual question regarding slots

2011-03-02 Thread bikash sharma
Hi,
Could someone throw some light as to how intuitively fixed-type slots in
Hadoop have a negative impact of cluster utilization as mentioned in Arun's
blog?
http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/

Thanks,
Bikash


Re: TaskTracker not starting on all nodes

2011-03-01 Thread bikash sharma
Hi James,
Sorry for the late response. No, the same problem persists. I reformatted
HDFS, stopped mapred and hdfs daemons and restarted them (using start-dfs.sh
and start-mapred.sh from master node). But surprisingly out of 4 nodes
cluster, two nodes have TaskTracker running while other two do not have
TaskTrackers on them (verified using jps). I guess since I have the Hadoop
installed on shared storage, that might be the issue? Btw, how do I start
the services independently on each node?

-bikash
On Sun, Feb 27, 2011 at 11:05 PM, James Seigel ja...@tynt.com wrote:

  Did you get it working?  What was the fix?

 Sent from my mobile. Please excuse the typos.

 On 2011-02-27, at 8:43 PM, Simon gsmst...@gmail.com wrote:

  Hey Bikash,
 
  Maybe you can manually start a  tasktracker on the node and see if there
 are
  any error messages. Also, don't forget to check your configure files for
  mapreduce and hdfs and make sure datanode can start successfully first.
  After all these steps, you can submit a job on the master node and see if
  there are any communication between these failed nodes and the master
 node.
  Post your error messages here if possible.
 
  HTH.
  Simon -
 
  On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma sharmabiks...@gmail.com
 wrote:
 
  Thanks James. Well all the config. files and shared keys are on a shared
  storage that is accessed by all the nodes in the cluster.
  At times, everything runs fine on initialization, but at other times,
 the
  same problem persists, so was bit confused.
  Also, checked the TaskTracker logs on those nodes, there does not seem
 to
  be
  any error.
 
  -bikash
 
  On Sat, Feb 26, 2011 at 10:30 AM, James Seigel ja...@tynt.com wrote:
 
  Maybe your ssh keys aren’t distributed the same on each machine or the
  machines aren’t configured the same?
 
  J
 
 
  On 2011-02-26, at 8:25 AM, bikash sharma wrote:
 
  Hi,
  I have a 10 nodes Hadoop cluster, where I am running some benchmarks
  for
  experiments.
  Surprisingly, when I initialize the Hadoop cluster
  (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
  TaskTracker process up (seen using jps), while other nodes do not have
  TaskTrackers. Could anyone please explain?
 
  Thanks,
  Bikash
 
 
 
 
 
 
  --
  Regards,
  Simon



disable pipelining in Hadoop

2011-03-01 Thread bikash sharma
Hi,
Is there a way to disable the use of pipelining , i.e., the reduce phase is
started only after the map phase is completed?
-bikash


Re: disable pipelining in Hadoop

2011-03-01 Thread bikash sharma
 Hi,
Thanks Benajamin and Bibek for the detailed explanations and pointers.
The question came after reading the paper Real-time MapReduce Scheduling (
http://repository.upenn.edu/cis_reports/942/) where in their experimental
setup, they say they disabled the use of speculative execution and use of
pipelining. Thus, I was wandering how to enforce the latter concept.

-bikash
On Tue, Mar 1, 2011 at 9:48 AM, Benjamin Gufler benjamin.guf...@tum.dewrote:

 On 2011-03-01 15:42, Bibek Paudel wrote:

 On Tue, Mar 1, 2011 at 3:27 PM, Benjamin Guflerbenjamin.guf...@tum.de
  wrote:

 Is there a way to disable the use of pipelining , i.e., the reduce phase

 is
 started only after the map phase is completed?

 you need to configure the mapred.reduce.slowstart.completed.maps property
 in
 mapred-site.xml. It gives the percentage of mappers which must be
 complete
 before the first reducers are launched. By setting it to 1, you should
 obtain the wanted behaviour.

 I think this only schedules the reducers, and the scheduled reducers
 start copy (followed by sort) stages. The actual reduce functions
 are called only after all the intermediate data from all mappers have
 been copied over.


 The reduce functions cannot be called earlier anyway, as the last mapper
 to complete might produce output which must be processed on the first reduce
 invocation. So, if it was not the early copying and sorting, I think I
 didn't get your initial question, sorry.

 Benjamin



Re: TaskTracker not starting on all nodes

2011-03-01 Thread bikash sharma
Hi Sonal,
Thanks. I guess you are right. ps -ef exposes such processes.

-bikash

On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 Bikash,

 I have sometimes found hanging processes which jps does not report, but a
 ps -ef shows them. Maybe you can check this on the errant nodes..

 Thanks and Regards,
 Sonal
 https://github.com/sonalgoyal/hihoHadoop ETL and Data 
 Integrationhttps://github.com/sonalgoyal/hiho
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal






 On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma sharmabiks...@gmail.comwrote:

 Hi James,
 Sorry for the late response. No, the same problem persists. I reformatted
 HDFS, stopped mapred and hdfs daemons and restarted them (using
 start-dfs.sh
 and start-mapred.sh from master node). But surprisingly out of 4 nodes
 cluster, two nodes have TaskTracker running while other two do not have
 TaskTrackers on them (verified using jps). I guess since I have the Hadoop
 installed on shared storage, that might be the issue? Btw, how do I start
 the services independently on each node?

 -bikash
 On Sun, Feb 27, 2011 at 11:05 PM, James Seigel ja...@tynt.com wrote:

   Did you get it working?  What was the fix?
 
  Sent from my mobile. Please excuse the typos.
 
  On 2011-02-27, at 8:43 PM, Simon gsmst...@gmail.com wrote:
 
   Hey Bikash,
  
   Maybe you can manually start a  tasktracker on the node and see if
 there
  are
   any error messages. Also, don't forget to check your configure files
 for
   mapreduce and hdfs and make sure datanode can start successfully
 first.
   After all these steps, you can submit a job on the master node and see
 if
   there are any communication between these failed nodes and the master
  node.
   Post your error messages here if possible.
  
   HTH.
   Simon -
  
   On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma 
 sharmabiks...@gmail.com
  wrote:
  
   Thanks James. Well all the config. files and shared keys are on a
 shared
   storage that is accessed by all the nodes in the cluster.
   At times, everything runs fine on initialization, but at other times,
  the
   same problem persists, so was bit confused.
   Also, checked the TaskTracker logs on those nodes, there does not
 seem
  to
   be
   any error.
  
   -bikash
  
   On Sat, Feb 26, 2011 at 10:30 AM, James Seigel ja...@tynt.com
 wrote:
  
   Maybe your ssh keys aren’t distributed the same on each machine or
 the
   machines aren’t configured the same?
  
   J
  
  
   On 2011-02-26, at 8:25 AM, bikash sharma wrote:
  
   Hi,
   I have a 10 nodes Hadoop cluster, where I am running some
 benchmarks
   for
   experiments.
   Surprisingly, when I initialize the Hadoop cluster
   (hadoop/bin/start-mapred.sh), in many instances, only some nodes
 have
   TaskTracker process up (seen using jps), while other nodes do not
 have
   TaskTrackers. Could anyone please explain?
  
   Thanks,
   Bikash
  
  
  
  
  
  
   --
   Regards,
   Simon
 





TaskTracker not starting on all nodes

2011-02-26 Thread bikash sharma
Hi,
I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
experiments.
Surprisingly, when I initialize the Hadoop cluster
(hadoop/bin/start-mapred.sh), in many instances, only some nodes have
TaskTracker process up (seen using jps), while other nodes do not have
TaskTrackers. Could anyone please explain?

Thanks,
Bikash


Re: TaskTracker not starting on all nodes

2011-02-26 Thread bikash sharma
Thanks James. Well all the config. files and shared keys are on a shared
storage that is accessed by all the nodes in the cluster.
At times, everything runs fine on initialization, but at other times, the
same problem persists, so was bit confused.
Also, checked the TaskTracker logs on those nodes, there does not seem to be
any error.

-bikash

On Sat, Feb 26, 2011 at 10:30 AM, James Seigel ja...@tynt.com wrote:

 Maybe your ssh keys aren’t distributed the same on each machine or the
 machines aren’t configured the same?

 J


 On 2011-02-26, at 8:25 AM, bikash sharma wrote:

  Hi,
  I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
  experiments.
  Surprisingly, when I initialize the Hadoop cluster
  (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
  TaskTracker process up (seen using jps), while other nodes do not have
  TaskTrackers. Could anyone please explain?
 
  Thanks,
  Bikash




definition of slots in Hadoop scheduling

2011-02-25 Thread bikash sharma
Hi,
How is task slot in Hadoop defined with respect to scheduling a map/reduce
task on such slots available on TaskTrackers?

Thanks,
Bikash


Re: definition of slots in Hadoop scheduling

2011-02-25 Thread bikash sharma
Thanks very much Harsh. It seems then that slots are not defined in terms of
actual machine resource capacities in terms of cpu, memory, disk and network
bandwidth.

-bikash

On Fri, Feb 25, 2011 at 11:33 AM, Harsh J qwertyman...@gmail.com wrote:

 Please see this archived thread for a very similar question on what
 tasks really are:

 http://mail-archives.apache.org/mod_mbox/hadoop-general/201011.mbox/%3c126335.8536...@web112111.mail.gq1.yahoo.com%3E

 Right now, they're just a cap number for parallelization,
 hand-configured and irrespective of the machine's capabilities.
 However, a Scheduler may take machine's states into account while
 assigning tasks to one.

 On Fri, Feb 25, 2011 at 7:22 PM, bikash sharma sharmabiks...@gmail.com
 wrote:
  Hi,
  How is task slot in Hadoop defined with respect to scheduling a
 map/reduce
  task on such slots available on TaskTrackers?
 
  Thanks,
  Bikash
 



 --
 Harsh J
 www.harshj.com



measure the resource usage of each map/reduce task

2011-02-22 Thread bikash sharma
Hi,
Is there any way in which we can measure the resource usage of each
map/reduce task running?
I was trying to use sar utility to track each process resource usage,
however it seems these individual map/reduce tasks are not listed as
processes when I do ps -ex.

Thanks,
Bikash


task scheduling based on slots in Hadoop

2011-02-21 Thread bikash sharma
Hi,
Can anyone throw some more light on resource based scheduling in Hadoop.
Specifically, are resources like CPU, Memory partitioned across slots?
From the blog by Arun on capacity scheduler,
http://developer.yahoo.com/blogs/hadoop/posts/2011/02/capacity-scheduler/
I understand that memory is the only resource supported, does that mean both
memory and CPU are partitioned across map/reduce tasks in slots?

Thanks in advance.

-bikash


measure the time taken by stragglers

2011-02-21 Thread bikash sharma
Hi,
Is there a way in which we can measure the execution time for stragglers and
non-stragglers tasks separately in Hadoop mapreduce?
-bikash