Hi,
I was working with a small Hadoop cluster while I was developing a new
scheduler, however the cluster was used only for development purposes and
never in production so I am wondering what obstacles are you facing in a
typical day-to-day cluster administration?
We have been discussing with an
If I understand it right HOD is mentioned mainly for merging existing HPC
clusters with hadoop and for testing purposes..
I cannot find what is the role of Torque here (just initial nodes
allocation?) and which is the default scheduler of HOD ? Probably the
scheduler from the hadoop
For distribution of load you can start reading some chapters from different
types of hadoop scheduler. I have not yet studied other implementation like
hadoop, however a very simplified version of distribution concept is the
following:
a) Tasktracker ask for work (heartbeat consist of a status
Anyone?
On 19 April 2012 17:34, Merto Mertek masmer...@gmail.com wrote:
I could find that the closest doc matching the current implementation of
the fairscheduler could be find in this
documenthttp://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-55.htmlfrom
Matei Zaharia et al
I could find that the closest doc matching the current implementation of
the fairscheduler could be find in this
documenthttp://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-55.htmlfrom
Matei Zaharia et al.. Another documented from delay scheduling can be
found from year 2010..
a) I am
I know that by design all unmarked jobs goes to that pool, however I am
doing some testing and I am interested if is possible to disable it..
Thanks
...@apache.org wrote:
We do it here by setting this:
poolMaxJobsDefault0/poolMaxJobsDefault
So that you _must_ have a pool (that's configured with a different
maxRunningJobs) in order to run jobs.
Hope this helps,
J-D
On Tue, Mar 13, 2012 at 10:49 AM, Merto Mertek masmer...@gmail.com
wrote
From the fairscheduler docs I assume the following should work:
property
namemapred.fairscheduler.poolnameproperty/name
valuepool.name/value
/property
property
namepool.name/name
value${mapreduce.job.group.name}/value
/property
which means that the default pool will be the group of
, Feb 11, 2012 at 2:19 AM, Merto Mertek masmer...@gmail.com
wrote:
Varun unfortunately I have had some problems with deploying a new
version
on the cluster.. Hadoop is not picking the new build in lib folder
despite
a classpath is set to it. The new build is picked just if I
Hm.. I would try first to stop all the deamons wtih
$haddop_home/bin/stop-all.sh. Afterwards check that on the master and one
of the slaves no deamons are running (jps). Maybe you could try to check if
your conf on tasktrackers for the jobtracker is pointing to the right place
(mapred-site.xml).
/questions/9400739/hadoop-globstatus-and-deflate-files
On Wed, Feb 22, 2012 at 7:39 AM, Merto Mertek masmer...@gmail.com wrote:
Hm.. I would try first to stop all the deamons wtih
$haddop_home/bin/stop-all.sh. Afterwards check that on the master and one
of the slaves no deamons are running (jps
I think that job configuration does not allow you such setup, however maybe
I missed something..
Probably I would tackle this problem from the scheduler source. The
default one is JobQueueTaskScheduler which preserves a fifo based queue.
When a tasktracker (your slave) tells the jobtracker that
likely receive some bad metrics.
Varun
On Wed, Feb 8, 2012 at 6:19 PM, Merto Mertek masmer...@gmail.com
wrote:
I will need your help. Please confirm if the following procedure is
right.
I have a dev environment where I pimp my scheduler (no hadoop running)
and
a small cluster
I am having some troubles in understanding how the whole stuff works..
Compiling with ant works ok and I am able to compile a jar which is
afterwards deployed to the cluster. On the cluster I've set the
HADOOP_CLASSPATH variable to point just to jar files in the lib folder
($HD_HOME/lib/*.jar),
):
- http://dl.dropbox.com/u/4366344/gmetadBufferOverflow.Hadoop.patch
- http://dl.dropbox.com/u/4366344/gmetadBufferOverflow.gmetad.patch
Here's hoping this works for you,
Varun
On Tue, Feb 7, 2012 at 6:00 PM, Merto Mertek masmer...@gmail.com wrote:
Varun, have I missed your link
, 2012 at 4:58 AM, Merto Mertek masmer...@gmail.com wrote:
I have tried to run it but it repeats crashing..
- When you start gmetad and Hadoop is not emitting metrics, everything
is peachy.
Right, running just ganglia without running hadoop jobs seems stable
for at
least a day
for
org.apache.hadoop.metrics.ganglia.GangliaContext in the
hadoop-metrics.properties lines above.
On Fri, Feb 3, 2012 at 1:07 PM, Merto Mertek masmer...@gmail.com
wrote:
I spent a lot of time to figure it out however i did not find a
solution.
Problems from the logs pointed me for some bugs in rrdupdate tool
the gmetad coring issue - the
warnings emitted about '4.9E-324' being out of range will continue, but I
know what's causing that as well (and hope that my patch fixes it for
free).
Varun
On Mon, Feb 6, 2012 at 2:39 PM, Merto Mertek masmer...@gmail.com wrote:
Yes I am encoutering the same
protocol and leave the version 3.1 for further releases?
any help is realy appreciated...
On 1 February 2012 04:04, Merto Mertek masmer...@gmail.com wrote:
I would be glad to hear that too.. I've setup the following:
Hadoop 0.20.205
Ganglia Front 3.1.7
Ganglia Back *(gmetad)* 3.1.7
RRDTool http
I would be glad to hear that too.. I've setup the following:
Hadoop 0.20.205
Ganglia Front 3.1.7
Ganglia Back *(gmetad)* 3.1.7
RRDTool http://www.rrdtool.org/ 1.4.5. - i had some troubles installing
1.4.4
Ganglia works just in case hadoop is not running, so metrics are not
publshed to gmetad
Hi,
I am having problems with changing the default hadoop scheduler (i assume
that the default scheduler is a FIFO scheduler).
I am following the guide located in hadoop/docs directory however I am not
able to run it. Link for scheduling administration returns an http error
404 (
I followed the same tutorial as you. If I am not wrong the problem arise
because you first tried to run a node as single node and then joining it to
the cluster (like Arpit mentioned). After testing that the new node works
ok try to delete content in directory /app/hadoop/tmp/ and insert a new
build/jar to a cluster
- try it on a working cluster
Is there any other option how to try a new functionality locally or in any
other way? Any comments and suggestion are welcomed
Thank you..
On 17 December 2011 21:58, Merto Mertek masmer...@gmail.com wrote:
Hi,
I am having some problems
Hi,
I am having some problems with running the following test file
org.apache.hadoop.mapred.TestFairScheduler
Nearly all test fails, most of them with the error:
javalang.runtimeexception: COULD NOT START JT. Here is a
tracehttp://pastebin.com/Jx90sYbw
.
Code was checkout from the svn branch,
as the user
creation.
Your VM route will most likely work but I can imagine the amount of
hiccups during migration from that to the real cluster will not make it
worth your time.
Matt
-Original Message-
From: Merto Mertek [mailto:masmer...@gmail.com]
Sent: Friday, September 23, 2011 10:00
time.
Matt
-Original Message-
From: Merto Mertek [mailto:masmer...@gmail.com]
Sent: Friday, September 23, 2011 10:00 AM
To: common-user@hadoop.apache.org
Subject: Environment consideration for a research on scheduling
Hi,
in the first phase we are planning to establish a small
Hi,
i am receiving messages from two mailing lists (common-dev,common-user)
and I would like to disable receiving msg from jira. I am not a member of
common-issues-unsubscribe list. Can I anyhow disable this? Thank you
opened/resolved/reopened
messages. The common-issues receives everything.
On Fri, Sep 23, 2011 at 7:27 PM, Merto Mertek masmer...@gmail.com wrote:
Hi,
i am receiving messages from two mailing lists
(common-dev,common-user)
and I would like to disable receiving msg from jira. I am not a member
hehe :) you are right :)
On 23 September 2011 16:21, Harsh J ha...@cloudera.com wrote:
Merto,
Am sure your mail client has some form of filtering available in that case!
:-)
On Fri, Sep 23, 2011 at 7:49 PM, Merto Mertek masmer...@gmail.com wrote:
Probably there is not any option just
Hi,
in the first phase we are planning to establish a small cluster with few
commodity computer (each 1GB, 200GB,..). Cluster would run ubuntu server
10.10 and a hadoop build from the branch 0.20.204 (i had some issues with
version 0.20.203 with missing
30 matches
Mail list logo