The blocks will be invalidated on the returned to service datanode.
If you want to save your namenode and network a lot of work, wipe the hdfs
block storage directory before returning the Datanode to service.
dfs.data.dir will be the directory, most likley the value is
${hadoop.tmp.dir}/dfs/data
J
We had trouble like that with some jobs, when the child ran additional
threads that were not set at daemon priority. These hold the Child JVM from
exiting.
JMX was the cause in our case, but we have seen our JNI jobs do it also.
In the end we made a local mod that forced a System.exit in the finall
With large numbers of files you run the risk of the Datanodes timing out
when they are performing their block report and or DU reports.
Basically if a *find* in the dfs.data.dir takes more than 10 minutes you
will have catastrophic problems with your hdfs.
At attributor with 2million blocks on a da
We like compression if the data is readily compressible and large as it
saves on IO time.
On Mon, Jan 26, 2009 at 9:35 AM, Mark Kerzner wrote:
> Doug,
> SequenceFile looks like a perfect candidate to use in my project, but are
> you saying that I better use uncompressed data if I am not interes
Sequence files rock, and you can use the
*
bin/hadoop dfs -text FILENAME* command line tool to get a toString level
unpacking of the sequence file key,value pairs.
If you provide your own key or value classes, you will need to implement a
toString method to get some use out of this. Also, your cla
I believe that the schedule code in 0.19.0 has a framework for this, but I
haven't dug into it in detail yet.
http://hadoop.apache.org/core/docs/r0.19.0/capacity_scheduler.html
>From what I gather you would set up 2 queues, each with guaranteed access to
1/2 of the cluster
Then you submit your jo
Hadoop just distributes to the available reduce execution slots. I don't
believe it pays attention to what machine they are on.
I believe the plan is to take account data locality in future (ie:
distribute tasks to machines that are considered more topologically close to
their input split first, bu
are you changing the definition of hadop.tmp.dir in the hadoop-site.xml
file.
1) the default location is in /tmp and your tmp watch cron job may be
deleteing the files
2) if you change the location, or the location is removed, you will need to
reformat.
On Sun, Feb 1, 2009 at 3:39 PM, Mark Kerzn
If the write is taking place on a datanode, by design, 1 replica will be
written to that datanode.
The other replicas will be written to different nodes.
When you write on the namenode, it generally is not a datanode, and hadoop
will pseudo randomly allocate the replica blocks across all of your
If your datanodes are pausing and falling out of the cluster you will get a
large workload for the namenode of blocks to replicate and when the paused
datanode comes back, a large workload of blocks to delete.
These lists are stored in memory on the namenode.
The startup messages lead me to wonder
If you want them to also start automatically, and for the slaves.sh command
to work as expected, add the names to the conf/slaves file also.
On Fri, Jan 30, 2009 at 7:15 PM, Amandeep Khurana wrote:
> Thanks Lohit
>
>
> On Fri, Jan 30, 2009 at 7:13 PM, lohit wrote:
>
> > Just starting DataNode a
It is possible that your slaves are unable to contact the master due to a
network routing, firewall or hostname resolution error.
The alternative is that your namenode is either failing to start, or running
from a different configuration file and binding to a different port.
On Fri, Jan 30, 2009
ce Hadoop to distribute reduce tasks evenly
> across all the machines?
>
>
>
> On Jan 30, 2009, at 7:32 AM, jason hadoop wrote:
>
> Hadoop just distributes to the available reduce execution slots. I don't
>> believe it pays attention to what machine they are on.
>>
may have some task
trackers handling more reduces.
If mapred.tasktracker.reduce.tasks.maximum*Number_Of_Slaves == number of
reduces configured
and mapred.tasktracker.reduce.tasks.maximum == 1, you will get 1 reduce per
task tracker (almost always).
On Sun, Feb 1, 2009 at 5:51 PM, jason hadoop wrote
The Datanode's use multiple threads with locking and one of the assumptions
is that the block report (1ce per hour by default) takes little time. The
datanode will pause while the block report is running and if it happens to
take a while weird things start to happen.
On Fri, Jan 30, 2009 at 8:59
nks,
> Sean
>
> On Sun, Feb 1, 2009 at 4:00 PM, jason hadoop
> wrote:
>
> > If your datanodes are pausing and falling out of the cluster you will get
> a
> > large workload for the namenode of blocks to replicate and when the
> paused
> > datanode comes back,
t have a firewall so that shouldnt be a problem. I'll look into
> the
> other things once. How can I point the system to use a particular config
> file? Arent those fixed to hadoop-default.xml and hadoop-site.xml?
>
>
>
> On Sun, Feb 1, 2009 at 5:49 PM, jason hadoop
>
;> Brian
>>>
>>>
>>> On Feb 1, 2009, at 6:11 PM, Sean Knapp wrote:
>>>
>>> Jason,
>>>
>>>> Thanks for the response. By falling out, do you mean a longer time since
>>>> last contact (100s+), or fully timed out
A reduce stall at 0% implies that the map tasks are not outputting any
records via the output collector.
You need to go look at the task tracker and the task logs on all of your
slave machines, to see if anything that seems odd appears in the logs.
On the tasktracker web interface detail screen for
If you have to do a time based solution, for now, simply close the file and
stage it, then open a new file.
Your reads will have to deal with the fact the file is in multiple parts.
Warning: Datanodes get pokey if they have large numbers of blocks, and the
quickest way to do this is to create a lot
there was a though of running a continous find on
the dfs.data.dir to try to force the kernel to keep the inodes in memory,
but I think they abandoned that strategy.
On Mon, Feb 2, 2009 at 10:23 AM, Karl Kleinpaste wrote:
> On Sun, 2009-02-01 at 17:58 -0800, jason hadoop wrote:
> >
If you have a large number of ftp urls spread across many sites, simply set
that file to be your hadoop job input, and force the input split to be a
size that gives you good distribution across your cluster.
On Mon, Feb 2, 2009 at 3:23 PM, Steve Morin wrote:
> Does any one have a good suggestio
When I was at Attributor we experienced periodic odd XFS hangs that would
freeze up the Hadoop Server processes resulting in them going away.
Sometimes XFS would deadlock all writes to the log file and the server would
freeze trying to log a message. Can't even JSTACK the jvm.
We never had any trac
Do you really want to have a single task process all of the reduce outputs?
If you want all of your output processed by a set of map tasks, you can set
the output directory of your previous job to be the input directory of your
next job, ensuring that the framework knows how to read the key value
An alternative is to have 2 Tasktracker clusters, where the nodes are on the
same machines.
One cluster is for IO intensive jobs and has a low number of map/reduces per
tracker,
the other cluster is for cpu intensive jobs and has a high number of
map/reduces per tracker.
The alternative, simpler m
If you are using the standard TextOutputFormat, and the output collector is
passed a null for the value, there will not be a trailing tab character
added to the output line.
output.collect( key, null );
Will give you the behavior you are looking for if your configuration is as I
expect.
On Tue, F
Ooops, you are using streaming., and I am not familar.
As a terrible hack, you could set mapred.textoutputformat.separator to the
empty string, in your configuration.
On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop wrote:
> If you are using the standard TextOutputFormat, and the output collec
t; map.output.key.field.separator parameters for this purpose, they
> don't work either. When hadoop sees empty string, it takes default tab
> character instead.
>
> Rasit
>
> 2009/2/4 jason hadoop
> >
> > Ooops, you are using streaming., and I am not familar.
> &g
The default task memory allocation size is set in the hadoop-default.xml
file for your configuration and is usually
The parameter is mapred.child.java.opts, and the value is generally
-Xmx200m.
You may alter this value in your JobConf object before you submit the job
and the individual tasks will
Please examine the web console for the namenode.
The url for this should be http://*namenodehost*:50070/
This will tell you what datanodes are successfully connected to the
namenode.
If the number is 0, then no datanodes are either running or were able to
connect to the namenode at start, or wer
On your master machine, use the netstat command to determine what ports and
addresses the namenode process is listening on.
On the datanode machines, examine the log files,, to verify that the
datanode has attempted to connect to the namenode ip address on one of those
ports, and was successfull.
You will have to increase the per user file descriptor limit.
For most linux machines the file /etc/security/limits.conf controls this on
a per user basis.
You will need to log in a fresh shell session after making the changes, to
see them. Any login shells started before the change and process sta
The other issue you may run into, with many files in your HDFS is that you
may end up with more than a few 100k worth of blocks on each of your
datanodes. At present this can lead to instability due to the way the
periodic block reports to the namenode are handled. The more blocks per
datanode, the
The .maximum values are only loaded by the Tasktrackers at server start time
at present, and any changes you make will be ignored.
2009/2/18 S D
> Thanks for your response Rasit. You may have missed a portion of my post.
>
> > On a different note, when I attempt to pass params via -D I get a us
I certainly hope it changes but I am unaware that it is in the todo queue at
present.
2009/2/18 S D
> Thanks Jason. That's useful information. Are you aware of plans to change
> this so that the maximum values can be changed without restarting the
> server?
>
> John
>
&
There is a moderate a mount of setup and tear down in any hadoop job. It may
be that your 10 seconds are primarily that.
On Wed, Feb 18, 2009 at 11:29 AM, Philipp Dobrigkeit wrote:
> I am currently trying Map/Reduce in Eclipse. The input comes from an hbase
> table. The performance of my jobs is
For reasons that are not clear, in 19, the partitioner steps one character
past the end of the field unless you are very explicit in your key
specification.
One would assume that -k2 would pick up the second token, even if it was the
last field in the key, but -k2,2 is required
As near as I can te
If you manually start the daemons, via hadoop-daemon.sh, the parent
directory of the hadoop-daemon.sh script will be used as the root directory
for the hadoop installation.
I do believe, but do not know, that the namenode/jobtracker does does not
notice the actual file system location of the the t
My 1st guess is that your application is running out of file
descriptors,possibly because your MultipleOutputFormat instance is opening
more output files than you expect.
Opening lots of files in HDFS is generally a quick route to bad job
performance if not job failure.
On Tue, Feb 24, 2009 at 6:
You may wish to look at the documentation on hadoop pipes, which provide a
interface for writing c++ map/reduce applications and a mechanism to pass
key/value data to C++ from hadoop.
The framework will read and write sequence file or mapfiles, and provide
key/value pairs to the map function and r
the number of computers, can we solve this problem of
> > running out of file descriptors?
> >
> >
> >
> >
> > On Wed, Feb 25, 2009 at 11:07 AM, jason hadoop
> > wrote:
> > > My 1st guess is that your application is running out of file
> > >
If you have to you can reach through all of the class loaders and find the
instance of your singleton class that has the data loaded. It is awkward,
and
I haven't done this in java since the late 90's. It did work the last time I
did it.
On Sun, Mar 1, 2009 at 11:21 AM, Scott Carey wrote:
> You
The way you are specifying the section of your key to compare is reaching
beyond the end of the last part of the key.
Your key specification is not terminating explicitly on the last character
of the final field of the key.
if your key splits in to N parts, and you are comparing on the Nth part,
1"). When i limit the input size,
> it works fine, i think this because i limit the total number of the
> possible
> "key1,key2,key3" compositions. but when i increate the input size, this
> exception was thrown.
>
> 2009/3/2 jason hadoop
>
> > The way you ar
I see that when the host name of the node is also on the localhost line in
/etc/hosts
On Fri, Mar 6, 2009 at 9:38 AM, wrote:
>
> I see the same strange behavior on 2-node cluster with 0.18.3, 0.19.1 and
> snv's branch-0.20.0...
> 2 nodes:
> "master1" running NameNode, JobTracker, DataNode, Task
You can have your item in a separate jar and pin the reference so that it
becomes perm-gen, which will pin it. Then you can search the class loader
hierarchy for the reference.
A quick scan through the Child.java main loop shows no magic with class
loaders.
I wrote some code to check this against
There were a couple of fork timing errors in the jdk 1.5 that occasionally
caused a sub process fork to go bad, this could by the du/df being forked
off by the datanode and dying.
I can't find the references I had saved away at one point, from the java
forums, but perhaps this will get you started
speculative execution.
On Mon, Mar 9, 2009 at 12:19 PM, Nathan Marz wrote:
> I have the same problem with reducers going past 100% on some jobs. I've
> seen reducers go as high as 120%. Would love to know what the issue is.
>
>
> On Mar 9, 2009, at 8:45 AM, Doug Cook wrote:
>
>
>> Hi folks,
>>
be this bug:
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6671051
>
> However, this is using Java 1.6.0_11, and that bug was marked as fixed in
> 1.6.0_6 :(
>
> Any other ideas?
>
> Brian
>
>
> On Mar 9, 2009, at 2:21 PM, jason hadoop wrote:
>
>
I noticed this getting much worse with block compression on the intermediate
map outputs, in the cloudera patched 18.3. I just assumed it was speculative
execution.
I wonder if one of the patches in the cloudera version has had an effect on
this.
On Mon, Mar 9, 2009 at 2:34 PM, Owen O'Malley wro
Hadoop has support for S3, the compression support is handled at another
level and should also work.
On Mon, Mar 9, 2009 at 9:05 PM, Ken Weiner wrote:
> I have a lot of large zipped (not gzipped) files sitting in an Amazon S3
> bucket that I want to process. What is the easiest way to process
There are a couple of failures that happen in tests derived from
ClusterMapReduceTestCase that are run outside of the hadoop unit test
framework.
The basic issue is that the unit test doesn't have the benefit of a runtime
environment setup by the bin/hadoop script.
The classpath is usually missin
("javax.xml.parsers.SAXParserFactory","org.apache.xerces.jaxp.SAXParserFactoryImpl");
On Tue, Mar 10, 2009 at 2:28 PM, jason hadoop wrote:
> There are a couple of failures that happen in tests derived from
> ClusterMapReduceTestCase that are run outside of the hadoop unit test
> framework.
>
&g
d out the details, but it has been
have a year since I dealt with this last.
Unless you are forking, to run your junit tests, ant won't let you change
the class path for your unit tests - much chaos will ensue.
On Wed, Mar 11, 2009 at 4:39 AM, Steve Loughran wrote:
> jason hadoop wr
Finally remembered, we had saxon 6.5.5 in the class path, and the jetty
error was
09/03/11 08:23:20 WARN xml.XmlParser: EXCEPTION
javax.xml.parsers.ParserConfigurationException: AElfred parser is
non-validating
On Wed, Mar 11, 2009 at 8:01 AM, jason hadoop wrote:
> I am having trou
For a simple test, set the replication on your entire cluster to 6 hadoop
dfs -setRep -R -w 6 /
This will triple your disk usage and probably take a while, but then you are
guaranteed that all data is local.
You can also get a rough idea from the Job Counters, 'Data-local map tasks'
total field
wget http://namenode:port/*data/*filename
will return the filename.
The namenode will redirect the http request to a datanode that has at least
some of the blocks in local storage to serve the actual request.
The key piece of course is the /data prefix on the file name.
port is the port that the w
If you use the Java System Property java.io.tmpdir, your reducer will use
the ./tmp directory in the local working directory allocated by the
framework for your task.
If you have a specialty file system for transient data, such as a tmpfs, use
that.
On Sun, Mar 15, 2009 at 4:08 PM, Mark Kerzner
fuse_dfs is a contrib package that is part of the standard hadoop
distribution tar ball, but not compiled, and does not compile without some
special ant flags
There is a README in src/contrib/fuse-dfs/README, of the distribution, that
walks you through the process of compiling and using fuse_dfs.
Make all of your hadoop-metrics properties use the standard IP address of
your master node.
Then add a straight udp receive block to the gmond.conf of your master node.
Then point your gmetad.conf at your master node.
There are complete details in forthcoming book, and with this in it, should
be a
For a job using TextOutputFormat, the final output key value pairs will be
separated by the string defined in the key
mapred.textoutputformat.separator, which defaults to TAB
The string under stream.map.output.field.separator, is used to split the
lines read back from the mapper into key, value, f
The exception reference to *org.apache.hadoop.hdfs.DistributedFileSystem*,
implies strongly that a hadoop-default.xml file, or at least a job.xml file
is present.
Since hadoop-default.xml is bundled into the hadoop-0.X.Y-core.jar, the
assumption is that the core jar is available.
The class not fou
If the search file data set is large, the issue becomes ensuring that only
the required portion of search file is actually read, and that those reads
are ordered, in search file's key order.
If the data set is small, most any of the common patterns will work.
I haven't looked at pig for a while,
don't see this as an issue yet, because I'm still puzzeled with how to
> write
> the job in plain MR. The join code is looking for an exact match in the
> keys
> and that is not what I need. Would a custom comperator which will look for
> a
> match in between the ranges, be
You may write an arbitrary number of output.collect command
You may even use MultipleOutputFormat, to separate and stream the
output.collect results to additional destinations.
Caution must be taken to ensure that large numbers of files are not created,
when using MultipleOutputFormat
On Fri,
Just for fun, chapter 9 in my book is a work through of solving this class
of problem.
On Thu, Mar 26, 2009 at 7:07 AM, jason hadoop wrote:
> For the classic map/reduce job, you have 3 requirements.
>
> 1) a comparator that provide the keys in ip address order, such that all
> ke
1) when running in pseudo-distributed mode, only 2 values for the reduce
count are accepted, 0 and 1. All other positive values are mapped to 1.
2) The single reduce task spawned has several steps, and each of these steps
account for about 1/3 of it's overall progress.
The 1st third, is collectin
Probably be available in a week or so, as draft one isn't quite finished :)
On Thu, Apr 2, 2009 at 1:45 AM, Stefan Podkowinski wrote:
> .. and is not yet available as an alpha book chapter. Any chance uploading
> it?
>
> On Thu, Apr 2, 2009 at 4:21 AM, jason hadoop
> wro
HDFS only allocates as much physical disk space is required for a block, up
to the block size for the file (+ some header data).
So if you write a 4k file, the single block for that file will be around 4k.
If you write a 65M file, there will be two blocks, one of roughly 64M, and
one of roughly 1M
This is discussed in chapter 8 of my book.
In short,
If both data sets are:
- in same key order
- partitioned with the same partitioner,
- the input format of each data set is the same, (necessary for this
simple example only)
A map side join will present all the key value pairs of e
>From the 0.19.0 FsNameSystem.java, it looks like the timeout by default is 2
* 3000 + 30 = 306000msec or 5 minutes 6 seconds.
If you have configured dfs.hosts.exclude in your hadoop-site.xml to point to
an empty file, that actually exists, you may add the name (as used in the
slaves file) for
Alpha chapters are available, and 8 should be available in the alpha's as
soon as draft one gets back from technical review.
On Sun, Apr 5, 2009 at 7:43 AM, Christian Ulrik Søttrup wrote:
> jason hadoop wrote:
>
>> This is discussed in chapter 8 of my book.
>>
>>
&g
The data is flushed when the file is closed, or the amount written is an
even multiple of the block size specified for the file, which by default is
64meg.
There is no other way to flush the data to HDFS at present.
There is an attempt at this in 0.19.0 but it caused data corruption issues
and wa
Chapter 8 of my book covers this in detail, the alpha chapter should be
available at the apress web site
Chain mapping rules!
http://www.apress.com/book/view/1430219424
On Wed, Apr 8, 2009 at 3:30 PM, Nathan Marz wrote:
> You can also try decreasing the replication factor for the intermediate
>
Hi Sagar!
There is no reason for the body of your reduce method to do more than copy
and queue the key value set into an execution pool.
The close method will need to wait until the all of the items finish
execution and potentially keep the heartbeat up with the task tracker by
periodically repor
The following very simple program will tell the VM to drop the pages being
cached for a file. I tend to spin this in a for loop when making large tar
files, or otherwise working with large files, and the system performance
really smooths out.
Since it use open(path) it will churn through the inode
If you pack your images into sequence files, as the value items, the cluster
will automatically do a decent job of ensuring that the input splits made
from the sequences files are local to the map task.
We did this in production at a previous job and it worked very well for us.
Might as well turn
I have a nice variant of this in the ch7 examples section of my book,
including a standalone wrapper around the virtual cluster for allowing
multiple test instances to share the virtual cluster - and allow an easier
time to poke around with the input and output datasets.
It even works decently und
be looking at the
> performance both with and without the cache.
>
> Brian
>
>
> On Apr 14, 2009, at 12:01 AM, jason hadoop wrote:
>
> The following very simple program will tell the VM to drop the pages being
>> cached for a file. I tend to spin this in a for loop whe
utely necessary for this test to work?
>
> Thanks again,
> bc
>
>
>
> jason hadoop wrote:
> >
> > I have a nice variant of this in the ch7 examples section of my book,
> > including a standalone wrapper around the virtual cluster for allowing
> > multiple
+ File.separator + "history");
looks like the hadoop.log.dir system property is not set, note: not
environment variable, not configuration parameter, but system property.
Try a *System.setProperty("hadoop.log.dir","/tmp");* in your code before you
initialize the virtu
Double check that there is no firewall in place.
At one point a bunch of new machines were kickstarted and placed in a
cluster and they all failed with something similar.
It turned out the kickstart script turned enabled the firewall with a rule
that blocked ports in the 50k range.
It took us a whi
Chaining described in chapter 8 of my book provides this to a limited
degree.
Cascading, http://www.cascading.org/, also supports complex flows. I do not
know how cascading works under the covers.
On Thu, Apr 16, 2009 at 8:23 AM, Shevek wrote:
> On Tue, 2009-04-14 at 07:59 -0500, Pankil Doshi w
you wrote or is it run when
> the system turns on?
> Mithila
>
> On Thu, Apr 16, 2009 at 1:06 AM, Mithila Nagendra
> wrote:
>
> > Thanks Jason! Will check that out.
> > Mithila
> >
> >
> > On Thu, Apr 16, 2009 at 5:23 AM, jason hadoop >wrote:
&g
The firewall was run at system startup, I think there was a
/etc/sysconfig/iptables file present which triggered the firewall.
I don't currently have access to any centos 5 machines so I can't easily
check.
On Thu, Apr 16, 2009 at 6:54 PM, jason hadoop wrote:
> The kicksta
wall rules, if any
for a linux machine.
You should be able to use telnet to verify that you can connect from the
remote machine.
On Thu, Apr 16, 2009 at 9:18 PM, Mithila Nagendra wrote:
> Thanks! I ll see what I can find out.
>
> On Fri, Apr 17, 2009 at 4:55 AM, jason hadoop >wrote:
The traditional approach would be a Mapper class that maintained a member
variable that you kept the max value record, and in the close method of your
mapper you output a single record containing that value.
The map method of course compares the current record against the max and
stores current in
f the work done in the reduce.
On Mon, Apr 20, 2009 at 4:26 AM, Shevek wrote:
> On Sat, 2009-04-18 at 09:57 -0700, jason hadoop wrote:
> > The traditional approach would be a Mapper class that maintained a member
> > variable that you kept the max value record, and in the close method
u're performing a
> > SQL-like operation in MapReduce; not always the best way to approach this
> > type of problem).
> >
> > Brian
> >
> > On Apr 20, 2009, at 8:25 PM, jason hadoop wrote:
> >
> >> The Hadoop Framework requires that a Map Phase b
ay to approach this
> type of problem).
>
> Brian
>
>
> On Apr 20, 2009, at 8:25 PM, jason hadoop wrote:
>
> The Hadoop Framework requires that a Map Phase be run before the Reduce
>> Phase.
>> By doing the initial 'reduce' in the map, a much smaller
There must be only 2 input splits being produced for your job.
Either you have 2 unsplitable files, or the input file(s) you have are not
large enough compared to the block size to be split.
Table 6-1 in chapter 06 gives a breakdown of all of the configuration
parameters that affect split size in
Most likely that machine is affected by some firewall somewhere that
prevents traffic on port 50075. The no route to host is a strong indicator,
particularly if the Datanote registered with the namenode.
On Tue, Apr 21, 2009 at 4:18 PM, Philip Zeyliger wrote:
> Very naively looking at the code, t
For reasons that I have never bothered to investigate I have never had a
cluster work when the hadoop.tmp.dir was not identical on all of the nodes.
My solution has always been to just make a symbolic link so that
hadoop.tmp.dir was identical and on the machine in question really ended up
in the f
/numbers -output /tmp/numbers_max_output -reducer
aggregate -mapper LongMax.pl -file /tmp/LongMax.pl
On Tue, Apr 21, 2009 at 7:42 PM, jason hadoop wrote:
> There is no reason to use a combiner in this case, as there is only a
> single output record from the map.
>
> Combiners buy you da
ey Jason,
>
> We've never had the hadoop.tmp.dir identical on all our nodes.
>
> Brian
>
>
> On Apr 22, 2009, at 10:54 AM, jason hadoop wrote:
>
> For reasons that I have never bothered to investigate I have never had a
>> cluster work when the hadoop.tmp.dir was
the no route to host message means one of two things, either there is no
actual route, which would have generated a different error, or some firewall
is sending back a new route message.
I have seen the now route to host problem several times, and it is usually
because there is a firewall in place
I wonder if this is an obscure case of out of file descriptors. I would
expect a different message out of the jvm core
On Wed, Apr 22, 2009 at 5:34 PM, Matt Massie wrote:
> Just for clarity: are you using any type of virtualization (e.g. vmware,
> xen) or just running the DataNode java process o
It looks like this is during the hdfs recovery phase of the cluster start.
Perhaps a tmp cleaner has removed some of the files, and now this portion of
the restart is causing a failure.
I am not terribly familiar with the job recovery code.
On Wed, Apr 22, 2009 at 11:44 AM, Tamir Kamara wrote:
I believe the datanode is the same physical machine as the namenode if I
understand this problem correctly.
Which really puts pay to our suggestions about traceroute and firewalls)
I have one question, is the ip address consistent, I think in one of the
thread mails, it was stated that the ip addr
Can you give us your network topology ?
I see that at least 3 ip addresses
192.168.253.20, 192.168.253.32 and 192.168.253.21
In particular the fs.default.name which you have provided, the
hadoop-site.xml for each machine,
the slaves file, with ip address mappings if needed and a netstat -a -n -t
-
1 - 100 of 222 matches
Mail list logo