It happens right after the MR job (though once or twice its happened
during). I am not using EBS, just HDFS between the machines. As for tasks,
there are 4 mappers and 0 reducers.
Richard J. Zak
-Original Message-
From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of
Yes you may overload your machines that way because of the small number. One
thing to do would be to look in the logs for any signs of IOExceptions and
report them back here. Another thing you can do is to change some configs.
Increase *dfs.datanode.max.xcievers* to 512 and set the
Yes guys. We observed such problems.
They will be common for 0.18.2 and 0.19.0 exactly as you
described it when data-nodes become unstable.
There were several issues, please take a look
HADOOP-4997 workaround for tmp file handling on DataNodes
HADOOP-4663 - links to other related
HADOOP-4810
Hi,
Since I¹ve upgraded to 0.19.0, I¹ve been getting the following exceptions
when restarting jobs, or even when a failed reducer is being restarted by
the job tracker. It appears that stale file locks in the namenode don¹t get
properly released sometimes:
org.apache.hadoop.ipc.RemoteException:
It seems hdfs isn't so robust or reliable as the website says and/or I
have a configuration issue.
quite possible. How robust does the website say it is?
I agree debuggings failures like the following is pretty hard for casual
users. You need look at the logs for block, or run 'bin/hadoop
Can you please attach your latest version of this to
https://issues.apache.org/jira/browse/HADOOP-496?
Thanks,
Doug
Boris Musykantski wrote:
we have fixed some patches in JIRA for support of webdav server on
top of HDFS, updated to work with newer version (0.18.0 IIRC) and
added support for
I am looking to create some RA scripts and experiment with starting
hadoop via linux-ha cluster manager. Linux HA would handle restarting
downed nodes and eliminate the ssh key dependency.
Hi, esteemed group,
how would I form Maps in MapReduce to recursevely look at every file in a
directory, and do something to this file, such as produce a PDF or compute
its hash?
For that matter, Google builds its index using MapReduce, or so the papers
say. First the crawlers store all the
Hi,
Sounds like you might want to look at the Nutch project architecture
and then see the Nutch on Hadoop tutorial -
http://wiki.apache.org/nutch/NutchHadoopTutorial It does web
crawling, and indexing using Lucene. It would be a good place to
start anyway for ideas, even if it doesn't end up
Thanks a lot for your help. I solved that problem by removing LDFLAGS
(containing libjvm.so) from hdfs_test compilation. I added that flag to compile
correctly using Makefile but that was the real problem. Only after removing it
I was able to run with ant.
Thanks,
Arifa
-Original
Hey all, I wanted to reach out to the user / development community to
start identifying those of you who are interested in consulting /
contract work for new Hadoop deployments.
A number of our larger customers are asking for more extensive on-site
help than would normally happen under a support
Christophe,
I am writing my first Hadoop project now, and I have 20 years of consulting,
and I am in Houston. Here is my resume, http://markkerzner.googlepages.com.
I have used EC2.
Sincerely,
Mark
On Fri, Jan 23, 2009 at 4:04 PM, Christophe Bisciglia
christo...@cloudera.com wrote:
Hey all,
Tim,
I looked there, but it is a set up manual. I read the MapReduce, Sazall, and
the MS paper on these, but I need best practices.
Thank you,
Mark
On Fri, Jan 23, 2009 at 3:22 PM, tim robertson timrobertson...@gmail.comwrote:
Hi,
Sounds like you might want to look at the Nutch project
Thanks Mark. I'll be getting in touch early next week.
Others, I see replies default strait to the list. Please feel free to
email just me (christo...@cloudera.com), unless, well, you're in the
mood to share you bio with everyone :-)
Cheers,
Christophe
On Fri, Jan 23, 2009 at 2:31 PM, Mark
Hi,
there is a performance penalty in Windows (pardon the expression) if you put
too many files in the same directory. The OS becomes very slow, stops seeing
them, and lies about their status to my Java requests. I do not know if this
is also a problem in Linux, but in HDFS - do I need to balance
If you are adding and deleting files in the directory, you might notice
CPU penalty (for many loads, higher CPU on NN is not an issue). This is
mainly because HDFS does a binary search on files in a directory each
time it inserts a new file.
If the directory is relatively idle, then there
On Sat, Jan 24, 2009 at 10:03 AM, Mark Kerzner markkerz...@gmail.com wrote:
Hi,
there is a performance penalty in Windows (pardon the expression) if you put
too many files in the same directory. The OS becomes very slow, stops seeing
them, and lies about their status to my Java requests. I do
Raghu Angadi wrote:
If you are adding and deleting files in the directory, you might notice
CPU penalty (for many loads, higher CPU on NN is not an issue). This is
mainly because HDFS does a binary search on files in a directory each
time it inserts a new file.
I should add that equal or
%Remaining is much more fluctuate than %dfs used. This is because dfs shares
the disks with mapred and mapred tasks may use a lot of disks temporally. So
trying to keep the same %free is impossible most of the time.
Hairong
On 1/19/09 10:28 PM, Billy Pearson sa...@pearsonwholesale.com wrote:
19 matches
Mail list logo