Hi there,
I have this following exception while I'm appending existing file in my
HDFS. This error appears intermittently. If the error does not show up, I
can append the file successfully. If the error appears, I could not append
the file.
Here is the error:
I use CDH4.3.1 and run the TestHDFSCLI unit test,but there are below errors:
2013-10-10 13:05:39,671 INFO cli.CLITestHelper
(CLITestHelper.java:displayResults(156)) -
---
2013-10-10 13:05:39,671 INFO cli.CLITestHelper
Hi Arinto,
Please disable this feature with smaller clusters.
dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you
have only 2 nodes in the cluster from the logs. When you first time created
pipeline we will
Hello,
Thanks a lot for the information. It helped me figure out the solution of this
problem.
I posted the sketch of solution on StackOverflow
(http://stackoverflow.com/a/19295610/337194) for anybody who is interested.
Best regards,
Youssef Hatem
On Oct 9, 2013, at 14:08 , Peter Marron
Hi,
We are working on building a MapReduce program that takes Avro input from
HDFS, gets the timestamp, and counts the number of events written in any
given day. We would like to make a program that does not need to have the
Avro data declared previously, rather, it would be best if it could read
Hi there,
I was running somme mapreduce jobs on hadoop-2.1.0-beta . These are
multiple unit tests that can take more than a day to finish running.
However I realized the logs for the jobs are being deleted some how quickly
than the default 24 hours setting of mapreduce.job.userlog.retain.hours
Hi Reyane,
Did you try yarn.nodemanager.log.retain-seconds? increasing that might
help. The default value is 10800 seconds, that means 3 hours.
Thanks,
Kishore
On Thu, Oct 10, 2013 at 8:27 PM, Reyane Oukpedjo oukped...@gmail.comwrote:
Hi there,
I was running somme mapreduce jobs on
We recently switched all our productions clusters to JDK7 off the EOL JDK6.
The one big gotcha, and this was -not- specifically a problem with the
Hadoop framework but you may have issues with your own applications or
clients is the with the Java 7 bytecode verifier which can be disabled
with
Thanks problem solved.
Reyane OUKPEDJO
On 10 October 2013 11:10, Krishna Kishore Bonagiri
write2kish...@gmail.comwrote:
Hi Reyane,
Did you try yarn.nodemanager.log.retain-seconds? increasing that might
help. The default value is 10800 seconds, that means 3 hours.
Thanks,
Hi,
I have a simple Grep job (from bundled examples) that I am running on a
11-node cluster. Each node is 2x8-core Intel Xeons (shows 32 CPUs with HT
on), 64GB RAM and 8 x 1TB disks. I have mappers set to 20 per node.
When I run the Grep job, I notice that CPU gets pegged to 100% on multiple
Actually... I believe that is expected behavior. Since your CPU is pegged
at 100% you're not going to be IO bound. Typically jobs tend to be CPU
bound or IO bound. If you're CPU bound you expect to see low IO throughput.
If you're IO bound, you expect to see low CPU usage.
On Thu, Oct 10, 2013
Thanks Pradeep. Does it mean this job is a bad candidate for MR?
Interestingly, running the cmdline '/bin/grep' under a streaming job
provides (1) Much better disk throughput and, (2) CPU load is almost evenly
spread across all cores/threads (no CPU gets pegged to 100%).
On Thu, Oct 10, 2013
I don't think it necessarily means that the job is a bad candidate for MR.
It's a different type of a workload. Hortonworks has a great article on the
different types of workloads you might see and how that affects your
provisioning choices at
On Thu, Oct 10, 2013 at 1:27 PM, Pradeep Gollakota pradeep...@gmail.comwrote:
I don't think it necessarily means that the job is a bad candidate for MR.
It's a different type of a workload. Hortonworks has a great article on the
different types of workloads you might see and how that affects
Hi,
I have a yarn application that launches a mapreduce job that has a mapper
that uses a newer version of guava than the one hadoop is using. Because
of this, the mapper fails and gets a NoSuchMethod exception. Is there a
way to indicate that application dependencies should be used over hadoop
Hi Albert,
If you are using distributed cache to push the newer version of the guava jars,
you can try setting mapreduce.job.user.classpath.first to true. If not, you
can try overriding the value of mapreduce.application.classpath to ensure that
the dir where the newer guava jars are present
Thank you for the comprehensive answer,
When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it
mean that one of the datanodes was unreachable when we try to append into
the files?
Best regards,
Arinto
Hi Guys,
We have fairly decent sized Hadoop cluster of about 200 nodes and was
wondering what is the state of art if I want to aggregate and visualize
Hadoop ecosystem logs, particularly
1. Tasktracker logs
2. Datanode logs
3. Hbase RegionServer logs
One way is to use something like a
You can try Chukwa which is part of the incubating projects under Apache. Tried
it before and liked it for aggregating logs.
On 11 Oct, 2013, at 1:36 PM, Sagar Mehta sagarme...@gmail.com wrote:
Hi Guys,
We have fairly decent sized Hadoop cluster of about 200 nodes and was
wondering what
19 matches
Mail list logo