Thats the one ...
Sent from my iPhone
On Jan 20, 2012, at 6:28 PM, "Paul Ho" wrote:
> I think the balancing bandwidth property you are looking for is in
> hdfs-site.xml:
>
>
>dfs.balance.bandwidthPerSec
>402653184
>
>
> Set the value that makes most sense for your NIC
I think the balancing bandwidth property you are looking for is in
hdfs-site.xml:
dfs.balance.bandwidthPerSec
402653184
Set the value that makes most sense for your NIC. But I thought this is only
for balancing.
On Jan 20, 2012, at 3:43 PM, Michael Segel wrote:
> Ste
Steve,
Ok, first your client connection to the cluster is a non issue.
If you go in to /etc/Hadoop/conf
That supposed to be a little h but my iPhone knows what's best...
Look and see what you have set for your bandwidth... I forget which parameter
but there are only a couple that deal with ban
Interesting - I strongly suspect a disk IO or network problem since my code
is very simple and very fast.
If you add lines to generateSubStrings to limit String length to 100
characters (I think it is always that but this makes su
public static String[] generateSubStrings(String inp, int minLeng
Good catch on the Configured - In my tests is extends my subclass of
Configured but a I took out any
dependencies on my environment.
Interesting - I strongly suspect a disk IO or network problem since my code
is very simple and very fast.
If you add lines to generateSubStrings to limit String le
One thing I can say for sure is that generateSubStrings() is not slow -
Every input line in my sample is 100 characters and the timing should be
very similar
from one run to the next.
This sample is a simplification of a more complex real problem where we see
timeouts when a
map generates signifi
On Fri, Jan 20, 2012 at 12:18 PM, Michel Segel wrote:
> Steve,
> If you want me to debug your code, I'll be glad to set up a billable
> contract... ;-)
>
> What I am willing to do is to help you to debug your code..
The code seems to work well for small input files and is basically a
standard sa
Well - I am running the job over a vpn so I am not on a fast network to the
cluster.
The job runs fine for small input files - we did not run into issues until
the input file got in the
multi gigabyte range
On Fri, Jan 20, 2012 at 11:29 AM, Raj V wrote:
> Steve
>
> There seems to be something wr
Every so often, you should do a context.progress() so that the
framework knows that this map is doing useful work. That will prevent
the framework from killing it after 10 mins. The framework
automatically does this every time you do a
context.write()/context.setStatus(), but if the map is stuck fo
Hi Steve, I ran your job on our cluster and it does not timeout. I noticed
that each mapper runs for a long time: one way to avoid a timeout is to
update a user counter. As long as this counter is updated within 10
minutes, the task should not timeout (as MR knows that something is being
done).
Steve,
If you want me to debug your code, I'll be glad to set up a billable
contract... ;-)
What I am willing to do is to help you to debug your code...
Did you time how long it takes in the Mapper.map() method?
The reason I asked this is to first confirm that you are failing within a map()
met
Steve
There seems to be something wrong with either networking or storage. Why does
it take "hours" to generate 4GB text file?
Raj
>
> From: Steve Lewis
>To: common-user ; Josh Patterson
>
>Sent: Friday, January 20, 2012 9:16 AM
>Subject: Problems with timeo
You are on the right path for sure.
Where are you updating the JCE policy jar? (I know the RM-NM case is
working after this, so just checking)
May be the datanodes are not using the same JRE that you updated with
the new policy jar? Can you check that? jsvc shouldn't cause any more
issues, it sho
We have been having problems with mappers timing out after 600 sec when the
mapper writes many more, say thousands of records for every
input record - even when the code in the mapper is small and fast. I have
no idea what could cause the system to be so slow and am reluctant to raise
the 600 sec l
Hi Folks,
I'm currently working on implementing a logging system for a new Hadoop
cluster I've setup. The way I've always seen these setup in the past was
logging split off by days with individual files sharded off at around
10x HDFS block size. I haven't had any problems with this methodology
Hi all,
I have a fairly simple hadoop streaming map-reduce task which takes a
batch of 60ishlog files (each around 5-10MB gzip compressed) and
processes them using a cdh3u2 based cluster. The map stage of the job
finishes successfully and reasonably swiftly, but the reduce is taking
hours to c
Hi,
Does anyone has some production ready experience with latest HDFS and one
of FUSE-like products?
Tried until now fuse-dfs and hdfs-fuse, but looks like it's not ported for
newer Hadoop installations, and lot of issues with Java library
compatibility.
What's, from your experience, best way to
After remove the upper-case, the problem disappeared. Now I get node manager
connected to resource manager successfully.
Thank you Vinod.
But now, I get another issue to connect Name Node from Data Node. The log in
Name Node is as following:
2012-01-20 18:17:02,127 WARN ipc.Server (Server.java:
down vote
favorite
share [fb]
share [tw] Hadoop 1.0.0 is released in dec 2011. And its in Beta version.
As per the below link security feature•Security (strong authentication via
Kerberos authentication protocol) is added in hadoop 1.0.0 release.
http://www.infoq.com/news/2012/01/apache-had
Thanks a lot guys, for such illustrative explanation. I will go through the
links you send and will get back with any doubts I have.
Thanks,
Praveenesh
On Thu, Jan 19, 2012 at 2:17 PM, Sameer Farooqui wrote:
> Hey Praveenesh,
>
> Here's a good article on HDFS by some senior Yahoo!, Facebook, Hor
20 matches
Mail list logo