Hi,
with your answer and your questions I cannot answer, I realize that I miss a
lot of Hadoop understanding. I will proceed with analysis and deeper
documentation readings. Do you know some tutorial or similar where I can fully
understand how Hadoop works and how it is performing the MR job?
Hi,
I am using a hadoop map reduce job + mongoDb.
It goes against a data base 252Gb big. During the job the amount of conexions
is over 8000 and we gave already 9Gb RAM. The job is still crashing because of
a OutOfMemory with only a 8% of the mapping done.
Are this numbers normal? Or did we
I don't have any experience with MongoDB, but just gave my 2 cents here.
Your code is not efficient, as using the += on String, and you could have
reused the Text object in your mapper, as it is a mutable class, to be reused
and avoid creating it again and again like new Text() in the mapper. My
Thanks for your answer.
To your questions:
1. When you claim 96G ram, I am not sure what do you mean?
It is not 96 Gb RAM, it is 9 Gb that our test server has available (is it too
small?).
2. Your code is not efficient, as using the += on String
I need (or at least I donĀ“t have
Here are my suggestions originally aims to improve the efficient:
1) In your case, you could use StringBuilder, which has the append method,
should be more efficient to concatenate your string data in this case.2) What I
mean to reuse the Text object is as following: public class mapper