Hey
Logs are created when spills of map job are created during the FETCH job
and are stored in /home/hadoop/nodelogs/usercache/root/appcache. The
total size of logs sums up to over 13GB which occupies a lot of disk
space of the datanode and I have to delete those logs for smooth
functioning of nutch.
Also, I am unclear as to which parameter should be changed in the
log4j.properties to reduce this size.
Shubham Gupta
On 08/24/2016 05:20 PM, Markus Jelsma wrote:
If it is Nutch logging, change its level in conf/log4j.properties. It can also
be Hadoop logging.
M.
-----Original message-----
From:shubham.gupta <[email protected]>
Sent: Tuesday 23rd August 2016 8:15
To: [email protected]
Subject: Application creating huge amount of logs : Nutch 2.3.1 + Hadoop 2.7.1
Hey
I have integrated Nutch 2.3.1 with Hadoop 2.7.1, and the fetcher.parse
property is set TRUE and the database used is MongoDB. While the map job
of nutch runs, it creates a huge size of nodelogs over 13GB in size. And
the cause of such huge amount of files in unknown. Any suggestion would
help.
Thanks in advance.
Shubham Gupta