The nutch job has started making these huge amount of logs after the
fetcher.parse property was set to TRUE. So, is there any relation with that?
Shubham
On Thursday 08 September 2016 10:28 AM, shubham.gupta wrote:
What changes can be made in Nutch log4j.properties to reduce the size
of Nutch logging size.
Shubham
On Wednesday 07 September 2016 04:04 AM, Markus Jelsma wrote:
I've seen Hadoop not honouring some logs settings before. Are you
really sure these are org.apache.nutch.* logs? If so, and as said
before, change log4j.properties to not log INFO messages. If they
Hadoop logs, of which there are many, then change some Hadoop
settings which i don't remember right now.
Hadoop is notorious for verbose logging and any job can easily create
hundreds MB's of logs on datanodes, HDFS master, YARN master,
containers etc. This is normal and should be expected. If your Hadoop
cluster is not designed to take GB's of logs, then your disk space is
just too small. This is because so much is happening when a job runs.
Either increase disk space, or set all logging levels to WARN or higher.
In any case, a Hadoop cluster always logs more than Nutch does, so
Nutch logging is the least of your problems.
M.
-----Original message-----
From:shubham.gupta <[email protected]>
Sent: Tuesday 6th September 2016 6:57
To: [email protected]
Subject: Re: Application creating huge amount of logs : Nutch 2.3.1
+ Hadoop 2.7.1
Hey,
I have changed the user.log_retain size to 10 MB still it is creating a
huge size of logs. This leads to the failure of datanode and the job
fails. And, if the logs are deleted periodically then the fetch phase
takes a lot of time and it is uncertain that whether it will
complete or
not.
Shubham Gupta
On Wednesday 24 August 2016 05:20 PM, Markus Jelsma wrote:
If it is Nutch logging, change its level in conf/log4j.properties.
It can also be Hadoop logging.
M.
-----Original message-----
From:shubham.gupta <[email protected]>
Sent: Tuesday 23rd August 2016 8:15
To: [email protected]
Subject: Application creating huge amount of logs : Nutch 2.3.1 +
Hadoop 2.7.1
Hey
I have integrated Nutch 2.3.1 with Hadoop 2.7.1, and the
fetcher.parse
property is set TRUE and the database used is MongoDB. While the
map job
of nutch runs, it creates a huge size of nodelogs over 13GB in
size. And
the cause of such huge amount of files in unknown. Any suggestion
would
help.
Thanks in advance.
Shubham Gupta