The nutch job has started making these huge amount of logs after the fetcher.parse property was set to TRUE. So, is there any relation with that?

Shubham

On Thursday 08 September 2016 10:28 AM, shubham.gupta wrote:
What changes can be made in Nutch log4j.properties to reduce the size of Nutch logging size.

Shubham

On Wednesday 07 September 2016 04:04 AM, Markus Jelsma wrote:
I've seen Hadoop not honouring some logs settings before. Are you really sure these are org.apache.nutch.* logs? If so, and as said before, change log4j.properties to not log INFO messages. If they Hadoop logs, of which there are many, then change some Hadoop settings which i don't remember right now.

Hadoop is notorious for verbose logging and any job can easily create hundreds MB's of logs on datanodes, HDFS master, YARN master, containers etc. This is normal and should be expected. If your Hadoop cluster is not designed to take GB's of logs, then your disk space is just too small. This is because so much is happening when a job runs. Either increase disk space, or set all logging levels to WARN or higher.

In any case, a Hadoop cluster always logs more than Nutch does, so Nutch logging is the least of your problems.

M.
  -----Original message-----
From:shubham.gupta <[email protected]>
Sent: Tuesday 6th September 2016 6:57
To: [email protected]
Subject: Re: Application creating huge amount of logs : Nutch 2.3.1 + Hadoop 2.7.1

Hey,

I have changed the user.log_retain size to 10 MB still it is creating a
huge size of logs. This leads to the failure of datanode and the job
fails. And, if the logs are deleted periodically then the fetch phase
takes a lot of time and it is uncertain that whether it will complete or
not.

Shubham Gupta

On Wednesday 24 August 2016 05:20 PM, Markus Jelsma wrote:
If it is Nutch logging, change its level in conf/log4j.properties. It can also be Hadoop logging.
M.
   -----Original message-----
From:shubham.gupta <[email protected]>
Sent: Tuesday 23rd August 2016 8:15
To: [email protected]
Subject: Application creating huge amount of logs : Nutch 2.3.1 + Hadoop 2.7.1

Hey

I have integrated Nutch 2.3.1 with Hadoop 2.7.1, and the fetcher.parse property is set TRUE and the database used is MongoDB. While the map job of nutch runs, it creates a huge size of nodelogs over 13GB in size. And the cause of such huge amount of files in unknown. Any suggestion would
help.

Thanks in advance.

Shubham Gupta




Reply via email to