Hi Hany,
Yes the paramater is set to 1GB by default but it should also be noted that 
this configuration key is actually deprecated as of some time ago. Seeing as we 
are using the 'new' MapReduce API, I suspect we should use 
'mapreduce.map.java.opts` and `mapreduce.reduce.java.opts` instead so that is 
something we need to update.
Can you please provide a patch for this and submit it against the 1.x branch?

Now to answer your question, essentially these configuration parameters enable 
you to tune the heap-size for child jvms of maps and reduces respectively. In 
the content of Nutch this might be useful if certain crawl phases consume more 
heap memory e.g. parsing. This will ultimately be crawl-specific.
HTH
Lewis

On 2018/12/07 14:08:59, hany.n...@hsbc.com wrote: 
> Hello,
> 
> While checking the Nutch (1.15) crawl bash file, I noticed at line 211 that 
> 1000MB is statically set for java - > mapred.child.java.opts=-Xmx1000m
> 
> Any idea why?, Can I change it?, What will be the impact?
> Kind regards,
> Hany Shehata
> Enterprise Engineer
> Green Six Sigma Certified
> Solutions Architect, Marketing and Communications IT
> Corporate Functions | HSBC Operations, Services and Technology (HOST)
> ul. Kapelanka 42A, 30-347 Kraków, Poland
> __________________________________________________________________
> 
> Tie line: 7148 7689 4698
> External: +48 123 42 0698
> Mobile: +48 723 680 278
> E-mail: hany.n...@hsbc.com<mailto:hany.n...@hsbc.com>
> __________________________________________________________________
> Protect our environment - please only print this if you have to!
> 
> 
> 
> -----------------------------------------
> SAVE PAPER - THINK BEFORE YOU PRINT!
> 
> This E-mail is confidential.  
> 
> It may also be legally privileged. If you are not the addressee you may not 
> copy,
> forward, disclose or use any part of it. If you have received this message in 
> error,
> please delete it and all copies from your system and notify the sender 
> immediately by
> return E-mail.
> 
> Internet communications cannot be guaranteed to be timely secure, error or 
> virus-free.
> The sender does not accept liability for any errors or omissions.
> 

Reply via email to