[
https://issues.apache.org/jira/browse/NUTCH-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1452:
----------------------------------------
Fix Version/s: 2.2
1.7
> hadoop.job.history.user.location in nutch-default making job history useless
> ----------------------------------------------------------------------------
>
> Key: NUTCH-1452
> URL: https://issues.apache.org/jira/browse/NUTCH-1452
> Project: Nutch
> Issue Type: Bug
> Reporter: Ferdy Galema
> Fix For: 1.7, 2.2
>
>
> There is still a property in nutch-default 'hadoop.job.history.user.location'
> that redirects the creation of history files from job output locations to a
> custom location. I noticed that the current value does not work well with
> cloudera (I have tested cdh3u4), because ${hadoop.log.dir} is not defined.
> This actually causes the job in the jobtracker to show empty info. (With
> 'incomplete' job status). This is only when the job moves to retired. When it
> is still in 'completed', all is looking well.
> This property can be set to 'none', because the job history is ALSO stored in
> the central jobtracker location anyway. The
> 'hadoop.job.history.user.location' property specifies an extra location. But
> if it is set to an invalid value, it causes the central history location to
> NOT store it, so it seems. Please see for more details:
> http://hadoop.apache.org/common/docs/r1.0.3/cluster_setup.html
> Besides setting it to 'none', another option is to set it to 'history' which
> does work with cdh. (This writes all logs to 'history' in the user directory
> in the configured filesystem, usually dfs). The final option is to simply
> remove this value and not meddle with hadoop properties at all. But that
> actually requires all jobs to correctly ignore these files. I am not up to
> date how well this currently works with Nutch jobs. This question is most
> relevant for trunk, since trunk heavily relies on the filesystem for jobs.
> What do you think?
> A) Set property to 'none'
> B) Set property to 'history'
> C) Remove property, see what happens, possibly fix jobs
> D) ?
> For now, I opt for A. But I think we need some more input with other
> distributions (for example official Hadoop 1.x) and also Nutch trunk.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira