[
https://issues.apache.org/jira/browse/NUTCH-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081639#comment-13081639
]
Lewis John McGibbney commented on NUTCH-881:
--------------------------------------------
In Nutch trunk we currently only have the wiki as a repository for any Nutch
2.0 information. Is this satisfactory?
As far as I can tell, the documentation for Gora_trunk is produced using Apache
Forrest. I am reasonably familiar with using Forrest and it would be a great
benefit, as well as lessening the burden upon mailing lists, if we could
maintain a clean distribution of documentation bundled nicely into a
/trunk/docs or/and branch-1.4/docs directory from now on and for all future
official releases.
I think the only addition to the documentation we require on the website is a
formal tutorial (available as part of the Apache Nutch website), which we need
to add to /site resources and which we could maintain and direct users to as a
one stop resource for Nutch branch/tags, then similarly a separate resource for
trunk.
ith specific reference to Nutch Trunk, in comparison on the Gora team they have
provided a quick-start guide followed by a more in depth tutorial, which in our
case we could apply to both branch-1.4 and 2.0 trunk. The quick-start guide
would only show users how to get trunk up and running, then the formal tutorial
would provide in-depth documentation on completing a crawl with either Nutch
1.4 or trunk 2.0. Does this sound reasonable?
Andrzej provided some good comments in the correspondence on NUTCH-881 which
should be addressed within any comprehensive documentation. I am very happy,
and pretty keen to get this issue resolved but I think we need to agree on a
specific tasks which need to be addressed, basically laying the path for
everything this issue encompasses.
> Good quality documentation for Nutch
> ------------------------------------
>
> Key: NUTCH-881
> URL: https://issues.apache.org/jira/browse/NUTCH-881
> Project: Nutch
> Issue Type: Improvement
> Components: documentation
> Affects Versions: 2.0
> Reporter: Andrzej Bialecki
>
> This is, and has been, a long standing request from Nutch users. This becomes
> an acute need as we redesign Nutch 2.0, because the collective knowledge and
> the Wiki will no longer be useful without massive amount of editing.
> IMHO the reference documentation should be in SVN, and not on the Wiki - the
> Wiki is good for casual information and recipes but I think it's too messy
> and not reliable enough as a reference.
> I propose to start with the following:
> 1. let's decide on the format of the docs. Each format has its own pros and
> cons:
> * HTML: easy to work with, but formatting may be messy unless we edit it by
> hand, at which point it's no longer so easy... Good toolchains to convert to
> other formats, but limited expressiveness of larger structures (e.g. book,
> chapters, TOC, multi-column layouts, etc).
> * Docbook: learning curve is higher, but not insurmountable... Naturally
> yields very good structure. Figures/diagrams may be problematic - different
> renderers (html, pdf) like to treat the scaling and placing somewhat
> differently.
> * Wiki-style (Confluence or TWiki): easy to use, but limited control over
> larger structures. Maven Doxia can format cwiki, twiki, and a host of other
> formats to e.g. html and pdf.
> * other?
> 2. start documenting the main tools and the main APIs (e.g. the plugins and
> all the extension points). We can of course reuse material from the Wiki and
> from various presentations (e.g. the ApacheCon slides).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira