[
https://issues.apache.org/jira/browse/FOR-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Williams updated FOR-677:
-----------------------------
Fix Version/s: (was: 0.9-dev)
0.10
Moving to next release.
> leading slash in gathered URIs causes double the number of links to be
> processed
> --------------------------------------------------------------------------------
>
> Key: FOR-677
> URL: https://issues.apache.org/jira/browse/FOR-677
> Project: Forrest
> Issue Type: Bug
> Components: Core operations
> Affects Versions: 0.7, 0.8
> Reporter: David Crossley
> Fix For: 0.10
>
>
> Doing 'forrest' starts at the virtual document called linkmap.html where the
> Cocoon crawler gathers the initial set of links, then starts crawling and
> generating pages. Any new links are pushed onto the linkmap. However, for
> some sites, such as our own "seed-sample" and our "site-author", there is a
> sudden jump in the number of URIs remaining to be processed.
> This is due to a URI with a leading slash (e.g. /samples/faq.html). When that
> URI is processed, it gains a whole new set of links all with leading slashes,
> and so the list of URIs is potentially doubled.
> This issue could be due to a user error, i.e. adding a link that deliberately
> begins with a slash. Sometimes, that is unavoidable.
> However, we do have a sitemap transformer to "relativize" and "absolutize"
> the links. Should it always trim the leading slash? Or are there cases where
> that should not happen, so cannot generalise?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.