Hello,

since this is my first post to this list, on which I've been lurking for four 
months, permit me to introduce myself: My name is Florian, I have done some 
work on the TM4J project (http://www.tm4j.org), and I'm currently working on 
integrating TM4J with Cocoon for the purpose of generating web sites from XML 
Topic Maps.

Everything I've done so far works nicely in a servlet environment. What I'm 
trying to do now is use the Cocoon CLI to generate the web site structure 
offline.

Now, I'm confronted with the LinkGatherer throwing numerous NPEs in the 
process, which is why offline generation stops processing immediately after 
the initial target URL. Effectively, it does not do any link crawling at all. 
This appears to be a known problem as it's been mentioned in an earlier post 
on the dev list[1], but seems not to have been fixed. There was also some 
debate in a thread started on 7-30 on this list[2], but it didn't come to any 
conclusion.

Trying to track this problem down a little further, I have tried using the 
LinkGatherer in the sitemap and access the link list from the servlet 
environment:

<!-- ... -->
    <map:transformer name="links"
                     src="org.apache.cocoon.sitemap.LinkGatherer"/>
<!-- ... -->
    <map:pipeline id="links">
      <map:match pattern="links">
        <map:generate type="file" src="src/fgh.xtm"/>
        <map:transform type="xslt" src="xsl/tm.xsl">
          <!-- lots of parameters here -->
        </map:transform>
        <map:transform type="links"/>
        <map:serialize type="xml"/>
      </map:match>
    </map:pipeline>
<!-- ... -->

Where the result of the XSLT transformation is an XHTML document containing a 
bunch of src and href attributes. Result: LinkGatherer throwing a 
NullPointerException (line numbers referring to 2.1 release):

Original Exception: java.lang.NullPointerException
        at org.apache.cocoon.sitemap.LinkGatherer.simpleLink(LinkGatherer.java:85)
        at org.apache.cocoon.xml.xlink.XLinkPipe.startElement(XLinkPipe.java:124)
        at 
org.apache.cocoon.xml.xlink.ExtendedXLinkPipe.startElement(ExtendedXLinkPipe.java:132)
        at 
org.apache.cocoon.components.sax.XMLTeePipe.startElement(XMLTeePipe.java:118)
        at 
org.apache.cocoon.components.sax.XMLByteStreamInterpreter.parse(XMLByteStreamInterpreter.java:134)
        at 
org.apache.cocoon.components.sax.XMLByteStreamInterpreter.deserialize(XMLByteStreamInterpreter.java:110)
        at 
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:270)
        at 
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:492)
        at [...]

This appears to be the same problem that occurs when using the CLI. I also 
tried a different approach, using LinkSerializer:

    <map:pipeline id="links">
      <map:match pattern="links">
        <map:generate type="file" src="src/fgh.xtm"/>
        <map:transform type="xslt" src="xsl/tm.xsl">
          <!-- lots of parameters here -->
        </map:transform>
        <map:serialize type="links"/>
      </map:match>
    </map:pipeline>

Now it's getting a bit bizarre: The LinkSerializer correctly outputs the very 
first link in the result document (a link to a CSS stylesheet), but ignores 
all others (numerous <a href="..."> elements, for example).

I guess it's worth mentioning that this behavior appears to be limited to the 
mounted subsitemap in my ~/public_html directory, I've been unable to produce 
this behavior for the rest of the Cocoon samples. The Tomcat process which 
serves as Cocoon's servlet environment runs as root, so I guess we can rule 
out file permission-related issues.

[Side note, something else which baffled me: Shouldn't <map:views> defined in 
the root sitemap be available to all subsitemaps unless overridden? I for my 
part can't use neither the content, nor pretty-content, nor links view unless 
redeclaring it in my subsitemap. Serializers, transformers, and generators, 
and apparently everything else defined in the root sitemap works in the 
subsitemap, though.]

All problems mentioned occur both in the 2.1 release and in a CVS checkout as 
of yesterday (8-29).

Now, my questions:
1. Could someone give me a clue as to what may be causing the NPEs for the 
LinkGatherer? 
2. Where can I find documentation on configuration parameters for the 
LinkGatherer, if any?
3. What might be causing this strange behaviour for the LinkSerializer?
4. How, provided I get the links view working, can I configure the CLI to use 
"the old way" instead of the LinkGatherer? It's been mentioned[3] that this 
is possible, yet I haven't found out how to do so.

I'd greatly appreciate any hints which may point me in the right direction.

Thanks and best regards,

Florian



[1]
On Saturday 17 May 2003 23:45, Upayavira wrote:
| > java.lang.NullPointerException
| >         at
| >         org.apache.cocoon.environment.AbstractEnvironment.release(Abst
| >         ractEnvironment.java:511) at
|
| Bleurgh. Don't know where to start on that one. But lets look at the above
| and see if that helps.

[2]
On Wednesday 30 July 2003 01:20, Maik Dobryn wrote:
| I want to use the Cocoon 2.1m3 command line interface to generate an
| offline version of my project site.
| The online version works fine but the offline version includes some broken
| links (nevertheless, brokenlinks.xml is still empty).

[3]
On Saturday 17 May 2003 23:45, Upayavira wrote:
| Just to note - you can still use the old method with the CLI (i.e.
| requesting each page 3 times), the option is still there to do it just the
| same as it was. In fact, I believe that is the default behaviour.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to