On Sat, Aug 30, 2003 at 09:27:17AM +0200, Florian G. Haas wrote: > Hello, > > since this is my first post to this list, on which I've been lurking for four > months, permit me to introduce myself: My name is Florian, I have done some > work on the TM4J project (http://www.tm4j.org), and I'm currently working on > integrating TM4J with Cocoon for the purpose of generating web sites from XML > Topic Maps.
Cool :) If you come up with something generic and useful, we'd be very interested in including it in Forrest[1]. Kal Ahmed once brought up the idea of using TMs in Forrest when the project was just starting: http://marc.theaimsgroup.com/?t=101647336300007&r=1&w=2 Forrest is just now getting to the stage of needing an internal metadata model (TMs or RDF). > Everything I've done so far works nicely in a servlet environment. What I'm > trying to do now is use the Cocoon CLI to generate the web site structure > offline. > > Now, I'm confronted with the LinkGatherer throwing numerous NPEs in the > process, which is why offline generation stops processing immediately after > the initial target URL. Effectively, it does not do any link crawling at all. > This appears to be a known problem as it's been mentioned in an earlier post > on the dev list[1], but seems not to have been fixed. There was also some > debate in a thread started on 7-30 on this list[2], but it didn't come to any > conclusion. > > Trying to track this problem down a little further, I have tried using the > LinkGatherer in the sitemap and access the link list from the servlet > environment: > > <!-- ... --> > <map:transformer name="links" > src="org.apache.cocoon.sitemap.LinkGatherer"/> > <!-- ... --> > <map:pipeline id="links"> > <map:match pattern="links"> > <map:generate type="file" src="src/fgh.xtm"/> > <map:transform type="xslt" src="xsl/tm.xsl"> > <!-- lots of parameters here --> > </map:transform> > <map:transform type="links"/> I don't think you're meant to be explicitly adding the LinkGatherer. It will be added automatically to the pipeline, just before the serializer. And then, only when you are using the 'new' CLI implementation (passing --xconf=cli.xconf to org.apache.cocoon.Main). > <map:serialize type="xml"/> > </map:match> > </map:pipeline> > <!-- ... --> .... > This appears to be the same problem that occurs when using the CLI. I also > tried a different approach, using LinkSerializer: > > <map:pipeline id="links"> > <map:match pattern="links"> > <map:generate type="file" src="src/fgh.xtm"/> > <map:transform type="xslt" src="xsl/tm.xsl"> > <!-- lots of parameters here --> > </map:transform> > <map:serialize type="links"/> > </map:match> > </map:pipeline> FYI, LinkSerializer is usually used in conjunction with a 'links' view. Eg. from Forrest: <map:views> <map:view name="links" from-position="last"> <map:transform src="resources/stylesheets/filterlinks.xsl"> <map:parameter name="ctxbasedir" value="{realpath:.}/"/> </map:transform> <map:serialize type="links"/> </map:view> </map:views> And then, this is only invoked if you are using the old CLI implementation (*not* using cli.xconf). Btw, Forrest's Ant script has examples of both the new (linkgatherer) and old (link view) CLI usages in its script: http://cvs.apache.org/viewcvs.cgi/xml-forrest/src/resources/forrest-shbat/forrest.build.xml?rev=1.82&content-type=text/vnd.viewcvs-markup > Now it's getting a bit bizarre: The LinkSerializer correctly outputs the very > first link in the result document (a link to a CSS stylesheet), but ignores > all others (numerous <a href="..."> elements, for example). Hmm, not sure. Is your XML using namespaces? But in any case, the LinkSerializer wouldn't do much good all on its lonesome in a pipeline; needs to be in a view. > I guess it's worth mentioning that this behavior appears to be limited to the > mounted subsitemap in my ~/public_html directory, I've been unable to produce > this behavior for the rest of the Cocoon samples. The Tomcat process which > serves as Cocoon's servlet environment runs as root, so I guess we can rule > out file permission-related issues. > > [Side note, something else which baffled me: Shouldn't <map:views> defined in > the root sitemap be available to all subsitemaps unless overridden? I for my > part can't use neither the content, nor pretty-content, nor links view unless > redeclaring it in my subsitemap. Serializers, transformers, and generators, > and apparently everything else defined in the root sitemap works in the > subsitemap, though.] Works fine for me in Forrest: http://localhost:8888/body-index.html?cocoon-view=links What does your link view definition look like? > All problems mentioned occur both in the 2.1 release and in a CVS checkout as > of yesterday (8-29). > > Now, my questions: > 1. Could someone give me a clue as to what may be causing the NPEs for the > LinkGatherer? > 2. Where can I find documentation on configuration parameters for the > LinkGatherer, if any? > 3. What might be causing this strange behaviour for the LinkSerializer? > 4. How, provided I get the links view working, can I configure the CLI to use > "the old way" instead of the LinkGatherer? It's been mentioned[3] that this > is possible, yet I haven't found out how to do so. You might want to check Forrest out of CVS and use its sitemap/cli.xconf/ant script as reference. Currently we're using the new CLI (twice as fast as the old'un), but support for the old one is still in forrest.build.xml (commented out), if you want to experiment with that. --Jeff [1] http://xml.apache.org/forrest/ > I'd greatly appreciate any hints which may point me in the right direction. > > Thanks and best regards, > > Florian > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
