Hello Upayavira,...
to you too, thanks a lot for the quick reply.
On Saturday 30 August 2003 12:59, Upayavira wrote: | Can you supply a full stack traces for some of these? I'll see if I can | find out what's going on.
OK. This is from the 2.1 release, full cocoon.sh output.
java.lang.NullPointerExceptionThis is in fact a problem initialising the Deli block. I don't know much about it, and I can't really explain why it should fail in the CLI but not in the servlet, but I'm pretty sure I've seen this before. It is a workaround, not a solution, but if you rebuild Cocoon excluding the Deli block, you'll get rid of this exception. Maybe I should add 'avoid deli' to the CLI docs :-(
at java.io.Reader.<init>(Reader.java:61)
at java.io.InputStreamReader.<init>(InputStreamReader.java:55)
at com.hp.hpl.deli.Workspace.getResource(Workspace.java:620)
at com.hp.hpl.deli.VocabularyConfig.<init>(VocabularyConfig.java:137)
at com.hp.hpl.deli.Vocabulary.<init>(Vocabulary.java:66)
at com.hp.hpl.deli.Workspace$1$CreateWorkspace.<init>(Workspace.java:331)
at com.hp.hpl.deli.Workspace.configure(Workspace.java:498)
at org.apache.cocoon.components.deli.DeliImpl.initialize(DeliImpl.java:148)
at
Setting the logkit level to DEBUG yields these interesting results in sitemap.log:Can you post some of the document that your scanning? Some with links that are found and some with links that aren't?
DEBUG (2003-08-30) 14:03.21:692 [sitemap.generator.file] (Unknown-URI) Unknown-thread/FileGenerator: processing file src/fgh.xtm
DEBUG (2003-08-30) 14:03.21:692 [sitemap.generator.file] (Unknown-URI) Unknown-thread/FileGenerator: file resolved to file:/home/fgh/public_html/src/fgh.xtm
DEBUG (2003-08-30) 14:03.22:713 [sitemap] (Unknown-URI) Unknown-thread/ExtendedXLinkPipe: Transforming to XLink: URI=http://www.w3.org/1999/xhtml NAME=link RAW=link ATT=href NS=http://www.w3.org/1999/xhtml VALUE=../css/tm4web.css
DEBUG (2003-08-30) 14:03.23:928 [sitemap] (Unknown-URI) Unknown-thread/ExtendedXLinkPipe: Transforming to XLink: URI=http://www.w3.org/1999/xhtml NAME=a RAW=a ATT=href NS=http://www.w3.org/1999/xhtml VALUE=mailto:[EMAIL PROTECTED]
DEBUG (2003-08-30) 14:03.23:937 [sitemap] (Unknown-URI) Unknown-thread/ExtendedXLinkPipe: Transforming to XLink: URI=http://www.w3.org/1999/xhtml NAME=a RAW=a ATT=href NS=http://www.w3.org/1999/xhtml VALUE=http://validator.w3.org/check/referer
DEBUG (2003-08-30) 14:03.23:938 [sitemap] (Unknown-URI) Unknown-thread/ExtendedXLinkPipe: Transforming to XLink: URI=http://www.w3.org/1999/xhtml NAME=img RAW=img ATT=src NS=http://www.w3.org/1999/xhtml VALUE=http://www.w3.org/Icons/valid-xhtml10
DEBUG (2003-08-30) 14:03.23:939 [sitemap] (Unknown-URI) Unknown-thread/ExtendedXLinkPipe: Transforming to XLink: URI=http://www.w3.org/1999/xhtml NAME=a RAW=a ATT=href NS=http://www.w3.org/1999/xhtml VALUE=http://jigsaw.w3.org/css-validator
DEBUG (2003-08-30) 14:03.23:940 [sitemap] (Unknown-URI) Unknown-thread/ExtendedXLinkPipe: Transforming to XLink: URI=http://www.w3.org/1999/xhtml NAME=img RAW=img ATT=src NS=http://www.w3.org/1999/xhtml VALUE=http://jigsaw.w3.org/css-validator/images/vcss.gif
As pointed out in my earlier reply to Jeff, the result document contains 12 links. Why is ExtendedXLinkPipe apparently resolving only 6?
...
Both extend the ExtendedXLinkPipe, so they're using the same method to extract links.| >3. What might be causing this strange behaviour for the LinkSerializer? | | Have you looked at the content of the page you generate if you replace | the link serialiser with an XML one?
Yes, of course. XML and HTML. Both work fine. As I said, the contents of the starting URL are rendered just fine even when using the CLI. It's just such that no links are ever crawled. follow-links is of course set to "true" in cli.xconf.
| >4. How, provided I get the links view working, can I configure the CLI to | > use "the old way" instead of the LinkGatherer? It's been mentioned[3] | > that this is possible, yet I haven't found out how to do so. | | You set the CLI to 'confirm-extensions', which also switches it to link | view mode. I'm planning to add an option to separate out | confirm-extensions and use-link-view. If it would help you, I'd happily | add it soon.
Hmmm, when setting confirm-extensions to "true", in sitemap.log I still get the same ExtendedXLinkPipe messages quoted above, and also this one:
What you can do is improve your link view to use an XSL that simplifies your page down to just the links you want and then run that into the linkSerializer. A hack, but it might at least get you going.
DEBUG (2003-08-30) 14:29.15:606 [sitemap] (Unknown-URI) Unknown-thread/ResourceLimitingPool: Put a org.apache.cocoon.sitemap.LinkGatherer back into the pool.Curious. I'll look into it.
I can't help but surmise that even when confirm-extensions="true", a LinkGatherer is at play somewhere. However, a separate "use-link-view" parameter would be very helpful.I'll add it.
| If we work together, I think we'll fix this.It certainly does. But I've given you a few more assignments above!
Well I hope this helps! :-)
Regards, Upayavira
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
