Hi,

as I wrote yesterday, I crawled sucessfully the default publication. After deploying my pub I tried to crawl it and got a null pointer exception. Near to getting nuts, I realized one difference between the default and my publication. The missing robots.txt. After writing and saving one and uncommenting the pointer in the crawler-live.xconf (it's commented in http://lenya.apache.org/1_2_x/components/search/lucene.html) it works!

But I still got some problems, and beeing you so kind to answer my questions before, I hope you'll go on with it :-))

I still got the following problem:
After a crawl, there are files in the dir mypub/work/search/lucene/htdocs_dump/live as I defined in the crawler-live.xconf, but there are (not intended) subdirectories lenya/mypub/live/ in which the crawled files are in.

So I think I did not fully understand the crawler-live.xonf
If I want to crawl http://127.0.01:8080/lenya/mypub/live so my crawler-live.xconf looks like:
 <user-agent>lenya</user-agent>
 <base-url href="http://127.0.0.1:8080/lenya/mypub/live/index.html"/>
 <scope-url href="http://127.0.0.1:8080/lenya/mypub/live/"/>
 <uri-list src="../../work/search/lucene/uris.txt"/>
 <htdocs-dump-dir src="../../work/search/lucene/htdocs_dump/live"/>

<robots src="robots.txt" />
# domain="lenya.apache.org" I commented this, because I don't understand the function. Should I write in there something like "127.0.0.1"?

Next question: Has crawling to be made in the src of the publication and then deploy it? (Crawling the deployed pub, but putting the result in the source and then deploy it)

If I change something in my pub (xslt, xconf...) should this be done in the source and then deploy?

Thanks for patience and your answers in advance

Franz

_________________________________________________________________
Immer f�r Sie da. MSN Hotmail. http://www.msn.de/email/webbased/ Jetzt kostenlos anmelden und �berall erreichbar sein!


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to