Hi,
as I wrote yesterday, I crawled sucessfully the default publication. After
deploying my pub I tried to crawl it and got a null pointer exception. Near
to getting nuts, I realized one difference between the default and my
publication. The missing robots.txt. After writing and saving one and
uncommenting the pointer in the crawler-live.xconf (it's commented in
http://lenya.apache.org/1_2_x/components/search/lucene.html) it works!
But I still got some problems, and beeing you so kind to answer my questions
before, I hope you'll go on with it :-))
I still got the following problem:
After a crawl, there are files in the dir
mypub/work/search/lucene/htdocs_dump/live as I defined in the
crawler-live.xconf, but there are (not intended) subdirectories
lenya/mypub/live/ in which the crawled files are in.
So I think I did not fully understand the crawler-live.xonf
If I want to crawl http://127.0.01:8080/lenya/mypub/live so my
crawler-live.xconf looks like:
<user-agent>lenya</user-agent>
<base-url href="http://127.0.0.1:8080/lenya/mypub/live/index.html"/>
<scope-url href="http://127.0.0.1:8080/lenya/mypub/live/"/>
<uri-list src="../../work/search/lucene/uris.txt"/>
<htdocs-dump-dir src="../../work/search/lucene/htdocs_dump/live"/>
<robots src="robots.txt" />
# domain="lenya.apache.org" I commented this, because I don't understand the
function. Should I write in there something like "127.0.0.1"?
Next question: Has crawling to be made in the src of the publication and
then deploy it? (Crawling the deployed pub, but putting the result in the
source and then deploy it)
If I change something in my pub (xslt, xconf...) should this be done in the
source and then deploy?
Thanks for patience and your answers in advance
Franz
_________________________________________________________________
Immer f�r Sie da. MSN Hotmail. http://www.msn.de/email/webbased/ Jetzt
kostenlos anmelden und �berall erreichbar sein!
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]