Hi,

I am newbie to nuch.I have just able to run nutch tutorials.
My requirement is I want to crawl only .htm files from my intranet which
should ignore sessionids.

After that I want to put all the crawled urls in xml file.I want to write
url into xml using sitemap format which will later submit to google..

Is there any way i can achieve this? If yes, provide me the solution ASAP.

Please help me.

cheers,
utsavi


-- 
View this message in context: 
http://www.nabble.com/writing-urls-to-xml-files-tf3427891.html#a9554639
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to