Thanks,
It works now, I pass a folder to Crawl containing plain text file with
URLs. I am testing, and I pass single URL.
At some point I have:
050815 162137 parsing \tmp\nutch\mapred\local\job_q3s4ai.xml
050815 162137 parsing file:/C:/workspace/MapRed/conf/nutch-site.xml
java.io.IOException: File already
exists:\tmp\nutch\mapred\local\map_pel04v\part-0.out
at
org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:135)
at
org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:102)
Fuad
-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Monday, August 15, 2005 4:30 PM
To: [email protected]
Subject: Re: MapRed - Injector - urlDir - Format?
Fuad Efendi wrote:
> Which parameter should I pass to Crawl? It should be directory
> containing smth. in which format?
As before, inject takes a flat text files of urls, one per line. If you
wish to inject DMOZ urls, there is now a utility main() that will
convert the DMOZ file to such a file.
Doug