Thanks,

It works now, I pass a folder to Crawl containing plain text file with
URLs. I am testing, and I pass single URL.

At some point I have:
050815 162137 parsing \tmp\nutch\mapred\local\job_q3s4ai.xml
050815 162137 parsing file:/C:/workspace/MapRed/conf/nutch-site.xml
java.io.IOException: File already
exists:\tmp\nutch\mapred\local\map_pel04v\part-0.out
        at
org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:135)
        at
org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:102)

Fuad

-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 15, 2005 4:30 PM
To: [email protected]
Subject: Re: MapRed - Injector - urlDir - Format?


Fuad Efendi wrote:
> Which parameter should I pass to Crawl? It should be directory 
> containing smth. in which format?

As before, inject takes a flat text files of urls, one per line.  If you

wish to inject DMOZ urls, there is now a utility main() that will 
convert the DMOZ file to such a file.

Doug


Reply via email to