Hi Luke,

Thanks for the help.

I ran the command you gave me:

[EMAIL PROTECTED] nutch-0.5]# bin/nutch crawl conf/crawl-urlfilter.txt -dir
crawl.test -depth 3 > crawl.log 2>&1 &

but it gives me a similar error:

[EMAIL PROTECTED] nutch-0.5]# cat crawl.log
041026 133046 loading file:/root/install/nutch-0.5/conf/nutch-default.xml
041026 133047 loading file:/root/install/nutch-0.5/conf/crawl-tool.xml
041026 133047 loading file:/root/install/nutch-0.5/conf/nutch-site.xml
041026 133047 crawl started in: crawl.test
041026 133047 rootUrlFile = conf/crawl-urlfilter.txt
041026 133047 threads = 10
041026 133047 depth = 3
Exception in thread "main" java.io.IOException: Invalid argument
        at sun.nio.ch.FileChannelImpl.lock0(Native Method)
        at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:490)
        at net.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1464)
        at net.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1424)
        at net.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:157)
        at net.nutch.tools.CrawlTool.main(CrawlTool.java:84)

I cant seem to get rid of the 

Exception in thread "main" java.io.IOException: Invalid argument

Any ideas??

Thanks,
Michael.

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Luke
Baker
Sent: 26 October 2004 13:28
To: [EMAIL PROTECTED]
Subject: Re: [Nutch-general] Intranet crawl fails


On 10/26/2004 06:53 AM, Dunson-Odusanya, Michael wrote:
> Can someone help a newbie??!!
> 
> I've setup nutch according to the tutorial
> http://www.nutch.org/docs/en/tutorial.html
> created a flat file called urls, edited conf/crawl-urlfilter.txt

If that's the file that has the URLs, then you need to change the 
command you ran.  Change:
> [EMAIL PROTECTED] nutch-0.5]# bin/nutch crawl urls -dir crawl.test -depth 3
TO
> [EMAIL PROTECTED] nutch-0.5]# bin/nutch crawl conf/crawl-urlfilter.txt -dir
crawl.test -depth 3

(All one line of course.)  If you want to see all available arguments or 
figure out what a particular one is, you can just run the command with 
no arguments.
$ bin/nutch crawl
Usage: CrawlTool (-local | -ndfs <nameserver:port>) <root_url_file> 
[-dir d] [-threads n] [-depth i] [-showThreadID]

Luke


> 
> but when I run the command I get:
> 
> [EMAIL PROTECTED] nutch-0.5]# bin/nutch crawl urls -dir crawl.test -depth 3
> 041026 102539 loading file:/root/install/nutch-0.5/conf/nutch-default.xml
> 041026 102539 loading file:/root/install/nutch-0.5/conf/crawl-tool.xml
> 041026 102539 loading file:/root/install/nutch-0.5/conf/nutch-site.xml
> 041026 102539 crawl started in: crawl.test
> 041026 102539 rootUrlFile = urls
> 041026 102539 threads = 10
> 041026 102539 depth = 3
> Exception in thread "main" java.io.IOException: Invalid argument
>         at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>         at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:490)
>         at net.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1464)
>         at net.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1424)
>         at net.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:157)
>         at net.nutch.tools.CrawlTool.main(CrawlTool.java:84)
> 
> Can anybody highlight the rookie mistake?
> 
> Many thanks,
> Michael.


This email, and any attachment, is confidential to the addressee. If you
have received this email and are not an authorised recipient please notify
the sender and delete this message from your system. If you are not an
authorised recipient you must not use, disclose, distribute, copy, print or
rely on this email.

Email transmission cannot be guaranteed to be secure, error-free or
virus-free. Although World Markets Research Centre ("WMRC plc") routinely
screens for viruses you are responsible for checking this email and any
attachments for viruses and WMRC plc accepts no responsibility for any
damage caused to your systems or for loss of data caused by any virus.  WMRC
plc does not accept liability resulting from errors or omissions in the
content of this message following email transmission.  If verification is
required please request a hard copy version.

If this email is of a personal nature any views expressed are solely those
of the author and are not made in the course of the author's employment with
WMRC.



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to