Ok,I will update and rebuild the code and try it again.
From: "Dennis Kubes" <[EMAIL PROTECTED]>
Reply-To: [email protected]
To: <[email protected]>
Subject: RE: please help!! It always return 0 hit.
Date: Fri, 7 Apr 2006 12:47:25 -0500
Copying from Hadooop to local and then performing a search on the index is
a
question that needs to be posted to the list. My guess would be that you
have an older version of the code and there were some bugs copying crc
files. I think I remember something about that on the list a little while
back. So you might want to update and rebuild you code base.
If you want to do a crawl and search without using hadoop follow the nutch
0.8 tutorial on the website (not wiki) for a regular crawl. You would
also
want to set fs.default.name to local and comment out the rest of the
hadoop-site.xml file options. Also make sure to set the nutch-site.xml
file
in the WEB-INF/classes directory to the absolute path of the crawl
directory
as below.
Dennis
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>local</value>
</property>
<property>
<name>searcher.dir</name>
<value>C:\TESTBED\NUTCH\CRAWLED</value>
</property>
</configuration>
-----Original Message-----
From: lin yuan [mailto:[EMAIL PROTECTED]
Sent: Friday, April 07, 2006 4:33 AM
To: [email protected]
Subject: please help!! It always return 0 hit.
Hi Denis ,
According to your tutorial
(http://wiki.apache.org/nutch/NutchHadoopTutorial):
I have setup Nutch and Hadoop,so far so good.But when I performing a
search,
It always return 0 hit.
So I want to do a search without hadoop, and used the command followed:
bin/hadoop dfs -copyToLocal crawled crawled
It seems that there is somthing wrong.would you give me some tips to
debug
it? I use the nutch 0.8 392087 revision.
The output said:
060407 172334 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-defa
ult.xml
060407 172334 parsing file:/nutch/search/conf/hadoop-site.xml
060407 172334 No FS indicated, using default:boxA:9000
060407 172334 Client connection to 127.0.0.1:9000: starting
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
000/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00000/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
001/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00001/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
002/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00002/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
003/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00003/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
004/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00004/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
005/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00005/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
006/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00006/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
007/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00007/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
008/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00008/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
009/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00009/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
010/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00010/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
060407 172335 Problem opening checksum file:
/user/nutch/crawled/indexes/part-00
011/index.done. Ignoring with exception java.rmi.RemoteException:
java.io.IOExc eption: Cannot
open filename /user/nutch/crawled/indexes/part-00011/.index.done.
crc
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:120)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
.
Best regards,
Lin Yuan
_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn
_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general