Indeed, you are correct. Thanks.
jay jiang wrote: > Shouldn't that be subcollection:wiki instead? Also I assumed you had > subcollection added to plugin.includes in the config file > (nutch-site.xml). > > Andrew Libby wrote: > >> Iv'e applied the patch in the ticket linked to below. I browesed the >> patch to >> try to figure out how to use this plugin, and I'm having troubles trying >> to get it >> working. >> Before I get into the details, if someone has a source of information >> describing >> how nutch starts up and initializes plugins so that I can get a feel for >> if this patch >> is even being used properly in the system, I'd very much appreciate it. >> >> ---- >> >> Here's what I did: >> >> Added patches with patch -p0 < subcollection.2.path >> >> Comiled tarball with ant tar >> >> Extracted tarball in my runtime location with tar -zxvpf - >> nutch-0.8-dev.tar.gz >> >> Created urls/urls.txt containing my site name >> (http://www.philadelphiariders.com/) >> >> Edited crawl-urlfilter.xml to accept aformentioned site name >> >> Edited subcollections.xml and added the following: >> >> <subcollection> >> <name>wiki</name> >> <id>wiki</name> >> <whitelist>http://www.philadelphiariders.com/wiki</whitelist> >> <blacklist /> >> </subcollection> >> >> <subcollection> >> <name>moto-web</name> >> <id>moto-web</name> >> <whitelist>http://www.philadelphiariders.com/c/dmoz</whitelist> >> <blacklist /> >> </subcollection> >> >> <subcollection> >> <name>gallery</name> >> <id>gallery</id> >> <whitelist>http://www.philadelphiariders.com/gallery</whitelist> >> <blacklist /> >> </subcollection> >> >> Crawled/ indexed my site with ./bin/nutch crawl urls -dir ../nutch-index >> >> When I start tomcat and do some test searching, I get links from the >> wiki area >> w/o a collection filed added to the query. But if I do something a >> query like: >> >> collection:wiki loudon >> >> Which should return documents, I get none. Additionally, if I simply >> query >> collection:wiki, I get no hits. >> >> If anyone has any ideas, I'll be very greatful. >> >> >> Zaheed Haque wrote: >> >> >> >>> Maybe this could help you.. >>> >>> http://issues.apache.org/jira/browse/NUTCH-201 >>> >>> Cheers >>> >>> >>> >>> >> >> >> >> > > -- Andrew Libby [EMAIL PROTECTED] http://philadelphiariders.com/ ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
