Hi, I'm trying to get subcollections working in nutch 1.0-dev, and have crawled our intranet with the subcollection.xml configured as below. However when I submit a query to search.jsp eg,
subcollection:im database I don't get any results (as opposed to submitting this without subcollection:im) Is this configured wrongly? I realise that subcollection.xml doesn't do regex expressions, but I wasn't sure if I could just put in part of the url, or had to put in the full stem pattern eg, http://planet.somdomain.com/level1/ Thanks, Ed. <subcollections> <subcollection> <name>default</name> <id>default</id> <whitelist> </whitelist> <blacklist> planet.somedomain.com/general/aptrix/bani.nsf/Content/Weekly+news /aptprop.nsf/Content/Americas+ /aptprop.nsf/Content/AB+CityFlyer+ /aptprop.nsf/Content/CityFlyer+ /im/barch/ /im/dms/ /im/tech/ </blacklist> </subcollection> <subcollection> <name>im</name> <id>im</id> <whitelist> planet.somedomain.com/general/aptrix/aptim.nsf/ planet.somedomain.com/im/barch/ planet.somedomain.com/im/dms/ planet.somedomain.com/im/tech/ </whitelist> <blacklist /> </subcollection> <subcollection> <name>news</name> <id>news</id> <whitelist> planet.somedomain.com/general/aptrix/bani.nsf/Content/Weekly+news </whitelist> <blacklist /> </subcollection> </subcollections> _________________________________________________________________ Discover Bird's Eye View now with Multimap from Live Search http://clk.atdmt.com/UKM/go/111354026/direct/01/
