Am 17.03.2006 um 20:22 schrieb MagRaj:


Thanks Marko for your suggestion.

But, here is my problem: Find below the config files with sample data i
have:

urls.txt has got 5 urls (just as an example)
--------------------------------------------------------
http://foo.com/broker/broker_name_1/
http://foo.com/broker/broker_name_2/
http://foo.com/broker/broker_name_3/
http://foo.com/broker/broker_name_4/
http://foo.com/broker/broker_name_5/



Ah ok i understand.


I tried as you mentioned, but it didn't work.
(site:foo.com/broker/broker_name_1 <Search_test>)

This does not work. The site field contains only the host not directories.


How can i implement the above requirement??

Hm. You can generate 5 segments and every segment was generated and fetched with an other regex-urlfilter.txt
segment1:
+foo.com/broker/broker_name1
-.

segment2:
+foo.com/broker/broker_name2
-.

etc.

After that every segment contains the urls you want. But you can not make a search of a specified segment. But you can write a indexing plugin that index the segment name. In this case you can filter the hits from a specified segment. But i think all these hints are not really good solutions, because this workflow is very intricate.

Marko






-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to