Am 17.03.2006 um 20:22 schrieb MagRaj:
Thanks Marko for your suggestion.
But, here is my problem: Find below the config files with sample
data i
have:
urls.txt has got 5 urls (just as an example)
--------------------------------------------------------
http://foo.com/broker/broker_name_1/
http://foo.com/broker/broker_name_2/
http://foo.com/broker/broker_name_3/
http://foo.com/broker/broker_name_4/
http://foo.com/broker/broker_name_5/
Ah ok i understand.
I tried as you mentioned, but it didn't work.
(site:foo.com/broker/broker_name_1 <Search_test>)
This does not work. The site field contains only the host not
directories.
How can i implement the above requirement??
Hm. You can generate 5 segments and every segment was generated and
fetched with an other regex-urlfilter.txt
segment1:
+foo.com/broker/broker_name1
-.
segment2:
+foo.com/broker/broker_name2
-.
etc.
After that every segment contains the urls you want. But you can not
make a search of a specified segment. But you can write a indexing
plugin that index the segment name. In this case you can filter the
hits from a specified segment.
But i think all these hints are not really good solutions, because
this workflow is very intricate.
Marko
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general