Re: search results of a defined list of urls and a usal websearch

Tom Smets Tue, 12 Apr 2005 14:58:20 -0700

ok sounds quite easy. the only problem is that i never wrote anything in java and further i don't have any idea about the structure of nucht ;)

but never the less. if there is no other way,
i'll give it a try...

so for the moment, two more questions:

1. is there somewhere a howto, how to start writing plugins
or are there just the api docs?

2. the java source of the language identifier plugin you are talking about
is it located at:
"trunk/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang"
isn't it?


thanks,
tom

You can write a own index filter plugin & query filter and add a meta data to the index to identify the "start urls". Take a look to the language identifier to get an idea.
Stefan
Am 12.04.2005 um 19:33 schrieb Tom Smets:
hello list, i have a list of about 3000 urls which i want to crawl. further i want to start a webcrawl with those urls as the initial fetchlist.

later i want to have the possibility to choose between a search over just the 3000 urls and a whole-web-search.

is it possible to use just one database (from whole-web-search) to get the desired results or do i need to build to databases?
thanks,
tom
---------------------------------------------------------------
company:                http://www.media-style.com
forum:          http://www.text-mining.org
blog:                   http://www.find23.net

Re: search results of a defined list of urls and a usal websearch

Reply via email to