karthik085 wrote:
What nutch plugins are available, that can do a similar job to these
following Google features? (More about google features:
http://www.google.com/advanced_search?hl=en)
* File format :
* Date
* Domain
* Topic-specific searches (Web/Images/Video...)
* Search within results
* Q/A (For example, 'weather 60004' gives weather data for Arlington Height,
IL)
* Suggest
* Did you mean?
* Similar pages
* Analytics
Are there any of these features already implemented in Nutch? Any other way,
without using plugins? With what version does these plugins work with?
Hi,
Well not all of them is implemented in nutch. Obviously, this is because
some of the tasks is very challenging and some of them could be a
project themselves.
To start with, you can index file formats and dates by using index-more
plugin. And these can be queried with query-more plugin. Topic specific
searches can be imitated by searching on mime type fields. However,
this is not a straightforward solution. Searching within results is not
implemented either, although it is not difficult.
Question answering is some broad topic. Hakia and Start are two
references. But as far as i understood, by Q/A you refer to googles
solution. Google forwards the query to, say whether server or finance
server, by a query dispatcher and displays the results along with the
regular query results. To implement such a feature, you should have the
whether, finance, or say music data. As far as I know, this is not one
of the project goals at nutch.
Spell checker is implemented under contrib/web2 directory.