prashant_nutch wrote:
Thanks for your valuable comment on subcollection,
but still i have some issues,
1.enabling subcollection in nutch-site.xml mean at time of crawling, can it
is possible if it is on direcly on index (means at searching)
nutch plugins can implement several extension points. Subcollection
implements both the IndexingFilter extension point, so that
subcollections are inserted in the index, and QueryFilter plugin , so
that you can search in the subcollection field. This means that if you
enable the subcollection plugin in nutch-site.xml, indexing and querying
in subcollection field is enabled.
2.in your message can u explain comment like
subcollection also includes a query plugin
by enabling the Subcollection plugin, you can search in the
subcollection field. For example
<term1> subcollection:<term2>
i done steps mentioned by you,
but when i execute command like
subcollection:<name of subcollection> <word for search>
still i get result 0 hits......
You should open your indexes in luke or lucli and check if the urls are
indexed correctly.
can u explain Subcollection more deeply because our aim is to searching on
specific URL?
Check the readme file in the src/plugin/subcollection directory.
is any other way other than subcollection ?
I assume that you do want to search on a set of urls(matching a regular
expression) rathe than a single url. If not, then there is no point in
using subcollection.