Hi Gautham,
I am using Nutch 0.8 and implemented the new field to search in
according the plugin query-lang.
Try to do the same as query-lang, let's say just for testing...
Also don't forget to create new plugin.xml and define fields
parameter.
It works for me, and I think it should work for you too.
BasicQueryFilter is used to query the index on different fields
but
this the same Term
for example +(url:java anchor:java content:java title:java ...)
in your case, as I understand you want to query index with
different terms like: +author:Guatham +title:Nutch
+description:Java
In this case you have to build you own query and when pass the
query as a parameter to search function ( for example in
NutchBean )
Actually you are right about the tutorial or documentation.
Compare to other Apache products, Nutch is really pure documented.
Thanks god we have this mailing list, otherwise I would be
lost :-)
Regards,
M
Milan Krendzelak
Senior Software Developer
mTLD Top Level Domain Limited is a private limited company
incorporated and registered in the Republic of Ireland with
registered number 398040 and registered office at Arthur Cox
Building, Earlsfort Terrace, Dublin 2
________________________________
From: Gautham Pai [mailto:[EMAIL PROTECTED]
Sent: Wed 10/10/2007 16:24
To: [email protected]
Subject: Re: Custom field query
Still, no luck. I am not able to search on a single field let
alone
multiple
fields per class.
I tried debugging the code and this is what I found:
* I see the field listed in the FIELD_NAMES HashSet in
QueryFilters.java.
* LuceneQueryOptimizer's method: optimize has a call to
searcher.search and
this returns no TopDocs in the case of author. If I do a search on
"url" it
works fine and I see results.
* I tried changing the boost value. No effect.
The fields that I am searching on are not tokenized. I don't have
any
analyzers defined. Is this a problem?
What else could be wrong?
Could this be a problem with Lucene or am I missing some
configuration?
Thanks,
Gautham
Sagar Naik-2 wrote:
Hey,
Pl see the answers to the questions below.
Gautham Pai wrote:
I have seen this question being asked multiple times in this
forum.
However
this has confused me more because each has its own approach to
solving
the
issue and no one has outlined the steps in one place. The
tutorials seem
to
be a bit outdated too.
The version of Nutch I am using is 0.9.
I have 3 custom fields that I have added via an IndexingFilter.
The
fields
are: author, title and description. I now intend to provide
support for
querying these fields as:
author:Gautham
title:Nutch
etc.
I added an Author class as follows:
public class Author extends RawFieldQueryFilter {
private Configuration conf;
public Author() {
super("author", 5f);
}
public void setConf(Configuration conf) {
this.conf = conf;
}
public Configuration getConf() {
return this.conf;
}
}
and made an entry in plugin.xml as:
<extension id="query.Author"
name="Author"
point="org.apache.nutch.searcher.QueryFilter">
<implementation id="Author"
class="query.Author">
<parameter name="fields" value="author"/>
</implementation>
</extension>
When I use NutchBean to perform the query, I see no results. I
also tried
changing the RawFieldQueryFilter to QueryFilter and following
the
approach
used in the query-more plugin. It does not seem to work either.
The questions I have specifically are:
* Do I need to create one class per custom field that I
intend to
provide
support for query?
Generally, one class for all the custom fields is sufficient. In
your
case too, u should be able to do with one class
* Should I use RawFieldQueryFilter or QueryFilter?
RawFieldQueryFilter implements QueryFilter , So I would use
RawfieldQueryFilter.
* Should I make an entry as: <parameter name="fields"
value="author"/> or
<parameter name="fields" value="DEFAULT"/> in plugin.xml?
In your case,
<parameter name="fields" value="author, title, description"/>
should solve
the problem.
Check "out org.apache.nutch.searcher.QueryFilters" class's Ctor.
Any help or pointers is greatly appreciated.
Thanks,
Gautham.
--
This message has been scanned for viruses and
dangerous content and is believed to be clean.
--
View this message in context: http://www.nabble.com/Custom-field-
query-tf4596454.html#a13138143
Sent from the Nutch - User mailing list archive at Nabble.com.