Hi, I was able to solve the problem today. The problem was with the way I do the indexing. I used UN_TOKENIZED while doing my indexing, because that is what is mentioned in this tutorial: http://wiki.apache.org/nutch/WritingPluginExample
I changed that to TOKENIZED and everything works fine now. Why is it mentioned as TOKENIZED in the tutorial? Jasper and Julien, thanks for the help. Gautham. Jasper Kamperman wrote: > > Yup, the patch solved our problem. Actually it's more the other way > around, Julien Nioche published the patch as a result of solving our > problem :-). > > Jasper > > On Oct 10, 2007, at 1:53 PM, Gautham Pai wrote: > >> >> I see you had a similar issue here: >> http://www.nabble.com/Field-based-search-on-metadata- >> tf4213684.html#a12045840 >> >> Were you able to solve the problem? I am facing the exact same >> issue as is >> mentioned in the thread. >> >> The problem of being able to query multiple fields using just one >> class is >> secondary. I am right now trying to solve the basic problem of >> querying one >> custom field at a time. >> >> Does the patch help me with this? >> >> Gautham. >> >> Jasper Kamperman wrote: >>> >>> You might want to check out this patch https://issues.apache.org/ >>> jira/ >>> browse/NUTCH-563 . From what I understand of your questions, it might >>> help solve your issues. >>> >>> Jasper >>> >>> On Oct 10, 2007, at 9:08 AM, Milan Krendzelak wrote: >>> >>>> Hi Gautham, >>>> >>>> I am using Nutch 0.8 and implemented the new field to search in >>>> according the plugin query-lang. >>>> Try to do the same as query-lang, let's say just for testing... >>>> Also don't forget to create new plugin.xml and define fields >>>> parameter. >>>> It works for me, and I think it should work for you too. >>>> >>>> BasicQueryFilter is used to query the index on different fields but >>>> this the same Term >>>> for example +(url:java anchor:java content:java title:java ...) >>>> in your case, as I understand you want to query index with >>>> different terms like: +author:Guatham +title:Nutch +description:Java >>>> In this case you have to build you own query and when pass the >>>> query as a parameter to search function ( for example in NutchBean ) >>>> >>>> Actually you are right about the tutorial or documentation. >>>> Compare to other Apache products, Nutch is really pure documented. >>>> Thanks god we have this mailing list, otherwise I would be lost :-) >>>> >>>> Regards, >>>> M >>>> >>>> Milan Krendzelak >>>> Senior Software Developer >>>> >>>> mTLD Top Level Domain Limited is a private limited company >>>> incorporated and registered in the Republic of Ireland with >>>> registered number 398040 and registered office at Arthur Cox >>>> Building, Earlsfort Terrace, Dublin 2 >>>> >>>> ________________________________ >>>> >>>> From: Gautham Pai [mailto:[EMAIL PROTECTED] >>>> Sent: Wed 10/10/2007 16:24 >>>> To: [email protected] >>>> Subject: Re: Custom field query >>>> >>>> >>>> >>>> >>>> Still, no luck. I am not able to search on a single field let alone >>>> multiple >>>> fields per class. >>>> >>>> I tried debugging the code and this is what I found: >>>> >>>> * I see the field listed in the FIELD_NAMES HashSet in >>>> QueryFilters.java. >>>> * LuceneQueryOptimizer's method: optimize has a call to >>>> searcher.search and >>>> this returns no TopDocs in the case of author. If I do a search on >>>> "url" it >>>> works fine and I see results. >>>> * I tried changing the boost value. No effect. >>>> >>>> The fields that I am searching on are not tokenized. I don't have >>>> any >>>> analyzers defined. Is this a problem? >>>> >>>> What else could be wrong? >>>> >>>> Could this be a problem with Lucene or am I missing some >>>> configuration? >>>> >>>> Thanks, >>>> Gautham >>>> >>>> Sagar Naik-2 wrote: >>>>> >>>>> Hey, >>>>> Pl see the answers to the questions below. >>>>> Gautham Pai wrote: >>>>>> I have seen this question being asked multiple times in this >>>>>> forum. >>>>>> However >>>>>> this has confused me more because each has its own approach to >>>>>> solving >>>>>> the >>>>>> issue and no one has outlined the steps in one place. The >>>>>> tutorials seem >>>>>> to >>>>>> be a bit outdated too. >>>>>> >>>>>> The version of Nutch I am using is 0.9. >>>>>> >>>>>> I have 3 custom fields that I have added via an IndexingFilter. >>>>>> The >>>>>> fields >>>>>> are: author, title and description. I now intend to provide >>>>>> support for >>>>>> querying these fields as: >>>>>> author:Gautham >>>>>> title:Nutch >>>>>> etc. >>>>>> >>>>>> I added an Author class as follows: >>>>>> >>>>>> public class Author extends RawFieldQueryFilter { >>>>>> private Configuration conf; >>>>>> >>>>>> public Author() { >>>>>> super("author", 5f); >>>>>> } >>>>>> >>>>>> public void setConf(Configuration conf) { >>>>>> this.conf = conf; >>>>>> } >>>>>> >>>>>> public Configuration getConf() { >>>>>> return this.conf; >>>>>> } >>>>>> } >>>>>> >>>>>> and made an entry in plugin.xml as: >>>>>> >>>>>> <extension id="query.Author" >>>>>> name="Author" >>>>>> point="org.apache.nutch.searcher.QueryFilter"> >>>>>> <implementation id="Author" >>>>>> class="query.Author"> >>>>>> <parameter name="fields" value="author"/> >>>>>> </implementation> >>>>>> </extension> >>>>>> >>>>>> When I use NutchBean to perform the query, I see no results. I >>>>>> also tried >>>>>> changing the RawFieldQueryFilter to QueryFilter and following the >>>>>> approach >>>>>> used in the query-more plugin. It does not seem to work either. >>>>>> >>>>>> The questions I have specifically are: >>>>>> * Do I need to create one class per custom field that I intend to >>>>>> provide >>>>>> support for query? >>>>>> >>>>> Generally, one class for all the custom fields is sufficient. In >>>>> your >>>>> case too, u should be able to do with one class >>>>>> * Should I use RawFieldQueryFilter or QueryFilter? >>>>>> >>>>> RawFieldQueryFilter implements QueryFilter , So I would use >>>>> RawfieldQueryFilter. >>>>>> * Should I make an entry as: <parameter name="fields" >>>>>> value="author"/> or >>>>>> <parameter name="fields" value="DEFAULT"/> in plugin.xml? >>>>>> >>>>>> >>>>> In your case, >>>>> >>>>> <parameter name="fields" value="author, title, description"/> >>>>> should solve >>>>> the problem. >>>>> Check "out org.apache.nutch.searcher.QueryFilters" class's Ctor. >>>>> >>>>>> Any help or pointers is greatly appreciated. >>>>>> >>>>>> Thanks, >>>>>> Gautham. >>>>>> >>>>> >>>>> >>>>> -- >>>>> This message has been scanned for viruses and >>>>> dangerous content and is believed to be clean. >>>>> >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: http://www.nabble.com/Custom-field- >>>> query-tf4596454.html#a13138143 >>>> Sent from the Nutch - User mailing list archive at Nabble.com. >>>> >>>> >>>> >>> >>> >>> >> >> -- >> View this message in context: http://www.nabble.com/Custom-field- >> query-tf4596454.html#a13144552 >> Sent from the Nutch - User mailing list archive at Nabble.com. >> >> > > > -- View this message in context: http://www.nabble.com/Custom-field-query-tf4596454.html#a13281583 Sent from the Nutch - User mailing list archive at Nabble.com.
