Query Searc

Emmanuel Sat, 16 Feb 2008 23:05:26 -0800

Hi Guys,

I've been trying to understand the way we are getting the search results
based on all parameters inputed.


My objectives are the following:

   - sort my results by score
   - limit the nb of dup
   - build a paging management within my html page according to the
   number of results

So i looked at the code in NuchBean class and more precisely to the method
named:

public Hits search(Query query, int numHits, int maxHitsPerDup, String
dedupField, String sortField, boolean reverse)

I tried to understand the code but it leads to the following questions:

   1. We initiate a search to get a list of hits. We go through this list
   using a loop, however we regenerate a list to prohibited some term when its
   needed. I don't really understand why do we do it within this loop. As the
   hits.getTotal() results might be different and the loop could miss
   some items. is there any reason ?
   2. We optimize the request by adding optQuery.addProhibitedTerm with
   the dedupValue found. However it looks like we are limited to 20 dedupValue.
   Why do we have this limitation ? What happen if we have more dedupValue to
   exclude ?
   3. numHitsRaw = (int)(numHitsRaw * rawHitsFactor); is defined within
   the loop to make a search following a new prohibited term. It means that we
   can end up by doing a search of (numHitsRaw exponent 20). Its a huge request
   which cause an outofmemory on my webserver. Is it normal ? why do we need to
   have a such amount of results ?
   4. getTotalHits is obviously not the exact number of results that we
   have. This number corresponds to the number of results without our filter on
   the dedup value. However how could i know how many results are available
   (based on the filter on dedup value) to build a paging managment on my html
   page ? We maybe need to add another variable within the object HITS to get
   the the total of results filtered. isn't it ?

Thanks in advance for any clarification

Query Searc

Reply via email to