Hello

Firstly great work on every one involved in the Lucene.NET its a great project. We are going to use it as the search for a new project that we are working on which is due to go live in the next few months. The only problem is that we are getting great results from the search if we are only searching one field but generally our users will be selecting from two or three fields which using the MultiFieldQueryParser does not seem to be producing great results so I was wondering if any one might be able to help us?

To create the index I am using the code below which is written to disk:-

luceneDocument.Add(new Field(Constant.LuceneConstant_FoodItemId, foodItem.FoodItemId.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_ParentFoodItemId, foodItem.ParentFoodItemId.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_DataProviderItemId, foodItem.DataProviderItemId.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_EANcode, (foodItem.EuropeanArticleNumberCode == null) ? string.Empty : foodItem.EuropeanArticleNumberCode.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_Summary, foodItem.Summary, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_StorageTypeId, foodItem.StorageTypeId.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_FoodItemSupplierId, foodItem.FoodItemSupplierId.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_FoodItemBrandId, foodItem.FoodItemBrandId.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_MeasurementType, ((int)foodItem.MeasurementType).ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_NoOfUnits, foodItem.NoOfUnits.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_PackSize, foodItem.PackSize.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_PortionSize, foodItem.PortionSize.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_Approved, foodItem.Approved.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_Created, foodItem.Created.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO)); luceneDocument.Add(new Field(Constant.LuceneConstant_Updated, foodItem.Updated.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO));

The index contains about 20,000 food products which creates an index of about 6mb. The code to search this index is then below

public static SortableResultsCollection<FoodItem> FoodItemSearch(short departmentCode, short ailseCode, short shelfCode, short supermarketId, byte searchType, string searchQuery, int startValue, int amountOfResults)
       {
           List<string> queryFieldList = new List<string>();
           List<string> queryList = new List<string>();
List<BooleanClause.Occur> queryClauseList = new List<BooleanClause.Occur>();
           //Process the department code
           if (departmentCode > 0)
           {
               queryList.Add(Constant.LuceneConstant_CategoryTopLevel);
               queryFieldList.Add(departmentCode.ToString());
               queryClauseList.Add(BooleanClause.Occur.MUST);
           }
           //Process the ailse code
           if (ailseCode > 0)
           {
               queryList.Add(Constant.LuceneConstant_CategorySecondLevel);
               queryFieldList.Add(ailseCode.ToString());
               queryClauseList.Add(BooleanClause.Occur.MUST);
           }
           //Process the shelf code
           if (shelfCode > 0)
           {
               queryList.Add(Constant.LuceneConstant_CategoryThirdLevel);
               queryFieldList.Add(shelfCode.ToString());
               queryClauseList.Add(BooleanClause.Occur.MUST);
           }
           //Process the supermarket
           if (supermarketId > 0)
           {
               queryList.Add(Constant.LuceneConstant_FoodItemSupplierId);
               queryFieldList.Add(supermarketId.ToString());
               queryClauseList.Add(BooleanClause.Occur.MUST);
           }
           //Process the search query
           if (searchQuery != string.Empty)
           {
               if (searchType == 1)
               {
                   queryList.Add(Constant.LuceneConstant_Summary);
                   queryFieldList.Add(searchQuery);
                   queryClauseList.Add(BooleanClause.Occur.MUST);
               }
               else
               {
                   queryList.Add(Constant.LuceneConstant_EANcode);
                   queryFieldList.Add(searchQuery);
                   queryClauseList.Add(BooleanClause.Occur.MUST);
               }
           }
           //Create the arrays to pass to the query
           string[] queryFieldArray = new string[queryFieldList.Count];
           string[] queryArray = new string[queryList.Count];
BooleanClause.Occur[] occurArray = new BooleanClause.Occur[queryList.Count];
           //Assign the list data to the array
           int rowCount = queryList.Count - 1;
           for (int i = 0; i <= rowCount; i++)
           {
               queryFieldArray[i] = queryFieldList[i];
               queryArray[i] = queryList[i];
               occurArray[i] = queryClauseList[i];
           }
Query query = MultiFieldQueryParser.Parse(queryFieldArray, queryArray, occurArray, new StandardAnalyzer()); EGC.SortableResultsCollection<FoodItem> collection = new EGC.SortableResultsCollection<FoodItem>(); string indexPath = ConfigurationManager.AppSettings[CONST_LUCENE_PATH];
           IndexSearcher indexSearcher = new IndexSearcher(indexPath);
           Hits searchHits = indexSearcher.Search(query);
           int totalHitsCount = searchHits.Length();
           collection.TotalResultCount = totalHitsCount - 1;
           collection.StartResult = startValue;
           int hitsCount = startValue + amountOfResults;
           if (hitsCount > totalHitsCount)
           {
               hitsCount = amountOfResults;
           }
           if (totalHitsCount > 0)
           {
for (int hitNumer = startValue; hitNumer <= hitsCount; hitNumer++)
               {
collection.Add(LuceneDocumentToFoodItem(searchHits.Doc(hitNumer)));
               }
           }
           //close the index sercher
           indexSearcher.Close();

I know the code at the moment is not amazing but we are trying to get the concept correct before we do any tuning. I have tried changing BooleanClause.Occur.MUST to BooleanClause.Occur.SHOULD but we have seen no improvement in the search. Is using the MultiFieldQueryParser the correct way to search for products in the index or do I need to do singular searches and then apply filters to narrow the results?

Any help is greatly received.

Thanks

Ollie Castle


***************************************************************************
E-mail Disclaimer
The information transmitted in this e-mail is intended only for the 
confidential use of the named recipient and may contain confidential and/or 
privileged material.
Any review, retransmission, dissemination or other use of, or taking of any 
action in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited.
Furthermore, no part of the information may be reproduced or transmitted in any 
form or by any means, electronic or mechanical, or by an information storage or 
retrieval system, without prior permission.

TescoDiets may monitor email traffic data and also the content of email for the purposes of security and staff training. If you received this in error, please contact the sender and/or system manager and delete the material from all relevant computers.
The employer assumes no responsibility for any use to which the information may 
be put, or for any errors.

Registration Number 19542
***************************************************************************


Reply via email to