RE: Alternative to looping through Hits

Franklin Simmons Fri, 02 Oct 2009 09:13:03 -0700

You could try using TopFieldDocCollector, TopDocs and an extended 
FieldSelector.  String.Join is fairly quick I think. This might be overkill 
though ;-)


...

Lucene.Net.Search.TopFieldDocCollector collector = new 
TopFieldDocCollector(reader, Sort.RELEVANCE, max_hits);

search.Search(query, null, collector);

Lucene.Net.Search.TopDocs top_docs = collector.TopDocs();
string [] values = new string[top_docs.scoreDocs.Length];
MyFieldSelector field_selector = new MyFieldSelector("DocumentId");

for(int i = 0; i < values.Length; i++) 
{
      Lucene.Net.Search.ScoreDoc score_document = top_docs.scoreDocs[i];
      Lucene.Net.Documents.Document document = searcher.Doc(score_document.doc, 
field_selector);
      values[i] = document.GetFieldable("DocumentId").StringValue();   
}

string csv = String.Join(" ,",values);


...
class MyFieldSelector : Lucene.Net.Documents.FieldSelector
{
      string field_name;

        public MyFieldSelector(string field_name)
        {
                this.field_name = field_name;
        }

        public Lucene.Net.Documents.FieldSelectorResult Accept(string 
field_name)
      {
          if(this.field_name == field_name) return 
Lucene.Net.Documents.FieldSelectorResult.LOAD;
          return Lucene.Net.Documents.FieldSelectorResult.NO_LOAD;
      }
}

-----Original Message-----
From: Trevor Watson [mailto:[email protected]] 
Sent: Friday, October 02, 2009 10:40 AM
To: [email protected]
Subject: Alternative to looping through Hits

I am currently attempting to create a comma separated list of IDs from a 
given Hits collection.

However, when we end up processing 6,000 or more hits, it takes 25-30 
seconds per collection.  I've been trying to find a faster way to change 
the search results to the comma separated list.  Do any of you have any 
advice?  Thanks in advance.

Trevor Watson


My current code looks like

Lucene.Net.Search.Searcher search = new 
Lucene.Net.Search.IndexSearcher(string.Format("c:\\sv_index\\" + 
jobId.ToString()));
            Lucene.Net.Search.Hits hits = search.Search(query);

            string docIds = "";
            totalDocuments = hits.Length();

           
          // Test #1
            Lucene.Net.Search.HitIterator hi = 
(Lucene.Net.Search.HitIterator)hits.Iterator();
            while (hi.MoveNext())
                docIds += 
((Lucene.Net.Search.Hit)hi.Current).GetDocument().GetField("DocumentId").StringValue()
 
+ ", ";

          // Test #2
            for (int iCount = 0; iCount < totalDocuments; iCount++)
            {
                Lucene.Net.Documents.Document docHit = hits.Doc(iCount);

                docIds += docHit.GetField("DocumentId").StringValue() + 
", ";
            }

RE: Alternative to looping through Hits

Reply via email to