Re: Query question

jeff . richley Sat, 04 Nov 2006 19:10:40 -0800

Thought I attached the code :)

package com.infinity.naxx.sandbox;


import java.io.IOException;
import java.util.Iterator;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.KeywordAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hit;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.PhraseQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.store.RAMDirectory;

public class LuceneTest {
        public static void main(String[] args) {
                try {
                        Analyzer analyzer = new KeywordAnalyzer();
                        RAMDirectory directory = new RAMDirectory();
                        IndexWriter writer = new IndexWriter(directory, 
analyzer, true);

                        Document document = new Document();
                        Field location = new Field("location", "/a/b/c", 
Field.Store.YES,
                                        Field.Index.UN_TOKENIZED);
                        document.add(location);
                        Field name = new Field("name", "Jeff Richley", 
Field.Store.YES,
                                        Field.Index.UN_TOKENIZED);
                        document.add(name);

                        writer.addDocument(document);
                        writer.optimize();
                        writer.close();

                        IndexSearcher searcher = new IndexSearcher(directory);
                        QueryParser parser = new QueryParser("name", analyzer);
                        Query query = parser.parse("\"Jeff Richley\"");

                        System.out.println("Searching: " + query.toString());

                        Hits hits = searcher.search(query);
                        System.out.println("There were " + hits.length() + " 
hits");

                        for (Iterator iter = hits.iterator(); iter.hasNext();) {
                                Hit hit = (Hit) iter.next();
                                System.out.println(hit.getScore() + " "
                                                + 
hit.getDocument().get("location"));
                        }
                } catch (IOException e) {
                        e.printStackTrace();
                } catch (ParseException e) {
                        e.printStackTrace();
                }
        }
}




> I know I am getting very close on this one but can't seem to get the score
> above .306.  My guess is that I need to do something different in my
> query.  If at all possible, could you take a quick look at my test code
> and point me in the correct direction?  I know everyone is very busy, so
> any help would be greatly appreciated.
>
>>
>> : 1.) I have data like name="Jeff" lastname="Richley" age="33" and I
>> need
>> to
>> : be able to query by any combination such as name="Jeff" age="33".  But
>> if
>> : I query with name="Jeffrey" there is no match.
>> :
>> : 2.) The name value pairs are not really controlled until the end user
>> is
>> : inserting information or querying.  I may have the data from the
>> previous
>> : example and then have another that has address information and then
>> : something totally unrelated such as stock prices.  The point is, I
>> can't
>> : guarantee what exactly will be in the data.
>>
>> Lucene will fit your needs because of your second point nicely,
>> Documents
>> don't need to all have hte same fields.
>>
>> as to your first point, and your question about 100% matches, Lucene
>> should be able to meet your needs perfectly, you just have to understand
>> how to ask it the right question.  lemme give you a quick check list of
>> things to keep in mind, and as you dig into the documentation these will
>> make more sense...
>>
>>  1) Use only UN_TOKENIZED fields when adding your documents, and if you
>> use QueryParser to build your queries for you, use the KeywordAnalyzer
>> to
>> make sure no lowercasing or stemming takes place.
>>  2) OMIT_NORMs when indexing .. they only matter if you want the lengths
>> of fields to affect the score, and you don't -- you only want to know if
>> it matched or not.
>>  3) if you want to require name="jeff" and age="33" make sure you
>> construct a query where all clauses are mandatory .. the default in the
>> query parser is "SHOULD" meaning only one clause is mandatory, and the
>> other clauses increase the score.
>>
>>
>>
>> -Hoss
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>
> Jeff Richley, Vice President
> Southeast Virginia Java Users Group
> [EMAIL PROTECTED]
> http://www.sevajug.org
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


Jeff Richley, Vice President
Southeast Virginia Java Users Group
[EMAIL PROTECTED]
http://www.sevajug.org


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Query question

Reply via email to