This is really a FAQ (prefix queries and case sensitivity).
StandardAnalyzer converts all tokens to lower case before indexing, no?
You are using upper case when using prefix queries.
See?
Regarding your indexing snippet - Exception could be SQLException, so
couldn't your 'conn' be null, therefore causing NPE in the finally
block?
And what happens with your writer.close() if you get an Exception? You
could put writer.close() in finally block, too, maybe, checking for
null, etc.
Otis
--- Marcel_St�r <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> My Lucene IndexSearcher returns too few hits when I use some extended
> query syntaxt. I'll give examples of my query/hits pairs at the
> bottom.
> I'm indexing a database table:
>
> <CODE>
> //creating a full index
> DBConnection conn = null;
> try {
> Analyzer analyzer = new StandardAnalyzer();
> IndexWriter writer = new IndexWriter(indexFile, analyzer, true);
> conn =
> DBConnectionFactory.getDBConnetion(DBConnectionFactory.ORACLE);
> String query = "select id, category, problem, solution from
> kedb.solution";
> conn.open(cred);
> ResultSet rs = conn.execute(query);
> while (rs.next()) {
> Document doc = new Document();
> doc.add(Field.Text("id", rs.getString(1)));
> doc.add(Field.Text("category", rs.getString(2)));
> doc.add(Field.Text("problem", rs.getString(3)));
> doc.add(Field.UnStored("solution", rs.getString(4)));
> writer.addDocument(doc);
> }
> writer.close();
> conn.close();
> } catch (Exception ex) {
> ex.printStackTrace();
> } finally {
> conn.close();
> }
>
> //searching the index
> Analyzer analyzer = new StandardAnalyzer();
> Searcher searcher = new IndexSearcher(IndexReader.open(indexFile));
> Query q = QueryParser.parse(query, "problem", analyzer);
> Hits hits = searcher.search(q);
> for (int i = 0; i < hits.length(); i++) {
> if (0 == cat.compareTo("") || 0 ==
> hits.doc(i).get("category").compareTo(cat)) {
> ids.add(hits.doc(i).get("id"));
> cats.add(hits.doc(i).get("category"));
> probs.add(hits.doc(i).get("problem"));
> }
> }
> </CODE>
>
> This is sample data from the table. Right, it's German and not
> English!
> I tried GermanAnalyzer instead of StandardAnalyzer but that made the
> search results even worse.
>
> ID;CATEGORY;PROBLEM;SOLUTION;
> 1;2;Irgendetwas mit dem Novell Server stimmt nicht;Server neu booten;
> 2;15;User kann sich nicht mehr am Novell anmelden. Kabel ist
> eingesteckt. Neubooten n�tzt nichts. Baum wird nicht gefunden.;Baum
> manuell suchen �ber Button auf Login-Maske.;
> 3;9;Makros in Word funktionieren nicht;Optionen ge�ndert (Medium
> Level).
> Extras - Optionen - Makros;
> 4;5;Scanner funktioniert nicht. Ger�t erscheint nicht in der Liste
> der
> USB Ger�te.;Neusten Treiber installieren.
> VORSICHT: NT 4.0 unterst�tzt kein USB!!!;
> 5;16;Maus spinnt;Reinigen/Auswechseln;
> 6;16;Eheprobleme;Adresse einer Beratungsstelle vermitteln;
> 7;11;Browser bringt imme eine Sexseite als Startseite;Extras -
> Optionen
> - Startseite festlegen;
>
>
> Query /hits /expected
> 1 Novell /2 /2
> 2 Nove* /0 /2
> 3 Novel~ /0 /2
> 4 N?vell /0 /2
> 5 +Scanner +Baum /0 /0
> 6 Scanner -USB /0 /0
> 7 usb AND liste /1 /1
> 8 Mikros~ /1 /1
> 9 Ehe* /1 /0
> 10 "Word Optionen"~10/1 /0
> 11 "Browser Sexseite"~10/1 /1
>
>
> Boolean operators never seem to be a problem (ok, it's the easiest to
> implement ;-)). Fuzzy searches, wildcard searches, plus proximity
> searches just don't return enough hits. Very surprising are number 10
> &
> 11. At 10 the proximity fails but at 11 everything is fine!
>
> If you have any tips as for how to improve the search process please
> let
> me know.
>
> Regards,
> Marcel
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]