Re: [pylucene-dev] searching repeated and untokenized fields

Alf Eaton Mon, 01 May 2006 18:44:16 -0700

On 01 May 2006, at 02:53, Andi Vajda wrote:

Secondly, it doesn't seem to be possible (in PyLucene 1.9.1) tosearch an untokenized field using a term that contains spaces. Fora document that has a creator "Doe J", the query
creator:"Doe J"
doesn't return any results, and
creator:Doe J
doesn't match what it needs to.
Again, please send in code that reproduces the problem. If you canmake sure that what you're trying to do work in Java Lucene, that'sa plus.
Ideally, your sample code would be organized as unit tests.

Good idea to do the tests: I realised that StandardAnalyzer wasconverting the search terms to lowercase when used in QueryParser,but not when adding untokenized fields to the document usingIndexWriter, so the two weren't matching. Fixed now, thanks (and it'spresumably not a PyLucene problem).


alf.

--------

#!/usr/bin/env python

from PyLucene import *

filestore = FSDirectory.getDirectory("test", True)
analyzer = StandardAnalyzer()
filewriter = IndexWriter(filestore, analyzer, True)

doc = Document()

doc.add(Field('author-space', "Doe J", Field.Store.YES,Field.Index.UN_TOKENIZED))doc.add(Field('author-space-tok', "Doe J", Field.Store.YES,Field.Index.TOKENIZED))doc.add(Field('author-underscore', "Doe_J", Field.Store.YES,Field.Index.UN_TOKENIZED))doc.add(Field('author-underscore-tok', "Doe_J", Field.Store.YES,Field.Index.TOKENIZED))


filewriter.addDocument(doc)
filewriter.close()

searcher = IndexSearcher("test")

for q in ("Doe J", "Doe_J"):

for f in ("author-space", "author-space-tok", "author-underscore", "author-underscore-tok"):#query = QueryParser.parse(q, f, analyzer) # only works fortokenized fieldsquery = TermQuery(Term(f, q)) # only works for untokenizedfields

        hits = searcher.search(query)
        print "\nQ: %s\nQuery: %s\n" % (q, query)
        for i, doc in hits:
            print "Result: %s\n" % doc[f]
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Re: [pylucene-dev] searching repeated and untokenized fields

Reply via email to