I have a couple of questions regarding indexing and searching a
document that has repeated values for the same field (specifically,
the authors of a document, in this case):
Firstly, I'm adding the repeated field with this code:
for creator in creators:
doc.add(Field('creator', creator, Field.Store.YES,
Field.Index.UN_TOKENIZED))
but can't find a way to read those fields back out from the index. If
I use
for author in hits[i]["creator"]:
print author
then just the first "creator" entry is returned for that document and
gets split into a list of individual letters - in other words, hits[i]
["creator"] is a string and not a list.
Secondly, it doesn't seem to be possible (in PyLucene 1.9.1) to
search an untokenized field using a term that contains spaces. For a
document that has a creator "Doe J", the query
creator:"Doe J"
doesn't return any results, and
creator:Doe J
doesn't match what it needs to.
Has anyone found solutions to these problems already? For the first I
could just replace spaces with underscores during the indexing, but
that wouldn't be the ideal solution.
alf.
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev