On Sat, 12 May 2007, Andra Tori wrote:

I have problems creating binary fields, i have nailed down the problem
to the non-ascii characters in supposed binary data of the field.

Here's the testcase:
------------------------------------
import PyLucene

a = PyLucene.Field("show_tokens", '\xf3',
       PyLucene.Field.Store.YES)

Lucene expect unicode strings. If you pass in a regular byte string as with '\xf3', PyLucene will assume it's a 'utf-8' string when converting it to Unicode for Lucene.

Given that '\xf3' is not a valid utf-8 string, it fails. If you're going to use non utf-8 strings with PyLucene, you need to convert them to unicode first yourself with u'\xf3' or unicode('\xe9', 'iso-8859-1'), for example.

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to