On Fri, 30 Nov 2007, Felix Schwarz wrote:
Andi Vajda wrote:
The seqfault seems to be in testSimpleKeywordAnalyzer() before:
self.assertEqual(ts.next().termText(), input)
The program terminates immediately after ts.next().
Could it be that there is a mismatch in unicode char width between the
python you compiled PyLucene with and the python you're running it with
(which should be the same, really) ?
How can I check this?
I'm just using the Python which comes with CentOS 5 and did not modify
anything in PyLucene (besides some Makefile/setup.py stuff).
From the name of the function on the stack 'PyUnicodeUCS4_FromUnicode', it
could imply this.
To debug this, use gdb. You can recompile PyLucene with DEBUG=1 to disable
optimizations and get a better gdb experience.
Edit JCCEnv.cpp and add:
printf("sizeof(Py_UNICODE) == sizeof(jchar): %d\n",
sizeof(Py_UNICODE) == sizeof(jchar));
to the top of the JCCEnv::fromJString function and rebuild.
If it says '1' I suspect a problem because, unless I'm mistaken, the
PyUnicodeUCS4_FromUnicode expects 4-byte unicode chars yet Java's jchar is
2-byte. There are flavors of unicode chars in python: 2-byte wide and 4-byte
wide.
Of course, I could be completely wrong and misleading you. Only stepping
though gdb can actually tell.
Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev