I played with different settings in wrd.cfg, but only achieved that I don’t get
ZeroDivisionErrors anymore; stemming is now disabled.
But since a lot of entries are missing in the indexes (i.e. you can’t find most
of our media), I tried to regenerate all indexes with —-force.
With that I get a lot of errors in invenio.err:
===================
* 2014-02-11 16:15:31 -> WARNING: <class '_mysql_exceptions.Warning'>:
Incorrect string value: '\xD0 \xD0\xA0\xD0\xB5...' for column 'term' at row 1
(/usr/local/lib/python2.6/dist-packages/invenio/dbquery.py:258)
** Traceback details
File "/opt/invenio/bin/bibindex", line 66, in <module>
main()
File "/usr/local/lib/python2.6/dist-packages/invenio/bibindex_engine.py",
line 1461, in main
task_submit_check_options_fnc=task_submit_check_options)
File "/usr/local/lib/python2.6/dist-packages/invenio/bibtask.py", line 606,
in task_init
ret = _task_run(task_run_fnc)
File "/usr/local/lib/python2.6/dist-packages/invenio/bibtask.py", line 1146,
in _task_run
if callable(task_run_fnc) and task_run_fnc():
File "/usr/local/lib/python2.6/dist-packages/invenio/bibindex_engine.py",
line 1928, in task_run_core
wordTable.add_recIDs(final_recIDs, task_get_option("flush"))
File "/usr/local/lib/python2.6/dist-packages/invenio/bibindex_engine.py",
line 865, in add_recIDs
self.put_into_db()
File "/usr/local/lib/python2.6/dist-packages/invenio/bibindex_engine.py",
line 677, in put_into_db
self.put_word_into_db(word, ind_id)
File "/usr/local/lib/python2.6/dist-packages/invenio/bibindex_engine.py",
line 747, in put_word_into_db
set = self.load_old_recIDs(word, index_id)
File "/usr/local/lib/python2.6/dist-packages/invenio/bibindex_engine.py",
line 724, in load_old_recIDs
res = run_sql(query, (word,))
File "/usr/local/lib/python2.6/dist-packages/invenio/dbquery.py", line 258,
in run_sql
rc = cur.execute(sql, param)
File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 168, in execute
if not self._defer_warnings: self._warning_check()
File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 82, in
_warning_check
warn(w[-1], self.Warning, 3)
File "/usr/local/lib/python2.6/dist-packages/invenio/errorlib.py", line 591,
in new_showwarning
traceback.print_stack(file=invenio_err)
===================
The string value above is different every time, the rest is the same.
What is this ‚term‘ column?
Is it possible that somewhere there’s a Unicode processing problem? Most of our
documents are in Cyrillic (some have even Cyrillic file names); a Greek user
recently reported similar errors as we get.
Greetlings, Hraban
---
http://www.fiee.net
https://www.cacert.org (I'm an assurer)