Hi Tibor,
thanks for your suggestions:
Does search_pattern() CLI query find your records well? Here is an
example for the demo site:
$ python -c "from invenio.search_engine import search_pattern; \
print search_pattern(p='POETRY', f='collection')"
intbitset([75, 76, 104])
If you find good record IDs, then the problem is not with the collection
index, but with the collection tree and/or collection definitions.
For a demo site collection as well as for our defined collection don't
find records and get a empty tuple, so I tryed this:
If you don't find your records, then try a direct MARC search:
$ python -c "from invenio.search_engine import search_pattern; \
print search_pattern(p='POETRY', f='980__a')"
This return results for a demo site collection as well as for our
defined collection! (intbitset([105, 106, 107, 108, 109, ..., 5440390,
5440391, 5440392, 5440393, 5440394]))
If the latter succeeds, but not the former, then the problem is with the
definition of the logical field `collection'. Just some wild-guess
debugging hints.
FYI we don't change configs or definitions concerning the collection
field after loading the first sucessfull 60 000 records.
What can be a problem with the logical field definition of 'collection'?
In "Manage logical fields": 'collection' has the Marc Tag "980__%" (and
we used the 980 tag in our loaded records).
In "Manage Indexes": all indexes have more than 0 records, except the
collection index. But the collection index has the relation to the
collection field?!
Here the results of your further debugging hints:
What about `Collection Status' page? Is everything `OK' there?
<http://localhost/admin/websearch/websearchadmin.py?colID=1&ln=en&mtype=perform_checkcollectionstatus>
yes, there is everything 'ok'.
Anything interesting in `/opt/invenio/var/log/bibsched_task_*.err' or in
`/opt/invenio/var/log/invenio.err' files in this respect?
Not direct concerning the bibindex. The only other 'new' problem is a
bibupload TypeError (it seems on each bibupload). This logs are
attached. I don't know, if there is a relation to our 'result display /
collection' problem.
Also, what Invenio version are you using, and does your database run well in
UTF-8 mode?
The db run in uft-8 mode, here the further results:
* Hostname: zb0035.zb.kfa-juelich.de
* Invenio version: 0.99.90.20100628
* Python version: 2.4.3 (#1, Jun 11 2009, 14:09:58) [GCC 4.1.2 20080704
(Red Hat 4.1.2-44)]
* Apache version: Apache/2.2.3 (Red Hat) (Release 43.el5) [/usr/sbin/httpd];
Apache/2.2.3 (Red Hat) (Release 43.el5)
[/usr/sbin/httpd.event];
Apache/2.2.3 (Red Hat) (Release 43.el5)
[/usr/sbin/httpd.worker]
* MySQLdb version: 1.2.1_p2
* MySQL version:
- version: 5.0.77-log
- character_set_client: utf8
- character_set_connection: utf8
- character_set_database: utf8
- character_set_results: utf8
- character_set_server: utf8
- character_set_system: utf8
- collation_connection: utf8_general_ci
- collation_database: utf8_general_ci
- collation_server: utf8_general_ci
>>> System details detected successfully.
It would be nice, if you have more hints for localizing and solving this
problem.
Best Regards
Cornelia
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
2010-11-26 12:24:12 --> Task #7713 started.
2010-11-26 12:24:12 --> Input file
'/cdsware/home/cdsware/scopus_simple_test.xml', input mode 'replace_or_insert'.
2010-11-26 12:24:12 --> Error during the creation_new_record function :
(1062, "Duplicate entry '0' for key 1")
2010-11-26 12:24:12 --> Task #7713 finished. [ERROR]:
The following problem occurred on <http://zb0035.zb.kfa-juelich.de> (CDS
Invenio 0.99.90.20100628)
>> 2010-11-26 12:24:12 -> TypeError: int argument required
>>> User details
No client information available
>>> Traceback details
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/invenio/bibtask.py", line 817, in
_task_run
if callable(task_run_fnc) and task_run_fnc():
File "/usr/lib/python2.4/site-packages/invenio/bibupload.py", line 1976, in
task_run_core
pretend=task_get_option('pretend'))
File "/usr/lib/python2.4/site-packages/invenio/bibupload.py", line 216, in
bibupload
write_message(" -Creation of a new record id (%d): DONE" % rec_id,
verbose=2)
TypeError: int argument required
Locals by frame, innermost last
>>>> Frame ? in /cdsware/cds-invenio/bin/bibupload at line 55
*******************************************************************************
52 import sys
53 sys.exit(1)
54
----> 55 main()
*******************************************************************************
__builtins__ = "<module '__builtin__' (built-in)>"
__name__ = "'__main__'"
__file__ = "'/cdsware/cds-invenio/bin/bibupload'"
main = '<function main at 0x99a648c>'
__doc__ = "'\\nBibUpload: Receive MARC XML file and
update the appropriate database tables according to options.\\n\\n Usage:
bibupload [options] input.xml\\n Examples: \\n $ bibupload -i
input.xml\\n\\n Options:\\n -a, --append new fields are
appended to the existing record\\n -c, --correct fields are
replaced by the new ones in the existing record\\n -f, --format
takes only the FMT fields into account. Does not update\\n -i, --insert
[...]
>>>> Frame main in /usr/lib/python2.4/site-packages/invenio/bibupload.py at
>>>> line 1821
*******************************************************************************
1818 "pretend"
1819 ]),
1820
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
----> 1821 task_run_fnc=task_run_core)
1822
1823 def task_submit_elaborate_specific_parameter(key, value, opts, args):
1824 """ Given the string key it checks it's meaning, eventually
using the
*******************************************************************************
>>>> Frame task_init in /usr/lib/python2.4/site-packages/invenio/bibtask.py at
>>>> line 351
*******************************************************************************
348 ret = _task_run(task_run_fnc)
349 write_message("ERROR: The Python Profiler is not
installed!", stream=sys.stderr)
350 else:
----> 351 ret = _task_run(task_run_fnc)
352 if not ret:
353 write_message("Error occurred. Exiting.",
sys.stderr)
354 except Exception, e:
*******************************************************************************
task_submit_check_options_fnc = 'None'
task_stop_helper_fnc = 'None'
version = "'$Id$'"
help_specific_usage = "' -a, --append\\t\\tnew fields are appended
to the existing record\\n -c, --correct\\t\\tfields are replaced by the new
ones in the existing record\\n -f, --format\\t\\ttakes only the FMT fields
into account. Does not update\\n -i, --insert\\t\\tinsert the new record in
the database\\n -r, --replace\\t\\tthe existing record is entirely replaced by
the new one\\n -z, --reference\\tupdate references (update only 999 fields)\\n
-d, --delete\\t\\tspecified fields are deleted in existing reco [...]
to_be_submitted = 'False'
argv = "['/cdsware/cds-invenio/bin/bibupload', '-ir',
'/cdsware/home/cdsware/scopus_simple_test.xml']"
specific_params = "('ircazdS:fno', ['insert', 'replace',
'correct', 'append', 'reference', 'delete', 'stage=', 'format', 'notimechange',
'holdingpen', 'pretend'])"
task_run_fnc = '<function task_run_core at 0x99a656c>'
authorization_action = "'runbibupload'"
task_submit_elaborate_specific_parameter_fnc = '<function
task_submit_elaborate_specific_parameter at 0x99a64c4>'
authorization_msg = "'BibUpload Task Submission'"
description = "'Receive MARC XML file and update appropriate
database\\ntables according to options.\\nExamples:\\n $ bibupload -i
input.xml\\n'"
>>>> Frame _task_run in /usr/lib/python2.4/site-packages/invenio/bibtask.py at
>>>> line 824
*******************************************************************************
821 except SystemExit:
822 pass
823 except:
----> 824 register_exception(alert_admin=True)
825 task_update_status("ERROR")
826 finally:
827 task_status = task_read_status()
*******************************************************************************
sleeptime = "''"
time_now = '1290770652.2199969'
pidfile_name =
"'/cdsware/cds-invenio/var/run/bibsched_task_7713.pid'"
task_run_fnc = '<function task_run_core at 0x99a656c>'
task_status = "'WAITING'"
pidfile = "<closed file
'/cdsware/cds-invenio/var/run/bibsched_task_7713.pid', mode 'w' at 0x9901ec0>"
>>>> Frame task_run_core in
>>>> /usr/lib/python2.4/site-packages/invenio/bibupload.py at line 1976
*******************************************************************************
1973
opt_stage_to_start_from=task_get_option('stage_to_start_from'),
1974
opt_notimechange=task_get_option('notimechange'),
1975 oai_rec_id=record_id,
----> 1976 pretend=task_get_option('pretend'))
1977 if error[0] == 1:
1978 if record:
1979 write_message(record_xml_output(record),
*******************************************************************************
record_id = "''"
recs = "[{'245': [([('a', 'Testdatensatz')], ' ', ' ',
'', 2)], '100': [([('a', 'Plott, Cornelia')], ' ', ' ', '', 1)], '980':
[([('a', 'SCOPUS')], ' ', ' ', '', 3)]}]"
record = "{'245': [([('a', 'Testdatensatz')], ' ', ' ',
'', 2)], '100': [([('a', 'Plott, Cornelia')], ' ', ' ', '', 1)], '980':
[([('a', 'SCOPUS')], ' ', ' ', '', 3)]}"
error = '0'
>>>> Frame bibupload in /usr/lib/python2.4/site-packages/invenio/bibupload.py
>>>> at line 216
*******************************************************************************
213 insert_mode_p = True
214 # Insert the record into the bibrec databases to have a
recordId
215 rec_id = create_new_record(pretend=pretend)
----> 216 write_message(" -Creation of a new record id (%d): DONE" %
rec_id, verbose=2)
217
218 # we add the record Id control field to the record
219 error = record_add_field(record, '001',
controlfield_value=rec_id)
*******************************************************************************
opt_mode = "'replace_or_insert'"
opt_stage_to_start_from = '1'
record_deleted_p = 'False'
opt_tag = 'None'
pretend = 'False'
rec_id = 'None'
opt_notimechange = '0'
oai_rec_id = "''"
insert_mode_p = 'True'
error = 'None'
record = "{'245': [([('a', 'Testdatensatz')], ' ', ' ',
'', 2)], '100': [([('a', 'Plott, Cornelia')], ' ', ' ', '', 1)], '980':
[([('a', 'SCOPUS')], ' ', ' ', '', 3)]}"
2010-11-26 12:24:12 --> Task #7713 started.
2010-11-26 12:24:12 --> Input file
'/cdsware/home/cdsware/scopus_simple_test.xml', input mode 'replace_or_insert'.
2010-11-26 12:24:12 --> Error during the creation_new_record function :
(1062, "Duplicate entry '0' for key 1")
2010-11-26 12:24:12 --> Task #7713 finished. [ERROR]
-------------------------------------------------------------------------------------------------------------------
2010-11-26 12:25:16 --> Task #7714 started.
2010-11-26 12:25:16 --> Input file '/cdsware/home/cdsware/scopus_test.xml',
input mode 'replace_or_insert'.
2010-11-26 12:25:16 --> Error during the creation_new_record function :
(1062, "Duplicate entry '0' for key 1")
2010-11-26 12:25:16 --> Task #7714 finished. [ERROR]:
The following problem occurred on <http://zb0035.zb.kfa-juelich.de> (CDS
Invenio 0.99.90.20100628)
>> 2010-11-26 12:25:16 -> TypeError: int argument required
>>> User details
No client information available
>>> Traceback details
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/invenio/bibtask.py", line 817, in
_task_run
if callable(task_run_fnc) and task_run_fnc():
File "/usr/lib/python2.4/site-packages/invenio/bibupload.py", line 1976, in
task_run_core
pretend=task_get_option('pretend'))
File "/usr/lib/python2.4/site-packages/invenio/bibupload.py", line 216, in
bibupload
write_message(" -Creation of a new record id (%d): DONE" % rec_id,
verbose=2)
TypeError: int argument required
Locals by frame, innermost last
>>>> Frame ? in /cdsware/cds-invenio/bin/bibupload at line 55
*******************************************************************************
52 import sys
53 sys.exit(1)
54
----> 55 main()
*******************************************************************************
__builtins__ = "<module '__builtin__' (built-in)>"
__name__ = "'__main__'"
__file__ = "'/cdsware/cds-invenio/bin/bibupload'"
main = '<function main at 0x9f6e48c>'
__doc__ = "'\\nBibUpload: Receive MARC XML file and
update the appropriate database tables according to options.\\n\\n Usage:
bibupload [options] input.xml\\n Examples: \\n $ bibupload -i
input.xml\\n\\n Options:\\n -a, --append new fields are
appended to the existing record\\n -c, --correct fields are
replaced by the new ones in the existing record\\n -f, --format
takes only the FMT fields into account. Does not update\\n -i, --insert
[...]
>>>> Frame main in /usr/lib/python2.4/site-packages/invenio/bibupload.py at
>>>> line 1821
*******************************************************************************
1818 "pretend"
1819 ]),
1820
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
----> 1821 task_run_fnc=task_run_core)
1822
1823 def task_submit_elaborate_specific_parameter(key, value, opts, args):
1824 """ Given the string key it checks it's meaning, eventually
using the
*******************************************************************************
>>>> Frame task_init in /usr/lib/python2.4/site-packages/invenio/bibtask.py at
>>>> line 351
*******************************************************************************
348 ret = _task_run(task_run_fnc)
349 write_message("ERROR: The Python Profiler is not
installed!", stream=sys.stderr)
350 else:
----> 351 ret = _task_run(task_run_fnc)
352 if not ret:
353 write_message("Error occurred. Exiting.",
sys.stderr)
354 except Exception, e:
*******************************************************************************
task_submit_check_options_fnc = 'None'
task_stop_helper_fnc = 'None'
version = "'$Id$'"
help_specific_usage = "' -a, --append\\t\\tnew fields are appended
to the existing record\\n -c, --correct\\t\\tfields are replaced by the new
ones in the existing record\\n -f, --format\\t\\ttakes only the FMT fields
into account. Does not update\\n -i, --insert\\t\\tinsert the new record in
the database\\n -r, --replace\\t\\tthe existing record is entirely replaced by
the new one\\n -z, --reference\\tupdate references (update only 999 fields)\\n
-d, --delete\\t\\tspecified fields are deleted in existing reco [...]
to_be_submitted = 'False'
argv = "['/cdsware/cds-invenio/bin/bibupload', '-ir',
'/cdsware/home/cdsware/scopus_test.xml']"
specific_params = "('ircazdS:fno', ['insert', 'replace',
'correct', 'append', 'reference', 'delete', 'stage=', 'format', 'notimechange',
'holdingpen', 'pretend'])"
task_run_fnc = '<function task_run_core at 0x9f6e56c>'
authorization_action = "'runbibupload'"
task_submit_elaborate_specific_parameter_fnc = '<function
task_submit_elaborate_specific_parameter at 0x9f6e4c4>'
authorization_msg = "'BibUpload Task Submission'"
description = "'Receive MARC XML file and update appropriate
database\\ntables according to options.\\nExamples:\\n $ bibupload -i
input.xml\\n'"
>>>> Frame _task_run in /usr/lib/python2.4/site-packages/invenio/bibtask.py at
>>>> line 824
*******************************************************************************
821 except SystemExit:
822 pass
823 except:
----> 824 register_exception(alert_admin=True)
825 task_update_status("ERROR")
826 finally:
827 task_status = task_read_status()
*******************************************************************************
sleeptime = "''"
time_now = '1290770716.167927'
pidfile_name =
"'/cdsware/cds-invenio/var/run/bibsched_task_7714.pid'"
task_run_fnc = '<function task_run_core at 0x9f6e56c>'
task_status = "'WAITING'"
pidfile = "<closed file
'/cdsware/cds-invenio/var/run/bibsched_task_7714.pid', mode 'w' at 0x9ec9d58>"
>>>> Frame task_run_core in
>>>> /usr/lib/python2.4/site-packages/invenio/bibupload.py at line 1976
*******************************************************************************
1973
opt_stage_to_start_from=task_get_option('stage_to_start_from'),
1974
opt_notimechange=task_get_option('notimechange'),
1975 oai_rec_id=record_id,
----> 1976 pretend=task_get_option('pretend'))
1977 if error[0] == 1:
1978 if record:
1979 write_message(record_xml_output(record),
*******************************************************************************
record_id = "''"
recs = "[{'594': [([('a', 'ar')], ' ', ' ', '', 28)],
'598': [([('a', 'Copyright 2008 Elsevier B.V., All rights reserved.')], ' ', '
', '', 29)], '980': [([('a', 'SCOPUS')], ' ', ' ', '', 66)], '700': [([('a',
'Raina C.'), ('0', '8341768600'), ('l', 'ind'), ('u', 'Department of Food
Technology S.L. Institute of Engineering and Technology, Longowal, Sangrur,
Punjab 148 106, ind')], ' ', ' ', '', 54), ([('a', 'Singh S.'), ('0',
'15849538900'), ('l', 'ind'), ('u', 'Department of Food Technology S.L. Insti
[...]
record = "{'594': [([('a', 'ar')], ' ', ' ', '', 28)],
'598': [([('a', 'Copyright 2008 Elsevier B.V., All rights reserved.')], ' ', '
', '', 29)], '980': [([('a', 'SCOPUS')], ' ', ' ', '', 66)], '700': [([('a',
'Raina C.'), ('0', '8341768600'), ('l', 'ind'), ('u', 'Department of Food
Technology S.L. Institute of Engineering and Technology, Longowal, Sangrur,
Punjab 148 106, ind')], ' ', ' ', '', 54), ([('a', 'Singh S.'), ('0',
'15849538900'), ('l', 'ind'), ('u', 'Department of Food Technology S.L. Instit
[...]
error = '0'
>>>> Frame bibupload in /usr/lib/python2.4/site-packages/invenio/bibupload.py
>>>> at line 216
*******************************************************************************
213 insert_mode_p = True
214 # Insert the record into the bibrec databases to have a
recordId
215 rec_id = create_new_record(pretend=pretend)
----> 216 write_message(" -Creation of a new record id (%d): DONE" %
rec_id, verbose=2)
217
218 # we add the record Id control field to the record
219 error = record_add_field(record, '001',
controlfield_value=rec_id)
*******************************************************************************
opt_mode = "'replace_or_insert'"
opt_stage_to_start_from = '1'
record_deleted_p = 'False'
opt_tag = 'None'
pretend = 'False'
rec_id = 'None'
opt_notimechange = '0'
oai_rec_id = "''"
insert_mode_p = 'True'
error = 'None'
record = "{'594': [([('a', 'ar')], ' ', ' ', '', 28)],
'598': [([('a', 'Copyright 2008 Elsevier B.V., All rights reserved.')], ' ', '
', '', 29)], '980': [([('a', 'SCOPUS')], ' ', ' ', '', 66)], '700': [([('a',
'Raina C.'), ('0', '8341768600'), ('l', 'ind'), ('u', 'Department of Food
Technology S.L. Institute of Engineering and Technology, Longowal, Sangrur,
Punjab 148 106, ind')], ' ', ' ', '', 54), ([('a', 'Singh S.'), ('0',
'15849538900'), ('l', 'ind'), ('u', 'Department of Food Technology S.L. Instit
[...]