This is an automated notification sent by LCG Savannah.
It relates to:
task #6809, project CDS Invenio
==============================================================================
LATEST MODIFICATIONS of task #6809:
==============================================================================
Update of task #6809 (project cdsware):
Status: None => Done
Percent Complete: 0% => 100%
Open/Closed: Open => Closed
==============================================================================
OVERVIEW of task #6809:
==============================================================================
URL:
<http://savannah.cern.ch/task/?6809>
Summary: improve citation indexer progress logging
Project: CDS Invenio
Submitted by: simko
Submitted on: 2008-04-30 12:10
Should Start On: 2008-04-30 00:00
Should be Finished on: 2008-04-30 00:00
Category: BibRank
Priority: 5 - Normal
Status: Done
Privacy: Public
Percent Complete: 100%
Assigned to: man
Open/Closed: Closed
Discussion Lock: Any
Effort: 0.00
_______________________________________________________
The citation indexer may take many hours to run (e.g. 8 hours for
150K records). Currently it does not seem to update its status
while it runs, see the large time gap in log messages:
| 2008-04-14 23:51:39 --> Task #50 started.
| 2008-04-14 23:51:39 --> Running rank method: times cited.
| Execution time for generating citation informations by parsing
xml contents: 355.899999999
|
| Execution time for analyzing the citation information generating the
dictionary:
| checking ref number: 29237.77
| checking ref ypvt: 0.0
| checking rec number: 108.819999999
| checking rec ypvt: 1015.43
| total time of ref_analyze: 30362.02
| Total time of software: 30725.6972189
| 2008-04-15 08:23:46 --> Running rank method: journal impact factor.
It also writes to the stderr instead of stdout:
| 2008-04-14 23:51:40 --> Last update 2008-04-13 20:00:34 records: 123011
updates: 123011
| 2008-04-15 08:11:40 --> Checking records referred to in new records
| 2008-04-15 08:12:14 --> Checking authors in new records
| 2008-04-15 08:23:45 --> size of reversedict 322550
| 2008-04-15 08:23:46 --> size of citationdict 306640
| 2008-04-15 08:23:46 --> size of selfcitedbydict 47206
| 2008-04-15 08:23:46 --> size of selfcitdict 45711
We have to clean its progress message logging so that (i) the
task updates its status more regularly so we can see where the
time is spent; (ii) the task updates also its progress DB column
so that the admin can see its progress in the bibsched
monitor. See e.g. what the regular word indexer (bibindex) does.
_______________________________________________________
Carbon-Copy List:
CC Address | Comment
------------------------------------+-----------------------------
3346 | -UPD-
1576 | -SUB-
==============================================================================
This item URL is:
<http://savannah.cern.ch/task/?6809>
_______________________________________________
Message sent via/by LCG Savannah
http://savannah.cern.ch/