This is an automated notification sent by LCG Savannah.
It relates to:
task #3227, project CDS Invenio
==============================================================================
LATEST MODIFICATIONS of task #3227:
==============================================================================
Update of task #3227 (project cdsware):
Assigned to: skaplun => lmarian
_______________________________________________________
Follow-up Comment #4:
At the same time a more modern algorithm might be implemented...
See also:
<http://cdsware.cern.ch/repo/?p=personal/cds-invenio-sam.git;a=shortlog;h=bibrank-cythonization>
==============================================================================
OVERVIEW of task #3227:
==============================================================================
URL:
<http://savannah.cern.ch/task/?3227>
Summary: debug math in ranking algorithm
Project: CDS Invenio
Submitted by: simko
Submitted on: 2006-03-29 17:44
Should Start On: 2006-03-29 00:00
Should be Finished on: 2006-03-29 00:00
Category: BibRank
Priority: 5 - Normal
Status: None
Privacy: Public
Percent Complete: 10%
Assigned to: lmarian
Open/Closed: Open
Discussion Lock: Any
Effort: 0.00
_______________________________________________________
http://cdsware.cern.ch:8000/search.py?p=recid:72&rm=wrd&ln=en
sometimes gives errors like:
[' File
"/log/cdsware-DEMOPLUS/lib/python/cdsware/bibrank_record_sorter.py",
line 230, in rank_records\n result = find_similar(rank_method_code,
pattern[0][6:], hitset, rank_limit_relevance, verbose)\n', ' File
"/log/cdsware-DEMOPLUS/lib/python/cdsware/bibrank_record_sorter.py",
line 381, in find_similar\n if len(tf_values) <=
methods[rank_method_code]["max_nr_words_lower"] or (len(term_recs)>=
methods[rank_method_code]["min_nr_words_docs"] and
(((float(len(term_recs)) /
float(methods[rank_method_code]["col_size"])) <=
methods[rank_method_code]["max_word_occurence"]) and
((float(len(term_recs)) /
float(methods[rank_method_code]["col_size"]))>=
methods[rank_method_code]["min_word_occurence"]))): #too
complicated...something must be done\n']
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: 2010-02-09 16:34 By: Samuele Kaplun <skaplun>
At the same time a more modern algorithm might be implemented...
See also:
<http://cdsware.cern.ch/repo/?p=personal/cds-invenio-sam.git;a=shortlog;h=bibrank-cythonization>
-------------------------------------------------------
Date: 2008-11-06 10:16 By: Samuele Kaplun <skaplun>
Moreover nowadays Cython might be used to greatly improve the speed of
lowlevel maths. Trying to make cythonize the maths part might help in
debugging it...
-------------------------------------------------------
Date: 2008-05-27 12:32 By: Tibor Simko <simko>
Sometimes we also get (e.g. for recID 1103777):
Forced traceback (most recent call last)
File "/usr/lib64/python2.3/site-packages/invenio/bibrank.py", line 150, in
task_run_core
func_object(key)
File "/usr/lib64/python2.3/site-packages/invenio/bibrank_word_indexer.py",
line 1207, in word_similarity
return word_index(run)
File "/usr/lib64/python2.3/site-packages/invenio/bibrank_word_indexer.py",
line 839, in word_index
update_rnkWORD(options["table"], options["modified_words"])
Traceback (most recent call last):
File "/usr/lib64/python2.3/site-packages/invenio/bibrank_word_indexer.py",
line 1114, in update_rnkWORD
Nj[j] = Nj.get(j, 0) + math.pow(Gi[t] * (1 + math.log(tf[0])), 2)
OverflowError: math range error
-------------------------------------------------------
Date: 2006-05-02 16:07 By: Martin Vesely <vesely>
<http://cdsware.cern.ch:8000/search.py?p=recid:38&rm=wrd&ln=en>
results in:
[An error occured when trying to rank the search result
Unexpected error: integer division or modulo by zero
Traceback:[' File
"/log/cdsware-DEMOPLUS/lib/python/cdsware/bibrank_record_sorter.py", line
230, in rank_records\n result = find_similar(rank_method_code,
pattern[0][6:], hitset, rank_limit_relevance, verbose)\n\
', ' File
"/log/cdsware-DEMOPLUS/lib/python/cdsware/bibrank_record_sorter.py", line
389, in find_similar\n (reclist, hitset) =
sort_record_relevance_findsimilar(recdict, rec_termcount, hitset,
rank_limit_relevan\
ce, verbose)\n', ' File
"/log/cdsware-DEMOPLUS/lib/python/cdsware/bibrank_record_sorter.py", line
543, in sort_record_relevance_findsimilar\n w = int(w * 100 / divideby)\n']]
_______________________________________________________
Carbon-Copy List:
CC Address | Comment
------------------------------------+-----------------------------
2195 | -COM-
1922 | -UPD-
1576 | -SUB-
==============================================================================
This item URL is:
<http://savannah.cern.ch/task/?3227>
_______________________________________________
Message sent via/by LCG Savannah
http://savannah.cern.ch/