If I use TrecContentSource to index a collection, it puts the doc name into
the docname field, just as I like.
say i have a doc with
<DOCNO>DOCID0001</DOCNO>
the problem is that concatenates the iteration number to this document name:

name = name + "_" + iteration;

this produces a docname of DOCID0001_0, which won't work if I am trying to
use the quality package to measure relevance.

Does anyone object to changing TrecContentSource to *not do this* ???
I would think the primary reason you would want to use it would be to
measure relevance.

alternatively, we could change DocNameExtractor in the quality package to
ignore this _Iteration suffix... doesn't matter to me.
-- 
Robert Muir
rcm...@gmail.com

Reply via email to