Samueli

Yes, that needs to be re-defined.   When a record is deleted, it is 99% of the 
time a duplicate record that has been found,  thus citations on both records 
"double count".   This is pretty important,  we get a lot of client questions 
about citation counts that seem to drop for them.

Deleted records should be unindexed.   

Mike


-----Original Message-----
From: Samuele Kaplun [mailto:[email protected]] 
Sent: Thursday, September 30, 2010 6:43 AM
To: project-cdsware-developers
Subject: DELETED records definition

Dear all,

what is the definition of a DELETED record in Invenio?

According to BibIndex and BibEdit a record is deleted when "DELETED" has
been put tag in 980__c.

However, by how the "collection" logical field is defined (i.e. with
"980__%"), any subfield of tag 980__ having the value "DELETED" will
correspond to a deleted record. 

Thus BibIndex will remove from its indexes, only record matched by
980__c, while other records will still be indexed, and will need to be
manually stripped away (like it happen in webcoll:
[...]
            # B - collection does have dbquery, so compute it:
            #     (note: explicitly remove DELETED records)
            if CFG_CERN_SITE:
                reclist = search_pattern(None, self.dbquery + \
                                         ' -collection:"DELETED"
-collection:"DUMMY"')
            else:
                reclist = search_pattern(None, self.dbquery + '
-collection:"DELETED"')
[...]

So, records using 980__b or 980__c will be deleted according to WebColl
(and thus WebSearch), but not according to BibIndex. They are in a kind
of limbo state.

Should we enlarge the definition implied by BibIndex?

Cheers!
        Sam

-- 
Samuele Kaplun
Invenio Developer ** <http://invenio-software.org/>

Reply via email to