#548: BibMatch: match validation
--------------------------+--------------------------------
  Reporter:  jlavik       |      Owner:  jlavik
      Type:  enhancement  |     Status:  closed
  Priority:  major        |  Milestone:
 Component:  BibMatch     |    Version:
Resolution:  fixed        |   Keywords:  matching, workflow
--------------------------+--------------------------------
Changes (by Jan Aage Lavik <jan.age.lavik@…>):

 * status:  in_merge => closed
 * resolution:   => fixed


Comment:

 In [d39a330c306e3dcb2f70bca11c143df91747bc4e]:
 {{{
 #!CommitTicketReference repository=""
 revision="d39a330c306e3dcb2f70bca11c143df91747bc4e"
 BibMatch: match validation

 * Adds a new sub-module for comparing records after searching
   for potentially matching records, called the match validation
   step. (fixes #548)

   * Various methods are used when comparing records, for example
     special metrics for comparing authors, titles and identifiers.

     These comparison methods are configurable per (sub-)field and
     acts as rules for matching records. These rules can be grouped
     in rulesets using regular expressions, allowing records to
     be compared differently based on content. (fixes #183)

   * For an exact match to happen all defined comparison rules must
     succeed. If they do not all succeed, but the ratio of success
     is above a certain (configurable) limit, the match is considered
     fuzzy. Two or more matching fields MUST be found, unless
     certain MARC fields have been configured as 'final' or 'joker'
     types, i.e. identifier fields such as DOI or ISBN.

   * Another configurable is added to control the limit of maximum
     number of search results to compare for a single search query.

 * Both match validation and fuzzy searching are toggleable using the
   CLI commands '--no-valid' and '--no-fuzzy' respectively.

 * New command available, '--ascii', for transliterating record values
   to ASCII before being used in searching and matching. XML entities,
   like &amp;, are transformed to UTF-8 before searches.

 * Adds a configuration module specific for BibMatch internal globals.

 * Enables automatic logging of BibMatch runs, providing information
   about record matching results.

 * Also adds applicable regression tests, a new unit-test module and
   brand new admin and hacking guides.

 * Detects if any input records are badly parsed by BibRecord.
 }}}

-- 
Ticket URL: </ticket/548#comment:3>
Invenio <http://invenio-software.org>

Reply via email to