#251: BibFormat: problems with format_record() and live incomplete MARCXML
-----------------------+----------------------------------------------------
 Reporter:  simko      |       Owner:      
     Type:  defect     |      Status:  new 
 Priority:  critical   |   Milestone:  v1.0
Component:  BibFormat  |     Version:      
 Keywords:             |  
-----------------------+----------------------------------------------------
 There are some problems when using `format_record()` on MARCXML
 snippets that do not have all expected fields and/or real record
 ID (tag 001).

 1) A small problem is that recID cannot really be `None` if MARCXML is
 passed, since it leads to tracebacks in statements like:

 {{{
 register_exception(prefix="An error occured while formatting record %i in
 %s" % \
                    (recID, of),
                    alert_admin=True)
 }}}

 We can live with this by passing fake recID, but we should probably
 document it
 in the docstring.

 2) The real problem is that some output formats such as EndNote and
 RefWorks seem to assume presence of many fields, which is not the case
 for e.g. external items in baskets, that have only a handful of fields
 defined, and do not even have `001`.

 A simple test case that fails:

 {{{
 z = """<?xml version="1.0" encoding="UTF-8"?>
   <record>
     <controlfield tag="001">1234</controlfield>
     <datafield tag="100" ind1=" " ind2=" ">
       <subfield code="a">Doe, J</subfield>
     </datafield>
     <datafield tag="245" ind1=" " ind2=" ">
       <subfield code="a">On the foo and bar</subfield>
     </datafield>
   </record>"""
 from invenio.bibformat import format_record
 format_record(1234, 'xe', xml_record=z, on_the_fly=True)
 }}}

 A test value that works:

 {{{
 z = format_record(1,'xm')
 }}}

 but eliminate `001` from the snippet and it will stop working:

 {{{
 z = z.replace('<controlfield tag="001">1</controlfield>\n  ','')
 }}}

 The typical error is:

 {{{
 In [19]: format_record(1234, 'xe', xml_record=z, on_the_fly=True)
 Entity: line 2: parser error : Start tag expected, '<' not found

 ^
 Out[19]: '<abbr class="unapi-id" title="1234"></abbr>\n'
 }}}

-- 
Ticket URL: <http://invenio-software.org/ticket/251>
Invenio <http://invenio-software.org>

Reply via email to