https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35104

--- Comment #116 from Martin Renvoize (ashimema) 
<[email protected]> ---
Created attachment 194221
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=194221&action=edit
Bug 35104: Strip non-XML characters gracefully with per-field error tracking

Rewrite Metadata::store to strip non-XML characters automatically rather
than throwing a Koha::Exceptions::Metadata::Invalid exception.  When
stripping is needed:

* Every bad character is located individually via _find_nonxml_chars(),
  which scans the raw MARCXML and records the field reference (e.g.
  336$a), character ordinal, 1-based position within the subfield value,
  and a two-line context snippet generated by _context_snippet().

* Context snippets show up to 30 characters either side of the bad
  character, replacing it with the appropriate Unicode Control Picture
  (U+2400-U+241F for C0 controls, U+2421 for DEL, U+FFFD otherwise) so
  the location is visible even in plain text.  A second line carries a
  caret (^) aligned beneath the replacement glyph.

* Each occurrence is stored as a separate row in biblio_metadata_errors
  with error_type 'nonxml_stripped'.  On a clean re-save the existing
  error rows are deliberately left in place: they are review flags
  requiring explicit human resolution and must not be silently cleared.
  Only a save that triggers fresh stripping replaces the existing set.

* A new stripped_on_last_store() method lets callers distinguish between
  "stripping just happened" and "pre-existing errors are present", so
  the UI can avoid spuriously re-displaying a save-time warning when the
  record is simply re-saved without changes.

If the MARCXML cannot be recovered at all even after stripping, the
existing Koha::Exceptions::Metadata::Invalid exception is still thrown.

Sponsored-by: OpenFifth

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to