I'm trying to import a number of bib records with "special" characters in the MARC fields. I've gotten as far running direct_ingest.pl but I'm noticing that biblio_fingerprint.js chokes on a few of them.
Looking a bit closer, I noticed that biblio_fingerprint.js chops character codes down to two least significant hex digits. For example, biblio_fingerprint.js turns "Č" (Č) into "" ("form feed"), which causes the direct_ingest.pl to skip the record and output the following error: "Couldn't process record: invalid character encountered while parsing JSON string" Attached is a sample record that causes this problem for me (the tarball includes both the original MARCXML and the BRE file generated from it by marc2bre.pl). Any help would be appreciated! I can open a bug on Launchpad, too, if needed. Cheers, Warren
iii_prob_record.tar.gz
Description: GNU Zip compressed data