On Mon, Nov 17, 2008 at 9:46 PM, Warren Layton <[EMAIL PROTECTED]> wrote: > What I suspect is happening is that direct_ingest.pl rejects records > that have an accented character between square brackets ("[" and "]) > in a field. For example, a record with the following 260 subfield will > be rejected: > > <subfield code=\"b\">[Bibliothèque nationale du Canada],</subfield> > > However, if I remove _either_ the square brackets _or_ the "̀", > the record will be successfully processed.
Just a quick follow-up to this problem. It also occurs in two other scenarios: 1) The openening and closing square brackets can be spread over multiple subfields. 2) The problem also occurs if the accented character/diacritic is placedbetween two escaped double-quotes (\"). For example, a record containing the following subfield will produce the same error: <subfield code=\"b\">\"Systèmes solaires\", </subfield> I have traced the execution of the script through /openils/lib/perl5/OpenILS/Application/Ingest.pm and the script is dying in "sub biblio_fingerprint" (API name: open-ils.ingest.fingerprint.xml). Specifically, "biblio_fingerprint.js" seems to be where the problem is occurring. I'm suspecting that some regular expression is getting tripped up somewhere. Cheers, Warren