Re: Marc::XML with MARC21
Oops I forgot to attach the script as promised didn't I. I also meant to say that this is a fine place to discuss questions about eprints too. Although I imagine it might be good to ask on eprints specific lists where there might be more eprints eyes. //Ed test.pl Description: Binary data
Re: Marc::XML with MARC21
Hi Michele: Yes, I see a UTF-8 encoding error in that file when I try to check it with xmllint (from the libxml2 package): e...@curry:~/Downloads$ xmllint marc.xml marc.xml:1: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE0 0x20 0x3A 0x3C ld code="b">le infrastrutture, l' organizzazione, i contratti e le responsabilit This causes MARC::Record->new_from_xml to blow up too, with a somewhat unhelpful error: not well-formed (invalid token) at line 1, column 1533, byte 1533 at /usr/lib/perl5/XML/Parser.pm line 187 It looks like your xml file might be in ISO-8859-1 (at least the unix file command told me): e...@curry:~/Projects/marc-xml$ file marc.xml marc.xml: ISO-8859 text, with very long lines, with no line terminators So you could try to convert your XML string with Encode before handing it off to MARC::Record->new_from_xml: use Encode; Encode->from_to($xml, 'iso-8859-1', 'utf-8'); I attached the full script which seems to work OK. Note, if you are on ubuntu it looks like they are a few versions back on their libmarc-xml-perl package (v0.88) instead of the latest on CPAN (v0.92) ... and v0.88 doesn't handle namespaces properly... //Ed
Re: Marc::XML with MARC21
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Ed, i've did some tests and all works correctly as you said. At this point i have some troubles importing MARC record into EPrints with MARC plugin developed by Jose Miguel. But...maybe this is not the right place where ask about it :-) Thanks, Michele Ed Summers ha scritto: > Hi Michele: > > I copied and pasted the XML from your email and ran it through a > simple test script (both attached) and the record seemed to be parsed > ok. What do you see if you run the attached test.pl? > > //Ed > - -- || Michele Pinassi || System Manager Area Sistema Biblioteche - UniSi || https://sites.google.com/a/unisi.it/o-zone/ || Assistenza: +39.577.232299 (int. 2299) || Personale: +39.577.232477 (int. 2477) || FAX: +39.577.232430 (int. 2430) -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkte1aoACgkQFPw35TwkuY57VQCfU/m2CDS7e9eEcGZ3pSsKvFA+ 5x8An3HPz1AbHYkiASLXooraP24f0Kms =1Mdu -END PGP SIGNATURE-
Re: Marc::XML with MARC21
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Ed, yes, it works as expected ! I've just tried again with my marc.xml (as attached) and seems to be encoding problems. Maybe Aleph don't export in UTF-8 ? Thanks for your help, Michele Ed Summers ha scritto: > Hi Michele: > > I copied and pasted the XML from your email and ran it through a > simple test script (both attached) and the record seemed to be parsed > ok. What do you see if you run the attached test.pl? > > //Ed > - -- || Michele Pinassi || System Manager Area Sistema Biblioteche - UniSi || https://sites.google.com/a/unisi.it/o-zone/ || Assistenza: +39.577.232299 (int. 2299) || Personale: +39.577.232477 (int. 2477) || FAX: +39.577.232430 (int. 2430) -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkteqLwACgkQFPw35TwkuY5f3QCeIjh80sQHCVl4u39gJreI13Dr lhAAnAhiR/Cs93aROB8EdImVx6k09NTA =jIAj -END PGP SIGNATURE- http://www.loc.gov/MARC21/slim"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd";>^cam^^22^^i^4507626628814075913ITServizio Bibliotecario SeneseRICAVI, 262 p. ;24 cmNavigazione da diportoLegislazioneAntonini,AlfredoMorandi,FrancescoitaLa navigazione da diporto :le infrastrutture, l' organizzazione, i contratti e le responsabilità :atti del convegno, Trieste, 27 marzo 1998 /a cura di Alfredo Antonini e Francesco MorandiMilano :Giuffrè1999Collana del Dipartimento di scienze giuridiche e della Facoltà di giurisprudenza dell' Università di Modena e Reggio EmiliaNuova serie ;0048Collana del Dipartimento di scienze giuridiche e della Facoltà di giurisprudenza dell' Università di Modena e Reggio Emilia0048343.4509620^^sxx^|r^|||
Re: Marc::XML with MARC21
Hi Michele: I copied and pasted the XML from your email and ran it through a simple test script (both attached) and the record seemed to be parsed ok. What do you see if you run the attached test.pl? //Ed test.pl Description: Binary data http://www.loc.gov/MARC21/slim"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd";>^cam^^22^^i^4507626628814075913ITServizio Bibliotecario SeneseRICAVI, 262 p. ;24 cmNavigazione da diportoLegislazioneAntonini,AlfredoMorandi,FrancescoitaLa navigazione da diporto :le infrastrutture, l' organizzazione, i contratti e le responsabilità :atti del convegno, Trieste, 27 marzo 1998 /a cura di Alfredo Antonini e Francesco MorandiMilano :Giuffrè1999Collana del Dipartimento di scienze giuridiche e della Facoltà di giurisprudenza dell' Università di Modena e Reggio EmiliaNuova serie ;0048Collana del Dipartimento di scienze giuridiche e della Facoltà di giurisprudenza dell' Università di Modena e Reggio Emilia0048343.4509620^^sxx^|r^|||
Re: Marc::XML with MARC21
> > my $file = MARC::Record->new_from_xml($marc->serialize(),"UTF-8","MARC21"); > $epdata = $plugin->EPrints::Plugin::Import::MARC::convert_input( > $file ); > > and here come troubles: only few metadatas will be interpreted > correctly, losing a lot of datas. Ummm, so what metdata makes it through? I see examples of what you feed it, but not what is coming out. Just from looking quickly at the MarcXML the only thing that seems really weird right away is the trailing 008 for the control field for the leader. Don't know what the xsd states about the ordering, but typically all the controlfields are at the top of a MARC record. Jon Gorman