I'm evaluating Linux software suitable to retrieve bibliographical data.

A great package is YAZ, a compact toolkit that provides access to the
Z39.50 protocol.  It comes with the command line tool yaz-client tht
allows to access Z39.50 server, e.g. the Library of Congress at
z3950.loc.gov:7090/voyager .

Unfortunately, there is a problem with the encoding;
http://lcweb.loc.gov/z3950/lcserver.html :

    The server does not support the complete MARC 21 character set.
       Diacritics and many special characters are not encoded correctly.
       Diacritics and special characters are currently being converted
       to an proprietary Endeavor Latin-1 representation.

Obviously, the software and the file format is able to represent all
encoding info but the server isn't able to make this info properly
available.  Thus, tools like
http://www.ece.arizona.edu/~denny/python_nest/MARCxUDC.tar.gz will
fail.  A Google search makes me believe this problem is know since two
years (at least!); what can we do about it?

Is it possible to map the "proprietary Endeavor Latin-1 representation"
to UTF-8?  Are there other Z39.50 servers you can use as fallback
(London, Paris)?  German university libraries do seem to offer direct
Z39.50 access, only Web-gateways :-(

-- 
[EMAIL PROTECTED] (work) / [EMAIL PROTECTED] (home):              |
http://www.suse.de/~ke/                                  |      ,__o
Free Translation Project:                                |    _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/             |   (*)/'(*)
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to