[CODE4LIB] Bibframe contracts

2014-08-27 Thread Ford, Kevin
Dear All,

The Library of Congress has issued two solicitations (RFQs) for 
Bibframe-related development work.  We want to be sure to advertise these 
possibilities to this community.  You can read more about them at the below 
links.

The first is for a Bibframe Search and Display tool (to be clear, this is /not/ 
to create a search and display tool for the entire LC catalog).

https://www.fbo.gov/index?s=opportunitymode=formid=11db76388c0caafa72f6bd6ccb3d159ftab=core_cview=0

The second is for a Bibframe Profiles editor (that is, an editor for the 
Profiles themselves):

https://www.fbo.gov/index?s=opportunitymode=formid=927b167c07002045e51cb8c53485fc4etab=core_cview=0

Proposals relating to both are due next week (Aug 6).

We want to encourage any interested developer, or developer team, to respond to 
the RFQs.  I frankly do not know what rules may pertain to bidding on US 
government contracts, but a quick review of the requirements for registering 
for government contracting suggests that it isn't too arduous:

http://www.sba.gov/content/register-government-contracting

For example, a DUNS number and EIN can be acquired in a day.

I think these could make for ideal side projects for a small team of interested 
developers from this community.  So many of you have the skills and expertise 
needed to really produce a very interesting software solution to the above 
solicitations.

I can't answer any questions about these contracts in this forum, but you can 
use the contacts listed at the bottom of the above pages if you have questions. 
 Those inquiries are forwarded to us, we answer them, and then the information 
is posted publicly so that everyone interested in the opportunity has access to 
the same information.

Yours,
Kevin


--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


Re: [CODE4LIB] Announcement: Two New Vocabularies added to LC's Linked Data Service

2014-06-26 Thread Ford, Kevin
For a variety of reasons, no, we do not have a SPARQL endpoint.

Yours,
Kevin


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Péter Király
 Sent: Thursday, June 26, 2014 6:34 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Announcement: Two New Vocabularies added to
 LC's Linked Data Service
 
 Hi Kevin,
 
 2014-06-25 23:00 GMT+02:00 Ford, Kevin k...@loc.gov:
  The Library of Congress is pleased to make two new vocabularies
  available as linked data
 
 congratulation, it's very useful. I have a question: do you have a SPARQL
 endpoint as well?
 
 Regards,
 Péter
 
 --
 Péter Király
 software developer
 
 Europeana - http://europeana.eu
 eXtensible Catalog - http://eXtensibleCatalog.org


[CODE4LIB] Bibframe survey

2014-06-25 Thread Ford, Kevin
Dear All,

Please see below a copy and pasted message, which was posted to the Bibframe 
listerv and also a number of (mostly) cataloging listservs.  Although it was 
developed and sponsored by the Program for Cooperative Cataloging (PCC), we're 
interested in the broadest possible feedback from the library community.  The 
code4lib community - comprised of developers and other library tech types - is 
a vital element within the broader community, and one we see as a key 
stakeholder in this process, and we'd very much like your feedback to the below 
survey.  



On June 20, 2014, the Library of Congress announced its desire to collaborate 
with the Program for Cooperative Cataloging in the endorsement and support of 
BIBFRAME as the model to help the library community move into the Linked Data 
environment.

PCC and LC strongly encourage the PCC membership and the broader library 
community to become more knowledgeable and attuned to the development and 
rollout of BIBFRAME and how it fits within libraries and the larger Linked Data 
sphere.

The PCC Secretariat has created a BIBFRAME survey that aims to assess the 
current level of understanding of BIBFRAME within the PCC community and the 
wider information community.  The survey also asks for ways in which 
information and announcements on BIBFRAME can be shared more widely within the 
communities. 

The PCC Secretariat encourages all PCC members to take the survey, and requests 
that PCC members share the survey widely with colleagues in all spheres of 
library work - vendors, systems, acquisitions, and other areas. 

You do not need to be a PCC member in order to take the survey!

The survey should take approximately 10 minutes or less to complete, and you 
may remain anonymous if you wish.

https://www.surveymonkey.com/s/PCC-BIBFRAME-2014

The survey will close on Monday, July 14, 2014.

--

I can vouch that it should take only a little of your valuable time to 
complete. 

Cordially,
Kevin


--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


[CODE4LIB] Announcement: Two New Vocabularies added to LC's Linked Data Service

2014-06-25 Thread Ford, Kevin
The Library of Congress is pleased to make two new vocabularies available as 
linked data from LC's Linked Data Service, ID.LOC.GOV:  the Library of Congress 
Medium of Performance Thesaurus for Music (LCMPT) and the American Folklife 
Society's Ethnographic Thesaurus (AFSET).  The LCMPT is a linked data 
representation of terminology to describe the instruments, voices, etc., used 
in the performance of musical works.  The AFSET is a linked data representation 
of terms that can be used to improve access to information about folklore, 
ethnomusicology, ethnology, and related fields.  

While LCMPT is relatively small, with fewer than 1,000 entries, AFSET includes 
more than 16,000 concepts.

Bulk downloads have been made available from the Downloads page for each 
dataset.   On a related note, a number of bulk downloads - such as those for 
Children's Subject Headings and Genre Form Headings - have also been updated.

**

Please explore them for yourself at

LCMPT - http://id.loc.gov/authorities/performanceMediums
AFSET - http://id.loc.gov/vocabulary/ethnographicTerms

**

Contact Us about ID:
As always, your feedback is important and welcomed.  Though we are interested 
in all forms of constructive commentary on all topics related to ID, we're 
particularly interested in how the data available from ID.LOC.GOV is used.  
Your contributions directly inform service enhancements.

You can send comments or report any problems to us via the ID feedback form or 
ID listserv (see the web site).

Background:
The LC Linked Data Service was first made available in May 2009 and offered the 
Library of Congress Subject Headings (LCSH), the Library's initial entry into 
the Linked Data environment. In part by assigning each vocabulary and each data 
value within it a unique resource identifier (URI), the service provides a 
means for machines to semantically access, use, and harvest authority and 
vocabulary data that adheres to W3C recommendations, such as Simple Knowledge 
Organization System (SKOS), and the more detailed vocabulary MADS/RDF.  In this 
way, the LC Linked Data Service also makes government data publicly and freely 
available in the spirit of the Open Government directive. Although the primary 
goal of the service is to enable machine access to Library of Congress data, a 
web interface serves human users searching and browsing the vocabularies.  The 
new datasets join the term and code lists already available through the service:

* Library of Congress Subject Headings (LCSH)
* Library of Congress Children's Subject Headings
* Library of Congress Genre/Form Terms
* Library of Congress / NACO Name Authority File
* Library of Congress / LCC (select schedules)
* Thesaurus of Graphic Materials
* Cultural Heritage Organizations
* MARC Code List for Relators
* MARC Code List for Countries (which reference their equivalent ISO 3166 codes)
* MARC Code List for Geographic Areas
* MARC Code List for Languages (which have been cross referenced with ISO 
639-1, 639-2, and 639-5, where appropriate)
* PREMIS vocabularies

The above code lists also contain links with appropriate LCSH and LC/NAF 
headings.

LC's Linked Data Service is managed by the Network Development and MARC 
Standards Office of the Library of Congress.


--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


Re: [CODE4LIB] [CODE4LIB] HEADS UP - Government shutdown will mean *.loc.gov is going offline October 1

2013-09-30 Thread Ford, Kevin
All *.loc.gov web sites will be closed, including the two you quoted.

The Internet Archive's Way Back Machine is probably your best bet for these 
types of things:

http://web.archive.org/web/*/http://www.loc.gov/marc/
http://web.archive.org/web/*/http://www.loc.gov/standards/sourcelist/index.html

Yours,
Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Becky Yoose
 Sent: Monday, September 30, 2013 4:32 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] [CODE4LIB] HEADS UP - Government shutdown will
 mean *.loc.gov is going offline October 1
 
 FYI - this also means that there's a very good chance that the MARC
 standards site [1] and the Source Codes site [2] will be down as well.
 I
 don't know if there are any mirror sites out there for these pages.
 
 [1] http://www.loc.gov/marc/
 [2] http://www.loc.gov/standards/sourcelist/index.html
 
 Thanks,
 Becky, about to be (forcefully) departed with her standards
 documentation
 
 
 On Mon, Sep 30, 2013 at 11:39 AM, Jodi Schneider
 jschnei...@pobox.comwrote:
 
  Interesting -- thanks, Birkin -- and tell us what you think when you
 get it
  implemented!
 
  :) -Jodi
 
 
  On Mon, Sep 30, 2013 at 5:19 PM, Birkin Diana birkin_di...@brown.edu
  wrote:
 
...you'd want to create a caching service...
  
  
   One solution for a relevant particular problem (not full-blown
  linked-data
   caching):
  
   http://en.wikipedia.org/wiki/XML_Catalog
  
   excerpt: However, if they are absolute URLs, they only work when
 your
   network can reach them. Relying on remote resources makes XML
 processing
   susceptible to both planned and unplanned network downtime.
  
   We'd heard about this a while ago, but, Jodi, you and David Riordan
 and
   Congress have caused a temporary retreat from normal sprint-work
 here at
   Brown today to investigate implementing this!  :/
  
   The particular problem that would affect us: if your processing
 tool
   checks, say, an loc.gov mods namespace url, that processing will
 fail if
   the loc.gov url isn't available, unless you've implemented xml
 catalog,
   which is a formal way to locally resolve such external references.
  
   -b
   ---
   Birkin James Diana
   Programmer, Digital Technologies
   Brown University Library
   birkin_di...@brown.edu
  
  
   On Sep 30, 2013, at 7:15 AM, Uldis Bojars capts...@gmail.com
 wrote:
  
What are best practices for preventing problems in cases like
 this when
   an
important Linked Data service may go offline?
   
--- originally this was a reply to Jodi which she suggested to
 post on
   the
list too ---
   
A safe [pessimistic?] approach would be to say we don't trust
   [reliability
of] linked data on the Web as services can and will go down and
 to
  cache
everything.
   
In that case you'd want to create a caching service that would
 keep
   updated
copies of all important Linked Data sources and a fall-back
 strategy
  for
switching to this caching service when needed. Like archive.org
 for
   Linked
Data.
   
Some semantic web search engines might already have subsets of
 Linked
   Data
web cached, but not sure how much they cover (e.g., if they have
 all of
   LoC
data, up-to-date).
   
If one were to create such a service how to best update it,
 considering
you'd be requesting *all* Linked Data URIs from each source? An
  efficient
approach would be to regularly load RDF dumps for every major
 source if
available (e.g., LoC says - here's a full dump of all our RDF
 data ...
   and
a .torrent too).
   
What do you think?
   
Uldis
   
   
On 29 September 2013 12:33, Jodi Schneider jschnei...@pobox.com
  wrote:
   
Any best practices for caching authorities/vocabs to suggest for
 this
thread on the Code4Lib list?
   
Linked Data authorities  vocabularies at Library of Congress (
   id.loc.gov)
are going to be affected by the website shutdown -- because of
 lack of
government funds.
   
-Jodi
  
 


Re: [CODE4LIB] Marcive.com hosts are compromised

2013-08-30 Thread Ford, Kevin
Righty.  I had to view the source, but I saw the injected text.

I gave the one contact I know at marcive a call.  She saw it too.

Yours,
Kevin


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Sam Kome
 Sent: Friday, August 30, 2013 3:24 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Marcive.com hosts are compromised
 
 Sorry about that - I mistype 'Marcive' all the time. Despite that, it
 is the site I meant, sans 'h'.
 
 It will resolve correctly but I wouldn't advise visiting - take
 precautions.
 Google search results also suggest it is compromised and the page
 sources contain pharma metadata.
 I emailed and then called the technical contact number.  Got a response
 on the phone, sounded like they were unaware but would look into it.
 
 Our Collections folks report not receiving expected reports this month
 so the problem may be fairly old.
 
 SK
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Ford, Kevin
 Sent: Friday, August 30, 2013 12:04 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Marcive.com hosts are compromised
 
 http://marcive.com goes to the right place for me.  It is the one you
 mentioned in the subject line of your email.
 
 http://marchive.com (note the h) goes to a domain squatter.  It is
 the one you mentioned in the body of your email.
 
 Which one is causing you the issue?
 
 Cordially,
 Kevin
 
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
  Of Sam Kome
  Sent: Friday, August 30, 2013 2:07 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: [CODE4LIB] Marcive.com hosts are compromised
 
  Based on the pharmaceutical ads in their page sources and the fact
  that our Cisco Iron Port has blacklisted them, I have to regretfully
  report that marchive.com has been compromised.  Does anyone know the
  relevant
  contact(s) there to notify?
 
  Sam Kome | Assistant Director, RD |The Claremont Colleges Library
  Claremont University Consortium |800 N. Dartmouth Ave |Claremont, CA
  91711
  Phone (909) 621-8866 |Fax (909) 621-8517
  |sam_k...@cuc.claremont.edumailto:%7csam_k...@cuc.claremont.edu


Re: [CODE4LIB] Marcive.com hosts are compromised

2013-08-30 Thread Ford, Kevin
http://marcive.com goes to the right place for me.  It is the one you mentioned 
in the subject line of your email.

http://marchive.com (note the h) goes to a domain squatter.  It is the one 
you mentioned in the body of your email.

Which one is causing you the issue?

Cordially,
Kevin


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Sam Kome
 Sent: Friday, August 30, 2013 2:07 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Marcive.com hosts are compromised
 
 Based on the pharmaceutical ads in their page sources and the fact that
 our Cisco Iron Port has blacklisted them, I have to regretfully report
 that marchive.com has been compromised.  Does anyone know the relevant
 contact(s) there to notify?
 
 Sam Kome | Assistant Director, RD |The Claremont Colleges Library
 Claremont University Consortium |800 N. Dartmouth Ave |Claremont, CA
 91711
 Phone (909) 621-8866 |Fax (909) 621-8517
 |sam_k...@cuc.claremont.edumailto:%7csam_k...@cuc.claremont.edu


[CODE4LIB] Announcement: Cultural Heritage Organizations Vocabulary Published

2013-08-29 Thread Ford, Kevin
The Library of Congress is pleased to make the Cultural Heritage Organizations 
vocabulary available as linked data from LC's Linked Data Service, ID.LOC.GOV.  
The Cultural Heritage Organizations vocabulary is a linked data representation 
of the MARC Organizations code list, which, among other uses, is an essential 
reference tool for those dealing with MARC records, for systems reporting 
library holdings, for many interlibrary loan systems, and for those who may be 
organizing cooperative projects on a regional, national, or international 
scale. 

While the Cultural Heritage Organizations vocabulary focuses on US 
institutions, with over 30,000 defined, it also includes codes for institutions 
in other countries that have requested them.  However, MARC codes are not 
assigned for institutions for Canada, Germany, or the United Kingdom unless the 
institution is a branch of a US institution.  Overall, the vocabulary contains 
over 36,000 entries.

Bulk downloads of the Cultural Heritage Organizations vocabulary are also 
available from the downloads page.

**

Please explore the Cultural Heritage Organizations yourself at

http://id.loc.gov/vocabulary/organizations

**

Contact Us about ID:
As always, your feedback is important and welcomed.  Though we are interested 
in all forms of constructive commentary on all topics related to ID, we're 
particularly interested in how the data available from ID.LOC.GOV is used.  
Your contributions directly inform service enhancements.

You can send comments or report any problems to us via the ID feedback form or 
ID listserv (see the web site).

Background:
The LC Linked Data Service was first made available in May 2009 and offered the 
Library of Congress Subject Headings (LCSH), the Library's initial entry into 
the Linked Data environment. In part by assigning each vocabulary and each data 
value within it a unique resource identifier (URI), the service provides a 
means for machines to semantically access, use, and harvest authority and 
vocabulary data that adheres to W3C recommendations, such as Simple Knowledge 
Organization System (SKOS), and the more detailed vocabulary MADS/RDF.  In this 
way, the LC Linked Data Service also makes government data publicly and freely 
available in the spirit of the Open Government directive. Although the primary 
goal of the service is to enable machine access to Library of Congress data, a 
web interface serves human users searching and browsing the vocabularies.  The 
new datasets join the term and code lists already available through the service:

* Library of Congress Subject Headings (LCSH)
* Library of Congress Children's Subject Headings
* Library of Congress Genre/Form Terms
* Library of Congress / NACO Name Authority File
* Library of Congress / LCC (select schedules)
* Thesaurus of Graphic Materials
* MARC Code List for Relators
* MARC Code List for Countries (which reference their equivalent ISO 3166 codes)
* MARC Code List for Geographic Areas
* MARC Code List for Languages (which have been cross referenced with ISO 
639-1, 639-2, and 639-5, where appropriate)
* PREMIS vocabularies

The above code lists also contain links with appropriate LCSH and LC/NAF 
headings.

LC's Linked Data Service is managed by the Network Development and MARC 
Standards Office of the Library of Congress.


--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-13 Thread Ford, Kevin
Dear Dana,

Thanks for the detail.  Based on the few example comparisons I've seen, I very 
much like your MARC records more.  Not only are they richer, they break up the 
data better.

Yours,
Kevin


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Dana Pearson
 Sent: Wednesday, June 12, 2013 7:20 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] best way to make MARC files available to anyone
 
 Kevin, Eric
 
 7zip worked fine to unzip and records look pretty good since they used
 653 and preserved the string from the metadata element with the hypens.
  However the records do not do subfield d in 100 or 700 fields and
 thus such content appears in the 245$c.  245$a seems to go missing with
 some frequency.  MarcEdit does not report any errors though.
 
 My original intent was just to keep my XSLT skills sharp while I had
 some free time last August.  After creating the stylesheet, I then had
 no free time until January when I could devote 2 or 3 hours to the post
 transform editing.  Thought I'd just dive in but the pool was much
 deeper than I had anticipated.
 
 Do think libraries will prefer my edited versions although different in
 non-access points as well.  Incidentally, not many additions since my
 harvest.
 
 First record in the Project Gutenberg produced records:
 
 =LDR  00721cam a22002293a 4500
 =001  27384
 =003  PGUSA
 =008  081202s2008xxu|s|000\|\eng\d
 =040  \\$aPGUSA$beng
 =042  \\$adc
 =050  \4$aPQ
 =100  1\$aDumas, Alexandre, 1802-1870
 =245  10$a$h[electronic resource] /$cby Alexandre, 1802-1870 Dumas
 =260  \\$bProject Gutenberg,$c2008
 =500  \\$aProject Gutenberg
 =506  \\$aFreely available.
 =516  \\$aElectronic text
 =653  \0$aFrance -- History -- Regency, 1715-1723 -- Fiction
 =653  \0$aOrléans, Philippe, duc d', 1674-1723 -- Fiction
 =830  \0$aProject Gutenberg$v27384
 =856  40$uhttp://www.gutenberg.org/etext/27384
 =856  42$uhttp://www.gutenberg.org/license$3Rights
 
 couldn't readily find the above item but here's an example of my
 records by the same author.
 
 =LDR  01002nam a22002535  4500
 =001  PG18997
 =006  md
 =007  cr||n\|||muaua
 =008  \\s2006utu|o|||eng\d
 =042  \\$adc
 =090  \\$aPQ
 =092  \0$aeBooks
 =100  1\$aDumas, Alexandre,$d1802-1870.
 =245  14$aThe Vicomte de Bragelonne$h[electronic resource] :$bOr Ten
 Years Later being the completion of The Three Musketeers And Twenty
 Years After /$Alexandre Dumas.
 =260  \\$aSalt Lake City :$bProject Gutenberg Literary Archive
 Foundation,$c2006.
 =300  \\$a1 online resource :$bmultiple file formats.
 =500  \\$aRecords generated from Project Gutenberg RDF data.
 =540  \\$aApplicable license:$uhttp://www.gutenberg.org/license
 =650  \0$aAdventure stories.
 =650  \0$aHistorical fiction.
 =651  \0$aFrance$vHistory$yLouis XIV, 1643-1715$vFiction.
 =655  \0$aElectronic books.
 =710  2\$aProject Gutenberg.
 =856  40$uhttp://www.gutenberg.org/etext/18997$zClick to access.
 
 thanks for your interest..
 
 regards,
 dana
 
 
 On Wed, Jun 12, 2013 at 9:10 AM, Ford, Kevin k...@loc.gov wrote:
 
  Hi Dana,
 
  Out of curiosity, how does your crosswalk differ from Project
  Gutenberg's MARC files?  See, e.g.:
 
 
 
 http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_Records_
  .28automatically_generated.29
 
  Yours,
  Kevin
 
  --
  Kevin Ford
  Network Development and MARC Standards Office Library of Congress
  Washington, DC
 
 
 
   -Original Message-
   From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
 Behalf
   Of Dana Pearson
   Sent: Tuesday, June 11, 2013 9:24 PM
   To: CODE4LIB@LISTSERV.ND.EDU
   Subject: [CODE4LIB] best way to make MARC files available to anyone
  
   I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.
 I
   would like to make these files available to any library that is
   interested.
  
   I thought that I would put them on my website via FTP but don't
 know
   if that is the best way.  Don't have an ftp client myself so was
   thinking that that may be now passé.
  
   I tried using Google Drive with access available via the link to
 two
   versions of the files, UTF8 and MARC8.  However, it seems that that
   is not a viable solution.  I can access the files with the URLs
   provided by setting the access to anyone with the URL but doesn't
   work for some of those testing it for me or with the links I have
 on my webpage..
  
   I have five folders with files of about 38 MB total.  I have
   separated the ebooks, audio books, juvenile content, miscellaneous
   and non-Latin scripts such as Chinese, Modern Greek.  Most of the
   content is in the ebook folder.
  
   I would like to make access as easy as possible.
  
   Google Drive seems to work for me.  Here's the link to my page with
   the links in case you would like to look at the folders.  Works for
   me but not for everyone who's tried it.
  
   http://dbpearsonmlis.com

Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Ford, Kevin
Hi Dana,

Out of curiosity, how does your crosswalk differ from Project Gutenberg's MARC 
files?  See, e.g.:

http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_Records_.28automatically_generated.29

Yours,
Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress 
Washington, DC



 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Dana Pearson
 Sent: Tuesday, June 11, 2013 9:24 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] best way to make MARC files available to anyone
 
 I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
 would like to make these files available to any library that is
 interested.
 
 I thought that I would put them on my website via FTP but don't know if
 that is the best way.  Don't have an ftp client myself so was thinking
 that that may be now passé.
 
 I tried using Google Drive with access available via the link to two
 versions of the files, UTF8 and MARC8.  However, it seems that that is
 not a viable solution.  I can access the files with the URLs provided
 by setting the access to anyone with the URL but doesn't work for some
 of those testing it for me or with the links I have on my webpage..
 
 I have five folders with files of about 38 MB total.  I have separated
 the ebooks, audio books, juvenile content, miscellaneous and non-Latin
 scripts such as Chinese, Modern Greek.  Most of the content is in the
 ebook folder.
 
 I would like to make access as easy as possible.
 
 Google Drive seems to work for me.  Here's the link to my page with the
 links in case you would like to look at the folders.  Works for me but
 not for everyone who's tried it.
 
 http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
 thanks,
 dana
 
 --
 Dana Pearson
 dbpearsonmlis.com


Re: [CODE4LIB] best way to make MARC files available to anyone

2013-06-12 Thread Ford, Kevin
Doh!

I read all the emails in the thread except for Eric's, which asked the same 
question.

Either way, his or mine, nevertheless curious.

Kevin

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Eric Phetteplace
 Sent: Tuesday, June 11, 2013 10:57 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] best way to make MARC files available to anyone
 
 Dana - perhaps a public Dropbox folder? Or just put the files up on
 your site somewhere, served with a Content-Disposition: attachment
 header so they trigger a download when accessed? E.g. here's a
 StackOverflowhttp://stackoverflow.com/questions/9195304/how-to-use-
 content-disposition-for-force-a-file-to-download-to-the-hard-
 drivethread
 on that. If they must be a recognized MIME type, you could compress
 them as .zip or .tar.gz files on the server, which would reduce
 download time either way.
 
 I did try clicking the links on your site and they never downloaded,
 the request just timed out.
 
 Not to discredit what you're doing, which is great, but aren't MARC
 records already available for Project Gutenberg? See their offline
 catalogshttp://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs#MARC_
 Records_.28offsite.29page.
 
 Best,
 Eric Phetteplace
 Emerging Technologies Librarian
 Chesapeake College
 Wye Mills, MD
 
 
 On Tue, Jun 11, 2013 at 9:24 PM, Dana Pearson
 dbpearsonm...@gmail.comwrote:
 
  I have crosswalked the Project Gutenberg RDF/DC metadata to MARC.  I
  would like to make these files available to any library that is
 interested.
 
  I thought that I would put them on my website via FTP but don't know
  if that is the best way.  Don't have an ftp client myself so was
  thinking that that may be now passé.
 
  I tried using Google Drive with access available via the link to two
  versions of the files, UTF8 and MARC8.  However, it seems that that
 is
  not a viable solution.  I can access the files with the URLs provided
  by setting the access to anyone with the URL but doesn't work for
 some
  of those testing it for me or with the links I have on my webpage..
 
  I have five folders with files of about 38 MB total.  I have
 separated
  the ebooks, audio books, juvenile content, miscellaneous and non-
 Latin
  scripts such as Chinese, Modern Greek.  Most of the content is in the
 ebook folder.
 
  I would like to make access as easy as possible.
 
  Google Drive seems to work for me.  Here's the link to my page with
  the links in case you would like to look at the folders.  Works for
 me
  but not for everyone who's tried it.
 
  http://dbpearsonmlis.com/ProjectGutenbergMarcRecords.html
 
  thanks,
  dana
 
  --
  Dana Pearson
  dbpearsonmlis.com
 


Re: [CODE4LIB] LOC Subject Headings API

2013-06-05 Thread Ford, Kevin
Dear Josh,

Take a look at Mike's email below, which may have quickly fell down the inbox, 
helped along by an unhelpful reply.  It has the suggest pattern, but to repeat 
the general pattern:

This will provide auto-suggestions for Subjects, ChildrensSubjects, GenreForms, 
and Names:
http://id.loc.gov/authorities/suggest/?q=Hounds

This will provide auto-suggestions for Subjects only (replace subjects 
with names for only names and so on):
http://id.loc.gov/authorities/subjects/suggest/?q=Hounds

Yours,
Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC



 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Michael J. Giarlo
 Sent: Tuesday, June 04, 2013 8:05 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] LOC Subject Headings API
 
 How about id.loc.gov's OpenSearch-powered autosuggest feature?
 
 mjg@moby:~$ curl http://id.loc.gov/authorities/suggest/?q=Biology
 [Biology,[Biology,Biology Colloquium,Biology Curators'
 Group,Biology Databook Editorial Board (U.S.),Biology and Earth
 Sciences Teaching Institute,Biology and Management of True Fir in the
 Pacific Northwest Symposium (1981 : Seattle, Wash.),Biology and
 Resource Management Program (Alaska Cooperative Park Studies
 Unit),Biology and behavior series,Biology and environment
 (Macmillan Press),Biology and management of old-growth forests],[1
 result,1 result,1 result,1
 result,1 result,1 result,1 result,1 result,1 result,1
 result],[http://id.loc.gov/authorities/subjects/sh85014203,;
 http://id.loc.gov/authorities/names/n79006962,;
 http://id.loc.gov/authorities/names/n90639795,;
 http://id.loc.gov/authorities/names/n85100466,;
 http://id.loc.gov/authorities/names/nr97041787,;
 http://id.loc.gov/authorities/names/n85276541,;
 http://id.loc.gov/authorities/names/n82057525,;
 http://id.loc.gov/authorities/names/n90605518,;
 http://id.loc.gov/authorities/names/nr2001011448,;
 http://id.loc.gov/authorities/names/no94028058;]]
 
 -Mike
 
 
 
 On Tue, Jun 4, 2013 at 7:51 PM, Joshua Welker jwel...@sbuniv.edu
 wrote:
 
  I did see that, and it will work in a pinch. But the authority file
 is
  pretty massive--almost 1GB-- and would be difficult to handle in an
  automated way and without completely killing my web app due to memory
  constraints while searching the file. Thanks, though.
 
  Josh Welker
 
 
  -Original Message-
  From: Bryan Baldus [mailto:bryan.bal...@quality-books.com]
  Sent: Tuesday, June 04, 2013 6:39 PM
  To: Code for Libraries; Joshua Welker
  Subject: RE: LOC Subject Headings API
 
  On Tuesday, June 04, 2013 6:31 PM, Joshua Welker [jwel...@sbuniv.edu]
  wrote:
  I am building an auto-suggest feature into our library's search box,
  and
  I am wanting to include LOC subject headings in my suggestions list.
  Does anyone know of any web service that allows for automated
  harvesting of LOC Subject Headings? I am also looking for name
 authorities, for that matter.
  Any format will be acceptable to me: RDF, XML, JSON, HTML, CSV... I
  have spent a while Googling with no luck, but this seems like the
 sort
  of general-purpose thing that a lot of people would be interested in.
  I feel like I must be missing something. Any help is appreciated.
 
  Have you seen http://id.loc.gov/ with bulk downloads in various
  formats at http://id.loc.gov/download/
 
  I hope this helps,
 
  Bryan Baldus
  Senior Cataloger
  Quality Books Inc.
  The Best of America's Independent Presses
  1-800-323-4241x402
  bryan.bal...@quality-books.com
  eij...@cpan.org
  http://home.comcast.net/~eijabb/
 


Re: [CODE4LIB] LOC Subject Headings API

2013-06-05 Thread Ford, Kevin
 This would work, except I would need a way to get all the subjects
 rather than just biology.
-- If you want all the subjects. [period], take a look at the download page:

http://id.loc.gov/download/

There are bulk downloads for LCSH and the LC/NACO file of Names.

The suggest service (described in a separate email) is designed to give you the 
top 10 best matches based on a left-anchored search, so that it may function 
as a real-time type-ahead service.

Yours,
Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC




 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Joshua Welker
 Sent: Wednesday, June 05, 2013 9:14 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] LOC Subject Headings API
 
 This would work, except I would need a way to get all the subjects
 rather than just biology. Any idea how to do that? I tried removing the
 querystring from the URL and changing Biology in the URL to  with
 no success.
 
 Josh Welker
 
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Michael J. Giarlo
 Sent: Tuesday, June 04, 2013 7:05 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] LOC Subject Headings API
 
 How about id.loc.gov's OpenSearch-powered autosuggest feature?
 
 mjg@moby:~$ curl http://id.loc.gov/authorities/suggest/?q=Biology
 [Biology,[Biology,Biology Colloquium,Biology Curators'
 Group,Biology Databook Editorial Board (U.S.),Biology and Earth
 Sciences Teaching Institute,Biology and Management of True Fir in the
 Pacific Northwest Symposium (1981 : Seattle, Wash.),Biology and
 Resource Management Program (Alaska Cooperative Park Studies
 Unit),Biology and behavior series,Biology and environment
 (Macmillan Press),Biology and management of old-growth forests],[1
 result,1 result,1 result,1
 result,1 result,1 result,1 result,1 result,1 result,1
 result],[http://id.loc.gov/authorities/subjects/sh85014203,;
 http://id.loc.gov/authorities/names/n79006962,;
 http://id.loc.gov/authorities/names/n90639795,;
 http://id.loc.gov/authorities/names/n85100466,;
 http://id.loc.gov/authorities/names/nr97041787,;
 http://id.loc.gov/authorities/names/n85276541,;
 http://id.loc.gov/authorities/names/n82057525,;
 http://id.loc.gov/authorities/names/n90605518,;
 http://id.loc.gov/authorities/names/nr2001011448,;
 http://id.loc.gov/authorities/names/no94028058;]]
 
 -Mike
 
 
 
 On Tue, Jun 4, 2013 at 7:51 PM, Joshua Welker jwel...@sbuniv.edu
 wrote:
 
  I did see that, and it will work in a pinch. But the authority file
 is
  pretty massive--almost 1GB-- and would be difficult to handle in an
  automated way and without completely killing my web app due to memory
  constraints while searching the file. Thanks, though.
 
  Josh Welker
 
 
  -Original Message-
  From: Bryan Baldus [mailto:bryan.bal...@quality-books.com]
  Sent: Tuesday, June 04, 2013 6:39 PM
  To: Code for Libraries; Joshua Welker
  Subject: RE: LOC Subject Headings API
 
  On Tuesday, June 04, 2013 6:31 PM, Joshua Welker [jwel...@sbuniv.edu]
  wrote:
  I am building an auto-suggest feature into our library's search box,
  and
  I am wanting to include LOC subject headings in my suggestions list.
  Does anyone know of any web service that allows for automated
  harvesting of LOC Subject Headings? I am also looking for name
 authorities, for that matter.
  Any format will be acceptable to me: RDF, XML, JSON, HTML, CSV... I
  have spent a while Googling with no luck, but this seems like the
 sort
  of general-purpose thing that a lot of people would be interested in.
  I feel like I must be missing something. Any help is appreciated.
 
  Have you seen http://id.loc.gov/ with bulk downloads in various
  formats at http://id.loc.gov/download/
 
  I hope this helps,
 
  Bryan Baldus
  Senior Cataloger
  Quality Books Inc.
  The Best of America's Independent Presses
  1-800-323-4241x402
  bryan.bal...@quality-books.com
  eij...@cpan.org
  http://home.comcast.net/~eijabb/
 


Re: [CODE4LIB] LOC Subject Headings API

2013-06-05 Thread Ford, Kevin
 it looks like LCSH
 is moving past this string-based hierarchy in favor of one expressed in
 terms of linked data.
-- Oh, I've never received that impression.  Pre-coordination - which you 
referred to as hierarchical sets of terms - is alive and well.  A number of 
studies were done in the second half of the 2000s that looked at the creation 
of LCSH headings.  Pre-coordination received significant attention in these 
studies and was ultimately confirmed as a good thing.  

Who knows why the precoordinated heading that was once used for Mexican War, 
1846-1848 was replaced, but that probably happened in 1986 (or 1991) based on 
the creation and most-resent modification times on that record.  In other 
words, at a time when the notion of Linked Data was non-existent.

Yours,
Kevin





 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Ethan Gruber
 Sent: Wednesday, June 05, 2013 9:41 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] LOC Subject Headings API
 
 Are you referring to hierarchical sets of terms, like United States--
 History--War with Mexico, 1845-1848?  This is an earlier established
 term of http://id.loc.gov/authorities/subjects/sh85140201 (now labeled
 Mexican War, 1846-1848).  Ed Summers or Kevin Ford are in a better
 position to discuss the change of terminology, but it looks like LCSH
 is moving past this string-based hierarchy in favor of one expressed in
 terms of linked data.
 
 Ethan
 
 
 On Wed, Jun 5, 2013 at 9:32 AM, Joshua Welker jwel...@sbuniv.edu
 wrote:
 
  I've seen those, but I can't figure out where on the id.loc.gov site
  there is actually a URL that provides a list of authority terms. All
  the links on the site seem to link to other pages within the site.
 
  Josh Welker
 
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
  Of Dana Pearson
  Sent: Tuesday, June 04, 2013 6:42 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] LOC Subject Headings API
 
  Joshua,
 
  There are different formats at LOC:
 
  http://id.loc.gov/authorities/subjects.html
 
  dana
 
 
  On Tue, Jun 4, 2013 at 6:31 PM, Joshua Welker jwel...@sbuniv.edu
 wrote:
 
   I am building an auto-suggest feature into our library's search box,
   and I am wanting to include LOC subject headings in my suggestions
   list. Does anyone know of any web service that allows for automated
   harvesting of LOC Subject Headings? I am also looking for name
  authorities, for that matter.
   Any format will be acceptable to me: RDF, XML, JSON, HTML, CSV... I
   have spent a while Googling with no luck, but this seems like the
   sort of general-purpose thing that a lot of people would be
 interested in.
   I feel like I must be missing something. Any help is appreciated.
  
   Josh Welker
   Electronic/Media Services Librarian
   College Liaison
   University Libraries
   Southwest Baptist University
   417.328.1624
  
 
 
 
  --
  Dana Pearson
  dbpearsonmlis.com
 


[CODE4LIB] K Class added to ID.LOC.GOV

2013-04-10 Thread Ford, Kevin
The Library of Congress is pleased to make the K Class - Law Classification - 
and all its subclasses available as linked data from LC's Linked Data Service, 
ID.LOC.GOV.  K Class joins the B, N, M, and Z Classes released in June 2012.  
With about 2.2 million new resources added to ID.LOC.GOV, K Class is nearly 
eight times larger than the B, M, N, and Z Classes combined.  It is four times 
larger than LCSH.  If it is not the largest class, it is second only to the P 
Class (Literature) in the Library of Congress Classification system.

We have also taken the opportunity to re-compute and reload the B, M, N, and Z 
classes in response to a few reported errors.  Our gratitude to Caroline Arms 
for her work crawling through B, M, N, and Z and identifying a number of these 
issues.

The classification section of ID.LOC.GOV remains a beta offering.  More work is 
needed not only to add the additional classes to the system but also to 
continue to work out issues with the data.

We continue to encourage the submission of use cases describing how users would 
like to utilize the LCC data.

**

Please explore the K Class for yourself at

http://id.loc.gov/authorities/classification/K

or all of the classes at

http://id.loc.gov/authorities/classification

**

Contact Us about ID:
As always, your feedback is important and welcomed. Though we are interested in 
all forms of constructive commentary on all topics related to ID, we're 
particularly interested in how the data available from ID.LOC.GOV is used. Your 
contributions directly inform service enhancements.

You can send comments or report any problems to us via the ID feedback form or 
ID listserv (see the web site).

Background:
The LC Linked Data Service was first made available in May 2009 and offered the 
Library of Congress Subject Headings (LCSH), the Library's initial entry into 
the Linked Data environment. In part by assigning each vocabulary and each data 
value within it a unique resource identifier (URI), the service provides a 
means for machines to semantically access, use, and harvest authority and 
vocabulary data that adheres to W3C recommendations, such as Simple Knowledge 
Organization System (SKOS), and the more detailed vocabulary MADS/RDF. In this 
way, the LC Linked Data Service also makes government data publicly and freely 
available in the spirit of the Open Government directive. Although the primary 
goal of the service is to enable machine access to Library of Congress data, a 
web interface serves human users searching and browsing the vocabularies.  The 
new datasets join the term and code lists already available through the service:

* Library of Congress Subject Headings (LCSH)
* Library of Congress Children's Subject Headings
* Library of Congress Genre/Form Terms
* Library of Congress / NACO Name Authority File
* Thesaurus of Graphic Materials
* MARC Code List for Relators
* MARC Code List for Countries (which reference their equivalent ISO 3166 codes)
* MARC Code List for Geographic Areas
* MARC Code List for Languages (which have been cross referenced with ISO 
639-1, 639-2, and 639-5, where appropriate)
* PREMIS vocabularies for Cryptographic Hash Functions, Preservation Events, 
and Preservation Level Roles

The above code lists also contain links with appropriate LCSH and LC/NAF 
headings.

LC's Linked Data Service is managed by the Network Development and MARC 
Standards Office of the Library of Congress.


--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


[CODE4LIB] LC Systems Maintenance this coming weekend (9-12 November)

2012-11-08 Thread Ford, Kevin
All Library of Congress systems will be taken offline beginning Friday evening. 
 This includes LCCN Permalink, Z39.50 and SRU services, ID.LOC.GOV, all 
listservs, and, of course, the catalog.  *All* Library systems. Service will be 
restored by Tuesday.

The Library of Congress has planned extensive electrical work and power 
maintenance for this coming weekend.  As a protective measure, all Library 
systems will be powered down.  The maintenance period is scheduled for 
completion by Tuesday morning, when it is expected all Library systems will 
have been restored to normal operation.  Though it is anticipated work will not 
be fully completed until late Monday (or very early Tuesday morning), services 
will be start coming back online many hours before then.

We regret any inconvenience this may cause.
 
Kevin

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


Re: [CODE4LIB] haititrust

2012-08-03 Thread Ford, Kevin
Ideally, you shouldn't need the hathifiles.

The HathiTrust search page links to an OpenSearch document [1], which 
promisingly identifies an RSS feed and a JSON serialization of the search 
results.  Neither appears to work. In theory, doing as Jon says and then 
appending view=rss would get you an RSS feed.  There is a contact email in 
the OpenSearch document you might try.  

FWIW, if you look at the search page HTML, there is a fixme note in an HTML 
comment, the same comment, incidentally, that also comments out the RSS feed 
link in the HTML.

Yours,

Kevin

[1] http://catalog.hathitrust.org/Search/OpenSearch?method=describe





 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Jon Stroop
 Sent: Friday, August 03, 2012 11:15 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] haititrust
 
 You can do an empty query in their catalog, and use the Original
 Location facet to filter to a holding library. Programatically, I'm
 not sure, but you'd probably need to use the Hathi files:
 http://www.hathitrust.org/hathifiles.
 
 -Jon
 
 On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:
  If I needed/wanted to know what materials held by my library were
 also
  in the HaitTrust, then programmatically how could I figure this out?
  In other words, do you know of a way to query the HaitTrust and limit
  the results to items my library owns? --Eric Lease Morgan


[CODE4LIB] 4 LCC Classes added to LC Linked Data Service

2012-06-13 Thread Ford, Kevin
Announcement:  4 LC Classification Classes added to LC Linked Data Service 

The Library of Congress is pleased to make available the B, M, N, and Z Classes 
of the Library of Congress Classification (LCC) from the Library's Linked Data 
Service (ID.LOC.GOV).

This effort not only provides URIs and resources for LCC schedules and tables 
required to synthesize a classification number, but also Linked Data resources 
for each derivable classification number and classification range within the 
entire Class hierarchy.  A small ontology has been developed to accurately 
represent the semantics for LCC data; it will follow shortly.

The publication of these LCC classes as Linked Data is presently a beta 
offering.  As such, this announcement is limited to those groups and users who 
will most benefit from this offering, and from whom we anticipate we will 
likely receive the most valuable feedback at this time.

Because LCC is sufficiently different from the other data available at 
ID.LOC.GOV, notably LCSH and Names, it is anticipated that more time and user 
feedback will be needed to fully work out any remaining issues and to maximize 
the data's usability.  Indeed, we encourage the submission of uses cases about 
how users would like to use the data.  

Because this is an area of active development, no bulk download of these 
classes are being published at this time.  On the other hand, it is hoped that 
more LCC Classes will be added more quickly in the near future.

**

Please explore it for yourself at 

http://id.loc.gov/authorities/classification 

**

Contact Us about ID:  
As always, your feedback is important and welcomed. Though we are interested in 
all forms of constructive commentary on all topics related to ID, we're 
particularly interested in how the data available from ID.LOC.GOV is used. Your 
contributions directly inform service enhancements.  

You can send comments or report any problems to us via the ID feedback form or 
ID listserv (see the web site).

Background:
The LC Linked Data Service was first made available in May 2009 and offered the 
Library of Congress Subject Headings (LCSH), the Library's initial entry into 
the Linked Data environment. In part by assigning each vocabulary and each data 
value within it a unique resource identifier (URI), the service provides a 
means for machines to semantically access, use, and harvest authority and 
vocabulary data that adheres to W3C recommendations, such as Simple Knowledge 
Organization System (SKOS), and the more detailed vocabulary MADS/RDF. In this 
way, the LC Linked Data Service also makes government data publicly and freely 
available in the spirit of the Open Government directive. Although the primary 
goal of the service is to enable machine access to Library of Congress data, a 
web interface serves human users searching and browsing the vocabularies.  The 
new datasets join the term and code lists already available through the 
service: 

* Library of Congress Subject Headings (LCSH)
* Library of Congress Children's Subject Headings
* Library of Congress Genre/Form Terms
* Library of Congress / NACO Name Authority File
* Thesaurus of Graphic Materials
* MARC Code List for Relators
* MARC Code List for Countries (which reference their equivalent ISO 3166 codes)
* MARC Code List for Geographic Areas
* MARC Code List for Languages (which have been cross referenced with ISO 
639-1, 639-2, and 639-5, where appropriate)
* PREMIS vocabularies for Cryptographic Hash Functions, Preservation Events, 
and Preservation Level Roles 

The above code lists also contain links with appropriate LCSH and LC/NAF 
headings.

--
Kevin Ford
Network Development  MARC Standards Office
Library of Congress
Washington, DC


[CODE4LIB] MARC Magic for file

2012-05-23 Thread Ford, Kevin
I finally had occasion today (read: remembered) to see if the *nix file 
command would recognize a MARC record file.  I haven't tested extensively, but 
it did identify the file as MARC21 Bibliographic record.  It also correctly 
identified a MARC21 Authority Record.  I'm running the most recent version of 
Ubuntu (12.04 - precise pangolin).

I write because the inclusion of a file MARC21 specification rule in the 
magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends 
in April if you want to go crawling for the entire thread).

Rgds,

Kevin

[1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728

--
Kevin Ford
Network Development and MARC Standards Office
Library of Congress
Washington, DC


Re: [CODE4LIB] MARC Magic for file

2012-05-23 Thread Ford, Kevin
 Does it work for bulk files?
-- It passed on a file containing 215 MARC Bibs and on a file containing 2,574 
MARC Auth records.  Don't know if you consider these bulk, but there is more 
than 1 record in each file (caveat: file stops after evaluating the first 
line, so of the 2,574 Auth records, the last 2,573 could be invalid).  It 
failed on a file containing all of LC Classification.  I need to figure out 
why.  

 Kevin, do you have examples of the output?
-- I received MARC21 Bibliography and MARC21 Authority respectively.  In 
theory, if Leader 20-23 are not 4500 then (non-conforming) should be 
appended to the identification.  If requested, the mimetype - application/marc 
- should also be outputted.

Rgds,

Kevin




 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Ross Singer
 Sent: Wednesday, May 23, 2012 3:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] MARC Magic for file
 
 Wow, this is pretty cool.
 
 Kevin, do you have examples of the output?
 
 Does it work for bulk files?
 
 I mean, I could just try this on my Ubuntu machine, but it's all the
 way downstairs...
 
 -Ross.
 
 On May 23, 2012, at 3:14 PM, Ford, Kevin wrote:
 
  I finally had occasion today (read: remembered) to see if the *nix
 file command would recognize a MARC record file.  I haven't tested
 extensively, but it did identify the file as MARC21 Bibliographic
 record.  It also correctly identified a MARC21 Authority Record.  I'm
 running the most recent version of Ubuntu (12.04 - precise pangolin).
 
  I write because the inclusion of a file MARC21 specification rule
 in the magic.db stems from a Code4lib exchange that started in March
 2011 [1] (it ends in April if you want to go crawling for the entire
 thread).
 
  Rgds,
 
  Kevin
 
  [1]
  https://listserv.nd.edu/cgi-
 bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=1
  12728
 
  --
  Kevin Ford
  Network Development and MARC Standards Office Library of Congress
  Washington, DC


Re: [CODE4LIB] Author authority records to create publication feed?

2012-04-13 Thread Ford, Kevin
Hi Paul,

I can't really offer any suggestions but to say that this is a problem area 
presently.  In fact, there was a recent workshop, held in connection with the 
Spring CNI Membership Meeting, designed specifically to look at this problem 
(and author identity management more generally).  You can read more about it 
from the announcement here [1], but the idea was to bring a number of the 
larger actors (Web of Science, arXiv, ORCID, ISNI, VIAF, LC/NACO, and a few 
more) involved in managing authorial identity together to learn about the work 
being done, and to discuss improved ways, to disambiguate scholarly identities 
and then diffuse and share that information within and across the library and 
scholarly publishing realms.  Clifford Lynch, who moderated the meeting, will 
publish a post-workshop report in a few weeks [2].  Perhaps of additional 
interest, [2] also contains a link to the report of a similar workshop held in 
London about international author identity.

Inititatives like ISNI [3] and ORCID [4], which mint identifiers for (public, 
authorial) identities, and VIAF, which has done so much to aggregate the 
authority records of the participating libraries (while also assigning them an 
identifier), are essential to disambiguating one identity from another and 
assigning unique identifiers to those identities.  For identifiers like ORCIDs, 
the faculty member's sponsoring organization might acquire the ORCID for 
him/her, after which the faculty member will/may know and use the identifier in 
situations such as grant applications, publishing, etc. (though it might also 
be early days for this activity also).   Part of the process, however, is 
diffusing the identifier across the library and scholarly publishing domains, 
all the while matching it with the correct identity (and identifer) in another 
system.  That said, when ISNIs and ORCIDs and, perhaps, VIAF identifiers start 
to make their ways into Web of Science, arXiv, LC/NACO file, an!
 d many other places, we - developers looking to creating RSS feeds of author 
publications across services but without having to deal with same-name problems 
or variants - might then have the hook we need to generate RSS feeds for author 
publications from such services as JSTOR, EBSCO, arXiv, Web Of Science, etc.

Alternatively, you'd have to get your faculty members to submit their entire 
publication history to academia.edu (as Ethan suggested), after which the 
community would have to request an RSS feed of that history, or an 
institutional repository (as Chad suggested), but I understand these types of 
things are an uphill battle with (often busy, underpaid) faculty.

Cordially,

Kevin


[1] http://www.cni.org/news/cni-workshop-scholarly-id/
[2] https://mail2.cni.org/Lists/CNI-ANNOUNCE/Message/113744.html
[3] http://www.isni.org/
[4] http://about.orcid.org/






 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Paul Butler (pbutler3)
 Sent: Friday, April 13, 2012 9:25 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Author authority records to create publication feed?
 
 Howdy All,
 
 Some folks from across campus just came to my door with this question.
 I am still trying to work through the possibilities and problems, but
 thought others might have encountered something similar.
 
 They are looking for a way to create a feed (RSS, or anything else that
 might work) for each faculty member on campus to collect and link to
 their publications, which can then be embedded into their faculty
 profile webpage (in WordPress).
 
 I realize the vendors (JSTOR, EBSCO, etc.) allow author RSS feeds, but
 that really does not allow for disambiguation between folks with the
 same name and variants in name citation.  It appears Web of Science has
 author authority records and a set of apis, but we currently do not
 subscribe to WoS and am waiting for a trial to test.  What we need is
 something similar to this: http://arxiv.org/help/author_identifiers
 
 We can ask faculty members to upload their own citations and then just
 auto link out to something like Serials Solutions' Journal Finder,  but
 that is likely not sustainable.
 
 So, any suggestions - particularly free or low cost solutions.  Thanks!
 
 Cheers, Paul
 +-+-+-+-+-+-+-+-+-+-+-+-+
 Paul R Butler
 Assistant Systems Librarian
 Simpson Library
 University of Mary Washington
 1801 College Avenue
 Fredericksburg, VA 22401
 540.654.1756
 libraries.umw.edu
 
 Sent from the mighty Dell Vostro 230.


[CODE4LIB] Bulk Download of Names Available

2011-08-11 Thread Ford, Kevin
Bulk downloads of the Library of Congress *Name* Authority File (NAF) are now 
available.  The current bulk download is only MADS/RDF.  We'll make a SKOS/RDF 
download available in the near future.  We are offering two serializations:  
n-triples and RDF/XML.

The LC *Subject* Heading (LCSH) file continues to be available for download as 
SKOS/RDF, but now LCSH is also available in MADS/RDF (in the same 
serializations).  The MADS/RDF enables the identification of types for subject 
headings and subheadings.  

They may be downloaded here:  http://id.loc.gov/download/

The data dumps are very much a work in progress. Please report problems, 
issues, and wishes on the ID.LOC.GOV listserv: 

http://listserv.loc.gov/cgi-bin/wa?SUBED1=IDA=1


--
Kevin Ford
Digital Project Coordinator
Network Development  MARC Standards Office 
Library of Congress
101 Independence Avenue, SE
Washington, DC 20540-4402

Email: k...@loc.gov
Tel: 202 707 3526


[CODE4LIB] Names Added to ID.LOC.GOV

2011-08-10 Thread Ford, Kevin
Announcement:  New Vocabulary Data Added to LC Authorities and Vocabularies 
Service 

The Library of Congress is pleased to make available additional vocabularies 
from its Authorities and Vocabularies web service (ID.LOC.GOV), which provides 
access to Library of Congress standards and vocabularies as Linked Data. The 
new dataset is: 

* Library of Congress Name Authority File (LC/NAF)

In addition, the service has been enhanced to provide separate access to the 
following datasets which have been a part of the LCSH dataset access:

* Library of Congress Genre/Form Terms
* Library of Congress Children's Headings

The LC/NAF data are published in RDF using the MADS/RDF and SKOS/RDF 
vocabularies, as are the other datasets. Individual concepts are accessible at 
the ID.LOC.GOV web service via a web browser interface or programmatically via 
content-negotiation. The vocabulary data are available for bulk download in 
MADS and SKOS RDF (the Name file and main LCSH file will be available by 
Friday, August 12).

**Please explore it for yourself at http://id.loc.gov. **

Contact Us about ID:  
As always, your feedback is important and welcomed. Though we are interested in 
all forms of constructive commentary on all topics related to ID, we're 
particularly interested in how the data available from ID.LOC.GOV is used. Your 
contributions directly inform service enhancements.  

The addition of Names has resulted in considerable changes to the ID.LOC.GOV 
backend.  Although we have endeavored to bring the service up with all pieces 
in place, please be patient as we work out any remaining kinks.  

You can send comments or report any problems to us via the ID feedback form or 
ID listserv (see the web site).

Background:
The Authorities and Vocabularies web service was first made available in May 
2009 and offered the Library of Congress Subject Headings (LCSH), the Library's 
initial entry into the Linked Data environment. In part by assigning each 
vocabulary and each data value within it a unique resource identifier (URI), 
the service provides a means for machines to semantically access, use, and 
harvest authority and vocabulary data that adheres to W3C recommendations, such 
as Simple Knowledge Organization System (SKOS), and the more detailed 
vocabulary MADS/RDF. In this way, the Authorities and Vocabularies web service 
also makes government data publicly and freely available in the spirit of the 
Open Government directive. Although the primary goal of the service is to 
enable machine access to Library of Congress data, a web interface serves human 
users searching and browsing the vocabularies.  The new datasets join the term 
and code lists already available through the service: 

* Library of Congress Subject Headings (LCSH)
* Thesaurus of Graphic Materials
* MARC Code List for Relators
* MARC Code List for Countries (which reference their equivalent ISO 3166 codes)
* MARC Code List for Geographic Areas 
* MARC Code List for Languages (which have been cross referenced with ISO 
639-1, 639-2, and 639-5, where appropriate)
* PREMIS vocabularies for Cryptographic Hash Functions, Preservation Events, 
and Preservation Level Roles 

The above code lists also contain links with appropriate LCSH and LC/NAF 
headings.  Additional vocabularies will be added in the future, including 
additional PREMIS controlled vocabularies.


-- 
Kevin Ford
Digital Project Coordinator
Network Development  MARC Standards Office 
Library of Congress
101 Independence Avenue, SE
Washington, DC 20540-4402

Email: k...@loc.gov
Tel: 202 707 3526


Re: [CODE4LIB] TIFF Metadata to XML?

2011-07-18 Thread Ford, Kevin
Exiftool [1] and trusty ImageMagick [2] will work.  With ImageMagick it is as 
easy as:

convert image.tiff image.xmp

Members of the Visual Resources Association (VRA) have been working on/with 
embedded metadata for a few years now.  There may be something more to glean 
from the working group's wiki [3].

Cordially,

Kevin


[1] http://www.sno.phy.queensu.ca/~phil/exiftool/
[2] http://www.imagemagick.org/script/index.php
[3] http://metadatadeluxe.pbworks.com/w/page/20792238/FrontPage
 

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Edward M. Corrado
 Sent: Monday, July 18, 2011 9:18 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] TIFF Metadata to XML?
 
 Hello All,
 
 Before I re-invent the wheel or try many different programs, does
 anyone have a suggestion on a good way to extract embedded Metadata
 added by cameras and (more importantly) photo-editing programs such as
 Photoshop from TIFF files and save it as as XML? I have  60k photos
 that have metadata including keywords, descriptions, creator, and other
 fields embedded in them and I need to extract the metadata so I can
 load them into our digital archive.
 
 Right now, after looking at a few tools and having done a number of
 Google searches and haven't found anything that seems to do what I want.
 As of now I am leaning towards extracting the metadata using
 exiv2 and creating a script (shell, perl, whatever) to put the fields I
 need into a pseudo-Dublin Core XML format. I say pseudo because I have
 a few fields that are not Dublin Core. I am assuming there is a better
 way. (Although part of me thinks it might be easier to do that then
 exporting to XML and using XSLT to transform the file since I might
 need to do a lot of cleanup of the data regardless.)
 
 Anyway, before I go any further, does anyone have any
 thoughts/ideas/suggestions?
 
 Edward


Re: [CODE4LIB] source of marc geographic code?

2011-06-23 Thread Ford, Kevin
The GeographicArea codes have been available from [1] in XML [2] since at least 
late 2007 [3].  I can't say with 100% certainty that the XML structure has 
remained perfectly consistent since 2007, but eyeballing the 2007 version and 
comparing it to currently available file suggests that the structure has 
remained consistent.

The GACS codes are also available from ID, as has been pointed out.  The entire 
list is available for download at [4].  Let me acknowledge, though, that the 
labels for the URIs (incidentally, the GACS code is the last token of the URI)  
are not part of the RDF/N-triples/JSON at [5].  This sounds like a feature 
request - and a useful one at that.  Would that be an accurate interpretation 
of this thread?

Cordially,

Kevin

--
Network Development  MARC Standards Office

[1] http://www.loc.gov/marc/geoareas/gacshome.html
[2] http://www.loc.gov/standards/codelists/gacs.xml
[3] http://web.archive.org/web/20071129170212/http://www.loc.gov/marc/geoareas/
[4] http://id.loc.gov/download/
[5] http://id.loc.gov/vocabulary/geographicAreas.html



From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan 
Rochkind [rochk...@jhu.edu]
Sent: Wednesday, June 22, 2011 21:43
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] source of marc geographic code?

 The result was that a few meetings later LC announced that they
 had coded the MARC online pages in XML, and were generating the HTML
 from that. I think I was mis-understood.

No doubt, but man if they'd then just SHARE that XML with us at a persistent 
URL, and keep the structure of that XML the same, that'd be really useful!


Re: [CODE4LIB] LCSH and Linked Data

2011-04-07 Thread Ford, Kevin
Actually, it appears to depend on whose Authority record you're looking at.  
The Canadians, Australians, and Israelis have it as a CorporateName (110), as 
do the French (210 - unimarc); LC and the Germans say it's a Geographic Name.

In the case of LCSH, therefore, it would be a 151.  Regardless, it is in VIAF.

Warmly,

Kevin




From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of LeVan,Ralph 
[le...@oclc.org]
Sent: Thursday, April 07, 2011 11:34
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] LCSH and Linked Data

If you look at the fields those names come from, I think they mean
England as a corporation, not England as a place.

Ralph

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of
 Owen Stephens
 Sent: Thursday, April 07, 2011 11:28 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] LCSH and Linked Data

 Still digesting Andrew's response (thanks Andrew), but

 On Thu, Apr 7, 2011 at 4:17 PM, Ya'aqov Ziso yaaq...@gmail.com
wrote:

  *Currently under id.loc.gov you will not find name authority
records, but
  you can find them at viaf.org*.
  *[YZ]*  viaf.org does not include geographic names. I just checked
there
  England.
 

 Is this not the relevant VIAF entry
 http://viaf.org/viaf/14299580http://viaf.org/viaf/142995804


 --
 Owen Stephens
 Owen Stephens Consulting
 Web: http://www.ostephens.com
 Email: o...@ostephens.com


Re: [CODE4LIB] MARC magic for file

2011-03-28 Thread Ford, Kevin
I couldn't get Simon's MARC 21 Magic file to work.  Among other issues, I 
received line too long errors.  But, since I've been curious about this for 
sometime, I figured I'd take a whack at it myself.  Try this:

#
# MARC 21 Magic  (Second cut)

# Set at position 0
0   short   0x 

# leader ends with 4500
20 string  4500

# leader starts with 5 digits, followed by codes specific to MARC format
0 regex/1 (^[0-9]{5})[acdnp][^bhlnqsu-z]  MARC Bibliographic
0 regex/1 (^[0-9]{5})[acdnosx][z] MARC Authority
0 regex/1 (^[0-9]{5})[cdn][uvxy]  MARC Holdings
0 regex/1 (^[0-9]{5})[acdn][w]MARC Classification
0 regex/1 (^[0-9]{5})[cdn][q] MARC Community

I've also attached it to this email to preserve the tabs.  

In any event, I can confirm it works on MARC Bib, MARC Authority, and MARC 
Classification files I have bumping around my computer.  I've not tested it on 
MARC Holdings and MARC Community.

Do let us/me know if it works for you (and the community generally).  I can see 
about submitting it for formal inclusion in the magic file.

Warmly,

Kevin

--
Library of Congress
Network Development and MARC Standards Office




From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Simon Spero 
[s...@unc.edu]
Sent: Thursday, March 24, 2011 12:28
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] MARC magic for file

Some of the problems in your first cut are:

1. Offsets for regex are given in terms of lines.  MARC files don't have
newlines in them, unless you're Millennium, in which case they can be
inserted every 200,000 bytes to keep things interesting.
2.  Byte matches match byte values, so 20 byte 4   is looking for the
binary value, not the ascii digit.
3.  Sometimes you need to prime the buffer before you can do a regexp match.

Is this good enough?


# MARC 21 Magic  (First cut)
#  indicator count must be 2
10 string 2
#  leader must end in 4500
20 string 4500
#  leader must start with five digits, a record status, and a record
type
0 regex ^([0-9]{5})[acdnp][acdefgijkmoprt][abcims] MARC Bibliographic
0 regex ^([0-9]{5})[acdnp][z] MARC Authority

Simon


On Wed, Mar 23, 2011 at 8:09 PM, William Denton w...@pobox.com wrote:

 Has anyone figured out the magic necessary for file to recognize MARC
 files?

 If you don't know it, file is a Unix command that tells you what kind of
 file a file is.  For example:

 $ file 101015_001.mp3
 101015_001.mp3: Audio file with ID3 version 2.3.0, contains: MPEG ADTS,
 layer III, v1, 192 kbps, 44.1 kHz, Stereo

 $ file P126.jpg
 P126.jpg: JPEG image data, EXIF standard, comment: AppleMark

 It's a really useful command.  I assume it's on OSX, but I don't know. You
 can get it for Windows with Cygwin.

 The problem is, file doesn't grok MARC:

 $ file catalog.01.mrc
 catalog.01.mrc: data

 I took a stab at getting the magic defined, but it didn't work.  I'll
 include what I used below.  You can put it into a magic.txt file, and then
 use

 file -m magic.txt some_file.mrc

 to test it.  It'll tell you the file is MARC Bibliographic ... but it also
 thinks that PDFs, JPEGs, and text files are MARC.  That's no good.

 It'd be great if the MARC magic got into the central magic database so
 everyone would be able to recognize various MARC file types.

 Bill


 # --- clip'n'test
 # MARC 21 for Bibliographic Data
 # http://www.loc.gov/marc/bibliographic/bdleader.html
 #
 # This doesn't work properly

 0 stringx

 5regex  [acdnp]
 6regex  [acdefgijkmoprt]
 7regex  [abcims]
 8regex  [\ a]
 9regex  [\ a]
 10   byte  x
 11   byte  x
 12   stringx
 17   regex [\ 12345678uz]
 18   regex  [\ aciu]
 19   regex  [\ abc] MARC Bibliographic

 #20   byte 4
 #21   byte 5
 #22   byte 0
 #23   byte 0   MARC Bibliographic

 # --- end clip'n'test

 --
 William Denton, Toronto : miskatonic.org www.frbr.org openfrbr.org



marc.magic
Description: marc.magic


[CODE4LIB] New Vocabs Added to ID.LOC.GOV

2011-01-04 Thread Ford, Kevin
Announcement: New Vocabularies Added to LC Authorities and Vocabularies Service 

The Library of Congress is pleased to make available new vocabularies from its 
Authorities and Vocabularies web service (ID.LOC.GOV), which provides access to 
Library of Congress standards and vocabularies as Linked Data.  The new 
additions include :

MARC Code List for Countries
MARC Code List for Geographic Areas
MARC Code List for Languages

The MARC Countries entries include references to their equivalent ISO 3166 
codes.  The MARC Languages have been cross referenced with ISOs 639-1, 639-2, 
and 639-5, where appropriate.  Additional vocabularies will be added in the 
future, including additional PREMIS controlled vocabularies.

The vocabulary data are published in RDF using the SKOS/RDF Vocabulary.  
Individual concepts are accessible via the ID.LOC.GOV web service via a web 
browser interface or programmatically via content-negotiation.  The vocabulary 
data are also available for bulk download.  A new bulk download of LCSH will be 
available tomorrow, 5 January 2011.

As always, your feedback is important and welcomed.  Though we are interested 
in all forms of constructive commentary on all topics related to ID, we're 
particularly interested in how the data available from ID.LOC.GOV is used.  
Your contributions directly inform service enhancements.  

The Authorities and Vocabularies web service was first made available in May 
2009 and offered the Library of Congress Subject Headings (LCSH), the Library's 
initial entry into the Linked Data movement.  In part by assigning each 
vocabulary and each data value within it a unique resource identifier (URI), 
the service provides a means for machines to semantically access, use, and 
harvest authority and vocabulary data that adheres to W3C recommendations, such 
as Simple Knowledge Organization System (SKOS).  In this way, the Authorities 
and Vocabularies web service also makes government data publicly and freely 
available in the spirit of the Open Government directive.  Although the primary 
goal of the service is to enable machine access to Library of Congress data, a 
web interface serves human users searching and browsing the vocabularies.

Please explore it for yourself at http://id.loc.gov.


*

Kevin M. Ford
Digital Project Coordinator
Network Development  MARC Standards Office
Library of Congress
101 Independence Avenue, SE
Washington, DC 20540-4402


[CODE4LIB] MADS/RDF for review

2010-11-19 Thread Ford, Kevin
Announcement: MADS/RDF for review

A MADS/RDF ontology developed at the Library of Congress is available for a 
public review period until Jan. 14, 2011.  The MADS/RDF (Metadata Authority 
Description Schema in RDF) vocabulary is a data model for authority and 
vocabulary data used within the library and information science (LIS) 
community, which is inclusive of museums, archives, and other cultural 
institutions. It is presented as an OWL ontology.

Documentation and the ontology are available at: 
http://www.loc.gov/standards/mads/rdf/

Based on the MADS/XML schema, MADS/RDF provides a means to record data from the 
Machine Readable Cataloging (MARC) Authorities format in RDF for use in 
semantic applications and Linked Data projects. MADS/RDF is a knowledge 
organization system designed for use with controlled values for names 
(personal, corporate, geographic, etc.), thesauri, taxonomies, subject heading 
systems, and other controlled value lists. It is closely related to SKOS, the 
Simple Knowledge Organization System and a widely supported and adopted RDF 
vocabulary. Unlike SKOS, however, which is very broad in its application, 
MADS/RDF is designed specifically to support authority data as used by and 
needed in the LIS community and its technology systems. Given the close 
relationship between the aim of MADS/RDF and the aim of SKOS, the MADS ontology 
has been fully mapped to SKOS.

Community feedback is encouraged and welcomed. The MODS listserv - MADS/XML is 
maintained as part of the community work on MODS (Metadata Object Description 
Schema) - is the preferred forum for feedback: 
http://listserv.loc.gov/listarch/mods.html (send mail to: 
m...@listserv.loc.gov).  Kevin Ford, the primary architect of the model, will 
be responding on that forum in order to have an open discussion.


*

Kevin M. Ford
Digital Project Coordinator
Network Development  MARC Standards Office
Library of Congress
101 Independence Avenue, SE
Washington, DC 20540-4402


Re: [CODE4LIB] dc:identifier in Google XML

2010-07-19 Thread Ford, Kevin
Dear David,

I believe they're codes for universities.  UCSC is probably Univ of Calif Santa 
Cruz.  UOM is University of Michigan.  (You'll see STANFORD and OCLC in the 
results also, though OCLC is not a university).

I tracked one of items in the ATOM feed to the UM record:

http://mirlyn.lib.umich.edu/Record/000680081/Details#tabs

The ID you see in the ATOM feed is buried in one of the 974 fields of the UM 
MARC record.

HTH,
Kevin





From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of David Kane 
[dk...@wit.ie]
Sent: Monday, July 19, 2010 12:55 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] dc:identifier in Google XML

HI All,

I am getting data from google books that I do not understand in the
dc:identifier field.

I understand ISBN: ISSN: LCCN: OCLC:

but UOM:, and UCSC:?

Can anyone help with what these two mean.  Are they Universities?
Here is a snippet of xml;

  dc:formatbook/dc:format
   dc:identifierr0xMMAAJ/dc:identifier
   dc:identifierUOM:39015035700759/dc:identifier
   dc:subjectMedical/dc:subject
   dc:titleAbstracts [of the] annual meeting/dc:title

... generated from this URL:
http://www.google.com/books/feeds/volumes?q=Abstracts%20of%20the%20annual%20meeting

Thanks,

David.

--
David Kane, MLIS.
Systems Librarian
Waterford Institute of Technology
Ireland
http://library.wit.ie/
T: ++353.51302838
M: ++353.876693212


Re: [CODE4LIB] Any web services that can help sort out this for me.

2010-06-17 Thread Ford, Kevin
Following on Dave's recommendation, you could also use Google Books' Data API 
[1].  Search for the book, get a structured ATOM feed as a response, presume 
the first hit is your book, and then follow the ATOM feed link for that books' 
metadata.  It isn't going to be perfect; I'd be interested to know the end 
ratio of perfect versus missed matches.

Good luck,
Kevin

[1] http://code.google.com/apis/books/docs/gdata/developers_guide_protocol.html



From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Dave Caroline 
[dave.thearchiv...@gmail.com]
Sent: Thursday, June 17, 2010 5:43 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Any web services that can help sort out this for me.

what definition of large list 10,100,1000,.

yes google

copy title part Progress in Smart Materials and Structures paste in
google box press return

first hit for the first line has the isbn, or you could script it and
use the Open Library API and get the isbn back possibly

Dave Caroline

On Thu, Jun 17, 2010 at 9:59 AM, David Kane dk...@wit.ie wrote:
 Hi, I have large amounts of data like this:

 yawn
 Reece, P. L., (2006), Progress in Smart Materials and Structures, Nova
 Ghosh, S. K., (2008), Self-healing materials: fundamentals, design
 strategies and applications, Wiley
 A.Y.K. Chan, Biomedical Device Technology: Principles  Design,
 Charles C. Thomas, 2008.
 L.J. Street, Introduction to Biomedical Engineering Technology, CRC
 Press, 2007.
 /yawn

 ... one book per line.

 they are not in any order.

 I am lazy.  So, is there a web service out there that I can throw this
 stuff at to organise it for me and ideally find the ISBNs.

 Long shot, I know.

 But thanks,

 David.


 --
 David Kane
 Systems Librarian
 Waterford Institute of Technology
 Ireland
 http://library.wit.ie/
 davidfk...@googlewave.com
 T: ++353.51302838
 M: ++353.876693212



Re: [CODE4LIB] XForms EAD editor sandbox available

2009-11-13 Thread Ford, Kevin
We've been using Orbeon forms for about a year now for cataloging our digital 
collections.  We use Fedora Commons, so using the XML as input and outputting 
to XML seemed a no brainer.  It has worked very nicely for editing VRA Core4 
records. But, instead of doing anything terribly fancy with Orbeon, we simply 
use the little sandbox application that comes with Orbeon (there's an online 
demo [1]).  The URL to the XForm is part of the query string. This solution has 
greatly reduced our time investment in making Orbeon part of our workflow and, 
more importantly, getting Orbeon to work for us.  All that being said, Ethan's 
sharp looking EAD editor makes me jealous that we haven't created our own 
custom editor.

As for Orbeon's performance, once we worked out some quirks, we've been quite 
happy with Orbeon.  Orbeon hosts a useful performance and tuning page [2].  We 
also learned that it is helpful to stop the Orbeon app and restart it about 
once every two weeks as performance can become progressively slower.  It seems 
to need a little reboot.  In any event, a typical XForm for us is about 200k, 
with a number of authority lists, one of which includes nearly 1500 items.  
Orbeon loads and renders the XForm fairly quickly (less than 4 seconds) and 
editing performance hasn't been an issue either, which is great considering 
that a 1500-item-subject-authority drop down list is created for each subject 
being added to a record.

Moving such a large XForm to a server-based solution was necessary.  Our XForm 
cataloging application, which began with a simple DC record and focused on 
producing a viable XForm, initially used the Mozilla XForm add-on [3].  The 
Firefox add-on, which of course runs on the client, easily scaled for a VRA 
Core4 record, but it couldn't handle a burgeoning subject authority file.  
Hence the need for an alternative solution, quick.

-Kevin

[1] http://www.orbeon.com/ops/xforms-sandbox/
[2] http://wiki.orbeon.com/forms/doc/developer-guide/performance-tuning
[3] http://www.mozilla.org/projects/xforms/

--
Kevin Ford
Library Digital Collections
Columbia College Chicago



-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Andrew 
Ashton
Sent: Friday, November 13, 2009 8:37 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] XForms EAD editor sandbox available

Nice job, Ethan.  This looks really cool.

We have an Orbeon-based MODS editor, but I have found Orbeon to be a bit
tough to develop/maintain and more heavyweight than we really need.  We're
considering more Xforms implementations, but I would love to find a more
lightweight Xforms application.  Does anyone have any recommendations?

The only one I know of is XSLTForms (http://www.agencexml.com/xsltforms) but
I haven't messed with it yet.

-Andy

On 11/13/09 9:13 AM, Eric Hellman e...@hellman.net wrote:

 XForms and Orbeon are very interesting tools for developing metadata
 management tools.

 The ONIX developers have used this stack to produce an interface for ONIX-PL
 called OPLE that people should try out.

 http://www.jisc.ac.uk/whatwedo/programmes/pals3/onixeditor.aspx

 Questions about Orbeon relate to performance and integrability, but I think
 it's an impressive use of XForms nonetheless.

 - Eric

 On Nov 12, 2009, at 1:30 PM, Ethan Gruber wrote:

 Hello all,

 Over the past few months I have been working on and off on a research
 project to develop a XForms, web-based editor for EAD finding aids that runs
 within the Orbeon tomcat application.  While still in a very early alpha
 stage (I have probably put only 60-80 hours of work into it thus far), I
 think that it's ready for a general demonstration to solicit opinions,
 criticism, etc. from librarians, and technical staff.

 Background:
 For those not familiar with XForms, it is a W3C standard for creating
 next-generation forms.  It is powerful and can allow you to create XML in
 the way that it is intended to be created, without limits to repeatability,
 complex hierarchies, or mixed content.  Orbeon adds a level on top of that,
 taking care of all the ajax calls, serialization, CRUD operations, and a
 variety of widgets that allow nice features like tabs and
 autocomplete/autosuggest that can be bound to authority lists and controlled
 access terms.  By default, Orbeon reads and writes data from and to an eXist
 database that comes packaged with it, but you can have it serialize the XML
 to disk or have it interact with any REST interface such as Fedora.

 Goals:
 Ultimately, I wish to create a system of forms that can open any EAD
 2002-compliant XML file without any data loss or XML transformation
 whatsoever.  I think that this is the shortcoming of systems such as Archon
 and Archivists' Toolkit.  I want to integrate authority lists that can be
 integrated into certain fields with autosuggest (such as corporate names,
 people, and subjects).  If there is demand, I can build a public interface
 for