I was going to try to reduce the space a bit by focusing on 650 fields. Each record with a Dewey number will be a tab separated line, that will include each 650 field in order. So something like:
305.42/0973 <tab> Women's rights -- United States -- History -- Sources. <tab> Women -- United States -- History — Sources <tab> Manuscripts, American -- Facsimiles. I thought it might be a place to start at least … it’s running on an ec2 instance right now :-) //Ed On Dec 10, 2013, at 4:26 PM, Karen Coyle <[email protected]> wrote: > I've often thought that this would be an interesting exercise if someone > would undertake it. > > Just a reminder: in theory (IN THEORY) the first subject heading in an LC > record is the one most semantically close to the assigned subject > classification. So perhaps a first pass with the FIRST 6xx might give a more > refined matching. And then it would be interesting to compare that with the > results using all 600-651's. > > kc > > On 12/10/13, 1:18 PM, Edward Summers wrote: >> Not a naive idea at all. If you have the stomach for it, you could extract >> the Subject Heading / Dewey combinations out of say the LC Catalog MARC data >> [1] to use as training data for some kind of clustering [2] algorithm. You >> might even be able to do something simple like keep a count of the Dewey >> ranges associated with each subject heading. >> >> I’m kind of curious myself, so I could work on getting the subject heading / >> dewey combinations if you want? >> >> //Ed >> >> [1] https://archive.org/details/marc_records_scriblio_net >> [2] https://en.wikipedia.org/wiki/Cluster_analysis >> >> On Dec 10, 2013, at 8:18 AM, Irina Arndt <[email protected]> wrote: >> >>> Hi CODE4LIB, >>> >>> we would like to add DDC classes to a bunch of MARC records, which contains >>> only LoC Subject Headings. >>> Does anybody know, if a mapping between LCSH and DDC is anywhere existent >>> (and available)? >>> >>> I understood, that WebDewey >>> http://www.oclc.org/dewey/versions/webdewey.en.html might provide such a >>> service, but >>> >>> · we are no OCLC customers or subscribers to WebDewey >>> >>> · even if we were, I'm not sure, if the service matches our needs >>> >>> I'm thinking of a tool, where I can upload my list of subject headings and >>> get back a list, where the matching Dewey classes have been added (but a >>> 'simple' csv file with LCSH terms and DDC classes would be helpful as well- >>> I am fully aware, that neither LCSH nor DDC are simple at all...) . Naïve >>> idea...? >>> >>> Thanks for any clues, >>> Irina >>> >>> >>> ------- >>> >>> Irina Arndt >>> Max Planck Digital Library (MPDL) >>> Library System Coordinator >>> Amalienstr. 33 >>> D-80799 Muenchen, Germany >>> >>> Tel. +49 89 38602-254 >>> Fax +49 89 38602-290 >>> >>> Email: [email protected]<mailto:[email protected]> >>> http://www.mpdl.mpg.de > > -- > Karen Coyle > [email protected] http://kcoyle.net > m: 1-510-435-8234 > skype: kcoylenet
