[CODE4LIB] NYPL Drupal Camp
Please excuse cross-postings. This message is being posted to multiple lists. Please join us for the first-ever NYPL Drupal Camp! In January 2010 the New York Public Library unveiled a soup-to-nuts re-engineering of its website, moving 15 years of digital sprawl into a modern, open source content management system: Drupal. Drupal not only enables us to better organize and interrelate our web content, it also lets us turn over daily control of various local website areas to the staff who know them best. Six months into this new era, staff from across the Library are learning how to update their own location information and events calendars, and are experimenting with exciting new tools like blogs, audio and video to build a dynamic new digital experience for our patrons. Now, with half a year under our belts, we’d like to share our experiences with other libraries in the first-ever NYPL Drupal Camp. Whether you’re considering adopting Drupal for your website, or are already an old hand, this will be a great opportunity to learn directly from NYPL digital staff, to share your own insights, and perhaps lay groundwork for collaboration. This event will take place over two days: Thursday August 26 and Friday August 27, from 9 am to 5 pm. The first day will be a series of presentations by NYPL staff with plenty of opportunities for questions and answers. Topics for the first day include (subject to change): •Vision •Content development •Staff training •Project management/staffing •Information architecture •User testing process •Policy •Infrastructure •Content migration •Development •Vendors/IT relationships The second day will be an un-conference format during which attendees set the schedule of sessions, some of which can be all-day long code sprints. The workshop is free. Coffee will be provided and a listing of nearby lunch options. WHAT: NYPL Drupal Camp WHERE: Science, Industry and Business Library 188 Madison Avenue, Lower Level, Room 018 New York, NY 10016 WHEN: Thursday August 26 and Friday August 27 9 am to 5 pm. COST: Free HOW: http://nypldrupalcamp.eventbrite.com/ LIMIT: Please note that registration is limited to 2 participants per organization. We would recommend that one project/content manager, and one technical staff member attend NYPL Drupal Camp. Hope to see you there!
Re: [CODE4LIB] NYPL Drupal Camp
I would definitely fly (drive, bike, walk...) from LA for this if it didn't conflict with DrupalCon CPH ;( Cary On Wed, Jul 7, 2010 at 11:03 AM, Michelle Misner mmis...@nypl.org wrote: Please excuse cross-postings. This message is being posted to multiple lists. Please join us for the first-ever NYPL Drupal Camp! In January 2010 the New York Public Library unveiled a soup-to-nuts re-engineering of its website, moving 15 years of digital sprawl into a modern, open source content management system: Drupal. Drupal not only enables us to better organize and interrelate our web content, it also lets us turn over daily control of various local website areas to the staff who know them best. Six months into this new era, staff from across the Library are learning how to update their own location information and events calendars, and are experimenting with exciting new tools like blogs, audio and video to build a dynamic new digital experience for our patrons. Now, with half a year under our belts, we’d like to share our experiences with other libraries in the first-ever NYPL Drupal Camp. Whether you’re considering adopting Drupal for your website, or are already an old hand, this will be a great opportunity to learn directly from NYPL digital staff, to share your own insights, and perhaps lay groundwork for collaboration. This event will take place over two days: Thursday August 26 and Friday August 27, from 9 am to 5 pm. The first day will be a series of presentations by NYPL staff with plenty of opportunities for questions and answers. Topics for the first day include (subject to change): • Vision • Content development • Staff training • Project management/staffing • Information architecture • User testing process • Policy • Infrastructure • Content migration • Development • Vendors/IT relationships The second day will be an un-conference format during which attendees set the schedule of sessions, some of which can be all-day long code sprints. The workshop is free. Coffee will be provided and a listing of nearby lunch options. WHAT: NYPL Drupal Camp WHERE: Science, Industry and Business Library 188 Madison Avenue, Lower Level, Room 018 New York, NY 10016 WHEN: Thursday August 26 and Friday August 27 9 am to 5 pm. COST: Free HOW: http://nypldrupalcamp.eventbrite.com/ LIMIT: Please note that registration is limited to 2 participants per organization. We would recommend that one project/content manager, and one technical staff member attend NYPL Drupal Camp. Hope to see you there! -- Cary Gordon The Cherry Hill Company http://chillco.com
[CODE4LIB] schema for some web page
So in our marc records, we have these 856 links, the meaning of which is basically some web page related to the entity at hand. You don't really know the relation, the granularity is not there. So, fine, data is data, there ought to be some way to model this in standard XML/RDF/DC/whatever, right? It's not dc:identifier, because dc:identifier ends up including all sorts of URIs that are not really web pages at all, they are just identifiers of various kinds. The marc 856s are URI's, it's true, but they really _aren't_ URIs given as identifiers, they do not neccesarily identify the item at hand at all, but they DO neccesarily lead to a web page with some see also relationship to the entity at hand. So... how would you include this in, say, a DC set in XML or RDF? Is there any common way people have done this in the past? Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm looking for some vocabulary that will handle marc 856s, but also in the future handle other some kind of see also link from other formats, when I add other formats into my corpus. Any ideas? Jonathan
Re: [CODE4LIB] schema for some web page
Isn't that pretty much what dc:relation is for? From http://dublincore.org/documents/dcmi-terms/#elements-relation Label: Relation Definition: A related resource. Comment:Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. On 7 July 2010 23:32, Jonathan Rochkind rochk...@jhu.edu wrote: So in our marc records, we have these 856 links, the meaning of which is basically some web page related to the entity at hand. You don't really know the relation, the granularity is not there. So, fine, data is data, there ought to be some way to model this in standard XML/RDF/DC/whatever, right? It's not dc:identifier, because dc:identifier ends up including all sorts of URIs that are not really web pages at all, they are just identifiers of various kinds. The marc 856s are URI's, it's true, but they really _aren't_ URIs given as identifiers, they do not neccesarily identify the item at hand at all, but they DO neccesarily lead to a web page with some see also relationship to the entity at hand. So... how would you include this in, say, a DC set in XML or RDF? Is there any common way people have done this in the past? Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm looking for some vocabulary that will handle marc 856s, but also in the future handle other some kind of see also link from other formats, when I add other formats into my corpus. Any ideas? Jonathan
Re: [CODE4LIB] schema for some web page
Mike: For sure dc:relation works, and has some subproperties that a bit more specific, but it's still pretty much a blunt instrument. I know I sound like a broken record, but RDA has a LOT of relationships to choose from--these are the WEMI-to-WEMI relationships: http://metadataregistry.org/schemaprop/list/schema_id/13.html There are also: RDA Relationships for Persons, Corporate Bodies, Families: http://metadataregistry.org/schemaprop/list/schema_id/22.html and RDA Relationships for Concepts, Events, Objects, Places: http://metadataregistry.org/schemaprop/list/schema_id/23.html Diane On 7/7/10 6:42 PM, Mike Taylor wrote: Isn't that pretty much what dc:relation is for? From http://dublincore.org/documents/dcmi-terms/#elements-relation Label: Relation Definition: A related resource. Comment:Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. On 7 July 2010 23:32, Jonathan Rochkindrochk...@jhu.edu wrote: So in our marc records, we have these 856 links, the meaning of which is basically some web page related to the entity at hand. You don't really know the relation, the granularity is not there. So, fine, data is data, there ought to be some way to model this in standard XML/RDF/DC/whatever, right? It's not dc:identifier, because dc:identifier ends up including all sorts of URIs that are not really web pages at all, they are just identifiers of various kinds. The marc 856s are URI's, it's true, but they really _aren't_ URIs given as identifiers, they do not neccesarily identify the item at hand at all, but they DO neccesarily lead to a web page with some see also relationship to the entity at hand. So... how would you include this in, say, a DC set in XML or RDF? Is there any common way people have done this in the past? Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm looking for some vocabulary that will handle marc 856s, but also in the future handle other some kind of see also link from other formats, when I add other formats into my corpus. Any ideas? Jonathan
Re: [CODE4LIB] schema for some web page
Hi Jonathan, So in our marc records, we have these 856 links, the meaning of which is basically some web page related to the entity at hand. You don't really know the relation, the granularity is not there. There is some *minimal* indication of the relationship via the second indicator of the 856 (and subfield $3, for a related resource) [1]: Second Indicator - Relationship Relationship between the electronic resource at the location specified in field 856 and the item described in the record as a whole. Used to provide further information about the relationship if it is not a one-to-one relationship. # - No information provided 0 - Resource Electronic location in field 856 is for the same resource described by the record as a whole. In this case, the item represented by the bibliographic record is an electronic resource. If the data in field 856 relates to a constituent unit of the resource represented by the record, subfield $3 is used to specify the portion(s) to which the field applies. The display constant Electronic resource: may be generated. 1 - Version of resource Location in field 856 is for the same resource described by the record as a whole. In this case, the item represented by the bibliographic record is not electronic but an electronic version is available. If the data in field 856 relates to a constituent unit of the resource represented by the record, subfield $3 is used to specify the portion(s) to which the field applies. The display constant Electronic version: may be generated. 2 - Related resource Location in field 856 is for an electronic resource that is related to the bibliographic item described by the record. In this case, the item represented by the bibliographic record is not the electronic resource itself. Subfield $3 can be used to further characterize the relationship between the electronic item identified in field 856 and the item represented by the bibliographic record as a whole. The display constant Related electronic resource: may be generated. 8 - No display constant generated Of course, subfield $3 values are not any kind of controlled vocabulary, so it's hard to do much with them programmatically. -- Michael [1] From: http://www.loc.gov/marc/holdings/hd856.html # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/ -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Mike Taylor Sent: Wednesday, July 07, 2010 5:42 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] schema for some web page Isn't that pretty much what dc:relation is for? From http://dublincore.org/documents/dcmi-terms/#elements-relation Label:Relation Definition: A related resource. Comment: Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. On 7 July 2010 23:32, Jonathan Rochkind rochk...@jhu.edu wrote: So in our marc records, we have these 856 links, the meaning of which is basically some web page related to the entity at hand. You don't really know the relation, the granularity is not there. So, fine, data is data, there ought to be some way to model this in standard XML/RDF/DC/whatever, right? It's not dc:identifier, because dc:identifier ends up including all sorts of URIs that are not really web pages at all, they are just identifiers of various kinds. The marc 856s are URI's, it's true, but they really _aren't_ URIs given as identifiers, they do not neccesarily identify the item at hand at all, but they DO neccesarily lead to a web page with some see also relationship to the entity at hand. So... how would you include this in, say, a DC set in XML or RDF? Is there any common way people have done this in the past? Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm looking for some vocabulary that will handle marc 856s, but also in the future handle other some kind of see also link from other formats, when I add other formats into my corpus. Any ideas? Jonathan
Re: [CODE4LIB] schema for some web page
On Wed, Jul 7, 2010 at 7:00 PM, Doran, Michael D do...@uta.edu wrote: Of course, subfield $3 values are not any kind of controlled vocabulary, so it's hard to do much with them programmatically. A few years ago I analyzed the subfield 3 values in the Library of Congress data up at the Internet Archive [1]. Of course it's really simple to extract, but I just pushed it up to GitHub, mainly to share the results [2]. I extracted all the subfield 3 values from the 12M? records, and then counted them up to see how often they repeated [3]. As you can see it's hardly controlled, but it might be worthwhile coming up with some simple heuristics and properties for the familiar ones: you could imagine dcterms:description being used for Publisher description, etc. Of course the $3 in your catalog data might be different from LCs, but maybe we could come up with a list of common ones on a wiki somewhere, and publish a little vocabulary that covered the important relations? //Ed [1] http://www.archive.org/details/marc_records_scriblio_net [2] http://github.com/edsu/beat [3] http://github.com/edsu/beat/raw/master/types.txt
Re: [CODE4LIB] schema for some web page
And one more (tiny, compared to edsu's) data point. You can see the $3 values from over 10,000 records that had 856 fields from an original 1 million records from the UC Berkeley catalog here: http://roytennant.com/proto/856/?string=%243 in all of it's, uh, gory detail. But I agree that there is some low hanging fruit here. It wouldn't take a rocket scientist (heck, even I can figure this out) to do a case insensitive string match on table of contents, for example. But Michael's point still stands -- this is an uncontrolled field, so it can get messy pretty quickly. In the end, I think if we focus on the 20 percent that we can do something useful with we might just get an 80 percent return. After all, in Ed's list, taking the first half-a-dozen items and variations on PDF would cover probably 99% of the cases. Roy On Wed, Jul 7, 2010 at 9:28 PM, Ed Summers e...@pobox.com wrote: On Wed, Jul 7, 2010 at 7:00 PM, Doran, Michael D do...@uta.edu wrote: Of course, subfield $3 values are not any kind of controlled vocabulary, so it's hard to do much with them programmatically. A few years ago I analyzed the subfield 3 values in the Library of Congress data up at the Internet Archive [1]. Of course it's really simple to extract, but I just pushed it up to GitHub, mainly to share the results [2]. I extracted all the subfield 3 values from the 12M? records, and then counted them up to see how often they repeated [3]. As you can see it's hardly controlled, but it might be worthwhile coming up with some simple heuristics and properties for the familiar ones: you could imagine dcterms:description being used for Publisher description, etc. Of course the $3 in your catalog data might be different from LCs, but maybe we could come up with a list of common ones on a wiki somewhere, and publish a little vocabulary that covered the important relations? //Ed [1] http://www.archive.org/details/marc_records_scriblio_net [2] http://github.com/edsu/beat [3] http://github.com/edsu/beat/raw/master/types.txt