[CODE4LIB] NYPL Drupal Camp

2010-07-07 Thread Michelle Misner
Please excuse cross-postings. This message is being posted to multiple
lists.

Please join us for the first-ever NYPL Drupal Camp!

In January 2010 the New York Public Library unveiled a soup-to-nuts
re-engineering of its website, moving 15 years of digital sprawl into a
modern, open source content management system: Drupal. Drupal not only
enables us to better organize and interrelate our web content, it also lets
us turn over daily control of various local website areas to the staff who
know them best. Six months into this new era, staff from across the Library
are learning how to update their own location information and events
calendars, and are experimenting with exciting new tools like blogs, audio
and video to build a dynamic new digital experience for our patrons.

Now, with half a year under our belts, we’d like to share our experiences
with other libraries  in the first-ever NYPL Drupal Camp. Whether you’re
considering adopting Drupal for your website, or are already an old hand,
this will be a great opportunity to learn directly from NYPL digital staff,
to share your own insights, and perhaps lay groundwork for collaboration.

This event will take place over two days: Thursday August 26 and Friday
August 27, from 9 am to 5 pm.

The first day will be a series of presentations by NYPL staff with plenty of
opportunities for questions and answers.
Topics for the first day include (subject to change):
•Vision
•Content development
•Staff training
•Project management/staffing
•Information architecture
•User testing process
•Policy
•Infrastructure
•Content migration
•Development
•Vendors/IT relationships

The second day will be an un-conference format during which attendees set
the schedule of sessions, some of which can be all-day long code sprints.

The workshop is free. Coffee will be provided and a listing of nearby lunch
options.

WHAT: NYPL Drupal Camp

WHERE: Science, Industry and Business Library
188 Madison Avenue, Lower Level, Room 018
New York, NY 10016

WHEN: Thursday August 26 and Friday August 27
9 am to 5 pm.

COST: Free

HOW: http://nypldrupalcamp.eventbrite.com/

LIMIT: Please note that registration is limited to 2 participants per
organization.  We would recommend that one project/content manager, and one
technical staff member attend NYPL Drupal Camp.

Hope to see you there!


Re: [CODE4LIB] NYPL Drupal Camp

2010-07-07 Thread Cary Gordon
I would definitely fly (drive, bike, walk...) from LA for this if it
didn't conflict with DrupalCon CPH ;(

Cary

On Wed, Jul 7, 2010 at 11:03 AM, Michelle Misner mmis...@nypl.org wrote:
 Please excuse cross-postings. This message is being posted to multiple
 lists.

 Please join us for the first-ever NYPL Drupal Camp!

 In January 2010 the New York Public Library unveiled a soup-to-nuts
 re-engineering of its website, moving 15 years of digital sprawl into a
 modern, open source content management system: Drupal. Drupal not only
 enables us to better organize and interrelate our web content, it also lets
 us turn over daily control of various local website areas to the staff who
 know them best. Six months into this new era, staff from across the Library
 are learning how to update their own location information and events
 calendars, and are experimenting with exciting new tools like blogs, audio
 and video to build a dynamic new digital experience for our patrons.

 Now, with half a year under our belts, we’d like to share our experiences
 with other libraries  in the first-ever NYPL Drupal Camp. Whether you’re
 considering adopting Drupal for your website, or are already an old hand,
 this will be a great opportunity to learn directly from NYPL digital staff,
 to share your own insights, and perhaps lay groundwork for collaboration.

 This event will take place over two days: Thursday August 26 and Friday
 August 27, from 9 am to 5 pm.

 The first day will be a series of presentations by NYPL staff with plenty of
 opportunities for questions and answers.
 Topics for the first day include (subject to change):
    •    Vision
    •    Content development
    •    Staff training
    •    Project management/staffing
    •    Information architecture
    •    User testing process
    •    Policy
    •    Infrastructure
    •    Content migration
    •    Development
    •    Vendors/IT relationships

 The second day will be an un-conference format during which attendees set
 the schedule of sessions, some of which can be all-day long code sprints.

 The workshop is free. Coffee will be provided and a listing of nearby lunch
 options.

 WHAT: NYPL Drupal Camp

 WHERE: Science, Industry and Business Library
 188 Madison Avenue, Lower Level, Room 018
 New York, NY 10016

 WHEN: Thursday August 26 and Friday August 27
 9 am to 5 pm.

 COST: Free

 HOW: http://nypldrupalcamp.eventbrite.com/

 LIMIT: Please note that registration is limited to 2 participants per
 organization.  We would recommend that one project/content manager, and one
 technical staff member attend NYPL Drupal Camp.

 Hope to see you there!




-- 
Cary Gordon
The Cherry Hill Company
http://chillco.com


[CODE4LIB] schema for some web page

2010-07-07 Thread Jonathan Rochkind
So in our marc records, we have these 856 links, the meaning of which is 
basically some web page related to the entity at hand. You don't 
really know the relation, the granularity is not there.


So, fine, data is data, there ought to be some way to model this in 
standard XML/RDF/DC/whatever, right?


It's not dc:identifier, because dc:identifier ends up including all 
sorts of URIs that are not really web pages at all, they are just 
identifiers of various kinds.  The marc 856s are URI's, it's true, but 
they really _aren't_ URIs given as identifiers, they do not 
neccesarily identify the item at hand at all, but they DO neccesarily 
lead to a web page with some see also relationship to the entity at hand.


So... how would you include this in, say, a DC set in XML or RDF?  Is 
there any common way people have done this in the past?


Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm 
looking for some vocabulary that will handle marc 856s, but also in the 
future handle other some kind of see also link from other formats, 
when I add other formats into my corpus. Any ideas?


Jonathan


Re: [CODE4LIB] schema for some web page

2010-07-07 Thread Mike Taylor
Isn't that pretty much what dc:relation is for?  From
http://dublincore.org/documents/dcmi-terms/#elements-relation

Label:  Relation
Definition: A related resource.
Comment:Recommended best practice is to identify the related resource
by means of a string conforming to a formal identification system.



On 7 July 2010 23:32, Jonathan Rochkind rochk...@jhu.edu wrote:
 So in our marc records, we have these 856 links, the meaning of which is
 basically some web page related to the entity at hand. You don't really
 know the relation, the granularity is not there.

 So, fine, data is data, there ought to be some way to model this in standard
 XML/RDF/DC/whatever, right?

 It's not dc:identifier, because dc:identifier ends up including all sorts of
 URIs that are not really web pages at all, they are just identifiers of
 various kinds.  The marc 856s are URI's, it's true, but they really _aren't_
 URIs given as identifiers, they do not neccesarily identify the item at
 hand at all, but they DO neccesarily lead to a web page with some see also
 relationship to the entity at hand.

 So... how would you include this in, say, a DC set in XML or RDF?  Is there
 any common way people have done this in the past?

 Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm
 looking for some vocabulary that will handle marc 856s, but also in the
 future handle other some kind of see also link from other formats, when I
 add other formats into my corpus. Any ideas?

 Jonathan




Re: [CODE4LIB] schema for some web page

2010-07-07 Thread Diane I. Hillmann

Mike:

For sure dc:relation works, and has some subproperties that a bit more 
specific, but it's still pretty much a blunt instrument.  I know I sound 
like a broken record, but RDA has a LOT of relationships to choose 
from--these are the WEMI-to-WEMI relationships: 
http://metadataregistry.org/schemaprop/list/schema_id/13.html


There are also: RDA Relationships for Persons, Corporate Bodies, 
Families: http://metadataregistry.org/schemaprop/list/schema_id/22.html
and RDA Relationships for Concepts, Events, Objects, Places: 
http://metadataregistry.org/schemaprop/list/schema_id/23.html


Diane


On 7/7/10 6:42 PM, Mike Taylor wrote:

Isn't that pretty much what dc:relation is for?  From
http://dublincore.org/documents/dcmi-terms/#elements-relation

Label:  Relation
Definition: A related resource.
Comment:Recommended best practice is to identify the related resource
by means of a string conforming to a formal identification system.



On 7 July 2010 23:32, Jonathan Rochkindrochk...@jhu.edu  wrote:
   

So in our marc records, we have these 856 links, the meaning of which is
basically some web page related to the entity at hand. You don't really
know the relation, the granularity is not there.

So, fine, data is data, there ought to be some way to model this in standard
XML/RDF/DC/whatever, right?

It's not dc:identifier, because dc:identifier ends up including all sorts of
URIs that are not really web pages at all, they are just identifiers of
various kinds.  The marc 856s are URI's, it's true, but they really _aren't_
URIs given as identifiers, they do not neccesarily identify the item at
hand at all, but they DO neccesarily lead to a web page with some see also
relationship to the entity at hand.

So... how would you include this in, say, a DC set in XML or RDF?  Is there
any common way people have done this in the past?

Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm
looking for some vocabulary that will handle marc 856s, but also in the
future handle other some kind of see also link from other formats, when I
add other formats into my corpus. Any ideas?

Jonathan


 
   


Re: [CODE4LIB] schema for some web page

2010-07-07 Thread Doran, Michael D
Hi Jonathan,

 So in our marc records, we have these 856 links, the meaning of which is
 basically some web page related to the entity at hand. You don't
 really know the relation, the granularity is not there.

There is some *minimal* indication of the relationship via the second indicator 
of the 856 (and subfield $3, for a related resource) [1]:

  Second Indicator - Relationship

Relationship between the electronic resource at the location specified 
in field 856 
and the item described in the record as a whole.

Used to provide further information about the relationship if it is not 
a one-to-one relationship.

# - No information provided

0 - Resource

Electronic location in field 856 is for the same 
resource described by the record as 
a whole. In this case, the item represented by the 
bibliographic record is an 
electronic resource. If the data in field 856 relates 
to a constituent unit of the 
resource represented by the record, subfield $3 is used 
to specify the portion(s) to 
which the field applies. The display constant 
Electronic resource: may be generated.

1 - Version of resource

Location in field 856 is for the same resource 
described by the record as a whole. In 
this case, the item represented by the bibliographic 
record is not electronic but an 
electronic version is available. If the data in field 
856 relates to a constituent 
unit of the resource represented by the record, 
subfield $3 is used to specify the 
portion(s) to which the field applies. The display 
constant Electronic version: may be 
generated.

2 - Related resource

Location in field 856 is for an electronic resource 
that is related to the bibliographic 
item described by the record. In this case, the item 
represented by the bibliographic 
record is not the electronic resource itself. Subfield 
$3 can be used to further 
characterize the relationship between the electronic 
item identified in field 856 and the 
item represented by the bibliographic record as a 
whole. The display constant Related 
electronic resource: may be generated.

8 - No display constant generated

Of course, subfield $3 values are not any kind of controlled vocabulary, so 
it's hard to do much with them programmatically. 

-- Michael

[1] From: http://www.loc.gov/marc/holdings/hd856.html

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# do...@uta.edu
# http://rocky.uta.edu/doran/
 

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Mike
 Taylor
 Sent: Wednesday, July 07, 2010 5:42 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] schema for some web page
 
 Isn't that pretty much what dc:relation is for?  From
 http://dublincore.org/documents/dcmi-terms/#elements-relation
 
 Label:Relation
 Definition:   A related resource.
 Comment:  Recommended best practice is to identify the related resource
 by means of a string conforming to a formal identification system.
 
 
 
 On 7 July 2010 23:32, Jonathan Rochkind rochk...@jhu.edu wrote:
  So in our marc records, we have these 856 links, the meaning of which is
  basically some web page related to the entity at hand. You don't really
  know the relation, the granularity is not there.
 
  So, fine, data is data, there ought to be some way to model this in standard
  XML/RDF/DC/whatever, right?
 
  It's not dc:identifier, because dc:identifier ends up including all sorts of
  URIs that are not really web pages at all, they are just identifiers of
  various kinds.  The marc 856s are URI's, it's true, but they really _aren't_
  URIs given as identifiers, they do not neccesarily identify the item at
  hand at all, but they DO neccesarily lead to a web page with some see also
  relationship to the entity at hand.
 
  So... how would you include this in, say, a DC set in XML or RDF?  Is there
  any common way people have done this in the past?
 
  Yeah, I _could_ just expose MODS or MARCXML or what have you. But I'm
  looking for some vocabulary that will handle marc 856s, but also in the
  future handle other some kind of see also link from other formats, when I
  add other formats into my corpus. Any ideas?
 
  Jonathan
 
 


Re: [CODE4LIB] schema for some web page

2010-07-07 Thread Ed Summers
On Wed, Jul 7, 2010 at 7:00 PM, Doran, Michael D do...@uta.edu wrote:
 Of course, subfield $3 values are not any kind of controlled vocabulary, so 
 it's hard to do much with them programmatically.

A few years ago I analyzed the subfield 3 values in the Library of
Congress data up at the Internet Archive [1]. Of course it's really
simple to extract, but I just pushed it up to GitHub, mainly to share
the results [2].

I extracted all the subfield 3 values from the 12M? records, and then
counted them up to see how often they repeated [3]. As you can see
it's hardly controlled, but it might be worthwhile coming up with some
simple heuristics and properties for the familiar ones: you could
imagine dcterms:description being used for Publisher description,
etc.

Of course the $3 in your catalog data might be different from LCs, but
maybe we could come up with a list of common ones on a wiki somewhere,
and publish a little vocabulary that covered the important relations?

//Ed

[1] http://www.archive.org/details/marc_records_scriblio_net
[2] http://github.com/edsu/beat
[3] http://github.com/edsu/beat/raw/master/types.txt


Re: [CODE4LIB] schema for some web page

2010-07-07 Thread Roy Tennant
And one more (tiny, compared to edsu's) data point. You can see the $3
values from over 10,000 records that had 856 fields from an original 1
million records from the UC Berkeley catalog here:

http://roytennant.com/proto/856/?string=%243

in all of it's, uh, gory detail. But I agree that there is some low hanging
fruit here. It wouldn't take a rocket scientist (heck, even I can figure
this out) to do a case insensitive string match on table of contents, for
example. But Michael's point still stands -- this is an uncontrolled field,
so it can get messy pretty quickly. In the end, I think if we focus on the
20 percent that we can do something useful with we might just get an 80
percent return. After all, in Ed's list, taking the first half-a-dozen items
and variations on PDF would cover probably 99% of the cases.
Roy

On Wed, Jul 7, 2010 at 9:28 PM, Ed Summers e...@pobox.com wrote:

 On Wed, Jul 7, 2010 at 7:00 PM, Doran, Michael D do...@uta.edu wrote:
  Of course, subfield $3 values are not any kind of controlled vocabulary,
 so it's hard to do much with them programmatically.

 A few years ago I analyzed the subfield 3 values in the Library of
 Congress data up at the Internet Archive [1]. Of course it's really
 simple to extract, but I just pushed it up to GitHub, mainly to share
 the results [2].

 I extracted all the subfield 3 values from the 12M? records, and then
 counted them up to see how often they repeated [3]. As you can see
 it's hardly controlled, but it might be worthwhile coming up with some
 simple heuristics and properties for the familiar ones: you could
 imagine dcterms:description being used for Publisher description,
 etc.

 Of course the $3 in your catalog data might be different from LCs, but
 maybe we could come up with a list of common ones on a wiki somewhere,
 and publish a little vocabulary that covered the important relations?

 //Ed

 [1] http://www.archive.org/details/marc_records_scriblio_net
 [2] http://github.com/edsu/beat
 [3] http://github.com/edsu/beat/raw/master/types.txt