Re: [CODE4LIB] Loris

2013-11-08 Thread Ed Summers
Hey, that's great. This work would make a great blog post/article I think.

 On Nov 8, 2013, at 5:13 PM, Jon Stroop jstr...@princeton.edu wrote:
 
 Whoops, wait.
 I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add 
 support for OSd. Then I made some changes and added support for 1.1. Credit 
 where credit is due
 -Js
 
 On 11/08/2013 04:40 PM, Jon Stroop wrote:
 Ed,
 
 I added support for IIIF syntax to OpenSeadragon:
 
 https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js
 
 so it just works. Not sure if Ian has cut a release recently, but it's on 
 the master branch anyway.
 
 -Js
 
 On 11/08/2013 04:00 PM, Edward Summers wrote:
 On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu  wrote:
 And here's a sample of the server backing 
 OpenSeadragon[2]:http://goo.gl/Gks6lR
 Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon 
 to talk iiif?
 
 //Ed
 


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ed Summers
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen
ben.compan...@dans.knaw.nl wrote:
 The URIs you gave get me to webpages *about* the Declaration of
 Independence. I'm sure it's just a copy/paste mistake, but in this context
 you want the exact right URIs of course. And by better I guess you meant
 probably more widely used and probably longer lasting? :)

 LOC URI for the DoI (the work) is without .html:
 http://id.loc.gov/authorities/names/n79029194

 VIAF URI for the DoI is without trailing /:
 http://viaf.org/viaf/179420344

Thanks for that Ben. IMHO it's (yet another) illustration of why the
W3C's approach to educating the world about URIs for real world things
hasn't quite caught on, while RESTful ones (promoted by the IETF)
have. If someone as knowledgeable as Karen can do that, what does it
say about our ability as practitioners to use URIs this way, and in
our ability to write software to do it as well?

In a REST world, when you get a 200 OK it doesn't mean the resource is
a Web Document. The resource can be anything, you just happened to
successfully get a representation of it. If you like you can provide
hints about the nature of the resource in the representation, but the
resource itself never goes over the wire, the representation does.
It's a subtle but important difference in two ways of looking at Web
architecture.

If you find yourself interested in making up your own mind about this
you can find the RESTful definitions of resource and representation in
the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1].
You can find language about Web Documents (or at least its more recent
variant, Information Resource) in the W3C's Architecture of the World
Wide Web [2].

Obviously I'm biased towards the IETF's position on this. This is just
my personal opinion from my experience as a Web developer trying to
explain Linked Data to practitioners, looking at the Web we have, and
chatting with good friends who weren't afraid to tell me what they
thought.

//Ed

[1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7
[2] http://www.w3.org/TR/webarch/#id-resources


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Ed Summers
On Mon, Nov 4, 2013 at 11:31 PM, Kevin Hawkins
kevin.s.hawk...@ultraslavonic.info wrote:
 b) Modify whatever code sends formatted job postings to this list so that it
 includes the location of the position.

That would be shortimer, and I think it should be doing what you suggest now?


https://github.com/code4lib/shortimer/commit/acb57090d4842920c9f92c684810f3c618f0a21e

If not let me know, create a github issue, or send a pull request :-)

//Ed


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ed Summers
On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote:
 This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using 
 a strict syntax, specifically a strict syntax of triples. It is very 
 difficult for humans to articulate knowledge, let alone codifying it. How 
 realistic is the idea of the Semantic Web? I wonder this not because I don’t 
 think the technology can handle the problem. I say this because I think 
 people can’t (or have great difficulty) succinctly articulating knowledge. Or 
 maybe knowledge does not fit into triples?

I think you're right Eric. I don't think knowledge can be encoded
completely in triples, any more than it can be encoded completely in
finding aids or books.

One thing that I (naively) wasn't fully aware of when I started
dabbling the Semantic Web and Linked Data is how much the technology
is entangled with debates about the philosophy of language. These
debates play out in a variety of ways, but most notably in
disagreements about the nature of a resource (httpRange-14) in Web
Architecture. Shameless plug: Dorothea Salo and I tried to write about
how some of this impacts the domain of the library/archive [1].

One of the strengths of RDF is its notion of a data model that is
behind the various serializations (xml, ntriples, json, n3, turtle,
etc). I'm with Ross though: I find it much to read rdf as turtle or
json-ld than it is rdf/xml.

//Ed

[1] http://arxiv.org/abs/1302.4591


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ed Summers
On Tue, Nov 5, 2013 at 10:07 AM, Karen Coyle li...@kcoyle.net wrote:
 I have suggested (repeatedly) to LC on the BIBFRAME list that they should
 use turtle rather than RDF/XML in their examples -- because I suspect that
 they may be doing some XML think in the background. This seems to be the
 case because in some of the BIBFRAME documents the examples are in XML but
 not RDF/XML. I find this rather ... disappointing.

I think you'll find that many people and organizations are much more
familiar with xml and its data model than they are with rdf. Sometimes
when people with a strong background in xml come to rdf they naturally
want to keep thinking in terms of xml. This is possible up to a point,
but it eventually hampers understanding.

//Ed


Re: [CODE4LIB] Python and Ruby

2013-07-29 Thread Ed Summers
On Mon, Jul 29, 2013 at 12:57 PM, Peter Schlumpf
pschlu...@earthlink.net wrote:
 Imagine if the library community had its own programming/scripting language, 
 at least one that is domain relevant.  What would it look like?

Ok, I think I'm going to have nightmares about that.

//Ed


Re: [CODE4LIB] Python and Ruby

2013-07-29 Thread Ed Summers
On Mon, Jul 29, 2013 at 1:11 PM, Ross Singer rossfsin...@gmail.com wrote:
 Over the NISO standardization process required to form the exploratory
 committee.

Thanks for answering the question better than I could have ever
dreamed of answering it.

//Ed


Re: [CODE4LIB] Wordpress: Any way to selectively control caching for content areas on a page?

2013-05-29 Thread Ed Summers
If your Wordpress happens to be fronted by Varnish you might get some
mileage out of using Edge Side Includes (ESI)


https://www.varnish-software.com/static/book/Content_Composition.html#edge-side-includes

If you google for Edge Side Includes and Wordpress you'll find some
articles like this describing how ESI's were used with Wordpress:

   
http://timbroder.com/2012/12/getting-started-with-varnish-edge-side-includes-and-wordpress.html

So, it might be do-able.

//Ed

On Tue, May 28, 2013 at 5:30 PM, Wilhelmina Randtke rand...@gmail.com wrote:
 In a Wordpress site, is there a way to allow site-wide caching, but force
 certain areas of a page to reload on each visit?

 For example, if  on a specific page there is a huge navigational menu that
 never changes, a map that rarely changes, and hours of operation which
 change frequently (as often as holidays), is there a way to force only the
 hours of operation to reload when a person revisits the page?

 -Wilhelmina Randtke


[CODE4LIB] GLAM Wiki Google Hangout (Today, 12:00PM EDT)

2013-05-03 Thread Ed Summers
Some folks interested in the role of Wikipedia in Galleries,
Libraries, Archives and Museums are doing a Google Hangout today at
Noon (EDT).

http://en.wikipedia.org/wiki/Wikipedia:GLAM/GLAMout

Today's anchor topic is the work that OCLC has been doing in adding
authority data from VIAF to Wikipedia and Wikidata. But there will be
space for discussing other things.

//Ed


Re: [CODE4LIB] GLAM Wiki Google Hangout (Today, 12:00PM EDT)

2013-05-03 Thread Ed Summers
Sorry, as the page indicates, the hangout time is 12 PDT ... not EDT.

//Ed

On Fri, May 3, 2013 at 9:04 AM, Ed Summers e...@pobox.com wrote:
 Some folks interested in the role of Wikipedia in Galleries,
 Libraries, Archives and Museums are doing a Google Hangout today at
 Noon (EDT).

 http://en.wikipedia.org/wiki/Wikipedia:GLAM/GLAMout

 Today's anchor topic is the work that OCLC has been doing in adding
 authority data from VIAF to Wikipedia and Wikidata. But there will be
 space for discussing other things.

 //Ed


Re: [CODE4LIB] Job: Digital Library Application Developer at Princeton University

2013-03-13 Thread Ed Summers
Hi Bill,

There actually is a bit of manual curation that goes on behind the scenes.

shortimer (the app at jobs.code4lib.org) subscribes to the code4lib
discussion list looking for emails that have job in the title. It
also subscribes to the atom/rss feeds of a 5 or 6 relevant job sites.
When it finds a job at any of these places it puts them in a queue [1]
where it waits for some logged in user to come along and:

* decide if it's appropriate for code4lib (not all are)
* make sure it hasn't already been posted recently (a duplicate)
* assign an employer, location, and any tags (using Freebase entities
behind the scenes)
* clean up any formatting issues
* click publish which pushes it out to the Web, Twitter and here
(the discussion list) if it didn't originate from here

Your question made me curious to see how many edits have been made by
curators so far: 10,451. This isn't bad considering the site has only
been operation for a year (28 edits/day) and is operating on the
kindness of strangers. You can see what users have edited jobs on the
jobs view pages. Mark Matienzo and Jodi Schneider deserve a special
thanks for their work curating job postings.

I hope this takes some of the mystery out of jobs.code4lib.org.
Patches to the about page [3] and elsewhere are (of course) welcome
:-)

//Ed

[1] http://jobs.code4lib.org/curate/
[2] http://jobs.code4lib.org/users/
[3] https://github.com/code4lib/shortimer/blob/master/jobs/templates/about.html

On Mon, Mar 11, 2013 at 10:05 PM, William Denton w...@pobox.com wrote:
 On 11 March 2013, Ed Summers wrote:

 Apologies for this duplicate...I leaned too heavily on the new recent
 jobs from this employer which didn't alert me to the duplicate since
 it was posted under Princeton Theological Seminary and I put it under
 Princeton University 


 Ed, does this amazing jobs site require your hand on the dial?  I thought
 you'd coded it all into magic and it just worked.

 Bill
 --
 William Denton
 Toronto, Canada
 http://www.miskatonic.org/


Re: [CODE4LIB] Job: Digital Library Application Developer at Princeton University

2013-03-11 Thread Ed Summers
Apologies for this duplicate...I leaned too heavily on the new recent
jobs from this employer which didn't alert me to the duplicate since
it was posted under Princeton Theological Seminary and I put it under
Princeton University 

//Ed

On Mon, Mar 11, 2013 at 3:24 PM,  j...@code4lib.org wrote:
 Princeton Theological Seminary Library seeks a Digital Library Application
 Developer. Reporting to the Digital Initiatives Librarian, this position works
 with a small, collaborative team of librarians and technologists to design,
 develop, and test web applications for searching and displaying the Library's
 digital resources.


 Responsibilities:

   * Works collaboratively with the Digital Initiatives team to design, 
 develop, and test web applications using Agile practices.
   * Writes and refactors XQuery, HTML, CSS, and JavaScript code for new and 
 existing web applications built on native XML databases (MarkLogic Server).
   * Tests web applications in multiple browsers on multiple platforms; 
 identifies, tracks, and resolves bugs.
 Qualifications:

   * Bachelor's degree or equivalent combination of education and professional 
 experience.
   * Experience developing web applications using one or more established 
 programming languages/frameworks. MVC experience preferred.
   * Experience programmatically processing XML documents. Experience with 
 XQuery or XSLT preferred. Experience with native XML databases preferred.
   * Experience with tools or frameworks for automated testing of web 
 applications preferred.
   * Enthusiasm for learning and applying new technologies.
 Princeton Theological Seminary is an equal opportunity employer. For details,
 and for information on how to apply, please see
 http://www.ptsem.edu/index.aspx?id=1260



 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6746/


Re: [CODE4LIB] jobs.code4lib.org and job locations

2013-02-24 Thread Ed Summers
Hi Péter,

I probably didn't explain myself very well at 2am :-) The Google Map
was just meant as a demonstration of the underlying geo coordinates in
the job data (exposed as georss in the Atom feed). Eventually I would
like to get a map working on jobs.code4lib.org using LeafletJS [1] or
some other toolkit, where the map can be more customized, and display
more results. The feed is paged, so there's only so much that can be
displayed currently. It might also be interesting to see a map for the
last year of posts, to see general trends in hiring on a map. And I'd
like to get location specific feeds set up for people who still use
feed readers to keep up with things. They do still exist don't they?

That being said, please share whatever you come up with!

//Ed

[1] http://leafletjs.com/

On Sun, Feb 24, 2013 at 8:39 AM, Péter Király kirun...@gmail.com wrote:
 Hi Ed,

 thank you for your work, it is very nice job! I have one comment: some
 job description too lengthy to show up in one screen size, so I have
 to move the map downside to see the top of the rescription. After I
 close the window the map doesn't jump back to the original viewport.
 There is a JS solution of this issue, I'll send you it later.

 Thanks again!
 Péter

 2013/2/24 Ed Summers e...@pobox.com:
 If you happen to post jobs to code4lib.org you'll notice that you can
 now add a location for the job. In fact you are required to fill it in
 when posting.

 The location input field uses Freebase Suggest just like the employer
 and tag fields. When you select an employer the location will
 auto-populate with the employer's headquarters location, but you can
 change it if the job happens to be somewhere else...which does happen
 from time to time. I retroactively applied as many locations as I
 could using the employer.

 One nice side effect (other than seeing where the job is for in the
 UI) is having lat/lon geo-coordinates for the job. I haven't built any
 maps into the UI yet, but I did expose the coordinates in the Atom
 feed which lets you do this:

 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/

 The small number of markers is because this is just the first page of
 the feed, e.g.

 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/2/
 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/3/
 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/4/
 ...

 If someone has an interest in playing with LeafletJS or something to
 get some map views into jobs.code4lib.org proper that might be a fun
 experiment, if you have any spare time.

 Many thanks to Ted Lawless for the work to get this going, and also to
 Mark Matienzo for tirelessly assigning employers to the historic job
 postings. There are still a few kinks to work out (some historic
 postings that had addresses in non-standard places in the freebase
 data), but please feel free to file issue tickets on Github [1] if you
 notice anything odd.

 //Ed

 [1] https://github.com/code4lib/shortimer



 --
 Péter Király
 software developer

 Europeana - http://europeana.eu
 eXtensible Catalog - http://eXtensibleCatalog.org


Re: [CODE4LIB] jobs.code4lib.org and job locations

2013-02-24 Thread Ed Summers
Hi Gary,

Great idea, and it was easy to implement. For example you can now get
tag related feeds:

http://jobs.code4lib.org/feed/tag/digital-preservation/
http://jobs.code4lib.org/feed/tag/python/
http://jobs.code4lib.org/feed/tag/web-archiving/
http://jobs.code4lib.org/feed/tag/fedora-repository-architecture/
etc ...

Your feed reader should be able to pick up on the feed url, but a
click to the feed icon on the tag specific jobs pages will take you to
the feed if not. I also added the feed URLs as a column to the tag
page:

http://jobs.code4lib.org/tags/

It's kind of neat to see them on a map, e.g.


https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/tag/digital-preservation/

Thanks for the idea!

//Ed

On Sun, Feb 24, 2013 at 10:44 AM, Gary McGath develo...@mcgath.com wrote:
 Definitely! I'd be more interested in job-category feeds than location,
 though.

 On 2/24/13 10:32 AM, Ed Summers wrote:
 And I'd
 like to get location specific feeds set up for people who still use
 feed readers to keep up with things. They do still exist don't they?

 --
 Gary McGath, Professional Software Developer
 http://www.garymcgath.com


Re: [CODE4LIB] jobs.code4lib.org and job locations

2013-02-24 Thread Ed Summers
Chris, as you saw, Chad started to tinker with maps and
jobs.code4lib.org, but the more the merrier. Just fork the repo on
github and try out some things if you have the energy/interest. If you
want a snapshot of the MySQL database to play around with the full
dataset let me know privately and I'll get it to you.

//Ed

On Sun, Feb 24, 2013 at 3:35 PM, Chris Fitzpatrick
chrisfitz...@gmail.com wrote:
 hi,
 has anyone volunteered for the mapping feature? if not, I'd like to take a
 crack at it as I am wanting to get more practical django experience under
 my belt. and since this list has gotten me two jobs, I would love to give
 some payback.  just dont want to duplicate any work someone else has
 started. b, chris.
 On 24 Feb 2013 20:08, Gary McGath develo...@mcgath.com wrote:

 It works very nicely with Sage, which is what I use to follow feeds.
 Thanks!

 On 2/24/13 1:45 PM, Ed Summers wrote:
  Hi Gary,
 
  Great idea, and it was easy to implement. For example you can now get
  tag related feeds:
 
  http://jobs.code4lib.org/feed/tag/digital-preservation/
  http://jobs.code4lib.org/feed/tag/python/
  http://jobs.code4lib.org/feed/tag/web-archiving/
  http://jobs.code4lib.org/feed/tag/fedora-repository-architecture/
  etc ...
 



 --
 Gary McGath, Professional Software Developer
 http://www.garymcgath.com



[CODE4LIB] jobs.code4lib.org and job locations

2013-02-23 Thread Ed Summers
If you happen to post jobs to code4lib.org you'll notice that you can
now add a location for the job. In fact you are required to fill it in
when posting.

The location input field uses Freebase Suggest just like the employer
and tag fields. When you select an employer the location will
auto-populate with the employer's headquarters location, but you can
change it if the job happens to be somewhere else...which does happen
from time to time. I retroactively applied as many locations as I
could using the employer.

One nice side effect (other than seeing where the job is for in the
UI) is having lat/lon geo-coordinates for the job. I haven't built any
maps into the UI yet, but I did expose the coordinates in the Atom
feed which lets you do this:

https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/

The small number of markers is because this is just the first page of
the feed, e.g.

https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/2/
https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/3/
https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/4/
...

If someone has an interest in playing with LeafletJS or something to
get some map views into jobs.code4lib.org proper that might be a fun
experiment, if you have any spare time.

Many thanks to Ted Lawless for the work to get this going, and also to
Mark Matienzo for tirelessly assigning employers to the historic job
postings. There are still a few kinks to work out (some historic
postings that had addresses in non-standard places in the freebase
data), but please feel free to file issue tickets on Github [1] if you
notice anything odd.

//Ed

[1] https://github.com/code4lib/shortimer


Re: [CODE4LIB] jobs.code4lib.org and job locations

2013-02-23 Thread Ed Summers
On Sun, Feb 24, 2013 at 2:14 AM, Ed Summers e...@pobox.com wrote:
 If you happen to post jobs to code4lib.org you'll notice that you can
 now add a location for the job. In fact you are required to fill it in
 when posting.

s/code4lib.org/jobs.code4lib.org/

That's what I get for writing email at 2am I guess...

//Ed

On Sun, Feb 24, 2013 at 2:14 AM, Ed Summers e...@pobox.com wrote:
 If you happen to post jobs to code4lib.org you'll notice that you can
 now add a location for the job. In fact you are required to fill it in
 when posting.

 The location input field uses Freebase Suggest just like the employer
 and tag fields. When you select an employer the location will
 auto-populate with the employer's headquarters location, but you can
 change it if the job happens to be somewhere else...which does happen
 from time to time. I retroactively applied as many locations as I
 could using the employer.

 One nice side effect (other than seeing where the job is for in the
 UI) is having lat/lon geo-coordinates for the job. I haven't built any
 maps into the UI yet, but I did expose the coordinates in the Atom
 feed which lets you do this:

 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/

 The small number of markers is because this is just the first page of
 the feed, e.g.

 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/2/
 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/3/
 https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/4/
 ...

 If someone has an interest in playing with LeafletJS or something to
 get some map views into jobs.code4lib.org proper that might be a fun
 experiment, if you have any spare time.

 Many thanks to Ted Lawless for the work to get this going, and also to
 Mark Matienzo for tirelessly assigning employers to the historic job
 postings. There are still a few kinks to work out (some historic
 postings that had addresses in non-standard places in the freebase
 data), but please feel free to file issue tickets on Github [1] if you
 notice anything odd.

 //Ed

 [1] https://github.com/code4lib/shortimer


Re: [CODE4LIB] Job: Data Services Manager at Pennsylvania State University

2013-02-06 Thread Ed Summers
Sorry for the duplication on the recent CDL/UC3 jobs by the way. I saw
them pop up on the digital-curation list, got excited and posted them
on jobs.code4lib.org without seeing that Stephen already had. Oh well,
two for the price of one I guess, or is that 4 for the price of 2? [1]

Mea culpa,
//Ed

[1] except their free, so uhh, yeah...

On Wed, Feb 6, 2013 at 8:59 AM,  j...@code4lib.org wrote:
 Digital Library Technologies (DLT), a unit within Information Technology
 Services at Penn State University, is seeking a Data Services Manager to lead
 the development of new data services to support teaching, research and
 outreach at Penn State.


 The Data Services Manager will be responsible for the development of services
 to support data throughout its lifecycle, including long-term archival data
 storage, preservation, and management, the management of restricted data, and
 database hosting. The Data Services Manager will collaborate with diverse
 constituencies at Penn State (ITS, the IT Leadership Council, the University
 Libraries, and researchers/faculty) and with our peers nationally, to design,
 develop, and implement sustainable data services that meet existing and
 emerging needs.


 This job will be filled as a level 3, or level 4, depending upon the
 successful candidate's competencies, education, and experience. Typically
 requires a Master's degree or higher plus four years of related experience, or
 an equivalent combination of education and experience for a level 3.
 Additional experience and/or education and competencies are required for
 higher level jobs. The successful candidate will demonstrate knowledge of and
 experience with data management infrastructure, specifically storage and
 repository technologies, standards, and practices; maintain an awareness of
 emerging trends and developments in the data storage and repository domains;
 have knowledge of information management practices and principles such as
 metadata, data lifecycle, and digital preservation practices.


 The Data Services Manager will be passionate about working hands-on with
 technology; have excellent problem-solving skills; demonstrate proven ability
 to lead complex and cross-organizational projects; provide outstanding
 customer service; and have excellent interpersonal communication and
 relationship-building skills.



 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6071/


Re: [CODE4LIB] Job: Data Services Manager at Pennsylvania State University

2013-02-06 Thread Ed Summers
s/their/they're/

But I guess there's no such thing as a free job posting, really.

Yeah, I'm done now. Thanks.

//Ed

On Wed, Feb 6, 2013 at 4:05 PM, Ed Summers e...@pobox.com wrote:
 Sorry for the duplication on the recent CDL/UC3 jobs by the way. I saw
 them pop up on the digital-curation list, got excited and posted them
 on jobs.code4lib.org without seeing that Stephen already had. Oh well,
 two for the price of one I guess, or is that 4 for the price of 2? [1]

 Mea culpa,
 //Ed

 [1] except their free, so uhh, yeah...

 On Wed, Feb 6, 2013 at 8:59 AM,  j...@code4lib.org wrote:
 Digital Library Technologies (DLT), a unit within Information Technology
 Services at Penn State University, is seeking a Data Services Manager to lead
 the development of new data services to support teaching, research and
 outreach at Penn State.


 The Data Services Manager will be responsible for the development of services
 to support data throughout its lifecycle, including long-term archival data
 storage, preservation, and management, the management of restricted data, and
 database hosting. The Data Services Manager will collaborate with diverse
 constituencies at Penn State (ITS, the IT Leadership Council, the University
 Libraries, and researchers/faculty) and with our peers nationally, to design,
 develop, and implement sustainable data services that meet existing and
 emerging needs.


 This job will be filled as a level 3, or level 4, depending upon the
 successful candidate's competencies, education, and experience. Typically
 requires a Master's degree or higher plus four years of related experience, 
 or
 an equivalent combination of education and experience for a level 3.
 Additional experience and/or education and competencies are required for
 higher level jobs. The successful candidate will demonstrate knowledge of and
 experience with data management infrastructure, specifically storage and
 repository technologies, standards, and practices; maintain an awareness of
 emerging trends and developments in the data storage and repository domains;
 have knowledge of information management practices and principles such as
 metadata, data lifecycle, and digital preservation practices.


 The Data Services Manager will be passionate about working hands-on with
 technology; have excellent problem-solving skills; demonstrate proven ability
 to lead complex and cross-organizational projects; provide outstanding
 customer service; and have excellent interpersonal communication and
 relationship-building skills.



 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6071/


Re: [CODE4LIB] Adding authority control to IR's that don't have it built in

2013-01-31 Thread Ed Summers
Hi Jason,

Heh, sorry for the long response below. You always ask interesting questions :-D

I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:

  http://id.library.osu.edu/person/123

When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping.  I would also highly recommend adding a sitemap to enable
discovery, and synchronization.

Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as links
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:

  http://id.library.osu.edu/person/123.json
  http://id.library.osu.edu/person/123.xml
  http://id.library.osu.edu/person/123.rdf

If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.

  curl -i --header Accept: application/json
http://id.library.osu.edu/person/123
  HTTP/1.1 303 See Other
  date: Thu, 31 Jan 2013 10:47:44 GMT
  server: Apache/2.2.14 (Ubuntu)
  location: http://id.library.osu.edu/person/123
  vary: Accept-Encoding

But that's gravy.

What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.

It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.

Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.

Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for John Chapman and get back results (there
might be more than one) in Atom, RSS or JSON.  OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.

I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with API written across the top, that hopefully includes terms of
service, can help a lot with use by others.

//Ed

PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.

[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/

On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo jrona...@gmail.com wrote:
 Ed,

 Any suggestions or recommendations on what such an API would look
 like, what response format(s) would be best, and how to advertise the
 availability of a local name authority API? Who should we expect would
 use our local name authority API? Are any of the examples from the big
 authority databases like VIAF ones that would be good to follow for
 API design and response formats?

 Jason

 On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers e

Re: [CODE4LIB] Adding authority control to IR's that don't have it built in

2013-01-31 Thread Ed Summers
Of course after sending that I noticed a mistake, the curl example
should look like:

  curl -i --header Accept: application/json
http://id.library.osu.edu/person/123
  HTTP/1.1 303 See Other
  date: Thu, 31 Jan 2013 10:47:44 GMT
  server: Apache/2.2.14 (Ubuntu)
  location: http://id.library.osu.edu/person/123.json
  vary: Accept-Encoding

I didn't have it redirecting to the JSON previously.

//Ed

On Wed, Jan 30, 2013 at 4:19 PM, Phillips, Mark mark.phill...@unt.edu wrote:
 Thanks for the prompt Ed,

 We've had a stupid simple vocabulary app for a few years now which we use
 to manage all of our controlled vocabularies [1].  These are represented in 
 our
 metadata editing application as drop-downs and type ahead values as described
 in the first email in this thread.  Nothing too exciting.  The entire 
 vocabulary app
 is available to our systems as xml, python or json objects. When we export our
 records as RDF we try and use the links for these values instead of the 
 strings.

 We are currently working on another simple app to manage names for our system
 (UNT Name App). It takes into account some of the use cases described in this 
 thread such as
 disambiguation, variant names, and the all important linking to other 
 vocabularies
 of which VIAF, LC, and Wikipedia are the primary expected targets. Once 
 populated
 it is to be integrated into the metadata editing system to provide 
 auto-complete
 functions to the various name fields in our repository.

 As far as technology we've tried to crib off the Chronicling America site as 
 much
 as possible and follow the pattern of using the suggestions extension of 
 OpenSearch [2]
 to provide the API.

 Mark



 [1] http://digital2.library.unt.edu/vocabularies/
 [2] 
 http://www.opensearch.org/Specifications/OpenSearch/Extensions/Suggestions/1.1



 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ed Summers 
 [e...@pobox.com]
 Sent: Wednesday, January 30, 2013 2:15 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Adding authority control to IR's that don't have it 
 built in

 On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee kyle.baner...@gmail.com 
 wrote:
 This would certainly be a possibility for other projects, but the use case
 we're immediately concerned with requires an authority file that's
 maintained by our local archives. It contains all kinds of information
 about people (degrees, nicknames, etc) as well as terminology which is not
 technically kosher but which we know people use.

 Just as an aside really, I think there's a real opportunity for
 libraries and archives to make their local thesauri and name indexes
 available for integration into other applications both inside and
 outside their institutional walls. Wikipedia, Freebase, VIAF are
 great, but their notability guidelines don't always the greatest match
 for cultural heritage organizations. So seriously consider putting a
 little web app around the information you have, using it for
 maintaining the data, making it available programatically (API), and
 linking it out to other databases (VIAF, etc) as needed.

 A briefer/pithier way of saying this is to quote Mark Matienzo [1]

   Sooner or later, everyone needs a vocabulary management app.

 :-)

 //Ed

 PS. I think Mark Phillips has done some interesting work in this area
 at UNT. But I don't have anything to point you at, maybe Mark is tuned
 in, and can chime in.

 [1] https://twitter.com/anarchivist/status/269654403701682176


Re: [CODE4LIB] Adding authority control to IR's that don't have it built in

2013-01-30 Thread Ed Summers
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee kyle.baner...@gmail.com wrote:
 This would certainly be a possibility for other projects, but the use case
 we're immediately concerned with requires an authority file that's
 maintained by our local archives. It contains all kinds of information
 about people (degrees, nicknames, etc) as well as terminology which is not
 technically kosher but which we know people use.

Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.

A briefer/pithier way of saying this is to quote Mark Matienzo [1]

  Sooner or later, everyone needs a vocabulary management app.

:-)

//Ed

PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.

[1] https://twitter.com/anarchivist/status/269654403701682176


Re: [CODE4LIB] Adding authority control to IR's that don't have it built in

2013-01-30 Thread Ed Summers
On Tue, Jan 29, 2013 at 10:41 PM, Bill Dueber b...@dueber.com wrote:
 Right -- I'd like to show the FAST stuff as facets in our catalog search
 (or, at least try it out and see if anyone salutes). So I'd need to inject
 the FAST data into the records at index time.

Alas, I can't help you with that. I haven't heard of FAST being
distributed before, but I suppose it must have. Where is Roy when you
need him?

//Ed


Re: [CODE4LIB] Adding authority control to IR's that don't have it built in

2013-01-29 Thread Ed Summers
Hi Kyle,

If you are thinking of doing name or subject authority control you
might want to check out OCLC's VIAF AutoSuggest service [1] and FAST
AutoSuggest [2]. There are also autosuggest searches for the name and
subject authority files, that are lightly documented in their
OpenSearch document [3].

In general, I really like this approach, and I think it has a lot of
potential for newer cataloging interfaces. I'll describe two scenarios
that I'm familiar with, that have worked quite well (so far). Note,
these aren't IR per-se, but perhaps they will translate to your
situation.

As part of the National Digital Newspaper Program LC has a simple app
so that librarians can create essays that describe newspapers in
detail. Rather than making this part of our public website we created
an Essay Editor as a standalone django app that provides a web based
editing environment, for authority the essays. Part of this process is
linking up the essay with the correct newspaper. Rather than load all
the newspapers that could be described into the Essay Editor, and keep
them up to date, we exposed an OpenSearch API in the main Chronicling
America website (where all the newspaper records are loaded and
maintained) [4]. It has been working quite well so far.

Another example is the jobs.code4lib.org website that allows people to
enter jobs announcements. I wanted to make sure that it was possible
to view jobs by organization [5], or skill [6] -- so some form of
authority control was needed. I ended up using Freebase Suggest [7]
that makes it quite easy to build simple forms that present users with
subsets of Freebase entities, depending on what they type. A nice side
benefit of using Freebase is that you get descriptive text and images
for the employers and topics for free. It has been working pretty well
so far. There is a bit of an annoying conflict between the Freebase
CSS and Twitter Bootstrap, which might be resolved by updating
Bootstrap. Also, I've noticed Freebase's service slowing down a bit
lately, which hopefully won't degrade further.

The big caveat here is that these external services are dependencies.
If they go down, a significant portion of your app might go down to.
Minimizing this dependency, or allowing things degrade well is good to
keep in mind. Also, it's worth remembering identifiers (if they are
available) for the selected matches, so that they can be used for
linking your data with the external resource. A simple string might
change.

I hope this helps. Thanks for the question, I think this is an area
where we can really improve some of our back-office interfaces and
applications.

//Ed

[1] 
http://www.oclc.org/developer/documentation/virtual-international-authority-file-viaf/request-types#autosuggest
[2] http://experimental.worldcat.org/fast/assignfast/
[3] http://id.loc.gov/authorities/opensearch/
[4] http://chroniclingamerica.loc.gov/about/api/#autosuggest
[5] 
http://jobs.code4lib.org/employer/university-of-illinois-at-urbana-champaign/
[6] http://jobs.code4lib.org/jobs/ruby/
[7] http://wiki.freebase.com/wiki/Freebase_Suggest

On Tue, Jan 29, 2013 at 11:59 AM, Kyle Banerjee kyle.baner...@gmail.com wrote:
 How are libraries doing this and how well is it working?

 Most systems that even claim to have authority control simply allow a
 controlled keyword list. But this does nothing for the see and see also
 references that are essential for many use cases (people known by many
 names, entities that change names, merge or whatever over time, etc).

 The two most obvious solutions to me are to write an app that provides this
 information interactively as the query is typed (requires access to the
 search box) or to have a record that serves as a disambiguation page (might
 not be noticed by the user for a variety of reasons). Are there other
 options, and what do you recommend?

 Thanks,

 kyle


Re: [CODE4LIB] Adding authority control to IR's that don't have it built in

2013-01-29 Thread Ed Summers
I think that Mike Giarlo and Michael Witt used the FAST AutoSuggest as
part of their databib project [1]. But are you talking about bringing
the data down for a local index?

//Ed

[1] http://databib.org/

On Tue, Jan 29, 2013 at 4:45 PM, Bill Dueber b...@dueber.com wrote:
 Has anyone created a nice little wrapper around FAST? I'd like to test out
 including FAST subjects in our catalog, but am hoping someone else went
 through the work of building the code to do it :-) I know FAST has a web
 interface, but I've got about 10M records and would rather use something
 local.


 On Tue, Jan 29, 2013 at 4:36 PM, Ed Summers e...@pobox.com wrote:

 Hi Kyle,

 If you are thinking of doing name or subject authority control you
 might want to check out OCLC's VIAF AutoSuggest service [1] and FAST
 AutoSuggest [2]. There are also autosuggest searches for the name and
 subject authority files, that are lightly documented in their
 OpenSearch document [3].

 In general, I really like this approach, and I think it has a lot of
 potential for newer cataloging interfaces. I'll describe two scenarios
 that I'm familiar with, that have worked quite well (so far). Note,
 these aren't IR per-se, but perhaps they will translate to your
 situation.

 As part of the National Digital Newspaper Program LC has a simple app
 so that librarians can create essays that describe newspapers in
 detail. Rather than making this part of our public website we created
 an Essay Editor as a standalone django app that provides a web based
 editing environment, for authority the essays. Part of this process is
 linking up the essay with the correct newspaper. Rather than load all
 the newspapers that could be described into the Essay Editor, and keep
 them up to date, we exposed an OpenSearch API in the main Chronicling
 America website (where all the newspaper records are loaded and
 maintained) [4]. It has been working quite well so far.

 Another example is the jobs.code4lib.org website that allows people to
 enter jobs announcements. I wanted to make sure that it was possible
 to view jobs by organization [5], or skill [6] -- so some form of
 authority control was needed. I ended up using Freebase Suggest [7]
 that makes it quite easy to build simple forms that present users with
 subsets of Freebase entities, depending on what they type. A nice side
 benefit of using Freebase is that you get descriptive text and images
 for the employers and topics for free. It has been working pretty well
 so far. There is a bit of an annoying conflict between the Freebase
 CSS and Twitter Bootstrap, which might be resolved by updating
 Bootstrap. Also, I've noticed Freebase's service slowing down a bit
 lately, which hopefully won't degrade further.

 The big caveat here is that these external services are dependencies.
 If they go down, a significant portion of your app might go down to.
 Minimizing this dependency, or allowing things degrade well is good to
 keep in mind. Also, it's worth remembering identifiers (if they are
 available) for the selected matches, so that they can be used for
 linking your data with the external resource. A simple string might
 change.

 I hope this helps. Thanks for the question, I think this is an area
 where we can really improve some of our back-office interfaces and
 applications.

 //Ed

 [1]
 http://www.oclc.org/developer/documentation/virtual-international-authority-file-viaf/request-types#autosuggest
 [2] http://experimental.worldcat.org/fast/assignfast/
 [3] http://id.loc.gov/authorities/opensearch/
 [4] http://chroniclingamerica.loc.gov/about/api/#autosuggest
 [5]
 http://jobs.code4lib.org/employer/university-of-illinois-at-urbana-champaign/
 [6] http://jobs.code4lib.org/jobs/ruby/
 [7] http://wiki.freebase.com/wiki/Freebase_Suggest

 On Tue, Jan 29, 2013 at 11:59 AM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:
  How are libraries doing this and how well is it working?
 
  Most systems that even claim to have authority control simply allow a
  controlled keyword list. But this does nothing for the see and see also
  references that are essential for many use cases (people known by many
  names, entities that change names, merge or whatever over time, etc).
 
  The two most obvious solutions to me are to write an app that provides
 this
  information interactively as the query is typed (requires access to the
  search box) or to have a record that serves as a disambiguation page
 (might
  not be noticed by the user for a variety of reasons). Are there other
  options, and what do you recommend?
 
  Thanks,
 
  kyle




 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library


Re: [CODE4LIB] Zoia

2013-01-24 Thread Ed Summers
On Thu, Jan 24, 2013 at 10:01 AM, Mark A. Matienzo
mark.matie...@gmail.com wrote:
 More to the point, no other decision about code4lib in terms of
 action or policy has been made ever. This is new territory for us.

It's not really that new. We've voted on tshirts, logos, and whether
or not to have jobs.code4lib.org post here--perhaps other things that
I'm forgetting. I'm not saying we need to vote on the anti-harassment
policy to make it real--it's already real. Not everyone may respect
it, but hopefully we'll all continue being nice people and won't have
to worry about enforcing it. It's hard to imagine anyone being against
it. Personally, I find it regrettable that it's even necessary, but it
is what it is.

Voting can be a nice way of testing the waters for something. I found
the survey on the jobs.code4lib.org email posting very helpful. But
voting on everything would get very tedious, and boring very quickly I
imagine. code4lib has always seemed much more freeform than that to
me. I really liked Bethany's description of lazy consensus [1] at the
last conference.

//Ed

[1] http://nowviskie.org/2012/lazy-consensus/


Re: [CODE4LIB] Zoia

2013-01-24 Thread Ed Summers
On Thu, Jan 24, 2013 at 4:04 PM, Shaun Ellis sha...@princeton.edu wrote:
 Determining whether action should be taken on harassment should not be based
 on a popularity contest.  That would be a fail, and that's what Karen is
 right to point out.

I added ABSTENTIONS.txt and OPPOSERS.txt to the anti-harassment github
repository [1] to supplement the SUPPORTERS.txt, for people who want
to record their particular view on this issue. If you want to record
your view you can fork the repository, add your name to the
appropriate file, and send a pull request. Perhaps that's good enough
for now? I don't disagree that ambiguity around this issue is
problematic, but I also think that trying to remove all ambiguity from
it maybe prove to be difficult, and damaging.

//Ed

[1] https://github.com/code4lib/antiharassment-policy


Re: [CODE4LIB] Group Decision Making (was Zoia)

2013-01-24 Thread Ed Summers
So we have a reasonable policy in place. Can we now tackle the creepy
things as they come up? I am not opposed to voting about this. It just
seems like a crazy thing to do, because I can't imagine anyone would
be opposed to it. But maybe I lack imagination.

//Ed

On Thu, Jan 24, 2013 at 4:49 PM, BWS Johnson abesottedphoe...@yahoo.com wrote:
 Salve!


 I am uneasy about coming up with a policy for banning people (from
 what?) and voting on it, before it's demonstrated that it's even
 needed. Can't we just tackle these issues as they come up, in context,
 rather than in the abstract?

 Or has a specific issue come up, and I'm just being daft?

 It's needed. It was requested. Specifically creepy things happening is 
 why this came up. The policy is necessary to help people deal with things as 
 they come up in context.

 I'm uneasy about voting on minority rights. That usually doesn't go well, 
 and it almost always misses the point.

 Cheers,
 Brooke


Re: [CODE4LIB] Conference roommate

2013-01-22 Thread Ed Summers
Whoever is rooming with Gabe, be sure to remind him to bring his Ukulele.

//Ed

On Tue, Jan 22, 2013 at 4:22 PM, Gabriel Farrell gsf...@gmail.com wrote:
 And the code4lib community comes through again. I now have a roommate. See
 you all at the conference!


 On Mon, Jan 21, 2013 at 11:59 AM, Gabriel Farrell gsf...@gmail.com wrote:

 I'm looking for a roommate for a room at the conference hotel Monday
 through Thursday. I've also posted at
 http://wiki.code4lib.org/index.php/2013_room_ride_share. References
 available upon request.



Re: [CODE4LIB] Zoia

2013-01-22 Thread Ed Summers
On Tue, Jan 22, 2013 at 4:01 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Thanks to whoever removed the 'poledance' plugin (REALLY? that existed? if
 it makes you feel any better, I don't think anyone who hangs out in
 #code4lib even knew it existed, and it never got used).

I knew it existed, and I even invoked it a few times. Although, If
this war on humor keeps up, I'm unlikely to hang out in #code4lib
much longer.

//Ed

PS. I really didn't expect the Spanish Inquisition.


[CODE4LIB] code4lib.org domain

2012-12-18 Thread Ed Summers
HI all,

I've owned the code4lib.org since 2005 and have been thinking it might
be wise for to transfer ownership of it to someone else. Sometimes I
forget to pay bills, and miss emails, and it seems like the domain
means something to a larger group of people.

With Ryan Ordway's help Oregon State University indicated they would
be willing to take over administration of the domain. They also have
been responsible for running the Drupal instance at code4lib.org and
the Mediawiki instance at wiki.code4lib.org -- so it seems like a
logical move.

But I thought I would bring it up here first in the interests of
transparency, community building and whatnot, to see if there were any
objections or ideas.

//Ed


Re: [CODE4LIB] code4lib.org domain

2012-12-18 Thread Ed Summers
On Tue, Dec 18, 2012 at 4:58 PM, Wilhelmina Randtke rand...@gmail.com wrote:
 Pay for it shouldn't be an issue.  It's like $10 a year to register the
 domain, right?  So, don't make a big deal out of OSU paying for it.  The
 fee is negligible.

Yes, it's not so much a matter of money as it is remembering to pay it :-)

 The key concern is how committed to OSU is Ryan Ordway, and what's the
 climate there like.  I see this as transferring to the people who are
 currently technical contacts at OSU, not to a faceless organization.  If
 they already hold several other URLs, and have a policy and timeframe for
 tracking and renewing these then that's a plus.

OSU is committed enough to have a Domain Name Committee to evaluate
these matters, which accepted the proposal to host code4lib.org. The
first code4lib conference was held at OSU, and there are several
active long time OSU folks who have helped create the code4lib
community...so it's not as if there's no connection between the
organization and this community.

I am not disagreeing with your assessment about individual vs
organizational ownership. But I am saying I don't want to be that
individual anymore, and that OSU is the best option for not letting
the domain lapse.

//Ed


[CODE4LIB] Just Solve the File Format Problem month: can you help?

2012-11-02 Thread Ed Summers
I imagine you've heard about the Just Solve the Problem month already,
but if not, I thought Chris Rusbridge's email to the
digital-preservation list was a good call for participation in the
project ...

//Ed

-- Forwarded message --
From: Chris Rusbridge c.rusbri...@googlemail.com
Date: Thu, Nov 1, 2012 at 4:00 PM
Subject: Just Solve the File Format Problem month: can you help?
To: digital-preservat...@jiscmail.ac.uk

Some of you will know that Jason Scott, Rogue Archivist, is raising a
citizen's army to attempt to solve the file format problem* in the
month of November, 2012. The work is taking place via a wiki at
http://justsolve.archiveteam.org/index.php/Main_Page, with a band of
volunteers (you need to register to make changes to the wiki, by
sending a username and email address to justso...@textfiles.com). I've
added a few formats and groups of formats myself (at least as
skeletons or empty placeholders).

The best form of help is for some of you who know more about rarer
data formats to register and help by editing the wiki yourself. It's
pretty easy; I've never used MediaWiki before, and everything I've
done so far has been by finding something like it and adapting the
wiki source. Other people can make it beautiful and standardised later
on!

If you can't do that, you could email me information about missing
data formats. This should include as much as possible of:

- name, and what it's for (ie brief description)
- web site with some authoritative information
- web site with some examples, etc.

Let's try and capture ALL these formats. As Jason says in his own
inimitable way Let's make that goddam army!.

* Note, the problem is only vaguely defined, and after some angst
(eg see 
http://unsustainableideas.wordpress.com/2012/07/04/the-solution-is-42-what-was-the-problem/),
I think that's OK. Gathering a huge amount of information about file
formats in one place will be a BIG HELP.

--
Chris Rusbridge
Mobile: +44 791 7423828
Email: c.rusbri...@gmail.com
Adopt the email charter! http://emailcharter.org/


Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google

2012-08-27 Thread Ed Summers
On Mon, Aug 27, 2012 at 10:36 AM, Ross Singer rossfsin...@gmail.com wrote:
 For MARC data, while I don't know of any examples of this, it seems like 
 something like CouchDB [2] and marc-in-json [3] would be a fantastic way to 
 make something like this available.

Great idea...and there are 4 years of transactions for LC record
create/update/deletes up at Internet Archive:

http://archive.org/details/marc_loc_updates

//Ed


Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google

2012-08-27 Thread Ed Summers
On Mon, Aug 27, 2012 at 8:49 AM, Karen Coyle li...@kcoyle.net wrote:
 Actually, Ed, this would not only make for a good blog post (please, so it
 doesn't get lost in email space), but I would love to see a discussion of
 what kind of revision control would work:

 1) for libraries (git is gawdawful nerdy)
 2) for linked data

I think you know well as me that linked data is gawdawful nerdy too :-)

 p.s. the Ramsay book is now showing on Open Library, and the subtitle is
 correct... perhaps because the record is from the LC MARC service :-)
 http://openlibrary.org/works/OL16528530W/Reading_machines

perhaps being the operative word. Being able to concretely answer
these provenence questions is important. Actually, I'm not sure it was
ever incorrect at OpenLibrary. At least I don't think I used it as an
example in my Genealogy of a Typo post.

//Ed


Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google

2012-08-27 Thread Ed Summers
On Mon, Aug 27, 2012 at 1:33 PM, Corey A Harper corey.har...@nyu.edu wrote:
 I think there's a useful distinction here. Ed can correct me if I'm
 wrong, but I suspect he was not actually suggesting that Git itself be
 the user-interface to a github-for-data type service, but rather that
 such a service can be built *on top* of an infrastructure component
 like GitHub.

Yes, I wasn't saying that we could just plonk our data into Github,
and pat ourselves on the back for a good days work :-) I guess I was
stating the obvious: technologies like Git have made once hard
problems like decentralized version control much, much easier...and
there might be some giants shoulders to stand on.

//Ed


Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google

2012-08-26 Thread Ed Summers
Thanks for sharing this bit of detective work. I noticed something
similar fairly recently myself [1], but didn't discover as plausible
of a scenario for what had happened as you did. I imagine others have
noticed this network effect before as well.

On Tue, Aug 21, 2012 at 11:42 AM, Lars Aronsson l...@aronsson.se wrote:
 And sure enough, there it is,
 http://clio.cul.columbia.edu:7018/vwebv/holdingsInfo?bibId=1439352
 But will my error report to Worldcat find its way back
 to CLIO? Or if I report the error to Columbia University,
 will the correction propagate to Google, Hathi and Worldcat?
 (Columbia asks me for a student ID when I want to give
 feedback, so that removes this option for me.)

I realize this probably will sound flippant (or overly grandiose), but
innovating solutions to this problem, where there isn't necessarily
one metadata master that everyone is slaved to seems to be one of the
more important and interesting problems that our sector faces.

When Columbia University can become the source of a bibliographic
record for Google Books, HathiTrust and OpenLibrary, etc how does this
change the hub and spoke workflows (with OCLC as the hub) that we are
more familiar with? I think this topic is what's at the heart of the
discussions about a github-for-data [2,3], since decentralized
version control systems [4] allow for the evolution of more organic,
push/pull, multimaster workflows...and platforms like Github make them
socially feasible, easy and fun.

I also think Linked Library Data, where bibliographic descriptions are
REST enabled Web resources identified with URLs, and patterns such as
webhooks [5] make it easy to trigger update events could be part of an
answer. Feed technologies like Atom, RSS and the work being done on
ResourceSync also seem important technologies for us to use to allow
people to poll for changes [6]. And being able to say where you have
obtained data from, possibly using something like the W3C Provenance
vocabulary [7] also seems like an important part of the puzzle.

I'm sure there are other (and perhaps better) creative analogies or
tools that could help solve this problem. I think you're probably
right that we are starting to see the errors more now that more
library data is becoming part of the visible Web via projects like
GoogleBooks, HathiTrust, OpenLibrary and other enterprising libraries
that design their catalogs to be crawlable and indexable by search
engines.

But I think it's more fun to think about (and hack on) what grassroots
things we could be doing to help these new bibliographic data
workflows to grow and flourish than to get piled under by the errors,
and a sense of futility...

Or it might make for a good article or dissertation topic :-)

//Ed

[1] http://inkdroid.org/journal/2011/12/25/genealogy-of-a-typo/
[2] http://www.informationdiet.com/blog/read/we-need-a-github-for-data
[3] http://sunlightlabs.com/blog/2010/we-dont-need-a-github-for-data/
[4] http://en.wikipedia.org/wiki/Distributed_revision_control
[5] https://help.github.com/articles/post-receive-hooks
[6] http://www.niso.org/workrooms/resourcesync/
[7] http://www.w3.org/TR/prov-primer/


Re: [CODE4LIB] Job: Senior Java Developer (CACI) at Library of Congress

2012-08-14 Thread Ed Summers
On Tue, Aug 14, 2012 at 9:02 AM,  j...@code4lib.org wrote:
 The Software Developer will serve as a member of the repository development
 team at the Library of Congress. The candidate will be responsible for
 participating in the definition, design, and development of the software,
 tools and technologies that satisfy functional requirements, within the scope,
 schedule, and priorities as assigned by the project manager and/or technical
 lead. The candidate must be familiar with the entire lifecycle of software
 development, and have experience creating, maintaining and applications for
 production environments. The candidate must be familiar with debugging
 software issues in the production environment.

Btw, if anyone wants to know more about this job and wants to chat
about it informally let me know...

//Ed


Re: [CODE4LIB] It's all job postings!

2012-08-06 Thread Ed Summers
150 people responded about whether jobs.code4lib.org posting should
come to the discussion list:

yes: 132
no: 10
who cares: 8

93% in support or agnostic seems to be a good indicator that the
postings should continue to come to the list for now.

//Ed


Re: [CODE4LIB] It's all job postings!

2012-08-02 Thread Ed Summers
On Thu, Aug 2, 2012 at 9:35 AM, Moynihan, Terry
terry.moyni...@analog.com wrote:
 I can't understand why this would be an issue in a profession (librarian) 
 that is very tiny compared to most. I also can't understand why it would be a 
 problem when 50% of college graduates can't get any job let alone one in 
 their field. The US and World economies stink, and more jobs have been lost 
 than ever before in the history of the world. There are still 100's of 
 millions of people without any job and a few job postings are an issue??

 Perhaps a step back to the reality of what's really important in life...

Thanks for this Terry. You expressed exactly the frustration that led
me to hack on jobs.code4lib.org in the first place. I know too many
people struggling to find work, and to find work they love.

As Dan Chudnov pointed out in his code4lib keynote this year, the
library/archive profession is in the midst of a pretty big
upheaval/transformation. So, the other goal of jobs.code4lib.org is to
help document the skills and jobs that are in demand, to help
educators teach their students relevant skills so that they can find
jobs. I also wanted it to assist life long learners who were
interested in refreshing their skillset. Ideas for improving the site
are welcome in the issue trackers Github [1].

//Ed

[1] https://github.com/code4lib/shortimer/issues


Re: [CODE4LIB] It's all job postings!

2012-08-02 Thread Ed Summers
On Thu, Aug 2, 2012 at 10:29 AM, Barbara Cormack
bcorm...@corvendesign.com wrote:
 I would vote for including more information in the postings, as some have
 come through without any details about the job or the hiring institution, or
 links. Usually a little searching turns this up, but not always.

Just so I understand, have you tried clicking on the jobs.code4lib.org
URL included at the bottom of the posting? If not does this link need
to be more obvious?

//Ed


Re: [CODE4LIB] Job: Digital Projects and Technology Librarian at Yale University

2012-07-20 Thread Ed Summers
I'm not sure if it helps, but jobs.code4lib.org picked this up
downstream from a libgig post yesterday:


http://publicboard.libgig.com/job/digital-projects-and-technology-librarian-new-haven-ct-yale-university-b56f0fd024/?d=1source=rss_pageutm_source=twitterfeedutm_medium=twitter

//Ed

On Fri, Jul 20, 2012 at 8:46 AM, Friscia, Michael
michael.fris...@yale.edu wrote:
 No, it is not possible to submit when the job is closed. I'm trying to get 
 clarification if closing it was intentional. Sorry for the confusion.

 I should add that I don't have anything to do with the job except my 
 department is named in the description as a collaborating partner.

 ___
 Michael Friscia
 Manager, Digital Library  Programming Services

 Yale University Library
 (203) 432-1856


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
 Matthew Sherman
 Sent: Friday, July 20, 2012 8:36 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Job: Digital Projects and Technology Librarian at 
 Yale University

 So even though it says closed to further applications one is actually able
 to submit?

 On Fri, Jul 20, 2012 at 5:27 AM, Friscia, Michael
 michael.fris...@yale.eduwrote:

 I just asked, our internal locks are only for the first 7 days during
 which the jobs won't even appear in the system unless you work for Yale.
 ___
 Michael Friscia
 Manager, Digital Library  Programming Services
 Yale University Library
 (203) 432-1856







 On 7/19/12 11:47 PM, Simon Spero sesunc...@gmail.com wrote:

 Maybe it's just closed to internal applicants- some sort of Yale lock?
 On Jul 19, 2012 11:25 PM, Matthew Sherman matt.r.sher...@gmail.com
 wrote:
 
  There is a slight problem here.  The posting says it is *closed to
 further
  applications*.  Can someone from Yale explain/look into that?  I would
  very much like to apply.
 
  On Thu, Jul 19, 2012 at 5:54 PM, Simon Spero sesunc...@gmail.com
 wrote:
 
   On Thu, Jul 19, 2012 at 6:35 PM, j...@code4lib.org wrote:
  
  * May be required to assist with disaster recovery efforts.
   
  
  
PREFERRED EDUCATION, EXPERIENCE AND SKILLS
  * Advanced degree in theology or a related field.
   
  
   Rise, take up they bed, and walk
  
 



Re: [CODE4LIB] Reminder - call for proposals, New England code4lib!

2012-07-07 Thread Ed Summers
On Fri, Jul 6, 2012 at 2:51 PM, Stern, Randall randy_st...@harvard.edu wrote:
 This will be a great opportunity to meet your peers at local institutions and 
 generate conversation on code4lib related topics in which you are interested! 
 Please add your proposals now (please, by August 1) for

 (a) Prepared talks (20 minutes)
 (b) Lightning talks (5 minutes)
 (c) Posters

Maybe it's just me, but doesn't It seem a bit odd to submit proposals
months in advance for lightning talks? My experience of lightning
talks is that people can sign up for them at the event, so they can be
more spontaneous and of-the-moment.

//Ed


Re: [CODE4LIB] Reminder - call for proposals, New England code4lib!

2012-07-07 Thread Ed Summers
On Sat, Jul 7, 2012 at 9:28 AM, Carol Bean beanwo...@gmail.com wrote:
 I thought the distinction was that Lightning talks are very short and more 
 informal.

well, that too :-)


[CODE4LIB] code4lib.org down?

2012-06-25 Thread Ed Summers
Paging Oregon State: do we know why code4lib.org isn't responding?

http://code4lib.org/

HTTP requests currently seem to timeout.

//Ed

PS. Thanks to Carol Bean for noticing it, and bringing it up in #code4lib :-)


Re: [CODE4LIB] Job: Agile Project Manager at AudioVisual Preservation Solutions

2012-06-11 Thread Ed Summers
Oops, sorry about that Mark. I should have looked more carefully
before adding this after seeing it in your TweetStream. I'll remove
the duplicate. Also happened today with the Yale posting. I guess I
need to come up with some smarts to detect duplicates.

//Ed

On Mon, Jun 11, 2012 at 5:43 PM,  j...@code4lib.org wrote:
 **Job Description:**

 AudioVisual Preservation Solutions (AVPS) seeks an experienced (mid-level)
 Agile Project Manager to provide essential support and facilitation to an open
 source software development project for the public media archival
 community. The position will begin on July 1, 2012 and
 continue through October 2013, with the possibility of
 extension. The project manager will both play a critical
 leadership role in the Agile development process as well act as primary
 liaison for clients and stakeholders.


 This position is full time, based at our office in New York
 City. No reimbursement for relocation costs will be
 provided.


 **Responsibilities**

  * Oversee the entire project, including overall project planning, project 
 coordination and software development
  * Oversee Agile development of the application
  * Develop and document comprehensive project plans, timelines, milestones 
 and deliverables
  * Manage the complete software development lifecycle
  * Lead the development and management of project requirements, system 
 features, and user stories
  * Carefully track and coordinate project progress, ensuring the timely 
 completion of deliverables
  * Continually prioritize and organize project goals in a way that is clearly 
 accessible to all stakeholders
  * Manage and track project progress through web-based collaboration tools
  * Organize and facilitate regular project meetings, including iteration and 
 release planning, daily stand-up meetings, demos, and reviews
  * Be the primary point of contact for all stakeholders, including clients, 
 developers, stakeholders and AVPS team members. Answer questions, and field 
 inquiries to appropriate team members as needed
  * Develop documentation and guidelines for software
  * Help train users of the application
  * Supervise hand off of application to product owners upon completion of 
 contract
  * Travel to meetings as needed (10%)
 **Desired Skills and Experience**

  * At least three years in a project management role
  * Demonstrated experience with Agile software development coordination, 
 using frameworks such as Scrum or Feature Driven Development (FDD)
  * Demonstrated leadership skills, with the ability to manage distributed, 
 remote teams
  * Excellent verbal, written, presentation, and interpersonal communication 
 skills
  * Extremely organized, responsive, and detail oriented
  * Experience managing projects with project tracking, issue tracking, and 
 collaboration software such as JIRA and Confluence
  * Excellent MS Office skills on Mac and PC platforms, Google Docs, 
 diagramming skills using a variety software such as OmniGraffle
  * Certified Scrum Master and/or PMP Certification a plus
  * Knowledge of library and information science, video and audio production, 
 and/or public media a plus
 AudioVisual Preservation Solutions (AVPS) is a full service audiovisual
 preservation and information management consulting firm serving the
 educational, broadcasting, government, non-profit, and corporate
 sectors. With a strong focus on professional standards and
 best practices, open communication, efficient workflows, and the innovative
 use and development of technological resources, AVPS brings a broad knowledge
 base and extensive experience to efficiently and effectively meeting the
 challenges faced in the preservation and access of digital content.


 To Apply please submit resume and cover letter (including salary requirements
 if applicable) in PDF format to care...@avpreserve.com by June 22, 2012.



 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/1003/


Re: [CODE4LIB] OCLC / British Library / VIAF / Wikipedia

2012-06-04 Thread Ed Summers
On Fri, Jun 1, 2012 at 7:48 PM, Stuart Yeates stuart.yea...@vuw.ac.nz wrote:
 There's a discussion going on on Wikipedia that may be of interest to 
 subscribers of this list:

Thanks for the heads up Stuart! It is an interesting discussion, and
one that hopefully can build on the excellent work that Jakob Voss and
others [1] have done on German Wikipedia with the Deutsche Bibliothek.

//Ed

[1] http://meta.wikimedia.org/wiki/Transwiki:Wikimania05/Paper-JV2


Re: [CODE4LIB] MARC Magic for file

2012-05-24 Thread Ed Summers
On Wed, May 23, 2012 at 6:16 PM, Kyle Banerjee
baner...@orbiscascade.org wrote:
 I'm not sure whether to laugh or cry that it's a sign of progress that a 40
 year old utility designed to identify file types is now just beginning to
 be able to recognize a format that's been around for almost 50 years...

Laugh :-)

//Ed


[CODE4LIB] duplicate jobs postings from jobs.code4lib.org

2012-05-18 Thread Ed Summers
I just wanted to apologize for 3 duplicate job postings that were sent
today. Now that there are multiple job curators who are finding jobs
and putting them on jobs.code4lib.org it is important to double check
that a job hasn't been posted already. At the minimum I think this is
a social convention that curators should follow if they want to post
jobs on jobs.code4lib.org. Perhaps there is something shortimer [1]
could do to help prevent this: such as warning when a given job URL
has been used before, etc.

Anyhow, thanks for your patience :-)
//Ed

[1] https://github.com/code4lib/shortimer


Re: [CODE4LIB] Anyone using node.js?

2012-05-09 Thread Ed Summers
On Wed, May 9, 2012 at 3:47 AM, Berry, Rob robert.be...@liverpool.ac.uk wrote:
 You almost certainly should not rewrite an entire codebase from scratch 
 unless there's an extremely good reason to do so. JoelOnSoftware did a good 
 piece on it - http://www.joelonsoftware.com/articles/fog69.html.

 Why has your project manager decided Node.js is the way to go instead of 
 something like Python or Perl? Just because it's a shiny new technology? 
 Python's got Twisted and Perl has POE if you want to do asynchronous 
 programming. They also both have a very large number of excellent quality 
 libraries to do innumerable other things.

I totally agree, it's all about the right tool for the job.

Just to clarify, NodeJS is quite a bit different than Twisted and POE
because the entire language and its supporting libraries are written
for event driven programming from the bottom up. When using Twisted
and POE you may end up needing existing libraries that are
synchronous, so the wins aren't as great, and things can
get...complicated. For a pretty even handed description of this check
out Paul Querna's blog post about why Rackspace decided to switch from
Twisted to NodeJS for their cloud monitoring dashboard applications
[1].

I am not saying Perl and Python are not good tools (they are) just
that the benefits of using NodeJS are not all hype.

//Ed

[1] 
http://journal.paul.querna.org/articles/2011/12/18/the-switch-python-to-node-js/


Re: [CODE4LIB] Anyone using node.js?

2012-05-09 Thread Ed Summers
On Wed, May 9, 2012 at 4:50 AM, Berry, Rob robert.be...@liverpool.ac.uk wrote:
 Though re Python I would say mixing Django with Twisted is a fairly blatant 
 error. There are libraries built on Twisted to serve web-pages, and if you're 
 doing event-driven programming you should really be using them.

Heh, but part of your argument for using POE or Twisted was that they
also both have a very large number of excellent quality libraries to
do innumerable other things. I think it's more like a slippery slope
of mixing programming paradigms than it is a blatant error. Also, I
think it was specifically the Django ORM code that bit them hardest,
not HTTP calls. Yes there are ORM options like adbmapper, but I think
you increasingly find yourself in the weeds on the fringe of the
Python community.

//Ed


Re: [CODE4LIB] Anyone using node.js?

2012-05-09 Thread Ed Summers
On Wed, May 9, 2012 at 5:17 AM, Berry, Rob robert.be...@liverpool.ac.uk wrote:
 No, fair enough, you are right. If that's the paradigm you want it would be a 
 better bet to go for a language that has it built in from the ground up.

And (just so it isn't lost) you are absolutely right to question
whether there is a legitimate reason for wanting to do the rewrite :-)

//Ed


Re: [CODE4LIB] Anyone using node.js?

2012-05-08 Thread Ed Summers
I've been using NodeJS in a few side projects lately, and have come to
like it quite a bit for certain types of applications: specifically
applications that need to do a lot of I/O in memory constrained
environments. A recent one is Wikitweets [1] which provides a real
time view of tweets on Twitter that reference Wikipedia. Similarly
Wikistream [2] monitors ~30 Wikimedia IRC channels for information
about Wikipedia articles being edited and publishes them to the Web.

For both these apps the socket.io library for NodeJS provided a really
nice abstraction for streaming data from the server to the client
using a variety of mechanisms: web sockets, flash socket, long
polling, JSONP polling, etc. NodeJS' event driven programming model
made it easy to listen to the Twitter stream, or the ~30 IRC channels,
while simultaneously holding open socket connections to browsers to
push updates to--all from within one process. Doing this sort of thing
in a more typical web application stack like Apache or Tomcat can get
very expensive where each client connection is a new thread or
process--which can lead to lots of memory being used.

If you've done any JavaScript programming in the browser, it will seem
familiar, because of the extensive use of callbacks. This can take
some getting used to, but it can be a real win in some cases,
especially in applications that are more I/O bound than CPU bound.
Ryan Dahl (the creator of NodeJS) gave a presentation [4] to a PHP
group last year which does a really nice job of describing how NodeJS
is different, and why it might be useful for you. If you are new to
event driven programming I wouldn't underestimate how much time you
might spend feeling like you are turning our brain inside out.

In general I was really pleased with the library support in NodeJS,
and the amount of activity there is in the community. The ability to
run the same code in the client as in the browser might be of some
interest. Also, being able use libraries like jQuery or PhantomJS in
command line programs is pretty interesting for things like screen
scraping the tagsoup HTML that is so prevalent on the Web.

If you end up needing to do RDF and XML processing from within NodeJS
and you aren't finding good library support you might want to find
databases (Sesame, eXist, etc) that have good HTTP APIs and use
something like request [5] if there isn't already support for it. I
wrote up why NodeJS was fun to use for Wikistream on my blog if you
are interested [6].

I recommend you try doing something small to get your feet wet with
NodeJS first before diving in with the rewrite. Good luck!

//Ed

[1] http://wikitweets.herokuapp.com
[2] http://wikistream.inkdroid.org
[3] http://inkdroid.org/journal/2011/11/07/an-ode-to-node/
[4] http://www.youtube.com/watch?v=jo_B4LTHi3I
[5] https://github.com/mikeal/request
[6] http://inkdroid.org/journal/2011/11/07/an-ode-to-node/

On Tue, May 8, 2012 at 5:24 PM, Randy Fischer randy.fisc...@gmail.com wrote:
 On Mon, May 7, 2012 at 11:17 PM, Ethan Gruber ewg4x...@gmail.com wrote:



 It was recently suggested to me that a project I am working on may adopt
 node.js for its architecture (well, be completely re-written for node.js).
 I don't know anything about node.js, and have only heard of it in some
 passing discussions on the list.  I'd like to know if anyone on code4lib
 has experience developing in this platform, and what their thoughts are on
 it, positive or negative.



 It's a very interesting project - I think of it as kind of non-preemptive
 multitasking framework, very much like POE in the Perl world, but with a
 more elegant way of managing the event queue.

 Where it could shine is that it accepts streaming, non-blocking HTTP
 requests.  So for large PUTs and POSTs, it could be a real win (most other
 web-server arrangements are going to require completed uploads of the
 request, followed by a hand-off to your framework of an opened file
 descriptor to a temporary file).

 My naive tests with it a year or so ago gave inconsistent results, though
 (sometime the checksums of large PUTs were right, sometimes not).

 And of course to scale up, do SSL, etc, you'll really need to put something
 like Apache in front of it - then you lose the streaming capability.  (I'd
 love to hear I'm wrong here).


 -Randy Fischer


[CODE4LIB] code4lib journal site statistics

2012-04-16 Thread Ed Summers
Just a quick note to let you know that site statistics for Code4lib
Journal [1] are going to be emailed regularly to the c4lj-discuss
Google Group [2]. The stats are provided as CSV attachments from
Google Analytics, which include page views, visitors and traffic
sources.

If you have any suggestions/ideas please let us know at
jour...@code4lib.org or on c4lj-discuss. Thanks to Jason Ronallo for
the idea to do this.

//Ed

[1] http://journal.code4lib.org
[2] https://groups.google.com/d/msg/c4lj-discuss/J-kqRtyrcnM/WYxLbw9YncUJ


Re: [CODE4LIB] Author authority records to create publication feed?

2012-04-16 Thread Ed Summers
Two other projects that are worth taking a look at are VIVO [1] and
BibApp [2]. Both take the approach of enabling institutions to manage
information about their faculty, which can then be federated more
widely. I guess the reality is that there will be lots of identifiers
for faculty, and simple systems that allow them to be collaboratively
and meaningfully linked together are a good way forward.

//Ed

[1] http://vivoweb.org/
[2] http://bibapp.org/

On Fri, Apr 13, 2012 at 1:03 PM, Paul Butler (pbutler3)
pbutl...@umw.edu wrote:
 Thank you all for your suggestions! Kevin's excellent email confirms my 
 suspicions.

 I am working on plans to transform our digital repository to a more broadly 
 defined IR, so that will likely be our focus down the road.  However, any 
 solution that requires faculty input without an immediate, tangle benefit 
 will likely gain slow traction.

 I will pass along the suggestions and go from there.

 Cheers, Paul
 +-+-+-+-+-+-+-+-+-+-+-+-+
 Paul R Butler
 Assistant Systems Librarian
 Simpson Library
 University of Mary Washington
 1801 College Avenue
 Fredericksburg, VA 22401
 540.654.1756
 libraries.umw.edu

 Sent from the mighty Dell Vostro 230.


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ford, 
 Kevin
 Sent: Friday, April 13, 2012 10:50 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Author authority records to create publication feed?

 Hi Paul,

 I can't really offer any suggestions but to say that this is a problem area 
 presently.  In fact, there was a recent workshop, held in connection with the 
 Spring CNI Membership Meeting, designed specifically to look at this problem 
 (and author identity management more generally).  You can read more about it 
 from the announcement here [1], but the idea was to bring a number of the 
 larger actors (Web of Science, arXiv, ORCID, ISNI, VIAF, LC/NACO, and a few 
 more) involved in managing authorial identity together to learn about the 
 work being done, and to discuss improved ways, to disambiguate scholarly 
 identities and then diffuse and share that information within and across the 
 library and scholarly publishing realms.  Clifford Lynch, who moderated the 
 meeting, will publish a post-workshop report in a few weeks [2].  Perhaps of 
 additional interest, [2] also contains a link to the report of a similar 
 workshop held in London about international author identity.

 Inititatives like ISNI [3] and ORCID [4], which mint identifiers for (public, 
 authorial) identities, and VIAF, which has done so much to aggregate the 
 authority records of the participating libraries (while also assigning them 
 an identifier), are essential to disambiguating one identity from another and 
 assigning unique identifiers to those identities.  For identifiers like 
 ORCIDs, the faculty member's sponsoring organization might acquire the ORCID 
 for him/her, after which the faculty member will/may know and use the 
 identifier in situations such as grant applications, publishing, etc. (though 
 it might also be early days for this activity also).   Part of the process, 
 however, is diffusing the identifier across the library and scholarly 
 publishing domains, all the while matching it with the correct identity (and 
 identifer) in another system.  That said, when ISNIs and ORCIDs and, perhaps, 
 VIAF identifiers start to make their ways into Web of Science, arXiv, LC/NACO 
 file, !
 an!

  d many other places, we - developers looking to creating RSS feeds of author 
 publications across services but without having to deal with same-name 
 problems or variants - might then have the hook we need to generate RSS feeds 
 for author publications from such services as JSTOR, EBSCO, arXiv, Web Of 
 Science, etc.

 Alternatively, you'd have to get your faculty members to submit their entire 
 publication history to academia.edu (as Ethan suggested), after which the 
 community would have to request an RSS feed of that history, or an 
 institutional repository (as Chad suggested), but I understand these types of 
 things are an uphill battle with (often busy, underpaid) faculty.

 Cordially,

 Kevin


 [1] http://www.cni.org/news/cni-workshop-scholarly-id/
 [2] https://mail2.cni.org/Lists/CNI-ANNOUNCE/Message/113744.html
 [3] http://www.isni.org/
 [4] http://about.orcid.org/






 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of Paul Butler (pbutler3)
 Sent: Friday, April 13, 2012 9:25 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Author authority records to create publication feed?

 Howdy All,

 Some folks from across campus just came to my door with this question.
 I am still trying to work through the possibilities and problems, but
 thought others might have encountered something similar.

 They are looking for a way to create a feed (RSS, or anything else
 that might work) for each faculty member on campus to 

Re: [CODE4LIB] Job: Lead Web Developer at Florida State University

2012-04-09 Thread Ed Summers
On Mon, Apr 9, 2012 at 2:02 PM, GORE, EMILY eg...@fsu.edu wrote:
 My apologies to all for the multiple listings, and I did forget to get 
 approval from Roy T. for all of them.  Please forgive!

No worries Emily. If there is a way the jobs.code4lib.org admin
interface can be improved definitely let me know.

//Ed


Re: [CODE4LIB] Job: at ScraperWiki

2012-03-13 Thread Ed Summers
Hi Jodi,

Was there a reason why you included the Pool temperatures, company
registrations, dairy prices … in the job description at:

http://jobs.code4lib.org/job/842

I almost flagged the posting as spam...

//Ed

On Tue, Mar 13, 2012 at 9:03 AM,  j...@code4lib.org wrote:
 Pool temperatures, company registrations, dairy prices …


 ScraperWiki is a Silicon Valley style startup, in Liverpool, UK. We're
 changing the world of open data, and how data science is done together on the
 Internet.


 We're looking for a data scientist who…


 Loves data, and what can be done with it.

 Able to code in Ruby or Python, but willing to learn the other.

 Good at communicating with non-technical people.

 Happy to responsively give our corporate customers what they need.

 Some practical things…


 We're an innovative, funded startup. Things will change lots, as we find how
 our business works. We'd like you to enjoy and help with that.

 Must be willing to either relocate to Liverpool or to commute to our offices
 which are near the University. We might be able to organise working visas.

 To apply - send the following:


 Links to two scrapers that you've made on ScraperWiki, involving a dataset
 that you find interesting for some reason.

 Similarly, a link to a view you've made on ScraperWiki (can be related to the
 two scrapers).

 A link to your resume/CV

 Any questions you have about the job.

 Along to fran...@scraperwiki.com with the word swjob4 in the subject (and yes,

 that means no agencies, unless the candidates do that themselves)


 … Oil wells, marathon results, planning applications





 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/842/


Re: [CODE4LIB] Job: at ScraperWiki

2012-03-13 Thread Ed Summers
Oh I see it's in the job description you got from the ScraperWiki blog post:


http://blog.scraperwiki.com/2012/03/13/job-advert-data-scientist-web-scraper/

On Tue, Mar 13, 2012 at 12:40 PM, Ed Summers e...@pobox.com wrote:
 Hi Jodi,

 Was there a reason why you included the Pool temperatures, company
 registrations, dairy prices … in the job description at:

    http://jobs.code4lib.org/job/842

 I almost flagged the posting as spam...

 //Ed

 On Tue, Mar 13, 2012 at 9:03 AM,  j...@code4lib.org wrote:
 Pool temperatures, company registrations, dairy prices …


 ScraperWiki is a Silicon Valley style startup, in Liverpool, UK. We're
 changing the world of open data, and how data science is done together on the
 Internet.


 We're looking for a data scientist who…


 Loves data, and what can be done with it.

 Able to code in Ruby or Python, but willing to learn the other.

 Good at communicating with non-technical people.

 Happy to responsively give our corporate customers what they need.

 Some practical things…


 We're an innovative, funded startup. Things will change lots, as we find how
 our business works. We'd like you to enjoy and help with that.

 Must be willing to either relocate to Liverpool or to commute to our offices
 which are near the University. We might be able to organise working visas.

 To apply - send the following:


 Links to two scrapers that you've made on ScraperWiki, involving a dataset
 that you find interesting for some reason.

 Similarly, a link to a view you've made on ScraperWiki (can be related to the
 two scrapers).

 A link to your resume/CV

 Any questions you have about the job.

 Along to fran...@scraperwiki.com with the word swjob4 in the subject (and 
 yes,

 that means no agencies, unless the candidates do that themselves)


 … Oil wells, marathon results, planning applications





 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/842/


Re: [CODE4LIB] Job: at ScraperWiki

2012-03-13 Thread Ed Summers
On Tue, Mar 13, 2012 at 12:49 PM, Chad Benjamin Nelson
cnelso...@gsu.edu wrote:
 I think it is just some examples of the weird and interesting data in 
 scraperwiki.

Yeah, I guess it would be kind of pointless spam eh? :-)

//Ed


Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records

2012-03-12 Thread Ed Summers
On Fri, Mar 9, 2012 at 12:12 PM, Godmar Back god...@gmail.com wrote:
 Here's my hand ||*(  [1].

||*)

I'm sorry that I was so unhelpful w/ the patches welcome message on
your docfix. You're right, it was antagonistic of me to suggest you
send a patch for something so simple. Plus, it wasn't even accurate,
because I actually wanted a pull request :-)

I've been amazed at how much github can speed fixes getting into the
codebase--even very small ones. Using the machinery of git (fork,
commit, push, pull request, merge) leaves a trail which is extremely
helpful for surfacing who is helping with what at the source code
level. It would be great if the students that you mentioned who are
using pymarc knew that they have the ability to participate at this
level as well.

One of the reasons why we moved pymarc over to github was to enable
more people to more easily maintain the software. I agree that there
are some dusty corners of pymarc that could use some cleanup, and that
character encoding is probably the cruftiest of the cruft. Perhaps
python3 compatibility will be good time to rethink how some of it
works? At any rate, I hope that you will keep helping the project out,
we need it.

//Ed

PS. thanks for being you Mike :-)


Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records

2012-03-12 Thread Ed Summers
On Mon, Mar 12, 2012 at 10:14 AM, Godmar Back god...@gmail.com wrote:
 Here's a make-up pull request especially made for you :-)

 https://github.com/edsu/pymarc/pull/25

Merged! :-D

//Ed


Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records

2012-03-08 Thread Ed Summers
Hi Terry,

On Thu, Mar 8, 2012 at 2:36 PM, Reese, Terry
terry.re...@oregonstate.edu wrote:
 This is one of the reasons you really can't trust the information found in 
 position 9.  This is one of the reasons why when I wrote MarcEdit, I utilize 
 a mixed process when working with data and determining characterset -- a 
 process that reads this byte and takes the information under advisement, but 
 in the end treats it more as a suggestion and one part of a larger heuristic 
 analysis of the record data to determine whether the information is in UTF8 
 or not.  Fortunately, determining if a set of data is in UTF8 or something 
 else, is a fairly easy process.  Determining the something else is much more 
 difficult, but generally not necessary.

Can you describe in a bit more detail how MARCEdit sniffs the record
to determine the encoding? This has come up enough times w/ pymarc to
make it worth implementing.

//Ed


Re: [CODE4LIB] code4lib.org back up, along with wiki.code4lib.org and planet.code4lib.org

2012-02-20 Thread Ed Summers
Hoorah, thanks RyanW and RyanO! Striking while the iron is hot, would
it be possible to verify that routine backups are happening for the
drupal and mediawiki databases on code4lib.org?

On Mon, Feb 20, 2012 at 5:07 PM, Wick, Ryan ryan.w...@oregonstate.edu wrote:
 We're back up and running, thanks to Ryan Ordway. Let me know if you notice 
 something that isn't working as expected.

 Ryan Wick
 Information Technology Consultant
 Special Collections  Archives Research Center
 Oregon State University Libraries
 http://osulibrary.oregonstate.edu/specialcollections


[CODE4LIB] code4lib.org

2012-02-16 Thread Ed Summers
I apologize if this has already come up, but has there been any
announcement about the code4lib.org drupal and mediawiki outages at
Oregon State?

//Ed


Re: [CODE4LIB] GetLamp screening at Code4Lib

2012-02-01 Thread Ed Summers
Shoot, I'm just realizing now I'm also double booked for the newcomers
dinner ... was there another option for the Get Lamp showing?

On Tue, Jan 31, 2012 at 5:16 PM, Dongqing Xie d...@fsu.edu wrote:
 Adam Wead aw...@rockhall.org wrote:

Shouldn't be a problem.  As I understand it, the screening is basically 
plugging in  laptop to the TV and watching the movie.

...adam



On Jan 31, 2012, at 4:34 PM, Michael J. Giarlo wrote:

 Just curious: is there a chance that we can arrange for subsequent
 viewings?  I ask because a number of us have late newcomer dinner
 reservations.  Maybe we can run it during the craft beer drink-up,
 too, for instance?

 Not trying to make this complicated.

 -Mike


 On Tue, Jan 31, 2012 at 16:28, Adam Wead aw...@rockhall.org wrote:
 Hi all,

 So far the preferred time for the GetLamp showing is Tuesday at 9 pm.  
 I'll close the Doodle poll tomorrow at 5 EST to give everyone a chance to 
 vote.

 http://doodle.com/p4c32i3b2ybsrkbh

 ...adam



 [http://donations.rockhall.com/Logo_WWR.gif]http://rockhall.com/exhibits/women-who-rock/
 This communication is a confidential and proprietary business 
 communication. It is intended solely for the use of the designated 
 recipient(s). If this communication is received in error, please contact 
 the sender and delete this communication.

 '

[http://donations.rockhall.com/Logo_WWR.gif]http://rockhall.com/exhibits/women-who-rock/
This communication is a confidential and proprietary business communication. 
It is intended solely for the use of the designated recipient(s). If this 
communication is received in error, please contact the sender and delete this 
communication.

'



Re: [CODE4LIB] OCLC control number access

2012-02-01 Thread Ed Summers
On Tue, Jan 31, 2012 at 3:45 PM, Stuart Spore spore...@nyu.edu wrote:
 If I can be forgiven a possibly naive question, is it possible to quickly
 and freely get a list of all the OCLC control numbers associated with an
 OCLC symbol (your own or someone else's) without resorting to any elaborate
 ( contractual) batchload service or the like?

If it's your own symbol I guess you could dump your OPAC as MARC and
rifle through it with a script? If it's someone else's you could ask
them for a dump of their opac as MARC and rifle through it. I seem to
remember the Worldcat API had some support for giving holdings
information...

That probably wasn't very helpful, since it's probably not going to be
quick but sometimes the obvious answer isn't very obvious.

//Ed


Re: [CODE4LIB] GetLamp screening at Code4Lib

2012-02-01 Thread Ed Summers
On Wed, Feb 1, 2012 at 5:52 AM, Ed Summers e...@pobox.com wrote:
 Shoot, I'm just realizing now I'm also double booked for the newcomers
 dinner ... was there another option for the Get Lamp showing?

Adam reminded me in #code4lib that the newcomers dinner starts at 6
and will likely be over by 9. So I'm not double booked after all.
Maybe some of the newcomer dinners could even segue into watching Get
Lamp if there is interest?

//Ed


Re: [CODE4LIB] Digital Object Viewer

2012-01-31 Thread Ed Summers
If by digital objects you mean images we've been getting a lot of
mileage out of OpenSeaDragon [1] at the Library of Congress. You do
have to pre-generate the deep-zoom-files DZI [2] or you can implement
your own server side tiling code to do it on the fly.

As a space vs time trade off we generate tiles on the fly in
Chronicling America [3], since there are millions of newspaper page
images. But in the World Digital Library [4] we generate DZI files.
Chris Thatcher, one of the developers at LC has a fork of the codeplex
repo on GitHub [5], which we are applying some fixes to, since GitHub
is alot easier to navigate and use than Codeplex.

If you are curious here are some samples of the viewer in action:

http://chroniclingamerica.loc.gov/lccn/sn85066387/1912-01-31/ed-1/seq-1/
http://www.wdl.org/en/item/4106/zoom/#group=1page=4

//Ed

[1] http://openseadragon.codeplex.com/
[2] https://github.com/openzoom/deepzoom.py
[3] http://chroniclingamerica.loc.gov
[4] http://wdl.org
[5] https://github.com/thatcher/openseadragon


Re: [CODE4LIB] Digital Object Viewer

2012-01-31 Thread Ed Summers
Yes, it's my understanding that OpenSeaDragon is basically a
JavaScript implementation of the OpenZoom flash code...and that they
work on roughly the same DZI files. But my knowledge of OpenZoom is
very limited, so take that with a grain of salt.

//Ed

On Tue, Jan 31, 2012 at 12:07 PM, Raymond Yee raymond@gmail.com wrote:
 Thanks, Ed, for pointing out OpenSeaDragon -- I didn't know about it.
 I've been aware of another similar open source project:

 http://www.openzoom.org/

 that makes use of Flash -- though the openzoom github repo has
 openzoom.js (https://github.com/openzoom/openzoom.js).  I've used the
 Python toolkit of openzoom (https://github.com/openzoom/deepzoom.py) to
 generate tiles.

 -Raymond

 On 1/31/12 8:59 AM, Ed Summers wrote:
 If by digital objects you mean images we've been getting a lot of
 mileage out of OpenSeaDragon [1] at the Library of Congress. You do
 have to pre-generate the deep-zoom-files DZI [2] or you can implement
 your own server side tiling code to do it on the fly.

 As a space vs time trade off we generate tiles on the fly in
 Chronicling America [3], since there are millions of newspaper page
 images. But in the World Digital Library [4] we generate DZI files.
 Chris Thatcher, one of the developers at LC has a fork of the codeplex
 repo on GitHub [5], which we are applying some fixes to, since GitHub
 is alot easier to navigate and use than Codeplex.

 If you are curious here are some samples of the viewer in action:

     http://chroniclingamerica.loc.gov/lccn/sn85066387/1912-01-31/ed-1/seq-1/
     http://www.wdl.org/en/item/4106/zoom/#group=1page=4

 //Ed

 [1] http://openseadragon.codeplex.com/
 [2] https://github.com/openzoom/deepzoom.py
 [3] http://chroniclingamerica.loc.gov
 [4] http://wdl.org
 [5] https://github.com/thatcher/openseadragon



[CODE4LIB] jobs.code4lib.org

2012-01-31 Thread Ed Summers
(apologies if you already saw this on the cod4libcon list)

There were some questions on #code4lib IRC today about
jobs.code4lib.org. Jonathan is right, it is a bit wacky, but
hopefully in a good way. I was going to grab a lightning talk slot at
the conference to talk about it, but here is a brief summary that may help.

jobs.code4lib.org is a Python Django application called shortimer that
is on github [1]. Jobs end up on jobs.code4lib.org via two workflows:

1. posting via email:

- lots of people post job ads to the code4lib mailing list, so
shortimer subscribes to the list and tries to find job postings in the
emails it receives
- if it finds what looks like a job it extracts what metadata it can,
and adds it to its database in an non-published state
- logged in users can curate jobs [2] (clean up job titles, add the
employer, job URL, and any tags that seem relevant) and then hit
publish
- when someone publishes a job it will show up on the homepage [3]
- when someone publishes a job the code4lib twitter account [4] will
tweet the job announcement

2. posting via website

- a logged in user can go to a web form [5] and post a new job
- when they hit publish an email will go to the discussion list, and
will get tweeted

That's pretty much it. Freebase is used as a controlled vocabulary for
tags and employers which has some benefits in displaying jobs by a
topic like Ruby [6]. It's even possible to get some general trend
reporting [7].

This is a long way of saying: if you have jobs to announce before or
at the conference please feel free to try out jobs.code4lib.org :-) Of
course there is a whole lot of value in a physical board at the
conference and/or a wiki with people that can answer questions in
person though. There's no replacing that...

//Ed

[1] http://github.com/code4lib/shortimer
[2] http://jobs.code4lib.org/curate/
[3] http://jobs.code4lib.org
[4] http://twitter.com/code4lib
[5] http://jobs.code4lib.org/job/new/
[6] http://jobs.code4lib.org/jobs/ruby/
[7] http://jobs.code4lib.org/reports/


Re: [CODE4LIB] jobs.code4lib.org

2012-01-31 Thread Ed Summers
I guess it's rarely a good idea to respond to your own post, but I
forgot to add that when a job is published on jobs.code4lib.org it
will show up in the site's Atom feed [1]. The feed should be usable by
your feed reader of choice, and could also be useful if you want to
syndicate the jobs elsewhere.

//Ed

[1] http://jobs.code4lib.org/feed/

PS. It was kind of fun to finally use the tag link relation to mark
up the job tags in the feed with Freebase URLs. For example:

entry
...
link rel=tag title=Unix
href=http://www.freebase.com/view/en/unix; type=text/html /
link rel=tag title=Unix [JSON]
href=http://www.freebase.com/experimental/topic/standard/en/unix;
type=application/json /
link rel=tag title=Unix [RDF]
href=http://rdf.freebase.com/rdf/en.unix; type=application/rdf+xml
/
/entry


Re: [CODE4LIB] marc in json

2011-12-02 Thread Ed Summers
Thanks for all the helpful guidance. I'll work on getting the JSON
implementation updated before releasing it.

I don't know if it's of interest but the Twitter firehose (as deliverd
by Gnip) is line oriented JSON. Each line is a tweet and all its
metadata. This format is handy for doing things like counting the
number of records with a 'wc -l' instead of having to parse the
JSON...which can be expensive when there can be 10M an hour.

//Ed


[CODE4LIB] marc in json

2011-12-01 Thread Ed Summers
Martin Czygan recently added JSON support to pymarc [1]. Before this
gets rolled into a release I was wondering if it might make sense to
bring the implementation in line with Ross Singer's proposed JSON
serialization for MARC [2]. After quickly looking around it seems to
be what got implemented in ruby-marc [3] and PHP's File_MARC [4]. It
also looked like there was a MARC::Record branch [5] for doing
something similar, but I'm not sure if that has been released yet.

It seems like a no-brainer to bring it in line, but I thought I'd ask
since I haven't been following the conversation closely.

//Ed

[1] 
https://github.com/edsu/pymarc/commit/245ea6d7bceaec7215abe788d61a0b34a6cd849e
[2] 
http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/
[3] https://github.com/ruby-marc/ruby-marc/blob/master/lib/marc/record.rb#L227
[4] 
http://pear.php.net/package/File_MARC/docs/latest/File_MARC/File_MARC_Record.html#methodtoJSON
[5] 
http://marcpm.git.sourceforge.net/git/gitweb.cgi?p=marcpm/marcpm;a=shortlog;h=refs/heads/marc-json


Re: [CODE4LIB] Library News (à la ycombinator's hackernews)

2011-12-01 Thread Ed Summers
On Wed, Nov 30, 2011 at 10:51 PM, Matthew Phillips
mphill...@law.harvard.edu wrote:
 I'm the guy that did the hacking (with help from my coworkers, Jeff and 
 David) to get Hacker News up and running. If you have technical questions 
 about the site, shoot them my way.

Nice work. It's great to see it starting to get used.

 Mark is right, Library News is running the news.arc source from 
 https://github.com/nex3/arc  I had to do a little customization, but the code 
 worked out of the box for me.

 I'm really interested in seeing Library News blossom. If you have input, 
 please share it. I'd also be excited to get a couple of community leaders to 
 become moderators for the site (drop me an email if you want to volunteer 
 yourself/someone).

I noticed that news.arc has some RSS functionality [1]. Does it seem
easy/possible to add a link element to the RSS feeds to the HTML,
e.g.

  link rel=alternate type=application/rss+xml title=Library
News href=http://news.librarycloud.org/rss;

//Ed

[1] https://github.com/nex3/arc/blob/master/news.arc#L2239


Re: [CODE4LIB] HTML5 Microdata, schema.org, and digital collections

2011-12-01 Thread Ed Summers
Damn auto-complete :-) Oh well, I guess everyone knows how inept I am now!
On Thu, Dec 1, 2011 at 1:03 PM, Ed Summers e...@pobox.com wrote:
 Excellent! Thanks for working with the situation :-)

 //Ed

 On Thu, Dec 1, 2011 at 9:55 AM, Jason Ronallo jrona...@gmail.com wrote:
 Ed,
 I'd like to still fit the article into the next issue. I agree that
 the cultural heritage community needs more exposure to these new web
 standards. With the increased interest in linked data, the landscape
 of choices for how to expose your data has become more complex, and I
 hope the article can get the discussion going and provide some
 guidance there.

 I also see this as an opportunity for me to get something out there
 relatively early on this topic, and coming before my talk is good
 timing.

 Jason

 On Thu, Dec 1, 2011 at 9:06 AM, Ed Summers e...@pobox.com wrote:
 Hi Jason,

 Let me just say again how bad I feel for dropping this on the floor. I
 feel even more guilty because more discussion about the use of
 html5/microdata in the cultural heritage community is desperately
 needed.

 So is it OK to still try to fit your article into the next issue, or
 should we push it to issue 17?

 //Ed

 On Thu, Dec 1, 2011 at 9:00 AM, Jason Ronallo jrona...@gmail.com wrote:
 Hi, Ed,
 I'm glad to hear from you and the journal. What I had when I submitted
 a proposal to the journal was just a proposal and an implementation,
 so I won't be able to have a draft to you before the end of the month.
 I'll try to share something with you sooner than that, though.

 I'll be happy to license the article US CC-BY and the code as open
 source (hopefully MIT).

 Thank you,

 Jason



 On Thu, Dec 1, 2011 at 3:59 AM, Ed Summers e...@pobox.com wrote:
 Hi Jason,

 I'm pleased to tell you that your recent proposal for an article about
 HTML5 Microdata has been provisionally accepted to the Code4Lib Journal.
 The editorial committee is interested in your proposal, and would like
 to see a draft. I have to apologize however, since through an
 oversight of my own this email should have been sent almost a month
 ago, and was not (more on this below).

 As a member of the Code4Lib Journal editorial committee, I will be
 your contact for this article, and will work with you to get it ready
 for publication.

 We hope to publish your article in issue 16 of the Journal, which is
 scheduled to appear Jan 30, 2012. Incidentally, this is good timing
 for your code4lib talk on the same topic!
 The official deadline for submission
 of a complete draft is Friday, December 2. But since I dropped the
 ball on getting this email out to you promptly I completely understand
 if you can't hit that date. Looking at the deadlines [1] for issue 16
 I can see that the 2nd draft is due Dec 30th, which is perhaps a more
 realistic goal for a draft. Please send whatever you have as soon as
 you can and we can get started. Upon receipt of the draft, I will
 work with you to address any changes recommended by the Editorial
 Committee.  More information about our author guidelines may be found
 at http://journal.code4lib.org/article-guidelines.

 Please note that final drafts must be approved by a vote of the
 Editorial Committee before being published.

 We also require all authors to agree to US CC-BY licensing for the
 articles we publish in the journal.  We recommend that any included
 code also have some type of code-specific open source license (such as
 the GPL).

 We look forward to seeing a complete draft and hope to include it in
 the Journal.  Thank you for submitting to us, and feel free to contact
 me directly with any questions.

 If you could drop me a line acknowledging receipt of this email, that
 would be great.

 //Ed

 [1] http://wiki.code4lib.org/index.php/Code4Lib_Journal_Deadlines


Re: [CODE4LIB] HTML5 Microdata, schema.org, and digital collections

2011-12-01 Thread Ed Summers
Excellent! Thanks for working with the situation :-)

//Ed

On Thu, Dec 1, 2011 at 9:55 AM, Jason Ronallo jrona...@gmail.com wrote:
 Ed,
 I'd like to still fit the article into the next issue. I agree that
 the cultural heritage community needs more exposure to these new web
 standards. With the increased interest in linked data, the landscape
 of choices for how to expose your data has become more complex, and I
 hope the article can get the discussion going and provide some
 guidance there.

 I also see this as an opportunity for me to get something out there
 relatively early on this topic, and coming before my talk is good
 timing.

 Jason

 On Thu, Dec 1, 2011 at 9:06 AM, Ed Summers e...@pobox.com wrote:
 Hi Jason,

 Let me just say again how bad I feel for dropping this on the floor. I
 feel even more guilty because more discussion about the use of
 html5/microdata in the cultural heritage community is desperately
 needed.

 So is it OK to still try to fit your article into the next issue, or
 should we push it to issue 17?

 //Ed

 On Thu, Dec 1, 2011 at 9:00 AM, Jason Ronallo jrona...@gmail.com wrote:
 Hi, Ed,
 I'm glad to hear from you and the journal. What I had when I submitted
 a proposal to the journal was just a proposal and an implementation,
 so I won't be able to have a draft to you before the end of the month.
 I'll try to share something with you sooner than that, though.

 I'll be happy to license the article US CC-BY and the code as open
 source (hopefully MIT).

 Thank you,

 Jason



 On Thu, Dec 1, 2011 at 3:59 AM, Ed Summers e...@pobox.com wrote:
 Hi Jason,

 I'm pleased to tell you that your recent proposal for an article about
 HTML5 Microdata has been provisionally accepted to the Code4Lib Journal.
 The editorial committee is interested in your proposal, and would like
 to see a draft. I have to apologize however, since through an
 oversight of my own this email should have been sent almost a month
 ago, and was not (more on this below).

 As a member of the Code4Lib Journal editorial committee, I will be
 your contact for this article, and will work with you to get it ready
 for publication.

 We hope to publish your article in issue 16 of the Journal, which is
 scheduled to appear Jan 30, 2012. Incidentally, this is good timing
 for your code4lib talk on the same topic!
 The official deadline for submission
 of a complete draft is Friday, December 2. But since I dropped the
 ball on getting this email out to you promptly I completely understand
 if you can't hit that date. Looking at the deadlines [1] for issue 16
 I can see that the 2nd draft is due Dec 30th, which is perhaps a more
 realistic goal for a draft. Please send whatever you have as soon as
 you can and we can get started. Upon receipt of the draft, I will
 work with you to address any changes recommended by the Editorial
 Committee.  More information about our author guidelines may be found
 at http://journal.code4lib.org/article-guidelines.

 Please note that final drafts must be approved by a vote of the
 Editorial Committee before being published.

 We also require all authors to agree to US CC-BY licensing for the
 articles we publish in the journal.  We recommend that any included
 code also have some type of code-specific open source license (such as
 the GPL).

 We look forward to seeing a complete draft and hope to include it in
 the Journal.  Thank you for submitting to us, and feel free to contact
 me directly with any questions.

 If you could drop me a line acknowledging receipt of this email, that
 would be great.

 //Ed

 [1] http://wiki.code4lib.org/index.php/Code4Lib_Journal_Deadlines


[CODE4LIB] vivosearchlight

2011-11-20 Thread Ed Summers
On Tue, Nov 1, 2011 at 7:44 AM, John Fereira ja...@cornell.edu wrote:
 If you want to see what node.js can do to implement a search mechanism take a 
 look something one of my colleagues developed.  http://vivosearchlight.org

 It installs a bookmarklet in your browser (take about 5 seconds) that will 
 initiate a search against a solr index that contains user profile information 
 from several institutions using VIVO (a semantic web application).  From any 
 web page, clicking on the Vivo Searchlight button in your browser will 
 initiate a search and find experts with expertise relevant to the content of 
 the page.  Highlight some text on the page and it will re-execute a search 
 with just those words.

Thanks for sharing John. That's a really a neat idea, even if the
results don't seem particularly relevant for some tests I tried. I was
curious how it does the matching of page text against the profiles. I
see from the description at http://vivosearchlight.org that
EleasticSearch is being used instead of Solr. Any chance Miles
Worthington (ok I googled) would be willing to share the source code
on his github account [1], or elsewhere?

//Ed

[1] https://github.com/milesworthington


Re: [CODE4LIB] Life and Literature Code Challenge

2011-09-01 Thread Ed Summers
On Wed, Aug 31, 2011 at 3:38 PM, John Mignault j...@mignault.net wrote:
 Through local and global digitization efforts, BHL has digitized over
 32 million pages of taxonomic literature, representing over 45,000
 titles and 87,000 volumes (January 2011). The entire -corpus- dataset
 is freely available and accessible via many open methods.

shamelssSelfPromotionIncidentally, there are 1,440 links from 952
Wikipedia articles to the BHL [1]./shamelessSelfPromotion

//Ed

[1] http://linkypedia.inkdroid.org/websites/34/


Re: [CODE4LIB] ruby-zoom port to 1.9.2

2011-09-01 Thread Ed Summers
Brice,

Do you have a a rubyforge account/email that I can use when requesting
that you are added as an admin? I can't seem to get the `gem owner`
command to respect my authoritay...

//Ed

On Tue, Aug 30, 2011 at 6:12 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 If you're unable to contact the original authors, you can contact the folks 
 who maintain rubygems.org, and ask them to give you the rights to release a 
 new version of the gem from (and pointing to) your repo, and effectively take 
 over the gem.

 Alternately, in your fork you link to, you should update the instructions to 
 make it clear that to install this fork, gem install isn't going to do it!

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Brice 
 Stacey
 Sent: Thursday, August 25, 2011 1:25 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] ruby-zoom port to 1.9.2

 FYI -



 I've finished porting ruby-zoom to 1.9.2, including the extended
 services. Repo is here: https://github.com/bricestacey/ruby-zoom



 I reached out to the original authors and haven't gotten a response, so
 it looks like it might never be integrated into the original project. If
 anyone has any ideas on how I might get these changes into it, please
 let me know.



 Brice Stacey



 From: Brice Stacey
 Sent: Tuesday, August 09, 2011 11:08 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: ruby-zoom port to 1.9.2



 Hi -



 I'd just like to let everyone know I did some work yesterday on
 ruby-zoom to port most of the code to 1.9.2. All of the standard z39.50
 features are ported. The only feature left is the packages, which allow
 for the extended services.



 I'd appreciate anyone that uses it to provide feedback.



 The git repository can be found here:
 https://github.com/bricestacey/ruby-zoom



 Installation:

 Install YAZ

 Clone the repo

 Run rake clean build package

 Gem install pkg/zoom-0.4.1



 If anyone has experience working with C and/or YAZ and would like to
 help finish this port off, I'd greatly appreciate it. Otherwise, I'll
 probably just drop the support entirely from my fork (since I won't need
 it going forward anyway). I've also contacted the original authors,
 hopefully they follow-up.



 Brice Stacey

 Digital Library Services

 University of Massachusetts Boston

 brice.sta...@umb.edu

 617-287-5921





Re: [CODE4LIB] ruby-zoom port to 1.9.2

2011-09-01 Thread Ed Summers
I opened a ticket with rubygems folks:

http://help.rubygems.org/discussions/problems/720-ruby-zoom-ownership

Maybe that will help get you the ability to maintain this module.
Thanks jrochkind for the help in #code4lib channel...

//Ed


Re: [CODE4LIB] OPDS 1.1 review period

2011-07-28 Thread Ed Summers
edsu--

Except that it was from a month ago and the review period is over. Oh
well, I guess the v1.1 of opds might be of interest still ... it is to
me at least.

/me slowly inches towards the door

//Ed

On Thu, Jul 28, 2011 at 4:04 PM, Ed Summers e...@pobox.com wrote:
 This might be of potential interest to code4lib folks who deal w/
 ebooks ... //Ed

 -- Forwarded message --
 From: Hadrien Gardeur hadrien.gard...@feedbooks.com
 Date: Sun, Jun 19, 2011 at 12:47 PM
 Subject: OPDS 1.1 review period
 To: atom-syn...@imc.org

 Hello,

 The OPDS community just posted the final draft for OPDS 1.1:
 http://opds-spec.org/2011/06/15/opds-1-1-call-for-comments/
 During this two week review period, we're actively looking for any
 kind of feedback about the spec.

 OPDS is based on Atom and is widely used by book retailers  libraries
 to distribute electronic publications on any device. Some of the most
 popular ebook applications on iOS (Stanza, Bluefire Reader) and on
 Android (Aldiko, FBReader) are compatible with this standard.

 Hadrien



Re: [CODE4LIB] RDF for opening times/hours?

2011-06-08 Thread Ed Summers
On Wed, Jun 8, 2011 at 10:00 AM, Simon Spero s...@unc.edu wrote:
 [cue edsu ]

And people wonder why Google/Yahoo/Bing chose to favor html5 microdata
on schema.org :-)

//Ed


Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-31 Thread Ed Summers
On Tue, May 31, 2011 at 11:55 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 The LCCN one does not work. Tries to take me to:
 http://errol.oclc.org/laf/n79021614.html

 Which results in an HTTP 500 error from the OCLC server.

 Since this template apparently generates a URL to an OCLC service (rather
 than LC? I guess maybe LC itself doesn't have the right permalinks?), I
 think that OCLC probably ought to fix this. If the template is not creating
 the right URL, I guess you've got to work with wikipedia to fix it. Or fix
 your end to accept those URLs properly.

As far as I know there aren't any permalinks for name authority
records at loc.gov that use the LCCN. I've heard informally from some
folks at OCLC that they plan to redirect these links to a URL at
loc.gov if/when the name authority records are available from there.
But I have no idea when that will happen unfortunately.

//Ed


Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-31 Thread Ed Summers
a bit of a fruedian slip there I suppose :-)

s/could/couldn't/

//Ed

On Tue, May 31, 2011 at 3:17 PM, Ed Summers e...@pobox.com wrote:
 On Tue, May 31, 2011 at 12:48 PM, Thomas Berger t...@gymel.com wrote:
 Currently about 150.000 articles on wikipedia.de carry the associated
 PND number, many of them also LoC-NA and VIAF numbers:

 Makes me wonder if we could use inter-wiki links to automatically
 update some of the en.wikipedia articles based on the viaf links in
 de.wikipedia. Could hurt to see how many there are I suppose.

 //Ed



Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-31 Thread Ed Summers
On Tue, May 31, 2011 at 12:48 PM, Thomas Berger t...@gymel.com wrote:
 Currently about 150.000 articles on wikipedia.de carry the associated
 PND number, many of them also LoC-NA and VIAF numbers:

Makes me wonder if we could use inter-wiki links to automatically
update some of the en.wikipedia articles based on the viaf links in
de.wikipedia. Could hurt to see how many there are I suppose.

//Ed


Re: [CODE4LIB] Adding VIAF links to Wikipedia

2011-05-27 Thread Ed Summers
On Thu, May 26, 2011 at 2:49 PM, Mark A. Matienzo m...@matienzo.org wrote:
 I would suggest engaging with the GLAM-WIKI project within Wikimedia
 http://outreach.wikimedia.org/wiki/GLAM.  They held an unconference
 recently in New York http://meta.wikimedia.org/wiki/GLAMcamp_NYC.

I second Mark's advice. The GLAM-WIKI folks are actively seeking
participation in Wikipedia from libraries. I would recommend you
specifically talk to Liam Wyatt and Katie Filbert (cc'd here) about
the ins and outs of library bots on Wikipedia. They helped organize
the recent event in NYC. Liam was the Wikipedian in Residence at the
British Museum.

//Ed

PS. some context for Liam and Katie: folks from OCLC are asking about
policies regarding bots adding links to viaf.org for name authorities,
similar to what Wikimedia-Germany have done. Full thread here:
http://www.mail-archive.com/code4lib@listserv.nd.edu/msg10562.html


Re: [CODE4LIB] Adding VIAF links to Wikipedia

2011-05-27 Thread Ed Summers
On Thu, May 26, 2011 at 2:01 PM, Ralph LeVan ralphle...@gmail.com wrote:
 OCLC Research would desperately love to add VIAF links to Wikipedia
 articles, but it seems to be very difficult.  The OpenLibrary folks tried to
 do it a while back and ended up getting their plans severely curtailed.  The
 discussion at Wikipedia is captured here:
 http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/OpenlibraryBot

Ralph if you read that entire discussion it sounds like the bot was
approved. Am I missing something?

//Ed


Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-26 Thread Ed Summers
It's the server unfortunately. I think OCLC is trying to figure out
what to do with errol ... there's a thread on the wc-devnet-l if you
are interested:


http://listserv.oclc.org/scripts/wa.exe?A2=ind1105dL=wc-devnet-lT=0F=PX=4D30895CB90D4C912FP=73

//Ed

On Thu, May 26, 2011 at 5:15 PM, Graham Seaman gra...@theseamans.net wrote:
 The lccn links from the template have been giving a java exception for
 the last few days at least: does the template or the server need fixing?


Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-25 Thread Ed Summers
The user profile pages that reference the website should eventually (1
or 2 days) turn up under the Users tab, e.g.

http://linkypedia.inkdroid.org/websites/23/users/

I don't see you there yet though :-)

//Ed

On Wed, May 25, 2011 at 5:03 PM, Karen Coyle li...@kcoyle.net wrote:
 Hi, Ed. Do you pick up user pages or just wikipedia entry pages? (I added
 mine to my user page, just for fun.)

 kc


Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-24 Thread Ed Summers
Big +1 for promoting the use of the Authority Control Wikipedia
template.I know i'm being a bit of a broken record, but you can watch
as people add these by looking at or subscribing to:

http://linkypedia.inkdroid.org/websites/23/pages/

Also, re: Jonathan's good advice to check out Wikipedia Miner [1] I
just ran across Duke [2] today, which looks like it could help guide
record linking a bit.


Duke is a fast and flexible deduplication (or entity resolution, or
record linkage) engine written in Java on top of Lucene. At the moment
(2011-04-07) it can process 1,000,000 records in 11 minutes on a
standard laptop in a single thread.


Haven't tried it yet, so YMMV, etc.

//Ed

[1] http://wikipedia-miner.sourceforge.net/
[2] http://code.google.com/p/duke/


Re: [CODE4LIB] linking catalog records to IMDB

2011-04-27 Thread Ed Summers
On Wed, Apr 27, 2011 at 12:14 PM, Roy Tennant roytenn...@gmail.com wrote:
 For what it's worth, I see over 7,000 links to IMDB from WorldCat records.

Sounds like a good excuse to use yourFavoriteProgrammingLanguage; to
rip through the 20k DVD records, look them up via the WorldCat API,
see if there's a IMDB URL, and add it back into the record if you find
one.

Oh, and report back here with what you find :-)

//Ed


Re: [CODE4LIB] What do you wish you had time to learn?

2011-04-26 Thread Ed Summers
Fun question, my list:

- data mining (the algorithms, the tools, etc)
- go (the programming language)
- hadoop

Not necessarily inter-related mind you :-)

//Ed

On Tue, Apr 26, 2011 at 8:30 AM, Edward Iglesias
edwardigles...@gmail.com wrote:
 Hello All,

 I am doing a presentation at RILA (Rhode Island Library Association) on
 changing skill sets for Systems Librarians.  I did a formal survey a while
 back (if you participated, thank you) but this stuff changes so quickly I
 thought I would ask this another way.  What do you wish you had time to
 learn?

 My list includes


 CouchDB(NoSQL in general)
 neo4j
 nodejs
 prototype
 API Mashups
 R

 Don't be afraid to include Latin or Greek History.  I'm just going for a
 snapshot of System angst at not knowing everything.

 Thanks,


 ~
 Edward Iglesias
 Systems Librarian
 Central Connecticut State University



Re: [CODE4LIB] Planned changes to the VIAF RDF

2011-04-12 Thread Ed Summers
On Tue, Apr 12, 2011 at 1:49 PM, Young,Jeff (OR) jyo...@oclc.org wrote:
 The only VIAF contributors we're aware of today that publish their own 
 authority Linked Data are Deutsche Nationalbibliothek, National Library of 
 Sweden, and the National Széchényi Library (Hungary).

Let's hope the trend continues :-)

//Ed


Re: [CODE4LIB] FW: VIAF linked data and non-Latin searching

2011-04-11 Thread Ed Summers
Nice, Jeff. I really like the simplified VIAF RDF. In particular I
like how you've modeled the deprecation of resources. Are you planning
to use a 301, e.g. http://viaf.org/viaf/77390479/ -
http://viaf.org/viaf/77390479 ?

//Ed

On Mon, Apr 11, 2011 at 3:18 PM, Young,Jeff (OR) jyo...@oclc.org wrote:
 Here is some information about pending updates to the VIAF Linked Data.



 I'm working on before/after diagrams to better explain the differences
 and will share them soon. Questions and comments are welcome.



 Jeff



 From: Hickey,Thom
 Sent: Monday, April 11, 2011 12:57 PM
 To: v...@listserv.log.gov
 Cc: Young,Jeff (OR)
 Subject: VIAF linked data and non-Latin searching



 Non-Latin searching:



 We believe we have resolved a reoccurring issue with non-Latin searching
 failing (it had to do with restarting VIAF in different environments).
 If anyone still has issues with this, please let us know.



 Linked Data:



 We have taken another look at the RDF generated for linked data.  The
 attached files show a personal, corporate and geographic (there are few
 pure geographic records in VIAF as of yet, but a mixed record such as
 Missouri's may be identified as geographic) record rendered in RDF.



 We think the new records are both simpler and easier to understand and
 use.  The biggest difference is that we have eliminated the
 viaf:NameAuthorityCluster that acted as a record hub. Formerly, this
 record hub was responsible for linking to the separately identified
 primary entity. In the new record structure, contributed authorities
 bypass this record hub and link directly to the primary entity
 themselves. The description of the primary entity appears first in the
 record inside an rdf:Description element followed by skos:Concept
 entries, one for each source file, each of which links back to the
 primary entity via foaf:focus.



 We have included some deprecated identifiers matching those used in
 previous RDF, which may help those processing it as linked data.  For
 those simply parsing it as XML and pulling information out of it, we
 have switched to fully qualified URIs which should make that easier.



 We will probably phase the new RDF in over the next two months.  This
 month we will generate both for those getting full dumps of VIAF, then
 next month switch both the online and offline versions to the new
 format.



 For those with suggestions about the new format, this would be an ideal
 time to let us know.  If we stay with the schedule outlined above we
 have until mid to late May before the new formats are in production.



 --Th




Re: [CODE4LIB] Documentation request for the marc gem

2011-03-16 Thread Ed Summers
Hi Tony,

Just in case it wasn't obvious, the source code is on GitHub [1]. As
Ross said, please consider forking it and sending a pull request for
any documentation improvements you want to do.

//Ed

[1] https://github.com/ruby-marc/ruby-marc

On Tue, Mar 15, 2011 at 3:18 PM, Ross Singer rossfsin...@gmail.com wrote:
 Hi Tony, I'm glad that ruby-marc appears to be generally useful.

 Another (even simpler) way to do what you want is:

 record.to_marc

 Which, I think, would do the same thing you're doing with MARC::Writer.encode.

 If you want to write up a block of text to plop into the README, feel
 free to send some me some copy (wholesale edits also welcome).

 Thanks,
 -Ross.

 On Tue, Mar 15, 2011 at 2:40 PM, Tony Zanella tony.zane...@gmail.com wrote:
 Hello all,
 If I may suggest adding to the documentation for the marc gem
 (http://marc.rubyforge.org/)...

 Currently, the documentation gives examples for how to read, create and write
 MARC records.

 The source code also includes an encode method in MARC::Writer, which came
 in handy for me when I needed to send an encoded record off to be archived on
 the fly, without writing it to the filesystem.

 That method isn't in the documentation, but it would be nice to see there! It
 could be as simple as:

 # encoding a record
 MARC::Writer.encode(record)

 Thanks for your consideration!
 Tony




Re: [CODE4LIB] dealing with Summon

2011-03-02 Thread Ed Summers
On Wed, Mar 2, 2011 at 11:38 AM, Godmar Back god...@gmail.com wrote:
 Like I said at the beginning of this thread, this is only tangentially
 a Code4Lib issue, and certainly the details aren't.  But perhaps the
 general problem is (?)

More than anything this seems like a documentation issue. From my seat
in the peanut gallery it seems like Godmar should be able to answer
these sorts of questions by looking at the Summon Search API
Documentation [1] for responses (which is quite nice btw).

Oh, and I think it's great to see this thread on code4lib, where other
people have been known to create an API or three. So thanks Godmar,
for asking here...

//Ed

[1] http://api.summon.serialssolutions.com/help/api/search/response


Re: [CODE4LIB] Low Cost Digitization of Manuscript Collections

2011-03-01 Thread Ed Summers
Hi Jody,

Thanks for sending along this information about Cabaniss. I'd be
curious to hear how your per-page costs compare with other projects,
such as Oregon State [1] (which I just wandered across in Google).

The notes from your project wiki [2] are really interesting. In
particular the details about linking from the EAD documents to the
item views using the PURLs struck my eye [3]. Did you have a PURL
server already set up at your institution, or is this something you
did as part of this project? Was there a real advantage to doing that
instead of thoughtfully managing a URL namespace with Cool URLs [4]. I
know I'm biased, but it sure was nice to see URLs in use instead of
Handles :-)

I haven't done EAD work in a while, and was wondering what the ns2
namespace is in the linking example on the wiki, e.g.

dao id=u0003_252_002 ns2:title=u0003_252_002
ns2:href=http://purl.lib.ua.edu/148; ns2:actuate=onRequest
ns2:show=new/

Last of all I was curious about the EAD viewing software you are
developing to stand in for Acumen. Is this work still underway?

Sorry for all the questions. I guess that's what you get for doing
interesting stuff :-)

//Ed

[1] 
http://wiki.library.oregonstate.edu/confluence/pages/viewpage.action?pageId=19327
[2] http://www.lib.ua.edu/wiki/digcoll/
[3]http://www.w3.org/Provider/Style/URI.html
[4] http://www.lib.ua.edu/wiki/digcoll/index.php/Scripted_Links_in_EADs

On Tue, Mar 1, 2011 at 9:03 PM, Jody DeRidder j...@jodyderidder.com wrote:
 (Apologies for cross posting)

 For Immediate Release
 Contact Person:  Jody L. DeRidder
 Email: jlderid...@ua.edu
 Phone: (205) 348-0511

 Completed UA Libraries Grant Project Provides Model for Low-Cost
 Digitization of Cultural Heritage Materials

 The University of Alabama Libraries has completed a grant project which
 demonstrates a model of low-cost digitization and web delivery of
 manuscript materials.  Funded by the National Archives and Records
 Administration (NARA) National Historical Publications and Records
 Commission (NHPRC), the project digitized a large and nationally important
 manuscript collection related to the emancipation of slaves:  the Septimus
 D. Cabaniss Papers.  This digitization grant (NAR10-RD-10033-10) extended
 for 14 months (ended February 2011), and has provided online access to
 46,663 images for less than $1.50 per page:
 http://acumen.lib.ua.edu/u0003_252.

 The model is designed to enable institutions to mass-digitize manuscript
 collections at a minimal cost, leveraging the extensive series
 descriptions already available in the collection finding aid to provide
 search and retrieval.  Digitized content for the collection is linked from
 the finding aid, providing online access to 31.8 linear feet of valuable
 archival material that otherwise would never be web-available.  We have
 developed software and workflows to support the process and web delivery
 of material regardless of the current method of finding aid access.  More
 information is available on the grant website:
 http://www.lib.ua.edu/libraries/hoole/cabaniss .

 The Septimus D. Cabaniss Collection (1815-1889) was selected as exemplary
 of the legal difficulties encountered in efforts to emancipate slaves in
 the Deep South. Cabaniss was a prominent southern attorney who served as
 executor for the estate of the wealthy Samuel Townsend, who sought to
 manumit and leave property to a selection of his slaves, many of whom were
 his children.  Samuel Townsend’s open admission to fathering slave
 children and his willingness to take responsibility for their care,
 combined with the letters from the former slaves themselves, dated before
 and after the Civil War, will inform social and racial historians. Legal
 scholars will be enlightened by Cabaniss' detailing of the sophisticated
 legal mechanism of using a trust to free slaves. Valuable collections such
 as this have a promise of open access via the web when the cost of
 digitization is lowered by avoiding item-level description.

 Usability testing was included in the grant project, and preliminary
 results indicate that this method of web delivery is as learnable for
 novices as access to the digitized materials via item-level descriptions.
 In addition, provision of web delivery of manuscript content via the
 finding aid provides the much-needed context preferred by experienced
 researchers.




 Jody DeRidder
 Digital Services
 University of Alabama Libraries
 Tuscaloosa, Alabama 35487
 (205) 348-0511
 j...@jodyderidder.com
 jlderid...@ua.edu



Re: [CODE4LIB] graphML of a social network in archival context

2011-02-18 Thread Ed Summers
Hi Brian,

It is *awesome* to see the SNAC data being released with an open
license--and it's also really interesting to see the code for loading
it into neo4. How have you been liking neo4j so far? Is the neo4j
graph database something that you have been using in SNAC? Have you
been interacting with it mainly via gremlin, the REST API, and/or
Java?

Just as an aside, I noticed that there are 66 edges that lack labels,
and 8332 'associateWith' labels that probably should be
'associatedWith'? I'm also kind of curious to hear more about what
'associatedWith' means, is that something from EAC? I noticed that it
can connect people, corporate bodies and families.

ed@curry:~/Datasets/eac/eac-graph-load-data-2011-02$ grep edge
graph-snac-example.xml | perl -ne '/label=(.+)/; print $1\n;' |
sort | uniq -c | sort -n
   66
   8332 associateWith
  99907 correspondedWith
 382855 associatedWith

Thanks for sending this update! Sorry for all the questions, but this
is cool stuff.

//Ed

On Thu, Feb 17, 2011 at 8:37 PM, Brian Tingle
brian.tingle.cdlib@gmail.com wrote:
 Hi,

 As a part of our work on the Social Networks and Archival Context
 Project [1], the SNAC team is please to release more early results of
 our ongoing research.

 A property graph [2] of correspondedWith and associatedWith
 relationships between corporate, personal, and family identities is
 made available under the Open Data Commons Attribution License [3] in
 the form of a graphML file [4].  The graph expresses 245,367
 relationships between 124,152 named entities.

 The graphML file, as well as the scripts to create and load a graph
 database from EAC or graphML, are available on google code [5]

 We are still researching how to map from the property graph model to
 RDF, but this graph processing stack will likely power the interactive
 visualization of the historical social networks we are developing.

 Please let us know if you have any feedback about the graph, how it is
 licensed, or if you create something cool with the data.

 -- Brian

 [1] http://socialarchive.iath.virginia.edu/

 [2] http://engineering.attinteractive.com/2010/12/a-graph-processing-stack/

 [3] http://www.opendatacommons.org/licenses/by/

 [4] http://graphml.graphdrawing.org/

 [5] 
 http://code.google.com/p/eac-graph-load/downloads/detail?name=eac-graph-load-data-2011-02.tar

 Research funded by the National Endowment for the Humanities 
 http://www.neh.gov/



[CODE4LIB] livefeed /about

2011-02-11 Thread Ed Summers
I just wanted to also say thanks for the livestream from code4lib
Bloomington. The stream, IRC and twitter in combination were
*extremely* useful from afar. I missed out on the craft-beers, but at
least I got to see them [1], and there's always next year :-) I don't
know if the bar has been set, but I think amplifying the conference
this way could be a really good option for scaling the conference
without requiring the amount of actual participants (and the size of
the venue) to increase. It also helps for those who can't pay for the
travel  lodging when travel budgets are on the wane.

Somewhat unrelatedly, I've seen some discussion about the place for
galleries, libraries and museums in the code4lib community [2].
Personally (despite its name) I've always thought of code4lib as being
about more than just code and libraries. I also noticed that
http://code4lib.org didn't have an about page. So I added one [3].
Please help edit it into shape if you care about this sorta thing.

//Ed

[1] http://twitpic.com/3y0zw5
[2] http://twitter.com/#!/wragge/statuses/35926310920396800
[3] http://code4lib.org/about


Re: [CODE4LIB] asist2010 meetup?

2010-10-27 Thread Ed Summers
Whoops, that was bus 61B not 61D.

//Ed

15:23  edsu @quote get 3
15:23  zoia edsu: Quote #3: edsu, your source for bad advice since, well,
  forever! (added by edsu at 09:46 PM, September 06, 2005)


On Tue, Oct 26, 2010 at 10:46 AM, Ed Summers e...@pobox.com wrote:
 Kind of last minute and random, but If you are at ASIST in Pittsburgh
 and want to get out of the downtown for some pizza at Aiello's in
 Squirrell Hill please join Raymond Yee and myself there at 7pm.

    http://www.aiellospizza.com/

 It looks like a simple ride on the 61D bus:

    http://bit.ly/hilton-to-aiellos

 And Raymond may be able to drive some folks back if they don't want to
 taxi or bus back.

 //Ed



  1   2   3   >