Re: [CODE4LIB] Loris
Hey, that's great. This work would make a great blog post/article I think. On Nov 8, 2013, at 5:13 PM, Jon Stroop jstr...@princeton.edu wrote: Whoops, wait. I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add support for OSd. Then I made some changes and added support for 1.1. Credit where credit is due -Js On 11/08/2013 04:40 PM, Jon Stroop wrote: Ed, I added support for IIIF syntax to OpenSeadragon: https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js so it just works. Not sure if Ian has cut a release recently, but it's on the master branch anyway. -Js On 11/08/2013 04:00 PM, Edward Summers wrote: On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu wrote: And here's a sample of the server backing OpenSeadragon[2]:http://goo.gl/Gks6lR Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to talk iiif? //Ed
Re: [CODE4LIB] rdf serialization
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen ben.compan...@dans.knaw.nl wrote: The URIs you gave get me to webpages *about* the Declaration of Independence. I'm sure it's just a copy/paste mistake, but in this context you want the exact right URIs of course. And by better I guess you meant probably more widely used and probably longer lasting? :) LOC URI for the DoI (the work) is without .html: http://id.loc.gov/authorities/names/n79029194 VIAF URI for the DoI is without trailing /: http://viaf.org/viaf/179420344 Thanks for that Ben. IMHO it's (yet another) illustration of why the W3C's approach to educating the world about URIs for real world things hasn't quite caught on, while RESTful ones (promoted by the IETF) have. If someone as knowledgeable as Karen can do that, what does it say about our ability as practitioners to use URIs this way, and in our ability to write software to do it as well? In a REST world, when you get a 200 OK it doesn't mean the resource is a Web Document. The resource can be anything, you just happened to successfully get a representation of it. If you like you can provide hints about the nature of the resource in the representation, but the resource itself never goes over the wire, the representation does. It's a subtle but important difference in two ways of looking at Web architecture. If you find yourself interested in making up your own mind about this you can find the RESTful definitions of resource and representation in the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1]. You can find language about Web Documents (or at least its more recent variant, Information Resource) in the W3C's Architecture of the World Wide Web [2]. Obviously I'm biased towards the IETF's position on this. This is just my personal opinion from my experience as a Web developer trying to explain Linked Data to practitioners, looking at the Web we have, and chatting with good friends who weren't afraid to tell me what they thought. //Ed [1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7 [2] http://www.w3.org/TR/webarch/#id-resources
Re: [CODE4LIB] more suggestions for code4lib.org
On Mon, Nov 4, 2013 at 11:31 PM, Kevin Hawkins kevin.s.hawk...@ultraslavonic.info wrote: b) Modify whatever code sends formatted job postings to this list so that it includes the location of the position. That would be shortimer, and I think it should be doing what you suggest now? https://github.com/code4lib/shortimer/commit/acb57090d4842920c9f92c684810f3c618f0a21e If not let me know, create a github issue, or send a pull request :-) //Ed
Re: [CODE4LIB] rdf serialization
On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote: This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don’t think the technology can handle the problem. I say this because I think people can’t (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? I think you're right Eric. I don't think knowledge can be encoded completely in triples, any more than it can be encoded completely in finding aids or books. One thing that I (naively) wasn't fully aware of when I started dabbling the Semantic Web and Linked Data is how much the technology is entangled with debates about the philosophy of language. These debates play out in a variety of ways, but most notably in disagreements about the nature of a resource (httpRange-14) in Web Architecture. Shameless plug: Dorothea Salo and I tried to write about how some of this impacts the domain of the library/archive [1]. One of the strengths of RDF is its notion of a data model that is behind the various serializations (xml, ntriples, json, n3, turtle, etc). I'm with Ross though: I find it much to read rdf as turtle or json-ld than it is rdf/xml. //Ed [1] http://arxiv.org/abs/1302.4591
Re: [CODE4LIB] rdf serialization
On Tue, Nov 5, 2013 at 10:07 AM, Karen Coyle li...@kcoyle.net wrote: I have suggested (repeatedly) to LC on the BIBFRAME list that they should use turtle rather than RDF/XML in their examples -- because I suspect that they may be doing some XML think in the background. This seems to be the case because in some of the BIBFRAME documents the examples are in XML but not RDF/XML. I find this rather ... disappointing. I think you'll find that many people and organizations are much more familiar with xml and its data model than they are with rdf. Sometimes when people with a strong background in xml come to rdf they naturally want to keep thinking in terms of xml. This is possible up to a point, but it eventually hampers understanding. //Ed
Re: [CODE4LIB] Python and Ruby
On Mon, Jul 29, 2013 at 12:57 PM, Peter Schlumpf pschlu...@earthlink.net wrote: Imagine if the library community had its own programming/scripting language, at least one that is domain relevant. What would it look like? Ok, I think I'm going to have nightmares about that. //Ed
Re: [CODE4LIB] Python and Ruby
On Mon, Jul 29, 2013 at 1:11 PM, Ross Singer rossfsin...@gmail.com wrote: Over the NISO standardization process required to form the exploratory committee. Thanks for answering the question better than I could have ever dreamed of answering it. //Ed
Re: [CODE4LIB] Wordpress: Any way to selectively control caching for content areas on a page?
If your Wordpress happens to be fronted by Varnish you might get some mileage out of using Edge Side Includes (ESI) https://www.varnish-software.com/static/book/Content_Composition.html#edge-side-includes If you google for Edge Side Includes and Wordpress you'll find some articles like this describing how ESI's were used with Wordpress: http://timbroder.com/2012/12/getting-started-with-varnish-edge-side-includes-and-wordpress.html So, it might be do-able. //Ed On Tue, May 28, 2013 at 5:30 PM, Wilhelmina Randtke rand...@gmail.com wrote: In a Wordpress site, is there a way to allow site-wide caching, but force certain areas of a page to reload on each visit? For example, if on a specific page there is a huge navigational menu that never changes, a map that rarely changes, and hours of operation which change frequently (as often as holidays), is there a way to force only the hours of operation to reload when a person revisits the page? -Wilhelmina Randtke
[CODE4LIB] GLAM Wiki Google Hangout (Today, 12:00PM EDT)
Some folks interested in the role of Wikipedia in Galleries, Libraries, Archives and Museums are doing a Google Hangout today at Noon (EDT). http://en.wikipedia.org/wiki/Wikipedia:GLAM/GLAMout Today's anchor topic is the work that OCLC has been doing in adding authority data from VIAF to Wikipedia and Wikidata. But there will be space for discussing other things. //Ed
Re: [CODE4LIB] GLAM Wiki Google Hangout (Today, 12:00PM EDT)
Sorry, as the page indicates, the hangout time is 12 PDT ... not EDT. //Ed On Fri, May 3, 2013 at 9:04 AM, Ed Summers e...@pobox.com wrote: Some folks interested in the role of Wikipedia in Galleries, Libraries, Archives and Museums are doing a Google Hangout today at Noon (EDT). http://en.wikipedia.org/wiki/Wikipedia:GLAM/GLAMout Today's anchor topic is the work that OCLC has been doing in adding authority data from VIAF to Wikipedia and Wikidata. But there will be space for discussing other things. //Ed
Re: [CODE4LIB] Job: Digital Library Application Developer at Princeton University
Hi Bill, There actually is a bit of manual curation that goes on behind the scenes. shortimer (the app at jobs.code4lib.org) subscribes to the code4lib discussion list looking for emails that have job in the title. It also subscribes to the atom/rss feeds of a 5 or 6 relevant job sites. When it finds a job at any of these places it puts them in a queue [1] where it waits for some logged in user to come along and: * decide if it's appropriate for code4lib (not all are) * make sure it hasn't already been posted recently (a duplicate) * assign an employer, location, and any tags (using Freebase entities behind the scenes) * clean up any formatting issues * click publish which pushes it out to the Web, Twitter and here (the discussion list) if it didn't originate from here Your question made me curious to see how many edits have been made by curators so far: 10,451. This isn't bad considering the site has only been operation for a year (28 edits/day) and is operating on the kindness of strangers. You can see what users have edited jobs on the jobs view pages. Mark Matienzo and Jodi Schneider deserve a special thanks for their work curating job postings. I hope this takes some of the mystery out of jobs.code4lib.org. Patches to the about page [3] and elsewhere are (of course) welcome :-) //Ed [1] http://jobs.code4lib.org/curate/ [2] http://jobs.code4lib.org/users/ [3] https://github.com/code4lib/shortimer/blob/master/jobs/templates/about.html On Mon, Mar 11, 2013 at 10:05 PM, William Denton w...@pobox.com wrote: On 11 March 2013, Ed Summers wrote: Apologies for this duplicate...I leaned too heavily on the new recent jobs from this employer which didn't alert me to the duplicate since it was posted under Princeton Theological Seminary and I put it under Princeton University Ed, does this amazing jobs site require your hand on the dial? I thought you'd coded it all into magic and it just worked. Bill -- William Denton Toronto, Canada http://www.miskatonic.org/
Re: [CODE4LIB] Job: Digital Library Application Developer at Princeton University
Apologies for this duplicate...I leaned too heavily on the new recent jobs from this employer which didn't alert me to the duplicate since it was posted under Princeton Theological Seminary and I put it under Princeton University //Ed On Mon, Mar 11, 2013 at 3:24 PM, j...@code4lib.org wrote: Princeton Theological Seminary Library seeks a Digital Library Application Developer. Reporting to the Digital Initiatives Librarian, this position works with a small, collaborative team of librarians and technologists to design, develop, and test web applications for searching and displaying the Library's digital resources. Responsibilities: * Works collaboratively with the Digital Initiatives team to design, develop, and test web applications using Agile practices. * Writes and refactors XQuery, HTML, CSS, and JavaScript code for new and existing web applications built on native XML databases (MarkLogic Server). * Tests web applications in multiple browsers on multiple platforms; identifies, tracks, and resolves bugs. Qualifications: * Bachelor's degree or equivalent combination of education and professional experience. * Experience developing web applications using one or more established programming languages/frameworks. MVC experience preferred. * Experience programmatically processing XML documents. Experience with XQuery or XSLT preferred. Experience with native XML databases preferred. * Experience with tools or frameworks for automated testing of web applications preferred. * Enthusiasm for learning and applying new technologies. Princeton Theological Seminary is an equal opportunity employer. For details, and for information on how to apply, please see http://www.ptsem.edu/index.aspx?id=1260 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6746/
Re: [CODE4LIB] jobs.code4lib.org and job locations
Hi Péter, I probably didn't explain myself very well at 2am :-) The Google Map was just meant as a demonstration of the underlying geo coordinates in the job data (exposed as georss in the Atom feed). Eventually I would like to get a map working on jobs.code4lib.org using LeafletJS [1] or some other toolkit, where the map can be more customized, and display more results. The feed is paged, so there's only so much that can be displayed currently. It might also be interesting to see a map for the last year of posts, to see general trends in hiring on a map. And I'd like to get location specific feeds set up for people who still use feed readers to keep up with things. They do still exist don't they? That being said, please share whatever you come up with! //Ed [1] http://leafletjs.com/ On Sun, Feb 24, 2013 at 8:39 AM, Péter Király kirun...@gmail.com wrote: Hi Ed, thank you for your work, it is very nice job! I have one comment: some job description too lengthy to show up in one screen size, so I have to move the map downside to see the top of the rescription. After I close the window the map doesn't jump back to the original viewport. There is a JS solution of this issue, I'll send you it later. Thanks again! Péter 2013/2/24 Ed Summers e...@pobox.com: If you happen to post jobs to code4lib.org you'll notice that you can now add a location for the job. In fact you are required to fill it in when posting. The location input field uses Freebase Suggest just like the employer and tag fields. When you select an employer the location will auto-populate with the employer's headquarters location, but you can change it if the job happens to be somewhere else...which does happen from time to time. I retroactively applied as many locations as I could using the employer. One nice side effect (other than seeing where the job is for in the UI) is having lat/lon geo-coordinates for the job. I haven't built any maps into the UI yet, but I did expose the coordinates in the Atom feed which lets you do this: https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/ The small number of markers is because this is just the first page of the feed, e.g. https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/2/ https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/3/ https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/4/ ... If someone has an interest in playing with LeafletJS or something to get some map views into jobs.code4lib.org proper that might be a fun experiment, if you have any spare time. Many thanks to Ted Lawless for the work to get this going, and also to Mark Matienzo for tirelessly assigning employers to the historic job postings. There are still a few kinks to work out (some historic postings that had addresses in non-standard places in the freebase data), but please feel free to file issue tickets on Github [1] if you notice anything odd. //Ed [1] https://github.com/code4lib/shortimer -- Péter Király software developer Europeana - http://europeana.eu eXtensible Catalog - http://eXtensibleCatalog.org
Re: [CODE4LIB] jobs.code4lib.org and job locations
Hi Gary, Great idea, and it was easy to implement. For example you can now get tag related feeds: http://jobs.code4lib.org/feed/tag/digital-preservation/ http://jobs.code4lib.org/feed/tag/python/ http://jobs.code4lib.org/feed/tag/web-archiving/ http://jobs.code4lib.org/feed/tag/fedora-repository-architecture/ etc ... Your feed reader should be able to pick up on the feed url, but a click to the feed icon on the tag specific jobs pages will take you to the feed if not. I also added the feed URLs as a column to the tag page: http://jobs.code4lib.org/tags/ It's kind of neat to see them on a map, e.g. https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/tag/digital-preservation/ Thanks for the idea! //Ed On Sun, Feb 24, 2013 at 10:44 AM, Gary McGath develo...@mcgath.com wrote: Definitely! I'd be more interested in job-category feeds than location, though. On 2/24/13 10:32 AM, Ed Summers wrote: And I'd like to get location specific feeds set up for people who still use feed readers to keep up with things. They do still exist don't they? -- Gary McGath, Professional Software Developer http://www.garymcgath.com
Re: [CODE4LIB] jobs.code4lib.org and job locations
Chris, as you saw, Chad started to tinker with maps and jobs.code4lib.org, but the more the merrier. Just fork the repo on github and try out some things if you have the energy/interest. If you want a snapshot of the MySQL database to play around with the full dataset let me know privately and I'll get it to you. //Ed On Sun, Feb 24, 2013 at 3:35 PM, Chris Fitzpatrick chrisfitz...@gmail.com wrote: hi, has anyone volunteered for the mapping feature? if not, I'd like to take a crack at it as I am wanting to get more practical django experience under my belt. and since this list has gotten me two jobs, I would love to give some payback. just dont want to duplicate any work someone else has started. b, chris. On 24 Feb 2013 20:08, Gary McGath develo...@mcgath.com wrote: It works very nicely with Sage, which is what I use to follow feeds. Thanks! On 2/24/13 1:45 PM, Ed Summers wrote: Hi Gary, Great idea, and it was easy to implement. For example you can now get tag related feeds: http://jobs.code4lib.org/feed/tag/digital-preservation/ http://jobs.code4lib.org/feed/tag/python/ http://jobs.code4lib.org/feed/tag/web-archiving/ http://jobs.code4lib.org/feed/tag/fedora-repository-architecture/ etc ... -- Gary McGath, Professional Software Developer http://www.garymcgath.com
[CODE4LIB] jobs.code4lib.org and job locations
If you happen to post jobs to code4lib.org you'll notice that you can now add a location for the job. In fact you are required to fill it in when posting. The location input field uses Freebase Suggest just like the employer and tag fields. When you select an employer the location will auto-populate with the employer's headquarters location, but you can change it if the job happens to be somewhere else...which does happen from time to time. I retroactively applied as many locations as I could using the employer. One nice side effect (other than seeing where the job is for in the UI) is having lat/lon geo-coordinates for the job. I haven't built any maps into the UI yet, but I did expose the coordinates in the Atom feed which lets you do this: https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/ The small number of markers is because this is just the first page of the feed, e.g. https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/2/ https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/3/ https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/4/ ... If someone has an interest in playing with LeafletJS or something to get some map views into jobs.code4lib.org proper that might be a fun experiment, if you have any spare time. Many thanks to Ted Lawless for the work to get this going, and also to Mark Matienzo for tirelessly assigning employers to the historic job postings. There are still a few kinks to work out (some historic postings that had addresses in non-standard places in the freebase data), but please feel free to file issue tickets on Github [1] if you notice anything odd. //Ed [1] https://github.com/code4lib/shortimer
Re: [CODE4LIB] jobs.code4lib.org and job locations
On Sun, Feb 24, 2013 at 2:14 AM, Ed Summers e...@pobox.com wrote: If you happen to post jobs to code4lib.org you'll notice that you can now add a location for the job. In fact you are required to fill it in when posting. s/code4lib.org/jobs.code4lib.org/ That's what I get for writing email at 2am I guess... //Ed On Sun, Feb 24, 2013 at 2:14 AM, Ed Summers e...@pobox.com wrote: If you happen to post jobs to code4lib.org you'll notice that you can now add a location for the job. In fact you are required to fill it in when posting. The location input field uses Freebase Suggest just like the employer and tag fields. When you select an employer the location will auto-populate with the employer's headquarters location, but you can change it if the job happens to be somewhere else...which does happen from time to time. I retroactively applied as many locations as I could using the employer. One nice side effect (other than seeing where the job is for in the UI) is having lat/lon geo-coordinates for the job. I haven't built any maps into the UI yet, but I did expose the coordinates in the Atom feed which lets you do this: https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/ The small number of markers is because this is just the first page of the feed, e.g. https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/2/ https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/3/ https://maps.google.com/maps?q=http://jobs.code4lib.org/feed/4/ ... If someone has an interest in playing with LeafletJS or something to get some map views into jobs.code4lib.org proper that might be a fun experiment, if you have any spare time. Many thanks to Ted Lawless for the work to get this going, and also to Mark Matienzo for tirelessly assigning employers to the historic job postings. There are still a few kinks to work out (some historic postings that had addresses in non-standard places in the freebase data), but please feel free to file issue tickets on Github [1] if you notice anything odd. //Ed [1] https://github.com/code4lib/shortimer
Re: [CODE4LIB] Job: Data Services Manager at Pennsylvania State University
Sorry for the duplication on the recent CDL/UC3 jobs by the way. I saw them pop up on the digital-curation list, got excited and posted them on jobs.code4lib.org without seeing that Stephen already had. Oh well, two for the price of one I guess, or is that 4 for the price of 2? [1] Mea culpa, //Ed [1] except their free, so uhh, yeah... On Wed, Feb 6, 2013 at 8:59 AM, j...@code4lib.org wrote: Digital Library Technologies (DLT), a unit within Information Technology Services at Penn State University, is seeking a Data Services Manager to lead the development of new data services to support teaching, research and outreach at Penn State. The Data Services Manager will be responsible for the development of services to support data throughout its lifecycle, including long-term archival data storage, preservation, and management, the management of restricted data, and database hosting. The Data Services Manager will collaborate with diverse constituencies at Penn State (ITS, the IT Leadership Council, the University Libraries, and researchers/faculty) and with our peers nationally, to design, develop, and implement sustainable data services that meet existing and emerging needs. This job will be filled as a level 3, or level 4, depending upon the successful candidate's competencies, education, and experience. Typically requires a Master's degree or higher plus four years of related experience, or an equivalent combination of education and experience for a level 3. Additional experience and/or education and competencies are required for higher level jobs. The successful candidate will demonstrate knowledge of and experience with data management infrastructure, specifically storage and repository technologies, standards, and practices; maintain an awareness of emerging trends and developments in the data storage and repository domains; have knowledge of information management practices and principles such as metadata, data lifecycle, and digital preservation practices. The Data Services Manager will be passionate about working hands-on with technology; have excellent problem-solving skills; demonstrate proven ability to lead complex and cross-organizational projects; provide outstanding customer service; and have excellent interpersonal communication and relationship-building skills. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6071/
Re: [CODE4LIB] Job: Data Services Manager at Pennsylvania State University
s/their/they're/ But I guess there's no such thing as a free job posting, really. Yeah, I'm done now. Thanks. //Ed On Wed, Feb 6, 2013 at 4:05 PM, Ed Summers e...@pobox.com wrote: Sorry for the duplication on the recent CDL/UC3 jobs by the way. I saw them pop up on the digital-curation list, got excited and posted them on jobs.code4lib.org without seeing that Stephen already had. Oh well, two for the price of one I guess, or is that 4 for the price of 2? [1] Mea culpa, //Ed [1] except their free, so uhh, yeah... On Wed, Feb 6, 2013 at 8:59 AM, j...@code4lib.org wrote: Digital Library Technologies (DLT), a unit within Information Technology Services at Penn State University, is seeking a Data Services Manager to lead the development of new data services to support teaching, research and outreach at Penn State. The Data Services Manager will be responsible for the development of services to support data throughout its lifecycle, including long-term archival data storage, preservation, and management, the management of restricted data, and database hosting. The Data Services Manager will collaborate with diverse constituencies at Penn State (ITS, the IT Leadership Council, the University Libraries, and researchers/faculty) and with our peers nationally, to design, develop, and implement sustainable data services that meet existing and emerging needs. This job will be filled as a level 3, or level 4, depending upon the successful candidate's competencies, education, and experience. Typically requires a Master's degree or higher plus four years of related experience, or an equivalent combination of education and experience for a level 3. Additional experience and/or education and competencies are required for higher level jobs. The successful candidate will demonstrate knowledge of and experience with data management infrastructure, specifically storage and repository technologies, standards, and practices; maintain an awareness of emerging trends and developments in the data storage and repository domains; have knowledge of information management practices and principles such as metadata, data lifecycle, and digital preservation practices. The Data Services Manager will be passionate about working hands-on with technology; have excellent problem-solving skills; demonstrate proven ability to lead complex and cross-organizational projects; provide outstanding customer service; and have excellent interpersonal communication and relationship-building skills. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6071/
Re: [CODE4LIB] Adding authority control to IR's that don't have it built in
Hi Jason, Heh, sorry for the long response below. You always ask interesting questions :-D I would highly recommend that vocabulary management apps like this assign an identifier to each entity, that can be expressed as a URL. If there is any kind of database backing the app you will get the identifier for free (primary key, etc). So for example let's say you have a record for John Chapman, who is on the faculty at OSU, which has a primary key of 123 in the database, you would have a corresponding URL for that record: http://id.library.osu.edu/person/123 When someone points their browser at that URL they get back a nice HTML page describing John Chapman. I would strongly recommend that schema.org microdata and/or opengraph protocol RDFa be layered into the page for SEO purposes, as well as anyone who happens to be doing scraping. I would also highly recommend adding a sitemap to enable discovery, and synchronization. Having that URL is handy because you could add different machine readable formats that hang off of it, which you can express as links in your HTML, for example lets say you want to have JSON, RDF and XML representations: http://id.library.osu.edu/person/123.json http://id.library.osu.edu/person/123.xml http://id.library.osu.edu/person/123.rdf If you want to get fancy you can content negotiate between the generic url and the format specific URLs, e.g. curl -i --header Accept: application/json http://id.library.osu.edu/person/123 HTTP/1.1 303 See Other date: Thu, 31 Jan 2013 10:47:44 GMT server: Apache/2.2.14 (Ubuntu) location: http://id.library.osu.edu/person/123 vary: Accept-Encoding But that's gravy. What exactly you put in these representations is a somewhat open question I think. I'm a bit biased towards SKOS for the RDF because it's lightweight, this is exactly its use case, it is flexible (you can layer other assertions in easily), and (full disclosure) I helped with the standardization of it. If you did do this you could use JSON-LD for the JSON, or just come up with something that works. Likewise for the XML. You might want to consider supporting JSON-P for the JSON representation, so that it can be used from JavaScript in other people's applications. It might be interesting to come up with some norms here for interoperability on a Wiki somewhere, or maybe a prototype of some kind. But the focus should be on what you need to actual use it in some app that needs vocabulary management. Focusing on reusing work that has already been done helps a lot too. I think that helps ground things significantly. I would be happy to discuss this further if you want. Whatever the format, I highly recommend you try to have the data link out to other places on the Web that are useful. So for example the record for John Chapman could link to his department page, blog, VIAF, Wikipedia, Google Scholar Profile, etc. This work tends to require human eyes, even if helped by a tool (Autosuggest, etc), so what you do may have to be limited, or at least an ongoing effort. Managing them (link scrubbing) is an ongoing effort too. But fitting your stuff into the larger context of the Web will mean that other people will want to use your identifiers. It's the dream of Linked Data I guess. Lastly I recommend you have an OpenSearch API, which is pretty easy, almost trivial, to put together. This would allow people to write software to search for John Chapman and get back results (there might be more than one) in Atom, RSS or JSON. OpenSearch also has a handy AutoSuggest format, which some JavaScript libraries work with. The nice thing about OpenSearch is that Browsers search boxes support it too. I guess this might sound like an information architecture more than an API. Hopefully it makes sense. Having a page that documents all this, with API written across the top, that hopefully includes terms of service, can help a lot with use by others. //Ed PS. I should mention that Jon Phipps and Diane Hillman's work on the Metadata Registry [2] did a lot to inform my thinking about the use of URLs to identify these things. The metadata registry is used for making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff like versioning, etc ... which might be nice to have. Personally I would probably start small before jumping to installing the Metadata Registry, but it might be an option for you. [1] http://www.opensearch.org [2] http://trac.metadataregistry.org/ On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo jrona...@gmail.com wrote: Ed, Any suggestions or recommendations on what such an API would look like, what response format(s) would be best, and how to advertise the availability of a local name authority API? Who should we expect would use our local name authority API? Are any of the examples from the big authority databases like VIAF ones that would be good to follow for API design and response formats? Jason On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers e
Re: [CODE4LIB] Adding authority control to IR's that don't have it built in
Of course after sending that I noticed a mistake, the curl example should look like: curl -i --header Accept: application/json http://id.library.osu.edu/person/123 HTTP/1.1 303 See Other date: Thu, 31 Jan 2013 10:47:44 GMT server: Apache/2.2.14 (Ubuntu) location: http://id.library.osu.edu/person/123.json vary: Accept-Encoding I didn't have it redirecting to the JSON previously. //Ed On Wed, Jan 30, 2013 at 4:19 PM, Phillips, Mark mark.phill...@unt.edu wrote: Thanks for the prompt Ed, We've had a stupid simple vocabulary app for a few years now which we use to manage all of our controlled vocabularies [1]. These are represented in our metadata editing application as drop-downs and type ahead values as described in the first email in this thread. Nothing too exciting. The entire vocabulary app is available to our systems as xml, python or json objects. When we export our records as RDF we try and use the links for these values instead of the strings. We are currently working on another simple app to manage names for our system (UNT Name App). It takes into account some of the use cases described in this thread such as disambiguation, variant names, and the all important linking to other vocabularies of which VIAF, LC, and Wikipedia are the primary expected targets. Once populated it is to be integrated into the metadata editing system to provide auto-complete functions to the various name fields in our repository. As far as technology we've tried to crib off the Chronicling America site as much as possible and follow the pattern of using the suggestions extension of OpenSearch [2] to provide the API. Mark [1] http://digital2.library.unt.edu/vocabularies/ [2] http://www.opensearch.org/Specifications/OpenSearch/Extensions/Suggestions/1.1 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ed Summers [e...@pobox.com] Sent: Wednesday, January 30, 2013 2:15 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Adding authority control to IR's that don't have it built in On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: This would certainly be a possibility for other projects, but the use case we're immediately concerned with requires an authority file that's maintained by our local archives. It contains all kinds of information about people (degrees, nicknames, etc) as well as terminology which is not technically kosher but which we know people use. Just as an aside really, I think there's a real opportunity for libraries and archives to make their local thesauri and name indexes available for integration into other applications both inside and outside their institutional walls. Wikipedia, Freebase, VIAF are great, but their notability guidelines don't always the greatest match for cultural heritage organizations. So seriously consider putting a little web app around the information you have, using it for maintaining the data, making it available programatically (API), and linking it out to other databases (VIAF, etc) as needed. A briefer/pithier way of saying this is to quote Mark Matienzo [1] Sooner or later, everyone needs a vocabulary management app. :-) //Ed PS. I think Mark Phillips has done some interesting work in this area at UNT. But I don't have anything to point you at, maybe Mark is tuned in, and can chime in. [1] https://twitter.com/anarchivist/status/269654403701682176
Re: [CODE4LIB] Adding authority control to IR's that don't have it built in
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: This would certainly be a possibility for other projects, but the use case we're immediately concerned with requires an authority file that's maintained by our local archives. It contains all kinds of information about people (degrees, nicknames, etc) as well as terminology which is not technically kosher but which we know people use. Just as an aside really, I think there's a real opportunity for libraries and archives to make their local thesauri and name indexes available for integration into other applications both inside and outside their institutional walls. Wikipedia, Freebase, VIAF are great, but their notability guidelines don't always the greatest match for cultural heritage organizations. So seriously consider putting a little web app around the information you have, using it for maintaining the data, making it available programatically (API), and linking it out to other databases (VIAF, etc) as needed. A briefer/pithier way of saying this is to quote Mark Matienzo [1] Sooner or later, everyone needs a vocabulary management app. :-) //Ed PS. I think Mark Phillips has done some interesting work in this area at UNT. But I don't have anything to point you at, maybe Mark is tuned in, and can chime in. [1] https://twitter.com/anarchivist/status/269654403701682176
Re: [CODE4LIB] Adding authority control to IR's that don't have it built in
On Tue, Jan 29, 2013 at 10:41 PM, Bill Dueber b...@dueber.com wrote: Right -- I'd like to show the FAST stuff as facets in our catalog search (or, at least try it out and see if anyone salutes). So I'd need to inject the FAST data into the records at index time. Alas, I can't help you with that. I haven't heard of FAST being distributed before, but I suppose it must have. Where is Roy when you need him? //Ed
Re: [CODE4LIB] Adding authority control to IR's that don't have it built in
Hi Kyle, If you are thinking of doing name or subject authority control you might want to check out OCLC's VIAF AutoSuggest service [1] and FAST AutoSuggest [2]. There are also autosuggest searches for the name and subject authority files, that are lightly documented in their OpenSearch document [3]. In general, I really like this approach, and I think it has a lot of potential for newer cataloging interfaces. I'll describe two scenarios that I'm familiar with, that have worked quite well (so far). Note, these aren't IR per-se, but perhaps they will translate to your situation. As part of the National Digital Newspaper Program LC has a simple app so that librarians can create essays that describe newspapers in detail. Rather than making this part of our public website we created an Essay Editor as a standalone django app that provides a web based editing environment, for authority the essays. Part of this process is linking up the essay with the correct newspaper. Rather than load all the newspapers that could be described into the Essay Editor, and keep them up to date, we exposed an OpenSearch API in the main Chronicling America website (where all the newspaper records are loaded and maintained) [4]. It has been working quite well so far. Another example is the jobs.code4lib.org website that allows people to enter jobs announcements. I wanted to make sure that it was possible to view jobs by organization [5], or skill [6] -- so some form of authority control was needed. I ended up using Freebase Suggest [7] that makes it quite easy to build simple forms that present users with subsets of Freebase entities, depending on what they type. A nice side benefit of using Freebase is that you get descriptive text and images for the employers and topics for free. It has been working pretty well so far. There is a bit of an annoying conflict between the Freebase CSS and Twitter Bootstrap, which might be resolved by updating Bootstrap. Also, I've noticed Freebase's service slowing down a bit lately, which hopefully won't degrade further. The big caveat here is that these external services are dependencies. If they go down, a significant portion of your app might go down to. Minimizing this dependency, or allowing things degrade well is good to keep in mind. Also, it's worth remembering identifiers (if they are available) for the selected matches, so that they can be used for linking your data with the external resource. A simple string might change. I hope this helps. Thanks for the question, I think this is an area where we can really improve some of our back-office interfaces and applications. //Ed [1] http://www.oclc.org/developer/documentation/virtual-international-authority-file-viaf/request-types#autosuggest [2] http://experimental.worldcat.org/fast/assignfast/ [3] http://id.loc.gov/authorities/opensearch/ [4] http://chroniclingamerica.loc.gov/about/api/#autosuggest [5] http://jobs.code4lib.org/employer/university-of-illinois-at-urbana-champaign/ [6] http://jobs.code4lib.org/jobs/ruby/ [7] http://wiki.freebase.com/wiki/Freebase_Suggest On Tue, Jan 29, 2013 at 11:59 AM, Kyle Banerjee kyle.baner...@gmail.com wrote: How are libraries doing this and how well is it working? Most systems that even claim to have authority control simply allow a controlled keyword list. But this does nothing for the see and see also references that are essential for many use cases (people known by many names, entities that change names, merge or whatever over time, etc). The two most obvious solutions to me are to write an app that provides this information interactively as the query is typed (requires access to the search box) or to have a record that serves as a disambiguation page (might not be noticed by the user for a variety of reasons). Are there other options, and what do you recommend? Thanks, kyle
Re: [CODE4LIB] Adding authority control to IR's that don't have it built in
I think that Mike Giarlo and Michael Witt used the FAST AutoSuggest as part of their databib project [1]. But are you talking about bringing the data down for a local index? //Ed [1] http://databib.org/ On Tue, Jan 29, 2013 at 4:45 PM, Bill Dueber b...@dueber.com wrote: Has anyone created a nice little wrapper around FAST? I'd like to test out including FAST subjects in our catalog, but am hoping someone else went through the work of building the code to do it :-) I know FAST has a web interface, but I've got about 10M records and would rather use something local. On Tue, Jan 29, 2013 at 4:36 PM, Ed Summers e...@pobox.com wrote: Hi Kyle, If you are thinking of doing name or subject authority control you might want to check out OCLC's VIAF AutoSuggest service [1] and FAST AutoSuggest [2]. There are also autosuggest searches for the name and subject authority files, that are lightly documented in their OpenSearch document [3]. In general, I really like this approach, and I think it has a lot of potential for newer cataloging interfaces. I'll describe two scenarios that I'm familiar with, that have worked quite well (so far). Note, these aren't IR per-se, but perhaps they will translate to your situation. As part of the National Digital Newspaper Program LC has a simple app so that librarians can create essays that describe newspapers in detail. Rather than making this part of our public website we created an Essay Editor as a standalone django app that provides a web based editing environment, for authority the essays. Part of this process is linking up the essay with the correct newspaper. Rather than load all the newspapers that could be described into the Essay Editor, and keep them up to date, we exposed an OpenSearch API in the main Chronicling America website (where all the newspaper records are loaded and maintained) [4]. It has been working quite well so far. Another example is the jobs.code4lib.org website that allows people to enter jobs announcements. I wanted to make sure that it was possible to view jobs by organization [5], or skill [6] -- so some form of authority control was needed. I ended up using Freebase Suggest [7] that makes it quite easy to build simple forms that present users with subsets of Freebase entities, depending on what they type. A nice side benefit of using Freebase is that you get descriptive text and images for the employers and topics for free. It has been working pretty well so far. There is a bit of an annoying conflict between the Freebase CSS and Twitter Bootstrap, which might be resolved by updating Bootstrap. Also, I've noticed Freebase's service slowing down a bit lately, which hopefully won't degrade further. The big caveat here is that these external services are dependencies. If they go down, a significant portion of your app might go down to. Minimizing this dependency, or allowing things degrade well is good to keep in mind. Also, it's worth remembering identifiers (if they are available) for the selected matches, so that they can be used for linking your data with the external resource. A simple string might change. I hope this helps. Thanks for the question, I think this is an area where we can really improve some of our back-office interfaces and applications. //Ed [1] http://www.oclc.org/developer/documentation/virtual-international-authority-file-viaf/request-types#autosuggest [2] http://experimental.worldcat.org/fast/assignfast/ [3] http://id.loc.gov/authorities/opensearch/ [4] http://chroniclingamerica.loc.gov/about/api/#autosuggest [5] http://jobs.code4lib.org/employer/university-of-illinois-at-urbana-champaign/ [6] http://jobs.code4lib.org/jobs/ruby/ [7] http://wiki.freebase.com/wiki/Freebase_Suggest On Tue, Jan 29, 2013 at 11:59 AM, Kyle Banerjee kyle.baner...@gmail.com wrote: How are libraries doing this and how well is it working? Most systems that even claim to have authority control simply allow a controlled keyword list. But this does nothing for the see and see also references that are essential for many use cases (people known by many names, entities that change names, merge or whatever over time, etc). The two most obvious solutions to me are to write an app that provides this information interactively as the query is typed (requires access to the search box) or to have a record that serves as a disambiguation page (might not be noticed by the user for a variety of reasons). Are there other options, and what do you recommend? Thanks, kyle -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] Zoia
On Thu, Jan 24, 2013 at 10:01 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: More to the point, no other decision about code4lib in terms of action or policy has been made ever. This is new territory for us. It's not really that new. We've voted on tshirts, logos, and whether or not to have jobs.code4lib.org post here--perhaps other things that I'm forgetting. I'm not saying we need to vote on the anti-harassment policy to make it real--it's already real. Not everyone may respect it, but hopefully we'll all continue being nice people and won't have to worry about enforcing it. It's hard to imagine anyone being against it. Personally, I find it regrettable that it's even necessary, but it is what it is. Voting can be a nice way of testing the waters for something. I found the survey on the jobs.code4lib.org email posting very helpful. But voting on everything would get very tedious, and boring very quickly I imagine. code4lib has always seemed much more freeform than that to me. I really liked Bethany's description of lazy consensus [1] at the last conference. //Ed [1] http://nowviskie.org/2012/lazy-consensus/
Re: [CODE4LIB] Zoia
On Thu, Jan 24, 2013 at 4:04 PM, Shaun Ellis sha...@princeton.edu wrote: Determining whether action should be taken on harassment should not be based on a popularity contest. That would be a fail, and that's what Karen is right to point out. I added ABSTENTIONS.txt and OPPOSERS.txt to the anti-harassment github repository [1] to supplement the SUPPORTERS.txt, for people who want to record their particular view on this issue. If you want to record your view you can fork the repository, add your name to the appropriate file, and send a pull request. Perhaps that's good enough for now? I don't disagree that ambiguity around this issue is problematic, but I also think that trying to remove all ambiguity from it maybe prove to be difficult, and damaging. //Ed [1] https://github.com/code4lib/antiharassment-policy
Re: [CODE4LIB] Group Decision Making (was Zoia)
So we have a reasonable policy in place. Can we now tackle the creepy things as they come up? I am not opposed to voting about this. It just seems like a crazy thing to do, because I can't imagine anyone would be opposed to it. But maybe I lack imagination. //Ed On Thu, Jan 24, 2013 at 4:49 PM, BWS Johnson abesottedphoe...@yahoo.com wrote: Salve! I am uneasy about coming up with a policy for banning people (from what?) and voting on it, before it's demonstrated that it's even needed. Can't we just tackle these issues as they come up, in context, rather than in the abstract? Or has a specific issue come up, and I'm just being daft? It's needed. It was requested. Specifically creepy things happening is why this came up. The policy is necessary to help people deal with things as they come up in context. I'm uneasy about voting on minority rights. That usually doesn't go well, and it almost always misses the point. Cheers, Brooke
Re: [CODE4LIB] Conference roommate
Whoever is rooming with Gabe, be sure to remind him to bring his Ukulele. //Ed On Tue, Jan 22, 2013 at 4:22 PM, Gabriel Farrell gsf...@gmail.com wrote: And the code4lib community comes through again. I now have a roommate. See you all at the conference! On Mon, Jan 21, 2013 at 11:59 AM, Gabriel Farrell gsf...@gmail.com wrote: I'm looking for a roommate for a room at the conference hotel Monday through Thursday. I've also posted at http://wiki.code4lib.org/index.php/2013_room_ride_share. References available upon request.
Re: [CODE4LIB] Zoia
On Tue, Jan 22, 2013 at 4:01 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Thanks to whoever removed the 'poledance' plugin (REALLY? that existed? if it makes you feel any better, I don't think anyone who hangs out in #code4lib even knew it existed, and it never got used). I knew it existed, and I even invoked it a few times. Although, If this war on humor keeps up, I'm unlikely to hang out in #code4lib much longer. //Ed PS. I really didn't expect the Spanish Inquisition.
[CODE4LIB] code4lib.org domain
HI all, I've owned the code4lib.org since 2005 and have been thinking it might be wise for to transfer ownership of it to someone else. Sometimes I forget to pay bills, and miss emails, and it seems like the domain means something to a larger group of people. With Ryan Ordway's help Oregon State University indicated they would be willing to take over administration of the domain. They also have been responsible for running the Drupal instance at code4lib.org and the Mediawiki instance at wiki.code4lib.org -- so it seems like a logical move. But I thought I would bring it up here first in the interests of transparency, community building and whatnot, to see if there were any objections or ideas. //Ed
Re: [CODE4LIB] code4lib.org domain
On Tue, Dec 18, 2012 at 4:58 PM, Wilhelmina Randtke rand...@gmail.com wrote: Pay for it shouldn't be an issue. It's like $10 a year to register the domain, right? So, don't make a big deal out of OSU paying for it. The fee is negligible. Yes, it's not so much a matter of money as it is remembering to pay it :-) The key concern is how committed to OSU is Ryan Ordway, and what's the climate there like. I see this as transferring to the people who are currently technical contacts at OSU, not to a faceless organization. If they already hold several other URLs, and have a policy and timeframe for tracking and renewing these then that's a plus. OSU is committed enough to have a Domain Name Committee to evaluate these matters, which accepted the proposal to host code4lib.org. The first code4lib conference was held at OSU, and there are several active long time OSU folks who have helped create the code4lib community...so it's not as if there's no connection between the organization and this community. I am not disagreeing with your assessment about individual vs organizational ownership. But I am saying I don't want to be that individual anymore, and that OSU is the best option for not letting the domain lapse. //Ed
[CODE4LIB] Just Solve the File Format Problem month: can you help?
I imagine you've heard about the Just Solve the Problem month already, but if not, I thought Chris Rusbridge's email to the digital-preservation list was a good call for participation in the project ... //Ed -- Forwarded message -- From: Chris Rusbridge c.rusbri...@googlemail.com Date: Thu, Nov 1, 2012 at 4:00 PM Subject: Just Solve the File Format Problem month: can you help? To: digital-preservat...@jiscmail.ac.uk Some of you will know that Jason Scott, Rogue Archivist, is raising a citizen's army to attempt to solve the file format problem* in the month of November, 2012. The work is taking place via a wiki at http://justsolve.archiveteam.org/index.php/Main_Page, with a band of volunteers (you need to register to make changes to the wiki, by sending a username and email address to justso...@textfiles.com). I've added a few formats and groups of formats myself (at least as skeletons or empty placeholders). The best form of help is for some of you who know more about rarer data formats to register and help by editing the wiki yourself. It's pretty easy; I've never used MediaWiki before, and everything I've done so far has been by finding something like it and adapting the wiki source. Other people can make it beautiful and standardised later on! If you can't do that, you could email me information about missing data formats. This should include as much as possible of: - name, and what it's for (ie brief description) - web site with some authoritative information - web site with some examples, etc. Let's try and capture ALL these formats. As Jason says in his own inimitable way Let's make that goddam army!. * Note, the problem is only vaguely defined, and after some angst (eg see http://unsustainableideas.wordpress.com/2012/07/04/the-solution-is-42-what-was-the-problem/), I think that's OK. Gathering a huge amount of information about file formats in one place will be a BIG HELP. -- Chris Rusbridge Mobile: +44 791 7423828 Email: c.rusbri...@gmail.com Adopt the email charter! http://emailcharter.org/
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
On Mon, Aug 27, 2012 at 10:36 AM, Ross Singer rossfsin...@gmail.com wrote: For MARC data, while I don't know of any examples of this, it seems like something like CouchDB [2] and marc-in-json [3] would be a fantastic way to make something like this available. Great idea...and there are 4 years of transactions for LC record create/update/deletes up at Internet Archive: http://archive.org/details/marc_loc_updates //Ed
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
On Mon, Aug 27, 2012 at 8:49 AM, Karen Coyle li...@kcoyle.net wrote: Actually, Ed, this would not only make for a good blog post (please, so it doesn't get lost in email space), but I would love to see a discussion of what kind of revision control would work: 1) for libraries (git is gawdawful nerdy) 2) for linked data I think you know well as me that linked data is gawdawful nerdy too :-) p.s. the Ramsay book is now showing on Open Library, and the subtitle is correct... perhaps because the record is from the LC MARC service :-) http://openlibrary.org/works/OL16528530W/Reading_machines perhaps being the operative word. Being able to concretely answer these provenence questions is important. Actually, I'm not sure it was ever incorrect at OpenLibrary. At least I don't think I used it as an example in my Genealogy of a Typo post. //Ed
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
On Mon, Aug 27, 2012 at 1:33 PM, Corey A Harper corey.har...@nyu.edu wrote: I think there's a useful distinction here. Ed can correct me if I'm wrong, but I suspect he was not actually suggesting that Git itself be the user-interface to a github-for-data type service, but rather that such a service can be built *on top* of an infrastructure component like GitHub. Yes, I wasn't saying that we could just plonk our data into Github, and pat ourselves on the back for a good days work :-) I guess I was stating the obvious: technologies like Git have made once hard problems like decentralized version control much, much easier...and there might be some giants shoulders to stand on. //Ed
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
Thanks for sharing this bit of detective work. I noticed something similar fairly recently myself [1], but didn't discover as plausible of a scenario for what had happened as you did. I imagine others have noticed this network effect before as well. On Tue, Aug 21, 2012 at 11:42 AM, Lars Aronsson l...@aronsson.se wrote: And sure enough, there it is, http://clio.cul.columbia.edu:7018/vwebv/holdingsInfo?bibId=1439352 But will my error report to Worldcat find its way back to CLIO? Or if I report the error to Columbia University, will the correction propagate to Google, Hathi and Worldcat? (Columbia asks me for a student ID when I want to give feedback, so that removes this option for me.) I realize this probably will sound flippant (or overly grandiose), but innovating solutions to this problem, where there isn't necessarily one metadata master that everyone is slaved to seems to be one of the more important and interesting problems that our sector faces. When Columbia University can become the source of a bibliographic record for Google Books, HathiTrust and OpenLibrary, etc how does this change the hub and spoke workflows (with OCLC as the hub) that we are more familiar with? I think this topic is what's at the heart of the discussions about a github-for-data [2,3], since decentralized version control systems [4] allow for the evolution of more organic, push/pull, multimaster workflows...and platforms like Github make them socially feasible, easy and fun. I also think Linked Library Data, where bibliographic descriptions are REST enabled Web resources identified with URLs, and patterns such as webhooks [5] make it easy to trigger update events could be part of an answer. Feed technologies like Atom, RSS and the work being done on ResourceSync also seem important technologies for us to use to allow people to poll for changes [6]. And being able to say where you have obtained data from, possibly using something like the W3C Provenance vocabulary [7] also seems like an important part of the puzzle. I'm sure there are other (and perhaps better) creative analogies or tools that could help solve this problem. I think you're probably right that we are starting to see the errors more now that more library data is becoming part of the visible Web via projects like GoogleBooks, HathiTrust, OpenLibrary and other enterprising libraries that design their catalogs to be crawlable and indexable by search engines. But I think it's more fun to think about (and hack on) what grassroots things we could be doing to help these new bibliographic data workflows to grow and flourish than to get piled under by the errors, and a sense of futility... Or it might make for a good article or dissertation topic :-) //Ed [1] http://inkdroid.org/journal/2011/12/25/genealogy-of-a-typo/ [2] http://www.informationdiet.com/blog/read/we-need-a-github-for-data [3] http://sunlightlabs.com/blog/2010/we-dont-need-a-github-for-data/ [4] http://en.wikipedia.org/wiki/Distributed_revision_control [5] https://help.github.com/articles/post-receive-hooks [6] http://www.niso.org/workrooms/resourcesync/ [7] http://www.w3.org/TR/prov-primer/
Re: [CODE4LIB] Job: Senior Java Developer (CACI) at Library of Congress
On Tue, Aug 14, 2012 at 9:02 AM, j...@code4lib.org wrote: The Software Developer will serve as a member of the repository development team at the Library of Congress. The candidate will be responsible for participating in the definition, design, and development of the software, tools and technologies that satisfy functional requirements, within the scope, schedule, and priorities as assigned by the project manager and/or technical lead. The candidate must be familiar with the entire lifecycle of software development, and have experience creating, maintaining and applications for production environments. The candidate must be familiar with debugging software issues in the production environment. Btw, if anyone wants to know more about this job and wants to chat about it informally let me know... //Ed
Re: [CODE4LIB] It's all job postings!
150 people responded about whether jobs.code4lib.org posting should come to the discussion list: yes: 132 no: 10 who cares: 8 93% in support or agnostic seems to be a good indicator that the postings should continue to come to the list for now. //Ed
Re: [CODE4LIB] It's all job postings!
On Thu, Aug 2, 2012 at 9:35 AM, Moynihan, Terry terry.moyni...@analog.com wrote: I can't understand why this would be an issue in a profession (librarian) that is very tiny compared to most. I also can't understand why it would be a problem when 50% of college graduates can't get any job let alone one in their field. The US and World economies stink, and more jobs have been lost than ever before in the history of the world. There are still 100's of millions of people without any job and a few job postings are an issue?? Perhaps a step back to the reality of what's really important in life... Thanks for this Terry. You expressed exactly the frustration that led me to hack on jobs.code4lib.org in the first place. I know too many people struggling to find work, and to find work they love. As Dan Chudnov pointed out in his code4lib keynote this year, the library/archive profession is in the midst of a pretty big upheaval/transformation. So, the other goal of jobs.code4lib.org is to help document the skills and jobs that are in demand, to help educators teach their students relevant skills so that they can find jobs. I also wanted it to assist life long learners who were interested in refreshing their skillset. Ideas for improving the site are welcome in the issue trackers Github [1]. //Ed [1] https://github.com/code4lib/shortimer/issues
Re: [CODE4LIB] It's all job postings!
On Thu, Aug 2, 2012 at 10:29 AM, Barbara Cormack bcorm...@corvendesign.com wrote: I would vote for including more information in the postings, as some have come through without any details about the job or the hiring institution, or links. Usually a little searching turns this up, but not always. Just so I understand, have you tried clicking on the jobs.code4lib.org URL included at the bottom of the posting? If not does this link need to be more obvious? //Ed
Re: [CODE4LIB] Job: Digital Projects and Technology Librarian at Yale University
I'm not sure if it helps, but jobs.code4lib.org picked this up downstream from a libgig post yesterday: http://publicboard.libgig.com/job/digital-projects-and-technology-librarian-new-haven-ct-yale-university-b56f0fd024/?d=1source=rss_pageutm_source=twitterfeedutm_medium=twitter //Ed On Fri, Jul 20, 2012 at 8:46 AM, Friscia, Michael michael.fris...@yale.edu wrote: No, it is not possible to submit when the job is closed. I'm trying to get clarification if closing it was intentional. Sorry for the confusion. I should add that I don't have anything to do with the job except my department is named in the description as a collaborating partner. ___ Michael Friscia Manager, Digital Library Programming Services Yale University Library (203) 432-1856 -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matthew Sherman Sent: Friday, July 20, 2012 8:36 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Job: Digital Projects and Technology Librarian at Yale University So even though it says closed to further applications one is actually able to submit? On Fri, Jul 20, 2012 at 5:27 AM, Friscia, Michael michael.fris...@yale.eduwrote: I just asked, our internal locks are only for the first 7 days during which the jobs won't even appear in the system unless you work for Yale. ___ Michael Friscia Manager, Digital Library Programming Services Yale University Library (203) 432-1856 On 7/19/12 11:47 PM, Simon Spero sesunc...@gmail.com wrote: Maybe it's just closed to internal applicants- some sort of Yale lock? On Jul 19, 2012 11:25 PM, Matthew Sherman matt.r.sher...@gmail.com wrote: There is a slight problem here. The posting says it is *closed to further applications*. Can someone from Yale explain/look into that? I would very much like to apply. On Thu, Jul 19, 2012 at 5:54 PM, Simon Spero sesunc...@gmail.com wrote: On Thu, Jul 19, 2012 at 6:35 PM, j...@code4lib.org wrote: * May be required to assist with disaster recovery efforts. PREFERRED EDUCATION, EXPERIENCE AND SKILLS * Advanced degree in theology or a related field. Rise, take up they bed, and walk
Re: [CODE4LIB] Reminder - call for proposals, New England code4lib!
On Fri, Jul 6, 2012 at 2:51 PM, Stern, Randall randy_st...@harvard.edu wrote: This will be a great opportunity to meet your peers at local institutions and generate conversation on code4lib related topics in which you are interested! Please add your proposals now (please, by August 1) for (a) Prepared talks (20 minutes) (b) Lightning talks (5 minutes) (c) Posters Maybe it's just me, but doesn't It seem a bit odd to submit proposals months in advance for lightning talks? My experience of lightning talks is that people can sign up for them at the event, so they can be more spontaneous and of-the-moment. //Ed
Re: [CODE4LIB] Reminder - call for proposals, New England code4lib!
On Sat, Jul 7, 2012 at 9:28 AM, Carol Bean beanwo...@gmail.com wrote: I thought the distinction was that Lightning talks are very short and more informal. well, that too :-)
[CODE4LIB] code4lib.org down?
Paging Oregon State: do we know why code4lib.org isn't responding? http://code4lib.org/ HTTP requests currently seem to timeout. //Ed PS. Thanks to Carol Bean for noticing it, and bringing it up in #code4lib :-)
Re: [CODE4LIB] Job: Agile Project Manager at AudioVisual Preservation Solutions
Oops, sorry about that Mark. I should have looked more carefully before adding this after seeing it in your TweetStream. I'll remove the duplicate. Also happened today with the Yale posting. I guess I need to come up with some smarts to detect duplicates. //Ed On Mon, Jun 11, 2012 at 5:43 PM, j...@code4lib.org wrote: **Job Description:** AudioVisual Preservation Solutions (AVPS) seeks an experienced (mid-level) Agile Project Manager to provide essential support and facilitation to an open source software development project for the public media archival community. The position will begin on July 1, 2012 and continue through October 2013, with the possibility of extension. The project manager will both play a critical leadership role in the Agile development process as well act as primary liaison for clients and stakeholders. This position is full time, based at our office in New York City. No reimbursement for relocation costs will be provided. **Responsibilities** * Oversee the entire project, including overall project planning, project coordination and software development * Oversee Agile development of the application * Develop and document comprehensive project plans, timelines, milestones and deliverables * Manage the complete software development lifecycle * Lead the development and management of project requirements, system features, and user stories * Carefully track and coordinate project progress, ensuring the timely completion of deliverables * Continually prioritize and organize project goals in a way that is clearly accessible to all stakeholders * Manage and track project progress through web-based collaboration tools * Organize and facilitate regular project meetings, including iteration and release planning, daily stand-up meetings, demos, and reviews * Be the primary point of contact for all stakeholders, including clients, developers, stakeholders and AVPS team members. Answer questions, and field inquiries to appropriate team members as needed * Develop documentation and guidelines for software * Help train users of the application * Supervise hand off of application to product owners upon completion of contract * Travel to meetings as needed (10%) **Desired Skills and Experience** * At least three years in a project management role * Demonstrated experience with Agile software development coordination, using frameworks such as Scrum or Feature Driven Development (FDD) * Demonstrated leadership skills, with the ability to manage distributed, remote teams * Excellent verbal, written, presentation, and interpersonal communication skills * Extremely organized, responsive, and detail oriented * Experience managing projects with project tracking, issue tracking, and collaboration software such as JIRA and Confluence * Excellent MS Office skills on Mac and PC platforms, Google Docs, diagramming skills using a variety software such as OmniGraffle * Certified Scrum Master and/or PMP Certification a plus * Knowledge of library and information science, video and audio production, and/or public media a plus AudioVisual Preservation Solutions (AVPS) is a full service audiovisual preservation and information management consulting firm serving the educational, broadcasting, government, non-profit, and corporate sectors. With a strong focus on professional standards and best practices, open communication, efficient workflows, and the innovative use and development of technological resources, AVPS brings a broad knowledge base and extensive experience to efficiently and effectively meeting the challenges faced in the preservation and access of digital content. To Apply please submit resume and cover letter (including salary requirements if applicable) in PDF format to care...@avpreserve.com by June 22, 2012. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/1003/
Re: [CODE4LIB] OCLC / British Library / VIAF / Wikipedia
On Fri, Jun 1, 2012 at 7:48 PM, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: There's a discussion going on on Wikipedia that may be of interest to subscribers of this list: Thanks for the heads up Stuart! It is an interesting discussion, and one that hopefully can build on the excellent work that Jakob Voss and others [1] have done on German Wikipedia with the Deutsche Bibliothek. //Ed [1] http://meta.wikimedia.org/wiki/Transwiki:Wikimania05/Paper-JV2
Re: [CODE4LIB] MARC Magic for file
On Wed, May 23, 2012 at 6:16 PM, Kyle Banerjee baner...@orbiscascade.org wrote: I'm not sure whether to laugh or cry that it's a sign of progress that a 40 year old utility designed to identify file types is now just beginning to be able to recognize a format that's been around for almost 50 years... Laugh :-) //Ed
[CODE4LIB] duplicate jobs postings from jobs.code4lib.org
I just wanted to apologize for 3 duplicate job postings that were sent today. Now that there are multiple job curators who are finding jobs and putting them on jobs.code4lib.org it is important to double check that a job hasn't been posted already. At the minimum I think this is a social convention that curators should follow if they want to post jobs on jobs.code4lib.org. Perhaps there is something shortimer [1] could do to help prevent this: such as warning when a given job URL has been used before, etc. Anyhow, thanks for your patience :-) //Ed [1] https://github.com/code4lib/shortimer
Re: [CODE4LIB] Anyone using node.js?
On Wed, May 9, 2012 at 3:47 AM, Berry, Rob robert.be...@liverpool.ac.uk wrote: You almost certainly should not rewrite an entire codebase from scratch unless there's an extremely good reason to do so. JoelOnSoftware did a good piece on it - http://www.joelonsoftware.com/articles/fog69.html. Why has your project manager decided Node.js is the way to go instead of something like Python or Perl? Just because it's a shiny new technology? Python's got Twisted and Perl has POE if you want to do asynchronous programming. They also both have a very large number of excellent quality libraries to do innumerable other things. I totally agree, it's all about the right tool for the job. Just to clarify, NodeJS is quite a bit different than Twisted and POE because the entire language and its supporting libraries are written for event driven programming from the bottom up. When using Twisted and POE you may end up needing existing libraries that are synchronous, so the wins aren't as great, and things can get...complicated. For a pretty even handed description of this check out Paul Querna's blog post about why Rackspace decided to switch from Twisted to NodeJS for their cloud monitoring dashboard applications [1]. I am not saying Perl and Python are not good tools (they are) just that the benefits of using NodeJS are not all hype. //Ed [1] http://journal.paul.querna.org/articles/2011/12/18/the-switch-python-to-node-js/
Re: [CODE4LIB] Anyone using node.js?
On Wed, May 9, 2012 at 4:50 AM, Berry, Rob robert.be...@liverpool.ac.uk wrote: Though re Python I would say mixing Django with Twisted is a fairly blatant error. There are libraries built on Twisted to serve web-pages, and if you're doing event-driven programming you should really be using them. Heh, but part of your argument for using POE or Twisted was that they also both have a very large number of excellent quality libraries to do innumerable other things. I think it's more like a slippery slope of mixing programming paradigms than it is a blatant error. Also, I think it was specifically the Django ORM code that bit them hardest, not HTTP calls. Yes there are ORM options like adbmapper, but I think you increasingly find yourself in the weeds on the fringe of the Python community. //Ed
Re: [CODE4LIB] Anyone using node.js?
On Wed, May 9, 2012 at 5:17 AM, Berry, Rob robert.be...@liverpool.ac.uk wrote: No, fair enough, you are right. If that's the paradigm you want it would be a better bet to go for a language that has it built in from the ground up. And (just so it isn't lost) you are absolutely right to question whether there is a legitimate reason for wanting to do the rewrite :-) //Ed
Re: [CODE4LIB] Anyone using node.js?
I've been using NodeJS in a few side projects lately, and have come to like it quite a bit for certain types of applications: specifically applications that need to do a lot of I/O in memory constrained environments. A recent one is Wikitweets [1] which provides a real time view of tweets on Twitter that reference Wikipedia. Similarly Wikistream [2] monitors ~30 Wikimedia IRC channels for information about Wikipedia articles being edited and publishes them to the Web. For both these apps the socket.io library for NodeJS provided a really nice abstraction for streaming data from the server to the client using a variety of mechanisms: web sockets, flash socket, long polling, JSONP polling, etc. NodeJS' event driven programming model made it easy to listen to the Twitter stream, or the ~30 IRC channels, while simultaneously holding open socket connections to browsers to push updates to--all from within one process. Doing this sort of thing in a more typical web application stack like Apache or Tomcat can get very expensive where each client connection is a new thread or process--which can lead to lots of memory being used. If you've done any JavaScript programming in the browser, it will seem familiar, because of the extensive use of callbacks. This can take some getting used to, but it can be a real win in some cases, especially in applications that are more I/O bound than CPU bound. Ryan Dahl (the creator of NodeJS) gave a presentation [4] to a PHP group last year which does a really nice job of describing how NodeJS is different, and why it might be useful for you. If you are new to event driven programming I wouldn't underestimate how much time you might spend feeling like you are turning our brain inside out. In general I was really pleased with the library support in NodeJS, and the amount of activity there is in the community. The ability to run the same code in the client as in the browser might be of some interest. Also, being able use libraries like jQuery or PhantomJS in command line programs is pretty interesting for things like screen scraping the tagsoup HTML that is so prevalent on the Web. If you end up needing to do RDF and XML processing from within NodeJS and you aren't finding good library support you might want to find databases (Sesame, eXist, etc) that have good HTTP APIs and use something like request [5] if there isn't already support for it. I wrote up why NodeJS was fun to use for Wikistream on my blog if you are interested [6]. I recommend you try doing something small to get your feet wet with NodeJS first before diving in with the rewrite. Good luck! //Ed [1] http://wikitweets.herokuapp.com [2] http://wikistream.inkdroid.org [3] http://inkdroid.org/journal/2011/11/07/an-ode-to-node/ [4] http://www.youtube.com/watch?v=jo_B4LTHi3I [5] https://github.com/mikeal/request [6] http://inkdroid.org/journal/2011/11/07/an-ode-to-node/ On Tue, May 8, 2012 at 5:24 PM, Randy Fischer randy.fisc...@gmail.com wrote: On Mon, May 7, 2012 at 11:17 PM, Ethan Gruber ewg4x...@gmail.com wrote: It was recently suggested to me that a project I am working on may adopt node.js for its architecture (well, be completely re-written for node.js). I don't know anything about node.js, and have only heard of it in some passing discussions on the list. I'd like to know if anyone on code4lib has experience developing in this platform, and what their thoughts are on it, positive or negative. It's a very interesting project - I think of it as kind of non-preemptive multitasking framework, very much like POE in the Perl world, but with a more elegant way of managing the event queue. Where it could shine is that it accepts streaming, non-blocking HTTP requests. So for large PUTs and POSTs, it could be a real win (most other web-server arrangements are going to require completed uploads of the request, followed by a hand-off to your framework of an opened file descriptor to a temporary file). My naive tests with it a year or so ago gave inconsistent results, though (sometime the checksums of large PUTs were right, sometimes not). And of course to scale up, do SSL, etc, you'll really need to put something like Apache in front of it - then you lose the streaming capability. (I'd love to hear I'm wrong here). -Randy Fischer
[CODE4LIB] code4lib journal site statistics
Just a quick note to let you know that site statistics for Code4lib Journal [1] are going to be emailed regularly to the c4lj-discuss Google Group [2]. The stats are provided as CSV attachments from Google Analytics, which include page views, visitors and traffic sources. If you have any suggestions/ideas please let us know at jour...@code4lib.org or on c4lj-discuss. Thanks to Jason Ronallo for the idea to do this. //Ed [1] http://journal.code4lib.org [2] https://groups.google.com/d/msg/c4lj-discuss/J-kqRtyrcnM/WYxLbw9YncUJ
Re: [CODE4LIB] Author authority records to create publication feed?
Two other projects that are worth taking a look at are VIVO [1] and BibApp [2]. Both take the approach of enabling institutions to manage information about their faculty, which can then be federated more widely. I guess the reality is that there will be lots of identifiers for faculty, and simple systems that allow them to be collaboratively and meaningfully linked together are a good way forward. //Ed [1] http://vivoweb.org/ [2] http://bibapp.org/ On Fri, Apr 13, 2012 at 1:03 PM, Paul Butler (pbutler3) pbutl...@umw.edu wrote: Thank you all for your suggestions! Kevin's excellent email confirms my suspicions. I am working on plans to transform our digital repository to a more broadly defined IR, so that will likely be our focus down the road. However, any solution that requires faculty input without an immediate, tangle benefit will likely gain slow traction. I will pass along the suggestions and go from there. Cheers, Paul +-+-+-+-+-+-+-+-+-+-+-+-+ Paul R Butler Assistant Systems Librarian Simpson Library University of Mary Washington 1801 College Avenue Fredericksburg, VA 22401 540.654.1756 libraries.umw.edu Sent from the mighty Dell Vostro 230. -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ford, Kevin Sent: Friday, April 13, 2012 10:50 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Author authority records to create publication feed? Hi Paul, I can't really offer any suggestions but to say that this is a problem area presently. In fact, there was a recent workshop, held in connection with the Spring CNI Membership Meeting, designed specifically to look at this problem (and author identity management more generally). You can read more about it from the announcement here [1], but the idea was to bring a number of the larger actors (Web of Science, arXiv, ORCID, ISNI, VIAF, LC/NACO, and a few more) involved in managing authorial identity together to learn about the work being done, and to discuss improved ways, to disambiguate scholarly identities and then diffuse and share that information within and across the library and scholarly publishing realms. Clifford Lynch, who moderated the meeting, will publish a post-workshop report in a few weeks [2]. Perhaps of additional interest, [2] also contains a link to the report of a similar workshop held in London about international author identity. Inititatives like ISNI [3] and ORCID [4], which mint identifiers for (public, authorial) identities, and VIAF, which has done so much to aggregate the authority records of the participating libraries (while also assigning them an identifier), are essential to disambiguating one identity from another and assigning unique identifiers to those identities. For identifiers like ORCIDs, the faculty member's sponsoring organization might acquire the ORCID for him/her, after which the faculty member will/may know and use the identifier in situations such as grant applications, publishing, etc. (though it might also be early days for this activity also). Part of the process, however, is diffusing the identifier across the library and scholarly publishing domains, all the while matching it with the correct identity (and identifer) in another system. That said, when ISNIs and ORCIDs and, perhaps, VIAF identifiers start to make their ways into Web of Science, arXiv, LC/NACO file, ! an! d many other places, we - developers looking to creating RSS feeds of author publications across services but without having to deal with same-name problems or variants - might then have the hook we need to generate RSS feeds for author publications from such services as JSTOR, EBSCO, arXiv, Web Of Science, etc. Alternatively, you'd have to get your faculty members to submit their entire publication history to academia.edu (as Ethan suggested), after which the community would have to request an RSS feed of that history, or an institutional repository (as Chad suggested), but I understand these types of things are an uphill battle with (often busy, underpaid) faculty. Cordially, Kevin [1] http://www.cni.org/news/cni-workshop-scholarly-id/ [2] https://mail2.cni.org/Lists/CNI-ANNOUNCE/Message/113744.html [3] http://www.isni.org/ [4] http://about.orcid.org/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Paul Butler (pbutler3) Sent: Friday, April 13, 2012 9:25 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Author authority records to create publication feed? Howdy All, Some folks from across campus just came to my door with this question. I am still trying to work through the possibilities and problems, but thought others might have encountered something similar. They are looking for a way to create a feed (RSS, or anything else that might work) for each faculty member on campus to
Re: [CODE4LIB] Job: Lead Web Developer at Florida State University
On Mon, Apr 9, 2012 at 2:02 PM, GORE, EMILY eg...@fsu.edu wrote: My apologies to all for the multiple listings, and I did forget to get approval from Roy T. for all of them. Please forgive! No worries Emily. If there is a way the jobs.code4lib.org admin interface can be improved definitely let me know. //Ed
Re: [CODE4LIB] Job: at ScraperWiki
Hi Jodi, Was there a reason why you included the Pool temperatures, company registrations, dairy prices … in the job description at: http://jobs.code4lib.org/job/842 I almost flagged the posting as spam... //Ed On Tue, Mar 13, 2012 at 9:03 AM, j...@code4lib.org wrote: Pool temperatures, company registrations, dairy prices … ScraperWiki is a Silicon Valley style startup, in Liverpool, UK. We're changing the world of open data, and how data science is done together on the Internet. We're looking for a data scientist who… Loves data, and what can be done with it. Able to code in Ruby or Python, but willing to learn the other. Good at communicating with non-technical people. Happy to responsively give our corporate customers what they need. Some practical things… We're an innovative, funded startup. Things will change lots, as we find how our business works. We'd like you to enjoy and help with that. Must be willing to either relocate to Liverpool or to commute to our offices which are near the University. We might be able to organise working visas. To apply - send the following: Links to two scrapers that you've made on ScraperWiki, involving a dataset that you find interesting for some reason. Similarly, a link to a view you've made on ScraperWiki (can be related to the two scrapers). A link to your resume/CV Any questions you have about the job. Along to fran...@scraperwiki.com with the word swjob4 in the subject (and yes, that means no agencies, unless the candidates do that themselves) … Oil wells, marathon results, planning applications Brought to you by code4lib jobs: http://jobs.code4lib.org/job/842/
Re: [CODE4LIB] Job: at ScraperWiki
Oh I see it's in the job description you got from the ScraperWiki blog post: http://blog.scraperwiki.com/2012/03/13/job-advert-data-scientist-web-scraper/ On Tue, Mar 13, 2012 at 12:40 PM, Ed Summers e...@pobox.com wrote: Hi Jodi, Was there a reason why you included the Pool temperatures, company registrations, dairy prices … in the job description at: http://jobs.code4lib.org/job/842 I almost flagged the posting as spam... //Ed On Tue, Mar 13, 2012 at 9:03 AM, j...@code4lib.org wrote: Pool temperatures, company registrations, dairy prices … ScraperWiki is a Silicon Valley style startup, in Liverpool, UK. We're changing the world of open data, and how data science is done together on the Internet. We're looking for a data scientist who… Loves data, and what can be done with it. Able to code in Ruby or Python, but willing to learn the other. Good at communicating with non-technical people. Happy to responsively give our corporate customers what they need. Some practical things… We're an innovative, funded startup. Things will change lots, as we find how our business works. We'd like you to enjoy and help with that. Must be willing to either relocate to Liverpool or to commute to our offices which are near the University. We might be able to organise working visas. To apply - send the following: Links to two scrapers that you've made on ScraperWiki, involving a dataset that you find interesting for some reason. Similarly, a link to a view you've made on ScraperWiki (can be related to the two scrapers). A link to your resume/CV Any questions you have about the job. Along to fran...@scraperwiki.com with the word swjob4 in the subject (and yes, that means no agencies, unless the candidates do that themselves) … Oil wells, marathon results, planning applications Brought to you by code4lib jobs: http://jobs.code4lib.org/job/842/
Re: [CODE4LIB] Job: at ScraperWiki
On Tue, Mar 13, 2012 at 12:49 PM, Chad Benjamin Nelson cnelso...@gsu.edu wrote: I think it is just some examples of the weird and interesting data in scraperwiki. Yeah, I guess it would be kind of pointless spam eh? :-) //Ed
Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records
On Fri, Mar 9, 2012 at 12:12 PM, Godmar Back god...@gmail.com wrote: Here's my hand ||*( [1]. ||*) I'm sorry that I was so unhelpful w/ the patches welcome message on your docfix. You're right, it was antagonistic of me to suggest you send a patch for something so simple. Plus, it wasn't even accurate, because I actually wanted a pull request :-) I've been amazed at how much github can speed fixes getting into the codebase--even very small ones. Using the machinery of git (fork, commit, push, pull request, merge) leaves a trail which is extremely helpful for surfacing who is helping with what at the source code level. It would be great if the students that you mentioned who are using pymarc knew that they have the ability to participate at this level as well. One of the reasons why we moved pymarc over to github was to enable more people to more easily maintain the software. I agree that there are some dusty corners of pymarc that could use some cleanup, and that character encoding is probably the cruftiest of the cruft. Perhaps python3 compatibility will be good time to rethink how some of it works? At any rate, I hope that you will keep helping the project out, we need it. //Ed PS. thanks for being you Mike :-)
Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records
On Mon, Mar 12, 2012 at 10:14 AM, Godmar Back god...@gmail.com wrote: Here's a make-up pull request especially made for you :-) https://github.com/edsu/pymarc/pull/25 Merged! :-D //Ed
Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records
Hi Terry, On Thu, Mar 8, 2012 at 2:36 PM, Reese, Terry terry.re...@oregonstate.edu wrote: This is one of the reasons you really can't trust the information found in position 9. This is one of the reasons why when I wrote MarcEdit, I utilize a mixed process when working with data and determining characterset -- a process that reads this byte and takes the information under advisement, but in the end treats it more as a suggestion and one part of a larger heuristic analysis of the record data to determine whether the information is in UTF8 or not. Fortunately, determining if a set of data is in UTF8 or something else, is a fairly easy process. Determining the something else is much more difficult, but generally not necessary. Can you describe in a bit more detail how MARCEdit sniffs the record to determine the encoding? This has come up enough times w/ pymarc to make it worth implementing. //Ed
Re: [CODE4LIB] code4lib.org back up, along with wiki.code4lib.org and planet.code4lib.org
Hoorah, thanks RyanW and RyanO! Striking while the iron is hot, would it be possible to verify that routine backups are happening for the drupal and mediawiki databases on code4lib.org? On Mon, Feb 20, 2012 at 5:07 PM, Wick, Ryan ryan.w...@oregonstate.edu wrote: We're back up and running, thanks to Ryan Ordway. Let me know if you notice something that isn't working as expected. Ryan Wick Information Technology Consultant Special Collections Archives Research Center Oregon State University Libraries http://osulibrary.oregonstate.edu/specialcollections
[CODE4LIB] code4lib.org
I apologize if this has already come up, but has there been any announcement about the code4lib.org drupal and mediawiki outages at Oregon State? //Ed
Re: [CODE4LIB] GetLamp screening at Code4Lib
Shoot, I'm just realizing now I'm also double booked for the newcomers dinner ... was there another option for the Get Lamp showing? On Tue, Jan 31, 2012 at 5:16 PM, Dongqing Xie d...@fsu.edu wrote: Adam Wead aw...@rockhall.org wrote: Shouldn't be a problem. As I understand it, the screening is basically plugging in laptop to the TV and watching the movie. ...adam On Jan 31, 2012, at 4:34 PM, Michael J. Giarlo wrote: Just curious: is there a chance that we can arrange for subsequent viewings? I ask because a number of us have late newcomer dinner reservations. Maybe we can run it during the craft beer drink-up, too, for instance? Not trying to make this complicated. -Mike On Tue, Jan 31, 2012 at 16:28, Adam Wead aw...@rockhall.org wrote: Hi all, So far the preferred time for the GetLamp showing is Tuesday at 9 pm. I'll close the Doodle poll tomorrow at 5 EST to give everyone a chance to vote. http://doodle.com/p4c32i3b2ybsrkbh ...adam [http://donations.rockhall.com/Logo_WWR.gif]http://rockhall.com/exhibits/women-who-rock/ This communication is a confidential and proprietary business communication. It is intended solely for the use of the designated recipient(s). If this communication is received in error, please contact the sender and delete this communication. ' [http://donations.rockhall.com/Logo_WWR.gif]http://rockhall.com/exhibits/women-who-rock/ This communication is a confidential and proprietary business communication. It is intended solely for the use of the designated recipient(s). If this communication is received in error, please contact the sender and delete this communication. '
Re: [CODE4LIB] OCLC control number access
On Tue, Jan 31, 2012 at 3:45 PM, Stuart Spore spore...@nyu.edu wrote: If I can be forgiven a possibly naive question, is it possible to quickly and freely get a list of all the OCLC control numbers associated with an OCLC symbol (your own or someone else's) without resorting to any elaborate ( contractual) batchload service or the like? If it's your own symbol I guess you could dump your OPAC as MARC and rifle through it with a script? If it's someone else's you could ask them for a dump of their opac as MARC and rifle through it. I seem to remember the Worldcat API had some support for giving holdings information... That probably wasn't very helpful, since it's probably not going to be quick but sometimes the obvious answer isn't very obvious. //Ed
Re: [CODE4LIB] GetLamp screening at Code4Lib
On Wed, Feb 1, 2012 at 5:52 AM, Ed Summers e...@pobox.com wrote: Shoot, I'm just realizing now I'm also double booked for the newcomers dinner ... was there another option for the Get Lamp showing? Adam reminded me in #code4lib that the newcomers dinner starts at 6 and will likely be over by 9. So I'm not double booked after all. Maybe some of the newcomer dinners could even segue into watching Get Lamp if there is interest? //Ed
Re: [CODE4LIB] Digital Object Viewer
If by digital objects you mean images we've been getting a lot of mileage out of OpenSeaDragon [1] at the Library of Congress. You do have to pre-generate the deep-zoom-files DZI [2] or you can implement your own server side tiling code to do it on the fly. As a space vs time trade off we generate tiles on the fly in Chronicling America [3], since there are millions of newspaper page images. But in the World Digital Library [4] we generate DZI files. Chris Thatcher, one of the developers at LC has a fork of the codeplex repo on GitHub [5], which we are applying some fixes to, since GitHub is alot easier to navigate and use than Codeplex. If you are curious here are some samples of the viewer in action: http://chroniclingamerica.loc.gov/lccn/sn85066387/1912-01-31/ed-1/seq-1/ http://www.wdl.org/en/item/4106/zoom/#group=1page=4 //Ed [1] http://openseadragon.codeplex.com/ [2] https://github.com/openzoom/deepzoom.py [3] http://chroniclingamerica.loc.gov [4] http://wdl.org [5] https://github.com/thatcher/openseadragon
Re: [CODE4LIB] Digital Object Viewer
Yes, it's my understanding that OpenSeaDragon is basically a JavaScript implementation of the OpenZoom flash code...and that they work on roughly the same DZI files. But my knowledge of OpenZoom is very limited, so take that with a grain of salt. //Ed On Tue, Jan 31, 2012 at 12:07 PM, Raymond Yee raymond@gmail.com wrote: Thanks, Ed, for pointing out OpenSeaDragon -- I didn't know about it. I've been aware of another similar open source project: http://www.openzoom.org/ that makes use of Flash -- though the openzoom github repo has openzoom.js (https://github.com/openzoom/openzoom.js). I've used the Python toolkit of openzoom (https://github.com/openzoom/deepzoom.py) to generate tiles. -Raymond On 1/31/12 8:59 AM, Ed Summers wrote: If by digital objects you mean images we've been getting a lot of mileage out of OpenSeaDragon [1] at the Library of Congress. You do have to pre-generate the deep-zoom-files DZI [2] or you can implement your own server side tiling code to do it on the fly. As a space vs time trade off we generate tiles on the fly in Chronicling America [3], since there are millions of newspaper page images. But in the World Digital Library [4] we generate DZI files. Chris Thatcher, one of the developers at LC has a fork of the codeplex repo on GitHub [5], which we are applying some fixes to, since GitHub is alot easier to navigate and use than Codeplex. If you are curious here are some samples of the viewer in action: http://chroniclingamerica.loc.gov/lccn/sn85066387/1912-01-31/ed-1/seq-1/ http://www.wdl.org/en/item/4106/zoom/#group=1page=4 //Ed [1] http://openseadragon.codeplex.com/ [2] https://github.com/openzoom/deepzoom.py [3] http://chroniclingamerica.loc.gov [4] http://wdl.org [5] https://github.com/thatcher/openseadragon
[CODE4LIB] jobs.code4lib.org
(apologies if you already saw this on the cod4libcon list) There were some questions on #code4lib IRC today about jobs.code4lib.org. Jonathan is right, it is a bit wacky, but hopefully in a good way. I was going to grab a lightning talk slot at the conference to talk about it, but here is a brief summary that may help. jobs.code4lib.org is a Python Django application called shortimer that is on github [1]. Jobs end up on jobs.code4lib.org via two workflows: 1. posting via email: - lots of people post job ads to the code4lib mailing list, so shortimer subscribes to the list and tries to find job postings in the emails it receives - if it finds what looks like a job it extracts what metadata it can, and adds it to its database in an non-published state - logged in users can curate jobs [2] (clean up job titles, add the employer, job URL, and any tags that seem relevant) and then hit publish - when someone publishes a job it will show up on the homepage [3] - when someone publishes a job the code4lib twitter account [4] will tweet the job announcement 2. posting via website - a logged in user can go to a web form [5] and post a new job - when they hit publish an email will go to the discussion list, and will get tweeted That's pretty much it. Freebase is used as a controlled vocabulary for tags and employers which has some benefits in displaying jobs by a topic like Ruby [6]. It's even possible to get some general trend reporting [7]. This is a long way of saying: if you have jobs to announce before or at the conference please feel free to try out jobs.code4lib.org :-) Of course there is a whole lot of value in a physical board at the conference and/or a wiki with people that can answer questions in person though. There's no replacing that... //Ed [1] http://github.com/code4lib/shortimer [2] http://jobs.code4lib.org/curate/ [3] http://jobs.code4lib.org [4] http://twitter.com/code4lib [5] http://jobs.code4lib.org/job/new/ [6] http://jobs.code4lib.org/jobs/ruby/ [7] http://jobs.code4lib.org/reports/
Re: [CODE4LIB] jobs.code4lib.org
I guess it's rarely a good idea to respond to your own post, but I forgot to add that when a job is published on jobs.code4lib.org it will show up in the site's Atom feed [1]. The feed should be usable by your feed reader of choice, and could also be useful if you want to syndicate the jobs elsewhere. //Ed [1] http://jobs.code4lib.org/feed/ PS. It was kind of fun to finally use the tag link relation to mark up the job tags in the feed with Freebase URLs. For example: entry ... link rel=tag title=Unix href=http://www.freebase.com/view/en/unix; type=text/html / link rel=tag title=Unix [JSON] href=http://www.freebase.com/experimental/topic/standard/en/unix; type=application/json / link rel=tag title=Unix [RDF] href=http://rdf.freebase.com/rdf/en.unix; type=application/rdf+xml / /entry
Re: [CODE4LIB] marc in json
Thanks for all the helpful guidance. I'll work on getting the JSON implementation updated before releasing it. I don't know if it's of interest but the Twitter firehose (as deliverd by Gnip) is line oriented JSON. Each line is a tweet and all its metadata. This format is handy for doing things like counting the number of records with a 'wc -l' instead of having to parse the JSON...which can be expensive when there can be 10M an hour. //Ed
[CODE4LIB] marc in json
Martin Czygan recently added JSON support to pymarc [1]. Before this gets rolled into a release I was wondering if it might make sense to bring the implementation in line with Ross Singer's proposed JSON serialization for MARC [2]. After quickly looking around it seems to be what got implemented in ruby-marc [3] and PHP's File_MARC [4]. It also looked like there was a MARC::Record branch [5] for doing something similar, but I'm not sure if that has been released yet. It seems like a no-brainer to bring it in line, but I thought I'd ask since I haven't been following the conversation closely. //Ed [1] https://github.com/edsu/pymarc/commit/245ea6d7bceaec7215abe788d61a0b34a6cd849e [2] http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/ [3] https://github.com/ruby-marc/ruby-marc/blob/master/lib/marc/record.rb#L227 [4] http://pear.php.net/package/File_MARC/docs/latest/File_MARC/File_MARC_Record.html#methodtoJSON [5] http://marcpm.git.sourceforge.net/git/gitweb.cgi?p=marcpm/marcpm;a=shortlog;h=refs/heads/marc-json
Re: [CODE4LIB] Library News (à la ycombinator's hackernews)
On Wed, Nov 30, 2011 at 10:51 PM, Matthew Phillips mphill...@law.harvard.edu wrote: I'm the guy that did the hacking (with help from my coworkers, Jeff and David) to get Hacker News up and running. If you have technical questions about the site, shoot them my way. Nice work. It's great to see it starting to get used. Mark is right, Library News is running the news.arc source from https://github.com/nex3/arc I had to do a little customization, but the code worked out of the box for me. I'm really interested in seeing Library News blossom. If you have input, please share it. I'd also be excited to get a couple of community leaders to become moderators for the site (drop me an email if you want to volunteer yourself/someone). I noticed that news.arc has some RSS functionality [1]. Does it seem easy/possible to add a link element to the RSS feeds to the HTML, e.g. link rel=alternate type=application/rss+xml title=Library News href=http://news.librarycloud.org/rss; //Ed [1] https://github.com/nex3/arc/blob/master/news.arc#L2239
Re: [CODE4LIB] HTML5 Microdata, schema.org, and digital collections
Damn auto-complete :-) Oh well, I guess everyone knows how inept I am now! On Thu, Dec 1, 2011 at 1:03 PM, Ed Summers e...@pobox.com wrote: Excellent! Thanks for working with the situation :-) //Ed On Thu, Dec 1, 2011 at 9:55 AM, Jason Ronallo jrona...@gmail.com wrote: Ed, I'd like to still fit the article into the next issue. I agree that the cultural heritage community needs more exposure to these new web standards. With the increased interest in linked data, the landscape of choices for how to expose your data has become more complex, and I hope the article can get the discussion going and provide some guidance there. I also see this as an opportunity for me to get something out there relatively early on this topic, and coming before my talk is good timing. Jason On Thu, Dec 1, 2011 at 9:06 AM, Ed Summers e...@pobox.com wrote: Hi Jason, Let me just say again how bad I feel for dropping this on the floor. I feel even more guilty because more discussion about the use of html5/microdata in the cultural heritage community is desperately needed. So is it OK to still try to fit your article into the next issue, or should we push it to issue 17? //Ed On Thu, Dec 1, 2011 at 9:00 AM, Jason Ronallo jrona...@gmail.com wrote: Hi, Ed, I'm glad to hear from you and the journal. What I had when I submitted a proposal to the journal was just a proposal and an implementation, so I won't be able to have a draft to you before the end of the month. I'll try to share something with you sooner than that, though. I'll be happy to license the article US CC-BY and the code as open source (hopefully MIT). Thank you, Jason On Thu, Dec 1, 2011 at 3:59 AM, Ed Summers e...@pobox.com wrote: Hi Jason, I'm pleased to tell you that your recent proposal for an article about HTML5 Microdata has been provisionally accepted to the Code4Lib Journal. The editorial committee is interested in your proposal, and would like to see a draft. I have to apologize however, since through an oversight of my own this email should have been sent almost a month ago, and was not (more on this below). As a member of the Code4Lib Journal editorial committee, I will be your contact for this article, and will work with you to get it ready for publication. We hope to publish your article in issue 16 of the Journal, which is scheduled to appear Jan 30, 2012. Incidentally, this is good timing for your code4lib talk on the same topic! The official deadline for submission of a complete draft is Friday, December 2. But since I dropped the ball on getting this email out to you promptly I completely understand if you can't hit that date. Looking at the deadlines [1] for issue 16 I can see that the 2nd draft is due Dec 30th, which is perhaps a more realistic goal for a draft. Please send whatever you have as soon as you can and we can get started. Upon receipt of the draft, I will work with you to address any changes recommended by the Editorial Committee. More information about our author guidelines may be found at http://journal.code4lib.org/article-guidelines. Please note that final drafts must be approved by a vote of the Editorial Committee before being published. We also require all authors to agree to US CC-BY licensing for the articles we publish in the journal. We recommend that any included code also have some type of code-specific open source license (such as the GPL). We look forward to seeing a complete draft and hope to include it in the Journal. Thank you for submitting to us, and feel free to contact me directly with any questions. If you could drop me a line acknowledging receipt of this email, that would be great. //Ed [1] http://wiki.code4lib.org/index.php/Code4Lib_Journal_Deadlines
Re: [CODE4LIB] HTML5 Microdata, schema.org, and digital collections
Excellent! Thanks for working with the situation :-) //Ed On Thu, Dec 1, 2011 at 9:55 AM, Jason Ronallo jrona...@gmail.com wrote: Ed, I'd like to still fit the article into the next issue. I agree that the cultural heritage community needs more exposure to these new web standards. With the increased interest in linked data, the landscape of choices for how to expose your data has become more complex, and I hope the article can get the discussion going and provide some guidance there. I also see this as an opportunity for me to get something out there relatively early on this topic, and coming before my talk is good timing. Jason On Thu, Dec 1, 2011 at 9:06 AM, Ed Summers e...@pobox.com wrote: Hi Jason, Let me just say again how bad I feel for dropping this on the floor. I feel even more guilty because more discussion about the use of html5/microdata in the cultural heritage community is desperately needed. So is it OK to still try to fit your article into the next issue, or should we push it to issue 17? //Ed On Thu, Dec 1, 2011 at 9:00 AM, Jason Ronallo jrona...@gmail.com wrote: Hi, Ed, I'm glad to hear from you and the journal. What I had when I submitted a proposal to the journal was just a proposal and an implementation, so I won't be able to have a draft to you before the end of the month. I'll try to share something with you sooner than that, though. I'll be happy to license the article US CC-BY and the code as open source (hopefully MIT). Thank you, Jason On Thu, Dec 1, 2011 at 3:59 AM, Ed Summers e...@pobox.com wrote: Hi Jason, I'm pleased to tell you that your recent proposal for an article about HTML5 Microdata has been provisionally accepted to the Code4Lib Journal. The editorial committee is interested in your proposal, and would like to see a draft. I have to apologize however, since through an oversight of my own this email should have been sent almost a month ago, and was not (more on this below). As a member of the Code4Lib Journal editorial committee, I will be your contact for this article, and will work with you to get it ready for publication. We hope to publish your article in issue 16 of the Journal, which is scheduled to appear Jan 30, 2012. Incidentally, this is good timing for your code4lib talk on the same topic! The official deadline for submission of a complete draft is Friday, December 2. But since I dropped the ball on getting this email out to you promptly I completely understand if you can't hit that date. Looking at the deadlines [1] for issue 16 I can see that the 2nd draft is due Dec 30th, which is perhaps a more realistic goal for a draft. Please send whatever you have as soon as you can and we can get started. Upon receipt of the draft, I will work with you to address any changes recommended by the Editorial Committee. More information about our author guidelines may be found at http://journal.code4lib.org/article-guidelines. Please note that final drafts must be approved by a vote of the Editorial Committee before being published. We also require all authors to agree to US CC-BY licensing for the articles we publish in the journal. We recommend that any included code also have some type of code-specific open source license (such as the GPL). We look forward to seeing a complete draft and hope to include it in the Journal. Thank you for submitting to us, and feel free to contact me directly with any questions. If you could drop me a line acknowledging receipt of this email, that would be great. //Ed [1] http://wiki.code4lib.org/index.php/Code4Lib_Journal_Deadlines
[CODE4LIB] vivosearchlight
On Tue, Nov 1, 2011 at 7:44 AM, John Fereira ja...@cornell.edu wrote: If you want to see what node.js can do to implement a search mechanism take a look something one of my colleagues developed. http://vivosearchlight.org It installs a bookmarklet in your browser (take about 5 seconds) that will initiate a search against a solr index that contains user profile information from several institutions using VIVO (a semantic web application). From any web page, clicking on the Vivo Searchlight button in your browser will initiate a search and find experts with expertise relevant to the content of the page. Highlight some text on the page and it will re-execute a search with just those words. Thanks for sharing John. That's a really a neat idea, even if the results don't seem particularly relevant for some tests I tried. I was curious how it does the matching of page text against the profiles. I see from the description at http://vivosearchlight.org that EleasticSearch is being used instead of Solr. Any chance Miles Worthington (ok I googled) would be willing to share the source code on his github account [1], or elsewhere? //Ed [1] https://github.com/milesworthington
Re: [CODE4LIB] Life and Literature Code Challenge
On Wed, Aug 31, 2011 at 3:38 PM, John Mignault j...@mignault.net wrote: Through local and global digitization efforts, BHL has digitized over 32 million pages of taxonomic literature, representing over 45,000 titles and 87,000 volumes (January 2011). The entire -corpus- dataset is freely available and accessible via many open methods. shamelssSelfPromotionIncidentally, there are 1,440 links from 952 Wikipedia articles to the BHL [1]./shamelessSelfPromotion //Ed [1] http://linkypedia.inkdroid.org/websites/34/
Re: [CODE4LIB] ruby-zoom port to 1.9.2
Brice, Do you have a a rubyforge account/email that I can use when requesting that you are added as an admin? I can't seem to get the `gem owner` command to respect my authoritay... //Ed On Tue, Aug 30, 2011 at 6:12 PM, Jonathan Rochkind rochk...@jhu.edu wrote: If you're unable to contact the original authors, you can contact the folks who maintain rubygems.org, and ask them to give you the rights to release a new version of the gem from (and pointing to) your repo, and effectively take over the gem. Alternately, in your fork you link to, you should update the instructions to make it clear that to install this fork, gem install isn't going to do it! -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Brice Stacey Sent: Thursday, August 25, 2011 1:25 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] ruby-zoom port to 1.9.2 FYI - I've finished porting ruby-zoom to 1.9.2, including the extended services. Repo is here: https://github.com/bricestacey/ruby-zoom I reached out to the original authors and haven't gotten a response, so it looks like it might never be integrated into the original project. If anyone has any ideas on how I might get these changes into it, please let me know. Brice Stacey From: Brice Stacey Sent: Tuesday, August 09, 2011 11:08 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: ruby-zoom port to 1.9.2 Hi - I'd just like to let everyone know I did some work yesterday on ruby-zoom to port most of the code to 1.9.2. All of the standard z39.50 features are ported. The only feature left is the packages, which allow for the extended services. I'd appreciate anyone that uses it to provide feedback. The git repository can be found here: https://github.com/bricestacey/ruby-zoom Installation: Install YAZ Clone the repo Run rake clean build package Gem install pkg/zoom-0.4.1 If anyone has experience working with C and/or YAZ and would like to help finish this port off, I'd greatly appreciate it. Otherwise, I'll probably just drop the support entirely from my fork (since I won't need it going forward anyway). I've also contacted the original authors, hopefully they follow-up. Brice Stacey Digital Library Services University of Massachusetts Boston brice.sta...@umb.edu 617-287-5921
Re: [CODE4LIB] ruby-zoom port to 1.9.2
I opened a ticket with rubygems folks: http://help.rubygems.org/discussions/problems/720-ruby-zoom-ownership Maybe that will help get you the ability to maintain this module. Thanks jrochkind for the help in #code4lib channel... //Ed
Re: [CODE4LIB] OPDS 1.1 review period
edsu-- Except that it was from a month ago and the review period is over. Oh well, I guess the v1.1 of opds might be of interest still ... it is to me at least. /me slowly inches towards the door //Ed On Thu, Jul 28, 2011 at 4:04 PM, Ed Summers e...@pobox.com wrote: This might be of potential interest to code4lib folks who deal w/ ebooks ... //Ed -- Forwarded message -- From: Hadrien Gardeur hadrien.gard...@feedbooks.com Date: Sun, Jun 19, 2011 at 12:47 PM Subject: OPDS 1.1 review period To: atom-syn...@imc.org Hello, The OPDS community just posted the final draft for OPDS 1.1: http://opds-spec.org/2011/06/15/opds-1-1-call-for-comments/ During this two week review period, we're actively looking for any kind of feedback about the spec. OPDS is based on Atom and is widely used by book retailers libraries to distribute electronic publications on any device. Some of the most popular ebook applications on iOS (Stanza, Bluefire Reader) and on Android (Aldiko, FBReader) are compatible with this standard. Hadrien
Re: [CODE4LIB] RDF for opening times/hours?
On Wed, Jun 8, 2011 at 10:00 AM, Simon Spero s...@unc.edu wrote: [cue edsu ] And people wonder why Google/Yahoo/Bing chose to favor html5 microdata on schema.org :-) //Ed
Re: [CODE4LIB] wikipedia/author disambiguation
On Tue, May 31, 2011 at 11:55 AM, Jonathan Rochkind rochk...@jhu.edu wrote: The LCCN one does not work. Tries to take me to: http://errol.oclc.org/laf/n79021614.html Which results in an HTTP 500 error from the OCLC server. Since this template apparently generates a URL to an OCLC service (rather than LC? I guess maybe LC itself doesn't have the right permalinks?), I think that OCLC probably ought to fix this. If the template is not creating the right URL, I guess you've got to work with wikipedia to fix it. Or fix your end to accept those URLs properly. As far as I know there aren't any permalinks for name authority records at loc.gov that use the LCCN. I've heard informally from some folks at OCLC that they plan to redirect these links to a URL at loc.gov if/when the name authority records are available from there. But I have no idea when that will happen unfortunately. //Ed
Re: [CODE4LIB] wikipedia/author disambiguation
a bit of a fruedian slip there I suppose :-) s/could/couldn't/ //Ed On Tue, May 31, 2011 at 3:17 PM, Ed Summers e...@pobox.com wrote: On Tue, May 31, 2011 at 12:48 PM, Thomas Berger t...@gymel.com wrote: Currently about 150.000 articles on wikipedia.de carry the associated PND number, many of them also LoC-NA and VIAF numbers: Makes me wonder if we could use inter-wiki links to automatically update some of the en.wikipedia articles based on the viaf links in de.wikipedia. Could hurt to see how many there are I suppose. //Ed
Re: [CODE4LIB] wikipedia/author disambiguation
On Tue, May 31, 2011 at 12:48 PM, Thomas Berger t...@gymel.com wrote: Currently about 150.000 articles on wikipedia.de carry the associated PND number, many of them also LoC-NA and VIAF numbers: Makes me wonder if we could use inter-wiki links to automatically update some of the en.wikipedia articles based on the viaf links in de.wikipedia. Could hurt to see how many there are I suppose. //Ed
Re: [CODE4LIB] Adding VIAF links to Wikipedia
On Thu, May 26, 2011 at 2:01 PM, Ralph LeVan ralphle...@gmail.com wrote: OCLC Research would desperately love to add VIAF links to Wikipedia articles, but it seems to be very difficult. The OpenLibrary folks tried to do it a while back and ended up getting their plans severely curtailed. The discussion at Wikipedia is captured here: http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/OpenlibraryBot Ralph if you read that entire discussion it sounds like the bot was approved. Am I missing something? //Ed
Re: [CODE4LIB] wikipedia/author disambiguation
It's the server unfortunately. I think OCLC is trying to figure out what to do with errol ... there's a thread on the wc-devnet-l if you are interested: http://listserv.oclc.org/scripts/wa.exe?A2=ind1105dL=wc-devnet-lT=0F=PX=4D30895CB90D4C912FP=73 //Ed On Thu, May 26, 2011 at 5:15 PM, Graham Seaman gra...@theseamans.net wrote: The lccn links from the template have been giving a java exception for the last few days at least: does the template or the server need fixing?
Re: [CODE4LIB] wikipedia/author disambiguation
The user profile pages that reference the website should eventually (1 or 2 days) turn up under the Users tab, e.g. http://linkypedia.inkdroid.org/websites/23/users/ I don't see you there yet though :-) //Ed On Wed, May 25, 2011 at 5:03 PM, Karen Coyle li...@kcoyle.net wrote: Hi, Ed. Do you pick up user pages or just wikipedia entry pages? (I added mine to my user page, just for fun.) kc
Re: [CODE4LIB] wikipedia/author disambiguation
Big +1 for promoting the use of the Authority Control Wikipedia template.I know i'm being a bit of a broken record, but you can watch as people add these by looking at or subscribing to: http://linkypedia.inkdroid.org/websites/23/pages/ Also, re: Jonathan's good advice to check out Wikipedia Miner [1] I just ran across Duke [2] today, which looks like it could help guide record linking a bit. Duke is a fast and flexible deduplication (or entity resolution, or record linkage) engine written in Java on top of Lucene. At the moment (2011-04-07) it can process 1,000,000 records in 11 minutes on a standard laptop in a single thread. Haven't tried it yet, so YMMV, etc. //Ed [1] http://wikipedia-miner.sourceforge.net/ [2] http://code.google.com/p/duke/
Re: [CODE4LIB] linking catalog records to IMDB
On Wed, Apr 27, 2011 at 12:14 PM, Roy Tennant roytenn...@gmail.com wrote: For what it's worth, I see over 7,000 links to IMDB from WorldCat records. Sounds like a good excuse to use yourFavoriteProgrammingLanguage; to rip through the 20k DVD records, look them up via the WorldCat API, see if there's a IMDB URL, and add it back into the record if you find one. Oh, and report back here with what you find :-) //Ed
Re: [CODE4LIB] What do you wish you had time to learn?
Fun question, my list: - data mining (the algorithms, the tools, etc) - go (the programming language) - hadoop Not necessarily inter-related mind you :-) //Ed On Tue, Apr 26, 2011 at 8:30 AM, Edward Iglesias edwardigles...@gmail.com wrote: Hello All, I am doing a presentation at RILA (Rhode Island Library Association) on changing skill sets for Systems Librarians. I did a formal survey a while back (if you participated, thank you) but this stuff changes so quickly I thought I would ask this another way. What do you wish you had time to learn? My list includes CouchDB(NoSQL in general) neo4j nodejs prototype API Mashups R Don't be afraid to include Latin or Greek History. I'm just going for a snapshot of System angst at not knowing everything. Thanks, ~ Edward Iglesias Systems Librarian Central Connecticut State University
Re: [CODE4LIB] Planned changes to the VIAF RDF
On Tue, Apr 12, 2011 at 1:49 PM, Young,Jeff (OR) jyo...@oclc.org wrote: The only VIAF contributors we're aware of today that publish their own authority Linked Data are Deutsche Nationalbibliothek, National Library of Sweden, and the National Széchényi Library (Hungary). Let's hope the trend continues :-) //Ed
Re: [CODE4LIB] FW: VIAF linked data and non-Latin searching
Nice, Jeff. I really like the simplified VIAF RDF. In particular I like how you've modeled the deprecation of resources. Are you planning to use a 301, e.g. http://viaf.org/viaf/77390479/ - http://viaf.org/viaf/77390479 ? //Ed On Mon, Apr 11, 2011 at 3:18 PM, Young,Jeff (OR) jyo...@oclc.org wrote: Here is some information about pending updates to the VIAF Linked Data. I'm working on before/after diagrams to better explain the differences and will share them soon. Questions and comments are welcome. Jeff From: Hickey,Thom Sent: Monday, April 11, 2011 12:57 PM To: v...@listserv.log.gov Cc: Young,Jeff (OR) Subject: VIAF linked data and non-Latin searching Non-Latin searching: We believe we have resolved a reoccurring issue with non-Latin searching failing (it had to do with restarting VIAF in different environments). If anyone still has issues with this, please let us know. Linked Data: We have taken another look at the RDF generated for linked data. The attached files show a personal, corporate and geographic (there are few pure geographic records in VIAF as of yet, but a mixed record such as Missouri's may be identified as geographic) record rendered in RDF. We think the new records are both simpler and easier to understand and use. The biggest difference is that we have eliminated the viaf:NameAuthorityCluster that acted as a record hub. Formerly, this record hub was responsible for linking to the separately identified primary entity. In the new record structure, contributed authorities bypass this record hub and link directly to the primary entity themselves. The description of the primary entity appears first in the record inside an rdf:Description element followed by skos:Concept entries, one for each source file, each of which links back to the primary entity via foaf:focus. We have included some deprecated identifiers matching those used in previous RDF, which may help those processing it as linked data. For those simply parsing it as XML and pulling information out of it, we have switched to fully qualified URIs which should make that easier. We will probably phase the new RDF in over the next two months. This month we will generate both for those getting full dumps of VIAF, then next month switch both the online and offline versions to the new format. For those with suggestions about the new format, this would be an ideal time to let us know. If we stay with the schedule outlined above we have until mid to late May before the new formats are in production. --Th
Re: [CODE4LIB] Documentation request for the marc gem
Hi Tony, Just in case it wasn't obvious, the source code is on GitHub [1]. As Ross said, please consider forking it and sending a pull request for any documentation improvements you want to do. //Ed [1] https://github.com/ruby-marc/ruby-marc On Tue, Mar 15, 2011 at 3:18 PM, Ross Singer rossfsin...@gmail.com wrote: Hi Tony, I'm glad that ruby-marc appears to be generally useful. Another (even simpler) way to do what you want is: record.to_marc Which, I think, would do the same thing you're doing with MARC::Writer.encode. If you want to write up a block of text to plop into the README, feel free to send some me some copy (wholesale edits also welcome). Thanks, -Ross. On Tue, Mar 15, 2011 at 2:40 PM, Tony Zanella tony.zane...@gmail.com wrote: Hello all, If I may suggest adding to the documentation for the marc gem (http://marc.rubyforge.org/)... Currently, the documentation gives examples for how to read, create and write MARC records. The source code also includes an encode method in MARC::Writer, which came in handy for me when I needed to send an encoded record off to be archived on the fly, without writing it to the filesystem. That method isn't in the documentation, but it would be nice to see there! It could be as simple as: # encoding a record MARC::Writer.encode(record) Thanks for your consideration! Tony
Re: [CODE4LIB] dealing with Summon
On Wed, Mar 2, 2011 at 11:38 AM, Godmar Back god...@gmail.com wrote: Like I said at the beginning of this thread, this is only tangentially a Code4Lib issue, and certainly the details aren't. But perhaps the general problem is (?) More than anything this seems like a documentation issue. From my seat in the peanut gallery it seems like Godmar should be able to answer these sorts of questions by looking at the Summon Search API Documentation [1] for responses (which is quite nice btw). Oh, and I think it's great to see this thread on code4lib, where other people have been known to create an API or three. So thanks Godmar, for asking here... //Ed [1] http://api.summon.serialssolutions.com/help/api/search/response
Re: [CODE4LIB] Low Cost Digitization of Manuscript Collections
Hi Jody, Thanks for sending along this information about Cabaniss. I'd be curious to hear how your per-page costs compare with other projects, such as Oregon State [1] (which I just wandered across in Google). The notes from your project wiki [2] are really interesting. In particular the details about linking from the EAD documents to the item views using the PURLs struck my eye [3]. Did you have a PURL server already set up at your institution, or is this something you did as part of this project? Was there a real advantage to doing that instead of thoughtfully managing a URL namespace with Cool URLs [4]. I know I'm biased, but it sure was nice to see URLs in use instead of Handles :-) I haven't done EAD work in a while, and was wondering what the ns2 namespace is in the linking example on the wiki, e.g. dao id=u0003_252_002 ns2:title=u0003_252_002 ns2:href=http://purl.lib.ua.edu/148; ns2:actuate=onRequest ns2:show=new/ Last of all I was curious about the EAD viewing software you are developing to stand in for Acumen. Is this work still underway? Sorry for all the questions. I guess that's what you get for doing interesting stuff :-) //Ed [1] http://wiki.library.oregonstate.edu/confluence/pages/viewpage.action?pageId=19327 [2] http://www.lib.ua.edu/wiki/digcoll/ [3]http://www.w3.org/Provider/Style/URI.html [4] http://www.lib.ua.edu/wiki/digcoll/index.php/Scripted_Links_in_EADs On Tue, Mar 1, 2011 at 9:03 PM, Jody DeRidder j...@jodyderidder.com wrote: (Apologies for cross posting) For Immediate Release Contact Person: Jody L. DeRidder Email: jlderid...@ua.edu Phone: (205) 348-0511 Completed UA Libraries Grant Project Provides Model for Low-Cost Digitization of Cultural Heritage Materials The University of Alabama Libraries has completed a grant project which demonstrates a model of low-cost digitization and web delivery of manuscript materials. Funded by the National Archives and Records Administration (NARA) National Historical Publications and Records Commission (NHPRC), the project digitized a large and nationally important manuscript collection related to the emancipation of slaves: the Septimus D. Cabaniss Papers. This digitization grant (NAR10-RD-10033-10) extended for 14 months (ended February 2011), and has provided online access to 46,663 images for less than $1.50 per page: http://acumen.lib.ua.edu/u0003_252. The model is designed to enable institutions to mass-digitize manuscript collections at a minimal cost, leveraging the extensive series descriptions already available in the collection finding aid to provide search and retrieval. Digitized content for the collection is linked from the finding aid, providing online access to 31.8 linear feet of valuable archival material that otherwise would never be web-available. We have developed software and workflows to support the process and web delivery of material regardless of the current method of finding aid access. More information is available on the grant website: http://www.lib.ua.edu/libraries/hoole/cabaniss . The Septimus D. Cabaniss Collection (1815-1889) was selected as exemplary of the legal difficulties encountered in efforts to emancipate slaves in the Deep South. Cabaniss was a prominent southern attorney who served as executor for the estate of the wealthy Samuel Townsend, who sought to manumit and leave property to a selection of his slaves, many of whom were his children. Samuel Townsend’s open admission to fathering slave children and his willingness to take responsibility for their care, combined with the letters from the former slaves themselves, dated before and after the Civil War, will inform social and racial historians. Legal scholars will be enlightened by Cabaniss' detailing of the sophisticated legal mechanism of using a trust to free slaves. Valuable collections such as this have a promise of open access via the web when the cost of digitization is lowered by avoiding item-level description. Usability testing was included in the grant project, and preliminary results indicate that this method of web delivery is as learnable for novices as access to the digitized materials via item-level descriptions. In addition, provision of web delivery of manuscript content via the finding aid provides the much-needed context preferred by experienced researchers. Jody DeRidder Digital Services University of Alabama Libraries Tuscaloosa, Alabama 35487 (205) 348-0511 j...@jodyderidder.com jlderid...@ua.edu
Re: [CODE4LIB] graphML of a social network in archival context
Hi Brian, It is *awesome* to see the SNAC data being released with an open license--and it's also really interesting to see the code for loading it into neo4. How have you been liking neo4j so far? Is the neo4j graph database something that you have been using in SNAC? Have you been interacting with it mainly via gremlin, the REST API, and/or Java? Just as an aside, I noticed that there are 66 edges that lack labels, and 8332 'associateWith' labels that probably should be 'associatedWith'? I'm also kind of curious to hear more about what 'associatedWith' means, is that something from EAC? I noticed that it can connect people, corporate bodies and families. ed@curry:~/Datasets/eac/eac-graph-load-data-2011-02$ grep edge graph-snac-example.xml | perl -ne '/label=(.+)/; print $1\n;' | sort | uniq -c | sort -n 66 8332 associateWith 99907 correspondedWith 382855 associatedWith Thanks for sending this update! Sorry for all the questions, but this is cool stuff. //Ed On Thu, Feb 17, 2011 at 8:37 PM, Brian Tingle brian.tingle.cdlib@gmail.com wrote: Hi, As a part of our work on the Social Networks and Archival Context Project [1], the SNAC team is please to release more early results of our ongoing research. A property graph [2] of correspondedWith and associatedWith relationships between corporate, personal, and family identities is made available under the Open Data Commons Attribution License [3] in the form of a graphML file [4]. The graph expresses 245,367 relationships between 124,152 named entities. The graphML file, as well as the scripts to create and load a graph database from EAC or graphML, are available on google code [5] We are still researching how to map from the property graph model to RDF, but this graph processing stack will likely power the interactive visualization of the historical social networks we are developing. Please let us know if you have any feedback about the graph, how it is licensed, or if you create something cool with the data. -- Brian [1] http://socialarchive.iath.virginia.edu/ [2] http://engineering.attinteractive.com/2010/12/a-graph-processing-stack/ [3] http://www.opendatacommons.org/licenses/by/ [4] http://graphml.graphdrawing.org/ [5] http://code.google.com/p/eac-graph-load/downloads/detail?name=eac-graph-load-data-2011-02.tar Research funded by the National Endowment for the Humanities http://www.neh.gov/
[CODE4LIB] livefeed /about
I just wanted to also say thanks for the livestream from code4lib Bloomington. The stream, IRC and twitter in combination were *extremely* useful from afar. I missed out on the craft-beers, but at least I got to see them [1], and there's always next year :-) I don't know if the bar has been set, but I think amplifying the conference this way could be a really good option for scaling the conference without requiring the amount of actual participants (and the size of the venue) to increase. It also helps for those who can't pay for the travel lodging when travel budgets are on the wane. Somewhat unrelatedly, I've seen some discussion about the place for galleries, libraries and museums in the code4lib community [2]. Personally (despite its name) I've always thought of code4lib as being about more than just code and libraries. I also noticed that http://code4lib.org didn't have an about page. So I added one [3]. Please help edit it into shape if you care about this sorta thing. //Ed [1] http://twitpic.com/3y0zw5 [2] http://twitter.com/#!/wragge/statuses/35926310920396800 [3] http://code4lib.org/about
Re: [CODE4LIB] asist2010 meetup?
Whoops, that was bus 61B not 61D. //Ed 15:23 edsu @quote get 3 15:23 zoia edsu: Quote #3: edsu, your source for bad advice since, well, forever! (added by edsu at 09:46 PM, September 06, 2005) On Tue, Oct 26, 2010 at 10:46 AM, Ed Summers e...@pobox.com wrote: Kind of last minute and random, but If you are at ASIST in Pittsburgh and want to get out of the downtown for some pizza at Aiello's in Squirrell Hill please join Raymond Yee and myself there at 7pm. http://www.aiellospizza.com/ It looks like a simple ride on the 61D bus: http://bit.ly/hilton-to-aiellos And Raymond may be able to drive some folks back if they don't want to taxi or bus back. //Ed
[CODE4LIB] asist2010 meetup?
Kind of last minute and random, but If you are at ASIST in Pittsburgh and want to get out of the downtown for some pizza at Aiello's in Squirrell Hill please join Raymond Yee and myself there at 7pm. http://www.aiellospizza.com/ It looks like a simple ride on the 61D bus: http://bit.ly/hilton-to-aiellos And Raymond may be able to drive some folks back if they don't want to taxi or bus back. //Ed