Re: [CODE4LIB] code4lib journal
I suppose that I shall have to write an article for the journal entitled Code 4 dealing gracefully with idiotic journal names. Our software has exception code for THE Journal, but it still is a problem. At 7:50 PM -0700 5/3/06, Roy Tennant wrote: Eric, Surely you must realize it was a vote. Once it was put to a vote, under specific rules, if the rules are followed (and they were as far as I know), then we're stuck with it, for good or evil. Unfortunately, I think poor Jeff Davis was quite against that choice. Roy On May 3, 2006, at 7:27 PM, Eric Hellman wrote: Here's the latest on the code4lib journal: /lib/dev: A Journal for Library Programmers won the journal name vote. (See http://www.code4lib.org/node/96 for more details.) The idea of a journal name that contains punctuation in the title is so breathtakingly idiotic that I can only assume that it is a reference to the bug in the name of the computer language C++ -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://www.openly.com/1cate/ 1 Click Access To Everything
[CODE4LIB] Link Evaluator Firefox Extension
OCLC Openly Informatics has just released some free, open-source software that we developed to help libraries deal with one aspect of the eResource fulfillment problem- How does a library easily and quickly determine which journals and other electronic resource have disappeared because of subscription snafus. It's a free add-on for the Firefox web browser that acts as an advanced link-checker. It's a spin-off of the link-checking technology that we have been developing as part of the machinery we use to maintain the linking knowledgebases we produce for 1Cate, Worldcat and offerings from other companies. In addition to basic link checking, Link Evaluator can check for the presence of green-flag and red-flag phrases in linked-to resources. By working inside the Firefox browser, LinkEvaluator can exactly duplicate a user's experience and detect authentication problems. The green- and red- flag phrases can be supplied by editing preferences, or by inserting them into an html page of links to evaluate. LinkEvaluator is also multi-threaded, so it works up to 4 times faster than other link checking plugins. The developer behind this work is Filip Babalievsky; you can contact either of us with questions or kudos; there's also email list that we've set up. For more information and downloads, please see http://openly.oclc.org/linkevaluator/ and http://openly.oclc.org/pr/15012007.html Eric -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
[CODE4LIB] OCLC xISBN service is moving - February 13
February 13, not 14 I must have valentines on my mind! -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
[CODE4LIB] xISBN moved
The xISBN service has now moved out of OCLC Research and is now being supported by the Openly Informatics Division of OCLC. For more information about about the xISBN service, please visit http://www.worldcat.org/affiliate/webservices/xisbn/app.jsp There was a delay in the transfer caused by the snowstorm in Ohio. We have one problem reported- a poorly documented feature (to redirect to opacs) was not implemented in the replacement service; we expect this to be fixed shortly. If you want updates on this issue, please subscribe to the xidentifier-l mailing list. In looking at the usage patterns, we've noticed that a very small number of users are trying to just suck up all the xISBN data using the webservice. While this may have been reasonable when the data was static, it is not appropriate now that the data is being frequently updated. We are now able to supply the COMPLETE xISBN data file every month to customers who need this for their application, if this applies to you, please contact us at [EMAIL PROTECTED]; one of the national libraries is already working with us in this way. Another usage pattern seems to be pulling data for the contents of a library catalog; if you are interested in a batch-update facility, please contact us. Some uses appear to be very commercial. If you have a commercial application not allowed by the existing terms and conditions for free use, please contact us at [EMAIL PROTECTED] for a commercial agreement. The new web service is rated at 10 million requests per day, so don't be afraid to send traffic to xISBN. If you expect to use more than 1,000 requests per day, you must contact us. The existing xml response format does not allow for us to indicate in the response whether an application over the limit or not without breaking it, and we've chosen not to break anyone's application. Eric -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] OpenURL validation services
I would recommend that you send this query to the OpenURL listserv [EMAIL PROTECTED] at one point there was something at caltech that did this; i'm not sure if it made it to final Hi-- Is there any existing code that can validate the descriptive metadata of an OpenURL ContextObject? For example, http://www.openurl.info/registry/docs/mtx/info:ofi/fmt:kev:mtx:journal states thats auinit1 can have zero or one value and it must be the first author's first initial. Is there something into which I can input an OpenURL to see whether indeed the auinit1 param value is only one character (either A-Z or a-z) and has no more than one occurrence... plus all the other constraints for the other parameters in the Matrix on http://www.openurl.info/registry/docs/mtx/info:ofi/fmt:kev:mtx:journal ? --SET -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] E-Resource Access Management Services
I posted some comments in Web4Lib on this- but on code4lib, I'd like to be a bit more provocative and get y'all bothered. * What online resources would you collect? - collection of online resources is an oxymoron. * How would you connect people to these new collections? - why do you think it is you that will do the connecting? * How will you control and manage these services? - if you want to be in control of these resources, take the blue pill * How will you provide your users with the most correct information possible? - take the red pill Eric We've talked a lot about this at OCLC- should be noted that credit for starting this discussion goes to our colleagues at SerialsSolutions. Tim McCormick here at OCLC Openly likes to ask the question What is [this interesting sounding concept] in opposition to? I think that the idea that the ERAMS concept is in opposition to is what our colleagues at ExLibris are calling URM or Universal Resource Management. To accentuate the differences: ERAMS: electronic resources need an entirely new/different management infrastructure. URM: libraries need a single management infrastructure for all their resources. Of course there are important truths in both sides of this argument, but you can see why Serials Solutions and Ex Libris are arguing the sides they have chosen. ERAMS E-Resource Access Management Services http://www.erams.org/ We are looking for the first 50 participants who are willing to visualize a library not focused solely on print resource management and willing to go out on a limb and conceptualize the library which is focused on user access and management of online resources services. Four questions we will be brainstorming about to try to develop our future scenario today, are: * What online resources would you collect? * How would you connect people to these new collections? * How will you control and manage these services? * How will you provide your users with the most correct information possible? Please join: Jill Emery, Bonnie Tijerina, Elizabeth Winter to learn more about the ERAMS concept and the future possibility this concept holds for libraries. Where: Marriott Inner Harbor at Camden Yards, Chesapeake Room When: Saturday, March 31, 2007 Time: 2:00 - 4:00 PM Light refreshments will be available. Please RSVP to Jill Emery at [EMAIL PROTECTED] by March 29, 2007. To learn more and participate in future discussions, please visit http://www.erams.org http://www.erams.org/ -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
[CODE4LIB] more metadata from xISBN
The api for xISBN that Xiaoming Liu previewed at the Code4Lib meeting is now officially launched and supported, and provisions for commercial use are now in place. For those of you who missed it, in addition to related ISBN's, xISBN will now also return metadata such as title, edition, language and publication year that can be used to distinguish manifestations of a work. xISBN supports a RESTful API, as well as OpenURL and UNAPI, and can return results in a variety of formats. xISBN is free for non-commercial, low volume use. API details are at http://xisbn.worldcat.org/xisbnadmin/doc/api.htm With your support, we will continue to develop this service along with other Worldcat-based machine-to-machine services. Eric -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] more metadata from xISBN
As long as LibX is free and not being used as a way to drive Amazon revenue, I don't see how it could be considered to be commercial. We've studied our logs pretty carefully. Most of the sites that have exceeded the limit we set were commercial sites doing bulk harvest. You can track the xISBN use by LibX by getting an affiliate id. Eric At 2:32 PM -0400 5/9/07, Godmar Back wrote: Interesting. Thom Hickey commented a while ago about LibX's use of xISBN (*): I suspect that eventually the LibX xISBN support will become both less visible and more automatic. We were indeed planning on making it more automatic. For instance, a user visiting a vendor's page such as amazon might be presented with options from their library catalog, based on related ISBN found via xISBN. Would that qualify as noncommercial use? For instance, if LibX with this feature were installed on a public library machine, 500 requests per day might be easily exceeded. Matters would be even worse if multiple library machines were to share an IP because they are hidden behind a NAT device or proxy. - Godmar (*) http://outgoing.typepad.com/outgoing/2006/05/libx_and_xisbn.html -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] more metadata from xISBN
At 4:41 PM -0400 5/9/07, Godmar Back wrote: On 5/9/07, Eric Hellman [EMAIL PROTECTED] wrote: We've studied our logs pretty carefully. Most of the sites that have exceeded the limit we set were commercial sites doing bulk harvest. You can track the xISBN use by LibX by getting an affiliate id. LibX is a client-side tool. We're not a user of xISBN, we provide clients who have installed it the option to use xISBN. I know, and I had to explain that to the legal department! Also, keep in mind that an important reason to use OCLC's xISBN service - rather than using an alternate service or using the data directly - is Jeff Young's OAI bookmark service, specifically the know-how he's put into searching multiple catalogs and his keeping a database of which library uses which catalog. That, as I understand, is still not part of the officially supported xISBN, though. We will improve on that service... -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
[CODE4LIB] js calling js hack (was: A new generation of OPAC enhancements)
IIRC, this hack doesn't work in older versions of IE unless you remove the type=text/javascript attribute. (see http://openly.oclc.org/jake/instant.html ) this is one example of the few examples of a choice you have to make between having something work and having your page pass strict validation. Eric At 2:59 PM -0400 5/14/07, Jonathan Rochkind wrote: For what it's worth, I've used that same weird SCRIPT hack to insert dynamically generated code onto my OPAC screen for other purposes too. It was initially suggested to me by Dave Pattern. It's a useful hack. Jonathan Altay Guvench wrote: Hi Godmar- Tim asked me to join the list and discussion on the LibraryThing widgets. You're right that, with Ajax, we're bound by the same-origin restriction. But we can dynamically change the page content after loading, by eschewing traditional Ajax. New content is delivered through dynamically-inserted script tags. For example, you can set an onclick that adds a tag like this to the head: script src=http://www.libarything.com/get_content.php?tag=foo; type=text/javascript/script Server-side, get_content.php generates the response on the fly, e.g. echo document.getElementById('tagbrowser').innerHTML = 'books tagged foo'. As long as the response header in get_content is set to javascript, the browser should interpret it correctly. As for the hardwired DOM finagling you saw in Danbury's OPAC, in most cases, the table[3] stuff isn't necessary. Typically, a library will simply edit their OPAC's html template to include empty widget divs ( e.g. div id=ltfl_tagbrowse class='ltfl'/div ) wherever they'd like the widgets. Then a single script tag finds those divs and inserts the contents onload. However, there were some permissions issues with the Danbury OPAC that didn't allow for this. (They could only edit the OPAC footer.) The workaround was to dynamically insert the LTFL divs using custom javascript in the footer. That said, like I mentioned, this isn't necessary in most cases. We've tested it in a few systems, and generally speaking, our widgets are DOM-agnostic. Altay -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] good web service api
Eric, I'll address only the xml design for your first link, and I'll ask questions, not because I want to know the answer, but because the answers determine whether your xml response is good. is this response supposed to work only for mylibrary? does version refer to mylibrary or to the response format? why is there an error element if there's not been an error? why is there a message element if there's no message? why is code PCDATA rather than a CDATA attribute? are the facets always ordered? is there a difference between a facet_name and a name? is the name and note attached to the facet or contained by the facet? why do facets have ids? is id an ID? I also find that it helps to try to read things like this aloud in English. MyLibrary, could I get all the facets you have? Hello, I'm MyLibrary version 0.001. I would like to report an error with code zero and no message. As for my facets, a facet, which I id as 2, is named as a facet Formats, and I should note as a facet it is The physical manifestation of information, and that's what I have to say about this facet. Another facet, which I id as 3, is named as a facet People, and I should note as a facet it is Human beings both real and fictional, and that's what I have to say about this facet. etc... And those are all the facets. Goodbye! At 2:15 PM -0400 6/29/07, Eric Lease Morgan wrote: What are the characteristics of a good Web Service API? We here at Notre Dame finished writing the MyLibrary Perl API a long time ago. It does what is was designed to do, and we believe others could be benefit from our experience. At the same time, the API is based on Perl and we realize not everybody likes (or knows) Perl. Some would say, Why didn't you write in X? where X is their favorite programming language. Well, that is just not practical. I believe the solution to the dilemma is a Web Service API against MyLibrary similar to the Web Services API provided by Fedora. Any scripts sends name/value pairs on a URL and gets back XML. This way any language can be used against MyLibrary (which we now calling a digital library framework toolbox). Here are the only two working examples I have: http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getAll http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getOneid=2 Try a few of errors: http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getNone http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getOneid=45 http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getOneid=x I can create all sorts of commands like: * get all facets * get all terms * get all librarians * get all resources * get all resources classified with this term * create facet, term, librarian, or resource * edit facet, term, etc. * delete facet, term, etc. Given such an interface library (MyLibrary) content can be used in all sort of venues much more easily. While I don't expect anybody here to know what commands to support, I am more interested in the how. What are the characteristics of good name/value pairs? Should they be terse or verbose? To what extent should the name/value pairs require a version number or a stylesheet parameter? Besides being encoded in XML, what should the output look like? What are characteristics of good XML in this regard? Heck, then there is authorization? How do I disable people from deleting resources and such. This might be just an academic exercise, but with the advent of more and more Web Services computing I thought I might give a MyLibrary Web Services API a whirl. -- Eric Lease Morgan University Libraries of Notre Dame code4lib_fridays++ -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED] [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] Citation parsing?
Having written a pretty decent citation parser 10 years ago (in Applescript!), and having seen a lot of people take whacks at the problem, I have to say that it's pretty easy to write one that works on 70-80% of citations, particularly if you stick to one scholarly subject area. On the other hand, it's really quite hard to write a citation parser that gets better than 90 % of citations for a general corpus . The main problem is that scholarly works are written by creative, ingenious people. When applied to citations, creativity and ingenuity are disasters for automated parsers. Parsers working on the computer science literature have come the farthest, mostly because the convention in computer science literature is to always include the article title. The most impressive thing to me about Google Scholar when it was first released was to see how far they had taken the citation parsing outside of the computer science literature. Still, they have a ways to go; most of the progress they've made seems to be by cheating ( i.e. backing the citation out of the linking, which means they're just piggybacking on the work done by Inera and others). (Hint: one of the very best performing open source citation parsers was written (in perl) by Steve Lawrence, who was at NEC at the time, as part of ResearchIndex AKA CiteSeer. It was released as pseudo open source, but not so easy to separate. It relied heavily on the availability of the article title. Steve has been at Google for a while. Steve apparently wasn't involved in Scholar, but you have to assume he and Anurag did a fair amount of comparing notes.) Anyway, almost all parsers rely on a set of heuristics. I have not seen any parsers that do a good job of managing their heuristics in a scaleable way. A successful open-source attack on this problem would have the following characteristics: 1. able to efficiently handle and manage large numbers of parsing and scoring heuristics 2. easy for contributors to add parsing and scoring heuristics 3. able to use contextual information (is the citation from a physics article or from a history monograph?) in application and scoring of heuristics Eric It's on our list of Big Problems To Solve; I'm hoping to have time to tackle it later this year :) -n On Jul 18, 2007, at 12:57 PM, Jonathan Rochkind wrote: Ha! If it's not too difficult, then with all the time you've spent looking at it extensively, how come you don't have a solution yet? Just kidding. :) Jonathan Nathan Vack wrote: We've looked at this pretty extensively, and we're pretty certain there's nothing downloadable that does a good enough job. However, it's by no means impossible -- it seems to be undergrad thesis-level work in Singapore: http://wing.comp.nus.edu.sg/parsCit/ There used to be a paper describing this approach (essentially treating citation parsing as a natural language processing task and using a maximum entropy algorithm) online... the page even cites it... but it seems to be gone now. FWIW, it didn't look too difficult. -Nate On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote: Does anyone have any decent open source code to parse a citation? I'm talking about a completely narrative citation like someone might cut-and-paste from a bibliography or web page. I realize there are a number of differnet formats this could be in (not to mention the human error problems that always occur from human entered free text)--but thinking about it, I suspect that with some work you could get something that worked reasonably well (if not perfect). So I'm wondering if anyone has donethis work. (One of the commerical legal product--I forget if it's Lexis or West--does this with legal citations--a more limited domain--quite well. I'm not sure if any of the commerical bibliographic citation management software does this?) The goal, as you can probably guess, is a box that the user can paste a citation into; make an OpenURL out of it; show the user where to get the citation. I'm pretty confident something useful could be created here, with enough time put into it. But saldy, it's probably more time than anyone has individually. Unless someone's done it already? Hopefully, Jonathan -- Jonathan Rochkind Sr. Programmer/Analyst The Sheridan Libraries Johns Hopkins University 410.516.8886 rochkind (at) jhu.edu -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED] [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] Citation parsing?
On Jul 18, 2007, at 10:04 PM, Eric Hellman wrote: Also, even in (many) scholarly journals, editorial consistency is almost unbelievably poor -- lots of times, the rules just aren't followed. Punctuation gets missed, journal names (especially abbreviations!) are misspelled... and so on. Rule-based and heuristic systems are always going to have problems in those cases. Heuristics are perhaps the only way to deal with lack of consistent format. (i.e. a cluster of words including journal of is likely to contain a journal name) If you have a halfway decent journal name parser (such as the one in our openurl software) it already contains a large list of journal misspellings. In a lot of ways, I think the problem is fundamentally similar to identifying parts of speech in natural language (which has lots of the same ambiguities) -- and the same techniques that succeed there will probably yield the most robust results for citation parsing. Have people been able to do a decent job of identifying parts of speech in natural language? -- Eric Hellman, DirectorOCLC Openly Informatics Division [EMAIL PROTECTED] [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/1cate/ 1 Click Access To Everything
Re: [CODE4LIB] library find and bibliographic citation export?
On Sep 27, 2007, at 9:59 PM, Steve Toub wrote: A reminder that the data model for OpenURL/COinS does not have all metadata fields: only one author allowed, no abstract, etc. That's incorrect; an Openurl context object may contain any number of author names. (but not parsed author names). And to be fair, there exists no metadata format that has all metadata fields. Eric
Re: [CODE4LIB] library find and bibliographic citation export?
SFX uses a proprietary mechanism to trigger fetch that is not part of the OpenURL standard. The usefulness of this mechanism, however, motivated the very rich fetch functionality in the 1.0 version of the standard- if you care at all about interoperability you should avoid the SFX trigger mechanism. COinS recommends not to use ContextObjects with fetch (by reference metadata) because it is thought that deverse agents will not be able to deal with them; however, link server systems with full Z39.88 implementations should have no problem with them. Eric On Sep 28, 2007, at 11:59 PM, Tom Keays wrote: I'm certainly no expert, but my understanding is that you have to embed the extra authors into a call (a fetch in the SFX lingo) using a private identifier at the end of the OpenURL string. It's more complicated than that, of course, since there has to an sid included in addition to the pid and, at least in the SFX-biased documentation, this so-called fetch operation is intended to be done using z39.50 or html-based (REST?) calls to a dataserver determined by the source. Pretty messy, huh? http://www.exlibrisgroup.com/resources/sfx/sfx_for_ips_aug_2002.pdf The upshot is that multiple authors aren't directly supported in COinS because a COinS url is by design generic and therefore can't know about how to deal with the specific sid and pid requirements of a fetch. However, I would think Umlaut ought to be able to handle it since it has SFX at its core. On 9/28/07, Jonathan Rochkind [EMAIL PROTECTED] wrote: Can you tell me how to legally include more than one author name in an OpenURL context object? I've been a bit confused about this myself, and happen to be dealing with it presently too.
[CODE4LIB] OpenURL Referrer for IE
OCLC's OpenURL Referrer is now available for Internet Explorer! Previously available only for Firefox, this popular browser extension inserts OpenURLs into Google Scholar and Google News Archive search results. It also detects and makes links out of web COinS, such as those found in Wikipedia and Worldcat.org. The extension can be downloaded for free at http://openly.oclc.org/openurlref/ OpenURL Referrer uses your institution's link resolver settings from the OCLC WorldCat Registry, so there is no need to manually configure the extension. Institutions can register their resolver in the OCLC WorldCat Registry by visiting http://worldcat.org/registry/ institutions. All institutions can register for free, even if they are not OCLC member libraries. The IE version leans much more heavily on the Worldcat Resolver Registry than the Firefox version does. The reason for this is that IE does not have a nice XUL-based way to make user interfaces, so we instead rely on the Registry to do baseurl management. I hope this proves to be useful. Eric Hellman, Director OCLC Openly Informatics Division [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 Bloomfield, NJ 07003 http://openly.oclc.org/
[CODE4LIB] Openly Jake is closed for renovation.
I continue to be surprised at the continuing use of Openly Jake considering that it's been over 7 years since the data it delivers has been maintained. Over the past month, the system has become increasingly unstable due in part to *heavy usage*, and on examination, it looks like that we'll need to do some serious renovation (the usual chain of os/vm/container updates) and as a result, jake will not be available for at least the next week. I can't make any promises, but there is a possibility that we can also enable a switch to an up-to-date knowledgebase. If you have any questions or concerns, please don't hesitate to contact me. Eric Hellman, Director OCLC New Jersey [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://openly.oclc.org/
Re: [CODE4LIB] Multiple ISBNs in COInS?
It's an excellent point. The resolver's knowledgebase needs to know which issn a vendor has bundled content under, and ideally will be able to access that content no matter the issn/eissn in the OpenURL metadata. I'm thinking of a particular vendor that uses issn in the url syntax, but without a knowledge, it's hard to predict which issn is the one they use! On Feb 29, 2008, at 8:38 AM, Kyle Banerjee wrote: I agree; issn is not an identifier for an article. But in general, a resolver should be smart enough to know what serial is meant even if a variant issn is supplied. To prevent multiple searches, the resolver has to know how a title is referenced in the target source. This requires precalculation using a service or data file like xISBN that included ISSNs. However, it is important to keep in mind that sources such as the library catalog sometimes require multiple ISSNs to retrieve all holdings data unless this information is combined before it is loaded into the resolver knowledgebase. Between cataloging rules that influence how serials are issued (specifically, the practice known as successive entry cataloging which spreads individual titles across multiple records because of piddly variations in issues) and things that occur at the publishing end of things, many journals are known by multiple ISSNs. Practices like these are not user friendly -- even reference librarians don't seem to understand them -- so database providers typically combine all the issues so they can be considered part of one unit. Vendor provided data about such titles will likely include only one of these ISSNs (most likely, the most recent one, but that is not guaranteed). Unlike vendors, catalogers can be counted on to spread the holdings statements across multiple records and ISSNs if the cataloging rules so prescribe. This may sound like cataloging minutia, but this dynamic affects a number of very popular titles. Resolving only one ISSN could easily lead people to think an issue they need is not available when it is on hand. kyle -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance [EMAIL PROTECTED] / 541.359.9599
[CODE4LIB] Open positions at OCLC New Jersey
) Worldcat Link Manager Knowledgebase (used by a number of other vendors) xISBN xISSN OpenURL Referrer (a firefox add-on) Link Evaluator (a firefox add-on) We have just started work on a project to change the way that libraries throughout the world organize what they do. As a big non profit corporation, we have solid benefits and good insulation from the coming recession. As a small location, we have flexible hours and casual working environment. As a non-profit, OCLC doesn't pay like a bank or other big businesses would; a corollary of that is that people who work here actually want to work here. For me, the greatest thing about what we do is that millions of people around the world benefit from the work we do. We're located 10 miles west of Manhattan (GSP exit 148) 1 block from the Bloomfield train station. If you're interested in a position at OCLC NJ, feel free to write me an e-mail ([EMAIL PROTECTED]) telling me why you're a good match, and submit a resume at https://jobs-oclc.icims.com/oclc_jobs/jobs/candidate/jobs.jsp?ss=1searchLoc ation=US-NJ so that you exist in the minds of HR. Eric Eric Hellman, Director OCLC New Jersey [EMAIL PROTECTED]2 Broad St., Suite 208 tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003 http://nj.oclc.org/
Re: [CODE4LIB] COinS in OL?
Not just the book pages, I might add! Wikipedia probably has the most non-book COinS deployed; Worldcat is the premier site for book COinS. A recent but impressive addition to the COinSiverse is ResearchBlogging- see http://ResearchBlogging.org Eric On 12/1/08 11:08 AM, Karen Coyle [EMAIL PROTECTED] wrote: I have a question to ask for the Open Library folks and I couldn't quite figure out where to ask it. This seems like a good place. Would it be useful to embed COinS in the book pages of the Open Library? Does anyone think they might make use of them? Thanks, kc Eric Hellman, Director OCLC New Jersey [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 Bloomfield, NJ 07003 http://nj.oclc.org/
Re: [CODE4LIB] COinS in OL?
just catch up on the discussion here... for the benefit of those who aren't on the openurl list, it was pointed out that you can put lccn in a ContextObject using info uri's: rft_id=info:lccn/93004427 Eric On 12/1/08 1:28 PM, Jonathan Rochkind [EMAIL PROTECTED] wrote: I'm not sure there's any good way to include a DDC or LCC in an SAP1 OpenURL for COinS. Same with subject vocabularies. Really, I'm pretty sure there is NOT, in fact. But if there is, sure, throw them in, put in anything you've got. But this re-affirms my suggestion that there might be a better microformat-ish way to embed information in the page in addition to OpenURL. COinS/OpenURL is important because we have an established infrastructure for it, but it's actually pretty limited and not always the easiest to work with. Jonathan Eric Hellman, Director OCLC New Jersey [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 Bloomfield, NJ 07003 http://nj.oclc.org/
Re: [CODE4LIB] COinS in OL?
Yep. There's no URI for LCC. You could put LCC in the subject field of a dublin core profile metadata format ContextObject. But it's not clear why anyone would want to do that. On 12/8/08 10:41 AM, Jonathan Rochkind [EMAIL PROTECTED] wrote: LCCN--Library of Congress Control Number--eg 98013779--, yes. LCC--Library of Congress Classification--eg BF575.H27 W35 1991--I don't think so. Jonathan Eric Hellman wrote: just catch up on the discussion here... for the benefit of those who aren't on the openurl list, it was pointed out that you can put lccn in a ContextObject using info uri's: rft_id=info:lccn/93004427 Eric Hellman, Director OCLC New Jersey [EMAIL PROTECTED] 2 Broad St., Suite 208 tel 1-973-509-7800 Bloomfield, NJ 07003 http://nj.oclc.org/
Re: [CODE4LIB] registering info: uris?
I'll bite. There are actually a number of http URLs that work like http://dx.doi.org/10./j.1475-4983.2007.00728.x One of them is http://doi.wiley.com/10./j.1475-4983.2007.00728.x Another is run by crossref; Some OpenURL ink servers also have doi proxy capability. So for code to extract the doi reliably from http urls, the code needs to know all the possibilities for the doi proxy stem. The proxies also tend to have optional parameters that can control the resolution. In principle, the info:doi/ stem addresses this. On Apr 1, 2009, at 7:27 AM, Ross Singer wrote: What I don't understand is the reason to express that identifier as: info:doi/10./j.1475-4983.2007.00728.x when http://dx.doi.org/10./j.1475-4983.2007.00728.x Eric Hellman e...@hellman.net (personal) http://hellman.net/eric/
Re: [CODE4LIB] registering info: uris?
no, that's not at all what it implies. the ofi/name identifiers were minted as identifiers for namespaces of indentifiers, not as a wrapper scheme for the identifiers themselves. Yes, it's a bit TOO meta, but they can be safely ignored unless a new profile is desired. On Apr 5, 2009, at 10:31 AM, Karen Coyle wrote: Jonathan Rochkind wrote: URI for an ISBN or SuDocs? I don't think the GPO is going anywhere, but the GPO isn't committing to supporting an http URI scheme, and whoever is, who knows if they're going anywhere. That issue is certainly mitigated by Ross using purl.org for these, instead of his own personal http URI. But another issue that makes us want a controlling authority is increasing the chances that everyone will use the _same_ URI. If GPO were behind the purl.org/ NET/sudoc URIs, those chances would be high. Just Ross on his own, the chances go down, later someone else (OCLC, GPO, some other guy like Ross) might accidentally create a 'competitor', which would be unfortunate. Note this isn't as much of a problem for born web resources -- nobody's going to accidentally create an alternate URI for a dbpedia term, because anybody that knows about dbpedia knows that it lives at dbpedia. So those are my thoughts. Now everyone else can argue bitterly over them for a while. :) The ones that really puzzle me, however, are the OpenURL info namespace URIs for ftp, http, https and info. This implies that EVERY identifier used by OpenURL needs an info URI, even if it is a URI in its own right. They are under info:ofi/nam which is called Namespace reserved for registry identifiers of namespaces. There's something so circular about this that I just get a brain dump when I try to understand it. Does it make sense to anyone? kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234 Eric Hellman http://hellman.net/eric/
Re: [CODE4LIB] Use of rft.identifier in COiNS?
Wikipedia uses the DC metadata format for non-book objects, rft.identifier is part of the DC metadata format; if you are describing a book, you want to use the book metadata format. Jonathan is correct, rft_id is a possible place to put an accession number for a book, but only if you can make a URI out of it. You might also consider rft_dat as long as you include rfr_id. Great to hear LibraryThing is looking at COinS! Eric On Apr 29, 2009, at 4:01 PM, Chris Catalfo wrote: Hi all, I am trying to find the best way to include an item's accession number (i.e. ILS system id) in a COiNS span. This is in the context of library catalog pages where I'd like to be able to retrieve the ILS accession number to return to LibraryThing for Libraries. I see no mention of an rft.identifier key/value pair on the COiNS site's brief guide to books [1]. It does, however, appear as an element in the COiNS online generator for generic items [2]. Googling returned a couple of results using rft.identifier to hold urls. Can anyone enlighten me as to whether using rft.identifier to hold the ILS accession number is valid? Or suggest a more suitable key/value pair? Thanks for any help you can provide. Chris Catalfo Programmer, LibraryThing [1] http://ocoins.info/cobgbook.html [2] http://generator.ocoins.info/?sitePage=info/dc.html; Eric Hellman 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net (personal) http://hellman.net/eric/
Re: [CODE4LIB] A Book Grab by Google
Should note that Google could be paying $100,000,000+ to rights holders without getting ANYTHING in return in the absence of a settlement- that's what the copyright attorneys I've talked to believe would have been the ruling by the court had the suit gone to trial. And if that happened libraries would get nothing, not even the scans. I don't see how bashing Google (which is NOT what the library association briefs are doing, btw) for gaps in US and international Copyright Law(orphan works, for example) will end up helping libraries. My blog at http://go-to-hellman.blogspot.com/ is no longer secret. Eric
Re: [CODE4LIB] A Book Grab by Google
But the argument being trotted out is that having orphan works available through Google would HURT libraries, which is a somewhat different discussion. The arguments I see for that (as applied to libraries other than the internet Archive) are: 1. Asset devaluation. Just as DeBeers would be hurt if Google started selling cheap diamonds, because their stock of diamonds would be devalued, libraries would find their collections devalued. 2. Competition. Patrons would have attractive alternatives to visiting libraries to access information locked onto paper. But let's imagine that Internet Archive was shoehorned into the settlement. How would these arguments change? Asset devaluation would presumably be worse as the price was driven down, and Libraries (other than IA) would be faced with more competition, not less. On the other hand, presumably libraries would gain more options in indexing and thus improved access to their collections. Eric http://hellman.net/eric/ On May 20, 2009, at 3:47 PM, st...@archive.org wrote: On 5/20/09 11:19 AM, Eric Hellman wrote: I don't see how bashing Google (which is NOT what the library association briefs are doing, btw) for gaps in US and international Copyright Law(orphan works, for example) will end up helping libraries. i think the concern is that the settlement could give _only_ Google the right to scan orphaned works, and no one else. that certainly wouldn't help libraries. /st...@archive.org
Re: [CODE4LIB] A Book Grab by Google
I think one thing in Karen's comment is incorrect. As far as I can tell, the 'most favored nation' clause does NOT apply in the situation that Karen assumes it would be most likely to come into play. MFN appears to apply only if the registry licenses orphan works. It's an odd provision if you assume that the registry can't license orphan works; commentators such as Randy Picker have also commented on this oddness; as Karen mentions, it could be meant to come into play if orphan works legislation is enacted. You can examine the legalese yourself at http://go-to-hellman.blogspot.com/2009/04/does-google-really-get-orphan-monopoly.html Eric On May 20, 2009, at 2:54 PM, Karen Coyle wrote: Eric Hellman wrote: Should note that Google could be paying $100,000,000+ to rights holders without getting ANYTHING in return in the absence of a settlement- that's what the copyright attorneys I've talked to believe would have been the ruling by the court had the suit gone to trial. And if that happened libraries would get nothing, not even the scans. I don't see how bashing Google (which is NOT what the library association briefs are doing, btw) for gaps in US and international Copyright Law(orphan works, for example) will end up helping libraries. My blog at http://go-to-hellman.blogspot.com/ is no longer secret. Eric Another important note is that the settlement is the collective desires of the entities representing rights holders (Author's Guild and Assn Am. Publishers) and Google. Because the settlement talks were done under NDA, we can only guess at which aspects of the settlement were proposed/championed by which participants. From the little bit that has been revealed by folks who were there (because they are still under NDA) the AAP had strong demands and was probably equal to Google, if not more so, in terms of its ability to carve out what it felt was the best deal. The settlement is a compromise, with everyone getting *some* of what they wanted, and no one getting *all*. In answer to the question you pose on your blog: The key question is this: Would the Book Rights Registry have the ability to authorize a Google competitor to copy and use Orphan works? The legal folks I've heard speak about this say that the answer is no. Only the court can authorize the copying and use of Orphan works outside of what copyright law already states, and this settlement waives liability under the law only for Google. The registry cannot change the legal status of Orphan works under the copyright law in a way that would permit copying of them as in-copyright works. The registry sets prices, so if someone else found a way to copy Orphan works legally (say, if we got orphan works legislation), the registry might be used by them as the middle-man for payments. Most likely the registry would be used for non-Orphan works, because the rights holder could make a deal with the registry to give permission for copying, with $$ going to the registry and on to the rights holder. This is exactly what the Copyright Clearance Center does -- it serves as a central licensing agency for copyright holders. I assume that this is the area where the 'most favored nation' clause would be most likely to come into play -- basically, if Google Books is successful, rights holders might want to make deals with other entities for similar product lines. Whether or not the suit itself would have gone against Google is a matter of debate. I've heard it both ways. Google folks state (and because they say this publicly it has to be considered at least partially a PR statement) that the lawsuit would have gone on for years (true), and they didn't want to wait that long to be able to know what they could and could not do with this project. That makes sense, but it also is possible that they weren't as sure that they'd win as they'd stated when they started the project. kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] A Book Grab by Google
The oddness was remarked upon in Randy picker's talk at the Columbia conference on the Google Book Search Settlement. orphan works is not a term that occurs in the settlement agreement. Rightsholders other than Registered Rightsholders are orphan parents. Careful commentators refer to the initial monopoly on orphan works created by the settlement agreement, becasue we don't know what will happen down the road. On May 20, 2009, at 8:14 PM, Karen Coyle wrote: Eric, can you cite a section for this? Because I haven't seen this interpretation elsewhere, and I don't read it in the section you cite, which doesn't seem to me to mention orphan works. I will point to Grimmelmann: http://works.bepress.com/cgi/viewcontent.cgi?article=1024context=james_grimmelmann* pp 10-11. Grimmelmann thinks that the monopoly on orphan works is what will give Google the edge that keeps away competition, but he doesn't interpret the MFN clause as relating only to orphan works. kc * Eric Hellman wrote: I think one thing in Karen's comment is incorrect. As far as I can tell, the 'most favored nation' clause does NOT apply in the situation that Karen assumes it would be most likely to come into play. MFN appears to apply only if the registry licenses orphan works. It's an odd provision if you assume that the registry can't license orphan works; commentators such as Randy Picker have also commented on this oddness; as Karen mentions, it could be meant to come into play if orphan works legislation is enacted. You can examine the legalese yourself at http://go-to-hellman.blogspot.com/2009/04/does-google-really-get-orphan-monopoly.html Eric On May 20, 2009, at 2:54 PM, Karen Coyle wrote: Eric Hellman wrote: Should note that Google could be paying $100,000,000+ to rights holders without getting ANYTHING in return in the absence of a settlement- that's what the copyright attorneys I've talked to believe would have been the ruling by the court had the suit gone to trial. And if that happened libraries would get nothing, not even the scans. I don't see how bashing Google (which is NOT what the library association briefs are doing, btw) for gaps in US and international Copyright Law(orphan works, for example) will end up helping libraries. My blog at http://go-to-hellman.blogspot.com/ is no longer secret. Eric Another important note is that the settlement is the collective desires of the entities representing rights holders (Author's Guild and Assn Am. Publishers) and Google. Because the settlement talks were done under NDA, we can only guess at which aspects of the settlement were proposed/championed by which participants. From the little bit that has been revealed by folks who were there (because they are still under NDA) the AAP had strong demands and was probably equal to Google, if not more so, in terms of its ability to carve out what it felt was the best deal. The settlement is a compromise, with everyone getting *some* of what they wanted, and no one getting *all*. In answer to the question you pose on your blog: The key question is this: Would the Book Rights Registry have the ability to authorize a Google competitor to copy and use Orphan works? The legal folks I've heard speak about this say that the answer is no. Only the court can authorize the copying and use of Orphan works outside of what copyright law already states, and this settlement waives liability under the law only for Google. The registry cannot change the legal status of Orphan works under the copyright law in a way that would permit copying of them as in- copyright works. The registry sets prices, so if someone else found a way to copy Orphan works legally (say, if we got orphan works legislation), the registry might be used by them as the middle-man for payments. Most likely the registry would be used for non-Orphan works, because the rights holder could make a deal with the registry to give permission for copying, with $$ going to the registry and on to the rights holder. This is exactly what the Copyright Clearance Center does -- it serves as a central licensing agency for copyright holders. I assume that this is the area where the 'most favored nation' clause would be most likely to come into play -- basically, if Google Books is successful, rights holders might want to make deals with other entities for similar product lines. Whether or not the suit itself would have gone against Google is a matter of debate. I've heard it both ways. Google folks state (and because they say this publicly it has to be considered at least partially a PR statement) that the lawsuit would have gone on for years (true), and they didn't want to wait that long to be able to know what they could and could not do with this project. That makes sense, but it also is possible that they weren't as sure that they'd win as they'd stated when
[CODE4LIB] Google Fusion Tables
Google Fusion Tables appear to be aimed at collaborative linked database development. It's a bit early (pre-alpha, no web services, no publishing), but it looks really interesting. Does anyone have ideas how to take advantage of this in libraries? Google Labs Blog: http://googleresearch.blogspot.com/2009/06/google-fusion-tables.html Google Fusion Tables: http://tables.googlelabs.com/Home My review: http://go-to-hellman.blogspot.com/2009/06/linked-data-vs-google-fusion-tables.html Eric Hellman 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net (personal) http://hellman.net/eric/
[CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy
Recent attention to privacy concerns about Google Book Search have led me to investigate whether any libraries are using tools such as proxy servers to enhance patron privacy when using Google Book Search. Similarly, advertising networks (web bugs, for example) could be proxied for the same reason. I would be very interested to hear from any libraries that have done either of these things and of their experiences doing so. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy
I doubt that very much. It's very common for corporate sites to channel all their traffic through gateways. I would assume that google was smart enough to recognize that your usage pattern was not that of many users coming from a single IP address, but rather that of a harvesting robot. The two activities have very different log signatures. On Aug 5, 2009, at 12:13 PM, Tim Spalding wrote: I suspect that proxying Google will trigger an automatic throttle. Early on, a number of us hit GB hard, trying to figure out what they had, and got stopped. Tim On Wed, Aug 5, 2009 at 9:59 AM, Eric Hellmane...@hellman.net wrote: Recent attention to privacy concerns about Google Book Search have led me to investigate whether any libraries are using tools such as proxy servers to enhance patron privacy when using Google Book Search. Similarly, advertising networks (web bugs, for example) could be proxied for the same reason. I would be very interested to hear from any libraries that have done either of these things and of their experiences doing so. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy
No doubt throttling is used for API calls, but IP address throttling of the full user interface ought to be managed quite differently. If anyone has seen that occur, I would be interested to hear of it. On Aug 5, 2009, at 2:34 PM, Jon Gorman wrote: On Wed, Aug 5, 2009 at 1:05 PM, Eric Hellmane...@hellman.net wrote: I doubt that very much. It's very common for corporate sites to channel all their traffic through gateways. I would assume that google was smart enough to recognize that your usage pattern was not that of many users coming from a single IP address, but rather that of a harvesting robot. The two activities have very different log signatures. Uh, actually, Google has in the past throttled some services based on the ip address. I'm pretty sure it was mentioned before on this list and I can verify it myself. Look for some of Jonathan Rochkind's questions about a year ago. The original api used with GBS seemed very prone to this. I know others hit issues and when our consortium tried to use a proxy of the original api due to some technical issues they ran into this. (First couple of hundred hits would be golden, the rest just would return http errors). There's a newer one out there now that apparently doesn't use this throttling, but I'm not positive of the details. An organization may still have to warn google about it. There's a reason why the original api strongly encouraged folks to do things via a ajaxy call on the client. I'm guessing part of the reason for the new api was to address these issues. Jon Gorman Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
IIRC you can also elide url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx as that is the default. If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. On Sep 14, 2009, at 9:48 AM, O.Stephens wrote: I'm working on a project called TELSTAR (based at the Open University in the UK) which is looking at the integration of resources into an online learning environment (see http://www.open.ac.uk/telstar for the basic project details). The project focuses on the use of References/Citations as the way in which resources are integrated into the teaching material/environment. We are going to use OpenURL to provide links (where appropriate) from references to full text resources. Clearly for journals, articles, and a number of other formats this is a relatively well understood practice, and implementing this should be relatively straightforward. However, we also want to use OpenURL even where the reference is to a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk . This is in order to ensure that links provided in the course material are persistent over time. A brief description of what we perceive to be the problem and the way we are tackling it is available on the project blog at http://www.open.ac.uk/blogs/telstar/2009/09/14/managing-link-persistence-with-openurls/ (any comments welcome). What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk Is this sufficient (and correct)? Should we consider passing fuller metadata? If the latter should we use the existing KEV DC representation, or should we be looking at defining a new metadata format? Any help would be very welcome. Thanks, Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.ukmailto:o.steph...@open.ac.uk The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302). Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
As I'm sure you're aware, the OpenURL spec only talks about providing services, and resolving to full text is only one of many possible services. If *all* you know about a referent is the url, then redirecting the user to the url is going to be the best you can do in almost all cases. In particular, I don't think the dublin core profile, which is what Owen suggests to use, has much to say about resolving to full text. http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. IIRC Google, Worldcat, and Wikipedia used rft_id. I'm not in a position to answer any questions about specific link resolver software that I no longer am associated with, however good it is/was. Eric On Sep 14, 2009, at 12:57 PM, Jonathan Rochkind wrote: Well, in the 'wild' I barely see any rft_id's at all, heh. Aside from the obvious non-http URIs in rft_id, I'm not sure if I've seen http URIs that don't resolve to full text. BUT -- you can do anything with an http URI that you can do with an info uri. There is no requirement or guarantee in any spec that an HTTP uri will resolve at all, let alone resolve to full text for the document cited in an OpenURL. The OpenURL spec says that rft_id is An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). It doesn't say that it needs to resolve to full text. In my own OpenURL link-generating software, I _frequently_ put identifiers which are NOT open access URLs to full text in rft_id. Because there's no other place to put them. And I frequently use http URIs even for things that don't resolve to full text, because the conventional wisdom is to always use http for URIs, whether or not they resolve at all, and certainly no requirement that they resolve to something in particular like full text. Examples that I use myself when generating OpenURL rft_ids, of http URIs that do not resolve to full text include ones identifying bib records in my own catalog: http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my catalog record, but not to full text!] Or similarly, WorldCat http URIs. Or, an rft_id to unambigously identify something in terms of it's Google Books record: http://books.google.com/books?id=tl8MCAAJ Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/ [sudoc]= will, as the purl is presently set up by rsinger, resolve to a GPO catalog record, but there's no guarantee of online public full text. I'm pretty sure what I'm doing is perfectly appropriate based on the definition of rft_id, but it's definitely incompatible with a receiving link resolver assuming that all rft_id http URIs will resolve to full text for the rft cited. I don't think it's appropriate to assume that just because a URI is http, that means it will resolve to full text -- it's merely an identifier that unambiguously specifies the referent, same as any other URI scheme. Isn't that what the sem web folks are always insisting in the arguments about how it's okay to use http URIs for any type of identifier at all -- that http is just an identifier (at least in a context where all that's called for is a URI to identify), you can't assume that it resolves to anything in particular? (Although it's nice when it resolves to RDF saying more about the thing identified, it's certainly not expected that it will resolve to full text). Eric, out of curiosity, will your own link resolver software automatically take rft_id's and display them to the user as links? Jonathan Eric Hellman wrote: Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
It's not correct to say that rft_val has no use; when used, it should contain a URL-encoded package of xml or kev metadata. it would be correct to say it is very rarely used. On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
If you have a URL that can be used for a resource that you are describing in metadata, resolvers can do a better job providing services to users if it is put in the openurl. The only place to put it is rft_id. So let's not let one resolver's incapacity to prevent other resolvers from providing better services. If you want to make an OpenURL for a web page, its url is in almost all cases the best unambiguous identifier you could possibly think of. Putting dead http uri's in rft_id is not really a very useful thing to do. On Sep 14, 2009, at 1:45 PM, Jonathan Rochkind wrote: Eric Hellman wrote: http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. Of course. But how is a link resolver supposed to know that, when all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM ?? I suggest that this is a kind of ambiguity in OpenURL, that many of us are using rft_id to, in some contexts, simply provide an unambiguous identifier, and in other cases, provide an end-user access URL (which may not be a good unambiguous identifier at all!). With no way for the link resolver to tell which was intended. So I don't think it's a good idea to do this. I think the community should choose one, and based on the language of the OpenURL spec, rft_id is meant to be an unambiguous identifier, not an end-user access URL. So ideally another way would be provided to send something intended as an end-user access URL in an OpenURL. But OpenURL is pretty much a dead spec that is never going to be developed further in any practical way. So, really, I recommend avoiding OpenURL for some non-library standard web standards whenever you can. But sometimes you can't, and OpenURL really is the best tool for the job. I use it all the time. And it constantly frustrates me with it's lack of flexibility and clarity, leading to people using it in ambiguous ways. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I'm doing is perfectly appropriate based on the definition of rft_id, but it's definitely incompatible with a receiving link resolver assuming that all rft_id http URIs will resolve to full text for the rft cited. I don't think it's appropriate to assume that just because a URI is http, that means it will resolve to full text -- it's merely an identifier that unambiguously specifies the referent, same as any other URI scheme. Isn't that what the sem web folks are always insisting in the arguments about how it's okay to use http URIs for any type of identifier at all -- that http is just an identifier (at least in a context where all that's called for is a URI to identify), you can't assume that it resolves to anything in particular? (Although it's nice when it resolves to RDF saying more about the thing identified, it's certainly not expected that it will resolve to full text). Eric, out of curiosity, will your own link resolver software automatically take rft_id's and display them to the user as links? Jonathan Eric Hellman wrote: Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
You're absolutely correct, in fact, all the ent_val fields are reserved for future use! They went in and out of the spec. I'm trying to remember from my notes. It's better that they're out. On Sep 14, 2009, at 2:05 PM, Rosalyn Metz wrote: sorry eric, i was reading straight from the documentation and according to it it has no use. On Mon, Sep 14, 2009 at 1:55 PM, Eric Hellman e...@hellman.net wrote: It's not correct to say that rft_val has no use; when used, it should contain a URL-encoded package of xml or kev metadata. it would be correct to say it is very rarely used. On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I can't imagine that SFX has some fundamental assumption that an http URL in rft_id is never ever something that can be used for access, and even if it did, it would be letting the tail wag the dog to suggest that other resolvers should not do so; some do. There are also resolvers that pre-check urls, at least there were before exlibris acquired linkfinderplus. So it's possible for a resolver agent to discover whether a url leads somewhere or not. On Sep 14, 2009, at 2:23 PM, Jonathan Rochkind wrote: I disagree. Putting URIs that unamiguously identify the referent, and in some cases provide additional 'hooks' by virtue of additional identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing to do to me. Whether or not they resolve to an end-user appropriate web page or not. If you want to use rft_id to instead be an end-user appropriate access URL (which may or may not be a suitable unambiguous persistent identifier), I guess it depends on how many of the actually existing in-the-wild link resolvers will, in what contexts, treat an http URI as an end-user appropriate access URL. If a lot of the in-the-wild link resolvers will, that may be a practically useful thing to do. Thus me asking if the one you had knowledge of did or didn't. I'm 99% sure that SFX will not, in any context, treat an rft_id as an appropriate end-user access URL. Certainly providing an appropriate end-user access URL _is_ a useful thing to do. So is providing an unambiguous persistent identifier. Both are quite useful things to do, they're just different things, shame that OpenURL kinda implies that you can use the same data element for both. OpenURL's not alone there though, DC does the same thing. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
? I could see a number of advantages to this in the local context: Consistency - references to websites get treated the same as references to journal articles - this means a single approach on the course side, with flexibility Usage stats - we could collect these whatever, but if we do it via OpenURL we get this in the same place as the stats about usage of other scholarly material and could consider driving personalisation services off the data (like the bX product from Ex Libris) Appropriate copy problem - for resources we subscribe to with authentication mechanisms there is (I think) an equivalent to the 'appropriate copy' issue as with journal articles - we can push a URI to 'Web of Science' to the correct version of Web of Science via a local authentication method (using ezproxy for us) The problem with the approach (as Nate and Eric mention) is that any approach that relies on the URI as a identifier (whether using OpenURL or a script) is going to have problems as the same URI could be used to identify different resources over time. I think Eric's suggestion of using additional information to help differentiate is worth looking at, but I suspect that this is going to cause us problems - although I'd say that it is likely to cause us much less work than the alternative, which is allocating every single reference to a web resource used in our course material it's own persistent URL. The use case we are currently looking at is only with our own (authenticated) learning environment - so these OpenURLs are not going to appear in the wild, so to some extent perhaps it doesn't matter what we do - but it still seems sensible to me to look at what 'good practice' might look like. I hope this is clear - I'm still struggling with some of this, and sometimes it doesn't make complete sense to me, but that's my best stab at explaining my thinking at the moment. Again, I appreciate the comments. Jonathan said But you seem to understand what's up. I wish I did! I guess that I'm reasonably confident that the approach I'm describing has some chance of doing the job - whether it is the best approach I'm not so sure about. Owen The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302). Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I think using locally meaningful ids in rft_id is a misuse and a mistake. locally meaningful data should goi in rft_dat, accompanied by rfr_id just sayin' On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote: I do like Ross's solution, if you really wanna use OpenURL. I'm much more comfortable with the idea of including a URI based on your own local service in rft_id, then including any old public URL in rft_id. Then at least your link resolver can say if what's in rft_id begins with (eg) http://telstar.open.ac.uk/, THEN I know this is one of these purl type things, and I know that sending the user to it will result in a redirect to an end-user-appropriate access URL. Cause that's my concern with putting random URLs in rft_id, that there's no way to know if they are intended as end-user-appropriate access URLs or not, and in putting things in rft_id that aren't really good identifiers for the referent at all. But using your own local service ID, now you really DO have something that's appropriately considered a persistent identifier for the referent, AND you have a straightforward way to tell when the rft_id of this context is intended as an access URL. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Yes, you can. On Sep 15, 2009, at 11:41 AM, Ross Singer wrote: I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a formatted citation. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
The process by which a URI comes to identify something other than the stuff you get by resolving it can be mysterious- I've blogged about a bit: http://go-to-hellman.blogspot.com/2009/07/illusion-of-internet-identity.html In the case of worldcat or google, it's fame. If you think a URI can be usable outside your institution for identification purposes, and your institution can maintain some sort of identification machinery as long as the OpenURL is expected to be useful, then it's fine to use it in rft_id. If you intend the uri to connote identity it only in the context that you're building urls for, then use rft_dat which is there for exactly that purpose. On Sep 15, 2009, at 12:17 PM, Jonathan Rochkind wrote: If it's a URI that is indeed an identifier that unambiguously identifies the referent, as the standard says... I don't see how that's inappropriate in rft_id. Isn't that what it's for? I mentioned before that I put things like http://catalog.library.jhu.edu/bib/1234 in my rft_ids. Putting http://somewhere.edu/our-purl-server/1234 in rft_id seems very analogous to me. Both seem appropriate. I'm not sure what makes a URI locally meaningful or not. What makes http://www.worldcat.org/bibID or http://books.google.com/book?id=foo globally meaningful but http://catalog.library.jhu.edu/bib/1234 or http://somewhere.edu/our-purl-server/1234 locally meaningful? If it's a URI that is reasonably persistent and unambiguously identifies the referent, then it's an identifier and is appropriate for rft_id, says me. Jonathan Eric Hellman wrote: I think using locally meaningful ids in rft_id is a misuse and a mistake. locally meaningful data should goi in rft_dat, accompanied by rfr_id just sayin' On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote: I do like Ross's solution, if you really wanna use OpenURL. I'm much more comfortable with the idea of including a URI based on your own local service in rft_id, then including any old public URL in rft_id. Then at least your link resolver can say if what's in rft_id begins with (eg) http://telstar.open.ac.uk/, THEN I know this is one of these purl type things, and I know that sending the user to it will result in a redirect to an end-user-appropriate access URL. Cause that's my concern with putting random URLs in rft_id, that there's no way to know if they are intended as end-user- appropriate access URLs or not, and in putting things in rft_id that aren't really good identifiers for the referent at all. But using your own local service ID, now you really DO have something that's appropriately considered a persistent identifier for the referent, AND you have a straightforward way to tell when the rft_id of this context is intended as an access URL. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] Another way to do link maintenance
The thread on Implementing OpenURL for simple web resources inspired my to write an article on all the things that redirectors can be used for: http://go-to-hellman.blogspot.com/2009/09/redirector-chain-mashup-design-pattern.html Having thought about the original problem a bit, it strikes me that going a bit farther than what ross suggests could be a nice solution. Have an onLoad javascript call your link maintenance database and then rewrite the links in your page. This could be implemented in a JSON sort of way. (and no Openurl) Here's why. There will be situations where you want to maintain the anchor text as well as the link, and this solution allows you to do it. Also, a well-crafted javascript will allow all the links to work (well, the good ones, at least) even if you link maintenace service goes down or disappears. Eric On Sep 15, 2009, at 11:47 AM, Ross Singer wrote: Oh yeah, one thing I left off -- In Moodle, it would probably make sense to link to the URL in the a tag: a href=http://bbc.co.uk/;The Beeb!/a but use a javascript onMouseDown action to rewrite the link to route through your funky link resolver path, a la Google. That way, the page works like any normal webpage, right mouse click-Copy Link Location gives the user the real URL to copy and paste, but normal behavior funnels through the link resolver. -Ross. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] DIY Book Scanner
I was at a conference last Friday where Dan Reetz demoed his open- source homemade book scanner. Code4Libbers who are involved with low- budget scanning projects may want to check it out: http://www.diybookscanner.org/ (website for Dan's DIY book scanner) http://www.instructables.com/id/DIY-High-Speed-Book-Scanner-from-Trash-and-Cheap-C/ (Instructions for making the scanner) http://www.diybookscanner.org/news/?p=17 (more pictures) http://www.diybookscanner.org/forum/ (the DIY scanner community forum) Blog posts: Harry Lewis: http://www.bitsbook.com/2009/10/do-it-yourself-book-scanning/ Robin Sloan: http://www.themillions.com/2009/10/bringing-book-scanning-home.html Me: http://go-to-hellman.blogspot.com/2009/10/revolution-will-be-digitized-by-cheap.html Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] XForms EAD editor sandbox available
XForms and Orbeon are very interesting tools for developing metadata management tools. The ONIX developers have used this stack to produce an interface for ONIX-PL called OPLE that people should try out. http://www.jisc.ac.uk/whatwedo/programmes/pals3/onixeditor.aspx Questions about Orbeon relate to performance and integrability, but I think it's an impressive use of XForms nonetheless. - Eric On Nov 12, 2009, at 1:30 PM, Ethan Gruber wrote: Hello all, Over the past few months I have been working on and off on a research project to develop a XForms, web-based editor for EAD finding aids that runs within the Orbeon tomcat application. While still in a very early alpha stage (I have probably put only 60-80 hours of work into it thus far), I think that it's ready for a general demonstration to solicit opinions, criticism, etc. from librarians, and technical staff. Background: For those not familiar with XForms, it is a W3C standard for creating next-generation forms. It is powerful and can allow you to create XML in the way that it is intended to be created, without limits to repeatability, complex hierarchies, or mixed content. Orbeon adds a level on top of that, taking care of all the ajax calls, serialization, CRUD operations, and a variety of widgets that allow nice features like tabs and autocomplete/autosuggest that can be bound to authority lists and controlled access terms. By default, Orbeon reads and writes data from and to an eXist database that comes packaged with it, but you can have it serialize the XML to disk or have it interact with any REST interface such as Fedora. Goals: Ultimately, I wish to create a system of forms that can open any EAD 2002-compliant XML file without any data loss or XML transformation whatsoever. I think that this is the shortcoming of systems such as Archon and Archivists' Toolkit. I want to integrate authority lists that can be integrated into certain fields with autosuggest (such as corporate names, people, and subjects). If there is demand, I can build a public interface for viewing the entire EAD collection, complete with solr for faceted browse and search, but this is secondary to producing a form that people with some basic archiving knowledge and EAD background can use to easily and effectively create finding aids. A public interface is the easy part, in any case. It wouldn't take more than a week or two to build something fairly nice and robust. Here is the link: http://beta.scholarslab.org:9080/cocoon/eaditor/ I should stress that the application is *not complete.* I am using cocoon for providing a list of EAD content in the system. I will remove that application eventually and utilize Orbeon's internal pipelining features to achieve the same objective. I haven't delved too deeply into Orbeon's pipelines yet. Here are some things to note: 1. If you click on a link to open the main part of the guide or any of its components, you have to click the Load link on the top of the form. Forms aren't being loaded on page load yet. 2. Elements that accept mixed content per the EAD 2002 schema (e.g. paragraphs) only accept PCDATA. I haven't worked on mixed content yet; it is by far the most challenging aspect of the project. 3. I only have a few C-level elements available to add. 4. Not all did elements are available yet. 5. A lot of the generic attributes, like type and label, are not available for editing yet. This may be the type of thing that is best customized per institution relative to their own best practices. I don't want more input fields than necessary right now. 6. The only thing you can add into the archdesc right now is the dsc. Once I finish all of the c-level elements, I can just put some xi:includes into the archdesc XForm file to show them in the archdesc level. I think those are the major issues for now. As I stated earlier, this is sort of a pre-alpha. The project is open source and available (through svn) to anyone who wants it. http://code.google.com/p/eaditor/ . I have put together an easy package to get the application up and running without difficulty. All you have to do is unzip the download, go into the apache tomcat folder and execute the startup script. This assumes you have nothing running on port 8080 already. Download page: http://code.google.com/p/eaditor/downloads/list Wiki instructions: http://code.google.com/p/eaditor/wiki/QuickstartInstallation?ts=1257887453updated=QuickstartInstallation Comments, questions, criticism welcome. The editor is a sandbox. Feel free to experiment. Ethan Gruber University of Virginia Library Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Assigning DOI for local content
Having incorporated the handle client software into my own stuff rather easily, I'm pretty sure that's not true. On Nov 19, 2009, at 12:51 PM, Ross Singer wrote: The caveat being that the initial access point is provided via HTTP. But then again, so is http://hdl.handle.net/, which, in fact, the only way currently in practice to dereference handles. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Assigning DOI for local content
For example, if you don't want to rely on dx.doi.org as your gateway to the handle system for doi resolution, it would be quite easy for me to deploy my own gateway at dx.hellman.net. I might want to do this if a were an organization paranoid about security and didn't want to disclose to anybody what doi's my organization was resolving. Or, I might want to directly access metadata in the handle system that doesn't go through the http gateways, to provide a service other than resolution. Does this answer your question, Ross? On Nov 20, 2009, at 2:31 PM, Ross Singer wrote: On Fri, Nov 20, 2009 at 2:23 PM, Eric Hellman e...@hellman.net wrote: Having incorporated the handle client software into my own stuff rather easily, I'm pretty sure that's not true. Fair enough. The technology is binding independent. So you are using and sharing handles using some protocol other than HTTP? I'm more interested in the sharing part of that question. What is the format of the handle identifier in this context? What advantage does it bring over HTTP? -Ross. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Assigning DOI for local content
On Nov 23, 2009, at 1:32 PM, Ross Singer wrote: On Mon, Nov 23, 2009 at 1:07 PM, Eric Hellman e...@hellman.net wrote: Does this answer your question, Ross? Yes, sort of. My question was not so much if you can resolve handles via bindings other than HTTP (since that's one of the selling points of handles) as it was do people actually use this in the real world? Well, the short answer to that question is yes. I think the discussion veered out of the zone of my understanding the point of it. The original question related to whether a journal should register Crossref doi's, and the short answer to that, as far as I'm concerned, is an emphatic yes. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] Support for attending Code4Lib2010
I hope to be in Asheville. But with the Global Economic Downturn, I worry that some people who might have a lot to contribute and the most to gain may be unable to go due to having lost their job or being in a library with horrific budget cuts. So, together with Eric Lease Morgan (who has been involved with Code4Lib from the very start) I'm putting up a bit of money to support the expenses of people who want to go to Code4Lib 2010. If other donors can join Eric and myself, that would be wonderful, but so far I'm guessing that together we can support the travel expenses of two relatively frugal people. If you would like to be considered, please send me an email as soon as possible, and before I wake up on Monday, December 14 at the latest. Please describe your economic hardship, your travel budget, and what you hope to get from the conference. Eric and I will use arbitrary and uncertain methods to decide who to support, and we'll inform you of our decision in time for you to register or not on Wednesday December 16, when registration opens. more at http://go-to-hellman.blogspot.com/2009/12/supporting-attendance-at-code4lib.html Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA glue...@twitter e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] Update: Support for attending Code4Lib2010
We now have four community members joining together to support the expenses of people who want to go to Code4Lib 2010, so it's likely that we'll be able to support more than two people's travel expenses. I should mention that support will be informal and discreet- its not like the scholarships offered by Brown/OSU. If you would like to be considered, please send me an email as soon as possible, and before I wake up on Monday, December 14, at the latest. Please describe your economic hardship, your travel budget, and what you hope to get from the conference. We will use arbitrary and uncertain methods to decide who to support, and we'll inform you of our decision in time for you to register or not on Wednesday December 16, when registration opens. If you want to go and money's a problem, don't hesitate to ask. more at http://go-to-hellman.blogspot.com/2009/12/supporting-attendance-at-code4lib.html Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA glue...@twitter e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] Update: Support for attending Code4Lib2010
I'm happy to report that the ad hoc committee to support attendance at Code4Lib will be able to provide the requested help. I'd also like to thank Serials Solutions for their offer of support. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] OpenURL aggregator not doing so well
Take a look at http://openurl.code4lib.org/aggregator Any ideas how to make it work better? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Twitter annotations and library software
I think Twitter annotations would be a good use for http://thing-described-by.org/ or a functional equivalent. The payload of the annotation would simply be a description URI and a namespace and value for descriptions by reference 1. the mechanism would be completely generic, usable for any sort of reference, not siloed in libraryland. In other words, we might actually get people to adopt it. 2. libraryland descriptions could use BIBO or RDA or both or whatever, and could be concise or verbose 3. descriptions could be easily reused I'll write this up a bit more and would be interested in comment, but it's where this post was going: http://go-to-hellman.blogspot.com/2010/04/when-shall-we-link.html Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Twitter annotations and library software
I mean, really, if the folks at RefWorks, EndNote, Papers, Zotero and LibX don't have crash programs underway to integrate Twitter clients into their software to send and receive reference metadata payloads they can use in the Twitter annotation field, they really ought to hire me to come and bash some sense into them. Really. I still think by-reference payloads would got the farthest, as described at http://go-to-hellman.blogspot.com/2010/04/when-shall-we-link.html would go the farthest, but surely these folks know very well what they can send and receive. Eric On Apr 28, 2010, at 4:17 AM, Jakob Voss wrote: Hi it's funny how quickly you vote against BibTeX, but at least it is a format that is frequently used in the wild to create citations. If you call BibTeX undocumented and garbage then how do you call MARC which is far more difficult to make use of? My assumption was that there is a specific use case for bibliographic data in twitter annotations: I. Identifiy publication = this can *only* be done seriously with identifiers like ISBN, DOI, OCLCNum, LCCN etc. II. Deliver a citation = use a citation-oriented format (BibTeX, CSL, RIS) I was not voting explicitly for BibTeX but at least there is a large community that can make use of it. I strongly favour CSL (http://citationstyles.org/) because: - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment but it is the natural environment for the Web 2.0 Mashup crowd that is going to implement applications that use Twitter annotations - there are dozens of CSL citation styles so you can display a citation in any way you want As Ross pointed out RIS would be an option too, but I miss the easy open source tools that use RIS to create citations from RIS data. Any other relevant format that I know (Bibont, MODS, MARC etc.) does not aim at identification or citation at the first place but tries to model the full variety of bibliographic metadata. If your use case is III. Provide semantic properties and connections of a publication Then you should look at the Bibliographic Ontology. But III does *not* just subsume usecase II. - it is a different story that is not beeing told by normal people but only but metadata experts, semantic web gurus, library system developers etc. (I would count me to this groups). If you want such complex data then you should use other systems but Twitter for data exchange anyway. A list of CSL metadata fields can be found at http://citationstyles.org/downloads/specification.html#appendices and the JavaScript-Processor (which is also used in Zotero) provides more information for developers: http://groups.google.com/group/citeproc-js Cheers Jakob P.S: An example of a CSL record from the JavaScript client: { title: True Crime Radio and Listener Disenchantment with Network Broadcasting, 1935-1946, author: [ { family: Razlogova, given: Elena } ], container-title: American Quarterly, volume: 58, page: 137-158, issued: { date-parts: [ [2006, 3] ] }, type: article-journal } -- Jakob Voß jakob.v...@gbv.de, skype: nichtich Verbundzentrale des GBV (VZG) / Common Library Network Platz der Goettinger Sieben 1, 37073 Göttingen, Germany +49 (0)551 39-10242, http://www.gbv.de Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
Since this thread has turned into a discussion on OpenURL... I have to say that during the OpenURL 1.0 standardization process, we definitely had moments of despair. Today, I'm willing to derive satisfaction from it works and overlook shortcomings. It might have been otherwise. What I hope for is that OpenURL 1.0 eventually takes a place alongside SGML as a too-complex standard that directly paves the way for a universally adopted foundational technology like XML. What I fear is that it takes a place alongside MARC as an anachronistic standard that paralyzes an entire industry. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Twitter annotations and library software
OK, back to Tim's specific question. I'm not sure why you want to put bib data in a tweet at all for your application. Why not just use a shortened URL pointing at your page of metadata? That page could offer metadata via BIBO, Open Graph and FOAF in RDFa, COinS, RIS, etc. using established methods to serve multiple applications at once. When Twitter annotations come along, the URL can be put in the annotation field. Eric On Apr 21, 2010, at 6:08 AM, Tim Spalding wrote: Have C4Lers looked at the new Twitter annotations feature? http://www.sitepoint.com/blogs/2010/04/19/twitter-introduces-annotations-hash-tags-become-obsolete/ I'd love to get some people together to agree on a standard book annotation format, so two people can tweet about the same book or other library item, and they or someone else can pull that together. I'm inclined to start adding it to the I'm talking about and I'm adding links on LibraryThing. I imagine it could be easily added to many library applications too—anywhere there is or could be a share this on Twitter link, including OPACs, citation managers, library event feeds, etc. Also, wouldn't it be great to show the world another interesting, useful and cool use of library data that OCLC's rules would prohibit? So the question is the format. Only a maniac would suggest MARC. For size and other reasons, even MODS is too much. But perhaps we can borrow the barest of field names from MODS, COinS, or from the most commonly used bibliographic format, Amazon XML. Thoughts? Tim -- Check out my library at http://www.librarything.com/profile/timspalding Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
Even the best standard in the world can only do so much! On Apr 29, 2010, at 1:14 PM, Ed Summers wrote: On Thu, Apr 29, 2010 at 12:08 PM, Eric Hellman e...@hellman.net wrote: Since this thread has turned into a discussion on OpenURL... I have to say that during the OpenURL 1.0 standardization process, we definitely had moments of despair. Today, I'm willing to derive satisfaction from it works and overlook shortcomings. It might have been otherwise. Personally, I've followed enough OpenURL enabled hyperlink dead ends to contest it works. //Ed
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
May I just add here that of all the things we've talked about in these threads, perhaps the only thing that will still be in use a hundred years from now will be Unicode. إن شاء الله On Apr 29, 2010, at 7:40 PM, Alexander Johannesen wrote: However, I'd like to add here that I happen to love XML, even from an integration perspective, but maybe that stems from understanding all those tedious bits no one really cares about about it, like id(s) and refid(s) (and all the indexing goodness that comes from it), canonical datasets, character sets and Unicode, all that schema craziness (including Schematron and RelaxNG), XPath and XQuery (and all the sub-standards), XSLT and so on. I love it all, and not because of the generic simplicity itself (simple in the default mode of operation, I might add), but because of a) modeling advantages, b) cross-environment language and schema support, and c) ease of creation. (I don't like how easy well-formedness breaks, though. That sucks) Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
Ha! One of the things OpenURL 1.0 fixed was to wire in UTF-8 encoding. Much of the MARC data in circulation also uses UTF-8 encoding. Some of it even uses it correctly. On Apr 29, 2010, at 8:58 PM, Alexander Johannesen wrote: On Fri, Apr 30, 2010 at 10:54, Eric Hellman e...@hellman.net wrote: May I just add here that of all the things we've talked about in these threads, perhaps the only thing that will still be in use a hundred years from now will be Unicode. إن شاء الله May I remind you that we're still using MARC. Maybe you didn't mean in the library world ... *rimshot* Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
Eek. I was hoping for something much simpler. Do you realize that you're asking for service taxonomy? On Apr 30, 2010, at 10:22 AM, Ross Singer wrote: I think the basis of a response could actually be another context object with the 'services' entity containing a list of services/targets that are formatted in some way that is appropriate for the context and the referent entity enhanced with whatever the resolver can add to the puzzle.
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
I'll try to find out. Sent from Eric Hellman's iPhone On May 2, 2010, at 4:10 PM, stuart yeates stuart.yea...@vuw.ac.nz wrote: But the interesting use case isn't OpenURL over HTTP, the interesting use case (for me) is OpenURL on a disconnected eBook reader resolving references from one ePub to other ePub content on the same device. Can OpenURL be used like that?
[CODE4LIB] Safari extensions
Has anyone played with the new Safari extensions capability? I'm looking at you, Godmar. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] MARCXML - What is it for?
I think you'd have a very hard time demonstrating any speed advantage to MARC over MARCXML. XML parsers have been speed optimized out the wazoo; If there exists a MARC parser that has ever been speed-optimized without serious compromise, I'm sure someone on this list will have a good story about it. On Oct 25, 2010, at 3:05 PM, Patrick Hochstenbach wrote: Dear Nate, There is a trade-off: do you want very fast processing of data - go for binary data. do you want to share your data globally easily in many (not per se library related) environments - go for XML/RDF. Open your data and do both :-) Pat Sent from my iPhone On 25 Oct 2010, at 20:39, Nate Vack njv...@wisc.edu wrote: Hi all, I've just spent the last couple of weeks delving into and decoding a binary file format. This, in turn, got me thinking about MARCXML. In a nutshell, it looks like it's supposed to contain the exact same data as a normal MARC record, except in XML form. As in, it should be round-trippable. What's the advantage to this? I can see using a human-readable format for poorly-documented file formats -- they're relatively easy to read and understand. But MARC is well, well-documented, with more than one free implementation in cursory searching. And once you know a binary file's format, it's no harder to parse than XML, and the data's smaller and processing faster. So... why the XML? Curious, -Nate Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] mailing list administratativia
I vote for changing the limit threshold to PI * (eventual length of this meta-thread). On Oct 27, 2010, at 3:37 PM, Alexander Johannesen wrote: On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D do...@uta.edu wrote: Can that limit threshold be raised? If so, are there reasons why it should not be raised? Is it to throttle spam or something? 50 seems rather low, and it's rather depressing to have a lively discussion throttled like that. Not to mention I thought I was simply kicked out for living things up (especially given my reasonable follow-up was where the throttling began). Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen --- Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] mailing list administratativia
I expect the length of the thread to be irrational; so perhaps that's not a problem. On Oct 27, 2010, at 6:18 PM, Ray Denenberg, Library of Congress wrote: I think the constraint is that it has to be a rational number. -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Eric Hellman Sent: Wednesday, October 27, 2010 5:58 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] mailing list administratativia I vote for changing the limit threshold to PI * (eventual length of this meta-thread). On Oct 27, 2010, at 3:37 PM, Alexander Johannesen wrote: On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D do...@uta.edu wrote: Can that limit threshold be raised? If so, are there reasons why it should not be raised? Is it to throttle spam or something? 50 seems rather low, and it's rather depressing to have a lively discussion throttled like that. Not to mention I thought I was simply kicked out for living things up (especially given my reasonable follow-up was where the throttling began). Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen --- Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] c4l2011 location + reg. open time
I believe that would be Indiana Memorial Union on the campus of IU in Bloomington, Indiana Sent from my iPad On Dec 8, 2010, at 10:50 PM, Karen Coyle li...@kcoyle.net wrote: I can't find anything on the wiki that says WHERE c4l2011 will be. (I thought IMU was a hint, but that comes out as International Medical University in Malaysia as the top link.) That would be useful information. Also, if registration opens at 9, what time zone is that? kc p.s. Just because I haven't been paying attention doesn't mean I don't CARE. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
[CODE4LIB] question about coding in libraries
For my talk at Code4Lib, I'm trying to find or gather statistics about the number of people doing any sort of code in libraries. My initial attempts to quantify this have failed. I would appreciate info from list members. If you'd like to help, send me two numbers 1. The number of people employed at or on contract to your library whose major responsibilities include software development or maintenance. Broadly defined. 2. The total FTE staff at your library. (send to me, not the list, I will summarize) Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
[CODE4LIB] AGPL for libraries (was: A to Z lists)
Hej Tony! Great to hear of your effort; I hope you have chosen to implement the NISO 1.0 standard. I would urge you to carefully consider your choice of license, however. As I wrote last year when the issue came up in Koha, using AGPL in stead of the less restrictive GPL can have some unintended consequences. http://go-to-hellman.blogspot.com/2010/07/koha-community-considers-affero-license.html It is still a reality today that many library resources release api's that are provided only to customers and often come with interface licenses incompatible with GPL. If you use AGPL, a library that modified the software to use it with one of these resources would be in violation of your license, even if they did not redistribute the software. If that's your intention, then fine, but please make sure you understand the implications. Also, please don't confuse AGPL, which is a restrictive license rooted in copyright law, with public domain, which has no restrictions on use. Eric On Feb 17, 2011, at 4:34 AM, Tony Mattsson wrote: Hi, We are at the final stages of building an EBM system with AZ-list and OpenURL resolver developed in LAMP (with Ajax) which we will release into the public domain (AGPL). I'll put up a notice on this list when it's done, and you can try it out to see if it measures up :=) Tony Mattsson IT-Librarian Landstinget Dalarna Bibliotek och informationscentral http://materio.fabicutv.com -Ursprungligt meddelande- Från: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] För Michele DeSilva Skickat: den 16 februari 2011 22:18 Till: CODE4LIB@LISTSERV.ND.EDU Ämne: [CODE4LIB] A to Z lists Hi Code4Lib-ers, I want to chime in and say that I, too, enjoyed the streaming archive from the conference. I also have a question: my library has a horribly antiquated A to Z list of databases and online resources (it's based in Access). We'd like to do something that looks more modern and is far more user friendly. I found a great article in the Code4Lib journal (issue 12, by Danielle Rosenthal Mario Bernado) about building a searchable A to Z list using Drupal. I'm also wondering what other institutions have done as far as in-house solutions. I know there're products we could buy, but, like everyone else, we don't have much money at the moment. Thanks for any info or advice! Michele DeSilva Central Oregon Community College Library Emerging Technologies Librarian 541-383-7565 mdesi...@cocc.edu Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] GPL incompatible interfaces
Since the Metalib API is not public, to my knowledge, I don't know whether it gets disclosed with an NDA. And you can't run or develop Xerxes without an ExLibris License, because it depends on a proprietary and unspecified data set. I'm sure that's legal, but it's not true to the spirit of copyleft. The main effect of using GPL for Xerxes is that it prevents ExLibris from distributing (but not using) proprietary versions of Xerces. If that is the intent of the developers, then perhaps AGPL would be a better tool for them to wield. None of this should be taken as a criticism of the Xerxes developers. On Feb 18, 2011, at 3:50 AM, graham wrote: That's very different from saying something with a GPL license can't use a proprietary interface. As if for example Xerxes couldn't use the Metalib API - without which it would be pointless. As I understand him Eric is saying that there are interfaces to library software which actually have a license or contract which blocks GPLed software from using them. It would be a kind of 'viral BSD' license, killing free software (in the FSF sense) but leaving proprietary or open source (in your Apache/MIT sense) untouched. I haven't seen any examples myself, and can't quite see how it would be done legally. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
[CODE4LIB] Gluejar is hiring
Hi, Everyone! http://go-to-hellman.blogspot.com/2011/03/gluejar-is-hiring.html Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] [dpla-discussion] Rethinking the library part of DPLA
The DPLA listserv is probably too impractical for most of Code4Lib, but Nate Hill (who's on this list as well) made this contribution there, which I think deserves attention from library coders here. On Apr 5, 2011, at 11:15 AM, Nate Hill wrote: It is awesome that the project Gutenberg stuff is out there, it is a great start. But libraries aren't using it right. There's been talk on this list about the changing role of the public library in people's lives, there's been talk about the library brand, and some talk about what 'local' might mean in this context. I'd suggest that we should find ways to make reading library ebooks feel local and connected to an immediate community. Brick and mortar library facilities are public spaces, and librarians are proud of that. We have collections of materials in there, and we host programs and events to give those materials context within the community. There's something special about watching a child find a good book, and then show it to his or her friend and talk about how awesome it is. There's also something special about watching a senior citizens book group get together and discuss a new novel every month. For some reason, libraries really struggle with treating their digital spaces the same way. I'd love to see libraries creating online conversations around ebooks in much the same way. Take a title from project Gutenberg: The Adventures of Huckleberry Finn. Why not host that book directly on my library website so that it can be found at an intuitive URL, www.sjpl.org/the-adventures-of-huckleberry-finn and then create a forum for it? The URL itself takes care of the 'local' piece; certainly my most likely visitors will be San Jose residents- especially if other libraries do this same thing. The brand remains intact, when I launch this web page that holds the book I can promote my library's identity. The interface is no problem because I can optimize the page to load well on any device and I can link to different formats of the book. Finally, and most importantly, I've created a local digital space for this book so that people can converse about it via comments, uploaded pictures, video, whatever. I really think this community conversation and context-creation around materials is a big part of what makes public libraries special. Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ Gluejar is hiring! e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] [dpla-discussion] Rethinking the library part of DPLA
The challenge I like to present to libraries is this: imagine that your entire collection is digital. Does it include Shakespeare? Does it include Moby Dick? Yes! Just because you don't have to pay for these works, doesn't mean that they don't belong in your library. And what if many modern works become available for free via Creative Commons licensing? Is it the library's role to promote these works, or should a library be promoting primarily the works it's paying for patrons to use? That's why I thought Nate's suggestions were worthy of attention from people who could potentially do practical things. The other hope is that if libraries can do compelling things with public domain content, there's no reason they couldn't do the same things with in-copyright material appropriately licensed. If the experience works, the rightsholders will see the value. On Apr 10, 2011, at 10:05 AM, Karen Coyle wrote: I appreciate the spirit of this, but despair at the idea that libraries organize their services around public domain works, thus becoming early 20th century institutions. The gap between 1923 and 2011 is huge, and it makes no sense to users that a library provide services based on publication date, much less that enhanced services stop at 1923. kc Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ Gluejar is hiring! e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] What do you wish you had time to learn?
This thread got me thinking about what I learned during a time when I actually had time to learn whatever I wanted to: Applied Epistemology (reading list supplied mostly by @edsu) Copyright Law (reading list supplied mostly by @grimmelm) Writing and Journalism Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ Gluejar is hiring! e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] Seth Godin on The future of the library
Some ebooks, in fact some of the greatest ever written, already cost less than razor blades. Eric (who just finished writing a chapter on open-access e-books) On May 16, 2011, at 7:52 PM, Luciano Ramalho wrote: 1) Why quote the ebook price in 1962 dollars? The reality in 2011 is that Kindle books in general are too expensive, particularly when comparing their cost with the paper counterparts (think about variable costs in paperbacks, logistics etc; it is pretty obvious the cost reductions are not being fully reflected in consumer prices). Given the current situation, I see no evidence that ebooks will cost less than razor blades, ever. Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] Seth Godin on The future of the library
Exactly. I apologize if my comment was perceived as coy, but I've chosen to invest in the possibility that Creative Commons licensing is a viable way forward for libraries, authors, readers, etc. Here's a link the last of a 5 part series on open-access ebooks. I hope it inspires work in the code4lib community to make libraries more friendly to free stuff. http://go-to-hellman.blogspot.com/2011/05/open-access-ebooks-part-5-changing.html On May 18, 2011, at 7:20 PM, David Friggens wrote: Some ebooks, in fact some of the greatest ever written, already cost less than razor blades. Do you mean ones not under copyright? Those, plus Creative Commons etc.
Re: [CODE4LIB] Seth Godin on The future of the library
Karen, The others who have responded while I was off, you know, doing stuff, have done a much better job of answering your question than I would have. I would have said something glib like almost all ways, with respect to open-access digital materials. There's a shift in library mindset that has to occur along with the transition from print to digital. The clearest example that I've seen is the typical presentation of pretend-its-print out-of-copyright material. A library will have purchased PIP access to an annotated edition of a Shakespeare play, or a new translation of Crime and Punishment. But the public domain versions of these works (which are perfectly good) don't exist in the catalog. A patron looking for ebook versions of these works will then frequently be denied access because another patron has already checked out the licensed version. That can't be justified by any vision for libraries that I can think of. It can't be justified because it's hard or time consuming, or because there are a flood of PD Crime and Punishments clamoring for attention. It's just a result of unthinking and we-haven't-done-that-before. It's my hope that there are a number of not-so-hard problems around this situation that people on this list have the tools to solve. Eric On May 19, 2011, at 1:30 AM, Karen Coyle wrote: Quoting Eric Hellman e...@hellman.net: Exactly. I apologize if my comment was perceived as coy, but I've chosen to invest in the possibility that Creative Commons licensing is a viable way forward for libraries, authors, readers, etc. Here's a link the last of a 5 part series on open-access ebooks. I hope it inspires work in the code4lib community to make libraries more friendly to free stuff. Eric, In what ways do you think that libraries today are not friendly to free stuff? kc http://go-to-hellman.blogspot.com/2011/05/open-access-ebooks-part-5-changing.html On May 18, 2011, at 7:20 PM, David Friggens wrote: Some ebooks, in fact some of the greatest ever written, already cost less than razor blades. Do you mean ones not under copyright? Those, plus Creative Commons etc. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Adding VIAF links to Wikipedia
We talked a bit about this at LOD-LAM; Asaf Bartov of the Wikimedia foundation offered to help make this work better. email me if you need a a contact. On Jun 2, 2011, at 10:40 AM, Ralph LeVan wrote: Yes, the bot was approved, but in a much more limited application that was initially intended (make a link between Wikipedia records and corresponding OpenLibrary records.) And the conversation was quite rancorous for granting permission to an organization philosophically much closer to Wikipedia than OCLC would seem to be. I don't think we'll be able to make this happen without a lot of help. Ralph On Fri, May 27, 2011 at 3:45 PM, Ed Summers e...@pobox.com wrote: On Thu, May 26, 2011 at 2:01 PM, Ralph LeVan ralphle...@gmail.com wrote: OCLC Research would desperately love to add VIAF links to Wikipedia articles, but it seems to be very difficult. The OpenLibrary folks tried to do it a while back and ended up getting their plans severely curtailed. The discussion at Wikipedia is captured here: http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/OpenlibraryBot Ralph if you read that entire discussion it sounds like the bot was approved. Am I missing something? //Ed
[CODE4LIB] JHU integration of PD works
Getting back to the subject of a previous thread, (and digesting some wonderful contributions by Karen, Alex, Jeremy and Ed C.) I dug around some links that Jonathan posted, and I think they're worth further discussion. The way that JHU has integrated Public Domain works into its catalog results with umlaut is brilliant and pragmatic; the new catalog (catalyst) interface based on Blacklight is a great improvement on the older Horizon version: https://catalyst.library.jhu.edu/catalog/bib_816990 Clearly, Jonathan has gone through the process of getting his library to think through the integration, and it seems to work. Has there been any opposition? What are the reasons that this sort of integration not more widespread? Are they technical or institutional? What can be done by producers of open access content to make this work better and easier? Are unified approaches being touted by vendors delivering something really different? Looking forward, I wonder whether the print-first, then enrich with digital strategy required by today's infrastructure and work flow will decline compared to a more Googlish web-first strategy. Eric Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ 41 Watchung Plaza #132, Montclair NJ 07042 e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
[CODE4LIB] privacy enhanced implementation of Like button
Has anyone seen, used or written a wrapper script for Facebook Like buttons (http://developers.facebook.com/docs/opengraph/ ) that prevents the leakage of all user browsing behavior to Facebook? For example, the script might invoke the facebook script on an OnClick event. Eric Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ 41 Watchung Plaza #132, Montclair NJ 07042 e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
[CODE4LIB] New thread: Why are you doing what you're doing?
I think it's a good question, worth asking about *every* dev position being hired for. I would be interested to hear an answer from others on the list. In fact, I think the price of putting a position announcement on Code4lib should be a willingness to answer why?. And why not? is a pretty pathetic answer. For me, I'm doing what I'm doing because I think it's important and because no one else is doing it. I hope there are many other with a similar answer. Eric
Re: [CODE4LIB] Examples of visual searching or browsing
I'm surprised that no one has mentioned the stars of the DPLA sprint- ShelfLife http://librarylab.law.harvard.edu/dpla/demo/app/ and BookWorm http://bookworm.culturomics.org/ Eric On Oct 27, 2011, at 4:27 PM, Julia Bauder wrote: Dear fans of cool Web-ness, I'm looking for examples of projects that use visual(=largely non-text and non-numeric) interfaces to let patrons browse/search collections. Things like the GeoSearch on North Carolina Maps[1], or projects that use Simile's Timeline or Exhibit widgets[2] to provide access to collections (e.g., what's described here: https://letterpress.uchicago.edu/index.php/jdhcs/article/download/59/70), or in-the-wild uses of Recollection[3]. I'm less interested in knowing about tools (although I'm never *uninterested* in finding out about cool tools) than about production or close-to-production sites that are making good use of these or similar tools to provide visual, non-linear access to collections. Who's doing slick stuff in this area that deserves a look? Thanks! Julia Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ 41 Watchung Plaza #132, Montclair NJ 07042 e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] Library News (à la ycombinator's hackernews)
And the discussion at hacker news is illuminating... http://news.ycombinator.com/item?id=3272980 On Nov 29, 2011, at 1:30 PM, Mark A. Matienzo wrote: On Tue, Nov 29, 2011 at 1:25 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Don't know if the link is in error, or what. Anyone know what software Hacker News and this Library News clone are based on, for real, and where to look at the source/documentation? Trying to google for what open source software Hacker News runs on, I'm not having any luck. Hacker News, and presumably Library News, both run using news.arc, which is written the the Arc dialect of Lisp. The news program is packaged with the Arc distribution: https://github.com/nex3/arc/blob/master/news.arc Mark A. Matienzo Digital Archivist, Manuscripts and Archives, Yale University Library Technical Architect, ArchivesSpace
Re: [CODE4LIB] Pandering for votes for code4lib sessions
I think that it's not out of bounds to ask people for c4l votes unless you're offering tangible rewards in exchange for said votes. Tangible rewards as used here shall in no circumstance be construed to apply to any offers of beer or its nonalcoholic equivalent. Non-alcoholic equivalent as used here, shall in no way be construed to imply that there is such a thing.
Re: [CODE4LIB] Pandering for votes for code4lib sessions
It's also worth noting that the voters (so far) have done a super job. If your talk is not making the cut, don't take it as a reflection or judgment on you or your work. It just means that voters want to save you for next year. And if your talk IS making the cut, it's probably because voters want the chance to make snide remarks about you on the backchannel. (I'll only be able to attend virtually this year. Please don't ask to take away my vote!) Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ 41 Watchung Plaza #132, Montclair NJ 07042 e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] site vulnerabilities
I gave a lightning talk on XSS vulnerabilities in library software at the first Code4Lib conference. You'll be happy to know that as bad as things are, they've improved considerably! I showed several ILS vendors how I could insert arbitrary javascripts into their products. Some of them fixed their products in the next update cycle, some took a couple of years. One particularly nasty vulnerability I am unable to talk about, it was so nasty and close to home. But the general problem persists. Perhaps an outing process would be useful. Eric On Dec 9, 2011, at 10:54 AM, Erin Germ wrote: Good morning group, I don't mean to be an alarmist but I follow some sites that list XSS and other vulnerabilities for web sites. Among the latest updates with site vulnerabilities were a few from libraries. Some of these are dated a couple months ago but they are now just being pushed out and still have a status of unfixed. If you would like to know if your site(s) are on the list, I would start by checking http://www.xssed.com/ V/R Erin
Re: [CODE4LIB] What software for a digital library
At gluejar, we decided to use Django for our Unglue.it website, which will open in january. As someone who built a web framework from scratch in Java, I've found that the django design aligned with mine where I got it right and didn't where I got it wrong. I'm still getting used to Python, but I'm quite happy with Django. Eric Hellman President, Gluejar, Inc. http://www.gluejar.com/ 41 Watchung Plaza #132, Montclair NJ 07042 e...@hellman.net http://go-to-hellman.blogspot.com/ @gluejar
Re: [CODE4LIB] site vulnerabilities
By the way, who ever decided it would be fun to reply by checking the gluejar website for XSS vulnerabilities, by all means, tell everyone about it! Eric On Dec 16, 2011, at 10:14 PM, Michael J. Giarlo wrote: On Fri, Dec 16, 2011 at 21:42, Eric Hellman e...@hellman.net wrote: You'll be happy to know that as bad as things are, they've improved considerably! I showed several ILS vendors how I could insert arbitrary javascripts into their products. Some of them fixed their products in the next update cycle, some took a couple of years. One particularly nasty vulnerability I am unable to talk about, it was so nasty and close to home. But the general problem persists. Perhaps an outing process would be useful. Leaks4Lib? +1 -Mike
Re: [CODE4LIB] too much Metadata
Related: http://go-to-hellman.blogspot.com/2009/06/when-are-you-collecting-too-much-data.html On Feb 10, 2012, at 3:57 PM, Patrick Berry wrote: So, one question I forgot to toss out at the Ask Anything session is: When do you know you have enough metadata? You'll know it when you have it, isn't the response I'm looking for. So, I'm sure you're wondering what the context for this question is, and honestly there is none. This is geared towards contentDM or DSpace or Omeka or Millennium. I've seen groups not plan enough for collecting data and I've seen groups that are have been planning so long they forgot what they were supposed to be collecting in the first place. So, I'll just throw that vague question out there and see who wants to take a swing. Thanks, Pat/@pberry
[CODE4LIB] Unglue.it has launched
There's even the beginnings of an API . https://unglue.it/api/help Lots of work left to do, though! Not much point unless the campaigns succeed. Eric
Re: [CODE4LIB] EPUB and ILS indexing
This is an area where the code4lib community can have a huge impact. Conversely, if the Code4lib community doesn't have a big impact, we're in trouble. I urge everyone to have a look at the OS projects that SourceFabric is involved in. In particular, BookType is a django web app that lets people collaboratively produce EPUB ebooks. If you want to implement a community ebook publishing platform, this is what you want to hop onto. I'm really glad to see Henru-Damien looking at this, I think he could use help! Eric On Oct 29, 2012, at 1:11 PM, Henri-Damien LAURENT henridamien.laur...@free.fr wrote: Le 29/10/2012 14:55, Jodi Schneider a écrit : Sounds great! Have you thought about starting from OPDS? http://opds-spec.org/about/ Thanks for that hint Jodi. Nope, I hadnot tought about using OPDS. It looks really great. But from what I know of ILSes, ATOM feeds are not yet getting indexed straight into the catalog. But that could be something great. Might be worth talking to some EPUB folks -- for instance Peter Brantley, or else folks from threepress.org? I am already in contacts with some people from the EPUB world (namely SourceFabric, gluejar, and tea-ebook). But could be interesting to have more feedback. -Jodi On Mon, Oct 29, 2012 at 12:19 PM, Henri-Damien LAURENT henridamien.laur...@free.fr wrote: Hi, I am about to write a tool which would help indexing EPUB into ILSes. My first guess is to produce ISO2709 or MARCXML record from EPUB files, but since MARCXML or ISO2709 is not really what I would call the more portable (UNIMARC and MARC21 may both be handled in the same file format), I am rather considering producing OAI-DC or html5 +schema.org http://schema.org/+dublin corebut that would rely on EPUB3. Any comment anyone ? Has anyone considered such a tool ? Is there any hidden corpse lurking around I should be aware of ? Have a nice day -- Henri-Damien LAURENT -- Henri-Damien LAURENT
[CODE4LIB] Code, Inclusiveness, and Fear
On Tuesday Night I went the the NYTech Meetup. They get 800+ people to come once a month to watch demos of the latest thing. One of the presentations was from Hackers Union. I was cringing because it was like a caricature of how to present an uninviting impression to anyone who wasn't white, male and 20-something. Complete with jokes about how to pick up girls in bars. In front of an audience about 30% non-male, 40% non-white, and 50% non-20-something. I thought to myself, if they did that at Code4Lib, it would NOT be received well, to say the least. And this morning I happened to scan through many of the recent threads on the listserv. And the thread on what is coding, including the existential digressions. What makes Code4Lib different from any other group I know of in the library world is that it rejects fear of code. Much of the library world fears code, and most of that fear is unfounded. And the code we need to fear is not so scary once we know how to fear it. The threads about having anti-harassment policies is a good thing because we want to remove fear that surrounds code. Talking about it is a big step towards addressing fear. Let's try to make sure that having a policy doesn't stop us from talking about the need to eliminate the fear. As to who is a part of the Code4Lib community, I think you don't have to be a coder, you just have to reject fear of code. A big part of the conferences is creating space to help people make the transition from being oppressed by fear of code to being liberated by the possibilities of code. OK, back to work for me- unfortunately not the code part. Eric Eric Hellman President, Gluejar.Inc. Founder, Unglue.it https://unglue.it/ http://go-to-hellman.blogspot.com/ twitter: @gluejar
Re: [CODE4LIB] Code, Inclusiveness, and Fear
We need to fear malicious code. To do that, we need to think about all the ways people can misuse, abuse and attack our systems. We need to cross our t's, dot our i's, and shine lots of light. Eric On Dec 6, 2012, at 1:17 PM, Gabriel Farrell gsf...@gmail.com wrote: one that rings true with me. I hope we can continue to live up to it. I want to make sure we're on the same page, though. To be clear, which code should we fear?
[CODE4LIB] early history of isbn/issn linking
I'm working on a little project on the early history of bibliographic linking. I'm looking for examples where plain-text documents with ISBNs or ISSNs were auto-linked to library catalogs or Amazon or whatnot. Any nominations for who did this first and documented it? Eric Eric Hellman President, Gluejar.Inc. Founder, Unglue.it https://unglue.it/ http://go-to-hellman.blogspot.com/ twitter: @gluejar
[CODE4LIB] You are a *pedantic* coder. So what am I?
OK, pedant, tell us why you think methods that can be over-ridden are static. Also, tell us why you think classes in Java are not instances of java.lang.Class On Feb 18, 2013, at 1:39 PM, Justin Coyne jus...@curationexperts.com wrote: To be pedantic, Ruby and JavaScript are more Object Oriented than Java because they don't have primitives and (in Ruby's case) because classes are themselves objects. Unlike Java, both Python and Ruby can properly override of static methods on sub-classes. The Java language made many compromises as it was designed as a bridge to Object Oriented programming for programmers who were used to writing C and C++. -Justin
[CODE4LIB] githubs for poetry, legal docs
Given the discussion of how github is not really so accessible to non-coders, I thought I'd mention these attempts to put version control into the mainstream. Github for writers: It sounds like that's what Blaine Cook is doing with Poetica.com Github for legal agreements: We've started using Docracy.com to help us manage legal agreements. Eric Eric Hellman President, Gluejar.Inc. Founder, Unglue.it https://unglue.it/ http://go-to-hellman.blogspot.com/ twitter: @gluejar
Re: [CODE4LIB] Anyone working with iPython?
I use it all the time. If anyone has played with mathematica notebooks, it's the same thing, with python, and other languages apparently on the way. Eric Hellman President, Gluejar.Inc. Founder, Unglue.it https://unglue.it/ http://go-to-hellman.blogspot.com/ twitter: @gluejar On Dec 19, 2013, at 12:48 PM, Roy Tennant roytenn...@gmail.com wrote: Our Wikipedian in Residence, Max Klein brought iPython [1] to my attention recently and even in just the little exploration I've done with it so far I'm quite impressed. Although you could call it interactive Python that doesn't begin to put across the full range of capabilities, as when I first heard that I thought Great, a Python shell where you enter a command, hit the return, and it executes. Great. Just what I need. NOT. But I was SO WRONG. It certainly can and does do that, but also so much more. You can enter blocks of code that then execute. Those blocks don't even have to be Python. They can be Ruby or Perl or bash. There are built-in functions of various kinds that it (oddly) calls magic. But perhaps the killer bit is the idea of Notebooks that can capture all of your work in a way that is also editable and completely web-ready. This last part is probably difficult to understand until you experience it. Anyway, i was curious if others have been working with it and if so, what they are using it for. I can think of all kinds of things I might want to do with it, but hearing from others can inspire me further, I'm sure. Thanks, Roy [1] http://ipython.org/