from:"Eric Hellman"

Re: [CODE4LIB] code4lib journal

2006-05-04 Thread Eric Hellman


I suppose that I shall have to write an article for the journal
entitled Code 4 dealing gracefully with idiotic journal names.

Our software has exception code for THE Journal,  but it still is a problem.


At 7:50 PM -0700 5/3/06, Roy Tennant wrote:

Eric,
Surely you must realize it was a vote. Once it was put to a vote,
under specific rules, if the rules are followed (and they were as far
as I know), then we're stuck with it, for good or evil.
Unfortunately, I think poor Jeff Davis was quite against that choice.
Roy

On May 3, 2006, at 7:27 PM, Eric Hellman wrote:


Here's the latest on the code4lib journal:

/lib/dev: A Journal for Library Programmers won the journal name
vote.  (See http://www.code4lib.org/node/96 for more details.)


The idea of a journal name that contains punctuation in the title is
so breathtakingly idiotic that I can only assume that it is a
reference to the bug in the name of the computer language  C++



--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://www.openly.com/1cate/  1 Click Access To Everything

[CODE4LIB] Link Evaluator Firefox Extension

2007-01-15 Thread Eric Hellman


OCLC Openly Informatics has just released some free, open-source
software that we developed to help libraries deal with one aspect of
the eResource fulfillment problem- How does a library easily and
quickly determine which journals and other electronic resource have
disappeared because of subscription snafus. It's a free add-on for
the Firefox web browser that acts as an advanced link-checker.

It's a spin-off of the link-checking technology that we have been
developing as part of the machinery we use to maintain the linking
knowledgebases we produce for 1Cate, Worldcat and offerings from
other companies. In addition to basic link checking, Link Evaluator
can check for the presence of green-flag and red-flag phrases in
linked-to resources. By working inside the Firefox browser,
LinkEvaluator can exactly duplicate a user's experience and detect
authentication problems. The green- and red- flag phrases can be
supplied by editing preferences, or by inserting them into an html
page of links to evaluate. LinkEvaluator is also multi-threaded, so
it works up to 4 times faster than other link checking plugins.

The developer behind this work is Filip Babalievsky; you can contact
either of us with questions or kudos;  there's also email list that
we've set up.

For more information and downloads, please see
http://openly.oclc.org/linkevaluator/  and
http://openly.oclc.org/pr/15012007.html

Eric
--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

[CODE4LIB] OCLC xISBN service is moving - February 13

2007-02-12 Thread Eric Hellman


February 13, not 14

I must have valentines on my mind!

--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

[CODE4LIB] xISBN moved

2007-02-15 Thread Eric Hellman


The xISBN service has now moved out of OCLC Research and is now being
supported by the Openly Informatics Division of OCLC. For more
information about about the xISBN service, please visit
http://www.worldcat.org/affiliate/webservices/xisbn/app.jsp

There was a delay in the transfer caused by the snowstorm in Ohio.

We have one problem reported- a poorly documented feature (to
redirect to opacs) was not implemented in the replacement service; we
expect this to be fixed shortly. If you want updates on this issue,
please subscribe to the xidentifier-l mailing list.

In looking at the usage patterns, we've noticed that a very small
number of users are trying to just suck up all the xISBN data using
the webservice. While this may have been reasonable when the data was
static, it is not appropriate now that the data is being frequently
updated. We are now able to supply the COMPLETE xISBN data file every
month to customers who need this for their application, if this
applies to you, please contact us at [EMAIL PROTECTED]; one of
the national libraries is already working with us in this way.

Another usage pattern seems to be pulling data for the contents of a
library catalog; if you are interested in a batch-update facility,
please contact us.

Some uses appear to be very commercial. If you have a commercial
application not allowed by the existing terms and conditions for free
use, please contact us at [EMAIL PROTECTED] for a commercial
agreement.

The new web service is rated at 10 million requests per day, so don't
be afraid to send traffic to xISBN. If you expect to use more than
1,000 requests per day, you must contact us. The existing xml
response format does not allow for us to indicate in the response
whether an application over the limit or not without breaking it, and
we've chosen not to break anyone's application.


Eric
--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] OpenURL validation services

2007-03-19 Thread Eric Hellman


I would recommend that you send this query to the OpenURL listserv
[EMAIL PROTECTED]


at one point there was something at caltech that did this; i'm not
sure if it made it to final



Hi--

Is there any existing code that can validate the descriptive metadata of
an OpenURL ContextObject?

For example,
http://www.openurl.info/registry/docs/mtx/info:ofi/fmt:kev:mtx:journal
states thats auinit1 can have zero or one value and it must be the
first author's first initial. Is there something into which I can input
an OpenURL to see whether indeed the auinit1 param value is only one
character (either A-Z or a-z) and has no more than one occurrence...
plus all the other constraints for the other parameters in the Matrix on
http://www.openurl.info/registry/docs/mtx/info:ofi/fmt:kev:mtx:journal
?


   --SET



--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] E-Resource Access Management Services

2007-03-23 Thread Eric Hellman


I posted some comments in Web4Lib on this- but on code4lib, I'd like
to be a bit more provocative and get y'all bothered.

* What online resources would you collect?
- collection of online resources is an oxymoron.
* How would you connect people to these new collections?
-  why do you think it is you that will do the connecting?
* How will you control and manage these services?
-  if you want to be in control of these resources, take the blue pill
* How will you provide your users with the most correct information possible?
- take the red pill

Eric


We've talked a lot about this at OCLC- should be noted that credit
for starting this discussion goes to our colleagues at
SerialsSolutions.

Tim McCormick here at OCLC Openly likes to ask the question What is
[this interesting sounding concept] in opposition to?

I think that the idea that the ERAMS concept is in opposition to is
what our colleagues at ExLibris are calling URM or Universal
Resource Management.

To accentuate the differences:

ERAMS: electronic resources need an entirely new/different
management infrastructure.
URM: libraries need a single management infrastructure for all their
resources.

Of course there are important truths in both sides of this argument,
but you can see why Serials Solutions and Ex Libris are arguing the
sides they have chosen.




ERAMS E-Resource Access  Management Services
http://www.erams.org/

We are looking for the first 50 participants who are willing to
visualize a library not focused solely on print resource management and
willing to go out on a limb and conceptualize the library which is
focused on user access and management of online resources  services.
Four questions we will be brainstorming about to try to develop our
future scenario today, are:

* What online resources would you collect?
* How would you connect people to these new collections?
* How will you control and manage these services?
* How will you provide your users with the most correct information
possible?

Please join: Jill Emery, Bonnie Tijerina,  Elizabeth Winter to learn
more about the ERAMS concept and the future possibility this concept
holds for libraries.

Where: Marriott Inner Harbor at Camden Yards, Chesapeake Room
When: Saturday, March 31, 2007
Time: 2:00 - 4:00 PM

Light refreshments will be available.

Please RSVP to Jill Emery at [EMAIL PROTECTED] by March 29, 2007.

To learn more and participate in future discussions, please visit
http://www.erams.org http://www.erams.org/



--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

[CODE4LIB] more metadata from xISBN

2007-05-08 Thread Eric Hellman


The api for xISBN that Xiaoming Liu previewed at the Code4Lib meeting
is now officially launched and supported, and provisions for
commercial use are now in place.

For those of you who missed it, in addition to related ISBN's,
xISBN will now also return metadata such as title, edition, language
and publication year that can be used to distinguish manifestations
of a work. xISBN supports a RESTful API, as well as OpenURL and
UNAPI, and can return results in a variety of formats.

xISBN is free for non-commercial, low volume use.

API details are at http://xisbn.worldcat.org/xisbnadmin/doc/api.htm

With your support, we will continue to develop this service along
with other Worldcat-based machine-to-machine services.

Eric




--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] more metadata from xISBN

2007-05-09 Thread Eric Hellman


As long as LibX is free and not being used as a way to drive Amazon
revenue, I don't see how it could be considered to be commercial.

We've studied our logs pretty carefully. Most of the sites that have
exceeded the limit we set were commercial sites doing bulk harvest.

You can track the xISBN use by LibX by getting an affiliate id.

Eric

At 2:32 PM -0400 5/9/07, Godmar Back wrote:

Interesting.

Thom Hickey commented a while ago about LibX's use of xISBN (*): I
suspect that eventually the LibX xISBN support will become both less
visible and more automatic.

We were indeed planning on making it more automatic. For instance, a
user visiting a vendor's page such as amazon might be presented with
options from their library catalog, based on related ISBN found via
xISBN.

Would that qualify as noncommercial use?  For instance, if LibX with
this feature were installed on a public library machine, 500 requests
per day might be easily exceeded. Matters would be even worse if
multiple library machines were to share an IP because they are hidden
behind a NAT device or proxy.

- Godmar

(*) http://outgoing.typepad.com/outgoing/2006/05/libx_and_xisbn.html


--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] more metadata from xISBN

2007-05-09 Thread Eric Hellman


At 4:41 PM -0400 5/9/07, Godmar Back wrote:

On 5/9/07, Eric Hellman [EMAIL PROTECTED] wrote:

We've studied our logs pretty carefully. Most of the sites that have
exceeded the limit we set were commercial sites doing bulk harvest.

You can track the xISBN use by LibX by getting an affiliate id.



LibX is a client-side tool. We're not a user of xISBN, we provide
clients who have installed it the option to use xISBN.


I know, and I had to explain that to the legal department!



Also, keep in mind that an important reason to use OCLC's xISBN
service - rather than using an alternate service or using the data
directly - is Jeff Young's OAI bookmark service, specifically the
know-how he's put into searching multiple catalogs and his keeping a
database of which library uses which catalog. That, as I understand,
is still not part of the officially supported xISBN, though.


We will improve on that service...
--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

[CODE4LIB] js calling js hack (was: A new generation of OPAC enhancements)

2007-05-14 Thread Eric Hellman


IIRC, this hack doesn't work in older versions of IE unless you remove the
type=text/javascript
attribute.

(see http://openly.oclc.org/jake/instant.html )

this is one example of the few examples of a choice you have to make
between having something work and having your page pass strict
validation.

Eric

At 2:59 PM -0400 5/14/07, Jonathan Rochkind wrote:

For what it's worth, I've used that same weird SCRIPT hack to insert
dynamically generated code onto my OPAC screen for other purposes too.
It was initially suggested to me by Dave Pattern. It's a useful hack.

Jonathan

Altay Guvench wrote:

Hi Godmar-

Tim asked me to join the list and discussion on the LibraryThing widgets.

You're right that, with Ajax, we're bound by the same-origin restriction.
But we can dynamically change the page content after loading, by
eschewing
traditional Ajax.  New content is delivered through dynamically-inserted
script tags.  For example, you can set an onclick that adds a tag like
this to the head:

script src=http://www.libarything.com/get_content.php?tag=foo;
type=text/javascript/script

Server-side, get_content.php generates the response on the fly, e.g.
echo 
document.getElementById('tagbrowser').innerHTML = 'books tagged
foo'.  As
long as the response header in get_content is set to javascript, the
browser
should interpret it correctly.

As for the hardwired DOM finagling you saw in Danbury's OPAC, in most
cases,
the table[3] stuff isn't necessary.  Typically, a library will simply
edit
their OPAC's html template to include empty widget divs  ( e.g. div
id=ltfl_tagbrowse class='ltfl'/div ) wherever they'd like the
widgets.
Then a single script tag finds those divs and inserts the contents
onload.

However, there were some permissions issues with the Danbury OPAC that
didn't allow for this.  (They could only edit the OPAC footer.) The
workaround was to dynamically insert the LTFL divs using custom
javascript
in the footer.  That said, like I mentioned, this isn't necessary in most
cases.  We've tested it in a few systems, and generally speaking, our
widgets are DOM-agnostic.

Altay



--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216  Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] good web service api

2007-06-30 Thread Eric Hellman


Eric,

I'll address only the xml design for  your first link, and I'll ask
questions, not because I want to know the answer, but because the
answers determine whether your xml response is good.

is this response supposed to work only for mylibrary?
does version refer to mylibrary or to the response format?
why is there an error element if there's not been an error?
why is there a message element if there's no message?
why is code PCDATA rather than  a CDATA attribute?
are the facets always ordered?
is there a difference between a facet_name and a name?
is the name and note attached to the facet or contained by the facet?
why do facets have ids?
is id an ID?

I also find that it helps to try to read things like this aloud in English.

MyLibrary, could I get all the facets you have?

Hello, I'm MyLibrary version 0.001.
I would like to report an error with code zero and no message.
As for my facets, a facet, which I id as 2, is named as a facet
Formats, and I should note as a facet it is The physical
manifestation of information, and that's what I have to say about
this facet.
Another facet, which I id as 3, is named as a facet People, and I
should note as a facet it is Human beings both real and fictional,
and that's what I have to say about this facet.
etc...
And those are all the facets.
Goodbye!


At 2:15 PM -0400 6/29/07, Eric Lease Morgan wrote:

What are the characteristics of a good Web Service API?

We here at Notre Dame finished writing the MyLibrary Perl API a long
time ago. It does what is was designed to do, and we believe others
could be benefit from our experience. At the same time, the API is
based on Perl and we realize not everybody likes (or knows) Perl.
Some would say, Why didn't you write in X? where X is their
favorite programming language. Well, that is just not practical.

I believe the solution to the dilemma is a Web Service API against
MyLibrary similar to the Web Services API provided by Fedora. Any
scripts sends name/value pairs on a URL and gets back XML. This way
any language can be used against MyLibrary (which we now calling a
digital library framework  toolbox).

Here are the only two working examples I have:

  http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getAll
  http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getOneid=2

Try a few of errors:

  http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getNone
  http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getOneid=45
  http://dewey.library.nd.edu/mylibrary/ws/?obj=facetcmd=getOneid=x

I can create all sorts of commands like:

  * get all facets
  * get all terms
  * get all librarians
  * get all resources
  * get all resources classified with this term
  * create facet, term, librarian, or resource
  * edit facet, term, etc.
  * delete facet, term, etc.

Given such an interface library (MyLibrary) content can be used in
all sort of venues much more easily.

While I don't expect anybody here to know what commands to support,
I am more interested in the how. What are the characteristics of
good name/value pairs? Should they be terse or verbose? To what
extent should the name/value pairs require a version number or a
stylesheet parameter? Besides being encoded in XML, what should the
output look like? What are characteristics of good XML in this
regard? Heck, then there is authorization? How do I disable people
from deleting resources and such.

This might be just an academic exercise, but with the advent of more
and more Web Services computing I thought I might give a MyLibrary
Web Services API a whirl.

--
Eric Lease Morgan
University Libraries of Notre Dame

code4lib_fridays++



--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]  [EMAIL PROTECTED]   2 Broad St., Suite 
208
tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] Citation parsing?

2007-07-18 Thread Eric Hellman


Having written a pretty decent citation parser 10 years ago (in
Applescript!), and having seen a lot of people take whacks at the
problem, I have to say that it's pretty easy to write one that works
on 70-80% of citations, particularly if you stick to one scholarly
subject area. On the other hand, it's really quite hard to write a
citation parser that gets better than 90 % of citations for a general
corpus .

The main problem is that scholarly works are written by creative,
ingenious people. When applied to citations, creativity and ingenuity
are disasters for automated parsers.

Parsers working on the computer science literature have come the
farthest, mostly because the convention in computer science
literature is to always include the article title. The most
impressive thing to me about Google Scholar when it was first
released was to see how far they had taken the citation parsing
outside of the computer science literature. Still, they have a ways
to go; most of the progress they've made seems to be by cheating (
i.e. backing the citation out of the linking, which means they're
just piggybacking on the work done by Inera and others).

(Hint: one of the very best performing open source citation parsers
was written (in perl) by Steve Lawrence, who was at NEC at the time,
as part of ResearchIndex AKA CiteSeer. It was released as pseudo open
source, but not so easy to separate. It relied heavily on the
availability of the article title. Steve has been at Google for a
while. Steve apparently wasn't involved in Scholar, but you have to
assume he and Anurag  did a fair amount of comparing notes.)

Anyway, almost all parsers rely on a set of heuristics. I have not
seen any parsers that do a good job of managing their heuristics in a
scaleable way. A successful open-source attack on this problem would
have the following characteristics:
1. able to efficiently handle and manage large numbers of parsing and
scoring heuristics
2. easy for contributors to add parsing and scoring heuristics
3. able to use contextual information (is the citation from a physics
article or from a history monograph?) in application  and scoring of
heuristics

Eric


It's on our list of Big Problems To Solve; I'm hoping to have time to
tackle it later this year :)

-n

On Jul 18, 2007, at 12:57 PM, Jonathan Rochkind wrote:


Ha! If it's not too difficult, then with all the time you've spent
looking at it extensively, how come you don't have a solution yet?

Just kidding. :)

Jonathan

Nathan Vack wrote:

We've looked at this pretty extensively, and we're pretty certain
there's nothing downloadable that does a good enough job. However,
it's by no means impossible -- it seems to be undergrad thesis-level
work in Singapore:

http://wing.comp.nus.edu.sg/parsCit/

There used to be a paper describing this approach (essentially
treating citation parsing as a natural language processing task and
using a maximum entropy algorithm) online... the page even cites
it... but it seems to be gone now.

FWIW, it didn't look too difficult.

-Nate

On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote:


Does anyone have any decent open source code to parse a citation?
I'm
talking about a completely narrative citation like someone might
cut-and-paste from a bibliography or web page. I realize there are a
number of differnet formats this could be in (not to mention the
human
error problems that always occur from human entered free text)--but
thinking about it, I suspect that with some work you could get
something
that worked reasonably well (if not perfect). So I'm wondering if
anyone
has donethis work.

(One of the commerical legal product--I forget if it's Lexis or
West--does this with legal citations--a more limited domain--quite
well.  I'm not sure if any of the commerical bibliographic citation
management software does this?)

The goal, as you can probably guess, is a box that the user can
paste a
citation into; make an OpenURL out of it; show the user where to
get the
citation.  I'm pretty confident something useful could be created
here,
with enough time put into it. But saldy, it's probably more time
than
anyone has individually. Unless someone's done it already?

Hopefully,
Jonathan





--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu



--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]  [EMAIL PROTECTED]   2 Broad St., Suite 
208
tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] Citation parsing?

2007-07-20 Thread Eric Hellman


On Jul 18, 2007, at 10:04 PM, Eric Hellman wrote:
Also, even in (many) scholarly journals, editorial consistency is
almost unbelievably poor -- lots of times, the rules just aren't
followed. Punctuation gets missed, journal names (especially
abbreviations!) are misspelled... and so on. Rule-based and heuristic
systems are always going to have problems in those cases.


Heuristics are perhaps the only way to deal with lack of consistent
format. (i.e. a cluster of words including journal of is likely to
contain a journal name)
If you have a halfway decent journal name parser (such as the one in
our openurl software) it already contains a large list of journal
misspellings.



In a lot of ways, I think the problem is fundamentally similar to
identifying parts of speech in natural language (which has lots of
the same ambiguities) -- and the same techniques that succeed there
will probably yield the most robust results for citation parsing.


Have people been able to do a decent job of identifying parts of
speech in natural language?
--

Eric Hellman, DirectorOCLC Openly
Informatics Division
[EMAIL PROTECTED]  [EMAIL PROTECTED]   2 Broad St., Suite 
208
tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003
http://openly.oclc.org/1cate/  1 Click Access To Everything

Re: [CODE4LIB] library find and bibliographic citation export?

2007-09-28 Thread Eric Hellman


On Sep 27, 2007, at 9:59 PM, Steve Toub wrote:


A reminder that the data model for OpenURL/COinS does not have all
metadata fields: only one author allowed, no abstract, etc.


That's incorrect; an Openurl context object may contain any number of
author names. (but not parsed author names).

And to be fair, there exists no metadata format that has all
metadata fields.

Eric

Re: [CODE4LIB] library find and bibliographic citation export?

2007-10-02 Thread Eric Hellman


 SFX uses a proprietary mechanism to trigger fetch that is not part
of the OpenURL standard. The usefulness of this mechanism, however,
motivated the very rich fetch functionality in the 1.0 version of the
standard- if you care at all about interoperability you should avoid
the SFX trigger mechanism.

COinS recommends not to use ContextObjects with fetch (by reference
metadata) because it is thought that deverse agents will not be able
to deal with them; however, link server systems with full Z39.88
implementations should have no problem with them.

Eric


On Sep 28, 2007, at 11:59 PM, Tom Keays wrote:


I'm certainly no expert, but my understanding is that you have to
embed the extra authors into a call (a fetch in the SFX lingo) using
a private identifier at the end of the OpenURL string. It's more
complicated than that, of course, since there has to an sid included
in addition to the pid and, at least in the SFX-biased documentation,
this so-called fetch operation is intended to be done using z39.50
or html-based (REST?) calls to a dataserver determined by the source.
Pretty messy, huh?

http://www.exlibrisgroup.com/resources/sfx/sfx_for_ips_aug_2002.pdf

The upshot is that multiple authors aren't directly supported in COinS
because a COinS url is by design generic and therefore can't know
about how to deal with the specific sid and pid requirements of a
fetch.

However, I would think Umlaut ought to be able to handle it since it
has SFX at its core.

On 9/28/07, Jonathan Rochkind [EMAIL PROTECTED] wrote:

Can you tell me how to legally include more than one author name
in an OpenURL context object? I've been a bit confused about this
myself, and happen to be dealing with it presently too.

[CODE4LIB] OpenURL Referrer for IE

2007-10-25 Thread Eric Hellman


OCLC's OpenURL Referrer is now available for Internet Explorer!
Previously available only for Firefox, this popular browser extension
inserts OpenURLs into Google Scholar and Google News Archive search
results.  It also detects and makes links out of web COinS, such as
those found in Wikipedia and Worldcat.org.

The extension can be downloaded for free at
http://openly.oclc.org/openurlref/

OpenURL Referrer uses your institution's link resolver settings from
the OCLC WorldCat Registry, so there is no need to manually configure
the extension. Institutions can register their resolver in the OCLC
WorldCat Registry by visiting http://worldcat.org/registry/
institutions.  All institutions can register for free, even if they
are not OCLC member libraries.

The IE version leans much more heavily on the Worldcat Resolver
Registry than the Firefox version does. The reason for this is that
IE does not have a nice XUL-based way to make user interfaces, so we
instead rely on the Registry to do baseurl management. I hope this
proves to be useful.


Eric Hellman, Director  OCLC Openly Informatics
Division
[EMAIL PROTECTED]   2 Broad St., Suite 208
tel 1-973-509-7800   Bloomfield, NJ 07003
http://openly.oclc.org/

[CODE4LIB] Openly Jake is closed for renovation.

2008-01-18 Thread Eric Hellman


I continue to be surprised at the continuing use of Openly Jake
considering that it's been over 7 years since the data it delivers has
been maintained.

Over the past month, the system has become increasingly unstable due
in part to *heavy usage*, and on examination, it looks like that we'll
need to do some serious renovation (the usual chain of os/vm/container
updates) and as a result, jake will not be available for at least the
next week. I can't make any promises, but there is a possibility that
we can also enable a switch to an up-to-date knowledgebase.

If you have any questions or concerns, please don't hesitate to
contact me.


Eric Hellman, Director  OCLC New Jersey
[EMAIL PROTECTED]   2 Broad 
St., Suite 208
tel 1-973-509-7800   fax 1-734-468-6216 Bloomfield, NJ 07003
http://openly.oclc.org/

Re: [CODE4LIB] Multiple ISBNs in COInS?

2008-02-29 Thread Eric Hellman


It's an excellent point. The resolver's knowledgebase needs to know
which issn a vendor has bundled content under, and ideally will be
able to access that content no matter the issn/eissn in the OpenURL
metadata. I'm thinking of a particular vendor that uses issn in the
url syntax, but without a knowledge, it's hard to predict which issn
is the one they use!

On Feb 29, 2008, at 8:38 AM, Kyle Banerjee wrote:




I agree; issn is not an identifier for an article. But in general, a
resolver should be smart enough to know what serial is meant even if a
variant issn is supplied.




To prevent multiple searches, the resolver has to know how a title is
referenced in the target source. This requires precalculation using a
service or data file like xISBN that included ISSNs. However, it is
important to keep in mind that sources such as the library catalog
sometimes require multiple ISSNs to retrieve all holdings data unless
this information is combined before it is loaded into the resolver
knowledgebase.

Between cataloging rules that influence how serials are issued
(specifically, the practice known as successive entry cataloging
which spreads individual titles across multiple records because of
piddly variations in issues) and things that occur at the publishing
end of things, many journals are known by multiple ISSNs. Practices
like these are not user friendly -- even reference librarians don't
seem to understand them -- so database providers typically combine all
the issues so they can be considered part of one unit. Vendor provided
data about such titles will likely include only one of these ISSNs
(most likely, the most recent one, but that is not guaranteed).

Unlike vendors, catalogers can be counted on to spread the holdings
statements across multiple records and ISSNs if the cataloging rules
so prescribe. This may sound like cataloging minutia, but this dynamic
affects a number of very popular titles. Resolving only one ISSN could
easily lead people to think an issue they need is not available when
it is on hand.

kyle
--
--
Kyle Banerjee
Digital Services Program Manager
Orbis Cascade Alliance
[EMAIL PROTECTED] / 541.359.9599

[CODE4LIB] Open positions at OCLC New Jersey

2008-07-21 Thread Eric Hellman

)
Worldcat Link Manager Knowledgebase (used by a number of other vendors)
xISBN
xISSN
OpenURL Referrer (a firefox add-on)
Link Evaluator (a firefox add-on)

We have just started work on a project to change the way that libraries
throughout the world organize what they do.

As a big non profit corporation, we have solid benefits and good insulation
from the coming recession. As a small location, we have flexible hours and
casual working environment. As a non-profit, OCLC doesn't pay like a bank or
other big businesses would; a corollary of that is that people who work here
actually want to work here. For me, the greatest thing about what we do
is that  millions of people around the world benefit from the work we do.

We're located 10 miles west of Manhattan (GSP exit 148) 1 block from the
Bloomfield train station.

If you're interested in a position at OCLC NJ, feel free to write me an
e-mail ([EMAIL PROTECTED]) telling me why you're a good match, and submit a
resume at 
https://jobs-oclc.icims.com/oclc_jobs/jobs/candidate/jobs.jsp?ss=1searchLoc
ation=US-NJ
so that you exist in the minds of HR.

Eric 


Eric Hellman, Director OCLC New Jersey
[EMAIL PROTECTED]2 Broad St., Suite 208
tel 1-973-509-7800  fax 1-734-468-6216   Bloomfield, NJ 07003
http://nj.oclc.org/

Re: [CODE4LIB] COinS in OL?

2008-12-01 Thread Eric Hellman

Not just the book pages, I might add! Wikipedia probably has the most
non-book COinS deployed; Worldcat is the premier site for book COinS.

A recent but impressive addition to the COinSiverse is ResearchBlogging- see
http://ResearchBlogging.org

Eric


On 12/1/08 11:08 AM, Karen Coyle [EMAIL PROTECTED] wrote:

 I have a question to ask for the Open Library folks and I couldn't quite
 figure out where to ask it. This seems like a good place.
 
 Would it be useful to embed COinS in the book pages of the Open Library?
 Does anyone think they might make use of them?
 
 Thanks,
 kc

Eric Hellman, Director  OCLC New Jersey
[EMAIL PROTECTED]   2 Broad St., Suite 208
tel 1-973-509-7800  Bloomfield, NJ 07003
http://nj.oclc.org/

Re: [CODE4LIB] COinS in OL?

2008-12-07 Thread Eric Hellman

just catch up on the discussion here...

for the benefit of those who aren't on the openurl list, it was pointed out
that you can put lccn in a ContextObject using info uri's:

rft_id=info:lccn/93004427

Eric

On 12/1/08 1:28 PM, Jonathan Rochkind [EMAIL PROTECTED] wrote:

 I'm not sure there's any good way to include a DDC or LCC in an SAP1
 OpenURL for COinS. Same with subject vocabularies. Really, I'm pretty
 sure there is NOT, in fact.  But if there is, sure, throw them in, put
 in anything you've got.
 
 But this re-affirms my suggestion that there might be a better
 microformat-ish way to embed information in the page in addition to
 OpenURL.  COinS/OpenURL is important because we have an established
 infrastructure for it, but it's actually pretty limited and not always
 the easiest to work with.
 
 Jonathan
 


Eric Hellman, Director  OCLC New Jersey
[EMAIL PROTECTED]   2 Broad St., Suite 208
tel 1-973-509-7800  Bloomfield, NJ 07003
http://nj.oclc.org/

Re: [CODE4LIB] COinS in OL?

2008-12-08 Thread Eric Hellman

Yep. There's no URI for LCC. You could put LCC in the subject field of a
dublin core profile metadata format ContextObject. But it's not clear why
anyone would want to do that.


On 12/8/08 10:41 AM, Jonathan Rochkind [EMAIL PROTECTED] wrote:

 LCCN--Library of Congress Control Number--eg 98013779--, yes.
 LCC--Library of Congress Classification--eg BF575.H27 W35 1991--I don't
 think so.
 
 Jonathan
 
 Eric Hellman wrote:
 just catch up on the discussion here...
 
 for the benefit of those who aren't on the openurl list, it was pointed out
 that you can put lccn in a ContextObject using info uri's:
 
 rft_id=info:lccn/93004427
   
 

Eric Hellman, Director  OCLC New Jersey
[EMAIL PROTECTED]   2 Broad St., Suite 208
tel 1-973-509-7800  Bloomfield, NJ 07003
http://nj.oclc.org/

Re: [CODE4LIB] registering info: uris?

2009-04-01 Thread Eric Hellman


I'll bite.

There are actually a number of http URLs that work like 
http://dx.doi.org/10./j.1475-4983.2007.00728.x
One of them is http://doi.wiley.com/10./j.1475-4983.2007.00728.x
Another is run by crossref;  Some OpenURL ink servers also have doi  
proxy capability.
So for code to extract the doi reliably from http urls, the code needs  
to know all the possibilities for the doi proxy stem. The proxies also  
tend to have optional parameters that can control the resolution. In  
principle, the info:doi/ stem addresses this.


On Apr 1, 2009, at 7:27 AM, Ross Singer wrote:

 What I don't understand is the
reason to express that identifier as:

info:doi/10./j.1475-4983.2007.00728.x

when

http://dx.doi.org/10./j.1475-4983.2007.00728.x



Eric Hellman

e...@hellman.net (personal)
http://hellman.net/eric/

Re: [CODE4LIB] registering info: uris?

2009-04-07 Thread Eric Hellman

no, that's not at all what it implies. the ofi/name identifiers were  
minted as identifiers for namespaces of indentifiers, not as a wrapper  
scheme for the identifiers themselves. Yes, it's a bit TOO meta, but  
they can be safely ignored unless a new profile is desired.



On Apr 5, 2009, at 10:31 AM, Karen Coyle wrote:


Jonathan Rochkind wrote:


URI for an ISBN or SuDocs?  I don't think the GPO is going  
anywhere, but the GPO isn't committing to supporting an http URI  
scheme, and whoever is, who knows if they're going anywhere. That  
issue is certainly mitigated by Ross using purl.org for these,  
instead of his own personal http URI. But another issue that makes  
us want a controlling authority is increasing the chances that  
everyone will use the _same_ URI.  If GPO were behind the purl.org/ 
NET/sudoc URIs, those chances would be high. Just Ross on his own,  
the chances go down, later someone else (OCLC, GPO, some other guy  
like Ross) might accidentally create a 'competitor', which would be  
unfortunate. Note this isn't as much of a problem for born web  
resources -- nobody's going to accidentally create an alternate URI  
for a dbpedia term, because anybody that knows about dbpedia knows  
that it lives at dbpedia.


So those are my thoughts. Now everyone else can argue bitterly over  
them for a while. :)


The ones that really puzzle me, however, are the OpenURL info  
namespace URIs for ftp, http, https and info. This implies that  
EVERY identifier used by OpenURL needs an info URI, even if it is a  
URI in its own right. They are under info:ofi/nam which is called  
Namespace reserved for registry identifiers of namespaces. There's  
something so circular about this that I just get a brain dump when I  
try to understand it. Does it make sense to anyone?


kc


--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234



Eric Hellman
http://hellman.net/eric/

Re: [CODE4LIB] Use of rft.identifier in COiNS?

2009-04-29 Thread Eric Hellman

Wikipedia uses the DC metadata format for non-book objects,  
rft.identifier is part of the DC metadata format; if you are  
describing a book, you want to use the book metadata format.


Jonathan is correct, rft_id is a possible place to put an accession  
number for a book, but only if you can make a URI out of it. You might  
also consider rft_dat as long as you include rfr_id.


Great to hear LibraryThing is looking at COinS!

Eric


On Apr 29, 2009, at 4:01 PM, Chris Catalfo wrote:


Hi all,

I am trying to find the best way to include an item's accession  
number (i.e.
ILS system id) in a COiNS span.  This is in the context of library  
catalog
pages where I'd like to be able to retrieve the ILS accession number  
to

return to LibraryThing for Libraries.

I see no mention of an rft.identifier key/value pair on the COiNS  
site's
brief guide to books [1].  It does, however, appear as an element in  
the

COiNS online generator for generic items [2].

Googling returned a couple of results using rft.identifier to hold  
urls.


Can anyone enlighten me as to whether using rft.identifier to hold  
the ILS

accession number is valid?  Or suggest a more suitable key/value pair?

Thanks for any help you can provide.

Chris Catalfo
Programmer, LibraryThing

[1] http://ocoins.info/cobgbook.html
[2] http://generator.ocoins.info/?sitePage=info/dc.html;


Eric Hellman
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net (personal)
http://hellman.net/eric/

Re: [CODE4LIB] A Book Grab by Google

2009-05-20 Thread Eric Hellman

Should note that Google could be paying $100,000,000+ to rights  
holders without getting ANYTHING in return in the absence of a  
settlement- that's what the copyright attorneys I've talked to believe  
would have been the ruling by the court had the suit gone to trial.  
And if that happened libraries would get nothing, not even the scans.  
I don't see how bashing Google (which is NOT what the library  
association briefs are doing, btw) for gaps in US and international  
Copyright Law(orphan works, for example) will end up helping libraries.


My blog at http://go-to-hellman.blogspot.com/ is no longer secret.

Eric

Re: [CODE4LIB] A Book Grab by Google

2009-05-20 Thread Eric Hellman

But the argument being trotted out is that having orphan works  
available through Google would HURT libraries, which is a somewhat  
different discussion.


The arguments I see for that (as applied to libraries other than the  
internet Archive) are:
1. Asset devaluation. Just as DeBeers would be hurt if Google started  
selling cheap diamonds, because their stock of diamonds would be  
devalued, libraries would find their collections devalued.
2. Competition. Patrons would have attractive alternatives to visiting  
libraries to access information locked onto paper.


But let's imagine that Internet Archive was shoehorned into the  
settlement. How would these arguments change? Asset devaluation would  
presumably be worse as the price was driven down, and Libraries (other  
than IA) would be faced with more competition, not less. On the other  
hand, presumably libraries would gain more options in indexing and  
thus improved access to their collections.


Eric
http://hellman.net/eric/

On May 20, 2009, at 3:47 PM, st...@archive.org wrote:


On 5/20/09 11:19 AM, Eric Hellman wrote:
 I don't see how bashing Google (which is NOT what the
 library association briefs are doing, btw) for gaps in US
 and international Copyright Law(orphan works, for example)
 will end up helping libraries.

i think the concern is that the settlement could give
_only_ Google the right to scan orphaned works, and no
one else. that certainly wouldn't help libraries.

/st...@archive.org

Re: [CODE4LIB] A Book Grab by Google

2009-05-20 Thread Eric Hellman

I think one thing in Karen's comment is incorrect. As far as I can  
tell, the 'most favored nation' clause does NOT apply in the situation  
that Karen assumes it would be most likely to come into play. MFN  
appears to apply only if the registry licenses orphan works. It's an  
odd provision if you assume that the registry can't license orphan  
works; commentators such as Randy Picker have also commented on this  
oddness; as Karen mentions, it could be meant to come into play if  
orphan works legislation is enacted. You can examine the legalese  
yourself at

http://go-to-hellman.blogspot.com/2009/04/does-google-really-get-orphan-monopoly.html

Eric

On May 20, 2009, at 2:54 PM, Karen Coyle wrote:


Eric Hellman wrote:
Should note that Google could be paying $100,000,000+ to rights  
holders without getting ANYTHING in return in the absence of a  
settlement- that's what the copyright attorneys I've talked to  
believe would have been the ruling by the court had the suit gone  
to trial. And if that happened libraries would get nothing, not  
even the scans. I don't see how bashing Google (which is NOT what  
the library association briefs are doing, btw) for gaps in US and  
international Copyright Law(orphan works, for example) will end up  
helping libraries.


My blog at http://go-to-hellman.blogspot.com/ is no longer secret.

Eric


Another important note is that the settlement is the collective  
desires of the entities representing rights holders (Author's Guild  
and Assn Am. Publishers) and Google. Because the settlement talks  
were done under NDA, we can only guess at which aspects of the  
settlement were proposed/championed by which participants. From the  
little bit that has been revealed by folks who were there (because  
they are still under NDA) the AAP had strong demands and was  
probably equal to Google, if not more so, in terms of its ability to  
carve out what it felt was the best deal. The settlement is a  
compromise, with everyone getting *some* of what they wanted, and no  
one getting *all*.


In answer to the question you pose on your blog: The key question  
is this: Would the Book Rights Registry have the ability to  
authorize a Google competitor to copy and use Orphan works? The  
legal folks I've heard speak about this say that the answer is no.  
Only the court can authorize the copying and use of Orphan works  
outside of what copyright law already states, and this settlement  
waives liability under the law only for Google. The registry cannot  
change the legal status of Orphan works under the copyright law in a  
way that would permit copying of them as in-copyright works. The  
registry sets prices, so if someone else found a way to copy Orphan  
works legally (say, if we got orphan works legislation), the  
registry might be used by them as the middle-man for payments.


Most likely the registry would be used for non-Orphan works, because  
the rights holder could make a deal with the registry to give  
permission for copying, with $$ going to the registry and on to the  
rights holder. This is exactly what the Copyright Clearance Center  
does -- it serves as a central licensing agency for copyright  
holders. I assume that this is the area where the 'most favored  
nation' clause would be most likely to come into play -- basically,  
if Google Books is successful, rights holders might want to make  
deals with other entities for similar product lines.


Whether or not the suit itself would have gone against Google is a  
matter of debate. I've heard it both ways. Google folks state (and  
because they say this publicly it has to be considered at least  
partially a PR statement) that the lawsuit would have gone on for  
years (true), and they didn't want to wait that long to be able to  
know what they could and could not do with this project. That makes  
sense, but it also is possible that they weren't as sure that they'd  
win as they'd stated when they started the project.


kc

--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234

Re: [CODE4LIB] A Book Grab by Google

2009-05-20 Thread Eric Hellman

The oddness was remarked upon in Randy picker's talk at the Columbia  
conference on the Google Book Search Settlement. orphan works is not  
a term that occurs in the settlement agreement. Rightsholders other  
than Registered Rightsholders are orphan parents.


Careful commentators refer to the initial monopoly on orphan works  
created by the settlement agreement, becasue we don't know what will  
happen down the road.



On May 20, 2009, at 8:14 PM, Karen Coyle wrote:

Eric, can you cite a section for this? Because I haven't seen this  
interpretation elsewhere, and I don't read it in the section you  
cite, which doesn't seem to me to mention orphan works. I will point  
to Grimmelmann:

http://works.bepress.com/cgi/viewcontent.cgi?article=1024context=james_grimmelmann*
pp 10-11.

Grimmelmann thinks that the monopoly on orphan works is what will  
give Google the edge that keeps away competition, but he doesn't  
interpret the MFN clause as relating only to orphan works.


kc
*
Eric Hellman wrote:
I think one thing in Karen's comment is incorrect. As far as I can  
tell, the 'most favored nation' clause does NOT apply in the  
situation that Karen assumes it would be most likely to come into  
play. MFN appears to apply only if the registry licenses orphan  
works. It's an odd provision if you assume that the registry can't  
license orphan works; commentators such as Randy Picker have also  
commented on this oddness; as Karen mentions, it could be meant to  
come into play if orphan works legislation is enacted. You can  
examine the legalese yourself at

http://go-to-hellman.blogspot.com/2009/04/does-google-really-get-orphan-monopoly.html

Eric

On May 20, 2009, at 2:54 PM, Karen Coyle wrote:


Eric Hellman wrote:
Should note that Google could be paying $100,000,000+ to rights  
holders without getting ANYTHING in return in the absence of a  
settlement- that's what the copyright attorneys I've talked to  
believe would have been the ruling by the court had the suit gone  
to trial. And if that happened libraries would get nothing, not  
even the scans. I don't see how bashing Google (which is NOT what  
the library association briefs are doing, btw) for gaps in US and  
international Copyright Law(orphan works, for example) will end  
up helping libraries.


My blog at http://go-to-hellman.blogspot.com/ is no longer secret.

Eric


Another important note is that the settlement is the collective  
desires of the entities representing rights holders (Author's  
Guild and Assn Am. Publishers) and Google. Because the settlement  
talks were done under NDA, we can only guess at which aspects of  
the settlement were proposed/championed by which participants.  
From the little bit that has been revealed by folks who were there  
(because they are still under NDA) the AAP had strong demands and  
was probably equal to Google, if not more so, in terms of its  
ability to carve out what it felt was the best deal. The  
settlement is a compromise, with everyone getting *some* of what  
they wanted, and no one getting *all*.


In answer to the question you pose on your blog: The key question  
is this: Would the Book Rights Registry have the ability to  
authorize a Google competitor to copy and use Orphan works? The  
legal folks I've heard speak about this say that the answer is  
no. Only the court can authorize the copying and use of Orphan  
works outside of what copyright law already states, and this  
settlement waives liability under the law only for Google. The  
registry cannot change the legal status of Orphan works under the  
copyright law in a way that would permit copying of them as in- 
copyright works. The registry sets prices, so if someone else  
found a way to copy Orphan works legally (say, if we got orphan  
works legislation), the registry might be used by them as the  
middle-man for payments.


Most likely the registry would be used for non-Orphan works,  
because the rights holder could make a deal with the registry to  
give permission for copying, with $$ going to the registry and on  
to the rights holder. This is exactly what the Copyright Clearance  
Center does -- it serves as a central licensing agency for  
copyright holders. I assume that this is the area where the 'most  
favored nation' clause would be most likely to come into play --  
basically, if Google Books is successful, rights holders might  
want to make deals with other entities for similar product lines.


Whether or not the suit itself would have gone against Google is a  
matter of debate. I've heard it both ways. Google folks state (and  
because they say this publicly it has to be considered at least  
partially a PR statement) that the lawsuit would have gone on for  
years (true), and they didn't want to wait that long to be able to  
know what they could and could not do with this project. That  
makes sense, but it also is possible that they weren't as sure  
that they'd win as they'd stated when

[CODE4LIB] Google Fusion Tables

2009-06-12 Thread Eric Hellman

Google Fusion Tables appear to be aimed at collaborative linked  
database development. It's a bit early (pre-alpha, no web services, no  
publishing), but it looks really interesting. Does anyone have ideas  
how to take advantage of this in libraries?


Google Labs Blog: 
http://googleresearch.blogspot.com/2009/06/google-fusion-tables.html
Google Fusion Tables: http://tables.googlelabs.com/Home
My review: 
http://go-to-hellman.blogspot.com/2009/06/linked-data-vs-google-fusion-tables.html

Eric Hellman
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net (personal)
http://hellman.net/eric/

[CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy

2009-08-05 Thread Eric Hellman

Recent attention to privacy concerns about Google Book Search have led  
me to investigate whether any libraries are using tools such as proxy  
servers to enhance patron privacy when using Google Book Search.  
Similarly, advertising networks (web bugs, for example) could be  
proxied for the same reason. I would be very interested to hear from  
any libraries that have done either of these things and of their  
experiences doing so.



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy

2009-08-05 Thread Eric Hellman

I doubt that very much. It's very common for corporate sites to  
channel all their traffic through gateways. I would assume that google  
was smart enough to recognize that your usage pattern was not that of  
many users coming from a single IP address, but rather that of a  
harvesting robot. The two activities have very different log signatures.



On Aug 5, 2009, at 12:13 PM, Tim Spalding wrote:


I suspect that proxying Google will trigger an automatic throttle.
Early on, a number of us hit GB hard, trying to figure out what they
had, and got stopped.

Tim

On Wed, Aug 5, 2009 at 9:59 AM, Eric Hellmane...@hellman.net wrote:
Recent attention to privacy concerns about Google Book Search have  
led me to
investigate whether any libraries are using tools such as proxy  
servers to
enhance patron privacy when using Google Book Search. Similarly,  
advertising
networks (web bugs, for example) could be proxied for the same  
reason. I
would be very interested to hear from any libraries that have done  
either of

these things and of their experiences doing so.


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy

2009-08-05 Thread Eric Hellman

No doubt throttling is used for API calls, but IP address throttling  
of the full user interface ought to be managed quite differently. If  
anyone has seen that occur, I would be interested to hear of it.



On Aug 5, 2009, at 2:34 PM, Jon Gorman wrote:


On Wed, Aug 5, 2009 at 1:05 PM, Eric Hellmane...@hellman.net wrote:
I doubt that very much. It's very common for corporate sites to  
channel all
their traffic through gateways. I would assume that google was  
smart enough
to recognize that your usage pattern was not that of many users  
coming from

a single IP address, but rather that of a harvesting robot. The two
activities have very different log signatures.



Uh, actually, Google has in the past throttled some services based on
the ip address.  I'm pretty sure it was mentioned before on this list
and I can verify it myself.  Look for some of Jonathan Rochkind's
questions about a year ago.  The original api used with GBS seemed
very prone to this.  I know others hit issues and when our consortium
tried to use a proxy of the original api due to some technical issues
they ran into this.  (First couple of hundred hits would be golden,
the rest just would return http errors).  There's a newer one out
there now that apparently doesn't use this throttling, but I'm not
positive of the details.  An organization may still have to warn
google about it.

There's a reason why the original api strongly encouraged folks to do
things via a ajaxy call on the client.  I'm guessing part of the
reason for the new api was to address these issues.

Jon Gorman


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

IIRC you can also elide url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx as that
is the default.

If you don't add DC metadata, which seems like a good idea, you'll
definitely want to include something that will help you to persist
your replacement record. For example, a label or description for the
link.

On Sep 14, 2009, at 9:48 AM, O.Stephens wrote:

I'm working on a project called TELSTAR (based at the Open
University in the UK) which is looking at the integration of
resources into an online learning environment (see http://www.open.ac.uk/telstar
for the basic project details). The project focuses on the use of
References/Citations as the way in which resources are integrated
into the teaching material/environment.

We are going to use OpenURL to provide links (where appropriate)
from references to full text resources. Clearly for journals,
articles, and a number of other formats this is a relatively well
understood practice, and implementing this should be relatively
straightforward.

However, we also want to use OpenURL even where the reference is to
a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk
. This is in order to ensure that links provided in the course
material are persistent over time. A brief description of what we
perceive to be the problem and the way we are tackling it is
available on the project blog at http://www.open.ac.uk/blogs/telstar/2009/09/14/managing-link-persistence-with-openurls/
(any comments welcome).

What we are considering is the best way to represent a web page (or
similar - pdf etc.) in an OpenURL. It looks like we could do
something as simple as:

http://resolver.address/?
url_ver=Z39.88-2004
url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx
rft_id=http%3A%2F%2Fwww.bbc.co.uk

Is this sufficient (and correct)? Should we consider passing fuller
metadata? If the latter should we use the existing KEV DC
representation, or should we be looking at defining a new metadata
format? Any help would be very welcome.

Thanks,

Owen

Owen Stephens
TELSTAR Project Manager
Library and Learning Resources Centre
The Open University
Walton Hall
Milton Keynes, MK7 6AA

T: +44 (0) 1908 858701
F: +44 (0) 1908 653571
E: o.steph...@open.ac.ukmailto:o.steph...@open.ac.uk

The Open University is incorporated by Royal Charter (RC 000391), an
exempt charity in England Wales and a charity registered in
Scotland (SC 038302).

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

Could you give us examples of http urls in rft_id that are like that?  
I've never seen such.


On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote:

In general, identifiers in URI form are put in rft_id that are NOT  
meant for providing to the user as a navigable URL.  So the  
receiving software can't assume that whatever url is in rft_is  
represents an actual access point (available to the user) for the  
document.




Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

As I'm sure you're aware, the OpenURL spec only talks about providing
services, and resolving to full text is only one of many possible
services. If *all* you know about a referent is the url, then
redirecting the user to the url is going to be the best you can do in
almost all cases. In particular, I don't think the dublin core
profile, which is what Owen suggests to use, has much to say about
resolving to full text.

http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I
mean what else would you use to id the catalog record. unless you've
implemented the http-range 303 redirect recommendation in your catalog
(http://www.w3.org/TR/cooluris/), it shouldn't be construed as
identifying the thing it describes, except as a private id, and you
should use another field for that.

IIRC Google, Worldcat, and Wikipedia used rft_id.

I'm not in a position to answer any questions about specific link
resolver software that I no longer am associated with, however good it
is/was.

Eric

On Sep 14, 2009, at 12:57 PM, Jonathan Rochkind wrote:

Well, in the 'wild' I barely see any rft_id's at all, heh. Aside
from the obvious non-http URIs in rft_id, I'm not sure if I've seen
http URIs that don't resolve to full text. BUT -- you can do
anything with an http URI that you can do with an info uri. There is
no requirement or guarantee in any spec that an HTTP uri will
resolve at all, let alone resolve to full text for the document
cited in an OpenURL.
The OpenURL spec says that rft_id is An Identifier Descriptor
unambiguously specifies the Entity by means of a Uniform Resource
Identifier (URI). It doesn't say that it needs to resolve to full
text.

In my own OpenURL link-generating software, I _frequently_ put
identifiers which are NOT open access URLs to full text in rft_id.
Because there's no other place to put them. And I frequently use
http URIs even for things that don't resolve to full text, because
the conventional wisdom is to always use http for URIs, whether or
not they resolve at all, and certainly no requirement that they
resolve to something in particular like full text.

Examples that I use myself when generating OpenURL rft_ids, of http
URIs that do not resolve to full text include ones identifying bib
records in my own catalog:
http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my
catalog record, but not to full text!]

Or similarly, WorldCat http URIs.

Or, an rft_id to unambigously identify something in terms of it's
Google Books record: http://books.google.com/books?id=tl8MCAAJ

Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/
[sudoc]= will, as the purl is presently set up by rsinger,
resolve to a GPO catalog record, but there's no guarantee of online
public full text.

I'm pretty sure what I'm doing is perfectly appropriate based on the
definition of rft_id, but it's definitely incompatible with a
receiving link resolver assuming that all rft_id http URIs will
resolve to full text for the rft cited. I don't think it's
appropriate to assume that just because a URI is http, that means it
will resolve to full text -- it's merely an identifier that
unambiguously specifies the referent, same as any other URI scheme.
Isn't that what the sem web folks are always insisting in the
arguments about how it's okay to use http URIs for any type of
identifier at all -- that http is just an identifier (at least in a
context where all that's called for is a URI to identify), you can't
assume that it resolves to anything in particular? (Although it's
nice when it resolves to RDF saying more about the thing identified,
it's certainly not expected that it will resolve to full text).

Eric, out of curiosity, will your own link resolver software
automatically take rft_id's and display them to the user as links?

Jonathan

Eric Hellman wrote:
Could you give us examples of http urls in rft_id that are like
that? I've never seen such.

On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote:

In general, identifiers in URI form are put in rft_id that are
NOT meant for providing to the user as a navigable URL. So the
receiving software can't assume that whatever url is in rft_is
represents an actual access point (available to the user) for the
document.

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

It's not correct to say that rft_val has no use; when used, it should   
contain a URL-encoded package of xml or kev metadata. it would be  
correct to say it is very rarely used.



On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote:


ok no one shoot me for doing this:

in section 9.1 Namespaces [Registry] of the OpenURL standard  
(z39.88) it

actually provides an example of using a URL in the rfr_id field, and i
wonder why you couldn't just do the same thing for the rft_id

also there is a field called rft_val which currently has no use.   
this might

be a good one for it.

just my 2 cents.


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

 If you have a URL that can be used for a resource that you are  
describing in metadata, resolvers can do a better job providing  
services to users if it is put in the openurl. The only place to put  
it is rft_id. So let's not let one resolver's incapacity to prevent  
other resolvers from providing better services.


If you want to make an OpenURL for a web page, its url is in almost  
all cases the best unambiguous identifier you could possibly think of.


Putting dead http uri's in rft_id is not really a very useful thing to  
do.


On Sep 14, 2009, at 1:45 PM, Jonathan Rochkind wrote:


Eric Hellman wrote:
http://catalog.library.jhu.edu/bib/NUM identifies a catalog record-  
I  mean what else would you use to id the catalog record. unless  
you've  implemented the http-range 303 redirect recommendation in  
your catalog  (http://www.w3.org/TR/cooluris/), it shouldn't be  
construed as  identifying the thing it describes, except as a  
private id, and you  should use another field for that.


Of course. But how is a link resolver supposed to know that, when  
all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM  ??


I suggest that this is a kind of ambiguity in OpenURL, that many of  
us are using rft_id to, in some contexts,  simply provide an  
unambiguous identifier, and in other cases, provide an end-user  
access URL  (which may not be a good unambiguous identifier at  
all!).   With no way for the link resolver to tell which was intended.


So I don't think it's a good idea to do this.  I think the community  
should choose one, and based on the language of the OpenURL spec,  
rft_id is meant to be an unambiguous identifier, not an end-user  
access URL.


So ideally another way would be provided to send something intended  
as an end-user access URL in an OpenURL.


But OpenURL is pretty much a dead spec that is never going to be  
developed further in any practical way. So, really, I recommend  
avoiding OpenURL for some non-library standard web standards  
whenever you can. But sometimes you can't, and OpenURL really is the  
best tool for the job. I use it all the time. And it constantly  
frustrates me with it's lack of flexibility and clarity, leading to  
people using it in ambiguous ways.



Jonathan


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

 I'm doing is perfectly appropriate based on the
definition of rft_id, but it's definitely incompatible with a  
receiving link
resolver assuming that all rft_id http URIs will resolve to full  
text for
the rft cited.  I don't think it's appropriate to assume that just  
because a
URI is http, that means it will resolve to full text -- it's  
merely an
identifier that unambiguously specifies the referent, same as any  
other URI
scheme.  Isn't that what the sem web folks are always insisting in  
the
arguments about how it's okay to use http URIs for any type of  
identifier at
all -- that http is just an identifier (at least in a context  
where all
that's called for is a URI to identify), you can't assume that it  
resolves
to anything in particular? (Although it's nice when it resolves to  
RDF
saying more about the thing identified, it's certainly not  
expected that it

will resolve to full text).

Eric, out of curiosity, will your own link resolver software  
automatically

take rft_id's and display them to the user as links?

Jonathan


Eric Hellman wrote:


Could you give us examples of http urls in rft_id that are like  
that?

I've never seen such.

On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote:




In general, identifiers in URI form are put in rft_id that are  
NOT  meant
for providing to the user as a navigable URL.  So the  receiving  
software
can't assume that whatever url is in rft_is  represents an  
actual access

point (available to the user) for the  document.





Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/









Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

You're absolutely correct, in fact, all the ent_val fields are  
reserved for future use! They went in and out of the spec. I'm trying  
to remember from my notes. It's better that they're out.



On Sep 14, 2009, at 2:05 PM, Rosalyn Metz wrote:

sorry eric, i was reading straight from the documentation and  
according to

it it has no use.

On Mon, Sep 14, 2009 at 1:55 PM, Eric Hellman e...@hellman.net  
wrote:



It's not correct to say that rft_val has no use; when used, it should
contain a URL-encoded package of xml or kev metadata. it would be  
correct

to say it is very rarely used.


On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote:

ok no one shoot me for doing this:


in section 9.1 Namespaces [Registry] of the OpenURL standard  
(z39.88) it
actually provides an example of using a URL in the rfr_id field,  
and i

wonder why you couldn't just do the same thing for the rft_id

also there is a field called rft_val which currently has no use.   
this

might
be a good one for it.

just my 2 cents.



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

I can't imagine that SFX has some fundamental assumption that  an http  
URL in rft_id is never ever something that can be used for access, and  
even if it did, it would be letting the tail wag the dog to suggest  
that other resolvers should not do so; some do.


There are also resolvers that pre-check urls, at least there were  
before exlibris acquired linkfinderplus. So it's possible for a  
resolver agent to discover whether a url leads somewhere or not.



On Sep 14, 2009, at 2:23 PM, Jonathan Rochkind wrote:

I disagree.   Putting URIs that unamiguously identify the referent,  
and in some cases provide additional 'hooks' by virtue of additional  
identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing  
to do to me.  Whether or not they resolve to an end-user appropriate  
web page or not.


If you want to use rft_id to instead be an end-user appropriate  
access URL (which may or may not be a suitable unambiguous  
persistent identifier), I guess it depends on how many of the  
actually existing in-the-wild link resolvers will, in what contexts,  
treat an http URI as an end-user appropriate access URL. If a lot of  
the in-the-wild link resolvers will, that may be a practically  
useful thing to do. Thus me asking if the one you had knowledge of  
did or didn't.


I'm 99% sure that SFX will not, in any context, treat an rft_id as  
an appropriate end-user access URL.


Certainly providing an appropriate end-user access URL _is_ a useful  
thing to do. So is providing an unambiguous persistent identifier.  
Both are quite useful things to do, they're just different things,  
shame that OpenURL kinda implies that you can use the same data  
element for both.  OpenURL's not alone there though, DC does the  
same thing.


Jonathan



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-14 Thread Eric Hellman

Nate's point is what I was thinking about in this comment in my  
original reply:
If you don't add DC metadata, which seems like a good idea, you'll  
definitely want to include something that will help you to persist  
your replacement record. For example, a label or description for the  
link.


I should also point out a solution that could work for some people but  
not you- put rewrite rules in the gateways serving your network. A bit  
dangerous and kludgy, but we've seen kludgier things.


On Sep 14, 2009, at 4:24 PM, O.Stephens wrote:


Nate has a point here - what if we end up with a commonly used URI  
pointing at a variety of different things over time, and so is used  
to indicate different content each time. However the problem with a  
'short URL' solution (tr.im, purl etc), or indeed any locally  
assigned identifier that acts as a key, is that as described in the  
blog post you need prior knowledge of the short URL/identifier to  
use it. The only 'identifier' our authors know for a website is it's  
URL - and it seems contrary for us to ask them to use something  
else. I'll need to think about Nate's point - is this common or an  
edge case? Is there any other approach we could take?




Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman

? I  
could see a number of advantages to this in the local context:


Consistency - references to websites get treated the same as  
references to journal articles - this means a single approach on  
the course side, with flexibility
Usage stats - we could collect these whatever, but if we do it via  
OpenURL we get this in the same place as the stats about usage of  
other scholarly material and could consider driving personalisation  
services off the data (like the bX product from Ex Libris)
Appropriate copy problem - for resources we subscribe to with  
authentication mechanisms there is (I think) an equivalent to the  
'appropriate copy' issue as with journal articles - we can push a  
URI to 'Web of Science' to the correct version of Web of Science  
via a local authentication method (using ezproxy for us)


The problem with the approach (as Nate and Eric mention) is that  
any approach that relies on the URI as a identifier (whether using  
OpenURL or a script) is going to have problems as the same URI  
could be used to identify different resources over time. I think  
Eric's suggestion of using additional information to help  
differentiate is worth looking at, but I suspect that this is going  
to cause us problems - although I'd say that it is likely to cause  
us much less work than the alternative, which is allocating every  
single reference to a web resource used in our course material it's  
own persistent URL.


The use case we are currently looking at is only with our own  
(authenticated) learning environment - so these OpenURLs are not  
going to appear in the wild, so to some extent perhaps it doesn't  
matter what we do - but it still seems sensible to me to look at  
what 'good practice' might look like.


I hope this is clear - I'm still struggling with some of this, and  
sometimes it doesn't make complete sense to me, but that's my best  
stab at explaining my thinking at the moment. Again, I appreciate  
the comments. Jonathan said But you seem to understand what's up.  
I wish I did! I guess that I'm reasonably confident that the  
approach I'm describing has some chance of doing the job - whether  
it is the best approach I'm not so sure about.


Owen


The Open University is incorporated by Royal Charter (RC 000391),  
an exempt charity in England  Wales and a charity registered in  
Scotland (SC 038302).




Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman

I think using locally meaningful ids in rft_id is a misuse and a  
mistake. locally meaningful data should goi in rft_dat, accompanied by  
rfr_id


just sayin'

On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote:

I do like Ross's solution, if you really wanna use OpenURL. I'm much  
more comfortable with the idea of including a URI based on your own  
local service in rft_id, then including any old public URL in rft_id.


Then at least your link resolver can say if what's in rft_id begins  
with (eg)  http://telstar.open.ac.uk/, THEN I know this is one of  
these purl type things, and I know that sending the user to it will  
result in a redirect to an end-user-appropriate access URL.
Cause that's my concern with putting random URLs in rft_id, that  
there's no way to know if they are intended as end-user-appropriate  
access URLs or not, and in putting things in rft_id that aren't  
really good identifiers for the referent at all.   But using your  
own local service ID, now you really DO have something that's  
appropriately considered a persistent identifier for the referent,  
AND you have a straightforward way to tell when the rft_id of this  
context is intended as an access URL.


Jonathan



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman


Yes, you can.

On Sep 15, 2009, at 11:41 AM, Ross Singer wrote:

I can't remember if you can include both metadata-by-reference keys
and metadata-by-value, but you could have by-reference
(rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something)
point at your citation db to return a formatted citation.


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman

The process by which a URI comes to identify something other than the  
stuff you get by resolving it can be mysterious- I've blogged about a  
bit: http://go-to-hellman.blogspot.com/2009/07/illusion-of-internet-identity.html 
  In the case of worldcat or google, it's fame. If you think a URI  
can be usable outside your institution for identification purposes,  
and your institution can maintain some sort of identification  
machinery as long as the OpenURL is expected to be useful, then it's  
fine to use it in rft_id. If you intend the uri to connote identity it  
only in the context that you're building urls for, then use rft_dat  
which is there for exactly that purpose.


On Sep 15, 2009, at 12:17 PM, Jonathan Rochkind wrote:

If it's a URI that is indeed an identifier that unambiguously  
identifies the referent, as the standard says...   I don't see how  
that's inappropriate in rft_id. Isn't that what it's for?


I mentioned before that I put things like http://catalog.library.jhu.edu/bib/1234 
 in my rft_ids.  Putting http://somewhere.edu/our-purl-server/1234  
in rft_id seems very analogous to me.  Both seem appropriate.


I'm not sure what makes a URI locally meaningful or not.  What  
makes http://www.worldcat.org/bibID or http://books.google.com/book?id=foo 
 globally meaningful but http://catalog.library.jhu.edu/bib/1234  
or http://somewhere.edu/our-purl-server/1234 locally meaningful?   
If it's a URI that is reasonably persistent and unambiguously  
identifies the referent, then it's an identifier and is appropriate  
for rft_id, says me.


Jonathan

Eric Hellman wrote:
I think using locally meaningful ids in rft_id is a misuse and a   
mistake. locally meaningful data should goi in rft_dat, accompanied  
by  rfr_id


just sayin'

On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote:


I do like Ross's solution, if you really wanna use OpenURL. I'm  
much  more comfortable with the idea of including a URI based on  
your own  local service in rft_id, then including any old public  
URL in rft_id.


Then at least your link resolver can say if what's in rft_id  
begins  with (eg)  http://telstar.open.ac.uk/, THEN I know this is  
one of  these purl type things, and I know that sending the user  
to it will  result in a redirect to an end-user-appropriate access  
URL.
Cause that's my concern with putting random URLs in rft_id, that   
there's no way to know if they are intended as end-user- 
appropriate  access URLs or not, and in putting things in rft_id  
that aren't  really good identifiers for the referent at all.
But using your  own local service ID, now you really DO have  
something that's  appropriately considered a persistent  
identifier for the referent,  AND you have a straightforward way  
to tell when the rft_id of this  context is intended as an access  
URL.


Jonathan



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

[CODE4LIB] Another way to do link maintenance

2009-09-16 Thread Eric Hellman

The thread on Implementing OpenURL for simple web resources inspired  
my to write an article on all the things that redirectors can be used  
for:

http://go-to-hellman.blogspot.com/2009/09/redirector-chain-mashup-design-pattern.html

Having thought about the original problem a bit, it strikes me that  
going a bit farther than what ross suggests could be a nice solution.


Have an onLoad javascript call your link maintenance database  and  
then rewrite the links in your page. This could be implemented in a  
JSON sort of way. (and no Openurl)


Here's why. There will be situations where you want to maintain the  
anchor text as well as the link, and this solution allows you to do  
it. Also, a well-crafted javascript will allow all the links to work  
(well, the good ones, at least) even if you link maintenace service  
goes down or disappears.


Eric

On Sep 15, 2009, at 11:47 AM, Ross Singer wrote:


Oh yeah, one thing I left off --

In Moodle, it would probably make sense to link to the URL in the  
a tag:

a href=http://bbc.co.uk/;The Beeb!/a
but use a javascript onMouseDown action to rewrite the link to route
through your funky link resolver path, a la Google.

That way, the page works like any normal webpage, right mouse
click-Copy Link Location gives the user the real URL to copy and
paste, but normal behavior funnels through the link resolver.

-Ross.




Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

[CODE4LIB] DIY Book Scanner

2009-10-13 Thread Eric Hellman

I was at a conference last Friday where Dan Reetz demoed his open- 
source homemade book scanner. Code4Libbers who are involved with low- 
budget scanning projects may want to check it out:


http://www.diybookscanner.org/  (website for Dan's DIY book scanner)
http://www.instructables.com/id/DIY-High-Speed-Book-Scanner-from-Trash-and-Cheap-C/ 
 (Instructions for making the scanner)

http://www.diybookscanner.org/news/?p=17 (more pictures)
http://www.diybookscanner.org/forum/ (the DIY scanner community forum)

Blog posts:
Harry Lewis:  http://www.bitsbook.com/2009/10/do-it-yourself-book-scanning/
Robin Sloan:  
http://www.themillions.com/2009/10/bringing-book-scanning-home.html
Me:  
http://go-to-hellman.blogspot.com/2009/10/revolution-will-be-digitized-by-cheap.html



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] XForms EAD editor sandbox available

2009-11-13 Thread Eric Hellman

XForms and Orbeon are very interesting tools for developing metadata management
tools.

The ONIX developers have used this stack to produce an interface for ONIX-PL
called OPLE that people should try out.

http://www.jisc.ac.uk/whatwedo/programmes/pals3/onixeditor.aspx

Questions about Orbeon relate to performance and integrability, but I think
it's an impressive use of XForms nonetheless.

- Eric

On Nov 12, 2009, at 1:30 PM, Ethan Gruber wrote:

Hello all,

Over the past few months I have been working on and off on a research
project to develop a XForms, web-based editor for EAD finding aids that runs
within the Orbeon tomcat application. While still in a very early alpha
stage (I have probably put only 60-80 hours of work into it thus far), I
think that it's ready for a general demonstration to solicit opinions,
criticism, etc. from librarians, and technical staff.

Background:
For those not familiar with XForms, it is a W3C standard for creating
next-generation forms. It is powerful and can allow you to create XML in
the way that it is intended to be created, without limits to repeatability,
complex hierarchies, or mixed content. Orbeon adds a level on top of that,
taking care of all the ajax calls, serialization, CRUD operations, and a
variety of widgets that allow nice features like tabs and
autocomplete/autosuggest that can be bound to authority lists and controlled
access terms. By default, Orbeon reads and writes data from and to an eXist
database that comes packaged with it, but you can have it serialize the XML
to disk or have it interact with any REST interface such as Fedora.

Goals:
Ultimately, I wish to create a system of forms that can open any EAD
2002-compliant XML file without any data loss or XML transformation
whatsoever. I think that this is the shortcoming of systems such as Archon
and Archivists' Toolkit. I want to integrate authority lists that can be
integrated into certain fields with autosuggest (such as corporate names,
people, and subjects). If there is demand, I can build a public interface
for viewing the entire EAD collection, complete with solr for faceted browse
and search, but this is secondary to producing a form that people with some
basic archiving knowledge and EAD background can use to easily and
effectively create finding aids. A public interface is the easy part, in
any case. It wouldn't take more than a week or two to build something
fairly nice and robust.

Here is the link: http://beta.scholarslab.org:9080/cocoon/eaditor/

I should stress that the application is *not complete.* I am using cocoon
for providing a list of EAD content in the system. I will remove that
application eventually and utilize Orbeon's internal pipelining features to
achieve the same objective. I haven't delved too deeply into Orbeon's
pipelines yet.

Here are some things to note:

1. If you click on a link to open the main part of the guide or any of its
components, you have to click the Load link on the top of the form. Forms
aren't being loaded on page load yet.
2. Elements that accept mixed content per the EAD 2002 schema (e.g.
paragraphs) only accept PCDATA. I haven't worked on mixed content yet; it
is by far the most challenging aspect of the project.
3. I only have a few C-level elements available to add.
4. Not all did elements are available yet.
5. A lot of the generic attributes, like type and label, are not available
for editing yet. This may be the type of thing that is best customized per
institution relative to their own best practices. I don't want more input
fields than necessary right now.
6. The only thing you can add into the archdesc right now is the dsc.
Once I finish all of the c-level elements, I can just put some xi:includes
into the archdesc XForm file to show them in the archdesc level.

I think those are the major issues for now. As I stated earlier, this is
sort of a pre-alpha. The project is open source and available (through svn)
to anyone who wants it. http://code.google.com/p/eaditor/ . I have put
together an easy package to get the application up and running without
difficulty. All you have to do is unzip the download, go into the apache
tomcat folder and execute the startup script. This assumes you have nothing
running on port 8080 already.

Download page: http://code.google.com/p/eaditor/downloads/list

Wiki instructions:
http://code.google.com/p/eaditor/wiki/QuickstartInstallation?ts=1257887453updated=QuickstartInstallation

Comments, questions, criticism welcome. The editor is a sandbox. Feel free
to experiment.

Ethan Gruber
University of Virginia Library

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Assigning DOI for local content

2009-11-20 Thread Eric Hellman

Having incorporated the handle client software into my own stuff rather easily, 
I'm pretty sure that's not true.

On Nov 19, 2009, at 12:51 PM, Ross Singer wrote:
 The caveat being that the initial access point is provided via HTTP.
 
 But then again, so is http://hdl.handle.net/, which, in fact, the only
 way currently in practice to dereference handles.

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Assigning DOI for local content

2009-11-23 Thread Eric Hellman

For example, if you don't want to rely on dx.doi.org as your gateway to the 
handle system for doi resolution, it would be quite easy for me to deploy my 
own gateway at dx.hellman.net. I might want to do this if a were an 
organization paranoid about security and didn't want to disclose to anybody 
what doi's my organization was resolving. Or, I might want to directly access 
metadata in the handle system that doesn't go through the http gateways, to 
provide a service other than resolution.

Does this answer your question, Ross?



On Nov 20, 2009, at 2:31 PM, Ross Singer wrote:

 On Fri, Nov 20, 2009 at 2:23 PM, Eric Hellman e...@hellman.net wrote:
 Having incorporated the handle client software into my own stuff rather 
 easily, I'm pretty sure that's not true.
 
 Fair enough.  The technology is binding independent.
 
 So you are using and sharing handles using some protocol other than HTTP?
 
 I'm more interested in the sharing part of that question.  What is the
 format of the handle identifier in this context?  What advantage does
 it bring over HTTP?
 
 -Ross.

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Assigning DOI for local content

2009-11-25 Thread Eric Hellman

On Nov 23, 2009, at 1:32 PM, Ross Singer wrote:

 On Mon, Nov 23, 2009 at 1:07 PM, Eric Hellman e...@hellman.net wrote:
 
 Does this answer your question, Ross?
 
 Yes, sort of.  My question was not so much if you can resolve handles
 via bindings other than HTTP (since that's one of the selling points
 of handles) as it was do people actually use this in the real world?

Well, the short answer to that question is yes.

I think the discussion veered out of the zone of my understanding the point of 
it. The original question related to whether a journal should register Crossref 
doi's, and the short answer to that, as far as I'm concerned,  is an emphatic 
yes.

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

[CODE4LIB] Support for attending Code4Lib2010

2009-12-09 Thread Eric Hellman

 I hope to be in Asheville. But with the Global Economic Downturn, I worry 
that  some people who might have a lot to contribute and the most to gain may 
be unable to go due to having lost their job or being in a library with 
horrific budget cuts. So, together with Eric Lease Morgan (who has been 
involved with Code4Lib from the very start) I'm putting up a bit of money to 
support the expenses of people who want to go to Code4Lib 2010. If other donors 
can join Eric and myself, that would be wonderful, but so far I'm guessing that 
together we can support the travel expenses of two relatively frugal people.

If you would like to be considered, please send me an email as soon as 
possible, and before I wake up on Monday, December 14 at the latest. Please 
describe your economic hardship, your travel budget, and what you hope to get 
from the conference. Eric and I will use arbitrary and uncertain methods to 
decide who to support, and we'll inform you of our decision in time for you to 
register or not on Wednesday December 16, when registration opens.

more at 
http://go-to-hellman.blogspot.com/2009/12/supporting-attendance-at-code4lib.html


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

glue...@twitter
e...@hellman.net 
http://go-to-hellman.blogspot.com/

[CODE4LIB] Update: Support for attending Code4Lib2010

2009-12-10 Thread Eric Hellman

We now have four community members joining together to support the expenses of 
people who want to go to Code4Lib 2010, so it's likely that we'll be able to 
support more than two people's travel expenses. 

I should mention that support will be informal and discreet- its not like the 
scholarships offered by Brown/OSU.

If you would like to be considered, please send me an email as soon as 
possible, and before I wake up on Monday, December 14, at the latest. Please 
describe your economic hardship, your travel budget, and what you hope to get 
from the conference. We will use arbitrary and uncertain methods to decide who 
to support, and we'll inform you of our decision in time for you to register or 
not on Wednesday December 16, when registration opens.

If you want to go and money's a problem, don't hesitate to ask.

more at 
http://go-to-hellman.blogspot.com/2009/12/supporting-attendance-at-code4lib.html


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

glue...@twitter
e...@hellman.net 
http://go-to-hellman.blogspot.com/

[CODE4LIB] Update: Support for attending Code4Lib2010

2009-12-14 Thread Eric Hellman

I'm happy to report that the ad hoc committee to support attendance at Code4Lib 
will be able to provide the requested help.

I'd also like to thank Serials Solutions for their offer of support.

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

[CODE4LIB] OpenURL aggregator not doing so well

2010-04-09 Thread Eric Hellman

Take a look at 
http://openurl.code4lib.org/aggregator
Any ideas how to make it work better?

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Eric Hellman

I think Twitter annotations would be a good use for 
http://thing-described-by.org/ or a functional equivalent. The payload of the 
annotation would simply be a description URI and a namespace and value for 
descriptions by reference

1. the mechanism would be completely generic, usable for any sort of reference, 
not siloed in libraryland. In other words, we might actually get people to 
adopt it.
2. libraryland descriptions could use BIBO or RDA or both or whatever, and 
could be concise or verbose
3. descriptions could be easily reused

I'll write this up a bit more and would be interested in comment, but it's 
where this post was going:
http://go-to-hellman.blogspot.com/2010/04/when-shall-we-link.html



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Eric Hellman

I mean, really, if the folks at RefWorks, EndNote, Papers, Zotero and LibX 
don't have crash programs underway to integrate Twitter clients into their 
software to send and receive  reference metadata payloads they can use in the 
Twitter annotation field, they really ought to hire me to come and bash some 
sense into them. Really.

I still think by-reference payloads would got the farthest, as described at 
http://go-to-hellman.blogspot.com/2010/04/when-shall-we-link.html
would go the farthest, but surely these folks know very well what they can send 
and receive.

Eric

On Apr 28, 2010, at 4:17 AM, Jakob Voss wrote:

 Hi
 
 it's funny how quickly you vote against BibTeX, but at least it is a format 
 that is frequently used in the wild to create citations. If you call BibTeX 
 undocumented and garbage then how do you call MARC which is far more 
 difficult to make use of?
 
 My assumption was that there is a specific use case for bibliographic data in 
 twitter annotations:
 
 I. Identifiy publication = this can *only* be done seriously with 
 identifiers like ISBN, DOI, OCLCNum, LCCN etc.
 
 II. Deliver a citation = use a citation-oriented format (BibTeX, CSL, RIS)
 
 I was not voting explicitly for BibTeX but at least there is a large 
 community that can make use of it. I strongly favour CSL 
 (http://citationstyles.org/) because:
 
 - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment but 
 it is the natural environment for the Web 2.0 Mashup crowd that is going to 
 implement applications that use Twitter annotations
 
 - there are dozens of CSL citation styles so you can display a citation in 
 any way you want
 
 As Ross pointed out RIS would be an option too, but I miss the easy open 
 source tools that use RIS to create citations from RIS data.
 
 Any other relevant format that I know (Bibont, MODS, MARC etc.) does not aim 
 at identification or citation at the first place but tries to model the full 
 variety of bibliographic metadata. If your use case is
 
 III. Provide semantic properties and connections of a publication
 
 Then you should look at the Bibliographic Ontology. But III does *not* just 
 subsume usecase II. - it is a different story that is not beeing told by 
 normal people but only but metadata experts, semantic web gurus, library 
 system developers etc. (I would count me to this groups). If you want such 
 complex data then you should use other systems but Twitter for data exchange 
 anyway.
 
 A list of CSL metadata fields can be found at
 
 http://citationstyles.org/downloads/specification.html#appendices
 
 and the JavaScript-Processor (which is also used in Zotero) provides more 
 information for developers: http://groups.google.com/group/citeproc-js
 
 Cheers
 Jakob
 
 P.S: An example of a CSL record from the JavaScript client:
 
 {
 title: True Crime Radio and Listener Disenchantment with Network 
 Broadcasting, 1935-1946,
  author: [ {
family: Razlogova,
given: Elena
  } ],
 container-title: American Quarterly,
 volume: 58,
 page: 137-158,
 issued: { date-parts: [ [2006, 3] ] },
 type: article-journal
 }
 
 
 -- 
 Jakob Voß jakob.v...@gbv.de, skype: nichtich
 Verbundzentrale des GBV (VZG) / Common Library Network
 Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
 +49 (0)551 39-10242, http://www.gbv.de

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

[CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Eric Hellman

Since this thread has turned into a discussion on OpenURL...

I have to say that during the OpenURL 1.0 standardization process, we 
definitely had moments of despair. Today, I'm willing to derive satisfaction 
from it works and overlook shortcomings. It might have been otherwise.

What I hope for is that OpenURL 1.0 eventually takes a place alongside SGML as 
a too-complex standard that directly paves the way for a universally adopted 
foundational technology like XML. What I fear is that it takes a place 
alongside MARC as an anachronistic standard that paralyzes an entire industry. 


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Eric Hellman

OK, back to Tim's specific question.

I'm not sure why you want to put bib data in a tweet at all for your
application. Why not just use a shortened URL pointing at your page of
metadata? That page could offer metadata via BIBO, Open Graph and FOAF in RDFa,
COinS, RIS, etc. using established methods to serve multiple applications at
once. When Twitter annotations come along, the URL can be put in the annotation
field.

Eric

On Apr 21, 2010, at 6:08 AM, Tim Spalding wrote:

Have C4Lers looked at the new Twitter annotations feature?

http://www.sitepoint.com/blogs/2010/04/19/twitter-introduces-annotations-hash-tags-become-obsolete/

I'd love to get some people together to agree on a standard book
annotation format, so two people can tweet about the same book or
other library item, and they or someone else can pull that together.

I'm inclined to start adding it to the I'm talking about and I'm
adding links on LibraryThing. I imagine it could be easily added to
many library applications too—anywhere there is or could be a share
this on Twitter link, including OPACs, citation managers, library
event feeds, etc.

Also, wouldn't it be great to show the world another interesting,
useful and cool use of library data that OCLC's rules would prohibit?

So the question is the format. Only a maniac would suggest MARC. For
size and other reasons, even MODS is too much. But perhaps we can
borrow the barest of field names from MODS, COinS, or from the most
commonly used bibliographic format, Amazon XML.

Thoughts?

Tim

--
Check out my library at http://www.librarything.com/profile/timspalding

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Eric Hellman

Even the best standard in the world can only do so much!

On Apr 29, 2010, at 1:14 PM, Ed Summers wrote:

 On Thu, Apr 29, 2010 at 12:08 PM, Eric Hellman e...@hellman.net wrote:
 Since this thread has turned into a discussion on OpenURL...
 
 I have to say that during the OpenURL 1.0 standardization process, we 
 definitely had moments of despair. Today, I'm willing to derive satisfaction 
 from it works and overlook shortcomings. It might have been otherwise.
 
 Personally, I've followed enough OpenURL enabled hyperlink dead ends
 to contest it works.
 
 //Ed

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Eric Hellman

May I just add here that of all the things we've talked about in these threads, 
perhaps the only thing that will still be in use a hundred years from now will 
be Unicode. إن شاء الله


On Apr 29, 2010, at 7:40 PM, Alexander Johannesen wrote:

 However, I'd like to add here that I happen to love XML, even from an
 integration perspective, but maybe that stems from understanding all
 those tedious bits no one really cares about about it, like id(s) and
 refid(s) (and all the indexing goodness that comes from it), canonical
 datasets, character sets and Unicode, all that schema craziness
 (including Schematron and RelaxNG), XPath and XQuery (and all the
 sub-standards), XSLT and so on. I love it all, and not because of the
 generic simplicity itself (simple in the default mode of operation, I
 might add), but because of a) modeling advantages, b)
 cross-environment language and schema support, and c) ease of
 creation. (I don't like how easy well-formedness breaks, though. That
 sucks)

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Eric Hellman

Ha!

One of the things OpenURL 1.0 fixed was to wire in UTF-8 encoding. Much of 
the MARC data in circulation also uses UTF-8 encoding. Some of it even uses it 
correctly.

On Apr 29, 2010, at 8:58 PM, Alexander Johannesen wrote:

 On Fri, Apr 30, 2010 at 10:54, Eric Hellman e...@hellman.net wrote:
 May I just add here that of all the things we've talked about in these 
 threads, perhaps the only thing that will still be in use a hundred years 
 from now will be Unicode. إن شاء الله
 
 May I remind you that we're still using MARC. Maybe you didn't mean in
 the library world ... *rimshot*
 
 
 Alex
 -- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 --- http://shelter.nu/blog/ --
 -- http://www.google.com/profiles/alexander.johannesen ---

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-30 Thread Eric Hellman

Eek. I was hoping for something much simpler. Do you realize that you're asking 
for service taxonomy?

On Apr 30, 2010, at 10:22 AM, Ross Singer wrote:

 I think the basis of a response could actually be another context
 object with the 'services' entity containing a list of
 services/targets that are formatted in some way that is appropriate
 for the context and the referent entity enhanced with whatever the
 resolver can add to the puzzle.

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-05-03 Thread Eric Hellman


I'll try to find out.

Sent from Eric Hellman's iPhone


On May 2, 2010, at 4:10 PM, stuart yeates stuart.yea...@vuw.ac.nz  
wrote:


But the interesting use case isn't OpenURL over HTTP, the  
interesting use case (for me) is OpenURL on a disconnected eBook  
reader resolving references from one ePub to other ePub content on  
the same device. Can OpenURL be used like that?

[CODE4LIB] Safari extensions

2010-08-05 Thread Eric Hellman

Has anyone played with the new Safari extensions capability? I'm looking at 
you, Godmar. 


Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Eric Hellman

I think you'd have a very hard time demonstrating any speed advantage to MARC 
over MARCXML. XML parsers have been speed optimized out the wazoo; If there 
exists a MARC parser that has ever been speed-optimized without serious 
compromise, I'm sure someone on this list will have a good story about it.

On Oct 25, 2010, at 3:05 PM, Patrick Hochstenbach wrote:

 Dear Nate,
 
 There is a trade-off: do you want very fast processing of data - go for 
 binary data. do you want to share your data globally easily in many (not per 
 se library related) environments - go for XML/RDF. 
 Open your data and do both :-)
 
 Pat
 
 Sent from my iPhone
 
 On 25 Oct 2010, at 20:39, Nate Vack njv...@wisc.edu wrote:
 
 Hi all,
 
 I've just spent the last couple of weeks delving into and decoding a
 binary file format. This, in turn, got me thinking about MARCXML.
 
 In a nutshell, it looks like it's supposed to contain the exact same
 data as a normal MARC record, except in XML form. As in, it should be
 round-trippable.
 
 What's the advantage to this? I can see using a human-readable format
 for poorly-documented file formats -- they're relatively easy to read
 and understand. But MARC is well, well-documented, with more than one
 free implementation in cursory searching. And once you know a binary
 file's format, it's no harder to parse than XML, and the data's
 smaller and processing faster.
 
 So... why the XML?
 
 Curious,
 -Nate

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Eric Hellman

I vote for changing the limit threshold to 

 PI * (eventual length of this meta-thread).

On Oct 27, 2010, at 3:37 PM, Alexander Johannesen wrote:

 On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D do...@uta.edu wrote:
 Can that limit threshold be raised?  If so, are there reasons why it should 
 not be raised?
 
 Is it to throttle spam or something? 50 seems rather low, and it's
 rather depressing to have a lively discussion throttled like that. Not
 to mention I thought I was simply kicked out for living things up
 (especially given my reasonable follow-up was where the throttling
 began).
 
 Alex
 -- 
  Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 --- http://shelter.nu/blog/ --
 -- http://www.google.com/profiles/alexander.johannesen ---

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Eric Hellman

I expect the length of the thread to be irrational; so perhaps that's not a 
problem.

On Oct 27, 2010, at 6:18 PM, Ray Denenberg, Library of Congress wrote:

 I think the constraint is that it has to be a rational number. 
 
 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Eric
 Hellman
 Sent: Wednesday, October 27, 2010 5:58 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] mailing list administratativia
 
 I vote for changing the limit threshold to 
 
 PI * (eventual length of this meta-thread).
 
 On Oct 27, 2010, at 3:37 PM, Alexander Johannesen wrote:
 
 On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D do...@uta.edu wrote:
 Can that limit threshold be raised?  If so, are there reasons why it
 should not be raised?
 
 Is it to throttle spam or something? 50 seems rather low, and it's 
 rather depressing to have a lively discussion throttled like that. Not 
 to mention I thought I was simply kicked out for living things up 
 (especially given my reasonable follow-up was where the throttling 
 began).
 
 Alex
 --
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic 
 Maps
 --- http://shelter.nu/blog/ 
 --
 -- http://www.google.com/profiles/alexander.johannesen 
 ---
 
 Eric Hellman
 President, Gluejar, Inc.
 41 Watchung Plaza, #132
 Montclair, NJ 07042
 USA
 
 e...@hellman.net
 http://go-to-hellman.blogspot.com/
 @gluejar

Re: [CODE4LIB] c4l2011 location + reg. open time

2010-12-08 Thread Eric Hellman

I believe that would be Indiana Memorial Union on the campus of IU in 
Bloomington, Indiana

Sent from my iPad

On Dec 8, 2010, at 10:50 PM, Karen Coyle li...@kcoyle.net wrote:

 I can't find anything on the wiki that says WHERE c4l2011 will be. (I thought 
 IMU was a hint, but that comes out as International Medical University in 
 Malaysia as the top link.) That would be useful information. Also, if 
 registration opens at 9, what time zone is that?
 
 kc
 p.s. Just because I haven't been paying attention doesn't mean I don't CARE.
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet

[CODE4LIB] question about coding in libraries

2011-02-01 Thread Eric Hellman

For my talk at Code4Lib, I'm trying to find or gather statistics about the 
number of people doing any sort of code in libraries.

My initial attempts to quantify this have failed. I would appreciate info from 
list members.

If you'd like to help, send me two numbers
1. The number of people employed at or on contract to your library whose major 
responsibilities include software development or maintenance. Broadly defined. 
2. The total FTE staff at your library. 

(send to me, not the list, I will summarize)

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

[CODE4LIB] AGPL for libraries (was: A to Z lists)

2011-02-17 Thread Eric Hellman

Hej Tony!

Great to hear of your effort; I hope you have chosen to implement the NISO 1.0 
standard.

I would urge you to carefully consider your choice of license, however. As I 
wrote last year when the issue came up in Koha, using AGPL in stead of the less 
restrictive GPL can have some unintended consequences. 
http://go-to-hellman.blogspot.com/2010/07/koha-community-considers-affero-license.html

It is still a reality today that many library resources release api's that are 
provided only to customers and often come with interface licenses incompatible 
with GPL. If you use AGPL, a library that modified the software to use it with 
one of these resources would be in violation of your license, even if they did 
not redistribute the software. If that's your intention, then fine, but please 
make sure you understand the implications.

Also, please don't confuse AGPL, which is a restrictive license rooted in 
copyright law, with public domain, which has no restrictions on use.

Eric

On Feb 17, 2011, at 4:34 AM, Tony Mattsson wrote:

 Hi,
 
 We are at the final stages of building an EBM system with AZ-list and OpenURL 
 resolver developed in LAMP (with Ajax) which we will release into the public 
 domain (AGPL). I'll put up a notice on this list when it's done, and you can 
 try it out to see if it measures up :=)
 
 Tony Mattsson
 IT-Librarian
 Landstinget Dalarna Bibliotek och informationscentral
 http://materio.fabicutv.com
 
 -Ursprungligt meddelande-
 Från: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] För Michele DeSilva
 Skickat: den 16 februari 2011 22:18
 Till: CODE4LIB@LISTSERV.ND.EDU
 Ämne: [CODE4LIB] A to Z lists
 
 Hi Code4Lib-ers,
 
 I want to chime in and say that I, too, enjoyed the streaming archive from 
 the conference.
 
 I also have a question: my library has a horribly antiquated A to Z list of 
 databases and online resources (it's based in Access). We'd like to do 
 something that looks more modern and is far more user friendly. I found a 
 great article in the Code4Lib journal (issue 12, by Danielle Rosenthal  
 Mario Bernado) about building a searchable A to Z list using Drupal. I'm also 
 wondering what other institutions have done as far as in-house solutions. I 
 know there're products we could buy, but, like everyone else, we don't have 
 much money at the moment.
 
 Thanks for any info or advice!
 
 Michele DeSilva
 Central Oregon Community College Library
 Emerging Technologies Librarian
 541-383-7565
 mdesi...@cocc.edu

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] GPL incompatible interfaces

2011-02-18 Thread Eric Hellman

Since the Metalib API is not public, to my knowledge, I don't know whether it 
gets disclosed with an NDA. And you can't run or develop Xerxes without an 
ExLibris License, because it depends on a proprietary and unspecified data set. 

I'm sure that's legal, but it's not true to the spirit of copyleft. The main 
effect of using GPL for Xerxes is that it prevents ExLibris from distributing 
(but not using) proprietary versions of Xerces. If that is the intent of the 
developers, then perhaps AGPL would be a better tool for them to wield.

None of this should be taken as a criticism of the Xerxes developers.

On Feb 18, 2011, at 3:50 AM, graham wrote:

 That's very different from saying something with a GPL license can't use
 a proprietary interface. As if for example Xerxes couldn't use the
 Metalib API - without which it would be pointless. As I understand him
 Eric is saying that there are interfaces to library software which
 actually have a license or contract which blocks GPLed software from
 using them. It would be a kind of 'viral BSD' license, killing free
 software (in the FSF sense) but leaving proprietary or open source (in
 your Apache/MIT sense) untouched. I haven't seen any examples myself,
 and can't quite see how it would be done legally.

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

[CODE4LIB] Gluejar is hiring

2011-03-01 Thread Eric Hellman

Hi, Everyone!

http://go-to-hellman.blogspot.com/2011/03/gluejar-is-hiring.html

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] [dpla-discussion] Rethinking the library part of DPLA

2011-04-07 Thread Eric Hellman

The DPLA listserv is probably too impractical for most of Code4Lib, but Nate 
Hill (who's on this list as well) made this contribution there, which I think 
deserves attention from library coders here.

On Apr 5, 2011, at 11:15 AM, Nate Hill wrote:

 It is awesome that the project Gutenberg stuff is out there, it is a great 
 start.  But libraries aren't using it right.  There's been talk on this list 
 about the changing role of the public library in people's lives, there's been 
 talk about the library brand, and some talk about what 'local' might mean in 
 this context.  I'd suggest that we should find ways to make reading library 
 ebooks feel local and connected to an immediate community.  Brick and mortar 
 library facilities are public spaces, and librarians are proud of that.  We 
 have collections of materials in there, and we host programs and events to 
 give those materials context within the community.  There's something special 
 about watching a child find a good book, and then show it to his  or her 
 friend and talk about how awesome it is.  There's also something special 
 about watching a senior citizens book group get together and discuss a new 
 novel every month.  For some reason, libraries really struggle with treating 
 their digital spaces the same way.
 
 I'd love to see libraries creating online conversations around ebooks in much 
 the same way.  Take a title from project Gutenberg: The Adventures of 
 Huckleberry Finn.  Why not host that book directly on my library website so 
 that it can be found at an intuitive URL, 
 www.sjpl.org/the-adventures-of-huckleberry-finn and then create a forum for 
 it?  The URL itself takes care of the 'local' piece; certainly my most likely 
 visitors will be San Jose residents- especially if other libraries do this 
 same thing.  The brand remains intact, when I launch this web page that holds 
 the book I can promote my library's identity.  The interface is no problem 
 because I can optimize the page to load well on any device and I can link to 
 different formats of the book.  Finally, and most importantly, I've created a 
 local digital space for this book so that people can converse about it via 
 comments, uploaded pictures, video, whatever.  I really think this community 
 conversation and context-creation around materials is a big part of what 
 makes public libraries special.

Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   Gluejar is hiring!

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] [dpla-discussion] Rethinking the library part of DPLA

2011-04-12 Thread Eric Hellman

The challenge I like to present to libraries is this: imagine that your entire 
collection is digital. Does it include Shakespeare? Does it include Moby Dick? 
Yes! Just because you don't have to pay for these works, doesn't mean that they 
don't belong in your library. And what if many modern works become available 
for free via Creative Commons licensing? Is it the library's role to promote 
these works, or should a library be promoting primarily the works it's paying 
for patrons to use?

That's why I thought Nate's suggestions were worthy of attention from people 
who could potentially do practical things.

The other hope is that if libraries can do compelling things with public domain 
content, there's no reason they couldn't do the same things with in-copyright 
material appropriately licensed. If the experience works, the rightsholders 
will see the value.


On Apr 10, 2011, at 10:05 AM, Karen Coyle wrote:

 I appreciate the spirit of this, but despair at the idea that libraries 
 organize their services around public domain works, thus becoming early 20th 
 century institutions. The gap between 1923 and 2011 is huge, and it makes no 
 sense to users that a library provide services based on publication date, 
 much less that enhanced services stop at 1923.
 
 kc
 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet

Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   Gluejar is hiring!

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] What do you wish you had time to learn?

2011-04-27 Thread Eric Hellman

This thread got me thinking about what I learned during a time when I actually 
had time to learn whatever I wanted to:

Applied Epistemology (reading list supplied mostly by @edsu)
Copyright Law (reading list supplied mostly by @grimmelm)
Writing and Journalism



Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   Gluejar is hiring!

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-18 Thread Eric Hellman

Some ebooks, in fact some of the greatest ever written, already cost less than 
razor blades.

Eric
(who just finished writing a chapter on open-access e-books)

On May 16, 2011, at 7:52 PM, Luciano Ramalho wrote:

 1) Why quote the ebook price in 1962 dollars? The reality in 2011 is
 that Kindle books in general are too expensive, particularly when
 comparing their cost with the paper counterparts (think about variable
 costs in paperbacks, logistics etc; it is pretty obvious the cost
 reductions are not being fully reflected in consumer prices). Given
 the current situation, I see no evidence that ebooks will cost less
 than razor blades, ever.



Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   

e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-18 Thread Eric Hellman

Exactly. I apologize if my comment was perceived as coy, but I've chosen to 
invest in the possibility that Creative Commons licensing is a viable way 
forward for libraries, authors, readers, etc. Here's a link the last of a 5 
part series on open-access ebooks. I hope it inspires work in the code4lib 
community to make libraries more friendly to free stuff.

http://go-to-hellman.blogspot.com/2011/05/open-access-ebooks-part-5-changing.html
 

On May 18, 2011, at 7:20 PM, David Friggens wrote:

 Some ebooks, in fact some of the greatest ever written, already cost less
 than razor blades.
 
 Do you mean ones not under copyright?
 
 Those, plus Creative Commons etc.

Re: [CODE4LIB] Seth Godin on The future of the library

2011-06-01 Thread Eric Hellman

Karen,

The others who have responded while I was off, you know, doing stuff, have done 
a much better job of answering your question than I would have. I would have 
said something glib like almost all ways, with respect to open-access digital 
materials.

There's a shift in library mindset that has to occur along with the transition 
from print to digital. The clearest example that I've seen is the typical 
presentation of pretend-its-print out-of-copyright material. A library will 
have purchased PIP access to an annotated edition of a Shakespeare play, or a 
new translation of Crime and Punishment. But the public domain versions of 
these works (which are perfectly good) don't exist in the catalog. A patron 
looking for ebook versions of these works will then frequently be denied access 
because another patron has already checked out the licensed version.

That can't be justified by any vision for libraries that I can think of. It 
can't be justified because it's hard or time consuming, or because there are a 
flood of PD Crime and Punishments clamoring for attention. It's just a result 
of unthinking and we-haven't-done-that-before.

It's my hope that there are a number of not-so-hard problems around this 
situation that people on this list have the tools to solve.

Eric


On May 19, 2011, at 1:30 AM, Karen Coyle wrote:

 Quoting Eric Hellman e...@hellman.net:
 
 Exactly. I apologize if my comment was perceived as coy, but I've chosen to 
 invest in the possibility that Creative Commons licensing is a viable way 
 forward for libraries, authors, readers, etc. Here's a link the last of a 5 
 part series on open-access ebooks. I hope it inspires work in the code4lib 
 community to make libraries more friendly to free stuff.
 
 Eric,
 
 In what ways do you think that libraries today are not friendly to free stuff?
 
 kc
 
 
 http://go-to-hellman.blogspot.com/2011/05/open-access-ebooks-part-5-changing.html
 
 On May 18, 2011, at 7:20 PM, David Friggens wrote:
 
 Some ebooks, in fact some of the greatest ever written, already cost less
 than razor blades.
 
 Do you mean ones not under copyright?
 
 Those, plus Creative Commons etc.
 
 
 
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet

Re: [CODE4LIB] Adding VIAF links to Wikipedia

2011-06-10 Thread Eric Hellman

We talked a bit about this at LOD-LAM; Asaf Bartov of the Wikimedia foundation 
offered to help make this work better.

email me if you need a a contact.

On Jun 2, 2011, at 10:40 AM, Ralph LeVan wrote:

 Yes, the bot was approved, but in a much more limited application that was
 initially intended (make a link between Wikipedia records and corresponding
 OpenLibrary records.)  And the conversation was quite rancorous for granting
 permission to an organization philosophically much closer to Wikipedia than
 OCLC would seem to be.
 
 I don't think we'll be able to make this happen without a lot of help.
 
 Ralph
 
 On Fri, May 27, 2011 at 3:45 PM, Ed Summers e...@pobox.com wrote:
 
 On Thu, May 26, 2011 at 2:01 PM, Ralph LeVan ralphle...@gmail.com wrote:
 OCLC Research would desperately love to add VIAF links to Wikipedia
 articles, but it seems to be very difficult.  The OpenLibrary folks tried
 to
 do it a while back and ended up getting their plans severely curtailed.
 The
 discussion at Wikipedia is captured here:
 
 http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/OpenlibraryBot
 
 Ralph if you read that entire discussion it sounds like the bot was
 approved. Am I missing something?
 
 //Ed

[CODE4LIB] JHU integration of PD works

2011-06-15 Thread Eric Hellman

Getting back to the subject of a previous thread, (and digesting some wonderful 
contributions by Karen, Alex, Jeremy and Ed C.) I dug around some links that 
Jonathan posted, and I think they're worth further discussion.

The way that JHU has integrated Public Domain works into its catalog results 
with umlaut is brilliant and pragmatic; the new catalog (catalyst) interface 
based on Blacklight is a great improvement on the older Horizon version:
https://catalyst.library.jhu.edu/catalog/bib_816990

Clearly, Jonathan has gone through the process of getting his library to think 
through the integration, and it seems to work.

Has there been any opposition? 

What are the reasons that this sort of integration not more widespread? Are 
they technical or institutional? What can be done by producers of open access 
content to make this work better and easier? Are unified approaches being 
touted by vendors delivering something really different?

Looking forward, I wonder whether the print-first, then enrich with digital 
strategy required by today's infrastructure and work flow will decline compared 
to a more Googlish web-first strategy.

Eric


Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   
41 Watchung Plaza #132, Montclair NJ 07042
e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

[CODE4LIB] privacy enhanced implementation of Like button

2011-07-23 Thread Eric Hellman

Has anyone seen, used or written a wrapper script for Facebook Like buttons 
(http://developers.facebook.com/docs/opengraph/ ) that prevents the leakage of 
all user browsing behavior to Facebook? For example, the script might invoke 
the facebook script on an OnClick event.

Eric


Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   
41 Watchung Plaza #132, Montclair NJ 07042
e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

[CODE4LIB] New thread: Why are you doing what you're doing?

2011-09-28 Thread Eric Hellman

I think it's a good question, worth asking about *every* dev position being 
hired for.  I would be interested to hear an answer from others on the list. In 
fact, I think the price of putting a position announcement on Code4lib should 
be a willingness to answer why?. And why not? is a pretty pathetic answer.

For me, I'm doing what I'm doing because I think it's important and because no 
one else is doing it. I hope there are many other with a similar answer.

Eric

Re: [CODE4LIB] Examples of visual searching or browsing

2011-11-10 Thread Eric Hellman

I'm surprised that no one has mentioned the stars of the DPLA sprint- 
ShelfLife
http://librarylab.law.harvard.edu/dpla/demo/app/
and
BookWorm
http://bookworm.culturomics.org/

Eric

On Oct 27, 2011, at 4:27 PM, Julia Bauder wrote:

 Dear fans of cool Web-ness,
 
 I'm looking for examples of projects that use visual(=largely non-text and
 non-numeric) interfaces to let patrons browse/search collections. Things
 like the GeoSearch on North Carolina Maps[1], or projects that use Simile's
 Timeline or Exhibit widgets[2] to provide access to collections (e.g.,
 what's described here:
 https://letterpress.uchicago.edu/index.php/jdhcs/article/download/59/70), or
 in-the-wild uses of Recollection[3]. I'm less interested in knowing about
 tools (although I'm never *uninterested* in finding out about cool tools)
 than about production or close-to-production sites that are making good use
 of these or similar tools to provide visual, non-linear access to
 collections. Who's doing slick stuff in this area that deserves a look?
 
 Thanks!
 
 Julia
 


Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   
41 Watchung Plaza #132, Montclair NJ 07042
e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] Library News (à la ycombinator's hackernews)

2011-11-29 Thread Eric Hellman

And the discussion at hacker news is illuminating...

http://news.ycombinator.com/item?id=3272980

On Nov 29, 2011, at 1:30 PM, Mark A. Matienzo wrote:

 On Tue, Nov 29, 2011 at 1:25 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Don't know if the link is in error, or what. Anyone know what software
 Hacker News and this Library News clone are based on, for real, and where to
 look at the source/documentation?  Trying to google for what open source
 software Hacker News runs on, I'm not having any luck.
 
 Hacker News, and presumably Library News, both run using news.arc,
 which is written the the Arc dialect of Lisp. The news program is
 packaged with the Arc distribution:
 
 https://github.com/nex3/arc/blob/master/news.arc
 
 Mark A. Matienzo
 Digital Archivist, Manuscripts and Archives, Yale University Library
 Technical Architect, ArchivesSpace

Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-01 Thread Eric Hellman

I think that it's not out of bounds to ask people for c4l votes unless you're 
offering tangible rewards in exchange for said votes. Tangible rewards as 
used here shall in no circumstance be construed to apply to any offers of beer 
or its nonalcoholic equivalent. Non-alcoholic equivalent as used here, shall 
in no way be construed to imply that there is such a thing.

Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-01 Thread Eric Hellman

It's also worth noting that the voters (so far) have done a super job. If your 
talk is not making the cut, don't take it as a reflection or judgment on you or 
your work. It just means that voters want to save you for next year. And if 
your talk IS making the cut, it's probably because voters want the chance to 
make snide remarks about you on the backchannel.

(I'll only be able to attend virtually this year. Please don't ask to take away 
my vote!)


Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   
41 Watchung Plaza #132, Montclair NJ 07042
e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] site vulnerabilities

2011-12-16 Thread Eric Hellman

I gave a lightning talk on XSS vulnerabilities in library software at the first 
Code4Lib conference.

You'll be happy to know that as bad as things are, they've improved 
considerably! I showed several ILS vendors how I could insert arbitrary 
javascripts into their products. Some of them fixed their products in the next 
update cycle, some took a couple of years. One particularly nasty vulnerability 
I am unable to talk about, it was so nasty and close to home. But the general 
problem persists. Perhaps an outing process would be useful.

Eric

On Dec 9, 2011, at 10:54 AM, Erin Germ wrote:

 Good morning group,
 
 I don't mean to be an alarmist but I follow some sites that list XSS and
 other vulnerabilities for web sites. Among the latest updates with site
 vulnerabilities were a few from libraries.
 
 Some of these are dated a couple months ago but they are now just being
 pushed out and still have a status of unfixed.
 
 If you would like to know if your site(s) are on the list, I would start by
 checking http://www.xssed.com/
 
 V/R
 
 Erin

Re: [CODE4LIB] What software for a digital library

2011-12-16 Thread Eric Hellman

At gluejar, we decided to use Django for our Unglue.it website, which will open 
in january.

As someone who built a web framework from scratch in Java, I've found that the 
django design aligned with mine where I got it right and didn't where I got it 
wrong. I'm still getting used to Python, but I'm quite happy with Django.


Eric Hellman
President, Gluejar, Inc.
http://www.gluejar.com/   
41 Watchung Plaza #132, Montclair NJ 07042
e...@hellman.net 
http://go-to-hellman.blogspot.com/
@gluejar

Re: [CODE4LIB] site vulnerabilities

2011-12-19 Thread Eric Hellman

By the way, who ever decided it would be fun to reply by checking the gluejar 
website for XSS vulnerabilities, by all means, tell everyone about it!

Eric

On Dec 16, 2011, at 10:14 PM, Michael J. Giarlo wrote:

 On Fri, Dec 16, 2011 at 21:42, Eric Hellman e...@hellman.net wrote:
 
 You'll be happy to know that as bad as things are, they've improved 
 considerably! I showed several ILS vendors how I could insert arbitrary 
 javascripts into their products. Some of them fixed their products in the 
 next update cycle, some took a couple of years. One particularly nasty 
 vulnerability I am unable to talk about, it was so nasty and close to home. 
 But the general problem persists. Perhaps an outing process would be useful.
 
 
 Leaks4Lib?  +1
 
 -Mike

Re: [CODE4LIB] too much Metadata

2012-02-16 Thread Eric Hellman

Related:

http://go-to-hellman.blogspot.com/2009/06/when-are-you-collecting-too-much-data.html



On Feb 10, 2012, at 3:57 PM, Patrick Berry wrote:

 So, one question I forgot to toss out at the Ask Anything session is:
 
 When do you know you have enough metadata?
 
 You'll know it when you have it, isn't the response I'm looking for.  So,
 I'm sure you're wondering what the context for this question is, and
 honestly there is none.  This is geared towards contentDM or DSpace or
 Omeka or Millennium.  I've seen groups not plan enough for collecting data
 and I've seen groups that are have been planning so long they forgot what
 they were supposed to be collecting in the first place.
 
 So, I'll just throw that vague question out there and see who wants to take
 a swing.
 
 Thanks,
 Pat/@pberry

[CODE4LIB] Unglue.it has launched

2012-05-17 Thread Eric Hellman

There's even the beginnings of an API .

https://unglue.it/api/help

Lots of work left to do, though! Not much point unless the campaigns succeed.

Eric

Re: [CODE4LIB] EPUB and ILS indexing

2012-11-01 Thread Eric Hellman

This is an area where the code4lib community can have a huge impact. 
Conversely, if the Code4lib community doesn't have a big impact, we're in 
trouble.

I urge everyone to have a look at the OS projects that SourceFabric is involved 
in. In particular, BookType is a django web app that lets people 
collaboratively produce EPUB ebooks. If you want to implement a community ebook 
publishing platform, this is what you want to hop onto.

I'm really glad to see Henru-Damien looking at this, I think he could use help!

Eric

On Oct 29, 2012, at 1:11 PM, Henri-Damien LAURENT henridamien.laur...@free.fr 
wrote:

 Le 29/10/2012 14:55, Jodi Schneider a écrit :
 Sounds great!
 
 Have you thought about starting from OPDS?
 http://opds-spec.org/about/
 Thanks for that hint Jodi.
 Nope, I hadnot tought about using OPDS.
 It looks really great.
 But from what I know of ILSes, ATOM feeds are not yet getting indexed 
 straight into the catalog.
 But that could be something great.
 
 
 Might be worth talking to some EPUB folks -- for instance Peter Brantley,
 or else folks from threepress.org?
 I am already in contacts with some people from the EPUB world (namely 
 SourceFabric, gluejar, and tea-ebook).
 But could be interesting to have more feedback.
 
 -Jodi
 
 
 On Mon, Oct 29, 2012 at 12:19 PM, Henri-Damien LAURENT 
 henridamien.laur...@free.fr wrote:
 
 Hi,
 I am about to write a tool which would help indexing EPUB into ILSes.
 My first guess is to produce ISO2709 or MARCXML record from EPUB files,
 but since MARCXML or ISO2709 is not really what I would call the more
 portable (UNIMARC and MARC21 may both be handled in the same file format),
 I am rather considering producing OAI-DC or html5 +schema.org 
 http://schema.org/+dublin corebut that would rely on EPUB3.
 
 Any comment anyone ?
 Has anyone considered such a tool ?
 Is there any hidden corpse lurking around I should be aware of ?
 
 Have a nice day
 
 --
 Henri-Damien LAURENT
 
 
 
 -- 
 Henri-Damien LAURENT

[CODE4LIB] Code, Inclusiveness, and Fear

2012-12-06 Thread Eric Hellman

On Tuesday Night I went the the NYTech Meetup. They get 800+ people to come 
once a month to watch demos of the latest thing. One of the presentations was 
from Hackers Union. I was cringing because it was like a caricature of how to 
present an uninviting impression to anyone who wasn't white, male and 
20-something. Complete with jokes about how to pick up girls in bars. In front 
of an audience about 30% non-male, 40% non-white, and 50% non-20-something.

I thought to myself, if they did that at Code4Lib, it would NOT be received 
well, to say the least.

And this morning I happened to scan through many of the recent threads on the 
listserv.

And the thread on what is coding, including the existential digressions.

What makes Code4Lib different from any other group I know of in the library 
world is that it rejects fear of code. Much of the library world fears code, 
and most of that fear is unfounded. And the code we need to fear is not so 
scary once we know how to fear it.

The threads about having anti-harassment policies is a good thing because we 
want to remove fear that surrounds code. Talking about it is a big step towards 
addressing fear. Let's try to make sure that having a policy doesn't stop us 
from talking about the need to eliminate the fear.

As to who is a part of the Code4Lib community, I think you don't have to be a 
coder, you just have to reject fear of code. A big part of the conferences is 
creating space to help people make the transition from being oppressed by fear 
of code to being liberated by the possibilities of code.

OK, back to work for me- unfortunately not the code part.

Eric


Eric Hellman
President, Gluejar.Inc.
Founder, Unglue.it https://unglue.it/
http://go-to-hellman.blogspot.com/
twitter: @gluejar

Re: [CODE4LIB] Code, Inclusiveness, and Fear

2012-12-06 Thread Eric Hellman

We need to fear malicious code. To do that, we need to think about all the ways 
people can misuse, abuse and attack our systems. We need to cross our t's, dot 
our i's, and shine lots of light.

Eric

On Dec 6, 2012, at 1:17 PM, Gabriel Farrell gsf...@gmail.com wrote:

 one that rings true with me. I hope we can continue to live up to it. I
 want to make sure we're on the same page, though. To be clear, which code
 should we fear?

[CODE4LIB] early history of isbn/issn linking

2012-12-16 Thread Eric Hellman

I'm working on a little project on the early history of bibliographic linking. 
I'm looking for examples where plain-text documents with ISBNs or ISSNs were 
auto-linked to library catalogs or Amazon or whatnot.

Any nominations for who did this first and documented it?

Eric

Eric Hellman
President, Gluejar.Inc.
Founder, Unglue.it https://unglue.it/
http://go-to-hellman.blogspot.com/
twitter: @gluejar

[CODE4LIB] You are a pedantic coder. So what am I?

2013-02-21 Thread Eric Hellman

OK, pedant, tell us why you think methods that can be over-ridden are static.
Also, tell us why you think classes in Java are not instances of java.lang.Class


On Feb 18, 2013, at 1:39 PM, Justin Coyne jus...@curationexperts.com wrote:

 To be pedantic, Ruby and JavaScript are more Object Oriented than Java
 because they don't have primitives and (in Ruby's case) because classes are
 themselves objects.   Unlike Java, both Python and Ruby can properly
 override of static methods on sub-classes. The Java language made many
 compromises as it was designed as a bridge to Object Oriented programming
 for programmers who were used to writing C and C++.
 
 -Justin

[CODE4LIB] githubs for poetry, legal docs

2013-02-27 Thread Eric Hellman

Given the discussion of how github is not really so accessible to non-coders, I 
thought I'd mention these attempts to put version control into the mainstream.

Github for writers: It sounds like that's what Blaine Cook is doing with 
Poetica.com

Github for legal agreements: We've started using Docracy.com to help us manage 
legal agreements. 

Eric


Eric Hellman
President, Gluejar.Inc.
Founder, Unglue.it https://unglue.it/
http://go-to-hellman.blogspot.com/
twitter: @gluejar

Re: [CODE4LIB] Anyone working with iPython?

2013-12-21 Thread Eric Hellman

I use it all the time.

If anyone has played with mathematica notebooks, it's the same thing, with 
python, and other languages apparently on the way.

Eric Hellman
President, Gluejar.Inc.
Founder, Unglue.it https://unglue.it/
http://go-to-hellman.blogspot.com/
twitter: @gluejar

On Dec 19, 2013, at 12:48 PM, Roy Tennant roytenn...@gmail.com wrote:

 Our Wikipedian in Residence, Max Klein brought iPython [1] to my attention
 recently and even in just the little exploration I've done with it so far
 I'm quite impressed. Although you could call it interactive Python that
 doesn't begin to put across the full range of capabilities, as when I first
 heard that I thought Great, a Python shell where you enter a command, hit
 the return, and it executes. Great. Just what I need. NOT. But I was SO
 WRONG.
 
 It certainly can and does do that, but also so much more. You can enter
 blocks of code that then execute. Those blocks don't even have to be
 Python. They can be Ruby or Perl or bash. There are built-in functions of
 various kinds that it (oddly) calls magic. But perhaps the killer bit is
 the idea of Notebooks that can capture all of your work in a way that is
 also editable and completely web-ready. This last part is probably
 difficult to understand until you experience it.
 
 Anyway, i was curious if others have been working with it and if so, what
 they are using it for. I can think of all kinds of things I might want to
 do with it, but hearing from others can inspire me further, I'm sure.
 Thanks,
 Roy
 
 [1] http://ipython.org/

1 2 >

1 - 100 of 145 matches

Mail list logo