Re: [CODE4LIB] Twitter annotations and library software

2010-06-15 Thread Jakob Voss

On 07.06.2010 16:15, Jay Luker wrote:

Hi all,

I found this thread rather interesting and figured I'd try and revive
the convo since apparently some things have been happening in the
twitter annotation space in the past month. I just read on techcrunch
that testing of the annotation features will commence next week [1].
Also it appears that an initial schema for a book type has been
defined [2].


 [1] http://techcrunch.com/2010/06/02/twitter-annotations-testing/
 [2] http://apiwiki.twitter.com/Annotations-Overview#RecommendedTypes


Have any code4libbers gotten involved in this beyond just opining on list?


I don't this so - the discussion slipped to general data modelling 
questions. For the specific, limited use case of twitter annotations I 
bet the recommended format from [2] will be fine (title is implied as 
common attribute, url is optional):


{book:{
  title: ...,
  author: ...,
  isbn: ...,
  year: ,
  url: ...
}}

I only miss an article type with a doi field for non-books.

Cheers,
Jako


--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-06-07 Thread Jay Luker
Hi all,

I found this thread rather interesting and figured I'd try and revive
the convo since apparently some things have been happening in the
twitter annotation space in the past month. I just read on techcrunch
that testing of the annotation features will commence next week [1].
Also it appears that an initial schema for a book type has been
defined [2].

Have any code4libbers gotten involved in this beyond just opining on list?

--jay

[1] http://techcrunch.com/2010/06/02/twitter-annotations-testing/
[2] http://apiwiki.twitter.com/Annotations-Overview#RecommendedTypes


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Owen Stephens
Alex,

Could you expand on how you think the problem that OpenURL tackles would
have been better approached with existing mechanisms? I'm not debating this
necessarily, but from my perspective when OpenURL was first introduced it
solved a real problem that I hadn't seen solved before.

Owen

On Thu, Apr 29, 2010 at 11:55 PM, Alexander Johannesen 
alexander.johanne...@gmail.com wrote:

 Hi,

 On Thu, Apr 29, 2010 at 22:47, Walker, David dwal...@calstate.edu wrote:
  I would suggest it's more because, once you step outside of the
  primary use case for OpenURL, you end-up bumping into *other* standards.

 These issues were raised all the back when it was created, as well. I
 guess it's easy to be clever in hindsight. :) Here's what I wrote
 about it 5 years ago (http://shelter.nu/blog-159.html) ;

 So let's talk about 'Not invented here' first, because surely, we're
 all guilty of this one from time to time. For example, lately I dug
 into the ANSI/NISO Z39.88 -2004 standard, better known as OpenURL. I
 was looking at it critically, I have to admit, comparing it to what I
 already knew about Web Services, SOA, http,
 Google/Amazon/Flickr/Del.icio.us API's, and various Topic Maps and
 semantic web technologies (I was the technical editor of Explorers
 Guide to the Semantic Web)

 I think I can sum up my experiences with OpenURL as such; why? Why
 have the library world invented a new way of doing things that already
 can be done quite well already? Now, there is absolutely nothing wrong
 with the standard per se (except a pretty darn awful choice of
 name!!), so I'm not here criticising the technical merits and the work
 put into it. No, it's a simple 'why' that I have yet to get a decent
 answer to, even after talking to the OpenURL bigwigs about it. I mean,
 come on; convince me! I'm not unreasonable, no truly, really, I just
 want to be convinced that we need this over anything else.


 Regards,

 Alex
 --
  Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 --- http://shelter.nu/blog/ --
 -- http://www.google.com/profiles/alexander.johannesen ---




-- 
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Owen Stephens
Tim,

I'd vote for adopting the same approach as COinS on the basis it already has
some level of adoption, and we know covers at least some of the stuff
libraries and academic users (as used by both libraries and consumer tools
such as Zotero) might want to do. We are talking Books (from what you've
said), so we don't have to worry about other formats. (although it does mean
we can do journal articles and some other stuff as well for no effort)

Mendeley and Zotero already speak COinS, it is pretty simple, and there are
already several code libraries to deal with it.

It isn't where I hope we end up in the longterm but if we talk about this
happening tomorrow, why not use something that is relatively simple, already
has a good set of implementations, and we know works for several cases of
embedding book metadata in a web environment

Owen

On Thu, Apr 29, 2010 at 7:01 PM, Jakob Voss jakob.v...@gbv.de wrote:

 Dear Tim,


 you wrote:

 So this is my recommended framework for proceeding. Tim, I'm afraid
 you'll actually have to do the hard work yourself.


 No, I don't. Because the work isn't fundamentally that hard. A
 complex standard might be, but I never for a moment considered
 anything like that. We have *512 bytes*, and it needs to be usable by
 anyone. Library technology is usually fatally over-engineered, but
 this is a case where that approach isn't even possible.


 Jonathan did a very well summary - you just have to pick what you main
 focus of embedding bibliographic data is.


 A) I favour using the CSL-Record format which I summarized at

 http://wiki.code4lib.org/index.php/Citation_Style_Language

 because I had in mind that people want to have a nice looking citation of
 the publication that someone tweeted about. The drawback is that CSL is less
 adopted and will not always fit in 512 bytes


 B) If you main focus is to link Tweets about the same publication (and
 other stuff about this publication) than you must embed identifiers.
 LibraryThing is mainly based on two identifiers

 1) ISBN to identify editions
 2) LT work ids to identify works

 I wonder why LT work ids have not picked up more although you thankfully
 provide a full mapping to ISBN at
 http://www.librarything.com/feeds/thingISBN.xml.gz but nevermind. I
 thought that some LT records also contain other identifiers such as OCLC
 number, LOC number etc. but maybe I am wrong. The best way to specify
 identifiers is to use an URI (all relevant identifiers that I know have an
 URI form). For ISBN it is

 uri:isbn:{ISBN13}

 For LT Work-ID you can use the URL with your .com top level domain:

 http://www.librarything.com/work/{LTWORKID}http://www.librarything.com/work/%7BLTWORKID%7D

 That would fit for tweets about books with an ISBN and for tweets about a
 work which will make 99.9% of tweets from LT about single publications
 anyway.


 C) If your focus is to let people search for a publication in libraries
 than and to copy bibliographic data in reference management software then
 COinS is a way to go. COinS is based on OpenURL which I and others ranted
 about because it is a crapy library standard like MARC. But unlike other
 metadata formats COinS usually fits in less then 512 bytes. Furthermore you
 may have to deal with it for LibraryThing for libraries anyway.


 Although I strongly favour CSL as a practising library scientist and
 developer I must admit that for LibraryThing the best way is to embed
 identifiers (ISBN and LT Work-ID) and maybe COinS. As long as LibraryThing
 does not open up to more complex publications like preprints of
 proceeding-articles in series etc. but mainly deals with books and works
 this will make LibraryThing users happy.


  Then, three years from now, we can all conference-tweet about a CIL talk,
 about all the cool ways libraries are using Twitter, and how it's such a
 shame that the annotations standard wasn't designed with libraries in mind.


 How about a bet instead of voting. In three years will there be:

 a) No relevant Twitter annotations anyway
 b) Twitter annotations but not used much for bibliographic data
 c) A rich variety of incompatible bibliographic annotation standards
 d) Semantic Web will have solved every problem anyway
 ..

 Cheers
 Jakob

 --
 Jakob Voß jakob.v...@gbv.de, skype: nichtich
 Verbundzentrale des GBV (VZG) / Common Library Network
 Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
 +49 (0)551 39-10242, http://www.gbv.de




-- 
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 18:47, Owen Stephens o...@ostephens.com wrote:
 Could you expand on how you think the problem that OpenURL tackles would
 have been better approached with existing mechanisms?

As we all know, it's pretty much a spec for a way to template incoming
and outgoing URLs, defining some functionality along the way. As such,
URLs with basic URI templates and rewriting have been around for a
long time. Even longer than that is just the basics of HTTP which have
status codes and functionality to do exactly the same. We've been
doing link resolving since mid 90's, either as CGI scripts, or as
Apache modules, so none of this were new. URI comes in, you look it up
in a database, you cross-check with other REQUEST parameters (or
sessions, if you must, as well as IP addresses) and pop out a 303
(with some possible rewriting of the outgoing URL) (with the hack we
needed at the time to also create dummy pages with META tags
*shudder*).

So the idea was to standardize on a way to do this, and it was a good
idea as such. OpenURL *could* have had a great potential if it
actually defined something tangible, something concrete like a model
of interaction or basic rules for fishing and catching tokens and the
like, and as someone else mentioned, the 0.1 version was quite a good
start. But by the time when 1.0 came out, all the goodness had turned
so generic and flexible in such a complex way that handling it turned
you right off it. The standard also had a very difficult language, and
more specifically didn't use enough of the normal geeky language used
by sysadmins around. The more I tried to wrap my head around it, the
more I felt like just going back to CGI scripts that looked stuff up
in a database. It was easier to hack legacy code, which, well, defeats
the purpose, no?

Also, forgive me if I've forgotten important details; I've suppressed
this part of my life. :)


Kind regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Owen Stephens
Thanks Alex,

This makes sense, and yes I see what your saying - and yes, if you end up
going back to custom coding because it's easier it does seem to defeat the
purpose.

However I'd argue that actually OpenURL 'succeeded' because it did manage to
get some level of acceptance (ignoring the question of whether it is v0.1 or
v1.0) - the cost of developing 'link resolvers' would have been much higher
if we'd been doing something different for each publisher/platform. In this
sense (I'd argue) sometimes crappy standards are better than none.

We've used OpenURL v1.0 in a recent project and because we were able to
simply pick up code already done for Zotero, and  we already had an OpenURL
resolver, the amount of new code we needed for this was minimal.

I think the point about Link Resolvers doing stuff that Apache and CGI
scripts were already doing is a good one - and I've argued before that what
we actually should do is separate some of this out (a bit like Johnathan did
with Umlaut) into an application that can answer questions about location
(what is generally called the KnowledgeBase in link resolvers) and the
applications that deal with analysing the context and the redirection

(To introduce another tangent in a tangential thread, interestingly (I
think!) I'm having a not dissimilar debate about Linked Data at the moment -
there are many who argue that it is too complex and that as long as you have
a nice RESTful interface you don't need to get bogged down in ontologies and
RDF etc. I'm still struggling with this one - my instinct is that it will
pay to standardise but so far I've not managed to convince even myself this
is more than wishful thinking at the moment)

Owen

On Fri, Apr 30, 2010 at 10:33 AM, Alexander Johannesen 
alexander.johanne...@gmail.com wrote:

 On Fri, Apr 30, 2010 at 18:47, Owen Stephens o...@ostephens.com wrote:
  Could you expand on how you think the problem that OpenURL tackles would
  have been better approached with existing mechanisms?

 As we all know, it's pretty much a spec for a way to template incoming
 and outgoing URLs, defining some functionality along the way. As such,
 URLs with basic URI templates and rewriting have been around for a
 long time. Even longer than that is just the basics of HTTP which have
 status codes and functionality to do exactly the same. We've been
 doing link resolving since mid 90's, either as CGI scripts, or as
 Apache modules, so none of this were new. URI comes in, you look it up
 in a database, you cross-check with other REQUEST parameters (or
 sessions, if you must, as well as IP addresses) and pop out a 303
 (with some possible rewriting of the outgoing URL) (with the hack we
 needed at the time to also create dummy pages with META tags
 *shudder*).

 So the idea was to standardize on a way to do this, and it was a good
 idea as such. OpenURL *could* have had a great potential if it
 actually defined something tangible, something concrete like a model
 of interaction or basic rules for fishing and catching tokens and the
 like, and as someone else mentioned, the 0.1 version was quite a good
 start. But by the time when 1.0 came out, all the goodness had turned
 so generic and flexible in such a complex way that handling it turned
 you right off it. The standard also had a very difficult language, and
 more specifically didn't use enough of the normal geeky language used
 by sysadmins around. The more I tried to wrap my head around it, the
 more I felt like just going back to CGI scripts that looked stuff up
 in a database. It was easier to hack legacy code, which, well, defeats
 the purpose, no?

 Also, forgive me if I've forgotten important details; I've suppressed
 this part of my life. :)


 Kind regards,

 Alex
 --
  Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 --- http://shelter.nu/blog/ --
 -- http://www.google.com/profiles/alexander.johannesen ---




-- 
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 20:29, Owen Stephens o...@ostephens.com wrote:
 However I'd argue that actually OpenURL 'succeeded' because it did manage to
 get some level of acceptance (ignoring the question of whether it is v0.1 or
 v1.0) - the cost of developing 'link resolvers' would have been much higher
 if we'd been doing something different for each publisher/platform. In this
 sense (I'd argue) sometimes crappy standards are better than none.

Well, perhaps. I see OpenURL as the natural progression from PURL, in
which both have their degree of success, however I'm careful using
that word as I live on the outside of the library world. It may well
be a success on the inside. :)

 I think the point about Link Resolvers doing stuff that Apache and CGI
 scripts were already doing is a good one - and I've argued before that what
 we actually should do is separate some of this out (a bit like Johnathan did
 with Umlaut) into an application that can answer questions about location
 (what is generally called the KnowledgeBase in link resolvers) and the
 applications that deal with analysing the context and the redirection

Yes, split it into smaller chunks is always smart, especially with
complex issues. For example, in the Topic Maps world, the who standard
(reference model, data model, query language, constraint language, XML
exchange language, various notational languages) is wrapped up with a
guide in the middle. Make them into smaller parcels, and make your
flexible point there. If you pop it all into one, no one will read it
and fully understand it. (And don't get me started on the WS-* set of
standards on the same issues ...)

 (To introduce another tangent in a tangential thread, interestingly (I
 think!) I'm having a not dissimilar debate about Linked Data at the moment -
 there are many who argue that it is too complex and that as long as you have
 a nice RESTful interface you don't need to get bogged down in ontologies and
 RDF etc. I'm still struggling with this one - my instinct is that it will
 pay to standardise but so far I've not managed to convince even myself this
 is more than wishful thinking at the moment)

Ah, now this is certainly up my alley. As you might have seen, I'm a
Topic Maps guy, and we have in our model a distinction between three
different kinds of identities; internal, external indicators and
published subject identifiers. The RDF world only had rdf:about, so
when you used www.somewhere.org, are you talking about that thing,
or does that thing represent something you're talking about? Tricky
stuff which has these days become a *huge* problem with Linked Data.
And yes, they're trying to solve that by issuing a HTTP 303 status
code as a means of declaring the identifiers imperative, which is a
*lot* of resolving to do on any substantial set of data, and in my
eyes a huge ugly hack. (And what if your Internet falls down? Tough.)

Anyway, here's more on these identity problems ;
   http://www.ontopia.net/topicmaps/materials/identitycrisis.html

As to the RESTful notions, they only take you as far as content-types
can take you. Sure, you can gleam semantics from it, but I reckon
there's an impedance mismatch between just the things librarians how
got down pat ; meta data vs. data. CRUD or, in this example, GPPD
(get/post/put/delete), who aren't in a dichotomy btw, can only
determine behavior that enables certain semantic paradigms, but cannot
speak about more complex relationships or even modest models. (Very
often models aren't actionable :)

The funny thing is that after all these years of working with Topic
Maps I find that these hard issues have been solved years ago, and the
rest of the world is slowly catching up to it. I blame the lame
DAML+OIL background of RDF and OWL, to be honest; a model too simple
to be elegantly advanced and too complex to be easily useful.


Kind regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Jakob Voss

Jonathan Rochkind wrote:

Call me pedantic but if you do not have an identifier than there is no 
hope to identity the publication by means of metadata. You only 
*describe* it with metadata and use additional heuristics (mostly 
search engines) to hopefully identify the publication based on the 
description.
  
But the entire OpenURL infrastructure DOES this, and does it without 
using search engines. It's a real world use case that has a solution in 
production! So, yeah, I call you pedantic for wanting to pretend the use 
case and the real world solution doesn't exist. :)


As you said OpenURL is an *infrastructure*. It only makes sense if you 
have resolvers that map an OpenURL to a unique publications. This 
resolvers do the identification while OpenURL only describes (as long as 
you do not put a unique publication identifier into an OpenURL). In 
contrast an identifier can be used to compare and search publications 
without any additional infrastructure.


You can call it description rather than identification if you like, 
that is a question of terminology. But it's description that is meant to 
uniquely identify a particular publication, and that a whole bunch of 
software in use every day succesfully uses to identify a particular 
publication.


It's not just terminology if you can either just compare two strings for 
equalness (identifcation) or you need an infrastructure with knowledge 
bases and specific software (to make use of a description).


OpenURL is of no use if you seperate it from the existing infrastructure 
which is mainly held by companies. No sane person will try to build an 
open alternative infrastructure because OpenURL is a crapy 
library-standard like MARC etc. This rant on OpenURL summarizes it well:


http://cavlec.yarinareth.net/2006/10/13/i-hate-library-standards/

The OpenURL specification is a 119 page PDF - that alone is a reason to 
run away as fast as you can.


If a twitter annotation setup wants to be able to identify publications 
that don't have standard identifiers, then you don't want to ignore this 
use case and how actually in production software currently deals with 
it. You can perhaps find a better way to deal with it -- I'm certainly 
not arguing for OpenURL as the be all end all, I rather hate OpenURL 
actually.  But dismissing it as impossible is indeed pedantic, since 
it's being done!


If a twitter annotation setup wants to get adopted than it should not be 
build on a crapy complex library standard like OpenURL.


 It IS a hacky and error-prone solution, to be sure.   But it's the
 best solution we've got, because it's simply a fact that we have many
 publications we want to identify that lack standard identifiers.

Ok, back to serious: Bibliographic Twitter annotations should be 
designed in a way that libraries (or whoever provides that knowledge 
bases aka OpenURL resolvers) can use it to look up a publication by its 
metadata. So there should be a transformation


Twitter annotation = OpenURL

If you choose CSL as bibliographic input format you can hopefully create 
a CSL style that does not produce a citation but an OpenURL - Voilà!


I must admit that this solution is based on the open assumption that CSL 
record format contains all information needed for OpenURL which may not 
the case. A good point to start from is the function createContextObject in


https://www.zotero.org/svn/extension/trunk/chrome/content/zotero/xpcom/ingester.js

which is used by Zotero to create OpenURLs.

Cheers
Jakob

--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread MJ Suhonos
Okay, I know it's cool to hate on OpenURL, but I feel I have to clarify a few 
points:

 OpenURL is of no use if you seperate it from the existing infrastructure 
 which is mainly held by companies. No sane person will try to build an open 
 alternative infrastructure because OpenURL is a crapy library-standard like 
 MARC etc.

OpenURL is mostly implemented by libraries, yes, but it isn't necessarily 
*just* a library standard - this is akin to saying that Dublin Core is a 
library standard.  Only sort of.

The other issue I have is that — although Jonathan used the term to make a 
point — OpenURL is *not* an infrastructure, it is a protocol.  Condemning the 
current OpenURL infrastructure (which is mostly a vendor-driven oligopoly) is 
akin to saying in 2004 that HTTP and HTML sucks because Firefox hadn't been 
released yet and all we had was IE6.  Don't condemn the standard because of the 
implementation.

 The OpenURL specification is a 119 page PDF - that alone is a reason to run 
 away as fast as you can.

The main reason for this is because OpenURL can do much, much, much more than 
the simple resolve a unique copy use case that libraries use it for.  We're 
using maybe 1% of the spec for 99% of our practice, probably because librarians 
weren't imaginative (as Jim Weinheimer would say) enough to think of other use 
cases beyond that most pressing one.

I'd contend that OpenURL, like other technologies (cough XML) is greatly 
misunderstood, and therefore abused, and therefore discredited.  I think there 
is also often confusion between the KEV schemas and OpenURL itself (which is 
really what Dorothea's blog rant is about); I'm certainly guilty of this 
myself, as Jonathan can attest.

You don't *have* to use the KEVs with OpenURL, you can use anything, including 
eg. Dublin Core.

 If a twitter annotation setup wants to get adopted than it should not be 
 build on a crapy complex library standard like OpenURL.

I don't quite understand this (but I think I agree) — twitter annotation should 
be built on a data model, and then serialized via whatever protocols make sense 
(which may or may not include OpenURL).

 I must admit that this solution is based on the open assumption that CSL 
 record format contains all information needed for OpenURL which may not the 
 case.
 …

A good example.  And this is where you're exactly right that we need better 
tools, namely OpenURL resolvers which can do much more than they do now.  I've 
had the idea for a number of years now that OpenURL functionality should be 
merged into aggregation / discovery layer (eg. OAI harvester)-type systems, 
because, like OAI-PMH, OpenURL can *transport metadata*, we just don't use it 
for that in practice.

A ContextObject is just a triple that makes a single assertion about two 
entities (resources): that A references B.  Just like an RDF statement using 
http://purl.org/dc/terms/references, but with more focus on describing the 
entities rather than the assertion.

Maybe if I put it that way, OpenURL sounds a little less crappy.

MJ


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Ross Singer
On Thu, Apr 29, 2010 at 8:17 AM, MJ Suhonos m...@suhonos.ca wrote:
 Okay, I know it's cool to hate on OpenURL, but I feel I have to clarify a few 
 points:


It's not that it's cool to hate on OpenURL, but if you've really
worked with it it's easy to grow bitter.

snip
 Maybe if I put it that way, OpenURL sounds a little less crappy.

No, OpenURL is still crappy and it will always be crappy, I'm afraid,
because it's tremendously complicated, mainly from the fact that it
tries to do too much.

The reason that context-sensitive services based on bibliographic
citations comprise 99% of all OpenURL activity is because:
A) that was the problem it was originally designed to solve
B) it's the only thing it really does well (and OpenURL 1.0's
insistence on being able to solve any problem almost takes that
strength away from it)

The barriers to entry + the complexity of implementation almost
guarantee that there's a better or, at any rate, easier alternative to
any problem.

The difference between OpenURL and DublinCore is that the RDF
community picked up on DC because it was simple and did exactly what
they needed (and nothing more).  A better analogy would be Z39.50 or
SRU: two non-library-specific protocols that, for their own reasons,
haven't seen much uptake outside of the library community.

-Ross.


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Mike Taylor
On 29 April 2010 13:17, MJ Suhonos m...@suhonos.ca wrote:
 The OpenURL specification is a 119 page PDF - that alone is a reason to run 
 away as fast as you can.

 The main reason for this is because OpenURL can do much, much, much more than 
 the simple resolve a unique copy use case that libraries use it for.  We're 
 using maybe 1% of the spec for 99% of our practice, probably because 
 librarians weren't imaginative (as Jim Weinheimer would say) enough to think 
 of other use cases beyond that most pressing one.

It's worth contrasting this with the original OpenURL specification,
now retro-numbered as v0.1:
http://www.openurl.info/registry/docs/pdf/openurl-01.pdf
This is the one that everyone implemented in a burst of enthusiasm
earlier this decade.  You know, in the way, almost on-one's
implemented v1.0.

That document is TEN pages long.  Eight, really, since the total count
includes a page containing the foreword written after the event and a
page of acknowledgements consisting of a single 11-word sentence.

Can we be surprised that this specification attracted more interest
than the one fifteen times longer?

OpenURL 1.0 took that simple, comprehensible spec -- one that you
could read over lunch and fully understand -- and blew it up into a
super-generalised exercise in architecture astronautics.  And then
provided ANOTHER big document explaining how you can profile OpenURL
1.0 to make it do the stuff that v0.1 does (i.e. what you actually
WANT it to do) -- except, of course, that it expresses the same
concepts in a different way, so that v0.1 and v1.0 OpenURLs are
mutually incomprehensible.

All of this to support vapour use-cases that no-one has taken
advantage of because no-one ever needed to do that stuff.  So the sum
achievement of OpenURL 1.0 has been (A) to fill people with fear of
what used to be a very useful and perfectly straightforward
specification, and (B) where implemented at all, to balkanise
implementations.

 I'd contend that OpenURL, like other technologies (cough XML) is greatly 
 misunderstood, and therefore abused, and therefore discredited.  I think 
 there is also often confusion between the KEV schemas and OpenURL itself 
 (which is really what Dorothea's blog rant is about); I'm certainly guilty of 
 this myself, as Jonathan can attest.

 You don't *have* to use the KEVs with OpenURL, you can use anything, 
 including eg. Dublin Core.

Yeah.

So long as you don't mind that only 0.01% of the world's OpenURL
resolvers will know what to do with them.


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Walker, David
 We're using maybe 1% of the spec for 99% of our practice, 
 probably because librarians weren't imaginative (as Jim 
 Weinheimer would say) enough to think of other use cases 
 beyond that most pressing one.

I would suggest it's more because, once you step outside of the primary use 
case for OpenURL, you end-up bumping into *other* standards.

 Dorthea'sblog post that Jakob referenced in his message is a good example of 
that.  She was trying to use OpenURL (via COINS) to get data into Zotero.  
Mid-way through the post she wonders if maybe she should have gone with unAPI 
instead.  

And, in fact, I think that would have been a better approach.  unAPI is better 
at doing that particular task than OpenURL.  And I think that may explain why 
OpenURL hasn't become the One Standard to Rule Them All, even though it kind of 
presents itself that way.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of MJ Suhonos 
[...@suhonos.ca]
Sent: Thursday, April 29, 2010 5:17 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Twitter annotations and library software

Okay, I know it's cool to hate on OpenURL, but I feel I have to clarify a few 
points:

 OpenURL is of no use if you seperate it from the existing infrastructure 
 which is mainly held by companies. No sane person will try to build an open 
 alternative infrastructure because OpenURL is a crapy library-standard like 
 MARC etc.

OpenURL is mostly implemented by libraries, yes, but it isn't necessarily 
*just* a library standard - this is akin to saying that Dublin Core is a 
library standard.  Only sort of.

The other issue I have is that — although Jonathan used the term to make a 
point — OpenURL is *not* an infrastructure, it is a protocol.  Condemning the 
current OpenURL infrastructure (which is mostly a vendor-driven oligopoly) is 
akin to saying in 2004 that HTTP and HTML sucks because Firefox hadn't been 
released yet and all we had was IE6.  Don't condemn the standard because of the 
implementation.

 The OpenURL specification is a 119 page PDF - that alone is a reason to run 
 away as fast as you can.

The main reason for this is because OpenURL can do much, much, much more than 
the simple resolve a unique copy use case that libraries use it for.  We're 
using maybe 1% of the spec for 99% of our practice, probably because librarians 
weren't imaginative (as Jim Weinheimer would say) enough to think of other use 
cases beyond that most pressing one.

I'd contend that OpenURL, like other technologies (cough XML) is greatly 
misunderstood, and therefore abused, and therefore discredited.  I think there 
is also often confusion between the KEV schemas and OpenURL itself (which is 
really what Dorothea's blog rant is about); I'm certainly guilty of this 
myself, as Jonathan can attest.

You don't *have* to use the KEVs with OpenURL, you can use anything, including 
eg. Dublin Core.

 If a twitter annotation setup wants to get adopted than it should not be 
 build on a crapy complex library standard like OpenURL.

I don't quite understand this (but I think I agree) — twitter annotation should 
be built on a data model, and then serialized via whatever protocols make sense 
(which may or may not include OpenURL).

 I must admit that this solution is based on the open assumption that CSL 
 record format contains all information needed for OpenURL which may not the 
 case.
 …

A good example.  And this is where you're exactly right that we need better 
tools, namely OpenURL resolvers which can do much more than they do now.  I've 
had the idea for a number of years now that OpenURL functionality should be 
merged into aggregation / discovery layer (eg. OAI harvester)-type systems, 
because, like OAI-PMH, OpenURL can *transport metadata*, we just don't use it 
for that in practice.

A ContextObject is just a triple that makes a single assertion about two 
entities (resources): that A references B.  Just like an RDF statement using 
http://purl.org/dc/terms/references, but with more focus on describing the 
entities rather than the assertion.

Maybe if I put it that way, OpenURL sounds a little less crappy.

MJ


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread MJ Suhonos
 It's not that it's cool to hate on OpenURL, but if you've really
 worked with it it's easy to grow bitter.

Well, fair enough.  Perhaps what I'm defending isn't OpenURL per se, but rather 
the concept of being able to transport descriptive assertions the way the 1.0 
spec proposes.

 The reason that context-sensitive services based on bibliographic
 citations comprise 99% of all OpenURL activity is because:
 A) that was the problem it was originally designed to solve

Yes, right.  And neither libraries nor vendors moved past this when 1.0 came 
out for the reasons described (too complex, no immediate use cases).

 The barriers to entry + the complexity of implementation almost
 guarantee that there's a better or, at any rate, easier alternative to
 any problem.

Let me be clear: I am *all* for a better system — even the first RSS specs were 
fragmented and crappy, which led to Atom.  But for the time they were around, 
they were useful, if kludgy.  My only point (and I think, Jonathan's) is that 
OpenURL, for better or worse, *exists* and *works* now, if not ideally.  If it 
sucks, the onus is on us, I think, to improve it or produce something better.

 The difference between OpenURL and DublinCore is that the RDF
 community picked up on DC because it was simple and did exactly what
 they needed (and nothing more).

Actually the difference between OpenURL and DC is that one is a transport 
protocol and one is a metadata schema.  :-)  But I get your and Mike's point 
about OpenURL 1.0 being too complicated for librarians to bother with.

 All of this to support vapour use-cases that no-one has taken
 advantage of because no-one ever needed to do that stuff.  So the sum
 achievement of OpenURL 1.0 has been (A) to fill people with fear of
 what used to be a very useful and perfectly straightforward
 specification, and (B) where implemented at all, to balkanise
 implementations.

Sounds a lot like Z39.50, to me, actually.  I guess I just see this as a 
classic example of librarians (and of course I'm generalizing) sitting with a 
tool-in-hand and saying this isn't good enough, tossing it in the trash, and 
then lamenting the lack of tools for doing useful things.  Sort of like MODS 
(for those on the NGC4lib list).  I know we're supposed to be pragmatists on 
C4L, but do we just relegate ourselves to doing stuff we need to do, or 
pushing our existing tools to experiment?

 You don't *have* to use the KEVs with OpenURL, you can use anything, 
 including eg. Dublin Core.
 
 Yeah.
 
 So long as you don't mind that only 0.01% of the world's OpenURL
 resolvers will know what to do with them.

Absolutely.  So how about we build some better resolvers and do useful and 
interesting new things with them?  Like, Twitter annotations.  :-)

MJ


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread MJ Suhonos
Let me correct myself (for the detail-oriented among us):

 Actually the difference between OpenURL and DC is that one is a transport 
 protocol and one is a metadata schema.  :-)

OpenURL is a *serialization format* which happens to be actionable by a 
transport protocol (HTTP), which is its main benefit.


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Jonathan Rochkind

I agree that OpenURL is crappy.

My point was that the problem case -- 'identifying' (or describing an 
element sufficient for identification, if you like to call it that) 
publications that do not have standard identifiers -- is a real one.  
OpenURL _does_ solve it.   You _probably_ don't want to ignore this 
problem case  in a twitter annotation scenario.  If you can solve it 
_better_ than OpenURL, than all the better. Or if you decide 
intentionally to exclude it from your scenario, that's fine, you know 
your intended domain.  

But OpenURL, despite it's crappiness, _does_ address this problem case 
reasonably effectively, and it is really in use.


I'm certainly not trying to be an OpenURL booster.  But it works, and 
until/unless we have something better, is is addressing a problem case 
that is really important in many scenarios (like getting users to 
licensed full text, naturally).


Jonathan

Ross Singer wrote:

On Thu, Apr 29, 2010 at 8:17 AM, MJ Suhonos m...@suhonos.ca wrote:
  

Okay, I know it's cool to hate on OpenURL, but I feel I have to clarify a few 
points:




It's not that it's cool to hate on OpenURL, but if you've really
worked with it it's easy to grow bitter.

snip
  

Maybe if I put it that way, OpenURL sounds a little less crappy.



No, OpenURL is still crappy and it will always be crappy, I'm afraid,
because it's tremendously complicated, mainly from the fact that it
tries to do too much.

The reason that context-sensitive services based on bibliographic
citations comprise 99% of all OpenURL activity is because:
A) that was the problem it was originally designed to solve
B) it's the only thing it really does well (and OpenURL 1.0's
insistence on being able to solve any problem almost takes that
strength away from it)

The barriers to entry + the complexity of implementation almost
guarantee that there's a better or, at any rate, easier alternative to
any problem.

The difference between OpenURL and DublinCore is that the RDF
community picked up on DC because it was simple and did exactly what
they needed (and nothing more).  A better analogy would be Z39.50 or
SRU: two non-library-specific protocols that, for their own reasons,
haven't seen much uptake outside of the library community.

-Ross.

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Jonathan Rochkind
Yes, what MJ said is indeed exactly my perspective as well. 


MJ Suhonos wrote:

It's not that it's cool to hate on OpenURL, but if you've really
worked with it it's easy to grow bitter.



Well, fair enough.  Perhaps what I'm defending isn't OpenURL per se, but rather 
the concept of being able to transport descriptive assertions the way the 1.0 
spec proposes.

  

The reason that context-sensitive services based on bibliographic
citations comprise 99% of all OpenURL activity is because:
A) that was the problem it was originally designed to solve



Yes, right.  And neither libraries nor vendors moved past this when 1.0 came 
out for the reasons described (too complex, no immediate use cases).

  

The barriers to entry + the complexity of implementation almost
guarantee that there's a better or, at any rate, easier alternative to
any problem.



Let me be clear: I am *all* for a better system — even the first RSS specs were 
fragmented and crappy, which led to Atom.  But for the time they were around, 
they were useful, if kludgy.  My only point (and I think, Jonathan's) is that 
OpenURL, for better or worse, *exists* and *works* now, if not ideally.  If it 
sucks, the onus is on us, I think, to improve it or produce something better.

  

The difference between OpenURL and DublinCore is that the RDF
community picked up on DC because it was simple and did exactly what
they needed (and nothing more).



Actually the difference between OpenURL and DC is that one is a transport 
protocol and one is a metadata schema.  :-)  But I get your and Mike's point 
about OpenURL 1.0 being too complicated for librarians to bother with.

  

All of this to support vapour use-cases that no-one has taken
advantage of because no-one ever needed to do that stuff.  So the sum
achievement of OpenURL 1.0 has been (A) to fill people with fear of
what used to be a very useful and perfectly straightforward
specification, and (B) where implemented at all, to balkanise
implementations.



Sounds a lot like Z39.50, to me, actually.  I guess I just see this as a classic example of 
librarians (and of course I'm generalizing) sitting with a tool-in-hand and saying this isn't 
good enough, tossing it in the trash, and then lamenting the lack of tools for doing useful 
things.  Sort of like MODS (for those on the NGC4lib list).  I know we're supposed to be 
pragmatists on C4L, but do we just relegate ourselves to doing stuff we need to do, or 
pushing our existing tools to experiment?

  

You don't *have* to use the KEVs with OpenURL, you can use anything, including 
eg. Dublin Core.
  

Yeah.

So long as you don't mind that only 0.01% of the world's OpenURL
resolvers will know what to do with them.



Absolutely.  So how about we build some better resolvers and do useful and 
interesting new things with them?  Like, Twitter annotations.  :-)

MJ

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Tim Spalding
Can we just hold a vote or something?

I'm happy to do whatever the community here wants and will actually
use. I want to do something that will be usable by others. I also
favor something dead simple, so it will be implemented. If we don't
reach some sort of conclusion, this is an interesting waste of time. I
propose only people engaged in doing something along these lines get
to vote?

Tim


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Jonathan Rochkind
I wouldn't count on the community using anything, just because random 
people on the listserv voted on it.


If you're coding it, you should take account of the feedback, and then 
go on and create something that YOU will use, and makes sense to you.  
And then hope other people do too.  That's pretty much the best you can do.


Vote by random people on a listserv is hardly a guarantee of getting a 
standard that actually works, or that people actually use -- just look 
at OpenURL 1.0!


Tim Spalding wrote:

Can we just hold a vote or something?

I'm happy to do whatever the community here wants and will actually
use. I want to do something that will be usable by others. I also
favor something dead simple, so it will be implemented. If we don't
reach some sort of conclusion, this is an interesting waste of time. I
propose only people engaged in doing something along these lines get
to vote?

Tim

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Rosalyn Metz
I'm going to throw in my two cents.

I dont think (and correct me if i'm wrong) we have mentioned once what
a user might actually put in a twitter annotation.  a book title?  an
article title? a link?

i think creating some super complicated thing for a twitter annotation
dooms it to failure.  after all, its twitter...make it short and
sweet.

also the 1.0 document for OpenURL isn't really that bad (yes I have
read it).  a good portion of it is a chart with the different metadata
elements.  also open url could conceivably refer to an animal and then
link to a bunch of resources on that animal, but no one has done that.
 i don't think that's a problem with OpenURL i think thats a problem
with the metadata sent by vendors to link resolvers and librarians
lack of creativity (yes i did make a ridiculous generalization that
was not intended to offend anyone but inevitably it will).  having
been a vendor who has worked with openurl, i know that the informaiton
databases send seriously effects (affects?) what you can actually do
in a link resolver.





On Thu, Apr 29, 2010 at 10:23 AM, Tim Spalding t...@librarything.com wrote:
 Can we just hold a vote or something?

 I'm happy to do whatever the community here wants and will actually
 use. I want to do something that will be usable by others. I also
 favor something dead simple, so it will be implemented. If we don't
 reach some sort of conclusion, this is an interesting waste of time. I
 propose only people engaged in doing something along these lines get
 to vote?

 Tim



Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Ross Singer
On Thu, Apr 29, 2010 at 10:32 AM, Rosalyn Metz rosalynm...@gmail.com wrote:
 I'm going to throw in my two cents.

 I dont think (and correct me if i'm wrong) we have mentioned once what
 a user might actually put in a twitter annotation.  a book title?  an
 article title? a link?

I think the idea is these would be machine generated from an
application.  So, imagine LT, Amazon, Delicious Library or SFX having
a Tweet this! button and *that* provides the annotation (not the
user).

 i think creating some super complicated thing for a twitter annotation
 dooms it to failure.  after all, its twitter...make it short and
 sweet.

Indeed, it's limited.

 also the 1.0 document for OpenURL isn't really that bad (yes I have
 read it).  a good portion of it is a chart with the different metadata
 elements.  also open url could conceivably refer to an animal and then
 link to a bunch of resources on that animal, but no one has done that.
  i don't think that's a problem with OpenURL i think thats a problem
 with the metadata sent by vendors to link resolvers and librarians
 lack of creativity (yes i did make a ridiculous generalization that
 was not intended to offend anyone but inevitably it will).  having
 been a vendor who has worked with openurl, i know that the informaiton
 databases send seriously effects (affects?) what you can actually do
 in a link resolver.

No, this is the mythical promise of 1.0, but delivery is, frankly,
much more complicated than that.  It is impractical to expect an
OpenURL link resolver to make sense of any old thing you throw at it
and return sensible results.  This is the point of the community
profiles, to narrow the infinite possibilities a bit.  None of our
current profiles would support the scenario you speak of and I would
be surprised if such a service were to be devised, that it would be
built on OpenURL.

I think it's very easy to underestimate how complicated it is to
actually build something using OpenURL since in the abstract it seems
like a very logical solution to any problem.

-Ross.




 On Thu, Apr 29, 2010 at 10:23 AM, Tim Spalding t...@librarything.com wrote:
 Can we just hold a vote or something?

 I'm happy to do whatever the community here wants and will actually
 use. I want to do something that will be usable by others. I also
 favor something dead simple, so it will be implemented. If we don't
 reach some sort of conclusion, this is an interesting waste of time. I
 propose only people engaged in doing something along these lines get
 to vote?

 Tim




Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Benjamin Young
At #ldow2010 on Tuesday there was a presentation on semantic Twitter 
via TwitLogic:

http://twitlogic.fortytwo.net/

You can download the full paper if you're really curious:
http://events.linkeddata.org/ldow2010/papers/ldow2010_paper16.pdf

Twitter Annotations system was mentioned at the end as a possible side 
option. There's bound to be a good bit of talk in the Linked Data 
community on strapping RDF/RDFa into Twitter Annotations, but I believe 
that's still beginning.


Additionally (as someone outside of the library community proper), 
OpenURL's dependence on resolvers would be the largest concern. Anyone 
could build similar real thing URL's and use 303 See Other redirects 
to return one or more digital resources about that real thing. See 
this for more information:

http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039

Enjoy the reads,
Benjamin

--
President
BigBlueHat
P: 864.232.9553
W: http://www.bigbluehat.com/
http://www.linkedin.com/in/benjaminyoung


On 4/29/10 10:32 AM, Rosalyn Metz wrote:

I'm going to throw in my two cents.

I dont think (and correct me if i'm wrong) we have mentioned once what
a user might actually put in a twitter annotation.  a book title?  an
article title? a link?

i think creating some super complicated thing for a twitter annotation
dooms it to failure.  after all, its twitter...make it short and
sweet.

also the 1.0 document for OpenURL isn't really that bad (yes I have
read it).  a good portion of it is a chart with the different metadata
elements.  also open url could conceivably refer to an animal and then
link to a bunch of resources on that animal, but no one has done that.
  i don't think that's a problem with OpenURL i think thats a problem
with the metadata sent by vendors to link resolvers and librarians
lack of creativity (yes i did make a ridiculous generalization that
was not intended to offend anyone but inevitably it will).  having
been a vendor who has worked with openurl, i know that the informaiton
databases send seriously effects (affects?) what you can actually do
in a link resolver.





On Thu, Apr 29, 2010 at 10:23 AM, Tim Spaldingt...@librarything.com  wrote:
   

Can we just hold a vote or something?

I'm happy to do whatever the community here wants and will actually
use. I want to do something that will be usable by others. I also
favor something dead simple, so it will be implemented. If we don't
reach some sort of conclusion, this is an interesting waste of time. I
propose only people engaged in doing something along these lines get
to vote?

Tim

 


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Jonathan Rochkind

Benjamin Young wrote:
Additionally (as someone outside of the library community proper), 
OpenURL's dependence on resolvers would be the largest concern.


This is a misconception.  An OpenURL context object can be created to 
provide structured semantic citation information, without any dependence 
on a resolver.  Just as a way of serializing structured semantic 
citation information in a standard way.


This is basically what COinS does.

Now, the largest concern with OpenURL to me is actually just that it's 
way harder to understand and work with than it should be to meet it's 
primary use cases, and that means trying to use it as a standard for a 
new use case is probably asking for trouble in adoption curve.


So here are the questions, my own summary analysis of this thread:

1.  What are the citations you think users would want to attach to a 
tweet? 
   a. Will they ALL have standard identifiers that can be expressed as 
some form of URI (ISBN, DOI, etc).
   b. Or are there an important enough subset of citations that will 
NOT have standard identifiers that you still want to support?



If you choose 'a' above, then the solution to me seems clear:  Simply 
attach a URI as your 'citation metadata' -- be willing to use info: 
URIs for ISBNs, ISSNs, LCCNs, OCLCnums, DOIs.   It should be clearly 
identified as identifier for thing cited by this tweet somehow, but 
the 'payload' is just a URI.   [ I know some people don't like 
non-resolvable info: URIs.  I like em, and THIS use case shows why. It 
allows you to attach an ISBN to a tweet as a URI right now today, 
keeping your metadata schema simple just a URI while still allowing 
ISBNs ].


And then we're done if we choose 'a' above, it's pretty simple.

If you choose 'b' above, then you need a way to identify (or describe 
sufficient for identification)  publications that do not have standard 
identifiers.


An OpenURL context object using the standard scholarly formats (the 
only ones actually being used much in the real world) is ONE such way 
that is _actually_ being used today for _just_ this purpose.  So it 
would be worth looking at. You could try to use it whole cloth, or you 
could just take the element schema from the scholarly formats and 
re-purpose it. You could try to fix some of it's problems. (There are 
many).


Or you could ignore OpenURL (or rather than ignore, review it briefly 
for ideas) and use one of the other formats that haven't really yet 
caught on yet,  but might be designed a lot better than OpenURL.   
Examples brought up in this thread include something by Jakob Voss (that 
I don't have the URL handy for), some kind of citation-in-json format 
(that I don't have the url handy for), and Bibo in RDF (that I don't 
have the url handy for).  If you decide to go with any of these, it's 
probably worth _comparing_ them to OpenURL to make sure what can be 
expressed in OpenURL with standard scholarly formats can _also_ be 
expressed in the format you chose. (Last time I looked at Bibo, I recall 
there was no place to put a standard identifier like a DOI.  So maybe 
using Bibo + URI for standard identifier would suffice. etc.)


So this is my recommended framework for proceeding. Tim, I'm afraid 
you'll actually have to do the hard work yourself.  Standards creation 
is hard.   You aren't going to get something good just by getting some 
listserv to vote.  Many of us involved in this discussion may find this 
intellectually interesting, but may have no actual use _ourselves_ for 
such a format anyway.  If Amazon or someone like that comes up with 
something, it will end up becoming the 'de facto' standard, so I 
recommend trying to talk to Amazon to see if they're thinking about this 
-- or just wait to see if/what Amazon comes up with, and use that.


Jonathan


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Rosalyn Metz
ok right now exlibris has a recommender service for sfx that stores
metadata from an openurl.  lets say a vendor bothered to pass an
element like rft.subject=hippo (which is most likely unlikely to
happen since they can't even pass an issn half the time).  that
subject got stored in the recommender service.

next time a child saw something in ebsco animals about hippos they
could click the find this button (or whatever it says) and the
recommender service could bring up everything on hippos.  the openurl
that would be passed would be something like
http://your.linkresolver.com/name?rft.subject=hippo

yes this is simplistic, but its more creative then say doing something
boring like just bringing up the full text or doing something half ass
creative like bringing up articles that are cited in the footnotes.
and to say something like rft.subject (or whatever it might be called)
is out of the scope of group profiles is a little absurd since we are
talking about things that already have subjects attached to them (see
any database or other library related system).

of course you'll probably want to talk about next how subjects aren't
standardized and that makes it possible.  that is true, but that isn't
openurl's fault or the link resolvers fault, thats the database
vendors who refuse to get with the program.






On Thu, Apr 29, 2010 at 11:02 AM, Ross Singer rossfsin...@gmail.com wrote:
 On Thu, Apr 29, 2010 at 10:32 AM, Rosalyn Metz rosalynm...@gmail.com wrote:
 I'm going to throw in my two cents.

 I dont think (and correct me if i'm wrong) we have mentioned once what
 a user might actually put in a twitter annotation.  a book title?  an
 article title? a link?

 I think the idea is these would be machine generated from an
 application.  So, imagine LT, Amazon, Delicious Library or SFX having
 a Tweet this! button and *that* provides the annotation (not the
 user).

 i think creating some super complicated thing for a twitter annotation
 dooms it to failure.  after all, its twitter...make it short and
 sweet.

 Indeed, it's limited.

 also the 1.0 document for OpenURL isn't really that bad (yes I have
 read it).  a good portion of it is a chart with the different metadata
 elements.  also open url could conceivably refer to an animal and then
 link to a bunch of resources on that animal, but no one has done that.
  i don't think that's a problem with OpenURL i think thats a problem
 with the metadata sent by vendors to link resolvers and librarians
 lack of creativity (yes i did make a ridiculous generalization that
 was not intended to offend anyone but inevitably it will).  having
 been a vendor who has worked with openurl, i know that the informaiton
 databases send seriously effects (affects?) what you can actually do
 in a link resolver.

 No, this is the mythical promise of 1.0, but delivery is, frankly,
 much more complicated than that.  It is impractical to expect an
 OpenURL link resolver to make sense of any old thing you throw at it
 and return sensible results.  This is the point of the community
 profiles, to narrow the infinite possibilities a bit.  None of our
 current profiles would support the scenario you speak of and I would
 be surprised if such a service were to be devised, that it would be
 built on OpenURL.

 I think it's very easy to underestimate how complicated it is to
 actually build something using OpenURL since in the abstract it seems
 like a very logical solution to any problem.

 -Ross.




 On Thu, Apr 29, 2010 at 10:23 AM, Tim Spalding t...@librarything.com wrote:
 Can we just hold a vote or something?

 I'm happy to do whatever the community here wants and will actually
 use. I want to do something that will be usable by others. I also
 favor something dead simple, so it will be implemented. If we don't
 reach some sort of conclusion, this is an interesting waste of time. I
 propose only people engaged in doing something along these lines get
 to vote?

 Tim





Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Ross Singer
On Thu, Apr 29, 2010 at 11:21 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 (Last
 time I looked at Bibo, I recall there was no place to put a standard
 identifier like a DOI.  So maybe using Bibo + URI for standard identifier
 would suffice. etc.)

BIBO has all sorts of identifiers (including DOI):

http://bibotools.googlecode.com/svn/bibo-ontology/trunk/doc/dataproperties/doi___1125128004.html

As well as ISBN (10 and 13), ISSN/e-issn, LCCN, EAN, OCLCNUM, and more.

-Ross.


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Tim Spalding
 So this is my recommended framework for proceeding. Tim, I'm afraid you'll 
 actually have to do the hard work yourself.

No, I don't. Because the work isn't fundamentally that hard. A complex
standard might be, but I never for a moment considered anything like
that. We have *512 bytes*, and it needs to be usable by anyone.
Library technology is usually fatally over-engineered, but this is a
case where that approach isn't even possible.

 You aren't going to get something good just by getting some listserv to vote.

My suggestion was to have people interested in actually using it vote.

 Many of us involved in this discussion may find this intellectually 
 interesting, but may have no actual use _ourselves_ for such a format anyway.

Oh, I bet half of you guys have sharing buttons on your OPAC or
elsewhere. And many of you are on Twitter and, at least occasionally,
discuss a book.

 If Amazon or someone like that comes up with something, it will end up 
 becoming the 'de facto' standard, so I recommend trying to talk to Amazon to 
 see if they're thinking about this -- or just wait to see if/what Amazon 
 comes up with, and use that.

You're right. It's a thankless task to get even a subset of library
technologists to agree on something like this. It'd be less important
if I didn't know the Amazon solution will leave off key pieces
libraries need.

Then, three years from now, we can all conference-tweet about a CIL
talk, about all the cool ways libraries are using Twitter, and how
it's such a shame that the annotations standard wasn't designed with
libraries in mind.

Best,
Tim


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Ross Singer
I still don't really see how what you're talking about would
practically be accomplished.

For one, to have rft.subject, like you mention, would require using
the dublincore context set.  Since that wouldn't be useful on its own
for the services that link resolvers currently offer, OpenURL sources
(i.e. AI database providers) would have to support SAP 2 (XML)
context objects so they can pass the book/journal/patent/etc. referent
metadata along with the Dublin Core referent metadata.  It also
becomes a POST rather than a simple link (GET).

What I'm saying is it ups the requirements on all ends of the
ecosystem, for what?

What you're talking about would be *much* more easily implemented via
SRU and CQL (or OpenSearch), anyway, since your example is really
performing a search.  Since OpenURL doesn't have any semblance of
standardized response format, a client wouldn't know what to do with
the response, anyway.

-Ross.

On Thu, Apr 29, 2010 at 11:29 AM, Rosalyn Metz rosalynm...@gmail.com wrote:
 ok right now exlibris has a recommender service for sfx that stores
 metadata from an openurl.  lets say a vendor bothered to pass an
 element like rft.subject=hippo (which is most likely unlikely to
 happen since they can't even pass an issn half the time).  that
 subject got stored in the recommender service.

 next time a child saw something in ebsco animals about hippos they
 could click the find this button (or whatever it says) and the
 recommender service could bring up everything on hippos.  the openurl
 that would be passed would be something like
 http://your.linkresolver.com/name?rft.subject=hippo

 yes this is simplistic, but its more creative then say doing something
 boring like just bringing up the full text or doing something half ass
 creative like bringing up articles that are cited in the footnotes.
 and to say something like rft.subject (or whatever it might be called)
 is out of the scope of group profiles is a little absurd since we are
 talking about things that already have subjects attached to them (see
 any database or other library related system).

 of course you'll probably want to talk about next how subjects aren't
 standardized and that makes it possible.  that is true, but that isn't
 openurl's fault or the link resolvers fault, thats the database
 vendors who refuse to get with the program.






 On Thu, Apr 29, 2010 at 11:02 AM, Ross Singer rossfsin...@gmail.com wrote:
 On Thu, Apr 29, 2010 at 10:32 AM, Rosalyn Metz rosalynm...@gmail.com wrote:
 I'm going to throw in my two cents.

 I dont think (and correct me if i'm wrong) we have mentioned once what
 a user might actually put in a twitter annotation.  a book title?  an
 article title? a link?

 I think the idea is these would be machine generated from an
 application.  So, imagine LT, Amazon, Delicious Library or SFX having
 a Tweet this! button and *that* provides the annotation (not the
 user).

 i think creating some super complicated thing for a twitter annotation
 dooms it to failure.  after all, its twitter...make it short and
 sweet.

 Indeed, it's limited.

 also the 1.0 document for OpenURL isn't really that bad (yes I have
 read it).  a good portion of it is a chart with the different metadata
 elements.  also open url could conceivably refer to an animal and then
 link to a bunch of resources on that animal, but no one has done that.
  i don't think that's a problem with OpenURL i think thats a problem
 with the metadata sent by vendors to link resolvers and librarians
 lack of creativity (yes i did make a ridiculous generalization that
 was not intended to offend anyone but inevitably it will).  having
 been a vendor who has worked with openurl, i know that the informaiton
 databases send seriously effects (affects?) what you can actually do
 in a link resolver.

 No, this is the mythical promise of 1.0, but delivery is, frankly,
 much more complicated than that.  It is impractical to expect an
 OpenURL link resolver to make sense of any old thing you throw at it
 and return sensible results.  This is the point of the community
 profiles, to narrow the infinite possibilities a bit.  None of our
 current profiles would support the scenario you speak of and I would
 be surprised if such a service were to be devised, that it would be
 built on OpenURL.

 I think it's very easy to underestimate how complicated it is to
 actually build something using OpenURL since in the abstract it seems
 like a very logical solution to any problem.

 -Ross.




 On Thu, Apr 29, 2010 at 10:23 AM, Tim Spalding t...@librarything.com 
 wrote:
 Can we just hold a vote or something?

 I'm happy to do whatever the community here wants and will actually
 use. I want to do something that will be usable by others. I also
 favor something dead simple, so it will be implemented. If we don't
 reach some sort of conclusion, this is an interesting waste of time. I
 propose only people engaged in doing something along these lines get
 to vote?

 Tim


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Eric Hellman
OK, back to Tim's specific question.

I'm not sure why you want to put bib data in a tweet at all for your 
application. Why not just use a shortened URL pointing at your page of 
metadata? That page could offer metadata via BIBO, Open Graph and FOAF in RDFa, 
COinS, RIS, etc. using established methods to serve multiple applications at 
once. When Twitter annotations come along, the URL can be put in the annotation 
field.

Eric

On Apr 21, 2010, at 6:08 AM, Tim Spalding wrote:

 Have C4Lers looked at the new Twitter annotations feature?
 
 http://www.sitepoint.com/blogs/2010/04/19/twitter-introduces-annotations-hash-tags-become-obsolete/
 
 I'd love to get some people together to agree on a standard book
 annotation format, so two people can tweet about the same book or
 other library item, and they or someone else can pull that together.
 
 I'm inclined to start adding it to the I'm talking about and I'm
 adding links on LibraryThing. I imagine it could be easily added to
 many library applications too—anywhere there is or could be a share
 this on Twitter link, including OPACs, citation managers, library
 event feeds, etc.
 
 Also, wouldn't it be great to show the world another interesting,
 useful and cool use of library data that OCLC's rules would prohibit?
 
 So the question is the format. Only a maniac would suggest MARC. For
 size and other reasons, even MODS is too much. But perhaps we can
 borrow the barest of field names from MODS, COinS, or from the most
 commonly used bibliographic format, Amazon XML.
 
 Thoughts?
 
 Tim
 
 -- 
 Check out my library at http://www.librarything.com/profile/timspalding

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Jakob Voss

Dear Tim,

you wrote:

So this is my recommended framework for proceeding. Tim, I'm afraid
you'll actually have to do the hard work yourself.


No, I don't. Because the work isn't fundamentally that hard. A
complex standard might be, but I never for a moment considered
anything like that. We have *512 bytes*, and it needs to be usable by
anyone. Library technology is usually fatally over-engineered, but
this is a case where that approach isn't even possible.


Jonathan did a very well summary - you just have to pick what you main 
focus of embedding bibliographic data is.



A) I favour using the CSL-Record format which I summarized at

http://wiki.code4lib.org/index.php/Citation_Style_Language

because I had in mind that people want to have a nice looking citation 
of the publication that someone tweeted about. The drawback is that CSL 
is less adopted and will not always fit in 512 bytes



B) If you main focus is to link Tweets about the same publication (and 
other stuff about this publication) than you must embed identifiers. 
LibraryThing is mainly based on two identifiers


1) ISBN to identify editions
2) LT work ids to identify works

I wonder why LT work ids have not picked up more although you thankfully 
provide a full mapping to ISBN at 
http://www.librarything.com/feeds/thingISBN.xml.gz but nevermind. I 
thought that some LT records also contain other identifiers such as OCLC 
number, LOC number etc. but maybe I am wrong. The best way to specify 
identifiers is to use an URI (all relevant identifiers that I know have 
an URI form). For ISBN it is


uri:isbn:{ISBN13}

For LT Work-ID you can use the URL with your .com top level domain:

http://www.librarything.com/work/{LTWORKID}

That would fit for tweets about books with an ISBN and for tweets about 
a work which will make 99.9% of tweets from LT about single publications 
anyway.



C) If your focus is to let people search for a publication in libraries 
than and to copy bibliographic data in reference management software 
then COinS is a way to go. COinS is based on OpenURL which I and others 
ranted about because it is a crapy library standard like MARC. But 
unlike other metadata formats COinS usually fits in less then 512 bytes. 
Furthermore you may have to deal with it for LibraryThing for libraries 
anyway.



Although I strongly favour CSL as a practising library scientist and 
developer I must admit that for LibraryThing the best way is to embed 
identifiers (ISBN and LT Work-ID) and maybe COinS. As long as 
LibraryThing does not open up to more complex publications like 
preprints of proceeding-articles in series etc. but mainly deals with 
books and works this will make LibraryThing users happy.


Then, three years from now, we can all conference-tweet about a CIL 
talk, about all the cool ways libraries are using Twitter, and how 
it's such a shame that the annotations standard wasn't designed with 
libraries in mind.


How about a bet instead of voting. In three years will there be:

a) No relevant Twitter annotations anyway
b) Twitter annotations but not used much for bibliographic data
c) A rich variety of incompatible bibliographic annotation standards
d) Semantic Web will have solved every problem anyway
..

Cheers
Jakob

--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Benjamin Young
I vote (heh) for d which will look a lot like c anyway, but with 
smatterings of owl:sameAs and Range-14 style 303's to keep things 
interesting. :)


--
President
BigBlueHat
P: 864.232.9553
W: http://www.bigbluehat.com/
http://www.linkedin.com/in/benjaminyoung


On 4/29/10 2:01 PM, Jakob Voss wrote:

How about a bet instead of voting. In three years will there be:

a) No relevant Twitter annotations anyway
b) Twitter annotations but not used much for bibliographic data
c) A rich variety of incompatible bibliographic annotation standards
d) Semantic Web will have solved every problem anyway 


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Alexander Johannesen
Hi,

On Thu, Apr 29, 2010 at 22:47, Walker, David dwal...@calstate.edu wrote:
 I would suggest it's more because, once you step outside of the
 primary use case for OpenURL, you end-up bumping into *other* standards.

These issues were raised all the back when it was created, as well. I
guess it's easy to be clever in hindsight. :) Here's what I wrote
about it 5 years ago (http://shelter.nu/blog-159.html) ;

So let's talk about 'Not invented here' first, because surely, we're
all guilty of this one from time to time. For example, lately I dug
into the ANSI/NISO Z39.88 -2004 standard, better known as OpenURL. I
was looking at it critically, I have to admit, comparing it to what I
already knew about Web Services, SOA, http,
Google/Amazon/Flickr/Del.icio.us API's, and various Topic Maps and
semantic web technologies (I was the technical editor of Explorers
Guide to the Semantic Web)

I think I can sum up my experiences with OpenURL as such; why? Why
have the library world invented a new way of doing things that already
can be done quite well already? Now, there is absolutely nothing wrong
with the standard per se (except a pretty darn awful choice of
name!!), so I'm not here criticising the technical merits and the work
put into it. No, it's a simple 'why' that I have yet to get a decent
answer to, even after talking to the OpenURL bigwigs about it. I mean,
come on; convince me! I'm not unreasonable, no truly, really, I just
want to be convinced that we need this over anything else.


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Ed Summers
On Tue, Apr 27, 2010 at 7:02 AM, Jakob Voss jakob.v...@gbv.de wrote:
 If you want to put bibliographic metadata
 into twitter annotations (good idea) you first need to clarify the basic
 purpose of embedding this information. I see two of them:

 I. Identification: To identify other tweets and resources that refer to the
 same publication

 II. Description: To nicely show which publication someone refers to.

I think this is right. I wonder, would you consider a potential use
case for Description to also provide machine readable data for a
resource when a standard identifier is not known?

It would be interesting to explore what identifiers + csl (and other
options) would look like in a twitter annotation if you had time to
mock something up in a wiki somewhere :-)

//Ed

[1] http://citationstyles.org/citation-style-language/schema/


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jakob Voss

Hi

it's funny how quickly you vote against BibTeX, but at least it is a 
format that is frequently used in the wild to create citations. If you 
call BibTeX undocumented and garbage then how do you call MARC which is 
far more difficult to make use of?


My assumption was that there is a specific use case for bibliographic 
data in twitter annotations:


I. Identifiy publication = this can *only* be done seriously with 
identifiers like ISBN, DOI, OCLCNum, LCCN etc.


II. Deliver a citation = use a citation-oriented format (BibTeX, CSL, RIS)

I was not voting explicitly for BibTeX but at least there is a large 
community that can make use of it. I strongly favour CSL 
(http://citationstyles.org/) because:


- there is a JavaScript CSL-Processor. JavaScript is kind of a 
punishment but it is the natural environment for the Web 2.0 Mashup 
crowd that is going to implement applications that use Twitter annotations


- there are dozens of CSL citation styles so you can display a citation 
in any way you want


As Ross pointed out RIS would be an option too, but I miss the easy open 
source tools that use RIS to create citations from RIS data.


Any other relevant format that I know (Bibont, MODS, MARC etc.) does not 
aim at identification or citation at the first place but tries to model 
the full variety of bibliographic metadata. If your use case is


III. Provide semantic properties and connections of a publication

Then you should look at the Bibliographic Ontology. But III does *not* 
just subsume usecase II. - it is a different story that is not beeing 
told by normal people but only but metadata experts, semantic web gurus, 
library system developers etc. (I would count me to this groups). If you 
want such complex data then you should use other systems but Twitter for 
data exchange anyway.


A list of CSL metadata fields can be found at

http://citationstyles.org/downloads/specification.html#appendices

and the JavaScript-Processor (which is also used in Zotero) provides 
more information for developers: http://groups.google.com/group/citeproc-js


Cheers
Jakob

P.S: An example of a CSL record from the JavaScript client:

{
title: True Crime Radio and Listener Disenchantment with Network 
Broadcasting, 1935-1946,

  author: [ {
family: Razlogova,
given: Elena
  } ],
 container-title: American Quarterly,
 volume: 58,
 page: 137-158,
 issued: { date-parts: [ [2006, 3] ] },
 type: article-journal
}


--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Ed Summers
On Wed, Apr 28, 2010 at 4:17 AM, Jakob Voss jakob.v...@gbv.de wrote:
 P.S: An example of a CSL record from the JavaScript client:

 {
 title: True Crime Radio and Listener Disenchantment with Network
 Broadcasting, 1935-1946,
  author: [ {
    family: Razlogova,
    given: Elena
  } ],
  container-title: American Quarterly,
  volume: 58,
  page: 137-158,
  issued: { date-parts: [ [2006, 3] ] },
  type: article-journal
 }

This looks really nice for the Description side. Has the JSON
serialization for CSL been detailed anywhere yet?

//Ed


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Owen Stephens
We've had problems with RIS on a recent project. Although there is a
specification (http://www.refman.com/support/risformat_intro.asp), it is (I
feel) lacking enough rigour to ever be implemented consistently. The most
common issue in the wild that I've seen is use of different tags for the
same information (which the specification does not nail down enough to know
when each should be used):

Use of TI or T1 for primary title
Use of AU or A1 for primary author
Use of UR, L1 or L2 to link to 'full text'

Perhaps more significantly the specification doesn't include any field
specifically for a DOI, but despite this EndNote (owned by ISI ResearchSoft,
who are also responsible for the RIS format specification) includes the DOI
in a DO field in its RIS output - not to specification.

Owen

On Wed, Apr 28, 2010 at 9:17 AM, Jakob Voss jakob.v...@gbv.de wrote:

 Hi

 it's funny how quickly you vote against BibTeX, but at least it is a format
 that is frequently used in the wild to create citations. If you call BibTeX
 undocumented and garbage then how do you call MARC which is far more
 difficult to make use of?

 My assumption was that there is a specific use case for bibliographic data
 in twitter annotations:

 I. Identifiy publication = this can *only* be done seriously with
 identifiers like ISBN, DOI, OCLCNum, LCCN etc.

 II. Deliver a citation = use a citation-oriented format (BibTeX, CSL, RIS)

 I was not voting explicitly for BibTeX but at least there is a large
 community that can make use of it. I strongly favour CSL (
 http://citationstyles.org/) because:

 - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment
 but it is the natural environment for the Web 2.0 Mashup crowd that is going
 to implement applications that use Twitter annotations

 - there are dozens of CSL citation styles so you can display a citation in
 any way you want

 As Ross pointed out RIS would be an option too, but I miss the easy open
 source tools that use RIS to create citations from RIS data.

 Any other relevant format that I know (Bibont, MODS, MARC etc.) does not
 aim at identification or citation at the first place but tries to model the
 full variety of bibliographic metadata. If your use case is

 III. Provide semantic properties and connections of a publication

 Then you should look at the Bibliographic Ontology. But III does *not*
 just subsume usecase II. - it is a different story that is not beeing told
 by normal people but only but metadata experts, semantic web gurus, library
 system developers etc. (I would count me to this groups). If you want such
 complex data then you should use other systems but Twitter for data exchange
 anyway.

 A list of CSL metadata fields can be found at

 http://citationstyles.org/downloads/specification.html#appendices

 and the JavaScript-Processor (which is also used in Zotero) provides more
 information for developers: http://groups.google.com/group/citeproc-js

 Cheers
 Jakob

 P.S: An example of a CSL record from the JavaScript client:

 {
 title: True Crime Radio and Listener Disenchantment with Network
 Broadcasting, 1935-1946,
  author: [ {
family: Razlogova,
given: Elena
  } ],
  container-title: American Quarterly,
  volume: 58,
  page: 137-158,
  issued: { date-parts: [ [2006, 3] ] },
  type: article-journal

 }


 --
 Jakob Voß jakob.v...@gbv.de, skype: nichtich
 Verbundzentrale des GBV (VZG) / Common Library Network
 Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
 +49 (0)551 39-10242, http://www.gbv.de




-- 
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jakob Voss

Ed Summers wrote:


II. Description: To nicely show which publication someone refers to.


I think this is right. I wonder, would you consider a potential use
case for Description to also provide machine readable data for a
resource when a standard identifier is not known?


There are lookup services to get a standard identifier when only some 
bibliographic data is known - mainly OpenURL. I have not investigated 
whether you can easily map CSL format to OpenURL or if you need to also 
embed the OpenURL as twitter annotation. However all lookup services 
that I know are either crapy or proprietary or both. This is not a 
technical issue but just based on a lack of data (hopefully to get 
better with more linked open data). Given enough open bibliographic data 
anyone can create a lookup service where you throw in some title, author 
and this stuff and get back an identified record. I think there also are 
some services called library catalog for this purpose.


Anyway this os nothing that can be solved with a bibliographic data 
format alone. Either you have a standard identifier or you have not. If 
you have not, you must rely on third party services that run independent 
of your bibliographic data.



It would be interesting to explore what identifiers + csl (and other
options) would look like in a twitter annotation if you had time to
mock something up in a wiki somewhere :-)


I summarized my findings on CSL at

http://wiki.code4lib.org/index.php/Citation_Style_Language

and included some ideas of CSL and other data in twitter annotations. 
Feel free to modify!


Cheers
Jakob

--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Walker, David
I was also just working on DOI with RIS.

It looks like both Endnote and Refworks recognize 'DO' for DOIs.  But 
apparently Zotero does not.  If Zotero supported it, I'd say we'd have a de 
facto standard on our hands.

In fact, I couldn't figure out how to pass a DOI to Zotero using RIS.  Or, at 
least, in my testing I never saw the DOI show-up in Zotero.  I don't really use 
Zotero, so I may have missed it.

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Owen Stephens 
[o...@ostephens.com]
Sent: Wednesday, April 28, 2010 2:26 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Twitter annotations and library software

We've had problems with RIS on a recent project. Although there is a
specification (http://www.refman.com/support/risformat_intro.asp), it is (I
feel) lacking enough rigour to ever be implemented consistently. The most
common issue in the wild that I've seen is use of different tags for the
same information (which the specification does not nail down enough to know
when each should be used):

Use of TI or T1 for primary title
Use of AU or A1 for primary author
Use of UR, L1 or L2 to link to 'full text'

Perhaps more significantly the specification doesn't include any field
specifically for a DOI, but despite this EndNote (owned by ISI ResearchSoft,
who are also responsible for the RIS format specification) includes the DOI
in a DO field in its RIS output - not to specification.

Owen

On Wed, Apr 28, 2010 at 9:17 AM, Jakob Voss jakob.v...@gbv.de wrote:

 Hi

 it's funny how quickly you vote against BibTeX, but at least it is a format
 that is frequently used in the wild to create citations. If you call BibTeX
 undocumented and garbage then how do you call MARC which is far more
 difficult to make use of?

 My assumption was that there is a specific use case for bibliographic data
 in twitter annotations:

 I. Identifiy publication = this can *only* be done seriously with
 identifiers like ISBN, DOI, OCLCNum, LCCN etc.

 II. Deliver a citation = use a citation-oriented format (BibTeX, CSL, RIS)

 I was not voting explicitly for BibTeX but at least there is a large
 community that can make use of it. I strongly favour CSL (
 http://citationstyles.org/) because:

 - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment
 but it is the natural environment for the Web 2.0 Mashup crowd that is going
 to implement applications that use Twitter annotations

 - there are dozens of CSL citation styles so you can display a citation in
 any way you want

 As Ross pointed out RIS would be an option too, but I miss the easy open
 source tools that use RIS to create citations from RIS data.

 Any other relevant format that I know (Bibont, MODS, MARC etc.) does not
 aim at identification or citation at the first place but tries to model the
 full variety of bibliographic metadata. If your use case is

 III. Provide semantic properties and connections of a publication

 Then you should look at the Bibliographic Ontology. But III does *not*
 just subsume usecase II. - it is a different story that is not beeing told
 by normal people but only but metadata experts, semantic web gurus, library
 system developers etc. (I would count me to this groups). If you want such
 complex data then you should use other systems but Twitter for data exchange
 anyway.

 A list of CSL metadata fields can be found at

 http://citationstyles.org/downloads/specification.html#appendices

 and the JavaScript-Processor (which is also used in Zotero) provides more
 information for developers: http://groups.google.com/group/citeproc-js

 Cheers
 Jakob

 P.S: An example of a CSL record from the JavaScript client:

 {
 title: True Crime Radio and Listener Disenchantment with Network
 Broadcasting, 1935-1946,
  author: [ {
family: Razlogova,
given: Elena
  } ],
  container-title: American Quarterly,
  volume: 58,
  page: 137-158,
  issued: { date-parts: [ [2006, 3] ] },
  type: article-journal

 }


 --
 Jakob Voß jakob.v...@gbv.de, skype: nichtich
 Verbundzentrale des GBV (VZG) / Common Library Network
 Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
 +49 (0)551 39-10242, http://www.gbv.de




--
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Owen Stephens
Unfortunately RefWorks only imports DO - not exports! We now recommend using
RefWorks XML when exporting (for our project) - which is fine, but not
publicly documented as far as I know :(

Zotero recommend using BibTex for importing from RefWorks I think

Owen

On Wed, Apr 28, 2010 at 2:05 PM, Walker, David dwal...@calstate.edu wrote:

 I was also just working on DOI with RIS.

 It looks like both Endnote and Refworks recognize 'DO' for DOIs.  But
 apparently Zotero does not.  If Zotero supported it, I'd say we'd have a de
 facto standard on our hands.

 In fact, I couldn't figure out how to pass a DOI to Zotero using RIS.  Or,
 at least, in my testing I never saw the DOI show-up in Zotero.  I don't
 really use Zotero, so I may have missed it.

 --Dave

 ==
 David Walker
 Library Web Services Manager
 California State University
 http://xerxes.calstate.edu
 
 From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Owen
 Stephens [o...@ostephens.com]
 Sent: Wednesday, April 28, 2010 2:26 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Twitter annotations and library software

 We've had problems with RIS on a recent project. Although there is a
 specification (http://www.refman.com/support/risformat_intro.asp), it is
 (I
 feel) lacking enough rigour to ever be implemented consistently. The most
 common issue in the wild that I've seen is use of different tags for the
 same information (which the specification does not nail down enough to know
 when each should be used):

 Use of TI or T1 for primary title
 Use of AU or A1 for primary author
 Use of UR, L1 or L2 to link to 'full text'

 Perhaps more significantly the specification doesn't include any field
 specifically for a DOI, but despite this EndNote (owned by ISI
 ResearchSoft,
 who are also responsible for the RIS format specification) includes the DOI
 in a DO field in its RIS output - not to specification.

 Owen

 On Wed, Apr 28, 2010 at 9:17 AM, Jakob Voss jakob.v...@gbv.de wrote:

  Hi
 
  it's funny how quickly you vote against BibTeX, but at least it is a
 format
  that is frequently used in the wild to create citations. If you call
 BibTeX
  undocumented and garbage then how do you call MARC which is far more
  difficult to make use of?
 
  My assumption was that there is a specific use case for bibliographic
 data
  in twitter annotations:
 
  I. Identifiy publication = this can *only* be done seriously with
  identifiers like ISBN, DOI, OCLCNum, LCCN etc.
 
  II. Deliver a citation = use a citation-oriented format (BibTeX, CSL,
 RIS)
 
  I was not voting explicitly for BibTeX but at least there is a large
  community that can make use of it. I strongly favour CSL (
  http://citationstyles.org/) because:
 
  - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment
  but it is the natural environment for the Web 2.0 Mashup crowd that is
 going
  to implement applications that use Twitter annotations
 
  - there are dozens of CSL citation styles so you can display a citation
 in
  any way you want
 
  As Ross pointed out RIS would be an option too, but I miss the easy open
  source tools that use RIS to create citations from RIS data.
 
  Any other relevant format that I know (Bibont, MODS, MARC etc.) does not
  aim at identification or citation at the first place but tries to model
 the
  full variety of bibliographic metadata. If your use case is
 
  III. Provide semantic properties and connections of a publication
 
  Then you should look at the Bibliographic Ontology. But III does *not*
  just subsume usecase II. - it is a different story that is not beeing
 told
  by normal people but only but metadata experts, semantic web gurus,
 library
  system developers etc. (I would count me to this groups). If you want
 such
  complex data then you should use other systems but Twitter for data
 exchange
  anyway.
 
  A list of CSL metadata fields can be found at
 
  http://citationstyles.org/downloads/specification.html#appendices
 
  and the JavaScript-Processor (which is also used in Zotero) provides more
  information for developers: http://groups.google.com/group/citeproc-js
 
  Cheers
  Jakob
 
  P.S: An example of a CSL record from the JavaScript client:
 
  {
  title: True Crime Radio and Listener Disenchantment with Network
  Broadcasting, 1935-1946,
   author: [ {
 family: Razlogova,
 given: Elena
   } ],
   container-title: American Quarterly,
   volume: 58,
   page: 137-158,
   issued: { date-parts: [ [2006, 3] ] },
   type: article-journal
 
  }
 
 
  --
  Jakob Voß jakob.v...@gbv.de, skype: nichtich
  Verbundzentrale des GBV (VZG) / Common Library Network
  Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
  +49 (0)551 39-10242, http://www.gbv.de
 



 --
 Owen Stephens
 Owen Stephens Consulting
 Web: http://www.ostephens.com
 Email: o...@ostephens.com




-- 
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com

Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread MJ Suhonos
 - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment but 
 it is the natural environment for the Web 2.0 Mashup crowd that is going to 
 implement applications that use Twitter annotations

A quick word of caution here; we got excited about citeproc-js until learning 
that it actually requires a specific extension compiled into the Javascript 
interpreter, E4X: 
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#javascript-interpreters

This is fine and cool, but is not as widely supported as Javascript itself; eg. 
Internet Explorer, Chrome, Safari, and a number of server-side Javascript 
engines do not have E4X support:
http://en.wikipedia.org/wiki/E4x

That said, I'm very excited about CSL in general and this thread in particular 
— structured citation parsing is what I dream about at night.  Great stuff.

MJ


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jonathan Rochkind

Jakob Voss wrote:
I. Identifiy publication = this can *only* be done seriously with 
identifiers like ISBN, DOI, OCLCNum, LCCN etc.
  
Ah, but for better or for worse, that's not the world we live in. We 
have LOTS of publications that either lack such identifiers altogether, 
or where information about identifiers is not available. (Mostly the 
former). That we need to identify. This is an actual use case, you can't 
just dismiss it by saying it can't be done!


The biggest example is pretty much every scholarly journal article. (A 
significant _minority_ have DOI or pmid; the majority have neither). 

And we DO identify these articles, by a description meant to serve as 
identification, often by using OpenURL.Maybe we're not doing it 
seriously, but it's a real use case, and it's being done in the wild 
in production.


Jonathan

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jonathan Rochkind

Jakob Voss wrote:


There are lookup services to get a standard identifier when only some 
bibliographic data is known - mainly OpenURL.
A standard identifier is not always _available_ -- even if you have 
access to a service to look up standard identifiers ( a not neccesarily 
realistic expectation for real world use cases) , not every publication 
HAS a standard identifier.


Jonathan

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jonathan Rochkind
Has anyone actually gotten up a _server-side_ process that uses CSL to 
produce formatted citations?   Using the citeproc-js with a certain 
custom compiled js interpreter, or anything else?


This is what I'm interested in -- I'm not concerned with making it run 
in a browser, so custom compiled JS interpreter isn't a showstopper.  
But is still something that I'm not familiar with doing, so is going to 
take me a while to figure out how to set up.  If anyone has already set 
anything up (using citeproc-js or anything else we may not know about), 
can you let us know, and maybe share your tips/instructions/code?


Jonathan

MJ Suhonos wrote:

- there is a JavaScript CSL-Processor. JavaScript is kind of a punishment but 
it is the natural environment for the Web 2.0 Mashup crowd that is going to 
implement applications that use Twitter annotations



A quick word of caution here; we got excited about citeproc-js until learning that it 
actually requires a specific extension compiled into the Javascript interpreter, E4X: 
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#javascript-interpreters

This is fine and cool, but is not as widely supported as Javascript itself; eg. 
Internet Explorer, Chrome, Safari, and a number of server-side Javascript 
engines do not have E4X support:
http://en.wikipedia.org/wiki/E4x

That said, I'm very excited about CSL in general and this thread in particular 
— structured citation parsing is what I dream about at night.  Great stuff.

MJ

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jakob Voss

Jonathan Rochkind wrote:


Jakob Voss wrote:
I. Identifiy publication = this can *only* be done seriously with 
identifiers like ISBN, DOI, OCLCNum, LCCN etc.
  
Ah, but for better or for worse, that's not the world we live in. We 
have LOTS of publications that either lack such identifiers altogether, 
or where information about identifiers is not available. (Mostly the 
former). That we need to identify. This is an actual use case, you can't 
just dismiss it by saying it can't be done!


Call me pedantic but if you do not have an identifier than there is no 
hope to identity the publication by means of metadata. You only 
*describe* it with metadata and use additional heuristics (mostly search 
engines) to hopefully identify the publication based on the description.


But these additional heuristics are not part of the metadta while a 
well-defined identifier implies a standard of how the identifier had 
been created and how it can be looked up.


The last hope if there is no identifier is to create one. For instance 
our library system creates internal record numbers (such as OCLC 
numbers) which can be reused. You can also define an algorithm that 
creates a hash as identifier like the bibkey I mentioned. But as long as 
there is no identifier there is no identification independent from a 
bibliographic database that already contains the record to search in.


Jakob

--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Jonathan Rochkind

Jakob Voss wrote:


Call me pedantic but if you do not have an identifier than there is no 
hope to identity the publication by means of metadata. You only 
*describe* it with metadata and use additional heuristics (mostly search 
engines) to hopefully identify the publication based on the description.
  
But the entire OpenURL infrastructure DOES this, and does it without 
using search engines. It's a real world use case that has a solution in 
production! So, yeah, I call you pedantic for wanting to pretend the use 
case and the real world solution doesn't exist. :)


You can call it description rather than identification if you like, 
that is a question of terminology. But it's description that is meant to 
uniquely identify a particular publication, and that a whole bunch of 
software in use every day succesfully uses to identify a particular 
publication.


It IS a hacky and error-prone solution, to be sure.   But it's the best 
solution we've got, because it's simply a fact that we have many 
publications we want to identify that lack standard identifiers.


If a twitter annotation setup wants to be able to identify publications 
that don't have standard identifiers, then you don't want to ignore this 
use case and how actually in production software currently deals with 
it. You can perhaps find a better way to deal with it -- I'm certainly 
not arguing for OpenURL as the be all end all, I rather hate OpenURL 
actually.  But dismissing it as impossible is indeed pedantic, since 
it's being done!


Jonathan

  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread Eric Hellman
I mean, really, if the folks at RefWorks, EndNote, Papers, Zotero and LibX 
don't have crash programs underway to integrate Twitter clients into their 
software to send and receive  reference metadata payloads they can use in the 
Twitter annotation field, they really ought to hire me to come and bash some 
sense into them. Really.

I still think by-reference payloads would got the farthest, as described at 
http://go-to-hellman.blogspot.com/2010/04/when-shall-we-link.html
would go the farthest, but surely these folks know very well what they can send 
and receive.

Eric

On Apr 28, 2010, at 4:17 AM, Jakob Voss wrote:

 Hi
 
 it's funny how quickly you vote against BibTeX, but at least it is a format 
 that is frequently used in the wild to create citations. If you call BibTeX 
 undocumented and garbage then how do you call MARC which is far more 
 difficult to make use of?
 
 My assumption was that there is a specific use case for bibliographic data in 
 twitter annotations:
 
 I. Identifiy publication = this can *only* be done seriously with 
 identifiers like ISBN, DOI, OCLCNum, LCCN etc.
 
 II. Deliver a citation = use a citation-oriented format (BibTeX, CSL, RIS)
 
 I was not voting explicitly for BibTeX but at least there is a large 
 community that can make use of it. I strongly favour CSL 
 (http://citationstyles.org/) because:
 
 - there is a JavaScript CSL-Processor. JavaScript is kind of a punishment but 
 it is the natural environment for the Web 2.0 Mashup crowd that is going to 
 implement applications that use Twitter annotations
 
 - there are dozens of CSL citation styles so you can display a citation in 
 any way you want
 
 As Ross pointed out RIS would be an option too, but I miss the easy open 
 source tools that use RIS to create citations from RIS data.
 
 Any other relevant format that I know (Bibont, MODS, MARC etc.) does not aim 
 at identification or citation at the first place but tries to model the full 
 variety of bibliographic metadata. If your use case is
 
 III. Provide semantic properties and connections of a publication
 
 Then you should look at the Bibliographic Ontology. But III does *not* just 
 subsume usecase II. - it is a different story that is not beeing told by 
 normal people but only but metadata experts, semantic web gurus, library 
 system developers etc. (I would count me to this groups). If you want such 
 complex data then you should use other systems but Twitter for data exchange 
 anyway.
 
 A list of CSL metadata fields can be found at
 
 http://citationstyles.org/downloads/specification.html#appendices
 
 and the JavaScript-Processor (which is also used in Zotero) provides more 
 information for developers: http://groups.google.com/group/citeproc-js
 
 Cheers
 Jakob
 
 P.S: An example of a CSL record from the JavaScript client:
 
 {
 title: True Crime Radio and Listener Disenchantment with Network 
 Broadcasting, 1935-1946,
  author: [ {
family: Razlogova,
given: Elena
  } ],
 container-title: American Quarterly,
 volume: 58,
 page: 137-158,
 issued: { date-parts: [ [2006, 3] ] },
 type: article-journal
 }
 
 
 -- 
 Jakob Voß jakob.v...@gbv.de, skype: nichtich
 Verbundzentrale des GBV (VZG) / Common Library Network
 Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
 +49 (0)551 39-10242, http://www.gbv.de

Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/


Re: [CODE4LIB] Twitter annotations and library software

2010-04-27 Thread Jakob Voss

Hi Tim,

you wrote:


Unless someone can come up with a perfect pre-cooked format—one that
not only covers what we need but is also super easy and
space-efficient (we have only 1/2k to use!)—Why don't we just decide
on:

'simplebib' : {

}

and start filling in fields. I don't think it makes sense to
externalize the information under another URL, at least in the first
instance. That at least doubles the calls involved, and makes whatever
you build dependent on lots of external services that may or may not
work.


Oh yeah, let's create just another ad-hoc metadata format, because 
obviously there are not enough different formats around there!


To be honest: I admire your multitude of good ideas and efforts but this 
is one of the rare counterexamples. If you want to put bibliographic 
metadata into twitter annotations (good idea) you first need to clarify 
the basic purpose of embedding this information. I see two of them:


I. Identification: To identify other tweets and resources that refer to 
the same publication


II. Description: To nicely show which publication someone refers to.


The purpose of identification can be served by the following means:

a). standard identifiers
b). standard identifiers
c). standard identifiers

Examples of standard identifiers include ISBN, OCLC Number, ASIN, 
LibraryThing Work-ID, well-defined bibliographic hash keys [*] etc.



The purpose of description can best be served by a format that can 
easily be displayed for human beeings. You can either use a simple 
string or a well-known format. A string can be displayed but people will 
put all different citation formats in there. Right now there are only 
two established metadata formats that aim at creating a citation:


a) BibTeX
b) The input format of the Citation Style Language (CSL)

I bet that CSL is the easier way to go. See http://citationstyles.org/ 
for details and examples.



Cheers
Jakob

[*] See http://www.gbv.de/wikis/cls/Bibliographic_Hash_Key for a 
description of the mapping mechanism that is also used in BibSonomy to 
match BibTeX records.


--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] Twitter annotations and library software

2010-04-27 Thread Ross Singer
On Tue, Apr 27, 2010 at 7:02 AM, Jakob Voss jakob.v...@gbv.de wrote:

 The purpose of description can best be served by a format that can easily be
 displayed for human beeings. You can either use a simple string or a
 well-known format. A string can be displayed but people will put all
 different citation formats in there. Right now there are only two
 established metadata formats that aim at creating a citation:

 a) BibTeX
 b) The input format of the Citation Style Language (CSL)

This isn't entirely true.  There's RIS
(http://en.wikipedia.org/wiki/RIS_%28file_format%29) and BIBO
(http://bibliontology.com/) is starting to become quite common in the
linked data sphere.

There's also BibJSON (http://www.bibkn.org/bibjson/index.html) which
I've had a browser tab open for months with the intention of actually
looking at and actually seems quite well suited for how Twitter will
store annotations.  My opinion of it all along, however, has been very
similar to yours -- why another citation format and why bind it so
closely to a particular serialization?

-Ross.


Re: [CODE4LIB] Twitter annotations and library software

2010-04-27 Thread Tom Pasley
-1 for BibTex!

It can be hard to comprehensively parse without inadvertently creating garbage.

Tom

On Wed, Apr 28, 2010 at 1:00 AM, Ross Singer rossfsin...@gmail.com wrote:
 On Tue, Apr 27, 2010 at 7:02 AM, Jakob Voss jakob.v...@gbv.de wrote:

 The purpose of description can best be served by a format that can easily be
 displayed for human beeings. You can either use a simple string or a
 well-known format. A string can be displayed but people will put all
 different citation formats in there. Right now there are only two
 established metadata formats that aim at creating a citation:

 a) BibTeX
 b) The input format of the Citation Style Language (CSL)

 This isn't entirely true.  There's RIS
 (http://en.wikipedia.org/wiki/RIS_%28file_format%29) and BIBO
 (http://bibliontology.com/) is starting to become quite common in the
 linked data sphere.

 There's also BibJSON (http://www.bibkn.org/bibjson/index.html) which
 I've had a browser tab open for months with the intention of actually
 looking at and actually seems quite well suited for how Twitter will
 store annotations.  My opinion of it all along, however, has been very
 similar to yours -- why another citation format and why bind it so
 closely to a particular serialization?

 -Ross.



Re: [CODE4LIB] Twitter annotations and library software

2010-04-27 Thread stuart yeates

Jakob Voss wrote:

a) BibTeX


Can I vote against BibTex, please?

At the core of BibTeX is a language called 'BST' or that's the file 
extension used, which is as close as it comes to a name.


This is an entirely undocumented language written to work on a patchily 
documented format. It's stack-based (not unlike PostScript), with 
special operation(s) to manipulate names based on deep assumptions about 
names and the ways they are formatted. These assumptions, by and large, 
hold for the personal names of North American English speakers (but I 
seem to recall is unable to correctly format the name of the President 
of the USA due to his title). The further you move from names of North 
American English speakers, the more they break (non-ASCII characters, 
eastern order names, complex titles, non-standard capitalisation, etc, 
etc, etc).


BST is non-recursive, attempting to execute recursive functions gives 
the error Curse on you, wizard, before you recurse on me. Yes, the BST 
interpreter does refer to users as wizards, which seems less cool 
after the first 12 hours of debugging.


Users have adapted to BibTeX by using an experimental 
approach---tinkering with the BibTeX entries until they 'look right,' 
which in most cases involves cramming everything into what BibTeX thinks 
 of as the surname, because BibTex never omits or initialises the surname.


If we're going to use a bibliographic framework, please, please, please 
don't make it BibTeX.


cheers
stuart
--
Stuart Yeates
http://www.nzetc.org/   New Zealand Electronic Text Centre
http://researcharchive.vuw.ac.nz/ Institutional Repository


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Mark A. Matienzo
On Wed, Apr 21, 2010 at 6:08 AM, Tim Spalding t...@librarything.com wrote:
 I'd love to get some people together to agree on a standard book
 annotation format, so two people can tweet about the same book or
 other library item, and they or someone else can pull that together.

 I'm inclined to start adding it to the I'm talking about and I'm
 adding links on LibraryThing. I imagine it could be easily added to
 many library applications too—anywhere there is or could be a share
 this on Twitter link, including OPACs, citation managers, library
 event feeds, etc.

By this description alone it seems to me that OpenURL, perhaps
implemented as some variation on COinS, would make the most sense.
With OpenURL, the fields have already been defined. Perhaps the
underlying JSON for the annotation could look something like the
following:

{ 'annotations':
  { 'z3988':
{ 'contextobject':
'ctx_ver=Z39.88-2004amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournalamp;rft.issn=1045-4438'}
  }
}

Additionally, one could specify an optional resolver parameter if so desired.

Mark A. Matienzo
Digital Archivist, Manuscripts and Archives
Yale University Library


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Karen Coombs
Have you looked the the citation microformat (
http://microformats.org/wiki/citation) ? Don't know where work with this
stands but it seems pretty interesting to me.

Karen

On Wed, Apr 21, 2010 at 8:21 AM, Mark A. Matienzo m...@matienzo.org wrote:

 On Wed, Apr 21, 2010 at 6:08 AM, Tim Spalding t...@librarything.com
 wrote:
  I'd love to get some people together to agree on a standard book
  annotation format, so two people can tweet about the same book or
  other library item, and they or someone else can pull that together.
 
  I'm inclined to start adding it to the I'm talking about and I'm
  adding links on LibraryThing. I imagine it could be easily added to
  many library applications too—anywhere there is or could be a share
  this on Twitter link, including OPACs, citation managers, library
  event feeds, etc.

 By this description alone it seems to me that OpenURL, perhaps
 implemented as some variation on COinS, would make the most sense.
 With OpenURL, the fields have already been defined. Perhaps the
 underlying JSON for the annotation could look something like the
 following:

 { 'annotations':
  { 'z3988':
{ 'contextobject':

 'ctx_ver=Z39.88-2004amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournalamp;rft.issn=1045-4438'}
  }
 }

 Additionally, one could specify an optional resolver parameter if so
 desired.

 Mark A. Matienzo
 Digital Archivist, Manuscripts and Archives
 Yale University Library



Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Ed Summers
On Wed, Apr 21, 2010 at 6:08 AM, Tim Spalding t...@librarything.com wrote:
 I'm inclined to start adding it to the I'm talking about and I'm
 adding links on LibraryThing. I imagine it could be easily added to
 many library applications too—anywhere there is or could be a share
 this on Twitter link, including OPACs, citation managers, library
 event feeds, etc.

You might want to add it now, but I don't think annotations are
available yet in Twitter. If you haven't seen it already Marcel Molina
of Twitter outlined how annotations might work in a post he made to
the twitter-api discussion list [1].

 So the question is the format. Only a maniac would suggest MARC. For
 size and other reasons, even MODS is too much. But perhaps we can
 borrow the barest of field names from MODS, COinS, or from the most
 commonly used bibliographic format, Amazon XML.

 Thoughts?

It sounds like a good idea to have a common pattern. I could
definitely see a use case for wanting to aggregate conversations
around books and such.

//Ed


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Ed Summers
whoops, forgot my footnote :-)

[1] 
http://groups.google.com/group/twitter-api-announce/browse_thread/thread/fa5da2608865453


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Tim Spalding
I was wondering if there was a good microformat. The trick is that the
citation format is very much about stuff that gets displayed, and
lacks the critical linking ids you'd want—ISBN, SSN, LCCN, OCLC, ASIN,
EAN, etc.

If people know of others that would work, maybe that's the answer.

On Wed, Apr 21, 2010 at 8:38 AM, Karen Coombs librarywebc...@gmail.com wrote:
 Have you looked the the citation microformat (
 http://microformats.org/wiki/citation) ? Don't know where work with this
 stands but it seems pretty interesting to me.

 Karen


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Jonathan Rochkind
So almost all of those identifiers can be formatted as a URI.   Although 
sometimes it takes an info: uri, which some people don't like, but I 
like, for reasons relevant to their usefulness here.


ISBN, ISSN, LCCN, and OCLCnum all have registered info: URI 
sub-schemes.  I once tried to figure out how to express an EAN as a URI, 
and I think I _did_ eventually find _something_, but it was kind of 
confusing and hard to track down (The EAN/UPC/etc people have some info 
URI subschemes registered too, I think, but it's hard to figure out what 
it all means).  For ASIN, I have been in the habit of using an Amazon 
http URI, the problem is that Amazon really offers several http URIs for 
the same ASIN, so you kind of just have to pick one format.


Oh, and you can do DOI as an info: URI too.

So your annotation _could_ simply be a URI.  And get a lot of stuff. 
But this leaves out a lot of things that don't really have good 
identifiers at all:   Articles in popular (not scholarly) 
newspapers/journals;   most daily newspapers as titles themselves (don't 
usually have an ISSN);  Movies;  books too old (or for other odd reasons 
lacking) an ISBN (or lccn or oclcnum).  Scholarly articles that don't 
have a DOI (the majority of them).


Maybe you could use the citation microformat extended to take arbitrary 
URI identifiers?  So for stuff without an identifier, you've got the 
citation details, but you can still stick identifiers in with URIs?


And as someone else mentioned, this _is_ pretty much the use-case of 
traditional OpenURL, and it does handle it well enough: allowing you 
put enough structured citation in to identify the referent for things 
without identifiers, allowing you to put arbitrary URIs  in rft_id.   
But OpenURL is kind of a monster to work with.   And doesn't deal too 
well with certain kinds of citations like movies or music either, it's 
really focused on published textual materials.


Jonathan

Tim Spalding wrote:

I was wondering if there was a good microformat. The trick is that the
citation format is very much about stuff that gets displayed, and
lacks the critical linking ids you'd want—ISBN, SSN, LCCN, OCLC, ASIN,
EAN, etc.

If people know of others that would work, maybe that's the answer.

On Wed, Apr 21, 2010 at 8:38 AM, Karen Coombs librarywebc...@gmail.com wrote:
  

Have you looked the the citation microformat (
http://microformats.org/wiki/citation) ? Don't know where work with this
stands but it seems pretty interesting to me.

Karen



  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Tim Spalding
Unless someone can come up with a perfect pre-cooked format—one that
not only covers what we need but is also super easy and
space-efficient (we have only 1/2k to use!)—Why don't we just decide
on:

'simplebib' : {

}

and start filling in fields. I don't think it makes sense to
externalize the information under another URL, at least in the first
instance. That at least doubles the calls involved, and makes whatever
you build dependent on lots of external services that may or may not
work.

Best,
Tim

On Wed, Apr 21, 2010 at 10:45 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 So almost all of those identifiers can be formatted as a URI.   Although
 sometimes it takes an info: uri, which some people don't like, but I like,
 for reasons relevant to their usefulness here.

 ISBN, ISSN, LCCN, and OCLCnum all have registered info: URI sub-schemes.  I
 once tried to figure out how to express an EAN as a URI, and I think I _did_
 eventually find _something_, but it was kind of confusing and hard to track
 down (The EAN/UPC/etc people have some info URI subschemes registered too, I
 think, but it's hard to figure out what it all means).  For ASIN, I have
 been in the habit of using an Amazon http URI, the problem is that Amazon
 really offers several http URIs for the same ASIN, so you kind of just have
 to pick one format.

 Oh, and you can do DOI as an info: URI too.

 So your annotation _could_ simply be a URI.  And get a lot of stuff. But
 this leaves out a lot of things that don't really have good identifiers at
 all:   Articles in popular (not scholarly) newspapers/journals;   most daily
 newspapers as titles themselves (don't usually have an ISSN);  Movies;
  books too old (or for other odd reasons lacking) an ISBN (or lccn or
 oclcnum).  Scholarly articles that don't have a DOI (the majority of them).

 Maybe you could use the citation microformat extended to take arbitrary URI
 identifiers?  So for stuff without an identifier, you've got the citation
 details, but you can still stick identifiers in with URIs?

 And as someone else mentioned, this _is_ pretty much the use-case of
 traditional OpenURL, and it does handle it well enough: allowing you put
 enough structured citation in to identify the referent for things without
 identifiers, allowing you to put arbitrary URIs  in rft_id.   But OpenURL is
 kind of a monster to work with.   And doesn't deal too well with certain
 kinds of citations like movies or music either, it's really focused on
 published textual materials.

 Jonathan

 Tim Spalding wrote:

 I was wondering if there was a good microformat. The trick is that the
 citation format is very much about stuff that gets displayed, and
 lacks the critical linking ids you'd want—ISBN, SSN, LCCN, OCLC, ASIN,
 EAN, etc.

 If people know of others that would work, maybe that's the answer.

 On Wed, Apr 21, 2010 at 8:38 AM, Karen Coombs librarywebc...@gmail.com
 wrote:


 Have you looked the the citation microformat (
 http://microformats.org/wiki/citation) ? Don't know where work with this
 stands but it seems pretty interesting to me.

 Karen







-- 
Check out my library at http://www.librarything.com/profile/timspalding


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Mark A. Matienzo
On Wed, Apr 21, 2010 at 10:58 AM, Tim Spalding t...@librarything.com wrote:
 Unless someone can come up with a perfect pre-cooked format—one that
 not only covers what we need but is also super easy and
 space-efficient (we have only 1/2k to use!)—Why don't we just decide
 on:

 'simplebib' : {

 }

 and start filling in fields.

Because I don't think we've decided anything. I for one don't think we
should have yet another arbitrary citation format floating around the
Web.

Mark A. Matienzo
Digital Archivist, Manuscripts and Archives
Yale University Library


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Jonathan Rochkind
Just to clarify, encoding identifiers as URI's, my suggestion, is NOT 
externalizing the information under another URL.   It is just picking 
a standard format for identifiers, the identifier format of the web, to 
re-use standards and cut down on custom vocabulary. If your 'simplebib' 
idea made sense, it could look like:


'simplebib' : {
   identifier:  info:isbn:1234556X
}

or identifier: info:oclcnum:whatever
etc.

Note that info URIs not only don't need to be looked up from another 
URL to resolve -- info URIs are actually un-resolvable!   While the 
ASIN http URI is (sort of) resolvable, it still doesn't _need_ to be 
looked up to resolve. Nothing is externalized.


'simplebib' : {
  identifier:  http://amazon.com/asin/whatever
or whatever.

Likewise for OpenURL.  Despite the name, OpenURL is, in practice, a 
standard vocabulary/encoding for citation details, it is not a method of 
'externalizing the information'.This is an OpenURL context object in 
KEV format that identifies a particular book:


rft.title=Manufacturing Consentrft.au=Noam 
Chomskyrft_id=info:isbn:whatever


Etc.


If you want to make up your own brand new citation format, then of 
course that is within your capabilities.  It seems to me that trying to 
re-use as much infrastructure that already exists is good.  Even if 
that's just re-using URI infrastructure (including info: URIs).  
Especially if you expect anyone other than you to 'adopt' this.


Jonathan

Tim Spalding wrote:

Unless someone can come up with a perfect pre-cooked format—one that
not only covers what we need but is also super easy and
space-efficient (we have only 1/2k to use!)—Why don't we just decide
on:

'simplebib' : {

}

and start filling in fields. I don't think it makes sense to
externalize the information under another URL, at least in the first
instance. That at least doubles the calls involved, and makes whatever
you build dependent on lots of external services that may or may not
work.

Best,
Tim

On Wed, Apr 21, 2010 at 10:45 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
  

So almost all of those identifiers can be formatted as a URI.   Although
sometimes it takes an info: uri, which some people don't like, but I like,
for reasons relevant to their usefulness here.

ISBN, ISSN, LCCN, and OCLCnum all have registered info: URI sub-schemes.  I
once tried to figure out how to express an EAN as a URI, and I think I _did_
eventually find _something_, but it was kind of confusing and hard to track
down (The EAN/UPC/etc people have some info URI subschemes registered too, I
think, but it's hard to figure out what it all means).  For ASIN, I have
been in the habit of using an Amazon http URI, the problem is that Amazon
really offers several http URIs for the same ASIN, so you kind of just have
to pick one format.

Oh, and you can do DOI as an info: URI too.

So your annotation _could_ simply be a URI.  And get a lot of stuff. But
this leaves out a lot of things that don't really have good identifiers at
all:   Articles in popular (not scholarly) newspapers/journals;   most daily
newspapers as titles themselves (don't usually have an ISSN);  Movies;
 books too old (or for other odd reasons lacking) an ISBN (or lccn or
oclcnum).  Scholarly articles that don't have a DOI (the majority of them).

Maybe you could use the citation microformat extended to take arbitrary URI
identifiers?  So for stuff without an identifier, you've got the citation
details, but you can still stick identifiers in with URIs?

And as someone else mentioned, this _is_ pretty much the use-case of
traditional OpenURL, and it does handle it well enough: allowing you put
enough structured citation in to identify the referent for things without
identifiers, allowing you to put arbitrary URIs  in rft_id.   But OpenURL is
kind of a monster to work with.   And doesn't deal too well with certain
kinds of citations like movies or music either, it's really focused on
published textual materials.

Jonathan

Tim Spalding wrote:


I was wondering if there was a good microformat. The trick is that the
citation format is very much about stuff that gets displayed, and
lacks the critical linking ids you'd want—ISBN, SSN, LCCN, OCLC, ASIN,
EAN, etc.

If people know of others that would work, maybe that's the answer.

On Wed, Apr 21, 2010 at 8:38 AM, Karen Coombs librarywebc...@gmail.com
wrote:

  

Have you looked the the citation microformat (
http://microformats.org/wiki/citation) ? Don't know where work with this
stands but it seems pretty interesting to me.

Karen


  




  


Re: [CODE4LIB] Twitter annotations and library software

2010-04-21 Thread Eric Hellman
I think Twitter annotations would be a good use for 
http://thing-described-by.org/ or a functional equivalent. The payload of the 
annotation would simply be a description URI and a namespace and value for 
descriptions by reference

1. the mechanism would be completely generic, usable for any sort of reference, 
not siloed in libraryland. In other words, we might actually get people to 
adopt it.
2. libraryland descriptions could use BIBO or RDA or both or whatever, and 
could be concise or verbose
3. descriptions could be easily reused

I'll write this up a bit more and would be interested in comment, but it's 
where this post was going:
http://go-to-hellman.blogspot.com/2010/04/when-shall-we-link.html



Eric Hellman
President, Gluejar, Inc.
41 Watchung Plaza, #132
Montclair, NJ 07042
USA

e...@hellman.net 
http://go-to-hellman.blogspot.com/