Ah, hadn't heard about that service! I don't think any publishers/databases are 
likely to use it on their journal article pages so I'm probably safe. But it 
does have a 'lovely' example of all the annoying punctuation a DOI can legally 
contain...

Deborah

-----Original Message-----
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe 
Hourcle
Sent: Wednesday, 22 May 2013 2:03 p.m.
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] DOI scraping

On May 21, 2013, at 9:40 PM, Fitchett, Deborah wrote:

> Joe and Owen--
>
> Thanks for the ideas!
>
> It's a bit of the opposite goal to LibX, in that rather than having a 
> title/DOI/whatever from some random site and wanting to get to  the full-text 
> article, I'm looking at the use case of academics who are already viewing the 
> full-text article and want a link that they can share with students.  Even 
> aside from the proxy prefix, the url in their browser may include (or consist 
> entirely of) session gunk.
>
> I'll try a regexp and see how far that gets me. I'm a bit trepidatious about 
> the way the DOI standard allows just about any character imaginable, but at 
> least there's the 10. prefix. Am also considering that if DOIs also appear in 
> the article's bibliography I'll need to make sure the javascript can 
> distinguish between them and the DOI for the article itself; but a lot of 
> this might be 'cross that bridge if I come to it' stuff.


Crap.  I just remembered :

        http://shortdoi.org/

... I don't know if any publishers are actually using them, or if they're just 
for people to use on twitter & other social media.

The real problem with them is that they don't have the '10.' string in them.

You can probably get away with just tracking the resolving form of them:

        http://doi[.]org/(\w+)

And ignore the

        10/(\w+)

form.

-Joe


________________________________
P Please consider the environment before you print this email.
"The contents of this e-mail (including any attachments) may be confidential 
and/or subject to copyright. Any unauthorised use, 
distribution, or copying of the contents is expressly prohibited.  If you have 
received this e-mail in error, please advise the sender 
by return e-mail or telephone and then delete this e-mail together with all 
attachments from your system."

Reply via email to