Re: [CODE4LIB] Implementing OpenURL for simple web resources
You might find the WebCite service [1] to be of some use. Thanks - I'll have a look (although obviously Mike's experience is worrying) Of course it cannot work retroactively, so it is best if researchers use it in the first place. It seems the number of our authors/researchers using bibliographic management s/w at all is pretty small. This is anecdotal, but it reflects my experience across the sector - a few academics are interested in this, but the majority are not. Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302).
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Thanks Erik, Yes - generally references to web sites require a 'route of access' (i.e. URL) and 'date accessed' - because, of course, the content of the website may change over time. Strictly you are right - if you are going to link to the resource it should be to the version of the page that was available at the time the author accessed it. This time aspect is something I'm thinking about more as a result of the conversations on this thread. The 'date accessed' seems like a good way of differentiating different possible resolutions of a single URL. Unfortunately references don't have a specified format for date, and they can be expressed in a variety of ways - typically you'll see something like 'Accessed 14 September 2009', but as far as I know it could be 'Accessed 14/09/09' or I guess 'Accessed 09/14/09' etc. It is also true that the intent of a reference can vary - sometimes the intent is to point at a website, and sometimes to point to the content of a website at a moment in time (thinking loosely in FRBR terms I guess you'd say that sometimes you want to reference the work/expression, and sometimes the manifestation? - although I know FRBR gets complicated when you look at digital representations, a whole other discussion) To be honest, our project is not going to delve into this too much - limited both by time (we finish in February) and practicalities (I just don't think the library/institution is going to want to look at snapshotting websites, or finding archived versions for each course we run - I suspect it would be less effort to update the course to use a more current reference in the cases this problem really manifests itself). One of the other things I've come to realise is that although it is nice to be able to access material that is referenced, the reference primarily recognises the work of others, and puts your work into context - access is only a secondary concern. It is perfectly possible and OK to reference material that is not generally available, as a reader I may not have access to certain material, and over time material is destroyed so when referencing rare or unique texts it may become absolutely impossible to access the referenced source. I think for research publications there is a genuine and growing issue - especially when we start to consider the practice of referencing datasets which is just starting to become common practice in scientific research. If the dataset grows over time, will it be possible to see the version of the dataset used when doing a specific piece of research? Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Erik Hetzner Sent: 15 September 2009 18:12 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources Hi Owen, all: This is a very interesting problem. At Tue, 15 Sep 2009 10:04:09 +0100, O.Stephens wrote: [...] If we look at a website it is pretty difficult to reference it without including the URL - it seems to be the only good way of describing what you are actually talking about (how many people think of websites by 'title', 'author' and 'publisher'?). For me, this leads to an immediate confusion between the description of the resource and the route of access to it. So, to differentiate I'm starting to think of the http URI in a reference like this as a URI, but not necessarily a URL. We then need some mechanism to check, given a URI, what is the URL. [...] The problem with the approach (as Nate and Eric mention) is that any approach that relies on the URI as a identifier (whether using OpenURL or a script) is going to have problems as the same URI could be used to identify different resources over time. I think Eric's suggestion of using additional information to help differentiate is worth looking at, but I suspect that this is going to cause us problems - although I'd say that it is likely to cause us much less work than the alternative, which is allocating every single reference to a web resource used in our course material it's own persistent URL. [...] I might be misunderstanding you, but, I think that you are leaving out the implicit dimension of time here - when was the URL referenced? What can we use to represent the tuple URL, date, and how do we retrieve an appropriate representation of this tuple? Is the most appropriate representation the most recent version of the page, wherever it may have moved? Or is the most appropriate representation the page as it existed in the past? I would argue that the most appropriate representation would be the page as it existed in the past, not what the page looks like now - but I am
Re: [CODE4LIB] Implementing OpenURL for simple web resources
At Wed, 16 Sep 2009 13:39:42 +0100, O.Stephens wrote: Thanks Erik, Yes - generally references to web sites require a 'route of access' (i.e. URL) and 'date accessed' - because, of course, the content of the website may change over time. Strictly you are right - if you are going to link to the resource it should be to the version of the page that was available at the time the author accessed it. This time aspect is something I'm thinking about more as a result of the conversations on this thread. The 'date accessed' seems like a good way of differentiating different possible resolutions of a single URL. Unfortunately references don't have a specified format for date, and they can be expressed in a variety of ways - typically you'll see something like 'Accessed 14 September 2009', but as far as I know it could be 'Accessed 14/09/09' or I guess 'Accessed 09/14/09' etc. It is also true that the intent of a reference can vary - sometimes the intent is to point at a website, and sometimes to point to the content of a website at a moment in time (thinking loosely in FRBR terms I guess you'd say that sometimes you want to reference the work/expression, and sometimes the manifestation? - although I know FRBR gets complicated when you look at digital representations, a whole other discussion) To be honest, our project is not going to delve into this too much - limited both by time (we finish in February) and practicalities (I just don't think the library/institution is going to want to look at snapshotting websites, or finding archived versions for each course we run - I suspect it would be less effort to update the course to use a more current reference in the cases this problem really manifests itself). One of the other things I've come to realise is that although it is nice to be able to access material that is referenced, the reference primarily recognises the work of others, and puts your work into context - access is only a secondary concern. It is perfectly possible and OK to reference material that is not generally available, as a reader I may not have access to certain material, and over time material is destroyed so when referencing rare or unique texts it may become absolutely impossible to access the referenced source. I think for research publications there is a genuine and growing issue - especially when we start to consider the practice of referencing datasets which is just starting to become common practice in scientific research. If the dataset grows over time, will it be possible to see the version of the dataset used when doing a specific piece of research? You might find the WebCite service [1] to be of some use. Of course it cannot work retroactively, so it is best if researchers use it in the first place. best, Erik Hetzner 1. http://www.webcitation.org/ ;; Erik Hetzner, California Digital Library ;; gnupg key id: 1024D/01DB07E3 pgpbj5M2lZ56Y.pgp Description: PGP signature
Re: [CODE4LIB] Implementing OpenURL for simple web resources
True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could be used to give some indication (but I suspect not unambiguously) But I still think what you want is simply a purl server. What makes you think you want OpenURL in the first place? But I still don't really understand what you're trying to do: deliver consistency of approach across all our references -- so are you using OpenURL for it's more conventional use too, but you want to tack on a purl-like functionality to the same software that's doing something more like a conventional link resolver? I don't completely understand your use case. I wouldn't use OpenURL just to get a persistent URL - I'd almost certainly look at PURL for this. But, I want something slightly different. I want our course authors to be able to use whatever URL they know for a resource, but still try to ensure that the link works persistently over time. I don't think it is reasonable for a user to have to know a 'special' URL for a resource - and this approach means establishing a PURL for all resources used in our teaching material whether or not it moves in the future - which is an overhead it would be nice to avoid. You can hit delete now if you aren't interested, but ... ... perhaps if I just say a little more about the project I'm working on it may clarify... The project I'm working on is concerned with referencing and citation. We are looking at how references appear in teaching material (esp. online) and how they can be reused by students in their personal environment (in essays, later study, or something else). The references that appear can be to anything - books, chapters, journals, articles, etc. Increasingly of course there are references to web-based materials. For print material, references generally describe the resource and nothing more, but for digital material references are expected not only to describe the resource, but also state a route of access to the resource. This tends to be a bad idea when (for example) referencing e-journals, as we know the problems that surround this - many different routes of access to the same item. OpenURLs work well in this situation and seem to me like a sensible (and perhaps the only viable) solution. So we can say that for journals/articles it is sensible to ignore any URL supplied as part of the reference, and to form an OpenURL instead. If there is a DOI in the reference (which is increasingly common) then that can be used to form a URL using DOI resolution, but it makes more sense to me to hand this off to another application rather than bake this into the reference - and OpenURL resolvers are reasonably set to do this. If we look at a website it is pretty difficult to reference it without including the URL - it seems to be the only good way of describing what you are actually talking about (how many people think of websites by 'title', 'author' and 'publisher'?). For me, this leads to an immediate confusion between the description of the resource and the route of access to it. So, to differentiate I'm starting to think of the http URI in a reference like this as a URI, but not necessarily a URL. We then need some mechanism to check, given a URI, what is the URL. Now I could do this with a script - just pass the URI to a script that checks what URL to use against a list and redirects the user if necessary. On this point Jonathan said if the usefulness of your technique does NOT count on being inter-operable with existing link resolver infrastructure... PERSONALLY I would be using OpenURL, I don't think it's worth it - but it struck me that if we were passing a URI to a script, why not pass it in an OpenURL? I could see a number of advantages to this in the local context: Consistency - references to websites get treated the same as references to journal articles - this means a single approach on the course side, with flexibility Usage stats - we could collect these whatever, but if we do it via OpenURL we get this in the same place as the stats about usage of other scholarly material and could consider driving personalisation services off the data (like the bX product from Ex Libris) Appropriate copy problem - for resources we subscribe to with authentication mechanisms there is (I think) an equivalent to the 'appropriate copy' issue as with journal articles - we can push a URI to 'Web of Science' to the correct version of Web of Science via a local authentication method (using ezproxy for us) The problem with the approach (as Nate and Eric mention) is that any approach that relies on the URI as a identifier (whether using
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I agree with this Rosalyn. The issue that Nate brought up was that the content at http://www.bbc.co.uk could change over time, and old content might be moved to another URI - http://archive.bbc.co.uk or something. So if course A references http://www.bbc.co.uk on 24/08/09, if the content that was on http://www.bbc.co.uk on 24/08/09 moves to http://archive.bbc.co.uk we can use the mechanism I propose to trap the links to http://www.bbc.co.uk and redirect to http://archive.bbc.co.uk. However, if at a later date course B references http://www.bbc.co.uk we have no way of knowing whether they mean the stuff that is currently on http://www.bbc.co.uk or the stuff that used to be on http://www.bbc.co.uk and is now on http://archive.bbc.co.uk - and we have a redirect that is being applied across the board. Thinking about it, references are required to include a date of access when citing websites, so this is probably the best piece of information to use to know where to resolve to (and we can put this in the DC metadata). Whether this will just get too confusing is a good question - I'll have at think about this. Owen PS using the date we could even consider resolving to the Internet Archive copy of a website if it was available I guess - this might be useful I guess... Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Rosalyn Metz Sent: 14 September 2009 21:52 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources oops...just re-read original post s/professor/article also your link resolver should be creating a context object with each request. this context object is what makes the openurl unique. so if you want uniqueness for stats purposes i would image the link resolver is already doing that (and just another reason to use an rfr_id that you create). On Mon, Sep 14, 2009 at 4:45 PM, Rosalyn Metz rosalynm...@gmail.com wrote: Owen, rft_id isn't really meant to be a unique identifier (although it can be in situations like a pmid or doi). are you looking for it to be? if so why? if professor A is pointing to http://www.bbc.co.uk and professor B is pointing to http://www.bbc.co.uk why do they have to have unique OpenURLs. Rosalyn On Mon, Sep 14, 2009 at 4:41 PM, Eric Hellman e...@hellman.net wrote: Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302).
Re: [CODE4LIB] Implementing OpenURL for simple web resources
you could force a timestamp if people don't include a date. and I like the idea of going to the Internet Archive of a website, because then you're not having to get into the business of handling www.bbc.co.uk differently than cnn.com and someblog.org. i also like the idea of using a redirect. you could theoretically write a source parser (i'm assuming youre using SFX based on what you said about bX) that says if my rfr_id = mylocalid and the item is a website (however you choose to identify the website...which if you're writing your own source parser you could put website in the rft_genre even though its not technically a metadata format but you just want your source parser to forward the url on anyway, so the link resolver isn't actually going to do anything with it) bypass everything and just direct to the internet archive of the website. all of this is of course kind of hackish...but really isn't the whole thing hackish? there were a few source parsers that would be good models for writing something like this. but i have no idea if they still exist because i haven't looked at the back end of sfx in about a year. On Tue, Sep 15, 2009 at 5:12 AM, O.Stephens o.steph...@open.ac.uk wrote: I agree with this Rosalyn. The issue that Nate brought up was that the content at http://www.bbc.co.uk could change over time, and old content might be moved to another URI - http://archive.bbc.co.uk or something. So if course A references http://www.bbc.co.uk on 24/08/09, if the content that was on http://www.bbc.co.uk on 24/08/09 moves to http://archive.bbc.co.uk we can use the mechanism I propose to trap the links to http://www.bbc.co.uk and redirect to http://archive.bbc.co.uk. However, if at a later date course B references http://www.bbc.co.uk we have no way of knowing whether they mean the stuff that is currently on http://www.bbc.co.uk or the stuff that used to be on http://www.bbc.co.uk and is now on http://archive.bbc.co.uk - and we have a redirect that is being applied across the board. Thinking about it, references are required to include a date of access when citing websites, so this is probably the best piece of information to use to know where to resolve to (and we can put this in the DC metadata). Whether this will just get too confusing is a good question - I'll have at think about this. Owen PS using the date we could even consider resolving to the Internet Archive copy of a website if it was available I guess - this might be useful I guess... Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Rosalyn Metz Sent: 14 September 2009 21:52 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources oops...just re-read original post s/professor/article also your link resolver should be creating a context object with each request. this context object is what makes the openurl unique. so if you want uniqueness for stats purposes i would image the link resolver is already doing that (and just another reason to use an rfr_id that you create). On Mon, Sep 14, 2009 at 4:45 PM, Rosalyn Metz rosalynm...@gmail.com wrote: Owen, rft_id isn't really meant to be a unique identifier (although it can be in situations like a pmid or doi). are you looking for it to be? if so why? if professor A is pointing to http://www.bbc.co.uk and professor B is pointing to http://www.bbc.co.uk why do they have to have unique OpenURLs. Rosalyn On Mon, Sep 14, 2009 at 4:41 PM, Eric Hellman e...@hellman.net wrote: Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Thanks Rosalyn, As you say we could push a custom value into rfr_genre. I'm a bit torn on this, as I guess I'm trying to do something that isn't 'hacky' - or at least not from the OpenURL end of it. It might be that this is just wishful thinking, and that I'm just trying to fool myself into thinking I'm 'sticking to the standard' when the likelihood of what I'm doing being transferrable to other scenarios is zero (although Eric's comments make me hope not) Yes, we are using SFX. What I'm proposing on the SFX end as the path of least resisitance is writing a source parser for our learning environment which can do a 'fetch' for an alternative URL, or use the primary URL, and put it in an SFX internal field rft_856. We can then use the existing Target Parser 856_URL which displays the contents of rft_856 in the menu. Combined with some logic which forces this as the only option under certain circumstances we can then push the user directly to the resulting URL. Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Rosalyn Metz Sent: 15 September 2009 14:42 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources you could force a timestamp if people don't include a date. and I like the idea of going to the Internet Archive of a website, because then you're not having to get into the business of handling www.bbc.co.uk differently than cnn.com and someblog.org. i also like the idea of using a redirect. you could theoretically write a source parser (i'm assuming youre using SFX based on what you said about bX) that says if my rfr_id = mylocalid and the item is a website (however you choose to identify the website...which if you're writing your own source parser you could put website in the rft_genre even though its not technically a metadata format but you just want your source parser to forward the url on anyway, so the link resolver isn't actually going to do anything with it) bypass everything and just direct to the internet archive of the website. all of this is of course kind of hackish...but really isn't the whole thing hackish? there were a few source parsers that would be good models for writing something like this. but i have no idea if they still exist because i haven't looked at the back end of sfx in about a year. On Tue, Sep 15, 2009 at 5:12 AM, O.Stephens o.steph...@open.ac.uk wrote: I agree with this Rosalyn. The issue that Nate brought up was that the content at http://www.bbc.co.uk could change over time, and old content might be moved to another URI - http://archive.bbc.co.uk or something. So if course A references http://www.bbc.co.uk on 24/08/09, if the content that was on http://www.bbc.co.uk on 24/08/09 moves to http://archive.bbc.co.uk we can use the mechanism I propose to trap the links to http://www.bbc.co.uk and redirect to http://archive.bbc.co.uk. However, if at a later date course B references http://www.bbc.co.uk we have no way of knowing whether they mean the stuff that is currently on http://www.bbc.co.uk or the stuff that used to be on http://www.bbc.co.uk and is now on http://archive.bbc.co.uk - and we have a redirect that is being applied across the board. Thinking about it, references are required to include a date of access when citing websites, so this is probably the best piece of information to use to know where to resolve to (and we can put this in the DC metadata). Whether this will just get too confusing is a good question - I'll have at think about this. Owen PS using the date we could even consider resolving to the Internet Archive copy of a website if it was available I guess - this might be useful I guess... Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Rosalyn Metz Sent: 14 September 2009 21:52 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources oops...just re-read original post s/professor/article also your link resolver should be creating a context object with each request. this context object is what makes the openurl unique. so if you want uniqueness for stats purposes i would image the link resolver is already doing that (and just another reason to use an rfr_id that you create). On Mon, Sep 14, 2009 at 4:45 PM, Rosalyn Metz rosalynm...@gmail.com wrote: Owen, rft_id isn't really meant to be a unique
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Owen, I might have missed it in this message -- my eyes are starting glaze over at this point in the thread, but can you describe how the input of these resources would work? What I'm basically asking is -- what would the professor need to do to add a new: citation for a 70 year old book; journal on PubMed; URL to CiteSeer? How does their input make it into your database? -Ross. On Tue, Sep 15, 2009 at 5:04 AM, O.Stephens o.steph...@open.ac.uk wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could be used to give some indication (but I suspect not unambiguously) But I still think what you want is simply a purl server. What makes you think you want OpenURL in the first place? But I still don't really understand what you're trying to do: deliver consistency of approach across all our references -- so are you using OpenURL for it's more conventional use too, but you want to tack on a purl-like functionality to the same software that's doing something more like a conventional link resolver? I don't completely understand your use case. I wouldn't use OpenURL just to get a persistent URL - I'd almost certainly look at PURL for this. But, I want something slightly different. I want our course authors to be able to use whatever URL they know for a resource, but still try to ensure that the link works persistently over time. I don't think it is reasonable for a user to have to know a 'special' URL for a resource - and this approach means establishing a PURL for all resources used in our teaching material whether or not it moves in the future - which is an overhead it would be nice to avoid. You can hit delete now if you aren't interested, but ... ... perhaps if I just say a little more about the project I'm working on it may clarify... The project I'm working on is concerned with referencing and citation. We are looking at how references appear in teaching material (esp. online) and how they can be reused by students in their personal environment (in essays, later study, or something else). The references that appear can be to anything - books, chapters, journals, articles, etc. Increasingly of course there are references to web-based materials. For print material, references generally describe the resource and nothing more, but for digital material references are expected not only to describe the resource, but also state a route of access to the resource. This tends to be a bad idea when (for example) referencing e-journals, as we know the problems that surround this - many different routes of access to the same item. OpenURLs work well in this situation and seem to me like a sensible (and perhaps the only viable) solution. So we can say that for journals/articles it is sensible to ignore any URL supplied as part of the reference, and to form an OpenURL instead. If there is a DOI in the reference (which is increasingly common) then that can be used to form a URL using DOI resolution, but it makes more sense to me to hand this off to another application rather than bake this into the reference - and OpenURL resolvers are reasonably set to do this. If we look at a website it is pretty difficult to reference it without including the URL - it seems to be the only good way of describing what you are actually talking about (how many people think of websites by 'title', 'author' and 'publisher'?). For me, this leads to an immediate confusion between the description of the resource and the route of access to it. So, to differentiate I'm starting to think of the http URI in a reference like this as a URI, but not necessarily a URL. We then need some mechanism to check, given a URI, what is the URL. Now I could do this with a script - just pass the URI to a script that checks what URL to use against a list and redirects the user if necessary. On this point Jonathan said if the usefulness of your technique does NOT count on being inter-operable with existing link resolver infrastructure... PERSONALLY I would be using OpenURL, I don't think it's worth it - but it struck me that if we were passing a URI to a script, why not pass it in an OpenURL? I could see a number of advantages to this in the local context: Consistency - references to websites get treated the same as references to journal articles - this means a single approach on the course side, with flexibility Usage stats - we could collect these whatever, but if we do it via OpenURL we get this in the same place as the stats about usage of other scholarly material and could consider driving
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Ross - no you didn't miss it, There are 3 ways that references might be added to the learning environment: An author (or realistically a proxy on behalf of the author) can insert a reference into a structured Word document from an RIS file. This structured document (XML) then goes through a 'publication' process which pushes the content to the learning environment (Moodle), including rendering the references from RIS format into a specified style, with links. An author/librarian/other can import references to a 'resources' area in our learning environment (Moodle) from a RIS file An author/librarian/other can subscribe to an RSS feed from a RefWorks 'RefShare' folder within the 'resources' area of the learning environment In general the project is focussing on the use of RefWorks - so although the RIS files could be created by any suitable s/w, we are looking specifically at RefWorks. How you get the reference into RefWorks is something we are looking at currently. The best approach varies depending on the type of material you are looking at: For websites it looks like the 'RefGrab-it' bookmarklet/browser plugin (depending on your browser) is the easiest way of capturing website details. For books, probably a Union catalogue search from within RefWorks For journal articles, probably a Federated search engine (SS 360 is what we've got) Any of these could be entered by hand of course, as could several other kinds of reference Entering the references into RefWorks could be done by an author, but it more likely to be done by a member of clerical staff or a librarian/library assistant Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ross Singer Sent: 15 September 2009 15:56 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources Owen, I might have missed it in this message -- my eyes are starting glaze over at this point in the thread, but can you describe how the input of these resources would work? What I'm basically asking is -- what would the professor need to do to add a new: citation for a 70 year old book; journal on PubMed; URL to CiteSeer? How does their input make it into your database? -Ross. On Tue, Sep 15, 2009 at 5:04 AM, O.Stephens o.steph...@open.ac.uk wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could be used to give some indication (but I suspect not unambiguously) But I still think what you want is simply a purl server. What makes you think you want OpenURL in the first place? But I still don't really understand what you're trying to do: deliver consistency of approach across all our references -- so are you using OpenURL for it's more conventional use too, but you want to tack on a purl-like functionality to the same software that's doing something more like a conventional link resolver? I don't completely understand your use case. I wouldn't use OpenURL just to get a persistent URL - I'd almost certainly look at PURL for this. But, I want something slightly different. I want our course authors to be able to use whatever URL they know for a resource, but still try to ensure that the link works persistently over time. I don't think it is reasonable for a user to have to know a 'special' URL for a resource - and this approach means establishing a PURL for all resources used in our teaching material whether or not it moves in the future - which is an overhead it would be nice to avoid. You can hit delete now if you aren't interested, but ... ... perhaps if I just say a little more about the project I'm working on it may clarify... The project I'm working on is concerned with referencing and citation. We are looking at how references appear in teaching material (esp. online) and how they can be reused by students in their personal environment (in essays, later study, or something else). The references that appear can be to anything - books, chapters, journals, articles, etc. Increasingly of course there are references to web-based materials. For print material, references generally describe the resource and nothing more, but for digital material references are expected not only to describe the resource, but also state a route of access to the resource. This tends to be a bad idea when (for example) referencing e-journals, as we
Re: [CODE4LIB] Implementing OpenURL for simple web resources
A suggestion on how to get a prof to enter a url. I use this bookmarklet to add a URL to Hacker News: javascript:window.location=%22http://news.ycombinator.com/submitlink?u=%22+encodeURIComponent(document.location)+%22t=%22+encodeURIComponent(document.title) I'm tempted to suggest an api based on OpenURL, but I fear the 10 emails it would provoke. On Sep 15, 2009, at 10:56 AM, Ross Singer wrote: Owen, I might have missed it in this message -- my eyes are starting glaze over at this point in the thread, but can you describe how the input of these resources would work? What I'm basically asking is -- what would the professor need to do to add a new: citation for a 70 year old book; journal on PubMed; URL to CiteSeer? How does their input make it into your database? -Ross. On Tue, Sep 15, 2009 at 5:04 AM, O.Stephens o.steph...@open.ac.uk wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could be used to give some indication (but I suspect not unambiguously) But I still think what you want is simply a purl server. What makes you think you want OpenURL in the first place? But I still don't really understand what you're trying to do: deliver consistency of approach across all our references -- so are you using OpenURL for it's more conventional use too, but you want to tack on a purl-like functionality to the same software that's doing something more like a conventional link resolver? I don't completely understand your use case. I wouldn't use OpenURL just to get a persistent URL - I'd almost certainly look at PURL for this. But, I want something slightly different. I want our course authors to be able to use whatever URL they know for a resource, but still try to ensure that the link works persistently over time. I don't think it is reasonable for a user to have to know a 'special' URL for a resource - and this approach means establishing a PURL for all resources used in our teaching material whether or not it moves in the future - which is an overhead it would be nice to avoid. You can hit delete now if you aren't interested, but ... ... perhaps if I just say a little more about the project I'm working on it may clarify... The project I'm working on is concerned with referencing and citation. We are looking at how references appear in teaching material (esp. online) and how they can be reused by students in their personal environment (in essays, later study, or something else). The references that appear can be to anything - books, chapters, journals, articles, etc. Increasingly of course there are references to web-based materials. For print material, references generally describe the resource and nothing more, but for digital material references are expected not only to describe the resource, but also state a route of access to the resource. This tends to be a bad idea when (for example) referencing e-journals, as we know the problems that surround this - many different routes of access to the same item. OpenURLs work well in this situation and seem to me like a sensible (and perhaps the only viable) solution. So we can say that for journals/articles it is sensible to ignore any URL supplied as part of the reference, and to form an OpenURL instead. If there is a DOI in the reference (which is increasingly common) then that can be used to form a URL using DOI resolution, but it makes more sense to me to hand this off to another application rather than bake this into the reference - and OpenURL resolvers are reasonably set to do this. If we look at a website it is pretty difficult to reference it without including the URL - it seems to be the only good way of describing what you are actually talking about (how many people think of websites by 'title', 'author' and 'publisher'?). For me, this leads to an immediate confusion between the description of the resource and the route of access to it. So, to differentiate I'm starting to think of the http URI in a reference like this as a URI, but not necessarily a URL. We then need some mechanism to check, given a URI, what is the URL. Now I could do this with a script - just pass the URI to a script that checks what URL to use against a list and redirects the user if necessary. On this point Jonathan said if the usefulness of your technique does NOT count on being inter-operable with existing link resolver infrastructure... PERSONALLY I would be using OpenURL, I don't think it's worth it - but it struck me that if we were passing a URI to a script, why not pass it in an OpenURL?
Re: [CODE4LIB] Implementing OpenURL for simple web resources
O.Stephens wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could be used to give some indication (but I suspect not unambiguously) Yeah, I don't think there IS any good way to do this. Well, wait, okay, you could use a DC metadata package, and try to convey web site in dc.type. The OpenURL dc.type is _recommended_ that you use a term from the DCTerms Type vocabulary, but that only lets you say something like it's an InteractiveResource or Text or Software. Unless InteractiveResource is sufficient to convey what you need, you could disregard the suggestion (not requirement) that the openurl dc metadata schema type element contain a DCMI Type vocabulary term, and just put something else there. Website. If you want to go this route, probably make a URI (perhaps using purl.org) to put an actual URI instead of a string literal there to represent Website. Now, you've still wound up with something that is somewhat local/custom, that other resolvers are not going to understand. But frankly, I think anything you're going to wind up with is something that you aren't going to be able to trust arbitrary resolvers in the wild to do anything in particular with. Which may not be a requirement for you anyway. (Which is why I personally find a new OpenURL metadata format to be a complete non-starter. I don't think OpenURL's abstract core actually provides much actual practical benefit, a new metadata format might as well be an entirely new standard -- for the practical benefit you get from it. Other link resolvers that aren't yours are unlikely to ever do anything with your new format, and if they do, whoever implements that is going to have almost as much work to do as if it hadn't been OpenURL at all. If I wanted a really abstract metadata framework to create a new profile/schema on top of, I'd choose DCMI, not OpenURL. DCMI is also so abstract that it doesn't make sense to just say My app can take DCMI (just like it doens't make any sense to say my app can take OpenURL--it's all about the profiels/schemas). But at least DCMI is a lot more flexible, and still has an active body of people working on maintaining and developing and adopting it.) Jonathan
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Wait, are you really going to try to do this with _SFX_ too? I missed that part. Oh boy. Seriously, I think you are in for a world of painful hacky kludge. Rosalyn Metz wrote: Owen, The reason I suggest a source parser rather than a target parser is that handling the openurl based on the source rather than shave a bit of time off. Attached is a slide i created (back in the day when it was my job to create such slides...no i don't sit around in my hole creating slides because i'm bored...although.) that shows the process an OpenURL goes through. So the source parser in this example would come into play before the OpenURL metadata hits the SFX KB. It would bypass the bottom half of the slide completely and reduce any weird formatting that SFX might try to do to the metadata with a value like website (if you tell sfx you're looking for an article but you're really looking for a book it sometimes ignores metadata unrelated to an article even though you might actually need it). if you never let it get to that point, then you don't need to worry about that feature coming into play. Source parsers aren't used as frequently as they once were, but they used to be a way to retrieve more metadata from databases that didn't create useful openurls (not that many vendors create useful openurls now...). but if you go a hackish route you could use a source parser like a redirect rather than using it to fetch more metadata. If none of this makes sense let me know and i can try to describe it better off list so as not to bore people into oblivion. Rosalyn On Tue, Sep 15, 2009 at 9:52 AM, O.Stephens o.steph...@open.ac.uk wrote: Thanks Rosalyn, As you say we could push a custom value into rfr_genre. I'm a bit torn on this, as I guess I'm trying to do something that isn't 'hacky' - or at least not from the OpenURL end of it. It might be that this is just wishful thinking, and that I'm just trying to fool myself into thinking I'm 'sticking to the standard' when the likelihood of what I'm doing being transferrable to other scenarios is zero (although Eric's comments make me hope not) Yes, we are using SFX. What I'm proposing on the SFX end as the path of least resisitance is writing a source parser for our learning environment which can do a 'fetch' for an alternative URL, or use the primary URL, and put it in an SFX internal field rft_856. We can then use the existing Target Parser 856_URL which displays the contents of rft_856 in the menu. Combined with some logic which forces this as the only option under certain circumstances we can then push the user directly to the resulting URL. Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Rosalyn Metz Sent: 15 September 2009 14:42 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources you could force a timestamp if people don't include a date. and I like the idea of going to the Internet Archive of a website, because then you're not having to get into the business of handling www.bbc.co.uk differently than cnn.com and someblog.org. i also like the idea of using a redirect. you could theoretically write a source parser (i'm assuming youre using SFX based on what you said about bX) that says if my rfr_id = mylocalid and the item is a website (however you choose to identify the website...which if you're writing your own source parser you could put website in the rft_genre even though its not technically a metadata format but you just want your source parser to forward the url on anyway, so the link resolver isn't actually going to do anything with it) bypass everything and just direct to the internet archive of the website. all of this is of course kind of hackish...but really isn't the whole thing hackish? there were a few source parsers that would be good models for writing something like this. but i have no idea if they still exist because i haven't looked at the back end of sfx in about a year. On Tue, Sep 15, 2009 at 5:12 AM, O.Stephens o.steph...@open.ac.uk wrote: I agree with this Rosalyn. The issue that Nate brought up was that the content at http://www.bbc.co.uk could change over time, and old content might be moved to another URI - http://archive.bbc.co.uk or something. So if course A references http://www.bbc.co.uk on 24/08/09, if the content that was on http://www.bbc.co.uk on 24/08/09 moves to http://archive.bbc.co.uk we can use the mechanism I propose to trap the links to http://www.bbc.co.uk and redirect to http://archive.bbc.co.uk. However, if at a later date course B references http://www.bbc.co.uk we have no way of knowing whether they mean the stuff that is currently on http://www.bbc.co.uk
Re: [CODE4LIB] Implementing OpenURL for simple web resources
O.Stephens wrote: Thanks Rosalyn, As you say we could push a custom value into rfr_genre. I'm a bit torn on this, as I guess I'm trying to do something that isn't 'hacky' - or at least not from the OpenURL end of it. It might be that this is just wishful thinking, and that I'm just trying to fool myself into thinking I'm 'sticking to the standard' when the likelihood of what I'm doing being transferrable to other scenarios is zero (although Eric's comments make me hope not) Heh, that is my opinion. Everything I've ever tried to do with OpenURL that isn't part of the original 0.1 use case has ended up very hacky, despite my best efforts.
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Do you think? I reckon it is just a few lines of code in a custom source parser... Only need to: Check rft.id contains an http uri (regexp) Define a fetchID based on this URI (possibly + date/other metadata) Get a URL or null from a lookup service Insert URL or rft_id value into rft.856 Simple! Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Jonathan Rochkind Sent: 15 September 2009 16:30 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources Wait, are you really going to try to do this with _SFX_ too? I missed that part. Oh boy. Seriously, I think you are in for a world of painful hacky kludge. Rosalyn Metz wrote: Owen, The reason I suggest a source parser rather than a target parser is that handling the openurl based on the source rather than shave a bit of time off. Attached is a slide i created (back in the day when it was my job to create such slides...no i don't sit around in my hole creating slides because i'm bored...although.) that shows the process an OpenURL goes through. So the source parser in this example would come into play before the OpenURL metadata hits the SFX KB. It would bypass the bottom half of the slide completely and reduce any weird formatting that SFX might try to do to the metadata with a value like website (if you tell sfx you're looking for an article but you're really looking for a book it sometimes ignores metadata unrelated to an article even though you might actually need it). if you never let it get to that point, then you don't need to worry about that feature coming into play. Source parsers aren't used as frequently as they once were, but they used to be a way to retrieve more metadata from databases that didn't create useful openurls (not that many vendors create useful openurls now...). but if you go a hackish route you could use a source parser like a redirect rather than using it to fetch more metadata. If none of this makes sense let me know and i can try to describe it better off list so as not to bore people into oblivion. Rosalyn On Tue, Sep 15, 2009 at 9:52 AM, O.Stephens o.steph...@open.ac.uk wrote: Thanks Rosalyn, As you say we could push a custom value into rfr_genre. I'm a bit torn on this, as I guess I'm trying to do something that isn't 'hacky' - or at least not from the OpenURL end of it. It might be that this is just wishful thinking, and that I'm just trying to fool myself into thinking I'm 'sticking to the standard' when the likelihood of what I'm doing being transferrable to other scenarios is zero (although Eric's comments make me hope not) Yes, we are using SFX. What I'm proposing on the SFX end as the path of least resisitance is writing a source parser for our learning environment which can do a 'fetch' for an alternative URL, or use the primary URL, and put it in an SFX internal field rft_856. We can then use the existing Target Parser 856_URL which displays the contents of rft_856 in the menu. Combined with some logic which forces this as the only option under certain circumstances we can then push the user directly to the resulting URL. Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Rosalyn Metz Sent: 15 September 2009 14:42 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources you could force a timestamp if people don't include a date. and I like the idea of going to the Internet Archive of a website, because then you're not having to get into the business of handling www.bbc.co.uk differently than cnn.com and someblog.org. i also like the idea of using a redirect. you could theoretically write a source parser (i'm assuming youre using SFX based on what you said about bX) that says if my rfr_id = mylocalid and the item is a website (however you choose to identify the website...which if you're writing your own source parser you could put website in the rft_genre even though its not technically a metadata format but you just want your source parser to forward the url on anyway, so the link resolver isn't actually going to do anything with it) bypass everything and just direct to the internet archive of the website. all of this is of course kind of hackish...but really isn't the whole thing hackish? there were a few
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Given that the burden of creating these links is entirely on RefWorks Telstar, OpenURL seems as good a choice as anything (since anything would require some other service, anyway). As long as the profs aren't expected to mess with it, I'm not sure that *how* you do the indirection matters all that much and, as you say, there are added bonuses to keeping it within SFX. It seems to me, though, that your rft_id should be a URI to the db you're using to store their references, so your CTX would look something like: http://res.open.ac.uk/?rfr_id=info:/telstar.open.ac.ukrft_id=http://telstar.open.ac.uk/1234dc.identifier=http://bbc.uk.co/ # not url encoded because I have, you know, a life. I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a formatted citation. This way your citations are unique -- somebody pointing at today's London Times frontpage isn't the same as somebody else's on a different day. While I'm shocked that I agree with using OpenURL for this, it seems as reasonable as any other solution. That being said, unless you can definitely offer some other service besides linking to the resource, I'd avoid the resolver menu completely. -Ross. On Tue, Sep 15, 2009 at 11:17 AM, O.Stephens o.steph...@open.ac.uk wrote: Ross - no you didn't miss it, There are 3 ways that references might be added to the learning environment: An author (or realistically a proxy on behalf of the author) can insert a reference into a structured Word document from an RIS file. This structured document (XML) then goes through a 'publication' process which pushes the content to the learning environment (Moodle), including rendering the references from RIS format into a specified style, with links. An author/librarian/other can import references to a 'resources' area in our learning environment (Moodle) from a RIS file An author/librarian/other can subscribe to an RSS feed from a RefWorks 'RefShare' folder within the 'resources' area of the learning environment In general the project is focussing on the use of RefWorks - so although the RIS files could be created by any suitable s/w, we are looking specifically at RefWorks. How you get the reference into RefWorks is something we are looking at currently. The best approach varies depending on the type of material you are looking at: For websites it looks like the 'RefGrab-it' bookmarklet/browser plugin (depending on your browser) is the easiest way of capturing website details. For books, probably a Union catalogue search from within RefWorks For journal articles, probably a Federated search engine (SS 360 is what we've got) Any of these could be entered by hand of course, as could several other kinds of reference Entering the references into RefWorks could be done by an author, but it more likely to be done by a member of clerical staff or a librarian/library assistant Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ross Singer Sent: 15 September 2009 15:56 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources Owen, I might have missed it in this message -- my eyes are starting glaze over at this point in the thread, but can you describe how the input of these resources would work? What I'm basically asking is -- what would the professor need to do to add a new: citation for a 70 year old book; journal on PubMed; URL to CiteSeer? How does their input make it into your database? -Ross. On Tue, Sep 15, 2009 at 5:04 AM, O.Stephens o.steph...@open.ac.uk wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could be used to give some indication (but I suspect not unambiguously) But I still think what you want is simply a purl server. What makes you think you want OpenURL in the first place? But I still don't really understand what you're trying to do: deliver consistency of approach across all our references -- so are you using OpenURL for it's more conventional use too, but you want to tack on a purl-like functionality to the same software that's doing something more like a conventional link resolver? I don't completely understand your use case
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Oh yeah, one thing I left off -- In Moodle, it would probably make sense to link to the URL in the a tag: a href=http://bbc.co.uk/;The Beeb!/a but use a javascript onMouseDown action to rewrite the link to route through your funky link resolver path, a la Google. That way, the page works like any normal webpage, right mouse click-Copy Link Location gives the user the real URL to copy and paste, but normal behavior funnels through the link resolver. -Ross. On Tue, Sep 15, 2009 at 11:41 AM, Ross Singer rossfsin...@gmail.com wrote: Given that the burden of creating these links is entirely on RefWorks Telstar, OpenURL seems as good a choice as anything (since anything would require some other service, anyway). As long as the profs aren't expected to mess with it, I'm not sure that *how* you do the indirection matters all that much and, as you say, there are added bonuses to keeping it within SFX. It seems to me, though, that your rft_id should be a URI to the db you're using to store their references, so your CTX would look something like: http://res.open.ac.uk/?rfr_id=info:/telstar.open.ac.ukrft_id=http://telstar.open.ac.uk/1234dc.identifier=http://bbc.uk.co/ # not url encoded because I have, you know, a life. I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a formatted citation. This way your citations are unique -- somebody pointing at today's London Times frontpage isn't the same as somebody else's on a different day. While I'm shocked that I agree with using OpenURL for this, it seems as reasonable as any other solution. That being said, unless you can definitely offer some other service besides linking to the resource, I'd avoid the resolver menu completely. -Ross. On Tue, Sep 15, 2009 at 11:17 AM, O.Stephens o.steph...@open.ac.uk wrote: Ross - no you didn't miss it, There are 3 ways that references might be added to the learning environment: An author (or realistically a proxy on behalf of the author) can insert a reference into a structured Word document from an RIS file. This structured document (XML) then goes through a 'publication' process which pushes the content to the learning environment (Moodle), including rendering the references from RIS format into a specified style, with links. An author/librarian/other can import references to a 'resources' area in our learning environment (Moodle) from a RIS file An author/librarian/other can subscribe to an RSS feed from a RefWorks 'RefShare' folder within the 'resources' area of the learning environment In general the project is focussing on the use of RefWorks - so although the RIS files could be created by any suitable s/w, we are looking specifically at RefWorks. How you get the reference into RefWorks is something we are looking at currently. The best approach varies depending on the type of material you are looking at: For websites it looks like the 'RefGrab-it' bookmarklet/browser plugin (depending on your browser) is the easiest way of capturing website details. For books, probably a Union catalogue search from within RefWorks For journal articles, probably a Federated search engine (SS 360 is what we've got) Any of these could be entered by hand of course, as could several other kinds of reference Entering the references into RefWorks could be done by an author, but it more likely to be done by a member of clerical staff or a librarian/library assistant Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Ross Singer Sent: 15 September 2009 15:56 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources Owen, I might have missed it in this message -- my eyes are starting glaze over at this point in the thread, but can you describe how the input of these resources would work? What I'm basically asking is -- what would the professor need to do to add a new: citation for a 70 year old book; journal on PubMed; URL to CiteSeer? How does their input make it into your database? -Ross. On Tue, Sep 15, 2009 at 5:04 AM, O.Stephens o.steph...@open.ac.uk wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that is saying 'this is a website'), although there are perhaps some bits and pieces (format, type) that could
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I'm thinking about it :) Logically I think we can avoid this as we have the context based on the rfr_id (for which we are proposing) rfr_id=info:sid/learn.open.ac.uk:[course code] (at the risk of more comment!) Which seems to me equivalent. I guess it is just a matter of where you do the work, since in SFX we'll end up constructing a 'fetch' to the same location anyway. The amount of work involved to change it one way or the other is probably trivial though. I'm not sure I agree that what I'm proposing puts 'random' URLs in the rft_id, although I do accept that this is a moot point if other resolvers don't do something useful with them (or worse, make incorrect assumptions about them) - perhaps this is something I could survey as part of the project... (although its all moot if we are only doing this within an internal environment and no-one else ever does it!) Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Jonathan Rochkind Sent: 15 September 2009 16:52 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources I do like Ross's solution, if you really wanna use OpenURL. I'm much more comfortable with the idea of including a URI based on your own local service in rft_id, then including any old public URL in rft_id. Then at least your link resolver can say if what's in rft_id begins with (eg) http://telstar.open.ac.uk/, THEN I know this is one of these purl type things, and I know that sending the user to it will result in a redirect to an end-user-appropriate access URL. Cause that's my concern with putting random URLs in rft_id, that there's no way to know if they are intended as end-user-appropriate access URLs or not, and in putting things in rft_id that aren't really good identifiers for the referent at all. But using your own local service ID, now you really DO have something that's appropriately considered a persistent identifier for the referent, AND you have a straightforward way to tell when the rft_id of this context is intended as an access URL. Jonathan Ross Singer wrote: Oh yeah, one thing I left off -- In Moodle, it would probably make sense to link to the URL in the a tag: a href=http://bbc.co.uk/;The Beeb!/a but use a javascript onMouseDown action to rewrite the link to route through your funky link resolver path, a la Google. That way, the page works like any normal webpage, right mouse click-Copy Link Location gives the user the real URL to copy and paste, but normal behavior funnels through the link resolver. -Ross. On Tue, Sep 15, 2009 at 11:41 AM, Ross Singer rossfsin...@gmail.com wrote: Given that the burden of creating these links is entirely on RefWorks Telstar, OpenURL seems as good a choice as anything (since anything would require some other service, anyway). As long as the profs aren't expected to mess with it, I'm not sure that *how* you do the indirection matters all that much and, as you say, there are added bonuses to keeping it within SFX. It seems to me, though, that your rft_id should be a URI to the db you're using to store their references, so your CTX would look something like: http://res.open.ac.uk/?rfr_id=info:/telstar.open.ac.ukrft_id=http:// telstar.open.ac.uk/1234dc.identifier=http://bbc.uk.co/ # not url encoded because I have, you know, a life. I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a formatted citation. This way your citations are unique -- somebody pointing at today's London Times frontpage isn't the same as somebody else's on a different day. While I'm shocked that I agree with using OpenURL for this, it seems as reasonable as any other solution. That being said, unless you can definitely offer some other service besides linking to the resource, I'd avoid the resolver menu completely. -Ross. On Tue, Sep 15, 2009 at 11:17 AM, O.Stephens o.steph...@open.ac.uk wrote: Ross - no you didn't miss it, There are 3 ways that references might be added to the learning environment: An author (or realistically a proxy on behalf of the author) can insert a reference into a structured Word document from an RIS file. This structured document (XML) then goes through a 'publication' process which pushes the content to the learning environment (Moodle), including rendering the references from RIS format into a specified style, with links. An author/librarian/other can import references to a 'resources' area in our
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I think using locally meaningful ids in rft_id is a misuse and a mistake. locally meaningful data should goi in rft_dat, accompanied by rfr_id just sayin' On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote: I do like Ross's solution, if you really wanna use OpenURL. I'm much more comfortable with the idea of including a URI based on your own local service in rft_id, then including any old public URL in rft_id. Then at least your link resolver can say if what's in rft_id begins with (eg) http://telstar.open.ac.uk/, THEN I know this is one of these purl type things, and I know that sending the user to it will result in a redirect to an end-user-appropriate access URL. Cause that's my concern with putting random URLs in rft_id, that there's no way to know if they are intended as end-user-appropriate access URLs or not, and in putting things in rft_id that aren't really good identifiers for the referent at all. But using your own local service ID, now you really DO have something that's appropriately considered a persistent identifier for the referent, AND you have a straightforward way to tell when the rft_id of this context is intended as an access URL. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Yes, you can. On Sep 15, 2009, at 11:41 AM, Ross Singer wrote: I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a formatted citation. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
On Tue, Sep 15, 2009 at 12:06 PM, Eric Hellman e...@hellman.net wrote: Yes, you can. In this case, I say punt on dc.identifier, throw the URL in rft_id (since, Eric, you had some concern regarding using the local id for this?) and let the real URL persistence/resolution work happen with the by-ref negotiation. -Ross. On Sep 15, 2009, at 11:41 AM, Ross Singer wrote: I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a formatted citation. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Hi Owen, all: This is a very interesting problem. At Tue, 15 Sep 2009 10:04:09 +0100, O.Stephens wrote: […] If we look at a website it is pretty difficult to reference it without including the URL - it seems to be the only good way of describing what you are actually talking about (how many people think of websites by 'title', 'author' and 'publisher'?). For me, this leads to an immediate confusion between the description of the resource and the route of access to it. So, to differentiate I'm starting to think of the http URI in a reference like this as a URI, but not necessarily a URL. We then need some mechanism to check, given a URI, what is the URL. […] The problem with the approach (as Nate and Eric mention) is that any approach that relies on the URI as a identifier (whether using OpenURL or a script) is going to have problems as the same URI could be used to identify different resources over time. I think Eric's suggestion of using additional information to help differentiate is worth looking at, but I suspect that this is going to cause us problems - although I'd say that it is likely to cause us much less work than the alternative, which is allocating every single reference to a web resource used in our course material it's own persistent URL. […] I might be misunderstanding you, but, I think that you are leaving out the implicit dimension of time here - when was the URL referenced? What can we use to represent the tuple URL, date, and how do we retrieve an appropriate representation of this tuple? Is the most appropriate representation the most recent version of the page, wherever it may have moved? Or is the most appropriate representation the page as it existed in the past? I would argue that the most appropriate representation would be the page as it existed in the past, not what the page looks like now - but I am biased, because I work in web archiving. Unfortunately this is a problem that has not been very well addressed by the web architecture people, or the web archiving people. The web architecture people start from the assumption that http://example.org/ is the same resource which only varies in its representation as a function of time, not in its identity as a resource. The web archives people create closed systems and do not think about how to store and resolve the tuple, URL, date. I know this doesn’t help with your immediate problem, but I think these are important issues. best, Erik Hetzner ;; Erik Hetzner, California Digital Library ;; gnupg key id: 1024D/01DB07E3 pgpoU4UofTFjn.pgp Description: PGP signature
Re: [CODE4LIB] Implementing OpenURL for simple web resources
The process by which a URI comes to identify something other than the stuff you get by resolving it can be mysterious- I've blogged about a bit: http://go-to-hellman.blogspot.com/2009/07/illusion-of-internet-identity.html In the case of worldcat or google, it's fame. If you think a URI can be usable outside your institution for identification purposes, and your institution can maintain some sort of identification machinery as long as the OpenURL is expected to be useful, then it's fine to use it in rft_id. If you intend the uri to connote identity it only in the context that you're building urls for, then use rft_dat which is there for exactly that purpose. On Sep 15, 2009, at 12:17 PM, Jonathan Rochkind wrote: If it's a URI that is indeed an identifier that unambiguously identifies the referent, as the standard says... I don't see how that's inappropriate in rft_id. Isn't that what it's for? I mentioned before that I put things like http://catalog.library.jhu.edu/bib/1234 in my rft_ids. Putting http://somewhere.edu/our-purl-server/1234 in rft_id seems very analogous to me. Both seem appropriate. I'm not sure what makes a URI locally meaningful or not. What makes http://www.worldcat.org/bibID or http://books.google.com/book?id=foo globally meaningful but http://catalog.library.jhu.edu/bib/1234 or http://somewhere.edu/our-purl-server/1234 locally meaningful? If it's a URI that is reasonably persistent and unambiguously identifies the referent, then it's an identifier and is appropriate for rft_id, says me. Jonathan Eric Hellman wrote: I think using locally meaningful ids in rft_id is a misuse and a mistake. locally meaningful data should goi in rft_dat, accompanied by rfr_id just sayin' On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote: I do like Ross's solution, if you really wanna use OpenURL. I'm much more comfortable with the idea of including a URI based on your own local service in rft_id, then including any old public URL in rft_id. Then at least your link resolver can say if what's in rft_id begins with (eg) http://telstar.open.ac.uk/, THEN I know this is one of these purl type things, and I know that sending the user to it will result in a redirect to an end-user-appropriate access URL. Cause that's my concern with putting random URLs in rft_id, that there's no way to know if they are intended as end-user- appropriate access URLs or not, and in putting things in rft_id that aren't really good identifiers for the referent at all. But using your own local service ID, now you really DO have something that's appropriately considered a persistent identifier for the referent, AND you have a straightforward way to tell when the rft_id of this context is intended as an access URL. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
IIRC you can also elide url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx as that is the default. If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. On Sep 14, 2009, at 9:48 AM, O.Stephens wrote: I'm working on a project called TELSTAR (based at the Open University in the UK) which is looking at the integration of resources into an online learning environment (see http://www.open.ac.uk/telstar for the basic project details). The project focuses on the use of References/Citations as the way in which resources are integrated into the teaching material/environment. We are going to use OpenURL to provide links (where appropriate) from references to full text resources. Clearly for journals, articles, and a number of other formats this is a relatively well understood practice, and implementing this should be relatively straightforward. However, we also want to use OpenURL even where the reference is to a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk . This is in order to ensure that links provided in the course material are persistent over time. A brief description of what we perceive to be the problem and the way we are tackling it is available on the project blog at http://www.open.ac.uk/blogs/telstar/2009/09/14/managing-link-persistence-with-openurls/ (any comments welcome). What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk Is this sufficient (and correct)? Should we consider passing fuller metadata? If the latter should we use the existing KEV DC representation, or should we be looking at defining a new metadata format? Any help would be very welcome. Thanks, Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.ukmailto:o.steph...@open.ac.uk The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302). Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
2009/9/14 O.Stephens o.steph...@open.ac.uk: However, we also want to use OpenURL even where the reference is to a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk. This is in order to ensure that links provided in the course material are persistent over time. [...] What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk How would this like be more persistent than http://www.bbc.co.uk/ ?
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Because we can manipulate how we resolve the OpenURL if we want to and redirect the user to an alternative location if we know the resource has moved from the original URL. OK, the BBC homepage is not likely to move, but many other pages are less stable of course. Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.uk -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Mike Taylor Sent: 14 September 2009 15:06 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources 2009/9/14 O.Stephens o.steph...@open.ac.uk: However, we also want to use OpenURL even where the reference is to a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk. This is in order to ensure that links provided in the course material are persistent over time. [...] What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk How would this like be more persistent than http://www.bbc.co.uk/ ? The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302).
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
At Mon, 14 Sep 2009 14:48:23 +0100, O.Stephens wrote: I'm working on a project called TELSTAR (based at the Open University in the UK) which is looking at the integration of resources into an online learning environment (see http://www.open.ac.uk/telstar for the basic project details). The project focuses on the use of References/Citations as the way in which resources are integrated into the teaching material/environment. We are going to use OpenURL to provide links (where appropriate) from references to full text resources. Clearly for journals, articles, and a number of other formats this is a relatively well understood practice, and implementing this should be relatively straightforward. However, we also want to use OpenURL even where the reference is to a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk. This is in order to ensure that links provided in the course material are persistent over time. A brief description of what we perceive to be the problem and the way we are tackling it is available on the project blog at http://www.open.ac.uk/blogs/telstar/2009/09/14/managing-link-persistence-with-openurls/ (any comments welcome). What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk Is this sufficient (and correct)? Should we consider passing fuller metadata? If the latter should we use the existing KEV DC representation, or should we be looking at defining a new metadata format? Any help would be very welcome. Here are some things that I would take into consideration, not related to the technical OpenURL question, but I think relevant anyhow. a) What will people do if the service that you provide goes away? A good thing about the OpenURL that you have above is that even if your resolver no longer works, a savvy user can see that the OpenURL is supposed to point at http://www.bbc.co.uk/. A bad thing about the old URL that you have on your blog: http://routes.open.ac.uk/ixbin/hixclient.exe?_IXDB_=routes_IXSPFX_=gsubmit-button=summary$+with+res_id+is+res9377 is that when that URL stops working - I will bet money it will stop working before www.bcc.co.uk stops working - nobody will know what it meant. b) How can you ensure that your service will not go away? What is the institutional commitment? If you can’t provide a stronger commitment than, e.g., www.bbc.co.uk, is this worth doing? c) Who will maintain that database that redirects www.bbc.co.uk to www.neobbc.co.uk? (see second part of B above). d) Is there a simpler solution to this problem than OpenURL? e) Finally: how many problems will this solve? It seems to me that this is only useful in the case of URL A1 moving to A2 (e.g., following an organization rename) where the organization does not maintain a redirect. In other words, it is not particularly useful in cases where URL A1 goes away completely (in which case there is no unarchived URL to go to) and where a redirect is maintained from A1 to A2 (in which case there is no need to maintain your own redirect). How many instances of this are there? Maybe there are many; www.bbc.co.uk is a bad example, but a journal article online might move around a lot. Hope that is useful! Thanks for reading. best, Erik Hetzner pgphclJkUh9ue.pgp Description: PGP signature
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Most link resolvers aren't going to know what to do with that -- they aren't going to know that that OpenURL is meant to represent a web page, and that the URL in rft_id should be provided to the user. In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Sadly, I'm not sure what a better solution is though. OpenURL is very frustrating. Jonathan O.Stephens wrote: I'm working on a project called TELSTAR (based at the Open University in the UK) which is looking at the integration of resources into an online learning environment (see http://www.open.ac.uk/telstar for the basic project details). The project focuses on the use of References/Citations as the way in which resources are integrated into the teaching material/environment. We are going to use OpenURL to provide links (where appropriate) from references to full text resources. Clearly for journals, articles, and a number of other formats this is a relatively well understood practice, and implementing this should be relatively straightforward. However, we also want to use OpenURL even where the reference is to a more straightforward web resource - e.g. a web page such as http://www.bbc.co.uk. This is in order to ensure that links provided in the course material are persistent over time. A brief description of what we perceive to be the problem and the way we are tackling it is available on the project blog at http://www.open.ac.uk/blogs/telstar/2009/09/14/managing-link-persistence-with-openurls/ (any comments welcome). What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk Is this sufficient (and correct)? Should we consider passing fuller metadata? If the latter should we use the existing KEV DC representation, or should we be looking at defining a new metadata format? Any help would be very welcome. Thanks, Owen Owen Stephens TELSTAR Project Manager Library and Learning Resources Centre The Open University Walton Hall Milton Keynes, MK7 6AA T: +44 (0) 1908 858701 F: +44 (0) 1908 653571 E: o.steph...@open.ac.ukmailto:o.steph...@open.ac.uk The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302).
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Well, in the 'wild' I barely see any rft_id's at all, heh. Aside from the obvious non-http URIs in rft_id, I'm not sure if I've seen http URIs that don't resolve to full text. BUT -- you can do anything with an http URI that you can do with an info uri. There is no requirement or guarantee in any spec that an HTTP uri will resolve at all, let alone resolve to full text for the document cited in an OpenURL. The OpenURL spec says that rft_id is An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). It doesn't say that it needs to resolve to full text. In my own OpenURL link-generating software, I _frequently_ put identifiers which are NOT open access URLs to full text in rft_id. Because there's no other place to put them. And I frequently use http URIs even for things that don't resolve to full text, because the conventional wisdom is to always use http for URIs, whether or not they resolve at all, and certainly no requirement that they resolve to something in particular like full text. Examples that I use myself when generating OpenURL rft_ids, of http URIs that do not resolve to full text include ones identifying bib records in my own catalog: http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my catalog record, but not to full text!] Or similarly, WorldCat http URIs. Or, an rft_id to unambigously identify something in terms of it's Google Books record: http://books.google.com/books?id=tl8MCAAJ Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/[sudoc]= will, as the purl is presently set up by rsinger, resolve to a GPO catalog record, but there's no guarantee of online public full text. I'm pretty sure what I'm doing is perfectly appropriate based on the definition of rft_id, but it's definitely incompatible with a receiving link resolver assuming that all rft_id http URIs will resolve to full text for the rft cited. I don't think it's appropriate to assume that just because a URI is http, that means it will resolve to full text -- it's merely an identifier that unambiguously specifies the referent, same as any other URI scheme. Isn't that what the sem web folks are always insisting in the arguments about how it's okay to use http URIs for any type of identifier at all -- that http is just an identifier (at least in a context where all that's called for is a URI to identify), you can't assume that it resolves to anything in particular? (Although it's nice when it resolves to RDF saying more about the thing identified, it's certainly not expected that it will resolve to full text). Eric, out of curiosity, will your own link resolver software automatically take rft_id's and display them to the user as links? Jonathan Eric Hellman wrote: Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
As I'm sure you're aware, the OpenURL spec only talks about providing services, and resolving to full text is only one of many possible services. If *all* you know about a referent is the url, then redirecting the user to the url is going to be the best you can do in almost all cases. In particular, I don't think the dublin core profile, which is what Owen suggests to use, has much to say about resolving to full text. http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. IIRC Google, Worldcat, and Wikipedia used rft_id. I'm not in a position to answer any questions about specific link resolver software that I no longer am associated with, however good it is/was. Eric On Sep 14, 2009, at 12:57 PM, Jonathan Rochkind wrote: Well, in the 'wild' I barely see any rft_id's at all, heh. Aside from the obvious non-http URIs in rft_id, I'm not sure if I've seen http URIs that don't resolve to full text. BUT -- you can do anything with an http URI that you can do with an info uri. There is no requirement or guarantee in any spec that an HTTP uri will resolve at all, let alone resolve to full text for the document cited in an OpenURL. The OpenURL spec says that rft_id is An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). It doesn't say that it needs to resolve to full text. In my own OpenURL link-generating software, I _frequently_ put identifiers which are NOT open access URLs to full text in rft_id. Because there's no other place to put them. And I frequently use http URIs even for things that don't resolve to full text, because the conventional wisdom is to always use http for URIs, whether or not they resolve at all, and certainly no requirement that they resolve to something in particular like full text. Examples that I use myself when generating OpenURL rft_ids, of http URIs that do not resolve to full text include ones identifying bib records in my own catalog: http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my catalog record, but not to full text!] Or similarly, WorldCat http URIs. Or, an rft_id to unambigously identify something in terms of it's Google Books record: http://books.google.com/books?id=tl8MCAAJ Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/ [sudoc]= will, as the purl is presently set up by rsinger, resolve to a GPO catalog record, but there's no guarantee of online public full text. I'm pretty sure what I'm doing is perfectly appropriate based on the definition of rft_id, but it's definitely incompatible with a receiving link resolver assuming that all rft_id http URIs will resolve to full text for the rft cited. I don't think it's appropriate to assume that just because a URI is http, that means it will resolve to full text -- it's merely an identifier that unambiguously specifies the referent, same as any other URI scheme. Isn't that what the sem web folks are always insisting in the arguments about how it's okay to use http URIs for any type of identifier at all -- that http is just an identifier (at least in a context where all that's called for is a URI to identify), you can't assume that it resolves to anything in particular? (Although it's nice when it resolves to RDF saying more about the thing identified, it's certainly not expected that it will resolve to full text). Eric, out of curiosity, will your own link resolver software automatically take rft_id's and display them to the user as links? Jonathan Eric Hellman wrote: Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. On Mon, Sep 14, 2009 at 12:57 PM, Jonathan Rochkind rochk...@jhu.eduwrote: Well, in the 'wild' I barely see any rft_id's at all, heh. Aside from the obvious non-http URIs in rft_id, I'm not sure if I've seen http URIs that don't resolve to full text. BUT -- you can do anything with an http URI that you can do with an info uri. There is no requirement or guarantee in any spec that an HTTP uri will resolve at all, let alone resolve to full text for the document cited in an OpenURL. The OpenURL spec says that rft_id is An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). It doesn't say that it needs to resolve to full text. In my own OpenURL link-generating software, I _frequently_ put identifiers which are NOT open access URLs to full text in rft_id. Because there's no other place to put them. And I frequently use http URIs even for things that don't resolve to full text, because the conventional wisdom is to always use http for URIs, whether or not they resolve at all, and certainly no requirement that they resolve to something in particular like full text. Examples that I use myself when generating OpenURL rft_ids, of http URIs that do not resolve to full text include ones identifying bib records in my own catalog: http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my catalog record, but not to full text!] Or similarly, WorldCat http URIs. Or, an rft_id to unambigously identify something in terms of it's Google Books record: http://books.google.com/books?id=tl8MCAAJ Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/[sudoc] http://purl.org/NET/sudoc/%5Bsudoc%5D = will, as the purl is presently set up by rsinger, resolve to a GPO catalog record, but there's no guarantee of online public full text. I'm pretty sure what I'm doing is perfectly appropriate based on the definition of rft_id, but it's definitely incompatible with a receiving link resolver assuming that all rft_id http URIs will resolve to full text for the rft cited. I don't think it's appropriate to assume that just because a URI is http, that means it will resolve to full text -- it's merely an identifier that unambiguously specifies the referent, same as any other URI scheme. Isn't that what the sem web folks are always insisting in the arguments about how it's okay to use http URIs for any type of identifier at all -- that http is just an identifier (at least in a context where all that's called for is a URI to identify), you can't assume that it resolves to anything in particular? (Although it's nice when it resolves to RDF saying more about the thing identified, it's certainly not expected that it will resolve to full text). Eric, out of curiosity, will your own link resolver software automatically take rft_id's and display them to the user as links? Jonathan Eric Hellman wrote: Could you give us examples of http urls in rft_id that are like that? I've never seen such. On Sep 14, 2009, at 11:58 AM, Jonathan Rochkind wrote: In general, identifiers in URI form are put in rft_id that are NOT meant for providing to the user as a navigable URL. So the receiving software can't assume that whatever url is in rft_is represents an actual access point (available to the user) for the document. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Eric Hellman wrote: http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. Of course. But how is a link resolver supposed to know that, when all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM ?? I suggest that this is a kind of ambiguity in OpenURL, that many of us are using rft_id to, in some contexts, simply provide an unambiguous identifier, and in other cases, provide an end-user access URL (which may not be a good unambiguous identifier at all!). With no way for the link resolver to tell which was intended. So I don't think it's a good idea to do this. I think the community should choose one, and based on the language of the OpenURL spec, rft_id is meant to be an unambiguous identifier, not an end-user access URL. So ideally another way would be provided to send something intended as an end-user access URL in an OpenURL. But OpenURL is pretty much a dead spec that is never going to be developed further in any practical way. So, really, I recommend avoiding OpenURL for some non-library standard web standards whenever you can. But sometimes you can't, and OpenURL really is the best tool for the job. I use it all the time. And it constantly frustrates me with it's lack of flexibility and clarity, leading to people using it in ambiguous ways. Jonathan
Re: [CODE4LIB] Implementing OpenURL for simple web resources
On Mon, Sep 14, 2009 at 8:48 AM, O.Stephens o.steph...@open.ac.uk wrote: What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk Wait, are you using the target URL as a uniquely identifying key? URLs can change not only location, but meaning. Consider: What if www.bbc.co.uk changes address, and www.bbc.co.uk becomes something else you want to link to? Then you'd have rft_id=http://www.bbc.co.uk really refer to http://new.example.co.uk, and you'd need to use a dummy rft_id to refer to whatever new content lives at http://www.bbc.co.uk. Unless I'm totally confused as to what you're trying to do, this seems like a bad idea. An artificial key is a better choice. If you want something like tr.im, use something like tr.im. Cheers, -Nate
Re: [CODE4LIB] Implementing OpenURL for simple web resources
It's not correct to say that rft_val has no use; when used, it should contain a URL-encoded package of xml or kev metadata. it would be correct to say it is very rarely used. On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Huh, I can't even FIND a section 9.1 in the z39.88 standard. Are we looking at the same z39.88 standard? Mine only goes up to chapter 4. Oh wait, there it is in Chapter 3, section 9.1 okay. While that example contains an http URI, I would say it's intended as an unambiguous identifier URI that happens to use an http schema, not an end-user access URL. Although the weird thing is, in every other context the docs use an info:sid uri for rfr_id, to the extent that I thought you were REQUIRED to use an info:sid in rfr_id, I didn't even know you could use an HTTP uri as that example does, weird. For instance, while chapter 3 Section 9.1 uses that example of rfr_id=http://www.sciencedirect.com, over on page 14 in Chapter 1, they use this example for the same entity: rfr_id = info:sid/elsevier.com:ScienceDirect It certainly doesn''t surprise anymore when the z3988 standard contains ambiguity or confusing/conflicting examples. I wonder if there's more on this that is conflicting or confusing in the scholarly format application profiles, or in the KEV implementation guidelines. Probably. Yep, that's where I got the rfr_id=sid idea from! The KEV implementation guideilines say: Referrer Identifiers are defined in the source identifier Namespace `info:ofi/nam:info:sid:'. They are identified using the `info:sid/' scheme for the identification of collections. It is unclear how the KEV Implementation Guidelines justify saying that a rfr_id is always info:sid, when the actual z39.88 actually uses an http rfr_id example. Who knows which one was the mistake. Seriously, don't use OpenURL unless you really can't find anything else that will do, or you actually want your OpenURLs to be used by the existing 'in the wild' OpenURL resolvers. In the latter case, don't count on them doing anything in particular or consistent with 'novel' OpenURLs, like ones that put an end-user access URL in rft_id... don't expect actually existing in the wild OpenURLs to do anything in particular with that. Jonathan Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. On Mon, Sep 14, 2009 at 12:57 PM, Jonathan Rochkind rochk...@jhu.eduwrote: Well, in the 'wild' I barely see any rft_id's at all, heh. Aside from the obvious non-http URIs in rft_id, I'm not sure if I've seen http URIs that don't resolve to full text. BUT -- you can do anything with an http URI that you can do with an info uri. There is no requirement or guarantee in any spec that an HTTP uri will resolve at all, let alone resolve to full text for the document cited in an OpenURL. The OpenURL spec says that rft_id is An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). It doesn't say that it needs to resolve to full text. In my own OpenURL link-generating software, I _frequently_ put identifiers which are NOT open access URLs to full text in rft_id. Because there's no other place to put them. And I frequently use http URIs even for things that don't resolve to full text, because the conventional wisdom is to always use http for URIs, whether or not they resolve at all, and certainly no requirement that they resolve to something in particular like full text. Examples that I use myself when generating OpenURL rft_ids, of http URIs that do not resolve to full text include ones identifying bib records in my own catalog: http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my catalog record, but not to full text!] Or similarly, WorldCat http URIs. Or, an rft_id to unambigously identify something in terms of it's Google Books record: http://books.google.com/books?id=tl8MCAAJ Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/[sudoc] http://purl.org/NET/sudoc/%5Bsudoc%5D = will, as the purl is presently set up by rsinger, resolve to a GPO catalog record, but there's no guarantee of online public full text. I'm pretty sure what I'm doing is perfectly appropriate based on the definition of rft_id, but it's definitely incompatible with a receiving link resolver assuming that all rft_id http URIs will resolve to full text for the rft cited. I don't think it's appropriate to assume that just because a URI is http, that means it will resolve to full text -- it's merely an identifier that unambiguously specifies the referent, same as any other URI scheme. Isn't that what the sem web folks are always insisting in the arguments about how it's okay to use http URIs for any type of identifier at all -- that http is just an identifier (at least in a context where all that's called for is a URI to
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Yes, what Nate said is what I'm trying to say. Nate, they aren't _trying_ to use the target URL as a uniquely identifying key. They're _trying_ to use it as... a target URL! They just can't find anywhere but rft_id to stick a target URL. But the problem with that is exactly what Nate demonstrated: Based on existing use and the spec and best practices with URIs, many will assume that the rft_id IS intended to be an unamiguous identifier, not a target URL. And there's no way to tell the difference between an rft_id intended to be the former (as many of us do) and the latter. Another way is needed to actually supply a target URL in an OpenURL. You can do it in rft_dat, I guess, in a custom way -- depending on your context and use case. Sometimes that doesn't work (like when you might want to add your data to an already existing vendor-specific rft_dat). Or, you can do what I increasingly do, and just add your own non-openURL query parameter to a KEV (or your own namespaced XML to an XML, even easier). my.url=http://whatever.com or whatever. You can't count on any existing link resolvers recognizing it, but they should safely ignore it, and you can't count on any existing link resolver doing what you want with an rft_id _anyway_. It remains shocking to me that even DCTerms doesn't supply any way to provide end-user access URL as distinct from identifier. They are not always the same thing, much as the semweb crowd would _like_ people to always make them the same thing, we aren't there yet. Jonathan Nate Vack wrote: On Mon, Sep 14, 2009 at 8:48 AM, O.Stephens o.steph...@open.ac.uk wrote: What we are considering is the best way to represent a web page (or similar - pdf etc.) in an OpenURL. It looks like we could do something as simple as: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=http%3A%2F%2Fwww.bbc.co.uk Wait, are you using the target URL as a uniquely identifying key? URLs can change not only location, but meaning. Consider: What if www.bbc.co.uk changes address, and www.bbc.co.uk becomes something else you want to link to? Then you'd have rft_id=http://www.bbc.co.uk really refer to http://new.example.co.uk, and you'd need to use a dummy rft_id to refer to whatever new content lives at http://www.bbc.co.uk. Unless I'm totally confused as to what you're trying to do, this seems like a bad idea. An artificial key is a better choice. If you want something like tr.im, use something like tr.im. Cheers, -Nate
Re: [CODE4LIB] Implementing OpenURL for simple web resources
If you have a URL that can be used for a resource that you are describing in metadata, resolvers can do a better job providing services to users if it is put in the openurl. The only place to put it is rft_id. So let's not let one resolver's incapacity to prevent other resolvers from providing better services. If you want to make an OpenURL for a web page, its url is in almost all cases the best unambiguous identifier you could possibly think of. Putting dead http uri's in rft_id is not really a very useful thing to do. On Sep 14, 2009, at 1:45 PM, Jonathan Rochkind wrote: Eric Hellman wrote: http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. Of course. But how is a link resolver supposed to know that, when all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM ?? I suggest that this is a kind of ambiguity in OpenURL, that many of us are using rft_id to, in some contexts, simply provide an unambiguous identifier, and in other cases, provide an end-user access URL (which may not be a good unambiguous identifier at all!). With no way for the link resolver to tell which was intended. So I don't think it's a good idea to do this. I think the community should choose one, and based on the language of the OpenURL spec, rft_id is meant to be an unambiguous identifier, not an end-user access URL. So ideally another way would be provided to send something intended as an end-user access URL in an OpenURL. But OpenURL is pretty much a dead spec that is never going to be developed further in any practical way. So, really, I recommend avoiding OpenURL for some non-library standard web standards whenever you can. But sometimes you can't, and OpenURL really is the best tool for the job. I use it all the time. And it constantly frustrates me with it's lack of flexibility and clarity, leading to people using it in ambiguous ways. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
sorry eric, i was reading straight from the documentation and according to it it has no use. On Mon, Sep 14, 2009 at 1:55 PM, Eric Hellman e...@hellman.net wrote: It's not correct to say that rft_val has no use; when used, it should contain a URL-encoded package of xml or kev metadata. it would be correct to say it is very rarely used. On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
What the spec for z39.88 says is that rfr_id (and all the other _id's) must be URIs. the info:sid samespace was defined to allow minting of identifiers for the specific purpose of identifying referrers. the info uri was defined to allow non-resolving identifiers to have a place to live within URI-land. Documents written by standards committees are often not as clear as they should be, but its hard to get consensus across an industy without getting a committee together. Social process is so much harder than technology. On Sep 14, 2009, at 1:57 PM, Jonathan Rochkind wrote: Huh, I can't even FIND a section 9.1 in the z39.88 standard. Are we looking at the same z39.88 standard? Mine only goes up to chapter 4. Oh wait, there it is in Chapter 3, section 9.1 okay. While that example contains an http URI, I would say it's intended as an unambiguous identifier URI that happens to use an http schema, not an end-user access URL. Although the weird thing is, in every other context the docs use an info:sid uri for rfr_id, to the extent that I thought you were REQUIRED to use an info:sid in rfr_id, I didn't even know you could use an HTTP uri as that example does, weird. For instance, while chapter 3 Section 9.1 uses that example of rfr_id=http://www.sciencedirect.com, over on page 14 in Chapter 1, they use this example for the same entity: rfr_id = info:sid/ elsevier.com:ScienceDirect It certainly doesn''t surprise anymore when the z3988 standard contains ambiguity or confusing/conflicting examples. I wonder if there's more on this that is conflicting or confusing in the scholarly format application profiles, or in the KEV implementation guidelines. Probably. Yep, that's where I got the rfr_id=sid idea from! The KEV implementation guideilines say: Referrer Identifiers are defined in the source identifier Namespace `info:ofi/nam:info:sid:'. They are identified using the `info:sid/' scheme for the identification of collections. It is unclear how the KEV Implementation Guidelines justify saying that a rfr_id is always info:sid, when the actual z39.88 actually uses an http rfr_id example. Who knows which one was the mistake. Seriously, don't use OpenURL unless you really can't find anything else that will do, or you actually want your OpenURLs to be used by the existing 'in the wild' OpenURL resolvers. In the latter case, don't count on them doing anything in particular or consistent with 'novel' OpenURLs, like ones that put an end-user access URL in rft_id... don't expect actually existing in the wild OpenURLs to do anything in particular with that. Jonathan Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. On Mon, Sep 14, 2009 at 12:57 PM, Jonathan Rochkind rochk...@jhu.eduwrote: Well, in the 'wild' I barely see any rft_id's at all, heh. Aside from the obvious non-http URIs in rft_id, I'm not sure if I've seen http URIs that don't resolve to full text. BUT -- you can do anything with an http URI that you can do with an info uri. There is no requirement or guarantee in any spec that an HTTP uri will resolve at all, let alone resolve to full text for the document cited in an OpenURL. The OpenURL spec says that rft_id is An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). It doesn't say that it needs to resolve to full text. In my own OpenURL link-generating software, I _frequently_ put identifiers which are NOT open access URLs to full text in rft_id. Because there's no other place to put them. And I frequently use http URIs even for things that don't resolve to full text, because the conventional wisdom is to always use http for URIs, whether or not they resolve at all, and certainly no requirement that they resolve to something in particular like full text. Examples that I use myself when generating OpenURL rft_ids, of http URIs that do not resolve to full text include ones identifying bib records in my own catalog: http://catalog.library.jhu.edu/bib/NUM [ Will resolve to my catalog record, but not to full text!] Or similarly, WorldCat http URIs. Or, an rft_id to unambigously identify something in terms of it's Google Books record: http://books.google.com/books?id=tl8MCAAJ Also, URIs to unambiguously specify a referent in terms of sudoc: http://purl.org/NET/sudoc/[sudoc] http://purl.org/NET/sudoc/%5Bsudoc%5D = will, as the purl is presently set up by rsinger, resolve to a GPO catalog record, but there's no guarantee of online public full text. I'm pretty sure what
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I disagree. Putting URIs that unamiguously identify the referent, and in some cases provide additional 'hooks' by virtue of additional identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing to do to me. Whether or not they resolve to an end-user appropriate web page or not. If you want to use rft_id to instead be an end-user appropriate access URL (which may or may not be a suitable unambiguous persistent identifier), I guess it depends on how many of the actually existing in-the-wild link resolvers will, in what contexts, treat an http URI as an end-user appropriate access URL. If a lot of the in-the-wild link resolvers will, that may be a practically useful thing to do. Thus me asking if the one you had knowledge of did or didn't. I'm 99% sure that SFX will not, in any context, treat an rft_id as an appropriate end-user access URL. Certainly providing an appropriate end-user access URL _is_ a useful thing to do. So is providing an unambiguous persistent identifier. Both are quite useful things to do, they're just different things, shame that OpenURL kinda implies that you can use the same data element for both. OpenURL's not alone there though, DC does the same thing. Jonathan Eric Hellman wrote: If you have a URL that can be used for a resource that you are describing in metadata, resolvers can do a better job providing services to users if it is put in the openurl. The only place to put it is rft_id. So let's not let one resolver's incapacity to prevent other resolvers from providing better services. If you want to make an OpenURL for a web page, its url is in almost all cases the best unambiguous identifier you could possibly think of. Putting dead http uri's in rft_id is not really a very useful thing to do. On Sep 14, 2009, at 1:45 PM, Jonathan Rochkind wrote: Eric Hellman wrote: http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. Of course. But how is a link resolver supposed to know that, when all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM ?? I suggest that this is a kind of ambiguity in OpenURL, that many of us are using rft_id to, in some contexts, simply provide an unambiguous identifier, and in other cases, provide an end-user access URL (which may not be a good unambiguous identifier at all!). With no way for the link resolver to tell which was intended. So I don't think it's a good idea to do this. I think the community should choose one, and based on the language of the OpenURL spec, rft_id is meant to be an unambiguous identifier, not an end-user access URL. So ideally another way would be provided to send something intended as an end-user access URL in an OpenURL. But OpenURL is pretty much a dead spec that is never going to be developed further in any practical way. So, really, I recommend avoiding OpenURL for some non-library standard web standards whenever you can. But sometimes you can't, and OpenURL really is the best tool for the job. I use it all the time. And it constantly frustrates me with it's lack of flexibility and clarity, leading to people using it in ambiguous ways. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
i'd like to point out that perhaps the reason that SFX (and other link resolvers) don't use the rft_id in a particular way is because no one has pushed it to. for example, it is possible for you to have the word dinosaur link to an openurl and provide services for dinosaurs, but the question is: 1) who would provide a link in their articles on webpages to an openurl about dinosaurs. 2) do users really care. if i were in Owen's position i might create an openurl that looked like this: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=mylocalid rft_dat=urlhttp://www.bbc.co.uk/url where mylocalid is an id i created to give these special openurls and where url is the tag that i made up to help my resolver identify the data i'm sending privately. i could then write a program so my link resolver knows that the information contained in the private data field (rft_dat) and is identified by url should direct you to that url. you might also need to make up some other tags (like pdf if all your pdfs are in one spot). On Mon, Sep 14, 2009 at 2:23 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I disagree. Putting URIs that unamiguously identify the referent, and in some cases provide additional 'hooks' by virtue of additional identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing to do to me. Whether or not they resolve to an end-user appropriate web page or not. If you want to use rft_id to instead be an end-user appropriate access URL (which may or may not be a suitable unambiguous persistent identifier), I guess it depends on how many of the actually existing in-the-wild link resolvers will, in what contexts, treat an http URI as an end-user appropriate access URL. If a lot of the in-the-wild link resolvers will, that may be a practically useful thing to do. Thus me asking if the one you had knowledge of did or didn't. I'm 99% sure that SFX will not, in any context, treat an rft_id as an appropriate end-user access URL. Certainly providing an appropriate end-user access URL _is_ a useful thing to do. So is providing an unambiguous persistent identifier. Both are quite useful things to do, they're just different things, shame that OpenURL kinda implies that you can use the same data element for both. OpenURL's not alone there though, DC does the same thing. Jonathan Eric Hellman wrote: If you have a URL that can be used for a resource that you are describing in metadata, resolvers can do a better job providing services to users if it is put in the openurl. The only place to put it is rft_id. So let's not let one resolver's incapacity to prevent other resolvers from providing better services. If you want to make an OpenURL for a web page, its url is in almost all cases the best unambiguous identifier you could possibly think of. Putting dead http uri's in rft_id is not really a very useful thing to do. On Sep 14, 2009, at 1:45 PM, Jonathan Rochkind wrote: Eric Hellman wrote: http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. Of course. But how is a link resolver supposed to know that, when all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM ?? I suggest that this is a kind of ambiguity in OpenURL, that many of us are using rft_id to, in some contexts, simply provide an unambiguous identifier, and in other cases, provide an end-user access URL (which may not be a good unambiguous identifier at all!). With no way for the link resolver to tell which was intended. So I don't think it's a good idea to do this. I think the community should choose one, and based on the language of the OpenURL spec, rft_id is meant to be an unambiguous identifier, not an end-user access URL. So ideally another way would be provided to send something intended as an end-user access URL in an OpenURL. But OpenURL is pretty much a dead spec that is never going to be developed further in any practical way. So, really, I recommend avoiding OpenURL for some non-library standard web standards whenever you can. But sometimes you can't, and OpenURL really is the best tool for the job. I use it all the time. And it constantly frustrates me with it's lack of flexibility and clarity, leading to people using it in ambiguous ways. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
whoopsthat should be rfr_id not rft_id. http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rfr_id=mylocalid rft_dat=urlhttp://www.bbc.co.uk/url On Mon, Sep 14, 2009 at 2:39 PM, Rosalyn Metz rosalynm...@gmail.com wrote: i'd like to point out that perhaps the reason that SFX (and other link resolvers) don't use the rft_id in a particular way is because no one has pushed it to. for example, it is possible for you to have the word dinosaur link to an openurl and provide services for dinosaurs, but the question is: 1) who would provide a link in their articles on webpages to an openurl about dinosaurs. 2) do users really care. if i were in Owen's position i might create an openurl that looked like this: http://resolver.address/? url_ver=Z39.88-2004 url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx rft_id=mylocalid rft_dat=urlhttp://www.bbc.co.uk/url where mylocalid is an id i created to give these special openurls and where url is the tag that i made up to help my resolver identify the data i'm sending privately. i could then write a program so my link resolver knows that the information contained in the private data field (rft_dat) and is identified by url should direct you to that url. you might also need to make up some other tags (like pdf if all your pdfs are in one spot). On Mon, Sep 14, 2009 at 2:23 PM, Jonathan Rochkind rochk...@jhu.edu wrote: I disagree. Putting URIs that unamiguously identify the referent, and in some cases provide additional 'hooks' by virtue of additional identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing to do to me. Whether or not they resolve to an end-user appropriate web page or not. If you want to use rft_id to instead be an end-user appropriate access URL (which may or may not be a suitable unambiguous persistent identifier), I guess it depends on how many of the actually existing in-the-wild link resolvers will, in what contexts, treat an http URI as an end-user appropriate access URL. If a lot of the in-the-wild link resolvers will, that may be a practically useful thing to do. Thus me asking if the one you had knowledge of did or didn't. I'm 99% sure that SFX will not, in any context, treat an rft_id as an appropriate end-user access URL. Certainly providing an appropriate end-user access URL _is_ a useful thing to do. So is providing an unambiguous persistent identifier. Both are quite useful things to do, they're just different things, shame that OpenURL kinda implies that you can use the same data element for both. OpenURL's not alone there though, DC does the same thing. Jonathan Eric Hellman wrote: If you have a URL that can be used for a resource that you are describing in metadata, resolvers can do a better job providing services to users if it is put in the openurl. The only place to put it is rft_id. So let's not let one resolver's incapacity to prevent other resolvers from providing better services. If you want to make an OpenURL for a web page, its url is in almost all cases the best unambiguous identifier you could possibly think of. Putting dead http uri's in rft_id is not really a very useful thing to do. On Sep 14, 2009, at 1:45 PM, Jonathan Rochkind wrote: Eric Hellman wrote: http://catalog.library.jhu.edu/bib/NUM identifies a catalog record- I mean what else would you use to id the catalog record. unless you've implemented the http-range 303 redirect recommendation in your catalog (http://www.w3.org/TR/cooluris/), it shouldn't be construed as identifying the thing it describes, except as a private id, and you should use another field for that. Of course. But how is a link resolver supposed to know that, when all it has is rft_id=http://catalog.library.jhu.edu/bib/NUM ?? I suggest that this is a kind of ambiguity in OpenURL, that many of us are using rft_id to, in some contexts, simply provide an unambiguous identifier, and in other cases, provide an end-user access URL (which may not be a good unambiguous identifier at all!). With no way for the link resolver to tell which was intended. So I don't think it's a good idea to do this. I think the community should choose one, and based on the language of the OpenURL spec, rft_id is meant to be an unambiguous identifier, not an end-user access URL. So ideally another way would be provided to send something intended as an end-user access URL in an OpenURL. But OpenURL is pretty much a dead spec that is never going to be developed further in any practical way. So, really, I recommend avoiding OpenURL for some non-library standard web standards whenever you can. But sometimes you can't, and OpenURL really is the best tool for the job. I use it all the time. And it constantly frustrates me with it's lack of flexibility and clarity, leading to people using it in
Re: [CODE4LIB] Implementing OpenURL for simple web resources
You're absolutely correct, in fact, all the ent_val fields are reserved for future use! They went in and out of the spec. I'm trying to remember from my notes. It's better that they're out. On Sep 14, 2009, at 2:05 PM, Rosalyn Metz wrote: sorry eric, i was reading straight from the documentation and according to it it has no use. On Mon, Sep 14, 2009 at 1:55 PM, Eric Hellman e...@hellman.net wrote: It's not correct to say that rft_val has no use; when used, it should contain a URL-encoded package of xml or kev metadata. it would be correct to say it is very rarely used. On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
2009/9/14 Jonathan Rochkind rochk...@jhu.edu: Seriously, don't use OpenURL unless you really can't find anything else that will do, or you actually want your OpenURLs to be used by the existing 'in the wild' OpenURL resolvers. In the latter case, don't count on them doing anything in particular or consistent with 'novel' OpenURLs, like ones that put an end-user access URL in rft_id... don't expect actually existing in the wild OpenURLs to do anything in particular with that. Jonathan, I am getting seriously mixed messages from you on this thread. In one message, you'll strongly insist that some facility in OpenURL is or isn't useful; in the next, you'll be saying that the whole standard is dead. The last time I was paying serious attention to OpenURL, that certainly wasn't true -- has something happened in the last few months to make it so?
Re: [CODE4LIB] Implementing OpenURL for simple web resources
It's not dead in the sense that it is not in use -- it is, in wide use. It is dead in the sense that, in my opinion, it is not going to evolve or change. It is highly unlikely that the majority of 'in the wild' OpenURL link resolvers or generators (referrers) are going to do anything with OpenURL other than what they do with it now -- which is basically using OpenURL like the pre-standardized 0.1 version, but in some (but not all) cases with the actual syntax updated to standardized 1.0. So it's in use, in a very basic way, not taking advantage of the 'sophisticated' features that OpenURL 1.0 tried to add. I say 'tried' because not only did they not catch on, but they are done in a very confusing and not entirely consistent way. Most implementers of OpenURL generators or link resolvers do not understand the sophisticated aspects of OpenURL 1.0, they're just plugging bits into query parameters from templates, and often doing so in a non-standards compliant illegal way, that ends up working anyway cause the whole infrastructure has been built assuming certain things that are not in fact specified will be done or not done. I think ideas to add extensions OR new metadata formats (that OpenURL 1.0 makes possible) to OpenURL are a lost cause. They are never going to catch on, and even trying to make them catch on is often going to mess things up for the established infrastructure that assumes they won't be done (Like, depending on who you talk to in this thread, assuming that rft_id is always suitable as an end-user access URL; or assuming it never is. heh.) . And the only reason to use OpenURL instead of something a LOT less confusion is because of it's current wide adoption. So new features, I throw up my hands. If you sense a mixed message or ambivalence from me, it's because I am ambivelent. I have spent a lot of time with OpenURL in working on a significant project or two. I know it fairly well, and have used it and will continue to use it to do all sorts of stuff. And I have come to really dislike it, and wish it were different than it was. It's got the wrong kinds of abstraction and lacks the right kinds of flexibility. Nonetheless, I think we're stuck with it for _certain_ things, so I am concerned with people trying to move forward without increasing it's flaws yet further, that's important to me, so, yeah, I care about how people are using it even though I don't like it, cause I use it too and need to inter-operate with all sorts of stuff. But if for a given project that did not necessarily need to operate with the existing link resolver infrastructure, if I could find a metadata standard other than OpenURL to use, especially one that's not just library world and (unlike OpenURL IMO) makes easy things simple and provides flexibility to do more complicated things more complicatedly... I'd never consider using OpenURL. Jonathan Mike Taylor wrote: 2009/9/14 Jonathan Rochkind rochk...@jhu.edu: Seriously, don't use OpenURL unless you really can't find anything else that will do, or you actually want your OpenURLs to be used by the existing 'in the wild' OpenURL resolvers. In the latter case, don't count on them doing anything in particular or consistent with 'novel' OpenURLs, like ones that put an end-user access URL in rft_id... don't expect actually existing in the wild OpenURLs to do anything in particular with that. Jonathan, I am getting seriously mixed messages from you on this thread. In one message, you'll strongly insist that some facility in OpenURL is or isn't useful; in the next, you'll be saying that the whole standard is dead. The last time I was paying serious attention to OpenURL, that certainly wasn't true -- has something happened in the last few months to make it so?
Re: [CODE4LIB] Implementing OpenURL for simple web resources
On Mon, 14 Sep 2009, Mike Taylor wrote: 2009/9/14 Jonathan Rochkind rochk...@jhu.edu: Seriously, don't use OpenURL unless you really can't find anything else that will do, or you actually want your OpenURLs to be used by the existing 'in the wild' OpenURL resolvers. In the latter case, don't count on them doing anything in particular or consistent with 'novel' OpenURLs, like ones that put an end-user access URL in rft_id... don't expect actually existing in the wild OpenURLs to do anything in particular with that. Jonathan, I am getting seriously mixed messages from you on this thread. In one message, you'll strongly insist that some facility in OpenURL is or isn't useful; in the next, you'll be saying that the whole standard is dead. The last time I was paying serious attention to OpenURL, that certainly wasn't true -- has something happened in the last few months to make it so? My interpretation of the part of Jonathan's response that you quoted was basically, don't use OpenURL when you're just looking for persistant URLs. The whole point of OpenURL was that the local resolver could determine what the best way to get you the resource was (eg, digital library vs. ILL vs. giving you a specific room shelf). If you're using OpenURLs for the reason of having it work with the established network of resolvers, don't get cute w/ encoding the information, as you can't rely on it to work. ... From what I've seen of the thread (and I admit, I didn't read every message), what's needed here is PURL, not OpenURL. -Joe
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I can't imagine that SFX has some fundamental assumption that an http URL in rft_id is never ever something that can be used for access, and even if it did, it would be letting the tail wag the dog to suggest that other resolvers should not do so; some do. There are also resolvers that pre-check urls, at least there were before exlibris acquired linkfinderplus. So it's possible for a resolver agent to discover whether a url leads somewhere or not. On Sep 14, 2009, at 2:23 PM, Jonathan Rochkind wrote: I disagree. Putting URIs that unamiguously identify the referent, and in some cases provide additional 'hooks' by virtue of additional identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing to do to me. Whether or not they resolve to an end-user appropriate web page or not. If you want to use rft_id to instead be an end-user appropriate access URL (which may or may not be a suitable unambiguous persistent identifier), I guess it depends on how many of the actually existing in-the-wild link resolvers will, in what contexts, treat an http URI as an end-user appropriate access URL. If a lot of the in-the-wild link resolvers will, that may be a practically useful thing to do. Thus me asking if the one you had knowledge of did or didn't. I'm 99% sure that SFX will not, in any context, treat an rft_id as an appropriate end-user access URL. Certainly providing an appropriate end-user access URL _is_ a useful thing to do. So is providing an unambiguous persistent identifier. Both are quite useful things to do, they're just different things, shame that OpenURL kinda implies that you can use the same data element for both. OpenURL's not alone there though, DC does the same thing. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Yep, that's a pretty good summary of my personal advice Joe, thanks. Obviously others like Eric may have other opinions, that's just mine. Joe Hourcle wrote: On Mon, 14 Sep 2009, Mike Taylor wrote: 2009/9/14 Jonathan Rochkind rochk...@jhu.edu: Seriously, don't use OpenURL unless you really can't find anything else that will do, or you actually want your OpenURLs to be used by the existing 'in the wild' OpenURL resolvers. In the latter case, don't count on them doing anything in particular or consistent with 'novel' OpenURLs, like ones that put an end-user access URL in rft_id... don't expect actually existing in the wild OpenURLs to do anything in particular with that. Jonathan, I am getting seriously mixed messages from you on this thread. In one message, you'll strongly insist that some facility in OpenURL is or isn't useful; in the next, you'll be saying that the whole standard is dead. The last time I was paying serious attention to OpenURL, that certainly wasn't true -- has something happened in the last few months to make it so? My interpretation of the part of Jonathan's response that you quoted was basically, don't use OpenURL when you're just looking for persistant URLs. The whole point of OpenURL was that the local resolver could determine what the best way to get you the resource was (eg, digital library vs. ILL vs. giving you a specific room shelf). If you're using OpenURLs for the reason of having it work with the established network of resolvers, don't get cute w/ encoding the information, as you can't rely on it to work. ... From what I've seen of the thread (and I admit, I didn't read every message), what's needed here is PURL, not OpenURL. -Joe
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Of course it's possible. But if you're counting on all (or a majority) of actually in-the-wild resolvers to do a certain thing, you should probably do some investigation to see if that assumption is true. Some probably do some probably don't. If the usefulness of your technique depends on most doing so, then you should check it. And if the usefulness of your technique does NOT count on being inter-operable with existing link resolver infrastructure... PERSONALLY I would be using OpenURL, I don't think it's worth it. Incidentally, I have been unhappy with the pre-check a http URL before giving it to the user technique. First of all, WAY too much stuff on the web still returns an HTTP 200 OK response message for what are in fact error pages (including MOST of our licensed scholarly content providers). Secondly, even if it's not an error page, there's no way for pre-checking to tell you WHAT the page has, what it represents and how useful it might be to the user. I don't think it serves the user well to say Here's a URL that'll give you something related to this citation, click on it to find out what. I want to tell them if it's going to be full text, or just a description, or what. As well as have the software be able to display it differently depending. I think that's important to making the service useful. Jonathan Eric Hellman wrote: I can't imagine that SFX has some fundamental assumption that an http URL in rft_id is never ever something that can be used for access, and even if it did, it would be letting the tail wag the dog to suggest that other resolvers should not do so; some do. There are also resolvers that pre-check urls, at least there were before exlibris acquired linkfinderplus. So it's possible for a resolver agent to discover whether a url leads somewhere or not. On Sep 14, 2009, at 2:23 PM, Jonathan Rochkind wrote: I disagree. Putting URIs that unamiguously identify the referent, and in some cases provide additional 'hooks' by virtue of additional identifiers (local bibID, OCLCnum, LCCN, etc) is a VERY useful thing to do to me. Whether or not they resolve to an end-user appropriate web page or not. If you want to use rft_id to instead be an end-user appropriate access URL (which may or may not be a suitable unambiguous persistent identifier), I guess it depends on how many of the actually existing in-the-wild link resolvers will, in what contexts, treat an http URI as an end-user appropriate access URL. If a lot of the in-the-wild link resolvers will, that may be a practically useful thing to do. Thus me asking if the one you had knowledge of did or didn't. I'm 99% sure that SFX will not, in any context, treat an rft_id as an appropriate end-user access URL. Certainly providing an appropriate end-user access URL _is_ a useful thing to do. So is providing an unambiguous persistent identifier. Both are quite useful things to do, they're just different things, shame that OpenURL kinda implies that you can use the same data element for both. OpenURL's not alone there though, DC does the same thing. Jonathan Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Thanks for all the responses so far. I'm not at my desk right now, but some quick responses... I'm not suggesting that we should necessarily assume that a http URI in the rft_id would resolve to the resource - but that this is a possibility (if we want, we can restrict behaviour to rfr contexts where we know this is sensible behaviour). I guess it is part of the reason that I'm asking the question, as if we also know that the rft was a website, it would be very odd behaviour (I think) to use an http URI in the rft_id that wasn't the website URL? I certainly wouldn't assume that even given a http URI in the rft_id that was a URL for the resource it would be the best place to resolve for our users (similar to the situation of having a DOI supplied in rft_id - this may help, but locally access may be via an aggregator etc.) What I'm actually suggesting (I think) is we strictly treat an http URI in the rft_id field as a URI, initially ignoring the fact that it may be a URL. We would (as Nate said) use the URI as a key to identify the destination URL - although in some cases that could end up being the original URI (i.e. if the lookup service had nothing to say about it) Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? I have some sympathy with the point that Jonathan makes about avoiding the use of 'library specific' standards, it feels like this is one of those situations where it is appropriate - but if there are suggestions of more general methods that achieve the same outcomes I'd be interested. As an aside, one other advantage of using OpenURL in this context does deliver consistency of approach across all our references - but perhaps this isn't a good enough reason to adopt it if there are good alternatives. Owen From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Eric Hellman [e...@hellman.net] Sent: 14 September 2009 19:52 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources You're absolutely correct, in fact, all the ent_val fields are reserved for future use! They went in and out of the spec. I'm trying to remember from my notes. It's better that they're out. On Sep 14, 2009, at 2:05 PM, Rosalyn Metz wrote: sorry eric, i was reading straight from the documentation and according to it it has no use. On Mon, Sep 14, 2009 at 1:55 PM, Eric Hellman e...@hellman.net wrote: It's not correct to say that rft_val has no use; when used, it should contain a URL-encoded package of xml or kev metadata. it would be correct to say it is very rarely used. On Sep 14, 2009, at 1:40 PM, Rosalyn Metz wrote: ok no one shoot me for doing this: in section 9.1 Namespaces [Registry] of the OpenURL standard (z39.88) it actually provides an example of using a URL in the rfr_id field, and i wonder why you couldn't just do the same thing for the rft_id also there is a field called rft_val which currently has no use. this might be a good one for it. just my 2 cents. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/ The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302).
Re: [CODE4LIB] Implementing OpenURL for simple web resources
O.Stephens wrote: I guess it is part of the reason that I'm asking the question, as if we also know that the rft was a website, it would be very odd behaviour (I think) to use an http URI in the rft_id that wasn't the website URL? True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? But I still think what you want is simply a purl server. What makes you think you want OpenURL in the first place? But I still don't really understand what you're trying to do: deliver consistency of approach across all our references -- so are you using OpenURL for it's more conventional use too, but you want to tack on a purl-like functionality to the same software that's doing something more like a conventional link resolver? I don't completely understand your use case. But you seem to understand what's up, so you can decide what will work best!
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
Owen, rft_id isn't really meant to be a unique identifier (although it can be in situations like a pmid or doi). are you looking for it to be? if so why? if professor A is pointing to http://www.bbc.co.uk and professor B is pointing to http://www.bbc.co.uk why do they have to have unique OpenURLs. Rosalyn On Mon, Sep 14, 2009 at 4:41 PM, Eric Hellman e...@hellman.net wrote: Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
On Mon, 14 Sep 2009, Eric Hellman wrote: The original question was whether it's a good idea to use OpenURL for a URL persistence application. Issues with using PURL are that 1. it only works with one website- PURL paths don't travel, though there have been proposals to get around this. 2. There's not a really good way to package metadata with the PURL reference. If there was some standard other than OpenURL that would do the trick, then we'd probably not be looking at OpenURL- there I agree with Jonathan. Not a standard per se, but issue #1 can be handled by some of of high availability system -- it can even be as simple as a few organizations federating together to resolve either other's items, and then setting up load balancer or the sloppy route of DNS round-robin. For issue #2 ... there's always the issue of linking straight to the object vs. presenting the metadata. If the goal is to track metadata and not just link straight to the object, you might be able to use purls to link to OAI-ORE [1] documents. (although, I admit, I've never actually used ORE myself) You can't maintain a tr.im url or a bit.ly url, by the way. Those services can't support post-creation modification of the target url, because if they did, they'd get killed by spam. Agreed. [1] http://www.openarchives.org/ore/ -Joe
Re: [CODE4LIB] Implementing OpenURL for simple web resources
oops...just re-read original post s/professor/article also your link resolver should be creating a context object with each request. this context object is what makes the openurl unique. so if you want uniqueness for stats purposes i would image the link resolver is already doing that (and just another reason to use an rfr_id that you create). On Mon, Sep 14, 2009 at 4:45 PM, Rosalyn Metz rosalynm...@gmail.com wrote: Owen, rft_id isn't really meant to be a unique identifier (although it can be in situations like a pmid or doi). are you looking for it to be? if so why? if professor A is pointing to http://www.bbc.co.uk and professor B is pointing to http://www.bbc.co.uk why do they have to have unique OpenURLs. Rosalyn On Mon, Sep 14, 2009 at 4:41 PM, Eric Hellman e...@hellman.net wrote: Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] Implementing OpenURL for simple web resources
I still think that rft_id IS meant to be a unique identifier! Again, 5.2.1 of z3988 says of the *_id fields: An Identifier Descriptor unambiguously specifies the Entity by means of a Uniform Resource Identifier (URI). I guess 'unambiguous' isn't exactly the same thing as 'unique', depending on what you mean by 'unique', okay. But it's pretty clear to me that rft_id is meant to be an identifier for the referent. Meaning the same rft_id should not correspond to two different referents. (But the same referent can certainly have multiple rft_id's, although it's inconvenient when it has multiple ones within the same scheme/system, it happens and is not disastrous). Rosalyn Metz wrote: Owen, rft_id isn't really meant to be a unique identifier (although it can be in situations like a pmid or doi). are you looking for it to be? if so why? if professor A is pointing to http://www.bbc.co.uk and professor B is pointing to http://www.bbc.co.uk why do they have to have unique OpenURLs. Rosalyn On Mon, Sep 14, 2009 at 4:41 PM, Eric Hellman e...@hellman.net wrote: Nate's point is what I was thinking about in this comment in my original reply: If you don't add DC metadata, which seems like a good idea, you'll definitely want to include something that will help you to persist your replacement record. For example, a label or description for the link. I should also point out a solution that could work for some people but not you- put rewrite rules in the gateways serving your network. A bit dangerous and kludgy, but we've seen kludgier things. On Sep 14, 2009, at 4:24 PM, O.Stephens wrote: Nate has a point here - what if we end up with a commonly used URI pointing at a variety of different things over time, and so is used to indicate different content each time. However the problem with a 'short URL' solution (tr.im, purl etc), or indeed any locally assigned identifier that acts as a key, is that as described in the blog post you need prior knowledge of the short URL/identifier to use it. The only 'identifier' our authors know for a website is it's URL - and it seems contrary for us to ask them to use something else. I'll need to think about Nate's point - is this common or an edge case? Is there any other approach we could take? Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/