Re: [CODE4LIB] Assigning DOI for local content
For example, if you don't want to rely on dx.doi.org as your gateway to the handle system for doi resolution, it would be quite easy for me to deploy my own gateway at dx.hellman.net. I might want to do this if a were an organization paranoid about security and didn't want to disclose to anybody what doi's my organization was resolving. Or, I might want to directly access metadata in the handle system that doesn't go through the http gateways, to provide a service other than resolution. Does this answer your question, Ross? On Nov 20, 2009, at 2:31 PM, Ross Singer wrote: On Fri, Nov 20, 2009 at 2:23 PM, Eric Hellman e...@hellman.net wrote: Having incorporated the handle client software into my own stuff rather easily, I'm pretty sure that's not true. Fair enough. The technology is binding independent. So you are using and sharing handles using some protocol other than HTTP? I'm more interested in the sharing part of that question. What is the format of the handle identifier in this context? What advantage does it bring over HTTP? -Ross. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
[CODE4LIB] calling another webpage within CGI script
Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
Ken, The difference is when you run through command script you are executing the file as /owner/ and as /Other/ when you access it through the browser. Looking at the error message you sent, I believe it might not be executing the complete script. Try setting permissions as 707 or 777 to start with. You may have to create a temporary directory to test with. Let me know if you have any questions, Vishwam Vishwam Annam Wright State University Libraries 120 Paul Laurence Dunbar Library 3640 Colonel Glenn Hwy. Dayton, OH 45435 Office: 937-775-3262 FAX 937-775-2356 Ken Irwin wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
Hi Ken, This may be obvious, but when running from the command line, stdout and stderr are often interleaved together, but on the web server you see stdout in the browser and stderr in the web server error log. Your script is probably exiting with an error either at the 'get' line (line 6) or at the 'die' line (line 7), which is what 'die' does -- terminate your script. Have you checked your web server error log to see what the error is on your 'get' call? Matt On Mon, Nov 23, 2009 at 7:17 AM, Ken Irwin kir...@wittenberg.edu wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] Assigning DOI for local content
On Mon, Nov 23, 2009 at 1:07 PM, Eric Hellman e...@hellman.net wrote: Does this answer your question, Ross? Yes, sort of. My question was not so much if you can resolve handles via bindings other than HTTP (since that's one of the selling points of handles) as it was do people actually use this in the real world? Of course, it may be impossible to answer that question since, by your example, such people may not actually be letting anybody know that they're doing that (although you would probably be somebody with insider knowledge on this topic). Also, with your use cases, would these services be impossible if the only binding was HTTP? Presumably dx.hellman.net would need to harvest its metadata from somewhere, which seems like it would leave a footprint. It also needs some mechanism to stay in sync with the master index. Your non-resolution service also seems to be looking these things up in realtime. Would a RESTful or SOAP API (*shudder*) not accomplish the same goal? Really, though, the binding argument here is less the issue here than if you believe http URIs are valid identifiers or not since there's no reason a URI couldn't be dereferenced via other bindings, either. -Ross.
Re: [CODE4LIB] Assigning DOI for local content
But minting DOIs requires a Registration Agency which as far as I understand it, requires $1,000 and approval from the International DOI Federation.[1] Roy [1] http://www.doi.org/handbook_2000/governance.html#7.2.2 On 11/23/09 11/23/09 10:07 AM, Eric Hellman e...@hellman.net wrote: For example, if you don't want to rely on dx.doi.org as your gateway to the handle system for doi resolution, it would be quite easy for me to deploy my own gateway at dx.hellman.net. I might want to do this if a were an organization paranoid about security and didn't want to disclose to anybody what doi's my organization was resolving. Or, I might want to directly access metadata in the handle system that doesn't go through the http gateways, to provide a service other than resolution. Does this answer your question, Ross? On Nov 20, 2009, at 2:31 PM, Ross Singer wrote: On Fri, Nov 20, 2009 at 2:23 PM, Eric Hellman e...@hellman.net wrote: Having incorporated the handle client software into my own stuff rather easily, I'm pretty sure that's not true. Fair enough. The technology is binding independent. So you are using and sharing handles using some protocol other than HTTP? I'm more interested in the sharing part of that question. What is the format of the handle identifier in this context? What advantage does it bring over HTTP? -Ross. Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA e...@hellman.net http://go-to-hellman.blogspot.com/
Re: [CODE4LIB] calling another webpage within CGI script
Ken, I tested your script on my server and it also worked for me on the command line and failed via my web server. All I did was add /usr to your path to perl and it worked: #!/usr/bin/perl Roy On 11/23/09 11/23/09 8:17 AM, Ken Irwin kir...@wittenberg.edu wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
On Mon, 23 Nov 2009, Ken Irwin wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? I'd suggest testing the results of the call, rather than just looking for content, as an empty response could be a result of the server you're connecting to. (unlikely in this case, but it happens once in a while, particularly if you turn off redirection, or support caching). Unfortunately, you might have to use LWP::UserAgent, rather than LWP::Simple: #!/bin/perl -- use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent-new( timeout = 60 ); my $response = $ua-get('http://www.npr.org/'); if ( $response-is_success() ) { my $content = $response-decoded_content(); ... } else { print HTTP Error : ,$response-status_line(),\n; } __END__ (and changing the shebang line for my location of perl, your version worked via both CGI and command line) oh ... and you don't need the foreach loop: my $i = @lines; -Joe
Re: [CODE4LIB] Assigning DOI for local content
Hi all, couldn't resist jumping in on this one: But appears that the handle system is quite a bit more fleshed out than a simple purl server, it's a distributed protocol-independent network. The protocol-independent part may or may not be useful, but it certainly seems like it could be, it doens't hurt to provide for it in advance. The distributed part seems pretty cool to me. So if it's no harder to set up, maintain, and use a handle server than a Purl server (this is a big 'if', I'm not sure if that's the case), and handle can do everything purl can do and quite a bit more (I'm pretty sure that is the case)... why NOT use handle instead of purl? It seems like handle is a more fleshed out, robust, full-featured thing than purl. I think it's also worth adding that handles (and DOIs) can also be used to create PURLs, eg. http://purl.org/handles/10.1074/jbc.M004545200 (which isn't a real link) -- in fact, there's no reason why you couldn't use a PURL server as a local handle resolver, aside from the fact that it wouldn't be participating in the handle network. See, for example: http://www.ukoln.ac.uk/distributed-systems/poi/ One thing PURL has going for it is that it has defined meanings for HTTP response codes; these are similar to REST responses though I don't know if they're the same; the most recent documentation mentions that PURL servers are RESTful but I suspect this is part of the recent re-tooling of PURL. http://purl.oclc.org/docs/help.html#rest The only potential advantage of PURLs that I can see is the ability to do partial redirects, eg. http://purl.org/redirect/xx -- http://some.server/long.path/x -- though one could make the case that this might be useful for directing handle requests to the appropriate servers, eg. http://purl.org/handles/10.123/xx -- http://handleserver1/xx and http://purl.org/handles/10.456/xx -- http://doiserver2/xx ... Overall, I tend to agree that handles seem more flexible -- or at least, less tied to URL and HTTP -- than PURLs. Not having to rely on a specific server for resolution is a fairly major bonus (think DNS-style round-robin resolver querying for handles; not possible with PURLs). MJ PS. At the risk of reposting potentially old news: http://web.mit.edu/handle/www/purl-eval.html
Re: [CODE4LIB] Assigning DOI for local content
On Mon, Nov 23, 2009 at 2:52 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Well, here's the trick about handles, as I understand it. A handle, for instance, a DOI, is 10.1074/jbc.M004545200. Well, actually, it could be: 10.1074/jbc.M004545200 doi:10.1074/jbc.M004545200 info:doi/10.1074/jbc.M004545200 etc. But there's still got to be some mechanism to get from there to: http://dx.doi.org/10.1074/jbc.M004545200 or http://dx.hellman.net/10.1074/jbc.M004545200 I don't see why it's any different, fundamentally, than: http://purl.hellman.net/?purl=http%3A%2F%2Fpurl.org%2FNET%2Fdoi%2F10.1074%2Fjbc.M004545200 besides being prettier. Anyway, my argument wasn't that Purl was technologically more sound that handles -- Purl services have a major single-point-of-failure problem -- it's just that I don't buy the argument that handles are somehow superior because they aren't limited to HTTP. What I'm saying is that there plenty of valid reasons to value handles more than purls (or any other indirection service), but independence to HTTP isn't one of them. -Ross. While, for DOI handles, normally we resolve that using dx.doi.org, at http://dx.doi.org/10.1074/jbc.M004545200, that is not actually a requirement of the handle system. You can resolve it through any handle server, over HTTP or otherwise. Even if it's still over HTTP, it doesn't have to be at dx.doi.org, it can be via any handle resolver. For instance, check this out, it works: http://hdl.handle.net/10.1074/jbc.M004545200 Cause the DOI is really just a subset of Handles, any resolver participating in the handle network can resolve em. In Eric's hypothetical use case, that could be a local enterprise handle resolver of some kind. (Although I'm not totally sure that would keep your usage data private; the documentation I've seen compares the handle network to DNS, it's a distributed system, I'm not sure in what cases handle resolution requests are sent 'upstream' by the handle resolver, and if actual individual lookups are revealed by that or not. But in any case, when Ross suggests -- Presumably dx.hellman.net would need to harvest its metadata from somewhere, which seems like it would leave a footprint. It also needs some mechanism to stay in sync with the master index. -- my reading this suggests this is _built into_ the handle protocol, it's part of handle from the very start (again, the DNS analogy, with the emphasis on the distributed resolution aspect), you don't need to invent it yourself. The details of exactly how it works, I don't know enough to say. ) Now, I'm somewhat new to this stuff too, I don't completely understand how it works. Apparently hdl.handle.net can strikehandle/strike deal with any handle globally, while presumably dx.doi.org can only deal with the subset of handles that are also DOIs. And apparently you can have a handle resolver that works over something other than HTTP too. (Although Ross argues, why would you want to? And I'm inclined to agree). But appears that the handle system is quite a bit more fleshed out than a simple purl server, it's a distributed protocol-independent network. The protocol-independent part may or may not be useful, but it certainly seems like it could be, it doens't hurt to provide for it in advance. The distributed part seems pretty cool to me. So if it's no harder to set up, maintain, and use a handle server than a Purl server (this is a big 'if', I'm not sure if that's the case), and handle can do everything purl can do and quite a bit more (I'm pretty sure that is the case)... why NOT use handle instead of purl? It seems like handle is a more fleshed out, robust, full-featured thing than purl. Jonathan Presumably dx.hellman.net would need to harvest its metadata from somewhere, which seems like it would leave a footprint. It also needs some mechanism to stay in sync with the master index. Your non-resolution service also seems to be looking these things up in realtime. Would a RESTful or SOAP API (*shudder*) not accomplish the same goal? Really, though, the binding argument here is less the issue here than if you believe http URIs are valid identifiers or not since there's no reason a URI couldn't be dereferenced via other bindings, either. -Ross.
Re: [CODE4LIB] Assigning DOI for local content
The actual handle is 10.1074/jbc.M004545200 . If your software wants to get a handle to give it to any handle resolver of it's choice, it's going to have to parse the doi: or info: versions to get the handle out first. The info version is a URI that has a DOI handle embedded in it. The doi version is... um, I dunno, just a convention, I think, that has a DOI handle embedded in it. Likewise, if your software had a URI, and was smart enough to know that the URI http://dx.doi.org/10.1074/jbc.M004545200; actually had a handle embedded in it, it could strip the handle out, and then resolve it against some other handle server that participates in the handle network, like hdl.handle.net. But that would be kind of going against the principle to treat URI's as opaque identifiers and not parse them for internal data. But me, I end up going against that principle all the time in actual practice, actually for scenarios kind of analagous to, but less well-defined and spec'd than, getting the actual handle out of the URI and resolving it against some other service. For instance, getting an OCLCnum out of an http://worldcat.oclc.org/ URI, to resolve against my local catalog that knows something about OCLCnums, but doesn't know anything about http://worldcat.oclc.org URIs that happen to have an OCLCnum embedded in them. Or getting an ASIN out of a http://www.amazon.com/ URI, to resolve against Amazon's _own_ web services, which ironically know something about ASIN's but don't know anything about www.amazon.com URI's that have an ASIN embedded in them. Actually quite analagous to getting the actual handle out of an http://dx.doi.org or http://hdi.handle.net URI, in order to resolve against the resolver of choice. Jonathan Ross Singer wrote: On Mon, Nov 23, 2009 at 2:52 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Well, here's the trick about handles, as I understand it. A handle, for instance, a DOI, is 10.1074/jbc.M004545200. Well, actually, it could be: 10.1074/jbc.M004545200 doi:10.1074/jbc.M004545200 info:doi/10.1074/jbc.M004545200 etc. But there's still got to be some mechanism to get from there to: http://dx.doi.org/10.1074/jbc.M004545200 or http://dx.hellman.net/10.1074/jbc.M004545200 I don't see why it's any different, fundamentally, than: http://purl.hellman.net/?purl=http%3A%2F%2Fpurl.org%2FNET%2Fdoi%2F10.1074%2Fjbc.M004545200 besides being prettier. Anyway, my argument wasn't that Purl was technologically more sound that handles -- Purl services have a major single-point-of-failure problem -- it's just that I don't buy the argument that handles are somehow superior because they aren't limited to HTTP. What I'm saying is that there plenty of valid reasons to value handles more than purls (or any other indirection service), but independence to HTTP isn't one of them. -Ross. While, for DOI handles, normally we resolve that using dx.doi.org, at http://dx.doi.org/10.1074/jbc.M004545200, that is not actually a requirement of the handle system. You can resolve it through any handle server, over HTTP or otherwise. Even if it's still over HTTP, it doesn't have to be at dx.doi.org, it can be via any handle resolver. For instance, check this out, it works: http://hdl.handle.net/10.1074/jbc.M004545200 Cause the DOI is really just a subset of Handles, any resolver participating in the handle network can resolve em. In Eric's hypothetical use case, that could be a local enterprise handle resolver of some kind. (Although I'm not totally sure that would keep your usage data private; the documentation I've seen compares the handle network to DNS, it's a distributed system, I'm not sure in what cases handle resolution requests are sent 'upstream' by the handle resolver, and if actual individual lookups are revealed by that or not. But in any case, when Ross suggests -- Presumably dx.hellman.net would need to harvest its metadata from somewhere, which seems like it would leave a footprint. It also needs some mechanism to stay in sync with the master index. -- my reading this suggests this is _built into_ the handle protocol, it's part of handle from the very start (again, the DNS analogy, with the emphasis on the distributed resolution aspect), you don't need to invent it yourself. The details of exactly how it works, I don't know enough to say. ) Now, I'm somewhat new to this stuff too, I don't completely understand how it works. Apparently hdl.handle.net can strikehandle/strike deal with any handle globally, while presumably dx.doi.org can only deal with the subset of handles that are also DOIs. And apparently you can have a handle resolver that works over something other than HTTP too. (Although Ross argues, why would you want to? And I'm inclined to agree). But appears that the handle system is quite a bit more fleshed out than a simple purl server, it's a distributed protocol-independent network. The protocol-independent part may or may not be useful,
Re: [CODE4LIB] Assigning DOI for local content
Interesting stuff. I never really thought about it before that DOIs can be served up by the Handle server. E.G., http://dx.doi.org/10.1074/jbc.M004545200 = http://hdl.handle.net/10.1074/jbc.M004545200 But, even more surprising to me was realizing that Handles can be resolved by the DOI server. Or presumably any DOI server. http://hdl.handle.net/2027.42/46087 = http://dx.doi.org/2027.42/46087 I suppose I should have understood this point since the Handle service does sort of obliquely say this. http://www.handle.net/factsheet.html Anyway, good to have it made explicit. Tom On Mon, Nov 23, 2009 at 4:03 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The actual handle is 10.1074/jbc.M004545200 . If your software wants to get a handle to give it to any handle resolver of it's choice, it's going to have to parse the doi: or info: versions to get the handle out first. The info version is a URI that has a DOI handle embedded in it. The doi version is... um, I dunno, just a convention, I think, that has a DOI handle embedded in it. Likewise, if your software had a URI, and was smart enough to know that the URI http://dx.doi.org/10.1074/jbc.M004545200; actually had a handle embedded in it, it could strip the handle out, and then resolve it against some other handle server that participates in the handle network, like hdl.handle.net. But that would be kind of going against the principle to treat URI's as opaque identifiers and not parse them for internal data. But me, I end up going against that principle all the time in actual practice, actually for scenarios kind of analagous to, but less well-defined and spec'd than, getting the actual handle out of the URI and resolving it against some other service. For instance, getting an OCLCnum out of an http://worldcat.oclc.org/ URI, to resolve against my local catalog that knows something about OCLCnums, but doesn't know anything about http://worldcat.oclc.org URIs that happen to have an OCLCnum embedded in them. Or getting an ASIN out of a http://www.amazon.com/ URI, to resolve against Amazon's _own_ web services, which ironically know something about ASIN's but don't know anything about www.amazon.com URI's that have an ASIN embedded in them. Actually quite analagous to getting the actual handle out of an http://dx.doi.org or http://hdi.handle.net URI, in order to resolve against the resolver of choice. Jonathan Ross Singer wrote: On Mon, Nov 23, 2009 at 2:52 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Well, here's the trick about handles, as I understand it. A handle, for instance, a DOI, is 10.1074/jbc.M004545200. Well, actually, it could be: 10.1074/jbc.M004545200 doi:10.1074/jbc.M004545200 info:doi/10.1074/jbc.M004545200 etc. But there's still got to be some mechanism to get from there to: http://dx.doi.org/10.1074/jbc.M004545200 or http://dx.hellman.net/10.1074/jbc.M004545200 I don't see why it's any different, fundamentally, than: http://purl.hellman.net/?purl=http%3A%2F%2Fpurl.org%2FNET%2Fdoi%2F10.1074%2Fjbc.M004545200 besides being prettier. Anyway, my argument wasn't that Purl was technologically more sound that handles -- Purl services have a major single-point-of-failure problem -- it's just that I don't buy the argument that handles are somehow superior because they aren't limited to HTTP. What I'm saying is that there plenty of valid reasons to value handles more than purls (or any other indirection service), but independence to HTTP isn't one of them. -Ross. While, for DOI handles, normally we resolve that using dx.doi.org, at http://dx.doi.org/10.1074/jbc.M004545200, that is not actually a requirement of the handle system. You can resolve it through any handle server, over HTTP or otherwise. Even if it's still over HTTP, it doesn't have to be at dx.doi.org, it can be via any handle resolver. For instance, check this out, it works: http://hdl.handle.net/10.1074/jbc.M004545200 Cause the DOI is really just a subset of Handles, any resolver participating in the handle network can resolve em. In Eric's hypothetical use case, that could be a local enterprise handle resolver of some kind. (Although I'm not totally sure that would keep your usage data private; the documentation I've seen compares the handle network to DNS, it's a distributed system, I'm not sure in what cases handle resolution requests are sent 'upstream' by the handle resolver, and if actual individual lookups are revealed by that or not. But in any case, when Ross suggests -- Presumably dx.hellman.net would need to harvest its metadata from somewhere, which seems like it would leave a footprint. It also needs some mechanism to stay in sync with the master index. -- my reading this suggests this is _built into_ the handle protocol, it's part of handle from the very start (again, the DNS analogy, with the emphasis on the distributed resolution aspect), you don't need to invent
Re: [CODE4LIB] Assigning DOI for local content
What happens if the main doi resolver goes down? I'd be interested to see how well a local resolver works when blocked from this upstream server. Are there any other upstream servers? Ben On Nov 23, 2009 10:10 PM, Tom Keays tomke...@gmail.com wrote: Interesting stuff. I never really thought about it before that DOIs can be served up by the Handle server. E.G., http://dx.doi.org/10.1074/jbc.M004545200 = http://hdl.handle.net/10.1074/jbc.M004545200 But, even more surprising to me was realizing that Handles can be resolved by the DOI server. Or presumably any DOI server. http://hdl.handle.net/2027.42/46087 = http://dx.doi.org/2027.42/46087 I suppose I should have understood this point since the Handle service does sort of obliquely say this. http://www.handle.net/factsheet.html Anyway, good to have it made explicit. Tom On Mon, Nov 23, 2009 at 4:03 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The actual handle ...
Re: [CODE4LIB] Assigning DOI for local content
What do you mean by a local resolver? If you're talking about a local handle resolver adhering to the handle spec... well, then it depends on the handle spec I guess, which I don't know. But since all the handle documetnation keeps saying like DNS, then I'd imagine it has similar (or better) redundancy built into it as DNS does. But I don't know. Poking around on handle.net, it looks like the handle infrastructure supports this,but you would have had to actually configure 'backup' handle resolvers -- similar to DNS in that if the DNS for your domain goes down, and you _haven't_ gotten someone else at another location to be a 'backup' resolver for you, and specified them as a nameserver in your DNS record... then you're out of luck. But the protocol supports that, and if you have done it (as most everyone does with DNS), you're good. I have no idea if 'most everyone' does it with handle or not, but handle supports it. Note that if dx.doi.org goes down, you obviously won't be able to resolve at dx.doi.org -- but IF it works as I think (I'm still confused), AND dx.doi.org has distributed their handles to a backup resolver, then you'd still be able to resolve via hdl.handle.net, or via your own local handle resolver (which will in turn find the backup resolver). http://www.handle.net/lhs.html Jonathan Ben O'Steen wrote: What happens if the main doi resolver goes down? I'd be interested to see how well a local resolver works when blocked from this upstream server. Are there any other upstream servers? Ben On Nov 23, 2009 10:10 PM, Tom Keays tomke...@gmail.com wrote: Interesting stuff. I never really thought about it before that DOIs can be served up by the Handle server. E.G., http://dx.doi.org/10.1074/jbc.M004545200 = http://hdl.handle.net/10.1074/jbc.M004545200 But, even more surprising to me was realizing that Handles can be resolved by the DOI server. Or presumably any DOI server. http://hdl.handle.net/2027.42/46087 = http://dx.doi.org/2027.42/46087 I suppose I should have understood this point since the Handle service does sort of obliquely say this. http://www.handle.net/factsheet.html Anyway, good to have it made explicit. Tom On Mon, Nov 23, 2009 at 4:03 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The actual handle ...
Re: [CODE4LIB] Assigning DOI for local content
More info here too: http://www.handle.net/introduction.html This handle stuff is interesting, but I don't entirely understand it. I guess if the Global Handle Service really went down, it would be similar to a root-level DNS server going down -- you'd be in trouble, somewhat mitigated by whatever data your local resolver had cached. Of course, CNRI maintains several failover mirrors of the Global Handle Service for that reason. (Much as we'd hope all the root-level DNS servers are thorougly failover-ed). Jonathan Ben O'Steen wrote: What happens if the main doi resolver goes down? I'd be interested to see how well a local resolver works when blocked from this upstream server. Are there any other upstream servers? Ben On Nov 23, 2009 10:10 PM, Tom Keays tomke...@gmail.com wrote: Interesting stuff. I never really thought about it before that DOIs can be served up by the Handle server. E.G., http://dx.doi.org/10.1074/jbc.M004545200 = http://hdl.handle.net/10.1074/jbc.M004545200 But, even more surprising to me was realizing that Handles can be resolved by the DOI server. Or presumably any DOI server. http://hdl.handle.net/2027.42/46087 = http://dx.doi.org/2027.42/46087 I suppose I should have understood this point since the Handle service does sort of obliquely say this. http://www.handle.net/factsheet.html Anyway, good to have it made explicit. Tom On Mon, Nov 23, 2009 at 4:03 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The actual handle ...
Re: [CODE4LIB] Web analytics for POST data
Alejandro Garza Gonzalez wrote: 1) You *can* use GA and some Javascript embedded in your III pages to log events (as they´re called in GA lingo). The javascript (depending on your coding wizardry level) could track anything from hovers over elements, form submission, next page events, etc. Hi Alejandro, Thanks for a great suggestion. I tried poking around at it; it seems to me like Events aren't built for what I'm really interested in doing, namely systematic exploration and analysis of the search sessions. IOW, let's say a form looks like t=finn a=twain l=circ,reserve It looks like I could log this as three separate events, or one; but either way, how would one analyze this? I'm not interested (solely) in how many times this particular query was entered. I started looking at ways to funnel the params into my own tracking script, the prototype of which just writes a line to a text file with a JSON serialization of the form data; but I'm not a JS ninja, so I'm still trying to figure out how to get around the XSS problems. Ruddy III turnkey... -- Yitzchak Schaffer Systems Manager Touro College Libraries 33 West 23rd Street New York, NY 10010 Tel (212) 463-0400 x5230 Fax (212) 627-3197 Email yitzchak.schaf...@tourolib.org