Re: [CODE4LIB] GPO PURLs
This is indeed an interesting problem - we are all dependent on a centralized service node. Just got off the phone with GPO 9 am 9/1/09. I was told they are now up to 50% or PURLs restored but the script is running very slowly line-by-line since the server (they're updating the production server while it is up) is experiencing unusually heavy load from the user community and bots scheduled to troll at beginning of the month. Jonathan LeBreton Sr. Associate University Librarian Temple University Libraries voice: 215-204-8231 fax: 215-204-5201 email: lebre...@temple.edu email: jonat...@temple.edu -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of James Jacobs Sent: Monday, August 31, 2009 6:06 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] GPO PURLs Hi all, (cross-posted to purl-dev) I'm a documents librarian (and member of the Depository Library Council) and usually just a lurker over here. Thanks Keith and Patricia for the easy workaround. I shared this with govdoc-l and on my blog: http://freegovinfo.info/node/2704 See especially the comment that as of today, only 3,677 PURLs out of 116,237 have been restored (3.1%). I would love to hear your thoughts/ideas for how this kind of critical system failure can be averted in the future from a technological standpoint. Is it possible to mirror a purl server? Will the same issue occur when GPO moves to handles in FDsys (http://www.handle.net/)? Will a distributed infrastructure as I've briefly mapped out be able to handle these types of critical system crashes better? Please let me know and I'd be happy to share your ideas with GPO and the documents community. Best, James Jacobs Keith Jenkins wrote: Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said: The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored. While the server is down, here is one workaround (thanks to Patricia Duplantis): 1. Go to http://catalog.gpo.gov/ 2. Click Advanced Search 3. Search for word in URL/PURL, enter the PURL 4. Click Go 5. The original URL at the time of cataloging should appear in a 53x note. This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs. Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available? Keith -- James R. Jacobs International Documents Librarian Green Library, Stanford University P: (650) 725-1030 E: jrjac...@stanford.edu AIM: LibrarianJames T: @freegovinfo The more beautiful questions demand the more beautiful answers, and if we can learn to ask them, we stand a chance of steering clear of shipwreck on our jury-rigged and not so distant star. --Lewis Lapham, Lapham's Quarterly I(3), Summer, 2008, p.17. --- This message may have been intercepted and read by U.S. government agencies including the FBI, CIA, and NSA without notice or warrant or knowledge of sender or recipient. (\ {|||8- (/
Re: [CODE4LIB] GPO PURLs
On Tue, Sep 1, 2009 at 11:53 AM, Jonathan Rochkindrochk...@jhu.edu wrote: Of course, one failure in X (10?) years is fairly good reliability... depending on how long it takes them to get everything back working 100%. If it's back by tomorrow, one outage in 10 years pretty good. If it takes a week to get back, not so good. It's been 8 days so far... hopefully it will we back to normal soon. I'm not sure how long GPO has been serving PURLs, but if we assume this is the first failure in 10 years, then that's still 99.8% uptime, which isn't bad. If the 0.2% were spread evenly across ten years, it would hardly be noticable, but when it happens all at once, it certainly does seem worse. Keith
Re: [CODE4LIB] GPO PURLs
Roy++ I agree while we might use technology to preserve things, it is only a tool to help preserve things. It is at best the how, not the which, what, why, and when. Edward Roy Tennant wrote: I think this episode also illustrates, once again, that preservation is not about technology at all, it's about *institutional commitment*. The kind of institutional commitment that would have implemented and maintained the kinds of procedures that Jonathan described. Without institutional commitment, no technology on earth can save you. Roy On 9/1/09 9/1/09 € 9:00 AM, Jonathan Rochkind rochk...@jhu.edu wrote: I'd add that not only does it sound like GPO maintained no failover backup, it sounds, based on Jonathan Lebreton's report, like they didn't even maintain an offline backup, since they're needing to regenerate the purl database from raw data, rather than simply restoring from a backup, which would generally be much quicker then the process that Jonathan Lebreton seems to be describing. From what info we have, it sounds like GPO simply, well, was very very far from 'best practices' for a service meant to be robustly reliable. On the other hand, we're just going from sort of third hand hearsay, maybe they were doing things more right than it sounds, but some kind of catastrophic unexpected 'perfect storm' still happened to bring everything down. Maybe 48 hours of outage in 10 years (how long has GPO purl been running? Have there been outages like this before?) is appropriate reliability for the level of importance of this service. I dunno. Jonathan Jonathan Lebreton wrote: This is indeed an interesting problem - we are all dependent on a centralized service node. Just got off the phone with GPO 9 am 9/1/09. I was told they are now up to 50% or PURLs restored but the script is running very slowly line-by-line since the server (they're updating the production server while it is up) is experiencing unusually heavy load from the user community and bots scheduled to troll at beginning of the month. Jonathan LeBreton Sr. Associate University Librarian Temple University Libraries voice: 215-204-8231 fax: 215-204-5201 email: lebre...@temple.edu email: jonat...@temple.edu -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of James Jacobs Sent: Monday, August 31, 2009 6:06 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] GPO PURLs Hi all, (cross-posted to purl-dev) I'm a documents librarian (and member of the Depository Library Council) and usually just a lurker over here. Thanks Keith and Patricia for the easy workaround. I shared this with govdoc-l and on my blog: http://freegovinfo.info/node/2704 See especially the comment that as of today, only 3,677 PURLs out of 116,237 have been restored (3.1%). I would love to hear your thoughts/ideas for how this kind of critical system failure can be averted in the future from a technological standpoint. Is it possible to mirror a purl server? Will the same issue occur when GPO moves to handles in FDsys (http://www.handle.net/)? Will a distributed infrastructure as I've briefly mapped out be able to handle these types of critical system crashes better? Please let me know and I'd be happy to share your ideas with GPO and the documents community. Best, James Jacobs Keith Jenkins wrote: Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said: The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored. While the server is down, here is one workaround (thanks to Patricia Duplantis): 1. Go to http://catalog.gpo.gov/ 2. Click Advanced Search 3. Search for word in URL/PURL, enter the PURL 4. Click Go 5. The original URL at the time of cataloging should appear in a 53x note. This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs. Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available? Keith -- James R. Jacobs International Documents Librarian Green Library, Stanford University P: (650) 725-1030 E: jrjac...@stanford.edu AIM: LibrarianJames T: @freegovinfo The more beautiful questions demand the more beautiful answers, and if we can learn to ask them, we stand a chance of steering clear of shipwreck on our jury-rigged and not so
Re: [CODE4LIB] GPO PURLs
Hi all, (cross-posted to purl-dev) I'm a documents librarian (and member of the Depository Library Council) and usually just a lurker over here. Thanks Keith and Patricia for the easy workaround. I shared this with govdoc-l and on my blog: http://freegovinfo.info/node/2704 See especially the comment that as of today, only 3,677 PURLs out of 116,237 have been restored (3.1%). I would love to hear your thoughts/ideas for how this kind of critical system failure can be averted in the future from a technological standpoint. Is it possible to mirror a purl server? Will the same issue occur when GPO moves to handles in FDsys (http://www.handle.net/)? Will a distributed infrastructure as I've briefly mapped out be able to handle these types of critical system crashes better? Please let me know and I'd be happy to share your ideas with GPO and the documents community. Best, James Jacobs Keith Jenkins wrote: Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said: The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored. While the server is down, here is one workaround (thanks to Patricia Duplantis): 1. Go to http://catalog.gpo.gov/ 2. Click Advanced Search 3. Search for word in URL/PURL, enter the PURL 4. Click Go 5. The original URL at the time of cataloging should appear in a 53x note. This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs. Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available? Keith -- James R. Jacobs International Documents Librarian Green Library, Stanford University P: (650) 725-1030 E: jrjac...@stanford.edu AIM: LibrarianJames T: @freegovinfo The more beautiful questions demand the more beautiful answers, and if we can learn to ask them, we stand a chance of steering clear of shipwreck on our jury-rigged and not so distant star. --Lewis Lapham, Lapham's Quarterly I(3), Summer, 2008, p.17. --- This message may have been intercepted and read by U.S. government agencies including the FBI, CIA, and NSA without notice or warrant or knowledge of sender or recipient. (\ {|||8- (/
Re: [CODE4LIB] GPO PURLs
On Thu, Aug 27, 2009 at 4:37 PM, Keith Jenkinsk...@cornell.edu wrote: Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available? This would be a great question for the purl-dev discussion list too: http://www.purlz.org/mailman/listinfo/purl-dev //Ed
Re: [CODE4LIB] GPO PURLs
Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said: The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored. While the server is down, here is one workaround (thanks to Patricia Duplantis): 1. Go to http://catalog.gpo.gov/ 2. Click Advanced Search 3. Search for word in URL/PURL, enter the PURL 4. Click Go 5. The original URL at the time of cataloging should appear in a 53x note. This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs. Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available? Keith