Re: [CODE4LIB] GPO PURLs

2009-09-01 Thread Jonathan Lebreton
This is indeed an interesting problem - we are all dependent on a
centralized service node.  

Just got off the phone with GPO 9 am 9/1/09.  
I was told they are now up to 50% or PURLs restored but the script is
running very slowly line-by-line since the server (they're updating the
production server while it is up) is experiencing unusually heavy load
from the user community and bots scheduled to troll at beginning of the
month.  

Jonathan LeBreton 
Sr. Associate University Librarian
Temple University Libraries
voice: 215-204-8231
fax: 215-204-5201
email:  lebre...@temple.edu
email:  jonat...@temple.edu






 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf
Of
 James Jacobs
 Sent: Monday, August 31, 2009 6:06 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] GPO PURLs
 
 Hi all, (cross-posted to purl-dev)
 
 I'm a documents librarian (and member of the Depository Library
 Council)
 and usually just a lurker over here. Thanks Keith and Patricia for the
 easy workaround. I shared this with govdoc-l and on my blog:
 
 http://freegovinfo.info/node/2704
 
 See especially the comment that as of today, only 3,677 PURLs out of
 116,237 have been restored (3.1%). I would love to hear your
 thoughts/ideas for how this kind of critical system failure can be
 averted in the future from a technological standpoint. Is it possible
 to
 mirror a purl server? Will the same issue occur when GPO moves to
 handles in FDsys (http://www.handle.net/)? Will a distributed
 infrastructure as I've briefly mapped out be able to handle these
types
 of critical system crashes better?
 
 Please let me know and I'd be happy to share your ideas with GPO and
 the
 documents community.
 
 Best,
 
 James Jacobs
 
 
 
 
 Keith Jenkins wrote:
  Thanks to everyone who helped me confirm that the GPO PURL server is
  down.  An official announcement on the GPO Listserv said:
 The PURL Server is currently inaccessible. GPO is working with
IT
  staff to restore service as soon as possible. We regret any
  inconvenience caused by the server problems. An updated listserv
will
  be sent once service is restored.
 
  While the server is down, here is one workaround (thanks to Patricia
 Duplantis):
 1. Go to http://catalog.gpo.gov/
 2. Click Advanced Search
 3. Search for word in URL/PURL, enter the PURL
 4. Click Go
 5. The original URL at the time of cataloging should appear in a
 53x note.
 
  This incident, however, illuminates a weakness in PURL systems:
 access
  is broken when the PURL server breaks, even though the documents are
  still online at their original URLs.
 
  Maybe someone more familiar with PURL systems can tell me... is
there
  any way to harvest data from a PURL server, so that a backup/mirror
  can be available?
 
  Keith
 
 --
 James R. Jacobs
 International Documents Librarian
 Green Library, Stanford University
 P: (650) 725-1030 E: jrjac...@stanford.edu
 AIM: LibrarianJames T: @freegovinfo
 
 The more beautiful questions demand the more beautiful answers,
 and if we can learn to ask them, we stand a chance of steering
 clear of shipwreck on our jury-rigged and not so distant star.
 --Lewis Lapham, Lapham's Quarterly I(3), Summer, 2008, p.17.
 
 ---
 This message may have been intercepted and read by U.S. government
 agencies including the FBI, CIA, and NSA without notice or warrant or
 knowledge of sender or recipient.
 
   (\
 {|||8-
   (/


Re: [CODE4LIB] GPO PURLs

2009-09-01 Thread Keith Jenkins
On Tue, Sep 1, 2009 at 11:53 AM, Jonathan Rochkindrochk...@jhu.edu wrote:
 Of course, one failure in X (10?) years is fairly good reliability...
 depending on how long it takes them to get everything back working 100%. If
 it's back by tomorrow, one outage in 10 years pretty good. If it takes a
 week to get back, not so good.

It's been 8 days so far... hopefully it will we back to normal soon.
I'm not sure how long GPO has been serving PURLs, but if we assume
this is the first failure in 10 years, then that's still 99.8% uptime,
which isn't bad.  If the 0.2% were spread evenly across ten years, it
would hardly be noticable, but when it happens all at once, it
certainly does seem worse.

Keith


Re: [CODE4LIB] GPO PURLs

2009-09-01 Thread Edward M. Corrado

Roy++

I agree while we might use technology to preserve things, it is only a 
tool to help preserve things. It is at best the how, not the which, 
what, why, and when.


Edward



Roy Tennant wrote:

I think this episode also illustrates, once again, that preservation is not
about technology at all, it's about *institutional commitment*. The kind of
institutional commitment that would have implemented and maintained the
kinds of procedures that Jonathan described. Without institutional
commitment, no technology on earth can save you.
Roy


On 9/1/09 9/1/09 € 9:00 AM, Jonathan Rochkind rochk...@jhu.edu wrote:

  

I'd add that not only does it sound like GPO maintained no failover
backup, it sounds, based on Jonathan Lebreton's report,  like they
didn't even maintain an offline backup, since they're needing to
regenerate the purl database from raw data, rather than simply restoring
from a backup, which would generally be much quicker then the process
that Jonathan Lebreton seems to be describing.

 From what info we have, it sounds like GPO simply, well, was very very
far from 'best practices' for a service meant to be robustly reliable.
On the other hand, we're just going from sort of third hand hearsay,
maybe they were doing things more right than it sounds, but some kind of
catastrophic unexpected 'perfect storm' still happened to bring
everything down. Maybe 48 hours of outage in 10 years (how long has GPO
purl been running? Have there been outages like this before?) is
appropriate reliability for the level of importance of this service. I
dunno.

Jonathan

Jonathan Lebreton wrote:


This is indeed an interesting problem - we are all dependent on a
centralized service node.

Just got off the phone with GPO 9 am 9/1/09.
I was told they are now up to 50% or PURLs restored but the script is
running very slowly line-by-line since the server (they're updating the
production server while it is up) is experiencing unusually heavy load
from the user community and bots scheduled to troll at beginning of the
month.  


Jonathan LeBreton
Sr. Associate University Librarian
Temple University Libraries
voice: 215-204-8231
fax: 215-204-5201
email:  lebre...@temple.edu
email:  jonat...@temple.edu






  
  

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf



Of
  
  

James Jacobs
Sent: Monday, August 31, 2009 6:06 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] GPO PURLs

Hi all, (cross-posted to purl-dev)

I'm a documents librarian (and member of the Depository Library
Council)
and usually just a lurker over here. Thanks Keith and Patricia for the
easy workaround. I shared this with govdoc-l and on my blog:

http://freegovinfo.info/node/2704

See especially the comment that as of today, only 3,677 PURLs out of
116,237 have been restored (3.1%). I would love to hear your
thoughts/ideas for how this kind of critical system failure can be
averted in the future from a technological standpoint. Is it possible
to
mirror a purl server? Will the same issue occur when GPO moves to
handles in FDsys (http://www.handle.net/)? Will a distributed
infrastructure as I've briefly mapped out be able to handle these



types
  
  

of critical system crashes better?

Please let me know and I'd be happy to share your ideas with GPO and
the
documents community.

Best,

James Jacobs




Keith Jenkins wrote:



Thanks to everyone who helped me confirm that the GPO PURL server is
down.  An official announcement on the GPO Listserv said:
   The PURL Server is currently inaccessible. GPO is working with
  
  

IT
  
  

staff to restore service as soon as possible. We regret any
inconvenience caused by the server problems. An updated listserv
  
  

will
  
  

be sent once service is restored.

While the server is down, here is one workaround (thanks to Patricia
  
  

Duplantis):



   1. Go to http://catalog.gpo.gov/
   2. Click Advanced Search
   3. Search for word in URL/PURL, enter the PURL
   4. Click Go
   5. The original URL at the time of cataloging should appear in a
  
  

53x note.



This incident, however, illuminates a weakness in PURL systems:
  
  

access



is broken when the PURL server breaks, even though the documents are
still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me... is
  
  

there
  
  

any way to harvest data from a PURL server, so that a backup/mirror
can be available?

Keith
  
  

--
James R. Jacobs
International Documents Librarian
Green Library, Stanford University
P: (650) 725-1030 E: jrjac...@stanford.edu
AIM: LibrarianJames T: @freegovinfo

The more beautiful questions demand the more beautiful answers,
and if we can learn to ask them, we stand a chance of steering
clear of shipwreck on our jury-rigged and not so

Re: [CODE4LIB] GPO PURLs

2009-08-31 Thread James Jacobs

Hi all, (cross-posted to purl-dev)

I'm a documents librarian (and member of the Depository Library Council) 
and usually just a lurker over here. Thanks Keith and Patricia for the 
easy workaround. I shared this with govdoc-l and on my blog:


http://freegovinfo.info/node/2704

See especially the comment that as of today, only 3,677 PURLs out of 
116,237 have been restored (3.1%). I would love to hear your 
thoughts/ideas for how this kind of critical system failure can be 
averted in the future from a technological standpoint. Is it possible to 
mirror a purl server? Will the same issue occur when GPO moves to 
handles in FDsys (http://www.handle.net/)? Will a distributed 
infrastructure as I've briefly mapped out be able to handle these types 
of critical system crashes better?


Please let me know and I'd be happy to share your ideas with GPO and the 
documents community.


Best,

James Jacobs




Keith Jenkins wrote:

Thanks to everyone who helped me confirm that the GPO PURL server is
down.  An official announcement on the GPO Listserv said:
   The PURL Server is currently inaccessible. GPO is working with IT
staff to restore service as soon as possible. We regret any
inconvenience caused by the server problems. An updated listserv will
be sent once service is restored.

While the server is down, here is one workaround (thanks to Patricia Duplantis):
   1. Go to http://catalog.gpo.gov/
   2. Click Advanced Search
   3. Search for word in URL/PURL, enter the PURL
   4. Click Go
   5. The original URL at the time of cataloging should appear in a 53x note.

This incident, however, illuminates a weakness in PURL systems: access
is broken when the PURL server breaks, even though the documents are
still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me... is there
any way to harvest data from a PURL server, so that a backup/mirror
can be available?

Keith


--
James R. Jacobs
International Documents Librarian
Green Library, Stanford University
P: (650) 725-1030 E: jrjac...@stanford.edu
AIM: LibrarianJames T: @freegovinfo

The more beautiful questions demand the more beautiful answers,
and if we can learn to ask them, we stand a chance of steering
clear of shipwreck on our jury-rigged and not so distant star.
--Lewis Lapham, Lapham's Quarterly I(3), Summer, 2008, p.17.

---
This message may have been intercepted and read by U.S. government
agencies including the FBI, CIA, and NSA without notice or warrant or
knowledge of sender or recipient.

 (\
{|||8-
 (/


Re: [CODE4LIB] GPO PURLs

2009-08-29 Thread Ed Summers
On Thu, Aug 27, 2009 at 4:37 PM, Keith Jenkinsk...@cornell.edu wrote:
 Maybe someone more familiar with PURL systems can tell me... is there
 any way to harvest data from a PURL server, so that a backup/mirror
 can be available?

This would be a great question for the purl-dev discussion list too:

  http://www.purlz.org/mailman/listinfo/purl-dev

//Ed


Re: [CODE4LIB] GPO PURLs

2009-08-27 Thread Keith Jenkins
Thanks to everyone who helped me confirm that the GPO PURL server is
down.  An official announcement on the GPO Listserv said:
  The PURL Server is currently inaccessible. GPO is working with IT
staff to restore service as soon as possible. We regret any
inconvenience caused by the server problems. An updated listserv will
be sent once service is restored.

While the server is down, here is one workaround (thanks to Patricia Duplantis):
  1. Go to http://catalog.gpo.gov/
  2. Click Advanced Search
  3. Search for word in URL/PURL, enter the PURL
  4. Click Go
  5. The original URL at the time of cataloging should appear in a 53x note.

This incident, however, illuminates a weakness in PURL systems: access
is broken when the PURL server breaks, even though the documents are
still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me... is there
any way to harvest data from a PURL server, so that a backup/mirror
can be available?

Keith