[Dspace-tech] Persistent identifiers, again

2008-04-07 Thread Gary McGath
As some of you may recall, I've been looking into implementing a 
non-Handle approach to persistent identifiers in our local version of 
DSpace. My approach has been to replace HandleManager bodily. So far 
I've gotten it to the point where it has the same functionality as 
out-of-the-box DSpace; that is, the handles don't work when accessed as 
external identifiers, but a manual substitution of the URL base produces 
a working URL.

Each time I blow away DSpace (as often happens during early stages of 
code development), I've noticed that the ID's start over from 1. It 
looks to me as if the HandleManager doesn't provide a strong guarantee 
of uniqueness; if DSpace is totally reinitialized, old Handles may be 
recycled. If this happened in a production environment, Handles for 
obsolete objects could point to the wrong object, rather than failing to 
resolve as they should.

This makes me think that for our environment, which stresses the 
unique in URN's, I need to add a stronger guarantee of uniqueness. It 
also seems like a minor bug in DSpace.

Have I missed anything? Does something come into play when a live handle 
server is used, which provides a stronger guarantee of uniqueness?

I'm aware, by the way, that DSpace 1.6 has totally new code for 
non-Handle persistent identifiers. Our schedule doesn't let us wait. 
Don't blame me, I'm only the programmer. :)

-- 
Gary McGath
Digital Library Software Engineer
Harvard University Library Office for Information Systems
http://hul.harvard.edu/~gary/index.html


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Persistent identifiers, again

2008-04-07 Thread James Rutherford
On Mon, Apr 07, 2008 at 03:04:58PM +, Gary McGath wrote:
 Each time I blow away DSpace (as often happens during early stages of
 code development), I've noticed that the ID's start over from 1. It
 looks to me as if the HandleManager doesn't provide a strong guarantee
 of uniqueness; if DSpace is totally reinitialized, old Handles may be
 recycled. If this happened in a production environment, Handles for
 obsolete objects could point to the wrong object, rather than failing to
 resolve as they should.

This is because the local part of the Handle comes from a database
sequence.

 This makes me think that for our environment, which stresses the
 unique in URN's, I need to add a stronger guarantee of uniqueness. It
 also seems like a minor bug in DSpace.

I think you are correct, yes.

 Have I missed anything? Does something come into play when a live handle
 server is used, which provides a stronger guarantee of uniqueness?

Not really, no. You're still responsible for knowing what points where.

 I'm aware, by the way, that DSpace 1.6 has totally new code for
 non-Handle persistent identifiers. Our schedule doesn't let us wait.
 Don't blame me, I'm only the programmer. :)

Understandable :) For the 1.6 code, myself and Richard Jones spent quite
a long time looking at methods of generating locally unique names for
content that would be independent of the database sequences (motivated
by the DAO work which could enable repositories to use something other
than a RDBMS if they so desire). This doesn't specifically address your
problem, though. From 1.6 forward, UUIDs will be used to identify
content, but the Handles (the default implementation of all the new
identifier code) will be generated in (approximately) the same way under
the hood. Check the 'handle' table in the DB and the 'createHandle'
method of HandleManager for more info.

Having said that, I think the Handle should be written into the Item
metadata when it is minted, so there will be some record there (as
dc.identifier or dc.identifier.uri).

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential
and may be legally privileged. If you have received this message in
error, you should delete it from your system immediately and advise the
sender. To any recipient of this message within HP, unless otherwise
stated you should consider this message and attachments as HP
CONFIDENTIAL.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech