Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-27 Thread James Rutherford
On Fri, Jul 27, 2007 at 12:20:55AM -0400, Ekaterina Pechekhonova wrote:
 I meant to say that building a flexible ExternalIdentifier service is a
 challenging task (as James himself has said), so I am concerned that at
 some point the configuration options will be dropped (because of lack of
 time, etc.) and the bitstream identifier assignment will be turned on and
 difficult to disable.
  
 This is less likely to happen if, instead of one ExternalIdentifier 
 implementation, Dspace will have a mechanism to plug-in different
 implementations (and the default implementation will replicate current
 functionality)

I have actually already done this. What I'm having trouble with is
working through the code and figuring out what to use now where DSpace
used to use Handles (ie: strings of various forms). It is much more
appropriate in most of these places to use internal identifiers, and in
the places where this information is exposed externally (eg OAI-PMH), we
need to figure out how best to fall back if no external identifiers
are available, and to make this mechanism consistent throughout the
application. I should emphasise that (in my opinion) I've done the hard
bit; now we need to decide what on earth to do with it. I can't follow
the model that is currently employed both because it is broken, and
because it no longer applies. I would like to see some kind of
identifier task force emerge post 1.5 to properly sort this out before
the 1.6 release when I hope to include it. The discussions on the
mailing lists have been informative, but they don't get code written ;)

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential and
may be legally privileged. If you have received this message in error, you
should delete it from your system immediately and advise the sender. To any
recipient of this message within HP, unless otherwise stated you should
consider this message and attachments as HP CONFIDENTIAL.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-26 Thread Ekaterina Pechekhonova
Hello Mark,

I think I wasn't clear in my previous post, which happens to
non-native speakers, sorry about that.

I meant to say that building a flexible ExternalIdentifier service is a
challenging task (as James himself has said), so I am concerned that at
some point the configuration options will be dropped (because of lack of
time, etc.) and the bitstream identifier assignment will be turned on and
difficult to disable.
 
This is less likely to happen if, instead of one ExternalIdentifier 
implementation, Dspace will have a mechanism to plug-in different
implementations (and the default implementation will replicate current 
functionality)

Actually, I am just re-iterating what Robert and Graham have already suggested.

Regards.

Kate

Ekaterina Pechekhonova
Digital Library Programmer/Analyst
New York University
Libraries
email: [EMAIL PROTECTED]
phone: 212-992-9993

- Original Message -
From: Mark Diggory [EMAIL PROTECTED]
Date: Wednesday, July 25, 2007 9:16 pm
Subject: Re: [Dspace-tech] [vote] Do we want to assign external identifiers 
(Handles) to files?
To: Ekaterina Pechekhonova [EMAIL PROTECTED]
Cc: DSpace Tech dspace-tech@lists.sourceforge.net, Robert Tansley [EMAIL 
PROTECTED]

 Hello Kate,
 
 On Jul 25, 2007, at 4:48 PM, Ekaterina Pechekhonova wrote:
 
 
  We need to reiterate and recognize that in the DSpace 2.0
  Architectural Model it was of strong interest that Metadata can be
 
  attached at any level of the Data Model (Community, Collection, Item,
 
  Manifestation and Content).  That for all practical purposes, and
  External Identifiers are just Metadata and as such should be
  assignable across the entire model either manually or dynamically.
 
  if I understand it right, nobody has problems with should be  
  assignable
  but some people(including myself) are concerned that it will become
  should be assigned at some point. I personally think that it's  
  more likely
  to happen, if an attempt is made to stick to one implementation of
  ExternalIdentifier Service instead of allowing multiple  
  implementations.
 
 No, IMO, thats exactly the opposite of the direction that the  
 developers such as James are seeking, we're trying to undo the  
 restrictions forced on the system through hardcoded logic and forced  
 
 dependency on specific technologies (like Handle Services) in favor  
 of shifting it into a configurable strategy that gives a IR manager  
 the ability to setup the system the way they see fit. For DSpace 1.6  
 
 There will continue to be a default configuration and implementation  
 
 which is handle based, but the goal is to allow for other  
 implementations in parallel or completely replacing the default.
 
 -Mark
 
 ~
 Mark R. Diggory - DSpace Systems Manager
 MIT Libraries, Systems and Technology Services
 Massachusetts Institute of Technology
 Office: E25-131
 Phone: (617) 253-1096
 
 
 
 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now   http://get.splunk.com/
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-25 Thread Mark Diggory
Hello Kate,

On Jul 25, 2007, at 4:48 PM, Ekaterina Pechekhonova wrote:


 We need to reiterate and recognize that in the DSpace 2.0
 Architectural Model it was of strong interest that Metadata can be

 attached at any level of the Data Model (Community, Collection, Item,

 Manifestation and Content).  That for all practical purposes, and
 External Identifiers are just Metadata and as such should be
 assignable across the entire model either manually or dynamically.

 if I understand it right, nobody has problems with should be  
 assignable
 but some people(including myself) are concerned that it will become
 should be assigned at some point. I personally think that it's  
 more likely
 to happen, if an attempt is made to stick to one implementation of
 ExternalIdentifier Service instead of allowing multiple  
 implementations.

No, IMO, thats exactly the opposite of the direction that the  
developers such as James are seeking, we're trying to undo the  
restrictions forced on the system through hardcoded logic and forced  
dependency on specific technologies (like Handle Services) in favor  
of shifting it into a configurable strategy that gives a IR manager  
the ability to setup the system the way they see fit. For DSpace 1.6  
There will continue to be a default configuration and implementation  
which is handle based, but the goal is to allow for other  
implementations in parallel or completely replacing the default.

-Mark

~
Mark R. Diggory - DSpace Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology
Office: E25-131
Phone: (617) 253-1096



-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-25 Thread Ekaterina Pechekhonova

 We need to reiterate and recognize that in the DSpace 2.0  
 Architectural Model it was of strong interest that Metadata can be  
 
 attached at any level of the Data Model (Community, Collection, Item,  
 
 Manifestation and Content).  That for all practical purposes, and  
 External Identifiers are just Metadata and as such should be  
 assignable across the entire model either manually or dynamically.
 
if I understand it right, nobody has problems with should be assignable 
but some people(including myself) are concerned that it will become 
should be assigned at some point. I personally think that it's more likely
to happen, if an attempt is made to stick to one implementation of 
ExternalIdentifier Service instead of allowing multiple implementations. 


Regards.

Kate Pechekhonova
 
 ~
 Mark R. Diggory - DSpace Systems Manager
 MIT Libraries, Systems and Technology Services
 Massachusetts Institute of Technology
 Office: E25-131
 Phone: (617) 253-1096
 
 
 
 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now   http://get.splunk.com/
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?

2007-07-24 Thread James Rutherford
On Mon, Jul 23, 2007 at 09:48:39AM +1000, Gary Browne wrote:
 James, thanks for raising this issue and in particular getting people to
 put their money where their collective mouths are.

Believe me, if I could possibly avoid it, I would leave the issue well
alone ;)

I'm sure this will come as no surprise to anyone, but it seems like this
issue has highlighted some conflicts of opinion. As I see it there are
(broadly) two camps: those who believe that every meaningful tier in the
DSpace content hierarchy should get external identifiers, and those who
don't (or at least those who can't decide and so want it to be
configurable). From the responses on- and off-list, it seems there are
more people in the former camp (which is basically what I expected).
While this kind of debate could usually be resolved with a make it
configurable argument, I have a fairly major concern with this, which I
will try to outline briefly for the brave few who are still following
this thread.

Users (and administrators) crave consistency. If we make this assignment
configurable, there is no guarantee of consistency of application
between collections, or even in a single collection over extended
periods of time. The usual arguments about what we intend people to do
with the tools we provide versus what they actually do apply as ever.
This flexibility could leave repositories in a very messy state. It also
adds another degree of complexity to the new identifier system I'm
putting in place. The configurable parameters (if we are going to please
everyone) would be:

 * whether or not to assign external identifiers at all
 * which external identifier system to use by default
 * whether or not external identifiers are re-assignable
 * whether or not new versions of objects get new identifiers
 * which tiers in the content hierarchy get identifiers (if any)

I'm sure I've missed a few, but does that sound like something that is
reasonable to want / implement / support?

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential and
may be legally privileged. If you have received this message in error, you
should delete it from your system immediately and advise the sender. To any
recipient of this message within HP, unless otherwise stated you should
consider this message and attachments as HP CONFIDENTIAL.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?

2007-07-24 Thread Graham Triggs
On Tue, 2007-07-24 at 09:22 +0100, James Rutherford wrote:
 As I see it there are
 (broadly) two camps: those who believe that every meaningful tier in the
 DSpace content hierarchy should get external identifiers, and those who
 don't (or at least those who can't decide and so want it to be
 configurable).

Do you mind if I take my tent and pitch a little further down the road?
Partly because I've seen images of the swollen rivers over your way, but
mostly because I can decide and that's why I want it configurable ;)

 Users (and administrators) crave consistency. If we make this assignment
 configurable, there is no guarantee of consistency of application
 between collections, or even in a single collection over extended
 periods of time.

They crave accuracy as well, and consistency isn't the same thing ;)

 The configurable parameters (if we are going to please
 everyone) would be:
 
  * whether or not to assign external identifiers at all
  * which external identifier system to use by default
  * whether or not external identifiers are re-assignable
  * whether or not new versions of objects get new identifiers
  * which tiers in the content hierarchy get identifiers (if any)
 
 I'm sure I've missed a few, but does that sound like something that is
 reasonable to want / implement / support?

Providing the options that modify the use of an identifier system apply
on a per-system basis, that sounds like a reasonable list of what should
be possible.

But, I think we are getting a little tied up around the idea that it may
only be a single implementation that has all these possibilities
available as configuration options - and that need not be the case at
all.

ie. a pluggable 'ExternalIdentifierManager', which supports managing a
single indentifier system (configured by default to be handles), that
out-of-the-box replicates the existing behaviour (EIDs for Items, not
bitstreams) and can easily be configured to also assign EIDs to
bitstreams. Beyond that, more advanced cases can be handled not by
adding more and more configuration options, but by switching out the
implementation.

G
This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?

2007-07-24 Thread Robert Tansley
On 24/07/07, Graham Triggs [EMAIL PROTECTED] wrote:

 But, I think we are getting a little tied up around the idea that it may
 only be a single implementation that has all these possibilities
 available as configuration options - and that need not be the case at
 all.

+1 - I doubt we could come up with one uber-flexible implementation
which an exhaustive set of options that takes care of the lot.  It's
probably better to providing an interface which lets people drop in
different implementations that implement any complex policy they
choose.

It then becomes a little more manageable for curators, as the majority
can just select from a menu and say 'we use the XXX implementation'
instead of having to specify which of the smörgåsbord of configuration
options they've tweaked and how.

Rob

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?

2007-07-24 Thread Robert Tansley
On 24/07/07, James Rutherford [EMAIL PROTECTED] wrote:
 On Tue, Jul 24, 2007 at 10:12:52AM -0400, Robert Tansley wrote:
  But, I think we are getting a little tied up around the idea that it may
  only be a single implementation that has all these possibilities
  available as configuration options - and that need not be the case at
  all.
 
  +1 - I doubt we could come up with one uber-flexible implementation
  which an exhaustive set of options that takes care of the lot.  It's
  probably better to providing an interface which lets people drop in
  different implementations that implement any complex policy they
  choose.

 OK, this is starting to sound like a more reasonable attack. I'm still
 not 100% sure I understand how this approach will help, but I could just
 be misunderstanding what you're getting at. In my opinion, the class
 that understands (eg) Handles shouldn't be in charge of deciding where
 it attaches itself to objects. Or do you mean implementations of the
 ExternalIdentifierManager?

I've been round this question myself a number of times!

The problem is I'm not sure how easy it is to disentangle the logic of
assigning IDs (what gets an ID of what form) and minting IDs.  If the
'Handle' piece of the implementation could be as simple as a 'mint'
method (i.e. if all identifiers were context-free) it would be easy to
abstract out.  However, IDs may depend both on the object type and
related objects -- e.g. bitstream IDs may include the item ID as a
path component, or the version number etc -- and the ID scheme itself.

So maybe the 'assignment' interface is as simple as an Event
listener/consumer, that decides based on the event and examination of
the object whether and what identifier should be assigned  Then it
just needs to register somewhere else (maybe even just adding it to
object metadata) that the new ID is assigned to that object and the
rest of the DSpace system is happy.

You could create a flexible Handle system class that could sit behind
a couple of different implementations of the above (e.g. a
context-free one, and one that assigns contextual bitstream IDs, for
example) but I find it hard to believe that a single interface would
be sufficient for all different ID schemes (Handle, info:, PURL, UUID,
...)

Rob

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?

2007-07-24 Thread Ekaterina Pechekhonova
 Hi Rob,

 However, IDs may depend both on the object type and
 related objects -- e.g. bitstream IDs may include the item ID as a
 path component, or the version number etc -- and the ID scheme itself.
 
Do you think it's appropriate for external IDs to support internal system 
hierarchy
and version numbers ? (I am not arguing just wondering)

Kate

 
 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now   http://get.splunk.com/
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?

2007-07-22 Thread Gary Browne
James, thanks for raising this issue and in particular getting people to
put their money where their collective mouths are.

+1 from me.

BTW, the discussions about admin seem to focus on us, with little
discussion about what the end user would like. And from my end users,
they want to see a persistent identifier to their bitstream.

Cheers
Gary


Gary Browne
Development Programmer
Library IT Services
University of Sydney
ph: 9351-5946
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of James
Rutherford
Sent: Thursday, 19 July 2007 7:51 PM
To: DSpace Tech
Subject: [Dspace-tech] [vote] Do we want to assign external
identifiers(Handles) to files?

Hi all,

As part of my work on the identification mechanisms employed by DSpace
(both internal and externally managed identifiers) I've decided that
it's better to assign external identifiers to files than to not (which
is the current policy). Whether or not external identifiers are used
(and if so, which ones) will be configurable, but what I would like to
know is whether anyone has a problem with assigning external identifiers
to files at all.

If possible, could we keep this to a +/- 1 system (+1 to assign them to
files, -1 to not). I definitely don't want to turn this into a
discussion about identification systems more generally otherwise we'll
be here until next year. Also, if you vote -1 a short explanation about
why would be useful.

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential
and
may be legally privileged. If you have received this message in error,
you
should delete it from your system immediately and advise the sender. To
any
recipient of this message within HP, unless otherwise stated you should
consider this message and attachments as HP CONFIDENTIAL.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread James Rutherford
On Thu, Jul 19, 2007 at 06:46:20PM +0100, Graham Triggs wrote:
 Possibilities that should be supportable:
 * reassigning the existing id (handle) to a new file

This is, in my humble opinion, pure evil. How can you consider something
to be an *identifier* if you can't actually guarantee that it identifies
something? What if an object is replicated between repositories and the
file is only replaced in one? The user would be seeing two objects that
claim to be the same thing (given that they have the same identifier)
but would actually be confronted with different content! We can't ever
dilute the notion of what it means to be an identifier just to make our
lives easier or to reduce the cost of assigning new identifiers or
managing relationships between them.

 * provide a fallback mapping - for example, to the item
 * reporting that the id is invalid

These are more reasonable, but see my previous email on the subject.

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential and
may be legally privileged. If you have received this message in error, you
should delete it from your system immediately and advise the sender. To any
recipient of this message within HP, unless otherwise stated you should
consider this message and attachments as HP CONFIDENTIAL.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread James Rutherford
On Thu, Jul 19, 2007 at 01:47:19PM -0400, Ekaterina Pechekhonova wrote:
 +1 as it usually better to have more options then less. 
 However, in our repository we keep identifiers on item level as we
 see item as a repository atomic structure, and we think it's pretty
 important that file identifiers will be an option not a default.

Do you actually oppose assigning them, or just making them visible to
the user? Naturally the citation link would be exactly as it is now
(ie: pointing to an Item). Personally, I can't see any reason why having
this information available would be harmful, so I'm interested to
understand why you might think it was a bad idea (I'm very open to input
here).

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential and
may be legally privileged. If you have received this message in error, you
should delete it from your system immediately and advise the sender. To any
recipient of this message within HP, unless otherwise stated you should
consider this message and attachments as HP CONFIDENTIAL.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread Graham Triggs
From: James Rutherford [EMAIL PROTECTED]
 This is, in my humble opinion, pure evil. How can you consider something
 to be an *identifier* if you can't actually guarantee that it identifies
 something?

I absolutely agree. But how can you guarantee that it resolves to what it is 
meant to be identifying if you completely disallow the possibility to 
reassign it?

I was tempted to say that you shouldn't be allowed to delete a file that has 
an external identifier (or at least that the default implementation 
shouldn't). As soon as I realised that wouldn't be possible, you have to 
consider the possibility of reassigning the handle.

Remember, that such a reassignment is (or rather should only be used for) 
altering the resolution of the identifier - which doesn't automatically mean 
that you are conceptually changing what it identifies.

G 

This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread Robert Tansley
See the arch review notes (I think
http://wiki.dspace.org/index.php/ArchReviewNotesThur was the day we
really hammered on it) for decisions made about identifiers and
versions.  We spent a lot of time on that!  (Basically there's an
identifier for the 'latest version' of each item and sub-part, and
each version also gets a separate ID).

One reason you might not want to assign context-free external
identifiers like Handles to bitstreams is that Handles are not free,
they require maintenance over time.  (Contextual or 'hierarchical'
identifiers like info:item_handle/bitstream_id are 'cheaper' in this
regard -- you don't have to maintain the context and relationships ad
infinitum as they're implicit in the identifier.)

Another point is that in assigning an identifier, you need to be very
clear on exactly what you're identifying.  It sounds like there's an
assumption here that (say) a Handle assigned to a bitstream is
identifying that exact sequence of 1s and 0s, rather than 'chapter 2
of the book'.  Giving symmetrical identifiers to both logical
constructs ('item', FRBR work etc) and 'physical' constructs (an exact
sequence of 1s and 0s) could get confusing -- what is identified by
one may change over time to reflect changes in technology etc. but the
other is set in stone.

This was the thinking behind the original decision to have Handles
assigned to Items but not Bitstreams:  The Item is what is
'persistent', the particular (potentially numerous) files within may
change.  So giving out apparently identical identifiers that actually
have different levels of persistence guarantee (or, maintaining all
those file identifiers and their relationships for items forever) was
something we wanted to avoid.  We regarded access to bitstreams as a
'service' on the item.

So in short flexibility is the key!  Definitely good to be able to
assign external IDs to files, but people need to be clear on what
they're identifying, and free to give out contextual identifiers or
differently-schemed identifiers to other elements like bitstreams or
metadata records.

On 20/07/07, James Rutherford [EMAIL PROTECTED] wrote:
 I definitely don't want to turn this into a
 discussion about identification systems more generally otherwise we'll
 be here until next year

See you in 2008...

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread Graham Triggs
From: James Rutherford [EMAIL PROTECTED]
 Assigning or displaying? I find it hard to believe that you actually
 have a problem with giving identifiers to files, but I can understand
 why you might not want your users to know about them.

Because I don't believe that an identifier should be assigned to something 
unless you are accepting the possibility for it to be used - there are too 
many implications to it's assignment.

G 

This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread Graham Triggs
From: James Rutherford [EMAIL PROTECTED]
 I absolutely agree. But how can you guarantee that it resolves to what it
 is meant to be identifying if you completely disallow the possibility to
 reassign it?

 I'd flip this around and say how can you guarantee that it resolves to
 what it is meant to be identifying if you *do* allow the possibility to
 reassign it. Oh, what a can of worms!

You can't. But that isn't the issue. If you are going to have the situation 
where an id may not resolve correctly, then you have to have the tools to be 
able to correct - even if that can create problems through misuse.

 I was tempted to say that you shouldn't be allowed to delete a file that
 has an external identifier (or at least that the default implementation
 shouldn't). As soon as I realised that wouldn't be possible, you have to
 consider the possibility of reassigning the handle.

 This isn't actually strictly true. Once we have versioning, it could
 well be impossible (presumably at the discretion of the repository
 curator) to delete *anything*, only to be able to create a new head
 version of the container that doesn't hold any reference to the file you
 wanted to delete.

As nice as that would be in theory, and even if it is the likely 'normal 
operation', you will always have to cater for being able to completely erase 
a file or item (ie. legal issues).

 Remember that in systems with versioning, deletion is
 a very different concept to systems where versioning isn't supported.
 The points I have made so far assume we are working with a system that
 supports versioning.

Yes, the external id could refer - and continue to refer - to a 'deleted' 
but still accessible file. But bear in mind that we should make no 
assumption of what external id system(s) are used for the assignment, and 
that system may not be providing a persistent identifier. So we can't assume 
what the appropriate behaviour of handling that id is on file / item 
deletion.

 Remember, that such a reassignment is (or rather should only be used for)
 altering the resolution of the identifier - which doesn't automatically
 mean that you are conceptually changing what it identifies.

 Danger danger! Surely we would just be giving our adopters enough rope
 with which to hang themselves by doing this. It is pretty obvious that
 people will never use things the way we've decided that they should, no
 matter how much we jump up and down and tell them that it's the wrong
 thing to do.

True - but I could argue that by even having the ability to assign external 
/ persistent identifiers to anything you are giving adopters enough rope to 
hang themselves.  But having them is also a fundamental part of 
preservation, and they are likely hanging themselves if they don't use them 
(appropriately).

There are so many issues that I don't think it's possible to ever write a 
system where it would be impossible for adopters to not hang themselves 
(with enough functionality to sustain a diverse community). The best we can 
do is minimize the potential for these problems in 'normal' operation, and 
provide extra (separate) functionality that can try to correct problems that 
do arise.

G 

This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-20 Thread Zhiwu Xie
I tend to agree with Graham:

1. That can of worm may have already been opened as soon as people
approach the persistent identifier issue. Items are relatively stable
so we don't have much of a problem on assigning handles to them but
the same can't said about the files. At least for now the admin is
free to delete/add files at will. To add the versioning helps but may
not completely eliminate the problem.

2. It would be nice to have a sensitive tool/solution accompanying
assigning handles to files, at least to help most of the admin
usecases when the identifier needs to be changed/re-assigned. I
remember someone has written a note for W3C arguing that the
persistent identifier issue is purely an adminnistrative problem. I
think he's right to some extent. To administer well the admin needs
tools. With the tools in hand then perhaps it's up to the admin to use
it wisely. BTW I think the admin can always mess up if s/he wants to.

Zhiwu


On 7/20/07, Graham Triggs [EMAIL PROTECTED] wrote:
 From: James Rutherford [EMAIL PROTECTED]
  I absolutely agree. But how can you guarantee that it resolves to what it
  is meant to be identifying if you completely disallow the possibility to
  reassign it?
 
  I'd flip this around and say how can you guarantee that it resolves to
  what it is meant to be identifying if you *do* allow the possibility to
  reassign it. Oh, what a can of worms!

 You can't. But that isn't the issue. If you are going to have the situation
 where an id may not resolve correctly, then you have to have the tools to be
 able to correct - even if that can create problems through misuse.

  I was tempted to say that you shouldn't be allowed to delete a file that
  has an external identifier (or at least that the default implementation
  shouldn't). As soon as I realised that wouldn't be possible, you have to
  consider the possibility of reassigning the handle.
 
  This isn't actually strictly true. Once we have versioning, it could
  well be impossible (presumably at the discretion of the repository
  curator) to delete *anything*, only to be able to create a new head
  version of the container that doesn't hold any reference to the file you
  wanted to delete.

 As nice as that would be in theory, and even if it is the likely 'normal
 operation', you will always have to cater for being able to completely erase
 a file or item (ie. legal issues).

  Remember that in systems with versioning, deletion is
  a very different concept to systems where versioning isn't supported.
  The points I have made so far assume we are working with a system that
  supports versioning.

 Yes, the external id could refer - and continue to refer - to a 'deleted'
 but still accessible file. But bear in mind that we should make no
 assumption of what external id system(s) are used for the assignment, and
 that system may not be providing a persistent identifier. So we can't assume
 what the appropriate behaviour of handling that id is on file / item
 deletion.

  Remember, that such a reassignment is (or rather should only be used for)
  altering the resolution of the identifier - which doesn't automatically
  mean that you are conceptually changing what it identifies.
 
  Danger danger! Surely we would just be giving our adopters enough rope
  with which to hang themselves by doing this. It is pretty obvious that
  people will never use things the way we've decided that they should, no
  matter how much we jump up and down and tell them that it's the wrong
  thing to do.

 True - but I could argue that by even having the ability to assign external
 / persistent identifiers to anything you are giving adopters enough rope to
 hang themselves.  But having them is also a fundamental part of
 preservation, and they are likely hanging themselves if they don't use them
 (appropriately).

 There are so many issues that I don't think it's possible to ever write a
 system where it would be impossible for adopters to not hang themselves
 (with enough functionality to sustain a diverse community). The best we can
 do is minimize the potential for these problems in 'normal' operation, and
 provide extra (separate) functionality that can try to correct problems that
 do arise.

 G

 This email has been scanned by Postini.
 For more information please visit http://www.postini.com



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread Dorothea Salo
+1 for file identifiers; we do it already.

Use case (one of many): Professor is writing a print textbook, is
putting sample code in repository, wants permanent URLs for the
component files that can be printed in the book.

Dorothea

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread Mark Diggory
Jim,

Just for clarification to the group... Its unclear what your  
requesting a vote on: Anyone can assign an external identifier to a  
Bitstream in an external identification system. Can you answer the  
two following questions for the group?

1.) Are you suggesting that we will support an infrastructure in  
DSpace to assign those via an external identifier service (currently  
the Handle System in 1.5, ExternalIdentifierManager in 1.6)?

2.) Are you suggesting that Internal Identifiers (currently sequence  
id's) will be structured differently in 1.6 such that one can assign  
one to a bitstream independent of the parent items identifier?

Feeding the fire,
Mark



On Jul 19, 2007, at 5:51 AM, James Rutherford wrote:

 Hi all,

 As part of my work on the identification mechanisms employed by DSpace
 (both internal and externally managed identifiers) I've decided that
 it's better to assign external identifiers to files than to not (which
 is the current policy). Whether or not external identifiers are used
 (and if so, which ones) will be configurable, but what I would like to
 know is whether anyone has a problem with assigning external  
 identifiers
 to files at all.

 If possible, could we keep this to a +/- 1 system (+1 to assign  
 them to
 files, -1 to not). I definitely don't want to turn this into a
 discussion about identification systems more generally otherwise we'll
 be here until next year. Also, if you vote -1 a short explanation  
 about
 why would be useful.

 cheers,

 Jim

 -- 
 James Rutherford  |  Hewlett-Packard Limited registered  
 Office:
 Research Engineer |  Cain Road,
 HP Labs   |  Bracknell,
 Bristol, UK   |  Berks
 +44 117 312 7066  |  RG12 1HN.
 [EMAIL PROTECTED]   |  Registered No: 690597 England

 The contents of this message and any attachments to it are  
 confidential and
 may be legally privileged. If you have received this message in  
 error, you
 should delete it from your system immediately and advise the  
 sender. To any
 recipient of this message within HP, unless otherwise stated you  
 should
 consider this message and attachments as HP CONFIDENTIAL.

 -- 
 ---
 This SF.net email is sponsored by DB2 Express
 Download DB2 Express C - the FREE version of DB2 express and take
 control of your XML. No limits. Just data. Click to get it now.
 http://sourceforge.net/powerbar/db2/
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech

~
Mark R. Diggory - DSpace Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology
Office: E25-131
Phone: (617) 253-1096



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread James Rutherford
On Thu, Jul 19, 2007 at 11:14:15AM -0400, Mark Diggory wrote:
 Just for clarification to the group... Its unclear what your  
 requesting a vote on: Anyone can assign an external identifier to a  
 Bitstream in an external identification system. Can you answer the  
 two following questions for the group?

Only if I can answer them in the wrong order :)

 2.) Are you suggesting that Internal Identifiers (currently sequence  
 id's) will be structured differently in 1.6 such that one can assign  
 one to a bitstream independent of the parent items identifier?

Yes, definitely. At the moment, files get UUIDs, which are used in the
URL space instead of sequence ids etc. For example, the following url
points to a pdf in my testing repository:

[my_base_url]/dspace/resource/uuid:d0875067-6853-4c54-9d72-ff7888e43c42

 1.) Are you suggesting that we will support an infrastructure in  
 DSpace to assign those via an external identifier service (currently  
 the Handle System in 1.5, ExternalIdentifierManager in 1.6)?

It would work exactly the same for Bitstreams as it currently does for
Items. This is purely about enlarging the scope (and expanding the
flexibility) of what we're currently doing, not necessarily about
changing the way we do it (at least not yet).

cheers,

Jim

-- 
James Rutherford  |  Hewlett-Packard Limited registered Office:
Research Engineer |  Cain Road,
HP Labs   |  Bracknell,
Bristol, UK   |  Berks
+44 117 312 7066  |  RG12 1HN.
[EMAIL PROTECTED]   |  Registered No: 690597 England

The contents of this message and any attachments to it are confidential and
may be legally privileged. If you have received this message in error, you
should delete it from your system immediately and advise the sender. To any
recipient of this message within HP, unless otherwise stated you should
consider this message and attachments as HP CONFIDENTIAL.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread Zhiwu Xie
Then what will happen if I remove one file from an item and add
another file? Will the new file get the old handle or a new one or I
can choose?

Thanks,

Zhiwu



On 7/19/07, James Rutherford [EMAIL PROTECTED] wrote:
 On Thu, Jul 19, 2007 at 11:14:15AM -0400, Mark Diggory wrote:
  Just for clarification to the group... Its unclear what your
  requesting a vote on: Anyone can assign an external identifier to a
  Bitstream in an external identification system. Can you answer the
  two following questions for the group?

 Only if I can answer them in the wrong order :)

  2.) Are you suggesting that Internal Identifiers (currently sequence
  id's) will be structured differently in 1.6 such that one can assign
  one to a bitstream independent of the parent items identifier?

 Yes, definitely. At the moment, files get UUIDs, which are used in the
 URL space instead of sequence ids etc. For example, the following url
 points to a pdf in my testing repository:

 [my_base_url]/dspace/resource/uuid:d0875067-6853-4c54-9d72-ff7888e43c42

  1.) Are you suggesting that we will support an infrastructure in
  DSpace to assign those via an external identifier service (currently
  the Handle System in 1.5, ExternalIdentifierManager in 1.6)?

 It would work exactly the same for Bitstreams as it currently does for
 Items. This is purely about enlarging the scope (and expanding the
 flexibility) of what we're currently doing, not necessarily about
 changing the way we do it (at least not yet).

 cheers,

 Jim

 --
 James Rutherford  |  Hewlett-Packard Limited registered Office:
 Research Engineer |  Cain Road,
 HP Labs   |  Bracknell,
 Bristol, UK   |  Berks
 +44 117 312 7066  |  RG12 1HN.
 [EMAIL PROTECTED]   |  Registered No: 690597 England

 The contents of this message and any attachments to it are confidential and
 may be legally privileged. If you have received this message in error, you
 should delete it from your system immediately and advise the sender. To any
 recipient of this message within HP, unless otherwise stated you should
 consider this message and attachments as HP CONFIDENTIAL.

 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft Visual Studio 2005.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread Graham Triggs
On Thu, 2007-07-19 at 12:03 -0400, Robert Tansley wrote:
 +1 on ability to assign arbitrary external IDs to bitstreams.
 
 +1 on ability to assign 'hierarchical' external IDs to bitstreams (for
 graceful fallback if files are deleted etc).

Minor point, but you don't actually need a hierarchical external ID to
do this. The external id is associated with the file, not part of it,
and so deleting the file doesn't mean that the external id is or has to
be deleted. So, entirely possible for the external id to be retained by
the system, but reassigned to a 'graceful fallback' state.

 -1 on assigning Handles to bitstreams in 'out of the box' config.

Essentially, agree with the voting.

G
This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread Graham Triggs
On Thu, 2007-07-19 at 10:06 -0600, Zhiwu Xie wrote:
 Then what will happen if I remove one file from an item and add
 another file? Will the new file get the old handle or a new one or I
 can choose?
 

How deletion of objects assigned external ids is not something that
DSpace should force on to a repository, although it would need to have
some kind of sensible default.

Although it isn't even the case that a repository can or should have a
single way of dealing with removal of an externally identified file -
the appropriate course of action would be influenced by the factors that
led to the removal / replacement.

Possibilities that should be supportable:

* reassigning the existing id (handle) to a new file
* provide a fallback mapping - for example, to the item
* reporting that the id is invalid



G
This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?

2007-07-19 Thread Larry Stone
+1   assign external identifiers to files

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech