Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
On Fri, Jul 27, 2007 at 12:20:55AM -0400, Ekaterina Pechekhonova wrote: I meant to say that building a flexible ExternalIdentifier service is a challenging task (as James himself has said), so I am concerned that at some point the configuration options will be dropped (because of lack of time, etc.) and the bitstream identifier assignment will be turned on and difficult to disable. This is less likely to happen if, instead of one ExternalIdentifier implementation, Dspace will have a mechanism to plug-in different implementations (and the default implementation will replicate current functionality) I have actually already done this. What I'm having trouble with is working through the code and figuring out what to use now where DSpace used to use Handles (ie: strings of various forms). It is much more appropriate in most of these places to use internal identifiers, and in the places where this information is exposed externally (eg OAI-PMH), we need to figure out how best to fall back if no external identifiers are available, and to make this mechanism consistent throughout the application. I should emphasise that (in my opinion) I've done the hard bit; now we need to decide what on earth to do with it. I can't follow the model that is currently employed both because it is broken, and because it no longer applies. I would like to see some kind of identifier task force emerge post 1.5 to properly sort this out before the 1.6 release when I hope to include it. The discussions on the mailing lists have been informative, but they don't get code written ;) cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
Hello Mark, I think I wasn't clear in my previous post, which happens to non-native speakers, sorry about that. I meant to say that building a flexible ExternalIdentifier service is a challenging task (as James himself has said), so I am concerned that at some point the configuration options will be dropped (because of lack of time, etc.) and the bitstream identifier assignment will be turned on and difficult to disable. This is less likely to happen if, instead of one ExternalIdentifier implementation, Dspace will have a mechanism to plug-in different implementations (and the default implementation will replicate current functionality) Actually, I am just re-iterating what Robert and Graham have already suggested. Regards. Kate Ekaterina Pechekhonova Digital Library Programmer/Analyst New York University Libraries email: [EMAIL PROTECTED] phone: 212-992-9993 - Original Message - From: Mark Diggory [EMAIL PROTECTED] Date: Wednesday, July 25, 2007 9:16 pm Subject: Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files? To: Ekaterina Pechekhonova [EMAIL PROTECTED] Cc: DSpace Tech dspace-tech@lists.sourceforge.net, Robert Tansley [EMAIL PROTECTED] Hello Kate, On Jul 25, 2007, at 4:48 PM, Ekaterina Pechekhonova wrote: We need to reiterate and recognize that in the DSpace 2.0 Architectural Model it was of strong interest that Metadata can be attached at any level of the Data Model (Community, Collection, Item, Manifestation and Content). That for all practical purposes, and External Identifiers are just Metadata and as such should be assignable across the entire model either manually or dynamically. if I understand it right, nobody has problems with should be assignable but some people(including myself) are concerned that it will become should be assigned at some point. I personally think that it's more likely to happen, if an attempt is made to stick to one implementation of ExternalIdentifier Service instead of allowing multiple implementations. No, IMO, thats exactly the opposite of the direction that the developers such as James are seeking, we're trying to undo the restrictions forced on the system through hardcoded logic and forced dependency on specific technologies (like Handle Services) in favor of shifting it into a configurable strategy that gives a IR manager the ability to setup the system the way they see fit. For DSpace 1.6 There will continue to be a default configuration and implementation which is handle based, but the goal is to allow for other implementations in parallel or completely replacing the default. -Mark ~ Mark R. Diggory - DSpace Systems Manager MIT Libraries, Systems and Technology Services Massachusetts Institute of Technology Office: E25-131 Phone: (617) 253-1096 - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
Hello Kate, On Jul 25, 2007, at 4:48 PM, Ekaterina Pechekhonova wrote: We need to reiterate and recognize that in the DSpace 2.0 Architectural Model it was of strong interest that Metadata can be attached at any level of the Data Model (Community, Collection, Item, Manifestation and Content). That for all practical purposes, and External Identifiers are just Metadata and as such should be assignable across the entire model either manually or dynamically. if I understand it right, nobody has problems with should be assignable but some people(including myself) are concerned that it will become should be assigned at some point. I personally think that it's more likely to happen, if an attempt is made to stick to one implementation of ExternalIdentifier Service instead of allowing multiple implementations. No, IMO, thats exactly the opposite of the direction that the developers such as James are seeking, we're trying to undo the restrictions forced on the system through hardcoded logic and forced dependency on specific technologies (like Handle Services) in favor of shifting it into a configurable strategy that gives a IR manager the ability to setup the system the way they see fit. For DSpace 1.6 There will continue to be a default configuration and implementation which is handle based, but the goal is to allow for other implementations in parallel or completely replacing the default. -Mark ~ Mark R. Diggory - DSpace Systems Manager MIT Libraries, Systems and Technology Services Massachusetts Institute of Technology Office: E25-131 Phone: (617) 253-1096 - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
We need to reiterate and recognize that in the DSpace 2.0 Architectural Model it was of strong interest that Metadata can be attached at any level of the Data Model (Community, Collection, Item, Manifestation and Content). That for all practical purposes, and External Identifiers are just Metadata and as such should be assignable across the entire model either manually or dynamically. if I understand it right, nobody has problems with should be assignable but some people(including myself) are concerned that it will become should be assigned at some point. I personally think that it's more likely to happen, if an attempt is made to stick to one implementation of ExternalIdentifier Service instead of allowing multiple implementations. Regards. Kate Pechekhonova ~ Mark R. Diggory - DSpace Systems Manager MIT Libraries, Systems and Technology Services Massachusetts Institute of Technology Office: E25-131 Phone: (617) 253-1096 - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?
On Mon, Jul 23, 2007 at 09:48:39AM +1000, Gary Browne wrote: James, thanks for raising this issue and in particular getting people to put their money where their collective mouths are. Believe me, if I could possibly avoid it, I would leave the issue well alone ;) I'm sure this will come as no surprise to anyone, but it seems like this issue has highlighted some conflicts of opinion. As I see it there are (broadly) two camps: those who believe that every meaningful tier in the DSpace content hierarchy should get external identifiers, and those who don't (or at least those who can't decide and so want it to be configurable). From the responses on- and off-list, it seems there are more people in the former camp (which is basically what I expected). While this kind of debate could usually be resolved with a make it configurable argument, I have a fairly major concern with this, which I will try to outline briefly for the brave few who are still following this thread. Users (and administrators) crave consistency. If we make this assignment configurable, there is no guarantee of consistency of application between collections, or even in a single collection over extended periods of time. The usual arguments about what we intend people to do with the tools we provide versus what they actually do apply as ever. This flexibility could leave repositories in a very messy state. It also adds another degree of complexity to the new identifier system I'm putting in place. The configurable parameters (if we are going to please everyone) would be: * whether or not to assign external identifiers at all * which external identifier system to use by default * whether or not external identifiers are re-assignable * whether or not new versions of objects get new identifiers * which tiers in the content hierarchy get identifiers (if any) I'm sure I've missed a few, but does that sound like something that is reasonable to want / implement / support? cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?
On Tue, 2007-07-24 at 09:22 +0100, James Rutherford wrote: As I see it there are (broadly) two camps: those who believe that every meaningful tier in the DSpace content hierarchy should get external identifiers, and those who don't (or at least those who can't decide and so want it to be configurable). Do you mind if I take my tent and pitch a little further down the road? Partly because I've seen images of the swollen rivers over your way, but mostly because I can decide and that's why I want it configurable ;) Users (and administrators) crave consistency. If we make this assignment configurable, there is no guarantee of consistency of application between collections, or even in a single collection over extended periods of time. They crave accuracy as well, and consistency isn't the same thing ;) The configurable parameters (if we are going to please everyone) would be: * whether or not to assign external identifiers at all * which external identifier system to use by default * whether or not external identifiers are re-assignable * whether or not new versions of objects get new identifiers * which tiers in the content hierarchy get identifiers (if any) I'm sure I've missed a few, but does that sound like something that is reasonable to want / implement / support? Providing the options that modify the use of an identifier system apply on a per-system basis, that sounds like a reasonable list of what should be possible. But, I think we are getting a little tied up around the idea that it may only be a single implementation that has all these possibilities available as configuration options - and that need not be the case at all. ie. a pluggable 'ExternalIdentifierManager', which supports managing a single indentifier system (configured by default to be handles), that out-of-the-box replicates the existing behaviour (EIDs for Items, not bitstreams) and can easily be configured to also assign EIDs to bitstreams. Beyond that, more advanced cases can be handled not by adding more and more configuration options, but by switching out the implementation. G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?
On 24/07/07, Graham Triggs [EMAIL PROTECTED] wrote: But, I think we are getting a little tied up around the idea that it may only be a single implementation that has all these possibilities available as configuration options - and that need not be the case at all. +1 - I doubt we could come up with one uber-flexible implementation which an exhaustive set of options that takes care of the lot. It's probably better to providing an interface which lets people drop in different implementations that implement any complex policy they choose. It then becomes a little more manageable for curators, as the majority can just select from a menu and say 'we use the XXX implementation' instead of having to specify which of the smörgåsbord of configuration options they've tweaked and how. Rob - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?
On 24/07/07, James Rutherford [EMAIL PROTECTED] wrote: On Tue, Jul 24, 2007 at 10:12:52AM -0400, Robert Tansley wrote: But, I think we are getting a little tied up around the idea that it may only be a single implementation that has all these possibilities available as configuration options - and that need not be the case at all. +1 - I doubt we could come up with one uber-flexible implementation which an exhaustive set of options that takes care of the lot. It's probably better to providing an interface which lets people drop in different implementations that implement any complex policy they choose. OK, this is starting to sound like a more reasonable attack. I'm still not 100% sure I understand how this approach will help, but I could just be misunderstanding what you're getting at. In my opinion, the class that understands (eg) Handles shouldn't be in charge of deciding where it attaches itself to objects. Or do you mean implementations of the ExternalIdentifierManager? I've been round this question myself a number of times! The problem is I'm not sure how easy it is to disentangle the logic of assigning IDs (what gets an ID of what form) and minting IDs. If the 'Handle' piece of the implementation could be as simple as a 'mint' method (i.e. if all identifiers were context-free) it would be easy to abstract out. However, IDs may depend both on the object type and related objects -- e.g. bitstream IDs may include the item ID as a path component, or the version number etc -- and the ID scheme itself. So maybe the 'assignment' interface is as simple as an Event listener/consumer, that decides based on the event and examination of the object whether and what identifier should be assigned Then it just needs to register somewhere else (maybe even just adding it to object metadata) that the new ID is assigned to that object and the rest of the DSpace system is happy. You could create a flexible Handle system class that could sit behind a couple of different implementations of the above (e.g. a context-free one, and one that assigns contextual bitstream IDs, for example) but I find it hard to believe that a single interface would be sufficient for all different ID schemes (Handle, info:, PURL, UUID, ...) Rob - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?
Hi Rob, However, IDs may depend both on the object type and related objects -- e.g. bitstream IDs may include the item ID as a path component, or the version number etc -- and the ID scheme itself. Do you think it's appropriate for external IDs to support internal system hierarchy and version numbers ? (I am not arguing just wondering) Kate - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files?
James, thanks for raising this issue and in particular getting people to put their money where their collective mouths are. +1 from me. BTW, the discussions about admin seem to focus on us, with little discussion about what the end user would like. And from my end users, they want to see a persistent identifier to their bitstream. Cheers Gary Gary Browne Development Programmer Library IT Services University of Sydney ph: 9351-5946 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of James Rutherford Sent: Thursday, 19 July 2007 7:51 PM To: DSpace Tech Subject: [Dspace-tech] [vote] Do we want to assign external identifiers(Handles) to files? Hi all, As part of my work on the identification mechanisms employed by DSpace (both internal and externally managed identifiers) I've decided that it's better to assign external identifiers to files than to not (which is the current policy). Whether or not external identifiers are used (and if so, which ones) will be configurable, but what I would like to know is whether anyone has a problem with assigning external identifiers to files at all. If possible, could we keep this to a +/- 1 system (+1 to assign them to files, -1 to not). I definitely don't want to turn this into a discussion about identification systems more generally otherwise we'll be here until next year. Also, if you vote -1 a short explanation about why would be useful. cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
On Thu, Jul 19, 2007 at 06:46:20PM +0100, Graham Triggs wrote: Possibilities that should be supportable: * reassigning the existing id (handle) to a new file This is, in my humble opinion, pure evil. How can you consider something to be an *identifier* if you can't actually guarantee that it identifies something? What if an object is replicated between repositories and the file is only replaced in one? The user would be seeing two objects that claim to be the same thing (given that they have the same identifier) but would actually be confronted with different content! We can't ever dilute the notion of what it means to be an identifier just to make our lives easier or to reduce the cost of assigning new identifiers or managing relationships between them. * provide a fallback mapping - for example, to the item * reporting that the id is invalid These are more reasonable, but see my previous email on the subject. cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
On Thu, Jul 19, 2007 at 01:47:19PM -0400, Ekaterina Pechekhonova wrote: +1 as it usually better to have more options then less. However, in our repository we keep identifiers on item level as we see item as a repository atomic structure, and we think it's pretty important that file identifiers will be an option not a default. Do you actually oppose assigning them, or just making them visible to the user? Naturally the citation link would be exactly as it is now (ie: pointing to an Item). Personally, I can't see any reason why having this information available would be harmful, so I'm interested to understand why you might think it was a bad idea (I'm very open to input here). cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
From: James Rutherford [EMAIL PROTECTED] This is, in my humble opinion, pure evil. How can you consider something to be an *identifier* if you can't actually guarantee that it identifies something? I absolutely agree. But how can you guarantee that it resolves to what it is meant to be identifying if you completely disallow the possibility to reassign it? I was tempted to say that you shouldn't be allowed to delete a file that has an external identifier (or at least that the default implementation shouldn't). As soon as I realised that wouldn't be possible, you have to consider the possibility of reassigning the handle. Remember, that such a reassignment is (or rather should only be used for) altering the resolution of the identifier - which doesn't automatically mean that you are conceptually changing what it identifies. G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
See the arch review notes (I think http://wiki.dspace.org/index.php/ArchReviewNotesThur was the day we really hammered on it) for decisions made about identifiers and versions. We spent a lot of time on that! (Basically there's an identifier for the 'latest version' of each item and sub-part, and each version also gets a separate ID). One reason you might not want to assign context-free external identifiers like Handles to bitstreams is that Handles are not free, they require maintenance over time. (Contextual or 'hierarchical' identifiers like info:item_handle/bitstream_id are 'cheaper' in this regard -- you don't have to maintain the context and relationships ad infinitum as they're implicit in the identifier.) Another point is that in assigning an identifier, you need to be very clear on exactly what you're identifying. It sounds like there's an assumption here that (say) a Handle assigned to a bitstream is identifying that exact sequence of 1s and 0s, rather than 'chapter 2 of the book'. Giving symmetrical identifiers to both logical constructs ('item', FRBR work etc) and 'physical' constructs (an exact sequence of 1s and 0s) could get confusing -- what is identified by one may change over time to reflect changes in technology etc. but the other is set in stone. This was the thinking behind the original decision to have Handles assigned to Items but not Bitstreams: The Item is what is 'persistent', the particular (potentially numerous) files within may change. So giving out apparently identical identifiers that actually have different levels of persistence guarantee (or, maintaining all those file identifiers and their relationships for items forever) was something we wanted to avoid. We regarded access to bitstreams as a 'service' on the item. So in short flexibility is the key! Definitely good to be able to assign external IDs to files, but people need to be clear on what they're identifying, and free to give out contextual identifiers or differently-schemed identifiers to other elements like bitstreams or metadata records. On 20/07/07, James Rutherford [EMAIL PROTECTED] wrote: I definitely don't want to turn this into a discussion about identification systems more generally otherwise we'll be here until next year See you in 2008... - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
From: James Rutherford [EMAIL PROTECTED] Assigning or displaying? I find it hard to believe that you actually have a problem with giving identifiers to files, but I can understand why you might not want your users to know about them. Because I don't believe that an identifier should be assigned to something unless you are accepting the possibility for it to be used - there are too many implications to it's assignment. G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
From: James Rutherford [EMAIL PROTECTED] I absolutely agree. But how can you guarantee that it resolves to what it is meant to be identifying if you completely disallow the possibility to reassign it? I'd flip this around and say how can you guarantee that it resolves to what it is meant to be identifying if you *do* allow the possibility to reassign it. Oh, what a can of worms! You can't. But that isn't the issue. If you are going to have the situation where an id may not resolve correctly, then you have to have the tools to be able to correct - even if that can create problems through misuse. I was tempted to say that you shouldn't be allowed to delete a file that has an external identifier (or at least that the default implementation shouldn't). As soon as I realised that wouldn't be possible, you have to consider the possibility of reassigning the handle. This isn't actually strictly true. Once we have versioning, it could well be impossible (presumably at the discretion of the repository curator) to delete *anything*, only to be able to create a new head version of the container that doesn't hold any reference to the file you wanted to delete. As nice as that would be in theory, and even if it is the likely 'normal operation', you will always have to cater for being able to completely erase a file or item (ie. legal issues). Remember that in systems with versioning, deletion is a very different concept to systems where versioning isn't supported. The points I have made so far assume we are working with a system that supports versioning. Yes, the external id could refer - and continue to refer - to a 'deleted' but still accessible file. But bear in mind that we should make no assumption of what external id system(s) are used for the assignment, and that system may not be providing a persistent identifier. So we can't assume what the appropriate behaviour of handling that id is on file / item deletion. Remember, that such a reassignment is (or rather should only be used for) altering the resolution of the identifier - which doesn't automatically mean that you are conceptually changing what it identifies. Danger danger! Surely we would just be giving our adopters enough rope with which to hang themselves by doing this. It is pretty obvious that people will never use things the way we've decided that they should, no matter how much we jump up and down and tell them that it's the wrong thing to do. True - but I could argue that by even having the ability to assign external / persistent identifiers to anything you are giving adopters enough rope to hang themselves. But having them is also a fundamental part of preservation, and they are likely hanging themselves if they don't use them (appropriately). There are so many issues that I don't think it's possible to ever write a system where it would be impossible for adopters to not hang themselves (with enough functionality to sustain a diverse community). The best we can do is minimize the potential for these problems in 'normal' operation, and provide extra (separate) functionality that can try to correct problems that do arise. G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
I tend to agree with Graham: 1. That can of worm may have already been opened as soon as people approach the persistent identifier issue. Items are relatively stable so we don't have much of a problem on assigning handles to them but the same can't said about the files. At least for now the admin is free to delete/add files at will. To add the versioning helps but may not completely eliminate the problem. 2. It would be nice to have a sensitive tool/solution accompanying assigning handles to files, at least to help most of the admin usecases when the identifier needs to be changed/re-assigned. I remember someone has written a note for W3C arguing that the persistent identifier issue is purely an adminnistrative problem. I think he's right to some extent. To administer well the admin needs tools. With the tools in hand then perhaps it's up to the admin to use it wisely. BTW I think the admin can always mess up if s/he wants to. Zhiwu On 7/20/07, Graham Triggs [EMAIL PROTECTED] wrote: From: James Rutherford [EMAIL PROTECTED] I absolutely agree. But how can you guarantee that it resolves to what it is meant to be identifying if you completely disallow the possibility to reassign it? I'd flip this around and say how can you guarantee that it resolves to what it is meant to be identifying if you *do* allow the possibility to reassign it. Oh, what a can of worms! You can't. But that isn't the issue. If you are going to have the situation where an id may not resolve correctly, then you have to have the tools to be able to correct - even if that can create problems through misuse. I was tempted to say that you shouldn't be allowed to delete a file that has an external identifier (or at least that the default implementation shouldn't). As soon as I realised that wouldn't be possible, you have to consider the possibility of reassigning the handle. This isn't actually strictly true. Once we have versioning, it could well be impossible (presumably at the discretion of the repository curator) to delete *anything*, only to be able to create a new head version of the container that doesn't hold any reference to the file you wanted to delete. As nice as that would be in theory, and even if it is the likely 'normal operation', you will always have to cater for being able to completely erase a file or item (ie. legal issues). Remember that in systems with versioning, deletion is a very different concept to systems where versioning isn't supported. The points I have made so far assume we are working with a system that supports versioning. Yes, the external id could refer - and continue to refer - to a 'deleted' but still accessible file. But bear in mind that we should make no assumption of what external id system(s) are used for the assignment, and that system may not be providing a persistent identifier. So we can't assume what the appropriate behaviour of handling that id is on file / item deletion. Remember, that such a reassignment is (or rather should only be used for) altering the resolution of the identifier - which doesn't automatically mean that you are conceptually changing what it identifies. Danger danger! Surely we would just be giving our adopters enough rope with which to hang themselves by doing this. It is pretty obvious that people will never use things the way we've decided that they should, no matter how much we jump up and down and tell them that it's the wrong thing to do. True - but I could argue that by even having the ability to assign external / persistent identifiers to anything you are giving adopters enough rope to hang themselves. But having them is also a fundamental part of preservation, and they are likely hanging themselves if they don't use them (appropriately). There are so many issues that I don't think it's possible to ever write a system where it would be impossible for adopters to not hang themselves (with enough functionality to sustain a diverse community). The best we can do is minimize the potential for these problems in 'normal' operation, and provide extra (separate) functionality that can try to correct problems that do arise. G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
+1 for file identifiers; we do it already. Use case (one of many): Professor is writing a print textbook, is putting sample code in repository, wants permanent URLs for the component files that can be printed in the book. Dorothea - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
Jim, Just for clarification to the group... Its unclear what your requesting a vote on: Anyone can assign an external identifier to a Bitstream in an external identification system. Can you answer the two following questions for the group? 1.) Are you suggesting that we will support an infrastructure in DSpace to assign those via an external identifier service (currently the Handle System in 1.5, ExternalIdentifierManager in 1.6)? 2.) Are you suggesting that Internal Identifiers (currently sequence id's) will be structured differently in 1.6 such that one can assign one to a bitstream independent of the parent items identifier? Feeding the fire, Mark On Jul 19, 2007, at 5:51 AM, James Rutherford wrote: Hi all, As part of my work on the identification mechanisms employed by DSpace (both internal and externally managed identifiers) I've decided that it's better to assign external identifiers to files than to not (which is the current policy). Whether or not external identifiers are used (and if so, which ones) will be configurable, but what I would like to know is whether anyone has a problem with assigning external identifiers to files at all. If possible, could we keep this to a +/- 1 system (+1 to assign them to files, -1 to not). I definitely don't want to turn this into a discussion about identification systems more generally otherwise we'll be here until next year. Also, if you vote -1 a short explanation about why would be useful. cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. -- --- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech ~ Mark R. Diggory - DSpace Systems Manager MIT Libraries, Systems and Technology Services Massachusetts Institute of Technology Office: E25-131 Phone: (617) 253-1096 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
On Thu, Jul 19, 2007 at 11:14:15AM -0400, Mark Diggory wrote: Just for clarification to the group... Its unclear what your requesting a vote on: Anyone can assign an external identifier to a Bitstream in an external identification system. Can you answer the two following questions for the group? Only if I can answer them in the wrong order :) 2.) Are you suggesting that Internal Identifiers (currently sequence id's) will be structured differently in 1.6 such that one can assign one to a bitstream independent of the parent items identifier? Yes, definitely. At the moment, files get UUIDs, which are used in the URL space instead of sequence ids etc. For example, the following url points to a pdf in my testing repository: [my_base_url]/dspace/resource/uuid:d0875067-6853-4c54-9d72-ff7888e43c42 1.) Are you suggesting that we will support an infrastructure in DSpace to assign those via an external identifier service (currently the Handle System in 1.5, ExternalIdentifierManager in 1.6)? It would work exactly the same for Bitstreams as it currently does for Items. This is purely about enlarging the scope (and expanding the flexibility) of what we're currently doing, not necessarily about changing the way we do it (at least not yet). cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
Then what will happen if I remove one file from an item and add another file? Will the new file get the old handle or a new one or I can choose? Thanks, Zhiwu On 7/19/07, James Rutherford [EMAIL PROTECTED] wrote: On Thu, Jul 19, 2007 at 11:14:15AM -0400, Mark Diggory wrote: Just for clarification to the group... Its unclear what your requesting a vote on: Anyone can assign an external identifier to a Bitstream in an external identification system. Can you answer the two following questions for the group? Only if I can answer them in the wrong order :) 2.) Are you suggesting that Internal Identifiers (currently sequence id's) will be structured differently in 1.6 such that one can assign one to a bitstream independent of the parent items identifier? Yes, definitely. At the moment, files get UUIDs, which are used in the URL space instead of sequence ids etc. For example, the following url points to a pdf in my testing repository: [my_base_url]/dspace/resource/uuid:d0875067-6853-4c54-9d72-ff7888e43c42 1.) Are you suggesting that we will support an infrastructure in DSpace to assign those via an external identifier service (currently the Handle System in 1.5, ExternalIdentifierManager in 1.6)? It would work exactly the same for Bitstreams as it currently does for Items. This is purely about enlarging the scope (and expanding the flexibility) of what we're currently doing, not necessarily about changing the way we do it (at least not yet). cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. [EMAIL PROTECTED] | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
On Thu, 2007-07-19 at 12:03 -0400, Robert Tansley wrote: +1 on ability to assign arbitrary external IDs to bitstreams. +1 on ability to assign 'hierarchical' external IDs to bitstreams (for graceful fallback if files are deleted etc). Minor point, but you don't actually need a hierarchical external ID to do this. The external id is associated with the file, not part of it, and so deleting the file doesn't mean that the external id is or has to be deleted. So, entirely possible for the external id to be retained by the system, but reassigned to a 'graceful fallback' state. -1 on assigning Handles to bitstreams in 'out of the box' config. Essentially, agree with the voting. G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
On Thu, 2007-07-19 at 10:06 -0600, Zhiwu Xie wrote: Then what will happen if I remove one file from an item and add another file? Will the new file get the old handle or a new one or I can choose? How deletion of objects assigned external ids is not something that DSpace should force on to a repository, although it would need to have some kind of sensible default. Although it isn't even the case that a repository can or should have a single way of dealing with removal of an externally identified file - the appropriate course of action would be influenced by the factors that led to the removal / replacement. Possibilities that should be supportable: * reassigning the existing id (handle) to a new file * provide a fallback mapping - for example, to the item * reporting that the id is invalid G This email has been scanned by Postini. For more information please visit http://www.postini.com - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [vote] Do we want to assign external identifiers (Handles) to files?
+1 assign external identifiers to files - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech