Bugs item #1715242, was opened at 2007-05-08 15:22
Message generated for change (Settings changed) made by tdonohue
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=119984&aid=1715242&group_id=19984
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Out of Date
Priority: 5
Private: No
Submitted By: Larry Stone (lcs8)
Assigned to: Nobody/Anonymous (nobody)
Summary: SequenceID gets reused if highest-SID bitstream is deleted
Initial Comment:
The SequenceID of a Bitstream is supposed to be a
"persistent" identifier. There is an edge case where it is not.. this is
reproducible in DSpace 1.4.1, undoubtedly in 1.4.2 as well. Running on Sun
Java 1.6, RHEL Linux, Postgresql 7.something, Tomcat 5.5.
Quoting email with the description:
> First, it is assigned sequentially and IDs are not reused if a bitstream
> is deleted. There is no magic ordering, and it was *not* intended for
> organizing a set of bitstreams into a meaningful sequence (e.g. PDF
> chapters of a book). Its sole purpose is to provide a *durable* unique
> ID for a bitstream - think of it as a 'sub-handle' ID - modulo an item
There's actually a bug in the data model, then. It's possible to get
the same sequence ID reused, because when adding a Bitstream, the code
only looks for the highest existing SequenceID and increments that.
1. Take an existing Item, go into the "Edit Item" admin page
(/dspace/tools/edit-item), and add a new Bitstream with a distinctive name.
Say, "foo.pdf".
2. Determine its Sequence ID. Go to the Item page
/dspace/handle/<my-handle> and observe the "View/Open" link next
to your bitstream, the path element after its handle is the SequenceID.
It should be the highest SequenecID there since it was most recently added.
There are some "invisible" Bitstreams (like licenses) that also take
up SIDs.
3. Go back to the "Edit" page and delete that newest bitstream.
4. Add a different bitstream with a different name, say, "bar.pdf".
5. Go to a freshly-loaded copy of the Item page, and observe that
"bar.pdf" has the same SequenceID that "foo.pdf" had before.
----------------------------------------------------------------------
Comment By: Robert Tansley (rtansley)
Date: 2007-06-13 09:51
Message:
Logged In: YES
user_id=166234
Originator: NO
This is a bug in the code that allocates the sequence IDs, not a problem
with the data model per se.
The sequence IDs should *not* be reused, as they are constituent parts of
a bitstream's 'persistent' or external ID. The intention has always been
that the Handle in combination with the sequence ID can be used to
constitute a persistent ID for a bitstream. ("Assigning a persistent ID to
a bitstream" does not necessarily mean e.g. minting a new Handle.)
e.g. a bitstream's 'assigned' persistent ID might be
info:dspace/1721.1/123/1 where '1' is the sequence ID.
Hence we need a way to ensure that sequence IDs are always unique within
an item, and that previously used sequence IDs are never re-assigned.
----------------------------------------------------------------------
Comment By: James Rutherford (jrutherford)
Date: 2007-06-13 08:46
Message:
Logged In: YES
user_id=1472833
Originator: NO
This can be attacked in one of two ways (as I see it). 1. Add an actual
database sequence for bitstreams that we can pull the last value from, or
2. stop putting so much stock in the sequence id. I don't really see why we
can't recycle them when bitstreams are deleted. It will always be the case
that Bitstreams in a given Item will have unique sequence ids, so why does
it matter? If it doesn't do what we thought it did, I'm not sure that makes
it a bug ;) The fact that we need it to be "durable" or some kind of
"sub-handle" only demonstrates that the current Bitstream implementation
needs an overhaul to remove such hacks, and provides yet another case for
actually assigning persistent identifiers to Bitstreams.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=119984&aid=1715242&group_id=19984
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel