Though it is a tangent, I just want to comment that in the versioning
work we've done on Dryad/NESCent and how it related to sequence IDs.

On Wed, Oct 26, 2011 at 11:14 AM, Tim Donohue <[email protected]> wrote:
> My answer would be that bitstream 'sequence_id' should be unique
> *per-Item*. Admittedly though, this doesn't seem to be documented anywhere.

It has absolutely no uniqueness constraint on it in the database.

   sequence_id             INTEGER

>
> The reason why I think it is *per-Item* is based on how it is
> implemented in our code/database.  There is only one 'sequence_id' per
> 'bitstream' (it's a column in the 'bitstream' table). But, as you noted,
> a bitstream may be linked into one or more bundles. At least the data
> model supports this idea of a bitstream in multiple bundles (even though
> I'm not sure if there's actually a way to map to multiple bundles from
> the UI or CLI).

we do link Bitstreams not only to multiple bundles, but the bundles
span Items.  This allows for the reuse of the Bitstreams that do not
change without having to replicate them across Item Versions.  This
was the proposed approach that was discussed int he original DSpace
2.0 Architectural Review.  We just implemented it in the Versioning
Addon for DSpace.  Note we have to adjust certain behaviors in DSpace
Item/Bundle/Bitstream deletion to assure that the integrity of the
database is assured and Bitstreams that are in previous versions are
not "deleted" completely, when deleted int eh current version via the
Item Edit or Submission interfaces.

So, yes, we still rely on the sequence ID across all the Item/Bundles
that are referencing the Bitstream. As we are basically "replicating"
the "structure" of the Item, the Bitstream sequence ID's are still
aligned across all the Item/Bundles in a version history.


> So, currently, based on the data model, it'd be impossible for a single
> bitstream to have different 'sequence_id' values for different Bundles
> (for that to work, the 'sequence_id' would need to be on the
> 'bundle2bitstream' table instead of the 'bitstream' table).

Agreed, we had to work within this constraint for versioning.

> All this being said, it makes me wonder how you ended up with an item
> that has two bitstreams with the same 'sequence_id'.

And the only recourse would be to shift one of the bitstreams to have
a new sequence id. There is a capability in the XMLUI that will allow
you to resolve strictly on the filename without the bitstream sequence
id in the URL, in which case it will return the first bitstream that
matches the filename.

> Looking at the code in Item.update(), it looks like this should not be
> "possible", as the code which assigns 'sequence_id' always attempts to
> assign the next largest value. Surprisingly, I don't see any other code
> in our DSpace API that even calls "bistream.setSequenceID()"
>
> Obviously, somehow your data still managed to end up with the same
> sequence_id though. :)  I'm just not sure how.

My thoughts drift to:

-- SQL code to update the ID (primary key) generating sequences, if some
-- import operation has set explicit IDs.
--
-- Sequences are used to generate IDs for new rows in the database.  If a
-- bulk import operation, such as an SQL dump, specifies primary keys for
-- imported data explicitly, the sequences are out of sync and need updating.
-- This SQL code does just that.
--
-- This should rarely be needed; any bulk import should be performed using the
-- org.dspace.content API which is safe to use concurrently and in multiple
-- JVMs.  The SQL code below will typically only be required after a direct
-- SQL data dump from a backup or somesuch.

http://scm.dspace.org/svn/repo/dspace/trunk/dspace/etc/postgres/update-sequences.sql
-- 

Mark Diggory
2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010
Esperantolaan 4, Heverlee 3001, Belgium
http://www.atmire.com

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to