I'd say
1.) We should consider the MIT proposal to do away with Bundles, its
critical that we reduce the complexity in this area of DSpace an align
it with the Fedora Object Model.  Bundles are poorly designed at the
moment and are inadequate for capturing the internal structure of the
content as we see in other approaches like Fedora.  The Hydra Common
Module suggests this detail should be captured in a METS structmap
bitstream, such that it can contain a richer hierarchy than a fixed
layer of bundle structures can represent.
2.) The uniqueness constraint and the sequence_id should be expressed
in the database, such constraints shouldn't be bound in application
code, this is an example of poor design choices on where to persist
this detail, it should not have been stored on the bitstream table, it
should have been in the bundle2bitstream table, or if you want it
unique across all bitstreams in an item, there should have been a
item2bitstream table that stored the sequence id, then the uniqueness
constraint could have been enforced properly.
Why, because sequence id is irrelevant to the Bitstream itself, it is
an attribute of the container/collection object that is aggregating
the bitstreams. A sequence id is meaningless in relation to the
Bitstreams that are used in Community and Collection logos, etc. It is
only relevant to the Item/Bundle container holding the Bitstream.
Mark
On Wed, Oct 26, 2011 at 2:25 PM, Tim Donohue <[email protected]> wrote:
> Mark,
>
> I agree with your concept. It'd be nice if the DBMS could enforce this
> for us.  I hadn't even thought of creating a more complex constraint
> based on a larger query/view.   Although it doesn't look "pretty", I'd
> be OK with it if we could ensure it'd get the job done (without a huge
> performance hit or anything).  I doubt this is the only place we aren't
> properly enforcing constraints on our data at the DBMS level, but it
> would be nice to plug up one hole.
>
> - Tim
>
> On Wednesday, October 26, 2011 4:10:37 PM, Mark H. Wood wrote:
>> CREATE VIEW BitstreamSequences AS
>>     SELECT item_id, bundle_id sequence_id
>>      FROM Item
>>       JOIN Item2Bundle USING(item_id)
>>       JOIN Bundle2Bitstream USING(bundle_id)
>>       JOIN Bitstream USING(bitstream_id);
>>
>> CREATE CONSTRAINT ic_bitstream_sequences CHECK (
>>   EXISTS (SELECT * from BitstreamSequence AS bs1 WHERE
>>   NOT EXISTS (SELECT select * FROM BitstreamSequences AS bs2
>>    WHERE bs1.item_id = bs2.item_id AND bs1.bundle_id != bs2.bundle_id
>>    AND bs1.sequence_id = bs2.sequence_id)
>>   )
>> );
>>
>> (I think -- not tested, and my SQL-fu is perhaps not yet that strong.)
>>
>>> All that being said, obviously Mark Wood found out that we really aren't
>>> properly checking that a sequence_id is unique per Item.  All we are
>>> doing is attempting to ensure that, when you add bitstreams through the
>>> DSpace Java API (either via UI or CLI), each of those bitstreams will be
>>> assigned a sequence_id which is unique within its associated Item.
>>>
>>> But, if "bad data" gets in your DB somehow (by bypassing our DSpace Java
>>> API), we aren't warning the user or telling the user that his/her data
>>> is "invalid" (which is what Mark Wood discovered the hard way).
>>
>> The question, then, is whether to use the DBMS to check this stuff or
>> build something custom.  I think that integrity checks need to be as
>> close as possible to the data, which suggests the DBMS.  Looking at
>> the monstrosity I wrote above, I'm not so sure. :-/
>>
>>
>>
>> ------------------------------------------------------------------------------
>> The demand for IT networking professionals continues to grow, and the
>> demand for specialized networking skills is growing even more rapidly.
>> Take a complimentary Learning@Cisco Self-Assessment and learn
>> about Cisco certifications, training, and career opportunities.
>> http://p.sf.net/sfu/cisco-dev2dev
>>
>>
>> _______________________________________________
>> Dspace-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>
> ------------------------------------------------------------------------------
> The demand for IT networking professionals continues to grow, and the
> demand for specialized networking skills is growing even more rapidly.
> Take a complimentary Learning@Cisco Self-Assessment and learn
> about Cisco certifications, training, and career opportunities.
> http://p.sf.net/sfu/cisco-dev2dev
> _______________________________________________
> Dspace-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>



-- 

Mark Diggory
2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010
Esperantolaan 4, Heverlee 3001, Belgium
http://www.atmire.com

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to