I'd say 1.) We should consider the MIT proposal to do away with Bundles, its critical that we reduce the complexity in this area of DSpace an align it with the Fedora Object Model. Bundles are poorly designed at the moment and are inadequate for capturing the internal structure of the content as we see in other approaches like Fedora. The Hydra Common Module suggests this detail should be captured in a METS structmap bitstream, such that it can contain a richer hierarchy than a fixed layer of bundle structures can represent. 2.) The uniqueness constraint and the sequence_id should be expressed in the database, such constraints shouldn't be bound in application code, this is an example of poor design choices on where to persist this detail, it should not have been stored on the bitstream table, it should have been in the bundle2bitstream table, or if you want it unique across all bitstreams in an item, there should have been a item2bitstream table that stored the sequence id, then the uniqueness constraint could have been enforced properly. Why, because sequence id is irrelevant to the Bitstream itself, it is an attribute of the container/collection object that is aggregating the bitstreams. A sequence id is meaningless in relation to the Bitstreams that are used in Community and Collection logos, etc. It is only relevant to the Item/Bundle container holding the Bitstream. Mark On Wed, Oct 26, 2011 at 2:25 PM, Tim Donohue <[email protected]> wrote: > Mark, > > I agree with your concept. It'd be nice if the DBMS could enforce this > for us. I hadn't even thought of creating a more complex constraint > based on a larger query/view. Although it doesn't look "pretty", I'd > be OK with it if we could ensure it'd get the job done (without a huge > performance hit or anything). I doubt this is the only place we aren't > properly enforcing constraints on our data at the DBMS level, but it > would be nice to plug up one hole. > > - Tim > > On Wednesday, October 26, 2011 4:10:37 PM, Mark H. Wood wrote: >> CREATE VIEW BitstreamSequences AS >> SELECT item_id, bundle_id sequence_id >> FROM Item >> JOIN Item2Bundle USING(item_id) >> JOIN Bundle2Bitstream USING(bundle_id) >> JOIN Bitstream USING(bitstream_id); >> >> CREATE CONSTRAINT ic_bitstream_sequences CHECK ( >> EXISTS (SELECT * from BitstreamSequence AS bs1 WHERE >> NOT EXISTS (SELECT select * FROM BitstreamSequences AS bs2 >> WHERE bs1.item_id = bs2.item_id AND bs1.bundle_id != bs2.bundle_id >> AND bs1.sequence_id = bs2.sequence_id) >> ) >> ); >> >> (I think -- not tested, and my SQL-fu is perhaps not yet that strong.) >> >>> All that being said, obviously Mark Wood found out that we really aren't >>> properly checking that a sequence_id is unique per Item. All we are >>> doing is attempting to ensure that, when you add bitstreams through the >>> DSpace Java API (either via UI or CLI), each of those bitstreams will be >>> assigned a sequence_id which is unique within its associated Item. >>> >>> But, if "bad data" gets in your DB somehow (by bypassing our DSpace Java >>> API), we aren't warning the user or telling the user that his/her data >>> is "invalid" (which is what Mark Wood discovered the hard way). >> >> The question, then, is whether to use the DBMS to check this stuff or >> build something custom. I think that integrity checks need to be as >> close as possible to the data, which suggests the DBMS. Looking at >> the monstrosity I wrote above, I'm not so sure. :-/ >> >> >> >> ------------------------------------------------------------------------------ >> The demand for IT networking professionals continues to grow, and the >> demand for specialized networking skills is growing even more rapidly. >> Take a complimentary Learning@Cisco Self-Assessment and learn >> about Cisco certifications, training, and career opportunities. >> http://p.sf.net/sfu/cisco-dev2dev >> >> >> _______________________________________________ >> Dspace-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/dspace-devel > > ------------------------------------------------------------------------------ > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@Cisco Self-Assessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/cisco-dev2dev > _______________________________________________ > Dspace-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dspace-devel >
-- Mark Diggory 2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010 Esperantolaan 4, Heverlee 3001, Belgium http://www.atmire.com ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning@Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Dspace-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-devel
