On Aug 15, 2008, at 12:15 PM, John Preston wrote: > On Fri, Aug 15, 2008 at 1:40 PM, Richard Rodgers <[EMAIL PROTECTED]> > wrote: >> On Fri, 2008-08-15 at 10:12 -0700, Mark Diggory wrote: >>>> On Aug 15, 2008, at 9:36 AM, John Preston wrote: >>>>> Hi. Can anyone say how I can re-use a bitstream sequence >>>>> number. The >>>>> use case is the following.... >>>>> >>> On Aug 15, 2008, at 10:01 AM, Mark H. Wood wrote: >>> >>>> Allowed or not, this sounds risky. If you are overloading the >>>> sequence number with a new meaning, this practice is likely to bite >>>> you again and again, since the developing stock code won't >>>> recognize >>>> your second meaning and will take no pains to preserve it.... >>> >>> Mark is correct about overloading the semantics here. Note, We >>> adjusted the behavior behind the dspace 1.5 XMLUI (but not the >>> JSPUI) >>> to allow for unsequenced name resolution of the bitstreams. For >>> instance: >>> ... >>> It certainly would have been much easier to key Bitstreams on the >>> name rather than a sequence id in the original architecture. I've >>> seen requests such as yours numerous times during my history of >>> working on DSpace and being able to reference resources by simple >>> assignable predictable names rather than internally generated >>> sequence ids makes life on the outside of DSpace easier and 3rd >>> party >>> tooling more powerful. This is something I hope to take into the >>> 2.0 >>> development initiative. >> >> Easier perhaps, but unfortunately the Bitstream filename need not be >> unique, so is a problematic candidate for a durable reference.
Richard, that is the crux of my criticism. It would be easier and more useful all around if the name were part of the identifier/re- visioning strategy for the item in DSpace 2.0 using the name as the identifier for the bitstream within the scope of that Item and its item wide revision id, the current XMLUI support is a transition somewhere between the original DSpace behavior and this Item re- visioning end-goal of 2.0. Likewise, Johns case is yet another example of why we need the ability to assign such identifiers rather than have them assigned internally. And because John seeks to supply an updated version of the file with the requirement that he not have to remove all the bitstreams and recreate them in order reconstruct all the local references to that specific bitstream within his item, its a reasonable use case. I encountered this when creating the DDI metadata (relative URI) describing the data files I ported from the Virtual Data Center to DSpace. http://dspace.mit.edu/handle/1721.1/39118 Where I might have: http://dspace.mit.edu/bitstream/handle/1721.1/39126/1/study.xml How would I define my DDI's relative references to the other bitstreams prior to having ingested the entire package representing the Item into DSpace, when my external application doesn't have access to this internally generated sequence id until after the fact? (thats rhetorical and answered below) http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/3/ womenpolicymakers_census_dta.tab http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/2/ womenpolicymakers_census.dta http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/5/ womenpolicymakers_parta_dta.tab rather than the above, reserving the name to be the unique identifier and eliminating the bitstream sequence id from the path allows me this flexibility. http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/study.xml? sequence=1 http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/ womenpolicymakers_census_dta.tab?sequence=3 http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/ womenpolicymakers_census.dta?sequence=2 http://dspace-test.mit.edu/bitstream/handle/1721.1/39126/ womenpolicymakers_parta_dta.tab?sequence=5 Can all be relatively referenced easily as (without uniqueness constraints) if the heuristic for resolution is sensible and predictable. I admit this heuristic is currently poorly defined and could use adjustment to return the bitstream with the same name and latest sequence id, thus becoming, in a sense a "poor mans" re- visioning system for 1.5. ./study.xml ./womenpolicymakers_census_dta.tab ./womenpolicymakers_census.dta ./womenpolicymakers_parta_dta.tab And if I wish to retain the granularity of the seqence id as a revision identifier when refering to the bitstream. ./study.xml?sequence=1 ./womenpolicymakers_census_dta.tab?sequence=3 ./womenpolicymakers_census.dta?sequence=2 ./womenpolicymakers_parta_dta.tab?sequence=5 Because of this "chicken-and-egg" problem that DSpace (pre 1.5 xmlui) creates, I had to abandon any attempts to capture changes to the bitstreams (or even the bitstreams initial sequence id) because of the lack of granularity in the Import/Package Ingest process. The only way that Applications can relatively resolve the above relative URI is to have a mechanism that tolerates the the usage of a composite identifier, name[?sequence=revision id] as a unique identifier with a sane default on the absence of the sequence_id meaning to refer to the latest. I don't think this is an unrealistic behavior to want out of the system. SVN/VIEWVC handles the subject elegantly by returning the most recent revision of a file http://dspace.svn.sourceforge.net/viewvc/dspace/branches/dspace-1_5_x/ dspace/docs/html/index.html and allow the various other revisions of the filename which is unique to the current revision to be returned from more complex queries that can be maintained against it. http://dspace.svn.sourceforge.net/viewvc/dspace/branches/dspace-1_5_x/ dspace/docs/html/index.html?revision=3044 In fact, this allows a very elegant relative reference solution to arise that doesn't require recalculation to place relative references into the system. (And eliminates the need for a special service like HTMLServlet to resolve these references using searches for matching paths in the bitstream names. (Simply try navigating the above documentation in the repository). > How will the versioning scheme, that I recall being talked about some > time ago, work. Did it not need to keep a stable reference to a > bitstream along with versions > > John > Yes, it does intend to, and currently that scheme is outdated in the architectural review given a number of new considerations with the usage of UUID's and referring to resources without nested hierarchies of identifiers. There was also a bit of recent work that went on in the Bristol meeting around relying on underlying support for versioning in the storage layers of the new 2.0 architecture. However, thats not completely thought out as well. My current viewpoint on the subject was that the versioning discussion in the architectural review outlined a need to have versioning be at the Item level only. This meant that revisions would be referred to via an item revision id rather than on individual bitstream sequence ids. For instance http://host/resource/[Item ID]/[Item_Version_ID]/[Manifestation_ID]/ [File_ID] And for example this might result in something that looks like: http://host/resource/Item_X/Version_1/Manifestation_Y/study.xml http://host/resource/Item_X/Version_1/Manifestation_Y/ womenpolicymakers_census_dta.tab http://host/resource/Item_X/Version_1/Manifestation_Y/ womenpolicymakers_census.dta http://host/resource/Item_X/Version_1/Manifestation_Y/ womenpolicymakers_parta_dta.ta http://host/resource/Item_X/Version_2/Manifestation_Y/study.xml http://host/resource/Item_X/Version_2/Manifestation_Y/ womenpolicymakers_census_dta.tab http://host/resource/Item_X/Version_2/Manifestation_Y/ womenpolicymakers_census.dta http://host/resource/Item_X/Version_2/Manifestation_Y/ womenpolicymakers_parta_dta.ta where if I had just replaced "womenpolicymakers_census_dta.tab" and the other referenced Bitsreams are just retained and mapped to the new version Id. This furthers my proposed strategy above by still retaining the relative reference capabilities within the "critical bitstream portion" of the path. As well we talked about the following defaulting to the Latest version, not unlike the behavior of SVN/VIEWVC. http://host/resource/Item_X/Manifestation_Y/study.xml http://host/resource/Item_X/Manifestation_Y/ womenpolicymakers_census_dta.tab http://host/resource/Item_X/Manifestation_Y/womenpolicymakers_census.dta http://host/resource/Item_X/Manifestation_Y/ womenpolicymakers_parta_dta.ta Note, if your confused about what a "Manifestation", it represents, in the DSpace 2.0 model, a replacement for the Bundle that is properly exposed and aligns with the Manifestation conceptualized in the FRBR area of research. Cheers, Mark ~~~~~~~~~~~~~ Mark R. Diggory - DSpace Developer and Systems Manager MIT Libraries, Systems and Technology Services Massachusetts Institute of Technology Home Page: http://purl.org/net/mdiggory/homepage ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

