Dorothea, I'll take this as an opportunity to summarize what I know about various projects that are already happening in the community to address these topics.
On Sep 12, 2008, at 8:48 AM, Dorothea Salo wrote: > Apologies for the lateness. We're finally getting 1.5 in shape to roll > out, and I'm just a little stressed about that. You should certainly be voicing any issues/concerns you may have with the community. We are glad to advise on best approaches for rolling out a 1.5 release (As you know, at MIT we just finished this rollout). > > REPOSITORY MIGRATION > > Work is underway to enable wholesale repository migration between > platforms via OAI-ORE. The winning entry in the hackfest at Open > Repositories 2008 nearly managed an entire migration between DSpace > and Fedora. I think the Fedora and DSpace communities are actually looking for something even more synergistic than "Migration" out of the ORE and other Projects. We did just finish a project in the GSoC to address the subject of using a Fedora repository as the storage layer for the DSpace Assetstore. http://wiki.dspace.org/index.php/ Google_Summer_of_Code_2008_Fedora_Integration There is certainly a strong push between both DSpace and Fedora communities to become more involved and collaborative with each-other. With the 2.0 work there is an opportunity to see even more reuse of storage between DSpace and Fedora. Allowing all DSpaceObject data (including Policies and permissions to be mapped to Fedora under the hood. Likewise, the 2.0 model rework is seeking to allow metadata to be attached to any DSpaceObject, this opens the door for a richer expression of DSpace Objects that fits better with both the ORE and Fedora storage models. While hackfests are interesting opportunities to explore ideas, without the resulting code being publicized and brought into the community, I'm wary of the outcome being anything more than just an example of the the point we all know: Given that we are storing similar content, we also ultimately have similar use-cases and underlying implementation strategies. I see this as not much different than Scott Yeadons work to present Content Interchange at OR2007. The only (albeit big) benefit being a third party expression/mapping of both tools to ORE rather than METS. > # Content Interchange and the Invisible Repository > Scott Yeadon > > The Australian National University (ANU) will be undertaking > development work for the Australian Partnership for Sustainable > Repositories (APSR) in 2007. Much of this work will be focused > around repository interoperability and the integration of a > repository service within the university’s application > infrastructure. This presentation will discuss and demonstrate some > of the prototype DSpace-related development work undertaken so far > and planned for further development in 2007. Specifically: a METS > SIP/DIP profile intended to be used as a national standard for the > meaningful exchange of digital objects between repositories; > separation of concerns at a functional level so an institution can > select best-of-breed software, with an example using Open Journal > Systems (OJS) to manage publication workflow, DSpace to manage > preservation and Manakin as an access/publication point; and a > Manakin theme incorporating Google Earth and Google Maps > functionality. > ... > ONE CHANGE > > Asked what the one change would be that would advance DSpace furthest > toward the ideal repository system, these possibilities came up (in > rough rank by interest): Thank you for instigating these responses Dorothea, I would like to Make some comments. Firstly, I would add that aligning DSpace in terms of Model and capability with lower level storage solutions (Like Fedora) is one of the important requirements in the 2.0 development roadmap and that this level of integration is of a very high priority to the Foundation and the Core 2.0 development team. Anyone who has questions about how/what is being planned for 2.0 should voice them to the team and the community at large. We are working to solidify the architectural prototype and bring together these designs. Once we have a tangible body of work, we will be opening the effort to review by the community at large. That said, I know we also have the following projects already in the works in the community: > * file versioning I mentored a GSoC project done in 2007 that address this, our intention is to merge it into the trunk at some point inthe near future http://wiki.dspace.org/index.php/Google_Summer_of_Code_2007_Versioning Likewise we have a project within the MIT Libraries initially prototyped by Larry Stone to implement the new History system for DSpace http://wiki.dspace.org/index.php/HistorySystemPrototype > * embargoes Elliot Metsinger has been hard at work on a prototype for handling embargoes that is a fork of the 1.5.x codebase, he is also working on porting this to the trunk. > * eliminating the need for server restarts Not sure about this topic? ITs not necessarily a "feature" of DSpace as much as how one deploys it in a production environment. > * authority control Authority control is an important topic and one that I've had some ideas about, but I haven't seen come to fruition yet. I'll say that theses project however relate to it. Bitstream Format Renovation (initially prototyped by Larry Stone): http://wiki.dspace.org/index.php/BitstreamFormat_Renovation This is a critical project that MIT Libraries has bee working on over the last year. It basically provides an Authority Control over the Bitstream Formats that will enable DSpace to use services like Pronom and GDFR to supply more uniform and consistent Format detection and control. By using a Global Format Registry, DSpace isntance can all share a common set of known Formats that will be updated on a regular basis to reflect the changes in digital media that occur over time. In the DSpace DAO refactoring work that James Rutherford, Richard Jones, Graham Triggs and myself participated in (which now resides in the DSpace trunk). There was a critical effort to refactor out the ability to assign and manage External Identifiers such as Handles, DOIs, PURLs, Arks, etc I'm of the opinion that what we are seeking is ultimately a more universal solution here. ExternalIdentifiers, BitstreamFormats, are really a cases of controlled Metadata fields. An that these fields are backed by services with the following levels of capability Level 1 Read - The ability to list, search or validate a specific metadata value (Literal string or Identifier) within an external service. Level 2 Write - The ability to mint new metadata value (Literal string or Identifier) within such a service We already see such "endpoints" evolving at the LoC and other leaders in the field of metadata standardization and classification. > * API to the repository layer We have an API to the repository layer. Its called "DSpace API". I'm not sure what is meant otherwise? If you are referring to a pluggable layer that will allow one to implement ones own Bitstream Assetstore solution, this is currently a project that Richard Rodgers has been spearheading at MIT Libraries and there are prototype API available, this is the intention of seeing it become part of DSpace shortly, the only question seems to be which version. If my understanding is correct, this api was also used as a critical enhancement to DSpace to support integration work with Fedora in the DSPace/Fedora GSoC project. > * multiple instances of DSpace run from a single codebase The Maven build system allows one to setup numerous separate configurations that reuse the same DSpace codebase across them all. Again, this is insufficiently vague. Are you referring to sharing jars in a tomcat server instance or JEE container? This is something that Graham Triggs has been working on cleaning up the codebase improve that capability of. > * componentization of DSpace DSpace 1.5.x was our first major reorganization to support componentization of DSpace, MAven allows you to write separate module projects for DSpace and include them into your build process. This allows not only the separation of your customizations from the original core codebase, but also the previous statement concerning multiple instances. > > EMBARGOES > > Bram Luyten shared this video of their DSpace embargo function: > <http://screencast.com/t/hinfBuq3fU> Elliot Metsger shared > <http://wiki.dspace.org/index.php/User:Emetsger:Embargo>, and its FAQ > at <http://maven.mse.jhu.edu/embargo/faq.html> > > A more nuanced implementation of OAI-PMH would be helpful to several > chatters. There was general agreement that withdrawn and embargoed > items should not export metadata via OAI-PMH. The ability to have > OAI-PMH only disseminate items designated as "full-text" (or otherwise > complete) was also desired. > > Embargoed items should not come up in browses or searches, of course, > nor should they be crawlable by search engines. However, some items > can be halfway-private: metadata can be available (including via > OAI-PMH), but the files should not be downloadable. Access Control > Lists were raised as one potential solution. Those are certainly requirements of the Embargo project the Elliot Metsger has been working very hard at. I think the Core developers embrace that unanimously as an important feature and the exposure as serious issue that needs fixing. I hope this summary of known projects assists the community in understanding where work is currently going on and what the overall "tack" is our communities informal development roadmap. I do think this discussion has been very fruitful and allows a platform for the developers within the community to clarify the work that they are doing. Sincerely, Mark Diggory ~~~~~~~~~~~~~ Mark R. Diggory - DSpace Developer and Systems Manager MIT Libraries, Systems and Technology Services Massachusetts Institute of Technology Home Page: http://purl.org/net/mdiggory/homepage _______________________________________________ Dspace-general mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/dspace-general
