Hey, Tim. That's an excellent explanation, thanks! I now understand the trade offs between AIP in -s and -r mode. Re-creating item mappings is indeed a tricky issue.
In the notes you added to the AIP docs you mention a possible strategy for dealing with this (#3). I will have to think about the problem a bit more and decide whether it's worth my time or not as a sysadmin, or if I should just create my ~10 collections manually and then export/import all items using CSV and tell my editors to suck it up and re-map them. We originally wanted to use AIP because a few dozen items have bitstreams, but now it's become a problem I've spent a few-too-many hours solving. ;) Cheers, Alan On 01/22/2014 07:51 PM, Tim Donohue wrote: > I realized these item mapping issues were not well documented. So, I've > added a warning to the "Submitting an AIP Hierarchy" section of the AIP > Backup & Restore docs: > > https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-SubmittinganAIPHierarchy > > > See the warning that says "Item Mappings may not be maintained when > submitting an AIP hierarchy". There's a few possible workarounds noted > there. Unfortunately, none are "perfect" at this time. > > On 1/22/2014 10:32 AM, Tim Donohue wrote: >> Hi Alan, >> >> Because you are running the AIP import in "-s" mode, this acts as a *new >> submission* and will assign *new handles* to Items, Collections and >> Communities. >> >> The reason this is important to note is that in AIPs, *handles are the >> unique identifier used to maintain relationships between objects*. Let's >> repeat that: *In AIPs, handles are the unique identifier used to >> maintain relationships between objects* :) >> >> What this means is the following: >> >> * Suppose you a "DSpaceInstance#1" containing an Item with Handle >> "1234/10" which is in a Collection with Handle "1234/2" and mapped to >> another Collection with Handle "1234/5" >> * When you export this Item to an AIP, it with generate an AIP named >> "ITEM@1234/10.zip". Instead this AIP (in a METS file) will be recorded >> that this Item is "owned" by a Collection with Handle "1234/2" and >> mapped to a Collection with Handle "1234/5". >> * When you *import* this Item's AIP into another DSpace >> ("DSpaceInstance#2") using "-s" option, here's what happens. By default, >> "-s" will import the Item to whatever Collection you specify (i.e. it >> ignores the "parent object handle" specified in the AIP). So, the Item >> will end up under the Collection you expect. >> * HOWEVER, Item Mappings are an entirely different issue. When it comes >> to Item Mappings, DSpace will just map the Item to the Collection(s) >> specified in the AIP, as unfortunately DSpace has no way to determine if >> the Handle of the "mapped" Collections has changed or not. DSpace also >> has no way to 100% verify that the Collection with Handle "1234/5" in >> "DSpaceInstance#2" is the SAME AS the Collection with Handle "1234/5" in >> "DSpaceInstance#1". >> >> So, the problem here may be that you are using the "-s" option to import >> Communities/Collections. When using the -s option, DSpace is going to >> assign a *brand new handle* to each Community/Collection during the >> import process (unless you specify "--o ignoreHandle=true" to keep the >> existing handle). Although DSpace will retain the hierarchy of newly >> submitted Communities/Collections/Items (because the "--o >> ignoreParent=true" is default), it may have difficulty in maintaining >> the *Item Mappings* between collections (as mappings are always recorded >> by Collection Handle, and Collection Handles may have changed when you >> moved this content between DSpace instances). >> >> This is one of the big differences between "-r" (restore) and "-s" >> (submit) modes. The former (-r) ensures that Handles are >> maintained/restored (therefore item mappings & everything else will be >> restored properly). The latter (-s) specifically assigns *new Handles* >> to all objects. This has the potential to cause issues with Item >> Mapping, though a Community->Collection->Item hierarchy will work fine. >> >> Not sure if that helps, but I think this is what you are seeing. It's >> essentially a "known issue", because unfortunately the only "unique >> external identifier" DSpace has is Handles. Therefore, when an object's >> Handle *changes*, attempting to maintain all mappings becomes extremely >> complex. >> >> - Tim >> >> >> On 1/22/2014 10:04 AM, Alan Orth wrote: >>> Hi, >>> >>> I'm trying to migrate a community hierarchy between two different DSpace >>> instances using AIP Import (in -s mode), and I'm seeing unpredictable >>> behavior with mapped items. >>> >>> I've been trying to identify a pattern, but so far have only identified >>> the following cases: >>> >>> * Some item views show only some of the collections they are mapped >>> to, but if you navigate to another collection you can see it there >>> * Some items are mapped to incorrect collections entirely >>> >>> Has anyone else noticed this? Both DSpaces are 3.1 with PostgreSQL 9.1, >>> on Linux of course. >>> >>> Thanks, >>> >>> -- >>> Alan Orth >>> [email protected] >>> http://alaninkenya.org >>> http://mjanja.co.ke >>> "I have always wished for my computer to be as easy to use as my >>> telephone; my wish has come true because I can no longer figure out >>> how to use my telephone." -Bjarne Stroustrup, inventor of C++ >>> GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, Development Environments & Everything In Between. >>> Get a Quote or Start a Free Trial Today. >>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> >>> >>> >>> >>> >>> _______________________________________________ >>> DSpace-tech mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/dspace-tech >>> List Etiquette: >>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette >>> -- Alan Orth [email protected] http://alaninkenya.org http://mjanja.co.ke "I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone." -Bjarne Stroustrup, inventor of C++ GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

