Hey, Tim.

That's an excellent explanation, thanks!  I now understand the trade
offs between AIP in -s and -r mode.  Re-creating item mappings is indeed
a tricky issue.

In the notes you added to the AIP docs you mention a possible strategy
for dealing with this (#3).  I will have to think about the problem a
bit more and decide whether it's worth my time or not as a sysadmin, or
if I should just create my ~10 collections manually and then
export/import all items using CSV and tell my editors to suck it up and
re-map them.

We originally wanted to use AIP because a few dozen items have
bitstreams, but now it's become a problem I've spent a few-too-many
hours solving. ;)

Cheers,

Alan

On 01/22/2014 07:51 PM, Tim Donohue wrote:
> I realized these item mapping issues were not well documented. So, I've
> added a warning to the "Submitting an AIP Hierarchy" section of the AIP
> Backup & Restore docs:
> 
> https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-SubmittinganAIPHierarchy
> 
> 
> See the warning that says "Item Mappings may not be maintained when
> submitting an AIP hierarchy". There's a few possible workarounds noted
> there. Unfortunately, none are "perfect" at this time.
> 
> On 1/22/2014 10:32 AM, Tim Donohue wrote:
>> Hi Alan,
>>
>> Because you are running the AIP import in "-s" mode, this acts as a *new
>> submission* and will assign *new handles* to Items, Collections and
>> Communities.
>>
>> The reason this is important to note is that in AIPs, *handles are the
>> unique identifier used to maintain relationships between objects*. Let's
>> repeat that: *In AIPs, handles are the unique identifier used to
>> maintain relationships between objects* :)
>>
>> What this means is the following:
>>
>> * Suppose you a "DSpaceInstance#1" containing an Item with Handle
>> "1234/10" which is in a Collection with Handle "1234/2" and mapped to
>> another Collection with Handle "1234/5"
>> * When you export this Item to an AIP, it with generate an AIP named
>> "ITEM@1234/10.zip". Instead this AIP (in a METS file) will be recorded
>> that this Item is "owned" by a Collection with Handle "1234/2" and
>> mapped to a Collection with Handle "1234/5".
>> * When you *import* this Item's AIP into another DSpace
>> ("DSpaceInstance#2") using "-s" option, here's what happens. By default,
>> "-s" will import the Item to whatever Collection you specify (i.e. it
>> ignores the "parent object handle" specified in the AIP). So, the Item
>> will end up under the Collection you expect.
>> * HOWEVER, Item Mappings are an entirely different issue. When it comes
>> to Item Mappings, DSpace will just map the Item to the Collection(s)
>> specified in the AIP, as unfortunately DSpace has no way to determine if
>> the Handle of the "mapped" Collections has changed or not. DSpace also
>> has no way to 100% verify that the Collection with Handle "1234/5" in
>> "DSpaceInstance#2" is the SAME AS the Collection with Handle "1234/5" in
>> "DSpaceInstance#1".
>>
>> So, the problem here may be that you are using the "-s" option to import
>> Communities/Collections. When using the -s option, DSpace is going to
>> assign a *brand new handle* to each Community/Collection during the
>> import process (unless you specify "--o ignoreHandle=true" to keep the
>> existing handle).  Although DSpace will retain the hierarchy of newly
>> submitted Communities/Collections/Items (because the "--o
>> ignoreParent=true" is default), it may have difficulty in maintaining
>> the *Item Mappings* between collections (as mappings are always recorded
>> by Collection Handle, and Collection Handles may have changed when you
>> moved this content between DSpace instances).
>>
>> This is one of the big differences between "-r" (restore) and "-s"
>> (submit) modes. The former (-r) ensures that Handles are
>> maintained/restored (therefore item mappings & everything else will be
>> restored properly).  The latter (-s) specifically assigns *new Handles*
>> to all objects. This has the potential to cause issues with Item
>> Mapping, though a Community->Collection->Item hierarchy will work fine.
>>
>> Not sure if that helps, but I think this is what you are seeing.  It's
>> essentially a "known issue", because unfortunately the only "unique
>> external identifier" DSpace has is Handles. Therefore, when an object's
>> Handle *changes*, attempting to maintain all mappings becomes extremely
>> complex.
>>
>> - Tim
>>
>>
>> On 1/22/2014 10:04 AM, Alan Orth wrote:
>>> Hi,
>>>
>>> I'm trying to migrate a community hierarchy between two different DSpace
>>> instances using AIP Import (in -s mode), and I'm seeing unpredictable
>>> behavior with mapped items.
>>>
>>> I've been trying to identify a pattern, but so far have only identified
>>> the following cases:
>>>
>>>   * Some item views show only some of the collections they are mapped
>>>     to, but if you navigate to another collection you can see it there
>>>   * Some items are mapped to incorrect collections entirely
>>>
>>> Has anyone else noticed this? Both DSpaces are 3.1 with PostgreSQL 9.1,
>>> on Linux of course.
>>>
>>> Thanks,
>>>
>>> -- 
>>> Alan Orth
>>> [email protected]
>>> http://alaninkenya.org
>>> http://mjanja.co.ke
>>> "I have always wished for my computer to be as easy to use as my
>>> telephone; my wish has come true because I can no longer figure out
>>> how to use my telephone." -Bjarne Stroustrup, inventor of C++
>>> GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> Critical Workloads, Development Environments & Everything In Between.
>>> Get a Quote or Start a Free Trial Today.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> DSpace-tech mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>> List Etiquette:
>>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>>>

-- 
Alan Orth
[email protected]
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my
telephone; my wish has come true because I can no longer figure out how
to use my telephone." -Bjarne Stroustrup, inventor of C++
GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to