Thanks, both Tim and Helix. Yes, I initially looked into the "-r" mode, but then realized that, as Tim mentioned, our development instance doesn't necessarily create proper handles. Our development instance is more of a code-testing ground, and we don't sync the content very frequently. Also, the date-related meta data isn't necessarily correct either, as the accession into the development instance (for quality assurance) isn't necessarily the accession date we'd want.
I think I'll have to rely on a two-step approach: first ingesting via AIP to get community/collection hierarchy and bitstreams, then meta data cleanup of the resulting community to clean the "old" URIs and accession dates etc. Thanks for bouncing some ideas around! Alan On 01/16/2014 06:39 PM, Tim Donohue wrote: > Hi Alan, > > On 1/16/2014 9:10 AM, Alan Orth wrote: >> Hi, >> >> I've got a development instance where we uploaded a few hundred items >> (in one community and several collections). Our editors spent some time >> manually uploading bit streams to many of these items. Now I want to >> migrate the community and its hierarchy to the production instance. We >> can't use the CSV via "Export Metadata" because of the bit streams, so >> I've been looking at using AIP, ie: >> >> dspace packager -s -a -t AIP -e [email protected] -p 10568/0 33474.zip >> >> This works great, but the resulting items now have two of each of the >> following fields: >> - dc.date.accessioned >> - dc.date.available >> - dc.identifier.uri >> >> I can't figure out a work flow that doesn't produce this effect... > > These three fields are unfortunately auto-generated by DSpace whenever > you treat an AIP as a submission information package (SIP), which is > what the -s option. Essentially, the '-s' option assumes this is new > content, so DSpace defines these fields as: > * dc.date.accessioned - the date this new content was added to DSpace > * dc.date.available - the date this new content became available > in DSpace (i.e. finished approval workflow) > * dc.identifier.uri - the assigned Handle for this object > > For your situation, you may need to consider some metadata related > questions. > > * Does your development instance assign proper Handles? If not, then > you *need* Production to assign a new dc.identifier.uri. This may > mean that you'll have to unfortunately do some post-metadata cleanup > (perhaps via the Bulk Metadata Editor) of the invalid "development" > handles in the dc.identifier.uri fields. DSpace never overwrites or > removes existing metadata. > > * Do you want the "date.accessioned" and "date.available" fields to be > set to the dates the Item was added to *development* or to > *production*? If the latter, again, you may unfortunately need to do > some post-metadata cleanup, as DSpace specifically *never* > removes/overwrites existing metadata fields. > > > Depending on your setup/answers to your questions, there are three > possible AIP import options I can see: > > 1a) Use "Restore/Replace" option instead (-r) when migrating to > Production. > > If you treat this as an AIP "restoration" then DSpace will skip > creating "date.accessioned", "date.available" and "identifier.uri" > fields and assume that the provided values in the AIPs are correct (as > it assumes you are restoring a set of deleted objects). WARNING: If > the 'dc.identifier.uri' in the AIP does NOT correspond to a valid > Handle, then you will end up with invalid Handles in Production! (See > next option.) > > More on Restore/Replace: > https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-Restoring/ReplacingusingAIP(s) > > > 1b) When using "Restore/Replace", you may want/need to override some > of the default options. For example, restoration will always assume > the 'dc.identifier.uri' is a valid Handle (so a new Handle will not be > assigned). Restoration will also always attempt to restore an object > under the *specified* parent object in the AIP -- so, this means if a > Collection was under a Community with ID "123456789/1" in your > development instance, then it will be restored under a Community of > the *same ID* in Production > > Luckily, these defaults can be overridden. See the 'ignoreHandle' and > 'ignoreParent' Advanced options documented here: > > https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-AdditionalPackagerOptions > > > 2) The other option is to still use Submission (-s) option, but use > one or more of the Advanced options (in 1b) to tweak the defaults. > > I know this is a lot of info, but hopefully it gives you some ideas to > go on. > > - Tim -- Alan Orth [email protected] http://alaninkenya.org http://mjanja.co.ke "I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone." -Bjarne Stroustrup, inventor of C++ GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

