Ah, thank you, helix. That's exactly what I needed. I hadn't noticed the command-line metadata export (I've only been using the web one).
After importing via AIP (to capture community/collection hierarchy and bitstreams), I exported via CSV and reset the dc.date.accessioned and dc.date.available to a sensible string manually, like 2014-01-21T18:58:46Z. Minor detail, but will probably help some future travelers. Cheers, Alan On 01/21/2014 12:13 AM, helix84 wrote: > > See the -a flag and the configuration option. > > https://wiki.duraspace.org/display/DSDOC4x/Batch+Metadata+Editing#BatchMetadataEditing-ExportFunction > > On Jan 20, 2014 8:25 PM, "Alan Orth" <[email protected] > <mailto:[email protected]>> wrote: > > Hi, > > I've just decided I will export the metadata (CSV) and clean it up > manually, then re-import before I export via AIP. This works > great for > the dc.identifier.uri (handle link), but I just realized that > dc.date.accessioned and dc.date.available aren't in the exported > metadata. > > I assume these fields are in the database, so I'll have to use SQL to > clean them up after importing via AIP? I'm not sure where to look in > the DB... > > Thanks, > > Alan > > On 01/17/2014 09:12 AM, Alan Orth wrote: > > Thanks, both Tim and Helix. > > > > Yes, I initially looked into the "-r" mode, but then realized > that, as > > Tim mentioned, our development instance doesn't necessarily create > > proper handles. Our development instance is more of a code-testing > > ground, and we don't sync the content very frequently. Also, the > > date-related meta data isn't necessarily correct either, as the > > accession into the development instance (for quality assurance) > isn't > > necessarily the accession date we'd want. > > > > I think I'll have to rely on a two-step approach: first > ingesting via > > AIP to get community/collection hierarchy and bitstreams, then > meta data > > cleanup of the resulting community to clean the "old" URIs and > accession > > dates etc. > > > > Thanks for bouncing some ideas around! > > > > Alan > > > > On 01/16/2014 06:39 PM, Tim Donohue wrote: > >> Hi Alan, > >> > >> On 1/16/2014 9:10 AM, Alan Orth wrote: > >>> Hi, > >>> > >>> I've got a development instance where we uploaded a few > hundred items > >>> (in one community and several collections). Our editors spent > some time > >>> manually uploading bit streams to many of these items. Now I > want to > >>> migrate the community and its hierarchy to the production > instance. We > >>> can't use the CSV via "Export Metadata" because of the bit > streams, so > >>> I've been looking at using AIP, ie: > >>> > >>> dspace packager -s -a -t AIP -e [email protected] <mailto:[email protected]> > -p 10568/0 33474.zip > >>> > >>> This works great, but the resulting items now have two of each > of the > >>> following fields: > >>> - dc.date.accessioned > >>> - dc.date.available > >>> - dc.identifier.uri > >>> > >>> I can't figure out a work flow that doesn't produce this effect... > >> These three fields are unfortunately auto-generated by DSpace > whenever > >> you treat an AIP as a submission information package (SIP), > which is > >> what the -s option. Essentially, the '-s' option assumes this > is new > >> content, so DSpace defines these fields as: > >> * dc.date.accessioned - the date this new content was added > to DSpace > >> * dc.date.available - the date this new content became > available > >> in DSpace (i.e. finished approval workflow) > >> * dc.identifier.uri - the assigned Handle for this object > >> > >> For your situation, you may need to consider some metadata related > >> questions. > >> > >> * Does your development instance assign proper Handles? If > not, then > >> you *need* Production to assign a new dc.identifier.uri. This may > >> mean that you'll have to unfortunately do some post-metadata > cleanup > >> (perhaps via the Bulk Metadata Editor) of the invalid "development" > >> handles in the dc.identifier.uri fields. DSpace never overwrites or > >> removes existing metadata. > >> > >> * Do you want the "date.accessioned" and "date.available" > fields to be > >> set to the dates the Item was added to *development* or to > >> *production*? If the latter, again, you may unfortunately need > to do > >> some post-metadata cleanup, as DSpace specifically *never* > >> removes/overwrites existing metadata fields. > >> > >> > >> Depending on your setup/answers to your questions, there are three > >> possible AIP import options I can see: > >> > >> 1a) Use "Restore/Replace" option instead (-r) when migrating to > >> Production. > >> > >> If you treat this as an AIP "restoration" then DSpace will skip > >> creating "date.accessioned", "date.available" and "identifier.uri" > >> fields and assume that the provided values in the AIPs are > correct (as > >> it assumes you are restoring a set of deleted objects). > WARNING: If > >> the 'dc.identifier.uri' in the AIP does NOT correspond to a valid > >> Handle, then you will end up with invalid Handles in > Production! (See > >> next option.) > >> > >> More on Restore/Replace: > >> > > https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-Restoring/ReplacingusingAIP(s) > > <https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-Restoring/ReplacingusingAIP%28s%29> > >> > >> > >> 1b) When using "Restore/Replace", you may want/need to override > some > >> of the default options. For example, restoration will always assume > >> the 'dc.identifier.uri' is a valid Handle (so a new Handle will > not be > >> assigned). Restoration will also always attempt to restore an > object > >> under the *specified* parent object in the AIP -- so, this > means if a > >> Collection was under a Community with ID "123456789/1" in your > >> development instance, then it will be restored under a Community of > >> the *same ID* in Production > >> > >> Luckily, these defaults can be overridden. See the > 'ignoreHandle' and > >> 'ignoreParent' Advanced options documented here: > >> > >> > > https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-AdditionalPackagerOptions > >> > >> > >> 2) The other option is to still use Submission (-s) option, but use > >> one or more of the Advanced options (in 1b) to tweak the defaults. > >> > >> I know this is a lot of info, but hopefully it gives you some > ideas to > >> go on. > >> > >> - Tim > > -- > Alan Orth > [email protected] <mailto:[email protected]> > http://alaninkenya.org > http://mjanja.co.ke > "I have always wished for my computer to be as easy to use as my > telephone; my wish has come true because I can no longer figure > out how to use my telephone." -Bjarne Stroustrup, inventor of C++ > GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c > > > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > DSpace-tech mailing list > [email protected] > <mailto:[email protected]> > https://lists.sourceforge.net/lists/listinfo/dspace-tech > List Etiquette: > https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette > -- Alan Orth [email protected] http://alaninkenya.org http://mjanja.co.ke "I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone." -Bjarne Stroustrup, inventor of C++ GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

