Also, to give you an actual solution that'll work right now, here's what
I've done in the past (when forgetting to concatenate PDFs properly and
apply coverpages!):

1. Use the DSpace CLI "export" tool to export all items in the collection
you're working on:
"[dspace]/bin/export -d /path/to/export/dir -i 12345/1 -n 0 -t COLLECTION",
where 12345/1 is the handle of your collection and /path/to/export/dir is
the directory you want your exported items to be stored.
(in DSpace 1.6 and later, you can use "[dspace]/bin/dspace export" instead
of "[dspace]/bin/export")

2. Back the export directory up, just in case :)

3. Using a tool like PDFTK (http://www.accesspdf.com/pdftk/), write a Bash
or Perl script (or Windows batch file) to loop through directories and
concatenate the coverpage and original PDF into a new PDF, and delete the
old one.
The exact script you want will vary, but I can give some Linux examples if
you're interested. Note that the concatenation will fail on PDFs that are
secured.

4. Generate a map file associating the sequence numbers of your exported
items (the numbered directories in your export directory) with their
handles. This would be a lot easier if we could tell the dspace export CLI
app to generate a mapfile for us, but unfortunately that hasn't quite made
it into DSpace yet. I can offer some script examples here, too.

5. Use the generated mapfile to do a replace ([dspace]/bin/import -h to see
usage) of all the exported items back into the collection you're working in.

This approach doesn't actually generate the coverpages, it assumes you have
a single PDF you just want to attach to the front of your theses. It also
won't preserve the originals, unless you change Step 3 a little bit (keeping
old file, add a new entry to the item's "contents" file, change the bundle
of the original one to something other than "ORIGINAL" or "CONTENT").

If you want to give this a try, I'd suggest testing everything out on a
small collection on a test server first, and performing backups at each step
of the way, because there are a lot of failure opportunities when working on
a whole set of items.

Cheers,

Kim.

On 21 July 2010 13:02, Kim Shepherd <kim.sheph...@gmail.com> wrote:

> Hi Andrew,
>
> I've had a few informal discussions with people around this as well, and
> one approach we hit on was to create a new media-filter plugin to create
> "access versions" of PDFs, with dynamically-generated coverpages attached.
>
> The "media filter" plugins in DSpace are responsible for things like
> extraction of full text from PDFs (creating the .pdf.txt bitstreams),
> creating thumbnail images (eg. the .jpg.jpg bitstreams), and so on.
>
> Advantages to this approach:
>
> - The plugin would run regularly with the other media-filter plugins, so
> there is no manual effort on the part of repository administrators
>
> - The plugin would create a new version of the pdf for public access, so
> you can still keep the original archived as a preservation copy rather than
> manipulating the only copy in DSpace (and upsetting the checksum checker, as
> helix mentioned in their reply)
>
> - The items being filtered are already archived, so information like handle
> URI can be used in the coverpage.
>
> I believe something like this might already be used by one or two DSpace
> repositories but I'm not 100% sure so I don't want to name any names yet
> ;-).
>
> It's definitely something that's on my radar. If I come up with a demo
> plugin I'll be sure to share it with the community.
>
> Cheers!
>
> Kim
>
>
>
> On 21 July 2010 09:48, White, Andrew <andrew.wh...@lincoln.ac.nz> wrote:
>
>>  Can anyone suggest a way to bulk add 1 page to every PDF bitsream in a
>> particlar collection, perhaps using export/import or perhaps SWORD?
>>
>>
>>
>> We want to add a coversheet to all existing theses in our DSpace archive,
>> without having to do it one-by-one.
>>
>>
>>
>> Thank
>>
>>
>>
>> *Andrew White*
>>
>> *Information Technology Librarian*
>>
>>
>>
>> *George Forbes Memorial Library*
>>
>> *PO Box 64*
>>
>> *Lincoln University*
>>
>> *Lincoln 7647*
>>
>> *Christchurch, New Zealand*
>>
>>
>>
>> *p* +64 3 321 8542 | *f* +64 3 325 2944
>>
>> *e* andrew.wh...@lincoln.ac.nz | *w* library.lincoln.ac.nz
>>
>>
>>
>> *Lincoln University, Te Whare Wanaka o Aoraki*
>>
>> *New Zealand's Specialist Land Based University*
>>
>>
>>
>>
>>
>>
>> "The contents of this e-mail (including any attachments) may be confidential 
>> and/or subject to copyright. Any unauthorised use,
>>
>> distribution, or copying of the contents is expressly prohibited.  If you 
>> have received this e-mail in error, please advise the sender
>>
>> by return e-mail or telephone and then delete this e-mail together with all 
>> attachments from your system."
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Sprint
>> What will you do first with EVO, the first 4G phone?
>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
>> _______________________________________________
>> DSpace-tech mailing list
>> DSpace-tech@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>>
>
------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to