In case you are interested, we have a few tools that automate the creation of packager/ingest folders that Mark has described.
https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/DSpace-Institutional-Repository-Ingest Once you have created these folders, the DSpace command line tools let you perform many batch operations. Terry On Tue, Feb 28, 2017 at 6:06 AM, Mark Wood <[email protected]> wrote: > The spreadsheet is not necessary; lots of people use that method because > they are familiar with spreadsheet tools. The important thing is that you > somehow create the archive format that DSpace will try to read. > > It shouldn't be too difficult to write a script that would, for example: > walk a directory of PDFs, use pdftk to extract some minimal metadata, and > lay out a tree of directories in Simple Archive Format that can be read > with 'bin/dspace import'. The content can be organized in any way you > wish. If you want each PDF to be a separate item, have your script create > one subdirectory per PDF with one content file in it. > > DSpace does demand a few metadata fields. I think you'll need to provide > at least a title. If necessary, you should be able to edit those later. > but you'll need to provide *some* value. You could just generate serial > numbers for temporary titles: "Title 1", "Title 2"...; or use the filename > as the title. > > If you'd rather use one of the Packager formats, again it really doesn't > matter how you proceed so long as you end up with a format that the > Packager can read. Again, if you describe each content file within a > separate item, that's what you'll get. > > -- > You received this message because you are subscribed to the Google Groups > "DSpace Technical Support" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/dspace-tech. > For more options, visit https://groups.google.com/d/optout. > -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology http://georgetown-university-libraries.github.io/ <https://www.library.georgetown.edu/lit/code> 425-298-5498 (Seattle, WA) -- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/dspace-tech. For more options, visit https://groups.google.com/d/optout.
