The spreadsheet is not necessary; lots of people use that method because they are familiar with spreadsheet tools. The important thing is that you somehow create the archive format that DSpace will try to read.
It shouldn't be too difficult to write a script that would, for example: walk a directory of PDFs, use pdftk to extract some minimal metadata, and lay out a tree of directories in Simple Archive Format that can be read with 'bin/dspace import'. The content can be organized in any way you wish. If you want each PDF to be a separate item, have your script create one subdirectory per PDF with one content file in it. DSpace does demand a few metadata fields. I think you'll need to provide at least a title. If necessary, you should be able to edit those later. but you'll need to provide *some* value. You could just generate serial numbers for temporary titles: "Title 1", "Title 2"...; or use the filename as the title. If you'd rather use one of the Packager formats, again it really doesn't matter how you proceed so long as you end up with a format that the Packager can read. Again, if you describe each content file within a separate item, that's what you'll get. -- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/dspace-tech. For more options, visit https://groups.google.com/d/optout.
