On Mon, Apr 7, 2008 at 8:30 AM, Esha Datta <[EMAIL PROTECTED]> wrote: > NYU is looking at e-publishing in general and how it ties in with > preservation requirements. Have any of you done any work with PDF/A > and generating access files from that format? We have a number of > books that will be converted to the pdf format. We're looking at PDF/ > A for ingestion into our preservation repository(a DSpace instance) > and generating access files from it. How easy/difficult was it to > generate a workflow for working with PDFs, generating PDF/As, and > access files from PDF/As.
One more vote for OJS, here -- we're running the Journal of Insect Science[1] with it, very successfully. With respect to generating HTML and PDFs, my understanding (it's a little fuzzy) is that we have manuscripts converted to XML by a third-party, and then use a combination of XSLT and Prince to generate professional-quality documents. Prince isn't cheap, but man, if it isn't good at what it does. If you were gonna start at this again, you might be able to build a wrapper around Gecko or WebKit to do the work... but that'd take time. IIRC, it's all pretty cheap (dunno if I can disclose our XML processing rate -- suffice to say, it's cheaper than undergrads), and takes somewhere in the 3-4 hours per article timeframe. I think there's a fairly good potential for economies of scale, were we to add more titles. A great person to talk to is Andrew Gough <[EMAIL PROTECTED]> -- he developed most of the workflow and procedures we use at Madison (I've copied him here, in case this message contains gross inaccuracies). Cheers, -Nate [1]: http://insectscience.org/