Norman, I would love to be just as optimistic as u seem to be. unfortunately, due to the evidence I just cant. in any case, I would like to invite u to a hackathon we are currently organizing.
Join us in Montpellier for a one-day event to hack on scholarly PDFs! Currently, the bulk of peer-reviewed scientific knowledge is locked up in PDF documents, which are difficult to get information . We want to change that. If you’re interested in hacking on PDFs and exploring ways to access scholarly data in modern ways, this hackathon is for you. http://scholrev.org/hackathon/ On Fri, May 3, 2013 at 12:36 AM, Norman Gray <[email protected]> wrote: > > Alexander, hello. > > On 2013 May 2, at 22:49, Alexander Garcia Castro <[email protected]> > wrote: > >> Hi Norman, I have heard the same from ADOBE people. its not the PDF it >> is YOU not wise enough as to know how to generate a PDF. >> Unfortunately, I dont work with PDFs generated by me, I have to deal >> with those coming from publishers; probably they should attend a >> training for generating PDFs. > > Hence my comment that journals could very usefully give more of a lead here. > > I think that some journal publishers _are_ trying to do things here, partly > in order to back up their assertions that they add value to the publication > process, but also to address their own production problems. It was Elsevier > who sponsored an 'Executable PDF' challenge > <http://www.executablepapers.com/>. Various other people are putting effort > in as well, obviously, but as you point out, the publishers have to be > involved. I have a couple of links at <https://pinboard.in/u:nxg/t:beyondpdf> > > Like I said: it's only fairly recently that the desire to put metadata into > PDFs has spread beyond a few nuts. The area is still pretty immature. > >> It is great to hear ":libraries for destructuring and rummaging around >> in PDFs are not very easy to use (no need for 'jailbreaking'". Please >> point us to such libraries and tutorials for destructuring the PDF. So >> far, for practical purposes the content is locked up and in deep need >> for jail braking so that it can be effectively used. But, as u pointed >> out, it may be just because we dont know how to generate PDFs. BTW, I >> am ccing this to Casey, we work together and we are eager to hear >> about those libraries. > > Well, there's pdflib <http://www.pdflib.com/>, which is expensive but clearly > supported, libpdf <https://sourceforge.net/projects/libpdf/>, which is free > but which I know nothing about, and PDFBox <http://pdfbox.apache.org/> which > is also free, and which I've made light use of, in order to extract metadata > from PDFs into an Atom feed (I can share this with you if you want, but it's > not really polished). > > There are some libraries mentioned at > <http://en.wikipedia.org/wiki/List_of_PDF_software> > > There's probably more, but that might be a start. Were you trying to grok > PDF straight from the spec? Hardcore! > > All the best, > > Norman > > > -- > Norman Gray : http://nxg.me.uk > SUPA School of Physics and Astronomy, University of Glasgow, UK > -- Alexander Garcia http://www.alexandergarcia.name/ http://www.usefilm.com/photographer/75943.html http://www.linkedin.com/in/alexgarciac
