I was reading the PNAS author guidelines and I came across this gem:

Datasets: Supply Excel (.xls), RTF, or PDF files. This file type will be 
published in raw format and will not be edited or composed.

Did I read those last two file formats correctly? I have actually came across a 
dataset in supplementary information that was several dozen pages of PDF. It 
was effectively impossible to extract the data from this document. (I can dig 
it up if pressed, probably.) I had no idea that the authors may have been 
encouraged to submit their data like that.

Does a premiere scientific journal actually request data to be in PDF format?

I can think of dozens of other formats that would be more fitting. They are 
summarized here:

http://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats

What is the scholarly equivalent to a torch and pitchfork march and how can we 
organize such a march to encourage journals to require proper serialization 
formats for datasets in supplementary info?

James

P.S. I am aware that it is better to submit data to a dedicated repository, but 
let's consider those cases where research produces data for which there is not 
yet a dedicated repository.

Reply via email to