On Nov 17, 2010, at 1:24 AM, Tim Gruene wrote:
> On Wed, Nov 17, 2010 at 12:12:10AM -0800, James Stroud wrote:
>> 
>> [...] 
>> The operative word is "dataset", which is a subset of all things "data".
>> 
>> A dataset should be in a format that
>> 
>> 1. can be validated
>> 2. is structured
>> 3. is machine readable
> 
> 
> What do these items have to do with a journal or an article therein? Why 
> should
> a journal be concerned with conserving data?

For Posterity.

I did a 5 minute search for an example, and the best I could do with the 
patience I had was this:

http://onlinelibrary.wiley.com/doi/10.1002/pmic.200700038/suppinfo

You'll see in the available PDF file Tables S1-S3. Were I to look for any 
significant amount of time, I could find much more egregious examples.

For this particular example, your eyes may deceive you into thinking that the 
PDF file can be parsed and the data represented in the tables extracted with a 
script of some sort. But, if you have the patience, go to Table S3 and start 
selecting text at "Accession Number" in the heading. You'll find that the 
selection goes down that column only about half way and then begins selecting 
at the next column, "Swissprot Identifier".

So basically, the data represented in these tables is useless for any 
computational analysis by the end user except for (1) those who wish to type 
the data in by hand or (2) individuals like Dr. Merritt who can presumably just 
read the data and do the analysis in cranio.

James

Reply via email to