Dear Tom, dear All, I think work patterns that involve "constant visualisation" are quite tightly coupled to working with image data. In other fields, such as sequence analysis, the objects of enquiry don't lend themselves to any meaningful visualisation, and the predominant modes of working with them are logical / non-visual. The scenario of Nelle in the shell lesson is a good example of this -- visually checking thousands of pacific garbage gyre protein files isn't really feasible, but the programmatic ways of `ls *[^AB].txt`, `wc -l *.txt | sort | head` etc. are viable and they scale. Perhaps these could be considered as simple "quality metrics".
In a more general perspective, it seems to me that becoming more independent from data is an indicator of progress in scientific understanding. If we don't understand a thing or phenomenon, all we can do is gather and record data. But once we have a principled scientific model, we can predict the phenomenon in question, and deduce from the model which data is necessary for the prediction. And computing often has great potential to facilitate such progress. From this perspective, "dissociation" from data is not necessarily a bad thing. Best regards, Jan On Fri, May 06, 2016 at 05:07:24AM +0000, Tom Wright wrote: > I was inspired to post this by by one of the posts in the "word / > PowerPoint all wrong" thread. > In my opionion, One of the pitfalls of 'our' programmatic way of working > with data is that it is easy to move further away from the raw data. > As a little background I typically work on biomedical imaging data (optical > coherence tomography and very high resolution images of the human retina). > In my own work I am often caught by two traps. The first is garbage in > garbage out. I often lack suitable metrics of quality and when poor quality > data is only processed in .CSV format this lack of quality becomes > invisible. The second trap relates to the unknown nature of disease induced > changes. Often the most interesting changes are only observed under careful > examination of images. While these specific examples relate rro imaging > data, I'm sure the problems are not limited to this modality. > My approach to addressing these issues is constant visualisation of data, > something made easier by R and knitr and where possible the development and > use of quality metrics. > My question and hope is that other people have addressed these issues. If > you have any thoughts or suggestions I'd love to hear them. > > Thx. > _______________________________________________ > Discuss mailing list > [email protected] > http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org -- +- Jan T. Kim -------------------------------------------------------+ | email: [email protected] | | WWW: http://www.jtkim.dreamhosters.com/ | *-----=< hierarchical systems are for files, not for humans >=-----* _______________________________________________ Discuss mailing list [email protected] http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org
