Dear Julien, On Thu, Mar 19, 2020 at 08:47:20AM +0000, Julien Cappèle wrote: > Though I agree with you Clemens that raw images are amazing to work > with as you can use any software you are confortable with, we cannot > forget that depositing several TB of data for each lab would be bad > for ecological reason.
Of course, there are ecological (carbon footprint) considerations - and there are lots of papers and studies about that. I haven't looked at any numbers, but maybe some points: * A lot of data is already stored (e.g. at synchrotrons) and would "only" needed to be made "visible" via a DOI (caveat: I realise that there are huge technical issues with that) * How does that energy consumption compare with the energy used to perform the experiment in the first place? * If by having that data available we can improve software and the way experiments are done: wouldn't that potentially save energy in hte long run (avoiding poor or unnecessary experiments in the first place)? * We are looking at a move to increase the number of raw image data depositions for deposited PDB structures - not at a requirement to deposit raw images for every PDB structure or even for every dataset ever collected. At the moment there are about 4500 image datasets available for about 100000 PDB X-Ray structures, i.e. ~5%. > And because detectors are always improving (thank you all!), size of > data will increase exponentially. True ... and some type of experiment can benefit from those larger, faster and more numerous types of datasets - if done correctly. > Could it be possible for a new/already existing software to store > reflections (area, intensity from center to border, position x/y on > the image, and information of the image) in a lightweight and text > only file ? Possibly a new format to be used for integration ? See my other reply: this all assumes that the initial processing step caught all spots (and nothing else) on the 2D image correctly. There have been all kind of initiatives about raw data deposition (in no particular order) https://www.iucr.org/resources/data/dddwg https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5331468/ https://www.sciencedaily.com/releases/2016/11/161108130045.htm https://journals.iucr.org/d/issues/2016/11/00/yt5099/ https://onlinelibrary.wiley.com/iucr/doi/10.1107/S0909049513020724 http://scripts.iucr.org/cgi-bin/paper?S0907444908015540 https://scripts.iucr.org/cgi-bin/paper?dz5309 https://bl831.als.lbl.gov/~jamesh/lossy_compression/ So we've been there before. Let's see if we can't do at least something for the clearly important structures and work right now - and worry about some long-term impact later (having maybe learned something along the way). Just because we could be doing something now doesn't mean we will have to keep doing this in a 1-N years time, right ;-) Cheers Clemens ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
