Dear Tim Short answer, forensics & chain of trust: $wonder-drug binding to $interesting-target is published and is carried downstream, some time later it is discovered that the binding mode is different in vivo to the published structure and you want to be able to verify (or otherwise) all of the steps which were taken to arrive at that structure. For this you need the original data. You also need other things, but without the original diffraction images all you have is an easily faked table of numbers.
Not saying that this happens frequently but there have been cases where this has happened. Making the raw data available is a useful check, as properly simulating this *including detector artefacts* is hard. One opinion, clearly others are equally valid. Another comment I will make is people are completely happy to pay large sums for lab equipment & consumables. Surely storing your data that are the basis of your science is just another consumable? You could draw a parallel with buying screens - clearly you test all of the conditions in case some work - here we’re talking about storing all of your data in case you need *some* later. Like with crystallisation conditions, you don’t usually know a priori which you need. Cheerio Graeme On 23 Oct 2015, at 10:16, Tim Gruene <tim.gru...@psi.ch<mailto:tim.gru...@psi.ch>> wrote: Dear all, I have wondered if it is really worth the effort (and disk space) for central long-term storage of diffraction images. What fraction of such data will ever be looked at in the future after the respective project has been published? Even if some revolutionary new technology would be developed, I guess this would mostly be applied to current rather than old projects. Given the substantial energy consumption of long term storage (including DVDs and tape as these have to be produced), the gross benefit might be greater deleting old data at some point saving energy and effort for more current things. I have been through a few disk crashs. Often I was annoyed because I had to reinstall a new computer, and sometimes I could not recover some data which I would have liked to. But in fact it often cleaned my computer and life went on even without access to whatever got lost. So what is the scientific argument behind long-term storage of diffraction images other than academic interest in re-processing the data? As mentioned above, I guess that the benefit of re-processing the data may only be minor and effort might be better spent on concurrent projects. Best wishes, Tim On Wednesday, October 21, 2015 06:03:21 PM Allister Crow wrote: On the last point about storing diffraction images, I wonder what the community's opinion is of uploading images to the Zenodo archive for safe-keeping and sharing? The Zenodo project is being run by the folks at CERN, and is EU funded to support scientific data sharing. (Zenodo.org<http://zenodo.org>) Until the PDB does this, perhaps this is one of the better ways through which we can ensure preservation (or at least another backup) of our most important diffraction images? - Ally ps I should also say that I originally learned of Zenodo from Graeme Winter at Diamond. ----------------- Allister Crow Department of Pathology University of Cambridge Google Scholar Profile <http://bit.ly/11ga7Sq> Research Gate Profile <http://bit.ly/137Ytt4> Departmental Page <http://www.path.cam.ac.uk/directory/allister-crow> On 21 Oct 2015, at 17:03, William G. Scott <wgsc...@ucsc.edu<mailto:wgsc...@ucsc.edu>> wrote: Dear CCP4 Citizenry: I’m worried about medium to long-term data storage and integrity. At the moment, our lab uses mostly HFS+ formatted filesystems on our disks, which is the OS X default. HFS+ always struck me as somewhat fragile, and resource forks at best are a (seemingly needless) headache, at least as far as crystallography datasets go. (True, you can do HFS-compression and losslessly shrink your images by a factor of 2, or shrink your ccp4 installation, but these are fairly minor advantages). I read the CCP4 wiki page http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Filesystems that summarizes some of the other options. From what I have read, there and elsewhere, it seems like zfs and btrfs might be significantly better alternatives to HFS+, but I really would like to get a sense of what others have experienced with these, or other, equally or more robust options. I don’t feel like I know enough to critically evaluate the information. Anyone know what the NSA uses? I recently created a de novo backup of some personal data on an external HFS+ drive (photos, movies, music, etc). I was very unpleasantly surprised to find several files had been silently corrupted. (In the case of a movie file, for example, the file would play but could not be copied. In another case, a music file would not copy, yet it had identical md5sum and sha1 checksums when compared to an uncorrupted redundant backup I had. I’m still puzzled by this, but it suggests the resource fork might be the source of the corruption, and, more worrisome still, that conventional checksums aren’t detecting some silently corrupted data, so I am not even sure if zfs self-healing would be the answer.) Since we as a community are now encouraging primary X-ray diffraction images to be stored, I can only imagine the problem could be ubiquitous, and a discussion might be worth having. (I apologize if this has been addressed previously; I did search the archive.) All the best, Bill William G. Scott Director, Program in Biochemistry and Molecular Biology Professor, Department of Chemistry and Biochemistry and The Center for the Molecular Biology of RNA University of California at Santa Cruz Santa Cruz, California 95064 USA -- -- Paul Scherrer Institut Dr. Tim Gruene - persoenlich - OFLC/102 CH-5232 Villigen PSI phone: +41 (0)56 310 5297 GPG Key ID = A46BEE1A -- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom