We do the remote access data collections from here, and retrieving raw data 
from a synchrotron is becoming a full time job for a synchrotron spokesperson 
at a large lab. I tried to read those CDs with the archived data once. One of 
disks failed. SSRL is nice enough to store the images for a while, but I 
retrieve and store locally all partially integrated  data. I can rescale them 
later if needed and they make a good proof (in my opinion) of an experimental 
origin of my structures.

I believe we should send those data to PDB together with the processed 
structural factors to validate data quality. They are small enough to do it 
over the Internet. Prevention of crime is the best way to stop it from 
happening.

Alex Aleshin,  staff scientist
Sanford-Burnham Medical Research Institute 
10901 North Torrey Pines Road
La Jolla, California 92037



On Apr 5, 2012, at 7:02 AM, Bosch, Juergen wrote:

> I would say everybody keeps probably too many junk datasets around - at least 
> I do. And I run into the trouble of having to buy new TB plates every now and 
> then.
> I think on average per year my group acquires currently ~700 GB of raw images 
> (compressed), now if we were to only keep the useful datasets we probably 
> would be down to 10% of that. But as always you hope for the best and keep 
> some data considered junk in 2009 which might be useful in 2015.
> 
> Jürgen
> 
> On Apr 5, 2012, at 9:08 AM, Roger Rowlett wrote:
> 
>> FYI, every NSF grant proposal now must have a data management plan that 
>> describes how all experimental data will be archived and in what formats. 
>> I'm not sure how seriously these plans are monitored, but a plan must be 
>> provided nevertheless. Is anyone NOT archiving their original data in some 
>> way?
>> 
>> Roger Rowlett
>> 
>> On Apr 5, 2012 7:16 AM, "John R Helliwell" <jrhelliw...@gmail.com> wrote:
>> Dear 'aales...@burnham.org',
>> 
>> Re the pixel detector; yes this is an acknowledged raw data archiving
>> challenge; possible technical solutions include:- summing to make
>> coarser images ie in angular range, lossless compression (nicely
>> described on this CCP4bb by James Holton) or preserving a sufficient
>> sample of data....(but nb this debate is certainly not yet concluded).
>> 
>> Re "And all this hassle is for the only real purpose of preventing data 
>> fraud?"
>> 
>> Well.....Why publish data?
>> Please let me offer some reasons:
>> • To enhance the reproducibility of a scientific experiment
>> • To verify or support the validity of deductions from an experiment
>> • To safeguard against error
>> • To allow other scholars to conduct further research based on
>> experiments already conducted
>> • To allow reanalysis at a later date, especially to extract 'new'
>> science as new techniques are developed
>> • To provide example materials for teaching and learning
>> • To provide long-term preservation of experimental results and future
>> access to them
>> • To permit systematic collection for comparative studies
>> • And, yes, To better safeguard against fraud than is apparently the
>> case at present
>> 
>> Also to (probably) comply with your funding agency's grant conditions:-
>> Increasingly, funding agencies are requesting or requiring data
>> management policies (including provision for retention and access) to
>> be taken into account when awarding grants. See e.g. the Research
>> Councils UK Common Principles on Data Policy
>> (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
>> Curation Centre overview of funding policies in the UK
>> (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
>> See also http://forums.iucr.org/viewtopic.php?f=21&t=58 for discussion
>> on policies relevant to crystallography in other countries. Nb these
>> policies extend over derived, processed and raw data, ie without
>> really an adequate clarity of policy from one to the other stages of
>> the 'data pyramid' ((see
>> http://www.stm-assoc.org/integration-of-data-and-publications).
>> 
>> 
>> And just to mention IUCr Journals Notes for Authors for biological
>> macromolecular structures, where we have our ie macromolecular
>> crystallography's version of the 'data pyramid' :-
>> 
>> (1) Derived data
>> • Atomic coordinates, anisotropic or isotropic displacement
>> parameters, space group information, secondary structure and
>> information about biological functionality must be deposited with the
>> Protein Data Bank before or in concert with article publication; the
>> article will link to the PDB deposition using the PDB reference code.
>> • Relevant experimental parameters, unit-cell dimensions are required
>> as an integral part of article submission and are published within the
>> article.
>> 
>> (2) Processed experimental data
>> • Structure factors must be deposited with the Protein Data Bank
>> before or in concert with article publication; the article will link
>> to the PDB deposition using the PDB reference code.
>> 
>> (3) Primary experimental data (here I give small and macromolecule
>> Notes for Authors details):-
>> For small-unit-cell crystal/molecular structures and macromolecular
>> structures IUCr journals have no current binding policy regarding
>> publication of diffraction images or similar raw data entities.
>> However, the journals welcome efforts made to preserve and provide
>> primary experimental data sets. Authors are encouraged to make
>> arrangements for the diffraction data images for their structure to be
>> archived and available on request.
>> For articles that present the results of powder diffraction profile
>> fitting or refinement (Rietveld) methods, the primary diffraction
>> data, i.e. the numerical intensity of each measured point on the
>> profile as a function of scattering angle, should be deposited.
>> Fibre data should contain appropriate information such as a photograph
>> of the data. As primary diffraction data cannot be satisfactorily
>> extracted from such figures, the basic digital diffraction data should
>> be deposited.
>> 
>> 
>> Finally to mention that many IUCr Commissions are interested in the
>> possibility of establishing community practices for the orderly
>> retention and referencing of raw data sets, and the IUCr would like to
>> see such data sets become part of the routine record of scientific
>> research in the future, to the extent that this proves feasible and
>> cost-effective.
>> I draw your attention therefore to the IUCr Forum on such matters at:-
>> http://forums.iucr.org/
>> Within this Forum you can find for example the ICSU convened Strategic
>> Coordinating Committee on Information and Data fairly recent report;
>> within this we learn of many other areas of science efforts on data
>> archiving and eg that the radio astronomy square kilometre array will
>> pose the biggest raw data archiving challenge on the planet.[Our needs
>> are thereby relatively modest.]
>> 
>> The IUCr Diffraction Data Deposition Working Group is actively
>> addressing all these various issues.
>> We weclome your input at the IUCr Forum, which will thereby be most
>> timely. Thankyou.
>> 
>> Best wishes,
>> Yours sincerely,
>> John
>> Professor John R Helliwell DSc
>> 
>> 
>> On Thu, Apr 5, 2012 at 1:24 AM, aaleshin <aales...@burnham.org> wrote:
>> > People who raise their voices for a prolonged storage of raw images miss a
>> > simple fact that the volume of collected data increases proportionally if
>> > not faster than the cost of storage space drops. I just had an opportunity
>> > to collect data with the PILATUS detector at SSRL and say you that monster
>> > allows slicing the data 4-5 times thinner than other detectors do. Some
>> > people also like collecting very redundant data sets. Even now, 
>> > transferring
>> > and storage of raw data from a synchrotron is a pain in the neck, but in a
>> > few years it may become simply impractical. And all this hassle is for the
>> > only real purpose of preventing data fraud? An't there a cheaper and more
>> > adequate solutions to the problem?
>> >
>> > I also wonder why after the first occurrence of data fraud several years
>> > ago, PDB did not take any action to prevent its appearance in the future? 
>> > Or
>> > administrative actions are simply impossible nowadays without a mega-dollar
>> > grant?
>> >
>> >
>> 
>> 
>> --
> 
> ......................
> Jürgen Bosch
> Johns Hopkins University
> Bloomberg School of Public Health
> Department of Biochemistry & Molecular Biology
> Johns Hopkins Malaria Research Institute
> 615 North Wolfe Street, W8708
> Baltimore, MD 21205
> Office: +1-410-614-4742
> Lab:      +1-410-614-4894
> Fax:      +1-410-955-2926
> http://web.mac.com/bosch_lab/
> 
> 
> 
> 

Reply via email to