Re: [ccp4bb] problem in scaling the Zn-MAD data
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Deepthi, is it just a typo, or do your last two sentences say that your data DO NOT scale in P312 but scale well in P321? Did you try pointless for space group determination? I have not used molrep for this purpose and cannot judge how reliable the self rotation function is for space group determination. Depending on your resolution and cell dimensions you may have very few reflection for determining the screw axis. Do your data suffer from radiation damage, does the cell become greater during integration? Did you fix the detector distance during integration? Best wishes, Tim On 04/04/12 23:07, Deepthi wrote: Hello everyone I have a problem scaling the MAD data which was collected a week ago.The data was collected at 1.5A resolution using three wavelengths for Zn-MAD experiments. Scaling the data for MAD experiments, the number of rejections and chi2 values were very high even after adjusting the error-scale factor and error model. The space group i used was p312 which i obtained by running a self-rotation function in MOLREP. When i scale my data using p312 spacegroup the chi2 and rejections were huge. But he data was scaling well in p321 spacegroup. can anyone explain whats going on? Thank you very much Deepthi - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFPfVx7UxlJ7aRr7hoRAlLbAJ0XPZq/SmucWYEuzYSHAZBzYTbJIwCeI1oY IyoLBnvHizxqj5oDGA/JY3s= =cKSs -END PGP SIGNATURE-
Re: [ccp4bb] Who is using 64-bit Linux?
Hi David I'm curious - do you mean running on a 32-bit Centos box or running the 32-bit Mosflm executable on a 64-bit Centos box? We did have one report of problems with the 32-bit exe on a 64-bit box, which (seemingly) randomly gave one of two different results (either the same failure or success) - but that was fixed in the beta we released in July last year, and didn't occur with the 64-bit exe at all. We really are grateful to people who tell us about the bugs they find rather than try to struggle on in silence! Funny enough I can't get iMosflm running reliably on 32 bit CentOS 5 or CentOS 6 and I can on 64 bits versions. We have all running (CCP4, Coot, iMosflm, XDS, phenix, best, etc, etc) running in 64 bit and intent to move all user computers to uniform 64 bit environment on the next shutdown as it is more difficult to support both 32 and 64 bit enviroment. David -- David Aragao, PhD | Research Fellow - MX | Australian Synchrotron p: (03) 8540 4121 | f: (03) 8540 4200 | m: 0467 775 203 david.ara...@synchrotron.org.au | www.synchrotron.org.au 800 Blackburn Road, Clayton, Victoria 3168, Australia Harry -- Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre, Hills Road, Cambridge, CB2 0QH
Re: [ccp4bb] problem in scaling the Zn-MAD data
Hi, On Wed, Apr 04, 2012 at 02:07:58PM -0700, Deepthi wrote: Hello everyone I have a problem scaling the MAD data which was collected a week ago.The data was collected at 1.5A resolution using three wavelengths for Zn-MAD experiments. Scaling the data for MAD experiments, the number of rejections and chi2 values were very high even after adjusting the error-scale factor and error model. The space group i used was p312 which i obtained by running a self-rotation function in MOLREP. When i scale my data using p312 spacegroup the chi2 and rejections were huge. But he data was scaling well in p321 spacegroup. can anyone explain whats going on? When you say 'Scaling the data for MAD experiments': do you mean scaling the various scans for your 3-wvl MAD data in a single scaling job? Unless you already took care of this during data integration, remember that your separate scans could have been indexed differently and therefore don't match up. See eg. http://www.ccp4.ac.uk/html/reindexing.html for some lookup-tables in P312 and P321. You can use the CCP4 program 'reindex' on MTZ files if needed. But I guess most modern data-processing and scaling programs will take care of that automatically anyway? Cheers Clemens -- *** * Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com * * Global Phasing Ltd. * Sheraton House, Castle Park * Cambridge CB3 0AX, UK *-- * BUSTER Development Group (http://www.globalphasing.com) ***
Re: [ccp4bb] arp_waters still available?
Dear Bernard, arp_waters is a very old code and it gets even older as we speak. Try to use ARP/wARP version 7.2, where you can run the same task: from the command line ($warpbin/auto_solvent.sh) from the CCP4i GUI (ARP/wARP Solvent) from ArpNavigator (Model Solvent) There should be both 32 and 64-bit versions. Best regards, Victor On 05/04/2012 05:23, Bernhard Rupp (Hofkristallrat a.D.) wrote: Dear Developers, in some older scripts I still call the ccp4 version of arp_waters, which worked well for dummy atom picking. It does not seem to be included in recent 64 bit CCP4 packages. Does anyone perhaps have a precompiled 64 bit version of arp_waters that might run on RHEL62? Best regards, BR - Bernhard Rupp 001 (925) 209-7429 +43 (676) 571-0536 b...@ruppweb.org hofkristall...@gmail.com http://www.ruppweb.org/ - No animals were hurt or killed during the production of this email. -
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the possibility of establishing community practices for the orderly retention and referencing of raw data sets, and the IUCr would like to see such data sets become part of the routine record of scientific research in the future, to the extent that this proves feasible and cost-effective. I draw your attention therefore to the IUCr Forum on such matters at:- http://forums.iucr.org/ Within this Forum you can find for example the ICSU convened Strategic Coordinating Committee on Information and Data fairly recent report; within this we learn of many other areas of science efforts on data archiving and eg that the radio astronomy square kilometre array will pose the biggest raw data archiving challenge on the planet.[Our needs are thereby relatively modest.] The IUCr Diffraction Data Deposition Working Group is actively addressing all these various issues. We weclome your input at the IUCr Forum, which will thereby be most timely. Thankyou. Best wishes, Yours
[ccp4bb] CCP-EM positions now available
[Cross-posted from the 3DEM mailing list.] --Gerard -- Forwarded message -- Date: Wed, 4 Apr 2012 16:34:39 +0100 From: Helen Saibil h.sai...@mail.cryst.bbk.ac.uk To: 3DEM Mailing List 3...@ncmir.ucsd.edu Subject: [3dem] CCP-EM positions now available Dear Colleagues, We have been awarded a Partnership grant by the MRC to provide computational support for UK scientists using electron cryo-microscopy for structural biology. One of the major aims is to create a Collaborative Computational Project, CCP-EM, by analogy with similar successful projects in macromolecular crystallography (CCP4) and biological nuclear magnetic resonance spectroscopy (CCPN). We seek two excellent and motivated computational scientists to support the Partnership grant and the CCP-EM project. These posts will have a wide variety of responsibilities, including writing community code, improving the useability of existing code, providing training, and supporting individual scientists. The first post will focus on technical aspects, building community tools and improving the programs available. The second post will focus more on the scientific requirements of the community. The posts are located at the Research Complex at Harwell, alongside the core group of CCP4, but the postholders will be expected to travel throughout the UK and interact with international groups to support the collaboration. Applications must be made through the RCUK Shared Services recruitment portal https://ext.ssc.rcuk.ac.uk/ using the references IRC50385 and IRC50666. Informal enquiries may be made to Martyn Winn (martyn.w...@stfc.ac.uk). Best wishes, Martyn Winn, Richard Henderson, Alan Roseman, Peter Rosenthal, Helen Saibil and Ardan Patwardhan
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Colleagues, Clearly, no system will be able to perfectly preserve every pixel of every dataset collected at a cost that can be afforded. Resources are finite and we must set priorities. I would suggest that, in order of declining priority, we try our best to retain: 1. raw data that might tend to refute published results 2. raw data that might tend to support published results 3. raw data that may be of significant use in currently ongoing studies either in refutation or support 4. raw data that may be of significant use in future studies While no archiving system can be perfect, we should not let the search for a perfect solution prevent us from working with currently available good solutions, and even in this era of tight budgets, there are good solutions. Regards, Herbert On 4/5/12 7:16 AM, John R Helliwell wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the possibility of establishing community practices for the orderly retention and referencing of raw data
Re: [ccp4bb] very informative - Trends in Data Fabrication
FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.com wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK ( http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies ). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the possibility of establishing community practices for the orderly retention and referencing of raw data sets, and the IUCr would like to see such data sets become part of the routine record of scientific research in the future, to the extent that this proves feasible and cost-effective. I draw your attention therefore to the IUCr Forum on such matters at:- http://forums.iucr.org/ Within this Forum you can find for example the ICSU convened Strategic Coordinating Committee on Information and Data
Re: [ccp4bb] very informative - Trends in Data Fabrication
I would say everybody keeps probably too many junk datasets around - at least I do. And I run into the trouble of having to buy new TB plates every now and then. I think on average per year my group acquires currently ~700 GB of raw images (compressed), now if we were to only keep the useful datasets we probably would be down to 10% of that. But as always you hope for the best and keep some data considered junk in 2009 which might be useful in 2015. Jürgen On Apr 5, 2012, at 9:08 AM, Roger Rowlett wrote: FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.commailto:jrhelliw...@gmail.com wrote: Dear 'aales...@burnham.orgmailto:aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the
[ccp4bb] Category 4 Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Herbert, Category 4, in Manchester, we find is tricky, for want of a better word. Needless to say that we have collaborators on our Crystallography Research Service who request data sets from eg ten years ago, that are now urgent for publication writing up. So we are keeping everything, although only recent years the raw diffraction images, and nb soon to be assisted by the Univ Manchester centralised Data Repository for its researchers. (Incidentally I have kept all of my film oscillation, and inc later Laue data, back to approx 1977, which fills a whole wall shelf worth, ~ 10 metres.) Greetings, John Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol. Chair School of Chemistry, University of Manchester, Athena Swan Team. http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html On 5 Apr 2012, at 13:50, Herbert J. Bernstein y...@bernstein-plus-sons.com wrote: Dear Colleagues, Clearly, no system will be able to perfectly preserve every pixel of every dataset collected at a cost that can be afforded. Resources are finite and we must set priorities. I would suggest that, in order of declining priority, we try our best to retain: 1. raw data that might tend to refute published results 2. raw data that might tend to support published results 3. raw data that may be of significant use in currently ongoing studies either in refutation or support 4. raw data that may be of significant use in future studies While no archiving system can be perfect, we should not let the search for a perfect solution prevent us from working with currently available good solutions, and even in this era of tight budgets, there are good solutions. Regards, Herbert On 4/5/12 7:16 AM, John R Helliwell wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no
[ccp4bb] Via Annual Reports...Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Roger, At the recent ICSTI Workshop on Delivering Data in science the NSF presenter, when I asked about monitoring, replied that the PIs' annual reports should include data management aspects. See http://www.icsti.org/spip.php?rubrique42 Best wishes, John Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol. Chair School of Chemistry, University of Manchester, Athena Swan Team. http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html On 5 Apr 2012, at 14:08, Roger Rowlett rrowl...@colgate.edu wrote: FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.com wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that
[ccp4bb] mtz2cif capable of handling map coefficients
It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Thanks! F - Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder
Re: [ccp4bb] mtz2cif capable of handling map coefficients
Have you tried mtz2various (with cif output)? Pete Francis E Reyes wrote: It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Thanks! F - Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder
Re: [ccp4bb] mtz2cif capable of handling map coefficients
On Thursday, April 05, 2012 08:25:05 am Francis E Reyes wrote: It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Maybe I missed something. What is accomplished by depositing map coefficients that isn't done better by depositing Fo and Fc? Ethan -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] mtz2cif capable of handling map coefficients
I have not tried it, but the latest version of the rcsb program sf-convert is supposed to support it (see version 1.2 released March 23) http://sw-tools.pdb.org/apps/SF-CONVERT/index.html http://sw-tools.pdb.org/apps/SF-CONVERT/doc/V1-2-00/documentation.html (Version 1.2 is not yet available as a binary download) Regards, Mitch -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Francis E Reyes Sent: Thursday, April 05, 2012 8:25 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] mtz2cif capable of handling map coefficients It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Thanks! F - Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder
Re: [ccp4bb] mtz2cif capable of handling map coefficients
Fc doesn't contain the weighting scheme used in the creation of the map coefficients, so Fc would require some sort of program to be run to recreate those for both 2Fo-Fc and Fo-Fc maps. By which time you might as well run a single cycle of the refinement program in question to generate new map coefficients - so I don't see the benefit of Fc. The map coefficients, on the other hand, are a checkpoint of the maps being looked at by the author at the time of deposition and don't require programs beyond a typical visualization program (i.e. Coot) to view. Phil Jeffrey Princeton On 4/5/12 12:00 PM, Ethan Merritt wrote: On Thursday, April 05, 2012 08:25:05 am Francis E Reyes wrote: It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Maybe I missed something. What is accomplished by depositing map coefficients that isn't done better by depositing Fo and Fc? Ethan
Re: [ccp4bb] problem in scaling the Zn-MAD data
Hello I arrived at the p312 space group by running a self rotation function using MOLREP. The maps show the space group as p312. I was scaling the data individually for each wavelength. None of the three wavelengths are scaling are scaling in p312 space group. On Thu, Apr 5, 2012 at 2:17 AM, Clemens Vonrhein vonrh...@globalphasing.com wrote: Hi, On Wed, Apr 04, 2012 at 02:07:58PM -0700, Deepthi wrote: Hello everyone I have a problem scaling the MAD data which was collected a week ago.The data was collected at 1.5A resolution using three wavelengths for Zn-MAD experiments. Scaling the data for MAD experiments, the number of rejections and chi2 values were very high even after adjusting the error-scale factor and error model. The space group i used was p312 which i obtained by running a self-rotation function in MOLREP. When i scale my data using p312 spacegroup the chi2 and rejections were huge. But he data was scaling well in p321 spacegroup. can anyone explain whats going on? When you say 'Scaling the data for MAD experiments': do you mean scaling the various scans for your 3-wvl MAD data in a single scaling job? Unless you already took care of this during data integration, remember that your separate scans could have been indexed differently and therefore don't match up. See eg. http://www.ccp4.ac.uk/html/reindexing.html for some lookup-tables in P312 and P321. You can use the CCP4 program 'reindex' on MTZ files if needed. But I guess most modern data-processing and scaling programs will take care of that automatically anyway? Cheers Clemens -- *** * Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com * * Global Phasing Ltd. * Sheraton House, Castle Park * Cambridge CB3 0AX, UK *-- * BUSTER Development Group (http://www.globalphasing.com) *** -- Deepthi
Re: [ccp4bb] problem in scaling the Zn-MAD data
Hi - Let me just add that P312 is a very uncommon space group for protein crystals, much less common than P321. (This doesn't mean you don't have it - it's just unlikely.) If you look at PDB statistics: P 3 1 2 : 12 structures P3(1) 1 2: 61 structures P3(2) 1 2: 85 structures P 3 2 1 : 278 structures P3(1) 2 1: 2354 structures P3(2) 2 1: 2533 structures This also suggests, by the way, that you have a screw axis that you haven't accounted for yet. It won't affect your data scaling, but it sure will affect your molecular replacement job! Hope that helps, Matt On 4/5/12 12:31 PM, Deepthi wrote: Hello I arrived at the p312 space group by running a self rotation function using MOLREP. The maps show the space group as p312. I was scaling the data individually for each wavelength. None of the three wavelengths are scaling are scaling in p312 space group. On Thu, Apr 5, 2012 at 2:17 AM, Clemens Vonrhein vonrh...@globalphasing.com mailto:vonrh...@globalphasing.com wrote: Hi, On Wed, Apr 04, 2012 at 02:07:58PM -0700, Deepthi wrote: Hello everyone I have a problem scaling the MAD data which was collected a week ago.The data was collected at 1.5A resolution using three wavelengths for Zn-MAD experiments. Scaling the data for MAD experiments, the number of rejections and chi2 values were very high even after adjusting the error-scale factor and error model. The space group i used was p312 which i obtained by running a self-rotation function in MOLREP. When i scale my data using p312 spacegroup the chi2 and rejections were huge. But he data was scaling well in p321 spacegroup. can anyone explain whats going on? When you say 'Scaling the data for MAD experiments': do you mean scaling the various scans for your 3-wvl MAD data in a single scaling job? Unless you already took care of this during data integration, remember that your separate scans could have been indexed differently and therefore don't match up. See eg. http://www.ccp4.ac.uk/html/reindexing.html for some lookup-tables in P312 and P321. You can use the CCP4 program 'reindex' on MTZ files if needed. But I guess most modern data-processing and scaling programs will take care of that automatically anyway? Cheers Clemens -- *** * Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com * * Global Phasing Ltd. * Sheraton House, Castle Park * Cambridge CB3 0AX, UK *-- * BUSTER Development Group (http://www.globalphasing.com) *** -- Deepthi -- Matthew Franklin, Ph. D. Senior Research Scientist New York Structural Biology Center 89 Convent Avenue, New York, NY 10027 (646) 275-7165
Re: [ccp4bb] mtz2cif capable of handling map coefficients
On Thursday, April 05, 2012 09:30:25 am Phil Jeffrey wrote: Fc doesn't contain the weighting scheme used in the creation of the map coefficients, so Fc would require some sort of program to be run to recreate those for both 2Fo-Fc and Fo-Fc maps. The viewers I am familiar with do this for themselves on the fly. No need to involve additional programs. In fact, generating and storing map coefficients is not part of my work flow, since none of the programs I normally use need them to be pre-calculated. By which time you might as well run a single cycle of the refinement program in question to generate new map coefficients - so I don't see the benefit of Fc. You must use a different tool set than I do. The map coefficients, on the other hand, are a checkpoint of the maps being looked at by the author at the time of deposition and don't require programs beyond a typical visualization program (i.e. Coot) to view. But is that a good thing or a bad thing? I would rather make my own call about weighting and choice of maps, so I would rather have the Fo and Fc. Anyhow, Coot reads in and displays maps just fine from an mtz or cif file containing Fo and Fc but no map coefficients. It is true that usually you want to have a value for the FOM or other weight avalailable also. cheers, Ethan Phil Jeffrey Princeton On 4/5/12 12:00 PM, Ethan Merritt wrote: On Thursday, April 05, 2012 08:25:05 am Francis E Reyes wrote: It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Maybe I missed something. What is accomplished by depositing map coefficients that isn't done better by depositing Fo and Fc? Ethan -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] mtz2cif capable of handling map coefficients
On Thu, 5 Apr 2012, Ethan Merritt wrote: On Thursday, April 05, 2012 09:30:25 am Phil Jeffrey wrote: Fc doesn't contain the weighting scheme used in the creation of the map coefficients, so Fc would require some sort of program to be run to recreate those for both 2Fo-Fc and Fo-Fc maps. The viewers I am familiar with do this for themselves on the fly. No need to involve additional programs. In fact, generating and storing map coefficients is not part of my work flow, since none of the programs I normally use need them to be pre-calculated. Ethan, If you load a mtz file from refmac or BUSTER then this file contains Map Coefficients. Different programs and protocols produce different maps. So I second Phil's comment that including map coefficients in deposition is a really good thing. It will enable people to see exactly the maps as seen by the depositor (and to do so in a few years time). Hence we have included map coefficients in 3 recent depositions 3syu, 3urp, 3v56 (using a prototype mtz2cif tool that is not quite ready for release yet). We have also worked out how to patch ccp4 cif2mtz so that it can do the reverse process see https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;325e1870.1112 Regards, Oliver
[ccp4bb] PDF position Structural Cell Biology - Michigan
Hi, I would like to advertise a position on behalf of Prof. Lois Weisman ((Michigan). Interested parties should contact her directly (lweisman.off...@gmail.commailto:lweisman.off...@gmail.com). -Amir Postdoctoral Fellow Position Integration of high resolution structures with biology Seeking a highly motivated postdoctoral fellow to initiate a new project in the general areas of phosphoinositide signaling or myosin V-based transport. We are a highly multidisciplinary, interactive laboratory, and use diverse techniques to address cutting edge questions on the roles and regulation of PtdIns(3,5)P2 and PtdIns(5)P in yeast and mice. Specifically we are determining how the lipid kinase, Fab1/PIKfyve is regulated, identifying the upstream signaling pathways, and downstream targets. In addition, we seek to determine the precise roles of Fab1/PIKfyve in the nervous system, and how minor perturbations in this pathway lead to profound neurodegeneration. Another critical area of research is determining how myosin V attaches to cargoes. We recently found that the cargo-binding domain of myosin V interacts directly with a subunit of the exocyst tethering complex [Jin et al. (2011) Dev. Cell. 21(6):1156-70.]. This paper illustrates the power of combining yeast genetics with high-resolution structures. Many additional projects in our lab are poised to utilize this type of approach. For an overview of recent projects, see our website: http://www.lsi.umich.edu/facultyresearch/labs/weisman Qualifications: A Ph.D. in the life sciences. Experience in molecular, biochemical and/or cell biological techniques. A level of publications appropriate for the current level of training. The applicant should be dedicated to research in the life sciences and have a strong desire to make major contributions. Please submit cover letter, CV with name of three references, and a brief paragraph indicating a project of interest to lweisman.off...@gmail.commailto:lweisman.off...@gmail.com The University of Michigan is an equal opportunity/affirmative action employer.
Re: [ccp4bb] mtz2cif capable of handling map coefficients
On Thursday, April 05, 2012 10:48:16 am Oliver Smart wrote: On Thu, 5 Apr 2012, Ethan Merritt wrote: On Thursday, April 05, 2012 09:30:25 am Phil Jeffrey wrote: Fc doesn't contain the weighting scheme used in the creation of the map coefficients, so Fc would require some sort of program to be run to recreate those for both 2Fo-Fc and Fo-Fc maps. The viewers I am familiar with do this for themselves on the fly. No need to involve additional programs. In fact, generating and storing map coefficients is not part of my work flow, since none of the programs I normally use need them to be pre-calculated. Ethan, If you load a mtz file from refmac or BUSTER then this file contains Map Coefficients. Different programs and protocols produce different maps. I am bowing out of this discussion with apologies for any confusion that I caused. I have realized that there may be a generational difference in understanding the term map coefficient (or else my poor brain is just not functioning as well as it ought to). I thought that the proposal was to require depositing the equivalent of a ccp4 *.map file, i.e. the real-space side of the Fourier transform. I see now that people are using map coefficient to mean weighted F, which was not what I originally understood. please carry on! Ethan So I second Phil's comment that including map coefficients in deposition is a really good thing. It will enable people to see exactly the maps as seen by the depositor (and to do so in a few years time). Hence we have included map coefficients in 3 recent depositions 3syu, 3urp, 3v56 (using a prototype mtz2cif tool that is not quite ready for release yet). We have also worked out how to patch ccp4 cif2mtz so that it can do the reverse process see https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;325e1870.1112 Regards, Oliver -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear John, Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages to do. There might be more numerous occurances of data massaging like overestimation of the resolution or data quality, the system does not allow to verify them. IUCr and PDB follows the American taxation policy, where the responsibility for a fraud is placed on people, and the agency does not take sufficient actions to prevent it. I believe it is inefficient and inhumane. Making a routine check of submitted data at a bit lower level would reduce a temptation to overestimate the unclearly defined quality statistics and make the model fabrication more difficult to accomplish. Many people do it unknowingly, and catching them afterwards makes no good. I suggested to turn the current incidence, which might be too complex for burning heretics, into something productive that is done as soon as possible, something that will prevent fraud from occurring. Since my persistent trolling at ccp4bb did not take any effect (until now), I wrote a bad-English letter to the PDB administration, encouraging them to take urgent actions. Those who are willing to count grammar mistakes in it can reading the message below. With best regards, Alexander Aleshin, staff scientist Sanford-Burnham Medical Research Institute 10901 North Torrey Pines Road La Jolla, California 92037 Dear PDB administrators; I am wringing to you regarding the recently publicized story about submission of calculated structural factors to the PDB entry 3k79 (http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This presumable fraud (or a mistake) occurred just several years after another, more massive fabrication of PDB structures (Acta Cryst. (2010). D66, 115) that affected many scientists including myself. The repetitiveness of these events indicates that the current mechanism of structure validation by PDB is not sufficiently robust. Moreover, it is completely incapable of detecting smaller mischief such as overestimation of the data resolution and quality. There are two approaches to handling fraud problems: (1) raising policing and punishment, or (2) making a fraud too difficult to implement. Obviously, the second approach is more humane and efficient. This issue has been discussed on several occasions by the ccp4bb community, and some members began promoting the idea of submitting raw crystallographic images as a fraud repellent. However, this validation approach is not easy and cheap, moreover, it requires a considerable manpower to conduct it on a day-to-day basis. Indeed, indexing data sets is sometimes a nontrivial problem and cannot be accomplished automatically. For this reason, submitting the indexed and partially integrated data (such as .x files from HKL2000 or the output.mtz file from Mosfilm) appears as a cheaper substitute to the image storing/validating. Analysis of the partially integrated data provides almost same means to the fraud prevention as the images. Indeed, the observed cases of data fraud suggest that they would likely be attempted by a biochemist-crystallographer, who is insufficiently educated to fabricate the partially processed data. A method developer, on contrary, does not have a reasonable incentive to forge a particular structure, unless he teams up with a similarly minded biologist. But the latter scenario is very improbable and has not been detected yet. The most valuable benefit in using the partially processed data as a validation tool would be the standardization of definition for the data resolution and detection of inappropriate massaging of experimental data. Implementation of this approach requires minuscule adaptation of the current system, which most of practicing crystallographers would accept (in my humble opinion). The requirement to the data storage would be only ~1000 fold higher than the current one, and transferring the new data to PDB could be still done over the Internet. Moreover, storing the raw data is not required after the validation is done. A program such as Scala of CCP4 could be easily adopted to process the validation data and compare them with a conventional set of structural factors. Precise consistency of the two sets is not necessary. They only need to agree within statistically
[ccp4bb] MOSFLM- Image compatibility
Can MOSFLM work with image files of type .x (BNL X6A) ? I am having no luck... I know it can do .cbf (BNL X25) for instance. Thanks a lot
[ccp4bb] Position in Structural Biology of Signaling
There is an immediate opening for a protein crystallographer position in Harvard Medical Schools Childrens Hospital. The research is focused on the structural and functional investigation of Wnt signaling pathway. The project is the close collaborative efforts between Professors Xi He and Jia-huai Wangs labs. The successful candidate should have a Ph.D. in structural biology. The candidate will be responsible for determining crystal structure using X-ray crystallography. He/she should also have experience in molecular biology and protein biochemistry. Interested candidates can email a CV, three contacts for reference, as well as an email address and a telephone number to Dr. Jia-huai Wang at jw...@red.dfci.harvard.edu or Dr. Xi He at xi...@childrens.harvard.edu. For more information regarding the Wang and He Laboratories, please see the website: http://wang.dfci.harvard.edu and http://www.childrenshospital.org/cfapps/research/data_admin/Site160/mainpageS160P0.html The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
[ccp4bb] postdoctoral position at the NIH focused on understanding the mechanism of cytoskeletal regulators
Postdoctoral positions are available in the Cell Biology and Biophysics Unit headed by Dr. Antonina Roll-Mecak at the National Institute of Neurological Disorders and Stroke. The Roll-Mecak Laboratory is interested in understanding the interplay between microtubules and their regulators and how tubulin post-translational modifications tune the behavior of motors and microtubule associated proteins (see for instance Szyk et al., 2011. Nature Struct. Molec. Biol. 8(11): 1250-8; Roll-Mecak, A. and Vale, R.D. 2008. Nature, 451(7176):363-7; Roll-Mecak, A. and McNally, F.J. 2010. Curr. Opin. Cell Biol., 22(1):96-103). We use a combination of biochemistry, structural biology, cell biology and single-molecule fluorescence techniques. Thus, a postdoctoral fellow in the lab would have the opportunity to move between these techniques and build upon an already strong background in one of these areas. We value a vibrant and collaborative environment where lab members share ideas, reagents and expertise and want to work on fundamental problems in cytoskeletal biology. The Roll-Mecak lab is located in the Porter Center for Neuroscience on the NIH main campus in Bethesda. The NIH has a long tradition of research excellence in cytoskeletal biology and offers a stimulating environment for postdoctoral fellows interested in interdisciplinary training in cell biology and biophysics. The research facilities at NIH are outstanding and the lab has state-of-the-art equipment such as crystallization robots, liquid handling systems, TIRF and confocal microscopes. For more information, please visit: http://intra.ninds.nih.gov/rm_lab/ The position will be fully funded by the NIH, and is available immediately. We are looking for candidates who wish to work on mechanistic problems related to the microtubule cytoskeleton and have a strong background in at least two of these areas: molecular biology, protein biochemistry, structural biology, cell biology, microscopy or single molecule motor biophysics. Other details: candidates should preferably have less than 2 years of postdoctoral experience. Please send a CV, a one-page research experience summary, and contact information of three references to anton...@mail.nih.gov Please write “Postdoctoral application” in the subject header.
Re: [ccp4bb] MOSFLM- Image compatibility
thanks for kindly pointing that out. (despite the level of stupidity on my part) Those were not the raw imgs... They were denzo output files. Its been a while.
Re: [ccp4bb] very informative - Trends in Data Fabrication
This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes the trust we have in each others results. No one has time or resources to check everything, so science is based on trust. There are efforts underway outside crystallographic circles to address this larger threat to all science, and we should be participating in those discussions as much as possible. Ron On Thu, 5 Apr 2012, aaleshin wrote: Dear John,Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages to do. There might be more numerous occurances of data massaging like overestimation of the resolution or data quality, the system does not allow to verify them. IUCr and PDB follows the American taxation policy, where the responsibility for a fraud is placed on people, and the agency does not take sufficient actions to prevent it. I believe it is inefficient and inhumane. Making a routine check of submitted data at a bit lower level would reduce a temptation to overestimate the unclearly defined quality statistics and make the model fabrication more difficult to accomplish. Many people do it unknowingly, and catching them afterwards makes no good. I suggested to turn the current incidence, which might be too complex for burning heretics, into something productive that is done as soon as possible, something that will prevent fraud from occurring. Since my persistent trolling at ccp4bb did not take any effect (until now), I wrote a bad-English letter to the PDB administration, encouraging them to take urgent actions. Those who are willing to count grammar mistakes in it can reading the message below. With best regards, Alexander Aleshin, staff scientist Sanford-Burnham Medical Research Institute 10901 North Torrey Pines Road La Jolla, California 92037 Dear PDB administrators; I am wringing to you regarding the recently publicized story about submission of calculated structural factors to the PDB entry 3k79 (http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This presumable fraud (or a mistake) occurred just several years after another, more massive fabrication of PDB structures (Acta Cryst. (2010). D66, 115) that affected many scientists including myself. The repetitiveness of these events indicates that the current mechanism of structure validation by PDB is not sufficiently robust. Moreover, it is completely incapable of
Re: [ccp4bb] very informative - Trends in Data Fabrication
I also don't really worry about the images as a primary means of fraud prevention, although such may be a useful side effect. These cases are spectacular but so rare that it indeed would not primarily justify the effort. That it can be a useful political instrument to make that argument and get funding, may be, but that is a bit of a double edged sword and harm can be done see (5) The real point to me seems - a) is there something in the images and in between casually indexed main reflections we do not use right now that allows us to ultimately get better structures? I think there is, and it has been told before, from superstructures, modulation, diffuse contributions etc etc. A processed data file does not help here. But do we need the old image data for that or rather use new ones from modern detectors? Where is the cost/benefit cutoff here? b) looking at how some structures are refined, there is little reason to believe that data processing would be done more competently by untrained casual users (except that much of the data processing is done with the help of beam line personnel who rather know how to do it). Had we images, the next step then could be PDB_reprocess. A processed data file does not help much there either. c) Discarding your primary data is generally considered bad form... @AlexA: Arguing with the PDB is not really useful. They did not generate the bad data. Best, BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ronald E Stenkamp Sent: Thursday, April 05, 2012 1:04 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes the trust we have in each others results. No one has time or resources to check everything, so science is based on trust. There are efforts underway outside crystallographic circles to address this larger threat to all science, and we should be participating in those discussions as much as possible. Ron On Thu, 5 Apr 2012, aaleshin wrote: Dear John,Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages to do. There might be more numerous occurances of data massaging like overestimation of the resolution or data quality, the system does not allow to verify them. IUCr and PDB follows the American taxation policy, where the responsibility for a fraud is placed on people, and the agency
Re: [ccp4bb] very informative - Trends in Data Fabrication
Well, looks like my opinion about importance of data validation at the moment of their submission does not catch much support, it is sad but understandable. Automatic redoing the pdb structures by professionals is a good idea, I myself suggested a similar thing 10 years ago at Accelrys (we were developing a tool that allowed detecting and remodeling changes in protein-ligand structures due to ligand binding), but there was not much financial interest. How much the raw images would enhance the remodeling process is an open question, but good luck in getting it funded. c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. What is wrong with partially integrated data in terms of structure validation? @AlexA: Arguing with the PDB is not really useful. I did not argue yet, but I'll take your advice. They did not generate the bad data. This is a genuine American thinking! But they might create conditions that would prevent their deposition. I think I should stop heating up this discussion. Regards, Alex On Apr 5, 2012, at 2:11 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: I also don't really worry about the images as a primary means of fraud prevention, although such may be a useful side effect. These cases are spectacular but so rare that it indeed would not primarily justify the effort. That it can be a useful political instrument to make that argument and get funding, may be, but that is a bit of a double edged sword and harm can be done see (5) The real point to me seems - a) is there something in the images and in between casually indexed main reflections we do not use right now that allows us to ultimately get better structures? I think there is, and it has been told before, from superstructures, modulation, diffuse contributions etc etc. A processed data file does not help here. But do we need the old image data for that or rather use new ones from modern detectors? Where is the cost/benefit cutoff here? b) looking at how some structures are refined, there is little reason to believe that data processing would be done more competently by untrained casual users (except that much of the data processing is done with the help of beam line personnel who rather know how to do it). Had we images, the next step then could be PDB_reprocess. A processed data file does not help much there either. c) Discarding your primary data is generally considered bad form... @AlexA: Arguing with the PDB is not really useful. They did not generate the bad data. Best, BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ronald E Stenkamp Sent: Thursday, April 05, 2012 1:04 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes
Re: [ccp4bb] very informative - Trends in Data Fabrication
Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR
Re: [ccp4bb] very informative - Trends in Data Fabrication
Alright, if the image deposition is the only way out, then I am for it, but please make sure that synchrotrons will do it for me... On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR
Re: [ccp4bb] very informative - Trends in Data Fabrication
How should they ? They have no clue which of the 20 datasets was actually useful to solve your structure. If you ask James Holton he has (suggested) to go back to the archived data after a certain time and try to solve the undeposited structures then :-) [Where is James anyhow ? Haven't seen a post recently from him] Seriously, I think it is in our own interest to submit the corresponding images which led to a structure solution somewhere. And as others mentioned bad data or good data can always serve for educational purposes. Just as an example http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/1Y13 Jürgen On Apr 5, 2012, at 11:46 PM, aaleshin wrote: Alright, if the image deposition is the only way out, then I am for it, but please make sure that synchrotrons will do it for me... On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] very informative - Trends in Data Fabrication
Did you play as a child a game called a broken phone? It is when someone tells something quickly to a neighbor, and so on until the words come back to the author. Very funny game. My original thesis was that downloading/depositing the raw images would be a pain in the neck for crystallographers, so why would not to begin with the partially processed data, like .x files from HKL2000? People should be trained to hardships gradually... On Apr 5, 2012, at 8:57 PM, Bosch, Juergen wrote: How should they ? They have no clue which of the 20 datasets was actually useful to solve your structure. If you ask James Holton he has (suggested) to go back to the archived data after a certain time and try to solve the undeposited structures then :-) [Where is James anyhow ? Haven't seen a post recently from him] Seriously, I think it is in our own interest to submit the corresponding images which led to a structure solution somewhere. And as others mentioned bad data or good data can always serve for educational purposes. Just as an example http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/1Y13 Jürgen On Apr 5, 2012, at 11:46 PM, aaleshin wrote: Alright, if the image deposition is the only way out, then I am for it, but please make sure that synchrotrons will do it for me... On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/