Re: [ccp4bb] PDB secondary structure assignments
Hi Miha, I thought the PDB actually uses DSSP. Perhaps it is a different version, there have been some new releases recently. Anyway, there is no reason why you should stick to the assignment of the PDB. If another program gives slightly different results you can use those as long as you make sure it is obvious which program you used (cite the program). The next CCP4 release will have the official dssp (which is used to make the DSSP databank). Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Pavšic, Miha Sent: Wednesday, July 03, 2013 16:01 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] PDB secondary structure assignments Dear CCP4BB members, what is the usual practice regarding secondary structure assignments when preparing publication figures of protein structures and topology diagrams from deposited PDB files? The deposited PDB files already contain such assignments in the header section (using PROMOTIF?). Should these assignments be obeyed or is it common to used other software/algorithms (e.g., DSSP and Stride). In my case assignments using DSSP result in slightly differ from PDB assignments in the regions of short alpha-helical structure (corresponding to stretch of 4 aa residues). Thank you for your suggestions. Regards, Miha ** Miha Pavsic, Ph.D. University of Ljubljana Faculty of Chemistry and Chemical Technology Chair of Biochemistry Cesta v Mestni log 88a SI-1000 Ljubljana Slovenia e-mail miha.pav...@fkkt.uni-lj.si skype mihapavsic phone (lab) +386 1 2419 488 fax +386 1 2419 487 **
Re: [ccp4bb] Rfree is 20%,why still green and red density?
Hi Bernhard, The formula from Tickly applies to the weighted/generalized/Hamilton free R-factor. From k-fold cross validation tests we observed that the 'regular' R-free has a standard deviation of R-free*(Nref )^-1/2 Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp Sent: Wednesday, June 26, 2013 13:31 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree is 20%,why still green and red density? you may have only a few hundred and thus not get a reliable Rfree value. The estimate for the error in R free as a function of the number of reflections is as follows: Brunger initially estimated^35 that the uncertainty in R-free is proportional to (Nref )^-1/2, which is reasonable to assume because this is how uncertainties vary with sample size. Tickle et al. finally showed^38 that the relative uncertainty in Rfree is exactly equal to (2Nref )^-1/2 confirming Brunger's initial estimate, with constant of proportionality as 2^-1/2. Following this proportionality, ~1000 reflections are sufficient to obtain a better than 1% precision for an overall R-free in the 20-30% range, i.e. 'a few hundred' is still not too bad. Best, BR
Re: [ccp4bb] R too low?
HI Sue, Can you give rmsZ for the bond and angles (from the Refmac output)? I never could figure these rmsd values out... I'm guessing that the restraint are too loose, or at least not optimal. Perhaps, they went overboard with the TLS as well (sometimes fewer TLS goups give much better R and R-free values). I'm not sure anything in particular is wrong with the data processing. They should optimize the restraint weights in refinement first. In this case tighter B-factor restraint weights might do the trick. Gratuitous plug: throw the model and data into PDB_REDO (which uses Refmac too) and see if it gives better refinement results. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Roberts, Sue A - (suer) Sent: Wednesday, June 26, 2013 17:45 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] R too low? Hello Everyone I have two data sets, from the same crystal form (space group P32) of the same protein, collected at 100 K at SSRL, about 2.2 A resolution, that refining to R = 0.14, Rf = 0.26 (refmac/TLS). This is a molecular replacement solution, from a model with about 40% homology (after MR density was apparent for some missing or misbuilt residues, so I don't think the structure is stuck in the wrong place. The Fo-Fc map is essentially featureless. The 2Fo-Fc map doesn't look as good as it should - for instance, there are very few water molecules to be found. The data reduction statistics look OK, the resolution cutoff is pretty conservative. There is one molecule in the asymmetric unit, so no NCS. There is no twinning either. It seemed to me that the R is too low, not Rf too high. More normally, R ends up about .18 - .20 for a data set at this resolution. I reprocessed the images with a different data processing program and redid the MR. The data reduction statistics look similar, the resolution is the same, but now the structure refines to R = 0.20, Rf = 0.24 (same free R set of reflections chosen, still refmac/TLS.) The maps look more normal. Further rebuilding took us to R = 0.18, Rf = 0.22 So, the question I have (and that I've been asked by the student and PI) is: What was the problem with the original data set? What should I be looking for in the data reduction log files, for instance, or in the refinement log? The large R - free R spread is characteristic of overfitting, but the geometry is not too loose (rmsd bonds = 0.14), there are plenty of reflections (both working and free). Can anyone point me toward a reason R would be low? Thanks Sue Dr. Sue A. Roberts Dept. of Chemistry and Biochemistry University of Arizona 1041 E. Lowell St., Tucson, AZ 85721 Phone: 520 621 8171 or 520 621 4168 s...@email.arizona.edu http://www.cbc.arizona.edu/xray or http://www.cbc.arizona.edu/facilities/x-ray_diffraction
Re: [ccp4bb] AW: Twinning problem - almost solved.
Hi Herman, Tighter restraints typically close the gap between R and R-free. This does not mean one should just tighten the restraints to satisfy one's own (or a referee's) idea of what the gap should be. I don't think there is a clear target of how large or small the gap should be. If you optimize the restraints to get the best (free) likelihood, you usually get a reasonable R gap without explicitly optimizing it. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Eleanor Dodson Sent: Friday, June 21, 2013 14:21 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] AW: Twinning problem - almost solved. At your resolution that seems to me a reasonable gap between R and Rfree? Eleanor On 21 Jun 2013, at 12:28, herman.schreu...@sanofi.com wrote: Dear Bulletin Board, After some headbanging (Refmac5 had helpfully created gap records for all insertions and deletions present in the structure), I got refmac5 running with the TWIN option. Refmac5 also found the k,h,-l domain and rejected the other possible domains because they were too small. The Rfactor's are now extremely good: ~14% and the Rfree's are for me acceptable: ~24%. Since I found the difference between R and Rfree somewhat large, I have been playing with the weighting. By using a weight of 0.01, I can bring the Rfactor up to 18%, but the Rfree stays about the same or even gets a little worse. My question: is there a way to bring R and Rfree closer together, or is it related to the twinned data and is it something we have to live with? Best regards, Herman -Ursprüngliche Nachricht- Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Miller, Mitchell D. Gesendet: Donnerstag, 20. Juni 2013 17:43 An: CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Twinning problem You are welcome. Let me also for the benefit of others who may search the archives in the future, let me correct two errors below - (typo and a miss- recollection). Specially, I was thinking that phenix.refine was now able to refine multiple twin laws, but according to Nat Echols on the phenix mailing list http://phenix-online.org/pipermail/phenixbb/2013-March/019538.html phenix.refine only handles 1 twin law at this time. (My typo was that and our second structure was 3nuz with twin fractions 0.38, 0.32, 0.16 and 0.14 -- not 2nuz). A useful search for deposited structures mentioning tetartohedral http://www.ebi.ac.uk/pdbe- srv/view/search?search_type=all_texttext=TETARTOHEDRALLY+OR+TETAR TOHEDRAL Regards, Mitch -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of herman.schreu...@sanofi.com Sent: Thursday, June 20, 2013 8:04 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] AW: Twinning problem Dear Mitch (and Philip and Phil), It is clear that I should give refmac a go with the non-detwinned F's and just the TWIN command. Thank you for your suggestions, Herman -Ursprüngliche Nachricht- Von: Miller, Mitchell D. [mailto:mmil...@slac.stanford.edu] Gesendet: Donnerstag, 20. Juni 2013 16:18 An: Schreuder, Herman RD/DE Betreff: RE: Twinning problem Hi Herman, Have you considered the possibility of your crystals being tetartohedral twinned. That is more than one of the twin laws may apply to your crystals. E.g. in P32 it is possible to have tetartohedral twinning which would have 4 twin domains - (h,k,l), (k,h,-l), (-h,-k,l) and (-k,-h,-l). Perfect tetartohedral twinning of P3 would merge in P622 and each twin domain would have a faction of 0.25. We have had 2 cases like this (the first 2PRX was before there was support for this type of twinning except for in shelxl and we ended up with refined twin fractions of 0.38, 0.28, 0.19, 0.15 for the deposited crystal and a 2nd crystal that we did not deposit had twin fractions of 0.25, 0.27, 0.17, 0.31). The 2nd case we had was after support for twining (including tetartohedral twinning) was added to refmac (and I think phenix.refine can also handle this). For 2NUZ, it was P32 with refined twin fractions of 0.25, 0.27, 0.17, 0.31. Pietro Roversi wrote a review of tetartohedral twinning for the CCP4 proceedings issues of acta D http://dx.doi.org/10.1107/S0907444912006737 I would try refinement with refmac using the original (non-detwinned F's) with just the TWIN command to see if it ends up keeping twin fractions for all 3 operators (4 domains) -- especially with crystals 1 and 3 which appear to have the largest estimates of the other twin fractions. Regards, Mitch == Mitchell Miller, Ph.D. Joint Center for Structural Genomics Stanford Synchrotron Radiation Lightsource 2575 Sand Hill Rd -- SLAC MS 99 Menlo Park, CA 94025 Phone: 1-650-926-5036 FAX: 1-650-926-3292 -Original
Re: [ccp4bb] Concerns about statistics
Hi Andrea, Any choice you make about a resolution cut-off based on a rule of thumb can be called into question by a referee who uses a different rule of thumb. So if you choose a metric + cut-off that is anything less than very conservative (say I/sigI 1), you have to be able to defend your choice either with a reference or with evidence from experiments. This is where the 'paired refinement' of the Karplus and Diederichs paper kicks in: you can show that you can get useful information out of the extra high resolution reflections by comparing refinement results. So what you can do is first solve your structure, build and refine using a conservative resolution cut-off. Once you are nearing the final stages of the process you can gradually go for higher resolutions using the paired refinement procedure. That way you have some results to support you choice of resolution cut-off. Who knows, when you reach the best resolution cut-off you may be able to add some more details to your structure model, that you would have missed otherwise. If you think that doing the paired refinement is too much work, you can try PDB_REDO. If you give it a PDB file with a resolution cut-off in REMARK 2 or 3 lower than the maximal resolution of your reflection file, it will automatically use paired refinement to find the best resolution cut-off (yes, this is a self-plug!). HTH, Robbie Joosten -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Andrea Edwards Sent: Thursday, June 13, 2013 17:15 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Concerns about statistics Hello group, I have some rather (embarrassingly) basic questions to ask. Mainly.. when deciding the resolution limit, which statistics are the most important? I have always been taught that the highest resolution bin should be chosen with I/sig no less than 2.0, Rmerg no less than 40%, and %Completeness should be as high as possible. However, I am currently encountered with a set of statistics that are clearly outside this criteria. Is it acceptable cut off resolution using I/sig as low as 1.5 as long as the completeness is greater than 75%? Another way to put this.. if % completeness is the new criteria for choosing your resolution limit (instead of Rmerg or I/sig), then what %completeness is too low to be considered? Also, I am aware that Rmerg increases with redundancy, is it acceptable to report Rmerg (or Rsym) at 66% and 98% with redundancy at 3.8 and 2.4 for the highest resolution bin of these crystals? I appreciate any comments. -A
Re: [ccp4bb] Fwd: [ccp4bb] pdbset
In Windows: findstr /b /v ANISOU input.pdb output.pdb Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene Sent: Tuesday, June 11, 2013 10:40 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Fwd: [ccp4bb] pdbset -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Swastik Phulera, unless you need the ANISOU cards, you can first remove them with e.g. grep -v ^ANISOU youfile.pdb yournewfile.pdb before running pdbset. (I hope you don't work on a Windows machine, then you would probably first find a way to install 'grep', a command common on unixoid operating systems). By the way: how did you get negative B-values into your PDB-file? Best, Tim On 06/11/2013 10:22 AM, Swastik Phulera wrote: -- Forwarded message -- From: Swastik Phulera swastik.phul...@gmail.com Date: Tue, Jun 11, 2013 at 1:51 PM Subject: Re: [ccp4bb] pdbset To: Tim Gruene t...@shelx.uni-ac.gwdg.de Dear Tim, Miguel Thanks for your suggestions, the program does work now, but it seems that it cant handle AnsioU s . It gives an error: PDBSET: *** AnisoU present: cannot reset B *** Is there any other program which would set minimum bfactors for me. Also I am looking for a program that would set the maximum occupancy to a desired value (It seems that pdbset can only play with the minimum values).. On Tue, Jun 11, 2013 at 12:11 PM, Tim Gruene t...@shelx.uni-ac.gwdg.dewrote: Dear Swastik Phulera, after the word 'output.pdb' you must first hit the Enter-key which takes you into the program pdbset. Then you type B_reset Minimum 0 END and the program runs. If you wish to do it without interaction, e.g. in a script, you can use the shell construct '': pdbset XYZIN input.pdb XYZOUT output.pdb eof B_reset MINIMUM 0 eof Best, Tim On 06/11/2013 08:15 AM, Swastik Phulera wrote: Dear All, I am trying to use pdbset from the terminal and am constantly getting an error: [XYZ@NCCS3 110613]$ pdbset XYZIN input.pdb XYZOUT output.pdb B_reset MINIMUM 0 CCP4 library signal ccp4_general:Use: logical name file name (Error) raised in ccp4fyp pdbset: Use: logical name file name pdbset: Use: logical name file name Times: User: 0.0s System:0.0s Elapsed: 0:00 Does any one have any idea what's wrong here? Swastik Phulera - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFRtuJQUxlJ7aRr7hoRAhcWAKDGPXBS3iJhzzbaYBK4GnJGjijyngCgqYZI B26VKBDFO5FQNJJQd6plsc8= =mDut -END PGP SIGNATURE-
Re: [ccp4bb] Formageddon is upon us... Important news from wwPDB! (help-4246)
Hi Rachel, Thanks for clarifying this. Things will be much easier when SPLIT entries are consolidated. That is, when all the software is updated as well. Entries are split only for reasons of size. 1apg was, sort of, an example of the contrary (see the, now obsolete, reflection file). I thought there were more examples, but fortunately I couldn't find any quickly. Cheers, Robbie We will continue the practice that any structure deposited as a single PDBx/mmCIF file will be divided into SPLIT files and added to the ftp archive as usual, in all supported formats (PDB, PDBx/mmCIF, PDBML/XML). Entries are split only for reasons of size. SPLIT entries currently in the archive will remain as they are until sometime in 2014 when we plan to consolidate all existing and new SPLIT entries into single large files. Regards, Rachel Rachel Kramer Green, Ph.D. RCSB PDB kra...@rcsb.rutgers.edu Twitter: https://twitter.com/#!/buildmodels Facebook: http://www.facebook.com/RCSBPDB On 5/24/2013 12:41 PM, Robbie Joosten wrote: Perhaps a silly question: will old entries with SPLIT records be superseded by consolidated entries? And what about entries split for other reasons than size (there are only a few of those, and they are old)? Cheers, Robbie Van: Gerard DVD Kleywegt Verzonden: 24-5-2013 20:21 Aan: CCP4BB@JISCMAIL.AC.UK Onderwerp: [ccp4bb] Formageddon is upon us... Important news from wwPDB! Dear colleagues, I would like to draw your attention to a notification from the wwPDB partners about Deposition and Release of PDB Entries Containing Large Structures - see: http://www.wwpdb.org/news/news_2013.html#22-May-2013 There are major changes afoot in the way large structures are handled in the PDB, as well as in the deposition and annotation procedures and software used by the wwPDB sites. This is of immediate relevance for depositors and users of large structures, but also for software developers and anyone who routinely processes the entire PDB archive or its weekly releases (bioinformatics resources, etc.). From 2014, it will affect essentially everyone who deposits, uses or processes PDB entries. If you have any questions about the new deposition system or the procedures for handling large structures or any of the other changes, please contact: i...@wwpdb.org Please pass on this information to anyone likely to be affected by the upcoming changes. Thanks! On behalf of the Worldwide Protein Data Bank, --Gerard Kleywegt --- Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK ger...@ebi.ac.uk . pdbe.org Secretary: Pauline Haslam pdbe_ad...@ebi.ac.uk
Re: [ccp4bb] Formageddon is upon us... Important news from wwPDB!
Perhaps a silly question: will old entries with SPLIT records be superseded by consolidated entries? And what about entries split for other reasons than size (there are only a few of those, and they are old)? Cheers, Robbie Van: Gerard DVD Kleywegt Verzonden: 24-5-2013 20:21 Aan: CCP4BB@JISCMAIL.AC.UK Onderwerp: [ccp4bb] Formageddon is upon us... Important news from wwPDB! Dear colleagues, I would like to draw your attention to a notification from the wwPDB partners about Deposition and Release of PDB Entries Containing Large Structures - see: http://www.wwpdb.org/news/news_2013.html#22-May-2013 There are major changes afoot in the way large structures are handled in the PDB, as well as in the deposition and annotation procedures and software used by the wwPDB sites. This is of immediate relevance for depositors and users of large structures, but also for software developers and anyone who routinely processes the entire PDB archive or its weekly releases (bioinformatics resources, etc.). From 2014, it will affect essentially everyone who deposits, uses or processes PDB entries. If you have any questions about the new deposition system or the procedures for handling large structures or any of the other changes, please contact: i...@wwpdb.org Please pass on this information to anyone likely to be affected by the upcoming changes. Thanks! On behalf of the Worldwide Protein Data Bank, --Gerard Kleywegt --- Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK ger...@ebi.ac.uk . pdbe.org Secretary: Pauline Haslam pdbe_ad...@ebi.ac.uk
Re: [ccp4bb] LINK or LINKR
Hi Eleanor, The recent versions of Refmac work well with the records in PDB format. According to the list of bug fixes on the website, Refmac should now take the distance from the PDB file (it used to complain about the distance record). Changing the 1.48 to 1.61 in the new LINK record should do the trick. So far for the theory, in practice there are still a lot of difficulties dealing with LINKs. 1) I noticed the LINK record in the output has a different symmetry record, are the two equivalent? 2) The PDB generates LINK records upon deposition, even for things that were not restrained by LINKs in Refinement, which may misrepresent the refinement. 3) The LINK records in the PDB give the actual distance, not the target. Which means that you can accidentally replace good restraint targets with poor ones, simply by loading a (previously poorly refined or miss-annotated) PDB file. 4) There is no consensus dictionary or a repository for LINKs at the PDB. The CCP4 dictionary has a number of LINKs, but is quite incomplete. 5) Some target LINK lengths, especially in ion coordination, vary with context even if the involved atoms are the same. Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Eleanor Dodson Sent: Friday, April 26, 2013 13:30 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] LINK or LINKR Is there any consensus about the accepted format for this? I believe Garib uses LINKR to add a link name to the record, (cant find a description in the documentation though???) but also in the documentation REFMAC is said to provide a link between symmetry related like this with the target distance here LINK1P DG A 11.61000 O3' DC A 2 1555 6554 i But REFMAC a) ignores the given distance and b) writes it out as : LINK PDG A 1 O3' DC A 2 1555 2554 1.48 This is in agreement with the PDB definition but with a wrong distance - presumably derived in the innards of the dictionary: PDB LINK definition 1234567890123456789012345678901234567890123456789012345678901234567890123456 7890 LINK O GLY A 49NANA A6001 1555 1555 2.98 LINK OG1 THR A 51NANA A6001 1555 1555 2.72 1234567890123456789012345678901234567890123456789012345678901234567890123456 7890 LINK O GLY A 49NANA A6001 1555 1555 2.98 LINK OG1 THR A 51NANA A6001 1555 1555 2.72 coot seems to refuse to read the LINKR at all! Confused Eleanor
Re: [ccp4bb] refinement hanging--what am I missing?
Hi Patrick, Did you try using a different refinement program (e.g. Refmac)? Which type of NCS restraints did you use, global or local (torsion- or distance-based)? Have you tried optimizing your restraint weights? Have you tried running a huge number of refinement cycles? You can also try running PDB_REDO (plug plug) which will try a number of things to improve your model. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Patrick Loll Sent: Saturday, April 27, 2013 00:32 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] refinement hanging--what am I missing? Responding to a couple of questions from Ethan, Charlie, and Phil: Ethan: Using the default bulk solvent modeling in Phenix; no selenium, but I'll double-check wavelengths as a sanity check for scattering factors (but several other native data sets from the same synchrotron trip refined beautifully, so I suspect there's no gross boo-boos of this nature...) Charlie: Solvent regions are pretty clean; I haven't tried any flipping (these are molecular replacement models, so it didn't occur to me...). I tried applying NCS in one case (the smaller cell) and it had no apparent effect on the refinement. The Fo-Fc map has no strong features crying out for interpretation. Just based on geometry and map appearance, I'd be inclined to say the refinement is done, were it not for the crappy R values. Phil: I used TLS for refinement in both xtal forms; it gives a small improvement (0.01-0.03) in Rfree in both cases, but nothing magic. I simply used one monomer/TLS group (these are ubiquitin variants, so the monomer itself is pretty much a little rock, without any internal domain motions). There are the usual complement of disordered side chains, but nothing unusual, and 98% of the main chain is accounted for. Haven't tried Arp/wArp yet... Excellent thoughts, keep those cards and letters coming. I'm still chewing on the substantive comments from Dean and Adrian... Thanks, Pat On 26 Apr 2013, at 6:17 PM, Carter, Charlie wrote: Hi Pat, Your stats aren't all that bad, but I share your discomfort. Do the solvent regions retain any significant features? Have you tried flipping those features? Have you applied NCS? What does the Fo - Fc map look like? Charlie On Apr 26, 2013, at 5:38 PM, Patrick Loll wrote: Hi all, Here is a problem that's been annoying me, and demanding levels of thought all out of proportion with the importance of the project: I have two related crystal forms of the same small protein. In both cases, the data look quite decent, and extend beyond 2 A, but the refinement stalls with statistics that are just bad enough to make me deeply uncomfortable. However, the maps look pretty good, and there's no obvious path to push the refinement further. Xtriage doesn't raise any red flags, nor does running the data through the Yeates twinning server. Xtal form 1: P22(1)2(1), a=29.0, b=57.4, c=67.4; 2 molecules/AU. Resolution of data ~ 1.9 Å. Refinement converges with R/Rfree = 0.24/0.27 Xtal form 2: P2(1)2(1)2(1), a=59.50, b=61.1, c=67.2; 4 molecules/AU. Resolution of data ~ 1.7 Å. Refinement converges w/ R/Rfree = 0.21/0.26 As you would expect, the packing is essentially the same in both crystal forms. It's interesting to note (but is it relevant?) that the packing is quite dense- -solvent content is only 25-30%. This kind of stalling at high R values smells like a twin problem, but it's not clear to me what specific kind of twinning might explain this behavior. Any thoughts about what I might be missing here? Thanks, Pat - -- Patrick J. Loll, Ph. D. Professor of Biochemistry Molecular Biology Director, Biochemistry Graduate Program Drexel University College of Medicine Room 10-102 New College Building 245 N. 15th St., Mailstop 497 Philadelphia, PA 19102-1192 USA (215) 762-7706 pat.l...@drexelmed.edu
Re: [ccp4bb] Alternate sugar conformations in refmac 5.5.0110
Hi Markus, You could try changing your Refmac version. The version you are using is ancient. You may have an old version in your PATH next to the new one because your CCP4 seems up to date. AFAICT there is nothing wrong with the LINKR or the HETATM records Sent from my Windows Phone From: Markus Meier Sent: 2013-04-18 22:19 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Alternate sugar conformations in refmac 5.5.0110 Dear all, I am trying to refine a beta-D-N-Acetyl glucosamine moiety (NAG) linked to an asparagine (ASN) in Refmac. (version CCP4 6.3: Refmac_5.5.0110 version 5.5.0110 : 08/05/10) The sugar has two alternate conformations that occupy the same position in the electron density but are rotated 180° relative to each other. Even though I have defined two alternate conformations for the asparagine side chain, the sugar and the beta-link, Refmac pushes the two conformations apart, out of the electron density (while still honouring the link to Asn). So it seems that Refmac applies repulsive forces between the alternate conformations. Did anyone else experience this and/or can suggest a fix? All help is very much appreciated! My definitions for the alternate conformations in the PDB file are below: LINKRC1 ANAG E 1 ND2AASN A 116NAG-ASN LINKRC1 BNAG E 1 ND2BASN A 116NAG-ASN HETATM 125 C1 ANAG E 1 30.978 40.626 -25.446 0.50 98.96 EC HETATM 126 C2 ANAG E 1 30.428 41.897 -26.138 0.50100.81 EC HETATM 127 N2 ANAG E 1 29.783 41.751 -27.447 0.50 94.26 EN HETATM 128 C7 ANAG E 1 28.807 42.561 -27.924 0.50 89.45 EC HETATM 129 O7 ANAG E 1 28.325 42.377 -29.035 0.50 83.16 EO HETATM 130 C8 ANAG E 1 28.252 43.720 -27.128 0.50 87.63 EC HETATM 131 C3 ANAG E 1 31.618 42.847 -26.221 0.50102.77 EC HETATM 132 O3 ANAG E 1 31.420 43.888 -27.159 0.50102.25 EO HETATM 133 C4 ANAG E 1 31.840 43.369 -24.802 0.50103.37 EC HETATM 134 O4 ANAG E 1 32.969 44.217 -24.782 0.50104.91 EO HETATM 135 C5 ANAG E 1 32.040 42.223 -23.785 0.50 94.99 EC HETATM 136 C6 ANAG E 1 31.564 42.633 -22.381 0.50 86.60 EC HETATM 137 O6 ANAG E 1 32.632 42.591 -21.462 0.50 77.76 EO HETATM 138 O5 ANAG E 1 31.458 40.954 -24.130 0.50 99.19 EO HETATM 139 C1 BNAG E 1 30.271 40.925 -24.108 0.50 98.96 EC HETATM 140 C2 BNAG E 1 31.415 41.878 -23.684 0.50100.81 EC HETATM 141 N2 BNAG E 1 32.048 41.664 -22.378 0.50 94.26 EN HETATM 142 C7 BNAG E 1 33.331 41.981 -22.079 0.50 89.45 EC HETATM 143 O7 BNAG E 1 33.784 41.778 -20.959 0.50 83.16 EO HETATM 144 C8 BNAG E 1 34.276 42.586 -23.092 0.50 87.63 EC HETATM 145 C3 BNAG E 1 30.816 43.277 -23.775 0.50102.77 EC HETATM 146 O3 BNAG E 1 31.569 44.240 -23.061 0.50102.25 EO HETATM 147 C4 BNAG E 1 30.718 43.597 -25.266 0.50103.37 EC HETATM 148 O4 BNAG E 1 30.115 44.862 -25.440 0.50104.91 EO HETATM 149 C5 BNAG E 1 29.907 42.532 -26.038 0.50 94.99 EC HETATM 150 C6 BNAG E 1 30.373 42.426 -27.500 0.50 86.60 EC HETATM 151 O6 BNAG E 1 29.320 42.743 -28.381 0.50 77.76 EO HETATM 152 O5 BNAG E 1 29.866 41.216 -25.458 0.50 99.19 EO ATOM745 N ASN A 116 27.207 36.475 -25.453 1.00 62.15 AN ATOM746 CA AASN A 116 28.475 37.144 -25.135 0.50 61.77 AC ATOM747 CB AASN A 116 28.182 38.336 -24.223 0.50 69.02 AC ATOM748 CG AASN A 116 29.037 39.555 -24.495 0.50 79.45 AC ATOM749 OD1AASN A 116 28.656 40.644 -24.047 0.50 83.18 AO ATOM750 ND2AASN A 116 30.178 39.419 -25.218 0.50 87.72 AN ATOM751 C ASN A 116 29.353 36.143 -24.389 1.00 53.28 AC ATOM752 O ASN A 116 29.778 36.380 -23.266 1.00 48.83 AO ATOM753 CA BASN A 116 28.575 37.144 -25.135 0.50 61.77 AC ATOM754 CB BASN A 116 28.282 38.336 -24.223 0.50 69.02 AC ATOM755 CG BASN A 116 29.487 39.201 -23.920 0.50 79.45 AC ATOM756 OD1BASN A 116 30.611 38.710 -24.084 0.50 83.18 AO ATOM757 ND2BASN A 116 29.299 40.471 -23.478 0.50 87.72 AN The Refmac log output shows that link description was recognized correctly for each conformation: WARNING : residue: NAG 1 chain:EE atom: O1 is absent in coord_file WARNING : link(spec):NAG-ASN is found dist = 1.466 ideal_dist= 1.439 ch:EE res: 1 NAG
Re: [ccp4bb] Angle restraints
Hi Kavya, Which validation program did you use? How big is the deviation (in sigma values)? Is it the only outlier? What is your overall bond angle rmsZ? Using external restraints is a bit over the top here, especially if it is the only outlier. If your rmsZ is high (close to or over 1) then you may want to try tighter geometric restraints overall. In a normal distribution it is not surprising to find a 'true' outlier, so if your structure is large you need to worry less. Cheers, Robbie Sent from my Windows Phone From: Kavyashree Manjunath Sent: 2013-04-15 07:13 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Angle restraints Dear users, Validation of a structure showed a deviation in the angle between atoms NH1-CZ-NE and NH2-CZ-NE in the arginine residue. Several trials of modification of the orientation failed to solve this problem. I also confirmed by deleting the side chain and refining, it confirmed the presence of complete side chain. So I proceeded to use external restraints for these two angles in refmac5 (version 5.6.0117). The keyword was as follows - external angle first chain A residue 93 atom NE next chain A residue 93 atom CZ next chain A residue 93 atom NH1 value 120.3 sigma 0.5 external angle first chain A residue 93 atom NH2 next chain A residue 93 atom CZ next chain A residue 93 atom NE value 120.3 sigma 0.5 Still there is no change in the angle, it continues to have the same deviation. So kindly suggest whether there is any error in the keyword provided or other way to handle this problem. Thanking you Regards Kavya -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [ccp4bb] Angle restraints
Dear Kavya, First try Herman's suggestions. You can try changing the restraint weight but it will probably not solve the problem; it may hide it. If you cannot solve the problem and you did the best you can do, you can deposit the model with the outlier. The PDB does not reject models with (minor) issues. Future users will be warned by REMARK 500 if they want to use that specific arginine for further research. Cheers, Robbie Sent from my Windows Phone From: Kavyashree Manjunath Sent: 2013-04-15 09:35 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Angle restraints Sir, I used RCSB validation server. Deviations are as given below - Deviation Resi AT1 - AT2 - AT3BondDict Std Angle ValueDev -- 3.2ARG NE - CZ - NH1123.5 120.30.5 -3.4ARG NE - CZ - NH2116.9 120.30.5 Thus its more than 6 sigmas in both the cases. Its not an outlier. zANGL is 0.628 and Rms BondAngle is 1.4077. would you suggest me to ignore this deviation? But will it not be a problem during PDB submission? What is the reason I am getting a deviation like this? Should I reduce the weighing term further down? Thanking you Regards Kavya Hi Kavya, Which validation program did you use? How big is the deviation (in sigma values)? Is it the only outlier? What is your overall bond angle rmsZ? Using external restraints is a bit over the top here, especially if it is the only outlier. If your rmsZ is high (close to or over 1) then you may want to try tighter geometric restraints overall. In a normal distribution it is not surprising to find a 'true' outlier, so if your structure is large you need to worry less. Cheers, Robbie Sent from my Windows Phone From: Kavyashree Manjunath Sent: 2013-04-15 07:13 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Angle restraints Dear users, Validation of a structure showed a deviation in the angle between atoms NH1-CZ-NE and NH2-CZ-NE in the arginine residue. Several trials of modification of the orientation failed to solve this problem. I also confirmed by deleting the side chain and refining, it confirmed the presence of complete side chain. So I proceeded to use external restraints for these two angles in refmac5 (version 5.6.0117). The keyword was as follows - external angle first chain A residue 93 atom NE next chain A residue 93 atom CZ next chain A residue 93 atom NH1 value 120.3 sigma 0.5 external angle first chain A residue 93 atom NH2 next chain A residue 93 atom CZ next chain A residue 93 atom NE value 120.3 sigma 0.5 Still there is no change in the angle, it continues to have the same deviation. So kindly suggest whether there is any error in the keyword provided or other way to handle this problem. Thanking you Regards Kavya -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [ccp4bb] Puzzling Structure
Hi Martyn, I think the question is where the error was made - seeing the uploaded file would clear this up. But it seems unlikely to me that the depositor saw a huge R factor discrepancy at the end of refinement and just blithely uploaded it. So scenario 3 :- PDB : we cannot reproduce your R factor with our programs Depositor : that's your problem mate - it was fine when it left me...up to you to sort it... Which seems a sort of reasonable attitude to me. Not quite, the depositor has to give, i.e. type, the space group (example depositions: https://www.ebi.ac.uk/pdbe-xdep/autodep/AutoDep?param=QovCsvhNv06Mpnr%2BvIkqqfuqeeBd8leFNAVymZgS%2Fe7mULyfrCaTMN8jsyaGZUTyUDQyN3gMF3o%3D). Don't ask me why, because it is clearly a source of error. Checking back to see the space group misparsed and re-running the water moving-rem500 - and validation scripts would have cleared this problem with no action needed from the depositor. I totally agree that the annotator could have intercepted the problem. But the responsibility lies with the depositor. Hooray for bureaucracy! Cheers, Robbie Cheers Martyn -- On Sat, Apr 13, 2013 23:03 BST Robbie Joosten wrote: Hi Martyn, A shame then that these 'helpful' annotators did not make use of Pavel's basic sanity on the space group (*mentioned below) and check back to the one listed in the uploaded PDB file. As far as I know, EDS is run on all new depositions at PDBe. I don't know whether they already did that when this model was deposited. Even if they did, this may not solve the problem because the PDB does not refuse models. Possible scenarios (there may be more): 1) Pretty bad case - Annotator: We cannot reproduce your R-factors in EDS. Could you check the annotated coordinate and reflection files? - Depositor: Not interested (the paper is almost accepted anyway). Approve model as-is. 2) Worst case - Annotator: ... (doesn't notice the problem) - Depositor: ... (doesn't notice the problem) I often wonder why the PDB does not make the deposited coordinate file publicly available so that these sorts of issues can be checked and tracked. Good point, I wonder about that as well. Also about whether depositors would like that? The whole PDB data (excluding the EMDB that was recently merged into it) amounts to about a laptop hard drive's worth of data - so surely space can be made for the deposited coordinates? (and restraint files which will be very useful for other workers including pdb-redo). Yes, having access to certain restraint files (particularly for LINKs) would be very nice. That said, a proper repository of consensus-restraints for hetero compounds and LINKs would be more reliable than potentially different restraints for each PDB entry. Having the depositors' uploaded data would help me understand other puzzling features of structures such as the current 4GRV.pdb which seems to have a list of TLS groups but contains not a single ANISOU line!... I'm not a big fan of using ANISOU records for TLS contributions anyway ;-) But, more seriously, PDB entries should adhere to the PDB standard. Cheers, Robbie Cheers Martyn *In this particular case attempting to calculate R-factor using data and model files and making sure that the R you get is not twice as large as published one would entirely suffice -:) Pavel From: Robbie Joosten robbie_joos...@hotmail.com To: CCP4BB@JISCMAIL.AC.UK Sent: Friday, 12 April 2013, 22:57 Subject: Re: [ccp4bb] Puzzling Structure Waters are moved during annotation using the perceived space group's symmetry operation. So if the authors give the wrong space group, then the annotation pipeline understandably messes things up. If the originally uploaded PDB file was kept by PDBe, then the problem can be recovered quite easily by the annotators. Perhaps the topic starter, Michel Fodje, can send a bug report to PDBe. In my experience, the annotators are very helpful resolving these matters. potential flame Hoping that the depositors solve the problem by themselves, is probably in vain: There are many crystallographers who do not read the CCP4BB (which is a shame, really); they didn't notice the enormous amount of water related bumps in their final model (which is in the validation report you get after deposition and in REMARK 500 of the PDB file you have to approve); they also didn't notice the huge number of symmetry-related bumps; the R-factors in the PDB file are different from (and better than) the ones in Table 1. Also notice that the paper was submitted on April 21st 2009 and the model was deposited on June 29th 2009. Paper accepted on July 8th 2009. But I'm sure the referees had a chance to properly assess the quality of the structure model ;-) / potential flame Cheers
Re: [ccp4bb] Puzzling Structure
Hi Martyn, A shame then that these 'helpful' annotators did not make use of Pavel's basic sanity on the space group (*mentioned below) and check back to the one listed in the uploaded PDB file. As far as I know, EDS is run on all new depositions at PDBe. I don't know whether they already did that when this model was deposited. Even if they did, this may not solve the problem because the PDB does not refuse models. Possible scenarios (there may be more): 1) Pretty bad case - Annotator: We cannot reproduce your R-factors in EDS. Could you check the annotated coordinate and reflection files? - Depositor: Not interested (the paper is almost accepted anyway). Approve model as-is. 2) Worst case - Annotator: ... (doesn't notice the problem) - Depositor: ... (doesn't notice the problem) I often wonder why the PDB does not make the deposited coordinate file publicly available so that these sorts of issues can be checked and tracked. Good point, I wonder about that as well. Also about whether depositors would like that? The whole PDB data (excluding the EMDB that was recently merged into it) amounts to about a laptop hard drive's worth of data - so surely space can be made for the deposited coordinates? (and restraint files which will be very useful for other workers including pdb-redo). Yes, having access to certain restraint files (particularly for LINKs) would be very nice. That said, a proper repository of consensus-restraints for hetero compounds and LINKs would be more reliable than potentially different restraints for each PDB entry. Having the depositors' uploaded data would help me understand other puzzling features of structures such as the current 4GRV.pdb which seems to have a list of TLS groups but contains not a single ANISOU line!... I'm not a big fan of using ANISOU records for TLS contributions anyway ;-) But, more seriously, PDB entries should adhere to the PDB standard. Cheers, Robbie Cheers Martyn *In this particular case attempting to calculate R-factor using data and model files and making sure that the R you get is not twice as large as published one would entirely suffice -:) Pavel From: Robbie Joosten robbie_joos...@hotmail.com To: CCP4BB@JISCMAIL.AC.UK Sent: Friday, 12 April 2013, 22:57 Subject: Re: [ccp4bb] Puzzling Structure Waters are moved during annotation using the perceived space group's symmetry operation. So if the authors give the wrong space group, then the annotation pipeline understandably messes things up. If the originally uploaded PDB file was kept by PDBe, then the problem can be recovered quite easily by the annotators. Perhaps the topic starter, Michel Fodje, can send a bug report to PDBe. In my experience, the annotators are very helpful resolving these matters. potential flame Hoping that the depositors solve the problem by themselves, is probably in vain: There are many crystallographers who do not read the CCP4BB (which is a shame, really); they didn't notice the enormous amount of water related bumps in their final model (which is in the validation report you get after deposition and in REMARK 500 of the PDB file you have to approve); they also didn't notice the huge number of symmetry-related bumps; the R-factors in the PDB file are different from (and better than) the ones in Table 1. Also notice that the paper was submitted on April 21st 2009 and the model was deposited on June 29th 2009. Paper accepted on July 8th 2009. But I'm sure the referees had a chance to properly assess the quality of the structure model ;-) / potential flame Cheers, Robbie P.S. It's pretty awesome that the problem was solved in less than 20 minutes by the CCP4BB (that is, by Phoebe Rice) -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Garib N Murshudov Sent: Friday, April 12, 2013 21:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Puzzling Structure It is typo: R factor for p212121 - 0.4 for p21212- around 0.18 Although water seem to have been moved around using p212121 On 12 Apr 2013, at 16:33, Phoebe A. Rice wrote: Looks like a typo to me: if you change the CRYST space group record from P212121 to P21212, as the paper says it is, the packing problem goes away. ++ Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago 773 834 1723; pr...@uchicago.edu http://bmb.bsd.uchicago.edu/Faculty_and_Research/ http://www.rsc.org/shop/books/2008/9780854042722.asp From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Michel Fodje [michel.fo...@lightsource.ca] Sent: Friday, April 12, 2013 2:17 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Puzzling Structure
Re: [ccp4bb] Puzzling Structure
Waters are moved during annotation using the perceived space group's symmetry operation. So if the authors give the wrong space group, then the annotation pipeline understandably messes things up. If the originally uploaded PDB file was kept by PDBe, then the problem can be recovered quite easily by the annotators. Perhaps the topic starter, Michel Fodje, can send a bug report to PDBe. In my experience, the annotators are very helpful resolving these matters. potential flame Hoping that the depositors solve the problem by themselves, is probably in vain: There are many crystallographers who do not read the CCP4BB (which is a shame, really); they didn't notice the enormous amount of water related bumps in their final model (which is in the validation report you get after deposition and in REMARK 500 of the PDB file you have to approve); they also didn't notice the huge number of symmetry-related bumps; the R-factors in the PDB file are different from (and better than) the ones in Table 1. Also notice that the paper was submitted on April 21st 2009 and the model was deposited on June 29th 2009. Paper accepted on July 8th 2009. But I'm sure the referees had a chance to properly assess the quality of the structure model ;-) / potential flame Cheers, Robbie P.S. It's pretty awesome that the problem was solved in less than 20 minutes by the CCP4BB (that is, by Phoebe Rice) -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Garib N Murshudov Sent: Friday, April 12, 2013 21:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Puzzling Structure It is typo: R factor for p212121 - 0.4 for p21212- around 0.18 Although water seem to have been moved around using p212121 On 12 Apr 2013, at 16:33, Phoebe A. Rice wrote: Looks like a typo to me: if you change the CRYST space group record from P212121 to P21212, as the paper says it is, the packing problem goes away. ++ Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago 773 834 1723; pr...@uchicago.edu http://bmb.bsd.uchicago.edu/Faculty_and_Research/ http://www.rsc.org/shop/books/2008/9780854042722.asp From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Michel Fodje [michel.fo...@lightsource.ca] Sent: Friday, April 12, 2013 2:17 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Puzzling Structure By the way, you will need to show symmetry atoms to see the problem. -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Michel Fodje Sent: April-12-13 1:14 PM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Puzzling Structure Has anyone else noticed a problem with the structure of the N-terminal capsid domain of HIV-2 PDB 2wlv. Load it up to in coot and navigate to residue B118. /Michel. Dr Garib N Murshudov Group Leader, MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge CB2 0QH UK Email: ga...@mrc-lmb.cam.ac.uk Web http://www.mrc-lmb.cam.ac.uk http://www.mrc-lmb.cam.ac.uk/
Re: [ccp4bb] Rfree reflections
Hi Tim, I don't think the 5-10% or 500-1000 reflections are real rules, but rather practical choices. The error margin in R-free is inverse proportional with the number of reflections in your test set and also proportional with R-free itself. So for R-free to be 'significant' you need some absolute number of reflections to reach your cut-off of significance. This is where the 1000 comes from (500 is really pushing the limit). You want to make sure the error margin in R and R-free are not too far apart and you probably also want to keep the test set representative of the whole data set (this is particularly important because we use hold-out validation, you only get one shot at validating). This is where the 5%-10% comes from. Another consideration for going for the 5%-10% thing is that this makes it feasible to do 'full' (i.e. k-fold) cross-validation: you only have to do 20-10 refinements. If you would go for 1000 reflections you would have to do 48 refinements for the average dataset. Personally, I take 5% and increase this percentage to maximum 10% if using 5% gives me a test set smaller than 1000 reflections. HTH, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene Sent: Tuesday, March 26, 2013 09:33 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Rfree reflections Dear all, I recall that the set of Rfree reflections should be 500-1000, rather than 5- 10%, but I cannot find the reference for it (maybe Ian Tickle?). I would therefore like to be confirmed or corrected: Is there an absolute number required for Rfree to be significant, i.e. 500-1000 irrespective of the total number of unique reflections in the data set, or is it 5-10% (as a compromise)? Thanks and regards, Tim -- -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A
Re: [ccp4bb] Rfree reflections
Hi Tim, The derivation of sigma(Rw-free) is in this paper: Acta Cryst. (2000). D56, 442-450. Tickle et al. Note the difference between the sigma of weighted/generalized/Hamilton R-free and that of the 'regular' R-free (there is a 2 there somewhere). From my own tests (10 fold cross-validation on 38 small datasets) I also find sigma(R-free) = R-free/sqrt(Ntest). For large datasets you really do not need to do k-fold cross validation, because sigma(R-free) can be predicted quite well. We just need to realize that it exists, Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene Sent: Tuesday, March 26, 2013 11:05 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree reflections Hi Robbie, thank you for the explanation. Heinz Gut and Michael Hadders pointed me at Axel Brunger's publication Methods Enzymol. 1997;277:366-96., http://www.ncbi.nlm.nih.gov/pubmed/18488318, which is where I got the notion of 500-1000 from. In this article a decrease of the error margin of Rfree with n^(1/2) is mentioned (p.384), but only as an observation. Is your statement inverse proportional with the number of reflections based on some statistical treatment, or also just on observation? It is a pity that k-cross validation is not standard routine because it seems so easy and so quickly to do with nowadays computers and a simple script. But that's probably like reminding people of not using R_int anymore in favour of R_meas... Cheers, Tim On Tue, Mar 26, 2013 at 10:24:51AM +0100, Robbie Joosten wrote: Hi Tim, I don't think the 5-10% or 500-1000 reflections are real rules, but rather practical choices. The error margin in R-free is inverse proportional with the number of reflections in your test set and also proportional with R-free itself. So for R-free to be 'significant' you need some absolute number of reflections to reach your cut-off of significance. This is where the 1000 comes from (500 is really pushing the limit). You want to make sure the error margin in R and R-free are not too far apart and you probably also want to keep the test set representative of the whole data set (this is particularly important because we use hold-out validation, you only get one shot at validating). This is where the 5%-10% comes from. Another consideration for going for the 5%-10% thing is that this makes it feasible to do 'full' (i.e. k-fold) cross-validation: you only have to do 20-10 refinements. If you would go for 1000 reflections you would have to do 48 refinements for the average dataset. Personally, I take 5% and increase this percentage to maximum 10% if using 5% gives me a test set smaller than 1000 reflections. HTH, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene Sent: Tuesday, March 26, 2013 09:33 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Rfree reflections Dear all, I recall that the set of Rfree reflections should be 500-1000, rather than 5- 10%, but I cannot find the reference for it (maybe Ian Tickle?). I would therefore like to be confirmed or corrected: Is there an absolute number required for Rfree to be significant, i.e. 500-1000 irrespective of the total number of unique reflections in the data set, or is it 5-10% (as a compromise)? Thanks and regards, Tim -- -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -- -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A
Re: [ccp4bb] Query regarding the use of anisotropic temperature factor and ideal rmsAngle and rmsBond length values
Dear Sonali, There is no such thing as an ideal rmsd for bonds and angles given resolution. IMO you should use rmsZ which also doesn't have an ideal value. If its below 1 your good. As for the isotropic vs anisotropic, you can use a hamilton test if you do two refinements changing only the B-factor model. Ethan Merritt had a paper about that last year in Acta D. Now for the self plug: PDB_REDO automatically selects the B-factor model based on this test (and some more selectors) and will also optimize your weights to get proper rmsZ values for geometry. Cheers, Robbie Sent from my Windows Phone From: sonali dhindwal Sent: 2013-03-17 09:06 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Query regarding the use of anisotropic temperature factor and ideal rmsAngle and rmsBond length values Dear All, We want little suggestion and knowledge regarding refinement of data in Refmac. We have a data with resolution upto 1.5A. Overall redundancy of 5.5 and 3.7 in high resolution bin. and I over Sigma is also 21 overall and 2.2 in last resolution bin. When we first did isotropic refinement we used automatic weighing term, which gave good Rfree and Rfactor of 18.4 and 16.9 but high rmsBond and rmsAngle of 0.027 and 2.5 respectively. We were able to improve rmsBond and rmsAngle values by decreasing weighing term to 0.5. But when we do anisotropic refinement with weighing term of 0.5 it gives Rfree, Rfactor and FOM of 16.8, 15.0 and 90.7 respectively. And rmsAngle and rmsBond of 0.0074 and 1.25. Now, we want to know what should be the ideal values for rmsAngle and rmsBond at such resolution. Secondly, if we can use anisotropic refinement with such data. All your suggestions will be highly valuable. Thanks in advance. -- Sonali Dhindwal “Live as if you were to die tomorrow. Learn as if you were to live forever.”
Re: [ccp4bb] Query regarding the use of anisotropic temperature factor and ideal rmsAngle and rmsBond length values
Small addition to Ian's comment. The value you give with 'weight auto $value' is a starting value. Refmac will gradually change it if needed (it's autoweighting after all) and your starting value does matter somewhat. Based on Ian's advice PDB_REDO uses a starting value of 2.50 which seems to do the trick most of the times. Cheers, Robbie Sent from my Windows Phone From: Ian Tickle Sent: 2013-03-17 13:15 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Query regarding the use of anisotropic temperature factor and ideal rmsAngle and rmsBond length values Hi Sonali The 'WEIGHT MATRIX Wm' scale factor, to which I assume you're referring, is on a relative, not absolute, scale so is not comparable between different models, i.e. the results will be substantially different when you change the model keeping Wm fixed as you discovered. If you want the weight to be more easily comparable for different models (and also for data with different resolution limits) you need it to be on an absolute scale: this is what you get with 'WEIGHT AUTO Wa'. For a theoretically perfect model the optimal value of Wa would be 1, i.e. the geometrical and X-ray weights would be on the same absolute scale, though in practice the optimal value of Wa usually turns out to be a little higher than 1 (say between 1 and 4). The default Wa value (using just 'WEIGHT AUTO') is 10: I find this is suitable for refining MR solutions where you need the model to be less geometrically rigid so the X-ray contribution needs to be inflated relatively, but as the model improves Wa needs to be decreased towards the theoretical value of 1 (or certainly not much less than 1). Note that this Wa is conveniently the same as the Wa used in X-PLOR, CNS phenix.refine (so it can be transferred between programs), though in the latter case I believe it stands for weight(absolute), not weight(automatic). If I have this wrong, no doubt the X-PLOR/CNS/phenix.refine people will put me straight! Other than that I agree with everything Robbie said you should heed his advice. Cheers -- Ian On 17 March 2013 08:06, sonali dhindwal sonali11dhind...@yahoo.co.inwrote: Dear All, We want little suggestion and knowledge regarding refinement of data in Refmac. We have a data with resolution upto 1.5A. Overall redundancy of 5.5 and 3.7 in high resolution bin. and I over Sigma is also 21 overall and 2.2 in last resolution bin. When we first did isotropic refinement we used automatic weighing term, which gave good Rfree and Rfactor of 18.4 and 16.9 but high rmsBond and rmsAngle of 0.027 and 2.5 respectively. We were able to improve rmsBond and rmsAngle values by decreasing weighing term to 0.5. But when we do anisotropic refinement with weighing term of 0.5 it gives Rfree, Rfactor and FOM of 16.8, 15.0 and 90.7 respectively. And rmsAngle and rmsBond of 0.0074 and 1.25. Now, we want to know what should be the ideal values for rmsAngle and rmsBond at such resolution. Secondly, if we can use anisotropic refinement with such data. All your suggestions will be highly valuable. Thanks in advance. -- Sonali Dhindwal “Live as if you were to die tomorrow. Learn as if you were to live forever.”
Re: [ccp4bb] validating ligand density
Hi Srinivasan, I always use edstats from the command line: 'edstats.pl HKLIN 101m_0cyc.mtz XYZIN 101m_0cyc.pdb'. I hadn't noticed it was well-hidden (or the limit case: not present) in CCP4i. I guess a feature request is in order. Cheers, Robbie Date: Wed, 13 Mar 2013 03:42:45 +0800 From: sreera...@yahoo.co.in Subject: Re: [ccp4bb] validating ligand density To: CCP4BB@JISCMAIL.AC.UK Thank you very much to all those who suggested a way out for our situation. We have so far refined the occupancies on phenix and the ligand shows an uniform occupancy of 0.8 post the refinement. The B-factors of the ligand are around 20 (atleast for the regions with a well defined electron density), which is slightly lower than the average B-factors of the whole structure which is 24. We have a few poorly defined regions in our electron density. This was the starting point of our problem and it remains to be a problem. @ Robbie --- we would like to run EDSTAT on CCP4 but we dont find the program in both 6.3.0 and 6.3.1 versions. It would be kind to know if we are doing something wrong to not find it on the ensuite. @ Herman We did add the cryo from the crystallisation condition as another strategy but that also doesnt look too convincing. There is just that enough more to the density to think its our substrate. The density also does not compare well with the apo structures. @ Eleanor We set the occupancies to zero and refined the structure but we did not get any conclusive answers from it. We have a continious density for the best part of the ligand; but as you mentioned a few carbon atoms which are wobbly are poorly defined. We will look carefully into the geometrical restraints as you suggested. Thank you all again for the suggestions!Srinivasan From: Bosch, Juergen jubo...@jhsph.edu To: CCP4BB@JISCMAIL.AC.UK Sent: Tuesday, 12 March 2013 4:32 PM Subject: Re: [ccp4bb] [ccp4bb] validating ligand density Going back to the initial question.I would recommend looking at AFITThttp://www.eyesopen.com/afitt Works like a dream (in certain cases). Jürgen P.S. I wish I had some stocks from them but I don't.. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://lupo.jhsph.edu On Mar 12, 2013, at 11:25 AM, herman.schreu...@sanofi.com herman.schreu...@sanofi.com wrote: You are the one who should judge your statement, but it looks plausible to me. Now that I think of it: why do we need referees if every scientist should judge their own hypothesis? Publication will be a lot faster if we no longer need to heed the remarks of some grumpy referees and send in revision after revision. Also the number of publications will increase significantly if every scientist is allowed to judge their own papers! HS From: Jacob Keller [mailto:j-kell...@fsm.northwestern.edu] Sent: Tuesday, March 12, 2013 4:14 PM To: Schreuder, Herman RD/DE Cc: CCP4BB@jiscmail.ac.uk Subject: Re: [ccp4bb] validating ligand density Dear Jacob, You are overinterpreting, the statement is about judging, not proving a hypothesis. I am sure Mr. Edwards judged his statement to be ok. I guess there is a good likelihood that you are right, but who am I to judge? JPK Herman From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Jacob Keller Sent: Tuesday, March 12, 2013 3:44 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] validating ligand density One final quote that is not in the twilight paper summarizes it nicely: The scientist must be the judge of his own hypotheses, not the statistician. A.F.W. Edwards (1992) in Likelihood - An account of the statistical concept of likelihood and its application to scientific inference , p. 34. There must be a lot of thinking behind this statement--while it seems plausible, it seems far from proven prima facie. Also, it assumes that the scientist is not a statistician. Jacob Btw, the book is good reading. Best, BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Robbie Joosten Sent: Tuesday, March 12, 2013 10:03 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] validating ligand density Dear Srinivasan, Although the Twilight program can only look at deposited PDB entries, the tips about ligand validation in the paper are very useful. I
Re: [ccp4bb] validating ligand density
Dear Srinivasan, Although the Twilight program can only look at deposited PDB entries, the tips about ligand validation in the paper are very useful. I suggest you start from there. You can use EDSTATS in CCP4 to get real-space validation scores. Also look at the difference map metrics it gives (and the maps themselves of course), they will tell you whether you misidentified your ligand. Occupancy refinement in Refmac can also help you: if the occupancy drops a lot something is wrong. That can be partial binding (not that much of a problem) or worse, a ligand that isn't there. By the way, I've been playing with that recently and some ligands/hetero compounds in the PDB were so incredibly 'not there' that Refmac would crash (that bug seems to be fixed in the latest version). HTH, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of R.Srinivasan Sent: Monday, March 11, 2013 23:03 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] validating ligand density Hello all, We co-crystallized an inactive variant of our enzyme in the presence of substrate and have determined the structure at 1.85A. Now, we want to validate the fitting of the ligand into the electron density. We tried validating using the difference map (2Fo-Fc) after refining the structure without the ligand. But, it is still a bit inconclusive if the density fits the ligand. It would be very kind to know if there are tools for validating this electron density. We were excited about twilight but turns out it can only be used with deposited structure. We will appreciate your help and suggestions. Many thanks, Srinivasan
Re: [ccp4bb] Rfree flag
Hi Tim, Our approach is a bit different. We first try to establish whether the R-free set is biased, by checking whether R-free is surprisingly low compared to R given the data parameter ratio. If this is the case (or if we chose a new R-free set for some reason, e.g. because it was too small) PDB_REDO resets the B-factors to 0.5*(Wilson B-factor) and then does more cycles of refinement to ensure it converges. This should get rid of the bias. We compared this to the 'perturb coordinates' approach and in most cases there wasn't any difference. In some cases the perturbed coordinates were outside the radius of convergence of Refmac (the version 5.2 or perhaps 5.4) particularly in cases with NCS. So coordinate perturbation was just not worth it. This was before NCS restraints were properly implemented in PDB_REDO (hurray for local NCS/LSSR!), so this issue must be much smaller now. Of course the rebuilding round of PDB_REDO followed by more refinemnt, will cause enough model perturbation if the above results are not convincing enough ;) We also do full (well, k-fold) cross-validation for small data sets in which the different test sets are all-but-one completely biased. Here, too, the B-factor resetting works well enough. That said we might add a few extra cycles of refinement here to be on the safe side. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene Sent: Thursday, February 28, 2013 10:33 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree flag -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Kavya, As far as I understand the PDBRedo project attempts to make the reflections unbiased from the structure by a random shift of coordinates (e.g. 'NOISE' keyword in pdbset, although I am not aware of an investigation about whether this actually does make remove bias. It is safest to keep the same Rfree-set. Regards, Tim On 02/28/2013 06:54 AM, Kavyashree Manjunath wrote: Dear users,y Is it mandatory to use the same reflections for Rfree calculations of a ligand bound data as that of its native? Thank you With Regards Kavya - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFRLyQ2UxlJ7aRr7hoRAsHBAJ43K7f2lcSZwm6fD1pH8+grvrOqRACg1bdi PUUKJaUv8C4JgmcPNM6H9+U= =J47i -END PGP SIGNATURE-
Re: [ccp4bb] Building sugars
Hi Folmer, Just to add some tips: Concerning the naming as one molecule: the sugar monomers get the same chain ID as the protein they are connected to and arbitrary residue numbers. I usually start numbering from 1000 to prevent overlap with the numbering of the amino acids. 1) Just don't use insertion codes, some people find it upsetting ;) And keep the residue numbering consistent between NCS copies. 2) The glycosciences.de portal has many tools for dealing with carbohydrates: http://www.glycosciences.de/ I really like PDB-care and CARP for validation in the building and refinement process. 3) When using TLS you should try to figure out whether it's useful to add the sugars to the group of the linked protein residue or to have specific groups for your sugar trees. Cheers, Robbie HS. From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Folmer Fredslund Sent: Thursday, February 21, 2013 12:33 PM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Building sugars Dear all, What's the correct way to build and refine sugar polymers? I am currently building several structures with different kinds of sugar polymers bound to them. Searching for similar ligands in the PDB, I end up with e.g. trisaccharides that are named as one molecule, even though they are indeed made up of three individual sugars with bonds between them. Thank you for any pointers. Best regards, Folmer -- Folmer Fredslund
Re: [ccp4bb] Building sugars
Hi Folmer, RAF is in the PDB ligand dictionary with status 'REL' so you can use it. If RAF is a subset of something bigger, then perhaps you should use monosaccharide building blocks. If in doubt, ask a PDB annotator. Anyway, PDB-care will check whether the connectivity in a compound named RAF matches the standard description of RAF. CARP will check the torsion angles between the monosaccharide building blocks. HTH, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Folmer Fredslund Sent: Thursday, February 21, 2013 15:37 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Building sugars Hi all Thank you all for your replies. I might have expressed myself poorly, but I am not talking about covalently linked sugar modifications, so for my purpose there's no need to be concerned about insertion codes ;-) The glycosciences.de link is really useful. There does not seem to be a test to verify correct PDB nomenclature though. Or perhaps RAF (for raffinose, a tri- saccharide) is OK to use? Best regards, Folmer 2013/2/21 Robbie Joosten robbie_joos...@hotmail.com Hi Folmer, Just to add some tips: Concerning the naming as one molecule: the sugar monomers get the same chain ID as the protein they are connected to and arbitrary residue numbers. I usually start numbering from 1000 to prevent overlap with the numbering of the amino acids. 1) Just don't use insertion codes, some people find it upsetting ;) And keep the residue numbering consistent between NCS copies. 2) The glycosciences.de portal has many tools for dealing with carbohydrates: http://www.glycosciences.de/ I really like PDB-care and CARP for validation in the building and refinement process. 3) When using TLS you should try to figure out whether it's useful to add the sugars to the group of the linked protein residue or to have specific groups for your sugar trees. Cheers, Robbie HS. From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Folmer Fredslund Sent: Thursday, February 21, 2013 12:33 PM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Building sugars Dear all, What's the correct way to build and refine sugar polymers? I am currently building several structures with different kinds of sugar polymers bound to them. Searching for similar ligands in the PDB, I end up with e.g. trisaccharides that are named as one molecule, even though they are indeed made up of three individual sugars with bonds between them. Thank you for any pointers. Best regards, Folmer -- Folmer Fredslund -- Folmer Fredslund
Re: [ccp4bb] Link problem with Refmac.
Hi Ian, The warning refers to a MET 59 in chain A whereas you only have MET 72. That is very suspicious. Non-sequential residues further apart than x Angstrom automatically get a gap record. Have you tried a newer version of Refmac, because this feature was added quite a while ago? What is your setting for 'MAKE CONN' when you run Refmac? Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ian Tickle Sent: Monday, February 18, 2013 17:32 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Link problem with Refmac. All, I'm having a problem with Refmac (v. 5.7.0025) that I don't understand. It's linking 2 residues that it shouldn't be. Here's the relevant message in the log file: WARNING : large distance for conn:TRANSdist =10.768 ch:AA res: 58 THR -- 59 MET ideal_dist= 1.329 Note that there are no LINK (or LINKR) records in the PDB header. Here are the input co-ords for the relevant residues (not linked): ATOM887 N THR A 58 13.587 1.365 19.814 1.00 14.28 A N ATOM888 CA THR A 58 14.743 1.126 18.960 1.00 17.64 A C ATOM890 CB THR A 58 14.325 0.613 17.567 1.00 17.69 A C ATOM892 OG1 THR A 58 13.605 1.650 16.879 1.00 15.24 A O ATOM894 CG2 THR A 58 13.505 -0.658 17.658 1.00 17.33 A C ATOM898 C THR A 58 15.573 2.346 18.631 1.00 22.80 A C ATOM899 O THR A 58 15.144 3.492 18.842 1.00 20.41 A O ATOM956 N MET A 72 13.605 -6.845 13.378 1.00 43.23 A N ATOM957 CA MET A 72 12.268 -6.980 12.733 1.00 39.06 A C ATOM959 CB MET A 72 12.308 -6.361 11.331 1.00 42.06 A C ATOM962 CG MET A 72 12.455 -4.846 11.320 1.00 43.45 A C ATOM965 SD MET A 72 13.020 -4.153 9.755 1.00 46.07 A S ATOM966 CE MET A 72 14.695 -4.789 9.653 1.00 49.84 A C ATOM970 C MET A 72 11.544 -8.344 12.624 1.00 36.94 A C ATOM971 O MET A 72 10.314 -8.353 12.558 1.00 34.24 A O Here are the same residues (linked) after refinement: ATOM887 N THR A 58 14.212 0.104 18.340 1.00 43.09 A N ATOM888 CA THR A 58 14.332 -1.166 17.541 1.00 45.12 A C ATOM890 CB THR A 58 12.906 -1.657 17.309 1.00 39.26 A C ATOM892 OG1 THR A 58 12.400 -1.039 16.117 1.00 38.40 A O ATOM894 CG2 THR A 58 12.010 -1.301 18.435 1.00 33.96 A C ATOM898 C THR A 58 14.805 -1.376 16.064 1.00 59.98 A C ATOM899 O THR A 58 15.304 -0.470 15.386 1.00 69.73 A O ATOM901 N MET A 72 14.609 -2.641 15.623 1.00 61.67 A N ATOM902 CA MET A 72 13.990 -2.997 14.308 1.00 60.32 A C ATOM904 CB MET A 72 14.898 -2.730 13.093 1.00 71.29 A C ATOM907 CG MET A 72 14.126 -2.345 11.812 1.00 73.22 A C ATOM910 SD MET A 72 12.912 -3.499 11.087 1.00 69.42 A S ATOM911 CE MET A 72 13.917 -4.503 9.996 1.00 63.68 A C ATOM915 C MET A 72 13.413 -4.438 14.205 1.00 59.57 A C ATOM916 O MET A 72 12.199 -4.599 14.130 1.00 60.33 A O Residues 59-71 are present but in a poorly defined loop so I definitely do not want residues 58 72 linked! I'm puzzled because I'm sure it never used to do this, i.e. you had to specify a LINK if you wanted one and Refmac was smart enough to recognise that residues across a break should not be linked. So how do I tell it NOT to link them? Cheers -- Ian
Re: [ccp4bb] Link problem with Refmac.
Hi Ian, I avoid renumbering whenever I can. If I do have to renumber things (e.g. to get proper connectivity in PDB entry 2j8g), I do it by hand. So no help there. As for dealing with insertion codes in general, why not try to convince the developers of the 'brain-damaged' to support insertion codes? I've asked quite a few for these sort of updates and many were very helpful. The problem is that most developers discover the existence of insertion codes after they set up a data structure for the coordinates. Adding support afterwards can be quite a hassle. The more users ask for such support, the more likely it will be implemented. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ian Tickle Sent: Monday, February 18, 2013 19:40 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Link problem with Refmac. Hi Robbie OK I just realised what's going on. In my script I renumber the input PDB file (starting at 1 for each chain and incrementing by 1) and keep the mapping so I can renumber it back afterwards for human consumption. So you're completely correct: there is indeed a residue A59 after renumbering! This is to avoid headaches with brain-damaged programs that can't cope with insertion codes and residue numbers out of sequence. So I guess I'm going to have to be smarter in my renumbering program and make sure I maintain any increasing gaps in the numbering which indicate real gaps in the sequence and only renumber over insertions and decreasing gaps. It doesn't actually matter what the new numbers are since the user never sees them. But this must be a common problem: how do others handle this? E.g. pdbset blindly renumbers with a increment of 1 (and anyway it doesn't renumber any LINK, SSBOND CISPEP records as I do) so it would have the same problem. Cheers -- Ian On 18 February 2013 17:09, Robbie Joosten robbie_joos...@hotmail.com wrote: Hi Ian, The warning refers to a MET 59 in chain A whereas you only have MET 72. That is very suspicious. Non-sequential residues further apart than x Angstrom automatically get a gap record. Have you tried a newer version of Refmac, because this feature was added quite a while ago? What is your setting for 'MAKE CONN' when you run Refmac? Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ian Tickle Sent: Monday, February 18, 2013 17:32 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Link problem with Refmac. All, I'm having a problem with Refmac (v. 5.7.0025) that I don't understand. It's linking 2 residues that it shouldn't be. Here's the relevant message in the log file: WARNING : large distance for conn:TRANSdist =10.768 ch:AA res: 58 THR -- 59 MET ideal_dist= 1.329 Note that there are no LINK (or LINKR) records in the PDB header. Here are the input co-ords for the relevant residues (not linked): ATOM887 N THR A 58 13.587 1.365 19.814 1.00 14.28 A N ATOM888 CA THR A 58 14.743 1.126 18.960 1.00 17.64 A C ATOM890 CB THR A 58 14.325 0.613 17.567 1.00 17.69 A C ATOM892 OG1 THR A 58 13.605 1.650 16.879 1.00 15.24 A O ATOM894 CG2 THR A 58 13.505 -0.658 17.658 1.00 17.33 A C ATOM898 C THR A 58 15.573 2.346 18.631 1.00 22.80 A C ATOM899 O THR A 58 15.144 3.492 18.842 1.00 20.41 A O ATOM956 N MET A 72 13.605 -6.845 13.378 1.00 43.23 A N ATOM957 CA MET A 72 12.268 -6.980 12.733 1.00 39.06 A C ATOM959 CB MET A 72 12.308 -6.361 11.331 1.00 42.06 A C ATOM962 CG MET A 72 12.455 -4.846 11.320 1.00 43.45 A C ATOM965 SD MET A 72 13.020 -4.153 9.755 1.00 46.07 A S ATOM966 CE MET A 72 14.695 -4.789 9.653 1.00 49.84 A C ATOM970 C MET A 72 11.544 -8.344 12.624 1.00 36.94 A C ATOM971 O MET A 72 10.314 -8.353 12.558 1.00 34.24 A O Here are the same residues (linked) after refinement: ATOM887 N THR A 58 14.212 0.104 18.340 1.00 43.09 A N ATOM888 CA THR A 58 14.332 -1.166 17.541 1.00 45.12 A C ATOM890 CB THR A 58 12.906 -1.657 17.309 1.00 39.26 A C ATOM892 OG1 THR A 58 12.400 -1.039 16.117 1.00 38.40 A O ATOM
Re: [ccp4bb] refmac5 MMA bug
Hi Ed, This is a 'compatability' option in Refmac that internally renames atoms. If you comment out 'MMA .C7 CM' in your mon_lib_list.cif file, the problem will disappear. Cheers, Robbie Date: Sun, 10 Feb 2013 23:35:25 -0500 From: epozh...@umaryland.edu Subject: [ccp4bb] refmac5 MMA bug To: CCP4BB@JISCMAIL.AC.UK I see a strange issue with a model that includes O1-methyl-mannose (three letter code MMA). Basically, refmac fails and says that C7 is missing in the model while CM is absent from the library. The problem is that there is no CM atom in the pdb file, while C7 is right there. This happens with Refmac_5.7.0029, and I see no obvious issues with the corresponding cif-file in the monomer library. -- Oh, suddenly throwing a giraffe into a volcano to make water is crazy? Julian, King of Lemurs
Re: [ccp4bb] refmac5 MMA bug
Hi Ed, C7 is the correct name for the atom. Instead of commenting out the line you could swap the C7 and the CM and then Refmac would correct the atom name if it is wrong. This is of course very user friendly, but it also keeps users from using the correct atom names (similar to the nucleic acid naming problem). So I prefer causing an error message. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ed Pozharski Sent: Monday, February 11, 2013 15:07 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] refmac5 MMA bug On Mon, 2013-02-11 at 09:56 +0100, Robbie Joosten wrote: This is a 'compatability' option in Refmac that internally renames atoms. If you comment out 'MMA .C7 CM' in your mon_lib_list.cif file, the problem will disappear. Robbie, thanks a lot - this fixes it. Is this still considered a bug? From what I understand, the data_comp_synonym_atom_list entry indicates that whenever MMA C7 atom is encountered, it will be internally renamed to CM. However, the $CCP4_LIB/data/monomers/m/MMA.cif should then refer to CM as well. But that cif-file still uses C7. Maybe this gets fixed in ccp4 updates, which reminds me to get that set up at last. Cheers, Ed. -- After much deep and profound brain things inside my head, I have decided to thank you for bringing peace to our home. Julian, King of Lemurs
Re: [ccp4bb] generating electron density from PDB and structure factor file
Just to add some more possibilities: - You can download maps from EDS or models and maps from PDB_REDO straight into CCP4mg. - You can download PDB_REDO maps and models into PyMOL using this plugin (http://www.cmbi.ru.nl/pdb_redo/pymol.html) for which we should thank Ed Pozharski. Note that this does require a working and sourced CCP4 installation to convert mtz files into maps. Cheers, Robbie Date: Mon, 4 Feb 2013 05:15:40 -0800 From: jan_i...@yahoo.com Subject: [ccp4bb] generating electron density from PDB and structure factor file To: CCP4BB@JISCMAIL.AC.UK Dear All, I would like to know what is the best possible way to generate the density from the published pdb file. thanks, Bhat
Re: [ccp4bb] RMSD Citation
Note that we discuss rmsZ values in the paper, not rmsd. This is done on purpose; rmsd values do not take the standard deviation of bond lengths into account. This makes it needlessly difficult to compare values. Consider reporting rmsZ instead of rmsd. Cheers, Robbie Sent from my Windows Phone From: Randy Read Sent: 2013-01-29 23:08 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] RMSD Citation Dear Peter, Shameless plug: you could do worse than to read the report of the X-ray Validation Task Force of the wwPDB (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3195755/), which includes citations to the original literature such as the Engh Huber studies on bond lengths and angles, and their standard deviations. Best wishes, Randy Read - Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical ResearchTel: +44 1223 336500 Wellcome Trust/MRC Building Fax: +44 1223 336827 Hills RoadE-mail: rj...@cam.ac.uk Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk On 29 Jan 2013, at 20:47, Peter Randolph wrote: My advisor has told me that an acceptable range for publication is an RMSD for bonds ~ 0.01 A and angles 2.0 degrees is acceptable for publication (with a proper R and R-free). Does anyone know where these values came from and if there a specific citation to go along with it? Thanks, Peter
Re: [ccp4bb] off topic: DSSP
Hi Nat, DSSP recently went open source with a very liberal license. So you can consider using the real DSSP now. This may also be the moment to integrate DSSP in CCP4. Cheers, Robbie Sent from my Windows Phone From: Nat Echols Sent: 2013-01-28 17:32 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] off topic: DSSP On Mon, Jan 28, 2013 at 8:04 AM, Antony Oliver antony.oli...@sussex.ac.uk wrote: If you don't mind using the ksDSSP implementation, it is already installed with the phenix suite if you have it. Correct, but although the method is supposed to be the same, the output is not, and there are bugs in how it presents helix annotations. So I'm not sure it's a reliable substitute for the original DSSP - we use it in Phenix to calculate secondary structure restraints, with some extra filtering to catch the buggy annotations. (Unfortunately it was the only open-source program I could find for this purpose.) -Nat
[ccp4bb] FW: [ccp4bb] off topic: DSSP
---BeginMessage--- Hi, I thought the latest version of dssp ftp://ftp.cmbi.ru.nl/pub/software/dssp/dssp-2.1.0.tgz contains the fixes to build a Mac version. However, it may be you have to remove the -static flag from the Makefile. If so, please let me know, I have no Mac capable of running 10.5 to test this. If it doesn't work, please send me the output so I can try to infer the required extra changes. best regards, -maarten Van: Robbie Joosten [robbie_joos...@hotmail.com] Verzonden: zaterdag 26 januari 2013 10:16 Aan: Maarten Hekkelman Onderwerp: FW: [ccp4bb] off topic: DSSP Hoi Maarten, Enig idee? Groetjes, Robbie Sent from my Windows Phone From: Rashmi Panigrahi Sent: 2013-01-26 10:11 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] off topic: DSSP Hi All Apologies for the off topic. Could some one help me in installing DSSP in Mac OSX 10.5.8 I tried downloading dssp-2.0.3.tbzftp://ftp.cmbi.ru.nl/pub/software/dssp/dssp-2.0.3.tbz dssp-2.0.4-linux-amd64ftp://ftp.cmbi.ru.nl/pub/software/dssp/dssp-2.0.4-lin ux-amd64 dssp-2.1.0.tgzftp://ftp.cmbi.ru.nl/pub/software/dssp/dssp-2.1.0.tgz from http://swift.cmbi.ru.nl/gv/dssp/ I tried doing make followed by make install. I got error on compiling. Thanks in advance -- rashmi Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in het handelsregister onder nummer 41055629. The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629. ---End Message---
Re: [ccp4bb] freerflag bug
Hi Ed, I've had this problem as well. It's the result of the very small R-free set fraction. There is an error routine that catches really small R-free sets, but 0.5% gets through and triggers ar problem. My workaround is to just use a larger R-free set fraction (more than 1%). The version number is actually an improvement: 6.2 was just the version of CCP4. The program now has its own version number. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Edwin Pozharski Sent: Saturday, January 26, 2013 22:06 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] freerflag bug FREERFLAG fails as part of the uniqueify script to complete a test set on a cif- file downloaded from the PDB. My observations: 1) The bug is version-dependent - freerflag 6.2 succeeds, freerflag 1.1 fails (comes with ccp4 6.3, apparently newer than version 6.2 from ccp4 6.2) 2) The input cif-file has 105382 working and 502 test reflections (deposited pdb file confirms that as a relatively small 0.5% test set size). 3) freerflag throws this error Error in LWREFL_NOEXIT: mindx 2 not open for write! FREERFLAG: LWREFL: failed to write reflection I have not done any updates since 6.3.0, but I do not see any mention of freerflag in updates summary. I can supply the input mtz file that produces the error if needed. Thanks, Ed. -- Edwin Pozharski, PhD University of Maryland, Baltimore
Re: [ccp4bb] refmac5 vs phenix refine mixed up
I noticed that Refmac has done the 1vs0 thing correct for ages, which is very useful because mix-ups between the work set and test set used to be quite common in the reflection files at the pdb (Refmac saved me a lot of extra work with this). Dealing with this problem is very simple as the smallest set is typically the test set. Phenix however needs to deal with the CCP4 type reflection binning. Now the size of the sets cannot be used which means that you have find a smarter solution. So I wonder how this is implemented. Does Phenix use the (reasonable) assumption that the test set is labeled 1.00 or 0.00? Or does it also check the sets with other labels? Cheers, Robbie Sent from my Windows Phone From: Garib N Murshudov Sent: 2013-01-25 10:46 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] refmac5 vs phenix refine mixed up Dear Tim In principle if a user defines freer flag then refmac knows about that (unless freer flag is 0 then refmac assumes that it is default). In this case (if freer defined by user) then it is not altered. regards Garib On 25 Jan 2013, at 09:14, Tim Gruene wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Pavel, dear Garib, how do you figure out automatically the correct flag? (I hope both phenix and refmac will allow to manual overwrite the software's decision) Cheers, Tim On 01/24/2013 07:47 PM, Pavel Afonine wrote: Hi, It would be nice if default setting was the same in different suites. it's a nice idea of course, but I feel it is impractical as it would require changing a lot of software, both modern and legacy. However, given array of flags it is algorithmically trivial to figure out what is test and work flags. That's what phenix.refine have been doing since its beginning (2005). And my understanding is that Refmac does this too. As always, there are corner cases here, but it's better than nothing. Plus, programs (at least phenix.refine, can't speak for others) tell which flag was actually used, and they provide option to define the flag value to use. Pavel - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFRAkzvUxlJ7aRr7hoRAlVfAKClRD4/JLNDcOab1HjBroQYXND3bQCfegA9 UiHvuKXg2/b3LqlbPWQpKmY= =Awum -END PGP SIGNATURE- Dr Garib N Murshudov Group Leader, MRC Laboratory of Molecular Biology Hills Road Cambridge CB2 0QH UK Email: ga...@mrc-lmb.cam.ac.uk Web http://www.mrc-lmb.cam.ac.uk
Re: [ccp4bb] B-factors
Dear Urmi, The way you switched from Phenix to Refmac may not have resulted in the flat B-factor model in Ethan's paper. You should really do a thorough test in which you reset the B-factors before you start refinement. Shameless plug: PDB_REDO will do this automatically and has a few fallback options for cases in which the Hamilton test is inconclusive. Your R-factors a quite low for your resolution which suggests that you may have been a bit too conservative when picking your resolution cut-off. If you have more data you can try using that as well. This may also help your choice of B-factor model. It will improve your data/parameter ratio. HTH, Robbie Netherlands Cancer Institute www.cmbi.ru.nl/pdb_redo Sent from my Windows Phone From: Ethan Merritt Sent: 2013-01-25 01:36 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] B-factors On Thursday, January 24, 2013 03:52:12 pm Urmi Dhagat wrote: Hi all, I have been refining twinned data (at 3.1 A resolution) using refmac. My R and Rfree values are 19.6 and 26.2 respectively with NCS restraints and isotropic B-factor refinement.. I am not sure weather it is a good idea to refine individual B-factors at this resolution. I have also tried refining the same model in phenix but this time not refining the Bfactors. My Rfactor and Rfree are 25 and 32 respectively. Refining with TLS in Phenix drops R factors to 23 and 29. I would suspect it is possible to do better than that. My thoughts on how to approach it were written up for a past CCP4 Study Weekend and appeared in Acta D last year: To B or not to B Acta D 68:468 (2012). You can find a link to the PDF on the TLSMD web site http://skuld.bmsc.washington.edu/~tlsmd/references.html Ethan Then I used the output PDB from phenix and refined it in CCP4 (selecting overall B-factor refinement option instead of Isotropic) and my R factors are R work=16 and Rfree =21. If Rfree reflections are refined my refmac upon switching from phenix to refmac then does this contaminate the Rfree set ? Should swiching between refinement programs Phenix and Refmac be avoided? Urmi Dhagat -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] Hi clashscore
Hi Supratim, The clashscore gives the relative number of clashes, not their severity. This makes it difficult to see what your specific problem is. Sever clashes (with large overlaps) are usually the result of errors in your model and need individual attention. Light bumps can usually be solved by optimizing the refinement parameters. Overrestraining bonds and angles can cause a lot of clashes. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of supratim dey Sent: Wednesday, January 23, 2013 09:47 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Hi clashscore Hi I am refining my 2 angstrom strucutre using phenix windows based software. After many refinements my clashscore is not at all reducing and showing a value of 12. Can anybody suggest how to reduce the clashscore. Is there any technique to do it. Or i have to deal with each individual clashes mentioned in the list manually ? Supratim
Re: [ccp4bb] refining against weak data and Table I stats
Hi Douglas, Using two Table Is is a good way to show the difference between the two cut-offs, but I assume you will only discuss one of the models in your paper. IMO you only need to deposit the high res model, so there should be no problems with resolution conflicts in the PDB file. The annotators will probably help you if there is a problem with Rmerge 1.00. As for the title of your paper: nobody forces you to put a resolution in it if it causes to much of a stir. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Boaz Shaanan Sent: Friday, December 07, 2012 12:21 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] refining against weak data and Table I stats Hi, I'm sure Kay will have something to say about this but I think the idea of the K K paper was to introduce new (more objective) standards for deciding on the resolution, so I don't see why another table is needed. Cheers, Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Douglas Theobald [dtheob...@brandeis.edu] Sent: Friday, December 07, 2012 1:05 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] refining against weak data and Table I stats Hello all, I've followed with interest the discussions here about how we should be refining against weak data, e.g. data with I/sigI 2 (perhaps using all bins that have a significant CC1/2 per Karplus and Diederichs 2012). This all makes statistical sense to me, but now I am wondering how I should report data and model stats in Table I. Here's what I've come up with: report two Table I's. For comparability to legacy structure stats, report a classic Table I, where I call the resolution whatever bin I/sigI=2. Use that as my high res bin, with high res bin stats reported in parentheses after global stats. Then have another Table (maybe Table I* in supplementary material?) where I report stats for the whole dataset, including the weak data I used in refinement. In both tables report CC1/2 and Rmeas. This way, I don't redefine the (mostly) conventional usage of resolution, my Table I can be compared to precedent, I report stats for all the data and for the model against all data, and I take advantage of the information in the weak data during refinement. Thoughts? Douglas ^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^` Douglas L. Theobald Assistant Professor Department of Biochemistry Brandeis University Waltham, MA 02454-9110 dtheob...@brandeis.edu http://theobald.brandeis.edu/ ^\ /` /^. / /\ / / /`/ / . /` / / ' ' '
Re: [ccp4bb] thanks god for pdbset
Hi Ian, It's easy to forget about LINK records and such when dealing with the coordinates (I recently had to fix a bug in my own code for that). The problem with insertion codes is that they are very poorly defined in the PDB standard. Does 128A come before or after 128? There is no strict rule for that, instead they are used in order of appearance. This makes it hard for programmers to stick to agreed standards. Instead people rather ignore insertion codes altogether. They are really poorly soppurted by many programs. Perhaps switching to mmCIF gets rid of the problem. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ian Tickle Sent: Wednesday, December 05, 2012 16:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] thanks god for pdbset The last time I tried the pdbset renumber command because of issues with insertion codes in certain programs, it failed to also renumber the LINK, SSBOND CISPEP records. Needless to say, thanking god (or even God) was not my first thought! (more along the lines of why can't software developers stick to the agreed standards?). I haven't tried it with the latest version, maybe it's fixed now. -- Ian On 5 December 2012 07:58, Francois Berenger beren...@riken.jp wrote: Especially the renumber command that changes residue insertion codes into an increment of the impacted residue numbers. Regards, F.
Re: [ccp4bb] thanks god for pdbset
Hi Ian, The 'standard' you describe below is more of a suggestion than a rule. The PDB does not enforce a numbering scheme which is particularly annoying when dealing with engineered proteins with linkers or domains of different proteins (they come with all sorts of numbering schemes). Of course, when you use the ATOM records and distance criteria you should be able to work out what is connected and where the gaps are. Unfortunately, this is not always properly implemented in software (I had a nice recent case with a gap in an insertion in a nucleic acid, that cause problems working out the connectivity). When dealing with ranges of residues, e.g. in TSL group descriptions, numbering issues with (or without) insertion codes can be a real pain because ranges can be somewhat ambiguous. In theory, it is easy and insertion codes (or other numbering issues) should not be a problem at all. In practice, as Ed pointed out, it is a big mess. Cheers, Robbie -Original Message- From: Ian Tickle [mailto:ianj...@gmail.com] Sent: Wednesday, December 05, 2012 17:26 To: Robbie Joosten Cc: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] thanks god for pdbset I had always assumed that ASCII sort order was the standard so ' 128A' comes after ' 128 ' in the collating sequence, and indeed the PDB documentation seems to make it clear that it comes after, e.g. in the section describing the ATOM record: REFERENCE PROTEIN NUMBERINGHOMOLOGOUS PROTEIN NUMBERING --- -- 59 59 60 60 61 62 62 REFERENCE PROTEIN NUMBERING HOMOLOGOUS PROTEIN NUMBERING --- --- 85 85 86 86 86A 86B 87 87 But does it actually matter if the insertion comes before? Surely the sequence is completely defined by the file order, regardless of the residue numbering, not by the alphanumeric sorting order? So if 86A comes immediately before 86 in the file then you must assume that 86A C is linked to 86 N (assuming of course that the bond length is sensible), if after then it's 86 C to 86A N. Cheers -- Ian On 5 December 2012 16:02, Robbie Joosten robbie_joos...@hotmail.com wrote: Hi Ian, It's easy to forget about LINK records and such when dealing with the coordinates (I recently had to fix a bug in my own code for that). The problem with insertion codes is that they are very poorly defined in the PDB standard. Does 128A come before or after 128? There is no strict rule for that, instead they are used in order of appearance. This makes it hard for programmers to stick to agreed standards. Instead people rather ignore insertion codes altogether. They are really poorly soppurted by many programs. Perhaps switching to mmCIF gets rid of the problem. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ian Tickle Sent: Wednesday, December 05, 2012 16:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] thanks god for pdbset The last time I tried the pdbset renumber command because of issues with insertion codes in certain programs, it failed to also renumber the LINK, SSBOND CISPEP records. Needless to say, thanking god (or even God) was not my first thought! (more along the lines of why can't software developers stick to the agreed standards?). I haven't tried it with the latest version, maybe it's fixed now. -- Ian On 5 December 2012 07:58, Francois Berenger beren...@riken.jp wrote: Especially the renumber command that changes residue insertion codes into an increment of the impacted residue numbers. Regards, F.
Re: [ccp4bb] thanks god for pdbset
Hi Peter, Thanks for the info. I'd better go check whether my code assumes insertion codes are not digits. Cheers, Robbie Date: Wed, 5 Dec 2012 17:57:58 + From: pkel...@globalphasing.com Subject: Re: [ccp4bb] thanks god for pdbset To: CCP4BB@JISCMAIL.AC.UK Hi Robbie, On Wed, 2012-12-05 at 17:02 +0100, Robbie Joosten wrote: Hi Ian, It's easy to forget about LINK records and such when dealing with the coordinates (I recently had to fix a bug in my own code for that). The problem with insertion codes is that they are very poorly defined in the PDB standard. Does 128A come before or after 128? There is no strict rule for that, instead they are used in order of appearance. This makes it hard for programmers to stick to agreed standards. Instead people rather ignore insertion codes altogether. They are really poorly soppurted by many programs. Perhaps switching to mmCIF gets rid of the problem. Properly used, the PDB exchange dictionary for mmCIF can indeed sort this out. In addition to the PDB-style residue number + insertion code, it has an item for the residue sequence number in the chain (running from 1 .. n). The relevant item names are: _atom_site.pdbx_PDB_residue_no _atom_site.pdbx_PDB_ins_code and: _entity_poly_seq.num One thing to be careful of, is cases where the insertion code is a digit (which does happen sometimes). I have seen code many times where an assumption is made that the insertion code is not a digit, and this is assumption is used to separate the residue number from the insertion code (e.g. a user is asked to enter a residue number + insertion code as a single item). If the insertion code is a digit, this won't work. This is easy to handle in the fixed-width PDB format: 85 851 852 86 but if it gets written to mmCIF incorrectly as: loop_ _atom_site.pdbx_PDB_residue_no _atom_site.pdbx_PDB_ins_code 85 . 851 . 852 . 86 . instead of the correct: loop_ _atom_site.pdbx_PDB_residue_no _atom_site.pdbx_PDB_ins_code 85 . 85 1 85 2 86 . it can be really hard to sort out later on. Regards, Peter. -- Peter Keller Tel.: +44 (0)1223 353033 Global Phasing Ltd., Fax.: +44 (0)1223 366889 Sheraton House, Castle Park, Cambridge CB3 0AX United Kingdom
Re: [ccp4bb] how many cycles to settle B-factor?
Hi Jim, The speed at which the B-factor converges depends on many factors. The B-factor restraint weight that Herman and I mentioned (the one you should optimise before changing occupancies!) is an important factor. Also the position of your atomic coordinates WRT where they should end up is important. In Refmac, there is a big difference between isotropic and anisotropic B-factors (the latter converge much slower). So all in all it is very difficult to predict when the B-factor converges. The good news however is that more cycles of refinement should not hurt your model (if they do, your refinement settings are non-optimal). So grab a cup of coffee and run enough cycles. In Refmac 30 cycles is enough for most isotropic B-factors, but in some cases I use many more. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Jim Brannigan Sent: Tuesday, November 20, 2012 10:49 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] how many cycles to settle B-factor? As an adjunct to B-factor vs occupancy thread, i might expect the b-factor to halve if the occupancy is set to 0.5 (but I'm only a molecular biologist), if so how many cycles of refmac would it take for the value to settle? My limited experience is that it dosn't exactly halve but i do tend to use fewer cycles at later stages of refinement? thanks Jim Brannigan
Re: [ccp4bb] occupancy vs. Bfactors
Hi Grant, This is part of the recurring side chain discussion. There is no consensus in the community about what the optimal approach is. In your current approach you are adding a model parameter (occupancy) to improve the fit with the experimental data (remove negative difference density). You should ask yourself whether you really need to add that parameter. Are you not overfitting? Is there any clear evidence that the atoms are not always there? The alternative model you propose (full occupancy, high B) has fewer parameters and explains more of the strucure (you account for all atoms the protein has, prior knowledge). This model probably also better reflects the uncertainty of the coordinates of the side chains involved. If your B-factor restraints are not too tight, the difference densitty should also disappear (equal explanation of the experimental data). To me that would be a better model. HTH, Robbie Date: Mon, 19 Nov 2012 23:36:56 + From: gdmi...@students.latrobe.edu.au Subject: [ccp4bb] occupancy vs. Bfactors To: CCP4BB@JISCMAIL.AC.UK Hello all, I'm currently working on a structure which if I stub a certain side chain phenix/coot shows me a large green blob which looks strikingly similar to the side chain, when I put it in and run another refinement the blob turns red. Basically I was just playing around and I changed the occupancy of the side chain and now there are no complaints. But I was thinking, should I haven changed the Bfactors instead? Should I have left well enough alone? If I lower the occupancy manually and do not include alternate confirmations have I introduced modelling bias? Could someone recommend some good articles I could read on exactly how to correctly fix this problem. Thanks, GM
Re: [ccp4bb] Convention on residue numbering of fusion proteins?
Hi Meindert, The PDB will let you do what you want and as a result there are a few PDB entries with crazy residue numbering. I would use insertion codes only for real insertions or engineered linkers. Like Nat said, they are a nightmare for many programmers which is why they are poorly supported by many programs. So go with Mitch's suggestion and offset the residue numbers of the second protein, by some value that makes it clear that the residues are not from another part of the protein. You can add a comment in REMARK 999 if you want to provide extra explanation. According to the PDB standard you do not need a LINK: the connectivity of residues is implied by the order in which the appear in the SEQRES records. That said, programs may do quite different things here. FYI, many programs assume that residue numbering is unidirectional, i.e. always increasing (or in some double stranded DNA molecules in the PDB, always decreasing). So avoid things like going from residue 299 to 300 to 170 to 171. This can cause big problems, for instance when you define your TLS group from residue 200 to residue 173. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Nat Echols Sent: Tuesday, October 23, 2012 19:01 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Convention on residue numbering of fusion proteins? On Tue, Oct 23, 2012 at 9:55 AM, Meindert Lamers mlamers@mrc- lmb.cam.ac.uk wrote: Is there any convention on the numbering of residues in a fusion protein? I have a structure of two domains fused together but would like to keep the biological numbering intact. 1st domain: residue 200-300 (protein A). 2nd domain: residue 170-350 (protein B). The fusion is between A300 and B170 Is it OK to label them chain A and B and create a LINK between the two (thus keeping the biological residue number intact). Or do I have to start the 2nd domain with residue number 301 (and loose all biological information). You could use the insertion code: the first domain could be residues 200A - 300A, the second domain would be residues 170B - 350B, e.g. ATOM 2743 CA THR A 300A -9.899 6.476 21.720 1.00 27.53 C ATOM 2750 CA VAL A 170B -6.589 4.599 21.939 1.00 32.82 C but the chain ID stays the same, with no BREAK or TER record (and no LINK required). The insertion code can be a pain to deal with from a programmer's perspective, and it makes it more difficult to specify residue ranges, but I think this is exactly what it's supposed to be used for. -Nat
Re: [ccp4bb] anisotropic refinement
If we define high resolution as 1.2A and higher, there are ~1500 high-resolution entries (ignoring entries without experimental data). Of these ~1000 were refined with anisotropic B-factors by our method. We use the Hamilton R-ratio test to support our B-factor model choice. The ~1000 is an underestimate because the B-factor model used wasn't stored properly for the older entries. If more accurate numbers are needed, they can be mined from the PDB_REDO databank. HTH, Robbie Joosten Netherlands Cancer Institute www.cmbi.ru.nl/pdb_redo Date: Thu, 11 Oct 2012 12:17:39 -0700 From: merr...@u.washington.edu Subject: Re: [ccp4bb] anisotropic refinement To: CCP4BB@JISCMAIL.AC.UK On Thursday, October 11, 2012 11:50:37 am Rex Palmer wrote: Dear CCP4'ers With the occurrence of more and more high resolution protein structures does anyone know at present how many such structures have been successfully refined anisotropicall?� When we tried to categorize refinement protocols in the PDB at the end of 2009 we identified about 1200 protein structures that had been given full anisotropic treatment. Zucker et al, Acta Cryst. (2010). D66, 889–900 However, using automated search of the PDB it is hard to distinguish full aniso refinement from structures refined with TLS but having missing or malformed TLS records. As to successfully, that's a separate question :-) May Robbie Joosten has more recent numbers from the PDB-Redo project, and a comment on success? Ethan -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] ideal rms bond length
Hi Faisal, There is no such thing as the ideal deviation from ideal geometry. As long as your rmsZ values are below 1 (which they are), it's okay. Note that, in contrast to popular belief, rmsd has no useful meaning for bonds and angles. Genarally, rmsZ goes down with resolution but the correlation is not that high. It does not mean that rmsZ should be a particular value at a certain resolution. That said, you should optimise your restraints to get the most likely model. In Refmac that means minimizing -LLfree. You can do this by hand, but there are also automated procedures to do that. Shameless plug: PDB_REDO has such an automated procedure. HTH, Robbie Joosten Date: Wed, 3 Oct 2012 03:19:48 +0530 From: faisaltari...@gmail.com Subject: [ccp4bb] ideal rms bond length To: CCP4BB@JISCMAIL.AC.UK Dear all i request you to please answer my basic query about the ideal acceptable rmsbond length obtained during refmac refinement..is the data acceptable in mine case which is as follows.. NcycRfactRfree FOM -LL -LLfree rmsBOND zBOND rmsANGL zANGL rmsCHIRAL $$ $$ 0 0.2090 0.2079 0.875226315. 11985.5 0.0278 1.389 2.718 1.261 0.198 1 0.2064 0.2284 0.850226313. 12201.1 0.0285 1.427 2.733 1.271 0.204 2 0.2076 0.2373 0.837226944. 12289.9 0.0248 1.242 2.598 1.200 0.187 3 0.2092 0.2429 0.828227495. 12341.7 0.0222 1.107 2.458 1.128 0.173 4 0.2100 0.2468 0.822227753. 12372.4 0.0211 1.053 2.377 1.086 0.166 5 0.2104 0.2500 0.818227942. 12395.7 0.0204 1.021 2.326 1.061 0.161 6 0.2108 0.2522 0.814228075. 12411.5 0.0200 0.999 2.289 1.042 0.158 7 0.2111 0.2537 0.812228162. 12421.8 0.0197 0.984 2.265 1.030 0.156 8 0.2113 0.2550 0.810228228. 12430.5 0.0194 0.971 2.243 1.020 0.154 9 0.2114 0.2559 0.809228300. 12436.1 0.0192 0.962 2.228 1.012 0.153 10 0.2116 0.2568 0.808228348. 12441.7 0.0191 0.957 2.218 1.008 0.152 11 0.2118 0.2574 0.807228394. 12446.2 0.0190 0.951 2.210 1.004 0.151 12 0.2119 0.2581 0.806228421. 12449.6 0.0189 0.948 2.203 1.001 0.151 13 0.2119 0.2585 0.805228440. 12452.7 0.0189 0.944 2.198 0.998 0.150 14 0.2120 0.2590 0.805228461. 12455.0 0.0188 0.941 2.194 0.996 0.150 15 0.2121 0.2593 0.804228480. 12456.9 0.0188 0.939 2.190 0.995 0.150 -- Regards Faisal School of Life Sciences JNU
Re: [ccp4bb] B-iso vs. B-aniso
Dear Yuri, Why do you think you need 36 reflections per atom when atoms with anisotropic B-factors only have 9 parameters? You can get away with much fewer in many cases especially if you have good restraints. As Ethan points out, a drop in R-free after adding many parameters may be misleading. Proper testing will give you a clearer example. The Hamilton test in Ethan's paper is implemented in PDB_REDO (http://scripts.iucr.org/cgi-bin/paper?ba5174) and I had a quick look at some refinement statistics for structures with ~21 reflections/atom (like your case): according to PDB_REDO's strict criteria anisotropic B-factors are acceptable in two thirds of the cases. This was tested with Refmac on 285 PDB entries; ShelX's new restraints may well increase the success rate. HTH, Robbie Joosten Netherlands Cancer Institute www.cmbi.ru.nl/pdb_redo -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Yuri Pompeu Sent: Monday, September 17, 2012 20:32 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] B-iso vs. B-aniso Dear community, The protein model I am refining has 400 amino acids (3320 atoms). Some real quick calculations tell me that to properly refine it anisotropically, I would need 119,520 observations. Given my unit-cell dimension and space- group it is equivalent to about a 1.24 A complete data set. However, I have had a couple of cases where anisotropic B-factor refinement significantly improved R-work and R-free, while maintaining a reasonable gap for lower resolution models (1.4-1.5 A, around 70,000 reflections). What is the proper way of modelling the B-factors? Any thoughts and/or opinions from the community are welcome. Cheers,
Re: [ccp4bb] compatibility issue between coot and refmac
Dear Norman, Refmac version 5.0 sounds unlikely, the version with CCP4 6.3 is 5.7.0029. Anyway, your DNA seems to have asterisks in the atom names, which is 'so last decade' (they were removed in 2008). Refmac and Coot may not be equally forgiving for legacy formats. IMO neither should be. The best you can do is upgrade your PDB to a new format. There is a tool in CCP4i for that and the MolProbity server can do it as well. HTH, Robbie Date: Sun, 26 Aug 2012 00:40:02 -0700 From: zhunor...@gmail.com Subject: [ccp4bb] compatibility issue between coot and refmac To: CCP4BB@JISCMAIL.AC.UK dear ccp4 forum Does anybody know which version of coot is compatible with which version of ccp4i, or more specifically, refmac? I seem to be having DNA recognition problem between the two softwares. I haven isolated the problem to the differences between naming conventions. However, manipulating the naming convention would only cause the pdb file to work in one of the software but not both. If it works for refmac the real space refinement in coot would not work. The warning sign pops up for that would says 'failed to match these atoms names (O1P, C5*...etc)to the dictionary' and if it works for the coot the refmac would not process the pdb file at all. Naturally the first thing i thought of to do is to synchronize the dictionaries in both of these programs but doesn't seem to be that straight forward. I suspect it has something to do with the import cif dictionary but i don't know where to import it from. That's why i think it might be easier find a version of coot that's compatible or vice versa. The versions of programs i currently running on are coot 0.6.2 and ccp4 6.3.0 interface 2.2.0. The version of refmac in ccp4 is 5.0. Please let me know if I am on the right track or if there is a easier way to do this. Thanks Norm
Re: [ccp4bb] protein sequence database with conservation score annotation
Dear Yuan Shang, HSSP provides multiple sequence alignments with conservation scores per position. It is originally PDB derived in the sense that a multiple sequence alignment already exists for each PDB entry. You can also make HSSP entries from sequence alone, but you should contact the HSSP maintainers because I’ m not sure this service is public yet. You can also cheat a bit and just get the HSSP for the closest homologue in the PDB. HTH, Robbie Joosten Netherlands Cancer Institute From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of ?? Sent: Wednesday, August 22, 2012 06:17 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] protein sequence database with conservation score annotation Hi,everyone, Does anyone here could recommend such a database for me? I've searched the web and only find tools like 'consurf'. Databases like 'consurf' are important for the analysis of the current known structures. However, for the original discoveries of new domains, sequence databases with such conservation score annotation could be as important as the secondary structure prediction. Although the 'conservation score' maynot be as accurate as that from the 'consurf database' which is based on the 3-D alignment. The information of such database could be much more helpful, especially for some new proteins, or proteins regions without any structure available. Best regards, Yuan SHANG HKUST
Re: [ccp4bb] large difference between r rfree during refinement
Hi Faisal, It looks like your restraints are simply not tight enough. Try optimizing the restraint weight. You should also run more cycles of refinement to make sure it converges. The initial gap between R and R-free is pretty small. Did you do much refinement before this run? Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Faisal Tarique Sent: Monday, July 23, 2012 17:52 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] large difference between r rfree during refinement Dear all i have two basic queries 1 while refining my structure (of resolution 2.2A) i face problem with the gap in R Rfree which is around 8. The mosaicity of my data is 1.8. is this difference is due to large mosaicity or twinning though for twinning i got nothing when i checked with Xtriage.. NcycRfactRfree FOM -LL -LLfree rmsBOND zBOND rmsANGL zANGL rmsCHIRAL $$ $$ 0 0.2280 0.2488 0.827 75366.4001.1 0.0318 1.602 2.061 0.978 0.192 1 0.2023 0.2493 0.828 73747.3981.8 0.0193 0.966 2.020 0.947 0.136 2 0.1928 0.2506 0.827 73291.3983.6 0.0183 0.914 2.028 0.952 0.132 3 0.1886 0.2509 0.826 73116.3986.7 0.0178 0.889 2.021 0.950 0.131 4 0.1860 0.2515 0.824 73050.3990.8 0.0175 0.872 2.015 0.948 0.131 5 0.1847 0.2524 0.822 73036.3995.1 0.0172 0.856 2.005 0.944 0.131 6 0.1838 0.2530 0.821 73034.3998.9 0.0169 0.842 1.993 0.938 0.131 7 0.1832 0.2545 0.819 73046.4002.6 0.0167 0.832 1.983 0.935 0.130 8 0.1828 0.2553 0.818 73054.4005.4 0.0165 0.822 1.973 0.931 0.130 9 0.1825 0.2567 0.816 73070.4008.3 0.0163 0.815 1.964 0.927 0.130 10 0.1824 0.2578 0.815 73085.4010.8 0.0162 0.807 1.955 0.922 0.130 2 what is the best way to add water during structure solution ?? is it better to do it automatically during refinement in refmac or arpwarp solvent is a better option ?? -- Regards Faisal School of Life Sciences JNU
Re: [ccp4bb] Chiral volume outliers SO4
Dear All, I'm with Dale on this one. It's better to have a standard and roll with it, than allow for ambiguity. The discussion just happened to start with a rather silly example as Tim pointed out. The ligand 1N1 (http://ligand-expo.rcsb.org/reports/1/1N1/1N1_D3L1.gif) is a better example: The atoms N5 and N6 can have inverted chirality. If it is just one of the two, then the molecule is distorted (IFF the restraint file is correct!). If both have inverted chirality than the problem can be fixed by label swapping. Hacking the restraint file to allow both positive and negative chirality would allow you to distort the molecule. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene Sent: Friday, July 13, 2012 10:59 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Chiral volume outliers SO4 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear all, I am surprised by the discussion about chiraliy of an utterly centrosymmetric molecule. Shouldn't the four Oxygen atoms be at least from a QM point-of-view to indistinguishable? What reason is there to maintain a certain 'order' in the human-induced numbering scheme? Cheers, Tim On 07/13/12 00:22, Dale Tronrud wrote: While this change has made your symptom go away it is stretching it a bit to call this a fix. You have not corrected the root problem that the names you have given your atoms do not match the convention which is being applied for SO4 groups. Changing the cif means that you don't have to worry about it, but people who study such details will be forced to deal with the incorrect labels of your model in the future. Wouldn't it just be easier to swap the names of two oxygen atoms in each SO4, leaving the cif alone? Your difficulties will go away and people using your model in the future will also have a simpler life. This labeling problem is not new. The fight to standardize the labeling of the methyl groups in Valine and Leucine was raging in the 1980's. Standardizing the labels on the PO4 groups in DNA/RNA was much more recent. It helps everyone when you know you can overlay two models and have a logical solution without a rotation matrix with a determinate of -1. Besides, you will continue to be bitten by this problem as you use other programs, until you actually swap some labels. Dale Tronrud On 07/12/12 15:00, Joel Tyndall wrote: Hi all, Thanks very much to all who responded so quickly. The fix is a one liner in the SO4.cif file (last line) SO4 chir_01 S O1 O2 O3both which I believe is now in the 6.3.0 release. Interestingly the chirality parameters were not in the SO4.cif file in 6.1.3 but then appeared in 6.2.0. Once again I'm very happy to get to the bottom of this and get it fixed. I do wonder if it had become over parametrised. Cheers Joel -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Robbie Joosten Sent: Thursday, 12 July 2012 12:16 a.m. To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Chiral volume outliers SO4 Hi Ian, @Ian: You'd be surprised how well Refmac can flatten sulfates if you have a chiral volume outlier (see Figure 1d in Acta Cryst. D68: 484-496 (2012)). But this is only because the 'negative' volume sign was erroneously used in the chiral restraint instead of 'both' (or better still IMO no chiral restraint at all), right? If so I don't find it surprising at all that Refmac tried to flip the sulphate and ended up flattening it. Seems to be a good illustration of the GIGO (garbage in - garbage out) principle. Just because the garbage input in this case is in the official CCP4 distribution and not (as is of course more commonly the case) perpetrated by the user doesn't make it any less garbage. The problem is that in the creation of chiral volume targets chemically equivalent (groups of) atoms are not recognized as such. So any new or recreated restraint files will have either 'positiv' or 'negativ' and the problem starts all over again. That is why it is better to stay consistent and choose one chirality (the same one as in the 'ideal' coordinates in the PDB ligand descriptions). This will also make it easier compare ligands after aligning them (this applies to ligands more complex than sulfate). Obviously, users should not be forced to deal with these things. Programs like Refmac and COOT should fix chiral volume inversions for the user, because it is only relevant inside the computer. That is the idea of chiron, just fix these 'problems' automatically by swapping equivalent atoms whenever Refmac gives a chiral volume inversion warning. It should make life a bit easier. The point I was making is that in this and similar cases you don't need a chiral restraint at all: surely 4
Re: [ccp4bb] Chiral volume outliers SO4
Hi Andrew, Indeed, provided the atom labeling is correct the chiral volume restraint actually says whether the groups on the ring a axial or equatorial. The cif files do not define that any other way, so without the restraint the description of the molecule is ambiguous. Note that the chirality restraint only describes the hand, the actual chiral volume is calculated from the angle restraints. In the more general case I think people are talking different languages. There is chemical chirality (the real deal) and computational chirality. In refinement and structure comparison the latter does matter (again, see Dale's post). Problems with restraints are one of the reasons why there are relatively many problems with ligands in the PDB (e.g. http://www.springerlink.com/content/eu28538101v7v885/). Cheers, Robbie -Original Message- From: Andrew Purkiss [mailto:a.purk...@mail.cryst.bbk.ac.uk] Sent: Friday, July 13, 2012 14:09 To: Robbie Joosten Cc: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Chiral volume outliers SO4 Dear Robbie and ccp4bb, Is 1N1 not a different type of problem though, where a chirality restraint is valid and so the atom labelling is important? Are you saying that we should always use the cif dictionary, even when there are errors? Surely in the SO4 case, as Ian said, it is better to remove the unnecessary restraint altogether. The sulphate cif file seems to have had this bug introduced in the current CCP4 version. Andrew On Fri, 2012-07-13 at 12:54 +0200, Robbie Joosten wrote: Dear All, I'm with Dale on this one. It's better to have a standard and roll with it, than allow for ambiguity. The discussion just happened to start with a rather silly example as Tim pointed out. The ligand 1N1 (http://ligand-expo.rcsb.org/reports/1/1N1/1N1_D3L1.gif) is a better example: The atoms N5 and N6 can have inverted chirality. If it is just one of the two, then the molecule is distorted (IFF the restraint file is correct!). If both have inverted chirality than the problem can be fixed by label swapping. Hacking the restraint file to allow both positive and negative chirality would allow you to distort the molecule. Cheers, Robbie -- Andrew Purkiss X-ray Laboratory London Research Institute Cancer Research UK
Re: [ccp4bb] Chiral volume outliers SO4
Hi Joel, I prefer the swapping of atom names, which is pretty much what the program chiron does, over hacking the restraint file. The latter makes the problem reappear as soon as you use your PDB file on a machine with an 'unhacked' restraint file. @Ian: You'd be surprised how well Refmac can flatten sulfates if you have a chiral volume outlier (see Figure 1d in Acta Cryst. D68: 484-496 (2012)). Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Eleanor Dodson Sent: Wednesday, July 11, 2012 12:15 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Chiral volume outliers SO4 Well - the problem may well be here - in the REFMAC dictionary (see $CLIBD/monomers.s.SO4.cif ) the chiral volume calculation uses at the order of the O numbering around the S atom . So if your O numbering is not right handed you will have the chiral volume calculated as positive, not negative. There are various fixes. Edit the SO4 and just the numbering of 2 of the O atoms - eg use labels: S O2 O1 O3 Or edit the SO4.cif to have SO4 chir_01 S O1 O2 O3 both Eleanor loop_ _chem_comp_chir.comp_id _chem_comp_chir.id _chem_comp_chir.atom_id_centre _chem_comp_chir.atom_id_1 _chem_comp_chir.atom_id_2 _chem_comp_chir.atom_id_3 _chem_comp_chir.volume_sign SO4 chir_01 S O1 O2 O3negativ On 11 July 2012 10:59, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Joel, out of curiosity: what is the Chiral volume outliers issue? Cheers, Tim On 07/11/12 01:00, Joel Tyndall wrote: Hi people, We are refining a structure with sulfates and we are getting the Chiral volume outliers issue. I understand the problem as being computational where the oxygens are in reality equivalent but computationally named differently. I have seen the recent Acta Cryst D paper (April 2012 - PDB_REDO) which talks about this issue and mentions the development of Chiron which could fix this issue. Is there a way to fix this problem using existing tools (or editing). We have ~20 sulfates in our protein (10-mer system) Thanks heaps Joel _ Joel Tyndall, PhD Senior Lecturer in Medicinal Chemistry National School of Pharmacy University of Otago PO Box 56 Dunedin 9054 New Zealand Skype: jtyndall Ph: +64 3 479 7293 tel:%2B64%203%20479%207293 - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFP/U5+UxlJ7aRr7hoRAqaUAKD4Eo2tqarIwbK6+mHIfYYHcyhAKQCgy02F XK8ZYLNW3gI873nrtkZSv9E= =ishS -END PGP SIGNATURE-
Re: [ccp4bb] Chiral volume outliers SO4
Hi Ian, @Ian: You'd be surprised how well Refmac can flatten sulfates if you have a chiral volume outlier (see Figure 1d in Acta Cryst. D68: 484-496 (2012)). But this is only because the 'negative' volume sign was erroneously used in the chiral restraint instead of 'both' (or better still IMO no chiral restraint at all), right? If so I don't find it surprising at all that Refmac tried to flip the sulphate and ended up flattening it. Seems to be a good illustration of the GIGO (garbage in - garbage out) principle. Just because the garbage input in this case is in the official CCP4 distribution and not (as is of course more commonly the case) perpetrated by the user doesn't make it any less garbage. The problem is that in the creation of chiral volume targets chemically equivalent (groups of) atoms are not recognized as such. So any new or recreated restraint files will have either 'positiv' or 'negativ' and the problem starts all over again. That is why it is better to stay consistent and choose one chirality (the same one as in the 'ideal' coordinates in the PDB ligand descriptions). This will also make it easier compare ligands after aligning them (this applies to ligands more complex than sulfate). Obviously, users should not be forced to deal with these things. Programs like Refmac and COOT should fix chiral volume inversions for the user, because it is only relevant inside the computer. That is the idea of chiron, just fix these 'problems' automatically by swapping equivalent atoms whenever Refmac gives a chiral volume inversion warning. It should make life a bit easier. The point I was making is that in this and similar cases you don't need a chiral restraint at all: surely 4 bond lengths and 6 bond angles define the chiral volume pretty well already? Or are there cases where without a chiral restraint the refinement still tries to flip the chirality (I would fine that hard to believe). I agree with you for sulfate, and also for phosphate ;). I don't know what happens in other compounds at poor resolution, when bond and angle targets (and their SDs) are not equivalent. I guess that some angle might 'give way' before others. That is something that should be tested. I have a growing list of chiral centers that have this problem if you are interested. Cheers, Robbie
Re: [ccp4bb] Problems with CCP4MG
Dear Regina, Re. 2) Which program gave the virus warning? Internet Explorer warns about executable files that are not downloaded frequently. This warning can usually be ignored (I got a similar warning for a nightly build of Coot yesterday). If your antivirus program gives a warning I'd be slightly more woried. Cheers, Robbie Date: Thu, 5 Jul 2012 09:51:30 -0700 From: reginaketter...@yahoo.com Subject: [ccp4bb] Problems with CCP4MG To: CCP4BB@JISCMAIL.AC.UK Dear All; I have two problems with the visualization program that I am hoping people can help me with:1) I downloaded a model from the Swiss-Model repository that I have been trying to open on CCP4MG version 2.4.2. I have it saved as a .pdb file but when it loads, CCP4MG hangs and appears to convert the file into a .pdb.html file. I can remove any spurious lines in the file, resave, reload, to no avail. I can open the .pdb file without problems in Coot and Pymol. 2) I decided to try opening on a Windows machine (don't judge), so downloaded CCP4MG executable today (version 2.5.2). When installing, I received two messages indicating that the executable is a virus. Is anyone aware of this problem; am I missing a program necessary to open the files? Looking forward to any insightsRegina
Re: [ccp4bb] pdb sequence search
Hi Ed, If you are looking for a specific protein, why not get all PDB files with a DBREF record pointing at the uniprot record of the protein you want? You can do a simple text search in the PDB, e.g. 'MYG_PHYCA'. Cheers,Robbie Date: Fri, 22 Jun 2012 22:39:12 -0400 From: epozh...@umaryland.edu Subject: Re: [ccp4bb] pdb sequence search To: CCP4BB@JISCMAIL.AC.UK Tim, I did not understand your objection against solution 1 - is it because it is not automated? You can sort the results by max. Ident so that you can sroll down to the limit you set yourself. More that it does not generate a list of PDB IDs. What I want to do is to find every structure of a particular protein and line them all up. I am not saying it's not doable with option 1, it's just not too convenient. Why do you think a identity cut-off was a good criterium? I usually cut by E-value because I assume the developers of blast know what they are doing and I have the impression they consider the E-value a better criterium than the max. Ident. Because I want all the structures of a particular protein itself, not it's homologues. I just went through several cycles of reducing E-value down to 1e-100, and I still get one hit included at 88% identity. Setting E-value cutoff to 0 doesn't work, it just returns them all. Well, thanks to you I now see how to figure out the cutoff - the results are sorted by E-values and list them, so I can just go to the first non-identical hit and use a slightly smaller number. It's just that sequence identity is easier for me to interpret and it's (emotionally) easier to select a cutoff at, say, no more than 5 mutations rather than E-value of 10e-150. Cheers, Ed. Cheers -- Oh, suddenly throwing a giraffe into a volcano to make water is crazy? Julian, King of Lemurs
Re: [ccp4bb] correlations of B-factors and resolution
Hi Tim, With small test sets, R-free doesn't become meaningless you just have to take into account that R-free has an error margin which is higher than for cases with a large test set. Few people report this error margin, but with a small data set you can easily do K-fold cross validation. I.e. do K refinements with K = 1/(test set fraction) and report R and R-free as averages with a standard deviation (instead of what we call cross validation, but is actually holdout validation). The CCP4 program freerflag already splits your data set in K groups to make it easier for the user. I do this automatically in PDB_REDO if the test set contains fewer than 500 reflections. It's amazing how much R-free is influenced by the choice of ones test set. Cheers, Robbie Date: Wed, 16 May 2012 16:06:24 +0200 From: t...@shelx.uni-ac.gwdg.de Subject: Re: [ccp4bb] correlations of B-factors and resolution To: CCP4BB@JISCMAIL.AC.UK -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Qiang, without much explanation, rather from experience, the average B-factor rises as resolution drops. It does make sense in a way because high B-factors indicate some degree of disorder and disorder is usually the cause for the resolution limit. 48A^2 for a 2.4A structure sound perfectly fine with me, I would not worry provided that all other statistices seem sound. High solvent content surely affects the B-values. The larger the solvent channels and smaller the contact area between the molecules, the more likely they become less stable and less ordered. R and Rfree seem also very good, although the gap is relatively tight. Did you make sure your Rfree set contains at least 500 reflections? The default of 5% often used, can lead to fewer reflections than 500 at medium or low resolution, and with less than 500 reflection Rfree becomes statistically meaningless - at least according to Axel Brunger's article about that topic. Cheers, Tim On 05/16/12 15:46, Qiang Chen wrote: Dear all, I have a 2.4A structure(pdb code 3LAF)with an average protein b-factor of 48. I wonder whether it's acceptable. Is there a direct correlation of b-factor and resolution? The R and Rfree are 21.1% and 23.1%, respectively. This structure has a very high solvent content, 75%. Does it affect the b-factors? Thanks a lot! Qiang The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFPs7RgUxlJ7aRr7hoRAnS8AJ472kwIWxf7rqDOhEPSBG5ipvQOWQCeNHNk bum4yGTB56Wtt0JbkixleCw= =uIfE -END PGP SIGNATURE-
Re: [ccp4bb] Ligand geometry
Hi Uma, How different are your NADs optimised in Refmac and Coot? Are you sure you are using the same geometric restraints? Coot has to know where Refmac's restraint files are. This info is passed through an environment setting on your computer (I don't know the name by hart. Anyone?). Are you using Windows, Linux or OSX or something else? You can try to find more details about geometric outliers by checking Refmac's log file. That way you may find which specific bond/angle is the problem. Cheers, Robbie Date: Sat, 28 Apr 2012 11:47:58 -0400 From: rosiso2...@gmail.com Subject: [ccp4bb] Ligand geometry To: CCP4BB@JISCMAIL.AC.UK Dear All: I use Refmac5 to refine my model. After the run, I check the model quality by Coot. Here is the problem: In Coot, the ligand - NAD, has bad geometry as indicated by a big red bar. While the geometry of NAD fit nicely with the electron density. If I use refine tools (i.e. regularize Zone or real space refine zone), the geometry of NAD turns to perfec with bond, angle and so on. But the ligand slightly turn away from the electron density map. If I run Refmac5 again with this modified model, the NAD turns back, fit nice to electron density, but gives red bar in coot geometry. The Refinment Parameters in Refmac5 is set @ use automatic weight and use experinmental sigmals to weight X-ray terms. Thank you for advice and comments Ros
Re: [ccp4bb] Ligand geometry
Quasi on-topic rant: I would advice against using the 'both' option for any well defined ligand. It's a hack to avoid thinking about which atom belongs where and it allows you to be inconsistent. This makes it difficult for others to use your model, because aligning atoms of ligands becomes needlesly complicated. To the eye an oxygen is an oxigen, to a computer O1 is different from O2. Just stick to the definition given by the PDB (see Ligand Expo). It's there for a reason. Cheers, Robbie Date: Sun, 29 Apr 2012 11:14:01 +0200 From: hraaijmak...@xs4all.nl Subject: Re: [ccp4bb] Ligand geometry To: CCP4BB@JISCMAIL.AC.UK Hmm, what are the perfect bonds, angles for NAD in your protein? remember that reactive groups can be in a stressed conformation, compared to ideal in vacuo conformations. As part of their functon. anyway, you'll have to check the restraints definition file (.cif). Bond lengths and angles are usually ok, but make sure only chiral atoms are defined as chiral, others need to be deleted or defined as both. Check that the torsion angles make chemical sense, especially the repetition factor for rotatable bonds. Rotatable bonds next to aromatic rings are often problematic. You might need to set high sigmas, and repetition factors (x/360 degrees). On the other hand, you say that refmac behaves well, so the weighting scheme can't be far off. Cheers, Hans. And Uma Ratu schreef: Dear All: I use Refmac5 to refine my model. After the run, I check the model quality by Coot. Here is the problem: In Coot, the ligand - NAD, has bad geometry as indicated by a big red bar. While the geometry of NAD fit nicely with the electron density. If I use refine tools (i.e. regularize Zone or real space refine zone), the geometry of NAD turns to perfec with bond, angle and so on. But the ligand slightly turn away from the electron density map. If I run Refmac5 again with this modified model, the NAD turns back, fit nice to electron density, but gives red bar in coot geometry. The Refinment Parameters in Refmac5 is set @ use automatic weight and use experinmental sigmals to weight X-ray terms. Thank you for advice and comments Ros
Re: [ccp4bb] Refmac and sigma value
Hi Uma, The optimal weight is indeed resolution dependent, but hard to predict. In Refmac you can follow LLfree when you optimize the restraint weight and also keep an eye on the gap between R and R-free (it should not be too wide). Like Rob said, your geometry should be 'reasonable'. This may be a bit vague, but there is no clear target for bond/angle rmsd at a given resolution (some referees will disagree). If you look at the rmsZ values Refmac gives, the target is a bit clearer: rmsZ 1.000. The average rmsZ does go down with resolution (i.e. lower resolution gives lower rmsZ), but an ideal value cannot be given easily (or at all). Tightening the restraints improves the effective data/parameter ratio of your model. You can also improve it by adding additional restrains (e.g. NCS restraints) or by removing parameters (e.g. changing the complexity of your B-factor model). Note that the absence of geometric outliers does not prove that your model is optimal. If you use too tight restraints you can end up hiding genuine fitting errors. Cheers, Robbie Date: Fri, 27 Apr 2012 10:04:11 +0200 From: herman.schreu...@sanofi.com Subject: Re: [ccp4bb] Refmac and sigma value To: CCP4BB@JISCMAIL.AC.UK It all will depend on the resolution. At low resolution, relaxing the geometric restraints will allow the refinement program to tweak the model such that the difference between Fobs and Fcalc is minimized, but not that the model gets closer to the truth. I once struggled for a long time with a 3.5Åish data set with a protein where the most important feature was a rather flexible loop. It was before maximum likelyhood methods and Rfrees and the only way I could get rid of the model bias was to use extremely tight geometric restraints. The Rfactor would go up, but suddenly the electron density maps would no longer accept incorrectly placed side chains and new features, not present in the model, would appear. So my advice: at low resolution use as tight restraints as possible and monitor with Rfree if you are going in the right direction. At high or very high resolution, you can follow what your diffraction data tells you. In fact many very high resolution structures ( 1.5 Å) have higher rmsd's for bond lenghts and angles as medium resolution structures. However, at medium or low resolution there is not enough data to justify to relax the geometric restraints too much. Best regards, Herman From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Robert Nicholls Sent: Friday, April 27, 2012 9:25 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Refmac and sigma value Hi Uma, Altering sigma affects the strength of geometry restraints throughout the model - bonds, angles, etc. Choosing a very low sigma will cause geometry to be more tightly restrained towards ideal values, which is why you observe improvements in Coot validation. Note that strengthening the geometry weight causes the observations (data) to be less influential in refinement. The risk of this is that your model may no longer appropriately/optimally describe your data. You can assess this locally by manual inspection of the electron density, and globally by considering overall refinement statistics (as reported at the bottom of the Refmac5 log file). Ideally, you want your model to both describe the data and have reasonable geometry. Regards Rob On 26 Apr 2012, at 21:26, Uma Ratu wrote: Hi, Alex: Which sigma do you mean? The one for automatic weight, not for Jelly-body refinement. I did not turn the Jelly-body refinement on. Thanks Ros On Thu, Apr 26, 2012 at 4:08 PM, aaleshin aales...@burnham.org wrote: Hi Uma, Which sigma do you mean? The one for Jelly-body refinement? J-B sigma=0.01 means very small fraction of the gradient will be used in each step. It is used usually with very low resolution (less then 3A) Alex On Apr 26, 2012, at 11:38 AM, Uma Ratu wrote: Dear All: I use Refmac5 to refine my structure model. When I set the sigma value to 0.3 (as recommended from tutorial), the resulted model has many red-bars by coot validation (geometry, rotamer, especially, Temp Facotr). I then lower the sigma value to 0.1, the resulted model is much improved by coot validation. I then lower the sigma value to 0.01, the resulted model is almost perfect, by coot validation and Molprobity. My question is: what is the risk for very low value sigma value? Thank you for your advice Ros
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear CCP4BBers, The PDB_REDO entry Bernhard referred to in his interesting and very thorough article was automatically deleted because the original PDB entry was obsoleted. Since access to the 'experimental' data of any study is important, we have made a compressed copy of the PDB_REDO entry available at http://www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 Our apologies to those who have looked for this entry in vain. Best wishes, Robbie Joosten (on behalf of the PDB_REDO team) Biochemistry Netherlands Cancer Institute P.S. The whole fraud thing seems to have interfered with the annual April fools' post on CCP4BB. Let's hope this will not happen again. -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Michel Fodje Sent: Saturday, March 31, 2012 21:55 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Very interesting Response to Detection and analysis of unusual features in the structural model and structure-factor data of a birch pollen allergen doi:10.1107/S1744309112008433 a quote from the response: Author Schwarzenbacher admits to the allegations of data fabrication and deeply apologizes to the co-authors and the scientific community for all the problems this has caused . Note added in proof: subsequent to the acceptance of this article for publication, author Schwarzenbacher withdrew his admission of the allegations. From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) [hofkristall...@gmail.com] Sent: Saturday, March 31, 2012 12:42 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication This is an unresolved problem, and no real satisfactory solution exists, because the underlying reasons for zero occupancy can be different. For people who understand this and look at electron density, it is not a problem. For users who rely on some graphics program displaying only atom coordinates, it can be. The same holds for manipulation of B-factors, trading high B-factors against reduced occupancy, and other (almost always purely cosmetic but still confusing or inconsistent) practices. Best, BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Nian Huang Sent: Saturday, March 31, 2012 11:29 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication I don't model zero occupancy in my model. But can't the refinement programs just treat those atoms with zero occupancy as missing atoms? Nian Huang On Sat, Mar 31, 2012 at 10:26 AM, Bosch, Juergen jubo...@jhsph.edumailto:jubo...@jhsph.edu wrote: really fascinating, bringing back the discussion for a repository for your collected frames. Jürgen Acta Cryst. (2012). F68, 366-376 doi:10.1107/S1744309112008421http://dx.doi.org/10.1107/S17443091120084 21 Detection and analysis of unusual features in the structural model and structure-factor data of a birch pollen allergen B. Rupphttp://scripts.iucr.org/cgi- bin/citedin?search_on=nameauthor_name=Rupp,%20B. Abstract: Physically improbable features in the model of the birch pollen structure Bet v 1d (PDB entry 3k78http://pdb.pdb.bnl.gov/pdb- bin/opdbshort?3k78) are faithfully reproduced in electron density generated with the deposited structure factors, but these structure factors themselves exhibit properties that are characteristic of data calculated from a simple model and are inconsistent with the data and error model obtained through experimental measurements. The refinement of the 3k78http://pdb.pdb.bnl.gov/pdb-bin/opdbshort?3k78model against these structure factors leads to an isomorphous structure different from the deposited model with an implausibly small R value (0.019). The abnormal refinement is compared with normal refinement of an isomorphous variant structure of Bet v 1l (PDB entry 1fm4http://pdb.pdb.bnl.gov/pdb- bin/opdbshort?1fm4). A variety of analytical tools, including the application of Diederichs plots, R plots and bulk-solvent analysis are discussed as promising aids in validation. The examination of the Bet v 1d structure also cautions against the practice of indicating poorly defined protein chain residues through zero occupancies. The recommendation to preserve diffraction images is amplified. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742tel:%2B1-410-614-4742 Lab: +1-410-614-4894tel:%2B1-410-614-4894 Fax: +1-410-955-2926tel:%2B1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] REFMAC Riding Hydrogens
Hi Everyone, Pavel’s statement is likely a bit of an exaggeration, but he has a valid (yet hard to prove point). The default in CCP4i was (and is?) to use hydrogens only if present in the input file. This is IMO not a safe default. Because there were some reporting errors in the past (http://proteincrystallography.org/ccp4bb/message18808.html) it is hard to tell from the PDB when refinement with hydrogens became hip. Discussions on this BB show that at the use of riding hydrogens is still not fully accepted, especially at low resolution (where they actually help most). Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Pavel Afonine Sent: Monday, March 05, 2012 21:53 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] REFMAC Riding Hydrogens Dear Tim, good catch, thanks; I could craft that phrase more carefully! Although often it may not be quite fair to take phrases out of context: this newsletter article was written in the context of macromolecular refinement. And yes, recently may be a broad term -:) All the best, Pavel On Mon, Mar 5, 2012 at 12:45 PM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Pavel, you may want to add to the structures mentioned in [1] one or two organic structures present in the Cambridge Database. Until recently it was customary to ignore hydrogen atoms throughout the process of crystallographic X-‐ray structure determination. [1] 'recently' as in 1997 [2]? Even though 1997 is probably a poor estimation of the corresponding year... Cheers, Tim [1] On contribution of hydrogen atoms to X-ray scattering http://www.phenix-online.org/newsletter/ [2] http://shelx.uni-ac.gwdg.de/SHELX/shelx.pdf On 03/05/2012 09:14 PM, Pavel Afonine wrote: Hi, On Mon, Mar 5, 2012 at 11:52 AM, Matthew Franklin mfrank...@nysbc.orgwrote: Adding the riding hydrogens generally gives you some improvement in R factors even with a good quality (i.e. stereochemically correct) model. and here are the results of more or less systematic test that prove this: see On contribution of hydrogen atoms to X-ray scattering here: http://www.phenix-online.org/newsletter/ Pavel - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFPVSXkUxlJ7aRr7hoRAm1TAJ9Hyfhkl3yhD5QSKw9I4RSK58m0fACgmlxk YGILzeMam/3gQVmCeh0vQ8k= =3m2J -END PGP SIGNATURE-
Re: [ccp4bb] on Rwork and Rfree
Hi Dialing, Most water picking tools are rather overenthusiastic and end up placing some waters at places where they should not be. This causes some overfitting and an increase of R-free. I'm hideously old-fashioned and recommend conservatively building waters by hand. There are some good validation tools that help get rid of excess water: centrifuge in PDB_REDO does the basic work; check/delete waters in Coot highlights other suspicious waters; WHAT_CHECK checks hydrogen bonding thoroughly, finds nonsense clusters of water and also finds possible ions. It must be noted that all these tools break down at very high resolution where you may get alternate waters. Fortunately, this problem doesn't occur very often. Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Dialing Pretty Sent: Tuesday, February 07, 2012 09:31 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] on Rwork and Rfree Dear All, After we refine the structure of the protein to satisfactory with satisfactory Rwork and Rfree, we pick water by phenix refine, and I find Rfree always increases slightly after the water picking refinment. Do you have nay idea to solve this problem or any comment? Cheers, Dialing
Re: [ccp4bb] reliable/unreliable maps?
Hi Frank, EDS already does that. Even so, reproducing the R-factor does not prove that the map is reliable. See for instance 3frk for which the deposited dataset is much smaller and less complete than the one used for refinement. The map from EDS is therefore completely model biased. I only recently started looking for this problem of lower-than-reported completeness with. I have not found a lot of cases, but already too many. Fortunately, at least a few depositors deposited the rest of the dataset after I sent a bug report to the PDB (e.g. 3mbs). Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Frank von Delft Sent: Tuesday, January 10, 2012 16:23 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] reliable/unreliable maps? Or just print both Rfactors...? On 10/01/2012 15:21, Luca Pellegrini wrote: Hi Paul, What would you rather it say, I'm happy to change the message. The EDS does not think that this is a reliable map, in that it is or may be inconsistent with what the authors were looking at during deposition? How about Warning: the R-factor calculated for this map differs significantly from the published R-factor? Then we can discuss what is significant ;-) Luca Luca Pellegrini Department of Biochemistry University of Cambridge 80 Tennis Court Road Cambridge CB2 1GA - UK Email: lp...@cam.ac.uk Tel: 0044-1223-760469 Fax: 0044-1223-766002 Sanger building, room 3.59
Re: [ccp4bb] chirality problem
Hi Phil, It is annoying problem especially for Phe and Tyr which have standard rotamers close to the critical chi angles (-90 and +90). Asp and Glu do not have standard rotamers near critical angles, so the problem should be much smaller (but I still get them too often). If Val, Leu and Arg problems reoccur after refinement, then there is something seriously wrong. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Phil Evans Sent: Monday, January 09, 2012 12:54 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] chirality problem The problem with fixing the nomenclature problems in Coot is that they are back again after the next round of refinement (or at least some of them are, if they are right on the edge of an arbitrary distinction) - indeed irritating Phil On 9 Jan 2012, at 11:43, Paul Emsley wrote: On 08/01/12 10:36, ccp4 wrote: Won't coot fix the nomenclature issue, then you can check whether you have a real chirality problem - eg a squashed flattened VAL.. It will indeed [1]. So Afshan need only read in the file, Press OK and then Save. Robbie and I think that it is more likely than not that Afshan did not really have a chirality problem. Afshan and Kim have been in touch and confirm that it is the adit validation report that describes a nomenclature error on a VAL CB as a chirality problem (rather than anything from CCP4). Paul. [1] well, modern ones do [2] [2] and you can turn it off (some people find the feature annoying)
Re: [ccp4bb] chirality problem
Hi Afshan, I assumed, because you mentioned only VAL and LEU, that you were refering to the CB (VAL) and CG (LEU) as problematic chiral centers. Paul is right that these atoms are not chiral in a chemical sense, but they are in a computational sense because every connected atom has a unique name. The PDB is pretty strict in this sense (as it should be), but they could/should call it a nomenclature error. They could also just swap the atom names like I described and solve the problem for you. Anyway, please give a bit more details about your problem. Computational chirality problem can be a serious problem for refinement: if the chirality is wrong due to swapped atom names, the chiral volume restraint will try to invert your chiral center. This can lead to malformed geometry, typically flattening of of the group. This means that a computational chirality problem can lead to a 'real' chirality problem. In Refmac, this will not happen for LEU or VAL, but it will happen for things like SO4, GOL, and a whole lot of other more interesting hetero compounds. @ Paul, I don't think it will be a CCP4 program that reported the problem. Does the 'fix nomenclature problems' option in Coot also do VAL and LEU? Cheers, Robbie Date: Fri, 6 Jan 2012 10:44:58 + From: paul.ems...@bioch.ox.ac.uk Subject: Re: [ccp4bb] chirality problem To: CCP4BB@JISCMAIL.AC.UK Hi Afshan, This is not the solution if you are right about the problem being one of chirality (and it is if it is not and is merely an issue of nomenclature (as I suspect is the case)). So the question is, if the problem is indeed one of nomenclature, what software (if any) described it as a chirality issue? If it is one of ours we should fix that. Paul On 05/01/12 11:44, Robbie Joosten wrote: Hi Afshan, Just swap the (names of) the CD and CG atoms, no need for refinement. The CCP4 dictionary allows both chiralities for LEU and VAL, so Refmac won't detect the problem. The problem is still very real to many programs so it should be fixed. Cheers, Robbie Joosten Date: Thu, 5 Jan 2012 02:46:30 -0800 From: afshan...@yahoo.com Subject: [ccp4bb] chirality problem To: CCP4BB@JISCMAIL.AC.UK Dear Users, I am facing difficulties to validate my structure according to PDB server. I have solved my structure and now want to submit in PDB but during validation process i have some chirality problem specially VAL and LEU amino acids there are total 18 amino acids which deviated from the chirality so how can i solve this problem. Any suggestion would be highly appreciated. Best Regards AFSHAN
Re: [ccp4bb] chirality problem
Hi Afshan, Just swap the (names of) the CD and CG atoms, no need for refinement. The CCP4 dictionary allows both chiralities for LEU and VAL, so Refmac won't detect the problem. The problem is still very real to many programs so it should be fixed. Cheers, Robbie Joosten Date: Thu, 5 Jan 2012 02:46:30 -0800 From: afshan...@yahoo.com Subject: [ccp4bb] chirality problem To: CCP4BB@JISCMAIL.AC.UK Dear Users, I am facing difficulties to validate my structure according to PDB server. I have solved my structure and now want to submit in PDB but during validation process i have some chirality problem specially VAL and LEU amino acids there are total 18 amino acids which deviated from the chirality so how can i solve this problem. Any suggestion would be highly appreciated. Best Regards AFSHAN
Re: [ccp4bb] How to assess geometry in a model?
Hi Matt, WHAT_CHECK writes out a file called check.db that contains per-residue scores for several quality metrics. It is fairly easy to parse. Cheers, Robbie Date: Thu, 8 Dec 2011 23:08:45 -0500 From: mattw...@gmail.com Subject: [ccp4bb] How to assess geometry in a model? To: CCP4BB@JISCMAIL.AC.UK Hi Folks I'm looking for a way to score each atom (or residue) in a model based on it's geometry. I know these scores exists because various software packages speak of outliers, even including a sigma value in some cases. So I'm looking for a simple way to get a complete list (not just outliers). Does anyone know of a package that can be made to output these scores? Thanks, Matt
Re: [ccp4bb] How to distinguish between Na+ and Mg2+?
Hi Florian, There are quite a few tools that do this check for you. To name a few: WASP (old but good, build the ion as water), WHAT_CHECK (http://swift.cmbi.ru.nl/servers/html/index.html), Check My Metal and probably quite a few others. All of them use the bond valence sum, but they all have a different implementation so the results may differ. That said, it is usually reasonably easy to tell Na+ and Mg2+ apart. At the risk of stating the obvious: think of what you added in the crystallization, buffer counter ions are easily overlooked. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Florian Sauer Sent: Thursday, December 01, 2011 19:40 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] How to distinguish between Na+ and Mg2+? Dear CCP4BBers, I'm refining the structure of a Ca-binding protein with several EF-hands against 2.2A data. There is clear density for an ion in several EF-hands coordinated by Asp/Glu, Ser, one backbone O and one water (coordination number 5 and 6). Ca2+ can be excluded as there are no anomalous difference peaks at these sites when I calculate a map from data collected at 2A wavelength. I suppose that either Na+ or Mg2+ are bound. I'd like to ask whether there is a clear way to distinguish between both ions in a model from data at this resolution. Thank you in advance for your suggestions, Florian P.S. The distances between ion and protein are: Asp/Glu carboxyl O: 2.04-2.75A Ser OH: 1.98-2.6A Backbone O: 2.4-2.65A
Re: [ccp4bb] FreeR in the case of few reflections
Hi Aaron, You don't explain why you have so few reflections. Is it a small cell, low resolution or just really bad data? Assuming it's not the last one and your data is reasonably complete, I would try this: - Divide your reflections into six groups (and check that these groups are really of equal size). - Refine with one set excluded and optimize your refinement protocol. Do a lot of cycles of refinement to ensure that the refinement converges. - Generate maps using all reflections (i.e. do not exclude the set you excluded in refinement). If you leave out 17% of your reflection you either get poor maps due to missing Fourier terms or your maps will be very biased towards your model. - Once you are content with your model. Do six refinements with different sets excluded like Pavel said. You can reset the B-factor if you worry about model bias. Use even more cycles of refinement than before to be sure your refinements converge. - Report ALL the R-free values in your publication and describe the methods really well. - Deposit the model with the R-free closest to the mean. HTH, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Pavel Afonine Sent: Friday, November 18, 2011 17:25 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] FreeR in the case of few reflections Hi Aaron, here is what I would do: - create 10-20 independent test sets containing 5% of reflections (make sure lattice symmetry is taken into account - Phenix does it by default); - solve and refine structure for each of the data set (make sure you use such a refinement strategy so you don't get very poor Rfree-Rwork gap (like you have right now: 28/40). See how final models, maps and R-factors are different. That will give you an idea about reliability of the results you get and starting point for further thoughts. Of course this is not the only way to tackle this problem, but a possibility. Pavel On Fri, Nov 18, 2011 at 6:36 AM, Aaron Alt aa...@ibv.csic.es wrote: Hi all, I have data indexed in I23 with ~3000 unique reflections. Having set aside 10% of these my refinements still go berserk. The maps do look fine though. The same happens when reindexed in lower symmetries. Phenix autobuild finishes for example with 28/40 and I get similiar results (although it took me longer) tracing manually and refining with refmac. Does it make sense to set aside 500 reflections in my case, which would be ~17% of the data? What is the correct way to deal with data of this type? Ignoring the Rfree completely? A nice weekend to all, Aaron
Re: [ccp4bb] weight matrix and R-FreeR gap optimization
Hi James, That is not exactly a lot of info to decide the best weight. The optimal weight is (very loosely) resolution dependent. At normal resolutions the optimal matrix weight is usually well below 1.0. Start at 0.3 and try a few weights to see what works best for your data. To close the R-free gap you can also try to optimize other refinement parameters such as NCS restraints, B-factor model (and restraint weight). Jelly body restraints sometimes work really well to keep the R-free gap sensible, especially at low resolution. Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of james09 pruza Sent: Tuesday, November 08, 2011 06:40 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] weight matrix and R-FreeR gap optimization Dear ccp4bbers, I wonder if someone can help me defining proper weight matrix term in Refmac5 to lower the R-FreeR gap. The log file indicates weight matrix of 1.98 with a gap of 7. Thanks for suggestions in advance. James
Re: [ccp4bb] refmac 5.6 ccp4 6.2.0
Hi Kenneth, This looks like an off-by-one bug in the restraint generation. Typical sources are weird LINKs, wrong atom names and bad luck. I suggest you make sure you have the very latest Refmac and dictionary and try setting up a new refinement instead of recycling an old job. If that doesn't work, we may need a closer look at your input model. Cheers, Robbie Date: Thu, 27 Oct 2011 20:48:49 -0500 From: satys...@wisc.edu Subject: [ccp4bb] refmac 5.6 ccp4 6.2.0 To: CCP4BB@JISCMAIL.AC.UK Has anyone had problems with Refmac 5.6? I tried refining our stucture at 1.24 A, aniso with H in riding position and it just exploded! I get error in distances such as Standard External All Bonds: 3270 0 3270 Angles: 4923 0 4923 Chirals: 214 0 214 Planes: 368 0 368 Torsions: 957 0 957 --- Number of reflections in file 90428 Number of reflections read 90428 CGMAT cycle number = 1 Bond distance outliers Bond distance deviations from the ideal 10.000Sigma will be monitored A 5 PRO C A - A 5 PRO HB1 A mod.= 3.344 id.= 1.329 dev= -2.015 sig.= 0.014 A 5 PRO C B - A 5 PRO HB1 B mod.= 2.997 id.= 1.329 dev= -1.668 sig.= 0.014 A 5 PRO HB1 A - A 5 PRO HG1 A mod.= 2.292 id.= 1.458 dev= -0.834 sig.= 0.021 A 5 PRO HB1 A - A 6 LEU HD23A mod.= 7.407 id.= 0.860 dev= -6.547 sig.= 0.020 A 5 PRO HB1 B - A 5 PRO HG1 B mod.= 2.247 id.= 1.458 dev= -0.789 sig.= 0.021 A 5 PRO HB1 B - A 6 LEU HD23B mod.= 6.529 id.= 0.860 dev= -5.669 sig.= 0.020 A 5 PRO HG1 A - A 5 PRO HD1 A mod.= 2.267 id.= 0.980 dev= -1.287 sig.= 0.020 A 5 PRO HG1 A - A 6 LEU N A mod.= 4.860 id.= 1.530 dev= -3.330 sig.= 0.020 A 5 PRO HG1 A - A 6 LEU HD21A mod.= 6.129 id.= 1.525 dev= -4.604 sig.= 0.021 A 5 PRO HG1 B - A 5 PRO HD1 B mod.= 2.236 id.= 0.980 dev= -1.256 sig.= 0.020 A 5 PRO HG1 B - A 6 LEU N B mod.= 4.922 id.= 1.530 dev= -3.392 sig.= 0.020 A 5 PRO HG1 B - A 6 LEU HD21B mod.= 6.664 id.= 1.525 dev= -5.139 sig.= 0.021 A 6 LEU N A - A 6 LEU CA A mod.= 1.467 id.= 0.970 dev= -0.497 sig.= 0.020 A 6 LEU N A - A 6 LEU HA A mod.= 2.005 id.= 0.970 dev= -1.035 sig.= 0.020 A 6 LEU N A - A 6 LEU CB A mod.= 2.497 id.= 1.530 dev= -0.967 sig.= 0.020 A 6 LEU N B - A 6 LEU CA B mod.= 1.469 id.= 0.970 dev= -0.499 sig.= 0.020 A 6 LEU N B - A 6 LEU HA B mod.= 2.032 id.= 0.970 dev= -1.062 sig.= 0.020 A 6 LEU N B - A 6 LEU CB B mod.= 2.446 id.= 1.530 dev= -0.916 sig.= 0.020 A 6 LEU CB A - A 6 LEU HB2 A mod.= 0.969 id.= 1.521 dev= 0.552 sig.= 0.020 A Rfree goes form 17 to 28 and R from 15 to 25. Coot map looks like a bunch of busted insect parts. I use the exact same input using ccp4 6.1.13 and Refmac 5.5 and all is good. I am forced to use the old ccp4 and refmac to publish. Rf 17 R 15. thanks -- Kenneth A. Satyshur, M.S.,Ph.D. Associate Scientist University of Wisconsin Madison, Wisconsin 53706 608-215-5207
Re: [ccp4bb] raw data deposition
Hi Francis, Even though they are not published, there are enough models in the PDB for which reevaluation of the crystallographic data leads to new biological insight. Unfortunately, a lot of the insight is of the type that ligand doesn't really bind, or at least not in that pose. Another nice one is a sequencing error in a Uniprot entry that became obvious after critically looking at the structure and the maps (the authors, of both structure and sequence, acknowledge the problem, but the entry is not yet fixed, so no names). Yesterday, I had a case where I didn't so much mistrust the model, but I would still have liked to have access to the images. There was something weird in the maps that was also clearly there in pictures of the maps in the linked publication, but it was not discussed. Needless to say, I'm in favour of depositing images. At least for published structure models. There is still a lot of interesting things to find in current and future PDB entries. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of James Stroud Sent: Friday, October 28, 2011 07:57 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] raw data deposition On Oct 27, 2011, at 5:22 PM, Francis E Reyes wrote: So I ask again, are there literature examples where reevaluation of the crystallographic data has directly resulted in new biological insights into the system being modeled? This is a poor criterion on which to base any conclusions or decisions. We can blame the lack of examples on unavailability of the data. Right now, I'd love to get my hands on the raw images for a particular cryoEM data set, but they are not available--only the maps. But the maps assume one symmetry and I have a hypothesis that the true symmetry is different. I could test my hypothesis by reprocessing the data were it available. James
Re: [ccp4bb] should the final model be refined against full datset
Hi Ed, This is a follow up (or a digression) to James comparing test set to missing reflections. I also heard this issue mentioned before but was always too lazy to actually pursue it. So. The role of the test set is to prevent overfitting. Let's say I have the final model and I monitored the Rfree every step of the way and can conclude that there is no overfitting. Should I do the final refinement against complete dataset? IMCO, I absolutely should. The test set reflections contain information, and the final model is actually biased towards the working set. Refining using all the data can only improve the accuracy of the model, if only slightly. Hmm, if your R-free set is small the added value will also be small. If it is relatively big, then your previously established optimal weights may no longer be optimal. A more elegant thing to would be refine the model with, say, 20 different 5% R-free sets, deposit the ensemble and report the average R(-free) plus a standard deviation. AFAIK, this is what the R-free set numbers that CCP4's FREERFLAG generates are for. Of course, in that case you should do enough refinement (and perhaps rebuilding) to make sure each R-free set is free. The second question is practical. Let's say I want to deposit the results of the refinement against the full dataset as my final model. Should I not report the Rfree and instead insert a remark explaining the situation? If I report the Rfree prior to the test set removal, it is certain that every validation tool will report a mismatch. It does not seem that the PDB has a mechanism to deal with this. The deposited R-free sets in the PDB are quite frequently 'unfree' or the wrong set was deposited (checking this is one of the recommendations in the VTF report in Structure). So at the moment you would probably get away with depositing an unfree R-free set ;) Cheers, Robbie Cheers, Ed. -- Oh, suddenly throwing a giraffe into a volcano to make water is crazy? Julian, King of Lemurs
Re: [ccp4bb] Superpose, SSM
One would assume that Windows software would read DOS/Windows type text files... Open the file in Wordpad. Unlike Notepad, it is able to work with Windows and Unix type text files. If you edit something and save the file, it will be in Windows style. If Superpose stops on that, it should really be updated. I'm sure that there are Windows versions op the programs Unix2dos and Dos2unix which were the programs to use to convert one type to the other. You can also use Word to search and replace the linefeeds. Good luck with this very retro problem. Cheers, Robbie Date: Mon, 26 Sep 2011 17:07:50 -0600 From: xtald...@gmail.com Subject: Re: [ccp4bb] Superpose, SSM To: CCP4BB@JISCMAIL.AC.UK I think something in your workflow is inserting dos line feeds (\n\r or \r\n, I can't remember which). If I have guessed correctly, you want to remove those \rs before proceeding (or never let them get in there in the first place). You claim to open it with MS something, which would insert dos line feeds as part of Operation Vendor Lock. Did you happen to save it, perhaps by habit? That would do the trick. It might even do something insidious and insert those linefeeds without your purposefully saving the document. Your best bet to fix the file after corruption is vim (used to be that crystallographers could use real text editors). The command in vim is: :%s/\r//g You might find some third party utility that fixes linefeeds for $30.00 somewhere, if vim is too retro. Otherwise, you may want to start over, skip checking it out in MS something, and go straight to superpose. James On Sep 26, 2011, at 2:51 PM, Matthias Zebisch wrote: Hi again, Thanks for your quick replies but I think I made myself not clear. here is what I'm doing: 1) superpose proteinA.pdb onto proteinB.pdb : works, but gives out proteinA_lsq1.pdb with extra empty lines (not the anisou lines ;o) ) 2) superpose proteinA_lsq1.pdb onto proteinC.pdb : doesnt work because proteinA_lsq1.pdb cannot be read Any ideas? Even if there is some compatibility issue between CCP4 and windows, I guess superpose should be able to read its own files, shouldnt it? Thanks, Matthias On 9/26/2011 9:13 PM, Jacob Keller wrote: I vaguely recall notepad doing something wacky with files in certain cases...why don't you get the excellent text editor NoteTab Light [sic] (I use it all the time--free and works great), then take a look at your files and see whether MS notepad altered the files. JPK On Mon, Sep 26, 2011 at 2:42 PM, Matthias Zebisch matthias.zebi...@bbz.uni-leipzig.de wrote: Dear CCP4 users, I am using the ccp4i version 6.2.0 under windows 7. I've come across a problem with superpose. The outputfile appears to have additional line feeds (see picture) which, however are not seen in the windows notepad. The structure can also be opened in coot and pymol. However, it is not possible to use it within CCP4, eg. for a subsequent superposition. Is this problem known to anybody and is there a simple workaround available? I need to compare hell of a lot of relative domain orientations... I did not have this problem on a second computer with ccp4 6.1.2. When I updated to 6.2.0, the situation was as described above. Any help will be highly appreciated, Thanks, Matthias
Re: [ccp4bb] number of cycles in refmac
Dear Protein Chemistry (?), When R and R-free drift off you are probably refining with suboptimal weights. If anything, it proves you still have work to do. At convergence R and R-free do not really change anymore so neither does the difference. If you have already done a lot of rebuilding and refinement 20 cycles is usually enough (but more cycles shouldn't hurt). Cheers, Robbie Date: Fri, 26 Aug 2011 20:29:59 +0530 From: proteinchemistr...@gmail.com Subject: Re: [ccp4bb] number of cycles in refmac To: CCP4BB@JISCMAIL.AC.UK Dear Dr Ian from your argument i could not understand how many cycles to refine before submitting the coordinates to the PDB. what is the upper limit 100 or thousand or million according to my understanding, its more logical to stop the refinement when over refinement is taking place (when R and Rfree are going in opposite directions and LLG is stabilized ) AR
Re: [ccp4bb] Another paper structure retracted
Hi Dale, The data looks fine but the refinement for 3kj5, 2qns's 'improved' model, is still pretty poor. Looking at the EDS maps for this entry there is some (model bias) density for the ligand but, it is clearly not there. The PDB_REDO optimization (http://www.cmbi.ru.nl/pdb_redo/kj/3kj5/index.html) improves the model substantially (with the exception of the Ramachandran plot) and removes pretty much all the density of the ligand. I'd say it's clear that 3kj5 was poorly refined, and that proper refinement would have made it very clear that the ligand isn't there. I just hope that nobody cited the structural aspect of the paper. Cheers, Robbie -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Dale Tronrud Sent: Thursday, August 11, 2011 18:13 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Another paper structure retracted It is my understanding that there is no retracted category in the wwPDB. Models are obsoleted, usually with a replacement but sometimes without. I don't know of a way to distinguish between those models obsoleted for gross error and those simply replaced by one of higher quality. Surely this is a continuum. In the current case, I've seen nothing to suggest that the diffraction data is bad, although I've not looked at it hard, so the model's creator could run a proper refinement and submit a replacement. This model is so poor that I can't be sure, but I suspect that that model would be an apo form and probably of little interest. This is a good case for an obsolete without replacement. I should note that, while the paper has been retracted, I see no indication that the entry 2QNS has been obsoleted. Perhaps that update is still in the pipeline. Dale Tronrud On 08/11/11 06:46, Garib N Murshudov wrote: Dear all Does anybody have the list (pdb as well as structure factors) of all retracted structures? regards Garib On 10 Aug 2011, at 22:01, David Schuller wrote: Time to fuel up the gossip engines for the approaching weekend: http://www.sciencedirect.com/science/article/pii/S096921260800186X RETRACTED: Structure of the Parathyroid Hormone Receptor C Terminus Bound to the G-Protein Dimer Gβ_1 γ_2 Structure, Volume 16, Issue 7 http://www.sciencedirect.com/science?_ob=PublicationURL_tockey=%23 T OC%236269%232008%23999839992%23693753%23FLA%23_cdi=6269_pub Type=Jv iew=c_auth=y_acct=C22719_version=1_urlVersion=0_userid=49 213 7md5=9dc4b8953d3fa243dc98e395b6ac590d, 9 July 2008, Pages 1086-1094 Structure 2QNS withdrawn. -- == === == All Things Serve the Beam == = David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu Garib N Murshudov Structural Studies Division MRC Laboratory of Molecular Biology Hills Road Cambridge CB2 0QH UK Email: ga...@mrc-lmb.cam.ac.uk Web http://www.mrc-lmb.cam.ac.uk
Re: [ccp4bb] **Possible spam**How to convert CNS PDB format to the most current version of the PDB format?
Dear Li(?), The MolProbity server fixes the atom naming before the actual validation. You can use that. The ATOM/HETATM conversion is not needed, the PDB will do that for you when you deposit your structure model. If you really need it now, I guess it's easy enough to do with you favourite script language. Cheers, Robbie Date: Tue, 9 Aug 2011 16:48:23 +0900 From: lihua...@naver.com Subject: [ccp4bb] **Possible spam**How to convert CNS PDB format to the most current version of the PDB format? To: CCP4BB@JISCMAIL.AC.UK Dear all, How to convert CNS pdb format to the most current version of the PDB format? e.g. HN (excluding N-term) in CNS output files should be changed to H And I don't know why the CNS put the N-terminal PCA in the ATOM catagory. How to change it as HETATM? Thanks! College of Pharmacy Pusan National University Busan, Korea
Re: [ccp4bb] Sodium ion vs. Water
The server seems to be up again. Give it another try. Cheers, Robbie Date: Tue, 2 Aug 2011 12:43:43 -0700 From: bourn...@yahoo.com Subject: Re: [ccp4bb] Sodium ion vs. Water To: robbie_joos...@hotmail.com; CCP4BB@JISCMAIL.AC.UK Hi Robbie and all- The link from that page is dead...and an inquiry to the webmaster bounced. Anyone know where WASP is now? Thanks Christina Oklahoma State University From: Robbie Joosten robbie_joos...@hotmail.com To: CCP4BB@JISCMAIL.AC.UK Sent: Tuesday, August 2, 2011 2:20 PM Subject: Re: [ccp4bb] Sodium ion vs. Water Dear Young-Jin, If you model it as water, you can use WASP. It's an old program but still accesible here: http://xray.bmc.uu.se/cgi-bin/gerard/rama_server.pl Cheers, Robbie Date: Tue, 2 Aug 2011 14:24:09 -0400 From: yj...@brandeis.edumailto:yj...@brandeis.edu Subject: [ccp4bb] Sodium ion vs. Water To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Dear CCP4 experts, I would like to ask if there is a clear way to distinguish Na+ and HOH molecules in the electron density map. The table I have suggests Na to O distance lies between 2.35 and 2.45 that is very hard to discern the difference of these two candidates. Na+ may have several coordination though (5, 6). This started from if where the divalent metal locates can be substituted by water or sodium. I would appreciate any valuable suggestions. Best, Young-Jin
Re: [ccp4bb] Sodium ion vs. Water
Dear Young-Jin, If you model it as water, you can use WASP. It's an old program but still accesible here: http://xray.bmc.uu.se/cgi-bin/gerard/rama_server.pl Cheers, Robbie Date: Tue, 2 Aug 2011 14:24:09 -0400 From: yj...@brandeis.edu Subject: [ccp4bb] Sodium ion vs. Water To: CCP4BB@JISCMAIL.AC.UK Dear CCP4 experts, I would like to ask if there is a clear way to distinguish Na+ and HOH molecules in the electron density map. The table I have suggests Na to O distance lies between 2.35 and 2.45 that is very hard to discern the difference of these two candidates. Na+ may have several coordination though (5, 6). This started from if where the divalent metal locates can be substituted by water or sodium. I would appreciate any valuable suggestions. Best, Young-Jin
Re: [ccp4bb] research paper
Jung-Hoon, This is a so-called WaReZ request, which could get you banned a lot of webfora. Of course, we are all guilty of it at some occasions. The best way to get an article is to ask the authors, they are allowed give away free copies (depending on the journal I guess). Hooray, for authors who pay open access fees. Cheers, Robbie Date: Thu, 28 Jul 2011 12:23:10 -0400 From: f...@bernstein-plus-sons.com Subject: Re: [ccp4bb] research paper To: CCP4BB@JISCMAIL.AC.UK The article is available for purchase for $40. Journals cannot survive without funding which can come from many sources - subscriptions, author payment to make the article open-access, etc. But asking someone to provide a 'free' copy without Acta's permission is tantamount to theft. Frances Bernstein = Bernstein + Sons * * Information Systems Consultants 5 Brewster Lane, Bellport, NY 11713-2803 * * *** * Frances C. Bernstein * *** f...@bernstein-plus-sons.com *** * * *** 1-631-286-1339 FAX: 1-631-286-1999 = On Thu, 28 Jul 2011, Ed Pozharski wrote: On Thu, 2011-07-28 at 14:35 +, Jung-Hoon Lee wrote: Acta Cryst D63 (2007), 550-554. I can't believe Cornell has no access to Acta D. -- Hurry up before we all come back to our senses! Julian, King of Lemurs
Re: [ccp4bb] Straw poll: polysaccharide building?
Hi Kim and Kevin, Even then you can have chirality inversions during real-space refinement, which would destroy the SWEET input model from. There is no substitute for common sense (and validation) here. That said, Kevin, something to autobuild carbohydrates (given a sequence) would be awesome. I'd use it a lot. Just don't make a WMD (weapon of model destruction). Cheers, Robbie Date: Tue, 26 Jul 2011 11:06:03 +0100 From: henr...@ebi.ac.uk Subject: Re: [ccp4bb] Straw poll: polysaccharide building? To: CCP4BB@JISCMAIL.AC.UK Yes but it is easier to take the sweet model for the required sequence and fit that to density rather than do it residue by residue which will lead to glycan structures unknown to the source kim Dear Kim, I asume that Kevin plans to build in electron density maps. As far as I can see Sweet will produce a model unhindered by experimental data. Herman -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kim Henrick Sent: Tuesday, July 26, 2011 11:44 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Straw poll: polysaccharide building? why not use http://glycosciences.de/modeling/sweet2/doc/index.php which works perfectly and would save the duplication of effort cut paste #--- a-D-Neup5Ac-(2-3)-b-D-Galp-(1-4)+ | b-D-GlcpNAc-(1-3)-b-D-Galp-(1-4)+ | | a-L-Fucp-(1-3)+ b-D-GlcpNAc-(1-3)-b-D-Galp-(1-4)-b-D-GlcpNAc-(1-6)+ | | a-L-Fucp-(1-3)+ a-D-Manp-(1-6)+ | | b-D-Galp-(1-4)-b-D-GlcpNAc-(1-3)-b-D-Galp-(1-4)-b-D-GlcpNAc-(1-3)-b-D-Ga lp-(1-4)-b-D-GlcpNAc-(1-2)+ | a-L-Fucp-(1-6)+ | | b-D-Galp-(1-4)+ | b-D-GlcpNAc-(1-4)-Asn | | | b-D-GlcpNAc-(1-3)-b-D-Galp-(1-4)+ b-D-Manp-(1-4)-b-D-GlcpNAc-(1-4)+ | | | a-L-Fucp-(1-3)+ b-D-GlcpNAc-(1-3)-b-D-Galp-(1-4)+ | | | | a-L-Fucp-(1-3)+ b-D-GlcpNAc-(1-4)+ | | | | a-L-Fucp-(1-3)+ a-D-Manp-(1-3)+ | a-D-Neup5Ac-(2-6)-b-D-Galp-(1-4)-b-D-GlcpNAc-(1-3)-b-D-Galp-(1-4)-b-D-Gl cpNAc-(1-3)-b-D-Galp-(1-4)-b-D-GlcpNAc-(1-2)+ and click and your o/p is as attached apart from the poor excuse for a pdb file it has the model with glycosidic torsion angles as expected as in glycomapsdb Straw poll: Are you interested in software to autobuild polysaccarides? Kevin p.s. I expect I'll have to spend at least a year working on the problem before before I spell polysaccharide consistently.
Re: [ccp4bb] Creating a non-minimal mmCIF dictionary for DNA?
Hi Brittney, DNA is pretty standard so the restraints should be in the dictionary. Perhaps the DNA in your model has non-standard residue names (PDBv2). Are your bases called DT, DA, etc? Do your atom names have * or '? Cheers, Robbie Date: Tue, 26 Jul 2011 12:45:08 -0400 From: bmanv...@umaryland.edu Subject: [ccp4bb] Creating a non-minimal mmCIF dictionary for DNA? To: CCP4BB@JISCMAIL.AC.UK I am trying to refine a protein-DNA structure, and I am having difficulties refining the DNA using COOT. I used the ccp4i Phaser program to molecular replace the protein and DNA simultaneously (there are solved crystal structures of the free protein and a similar DNA bound to another protein), but I am only able to manually refine the protein in COOT. If I try to refine the DNA, a window pops up with the following statement: No Restraints Found! Non-existent or minimal description of restrained residues. How do I create a non-minimal mmCIF dictionary for just the DNA? I am relatively new to crystallography and would appreciate any guidance. Brittney
Re: [ccp4bb] unusual sighting of a crystal structure
Hi Artem, Thank for that nice example of a protein structure used to pimp a movie. Ribbon representations are always the scariest. Cheers, Robbie Date: Sat, 16 Jul 2011 10:57:21 -0500 From: artem.evdoki...@gmail.com Subject: [ccp4bb] unusual sighting of a crystal structure To: CCP4BB@JISCMAIL.AC.UK Fellow structural biologists, I just caught a brief glimpse of a crystal structure (looks like an Fv complex or maybe an IG-like receptor ectodomain complex?) in the trailer for the upcoming 'scary virus' movie Contagion and thought you'd want to share the amusement. Sorry about the 300K attachment :) Artem
Re: [ccp4bb] output individual redundancies
Hi Ed, I was recently looking for that value myself, but couldn't find it. I suppose (at some point) it may be useful information to deposit. If something is a mean value, it is nice to know how many individual values were used to construct that mean. Unfortunately, there doesn't seem to be a cif token for that. Cheers, Robbie Date: Fri, 15 Jul 2011 09:26:39 +0100 From: p...@mrc-lmb.cam.ac.uk Subject: Re: [ccp4bb] output individual redundancies To: CCP4BB@JISCMAIL.AC.UK No M/ISYM is different it's the symmetry number plus a full or partial flag. Ed. You could count them from the unmerged output as you say, or I could make you a special version of SCALA or Aimless maybe next week Phil Sent from my iPhone On 14 Jul 2011, at 23:15, Ethan Merritt merr...@u.washington.edu wrote: On Thursday, July 14, 2011 02:55:26 pm Ed Pozharski wrote: I am looking for a way to output redundancy per individual reflection, preferably for scala but if that is not possible then maybe for scalepack. If you read the unmerged file from scalepack into ccp4 using combat, it creates a data column with label M/ISYM that I think is what you are asking for. You can use the Import Unmerged Data (Combat) tab in the ccp4i GUI. Ethan From my (admittedly quick) look at the scala manual it seems that I can use something like UNMERGED output option to exclude outliers and then would need to write a bit of code to calculate the redundancies. But I hope that I missed something and there is a secret keyword that would add redundancies to the merged mtz file. Cheers, Ed. -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] large R-Rfree difference in final structure
Hi Careina, Assuming you don't suffer from a very poor data parameter ratio that would lead to such a large R-free/R, you need to improve your refinement. If you have NCS you should use local NCS restraints. You could also try jelly-body restraints, although they may not work at your resolution. Cheers, Robbie Date: Wed, 13 Jul 2011 08:38:38 -0700 From: careinaedgo...@yahoo.com Subject: [ccp4bb] large R-Rfree difference in final structure To: CCP4BB@JISCMAIL.AC.UK Dear ccp4 bulletin board I just have a slight concern regarding my Rwork Rfree difference. I have a structure that I have solved. I am reasonably content that it is complete because it has refined well, it no longer has bad geometries and contacts and all the rotamers, ramachandra, bond lengths etc are good. It gives favourable scores on molprobity and procheck. My only concern is the R factor difference. The resolution of the structure is 2.3A. The R factor is 0.24 after refinement but the Rfree is 0.33 which seems to me to be rather high. Should I be concerned? During refinement Rfree only drops from about 0.36 to 0.33 while the R factor drops from 0.31 to 0.24.. I have removed automatic weighting in refmac in order to constrain my bond lengths and angles during a couple of rounds of refinement. This did not have any effect on the R factors, however. I am fairly content that the space group I have chosen is correct so I am not sure what else could cause the big difference in R factors? There is no twinning. Can I be satisfied that my structure is correct despite the high R free or should I be doing other checks/ trying other things before I can submit this structure? Thank you for any help Careina
Re: [ccp4bb] low resolution refinement
When in doubt, try both. In my personal experience, adding hydrogens always works. Especially at low resolution. But don't take my word for it, experiment a little. Cheers, Robbie Date: Sun, 10 Jul 2011 16:01:59 +0800 From: caiq...@gmail.com Subject: Re: [ccp4bb] low resolution refinement To: CCP4BB@JISCMAIL.AC.UK Hi, Thank you very much. In the example5 of this page http://www.ysbl.york.ac.uk/~garib/refmac/docs/usage/examples.html#exam5, It seems that for 3A dataset, MAKE HYDRogens No. Is it mean that the hydrogen just usefull for high resolution data? 2011/7/10 Robbie Joosten robbie_joos...@hotmail.com Hi Qixu, In CCP4i the option is in the refinement parameters: Use hydrogen atoms: [build all hydrogens] and [] output to coordinate file What is does is build all hydrogens at the expected coordinates and constrain them in refinement (i.e. adding hydrogens does not add extra parameters to the model). The effect on explaining your experminetal data is typically small, but the hydrogens help with the VdW restraints. In effect they reduce the number of bumps and improve your torsion angles. You can use a reference structure to generate external restraints: http://www.ysbl.york.ac.uk/~garib/refmac/data/refmac_news.html#External I hope someone else on the BB can explain how. I think it is also explained in the talk and tutorials of the Refmac website. HTH, Robbie From: caiq...@gmail.com Date: Sun, 10 Jul 2011 00:44:25 +0800 Subject: Re: [ccp4bb] low resolution refinement To: robbie_joos...@hotmail.com CC: CCP4BB@jiscmail.ac.uk Hi, Thank you for your suggestion. Could you tell me what is riding hydrogens? And it seems there is not reference model function in refmac5.6? 2011/7/9 Robbie Joosten robbie_joos...@hotmail.commailto:robbie_joos...@hotmail.com Dear Qixu, refamac 5.6 works well at these resolutions. You can add commands to your refinement in CCP4i by using the 'Run and view command script' (or something like that) option and just typing in the extra commands. Jelly-body has worked very well for me (although I use tigheter restraints than the default). Also local NCS works well (provided you have NCS). I never used reference structures, but I heared good things about it. Don't forget to use riding hydrogens, for some reason it is not the deafault. Perhaps you should also switch of the automatic X-ray weighting in favour of optimizing the matrix weight yourself (start with 0.05 and compare refinements for higher and lower values). HTH, Robbie Date: Sat, 9 Jul 2011 16:59:29 +0800 From: caiq...@gmail.commailto:caiq...@gmail.com Subject: [ccp4bb] low resolution refinement To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Dear all, Recently, I refine two low resolution structures in refmac 5.5. Their resolutions are 3A and 3.5A respectively. For 3A structure, after MR by phaser and rigidbody refinementrestraint refinement by refmac5.5, I got R factor 25% and R free 35%. And then each time, after my model building in coot and restraint refinement by refmac 5.5, the R factor stays 25%, but R free increases to 38%, even 39%. For 3.5A structure, the R factor stays 27%, but R free increases from 37% to 42% after my slightly model building in coot. Could you help me to find the reason? Maybe the reason is the overfit of the structure? I found that new version of refmac 5.6 has many new features for low resolution refinement, such as jelly boy, secondary structure restraints. But I don't know how to use these new features in old version ccp4i (6.1.13)? I also used phenix.refine with the reference model ( I have high resolution model for one domain of the low resolution protein) and secondary structure restraints, but it seams the same. Any suggestion? BTW, is that simulator annealing not suitable for low resolution structure? I used the simulator annealing method of CNS and phenix.refine, but the geometry of the structure is always destroyed seriously. Could you help me? Thank you very much!
Re: [ccp4bb] low resolution refinement
Dear Qixu, refamac 5.6 works well at these resolutions. You can add commands to your refinement in CCP4i by using the 'Run and view command script' (or something like that) option and just typing in the extra commands. Jelly-body has worked very well for me (although I use tigheter restraints than the default). Also local NCS works well (provided you have NCS). I never used reference structures, but I heared good things about it. Don't forget to use riding hydrogens, for some reason it is not the deafault. Perhaps you should also switch of the automatic X-ray weighting in favour of optimizing the matrix weight yourself (start with 0.05 and compare refinements for higher and lower values). HTH, Robbie Date: Sat, 9 Jul 2011 16:59:29 +0800 From: caiq...@gmail.com Subject: [ccp4bb] low resolution refinement To: CCP4BB@JISCMAIL.AC.UK Dear all, Recently, I refine two low resolution structures in refmac 5.5. Their resolutions are 3A and 3.5A respectively. For 3A structure, after MR by phaser and rigidbody refinementrestraint refinement by refmac5.5, I got R factor 25% and R free 35%. And then each time, after my model building in coot and restraint refinement by refmac 5.5, the R factor stays 25%, but R free increases to 38%, even 39%. For 3.5A structure, the R factor stays 27%, but R free increases from 37% to 42% after my slightly model building in coot. Could you help me to find the reason? Maybe the reason is the overfit of the structure? I found that new version of refmac 5.6 has many new features for low resolution refinement, such as jelly boy, secondary structure restraints. But I don't know how to use these new features in old version ccp4i (6.1.13)? I also used phenix.refine with the reference model ( I have high resolution model for one domain of the low resolution protein) and secondary structure restraints, but it seams the same. Any suggestion? BTW, is that simulator annealing not suitable for low resolution structure? I used the simulator annealing method of CNS and phenix.refine, but the geometry of the structure is always destroyed seriously. Could you help me? Thank you very much!
Re: [ccp4bb] low resolution refinement
Hi Qixu, In CCP4i the option is in the refinement parameters: Use hydrogen atoms: [build all hydrogens] and [] output to coordinate file What is does is build all hydrogens at the expected coordinates and constrain them in refinement (i.e. adding hydrogens does not add extra parameters to the model). The effect on explaining your experminetal data is typically small, but the hydrogens help with the VdW restraints. In effect they reduce the number of bumps and improve your torsion angles. You can use a reference structure to generate external restraints: http://www.ysbl.york.ac.uk/~garib/refmac/data/refmac_news.html#External I hope someone else on the BB can explain how. I think it is also explained in the talk and tutorials of the Refmac website. HTH, Robbie From: caiq...@gmail.com Date: Sun, 10 Jul 2011 00:44:25 +0800 Subject: Re: [ccp4bb] low resolution refinement To: robbie_joos...@hotmail.com CC: CCP4BB@jiscmail.ac.uk Hi, Thank you for your suggestion. Could you tell me what is riding hydrogens? And it seems there is not reference model function in refmac5.6? 2011/7/9 Robbie Joosten robbie_joos...@hotmail.commailto:robbie_joos...@hotmail.com Dear Qixu, refamac 5.6 works well at these resolutions. You can add commands to your refinement in CCP4i by using the 'Run and view command script' (or something like that) option and just typing in the extra commands. Jelly-body has worked very well for me (although I use tigheter restraints than the default). Also local NCS works well (provided you have NCS). I never used reference structures, but I heared good things about it. Don't forget to use riding hydrogens, for some reason it is not the deafault. Perhaps you should also switch of the automatic X-ray weighting in favour of optimizing the matrix weight yourself (start with 0.05 and compare refinements for higher and lower values). HTH, Robbie Date: Sat, 9 Jul 2011 16:59:29 +0800 From: caiq...@gmail.commailto:caiq...@gmail.com Subject: [ccp4bb] low resolution refinement To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Dear all, Recently, I refine two low resolution structures in refmac 5.5. Their resolutions are 3A and 3.5A respectively. For 3A structure, after MR by phaser and rigidbody refinementrestraint refinement by refmac5.5, I got R factor 25% and R free 35%. And then each time, after my model building in coot and restraint refinement by refmac 5.5, the R factor stays 25%, but R free increases to 38%, even 39%. For 3.5A structure, the R factor stays 27%, but R free increases from 37% to 42% after my slightly model building in coot. Could you help me to find the reason? Maybe the reason is the overfit of the structure? I found that new version of refmac 5.6 has many new features for low resolution refinement, such as jelly boy, secondary structure restraints. But I don't know how to use these new features in old version ccp4i (6.1.13)? I also used phenix.refine with the reference model ( I have high resolution model for one domain of the low resolution protein) and secondary structure restraints, but it seams the same. Any suggestion? BTW, is that simulator annealing not suitable for low resolution structure? I used the simulator annealing method of CNS and phenix.refine, but the geometry of the structure is always destroyed seriously. Could you help me? Thank you very much!
Re: [ccp4bb] Waters in ADIT
Dear Petr, Did you try WHAT_CHECK? It has a number of tests for water and will take indirect interactions with the macromolecule into account. Cheers, Robbie Date: Wed, 22 Jun 2011 16:01:45 +0200 From: arnaud.goepf...@unibas.ch Subject: Re: [ccp4bb] Waters in ADIT To: CCP4BB@JISCMAIL.AC.UK Dear Petr, I guess ADIT only looks for interaction between water molecules and the protein and does not take into account the interactions between water molecules. So if a water molecule interacts with another water molecule but not with the protein, ADIT will give you these error report even though the water is well coordinated. On Jun 22, 2011, at 1:18 PM, Petr Kolenko wrote: Dear colleagues, I want to deposit one structure, but ADIT reports tens more waters that are further than 3.5 AA away from macromolecule atoms. I inspected about half of them manually, but all of them are OK. I have observed this incorrect behavior of ADIT also in one previous structure for deposition, but just ignored three or four reports, because I knew, I was doing the right thing. Does anyone know how to solve this problem? I have already tried: - changing HETATM to ATOM - assigning different chain ID for waters to have same ID as protein chain - renumbering of residues (not in this case, but the previous one) I do not have to solve this problem, but I do not want to have so strange Validation Report from ADIT. Many thanks for any idea. Petr PS: Not important, but refined with REFMAC5 at medium resolution. -- Petr Kolenko kole...@imc.cas.cz http://kolda.webz.cz
Re: [ccp4bb] Help! low resolution protein-DNA complex
Dear Xun, I have a 3.2A dataset for a protein-DNA complex. The protein is a homodimer, and the DNA is almost palindromic (except one base pair in the middle and two or three base pairs at both two ends). It is my first time solving structures, and unfortunately the resolution is low. No body in our lab has used ccp4 or phenix, so I am really frustrated as a second year student. Your frustration is understandable. It is somewhat of an expectation in academia that your advisor will either help you directly or if she/he is not familiar with the methodology you are forced to use, will find someone to help you. The questions you ask surely may be answered by someone in your department. IMHO, a second year student should not be left alone to battle his first structure which happens to be 3.2A protein/DNA complex.Indeed, this is just asking for problems. It's a good call that you asked for help. Perhaps your supervisor can arrage for you to be embedded in a crystallography lab for a while. That should give you easy access to people with experience. I mainly used ccp4. So far, the best R/Rfree I got is 0.27/0.34. and that is not bad given the resolutionYou are heading the right way. You should be able to close the R/R-free gap a bit more. I went to the crystallography meeting, and people suggested me to rely more on geometry. I remember I got a DNA restraints file and a refmac script from someone on this mailing list, and that really helped (otherwise the DNA base pairing will be weird). Can someone tell me how to restraint the protein (helix)? one way of doing it would be to restrain the hydrogen bonds that stabilize the helix. It is not advisable at higher resolution, but sounds alright at 3.2A. I once used a restraint file to keep DNA sane by forcing Watson-Crick pairing, the helical restraints would work pretty mnuch in the same way. Look at the structure of the restraint file that you have and modify it to include the helix-stabilizing hydrogen bonds.I like real-space refining everything in Coot with tight helical restraints. You may need to chainge the default restraint weight matrix (lower numbers give tighter restraints). The options are under the R/RC button. People also suggested me to include NCS and TLS in the refinement, but I don't know how to. For NCS, I should define a region that are the same in both monomers? Should I use tight or loose restraints? For TLS, I don't have a clue. Yes and tight (at least at first). For TLS you may want to take a look at the TLSMD server. (Also, consider tighter restraints on B-factors). Otherwise, just define TLS for the whole thing, then protein and DNA separately, then individual monomers and whatever pieces of DNA common sense suggests would move together. Keep whatever combination gives you the lowest Rfree.In Refmac you can use local NCS which takes away the need to mess with NCS selections (which can be really difficult). Although it is not needed for Refmac, you should make sure that the same residues in different monomers have the same residue number. Be conservative with TLS (in the beginning). One group per chain sounds right. In the case of your DNA you can consider putting both chains in one group. Tight B-factor weights may be needed, you could also trying one overall B factor. I personally only do that when TLS works well. Oh, and always use riding hydrogens in refinement. It helps a lot at low resolution, because of the VDW restraints. For that same reason you should not be too conservative with the sidechains (at least not for the ones in the core of the protein). Since you have only started building you should probably go through the entire structure a few times. After that, use structure validation tools frequently. WHAT_CHECK and Molprobity are must-use tools for that. Coot also has many usefull features for validation. Good luck. Cheers,Robbie Joosten
Re: [ccp4bb] Follow-up: non-waters among structured solvent atoms
For the record:-UNK is for unknown residues only. That means that you know that you are looking at an amino acid you just don't know which. You should assign element types. It used to be defined to CB (just like ALA), it now goes to CG. I don't see the point of this update.-UNL is for unknown hetero compounds.-UNX is for unknown solo atoms.-DN id for unknown deoxy nucleotide. Cheers,Robbie Date: Fri, 17 Jun 2011 12:22:50 +0100 From: twom...@globalphasing.com Subject: Re: [ccp4bb] Follow-up: non-waters among structured solvent atoms To: CCP4BB@JISCMAIL.AC.UK On 16 Jun 2011, at 17:19, Pavel Afonine wrote:Hi, On Thu, Jun 16, 2011 at 7:49 AM, Jan Dohnalek dohnalek...@gmail.com wrote: Modeling more UNKNOWN atoms might be the future for these cases? one needs to specify chemical element type in 77-78 position, otherwise these records are useless. But if you know the chemical element type then there's no point in calling it UNK. BUSTER uses the scattering factors for oxygen for modelling X, on the grounds that you'll have put in an X because it doesn't look enough unlike water to be obviously something else. Tom
Re: [ccp4bb] non-waters among structured solvent atoms
Hi Wolfram, This was an early study on the subject: http://www.ncbi.nlm.nih.gov/pubmed/8594192 The software is still accessible via the STAN server. Cheers, Robbie Date: Tue, 14 Jun 2011 17:51:21 -0400 From: wtem...@gmail.com Subject: [ccp4bb] non-waters among structured solvent atoms To: CCP4BB@JISCMAIL.AC.UK Dear colleagues, following a discussion in our lab, I have volunteered to dig out articles from the literature about erroneous assignments of non-water entities such as metal ions, halides in protein models. For example I have the faint recollection that data mining of the PDB for suspect water assemblies matching the geometry of coordinated cations has previously been described. But none of my google searches has turned up the references I was looking for. Could someone point me in the right direction, please? Many thanks, Wolfram Tempel
Re: [ccp4bb] refmac problem with anisotropic Us
Hi Ethan, I also reset the temperature factors to 20 at the beginning of each refinement round. The refinement resolution is 75 to 1.8 A, and the space group is C2, if it matters. I am virtually certain that refinement of individual anisotropic U^ij terms cannot be justified at 1.8A. Too many parameters, too few observations. In some cases (high solvent) there may be enough reflections to give your atoms anisotropic B-factors. Look for instance at PDB entry 1tv4, it has 19.5 reflections per atom, 1rzh has even more. When in doubt, I use a Hamilton test to see if anisotropic Bs are acceptable. That said, of course your residual B-factors after TLS can be anisotropic, but TLS + anisotropic Bs don't seem right to me either. You might as well use anisotropic B-factors and then extract TLS tensors from them. That was the original point of TLS anyway. Cheers, Robbie The problem comes when I take the output from such a run, model build in Coot, and feed it back in for the next round of refinement. As you can see from the summary table below (trimmed for brevity), the refinement blows up rather badly, making the R factors and the geometry substantially worse. Ncyc Rfact Rfree FOM -LL -LLfree rmsBOND zBOND rmsANGL zANGL rmsCHIRAL $$ $$ 0 0.3376 0.3576 0.556 115671. 6318.1 0.0 0.0 0.0 0.0 0.0 5 0.2676 0.2921 0.680 109411. 6009.2 0.0 0.0 0.0 0.0 0.0 9 0.2529 0.2785 0.703 108589. 5978.4 0.0 0.0 0.0 0.0 0.0 10 0.2516 0.2787 0.710 108459. 5975.2 0.0170 0.829 1.534 0.731 0.097 15 0.2503 0.2876 0.711 108711. 6042.1 0.0241 0.998 1.729 0.833 0.111 20 0.2739 0.3152 0.672 110787. 6147.3 0.0268 1.103 1.837 0.885 0.119 25 0.2949 0.3388 0.638 112219. 6215.4 0.0255 1.049 1.808 0.866 0.116 30 0.3114 0.3575 0.609 113094. 6257.7 0.0243 1.006 1.794 0.854 0.114 This refinement also throws out an enormous number of warnings about adjacent atoms' B factors being substantially different. Most of these warnings appear to involve the autobuilt riding hydrogens and their adjacent heavy atoms. If I use pdbcur to strip out the ANISOU lines, but otherwise keep the file and refinement protocol unchanged, it goes along nicely: Ncyc Rfact Rfree FOM -LL -LLfree rmsBOND zBOND rmsANGL zANGL rmsCHIRAL $$ $$ 0 0.2912 0.3233 0.667 112220. 6151.2 0.0 0.0 0.0 0.0 0.0 5 0.2476 0.2787 0.751 107853. 5934.7 0.0 0.0 0.0 0.0 0.0 9 0.2468 0.2764 0.754 107501. 5915.6 0.0 0.0 0.0 0.0 0.0 10 0.2469 0.2763 0.754 107480. 5914.2 0.0170 0.829 1.534 0.731 0.097 15 0.1933 0.2387 0.810 101501. 5723.3 0.0238 0.994 1.791 0.868 0.118 20 0.1849 0.2327 0.817 100446. 5694.7 0.0239 0.992 1.826 0.884 0.120 25 0.1818 0.2316 0.821 100034. 5681.0 0.0223 0.925 1.775 0.855 0.114 30 0.1804 0.2296 0.824 99820. 5669.4 0.0221 0.913 1.763 0.848 0.113 (Without TLS refinement, the final R and Rfree would be 0.1896 and 0.2503.) So, what's happening here? Does Refmac not like ANISOU lines in the input PDB file? I don't usually work with structures at a resolution high enough to warrant aniso B refinement, so I haven't encountered this before. Thanks for any advice, Matt -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742
Re: [ccp4bb] Zotero style
Hi Bjorn, Thank you for the file. I gave it a go, but it kept saying 'this.style is undefined' or word used the next style in the list. It seems like a compatibility problem, Zotero has changed a lot since 2009. It tried some hacking, but that didn't work. Cheers, Robbie Date: Wed, 4 May 2011 12:06:32 -0700 From: bj...@msg.ucsf.edu Subject: Re: [ccp4bb] Zotero style To: CCP4BB@JISCMAIL.AC.UK I made one some time ago (attached). It's not perfect (missing definitions for e.g. books) but works well for article-references. -Bjørn -- Bjørn Panyella Pedersen Macromolecular Structure Group Dept. of Biochemistry and Biophysics University of California, San Francisco On 2011-05-04 05:32, Robbie Joosten wrote: Hi Darren, Thank you for the link. It may be a usefull tool. Unfortunately, the site was buggy in IE9. It worked much better in FF4, but it stopped when I tried to generate the final style file. It turns out that you can also import an EndNote style into Zotero. I tried this for the .ens file on the Acta Cryst site but the import destroyed all the interpunction. Still it may provide a good starting point to create a final Zotero style. Cheers, Robbie From: h...@embl.fr Date: Wed, 4 May 2011 13:33:15 +0200 Subject: Re: [ccp4bb] Zotero style To: robbie_joos...@hotmail.com CC: CCP4BB@jiscmail.ac.uk You can use this http://www.somwhere.org/csl/ to build your style. Darren On 4 May 2011 09:05, Robbie Joosten robbie_joos...@hotmail.com mailto:robbie_joos...@hotmail.com wrote: Hi Everyone, Does anyone have a Zotery style template for Acta Cryst and the like, (s)he wishes to share? I cannot find it in the repository, but perhaps someone has made one for private use. Cheers, Robbie Joosten Biochemistry Netherlands Cancer Institute -- ** Dr. Darren Hart, Team Leader High Throughput Protein Lab Grenoble Outstation European Molecular Biology Laboratory (EMBL) ** www.embl.fr/research/unit/hart/index.html http://www.embl.fr/research/unit/hart/index.html For funded access to ESPRIT construct screening via EU FP7 PCUBE: http://tinyurl.com/ydnrwg4 Email: h...@embl.fr mailto:h...@embl.fr Tel: +33 4 76 20 77 68 Fax: +33 4 76 20 71 99 Skype: hartdarren Postal address: EMBL, 6 rue Jules Horowitz, BP181, 38042 Grenoble, Cedex 9, France **
Re: [ccp4bb] Insertion codes
Hi Ed, Personally I don't care one way or the other, but it may be pointed out that if D25 is actually number 37 in a homologous protein, it should be D37. Just as acknowledgement of the (somewhat purist) point of view that the residue number should denote its linear distance from the N-terminus. But which N-terminus should we use? The N-terminus of the protein, the one of the construct, or the N-terminus of what is ordered in the PDB file? And what about deletions, isn't it usefull to have gaps in the residue numbering indicating a deletion? Getting proper residue numbering is difficult and there will always be exceptions. Dealing with all the different possible schemes is a nightmare. That is why residue numbering is always one of the first topics in structural bioinformatics. The PDB now seems to follow the numbering from UniProt which makes things a lot clearer, but fusion proteins now lead to crazy jumps in the residue numbering resulting in chains with numbers going from 100, to 1200 and back to 300. For many well studied groups of proteins insertion codes help the biological interpretation of the structures. Unfortunately, insertion codes are surprisingly poorly supported by software that uses PDB files especially outside crystallography (but even CCP4 software has some remaining problems). I hope this thread will at least increase awareness of the existence of insertion codes. It is very much needed... Cheers, Robbie Cheers, Ed. -- Hurry up before we all come back to our senses! Julian, King of Lemurs
[ccp4bb] Zotero style
Hi Everyone, Does anyone have a Zotery style template for Acta Cryst and the like, (s)he wishes to share? I cannot find it in the repository, but perhaps someone has made one for private use. Cheers, Robbie Joosten Biochemistry Netherlands Cancer Institute
Re: [ccp4bb] Zotero style
Hi Ian, Indeed the word 2007 template is very good (I never got the 2003 version working well), but it becomes a problem when your co-authors are Mac users with old Word versions. Cheers, Robbie Date: Wed, 4 May 2011 12:18:59 +0100 Subject: Re: [ccp4bb] Zotero style From: ianj...@gmail.com To: robbie_joos...@hotmail.com CC: CCP4BB@jiscmail.ac.uk Hi Robbie My understanding is that the templates for Acta Cryst. are here: http://journals.iucr.org/services/wordstyle.html or here: http://journals.iucr.org/d/services/latexstyle.html At least that's what I've always used. Cheers -- Ian On Wed, May 4, 2011 at 8:05 AM, Robbie Joosten robbie_joos...@hotmail.com wrote: Hi Everyone, Does anyone have a Zotery style template for Acta Cryst and the like, (s)he wishes to share? I cannot find it in the repository, but perhaps someone has made one for private use. Cheers, Robbie Joosten Biochemistry Netherlands Cancer Institute
Re: [ccp4bb] Zotero style
Hi Darren, Thank you for the link. It may be a usefull tool. Unfortunately, the site was buggy in IE9. It worked much better in FF4, but it stopped when I tried to generate the final style file. It turns out that you can also import an EndNote style into Zotero. I tried this for the .ens file on the Acta Cryst site but the import destroyed all the interpunction. Still it may provide a good starting point to create a final Zotero style. Cheers, Robbie From: h...@embl.fr Date: Wed, 4 May 2011 13:33:15 +0200 Subject: Re: [ccp4bb] Zotero style To: robbie_joos...@hotmail.com CC: CCP4BB@jiscmail.ac.uk You can use this http://www.somwhere.org/csl/ to build your style. Darren On 4 May 2011 09:05, Robbie Joosten robbie_joos...@hotmail.com wrote: Hi Everyone, Does anyone have a Zotery style template for Acta Cryst and the like, (s)he wishes to share? I cannot find it in the repository, but perhaps someone has made one for private use. Cheers, Robbie Joosten Biochemistry Netherlands Cancer Institute -- ** Dr. Darren Hart, Team Leader High Throughput Protein Lab Grenoble Outstation European Molecular Biology Laboratory (EMBL) ** www.embl.fr/research/unit/hart/index.html For funded access to ESPRIT construct screening via EU FP7 PCUBE: http://tinyurl.com/ydnrwg4 Email: h...@embl.fr Tel: +33 4 76 20 77 68 Fax: +33 4 76 20 71 99 Skype: hartdarren Postal address: EMBL, 6 rue Jules Horowitz, BP181, 38042 Grenoble, Cedex 9, France **
Re: [ccp4bb] anisotropy vs TLS
Dear Kenneth, IMO there is no resolution cut-off to decide to go from TLS to individual anisotropic Bs. I use the number of reflections per atom. You are refining 9 parameters per atom so you need quite a lot. When I have18 ref/atom I switch to anisotropic. I try both isotropic and anisotropic Bs with 13.5 reflections per atom. You need good evidence that the anisotropic model is better than an isotropic model, looking at R-free is not good enough. When you add so many parameters R-free will drop anyway. Ethan Merritt discussed a good test for this at the CCP4 study weekend. If you use Refmac, I have a tool that uses that method to compare the logfiles from too models and helps decide which model is best. Combining TLS and anisotropic Bs is a bit over the top. You could use anisotropic Bs and then use TLSMD to extract the bulk movement. Cheers, Robbie Date: Thu, 7 Apr 2011 19:39:39 -0500 From: satys...@wisc.edu Subject: [ccp4bb] anisotropy vs TLS To: CCP4BB@JISCMAIL.AC.UK peoples: I know that TLS is a group B factor for regions of proteins that are moving the same. It is used in low res structures. But at what resolution does one begin anisotropic, i.e individual aniso for each atom, and leave TLS out. Or can one still use TLS to first compensate for large motions and then dampen down the individual atoms with aniso ADP? If both the aniso and TLS are used, how does a person interpret the results? What programs are there to see just what is large body motions and what is atoms. thanks -- Kenneth A. Satyshur, M.S.,Ph.D. Associate Scientist University of Wisconsin Madison, Wisconsin 53706 608-215-5207