from:"Phil Jeffrey"

Re: [ccp4bb] add ligand with AceDRG

2024-04-26 Thread Phil Jeffrey


Indeed, as Diana points out:

PDB's own components.cif defines LIG as:

_chem_comp.id   LIG
_chem_comp.name "3-PYRIDIN-4-YL-2,4-DIHYDRO-INDENO[1,2-.C.]PYRAZOLE"
_chem_comp.type NON-POLYMER
_chem_comp.pdbx_type HETAIN
_chem_comp.formula   "C15 H11 N3"

So they probably should fix that.  Also that chem_comp.name seems to be 
associated with a variety of ligand IDs with different formulae and also 
turns up as a synonym of others.  Things seem to be a little wayward in 
there.


Phil Jeffrey
Princeton


On 4/26/24 10:40 AM, Diana Tomchick wrote:
But I think that is a mistake, if you search for LIG in the PDB, it 
brings up a definite ligand that has that 3-letter code.


Diana

Sent from my iPhone


On Apr 26, 2024, at 8:04 AM, Deborah Harrus  wrote:



Dear all,

Just to clarify, "LIG" is also a reserved code, so it's safe to use.

See https://www.wwpdb.org/news/news?year=2023#656f4404d78e004e766a96c6

Kind regards,

Deborah Harrus

PDBe





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] request for applications

2024-04-01 Thread Phil Jeffrey


:: I expect to have ~ $1e12 USD on current ledgers.

Presumably via the Bankman-Fried algorithm

Phil

On 4/1/24 3:01 AM, James Holton wrote:

Hey Everyone,

It may sound like an incredibly boring thing that there has never been a 
formal mathematical proof that finding the prime factors of very large 
numbers doesn't have a more efficient algorithm than simply trying every 
single one of them. Nevertheless, to this day, encryption keys and 
indeed blockchain-based cryptocurrencies hinge upon how computationally 
hard it is to find these large prime factors. And yet, no one has ever 
proven that there is not a more efficient way.



[snip]



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Fragile Crystals

2023-11-22 Thread Phil Jeffrey


Hello Morgan

In addition to the other good suggestions, I have a few observations of 
my own.


If your crystals crack without handling or adding anything to the drop, 
then they are extremely environment-sensitive.  If that's the case, 
testing at room temperature will be problematic because that tends to be 
somewhat stressful on the crystal either mechanically (ye olde capillary 
mount method) or via dehydration (loop mounts with the sleeve).


Growing in the presence of at least a little cryoprotectant as per Vaheh 
would be less stressful than multi-step processes like Tao-Hsin's advice 
unless your crystals can re-anneal after stress.  Mounting directly from 
the drop is probably essential, and mounting under oil is a good thing 
to try in addition - apart from anything else oil on the drop slows down 
the environmental changes.  Using Mitegen mounts might be less stressful 
on some crystals than standard nylon loops if they are mechanically 
sensitive.  Spending some time optimizing the mechanics of your freezing 
technique might help significantly in reducing the amount of time your 
crystal dehydrates while moving through air.

(Jim Pflugrath's article is full of useful information:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4461322/ )

Small crystals often freeze more smoothly than large ones - even for 
robust crystals like tetragonal lysozyme.  Try a lot of crystals - I've 
had projects were two different crystals in the same loop from the same 
drop showed radically different diffraction.  Also I've encountered 
several cases where the appearance of disorder varies within a crystal 
when using a microfocus synchrotron beam line (I mostly use FMX and AMX 
at NSLS2).


Lastly, really cranky crystals rings a distant bell of something we 
encountered in the p19(INK4d)-Cdk6 structure back in the 1990's.  I 
think it was Jie-Oh Lee that did the hard work on this, but in many 
instances crystals cracked in situ when merely opening the drop, and the 
fix was by adding a cross-linker to the well, resealing the drop and 
waiting for the cross-linker to diffuse:


"The crystals were pretreated with glutaraldehyde (diffused into the 
drop from a reservoir of 30% glutaraldehyde) to reduce their tendency to 
crack and lose diffraction along b* and c*."

https://www.nature.com/articles/26155#Sec9

Most crystals don't love being cross-linked, and I would call this a 
successful instance of a desperation maneuver.


Good luck.
Phil Jeffrey
Princeton

On 11/22/23 11:44 AM, Blake, Morgan Elizabeth wrote:

Hello

I am a PhD student working on a crystallography project to wrap up my 
dissertation research. I have purified a complex of two proteins, and I 
can consistently grow crystals in 10% PEG3350, 0.2M KSCN, 0.1M BIS-TRIS 
propane pH 7.5. These crystals have sharp edges and can grow to a large 
size (greater than 0.5 mm), but the crystals seem to be very fragile. 
When we open the drops to harvest the crystals, we have little time to 
harvest the crystals before they crack. When we move the crystals to a 
cryoprotectant, over time they start fracturing. We've tried using 
different percentages of glycerol, ethylene glycol, PEG400, and oil for 
cryoprotectants with no success. Needless to say, the crystals do not 
diffract well, with spot patterns that look very streaky/mosaic, which I 
presume is due to the defects that we see in harvesting/handling. We 
have screened for alternate crystallization conditions, but we seem to 
get the same morphology in other conditions. Does anyone have 
suggestions for additives we could use post-crystallization to help 
stabilize our crystals?


Thanks for your advice!



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C7df53dda9c18474a25d208dbeb7bbeec%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638362688963366377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=tru7KwHbWYoYZZ3AOrMUilKXqAopNBpCZ32XASFWcaQ%3D=0>






To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Cannot select any recommended SG for the protein BsAlaDH

2023-08-02 Thread Phil Jeffrey

1. Completeness is primarily an issue with using the right point group
and crystal system, not the actual space group (e.g. in primitive point
group mmm the space groups P222, P2221, P21212, P212121 should all have
essentially the same completeness).
2. If "refinalize" in CrysAlisPro doesn't let you choose the right
point group and system, then you should process the data with another
program. XDS, MOSFLM, DIALS, autoPROC etc might work, and I have to
believe they'd be better at scaling your data.
3. If you can export the unscaled data from CrysAlisPro you might be
able to feed it into POINTLESS and AIMLESS for scaling

4. On the model front, go find an AlphaFold model, they have worked for
me multiple times in molecular replacement so far.

Phil Jeffrey
Princeton

On 8/2/23 3:00 PM, CENGIZ KAAN FERAH wrote:

Hello,
So I'm trying to get the data processed that I gathered from XRD for the
protein BsAlaDH. Unfortunately from the method that I know of on
CrysAlisPro I cannot select the recommended space group for the protein.
This results in the data not being complete. Still I can get good unit
cells and the degrees. Another problem is that this protein has no
structure published on PDB. And the homolog proteins do not have high
similarities. By that way I cannot really find a suitable space group.
Can someone give me a hand on this issue.

Thank you.

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7Ca39f5ce183ef4277715608db938c34bd%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638266002730226032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=yOpB80eriM%2FMS07e8o4YNZRFiTVN%2FdGOmNiSNZxSxN0%3D=0>

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] CIF file problem

2023-04-20 Thread Phil Jeffrey


Hello Ning

CheckCIF checks small molecule crystallographic cif files - the 
dictionaries and expectations on the contents are not the same as for 
mmcif, although the underlying syntax is the same.


I'm unaware of anything that does the equivalent of CheckCIF for 
macromolecular cif files.


Phil Jeffrey
Princeton


On 4/20/23 4:02 PM, Ning Li wrote:

Hi all,

Does anybody know why I got this error message:

/Checking for embedded fcf data in CIF .../
/No extractable fcf data in found in CIF/

as I uploaded the CIF file to https://checkcif.iucr.org/ 
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcheckcif.iucr.org%2F=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C68cb158f5d564e523dd208db41da53c1%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638176178344904359%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=AGJyz7EJ7Isif6JDsBSfsrr8lg%2BgFsp4gjjA2pOTld0%3D=0> for structure validation? The CIF file was directly from phenix.refine during structure refinement.


Appreciate your help

Ning




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C68cb158f5d564e523dd208db41da53c1%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638176178344904359%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=lPCCK274is8BSjU5Hl5qTY6wODAdprb5VepQ%2F%2FLJTUA%3D=0>






To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Problem with crystal structure solution

2023-03-27 Thread Phil Jeffrey

Hello Gargi

I don't think you mean Fc. Apart from anything else, there's not enough
room for 4 more domains per Fab.

I think you mean the CL:CH1 domain dimer of the Fab. Fabs have a
well-characterized variability in the "elbow angle" between the VL:VH
domain pair and the CL:CH1 domain pair. I suspect you've tried
molecular replacement without trying a range of different elbow angles.
I see +ve difference map peaks within domains and -ve difference map
peaks along polypeptide chain, which is refinement's way of saying that
it doesn't want atoms in the latter location and wants atoms in the
former location.

There's also some quite serious clashing (chain interdigitation) in
places, which can't possibly be correct.

The easiest way to do this is to re-do your molecular replacement with
VL:VH and CL:CH1 models which will inherently allow their correct
relative placement to be modeled. If you managed to get TFZ of 42 with
flawed models, this should be a pretty easy thing to pull off, but if
not please contact me off list.

Phil Jeffrey
Princeton

On 3/27/23 1:52 PM, Kher, Gargi M wrote:

Hello,

I obtained diffraction data for one of my crystallographic projects.
Data collection determined the space group to be P1 at a ~2.67Å. I
solved it using MR (it contains a Fab-Fab complex) and the Phaser
solution (TFZ of 42.0), which is close to the Matthews coefficient
prediction, placed 16 Fabs in the unit cell (8 copies of the same
Fab-Fab complex). The Matthews coefficient was predicted to be 2.44 with
9 copies in the ASU, but for 8 and 10 copies in the unit cell, the
Matthews coefficients were 2.75 and 2.20, respectively. I did set up MR
jobs searching for 9 and 10 Fab-Fab copies in the unit cell, but still,
only 8 Fabs were placed.

While most of the Fabs fit nicely in the density, 6 Fcs (chains e,
A, M, Q, U, and c) do not fit within the electron density. I
have tried re-processing my data, but P1 seems to be the “correct” space
group, and Xtriage does not indicate any red flags. There is some extra
positive density visible close to these Fcs. I believe these 6 Fcs might
be in a different orientation/position than how they’re currently being
placed, and should fit into the additional positive density I’m seeing.
However, I have been unable to place them correctly: either by using
rigid body refinement in Phenix for these 6 Fc domains or doing it
manually as Fc domains.

Does anyone have ideas as to what might be happening here and what I
could do to try and fix this? I’ve attached my PDB and MTZ file for your
reference.

Thank you,

Gargi Kher

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C6e91270dddb447ccc5d108db2eee6e8d%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638155375396165543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=HdrbNX%2Fw7VGPrUFglnM3edzgSsMKAELuPTIhrOvYjJc%3D=0>

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] To Trim or Not to To Trim

2023-03-10 Thread Phil Jeffrey


On 3/10/23 4:05 AM, Julia Griese wrote:

Hi all,

My impression has been that the most common approach these days is to 
“let the B-factors take care of it”, but I might be wrong. Maybe it’s 
time to run another poll?


Personally, I call any other approach R-factor cosmetics. The goal in 
model building is not to achieve the lowest possible R-factors, it’s to 
build the most physically meaningful, most likely to be correct, model. 


And I could call your approach "model cosmetics".

If you can't see the side-chain, you don't know where it is and you 
probably don't even know where the centroid of the distribution is. 
Only in the case of very short side-chains with few rotamers can you 
make a reasonable volume approximation to where the side-chain is and 
"let the B-factors" smear out the density to cover a range of the 
projected conformations.


For longer side-chains, if you put it in a single conformation, you are 
very likely NOT coming close to correctly modeling the actual 
distribution of the conformations.  So let's circle back on "most likely 
to be correct model" and ask what we *actually* know about where the 
atoms are.


Put your disordered Arg in with 10 alternate conformations, each with a 
refined relative occupancy, and then let the B-factors smear that lot 
out, and that's your better model.


Phil Jeffrey
Princeton



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] what would be the best metric to asses the quality of a mtz file?

2021-10-27 Thread Phil Jeffrey


CCP4's PEAKMAX program would be quite scriptable.

Phil

On 10/27/21 1:58 PM, Murpholino Peligro wrote:

So... how can I get a metric for noise in electron density maps?
First thing that occurred to me
open in coot and do validate->difference map peaks-> get number of peaks
(is this scriptable?)
or
Second
phenix.real_space_correlation detail=residue file.pdb file.mtz


Thanks again





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Lowering R factor from small molecule structure

2021-06-04 Thread Phil Jeffrey

Unlike macromolecular crystallography, small molecule crystallography is 
infrequently starved for data.  So it makes no sense at all to extend 
your data to e.g. I/sigI of 1.0 amd Rmeas > 80% unless you want your R1 
to be >10% for no good reason or utility, which is what was behind my 
suggestion - test to see if the data cutoff is an issue.  Also about the 
fastest test you can do in SHELXL.


> Yes, ANIS and adding hydrogens (in SHELXL) are good things to do - 
with 0.8Å data most small molecule crystallographers would do this as a 
first step after fitting all the non-H atoms.


Actually, adding AnisoB's and hydrogens too soon will mess up your 
disorder modeling, so blanket statements like that work for well-behaved 
structures but not so much for more challenging ones.


e.g. in one of the the four structures I've done this week, one had 
significant main-molecule disorder so that comes ahead of adding 
hydrogens, and refining unrestrained anisoB (as is the default) for 
disordered atoms is asking for trouble.  It's not as cookie-cutter as 
you represent, and I stick to all my suggestions.


Phil Jeffrey
Princeton

On 6/4/21 4:27 AM, Harry Powell - CCP4BB wrote:

Hi

Yes, ANIS and adding hydrogens (in SHELXL) are good things to do - with 0.8Å 
data most small molecule crystallographers would do this as a first step after 
fitting all the non-H atoms.

One thing I can’t agree with is to cut the resolution of your data _unless_ you 
have a very, very good reason to do so. Normal small molecule refinements will 
use data to ~0.8Å and not use a cut-off based on resolution or I/sig(I). A good 
dataset will often go to higher resolution and small molecule crystallographers 
will be very happy to use these data (unless, as I say, they have a very good 
reason not to), and would certainly have to “explain to the referees” why they 
didn’t if they ignored a systematic chunk.

Something else that you might not have thought of - have you actually told 
SHELXL what the reflection data are - i.e., are they F, F^2, intensity? It’s 
perfectly possible to solve a small molecule structure by e.g. telling the 
program you’re giving it F^2 but actually giving it F, but refinement would be 
somewhat less straightforward. SHELXL normally uses F^2 in refinement, 
macromolecular programs still normally use F (AFAIK).

What programs did you use for processing the diffraction data?

Of course, lowering the R factor is not the objective of the exercise - a lower 
R-factor is a consequence of having a model that fits the data better.

I would be strongly inclined to ask a small molecule crystallographer (or someone 
with a strong background in it) to have a look at your data & model - they 
could probably give you a definitive answer by return of e-mail.

Just my two ha’porth

Harry


On 4 Jun 2021, at 03:10, Jon Cooper 
<488a26d62010-dmarc-requ...@jiscmail.ac.uk> wrote:

Agreed, ANIS is the command to try.

Sent from ProtonMail mobile



 Original Message 
On 3 Jun 2021, 20:18, Philip D. Jeffrey < pjeff...@princeton.edu> wrote:

R1 of 17% is bad for small molecule.
0.8 Å is in the eye of the beholder - if you're using macromolecular cutoffs 
then these might be too aggressive for small molecule-type refinement stats - 
try a more conservative cutoff lie 0.9 and see how that changes R1.  However I 
suspect it's more to do with how your model is fitting the data.

Have you refined anisotropic Bfactors ?
Have you added hydrogens ?

I would suggest non-CCP4 programs like Olex2 or SHELXLE as the interface for 
the refinements - I use the latter and it's somewhat Coot like with useful 
features that are particular to small molecule.  Also PLATON has some things 
(like expand-to-P1 and Squeeze) that, respectively, might be useful to explore 
space group issues and disordered solvent.  PLATON also has a means to check 
for some forms of twinning.

Phil Jeffrey
Princeton
From: CCP4 bulletin board  on behalf of Jacob Summers 
<60a137e4bf3a-dmarc-requ...@jiscmail.ac.uk>
Sent: Thursday, June 3, 2021 2:49 PM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: [ccp4bb] Lowering R factor from small molecule structure
  
Greetings!


I am currently trying to reduce the R factor of a cyclic small molecule peptoid 
in ShelXle. The max resolution of the molecule is 0.8 angstroms. The molecule 
itself fits the density very well, but there are a few unexplained densities 
around the molecule which do not seem to be anything in the crystallization 
conditions. The R1 factor of the refinement is 17.07% but I am unsure how to 
lower this value. Any ideas on how to better refine this molecule or fill 
densities to lower the R1 factor? I do not have much experience working with 
small molecule refinement or with ShelX.

Thanks so much,
Jacob Summers



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.a

Re: [ccp4bb] (scattering factors) f and f" for Sr Heavy atom

2021-01-22 Thread Phil Jeffrey


http://skuld.bmsc.washington.edu/scatter/AS_periodic.html
http://skuld.bmsc.washington.edu/scatter/data/Sr.dat

Would probably work for initial values.  9700 eV for that wavelength.
https://people.mbi.ucla.edu/sumchan/crystallography/ang-eV_convertor.html


Phil Jeffrey
Princeton

On 1/22/21 2:36 PM, rohit kumar wrote:

Hello All,

I have data collected at Wavelength: 1.2782 (For Sr Heavy atom) with a 
resolution of 1.6 A. I was trying to run Crank in ccp4 for SAD phasing 
and It  asks me to fill the values of (scattering factors) f and f" for 
the heavy atom.
Can anyone please help with this, how to calculate or where to find 
these f and f" values for Sr heavy atoms?


Please let me If you need any information from my side.

Thank you in advance


--
Regards
Dr. Rohit Kumar Singh
Postdoctoral fellow





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1>






To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Finding partial occupancy monomer by MR ?

2020-12-10 Thread Phil Jeffrey


Thanks for the suggestions.

The idea that it's related to a trigonal space group and twinning or 
pseudo space group is an interesting one, but this is C2221 and the 
intensity stats don't show twinning.  Twinned P21 -> C2221 doesn't solve 
the non-unit occupancy in this case.  Since the other monomers are 
full-occupancy it can't be 3 overlapping dimers so the phenomenon is 
rather unusual in my finite experience.  (Also only one set of Se peaks 
for this 4th monomer).


I used Herman's suggestion of finding 3 monomers first (with very large 
RFZ/TFZ/LLG since the monomers had been refined against the data) since 
that's very fast.  And then Phaser took a long while to not find the 4th 
monomer.  Once I figure out how to make modern versions of phaser to 
"fail quickly" like the older versions I'll scan a range of homology% 
and see if that changes anything.


Phil


On 12/10/20 9:46 AM, Schreuder, Herman /DE wrote:

Dear Phil,
0.32 is awfully close to 1/3, which brings a nice mathematical puzzle to my 
mind to see if the 1/3 occupancy is somehow related to the 3 fully occupied 
monomers... It may also be related to a (trigonal??) space group...

You probably have already tried it, but phaser has the option to give it 
already solved molecules and ask it to search for additional molecules. Here I 
would indeed lower the expected % homology significantly, to crudely compensate 
for the low occupancy. In contrast to the advice of Dale, I would play around 
with the % homology to find the value which works best.

My 2 cents,
Herman


-Ursprüngliche Nachricht-
Von: CCP4 bulletin board  Im Auftrag von Phil Jeffrey
Gesendet: Donnerstag, 10. Dezember 2020 14:49
An: CCP4BB@JISCMAIL.AC.UK
Betreff: [ccp4bb] Finding partial occupancy monomer by MR ?

Preamble:
I have an interesting crystal form with 3 monomers (~400aa) at full occupancy and 
apparently one at much reduced occupancy.  It was built recently from Se-SAD and was in 
moderately good condition: Rfree=32% for trimer, 2.6 Å.  In recent refinement cycles it 
became obvious that there was a 4th monomer in a region of weaker/choppy 2Fo-Fc and Fo-Fc 
density that corresponded to a "confusing" set of low-occupancy SeMet sites 
found by SHELXD and Phaser-EP.  The experimental map was bad in that region and was 
probably flattened during density modification anyway, in retrospect.

Question:
Phaser failed to find the 4th monomer after trivially finding the other
3 with a recent version of the monomer.  I'm wondering if there's a way to indicate 
"this one is partial occupancy" to Phaser, or if there's a way to improve the 
odds of success beyond just lowering the expected % homology.  Or if anyone has had 
success with other programs.  This is perhaps a rare edge case but I naively expected 
Phaser to work.

In the end I used the weak SeMet sites to locate the monomer and the occupancy 
appears to be around 0.32 in refinement.

Cheers,
Phil Jeffrey
Princeton



To unsubscribe from the CCP4BB list, click the following link:
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1data=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=G28AUNQrAgQYblmaYBVnXESTXiekmWzfTLPMMX%2B%2BOgw%3Dreserved=0

This message was issued to members of 
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2FCCP4BBdata=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=WitzjV%2F3hzx1SzMmmDzKVX56uVBD1fXluDDAcyY8Y1g%3Dreserved=0,
 a mailing list hosted by 
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2Fdata=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=wv3OjpY8AGsTj0RfiiEMjWPLDD5QOMvfV1TwcaRXczs%3Dreserved=0,
 terms & conditions are available at 
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fpolicyandsecurity%2Fdata=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mEy2Z2rqn8IBmLklgmDe0%2BZOXq4gcLHqSqze6fW%2Fhx4%3Dreserved=0



To unsubscribe from the CCP4BB

[ccp4bb] Finding partial occupancy monomer by MR ?

2020-12-10 Thread Phil Jeffrey


Preamble:
I have an interesting crystal form with 3 monomers (~400aa) at full 
occupancy and apparently one at much reduced occupancy.  It was built 
recently from Se-SAD and was in moderately good condition: Rfree=32% for 
trimer, 2.6 Å.  In recent refinement cycles it became obvious that there 
was a 4th monomer in a region of weaker/choppy 2Fo-Fc and Fo-Fc density 
that corresponded to a "confusing" set of low-occupancy SeMet sites 
found by SHELXD and Phaser-EP.  The experimental map was bad in that 
region and was probably flattened during density modification anyway, in 
retrospect.


Question:
Phaser failed to find the 4th monomer after trivially finding the other 
3 with a recent version of the monomer.  I'm wondering if there's a way 
to indicate "this one is partial occupancy" to Phaser, or if there's a 
way to improve the odds of success beyond just lowering the expected % 
homology.  Or if anyone has had success with other programs.  This is 
perhaps a rare edge case but I naively expected Phaser to work.


In the end I used the weak SeMet sites to locate the monomer and the 
occupancy appears to be around 0.32 in refinement.


Cheers,
Phil Jeffrey
Princeton



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Model refinement problems when upgrading Phenix

2020-07-21 Thread Phil Jeffrey


Hello Juan

First, there's a phenix.refine bulletin board, on which you might 
attract the attention of the developers, which might help.

http://www.phenix-online.org/mailman/listinfo/phenixbb

I've been using 1.17-3644 without issues after transitioning from 
something older.  Consider downgrading to something new-ish.


At first glance this looks like a change in the weighting between the 
X-ray term and the (sum of) the geometric terms - if you are using a 
wxc_scale command or explicit weighting value I'd turn that off and see 
what happens.  But you say that you've been using weight optimization, 
which seems to suggest otherwise.


What are your Rwork, Rfree, RMSD bonds, RMSD angles, Rama stats for the 
same model in the two refinement program versions ?  If the weight 
changes the Rwork vs geometry should be a pretty easy indicator.  If you 
get worse geometry with the same Rwork that's a lot more troubling.


And, try REFMAC. REFMAC is usually faster, and on a couple of high 
resolution projects gave a significant drop in Rfree.  Usually they are 
comparable but it's worth running both to see what happens.


Things that traditionally give me issues in phenix.refine are: real 
space refine subprocess sometimes "unrefines" my structure (try turning 
it off); there appears to be enough of a difference between first and 
subsequent passes of the weight estimation that the weight-refinement 
scheme gets thrown off.



Phil Jeffrey
Princeton

On 7/21/20 10:40 AM, JUAN ESTEVEZ GALLEGO wrote:

Dear all,
I have been working on the refinement of a crystal structure using 
phenix.refine from the 1.12-2828-Intel-Linux-2.6 version of Phenix. I 
have recently replaced my computer by a MacBook and I have upgraded 
Phenix to the 1.18.2-3874-MacOs version. However, I found that the 
refinement introduced an amazing huge amount of outliers, specially 
C-beta and ramachandran.


The structure is at 2.5A resolution and I used the following refinement 
strategy: XYZ coordinates, Real-space, Individual B-factors, 
occupancies, X-ray/stereochemistry weight optimization, no experimental 
phase restrain, automatic metal and ligand linking and automatic correct 
N/H/Q errors. I also tried using TLS instead of Indivudal B-factors, but 
the problems persist.


Does anybody know why this could be happening? Thanks a lot for your help!

Best,

Juan



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a 
mailing list hosted by www.jiscmail.ac.uk, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] number of frames to get a full dataset?

2020-06-30 Thread Phil Jeffrey

The people that already use multiplicity are going to find reasons why 
it's the superior naming scheme - although the underlying reason has a 
lot to do with negative associations with 'redundant', perhaps hightened 
in the current environment.  And conversely redundant works for many 
others - Graeme's pragmatic defense of multiplicity actually works both 
ways - any person who takes the trouble to read the stats table, now 
exiled to Supplementary Data, knows what it means.  Surely, then, the 
only way forward on this almost totally irrelevant discussion is to come 
up with a universally-loathed nomenclature that pleases nobody, 
preferably an acronym whose origins will be lost to history and the 
dusty CCP4 archives (which contain threads similar to this one).  I 
humbly submit:


NFDOF: Nearly Futile Data Overcollection Factor ?
[*]

Or, even better, could we not move on to equally pointless discussions 
of the inappropriateness of "R-factor" ?  I have a long history of 
rearguard action trying to give stupid acronyms a wider audience, so 
you're guaranteed to hear from me on this for years.


(Personally I'm pining for Gerard Kleywegt to resume his quest for 
overextended naming rationales, of which ValLigURL is a personal 
'favo[u]rite'.  But I'm just old-fashioned.)


Ironically,
Phil Jeffrey
Princeton

[* I too have collected 540 degrees in P1 to solve a SAD structure, just 
because I could, hence "nearly"]
[** The actual answer to this thread is: history is written by the 
authors of scaling programs - and I think the Americans are currently 
losing at this game, thus perilously close to making themselves redundant.]


On 6/30/20 4:14 AM, Winter, Graeme (DLSLtd,RAL,LSCI) wrote:
Or, we could accept the fact that crystallographers are kinda used to 
multiplicity of an individual Miller index being different to 
multiplicity of observations, and in Table 1 know which one you mean?  
Given that they add new information (at the very least to the scaling 
model) they are strictly not “redundant”.


The amount that anyone outside of methods development cares about the 
“epsilon” multiplicity of reflections is … negligible?


Sorry for chucking pragmatism into a dogmatic debate 

Cheerio Graeme





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] refinement of 0.73A data in shelxl

2020-02-06 Thread Phil Jeffrey


That doesn't sound right re: PART numbers

classically:

PART 1
majority disordered atoms with FVAR/occupancy of e.g. "21." instead 
of usual "11."

PART 2
minority disordered atoms with FVAR/occupancy of e.g. "-21."
PART 0
The 21.000/-21.000 pairs makes the sum of occupancies add to 1.0, but 
the actual value of each group is defined by the second free variable.


See: http://shelx.uni-goettingen.de/shelxl_html.php#PART
The "PART 1" atoms would not interact with the "PART 2" atoms.
There's even an example for a disordered SER in the documentation.

PART -n is used for disorders that overlap on themselves on symmetry 
axes.  "If n is negative, the generation of special position constraints 
is suppressed and bonds to symmetry generated atoms with the same or a 
different non-zero PART number are excluded; this is suitable for a 
solvent molecule disordered on a special position of higher symmetry 
than the molecule can take".


I use PART 1/PART 2/PART 0 all the time in "small molecule world" but 
I've used PART -1 precisely once.


Phil Jeffrey
Princeton



On 2/6/20 4:15 PM, Tim Gruene wrote:

Dear Matthias,


some developers introduce new features of their refinement programs with the
words " ... which has been there in SHELXL since the beginning of time".

If you are only looking for two conformations, you are looking for the
combination of free variable number N with part N and part -N. In case you
deal with more than two conformations, take a look at SUMP (as Jon suggested).

The use of free variables is easier to explain right at the computer, so
please ask a colleague near you office, who is familiar with SHELXL for the
details.

Best,
Tim

On Thursday, February 6, 2020 8:10:01 PM CET Barone, Matthias wrote:

Sorry if the mail was not clear. I figured that out now yes. As I wrote in
the update, I found this stupid error I made and now everything looks good.

Now that I got the feeling of how shelxl works, I miss one of it's features
in the pdb format, namely the possibility to link occupancies of a double
confirmation to another moiety, say a water or a double confirmation of the
ligand. It's there a way to use something similar like FVAR in a pdb file?




Dr. Matthias Barone

AG Kuehne, Rational Drug Design

Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP)
Robert-Rössle-Strasse 10
13125 Berlin

Germany
Phone: +49 (0)30 94793-284


From: bogba...@yahoo.co.uk 
Sent: Thursday, February 6, 2020 5:01:14 PM
To: Barone, Matthias
Cc: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] refinement of 0.73A data in shelxl


Hello, hope I can help.


OK, so here is the disp table...

SFAC  C H CL N O

DISP $C 0.005100.00239 15.73708

DISP $H-0.20.0  0.66954

DISP $CL0.188450.21747   1035.16450

DISP $N 0.009540.00480 28.16118

DISP $O 0.016050.00875 47.79242


If we take these coordinates...

N 30.414964   -0.1476350.11689611.00.19533
0.44341 =

H0A   20.427823   -0.1386560.12325611.0   -1.5

C 10.348035   -0.1607760.11097911.00.20723
0.28451 =

O 40.363785   -0.1741540.10290611.00.21226
0.22954 =

SG50.1773030.1012670.04057210.040000.06849
0.03024 =

O 40.2413040.0717350.03856710.960000.14982
0.12755 =

... the first N (followed by 3) is being assigned the scattering factors of
chlorine because this element is 3rd in the SFAC list. The SG (followed by
5) is being assigned the scattering factors of O because the latter is 5th
in the SFAC list.

I think you need to check these  assignments and the chlorine occupancy are
Ok.

Jon Cooper

On 6 Feb 2020 11:13, "Barone, Matthias"  wrote:

Dear community
here is an update of my shelxl problem. I solved it after an epiphany last
night in bed... I tried countless things to get the postive density on the
Cl under control. Markus suggested that the density came from a radiolysed
chloride, so I tried to superimpose chlorinated and radiolysed ligands.
However that did not lead to anything fruitful.

Remember that I tried to incorporate DISP of Cl into the .ins file:
This is the original of the protein .ins, chloride just pasted as last
element: SFAC  C  H  N  O  S  CL
DISP $C 0.005100.00239 15.73708
DISP $H-0.20.0  0.66954
DISP $N 0.009540.00480 28.16118
DISP $O 0.016050.00875 47.79242
DISP $S 0.159950.16998812.87489
DISP $CL0.188450.21747   1035.16450

The upper list only creates postive density on the Chloride, the rest of the
map is clean and looks the same as if you would omit the DISP line of Cl
alltogether. The following list is coming from the .ins file of the
converted prodrg file:

SFAC  C H CL N O
DISP $C 0.005100.00239 15.73708
DISP

Re: [ccp4bb] Another difficult MR case

2019-08-29 Thread Phil Jeffrey


Are you *sure* there's no translational NCS ?

For example your first molecular replacement solution out of Phenix shows

EULER  293.6   27.7  288.7
FRAC -0.02  0.02  0.02
(that's "first molecule at origin in P1")

and

EULER  294.0   27.9  288.8
FRAC -0.37  0.02  0.02

which is essentially the same orientation, and a translation down one 
crystallographic axis (a*)


And this suggests to me that either Xtriage or Phaser is missing 
something here.  Does Phaser find translational NCS in its initial data 
analysis ?  Unmodeled translational NCS could cause significant problems 
with the molecular replacement search.


Phil Jeffrey
Princeton




On 8/29/19 11:28 AM, Napoleão wrote:

Deal all,
Sorry for the long post.
I have a data set obtained from a crystal produced after incubating a 
protease with a protein which is mostly composed by an antiparallel beta 
sheet. I have tried numerous approaches to solve it, and failed. 
Molecular replacement using Phaser, and the protease or the protein as a 
template yields no solution. However, molecular replacement using only 
part of the beta sheet yields LLG=320 TFZ==28.0 (see below).


The apparently good data extends to 1.9 A, as processed by XDS, and the 
space group is P1 (pointless agree). XDS info below:


SPACE_GROUP_NUMBER=    1
UNIT_CELL_CONSTANTS=    44.43    72.29    77.30  97.802  89.939 101.576

  a    b  ISa
  9.647E-01  3.176E-03   18.07

  RESOLUTION NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  
R-FACTOR COMPARED I/SIGMA   R-meas  CC(1/2)  Anomal  SigAno   Nano
    LIMIT OBSERVED  UNIQUE  POSSIBLE OF DATA   observed  
expected  Corr
  1.90   24890   19149 23814   80.4%  58.1% 
63.7%    11482    0.77 82.2%    63.8* 3    0.694 492
     total  163756  125884    146938   85.7%  10.6% 
10.8%    75744    3.78 15.0%    99.0*    -3    0.761    5834



Xtriage in Phenix 1.16-3549 gives me all green lights (print below), 
suggesting the data presents no twinning, no translational NCS, no ice 
rings and is not anisotropic.

http://fullonline.org/science/phenix_xtriage_green.png

Molecular replacement in Phaser yields single solutions like:

    Solution annotation (history):
    SOLU SET  RFZ=3.0 TFZ=* PAK=0 LLG=29 RFZ=2.8 TFZ=8.8 PAK=1 LLG=310 
TFZ==27.6

     LLG=320 TFZ==28.0
    SOLU SPAC P 1
    SOLU 6DIM ENSE ensemble1 EULER  293.6   27.7  288.7 FRAC -0.02  
0.02  0.02 BFAC

     -6.03
    SOLU 6DIM ENSE ensemble1 EULER  294.0   27.9  288.8 FRAC -0.37  
0.02  0.02 BFAC

     -6.52
    SOLU ENSEMBLE ensemble1 VRMS DELTA -0.1983 RMSD  0.49 #VRMS  0.21

or partial solutions like:

    Partial Solution #1 annotation (history):
    SOLU SET  RFZ=3.7 TFZ=* PAK=0 LLG=32 RFZ=2.8 TFZ=13.0 PAK=0 LLG=317 
TFZ==30.2
     LLG=331 TFZ==30.5 RFZ=2.4 TFZ=7.2 PAK=0 LLG=464 TFZ==18.5 RFZ=2.7 
TFZ=5.7 PAK=1

     LLG=501 TFZ==6.8 LLG=509 TFZ==6.6
    SOLU SPAC P 1
    SOLU 6DIM ENSE ensemble1 EULER   85.4  153.0  138.5 FRAC -0.01 -0.00 
-0.00 BFAC

     -12.30
    SOLU 6DIM ENSE ensemble1 EULER   86.2  153.2  139.5 FRAC -0.36 -0.01 
-0.01 BFAC

     -9.16
    SOLU 6DIM ENSE ensemble1 EULER   83.8  152.3  135.9 FRAC -0.00  0.00 
-0.25 BFAC

     1.52
    SOLU 6DIM ENSE ensemble1 EULER  191.2  109.1   39.3 FRAC -0.27 
-0.01  0.22 BFAC

     10.18
    SOLU ENSEMBLE ensemble1 VRMS DELTA -0.0447 RMSD  0.49 #VRMS  0.44


However, after 1 refinement round in Phenix_Refine (Final: r_work = 
0.4881 r_free = 0.5009) I got densities that are part good and part bad, 
and if I delete the bad parts and refine again, the good parts become 
bad. Please check the prints:


http://fullonline.org/science/good_part_of_density.png
http://fullonline.org/science/bad_part_of_density.png

What is the explanation for these molecular replacement results?
What else should I try? Arcimboldo takes 2 days+ to run and yields no 
good solution.


Thank you!
Regards,
     Napo




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1






To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1

Re: [ccp4bb] problem with the symmetry water molecule

2019-03-11 Thread Phil Jeffrey


Hello Firdous

You are seeing two because you are displaying crystallographic symmetry 
and you are seeing its symmetry mate.  Coot only places one (check the 
PDB file) but displays the second generated by symmetry.  It pays to 
place that water molecule as precisely as possible on the symmetry axis 
so that refinement programs will treat this as a special position water 
and eliminate the extra one - i.e. make it as close as possible to its 
symmetry mate.


Phil Jeffrey
Princeton

On 3/11/19 12:09 PM, Firdous Tarique wrote:

Hello everyone

I am having a difficult time fitting a water molecule which is right at 
the centre of symmetry. Every time I am trying to fit one water molecule 
it fits two because of the symmetry atom is at the same place. What is 
the best way to solve this problem? I am talking about the water 
molecule where two molecules are paced at one place (4th position in the 
semicircle having both pink and purple).


Thanks

Firdous
Screen Shot 2019-03-11 at 12.00.21 PM.png




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1

Re: [ccp4bb] translational NCS & twinning

2019-01-10 Thread Phil Jeffrey


Donghyuk

The combination of two things gives me cause for concern:
1.  You've reindexed something that apparently scaled OK in point group 
622 into point group 2, with a smaller cell.  Since it's hard to fake 
that sort of data agreement in 622, I assume your data is at the very 
least pseudo-622.
2.  You've modeled that additional symmetry using a whole array of twin 
operators and some non-crystallographic symmetry.


This may in fact be the correct model, but there's a significant risk 
that you're inappropriately modeling something.


Let's assume for a moment that your small-cell C2 refinement with 6 twin 
ops improves the model somewhat.  Now go back and generate scaled data 
sets in all possible point groups suggested by Pointless for that data, 
and try and find molecular replacement solutions with your 
partially-refined model by testing every possible space group in every 
possible point group based on the Pointless suggestions.  The idea is to 
try and model more of the 622 pseudo-symmetry as crystallographic 
symmetry, using fewer twin operators in refinement.  If you test all of 
these combinations, which might be quite extensive, you might find one 
or more that fit nearly as well as your current C2 solution and has 
higher symmetry. You should take a hard look at that those potential 
solutions.  Only when you've thoroughly exhausted alternative molecular 
replacement solutions would you be confident that your C2 model is in 
fact the only reasonable explanation of your data.


But as it stands it is rather atypical and it warrants further investigation

Additional evidence that your C2 cell is the only reasonable model would 
be identifiable electron density differences between each chain in your 
(presumed) multi-chain model.


Cheers
Phil Jeffrey
Princeton

On 1/10/19 11:08 AM, Donghyuk Shin wrote:

Dear Jacob Keller and Vipul,

Thank you both very much for the reply.
Regarding the R-values, I am just wondering whether the huge gap between 
refinements w/ w/o twin operator can be possible even the crystal is not 
twinned?

Best wishes,
Donghyuk



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1

[ccp4bb] Refmac: removing selected ligand hydrogens after making a link

2018-03-29 Thread Phil Jeffrey

I've got a couple of instances where I have non-standard amino acids, 
nevertheless present in the monomer dictionary, that have additional 
non-peptide covalent linkages.  I've figured out how to define these, 
but if I opt to output hydrogens as a diagnostic I see that Refmac 
doesn't delete the ligand hydrogens that were present at the linkage point.


Nothing catastrophic happens in refinement but extra atoms lying along 
other covalent bonds makes me a little queasy.


Is there something (non-obvious) in additional user-defined .cif library 
that I can use to do this ?  Do I simply define a new version of the 
monomer (w/o errant hydrogen) and hope that it overwrites the previous 
definition ?


I'm doing this at borderline atomic resolution.

Thanks
Phil Jeffrey
Princeton

Re: [ccp4bb] high TFZ score but 50% of Rfree

2017-11-17 Thread Phil Jeffrey

R=50% at a resolution of 2.2 Å is a lot different to 50% at, say, 3.5 Å 
resolution.  What happens if you refine it at 3.0 or 3.5 Å ?


What's the model vs sequence % identity ?

Phil Jeffrey
Princeton


On 11/17/17 3:23 PM, Yue Li wrote:

Dear all,

I have several datasets (one best resolution reaching 2.2A), giving C2 
space group, two molecules in an asymmetry unit (65.1% of solvent 
content).  When running MR using a template (<20% sequence identify to 
the target molecule), I got a solution with high TFZ (23.7) and LLG 
(842). However, the Rfree sticks to 50% in the structural refinement 
using phenix. There is no complain in Xtriage - no twin, no 
translational NCS.  I think that the structure solution looks 
reasonable, which can explain the three disulfide bond formations 
through the sequence threading.  I tried to search three molecules in 
the asymmetry unit, but the final solution gives me two molecules.


Do you have any suggestion for this high Rfree problem?

Thank you very much for your help.

All the very best,

Simon

Re: [ccp4bb] AW: Another troublesome dataset (High Rfree after MR)

2017-10-16 Thread Phil Jeffrey

Rarely do I disagree with the wit and wisdom of James Holton, but R1 is 
not a property that Macromolecular World is unaware of.


R1 is just Rwork.
It's just R1 = Σ | |Fo| – |Fc| | / Σ |Fo|

However e.g. George Sheldrick's SHELXL reports it based on a 4 sig(F) 
cutoff as well as on all data.  Example:


R1 =  0.0421 for 27579 Fo > 4sig(Fo) and  0.0488 for all   30318 data
wR2 = 0.1153, GooF = S = 1.083,  Restrained GooF = 1.083  for all data
(this Small Molecule World structure is not yet finished)

wR2 is a weighted R-factor based on |F|^2

See: http://shelx.uni-ac.gwdg.de/SHELX/shelxl_user_guide.pdf

The CIF file stores the two different R1 values as:
_refine_ls_R_factor_all   0.0488
_refine_ls_R_factor_gt0.0421

So, don't expect that labeling anything "R1" uniquely defines whatever 
sigma cutoff you are actually using.  It's not implicit.  You must 
specify it but preferably don't report it at all, and just use it for 
diagnostic purposes.


Phil Jeffrey
Princeton



On 10/16/17 11:02 AM, James Holton wrote:


If you suspect that weak data (such as all the spot-free hkls beyond 
your anisotropic resoluiton limits) are driving up your Rwork/Rfree, 
then a good sanity check is to compute "R1".  Most macromolecular 
crystallographers don't know what "R1" is, but it is not only 
commonplace but required in small-molecule crystallography.  All you do

[ccp4bb] Job Opening - Biophysics Facility Manager, Chemistry and Molecular Biology Depts, Princeton University

2017-09-07 Thread Phil Jeffrey


Biophysics Facility Manager
Associate Professional Specialist Rank

The Departments of Chemistry and Molecular Biology at Princeton 
University seek an Associate or more senior Professional Specialist to 
manage a new state-of-the-art Biophysics Core Facility. The successful 
candidate will play a leading role in equipping the Facility with the 
latest instrumentation and in interfacing with a vibrant community of 
faculty and research scientists. This individual will also serve as 
Analytical Spectroscopist for the Spectrometry and Small Instruments 
Core Facilities. Applicants must have a Ph.D. and demonstrated 
proficiency working with macromolecules using technologies such as 
analytical ultracentrifugation, surface plasmon resonance (SPR), 
microscale thermophoresis, and/or isothermal titration calorimetry (ITC).


Applicants must apply online at
https://www.princeton.edu/acad-positions/position/3421
and submit a cover letter, CV, and the names and email addresses of 3 
references. Appointment is for one year with renewal contingent on 
satisfactory performance. This position is subject to the University’s 
background check policy. Princeton University is an Equal 
Opportunity/Affirmative Action Employer and all qualified applicants 
will receive consideration for employment without regard to age, race, 
color, religion, sex, sexual orientation, gender identity or expression, 
national origin, disability status, protected veteran status, or any 
other characteristic protected by law.



For questions not covered by the application URL above, email Prof. 
Hughson at hugh...@princeton.edu


Cheers
Phil Jeffrey
Princeton

[ccp4bb] Assistant Professor position in Cryo-Electron Microscopy/Tomography at Princeton University

2017-08-17 Thread Phil Jeffrey

The Molecular Biology Department at Princeton University invites 
applications for a tenure-track faculty position at the Assistant 
Professor level.  We are seeking a colleague whose research will 
leverage high resolution cryo-electron microscopy and/or cryo-electron 
tomography in the study of outstanding biological questions. The 
successful candidate will join a friendly, highly collaborative faculty 
and will have access to superb resources including a new 300 kV Titan 
Krios TEM. We seek faculty members with a strong commitment to teaching, 
mentoring, and fostering a climate that embraces both excellence and 
diversity.


See the link: 
https://puwebp.princeton.edu/AcadHire/apply/application.xhtml?listingId=2821


Applications must be received by October 31, 2017.

Note: I'm just the messenger and not involved at all in the search process.


Phil Jeffrey
Macromolecular crystallography facility manager
Princeton

Re: [ccp4bb] Problem with a cell content

2017-07-11 Thread Phil Jeffrey


Hello Anna

You've already found the correct number of molecules in the asymmetric 
unit.  21% Rwork is a quite respectable value for a structure at this 
resolution, and while 80% solvent is a relatively rare occurrence it's 
not unprecedented (a couple of years back I did one at 3.0Å with 75% 
solvent - PDB 4U6U).  If you were missing half your asymmetric unit from 
your model, Rwork would be held up in the mid-30% range and there would 
be regions of relatively high difference density outside the model.


Phil Jeffrey
Princeton


On 7/11/17 12:31 PM, Koromyslova, Anna wrote:

Dear CCP4 members,

I am working on a structure of a protein in complex with an antibody 
fragment (approx. 50kDa together). Molecular replacement with closely 
related proteins always comes up with one complex in the asymmetric 
unit, although MW of protein to which Matthews applies is 125kDa and 
corresponds to two complexes.


Phaser gives two warnings:

Large non-origin Patterson peak indicates that translational NCS is present.

Solutions with Z-scores greater than 27.2 (the threshold indicating a 
definite solution) were rejected for failing packing test


I couldn’t get a solution with two subunits although I have tried 
multiple combinations including only conserved parts of both proteins 
and different space groups including P1. Phenix Autobuild also yielded 
only one complex.


So, the question is whether I can use that structure as is despite very 
high solvent content (80%) or should I try smth else. I would be very 
grateful for any suggestions.


When the solution with a single complex is refined the statistics are 
the following:


R-work  0.2129
R-free  0.2459
Matthews Coefficient: 6.22
Percentage Solvent: 80.22
Resolution range (Å)  48.34  - 2.9 (2.98  - 2.9)
Space group P 62 2 2
Unit cell  167.45 167.45 143.538 90 90 120
Multiplicity  19.1 (18.3)
Completeness (%)99.44 (94.39)
Mean I/sigma(I) 24.59 (2.71)
Wilson B-factor64.28
R-merge   0.1256 (1.186)
R-meas   0.1291
CC1/2 0.999 (0.85)
CC*1 (0.959)

Thank you very much for your help,

Anna
Dr. Anna Koromyslova, Postdoctoral researcher
German Cancer Research Center (DKFZ), F150
Im Neuenheimer Feld 242
D-69120 Heidelberg
Germany

[ccp4bb] RuH3R system bits and pieces (USA)

2017-05-03 Thread Phil Jeffrey

I've just started decommissioning our aged Rigaku X-ray system prior to 
replacement and before I consign every bit to surplus and the scrap yard 
there's a possibility that someone else in the USA is nursing an old 
RuH3R/RaxisIV++ system and could find use for a board or TMP controller 
or ...


The only item that might have residual value is the optical upgrade that 
we did 7 years ago - a Xenocs Fox2D Cu 25-25P multilayer.


I have one anode rebuild kit that dates from mid-2013.  I have an 
unopened (but decade+ old) box of filaments (CN4892V2).


Please email me directly, not to the list.

Thanks
Phil Jeffrey
Princeton

Re: [ccp4bb] Fwd: how to calculate a difference map between two heterodimers in heterotetrameric protein

2017-03-10 Thread Phil Jeffrey


Tricky - perhaps this could be viewed as "anti-averaging" methodologically.

Use USF programs MAMA, IMP to generate a mask and optimize the NCS 
operator (or skip this step if you feel you know yours accurately)


Use CCP4's MAPROT to rotate the map of one monomer onto the other

Conceivably use USF program COMDEM to create the "difference map" 
assuming it will tolerate a weighting of -1.  Or perhaps MAPROT will 
take a negative scale. Or MAPMASK, or there's a range of map 
manipulation programs that can scale an input map, but I've never tried 
to invert one.  Unless you actually wanted Fourier coefficients it 
shouldn't be impossible to create a masked volume of the difference 
between two maps after rotating one of them.


Phil Jeffrey
Princeton

On 3/10/17 5:24 PM, Oleg Zadvornyy wrote:

Dear All,


We are working on a tetrameric protein containing 2 heterodimers which
are related to each other by 2 fold symmetry. There are difference at
the active site between the two heterodimers in the crystals, and we
would like to make a difference map to compare one heterodimer to the
other. I would really appreciate your advise and suggestions how to
perform this comparison.
Thank you,
Oleg

--
Oleg A. Zadvornyy, PhD

Institute of Biological Chemistry,
Washington State University

237 Clark Hall
Pullman, WA 99163

Tel:  (509)-335-9837
Lab: (509)-335-1958

Re: [ccp4bb] CCP4BB Digest - 13 Jan 2017 to 14 Jan 2017 (#2017-15)

2017-01-19 Thread Phil Jeffrey


On 1/19/17 3:54 PM, Panneerselvam, Saravanan wrote:

We observed additional density around ADP that fits perfectly
like a gamma phosphate


Hello Saravanan

At 1.4 Angstrom resolution wouldn't that suggest that you've somehow got 
ATP in there ?  I don't think I understand the other option - were you 
proposing a ADP-O-C(O)2 arrangement to explain the density ?  Surely 
that has a rather different shape, considerably different scattering 
power at the center of the terminal group (C vs P) and probably 
different X-O bond lengths.  All of these should show in the density 
maps at 1.4 Å, although the bond length issue could be quite subtle.


Phil Jeffrey
Princeton




mimicking like ATP bound state, surrounded and
coordinated by two metal ions(resolution is 1.4A). There is a change in
space group (from I212121 to P212121 ) and further important
conformation changes are observed around ATP binding pocket and distant
region. This is the only xtal we obtained in this space group, and all
other xtals(measured 10 xtals)  from the same plate belong to I212121.



Thanks your help and time!

Saravanan

Re: [ccp4bb] OT: mapping PDB to mmCIF data quantities

2015-07-08 Thread Phil Jeffrey


Thanks Jose - I missed that one.

REMARK 2 is somewhat ambiguous with:
_refine.ls_d_res_high
and
_reflns.d_resolution_high

although the former makes more sense and seems to be what corresponds to 
REMARK 2.  Haven't yet seen an entry with only _reflns.d_resolution_high 
and not _refine.ls_d_res_high but there are several where the resolution 
of refinement is apparently significantly higher than the resolution of 
the source data: 1AU7, 1AW7 etc.


Cheers
Phil Jeffrey
Princeton


On 7/8/15 10:04 AM, Jose Manuel Duarte wrote:

This looks like the mapping you are after:

http://mmcif.wwpdb.org/docs/pdb_to_pdbx_correspondences.html

It maps only the structured PDB data items to their equivalent mmCIF
items. For instance REMARK 2 is not there, but REMARK 200 is. The
resolution value should then be in REMARK 200 RESOLUTION RANGE HIGH
(corresponding to mmCIF data item _refine.ls_d_res_high).

Jose

Re: [ccp4bb] crystal habit/morphology and the relationship to unit cell contents

2015-06-01 Thread Phil Jeffrey

I would have thought that what the indexing routine defined as [001] vs 
[00-1] would be essentially random as one would obtain the equivalent 
indexing in 622 in both up and down alignment of the 
crystallographic a/b/c axes with respect to crystal morphology.


Phil Jeffrey
Princeton



On 6/1/15 1:44 PM, Scott Lovell wrote:

Hi Paul,

If you have access to diffractometer equipped with a 4-circle goniometer, you should be able to 
index the faces of the crystals.  All you need to do is collect some images to index the lattice 
and determine the orientation matrix.  Most instruments have software that allows one to then 
orient specific faces or crystallographic directions relative to various directions of the 
instrument (eg. camera, phi axis, direct beam, etc).  So after indexing, you could then orient the 
[001] direction of the crystal towards the camera to determine if this is the top or 
the base.  You can also determine the direction of the a/b axes [100] and [010] 
relative to the crystal and index the other faces.  If you can also measure the interfacial angles, 
this may help you to confirm the indices.

If you do this for a number of samples, is the top face always the [001] direction or 
is it the [00-1] direction for other crystals?  Assuming that you are growing these crystals by 
hanging drop, my guess is that the base is in contact with the coverslip during growth 
and you observe this half pyramid habit.  If you were to grow the crystals using the floating drop 
method, to prevent contact with the plate materials, would the crystals form a bipyramidal habit?  
Or do you see crystals in the current drop that have the same habit but are not in contact with the 
plate materials?

Scott


-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Paul 
Paukstelis
Sent: Monday, June 01, 2015 11:21 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] crystal habit/morphology and the relationship to unit cell 
contents

I'm interested in knowing how to figure out the relationship between the unit 
cell contents and the crystal habit in these crystals (small attachment, two 
roughly orthogonal views).

Space group is P64 (enantiomeric) , and you can clearly see the six-fold. The question becomes how 
to determine which direction the screw axis is going with respect to top and the 
base of the pyramidal crystals (right image) so I can gauge how/why the crystals grow 
this way based on the cell contents.

Thanks in advance.

--paul

Re: [ccp4bb] proton scattering by X-rays

2015-02-05 Thread Phil Jeffrey


Mark,

In the small-molecule crystal structures I work with it's relatively 
common to see localized difference electron density along covalent bonds 
or in the places you'd expect to see lone pairs during refinement after 
you've fit and modeled the atoms reasonably well and the phases are 
pretty good.  It's usually not as strong as difference density for 
hydrogens, before you put them in, but it's often pretty clearly visible 
once you have.


(I use SHELXLE as an interface for small molecule refinements because of 
a somewhat Coot-like experience in viewing maps).


Phil Jeffrey
Princeton



What you CAN do in fact is appropriately subtract spherical electron
density from the experimental density and see what is left (i.e.
directional ED that is 'surplus'). I tried to quickly find a paper on
that, they exist, and they show that experimental density does confirm
what we learn in chemistry class, orbitals are not imaginary.

Mark

Re: [ccp4bb] Molecular Replacement model preparation

2014-10-06 Thread Phil Jeffrey

That document is fairly old and is in dire need of revision to reflect 
the modern arsenal of programs.


Nevertheless:
Putting the hinge axis along Z was a trick told to me by Steven Sheriff 
back in the days when we worked on Fab structures - which after all are 
classical examples of hinged molecules.  One would search with separate 
domain fragments - split either side of the hinge - and the 
Z-orientation trick makes it easier to spot pairs of peaks from each 
search model that are related to each other.  In the Fab world we 
searched with Fv models (VH:VL heterodimer) and CH1:CL constant region 
heterodimeric models.  Peaks related solely by hinge motion would have 
similar alpha and beta angles and potentially different gamma (Crowther 
convention Eulerian angles).  Historical note: this was back in the days 
when it was possible to remember the names of all the Fab fragments that 
were in PDB and their respective IDs.


This ploy was more important in the days before Phaser or Molrep, which 
will now gleefully try a long list of rotation function peaks for you 
quite quickly, so manually parsing the list of rotation function peaks 
is rather unnecessary.  And perhaps counter-productive.



Split your molecule apart at the hinge, giving fragment1 and fragment2. 
 Attempt to find both fragments independently.  Choose the one that 
gives the best results: TFZ score or LLG score or discrimination between 
possible space groupr or whatever you like.  Then, attempt to find the 
*other* fragment in the context of that first solution.



Phil Jeffrey
Princeton






On 10/5/14 3:34 AM, Luzuokun wrote:

Dear all,
I’m doing molecular replacement using Phaser. My protein is predicted to
have two domain with a “hinge” linking them. The model sequence identity
is 0.27. But the MR result is poor. I’ve tried other programme (Molrep,
MrBump, Balbes,,,_.) But no improvement was observed. I think that this
is due to the “open” or “closed” conformation around the hinge. I was
told that I could place the Z axis along the hinge
(http://xray0.princeton.edu/~phil/Facility/Guides/MolecularReplacement.html),
  could anyone tell me more details about how to do next?

Thanks!
Lu Zuokun

Re: [ccp4bb] Extract Euler angles from fractional coordinate matrix

2014-09-04 Thread Phil Jeffrey


The orthogonal/fractional matrix is outlined here:
http://www.iucr.org/__data/assets/pdf_file/0009/7011/19_06_cowtan_coordinate_frames.pdf

Sorry to say I apparently ditched my old Fortran o2f and f2o programs to 
do that.


Bear in mind, however, that orthogonal has no fixed orientation with 
respect to fractional - for most space groups ncode 1 is often used 
but for primitive monoclinic ncode 3 is sometimes used, and I think 
the matrix shown in Kevin Cowtan's document above corresponds to ncode 1.


Phil Jeffrey
Princeton


On 9/4/14 3:55 PM, Chen Zhao wrote:

I am sorry, just to clarify, the fractional coordinate matrix I referred
to is a rotational matrix in the fractional coordinate system.


On Thu, Sep 4, 2014 at 3:52 PM, Chen Zhao c.z...@yale.edu
mailto:c.z...@yale.edu wrote:

Hi all,

I am just curious whether there are some tools extracting the Euler
angles from a fractional coordinate matrix. I have no luck searching
it online.

Alternatively, I found the analytical solution for the Euler angles
from an orthogonal coordinate matrix. So in the worst case, my
problem reduces to calculating the transformation matrix between the
fractional and orthogonal coordinate system. I feel a little bit at
a loss because it is 6 years since I last studied linear algebra.
How can I calculate this for a specific unit cell?

Thanks a lot in advance!

Sincerely,
Chen

Re: [ccp4bb] Proper detwinning?

2014-07-11 Thread Phil Jeffrey


Chris,

To change the axis ordering for e.g. changing which cell edge is the P21 
B axis use an hkl matrix command.  Probably can do this via the Macros 
during scaling but I distrust this and just edit scl.in by hand and run 
it as scalepack  scl.in

For k,l,h reindexing use hkl matrix 0 1 0  0 0 1  1 0 0
For l, h, k reindexing use hkl matrix 0 0 1  1 0 0  0 1 0

Systematic absences would be an anecdotal indicator that it is/isn't 
P212121.  That would show strong systematic absences for (h,0,0), 
(0,k,0), (0,0,l) reflections.  (Or reflexions if one prefers).  While 
not impossible I would think it statistically unlikely to observe such 
absences if the data was really P1, P2 or P21.  Going back to the images 
to eyeball the actual reflections on the display can be pretty illuminating.


I don't remember Scalepack giving much detail in postrefinement but 
paying attention to the positional chi^2 values during integration might 
give clues about how far from 90 those unit cell axes are wandering if 
you try integrating in different space groups.  There's also a method 
(or was, last time I tried it) to change the way Scalepack postrefines 
unit cell dimensions (value per frame or value per crystal) which might 
also help.  More hacking of scl.in might be required.


However I'm usually pretty happy if my R-free drops 12% at 2.0 Angstrom 
resolution when going from P21 to P1.  I would look for legitimate 
deviations between previously identical monomers in the map and probably 
consider using NCS to reduce the random deviation between monomers that 
actually are identical by symmetry.  You may have assigned the 
crystallographic 21 down the wrong unit cell axis in that P21 test case.



Phil Jeffrey
Princeton

On 7/11/14 7:33 PM, Chris Fage wrote:

Nat and Misha,

Thank you for the suggestions.

Xtriage does indeed detect twinning in P1, reporting similar values
for |L|, L^2, and twin fraction as in P212121.

The unit cell dimensions for the 2.0-A structure (P1) are:
72.050  105.987  201.142  89.97  89.98  89.94 P 1

The unit cell dimensions for the 2.8-A structure (P212121) are:
75.456  115.154  202.022  90.00  90.00  90.00 P 21 21 21

I have been processing in HKL2000, which only recognizes one set of
unit cell parameters for each Bravais lattice (does anyone know how to
change this?). Specifically, for a primitive monoclinic unit cell it estimates:
104.53  71.82  200.99  89.86  91.80  91.16
This is the unit cell which refined to Rwork/Rfree ~ 27%/34%.

Indexing in mosflm gives three options for primitive monoclinic:
105.6  71.7 200.9  90.0  90.1  90.0
71.7 105.6  201.0  90.0  89.9  90.0
71.7 200.9  105.6  90.0  90.3  90.0
Attempting to integrate in any of these space groups leads to a fatal
error in subroutine MASKIT. I can also use the index multiple
lattices feature to get a
whole slew of potential space group; however, integrating reflections
leads to the same fatal error.

Finally, Zanuda tells me that P212121 is the best space group,
according to R-factors. However, I do not believe P212121 is the
correct assignment.

Best,
Chris


On 7/10/14, Isupov, Michail m.isu...@exeter.ac.uk wrote:

I would recommend to run ZANUDA in the default mode from ccp4i or on CCP4
web server.
ZANUDA has resolved several similar cases for me.

Misha


From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Chris Fage
[cdf...@gmail.com]
Sent: 10 July 2014 01:14
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Proper detwinning?

Hi Everyone,

Despite modelling completely into great electron density, Rwork/Rfree
stalled at ~38%/44% during refinement of my 2.0-angstrom structure
(P212121, 4 monomers per asymmetric unit). Xtriage suggested twinning,
with |L| = 0.419, L^2 = 0.245, and twin fraction = 0.415-0.447.
However, there are no twin laws in this space group. I reprocessed the
dataset in P21 (8 monomers/AU), which did not alter Rwork/Rfree, and
in P1 (16 monomers/AU), which dropped Rwork/Rfree to ~27%/32%. Xtriage
reported the pseudo-merohedral twin laws below.

P21:
h, -k, -l

P1:
h, -k, -l;
-h, k, -l;
-h, -k, l

Performing intensity-based twin refinement in Refmac5 dropped
Rwork/Rfree to ~27%/34% (P21) and ~18%/22% (P1). Would it be
appropriate to continue with twin refinement in space group P1? How do
I know I'm taking the right approach?

Interestingly, I solved the structure of the same protein in P212121
at 2.8 angstroms from a different crystal. Rwork/Rfree bottomed out at
~21%/26%. One unit cell dimension is 9 angstroms greater in the
twinned dataset than in the untwinned.

Thank you for any suggestions!

Regards,
Chris

Re: [ccp4bb] PDB passes 100,000 structure milestone

2014-05-14 Thread Phil Jeffrey

As long as it's just a Technical Comments section - an obvious concern 
would be the signal/noise in the comments themselves.  I'm sure PDB 
would not relish having to moderate that lot.


Alternatively PDB can overtly link to papers that discuss technical 
issues that reference the particular structure - wrong or fraudulent 
structures are often associated with refereed publications that point 
that out, and structures with significant errors often show up in that 
way too.  I once did a journal club on Muller (2013) Acta Cryst 
F69:1071-1076 and wish that could be associated with the relevant PDB 
file(s).


Phil Jeffrey
Princeton

On 5/14/14 1:37 PM, Gloria Borgstahl wrote:

I vote for Z's idea


On Wed, May 14, 2014 at 12:32 PM, Zachary Wood z...@bmb.uga.edu
mailto:z...@bmb.uga.edu wrote:

Hello All,

Instead of placing the additional burden of policing on the good
people at the PDB, perhaps the entry page for each structure could
contain a comments section. Then the community could point out
serious concerns for the less informed users. At least that will
give users some warning in the case of particularly worrisome
structures. The authors of course could still reply to defend their
structure, and it may encourage some people to even correct their
errors.

Best regards,

Z


***
Zachary A. Wood, Ph.D.
Associate Professor
Department of Biochemistry  Molecular Biology
University of Georgia
Life Sciences Building, Rm A426B
120 Green Street
Athens, GA  30602-7229
Office: 706-583-0304 tel:706-583-0304
Lab: 706-583-0303 tel:706-583-0303
FAX: 706-542-1738 tel:706-542-1738
***

Re: [ccp4bb] KD of dimerization, off topic

2014-02-14 Thread Phil Jeffrey

That's an extremely useful link - thanks to Will Stanley for posting 
that one.


For a VP-ITC machine I'd guess that you need to load the injector with 
about 500ul of protein at a concentration of 80x the Kd or more.



Notice that Alan Cooper was injecting 10 microliters of protein at 2mM 
with a 12 microMolar dissociation constant, per injection.  You would 
probably want to maintain that approximate ratio - ~170 because it's 
mostly a question of measuring deltaH with a decent signal-to-noise per 
injection.


I recall that it takes up to 500 microLiters to load the injection 
syringe on a VP-ITC without air gap between plunger tip and injection 
point - unless someone's got a nice trick to reduce that.


The rule of thumb from the VP-ITC manual - and from practical experience 
on our machine here - for A+B = AB is using at least 10x the Kd in the 
sample chamber and about 80x the Kd in the injector.  That's not exactly 
the same situation, but 80x vs 170x suggests the the considerations are 
much the same.


Phil Jeffrey
Princeton




On 2/14/14 12:52 PM, Keller, Jacob wrote:

What a nice idea this ITC dilution is--a great example of a wet lab technique 
learned en passant on the ccp4bb.

I wonder what range of Kds could feasibly be measured with existing calorimeter 
sensitivities?

JPK

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Will 
Stanley
Sent: Friday, February 14, 2014 12:18 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] KD of dimerization, off topic

Hi Careina,

Since alternative methods are being suggested...

ITC can be good for quantitating a monomer-dimer equilibrium by diluting dimers 
out from a concentrated solution (which obviously favours the dimer) - and 
presuming a  reasonable Kon/Koff.

Alan Cooper has kindly figured out the data fitting for the rest of us:

http://www.chem.gla.ac.uk/staff/alanc/itcdil.pdf

I think Alan was using a VP-ITC when he was doing this stuff.  Lower volumes - 
and presumably concentrations if the KD is small enough - are feasible in an 
ITC200.  The protein is recoverable anyway.

All the best,
Will.

Re: [ccp4bb] Fwd: undefined edensity blob at glutamine sidechain

2013-12-10 Thread Phil Jeffrey


Priyank,

I think it's too far out from the Calpha to be Trp, and also not quite 
flat enough.  The map contains more information than just the shape as 
to identity.  Compare the max peak height of the 2Fo-Fc blob near the 
Gln to neighboring element heights.


Much higher than C/N/O ?  Probably not citrate, propanol etc
Much higher than S ?  Probably not Cl, Mg, S, SO4, PO4
K is only a little heavier than S.
So you might cut down the range of possibilities considerably. 
Partially occupied elements can be an issue, but you could narrow your 
range of options.  You can also do the expedient thing and drop an Au in 
there with e.g. 0.5 occupancy (or a range of occupancies) and see what 
the refinement does.


Phil Jeffrey
Princeton




On 12/10/13 7:44 AM, PriyankMaindola wrote:


dear members

i am trying to solve this crystal structure but
I am puzzled with an
  undefined  blob
 that
appeared at a glutamine residue after refinement. I have attached pics
of that below.
Is it a covalent modification of acid-amide side chain..
... as there is no charged environment around and density seems continuous.

please suggest



following reagents were encountered by protein
 during purification, crystallization and soaking
:
phenyl methyl sulfonyl fluoride
benzamidine
tris
dtt
  (could it be cyclized dtt?)

k[au(cn)2]
acidic pH
isopropanol
citrate
sulfate
phosfate
K+, Na+, Cl-

map contour:
2fo-fc: 1rmsd
fo-fc (green): 3 rmsd


--
*Priyank*

Re: [ccp4bb] How can I find the other molecule in the asymmetric unit?

2013-11-21 Thread Phil Jeffrey


Meisam:

Probabilities are just that: many of us have had structures with large 
solvent contents that are statistically unlikely.


Pedantic quibble:  It scales in P21 Space group with 7% linear 
Rfactor. really means that it scales in primitive monoclinic with a 
reasonable Rsymm, and I hope you also checked P2 as well as P21 when 
doing molecular replacement.  P2 is rare, but not unprecedented.


When you say refine just the back bone do you mean you're refining 
just a poly-ALA model or a non-mutated one ?  Because if so, absent the 
side-chains and any waters, an R-factor of 31% quite good.  If so, add 
side-chains and then waters and continue refining and see how things 
look.  3-fold averaging across the current monomers plus your decent 
resolution should make the sequence interpretation straightforward.


But:

* if you can see interpretable density for the fourth molecule, build in 
secondary structure elements, refine those, repeat until you can see a 
substructure you recognize, then place the monomer manually.


* use Arp/wArp to autobuild your structure - this has the benefit that 
often the map you'll get out will be a very good one even if you build 
the remaining monomer manually.  If you're lucky it'll build it all for you.


* Autobuild in Phenix can also do much the same

(I would do the first two, and perhaps all three, in parallel until one 
emerges a clear winner)


* if the above still doesn't resolve things, consider the possibility 
that the fourth molecule is not what you think it is, or may be 
statistically disordered


Phil Jeffrey
Princeton

On 11/21/13 12:35 PM, Meisam wrote:

Dear CCP4ers

I have a data set that diffracts 1.96 Å. It scales in P21 Space group with 7% 
linear Rfactor.
The Mattew coefficient gives 10% probability for 3 molecules in the asymmetric 
unit, 53% for 4 molecules, and 36% for 5 molecules.
Molecular replacement just finds 3 molecules in the asymmetric unit. Running 
Phaser also gives a partial solution with 3 molecules.
When I refine just the back bone of the protein for the 3 molecules the 
Rfree/Rwork does not go better than 34% / 31%, and when I run the molecular 
replacement on the refined structure again, and I fix it as a model to search 
for another molecule, it still does not find it.

I have attached a photo to show the density for the fourth molecule in the 
asymmetric unit.

What is the solution to this?

Thanks in advance for your help

Meisam

Re: [ccp4bb] Orientation of molecules

2013-11-21 Thread Phil Jeffrey


* Open molecular replacement solution in Coot
* Display crystal packing (DrawCell  Symmetry), perhaps as Calphas only
* Find the symmetry-related instance of copyB that is in the correct 
position relative to copyA according to your preferences
* Use FileSave Symmetry Coordinates to write the structure transposed 
by that operator (note: select the menu option, then click on the copyB 
instance)
* Since Coot will write the entire structure transposed by that symop, 
assemble the desired solution from copyA from the mol.rep. solution and 
copyB from the transposed solution.  I'm a Luddite so I use emacs and/or 
grep for this.


Phil Jeffrey
Princeton


11/21/13 1:11 PM, Appu kumar wrote:

Dear All,
I think i have not explained my problem precisely. This
may be weird one but let me elaborate more. I have have a protein
moleculeA, having N-term, and C-term end. Structurally, it is dimer
with anti-parallel arrangement i.e N-terminal of one copyA of molecule
form dimer in such a way that it copyB would be arranged in
antiparallel fashioned (N-term of copyA is besides C-term of CopyB).
So when i am searching for two copy of molecule in phaser it is giving
me two copy of molecule in parallel arrangement. So my question is,
how to tell phaser that after fixing the orientation of first copy, to
change the orientation of 2nd copy with respect to first one so that
their n-teminal and c-terminal lies beside each other. I am looking
for your valuable suggestion.
Thank you

Re: [ccp4bb] Weird MR result

2013-11-14 Thread Phil Jeffrey


Hello Niu,

1.  We need extra information.  What program did you use ?  What's the 
similarity (e.g. % identity) of your model.  What's your space group ? 
Did you try ALL the space groups in your point group in ALL the 
permutations (e.g. in primitive orthorhombic there are 8 possibilities).


1a.  My best guess on limited info is that you've got a partial solution 
in the wrong space group with only part of the molecules at their 
correct position.


2.  I recently had a very unusual case where I could solve a structure 
in EITHER P41212 or P43212 with similar statistics, but that I would see 
interpenetrating electron density for a second, partial occupancy 
molecule no matter which of these space groups I tried (and it showed 
this when I expanded the data to P1).  Might conceivably be a 2:1 
enantiomorphic twin, in retrospect, but we obtained a more friendly 
crystal form.  I hope you don't have something like that, but it's possible.


Phil Jeffrey
Princeton

On 11/14/13 5:22 PM, Niu Tou wrote:

Dear All,

I have a strange MR case which do not know how to interpret, I wonder if
any one had similar experiences.

The output model does not fit into the map at all, as shown in picture
1, however the map still looks good in part regions. From picture 2 we
can see even clear alpha helix. I guess this could not be due to some
random density, and I have tried to do MR with a irrelevant model
without producing such kind of regular secondary structure.

This data has a long c axis, and in most parts the density are still not
interpretable. I do not know if this is a good starting point. Could any
one give some suggestions? Many thanks!

Best,
Niu

Re: [ccp4bb] Problematic PDBs

2013-10-17 Thread Phil Jeffrey


From the original ABC transporter retraction:
http://www.sciencemag.org/content/314/5807/1875.2.full

The Protein Data Bank (PDB) files 1JSQ, 1PF4, and 1Z2R for MsbA and 
1S7B and 2F2M for EmrE have been moved to the archive of obsolete PDB 
entries


You can get your hands on them via URLs like:
ftp://ftp.rcsb.org/pub/pdb/data/structures/obsolete/XML/js/1jsq.xml.gz‎

Phil Jeffrey
Princeton

On 10/17/13 10:26 AM, Nat Echols wrote:

On Thu, Oct 17, 2013 at 6:51 AM, Lucas lucasbleic...@gmail.com
mailto:lucasbleic...@gmail.com wrote:

I wonder if there's a list of problematic structures somewhere that
I could use for that practice? Apart from a few ones I'm aware of
because of (bad) publicity, what I usually do is an advanced search
on PDB for entries with poor resolution and bound ligands, then
checking then manually, hopefully finding some examples of creative
map interpretation. But it would be nice to have specific examples
for each thing that can go wrong in a PDB construction.


This would be a good place to start:

http://www.ncbi.nlm.nih.gov/pubmed/23385452

The retracted ABC transporter structures are also good, although less
obvious to the untrained eye.  I forget what the PDB IDs are but I'll
see if I can dig them up.

-Nat

Re: [ccp4bb] A case of perfect pseudomerehedral twinning?

2013-10-15 Thread Phil Jeffrey


Hello Yarrow,

Since you have a refined molecular replacement solution I recommend 
using that rather than global intensity statistics.


Obviously if you solve in P21 and it's really P212121 you should have 
twice the number of molecules in the asymmetric unit and one half of the 
P21 asymmetric unit should be identical to the other half.


Since you've got decent resolution I think you can determine the real 
situation for yourself: one approach would be to test to see if you can 
symmetrize the P21 asymmetric unit so that the two halves are identical. 
 You could do this via stiff NCS restraints (cartesian would be better 
than dihedral).  After all the relative XYZs and even B-factors would be 
more or less identical if you've rescaled a P212121 crystal form in P21. 
 If something violates the NCS than it can't really be P212121.


Alternatively you can look for clear/obvious symmetry breaking between 
the two halves: different side-chain rotamers for surface side-chains 
for example.  If you've got an ordered, systematic, difference in 
electron density between the two halves of the asymmetric unit in P21 
then that's a basis for describing it as P21 rather than P212121. 
However if the two halves look nearly identical, down to equivalent 
water molecule densities, then you've got no experimental evidence that 
P21 with 2x molecules generates a better model than P212121 than 1x 
molecules.  An averaging program would show very high correlation 
between the two halves of the P21 asymmetric unit if it was really 
P212121 and you could overlap the maps corresponding to the different 
monomers using those programs.


Phil Jeffrey
Princeton

Re: [ccp4bb] mmCIF as working format?

2013-08-08 Thread Phil Jeffrey


On 8/7/13 8:27 PM, Ethan Merritt wrote:

That would be a bug.  But it hasn't been true for any version of coot
that I have used.  As you say, this is a common thing to do and I am
certain I would have noticed if it didn't work. I just checked that
it isn't true for 0.7.1-pre.


Thanks.
Turns out I'm using 0.7 and 0.7-pre on the octacore Mac and the laptop I 
use for building - slightly different versions updated at different 
times.  I'll change versions.


Apropos the other point I invariably do segment reordering via xemacs 
cut and paste although clearly Peek2 needs a reorder command.


Phil

Re: [ccp4bb] mmCIF as working format?

2013-08-05 Thread Phil Jeffrey

Questionable practice is writing an interpretation program for 
operations that can be handled simply at the command line.  Programs 
that use the API that Eugene implicitly refers to are no panacea, e.g. 
Coot has strange restrictions on things like changing the chain label 
that can be fixed in a matter of seconds by editing the PDB file in e.g. 
xemacs.  Which means that when I'm building a large structure with 
multiple chain fragments present during the build process, I've edited 
those intermediate PDB files tens of times in a single day.


While alternative programs exist to do almost everything I prefer 
something that works well, works quickly, and provides instant visual 
feedback.  CCP4 and Phenix are stuck in a batch processing paradigm that 
I don't find useful for these manipulations.


While PDB is limited and has a lot of redundant information it's for the 
latter reason it's a rather useful format for quickly making changes in 
a text editor.  It's certainly far faster than using any GUI, and it's 
also faster than the command line in many instances - and I have my own 
command line programs for hacking PDB files (and ultimately whatever 
formats come next)


Using mmCIF as an archive format makes sense, but I doubt it's going to 
make building structures any easier except for particularly large 
structures where some extended-PDB format might work just as well or better.


Phil Jeffrey
Princeton

On 8/5/13 9:53 AM, Pavel Afonine wrote:

Editing (for example, PDB files) by hand is a questionable practice. If
you know programming use either existing reliable parsers (available for
both, PDB and CIF) or write your own jiffy.

Re: [ccp4bb] HKL2000 sigma cutoff

2013-07-05 Thread Phil Jeffrey


Ursula,

I/sigI of -3 as I recall.

Are you sure that the downstream programs you are using aren't the ones 
applying the cutoff ?  Scalepack is, in general, perfectly happy to 
write negative intensities to output.sca and certainly is doing so as of 
HKL3000.  Perhaps you need to use the TRUNCATE YES option in Truncate ? 
 Does the output MTZ from Scalepack2mtz show the number of reflections 
you expect ?



Phil Jeffrey
Princeton

On 7/5/13 3:24 PM, Ursula Schulze-Gahmen wrote:

Sorry for the non-CCP4 question.

I am confused about the sigma cutoff used by HKL2000 for scaling. I
scaled a data set to 3.0 A resolution. I collected a complete dataset
to 2.8A, but the I/sigma is about 1.0 at 3.0 A. The scaling logfile in
HKL2000 shows 100% completeness in the highest resolution shell, but
about 50% of the reflections are below I/sigma =0 in the highest
resolution shell. I am guessing that these negative reflections are
not being written out, because the output file from HKL200 does not
have 100% completeness anymore. I would like to include these negative
reflections. Is there a setting in HKL2000 that I can change or do I
need to switch to a different program.

Ursula

Re: [ccp4bb] High Rwork/Rfree values

2013-06-24 Thread Phil Jeffrey


Haiying,

As far as I can tell you've got a successful solution in molecular 
replacement via Phaser and then gone and refined it in the wrong space 
group.


Based on what you've told us:  you took your initial data in primitive 
orthorhombic and solved for the structure in Phaser while sampling all 
possible space groups.  Phaser is telling you that your *original* data 
indexing is truly space group P22(1)2(1) and if you take that m.r. 
solution/data combination and simple *assign* the space group it should 
work in Refmac.  In fact Phaser should have written the correct space 
group in the PDB file header.


If you refine your original MTZ native data file with the PDB file 
Phaser wrote, what do you get ?


You seem to have reindexed the data but not rotated the model (or re-run 
molecular replacement).  That makes the model and data out-of-sync. 
Phaser does not reindex the data internally, and that's why it tries 
eight space groups in primitive orthorhombic rather than just the 
minimal set P222, P222(1), P2(1)2(1)2, P2(1)2(1)2(1).  The others that 
it tries are alternative settings of these space groups (where appropriate).


If you want to refine in P2(1)2(1)2 then reindex the data (h,k,l) - 
(k,l,h) and re-run molecular replacement with the reindexed MTZ file.



If the above is a misinterpretation of what you wrote, my alternative 
advice on this is:


1.  throw the thing at Arp/wArp and look hard at the maps you get out. 
The structure might have changed more than you thought.
2.  rescale the data in P1 and put it into Pointless and/or Xtriage to 
check for twinning and point group assignment
3.  I'm fairly sure that the (72.6, 78.0, 112.5) and (66.5, 70.5, 137.0) 
cells are unrelated but #2 will show that.
4.  If all else fails solve it in P1 and find the space group by 
inspection afterwards


Phil Jeffrey
Princeton

Re: [ccp4bb] refinement hanging--what am I missing?

2013-04-26 Thread Phil Jeffrey


Pat,

Try TLS - I usually don't invoke it at this type of resolution but in 
one case I saw it make a surprisingly significant improvement.


I would also be tempted to put the structures through Arp/wArp and see 
if it lowers the R-free any more - rightly or wrongly I view this as the 
lowest reasonably achievable R-factor with isotropic modeling - and 
especially look at the maps after it has finished in case it shows up 
anything you had missed.


When I had P21 - P2x212x twinning the R-free held up in the mid-30's at 
2 Angstrom resolution so absent any indications in Truncate or Xtriage I 
wouldn't suggest that.


A final question is how much disordered structure is missing from your 
models ?  Could a partly ordered but unmodeled segment be driving up 
R-free ?


Cheers
Phil Jeffrey
Princeton

On 4/26/13 5:38 PM, Patrick Loll wrote:

Hi all,

Here is a problem that's been annoying me, and demanding levels of thought all 
out of proportion with the importance of the project:

I have two related crystal forms of the same small protein. In both cases, the 
data look quite decent, and extend beyond 2 A, but the refinement stalls with 
statistics that are just bad enough to make me deeply uncomfortable. However, 
the maps look pretty good, and there's no obvious path to push the refinement 
further. Xtriage doesn't raise any red flags, nor does running the data through 
the Yeates twinning server.

Xtal form 1: P22(1)2(1), a=29.0, b=57.4, c=67.4; 2 molecules/AU. Resolution of 
data ~ 1.9 Å. Refinement converges with R/Rfree = 0.24/0.27

Xtal form 2: P2(1)2(1)2(1), a=59.50, b=61.1, c=67.2; 4 molecules/AU. Resolution 
of data ~ 1.7 Å. Refinement converges w/ R/Rfree = 0.21/0.26

As you would expect, the packing is essentially the same in both crystal forms.

It's interesting to note (but is it relevant?) that the packing is quite 
dense--solvent content is only 25-30%.

This kind of stalling at high R values smells like a twin problem, but it's not 
clear to me what specific kind of twinning might explain this behavior.

Any thoughts about what I might be missing here?

Thanks,

Pat


---
Patrick J. Loll, Ph. D.
Professor of Biochemistry  Molecular Biology
Director, Biochemistry Graduate Program
Drexel University College of Medicine
Room 10-102 New College Building
245 N. 15th St., Mailstop 497
Philadelphia, PA  19102-1192  USA

(215) 762-7706
pat.l...@drexelmed.edu

Re: [ccp4bb] Off-topic: PDB statistics

2013-04-15 Thread Phil Jeffrey


From my own db program:

Number of entries in histogram: 711
Total number of instances : 78467
   0 48249 0.6149 MOLECULAR REPLACEMENT
   1  8557 0.1091 NULL
   2  5632 0.0718 SAD
   3  5128 0.0654 MAD
   4  3600 0.0459 FOURIER SYNTHESIS
   5  1762 0.0225 OTHER
   6  1171 0.0149 MIR
   7   511 0.0065 SIRAS
   8   505 0.0064 DIFFERENCE FOURIER
   9   392 0.0050 MIRAS
  10   229 0.0029 AB INITIO
  11   226 0.0029 MR
  12   151 0.0019 RIGID BODY REFINEMENT
  13   146 0.0019 ISOMORPHOUS REPLACEMENT
  14   110 0.0014 AB INITIO PHASING
  15   109 0.0014 MULTIPLE ISOMORPHOUS
  1683 0.0011 N/A
  1775 0.0010 SIR
  1870 0.0009 RIGID BODY
  1964 0.0008 DIRECT METHODS
  2050 0.0006 RE-REFINEMENT USING
  2137 0.0005 DIFFERENCE FOURIER PLUS
  2236 0.0005 ISOMORPHOUS
  2334 0.0004 REFINEMENT
  2430 0.0004 MOLREP
  2526 0.0003 SE-MET MAD PHASING
  2625 0.0003 RIGID-BODY REFINEMENT
  2724 0.0003 ISOMORPHOUS METHOD
etc

It's a very heterogeneous field, that REMARK 3 field, and the ones above 
are the most dominant entries (note the 8,557 that are NULL that are 
in fact crystal structures).  At least in some versions of ADIT the 
guidance that RCSB gives about this field is very weak, which accounts 
for the variation.


I'm interested in what ab initio phasing really means, but I've been 
too lazy to mine the actual entries for details.


Phil Jeffrey
Princeton


On 4/15/13 9:48 AM, Raji Edayathumangalam wrote:

Hi Folks,

Does anyone know of an accurate way to mine the PDB for what percent of
total X-ray structures deposited as on date were done using molecular
replacement? I got hold of a pie chart for the same from my Google
search for 2006 but I'd like to get hold of the most current statistics,
if possible. The PDB has all kinds of statistics but not one with
numbers or precent of X-ray structures deposited sorted by various
phasing types or X-ray structure determination methods.

For example, an Advanced Search on the PDB site pulls up the following:

Total current structures by X-ray: 78960
48666 by MR

5139 by MAD

5672 by SAD

1172 by MIR

94 by MIR (when the word is completely spelled out)

75 by SIR
5 by SIR (when the word is completely spelled out)

That leaves about 19,000 X-ray structures either solved by other phasing
methods (seems unlikely) or somehow unaccounted for in the way I am
searching. Maybe the way I am doing the searches is no good. Does
someone have a better way to do this?

Thanks much.
Raji

--
Raji Edayathumangalam
Instructor in Neurology, Harvard Medical School
Research Associate, Brigham and Women's Hospital
Visiting Research Scholar, Brandeis University

Re: [ccp4bb] refinement protein structure

2013-03-27 Thread Phil Jeffrey

That's quite brave - shipping your entire structure to people that could 
be actual competitors.  But it was fun to play at 1.4 Angstrom over lunch.


Practical points:

* not everyone loves 12Mb of attachments in one email in their inbox, so 
if you do this again please put the files on a webserver and point us there


Structural points:

* the map looks pretty good, but I think the sequence is misassigned in 
some regions (e.g. A118-A122 etc).  Automation is a good tool but a poor 
master, and extreme caution is required before taking the results too 
literally.  Usually you'd expect a 1.4 Angstrom to be easy to autobuild 
but I recently had a sequence misassignment at just that resolution. 
That map was trivial to interpret with the correct sequence however - 
one of the joys of working with Arp/wArp at 1.4 Angstrom.


* the large number of positive difference density blobs and water 
molecules clustered in what otherwise would be the solvent void strongly 
suggest that there's a second molecule present.



If I take redfluorescentprotein_refine_10.pdb (waters removed) and 
exptl_fobs_phases_freeR_flags.mtz and ask Phaser to look for two 
molecules, it finds them quite successfully.  (for the record an LLG of 
15111 using nominal sequence identity of 90%).  I will send this to you 
off-list.  Please note that Phaser is using a different origin for this 
molecular replacement solution so the coordinates and your previous map 
do not overlap.


This rather nicely explains why your structure had an R-factor in the 
40's despite being a half-way decent model.  The new MR solution has an 
R-free in the 30's in the phenix.refine job I'm running right now.



Going forward I suggest you utilize the Arp/wArp program to autobuild 
your structure for you, starting from the molecular replacement solution 
(or, perhaps with it stripped to ALA).  While you could use Autobuild, 
this is the CCP4 list and so you should use CCP4 programs.


Phil Jeffrey
Princeton


On 3/27/13 12:22 PM, Tom Van den Bergh wrote:

Dear members of ccp4bb,

I need some help with the refinement of my structure of a variant of
mRFP (monomer red fluorescent protein, sequence in attachment). I have
done molecular replacement with phaser with model 2VAD of protein
database. Then i have done some model building phenix.autobuild. (2
pdb's (overall...), freeR flags and log file attached) When i refine
with phenix.refine my structure i get a R-value of 0,42 which is still
way too high. (redfluorescent protein.pdb, .mtz and logfile attached)
When i look at the structure in coot i find many unmodelled blobs and
many outliers in density analysis and rotamer analysis. The problem is
that there are so many problems with my structure, that i dont know
where to begin. Could you try some refinement for me, because this is
first structure that i need to solve as a student and i dont have too
many experience with it.

Greetings,

Tom

Re: [ccp4bb] vitrification vs freezing

2012-11-15 Thread Phil Jeffrey


Perhaps it's an artisan organic locavore fruit cake.

Either way, your *crystal* is not vitrified.  The solvent in your 
crystal might be glassy but your protein better still hold crystalline 
order (cf. ice) or you've wasted your time.


Ergo, cryo-cooled is the description to use.

Phil Jeffrey
Princeton

On 11/15/12 1:14 PM, Nukri Sanishvili wrote:

s: An alternative way to avoid the argument and discussion all together
is to use cryo-cooled.
Tim: You go to a restaurant, spend all that time and money and order a
fruitcake?
Cheers,
N.

Re: [ccp4bb] Compounded problems for molecular replacement

2012-10-26 Thread Phil Jeffrey


Hello Jose,

Depending on what data integration program you used, trying XDS may help 
you out a little with spot overlap.


Example #3 in my rather out-of-date page:
http://xray0.princeton.edu/~phil/Facility/Guides/MolecularReplacement.html
illustrates how you could find 8 domains, especially if you pay 
attention to the rotation angle values for the candidate domain 
solutions.  This example did not have twinning but did have a little 
pseudo-centering.  This is a 15 year-old example from back when I was 
using AMORE, so I should clearly rewrite that page.


Additionally, if the inter-domain flexibility is restricted to rotation 
about a single axis, it would be a good idea to rotate your model so 
that this rotation axis is parallel to the Z axis.  This was a method 
that was exploited with Fab structures (whose elbow angle is a fairly 
restricted rotation).  If so oriented, rotation function peaks relating 
different domains in the same molecule should show very similar alpha, 
beta and differ in gamma.


Good luck,
Phil Jeffrey
Princeton


On 10/26/12 8:27 AM, Seijo, Jose A. Cuesta wrote:

Hi all,

I am dealing with a molecular replacement problem for a 60KDa protein
composed of 2 rigid domains joined by a flexible linker which can move
relative to each other. Sequence identity for my best model is 46%
evenly spread, so in principle this should be a tractable problem.

Then the problems start to pile up:

a)The unit cell is 56.7Å, 288.5Å, 69.4Å, 90 93.5, 90. Spacegroup P21.
Rmerge 12% to 2.4Å. The data also merges relatively well (Rmerge 16%) in
P222 with the same a, c and b axes, now of course in that order. In the
P21 case, that corresponds to 4 monomers in the asymmetric unit with a
solvent content of approx. 50%, giving me 8 domains to find if I
separate them.

b)The 288 axis means that my data show some overlap in almost all
orientations (might be corrected in the future with new datasets), so
that my low resolution data are likely unreliable.

c)Intensity distributions suggest twinning in either point groups.
Actually, they are beyond the perfect twinning case, which I attribute
to the reflection overlaps making the strong reflections weaker
(integration box too small) and the small stronger (from tails of
adjacent strong ones). Of course the latest would mean that the twin
fraction estimation is unreliable, but all moments, etc show perfect
twin statistics, so I am assuming that there is indeed perfect twinning
of some sort.

So, the question is, what is the best strategy to deal with this many (4
or 8) body / noisy / twinned problem?

I am trying EPMR with many bodies, but I suspect the twinning would
throw it out of the right track, and one domain seems to be too little
of the diffracting matter to show any sort of discriminations between
solutions and non-solutions if do the usual serial searches. I plan to
let autotracing programs be the judge of success, but I am not sure of
how well those can deal with twinning. Can Arp-Warp use twinned data?

Thanks in advance for any tips.

Jose.


Jose Antonio Cuesta-Seijo, PhD
Carlsberg Laboratory
Gamle Carlsberg Vej 10
DK-1799 Copenhagen V
Denmark

Tlf +45 3327 5332
Email josea.cuesta.se...@carlsberglab.dk

Re: [ccp4bb] Determining Rpim value

2012-09-04 Thread Phil Jeffrey

If the alternative to reprocessing your data with XDS, iMosflm, Xia2, 
autoProc etc is unpalatable, might I suggest the nearly-as-unpalatable 
method as follows:


If you can still run Scalepack on all your .x files, put the line
NO MERGE ORIGINAL INDEX
in the scalepack script file.  Get the .sca or .hkl file out of that.
Use the following - strictly no warranties - script:

#  Assumes scalepack.hkl is created with NO MERGE ORIGINAL INDEX
#
#
pointless SCAIN scalepack.hkl HKLOUT scalepack.mtz  EOF
NAME PROJECT mydata CRYSTAL mydata1
CELL 75.2   75.2   135.890.00090.00090.000
EOF
#
#
scala hklin scalepack.mtz hklout scala.mtz \
  scales   scala.scales \
  rogues   scala.rogues \
  normplot scala.norm \
  anomplot scala.anom EOF
bins 20
resolution 2.9
run 1 all
resolution run 1 high 2.9
name run 1 project AUTOMATIC crystal DEFAULT dataset scalepack
scales constant
exclude sdmin 2.0
sdcorrection fixsdb noadjust norefine both 1.0 0.0
anomalous off
EOF

Corrections, comments or outright repudiation of this script quite 
welcome - this was my first attempt.


Phil Jeffrey
Princeton


On 9/4/12 6:14 PM, Michelle Deaton wrote:

I am trying to obtain an Rpim (precision indicating merging Rfactor)
value for a dataset that I have already processed with HKL2000/Scalepack
and refined.  Is there a straightforward way to obtain this value from
my data?  From what I understand, most of my options involve going back
and obtaining unmerged intensities.  I am hoping there may be a way for
me to avoid having to backtrack that far, as this data is now very far
along in the refinement process.

Thank you,
Michelle Deaton

University of Denver
Department of Chemistry and Biochemistry

Re: [ccp4bb] value of observed criterion sigma

2012-07-31 Thread Phil Jeffrey

HKL2000 does not have an observed criterion sigma (F) since Scalepack 
deals with intensities.  Leave that entry blank.  Scalepack uses

observed criterion sigma (I) = -3


On question #2 you always want to quote the statistics (completeness, 
Rsym, I/sigI etc) for the highest resolution shell but I'm not sure it 
makes any sense to report it for the lowest resolution shell unless your 
data is unusually incomplete there.  The default for PDB REMARK 200 is 
just the high resolution shell and the overall values for the entire 
dataset.


Also be aware that last time I checked the I/sigI reported by 
Scalepack in the log file is I/sigma(I) and not I/sigma(I) for the 
shell.  The PDB format in REMARK 200 wants the latter.


One of these days one hopes RCSB might include Rmeas in REMARK 200.

Phil Jeffrey
Princeton

On 7/31/12 8:54 AM, Faisal Tarique wrote:

Dear all

i have two basic queries

1) i have processed my data in HKL 2000 and during pdb submission i need
to know the value of observed criterion sigma (F) and observed criterion
sigma (I).

2) during entering data in category resolution shell whether one needs
to mention the statistics of each and every resolution shell or only two
entries i.e. the maximum resolution and minimum resolution entry is
enough in the whole columns.

--
Regards

Faisal
School of Life Sciences
JNU

[ccp4bb] SCALA keywords for merging Scalepack (no merge original index) data ?

2012-07-03 Thread Phil Jeffrey

I'm not exactly a Scala veteran so am looking for advice as to what 
would be the best way to run Scala in the following scenario:


* data integrated with Denzo
* data scaled with Scalepack and output with NO MERGE ORIGINAL INDEX
* .sca data imported into MTZ via Pointless

Do I just use the SCALA keywords:

run 1 all
onlymerge
anomalous off

or would there be a better set of commands ?  I've toyed with:

sdcorrection norefine
scales constant  bfactor off
reject merge 4
anomalous off

which produces similar results.  no merge original index data is 
already scaled and Lp-corrected so I want to avoid applying things twice.


Thanks
Phil Jeffrey
Princeton

Re: [ccp4bb] P21221 to P21212 conversion

2012-05-07 Thread Phil Jeffrey

The program that does the indexing in HKL is Denzo.  Denzo doesn't care 
about the space group.  It cares about the point group (cf. Ethan's 
point) and the cell dimensions, because it integrates the data without 
regard to the symmetry expressed in the intensities - however it does 
take notice of the restrictions placed on cell dimensions by point 
groups.  Denzo therefore picks primitive orthorhombic cells in abc.


Scalepack scales the integrated data but does not reindex the data if 
you tell it the space group is P22121.  Therefore unit cell choice in 
HKL is by default driven by cell edge size.  Scalepack has the ability 
to reindex the data, for those of us that like to work in P21212 rather 
than P22121.


On Mon, May 7, 2012 at 3:33 PM, Ethan Merritt
 Scaling is done in a point group, not a space group.

My quibble with this statement is that the output reflection data from 
Scalepack differs depending on what space group you tell it, since 
systematic absences along h00, 0k0 and 00l in P2x2x2x are not written 
out.  The number of reflections affected is quite small, of course.



Phil Jeffrey
Princeton




On 5/7/12 4:48 PM, Jacob Keller wrote:

Is it true that HKL adopts the naming convention of putting the screw
axes first and then naming abc if possible, whereas CCP4 just makes
the cell abc? E.g., would HKL ever output by default a p22121 dataset,
or would it automatically be p21212?

JPK

On Mon, May 7, 2012 at 3:33 PM, Ethan Merritt merr...@u.washington.edu
mailto:merr...@u.washington.edu wrote:

On Monday, May 07, 2012 01:09:25 pm Shya Biswas wrote:
  Hi all,
  I was wondering if anyone knows how to convert the P21221 to P21212
  spacegroup in HKL2000. I scaled the data set in P21212 in HKL
2000 but I
  got a correct MR solution in P21221 spacegroup.

Shya:

Scaling is done in a point group, not a space group.

The point group P222 contains both space groups P2(1)22(1) and
P2(1)2(1)2,
so your original scaling is correct in either case.

It is not clear from your query which of two things happened:

1) The MR solution kept the same a, b, and c axis assignments but made a
different call on whether each axis did or did not correspond to a
2(1) screw.
In this case you don't need to do anything to your files. Just make sure
that you keep the new space group as you go forward into refinement.

2) The MR solution kept the orginal screw-axis identifications but
permuted the axes to the standard setting (non-screw axis is
labelled c).
In this case you will need to construct a file containing the permuted
indices. For example, the reflection originally labeled (h=1 k=2
l=3) is now
(h=3 k=1 l=2). There are several programs that can help you do this,
including the HKL2000 GUI. But you do not need to go back into HKL
if you don't want to. You could, for example, use the ccp4i GUI to
select
- Reflection Data Utilities
- Reindex Reflections
Define Transformation Matrix by entering reflection transformation
h=l k=h l=k


Ethan


  I have a script file that
  runs with scalepack but was wondering if there is an easier way
to do it
  with HKL2000 gui mode.
  thanks,
  Shya
 

--
Ethan A Merritt
Biomolecular Structure Center, K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742




--
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
email: j-kell...@northwestern.edu mailto:j-kell...@northwestern.edu
***

Re: [ccp4bb] indexing(?) question in P21

2012-04-19 Thread Phil Jeffrey


Wolfram,

Did you solve these structures independently by molecular replacement ? 
 It sounds like your two solutions might be related by alternative 
origins (0,1/2 along a,c).  If you translate the second example along 
the a axis by -a/2 does it refine with similar R-factors ?


Phil Jeffrey
Princeton

On 4/19/12 1:35 PM, wtempel wrote:

Hello all,
I am puzzled by this situation:
I have two different crystal of the same protein, in the presence, one
in the absence of a ligand.
Both structures refine nicely in space group P21.
Cell constants (a,b,c,beta) are (i) 61,124,61,119 (ac by a hair) and
(ii) 59,125,61,118. There is a SINGLE protein molecule in the ASU. To
facilitate future analysis and comparison between both structures, I
have tried (incl. reindexing) to refine both structures with as similar
as possible translational/rotational states as possible. A failed to do
any better than having them offset by approx. 32A exactly along the
a-xis. Considering the pseudo-hexagonal cell and the extent of the
offset being so close to a/2, c/2 or b/3, I have the feeling that I am
missing something. What could it be?
Thank you as always.
Wolfram Tempel

Re: [ccp4bb] mtz2cif capable of handling map coefficients

2012-04-05 Thread Phil Jeffrey

Fc doesn't contain the weighting scheme used in the creation of the map 
coefficients, so Fc would require some sort of program to be run to 
recreate those for both 2Fo-Fc and Fo-Fc maps. By which time you might 
as well run a single cycle of the refinement program in question to 
generate new map coefficients - so I don't see the benefit of Fc.


The map coefficients, on the other hand, are a checkpoint of the maps 
being looked at by the author at the time of deposition and don't 
require programs beyond a typical visualization program (i.e. Coot) to view.


Phil Jeffrey
Princeton

On 4/5/12 12:00 PM, Ethan Merritt wrote:

On Thursday, April 05, 2012 08:25:05 am Francis E Reyes wrote:

It seems that deposition of map coefficients is a good idea.
Does someone have an mtz2cif that can handle this?


Maybe I missed something.
What is accomplished by depositing map coefficients that isn't
done better by depositing Fo and Fc?

Ethan

Re: [ccp4bb] Using intrinsically bound Zn atoms for phasing

2012-03-06 Thread Phil Jeffrey


Self-referentially:

I once used the structural Zn of p53 to do a Zn MAD structure of a 
p53:53BP1 complex at 2.5 Angstrom with one zinc per 450 residues.

Apparently using 1.283, 1.282 and 1.262 Angstroms (i.e. the Zinc edge).
http://genesdev.cshlp.org/content/16/5/583.long
But of course do your own fluorescence scan.  The advantage of 
structural metals is full occupancy and relatively lower B-factor.


That map was actually pretty good, and since it came out of MLPHARE I 
don't doubt modern programs like SHARP could make it quite a lot better.


Phil Jeffrey
Princeton

On 3/6/12 3:09 PM, Deepthi wrote:

Hi

I am trying to solve the structure of an engineered protein.The protein
is crystallized with Zn bound to it .We collected a 1.5A0 data.
Molecular Replacement didn't yield a good match for the protein. I want
to try MAD taking advantage of the Zn atoms in protein. I am not sure
what wavelength should i use to collect the diffraction data for Zn. any
suggestions?

Thank You
Deepthi

--
Deepthi

Re: [ccp4bb] Merging data collected at two different wavelength

2012-01-18 Thread Phil Jeffrey


Can I be dogmatic about this ?

Multiwavelength anomalous diffraction from Hendrickson (1991) Science 
Vol. 254 no. 5028 pp. 51-58


Multiwavelength anomalous diffraction (MAD) from the CCP4 proceedings 
http://www.ccp4.ac.uk/courses/proceedings/1997/j_smith/main.html


Multi-wavelength anomalous-diffraction (MAD) from Terwilliger Acta 
Cryst. (1994). D50, 11-16


etc.


I don't see where the problem lies:

a SAD experiment is a single wavelength experiment where you are using 
the anomalous/dispersive signals for phasing


a MAD experiment is a multiple wavelength version of SAD.  Hopefully one 
picks an appropriate range of wavelengths for whatever complex case one has.


One can have SAD and MAD datasets that exploit anomalous/dispersive 
signals from multiple difference sources.  This after all is one of the 
things that SHARP is particularly good at accommodating.


If you're not using the anomalous/dispersive signals for phasing, you're 
collecting native data.  After all C,N,O,S etc all have a small 
anomalous signal at all wavelengths, and metalloproteins usually have 
even larger signals so the mere presence of a theoretical d difference 
does not make it a SAD dataset.  ALL datasets contain some 
anomalous/dispersive signals, most of the time way down in the noise.


Phil Jeffrey
Princeton


On 1/18/12 12:48 PM, Francis E Reyes wrote:


Using the terms 'MAD' and 'SAD' have always been confusing to me when 
considering more complex phasing cases.  What happens if you have intrinsic 
Zn's, collect a 3wvl experiment and then derivatize it with SeMet or a heavy 
atom?  Or the MAD+native scenario (SHARP) ?

Instead of using MAD/SAD nomenclature I favor explicitly stating whether 
dispersive/anomalous/isomorphous differences (and what heavy atoms for each ) 
were used in phasing.   Aren't analyzing the differences (independent of 
source) the important bit anyway?


F


-
Francis E. Reyes M.Sc.
215 UCB
University of Colorado at Boulder

Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Phil Jeffrey

Let's say you have two isomorphous crystals of two different 
protein-ligand complexes.  Same protein different ligand, same xtal 
form.  Conventionally you'd keep the same free set reflections (hkl 
values) between the two datasets to reduce biasing.  However if the 
first model had been refined against all reflections there is no longer 
a free set for that model, thus all hkl's have seen the atoms during 
refinement, and so your R-free in the second complex is initially biased 
to the model from the first complex. [*]


The tendency is to do less refinement in these sort of isomorphous cases 
than in molecular replacement solutions, because the structural changes 
are usually far less (it is isomorphous after all) so there's a risk 
that the R-free will not be allowed to fully float free of that initial 
bias.  That makes your R-free look better than it actually is.


This is rather strongly analogous to using different free sets in the 
two datasets.


However I'm not sure that this is as big of a deal as it is being made 
to sound.  It can be dealt with straightforwardly.  However refining 
against all the data weakens the use of R-free as a validation tool for 
that particular model so the people that like to judge structures based 
on a single number (i.e. R-free) are going to be quite put out.


It's also the case that the best model probably *is* the one based on a 
careful last round of refinement against all data, as long as nothing 
much changes.  That would need to be quantified in some way(s).


Phil Jeffrey
Princeton

[* Your R-free is also initially model-biased in cases where the data 
are significant non-isomorphous or you're using two different xtal 
forms, to varying extents]





I still don't understand how a structure model refined with all data
would negatively affect the determination and/or refinement of an
isomorphous structure using a different data set (even without doing SA
first).

Quyen

On Oct 14, 2011, at 4:35 PM, Nat Echols wrote:


On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang qqho...@gmail.com
mailto:qqho...@gmail.com wrote:

Sorry, I don't quite understand your reasoning for how the
structure is rendered useless if one refined it with all data.


Useless was too strong a word (it's Friday, sorry). I guess
simulated annealing can address the model-bias issue, but I'm not
totally convinced that this solves the problem. And not every
crystallographer will run SA every time he/she solves an isomorphous
structure, so there's a real danger of misleading future users of the
PDB file. The reported R-free, of course, is still meaningless in the
context of the deposited model.

Would your argument also apply to all the structures that were
refined before R-free existed?


Technically, yes - but how many proteins are there whose only
representatives in the PDB were refined this way? I suspect very few;
in most cases, a more recent model should be available.

-Nat

Re: [ccp4bb] Apparent twinning in P 1 21 1

2011-09-29 Thread Phil Jeffrey


Yuri,

Detwinning relies on having both twin-related reflections present to 
calculate either/both of the the de-twinned data values.  Therefore it 
magnifies incompleteness depending on where your missing data is with 
respect to the twin operator.


I'd recommend against trying to do this with a twin fraction close to 
0.5.  From the DETWIN docs:


Itrue(h1) = ((1-tf)*iTw(h1) -tf*iTw(h2)) / (1-2tf)

i.e. tf = twin fraction, so 1/(1-2tf) becomes a large number and it's 
multiplying a weighted term of the form: (iTw(h1) - iTw(h2)) which 
becomes a very small number as the twin fraction approaches 0.5.  The 
latter difference can easily be less than sigma(I), and so the 
signal/noise of your data plummets.


Better to use REFMAC and phenix.refine's abilities to compensate for the 
twin fraction directly in refinement and leave your data as it is.


Phil Jeffrey
Princeton


On 9/29/11 10:03 AM, Yuri Pompeu wrote:

After I ran DETWIN with the estimated 0.46 alpha, my completeness for the 
detwinned data is now down to 54%!!!
Is this normal behavior? (I am guessing yes since the lower symmetry untwinned 
dat is P1 21 1)

[ccp4bb] Postdoctoral position at Princeton University

2011-03-29 Thread Phil Jeffrey


Postdoctoral position at Princeton University

A position is available in the laboratory of Prof. Fred Hughson to apply 
biochemical and structural approaches to the study of bacterial 
cell-cell communication, also known as quorum sensing. We are especially 
interested in the receptors bacteria use to detect small-molecule 
signals emitted by other cells, and in identifying and characterizing 
antagonists that block communication. A strong background in 
biochemistry and/or x-ray crystallography is essential. Please e-mail 
cover letter, c.v., and names of three references to hugh...@princeton.edu.

Re: [ccp4bb] image file extensions

2011-02-09 Thread Phil Jeffrey

find . -name '*.osc' -or -name '*.img' -type f -size +3000 -print -exec 
bzip2 '{}' \;


is a personal favorite, along those lines, with ample opportunities for 
customization.  (If the above command line wraps, it's all supposed to 
be on one line)


Phil Jeffrey
Princeton

On 2/9/11 2:46 PM, David Schuller wrote:


/bin/ls -lR | sort -nk5 | tail -40
will list the largest files in the directory tree. Those are probably
the ones you need to compress.

Re: [ccp4bb] Let's talk pseudotranslational symmetry (or maybe it's bad data).

2011-02-09 Thread Phil Jeffrey

Is there a program that does ?  I was under the impression that they 
were all equally good/bad at this, because any solution that agrees with 
the PTS has quite a high score and any solution that doesn't has a low 
score, irrespective of the correctness of the placement of the molecules.


In one case that ritually defeats me with quite strong pseudo-centering, 
this seems to be true for heavy atom searches also.


Phil Jeffrey
Princeton

On 2/9/11 5:08 PM, Jon Schuermann wrote:

I would NOT use Phaser for MR with PTS present. It doesn't handle it
correctly yet, since the likelihood targets don't account for PTS.
Others may be able to explain it better.

Re: [ccp4bb] Looking for the following values...

2011-01-13 Thread Phil Jeffrey


On 1/13/11 2:48 PM, J. Fleming wrote:

Hi All,

   I'm about ready to deposit my structure and have used pdb_extract to
aid in the process.  Unfortunately the following values were not found
and are required by ADIT:

1) Under Data Collection, Reflections section: Observed criterion
sigma(F) and Observed criterion sigma(I)


There is no criterion for sigma F applied in Denzo/HKL2000.  Not least 
of all because data processing programs like Denzo and Scalepack work 
with intensities and not structure factor moduli.


The default Sigma(I) cutoff is -3
See:
http://www.hkl-xray.com/hkl_web1/hkl/Scalepack_Keywords.html (keyword 
SIGMA CUTOFF)




2) Under Refinement, Refinement Statistics section: Number unique
reflections (all)


If your refinement program does not write it into the header of the PDB 
file, and the description of the value does not make immediate sense to 
you, omit it.  Some of the requested values are defined rather vaguely.


A field matching this name doesn't show up in the REMARK 3 refinement 
template for PHENIX-derived PDB files.  (http://www.wwpdb.org/docs.html)


I haven't deposited lately but if I were to hazard a *guess* it might 
approximate to the number of reflections you would have used in 
refinement if you hadn't applied magnitude or sigma(F) cutoffs and prior 
to PHENIX rejecting reflections as gross statistical outliers.  One 
straightforward way to get this number would be to use CAD to write a 
new MTZ file containing only reflections within the resolution limits 
used in refinement, and look in the log file to see what the output 
reflection count was.  Assuming, of course, that the cell dimensions 
defined in your MTZ file are the same ones that you used in refinement. 
 Refinement programs vary in their policy about handling reflections 
with |F|=0.  The loss of reflections would manifest in a difference 
between the completeness in data collection and the completeness in 
refinement.




Phil Jeffrey
Princeton

Re: [ccp4bb] finding I/Sigma(I) from HKL Scalepack

2010-11-01 Thread Phil Jeffrey


On 11/1/10 4:18 PM, Radisky, Evette S., Ph.D. wrote:


Two questions:

(1) Is this I/Sigma(I) what is generally reported in the literature
for data processed with the HKL suite?


Possibly.  I say possibly because nobody appears to footnote their 
I/sigI rows in their data processing tables, so it's impossible to 
tell which statistic they are reporting.  The editorial/proof-reading 
staff at journals aren't catching this ambiguity.


I personally report I/sig(I) but wrote my own program to do it from 
the .sca files.


Phil Jeffrey
Princeton

[ccp4bb] Faculty Position, Dept of Molecular Biology, Princeton University

2010-10-14 Thread Phil Jeffrey


Faculty Position
Department of Molecular Biology
Princeton University


The Department of Molecular Biology at Princeton University invites 
applications for a tenure-track faculty position at the assistant 
professor level. We are seeking an outstanding investigator in the area 
of biochemistry and structural biology. We are particularly interested 
in candidates whose plans to address fundamental biological questions 
include the use of X-ray crystallography. Applicants must have an 
excellent record of research productivity and demonstrate the ability to 
develop a rigorous research program. All applicants must have a Ph.D. or 
M.D. with postdoctoral research experience and a commitment to teaching 
at the undergraduate and graduate levels.


Applications must be submitted online at http://jobs.princeton.edu, 
requisition #1000770, and should include a cover letter, curriculum 
vitae and a short summary of research interests. We also require three 
letters of recommendation. All materials must be submitted as PDF files. 
For full consideration, applications should be received by December 1, 2010.



Princeton University is an Equal Opportunity Employer and complies with
applicable EEO and affirmative action regulations.

Re: [ccp4bb] Lousy diffraction at home but fantastic at the synchrotron?

2010-09-28 Thread Phil Jeffrey

Often this reflect crystal size - a small crystal in a big beam (or one 
with a long path in air) on a home source would see the small 
diffraction signal drop below the noise level quite quickly - often at 
the low resolution intensity dip that sits very approximately around 6 
Angstrom.  On a synchrotron source with a tight low-divergence beam that 
matches more closely the crystal dimensions that same crystal will 
appear to do rather better.


Also one is more likely to expose the crystal longer (in terms of total 
photon numbers) at a synchrotron, which itself begets better signal/noise.


Alternatively: everyone tries harder before synchrotron trips

Phil Jeffrey
Princeton

On 9/28/10 1:27 PM, Francis E Reyes wrote:

Hi all

I'm interested in the scenario where crystals were screened at home and
gave lousy (say  8-10A) but when illuminated with synchrotron radiation
gave reasonable diffraction (  3A) ? Why the discrepancy?

Thanks

F

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Phil Jeffrey


On 9/15/10 3:54 PM, Ed Pozharski wrote:


Don't you agree that using the riding model does not add
additional refinable parameters?

(snip)

instance, when hydrogens are added, the average N-H distance is
1.1(5), but upon refinement the value is down to 0.85998(4).  I


So the riding hydrogen model is imperfect.  At least with phenix.refine 
you can measure it, unlike the default behavior of REFMAC.  (But you can 
tell it to write hydrogens out, I believe).


Obviously this question is not one amenable to a simple answer.  In some 
sense (as per George) riding hydrogens are merely a restraint.  In some 
other sense they are fundamentally a part of the model - they have very 
directional properties via bumping restraints that most certainly alter 
the atomic model for the heavy atoms in a very direct way via collision. 
 Since the nature of these atoms - locationally specific - differs from 
the more amorphous extended atom restraints (CH3E for methyl in CNS 
etc) it could make sense to include them in the model at deposition.


As far as I know we do not delete atoms from the final model that 
contribute to scattering and geometric restraints under any other 
circumstances, except perhaps in the nearly-as-contentious how do I 
model my disordered side-chain case.  Also not amenable to a simple answer.


Both approaches (REFMAC-esque and PHENIX-esque) have their merits.
I doubt I'm the only person here conflicted over what to do about it.
However this thread appears to have reached the point where not much new 
ground is being broken.


Phil Jeffrey
Princeton

[ccp4bb] Question: Refmac5 stats reported in pdb REMARK 3

2010-05-19 Thread Phil Jeffrey


Compare these two lines from phenix.refine:
REMARK   3   NUMBER OF REFLECTIONS : 46001
REMARK   3   FREE R VALUE TEST SET COUNT  : 2339

with those from refmac, ostensibly using the same data and start pdb:
REMARK   3   NUMBER OF REFLECTIONS :   43672
REMARK   3   FREE R VALUE TEST SET COUNT  :  2339


I know there are 46011 reflections with |F|0 in the files I used.
phenix.refine removes 10 of these as outliers.  The 46001 remaining 
reported in REMARK 3 *include* the test set.


With REFMAC, 43672+2339=46011 so it appears that Refmac reports just the 
*working* set count in that first line, excluding the test set.


Is this is a bug with one program or the other, or a bug in the PDB 
definition of REMARK 3 ? 
http://www.wwpdb.org/documentation/format23/remark3.html


This appears to be a source of inconsistency.

phenix.refine 1.6-289
refmac5 5.4.0077  (I'm apparently a Luddite)

Phil Jeffrey
Princeton

Re: [ccp4bb] problem in calculation of elbow angle.

2010-01-13 Thread Phil Jeffrey


Inherently you want to calculate:

1.  the approximately two-fold relationship between VH and VL
2.  the approximately two-fold relationship between CL and CH1

You can use many programs for that (e.g. LSQMAN) but ideally you want a 
program that will report Direction Cosines for the rotation axis in this 
superimposition.  Particularly wacky CDR conformations could conceivably 
 confuse automatic alignment programs so you could delete those.  You 
should check the superimposed alignments for sanity (e.g. correspondence 
of the disulfide bonds).


Then calculate the elbow angle for the Fab from the dot product of the 
direction cosines of VL:VH and CL:CH1.


Phil Jeffrey
Princeton



tarique khan wrote:

Dear all,
I am trying to calculate elbow angle of my fab structure using a online 
software developed by Robyn L. Stanfield /et. al/. but it is giving a 
solution with the following errors.


WARNING: rotation matrix contains significant additional contributions 


WARNING: and deviates significantly from a pseudo-twofold0.721

WARNING: rotation matrix contains significant additional contributions 


WARNING: and deviates significantly from a pseudo-twofold :0.726


WARNING: there have been deviations from expected values - 
   please read the log above!) 
   No guarantee that the calulated elbow angle is meaningful 


   The Elbow angle is probably 174.5 deg.

*Kindly suggest some other way of accurately, calculating elbow angle.*

regards.

Tarique khan

Re: [ccp4bb] superimposing Mtz maps

2009-07-21 Thread Phil Jeffrey


This sounds a little like multi-crystal averaging without the averaging.
If you were using the Uppsala program suite, you could perhaps do the 
following:


[define one xtal as reference, the other as target]

1. Make mask in map grid for reference xtal   (program MAMA)
2. Establish operator for reference - target transformation
e.g. from protein superimposition with LSQMAN
3. Improve this operator for the two density maps (program MAVE)
 * do not average the maps, which would otherwise be the usual step
4. Expand the reference map into the target xtal, by lying to MAVE that 
the reference map is the averaged map.  (program MAVE)


There also appears to be an EZ skewing option in MAVE.

Make sure you make the mask around the unknown density large enough so 
that MAVE/Improve has some density to work with for optimization that 
isn't just the unknown blob.  You could always use a larger mask for 
this step and a smaller one for the MAVE/Expand step.


Phil Jeffrey
Princeton




Rana Refaey wrote:

Hi,

I have two maps from two different crystals with the same space group, both
show an unknown density in the same place. I wanted to superimpose the maps
to see if it is the same/similar density.
Any ideas how to do this ?

Thank you,
Regards
Rana

Re: [ccp4bb] MAD wavelength

2009-07-16 Thread Phil Jeffrey

Always take the scan results ahead of the typical values unless they are 
obviously wrong.  Only use the predicted values if the scan is broken 
or too weak (e.g. very small crystals) and in that case I'd be tempted 
to add 10-20 eV to the typical peak wavelength to make sure you 
weren't actually collecting the inflection point since they are 
typically very close in SeMet.


In my NSLS X29-dominated data collections, I find I end up using 
something like this for non-oxidized SeMet:


Peak: 12664 eV, 0.9790 Angstrom   (usually in range 12662-12664)
Infl: 12662 eV, 0.9792 Angstrom   (usually in range 12660-12662)
I also typically use high energy remote: 12860 eV, 0.964 Angstrom

give or take a few eV.  This tends to translate well between the 
relatively small number of beamlines that I personally end up using. 
But I always prefer to take the results from the Chooch analysis of the 
scan from the actual crystal.


Cheers (and good luck)
Phil Jeffrey
Princeton


Jerry McCully wrote:

Dear All:

 Next week we are going to try some seleno-Met labeled crystals.

 We checked the literature to try to find out the peak wavelength 
that has been used for SAD or MAD data collection. But they are slightly 
different ( may be 50 ev) in different papers.


 I guess this is due to the discrepancy between the fluorescence 
scanning and the theoretical vaules of f' and f''.


  When we collect the data, which wavelength should we use? Should 
we trust the scanning results?

Re: [ccp4bb] multi-domain protein with identical tertiary structure

2009-07-02 Thread Phil Jeffrey

The cadmium-utilizing marine diatom carbonic anhydrase (CA) protein has 
three consecutive CA domains that have very similar structures but 
non-identical sequences.


See:
Structure and metal exchange in the cadmium carbonic anhydrase of marine 
diatoms.

Xu Y, Feng L, Jeffrey PD, Shi Y, Morel FM.
Nature. 2008 Mar 6;452(7183):56-61.

Shankar Prasad Kanaujia wrote:

Dear CCP4 users,
Is there any multi-domain protein (with at least two domains) which has 
identical tertiary structure of each domain ?


Thanking you.

-regards
shankar

Re: [ccp4bb] phasing with se-met at low resolution

2009-05-12 Thread Phil Jeffrey

However, do not get too excited if this resolution limit is 6 A.  
Although 6 A phases are better than no phases at all, have you ever 
LOOKED at a 6 A map?  It can be very hard to tell if it is protein or 
not, even with perfect phases and all the right hand choices, etc.  


If the map is a 6 Angstrom SeMet map you may well be right, since if the 
signal goes to 6 Angstrom the data at 7 Angstrom isn't that hot either. 
 However if this was a Ta6Br12 6 Angstrom map then it can look quite 
pretty for the resolution because the 7 Angstrom SAD data in that case 
can be pretty good.  Case in point it the one we collected for PP2a ABC 
holoenzyme cleared up all sorts of things about the partial molecular 
replacement solution, including some reassurance that the desperation 
WD40 ensemble MR solution was actually correct.  At 6A, the WD40 looked 
somewhat like a Bagel (or a Bundt Cake if one is familiar) but the 
helices in one of the other subunits (A) were actually nicely resolved.


Excitement may be warranted, even at 6 Angstrom.

Phil Jeffrey
Princeton

Re: [ccp4bb] pointless question

2009-05-07 Thread Phil Jeffrey

It also has the same/analogous bug in space group P3 with Pointless 
v1.2.10 - I wasn't sure if I was missing something obvious and went back 
to using my default combination of REINDEX/SCALEIT for the tests and 
reindexing.


Phil Jeffrey
Princeton

Robert Nolte wrote:
An output file is created on hklout with or without the -copy flag.   
The problem is why is it picking the unity matrix (solution #2) for the 
reindexing operator rather than the  first matrix (-h -k l) that it 
identifies under reindexing (which is clearly the correct answer).


-Original Message-
From: Jan Abendroth
Sent: May 6, 2009 2:45 PM
To: Robert Nolte
Subject: Re: [ccp4bb] pointless question

Hi Bob,
including a -copy flag might not be totally pointless:
pointless -copy hklin ...

Jan

2009/5/6 Robert Nolte rtno...@earthlink.net
mailto:rtno...@earthlink.net

I'm hoping someone can help me with a pointless problem.  I am
trying to reindex
data into an orientation that I used to solve the structure
initially.  While
I can get pointless to give me the reindexing needed to make the
new data match
the old data for the project, when I ask it to write the data to
HKLOUT it does
not carry out the reindexing. I was under the impression from
the documentation
it would write out the reindexed solution.  Am I doing something
wrong or have
I found a bug in my particular space group. I seem to recall
getting this to work
on a different project in the past.  I have also tried a number
of different
versions of pointless, and all give me the same results. The
output file is shown below.
 Thanks in advance for any help.
   Regards,
   Bob Nolte


- pointless hklin input.mtz  hklref reference.mtz hklout
reindex.mtz  pointless.log

contents of pointless.log

 ###
 ### CCP4 6.1: POINTLESS version 1.2.23 : 26/09/08##
 ###
 User: unknown  Run date: 28/ 4/2009 Run time: 13:48:05


 Please reference: Collaborative Computational Project, Number
4. 1994.
 The CCP4 Suite: Programs for Protein Crystallography. Acta
Cryst. D50, 760-763.
 as well as any specific reference in the program write-up.

OS type:  linux
Release Date: 26th September 2008


   **
   **
   * POINTLESS  *
   *   1.2.23   *
   **
   *   Determine Laue group from unmerged intensities   *
   * Phil Evans MRC LMB, Cambridge  *
   * Uses cctbx routines by Ralf Grosse-Kunstleve et al.*
   **
   **


---

Reading reference data set from file reference.mtz
Maximum resolution in file reference.mtz:1.810
Columns for F, sigF (squared to I): F_881  SIGF_881
Number of valid observations read:18733
  Highest resolution: 1.81
  Unit cell:   72.6672.6665.9890.0090.00   120.00
 Space group: P 3 2 1

 Spacegroup information obtained from library file:
 Logical Name: SYMINFO   Filename:
/apps/ccp4/ccp4-6.1.0/lib/data/syminfo.lib

Maximum resolution in file input.mtz:1.870
Columns for F, sigF (squared to I): F_880  SIGF_880
Number of valid observations read:17028
  Highest resolution: 1.87
  Unit cell:   72.5872.5866.2090.0090.00   120.00
 Space group: P 3 2 1

Possible alternative indexing schemes
  Operators labelled exact are exact by symmetry
  For inexact options, deviations are from original cell
 (root mean square deviation between base vectors)
  Maximum accepted RMS deviation between test and reference
cells (TOLERANCE) =  2.0

  [h,k,l]exact
[-h,-k,l]exact

  Normalising reference dataset

Log(I) fit for intensity normalisation: B (slope) -18.82

  Normalising test dataset

Log(I) fit for intensity normalisation: B (slope) -18.21

Alternative indexing relative to reference file reference.mtz

$TEXT:Result

Re: [ccp4bb] NCBHT: severe warning

2009-04-01 Thread Phil Jeffrey

Posting private emails on a public email list is rarely considered good 
form, in fact on some email lists it would get you thrown off fairly 
quickly, especially considering your intended purpose.


(Who is the list admin here ?)

If you're going to post semi-humorous way off-topic posts, you should 
consider tolerating a few ill-humored replies - at least that particular 
responder didn't post to the entire list.


Phil Jeffrey

Marius Schmidt wrote:

Interesting, isn't it? :-), nice person.


[rest of content removed]

Re: [ccp4bb] Mac pro

2009-01-06 Thread Phil Jeffrey

The Mac Pro is what I use for all my crystallography calculations.  The 
vast majority of programs run, the one major sticking point being the 
older version of HKL but I believe that HKL2000 may well run on OSX now. 
I use XDS and/or MOSFLM if I want to reprocess on this machine. 
With most packages the bad old days of actually having to edit the 
Makefile to install programs is past - you can either install via Fink 
(especially with all the fine work from Bill Scott) or via .dmg files 
and failing that most packages will compile with not too much pain.


Coot did make the CPU almost glow on my MacBook when I installed all the 
dependencies via Fink back in early 2008, however.


The principal problem with the Mac Pro is that it is difficult to get it 
to run stereo - the one supported configuration last time I checked was 
absurdly expensive.  If you need stereo (and can actually find a CRT 
monitor) Linux supports a wider array of options, but I enjoy the 
relatively seamless integration of a conventional desktop environment 
with Unix on the Mac.  There are also options for virtualization of 
Windoze and Linux via the software Parallels although I have yet to test 
this out.


OSX has minor quirks, like the patch to make OSX treat e.g. the 
filenames MyJunkData.sca and myjunkdata.sca as the same file rather than 
the expected Unix behavior.  But in practice I rarely find this to be an 
issue.


Phil Jeffrey
Princeton

Sheemei wrote:

Dear all,
I am thinking of getting a apple Mac pro desktop computer. I was 
wondering does all crystallography programs run on it? I think there are 
Mac OSX version of CCP4, CNS, SHELX etc. But how about programs in the 
Uppsala software factory etc?. Also is it difficult to install these 
programs - are there problems? Is linux still a safer choice?


sheemei

Re: [ccp4bb] interface

2008-08-08 Thread Phil Jeffrey

Which brings up something about PISA.  If I run PISA on pdb entry 2IE3, 
which I'm familiar with, I get the following numbers from PISA and 
CCP4's AREAIMOL  (surface areas in Angstrom^2) for the A:C interface.


 PISA for 2IE3
Automatic A:C interface selection 907.9
(a crystal packing interface is larger than this, but this surface 
is the A:C interface)


 AreaIMol with some editing of 2IE3 to separate the chains
Chain A25,604.4
Chain C11,847.4
Total  37,451.8
Chain AC   35,576.6
Difference  1,875.2
Difference/2  937.6


For buried S.A. I agree with Steve Darnell's definition.  However PISA 
appears to be reporting half that value, or what it calls interface 
area.  Potentially confusing.


Phil Jeffrey
Princeton

Steven Darnell wrote:

Sorry, that equation should read:

Buried_Surface_Area = ASA_unbound1 + ASA_unbound2 - ASA_bound
ASA = Accessible Surface Area

The way I wrote it before would give you a negative value.

Regards,
Steve Darnell

[ccp4bb] Bug/feature in Phaser 2.1.1 over solution scoring/culling (long)

2008-06-24 Thread Phil Jeffrey


This is on OSX Tiger 10.4.11 on a G5 machine.
Phaser 2.1.1, CCP4 6.0.2

Is anyone else seeing the following ?

In a feature that seems new-ish in Phaser, intermediate solutions get 
culled after translation function and before packing tests:


(begin snippet)
Purge solutions according to highest LLG from TFs in other spacegroups
Best Space Group so far: P 61 2 2

Percent used for purge = 0.75
Top LLG value for purge = 28.8034
Mean LLG used for purge = 6.03998
Cutoff LLG used for purge = 23.1125
Number of solutions stored before purge = 33
Number of solutions stored (deleted) after purge = 0 (33)


Purging of the results of the translation function in this spacegroup
using the
highest LLG value so far (from searches in other spacegroups) deleted all
solutions
(end snippet)

which is all well and good, except in this particular case the solutions 
for P6122 have LLGs in the 14-16 range.


What Phaser appears to be doing is picking up the best case LLGs (and 
therefore LLG cutoffs) for translation function peaks from another space 
group (P622), none of which passed the packing criteria, and then 
inappropriately applying them to subsequent trial solutions in 
subsequent space groups that have lower LLGs may in fact be better 
candidate solutions and survive the packing test.


In the normal course of events, you'd hope that the best LLG corresponds 
to the correct solution in the correct space group, but I'll gladly 
concede that I'm using a marginal model with unimpressive data, and 
it'll probably fail anyway.  This feature in Phaser would seem to 
potentially speed up correct solutions with good models and data when 
using SGALTERNATIVE ALL but may in fact make the performance worse with 
bad models and poor data.


Unless I've missed something here, the LLG score/cutoff test needs to be 
based on trial solutions that have survived the packing test, not peaks 
from the translation function before that test.


I'm using a fairly conventional script (not the GUI) in this case.

#
phaser  EOF
MODE MR_AUTO
HKLIN ptcr1680_pk_truncate.mtz
LABIN F=F SIGF=SIGF
TITLE Just a phaser script
COMPOSITION PROTEIN MW 28000 NUMBER 3
RESOLUTION 10. 3.2
SGALTERNATIVE ALL
ENSEMBLE ensemble1 PDBFILE helix_16.pdb IDENT 75.0
SEARCH ENSEMBLE ensemble1 NUMBER 1
ROOT phaser1
END
EOF

Re: [ccp4bb] Finding NCS operator from one heavy atom site? (long)

2008-05-22 Thread Phil Jeffrey


This almost does what you want, but not quite.

To quote from the NCS6D manual:
NCS6D uses a set of BONES or PDB atoms as input and tries to find a set 
of rotations and translations which maximise the correlation coefficient 
between the density at the (BONES) atoms and those at the same atoms 
after application of the operator.


So you cannot use a mask in NCS6D - you can in IMP.

In the case where I did something like this, I could see a single helix 
near the SeMet sites, so I built this helix, then used the following 
script to find the first NCS relationship:


#!/bin/csh -f
#
/usr/local/Uppsala/rave_osx/osx_ncs6d  EOF
eden_400.ext
P
ncs6d_probe.pdb
1
p21212.sym
30.5 6.5 23.0
Y
0 359 10
0 179 10
0 359 10
-10 10 2
-10 10 2
-10 10 2
L
rt_best.o
EOF

Then I wrote a little C program that broke out each of the 100-or-so NCS 
operators that are in rt_best.o into files called rt_test_NN.o 
(NN=integer) and ran each and every one of them through Imp:


#!/bin/csh -f
#
#
foreach file (rt_test_*.o)
\rm LOG
/usr/local/Uppsala/rave_osx/osx_imp MAPSIZE 3500 EOF ! LOG
eden_400.ext
model.mask
p21212.sym
$file
Automatic
1.
.02
2.0
.1
.01
.0001
2
Proper
Complete
Quit
rt_test_new.o
EOF
set cc = `grep Correlation Coefficient LOG`
echo $file
echo $cc
end
#

I guess you could create a fur ball of Calpha positions for the 
initial model to force NCS6D to sort of a volume average - or peak-pick 
the map around the Se sites - I have not tried this.  I found that 
without the IMP step there were too many similar and unimpressive 
solutions for the NCS operator and the top one was not in general the 
correct one.


This approach has the potential to consume quite a lot of CPU but the 
initial map was relatively ugly and ultimately it worked rather well. 
Others might have more elegant ideas.


Phil Jeffrey
Princeton

Partha Chakrabarti wrote:

Hi,

Apologies for a non CCP4 question in strict sense. I am trying to work
out the NCS operators for a three wavelength
Se-MAD data which has only one site. The map is hardly interpretable.
I came across the USF Rave package and what I am aiming is

creak a mask around the heavy atom site (found by SHELX or Solve)
using mama or so, (ideally from resolve.mtz but not necessarily),

translate it to the other heavy atom site(s),
give a 6d search with NCS6d and
perhaps refine the best CC a bit with imp.

If it works, I could try use the NCS operator in DM or Resolve etc.

I was wondering if someone has a C-shell scripts for dealing with such
situation already. Of course if there are other programs for such a
task within CCP4, could give it a try.

Best Regards,
Partha

Re: [ccp4bb] CNS 1.2.2 binary running out of memory

2008-04-30 Thread Phil Jeffrey


You need to use the syntax:

unlimit stacksize
unlimit datasize
unlimit memoryuse

and I have these in my .cshrc


I can get this under OSX 10.5 (albeit on an old G5 chip machine):

cputime  unlimited
filesize unlimited
datasize unlimited
stacksize65532 kbytes
coredumpsize unlimited
memoryuseunlimited
descriptors  256
memorylocked unlimited
maxproc  266

In the above there's an undesirable unlimited core dump size because I 
have this account set up for debugging.


On 10.4 on similar hardware I get:

cputime unlimited
filesizeunlimited
datasizeunlimited
stacksize   65536 kbytes
coredumpsize0 kbytes
memoryuse   unlimited
descriptors 256
memorylockedunlimited
maxproc 100

Hope this helps,

Phil Jeffrey
Princeton


hari jayaram wrote:

Hi
Since I am not on the cnsbb yet I am posting this here.
I downloaded the cns 1.2.2 intel build and was trying to run a simulated 
annealing refinement on my macbook pro ( Intel) running 10.5.2 .


However the annealing job crashes roughly 40 minutes into the refinement 
with the following message


There is not enough memory available to the program.
 This may be because of too little physical memory (RAM)
 or too little swap space on the machine. It could also be
 the result of user or system limits. On most Unix systems
 the limit command can be used to check the current user
 limits. Please check that the datasize, memoryuse and
 vmemoryuse limits are set at a large enough value.

Unfortunately on Leopard it seems that unlimit and limit are not 
available under bash

Further when I use csh , I get the following values for the limits

[mango:~/aps_04_21_2008/p10_2] hari% limit
cputime  unlimited
filesize unlimited
datasize 6144 kbytes
stacksize8192 kbytes
coredumpsize 0 kbytes
memoryuseunlimited
descriptors  256
memorylocked unlimited
maxproc  266

In the same csh shell unlimit returns

[mango:~/aps_04_21_2008/p10_2] hari% unlimit
unlimit: descriptors: Can't remove limit (Invalid argument)


[snip]

Re: [ccp4bb] an over refined structure

2008-02-07 Thread Phil Jeffrey

Here I will disagree.  R-free rewards you for putting in atom in density 
which an atom belongs in.  It doesn't necessarily reward you for putting 
the *right* atom in that density, but it does become difficult to do 
that under normal circumstances unless you have approximately the right 
structure.


However in the case of multi-copy refinement at low resolution, the 
refinement is perfectly capable of shoving any old atom in density 
corresponding to any other old atom if you give it enough leeway. 
Remember that there's a big difference between R-free for a single copy 
(45%) and a 16-fold multicopy (38%) in MsbA's P1 form, and almost the 
same amount (41% vs 33%) with MsbA's P21 form.  (These are E.coli and 
V.cholerae respectively).  Both single copy and multicopy refinements 
were NCS-restrained, as far as I know.


So there's evidence, w/o simulation, that the 12-fold or 16-fold 
multicopy refinements are worth 7-8% in R-free, and I'm doubtful that 
NCS can generate that sort of gain in either crystal form.  I've 
certainly never seen that in my own experience at low resolution.


I've been meaning to put online the Powerpoint from the CCP4 talk with 
all these numbers in it, but I regret it's sitting on my iBook at home 
as of writing.


Phil Jeffrey

Dean Madden wrote:
It is true that multicopy refinement was essential for the suppression 
of Rwork. However, the whole point of the Rfree is that it is supposed 
to be independent of the number of parameters you're refining. Simply 
throwing multiple copies of the model into the refinement shouldn't have 
affected Rfree, IF IT WERE TRULY FREE.


It was almost certainly NCS-mediated spillover that allowed the 
multicopy, parameter-driven reduction in Rwork to pull down the Rfree 
values as well. The experiment is probably not worth the time it would 
take to do, but I suspect that if MsbA and EmrE test sets had been 
chosen in thin shells, then Rfree wouldn't have shown nearly the 
improvement it did.


Dean


Phil Jeffrey wrote:
While NCS probably played a role in the first crystal form of MsbA 
(P1, 8 monomers), this is also the one that showed the greatest 
improvement in R-free once the structure was correctly redetermined 
(7% or 14% depending on which refinement protocols you compare).


The other crystal form of MsbA and the crystal forms of EmrE didn't 
have particularly high-copy NCS (2 dimers, 4 monomers, dimer, 2 
tetramers) and the R-frees were somewhat comparable in all cases 
(31-36% for the redetermined structures).


The *major* source of the R-free suppression in all these cases with 
the inappropriate use of multi-copy refinement at low resolution.


Phil Jeffrey
Princeton


Dean Madden wrote:

Hi Dirk,

I disagree with your final sentence. Even if you don't apply NCS 
restraints/constraints during refinement, there is a serious risk of 
NCS contaminating your Rfree. Consider the limiting case in which 
the NCS is produced simply by working in an artificially low 
symmetry space-group (e.g. P1, when the true symmetry is P2): in this 
case, putting one symmetry mate in the Rfree set, and one in the 
Rwork set will guarantee that Rfree tracks Rwork. The same effect 
applies to a large extent even if the NCS is not crystallographic.


Bottom line: thin shells are not a perfect solution, but if NCS is 
present, choosing the free set randomly is *never* a better choice, 
and almost always significantly worse. Together with multicopy 
refinement, randomly chosen test sets were almost certainly a major 
contributor to the spuriously good Rfree values associated with the 
retracted MsbA and EmrE structures.


Best wishes,
Dean

Dirk Kostrewa wrote:

Dear CCP4ers,

I'm not convinced, that thin shells are sufficient: I think, in 
principle, one should omit thick shells (greater than the diameter 
of the G-function of the molecule/assembly that is used to describe 
NCS-interactions in reciprocal space), and use the inner thin layer 
of these thick shells, because only those should be completely 
independent of any working set reflections. But this would be too 
expensive given the low number of observed reflections that one 
usually has ...
However, if you don't apply NCS restraints/constraints, there is no 
need for any such precautions.


Best regards,

Dirk.

Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf:

It is important when using NCS that the Rfree reflections be 
selected is
distributed thin resolution shells. That way application of NCS 
should not

mix Rwork and Rfree sets.  Normal random selection or Rfree + NCS
(especially 4x or higher) will drive Rfree down unfairly.

Doug Ohlendorf

-Original Message-
From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of
Eleanor Dodson
Sent: Tuesday, February 05, 2008 3:38 AM
To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] an over refined structure

I agree that the difference in Rwork to Rfree is quite acceptable 
at your resolution. You cannot

Re: [ccp4bb] an over refined structure

2008-02-07 Thread Phil Jeffrey

While NCS probably played a role in the first crystal form of MsbA (P1, 
8 monomers), this is also the one that showed the greatest improvement 
in R-free once the structure was correctly redetermined (7% or 14% 
depending on which refinement protocols you compare).


The other crystal form of MsbA and the crystal forms of EmrE didn't have 
particularly high-copy NCS (2 dimers, 4 monomers, dimer, 2 tetramers) 
and the R-frees were somewhat comparable in all cases (31-36% for the 
redetermined structures).


The *major* source of the R-free suppression in all these cases with the 
inappropriate use of multi-copy refinement at low resolution.


Phil Jeffrey
Princeton


Dean Madden wrote:

Hi Dirk,

I disagree with your final sentence. Even if you don't apply NCS 
restraints/constraints during refinement, there is a serious risk of NCS 
contaminating your Rfree. Consider the limiting case in which the 
NCS is produced simply by working in an artificially low symmetry 
space-group (e.g. P1, when the true symmetry is P2): in this case, 
putting one symmetry mate in the Rfree set, and one in the Rwork set 
will guarantee that Rfree tracks Rwork. The same effect applies to a 
large extent even if the NCS is not crystallographic.


Bottom line: thin shells are not a perfect solution, but if NCS is 
present, choosing the free set randomly is *never* a better choice, and 
almost always significantly worse. Together with multicopy refinement, 
randomly chosen test sets were almost certainly a major contributor to 
the spuriously good Rfree values associated with the retracted MsbA and 
EmrE structures.


Best wishes,
Dean

Dirk Kostrewa wrote:

Dear CCP4ers,

I'm not convinced, that thin shells are sufficient: I think, in 
principle, one should omit thick shells (greater than the diameter of 
the G-function of the molecule/assembly that is used to describe 
NCS-interactions in reciprocal space), and use the inner thin layer of 
these thick shells, because only those should be completely 
independent of any working set reflections. But this would be too 
expensive given the low number of observed reflections that one 
usually has ...
However, if you don't apply NCS restraints/constraints, there is no 
need for any such precautions.


Best regards,

Dirk.

Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf:


It is important when using NCS that the Rfree reflections be selected is
distributed thin resolution shells. That way application of NCS 
should not

mix Rwork and Rfree sets.  Normal random selection or Rfree + NCS
(especially 4x or higher) will drive Rfree down unfairly.

Doug Ohlendorf

-Original Message-
From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of
Eleanor Dodson
Sent: Tuesday, February 05, 2008 3:38 AM
To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] an over refined structure

I agree that the difference in Rwork to Rfree is quite acceptable at 
your resolution. You cannot/ should not use Rfactors as a criteria 
for structure correctness.
As Ian points out - choosing a different Rfree set of reflections can 
change Rfree a good deal.
certain NCS operators can relate reflections exactly making it hard 
to get a truly independent Free R set, and there are other reasons to 
make it a blunt edged tool.


The map is the best validator - are there blobs still not fitted? 
(maybe side chains you have placed wrongly..) Are there many positive 
or negative peaks in the difference map? How well does the NCS match 
the 2 molecules?


etc etc.
Eleanor

George M. Sheldrick wrote:

Dear Sun,

If we take Ian's formula for the ratio of R(free) to R(work) from 
his paper Acta D56 (2000) 442-450 and make some reasonable 
approximations,

we can reformulate it as:

R(free)/R(work) = sqrt[(1+Q)/(1-Q)]  with  Q = 0.025pd^3(1-s)

where s is the fractional solvent content, d is the resolution, p is
the effective number of parameters refined per atom after allowing for
the restraints applied, d^3 means d cubed and sqrt means square root.

The difficult number to estimate is p. It would be 4 for an 
isotropic refinement without any restraints. I guess that p=1.5 
might be an appropriate value for a typical protein refinement 
(giving an R-factor
ratio of about 1.4 for s=0.6 and d=2.8). In that case, your R-factor 
ratio of 0.277/0.215 = 1.29 is well within the allowed range!


However it should be added that this formula is almost a 
self-fulfilling prophesy. If we relax the geometric restraints we

increase p, which then leads to a larger 'allowed' R-factor ratio!

Best wishes, George


Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-2582






***
Dirk Kostrewa
Gene Center, A 5.07
Ludwig-Maximilians-University
Feodor-Lynen-Str. 25
81377 Munich
Germany
Phone:  +49-89-2180-76845
Fax:  +49-89-2180-76999
E-mail

Re: [ccp4bb] an over refined structure

2008-02-07 Thread Phil Jeffrey

If you think about it, there is an analogy to relaxing geometrical 
constraints, which also allows the refinement to put atoms into 
density. The reason it usually doesn't help Rfree is that the density 
is spurious. At least some of the incorrect structure determinations of 
the early 90's (that spurred the introduction of Rfree etc.) had high 
rms deviations, suggesting that this is how the overfitting occurred. 
Nevertheless, once hit with a bit of simulated annealing, the Rfree 
values of such models deteriorated significantly.


If memory serves, the incorrect structures of the 1990's would have had 
relaxed geometry precisely because they needed to do that to reduce R, 
and R used to be the primary indicator of structure quality in the days 
before R-free was introduced.  There's quite a big difference between 
the latitude afforded by relaxing geometry and the degree of freedom 
allowed by multicopy refinement.  Simply increasing the RMS bond length 
deviations from 0.012 to 0.035 Angstrom would move atoms on average by 
only a fraction of a bond length, which is not really enough to jump 
between different atom locations.


In any event, the MsbA statistics can be simply explained from an 
expectation of what happens if you overfit your (wrong) structure using 
techniques inappropriate for the resolution:


R-work goes down
R-free goes down less
(R-free - R-work) goes up

and this happens in general with use of multicopy refinement at anything 
less than quite high resolution - I'm thinking in particular of a 
comment in Chen  Chapman (2001) Biophys J vol. 8, 1466-1472.  So I see 
no reason to suggest NCS is having a particularly extreme, perhaps 
unprecedented, effect.



Phil Jeffrey
(still working on converting Micro$loth Powerpoint to html)

Re: [ccp4bb] Missing scatter deffinition in CNS

2007-07-13 Thread Phil Jeffrey


According to the error message your offending atom has a type:
chemical=FPAF

scatter.lib assigns scattering factors based on chemical type, and there 
are ones for F and F-1 but of course not FPAF - this would likely 
be the source of your problem.  The quick fix is to make your own copy 
of scatter.lib and edit the files that reference it to pick up the local 
copy.


Phil Jeffrey
Princeton



Jian Wu wrote:

Dear all,
I am refining a structure in which there is an fluorine atom in the 
inhibtor. When I go on the energy minimization in CNS, an unusual error 
happened to this atom:


 Program version= 1.1 File version= 1.1
 CONNECt: selected atoms form  9 covalently disconnected set(s)

 list of isolated (non-covalently bonded) atoms:
 --none--

 list of isolated (non-covalently bonded) di-atomic molecules:
 --none--
 %XRASSOC-ERR: missing SCATter definition for ( $RX4  300  FAF  ) 
chemical=FPAF

 %XRASSOC error encountered: missing SCATter definition for SELEcted atoms.
   (CNS is in mode: SET ABORT=NORMal END)
 *
 ABORT mode will terminate program execution.
 *
 Program will stop immediately.
 
I have check the topology file, the paramter file, and the scatter.lib 
file, but found nothing is unusual in these files. Had anyone ever 
encountered this problem before?

Any suggestion would be welcome and thank you in advance!
Best Regards,
Jian Wu

--
Jian Wu

Ph.D. Student
Institute of Biochemistry and Cell Biology
Shanghai Institutes for Biological Sciences
Chinese Academy of Sciences (CAS)
Tel: 0086-21-54921117
Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]

Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread Phil Jeffrey

Wouldn't the desirability of this depend on the extent to which the 
molecule has moved between the high-resolution and low-resolution 
datasets ?  I would have thought that there was an effective information 
transfer between R-work and R-free once the rigid body movements became 
too large, which might provide one with an over-optimistic idea of what 
the R-free would be with the high-resolution model with the 
low-resolution data.


Phil
Princeton NJ

Edward A Berry wrote:


Even if the free-R set is not preserved for the new crystal,
R and R-free tend to diverge rapidly once any kind of
fitting with a low data/param is performed, so I think
the new structure must not have been refined much beyond
rigid body (and over-all B which is included in any kind
of refinement).  And that choice may be well justified.
Ed

cdekker wrote:

Hi,

Your reply to the ccp4bb has confused me a bit. I am currently 
refining a low res structure and realise that I don't know what to 
expect for final R and Rfree - it is definitely not what most people 
would publish. So the absolute values of R and Rfree are not telling 
me much, the only gauge I have is that as long as both R and Rfree are 
decreasing I am improving the model (and yes, at the moment that is 
only rigid body refinement).
In your email reply you suggest that even though a refinement to 
convergence that will lead to an increased Rfree (and lower R? - a 
classic case of overfitting!) would be a better model than the 
rigid-body-refined only model. This is what confuses me.
I can see your reasoning that starting with an atomic model to solve 
low-res data can lead to this behaviour, but then should the solution 
not be a modification of the starting model (maybe high B-factors?) to 
compensate for the difference in resolution of model and data?


Carien

On 4 Jun 2007, at 19:38, Edward A Berry wrote:


Ibrahim M. Moustafa wrote:
The last question: In the same paper, for the complex structure R 
and Rfree are equal (30%) is that an indication for improper 
refinement in these published structure? I'd love to hear your 
comments on that too.

Several times I solved low resolution structures using high resolution
models, and noticed that R-free increased during atomic positional
refinement.  This could be expected from the assertion that after
refinement to convergence, the final values should not depend on
the starting point: If I had started with a crude model and refined
against low resolution data, Rfree would not have gone as low as the
high-resolution model, so if I start with the high resolution model
and refine, Rfree should worsen to the same value as the structure
converges to the same point.

Thinking about the main purpose of the Rfree statistic, in a very
real way this tells me that the model was better before this step
of refinement, and it would be better to omit the minimization step.
Perhaps this is what the authors did.

   On the other hand it does not seem quite right submit a model that
has simply been rigid-body-refined against the data- I would prefer to
refine to convergence and submit the best model that can be supported
by the data alone, rather than a better model which is really the model
from a better dataset repositioned in the new crystal.

Ed

Re: [ccp4bb] Stop Refmac from refining B factors?

2007-04-18 Thread Phil Jeffrey


Harry M. Greenblatt wrote:

  You should be refining an overall temperature factor at that 
resolution.  It's one of the choices in the list, instead of isotropic.


I disagree with this.  At that (3.2 Angstrom) resolution I've often 
found than a tightly restrained individual B-factor refinement gives a 
significantly lower R-free than a single overall B-factor.  I also 
prefer it to grouped B-factors in CNS, because the latter are not 
geometrically restrained and show a lot of physically unreasonable 
waywardness (although often, similar R-free as B-individual). 
Individual B's can also be restrained by non-crystallographic symmetry 
and as far as I can tell grouped B's are not.


I think one has to explore all possibilities rather than take one fixed 
approach to working at modest resolutions, and the optimal solution is 
likely to be different for different structures.


Phil Jeffrey
Princeton, NJ




Hi,

I have a little problem with B-factor refinement. I'm using the CCP4i 
interface, Refmac 5.2.0019, a resolution of 30-3.2 A (I tried 8-3.2 A 
as well, it doesn't make a big difference for this problem), and a 
current Rfree of 30.4%.


Refmac refines the B-factors so that they are nearly the same for main 
chain and side chain, and I don't like that (or could it make sense in 
any way?). Moreover, my structure is a protein complex, and Refmac is 
mainly doing this for one component of the complex. If I take the 
B-factors from the original uncomplexed protein (around 18, 1.75 A) 
and add 44 to them with moleman to get them in the range they are in 
the complex, Refmac flattens them remarkably in only 5 cycles of 
restricted refinement. Does anyone have an explanation for this? I am 
pretty sure that the complex components are in the right place, I see 
beautiful density and everything I should see at this resolution.


Here is what I tried further:

* I de-selected Refine isotropic temperature factors in the Refmac 
interface. There was no REFI BREF ISOT any more in the com file. But 
there was also no difference in the B-factors compared to when there 
_was_ REFI BREF ISOT in the com file... So does Refmac just _ignore_ 
my wish not to refine B-factors? (The REFI keywords were as follows: 
type REST - resi MLKF - meth CGMAT - is there any B-factor-thing 
hidden in this?)


* I played around with the geometric parameters. If I select the 
B-factor values there (the keywords are TEMP|BFAC 
wbskalsigb1sigb2sigb3sigb4), it does not make _any_ 
difference, what values I fill in there, the resulting B-factors are 
always the same (but different from when I don't use the TEMP keyword, 
and even flatter). Default for WBSCAL is 1.0, I tried 10, 1.0, 0.1, 
0.01, and the equivalent numbers for the sigbs.


Thanks for any thoughts on this,

Eva


-

Harry M. Greenblatt

Staff Scientist

Dept of Structural Biology   [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED]


Weizmann Institute of SciencePhone:  972-8-934-3625

Rehovot, 76100   Facsimile:   972-8-934-4159

Israel

Re: ccp4bb on new site

2007-01-22 Thread Phil Jeffrey


As far as the subject header line is concerned, ye olde ListServ command:

SET CCP4BB SUBJECTHDR

would probably work if one emailed it to the server (i.e.
[EMAIL PROTECTED] *not* CCP4BB@JISCMAIL.AC.UK)
or you can do it via the web interface.  It appears that the mail/web 
command interface will not let you change the Reply-To feature.


Phil Jeffrey

Kjeldgaard Morten wrote:
Unfortunately, It appears that JISCMAIL is using the outdated LISTSERV 
software to run it's mailing lists, so there is not much hope of getting 
such things as the [ccp4bb] subject tag and reply to sender features 
back :-(


Morten

--Morten Kjeldgaard, asc. professor, MSc, PhD

88 matches

Mail list logo