Re: [ccp4bb] outliers

2022-11-08 Thread Dale Tronrud
   The second part of your question has to do with assessing the 
probability of correctness of a model by comparing the distribution of 
the individual values of geometry items with the distribution observed 
in large sets of high quality crystal structures.  Certainly, if your 
model has many more large deviants than expected from the observed 
distribution of deviants in quality models I would have doubts about it. 
 (I would also like to say that too few large deviants is a mark of 
shame too, but read on.)


   Actually, this is nothing more than comparing the rmsd bond lengths 
and rmsd bond angles with the rmsd's of the restraint library.  You are 
basically fitting a Normal distribution to both sets of observations and 
comparing their sigmas.  Remember when we used to do that, and still do 
implicitly when we publish these rmsd's in Table 1.


   What we have learned is that a model with rmsd's that are too large 
is certainly suspect, but people only rarely produce such models any 
more.  The real complication is that we, as a community, have decided 
based on other criteria that it is best for our models to have rmsd's 
for geometry that are much smaller than the rmsd's of our restraint 
libraries.


   The rmsd bond length of the quality models that I've seen tend to be 
around 0.02 A.  Looking in the PDB we tend to prefer 0.01 A and often 
less.  There are good reasons for this, based on the fact that low 
resolution data cannot define the correct values of the deviants and in 
that case we prefer to have deviants that are too small than deviants 
that have the correct magnitude distribution but are not related to the 
"real" deviants on a bond-by-bond basis.  (SigmaA weighting comes to 
mind as a similar solution to a similar problem.)


   If we assess the reliability of our models by looking to see if the 
distribution of deviants matches that of the library all of our models 
will be flagged as extremely unlikely.  Does that mean that matching the 
distributions will improve the model, as measured by the reliability of 
the individual or relative locations of the atoms?  I don't think so.


Dale E. Tronrud

On 11/8/2022 3:25 PM, James Holton wrote:

Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the 
"geometry" of a given PDB file.  As in: what are the odds the deviations 
from ideality of this model are due to chance?


I am leaning toward the need to take all the deviations in the structure 
together as a set, but, as Joao just noted, that it just "feels wrong" 
to tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5 
sigma. And 6 sigma deviates are really difficult to swallow unless your 
have trillions of data points.


To put it down in equations, is the p-value of a structure with 1000 
bonds in it with one 3-sigma deviate given by:


a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?



On 11/8/2022 2:56 PM, Ian Tickle wrote:

Hi James

I don't think it's meaningful to ask whether the deviation of a single 
bond length (or anything else that's single) from its expected value 
is significant, since as you say there's always some finite 
probability that it occurred purely by chance.  Statistics can only 
meaningfully be applied to samples of a 'reasonable' size.  I know 
there are statistics designed for small samples but not for samples of 
size 1 !  It's more meaningful to talk about distributions.  For 
example if 1% of the sample contained deviations > 3 sigma when you 
expected there to be only 0.3 %, that is probably significant (but it 
still has a finite probability of occurring by chance), as would be 
finding no deviations > 3 sigma (for a reasonably large sample to 
avoid sampling errors).


Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke?  Let's
say
for example its a CA-CB bond that is supposed to be 1.529 A long,
but in
your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
much, right? But the "sigma" given to such a bond in our geometry
libraries is 0.016 A.  These sigmas are typically derived from a
database of observed bonds of similar type found in highly accurate
structures, like small molecules. So, that makes this a 3-sigma
outlier.
Assuming the distribution of deviations is Gaussian, that's a pretty
unlikely thing to happen. You expect 3-sigma deviates to appear less
than 0.3% of the time.  So, is that significant?

But, then again, there are lots of other bonds in the structure. Lets
say there are 1000. With that many samplings from a Gaussian
distribution you generally expect to see a 3-sigma deviate at least
once.  That is, do an "experiment" where you pick 1000
Gaussian-random
numbers from a distribution with a standard deviation 

Re: [ccp4bb] outliers

2022-11-08 Thread Kay Diederichs
On Tue, 8 Nov 2022 15:25:03 -0800, James Holton  wrote:

>Thank you Ian for your quick response!
>
>I suppose what I'm really trying to do is put a p-value on the
>"geometry" of a given PDB file.  As in: what are the odds the deviations
>from ideality of this model are due to chance?
>
>I am leaning toward the need to take all the deviations in the structure
>together as a set, but, as Joao just noted, that it just "feels wrong"
>to tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5
>sigma. And 6 sigma deviates are really difficult to swallow unless your
>have trillions of data points.
>
>To put it down in equations, is the p-value of a structure with 1000
>bonds in it with one 3-sigma deviate given by:
>
>a)  p = 1-erf(3/sqrt(2))
>or
>b)  p = 1-erf(3/sqrt(2))**1000
>or
>c) something else?
>

p = 1-erf(3/sqrt(2))**1000 (= 0.933 thus quite likely to happen) is the p-value 
of a structure with 1000 bonds in it, with one or more deviations > 3-sigma. 
(the words after the comma differ from how you express it)

But keep in mind: one person's outlier may be another person's Nobel Prize!

Best,
Kay

P.S. I used 1-erf(3/sqrt(2))^1000 at wolfram alpha.com for the numerical 
calculation



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] outliers

2022-11-08 Thread Dale Tronrud
   Let's say you have decided that you want to know if the CA-CB bond 
of residue 123 in your favorite protein differs from the expected value 
for that type of bond.  You solve the structure and refine a model 
against your crystallographic data, then look at residue's 123 CA-CB 
bond and find that it is 3 sigma from the expected value.  Is this 
observation unlikely given the uncertainties in the parameters of the model?


   Now, let's look at a different case.  You have solved and refined a 
model of your favorite protein.  After examining all of 1000 bond 
lengths in your model you notice that the CA-CB bond of residue 123 is 3 
sigma from its expected value.  Is this observation unlikely given the 
uncertainties in the parameters of the model?


   Even though you are looking at the same bond in the same model and 
see exactly the same thing, the calculation of the probability that this 
bond is actually different than is usual it very different.  The 
calculation that you want to perform - the classic p test based on a 
Normal distribution - is valid for the first case but is quite 
inappropriate for the second.


   It is clearly much more likely that, among 1000 bonds, one of them 
will have a deviation of 3 sigma.  In fact I would say it is a near 
certainty.


   This twist of statistical analysis was never discussed in the basic 
classes on stats that I took and most scientists tend to ignore it.  To 
avoid the apparent paradox that you are confronting you have to include 
in your calculations the consequences of the actual question you have asked.


   There are huge problems with calculating this sort of "significance" 
because it is quite tempting to change your question after the fact and 
conclude that something is significant when it is not.  TNT always 
produced a list of the geometry outliers after refinement.  If you 
notice that a residue in the active site is present in that list, you 
will be tempted to forget that this residue was brought to your 
attention by a search over all geometry restraints and not a prior 
interest in the active site.


   This is a problem that many other fields of research are contending 
with.  One solution is to publish the questions you hope your model will 
answer before you perform the research.  That is certainly difficult 
with our sort of research.


   An example from another area might be helpful.  A researcher 
performs a survey of a lot of people asking questions about their diet 
and about their medical history.  Very often the published conclusion 
will be that, say, dietary item number 5 is correlated with medical 
condition number 12.  These studies tend to assess the significance of 
this result by just comparing the odds of these two items having the 
observed magnitude of correlation.


   This ignores the fact that a host of correlations were calculated 
and only this one was "significant".  If the survey had 20 dietary 
factors and 20 conditions then 400 comparisons were made and it was a 
virtual certainty that one of them would be "significant" unless the 
proper correction made to the probability calculations.


Dale E. Tronrud

On 11/8/2022 3:25 PM, James Holton wrote:

Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the 
"geometry" of a given PDB file.  As in: what are the odds the deviations 
from ideality of this model are due to chance?


I am leaning toward the need to take all the deviations in the structure 
together as a set, but, as Joao just noted, that it just "feels wrong" 
to tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5 
sigma. And 6 sigma deviates are really difficult to swallow unless your 
have trillions of data points.


To put it down in equations, is the p-value of a structure with 1000 
bonds in it with one 3-sigma deviate given by:


a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?



On 11/8/2022 2:56 PM, Ian Tickle wrote:

Hi James

I don't think it's meaningful to ask whether the deviation of a single 
bond length (or anything else that's single) from its expected value 
is significant, since as you say there's always some finite 
probability that it occurred purely by chance.  Statistics can only 
meaningfully be applied to samples of a 'reasonable' size.  I know 
there are statistics designed for small samples but not for samples of 
size 1 !  It's more meaningful to talk about distributions.  For 
example if 1% of the sample contained deviations > 3 sigma when you 
expected there to be only 0.3 %, that is probably significant (but it 
still has a finite probability of occurring by chance), as would be 
finding no deviations > 3 sigma (for a reasonably large sample to 
avoid sampling errors).


Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke?  Let's

Re: [ccp4bb] outliers

2022-11-08 Thread Petr Kolenko
Dear James,
I hope I will not disturb this thread and you will further continue with the 
theoretical aspects. But, what is the motivation for such a general question? 
In my crystal structures, I usually have many bonds outside the 3-sigma 
interval. Especially in the case of ligands. And not only for the low 
resolution structures. However, they are not reported as bond outliers in our 
validation programs. Would you suggest more geometry tied up refinement?
Regarding the last question in your contribution, I believe we should report 
RMSD together with the extremes in Table 1. I have seen a crystal structure 
where one significant outlier was “masked” by nearly perfect rest of the model.
Best regards,
Petr

Od: CCP4 bulletin board  za uživatele James Holton
Odesláno: Wednesday, November 9, 2022 12:51 AM
Komu: CCP4BB@JISCMAIL.AC.UK
Předmět: Re: [ccp4bb] outliers

Thank you for this.

Hmmm.

Interesting, and good to know the expected distribution of extreme values.

However, what I'm more worried about is how to evaluate the other 999 points?  
Lets say I'm trying to compare two 1000-member sets (A and B) that both have an 
extreme value of 3, but for the other 999 they are all 2sigma in "A" and 1sigma 
in B.  Clearly, "B" is better than "A", but how to quantify?

On 11/8/2022 3:34 PM, Petrus Zwart wrote:
Hi James,

This is what you need.

https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution

The distribution of a maximum of 1k random variates looks like this, and the 
(fitted by eye) analytical distribution associated with it seems to have a 
decent fit - as expected.

[image.png]

The idea of a p-value to judge the quality of a structure is interesting. 
xtriage uses this mechanism to flag suspicious normalized intensities, the idea 
being that in a small dataset it is less likely to see a large E value as 
compared to in a large dataset.
The issue of course is that the total intensity of a normalized intensity is 
bound by the number of atoms and the underlying assumption used is that it can 
be potentially infinitely large. It still is a decent metric I think.

P


P


On Tue, Nov 8, 2022 at 3:25 PM James Holton 
mailto:jmhol...@lbl.gov>> wrote:
Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the "geometry" of a 
given PDB file.  As in: what are the odds the deviations from ideality of this 
model are due to chance?

I am leaning toward the need to take all the deviations in the structure 
together as a set, but, as Joao just noted, that it just "feels wrong" to 
tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5 sigma. And 
6 sigma deviates are really difficult to swallow unless your have trillions of 
data points.

To put it down in equations, is the p-value of a structure with 1000 bonds in 
it with one 3-sigma deviate given by:

a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?


On 11/8/2022 2:56 PM, Ian Tickle wrote:
Hi James

I don't think it's meaningful to ask whether the deviation of a single bond 
length (or anything else that's single) from its expected value is significant, 
since as you say there's always some finite probability that it occurred purely 
by chance.  Statistics can only meaningfully be applied to samples of a 
'reasonable' size.  I know there are statistics designed for small samples but 
not for samples of size 1 !  It's more meaningful to talk about distributions.  
For example if 1% of the sample contained deviations > 3 sigma when you 
expected there to be only 0.3 %, that is probably significant (but it still has 
a finite probability of occurring by chance), as would be finding no deviations 
> 3 sigma (for a reasonably large sample to avoid sampling errors).

Cheers

-- Ian

On Tue, Nov 8, 2022, 22:22 James Holton 
mailto:jmhol...@lbl.gov>> wrote:
OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke?  Let's say
for example its a CA-CB bond that is supposed to be 1.529 A long, but in
your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
much, right? But the "sigma" given to such a bond in our geometry
libraries is 0.016 A.  These sigmas are typically derived from a
database of observed bonds of similar type found in highly accurate
structures, like small molecules. So, that makes this a 3-sigma outlier.
Assuming the distribution of deviations is Gaussian, that's a pretty
unlikely thing to happen. You expect 3-sigma deviates to appear less
than 0.3% of the time.  So, is that significant?

But, then again, there are lots of other bonds in the structure. Lets
say there are 1000. With that many samplings from a Gaussian
distribution you generally expect to see a 3-sigma deviate at least
once.  That is, do an "experiment" where you pick 1000 Gaussian-random
numbers from a distribution with a standard deviation of 1.0. Then, look
for the maximum over all 1000 trials. Is that 

[ccp4bb] Research Fellow position in Singapore

2022-11-08 Thread hans song
We are seeking two enthusiastic and self-motivated postdoctoral research
fellows. One candidate will work on Hippo signalling pathway and drug
design and the other candidate will work on developing circular RNA as the
second generation vaccine.



Requirements:

·   PhD in biochemistry, molecular & cell biology, virology or related
field

·Experiences in protein expression using insect or mammalian cells

·   Experiences in RNA in vitro transcription and purification

·   Good publication record

·   Competent in scientific writing



Singapore is an excellent environment to do research with many research
groups close by in Biopolis [https://www.a-star.edu.sg/imcb]. A*STAR offers
highly competitive salaries with benefits including housing allowance and
1-3 months performance bonus.



Interested applicants should send a CV and three confidential letters of
reference to: Prof. Haiwei Song (Email: hai...@imcb.a-star.edu.sg
),
Institute of Molecular and Cell Biology, A*STAR, Singapore.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] Mini map aide updates

2022-11-08 Thread Jon Cooper
For info, or more likely amusement, the Mini Map Aide website which I put 
together a few months ago for looking at electron density maps, primarily on 
mobile devices:

http://minimapai.de

now allows tweaks to be made to individual amino acids and basic regularisation 
to be done, so there definitely is no excuse to shrink from building, or at 
least tweaking, models. It is still a bit harebrained and I'm not sure how the 
results of my geometry minimiser would stand up in any validation tests but, 
used cautiously, it seems to give better rmsd's than the 1974 lysozyme models 
;-0

Best wishes, Jon Cooper.
jon.b.coo...@protonmail.com

Sent with [Proton Mail](https://proton.me/) secure email.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] Postdoc position at Duke University

2022-11-08 Thread Seok-Yong Lee
Dear all,

We are recruiting a postdoc.

A NIH-funded post-doctoral research position is available in the laboratory
of Seok-Yong Lee at the Duke University School of Medicine to study the
structural and mechanistic basis of membrane transport processes. The lab’s
current research focuses on three membrane transport processes: Ca2+ permeation
in somatosensation, drug/metabolite transport, and polysaccharide transport
in microbial cell wall synthesis. The Lee Lab applies a broad
multidisciplinary approach to examine these processes using structural,
biochemical, biophysical, chemical biology, and computational methods. The
successful applicant will work on advancing our mechanistic understanding
of somatosensation into broader biological contexts.



Recent publications from the lab are

Yin et al., *Science*, 2022


Wright, Fedor, et al., *Nature*, 2022


Ren et al., *Nat. Struct. & Mol. Biol.* 2022


Kwon et al., *Nat. Communs.* 2022


Kwon et al., *Nat. Struc. & Mol. Biol*. 2021


Suo et al., *Neuron*, 2020 

Yin et al., *Science*, 2019 



Further information can be found on the Lee lab website (
https://sites.duke.edu/leelab/).



This is a great opportunity for talented biochemists/structural biologists
interested in linking structural biology to sensory neuroscience who would
enjoy being embedded in a scientifically diverse and highly collaborative
group. A strong background in either structural biology (cryo-EM or X-ray)
or membrane protein biochemistry is a plus, but not required.



Interested candidates should send their application material, including
CV/resume, list of publications, a statement of research interests as well
as the contact information of three references, to seok-yong@duke.edu



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] outliers

2022-11-08 Thread James Holton

Thank you for this.

Hmmm.

Interesting, and good to know the expected distribution of extreme values.

However, what I'm more worried about is how to evaluate the other 999 
points?  Lets say I'm trying to compare two 1000-member sets (A and B) 
that both have an extreme value of 3, but for the other 999 they are all 
2sigma in "A" and 1sigma in B.  Clearly, "B" is better than "A", but how 
to quantify?



On 11/8/2022 3:34 PM, Petrus Zwart wrote:

Hi James,

This is what you need.

https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution

The distribution of a maximum of 1k random variates looks like this, 
and the (fitted by eye) analytical distribution associated with it 
seems to have a decent fit - as expected.


image.png

The idea of a p-value to judge the quality of a structure is 
interesting. xtriage uses this mechanism to flag suspicious normalized 
intensities, the idea being that in a small dataset it is less likely 
to see a large E value as compared to in a large dataset.
The issue of course is that the total intensity of a normalized 
intensity is bound by the number of atoms and the underlying 
assumption used is that it can be potentially infinitely large. It 
still is a decent metric I think.


P


P


On Tue, Nov 8, 2022 at 3:25 PM James Holton  wrote:

Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the
"geometry" of a given PDB file.  As in: what are the odds the
deviations from ideality of this model are due to chance?

I am leaning toward the need to take all the deviations in the
structure together as a set, but, as Joao just noted, that it just
"feels wrong" to tolerate a 3-sigma deviate. Even more wrong to
tolerate 4 sigma, 5 sigma. And 6 sigma deviates are really
difficult to swallow unless your have trillions of data points.

To put it down in equations, is the p-value of a structure with
1000 bonds in it with one 3-sigma deviate given by:

a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?



On 11/8/2022 2:56 PM, Ian Tickle wrote:

Hi James

I don't think it's meaningful to ask whether the deviation of a
single bond length (or anything else that's single) from its
expected value is significant, since as you say there's always
some finite probability that it occurred purely by chance. 
Statistics can only meaningfully be applied to samples of a
'reasonable' size.  I know there are statistics designed for
small samples but not for samples of size 1 !  It's more
meaningful to talk about distributions.  For example if 1% of the
sample contained deviations > 3 sigma when you expected there to
be only 0.3 %, that is probably significant (but it still has a
finite probability of occurring by chance), as would be finding
no deviations > 3 sigma (for a reasonably large sample to avoid
sampling errors).

Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke? 
Let's say
for example its a CA-CB bond that is supposed to be 1.529 A
long, but in
your model its 1.579 A.  This is 0.05 A too long. Doesn't
seem like
much, right? But the "sigma" given to such a bond in our
geometry
libraries is 0.016 A.  These sigmas are typically derived from a
database of observed bonds of similar type found in highly
accurate
structures, like small molecules. So, that makes this a
3-sigma outlier.
Assuming the distribution of deviations is Gaussian, that's a
pretty
unlikely thing to happen. You expect 3-sigma deviates to
appear less
than 0.3% of the time.  So, is that significant?

But, then again, there are lots of other bonds in the
structure. Lets
say there are 1000. With that many samplings from a Gaussian
distribution you generally expect to see a 3-sigma deviate at
least
once.  That is, do an "experiment" where you pick 1000
Gaussian-random
numbers from a distribution with a standard deviation of 1.0.
Then, look
for the maximum over all 1000 trials. Is that one > 3 sigma?
It probably
is. If you do this "experiment" millions of times it turns
out seeing at
least one 3-sigma deviate in 1000 tries is very common.
Specifically,
about 93% of the time. It is rare indeed to have every member
of a
1000-deviate set all lie within 3 sigmas.  So, we have gone
from one
3-sigma deviate being highly unlikely to being a virtual
certainty if
you look at enough samples.

So, my question is: is a 3-sigma deviate significant?  Is 

Re: [ccp4bb] outliers

2022-11-08 Thread Petrus Zwart
Hi James,

This is what you need.

https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution

The distribution of a maximum of 1k random variates looks like this, and
the (fitted by eye) analytical distribution associated with it seems to
have a decent fit - as expected.

[image: image.png]

The idea of a p-value to judge the quality of a structure is interesting.
xtriage uses this mechanism to flag suspicious normalized intensities, the
idea being that in a small dataset it is less likely to see a large E value
as compared to in a large dataset.
The issue of course is that the total intensity of a normalized intensity
is bound by the number of atoms and the underlying assumption used is
that it can be potentially infinitely large. It still is a decent metric I
think.

P


P


On Tue, Nov 8, 2022 at 3:25 PM James Holton  wrote:

> Thank you Ian for your quick response!
>
> I suppose what I'm really trying to do is put a p-value on the "geometry"
> of a given PDB file.  As in: what are the odds the deviations from ideality
> of this model are due to chance?
>
> I am leaning toward the need to take all the deviations in the structure
> together as a set, but, as Joao just noted, that it just "feels wrong" to
> tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5 sigma.
> And 6 sigma deviates are really difficult to swallow unless your have
> trillions of data points.
>
> To put it down in equations, is the p-value of a structure with 1000 bonds
> in it with one 3-sigma deviate given by:
>
> a)  p = 1-erf(3/sqrt(2))
> or
> b)  p = 1-erf(3/sqrt(2))**1000
> or
> c) something else?
>
>
>
> On 11/8/2022 2:56 PM, Ian Tickle wrote:
>
> Hi James
>
> I don't think it's meaningful to ask whether the deviation of a single
> bond length (or anything else that's single) from its expected value is
> significant, since as you say there's always some finite probability that
> it occurred purely by chance.  Statistics can only meaningfully be applied
> to samples of a 'reasonable' size.  I know there are statistics designed
> for small samples but not for samples of size 1 !  It's more meaningful to
> talk about distributions.  For example if 1% of the sample contained
> deviations > 3 sigma when you expected there to be only 0.3 %, that is
> probably significant (but it still has a finite probability of occurring by
> chance), as would be finding no deviations > 3 sigma (for a reasonably
> large sample to avoid sampling errors).
>
> Cheers
>
> -- Ian
>
>
> On Tue, Nov 8, 2022, 22:22 James Holton  wrote:
>
>> OK, so lets suppose there is this bond in your structure that is
>> stretched a bit.  Is that for real? Or just a random fluke?  Let's say
>> for example its a CA-CB bond that is supposed to be 1.529 A long, but in
>> your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
>> much, right? But the "sigma" given to such a bond in our geometry
>> libraries is 0.016 A.  These sigmas are typically derived from a
>> database of observed bonds of similar type found in highly accurate
>> structures, like small molecules. So, that makes this a 3-sigma outlier.
>> Assuming the distribution of deviations is Gaussian, that's a pretty
>> unlikely thing to happen. You expect 3-sigma deviates to appear less
>> than 0.3% of the time.  So, is that significant?
>>
>> But, then again, there are lots of other bonds in the structure. Lets
>> say there are 1000. With that many samplings from a Gaussian
>> distribution you generally expect to see a 3-sigma deviate at least
>> once.  That is, do an "experiment" where you pick 1000 Gaussian-random
>> numbers from a distribution with a standard deviation of 1.0. Then, look
>> for the maximum over all 1000 trials. Is that one > 3 sigma? It probably
>> is. If you do this "experiment" millions of times it turns out seeing at
>> least one 3-sigma deviate in 1000 tries is very common. Specifically,
>> about 93% of the time. It is rare indeed to have every member of a
>> 1000-deviate set all lie within 3 sigmas.  So, we have gone from one
>> 3-sigma deviate being highly unlikely to being a virtual certainty if
>> you look at enough samples.
>>
>> So, my question is: is a 3-sigma deviate significant?  Is it significant
>> only if you have one bond in the structure?  What about angles? What if
>> you have 500 bonds and 500 angles?  Do they count as 1000 deviates
>> together? Or separately?
>>
>> I'm sure the more mathematically inclined out there will have some
>> intelligent answers for the rest of us, however, if you are not a
>> mathematician, how about a vote?  Is a 3-sigma bond length deviation
>> significant? Or not?
>>
>> Looking forward to both kinds of responses,
>>
>> -James Holton
>> MAD Scientist
>>
>> 
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>>
>> This message was issued to members of 

Re: [ccp4bb] outliers

2022-11-08 Thread James Holton

Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the 
"geometry" of a given PDB file.  As in: what are the odds the deviations 
from ideality of this model are due to chance?


I am leaning toward the need to take all the deviations in the structure 
together as a set, but, as Joao just noted, that it just "feels wrong" 
to tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5 
sigma. And 6 sigma deviates are really difficult to swallow unless your 
have trillions of data points.


To put it down in equations, is the p-value of a structure with 1000 
bonds in it with one 3-sigma deviate given by:


a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?



On 11/8/2022 2:56 PM, Ian Tickle wrote:

Hi James

I don't think it's meaningful to ask whether the deviation of a single 
bond length (or anything else that's single) from its expected value 
is significant, since as you say there's always some finite 
probability that it occurred purely by chance.  Statistics can only 
meaningfully be applied to samples of a 'reasonable' size.  I know 
there are statistics designed for small samples but not for samples of 
size 1 !  It's more meaningful to talk about distributions.  For 
example if 1% of the sample contained deviations > 3 sigma when you 
expected there to be only 0.3 %, that is probably significant (but it 
still has a finite probability of occurring by chance), as would be 
finding no deviations > 3 sigma (for a reasonably large sample to 
avoid sampling errors).


Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke?  Let's
say
for example its a CA-CB bond that is supposed to be 1.529 A long,
but in
your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
much, right? But the "sigma" given to such a bond in our geometry
libraries is 0.016 A.  These sigmas are typically derived from a
database of observed bonds of similar type found in highly accurate
structures, like small molecules. So, that makes this a 3-sigma
outlier.
Assuming the distribution of deviations is Gaussian, that's a pretty
unlikely thing to happen. You expect 3-sigma deviates to appear less
than 0.3% of the time.  So, is that significant?

But, then again, there are lots of other bonds in the structure. Lets
say there are 1000. With that many samplings from a Gaussian
distribution you generally expect to see a 3-sigma deviate at least
once.  That is, do an "experiment" where you pick 1000
Gaussian-random
numbers from a distribution with a standard deviation of 1.0.
Then, look
for the maximum over all 1000 trials. Is that one > 3 sigma? It
probably
is. If you do this "experiment" millions of times it turns out
seeing at
least one 3-sigma deviate in 1000 tries is very common. Specifically,
about 93% of the time. It is rare indeed to have every member of a
1000-deviate set all lie within 3 sigmas.  So, we have gone from one
3-sigma deviate being highly unlikely to being a virtual certainty if
you look at enough samples.

So, my question is: is a 3-sigma deviate significant?  Is it
significant
only if you have one bond in the structure?  What about angles?
What if
you have 500 bonds and 500 angles?  Do they count as 1000 deviates
together? Or separately?

I'm sure the more mathematically inclined out there will have some
intelligent answers for the rest of us, however, if you are not a
mathematician, how about a vote?  Is a 3-sigma bond length deviation
significant? Or not?

Looking forward to both kinds of responses,

-James Holton
MAD Scientist



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1


This message was issued to members of www.jiscmail.ac.uk/CCP4BB
, a mailing list hosted by
www.jiscmail.ac.uk , terms & conditions
are available at https://www.jiscmail.ac.uk/policyandsecurity/





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] [EXTERNAL] [ccp4bb] outliers

2022-11-08 Thread Dias, Joao M.
Hi James,
My initial thought is that a 3-sigma bond length deviation is significant.
When we make an experiment and observe something that doesn’t correspond to 
what we expect, it could be an error, or it could be that we found the thread 
for a new discovery. 

If the experimental data supports that deviation, I would look at the 
experimental data carefully because that can show you a surprise. Maybe it is a 
different bond than you were expecting? Maybe a different geometry? 
I assume you are talking about a protein crystal structure that was collected 
at the synchrotron, but you should analyze the data and consider what is the 
resolution of your data?
What is the completeness? What is the B-factor? How did you refine those bonds?
 If you have low resolution you might have more variability, but we typically 
impose more restrictions during refinement and might not observe those 
variations. 

When you represent your crystal structure by a model, you have to consider that 
in a crystal you have an average of structures in the crystal lattice, and an 
average in time, where radiation damage is also occurring .

If the experiment is well done, the data might point you to something relevant.
Maybe the atoms you are considering C-C bond are not what you thought and 
therefore the bond is different.
One example can be observed in interactions of proteins with metals, where 
initially we refine the model thinking we have one metal and it shows a mixture 
of metals, or a completely different one. I know you will do a fluorescence 
scan in your MAD beamline to sort it out 

Best wishes,
Joao

-Original Message-
From: CCP4 bulletin board  On Behalf Of James Holton
Sent: Tuesday, November 8, 2022 5:22 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [EXTERNAL] [ccp4bb] outliers

OK, so lets suppose there is this bond in your structure that is stretched a 
bit.  Is that for real? Or just a random fluke?  Let's say for example its a 
CA-CB bond that is supposed to be 1.529 A long, but in your model its 1.579 A.  
This is 0.05 A too long. Doesn't seem like much, right? But the "sigma" given 
to such a bond in our geometry libraries is 0.016 A.  These sigmas are 
typically derived from a database of observed bonds of similar type found in 
highly accurate structures, like small molecules. So, that makes this a 3-sigma 
outlier. 
Assuming the distribution of deviations is Gaussian, that's a pretty unlikely 
thing to happen. You expect 3-sigma deviates to appear less than 0.3% of the 
time.  So, is that significant?

But, then again, there are lots of other bonds in the structure. Lets say there 
are 1000. With that many samplings from a Gaussian distribution you generally 
expect to see a 3-sigma deviate at least once.  That is, do an "experiment" 
where you pick 1000 Gaussian-random numbers from a distribution with a standard 
deviation of 1.0. Then, look for the maximum over all 1000 trials. Is that one 
> 3 sigma? It probably is. If you do this "experiment" millions of times it 
turns out seeing at least one 3-sigma deviate in 1000 tries is very common. 
Specifically, about 93% of the time. It is rare indeed to have every member of 
a 1000-deviate set all lie within 3 sigmas.  So, we have gone from one 3-sigma 
deviate being highly unlikely to being a virtual certainty if you look at 
enough samples.

So, my question is: is a 3-sigma deviate significant?  Is it significant only 
if you have one bond in the structure?  What about angles? What if you have 500 
bonds and 500 angles?  Do they count as 1000 deviates together? Or separately?

I'm sure the more mathematically inclined out there will have some intelligent 
answers for the rest of us, however, if you are not a mathematician, how about 
a vote?  Is a 3-sigma bond length deviation significant? Or not?

Looking forward to both kinds of responses,

-James Holton
MAD Scientist



To unsubscribe from the CCP4BB list, click the following link:
https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxf378QBG7$
  

This message was issued to members of 
https://urldefense.com/v3/__http://www.jiscmail.ac.uk/CCP4BB__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxf28joQL8$
  , a mailing list hosted by 
https://urldefense.com/v3/__http://www.jiscmail.ac.uk__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxfwxM1QmH$
  , terms & conditions are available at 
https://urldefense.com/v3/__https://www.jiscmail.ac.uk/policyandsecurity/__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxf1Mbwaqi$
  



To unsubscribe from the CCP4BB list, click the following link:

Re: [ccp4bb] outliers

2022-11-08 Thread Ian Tickle
Hi James

I don't think it's meaningful to ask whether the deviation of a single bond
length (or anything else that's single) from its expected value is
significant, since as you say there's always some finite probability that
it occurred purely by chance.  Statistics can only meaningfully be applied
to samples of a 'reasonable' size.  I know there are statistics designed
for small samples but not for samples of size 1 !  It's more meaningful to
talk about distributions.  For example if 1% of the sample contained
deviations > 3 sigma when you expected there to be only 0.3 %, that is
probably significant (but it still has a finite probability of occurring by
chance), as would be finding no deviations > 3 sigma (for a reasonably
large sample to avoid sampling errors).

Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

> OK, so lets suppose there is this bond in your structure that is
> stretched a bit.  Is that for real? Or just a random fluke?  Let's say
> for example its a CA-CB bond that is supposed to be 1.529 A long, but in
> your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
> much, right? But the "sigma" given to such a bond in our geometry
> libraries is 0.016 A.  These sigmas are typically derived from a
> database of observed bonds of similar type found in highly accurate
> structures, like small molecules. So, that makes this a 3-sigma outlier.
> Assuming the distribution of deviations is Gaussian, that's a pretty
> unlikely thing to happen. You expect 3-sigma deviates to appear less
> than 0.3% of the time.  So, is that significant?
>
> But, then again, there are lots of other bonds in the structure. Lets
> say there are 1000. With that many samplings from a Gaussian
> distribution you generally expect to see a 3-sigma deviate at least
> once.  That is, do an "experiment" where you pick 1000 Gaussian-random
> numbers from a distribution with a standard deviation of 1.0. Then, look
> for the maximum over all 1000 trials. Is that one > 3 sigma? It probably
> is. If you do this "experiment" millions of times it turns out seeing at
> least one 3-sigma deviate in 1000 tries is very common. Specifically,
> about 93% of the time. It is rare indeed to have every member of a
> 1000-deviate set all lie within 3 sigmas.  So, we have gone from one
> 3-sigma deviate being highly unlikely to being a virtual certainty if
> you look at enough samples.
>
> So, my question is: is a 3-sigma deviate significant?  Is it significant
> only if you have one bond in the structure?  What about angles? What if
> you have 500 bonds and 500 angles?  Do they count as 1000 deviates
> together? Or separately?
>
> I'm sure the more mathematically inclined out there will have some
> intelligent answers for the rest of us, however, if you are not a
> mathematician, how about a vote?  Is a 3-sigma bond length deviation
> significant? Or not?
>
> Looking forward to both kinds of responses,
>
> -James Holton
> MAD Scientist
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] outliers

2022-11-08 Thread James Holton
OK, so lets suppose there is this bond in your structure that is 
stretched a bit.  Is that for real? Or just a random fluke?  Let's say 
for example its a CA-CB bond that is supposed to be 1.529 A long, but in 
your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like 
much, right? But the "sigma" given to such a bond in our geometry 
libraries is 0.016 A.  These sigmas are typically derived from a 
database of observed bonds of similar type found in highly accurate 
structures, like small molecules. So, that makes this a 3-sigma outlier. 
Assuming the distribution of deviations is Gaussian, that's a pretty 
unlikely thing to happen. You expect 3-sigma deviates to appear less 
than 0.3% of the time.  So, is that significant?


But, then again, there are lots of other bonds in the structure. Lets 
say there are 1000. With that many samplings from a Gaussian 
distribution you generally expect to see a 3-sigma deviate at least 
once.  That is, do an "experiment" where you pick 1000 Gaussian-random 
numbers from a distribution with a standard deviation of 1.0. Then, look 
for the maximum over all 1000 trials. Is that one > 3 sigma? It probably 
is. If you do this "experiment" millions of times it turns out seeing at 
least one 3-sigma deviate in 1000 tries is very common. Specifically, 
about 93% of the time. It is rare indeed to have every member of a 
1000-deviate set all lie within 3 sigmas.  So, we have gone from one 
3-sigma deviate being highly unlikely to being a virtual certainty if 
you look at enough samples.


So, my question is: is a 3-sigma deviate significant?  Is it significant 
only if you have one bond in the structure?  What about angles? What if 
you have 500 bonds and 500 angles?  Do they count as 1000 deviates 
together? Or separately?


I'm sure the more mathematically inclined out there will have some 
intelligent answers for the rest of us, however, if you are not a 
mathematician, how about a vote?  Is a 3-sigma bond length deviation 
significant? Or not?


Looking forward to both kinds of responses,

-James Holton
MAD Scientist



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] Postdoc Opening in X-ray, Cryo-EM, cell biology, at Princeton

2022-11-08 Thread Alexei Korennykh
A postdoctoral or more senior researcher opportunity is available 
immediately in the laboratory of Alexei Korennykh in the Department of 
Molecular Biology at Princeton University. We invite candidates with 
relevant PhD degrees to work in the fields of human cell biology and 
structural biology. Our research explores molecular and cellular 
mechanisms of immune receptors, which recognize “dark matter RNA” 
(non-coding RNAs) and shape autoimmune disorders and cancers.


Some of our core methods include tissue culture, cell biology, CRISPR 
knockout generation, various flavors of RNA-seq, RNA biochemistry, RNA 
biology, X-ray crystallography, Cryo-EM. High-resolution cryo-EM work is 
conducted at the recently built Cryo-EM center at Princeton.


Our publications:
https://pubmed.ncbi.nlm.nih.gov/?term=korennykh=date=50 

https://scholar.princeton.edu/korennykhlab/home 



Interested applicants should apply |to Alexei directly,| 
akore...@princeton.edu .


Please include a cover letter and a current CV



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Optimising a Nanobody interface computationally to improve binding

2022-11-08 Thread Jonathan Elegheert

Hello,

Indeed, Rosetta-ddG can evaluate the effect of residue mutations. Also, have a 
look at the program FoldX for per-residue evaluation of mutations 
(https://foldxsuite.crg.eu/). Then, there is the freeware webtool FireProt, its 
algorithm uses FoldX as a pre-filter to select beneficial mutations which are 
subsequently proved in a second round using Rosetta-ddG 
(https://loschmidt.chemi.muni.cz/fireprotweb/).

Overall my understanding is that these programs evaluate destabilising 
mutations more accurately than stabilising / affinity-enhancing mutations … but 
it may give interesting leads.

Cheers
Jonathan

-
Jonathan Elegheert, PhD
Team Leader
CNRS Scientist

Interdisciplinary Institute for NeuroScience (IINS) 

UMR5297 CNRS / UB
Centre Broca Nouvelle-Aquitaine
146 rue Léo Saignat BP 61292
33076 Bordeaux Cedex 
France

T: +33 (0)5 33 51 48 57
E: jonathan.eleghe...@u-bordeaux.fr
W: https://www.iins.u-bordeaux.fr/ELEGHEERT
Twitter: @ElegheertLab 
-






> On 8 Nov 2022, at 09:32, Jonas Emsley 
> <8d0668c7d48d-dmarc-requ...@jiscmail.ac.uk> wrote:
> 
>  
>  
> Dear all
>  
> I have a protein engineering question.
>  
> If you have a nanobody ligand complex structure is there a program that can 
> use the crystal structure to suggest engineering  improvements to give a 
> tighter binding nanobody. It need not just be in the CDRs. Maybe rosetta can 
> do this? 
>  
> Any suggestions will be greatly welcome
>  
> cheers
>  
> jonas
>  
>  
>  
>  
> ##
> Dr Jonas Emsley
> Professor of Macromolecular Crystallography,
> Nottingham Biodiscovery Institute 
> School of Pharmacy,
> University of Nottingham,
> University Park,
> Nottingham.
> NG72RD
> Tel: +44 1158467092
> Fax: +44 1158468002
> email:jonas.ems...@nottingham.ac.uk 
> http://www.nottingham.ac.uk/research/groups/structural-biology/index.aspx
> https://orcid.org/-0002-8949-8030
>  
>  
>  
>  
>  
>  
> This message and any attachment are intended solely for the addressee
> and may contain confidential information. If you have received this
> message in error, please contact the sender and delete the email and
> attachment. 
> 
> Any views or opinions expressed by the author of this email do not
> necessarily reflect the views of the University of Nottingham. Email
> communications with the University of Nottingham may be monitored 
> where permitted by law.
> 
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


smime.p7s
Description: S/MIME cryptographic signature


[ccp4bb] Optimising a Nanobody interface computationally to improve binding

2022-11-08 Thread Jonas Emsley


Dear all

I have a protein engineering question.

If you have a nanobody ligand complex structure is there a program that can use 
the crystal structure to suggest engineering  improvements to give a tighter 
binding nanobody. It need not just be in the CDRs. Maybe rosetta can do this?

Any suggestions will be greatly welcome

cheers

jonas




##
Dr Jonas Emsley
Professor of Macromolecular Crystallography,
Nottingham Biodiscovery Institute
School of Pharmacy,
University of Nottingham,
University Park,
Nottingham.
NG72RD
Tel: +44 1158467092
Fax: +44 1158468002
email:jonas.ems...@nottingham.ac.uk
http://www.nottingham.ac.uk/research/groups/structural-biology/index.aspx
[ORCID iD icon]https://orcid.org/-0002-8949-8030









This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please contact the sender and delete the email and
attachment. 

Any views or opinions expressed by the author of this email do not
necessarily reflect the views of the University of Nottingham. Email
communications with the University of Nottingham may be monitored 
where permitted by law.







To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/