Re: Full analysis issue

Sébastien Morin Fri, 09 May 2008 08:45:48 -0700

Hi Ed,

First, thanks a lot for this help !

Second, I have to apologize for the length of this mail...

Ok...

My system is a 271 residue globular protein (230 residues with data at 3
fields = 2070 observables). An homologous protein is being studied in
the lab and analysing relaxation data using either the diffusion seeded
approach in ModelFree or the new protocol of the full_analysis script
yields similar results with a high mean S2 (~0.90) and a few Rex (15-20)
throughout the protein. Thus, the problem here with my system is
probably external to the approaches and the user...

Ok...

I tried using ModelFree with relax (script palmer.py : ModelFree as an
engine for optimization, but relax for automating and AIC model
selection) and got similar results than with the full_analysis.py
approach... For the two situations tested (see below), no oscillation
occured. Here are some stats :

=======================================================================
Approach        Diff     Iter  Chi2    AIC     Nb_Rex  <Rex>_+-_StdDev
==============  =======  ====  ======  ======  ======  ===============
palmer          prolate  15    ~12990  ~14060  182     1.602_+-_0.770

palmer_hybrid   prolate  12    ~ 2715  ~ 3660  129     0.902_+-_0.571

full            prolate   5    ~13090  ~14125  181     1.671_+-_0.782

full_hybrid     prolate   7    ~ 2750  ~ 3720  145     2.431_+-_1.546
=======================================================================

It seems that the new protocol is not the source of the problem.
Moreover, it is obvious from the AIC value (and also from the diffusion
tensor details, not shown here) that the hybrid (without the highly
flexible C-terminus) is a better description of the system. However, as
is seen here, the Rex values seem quite small and there are way too much
Rex (> 50 % of all residues)... These may thus be non significative, but
then, how can one exclude such "artifacts" when doing iterative
optimization (with either approach)..? How can one decide to choose
another model than with Rex when iterating to find the best diffusion
tensor..?

Ok...

Maybe, as you proposed, the problem arises because of the crystal
structure being inappropriate for describing the solution structure...
The crystal structure I use has a resolution of 1.95 A. Protons were not
visible but were added using CHARMM.  Moreover, different snapshots from
molecular mechanics in CHARMM were also tested to see if fluctuations in
NH bond orientation could yield better optimizations... It was not the case.

I'll try to assess this issue of the crystal structure by running tests
(with palmer.py and also full_analysis.py approaches) using a different
structure (a ponctual mutant) also from crystallography... The
resolution of this structure is also quite low (1.75 A). Anyway, I don't
have choice since no solution structure exists, neither better crystal
structures... If ever the crystal structure is the cause of this
problem, what can one do ? Is one obliged to do his analysis with a
local_tm or a sphere diffusion tensor ? Is it a waste if on does so with
good quality data at three fields ???

Ok...

What about the AIC for the local_tm model VS the ellipsoid in the
full_analysis approach ? Here are some stats :

=======================================================================
Approach     Models  Diff       AIC
===========  ======  =========  ======
full         m1-m5   local_tm   ~ 4510
full         m1-m5   ellipsoid  ~12710

full         m0-m9   local_tm   ~ 4410
full         m0-m9   ellipsoid  ~ 5210

full_hybrid  m1-m5   local_tm   ~ 4510
full_hybrid  m1-m5   ellipsoid  ~ 4720 *

full_hybrid  m0-m9   local_tm   ~ 4410
full_hybrid  m0-m9   ellipsoid  ~ 4570 **
=======================================================================
*  not converged after 35 rounds (oscillates)
** not converged after 26 rounds (oscillates)

As said before, the hybrid improves the description of the diffusion,
however, there is still a problem : first, the local_tm diffusion is
still selected over the ellipsoid (even if the difference is now
smaller), second, the ellipsoid optimizations don't converge and
oscillate...

Now, what about the Rex and slow motions (ts) in the local_tm diffusion
? Here are some stats :

=======================================================================
Approach     Models  Diff       Nb_Rex  Nb_ts
===========  ======  =========  ======  =====
full         m1-m5   local_tm    58      30
full         m1-m5   ellipsoid  171      21

full         m0-m9   local_tm    63      41
full         m0-m9   ellipsoid  144      49

full_hybrid  m1-m5   local_tm    58      30
full_hybrid  m1-m5   ellipsoid  142 *    28

full_hybrid  m0-m9   local_tm    64      41
full_hybrid  m0-m9   ellipsoid  145 **   50
=======================================================================
*  not converged after 35 rounds (oscillates)
** not converged after 26 rounds (oscillates)

As you can see, there are way more Rex in the ellipsoid, which probably
means that there is a problem with the diffusion tensor... For the slow
ns motions, there doesn't seem to be significantly more in the ellipsoid
description... Moreover, the sphere diffusion tensor which is not
NH-vector-orientation-dependent, also as a high degree of Rex, similar
ns motions and AIC values similar (just a bit higher) to what is
observed for the ellipsoid :

=======================================================================
Approach     Models  Diff       Nb_Rex  Nb_ts  AIC
===========  ======  =========  ======  =====  ======
full         m1-m5   sphere     191      20    ~15200

full         m0-m9   sphere     155      47    ~ 5640

full_hybrid  m1-m5   sphere     145      31    ~ 5190

full_hybrid  m0-m9   sphere     153      47    ~ 5030
=======================================================================

Should the sphere diffusion tensor yield similar results as the local_tm
? If there is a major difference between those two, does it mean that
concerted motions may be present and that an hybrid model could solve
the issue ?

Ok...

Now, are there concerted motions apparent from the local_tm results..? I
plotted the results from the local_tm run after aic model selection
(Would it be better if I'd look at the local_tm run for model 1 or 2
only ? Can model selection here bias the results ?) and couldn't find
any obvious link between different parts of the protein for one or more
parameters among S2, S2f, S2s, Rex, te, tf, ts, chi2.

However, a small relation seems to exist for the local_tm distribution
and the domain (The inverse is seen for the S2, but to a lesser extent.
When looking at the tm1 run, the local_tm is also a bit smaller in the
same domain [a small difference of 0.5-1.0 ns for values of ~13 ns], but
the S2 are similar, which points to a difference for the two domains).

My protein is globular, but has two structural domains side by side, an
all alpha domain and an alpha/beta domain. In the homologous protein,
there seems to exist Rex at the interface (which spans a surface of four
10 residue beta strands, which is big and is expected to be quite
rigid). Maybe the two domains are a bit different in my system which
could cause the problems I encounter. I'll try to assess this by running
full_analysis runs on the different domains alone...

Ok...

Well, I'm out of idea now... If you have any idea that could help, these
will be more than welcome !

I hope this discussion can also help other people solving difficulties
encountered in their analysis or help them get more information out of
their system...

Thanks a lot once more !

Cheers !

Sébastien

P.S. Again, sorry for the length of the mail...

Edward d'Auvergne wrote:
> Hi,
>
> I've been thinking about this one for a while, but I don't know
> exactly what the problem is.  I have a few ideas that may help though.
>  This could either be some type of interesting dynamics, or be caused
> by something a bit more sobering.
>
> Firstly though, it is worth comparing the local tm model to the best
> of the global diffusion tensor models (the ellipsoid).  It could be
> that if the AIC values are similar, then the local tm model and the
> global diffusion model are statistically similar and that it would be
> safe to go with either.  In this case, it is worth very carefully
> comparing the description of the internal dynamics.  For this, do not
> compare selected models - that is not what is of interest.  It should
> be the overall picture of the dynamics reported by the parameters.
> For example if Rex is statistically close to zero then, from the
> perspective of the internal motions, models m2 and m4 are the same.
>
> Assuming that the local tm global model is significantly better than
> the other models, another option could be that you have very
> interesting global concerted dynamics occurring in the molecule.  This
> would mean that the standard single global diffusion model (sphere,
> spheroid, or ellipsoid) is insufficient to describe these motions.
> This is what the hybrid models in relax were designed for, but maybe
> these don't describe certain large scale motions well enough (hence
> your use of these didn't resolve the problem).  These aren't a proper
> mathematical solution to the complex physics of coupled diffusion
> processes and hence may be insufficient.
>
> It might be worth trying the normal model-free analysis of starting
> with the diffusion tensor, rather than my new technique which starts
> with the internal dynamics, to see if you end up with a different
> result.  It could be that the new technique in the full_analysis.py
> script is somehow failing, although I doubt that will be the case.
> The oscillation you see in point 3 is found by using Art Palmer's
> Modelfree program as well with a standard analysis - this was one of
> the motivators for me to start looking into and fixing problems with
> model-free analysis - but it is inherent to the iterative procedure
> required for convergence.  Have you tried the analysis with Modelfree
> or Dasha?  And if so, how do the chi-squared and AIC values compare?
>
> Alternatively, the reason could be quite simple.  It could possibly be
> that the structure you have used in the analysis is not accurate
> enough.  If it is a crystal structure, maybe it doesn't represent the
> solution structure well.  The analysis is highly dependent upon the XH
> bond vector orientations, and if this is slightly out it could cause a
> bias and the introduction of artificial motions (either Rex or
> nanosecond motions).  It will also affect the determination of the
> diffusion tensor.  These artificial motions are unlikely to be present
> in the local tm model though, so this is a good check.
>
> The Rex in the ellipsoid model is an indication that something could
> be wrong with the global model.  Whether it is interesting large scale
> motions which are insufficiently described by the ellipsoid, whether
> the technique cannot find the real solution, or whether this is caused
> by structural inaccuracies, that I cannot tell.  Is the structure of
> the protein released?  What is the system which is being studied?
> What are the AIC values like for each global model?  Anyway, hopefully
> one of these ideas may be of help in sorting out the problem.
>
> Regards,
>
> Edward
>
>
>
>
>
> On Mon, May 5, 2008 at 9:23 PM, Sébastien Morin
> <[EMAIL PROTECTED]> wrote:
>   
>> Hi,
>>
>>  I am currently using relax with the full_analysis.py script.
>>
>>  I face several problems for which I can't find any solution...
>>
>>  1.
>>  With all my data (230 residues at 3 fields, for a total of 2070
>>  observables), the best diffusion model is the local tm. This is not
>>  normal as this protein is globular. Hence, the C-terminus residues have
>>  really high chi2 values... Thus, when excluding the C-terminus, the best
>>  diffusion model is still the local tm. Maybe some other residues are
>>  highly flexible and should be rejected... Maybe also some residues have
>>  bad data... What is a good strategy to find residues I should exclude
>>  from my analysis ?
>>
>>
>>  2.
>>  When I look at optimized results from the ellipsoid runs (second best
>>  choice after local tm), I see lots (~ 50 % residues) of Rex, which is a
>>  bit anoying... The diffusion tensor may not be well optimized... This
>>  may be related to problem 1...
>>
>>
>>  3.
>>  In different situations, some runs (prolate or ellipsoid, i.e. the
>>  diffusion tensor that should best describe my system) never converge and
>>  oscillate between 2 or more AIC values. Some residues oscillate between
>>  2 or more models, but these residues are not special as to their
>>  relaxation data or position in the protein...
>>
>>
>>  Consistency testing and reduced spectral density mapping show that my
>>  data are of good quality and are consistent with each other...
>>
>>  I tried with different structures (crystal structure with added protons,
>>  MM snapshots), but always got the same kind of results...
>>
>>  I tried several hybrids (with no C-ter, with no C-ter and several loops,
>>  etc), but always got the same kind of results...
>>
>>  Also, chi2 values are quite high for most residues (5-20 on average)...
>>
>>  What should I do now ? Do you have any idea ?
>>
>>  Thanks a lot for any help or idea !!!!!!!
>>
>>
>>  Exhausted Séb
>>
>>  _______________________________________________
>>  relax (http://nmr-relax.com)
>>
>>  This is the relax-users mailing list
>>  [email protected]
>>
>>  To unsubscribe from this list, get a password
>>  reminder, or change your subscription options,
>>  visit the list information page at
>>  https://mail.gna.org/listinfo/relax-users
>>
>>     
>
>   

_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users

Re: Full analysis issue

Reply via email to