Hi Ed
I didn't have time to try your tips, but they should help me out when I
try to run the full_analysis.py script again...
I'll let you know if it works well or if I still get long computation
times...
Cheers
Séb :)
Edward d'Auvergne wrote:
> Hi,
>
> On 9/17/07, Sebastien Morin <[EMAIL PROTECTED]> wrote:
>
>> Hi Ed,
>>
>> First, there were some bad assignments in my data set. I used the automatic
>> assignment (which takes an assigned peak list and propagates it to other
>> peak lists) procedure within NMRPipe for the first time and some peaks were
>> badly assigned.
>>
>
> Although a problem because of the bond vector orientation, the effect
> of this should not be long computation times just incorrect internal
> motions.
>
>
>
>> Second, the PDB file is quite good as it is a representative conformation
>> from a 60 ns MD simulation using CHARMM. That said, the protein moves in the
>> simulation and, hence, the orientations also change. I could take another
>> conformation, which is what I'll do to cross-validate my models, but
>> nevertheless the orientations will change and subtil changes will appear.
>> This shouldn't be an issue since the vectors that move a lot in the
>> simulations should have correlating relaxation properties and that should be
>> seen in the models chosen.
>>
>
> The orientation changes should only affect the Euler angle values of
> the diffusion tensor. Nothing else should be affected by this. The
> internal motions of the simulation will affect the results of the
> analysis, but the overall orientation really doesn't matter unless you
> are comparing these Euler angles.
>
>
>
>> Third, here are the stats for the ellipsoid optimization :
>>
>> round t_total_(h) t_opt_(h) iter_opt model_change tm a b
>> g chi2 comments
>> ===== =========== ========= ======== ============ ====== ==== =====
>> ==== ================== =======================
>> 1 146 144 207 --- 12.423 18.8 159.7
>> 99.1 9282.2280010132217 ok
>> 2 49 47 62 215 12.463 74.7 152.0
>> 94.3 8793.0777454789404 ok
>> 3 16 14 19 16 12.448 78.0 152.3
>> 96.9 8767.5325004348124 ok
>> 4 12 10 13 1 12.445 80.2 151.9
>> 97.9 8765.5659442063006 ok
>> 5 19 17 23 2 12.445 83.1 151.7
>> 98.3 8761.0001889287214 ok
>> 6 25 23 27 1 12.452 80.9 151.4
>> 96.2 8744.6870170285692 ok
>> 7 16 14 19 1 12.445 83.1 151.7
>> 98.3 8761.0001889287269 almost_5
>> 8 25 23 28 1 12.452 80.9 151.4
>> 96.2 8744.6870170285729 almost_6
>> 9 14 12 17 1 12.445 83.1 151.7
>> 98.3 8761.0001889287269 almost_5_and_exactly_7
>> 10 29 27 33 1 12.452 80.9 151.4
>> 96.2 8744.6870170285656 almost_6_and_8
>> 11 stopped...................................
>>
>
> Are these states from the results in the 'opt' directories? Can you
> possibly pin-point where in the calculation the problem is? One
> option is to increase the verbosity flag 'print_flag' in the
> minimise() user function. This may help in seeing the problem.
>
>
>
>> As you can see, there is a kind of interchange between two runs in the end
>> of the optimization. In fact, from the iteration 5 on, there is only one
>> residue for which the model is changing, it's always the same. It changes
>> from model 5 to 6 and 6 to 5... with a tf of ~17, a ts of ~25000 and a S2 of
>> ~0.73 (chi2 ~40 in aic file, but then with ts ~ 1200) when with model 6 and
>> ts of ~650 and S2 of ~0.78 when with model 5 (chi2 ~50 in aic file). How
>> come a so high ts (25000) isn't eliminated..?
>>
>
> In mathematical modelling, model elimination or model validation must
> occur prior to the model selection step. This is when ts is at ~1.2
> ns, and hence the model is not eliminated. The final optimisation is
> shifting ts up to 25 ns, and this is likely to be the thing causing
> the optimisation to take soooo long! Is there something particular
> with this residue?
>
> The iteration numbers are low, but these may be the number of
> iterations of the method of multipliers algorithm. For each iteration
> there could possibly be thousands of steps of the Newton subalgorithm.
> I can't remember how the iteration number is generated, but the
> print_flag option may show if this is the case.
>
>
>
>> round AIC_or_OPT model S2 S2f S2s tf ts chi2
>> ===== ========== ===== === ==== ==== ====== ====== =========
>> 9 AIC 5 0.78 0.96 0.81 None 698 52
>> 10 AIC 6 0.78 0.97 0.80 11.2 1173 39
>> 9 OPT 5 0.78 0.96 0.81 None 630 ---
>> 10 OPT 6 0.73 0.93 0.79 16.8 24904 ---
>>
>>
>> Fourth, the previous runs were made on 4 different computers which give
>> almost exactly the same calculation time, maybe differing from 10-15 %...
>> This shouldn't be what's causing those so extremely long times...
>>
>
> This is unlikely to be the problem, but I was just wondering in case
> there was an operating system or platform specific bug possibly in the
> Numeric code.
>
>
>
>> Fifth, I used the default algorithm whithin the full_analysis.py script.
>> How can I change the optimization algorithm so it's a two stage procedure
>> like you proposed ? Should I run several times with MIN_ALGOR = 'simplex'
>> and, after a few runs (maybe when the chi2 and number of iterations get to a
>> plateau) switch to MIN_ALGOR = 'newton' ?
>>
>
> Simply have two lines, one after the other, in the code where the
> minimise() user function is located. I.e. in the current 1.2
> repository line file 'full_analysis.py':
>
> # Minimise all parameters.
> minimise('simplex', run=name)
> minimise(MIN_ALGOR, run=name)
>
> # Write the results.
> ...
>
>
> That should be enough to solve the problem (hopefully).
>
> Cheers,
>
> Edward
>
>
>
>
>> I think that's almost everything I can find now...
>>
>> Let me know if you know how to catch those problems before they appear...
>>
>> Cheers
>>
>>
>> Séb :)
>>
>>
>>
>>
>>
>>
>> Edward d'Auvergne wrote:
>> Hi,
>>
>> I've been trying to think of what could possibly be causing these
>> really long times, but I'm really not sure what is happening.
>> Unfortunately there just was not enough information in the post to
>> decipher the key to this problem. Is there something special about
>> those 7 residues? How accurate do you think their orientations are in
>> the PDB file you are using? And how accurate is the PDB file itself
>> with respect to all parts of the system?
>>
>> Have you had a chance to investigate further as to what the issue
>> might be? For example, which part of the calculation is taking the
>> time? Is it the global optimisation of all parameters? Are the final
>> results of each round similar or completely different (selected model
>> wise and parameter value wise). How do the iteration numbers compare
>> at each stage. Essentially a fine analysis and comparison of the
>> results files and the printout from relax will be necessary to track
>> down this abnormal computation time. Oh, are you running these on the
>> same computer as the previous analysis?
>>
>> As for the optimisation algorithm being stuck, if you've used the
>> default algorithm then this shouldn't happen. Optimisation should
>> terminate. There are certain very rare situations where the algorithm
>> known as the GMW Hessian modification, which is used by default as a
>> subalgorithm by the Newton algorithm in relax, can take large amounts
>> of time to complete. You'll see this as a increase in the number of
>> iterations by 4 to 5 orders of magnitude. One way to test this is to
>> use a lower quality optimisation algorithm first and then complete to
>> high precision with the Newton algorithm. In this case I would use
>> simplex first followed by the default Newton algorithm and its default
>> subalgorithms. In all cases constraints should be used. This will
>> only solve the long computation times if the GMW algorithm is at
>> fault.
>>
>> Regards,
>>
>> Edward
>>
>>
>> On 9/4/07, Sebastien Morin <[EMAIL PROTECTED]> wrote:
>>
>>
>> Hi all,
>>
>> I am using the full_analysis.py script with data a three magnetic fields.
>>
>> After a first complete cycle (going through the final optimization), I
>> realized that a few residues had extremely high chi-squared values (>
>> 1000) no matter the diffusion model or model-free model chosen...
>>
>> So I removed those residues (7 out of 222) and started the full_analysis
>> protocole again.
>>
>> However, the optimization times are now extremely long and I should get
>> the final results in weeks...
>>
>>
>> Here are the available times (for local_tm, sphere and ellipsoid) :
>>
>>
>> Diffusion_model Round Time-before_N=222 X2
>> Time-now_N=215 X2
>> =============== ===== ================= =======
>> ============== =======
>> local_tm --- 12h30 45949
>> 14h30 5802 OK, X2 much smaller
>>
>> sphere init --- 1154338 ---
>> 249255
>> 1 2h30 65654 36h00
>> 10303 Long, but X2 much smaller
>> 2 2h30 65654 > 30h00
>>
>> ellipsoid init --- 753535
>> --- 177764
>> 1 4h00 64592 >
>> 67h00 ??
>> 2 2h30 64592
>> not_there_yet
>>
>> Is it possible that the algorithms get stuck somewhere during the
>> optimization..?
>>
>> I thought that removing badly fit residues would, on the contrary, speed
>> up calculations...
>>
>> Thanks for ideas !
>>
>>
>> Sébastien :)
>>
>> --
>> ______________________________________
>> _______________________________________________
>> | |
>> || Sebastien Morin ||
>> ||| Etudiant au PhD en biochimie |||
>> |||| Laboratoire de resonance magnetique nucleaire ||||
>> ||||| Dr Stephane Gagne |||||
>> |||| CREFSIP (Universite Laval, Quebec, CANADA) ||||
>> ||| 1-418-656-2131 #4530 |||
>> || ||
>> |_______________________________________________|
>> ______________________________________
>>
>>
>>
>> _______________________________________________
>> relax (http://nmr-relax.com)
>>
>> This is the relax-users mailing list
>> [email protected]
>>
>> To unsubscribe from this list, get a password
>> reminder, or change your subscription options,
>> visit the list information page at
>> https://mail.gna.org/listinfo/relax-users
>>
>>
>>
>>
>>
>> --
>> ______________________________________
>> _______________________________________________
>> | |
>> || Sebastien Morin ||
>> ||| Etudiant au PhD en biochimie |||
>> |||| Laboratoire de resonance magnetique nucleaire ||||
>> ||||| Dr Stephane Gagne |||||
>> |||| CREFSIP (Universite Laval, Quebec, CANADA) ||||
>> ||| 1-418-656-2131 #4530 |||
>> || ||
>> |_______________________________________________|
>> ______________________________________
>>
>>
>>
>
>
--
______________________________________
_______________________________________________
| |
|| Sebastien Morin ||
||| Etudiant au PhD en biochimie |||
|||| Laboratoire de resonance magnetique nucleaire ||||
||||| Dr Stephane Gagne |||||
|||| CREFSIP (Universite Laval, Quebec, CANADA) ||||
||| 1-418-656-2131 #4530 |||
|| ||
|_______________________________________________|
______________________________________
_______________________________________________
relax (http://nmr-relax.com)
This is the relax-users mailing list
[email protected]
To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users