Re: Problem during "final" run of d'Auvergne Protocol

Edward d'Auvergne Tue, 06 Mar 2012 03:50:12 -0800

Hi,

Thank you for all the details.  That really helps in narrowing down
the bug!  From all the info, the bug is without doubt within the
multi-processor package.  Cheers.  If you have a little time, we can
work together and fix this.  The changes/fixes will go into the
repository version, so you'll need a copy of that for testing.  Do you
have the subversion program installed?  If so, you can obtain the most
up to date copy from the repository by typing:


$ svn co svn://svn.gna.org/svn/relax/1.3 relax-1.3

or if this doesn't work:

$ svn co http://svn.gna.org/svn/relax/1.3 relax-1.3

If you already have a checked out copy, you can update to the newest
copy by typing:

$ svn up

I'll look at the second bug you've identifed later.  It would be
appreciated if you created a second bug report for that problem too.
I would not recommend reverting to earlier relax versions due to the
number of bug fixes and other problems solved since then.  This should
not affect the model-free results, but the bugs could bite elsewhere.
Hopefully I can fix this problem quickly.

Cheers,

Edward


P. S.  For reference, the bug report is https://gna.org/bugs/?19528.



On 6 March 2012 12:18, Hugh RW Dannatt <[email protected]> wrote:
> Hi Edward,
>
> Your description sounds very likely the cause of the problem, during
> the time where no output is being produced, the computer gets
> gradually more and more slow before finally giving up.
>
> The error is reproducible such that I have tried it on a couple of
> different machines and it has failed several times at the same stage.
> The error messages tend to vary a little, however. Here are another 2
> of the outputs given when the program has failed (I should clarify all
> of these messages came from runs done on the same machine, and the
> second was run with option "-d" but it hasn't helped very much):-
>
> Simulation 492
> Simulation 493
> Simulation 494
> Simulation 495
> Simulation 496
> Simulation 497
> Simulation 498
> Simulation 499
> Simulation 500
> Traceback (most recent call last):
>  File "/usr/local/relax-1.3.13/multi/uni_processor.py", line 136, in run
>    self.callback.init_master(self)
>  File "/usr/local/relax-1.3.13/multi/processor.py", line 263, in 
> default_init_m
> aster
> Traceback (most recent call last):
>  File "/usr/local/bin/relax", line 7, in <module>
>    relax.start()
>  File "/usr/local/relax-1.3.13/relax.py", line 100, in start
>    processor.run()
>  File "/usr/local/relax-1.3.13/multi/uni_processor.py", line 139, in run
>    self.callback.handle_exception(self, e)
>  File "/usr/local/relax-1.3.13/multi/processor.py", line 250, in 
> default_handle
> _exception
>    traceback.print_exc(file=sys.stderr)
>  File "/usr/lib/python2.6/traceback.py", line 227, in print_exc
>    print_exception(etype, value, tb, limit, file)
>  File "/usr/lib/python2.6/traceback.py", line 125, in print_exception
>    print_tb(tb, limit, file)
>  File "/usr/lib/python2.6/traceback.py", line 69, in print_tb
>    line = linecache.getline(filename, lineno, f.f_globals)
>  File "/usr/lib/python2.6/linecache.py", line 14, in getline
>    lines = getlines(filename, module_globals)
>  File "/usr/lib/python2.6/linecache.py", line 40, in getlines
>    return updatecache(filename, module_globals)
>  File "/usr/lib/python2.6/linecache.py", line 136, in updatecache
>    lines = fp.readlines()
> MemoryError
> 9203.219u 258.488s 8:05:09.46 32.5%     0+0k 90962440+0io 2215895pf+0w
>
> ------------------
>
> Simulation 489
> Simulation 490
> Simulation 491
> Simulation 492
> Simulation 493
> Simulation 494
> Simulation 495
> Simulation 496
> Simulation 497
> Simulation 498
> Simulation 499
> Simulation 500
> debug> Execution lock:  Release by 'script UI' ('script' mode).
> debug> Execution lock:  Release by 'script UI' ('script' mode).
> Traceback (most recent call last):
>  File "/progs/Linux/bin/relax13", line 7, in <module>
>    relax.start()
>  File "/progs/relax-1.3.13/relax.py", line 100, in start
>    processor.run()
>  File "/progs/relax-1.3.13/multi/uni_processor.py", line 139, in run
>    self.callback.handle_exception(self, e)
>  File "/progs/relax-1.3.13/multi/processor.py", line 250, in 
> default_handle_exc
> eption
>    traceback.print_exc(file=sys.stderr)
>  File "/usr/lib/python2.6/traceback.py", line 227, in print_exc
>    print_exception(etype, value, tb, limit, file)
> MemoryError
>
> 8006.268u 542.873s 8:34:11.81 27.7%     0+0k 225824840+0io 6192344pf+0w
>
> ------------------
>
> If the number of MC simulations is dropped even as little as 100, the
> program finishes the fitting successfully, though I then get an error
> message to do with the grace files (i've not been using them so I'm
> not bothered about this though it will be of interest to you no
> doubt):-
>
> Data pipe 'final':  The ts value of 2.6285e-08 is greater than 1.9714e-08, 
> elimi
> nating simulation 94 of spin system ':218@N'.
> Data pipe 'final':  The ts value of 2.6285e-08 is greater than 1.9714e-08, 
> elimi
> nating simulation 95 of spin system ':218@N'.
>
> relax> monte_carlo.error_analysis(prune=0.0)
>
> relax> results.write(file='results', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/
> final', compress_type=1, force=True)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/results.bz2' 
> for w
> riting.
>
> relax> grace.write(x_data_type='spin', y_data_type='s2', spin_id=None, 
> plot_data
> ='value', file='s2.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace'
> , force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/s2.agr' 
> for
> writing.
>
> relax> grace.write(x_data_type='spin', y_data_type='s2f', spin_id=None, 
> plot_dat
> a='value', file='s2f.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grac
> e', force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/s2f.agr' 
> for
>  writing.
>
> relax> grace.write(x_data_type='spin', y_data_type='s2s', spin_id=None, 
> plot_dat
> a='value', file='s2s.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grac
> e', force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/s2s.agr' 
> for
>  writing.
>
> relax> grace.write(x_data_type='spin', y_data_type='te', spin_id=None, 
> plot_data
> ='value', file='te.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace'
> , force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/te.agr' 
> for
> writing.
>
> relax> grace.write(x_data_type='spin', y_data_type='tf', spin_id=None, 
> plot_data
> ='value', file='tf.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace'
> , force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/tf.agr' 
> for
> writing.
>
> relax> grace.write(x_data_type='spin', y_data_type='ts', spin_id=None, 
> plot_data
> ='value', file='ts.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grace'
> , force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/ts.agr' 
> for
> writing.
>
> relax> grace.write(x_data_type='spin', y_data_type='rex', spin_id=None, 
> plot_dat
> a='value', file='rex.agr', 
> dir='/ld10c/home1/hugh/data/pgm298bq/relax/final/grac
> e', force=True, norm=False)
> Opening the file '/ld10c/home1/hugh/data/pgm298bq/relax/final/grace/rex.agr' 
> for
>  writing.
> debug> Execution lock:  Release by 'script UI' ('script' mode).
> debug> Execution lock:  Release by 'script UI' ('script' mode).
> Traceback (most recent call last):
>  File "/ld10c/progs/relax-1.3.13/prompt/interpreter.py", line 383, in 
> exec_scri
> pt
>    runpy.run_module(module, globals)
>  File "/usr/lib/python2.6/runpy.py", line 140, in run_module
>    fname, loader, pkg_name)
>  File "/usr/lib/python2.6/runpy.py", line 34, in _run_code
>    exec code in run_globals
>  File "/ld10c/home1/hugh/data/pgm298bq/relax/dauvergne_protocol_lessMC.py", 
> lin
> e 216, in <module>
>    dAuvergne_protocol(pipe_name=name, diff_model=DIFF_MODEL, 
> mf_models=MF_MODEL
> S, local_tm_models=LOCAL_TM_MODELS, grid_inc=GRID_INC, min_algor=MIN_ALGOR, 
> mc_s
> im_num=MC_NUM, conv_loop=CONV_LOOP)
>  File "/ld10c/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 
> 223
> , in __init__
>    self.execute()
>  File "/ld10c/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 
> 710
> , in execute
>    self.write_results()
>  File "/ld10c/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line 
> 837
> , in write_results
>    self.interpreter.grace.write(x_data_type='spin', y_data_type='rex', 
> file='re
> x.agr',       dir=dir, force=True)
>  File "/ld10c/progs/relax-1.3.13/prompt/grace.py", line 103, in write
>    grace.write(x_data_type=x_data_type, y_data_type=y_data_type, 
> spin_id=spin_i
> d, plot_data=plot_data, file=file, dir=dir, force=force, norm=norm)
>  File "/ld10c/progs/relax-1.3.13/generic_fns/grace.py", line 366, in write
>    write_xy_header(sets=len(data[0]), file=file, data_type=[x_data_type, 
> y_data
> _type], seq_type=seq_type, set_names=set_names, norm=norm)
>  File "/ld10c/progs/relax-1.3.13/generic_fns/grace.py", line 600, in 
> write_xy_h
> eader
>    units = return_units(data_type[i])
>  File "/ld10c/progs/relax-1.3.13/specific_fns/model_free/main.py", line 2394, 
> i
> n return_units
>    raise RelaxNoSpinSpecError
> RelaxNoSpinSpecError: RelaxError: The spin system must be specified.
>
>
> 3510.479u 20.741s 59:07.76 99.5%        0+0k 0+3368io 0pf+0w
>
> ------------------
>
> Finally, this is the output from relax --info as requested:-
>
>                                            relax 1.3.13
>
>                              Molecular dynamics by NMR data analysis
>
>                             Copyright (C) 2001-2006 Edward d'Auvergne
>                         Copyright (C) 2006-2011 the relax development team
>
> This is free software which you are welcome to modify and redistribute
> under the conditions of the
> GNU General Public License (GPL).  This program, including all
> modules, is licensed under the GPL
> and comes with absolutely no warranty.  For details type 'GPL' within
> the relax prompt.
>
> Assistance in using the relax prompt and scripting interface can be
> accessed by typing 'help' within
> the prompt.
>
> Processor fabric:  Uni-processor.
>
> Hardware information:
>    Machine:                 i686
>    Processor:
>
> System information:
>    System:                  Linux
>    Release:                 2.6.32-37-generic
>    Version:                 #81-Ubuntu SMP Fri Dec 2 20:35:14 UTC 2011
>    GNU/Linux version:       Ubuntu 10.04 lucid
>    Distribution:            Ubuntu 10.04 lucid
>    Full platform string:
> Linux-2.6.32-37-generic-i686-with-Ubuntu-10.04-lucid
>
> Software information:
>    Architecture:            32bit ELF
>    Python version:          2.6.5
>    Python branch:           tags/r265
>    Python build:            r265:79063, Apr 16 2010 13:09:56
>    Python compiler:         GCC 4.4.3
>    Python implementation:   CPython
>    Python revision:         79063
>    Numpy version:           1.3.0
>    Libc version:            glibc 2.4
>
> Python packages (most are optional):
>
> Package              Installed       Version         Path
> minfx                True            Unknown
> /ld10c/progs/relax-1.3.13/minfx
> bmrblib              True            Unknown
> /ld10c/progs/relax-1.3.13/bmrblib
> numpy                True            1.3.0
> /usr/lib/python2.6/dist-packages/numpy
> scipy                True            0.7.0
> /usr/lib/python2.6/dist-packages/scipy
> wxPython             False
> mpi4py               False
> epydoc               False
> optparse             True            1.5.3
> /usr/lib/python2.6/optparse.pyc
> readline             True
> /usr/lib/python2.6/lib-dynload/readline.so
> profile              True
> /usr/lib/python2.6/profile.pyc
> bz2                  True
> /usr/lib/python2.6/lib-dynload/bz2.so
> gzip                 True                            
> /usr/lib/python2.6/gzip.pyc
> os.devnull           True                            /usr/lib/python2.6/os.pyc
>
> Compiled relax C modules:
>    Relaxation curve fitting: True
>
> ------------------
>
> Apologies for all the detail but I'm not really sure what to do here.
> If it is the multi-processor part of it that is failing, is installing
> relax 1.3.11 an option? I previously has 1.3.10 installed and the
> commands seem to have changed quite a lot since then. What is your
> opinion on the validity of error estimates based on 100 simulations?
>
> Thanks
>
> Hugh
>
>
>
> On 5 March 2012 08:33, Edward d'Auvergne <[email protected]> wrote:
>> Hi Hugh,
>>
>> I'm pretty sure this error has not been encountered before.  It at
>> least hasn't been reported.  I've never seen anything close to this
>> before, but I would guess that this is an infinitely recursive
>> exception (the error is being caught but, in the process, the error
>> occurs again, being caught a second time, then the 3rd error occurs,
>> is caught a 3rd time, with this continuing until your computer runs
>> out of RAM and swap space and relax is killed by the operating
>> system).  The error seems to occur within the error handing portion of
>> Gary Thompson's multi-processor framework (you are using the
>> uni-processor fabric of the framework here), so maybe Gary might know
>> a solution?
>>
>> Is this error reproducible?  For testing, can you drop the number of
>> Monte Carlo simulations down to say 5?  Running relax with the debug
>> flag might also help:
>>
>> $ relax --debug
>>
>> or:
>>
>> $ relax -d
>>
>> Are you using the GUI or scripting user interface?  The output of:
>>
>> $ relax --info
>>
>> might also be useful.  As for your data set being too large, relax has
>> been used on much bigger systems before so this should not be an
>> issue.  One last thing, would you be able to create a bug report for
>> this error (https://gna.org/bugs/?func=additem&group=relax)?  All of
>> the info/log files can then be pasted/attached there, and it is a
>> useful future reference for anyone who encounters the same or a
>> similar bug.
>>
>> Cheers,
>>
>> Edward
>>
>>
>>
>> On 2 March 2012 12:33, Hugh RW Dannatt <[email protected]> wrote:
>>> Dear All,
>>>
>>> Having completed the fitting of 1 dataset without any problems, I am
>>> now moving onto another. Everything has worked fine until I change the
>>> DIFF_MODEL to "final" and try to run the program again to get error
>>> estimates on my fitted parameters.
>>>
>>> The program successfully re-opens all the results file and selects the
>>> diffusion model. Then all 500 simulations are done without issue, but
>>> as soon as the program has finished this, it stops outputting anything
>>> to the screen for a long time (>12 hrs). During this time, the CPU and
>>> Memory use is very high and the computer runs slowly. Eventually I get
>>> a "Memory Error" and a whole load of messages outputted to the screen,
>>> which I have pasted below. I should emphasize that all the stages of
>>> running this program with different diffusion models have run fine,
>>> and the computer I'm using is a relatively fast machine (dual core
>>> Pentium 4, 2 GB RAM).
>>>
>>> Has anyone had a similar problem? This dataset is larger than the
>>> previous one which fit without issue (current one has 6 measurements
>>> per 176 residues), but I can't imagine this being the cause of this
>>> problem.
>>>
>>> Thanks
>>>
>>> Hugh
>>>
>>> ----
>>>
>>> Simulation 485
>>> Simulation 486
>>> Simulation 487
>>> Simulation 488
>>> Simulation 489
>>> Simulation 490
>>> Simulation 491
>>> Simulation 492
>>> Simulation 493
>>> Simulation 494
>>> Simulation 495
>>> Simulation 496
>>> Simulation 497
>>> Simulation 498
>>> Simulation 499
>>> Simulation 500
>>>
>>>
>>> Traceback (most recent call last):
>>>  File "/progs/relax-1.3.13/multi/uni_processor.py", line 136, in run
>>>    self.callback.init_master(self)
>>>  File "/progs/relax-1.3.13/multi/processor.py", line 263, in
>>> default_init_master
>>>    self.master.run()
>>>  File "/progs/relax-1.3.13/relax.py", line 171, in run
>>>    self.interpreter.run(self.script_file)
>>>  File "/progs/relax-1.3.13/prompt/interpreter.py", line 300, in run
>>>    return run_script(intro=self.__intro_string, local=locals(),
>>> script_file=script_file, quit=self.__quit_flag,
>>> show_script=self.__show_script,
>>> raise_relax_error=self.__raise_relax_error)
>>>  File "/progs/relax-1.3.13/prompt/interpreter.py", line 610, in run_script
>>>    return console.interact(intro, local, script_file, quit,
>>> show_script=show_script, raise_relax_error=raise_relax_error)
>>>  File "/progs/relax-1.3.13/prompt/interpreter.py", line 495, in 
>>> interact_script
>>>    exec_script(script_file, local)
>>>  File "/progs/relax-1.3.13/prompt/interpreter.py", line 383, in exec_script
>>>    runpy.run_module(module, globals)
>>>  File "/usr/lib/python2.6/runpy.py", line 140, in run_module
>>>    fname, loader, pkg_name)
>>>  File "/usr/lib/python2.6/runpy.py", line 34, in _run_code
>>>    exec code in run_globals
>>>  File "/home1/hugh/data/pgm298bq/relax/dauvergne_protocol.py", line
>>> 216, in <module>
>>>    dAuvergne_protocol(pipe_name=name, diff_model=DIFF_MODEL,
>>> mf_models=MF_MODELS, local_tm_models=LOCAL_TM_MODELS,
>>> grid_inc=GRID_INC, min_algor=MIN_ALGOR, mc_sim_num=MC_NUM,
>>> conv_loop=CONV_LOOP)
>>>  File "/progs/relax-1.3.13/auto_analyses/dauvergne_protocol.py", line
>>> 223, in __init__
>>> Traceback (most recent call last):
>>>  File "/progs/Linux/bin/relax13", line 7, in <module>
>>>    relax.start()
>>>  File "/progs/relax-1.3.13/relax.py", line 100, in start
>>>    processor.run()
>>>  File "/progs/relax-1.3.13/multi/uni_processor.py", line 139, in run
>>>    self.callback.handle_exception(self, e)
>>>  File "/progs/relax-1.3.13/multi/processor.py", line 250, in
>>> default_handle_exception
>>>    traceback.print_exc(file=sys.stderr)
>>>  File "/usr/lib/python2.6/traceback.py", line 227, in print_exc
>>>    print_exception(etype, value, tb, limit, file)
>>>  File "/usr/lib/python2.6/traceback.py", line 125, in print_exception
>>>    print_tb(tb, limit, file)
>>>  File "/usr/lib/python2.6/traceback.py", line 69, in print_tb
>>>    line = linecache.getline(filename, lineno, f.f_globals)
>>>  File "/usr/lib/python2.6/linecache.py", line 14, in getline
>>>    lines = getlines(filename, module_globals)
>>>  File "/usr/lib/python2.6/linecache.py", line 40, in getlines
>>>    return updatecache(filename, module_globals)
>>>  File "/usr/lib/python2.6/linecache.py", line 136, in updatecache
>>>    lines = fp.readlines()
>>> MemoryError
>>> 9078.655u 666.933s 10:55:29.66 24.7%    0+0k 241482000+0io 6665721pf+0w
>>>
>>> _______________________________________________
>>> relax (http://nmr-relax.com)
>>>
>>> This is the relax-users mailing list
>>> [email protected]
>>>
>>> To unsubscribe from this list, get a password
>>> reminder, or change your subscription options,
>>> visit the list information page at
>>> https://mail.gna.org/listinfo/relax-users
>
>
>
> --
> Hugh Dannatt
> PhD Student Researcher
>
> Prof. Jon Waltho Lab
> Department of Molecular Biology & Biotechnology
> University of Sheffield
> Firth Court
> Western Bank
> Sheffield
> S10 2TN
>
> 0114 222 2729

_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users

Re: Problem during "final" run of d'Auvergne Protocol

Reply via email to