Hi Ian, On Thursday 10 July 2008 06:03:54 am Ian Harry wrote: > Hi all, > > Myself and my colleagues use, and have used, matplotlib and it's Tex > capabilities quite extensively to create plots to assist in the > gravitational wave searches we perform. (and it has been a great tool for > us > > :-) ). However recently we have been running into problems when we have > > started automating our plot generation by running multiple plotting jobs > concurrently using the condor scheduler (and dagmans). Many of our plotting > jobs fail with messages such as the one below: > > ---snip--- > > Traceback (most recent call last): > File > "/home/romain/Projects/ > ligovirgo/s5_2yr_lv_lowcbc_20080625/868815014-868901414/868815014-868901414 >/inj001_summary_plots/../executables/plotinjnum", line 298, in ? > 'eff_dist_h') > File > "/home/romain/Projects/ligovirgo/s5_2yr_lv_lowcbc_20080625/868815014-868901 >414/868815014-868901414/inj001_summary_plots/../executables/plotinjnum", > line 119, in plot_found_missed > fname_thumb = InspiralUtils.savefig_pylal(filename=fname, > doThumb=True, dpi_thumb=opts.figure_resolution) > File > "/home/romain/codes/s5_2yr_lv_lowcbc_20080625/pylal/lib64/python2.4/site-pa >ckages/pylal/InspiralUtils.py", line 58, in savefig_pylal > fig.savefig(filename_thumb, dpi=dpi_thumb) > .... > File "/usr/lib64/python2.4/site-packages/matplotlib/texmanager.py", line > 259, in make_png > os.remove(outfile) > OSError: [Errno 2] No such file or directory: > '/home/romain/.matplotlib/tex.cache/ae479c90ff242327b54af004a0846188.output >' > > ---snip--- > > My feeling is that when the code invokes the Tex 'bit' it creates a temp > file in ~/matplotlib/tex.cache and then deletes it and all other temp tex > files when it finishes the Tex 'bit'. This would cause problems if another > job is in the middle of running Tex when the other job deletes it's temp > files! > > We are running a slightly old version of matplotlib (0.87.7), as we run on > multiple clusters our sys admins tend to only update software when there is > a need to and we have had no other problems with matplotlib, I apologize if > this has been fixed in the meantime (I did do a quick search of the mailing > list archive but found nothing). All our clusters currently run Fedora Core > 4 (we're going to move to CentOS 5). > > Currently we are getting around this by forcing condor to retry the failed > jobs 2/3 times, this catches most of these errors. Another solution would > be to limit the number of jobs running to 1 BUT as we run dagmen from > within one 'super' dagman it would prove difficult to limit jobs from > multiple dagmen. > > Anyway if anyone has any ideas of how to solve this I would appreciate > this. Also if there are any options where we can set the location of these > temp tex files and use a different directory for each job (or stop > matplotlib deleting other temp files) that would help us.
I'm really hesitant to mess around with the location of the temp files. It was a bit painfull trying to get usetex to work across platforms. Instead, would you try replacing: os.remove(outfile) with: try: os.remove(outfile) except OSError: pass Let me know if that fixes it, and if you need to wrap any other file deletions. Thanks, Darren ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users