Re: [galaxy-dev] python egg cache exists error

Jorrit Boekel Tue, 18 Sep 2012 07:38:39 -0700

Hi again,

I have looked into this matter a little bit more, and it looks like thisis happening:


- tasked job is split

- tasks commands are sent to workers (I am running 8-core high cpu extralarge workers on EC2)

- per task, worker runs env.sh for the respective tool
- per task, worker runs scripts/extract_dataset_part.py

- this scripts issues import statements (ones forsimplejson andgalaxy.model.mapping have caused me problems)- which lead to unzipping .so libraries from python eggs into the nodes'/home/galaxy/.python-eggs- this runs into lib/pkg_resources.py and its _bypass_ensure_directorymethod that creates the temporary dir for the egg unzip- since there are 8 processes on the node, sometimes this method triesto mkdir a directory that was just made by the previous process afterthe isdir.

That last point is my guessing. I don't really know how to solve this ina non-hackish way, so until someone finds out, I may use reading from a'eggs_extracted.txt' file to determine if the eggs have been extracted.And locking the file when writing to it of course.


cheers,
jorrit

On 09/14/2012 10:57 AM, Jorrit Boekel wrote:

Dear list,
I am running galaxy-dist on Amazon EC2 through Cloudman, and am usingthe enable_tasked_jobs to run jobs in parallel. Yes, I know it's notrecommended in production. My jobs usually get split in 72 parts, andsometimes (but not always, maybe in 30-50% of cases), errors arereturned concerning the python egg cache, usually:
[Errno 17] File exists: '/home/galaxy/.python-eggs'

or something like
[Errno 17] File exists:'/home/galaxy/.python-eggs/simplejson-2.1.1-py2.7-linux-x86_64-ucs4.egg-tmp'
The errors arise AFAIK from when scripts/extract_dataset_part.py isrun. I am guessing that the tmp python egg dir is created for everytask of the mentioned 72, that they sometimes coincide and that thisleads to an error.
I would like to solve this problem, but before doing so, I'd like toknow if someone else has already fixed it in a galaxy-central changeset.
cheers,
jorrit

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Re: [galaxy-dev] python egg cache exists error

Reply via email to