Hi Rob,

Indeed, I had/have some difficulties with setting temporary directories. The problem was that FTP uploaded data was first copied to TMPDIR prior to being put in the database directory.

My solution: I extended the /tmp partition to several GB's, by mounting a bigger device over it.

In addition, I have a large network share, which is mounted on /mnt/galaxytemp. The __new_file_path__ points to here. A bit messy indeed.

I just had another discussion about the temporary directories with John Chilton and Jeremy Goecks, which you can read here: https://bitbucket.org/galaxy/galaxy-central/pull-request/139/letting-cuffdiff-use-__new_file_path__-as/diff

From what I understood, __new_file_path__ is going to be phased out, in favour of __job_working_directory__. But apparently, the job_working_directory is not a temporary directory (in my case, it contains symlinks from the job_working_directory to database/files. In addition, job_working_directory is default part of the database directory of Galaxy.)

The suggestion is to set TMPDIR env variable to a directory you specify. I have one file in /home/galaxy that contains the environment settings, and which gets sourced in the init script that launches Galaxy.



Cheers,


Joachim Jacob

Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib



On Wed 27 Mar 2013 08:44:08 PM CET, Rob Hooft wrote:
Joachim, Nate,

Leon Mei pointed me to a mailing list post of August 2012 where you
two discussed a problem with uploads to Galaxy filling up /tmp. I
think I have traced this down now after we suffered from this too
several times.

There are a number of places where temporary files are configurable in
galaxy, but there is (at least) one place that uses the Python default
directory (can be set with TMPDIR or some other envvars, but if you
don't it is often /tmp). The "unconfigurable" place is
tools/data_source/upload.py, where the code reads:

    if dataset.type == 'url':
        try:
            page = urllib.urlopen( dataset.path ) #page will be
.close()ed by sniff methods
            temp_name, dataset.is_multi_byte = sniff.stream_to_file(
page, prefix='url_paste',
source_encoding=util.get_charset_from_http_headers( page.headers ) )
        except Exception, e:
            file_err( 'Unable to fetch %s\n%s' % ( dataset.path, str(
e ) ), dataset, json_file )
            return
        dataset.path = temp_name

sniff.stream_to_file uses the tempfile module, and since there is no
"dir=" in the argument list to this call, the temporary file is made
in /tmp. The central solution for the main galaxy code is in
lib/galaxy/config.py:

        self.new_file_path = resolve_path( kwargs.get(
"new_file_path", "database/tmp" ), self.root )
        tempfile.tempdir = self.new_file_path

But this assignment to "tempdir" does not help in this case because
upload.py is a tool?

It would be nice to fix this, which we can obviously do ourselves for
our andromeda deployment, but it would be better to do it centrally.

Regards,

Rob

--
Rob W.W. Hooft
Chief Technology Officer BioAssist, Netherlands Bioinformatics Centre
http://www.nbic.nl/    Skype: robhooft    GSM: +31 6 27034319
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Reply via email to