Hi Rob,
Indeed, I had/have some difficulties with setting temporary
directories. The problem was that FTP uploaded data was first copied to
TMPDIR prior to being put in the database directory.
My solution: I extended the /tmp partition to several GB's, by mounting
a bigger device over it.
In addition, I have a large network share, which is mounted on
/mnt/galaxytemp. The __new_file_path__ points to here. A bit messy
indeed.
I just had another discussion about the temporary directories with John
Chilton and Jeremy Goecks, which you can read here:
https://bitbucket.org/galaxy/galaxy-central/pull-request/139/letting-cuffdiff-use-__new_file_path__-as/diff
From what I understood, __new_file_path__ is going to be phased out, in
favour of __job_working_directory__. But apparently, the
job_working_directory is not a temporary directory (in my case, it
contains symlinks from the job_working_directory to database/files. In
addition, job_working_directory is default part of the database
directory of Galaxy.)
The suggestion is to set TMPDIR env variable to a directory you
specify. I have one file in /home/galaxy that contains the environment
settings, and which gets sourced in the init script that launches
Galaxy.
Cheers,
Joachim Jacob
Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib
On Wed 27 Mar 2013 08:44:08 PM CET, Rob Hooft wrote:
Joachim, Nate,
Leon Mei pointed me to a mailing list post of August 2012 where you
two discussed a problem with uploads to Galaxy filling up /tmp. I
think I have traced this down now after we suffered from this too
several times.
There are a number of places where temporary files are configurable in
galaxy, but there is (at least) one place that uses the Python default
directory (can be set with TMPDIR or some other envvars, but if you
don't it is often /tmp). The "unconfigurable" place is
tools/data_source/upload.py, where the code reads:
if dataset.type == 'url':
try:
page = urllib.urlopen( dataset.path ) #page will be
.close()ed by sniff methods
temp_name, dataset.is_multi_byte = sniff.stream_to_file(
page, prefix='url_paste',
source_encoding=util.get_charset_from_http_headers( page.headers ) )
except Exception, e:
file_err( 'Unable to fetch %s\n%s' % ( dataset.path, str(
e ) ), dataset, json_file )
return
dataset.path = temp_name
sniff.stream_to_file uses the tempfile module, and since there is no
"dir=" in the argument list to this call, the temporary file is made
in /tmp. The central solution for the main galaxy code is in
lib/galaxy/config.py:
self.new_file_path = resolve_path( kwargs.get(
"new_file_path", "database/tmp" ), self.root )
tempfile.tempdir = self.new_file_path
But this assignment to "tempdir" does not help in this case because
upload.py is a tool?
It would be nice to fix this, which we can obviously do ourselves for
our andromeda deployment, but it would be better to do it centrally.
Regards,
Rob
--
Rob W.W. Hooft
Chief Technology Officer BioAssist, Netherlands Bioinformatics Centre
http://www.nbic.nl/ Skype: robhooft GSM: +31 6 27034319
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/