Thank you, Iayd. Indeed, setting ulimit -s to unlimited helped to advance
this further.
I can see now that a job gets generated and submitted. However Galaxy
crashes immediately after that.
Job <108038> is submitted to queue <short>.
*** glibc detected *** python: free(): invalid pointer: 0x00007fff79f10b64
***
======= Backtrace: =========
< further output is omitted >

Tracking the job through the scheduler reveals that the job finished
successfully.

The command in the job script is something like this:

python /galaxy-dist/tools/data_source/upload.py /galaxy-dist
/galaxy-dist/database/tmp/tmpGY5_lI /galaxy-dist/database/tmp/tmpr7VKGy

1:/galaxy-dist/database/job_working_directory/000/1/dataset_1_files:/galaxy-dist/database/files/000/dataset_1.dat

usage: upload.py <root> <datatypes_conf> <json paramfile> <output spec> ...

I cannot re-run it because only the first file in the tmp folder is there.
The second (json paramfile, tmpr7VKGy) is gone. I presume dataset_1.dat is
the output and it's there.

The second half of the job script is the execution of set_metadata.sh
I can execute it without issues (is this a db update?).

One significant difference between the setup which works and which doesnt
is that the working setup sits on local disk whereas the not working on
Lustre. Could that be relevant?

By the way, is there a method for removing the pending job?
When I re-run Galaxy, it promptly crashes again due the stuck job.

When Galaxy starts, the only error that I see is this
IOError: [Errno 2] No such file or directory:
'./tools/mutation/visualize.xml'
While it might be a good question why mutation directory is not there, the
error is very likely not relevant to the issue.

So I'm open to further suggestions as to how to understand what's going on.

Thank you


On 10 June 2014 19:24, Kandalaft, Iyad <iyad.kandal...@agr.gc.ca> wrote:

>  This is just a guess, which may help you troubleshoot.
>
> It could be a that python is reaching a stack limit: run ulimit -s  and
> set it to a higher value if required
>
> I’m completely guessing here but is it possible that the DRMAA is missing
> a linked library on the redhat system – check with ldd?
>
>
>
> Regards,
>
> Iyad Kandalaft
>
>
>
> *Iyad Kandalaft*
>
> Microbial Biodiversity Bioinformatics
>
> Agriculture and Agri-Food Canada | Agriculture et Agroalimentaire Canada
> 960 Carling Ave.| 960 Ave. Carling
>
> Ottawa, ON| Ottawa (ON) K1A 0C6
>
> E-mail Address / Adresse courriel  iyad.kandal...@agr.gc.ca
> Telephone | Téléphone 613-759-1228
> Facsimile | Télécopieur 613-759-1701
> Teletypewriter | Téléimprimeur 613-773-2600
> Government of Canada | Gouvernement du Canada
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to