Re: [galaxy-dev] SGE and Galaxy (a different approach)

2011-04-06 Thread Bram Slabbinck

Hi andrew

What you need to do is to add the qsub parameter '-sync y'. This puts a 
hold on the qsub command and makes it wait until the SGE job is finished.


regards
Bram

On 05/04/2011 18:27, andrew stewart wrote:
I'm aware of how to configure Galaxy to use SGE in universe_wsgi.ini, 
however what I want to do is a little different.  Because I only want 
certain processes to be submitted to the queue, I'd rather control 
this at the tool configuration level (the xml wrapper).  For example:


command interpreter=bash
qsub myscript.sh
/command

This will work, except that the status of the job (in Galaxy) shows as 
completed even though the job has simply been submitted to SGE. 
 Basically Galaxy 'loses track' of the process because the submission 
process (myscript.sh) has completed even if the actual job hasn't.


Has anyone else tried anything like this before, or have anything 
helpful to suggest?  One thought is to somehow cause the myscript.sh 
process to pause until the SGE job has completed... somehow.


Any advice appreciated.

Thanks,
Andrew


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


--
=
Bram Slabbinck, PhD

Bioinformatics  Systems Biology
VIB Department of Plant Systems Biology, Ghent University
Technologiepark 927, 9052 Gent, BELGIUM

Tel:+32 (0)9 33 13 822
Fax:+32 (0)9 33 13 809
Email: bram.slabbi...@psb.ugent.be
WWW: http://bioinformatics.psb.ugent.be
=
Services and consulting in bioinformatics
  http://www.arctix.be
=

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] samtools sam-to-bam problem

2011-04-06 Thread Ryan Golhar
Any ideas why I would get this?  If I run the sam_to_bam python script 
from the shell, I get the same error:


(galaxy_env)[galaxy@vail pbs]$ sh 471.sh
Linux vail 2.6.18-194.3.1.el5xen #1 SMP Sun May 2 04:26:43 EDT 2010 x8
6_64 x86_64 x86_64 GNU/Linux
Samtools Version: 0.1.14 (r933:170)
Error extracting alignments from 
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat),


However running the samtools command works fine

On 4/5/11 5:58 PM, Ryan Golhar wrote:

I've performed an alignment using BWA on a file of paired-end illumina
reads. The SAM file looks fine, and contains header information. I'm
converting it to BAM using the sam to bam converter, however it
consistently errors out after running for a while. The error is:

Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat), 

but no error is provided. Looking at the sam_to_bam.py on line 156 is
where the error is thrown. Nothing is in e (I think).

BTW - If I run the samtools command from the shell by hand, the BAM file
is created properly. I do see information on stderr:

$ samtools view -bt /data/genomes/H_sapiens/hg19/hg19.fa.fai -o
/tmp/killme.bam /home/galaxy/galaxy-dist/database/files/000/dataset_785.dat
[samopen] SAM header is present: 25 sequences.

I'm using samtools version 0.1.14 (r933:170) on Linux, 64-bit.

What do I do?

Ryan
___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] SGE and Galaxy (a different approach)

2011-04-06 Thread andrew stewart
Ah this is exactly what I was looking for.  Thanks!

On Wed, Apr 6, 2011 at 2:56 AM, Bram Slabbinck br...@psb.vib-ugent.bewrote:

  Hi andrew

 What you need to do is to add the qsub parameter '-sync y'. This puts a
 hold on the qsub command and makes it wait until the SGE job is finished.

 regards
 Bram

 On 05/04/2011 18:27, andrew stewart wrote:

 I'm aware of how to configure Galaxy to use SGE in universe_wsgi.ini,
 however what I want to do is a little different.  Because I only want
 certain processes to be submitted to the queue, I'd rather control this at
 the tool configuration level (the xml wrapper).  For example:

  command interpreter=bash
 qsub myscript.sh
 /command

  This will work, except that the status of the job (in Galaxy) shows as
 completed even though the job has simply been submitted to SGE.  Basically
 Galaxy 'loses track' of the process because the submission process
 (myscript.sh) has completed even if the actual job hasn't.

  Has anyone else tried anything like this before, or have anything helpful
 to suggest?  One thought is to somehow cause the myscript.sh process to
 pause until the SGE job has completed... somehow.

  Any advice appreciated.

  Thanks,
 Andrew


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


 --
 =
 Bram Slabbinck, PhD

 Bioinformatics  Systems Biology
 VIB Department of Plant Systems Biology, Ghent University
 Technologiepark 927, 9052 Gent, BELGIUM

 Tel:+32 (0)9 33 13 822
 Fax:+32 (0)9 33 13 809
 Email: bram.slabbi...@psb.ugent.be
 WWW: http://bioinformatics.psb.ugent.be
 ==
 ===
 Services and consulting in bioinformatics
   http://www.arctix.be
 =


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Relative file path in Galaxy

2011-04-06 Thread Zhe Chen
Hi,

My tool has a script reads a file in the same directory as the script.
When I try to use relative path to read the file, it works by directly run
it, but does not work when calling from Galaxy. Absolute path will work,
but I want to know is there a way to do it using relative path.

Can you give me some suggestion?



myTool.pl use the following code the read the file in the same directory

my $genusfiles = genus.txt;
open (GENUSGP, $genusfiles)  or die $genusfiles: $! \n;
close(GENUSGP);


Thanks
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Relative file path in Galaxy

2011-04-06 Thread Peter Cock
On Wed, Apr 6, 2011 at 12:44 AM, Zhe Chen z...@lanl.gov wrote:
 Hi,

 My tool has a script reads a file in the same directory as the script.
 When I try to use relative path to read the file, it works by directly run
 it, but does not work when calling from Galaxy. Absolute path will work,
 but I want to know is there a way to do it using relative path.

 Can you give me some suggestion?



 myTool.pl use the following code the read the file in the same directory

 my $genusfiles = genus.txt;
 open (GENUSGP, $genusfiles)  or die $genusfiles: $! \n;
 close(GENUSGP);


 Thanks

I would expect that you can look at the first entry in argv which
should give you the path to the script being run, myTool.pl, then
use that to construct the path to genus.txt

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Use of galaxy with Illumina Pipeline RTA / OLB/ CASAVA / GERALD etc

2011-04-06 Thread WATSON Mick
Hi

I'd be interested to hear from anyone who has tried, or even thought about, 
using galaxy to manage the Illumina pipeline (RTA, OLB, CASAVA, GERALD and all 
those familiar names).

Thanks
Mick
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Trouble viewing interval files in UCSC with Apache+XSendFile

2011-04-06 Thread Assaf Gordon
Until a better solution comes along, this tiny patch makes the temp files 
world-readable:

http://cancan.cshl.edu/labmembers/gordon/files/apache_xsendfile_temp_files.patch

Assaf Gordon wrote, On 04/05/2011 11:14 AM:
 Hello,
 
 I've encountered a strange combination of factors that results in file access 
 problems.
 Perhaps I'm doing something wrong - any advice will be appreciated.
 
 I'm using Apache + XSendFile.
 
 when viewing a strict BED file, everything works, because the actual dataset 
 filename is passed on to Apache/XSendFile.
 
 When viewing any other kind of interval file, Galaxy first creates a 
 temporary file in strict BED format ( in Interval::as_ucsc_display_file() ).
 This method uses tempfile.mkstemp(), which is documented to create a 
 temporary file with user-only read/write access (no group or world access).
 So when the file name is passed on to apache/XsendFile - apache can't read 
 the file and returns 404.
 
 I'm wondering if anyone else encountered such problem, or is my configuration 
 is somehow wrong.
 
 Thanks,
  -gordon
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] samtools sam-to-bam problem

2011-04-06 Thread Ryan Golhar
So it looks like I can get small sam files converted to bam files, but 
not large sam files (~50GB-80GB).  I'm still trying to debug this, but 
not sure what's going on.


Has anyone else run into anything like this?


On 4/6/11 10:08 AM, Ryan Golhar wrote:

Any ideas why I would get this? If I run the sam_to_bam python script
from the shell, I get the same error:

(galaxy_env)[galaxy@vail pbs]$ sh 471.sh
Linux vail 2.6.18-194.3.1.el5xen #1 SMP Sun May 2 04:26:43 EDT 2010 x8
6_64 x86_64 x86_64 GNU/Linux
Samtools Version: 0.1.14 (r933:170)
Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat),

However running the samtools command works fine

On 4/5/11 5:58 PM, Ryan Golhar wrote:

I've performed an alignment using BWA on a file of paired-end illumina
reads. The SAM file looks fine, and contains header information. I'm
converting it to BAM using the sam to bam converter, however it
consistently errors out after running for a while. The error is:

Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat), 

but no error is provided. Looking at the sam_to_bam.py on line 156 is
where the error is thrown. Nothing is in e (I think).

BTW - If I run the samtools command from the shell by hand, the BAM file
is created properly. I do see information on stderr:

$ samtools view -bt /data/genomes/H_sapiens/hg19/hg19.fa.fai -o
/tmp/killme.bam
/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat
[samopen] SAM header is present: 25 sequences.

I'm using samtools version 0.1.14 (r933:170) on Linux, 64-bit.

What do I do?

Ryan
___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] samtools sam-to-bam problem

2011-04-06 Thread Ryan Golhar

Alright, I'm at a loss

I can run the sam to bam converter on a small sam file but not a big sam 
file.  The small SAM file is only 65K, the big SAM file is 44G.  I have 
more than 8TB of free space.


Running the job script from the shell results in the small conversion 
succeeding and the big one failing.  The return code from samtools in 
both instances in 0 so I can't for any reason think of why there the 
script is getting caught in an exception.


I even added a write statement to stdout to double-check the return code 
and stderr message and they are the same in both cases.


Why is this failing in one case and not the other?  I'm stuck.  Help

Ryan

On 4/6/11 4:58 PM, Ryan Golhar wrote:

So it looks like I can get small sam files converted to bam files, but
not large sam files (~50GB-80GB). I'm still trying to debug this, but
not sure what's going on.

Has anyone else run into anything like this?


On 4/6/11 10:08 AM, Ryan Golhar wrote:

Any ideas why I would get this? If I run the sam_to_bam python script
from the shell, I get the same error:

(galaxy_env)[galaxy@vail pbs]$ sh 471.sh
Linux vail 2.6.18-194.3.1.el5xen #1 SMP Sun May 2 04:26:43 EDT 2010 x8
6_64 x86_64 x86_64 GNU/Linux
Samtools Version: 0.1.14 (r933:170)
Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat),

However running the samtools command works fine

On 4/5/11 5:58 PM, Ryan Golhar wrote:

I've performed an alignment using BWA on a file of paired-end illumina
reads. The SAM file looks fine, and contains header information. I'm
converting it to BAM using the sam to bam converter, however it
consistently errors out after running for a while. The error is:

Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat), 

but no error is provided. Looking at the sam_to_bam.py on line 156 is
where the error is thrown. Nothing is in e (I think).

BTW - If I run the samtools command from the shell by hand, the BAM file
is created properly. I do see information on stderr:

$ samtools view -bt /data/genomes/H_sapiens/hg19/hg19.fa.fai -o
/tmp/killme.bam
/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat
[samopen] SAM header is present: 25 sequences.

I'm using samtools version 0.1.14 (r933:170) on Linux, 64-bit.

What do I do?

Ryan
___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] samtools sam-to-bam problem

2011-04-06 Thread Assaf Gordon

Just another example why python's misleadingly simple idioms are quite 
dangerous in production code (couldn't help myself from teasing about python... 
sorry about that).

Seems like line 150 in sam_to_bam.py tries to read the entire BAM file into 
memory just to find out if it's empty or not...

As a stop gap solution with minimal changes, change line 150 from:
if len( open( tmp_aligns_file_name ).read() ) == 0:
to
if len( open( tmp_aligns_file_name ).read(10) ) == 0:

Which will read up to the first 10 bytes (instead of the entire file).

A slightly better (but still wrong) solution is to simply check the file size, 
with:
if os.path.getsize(tmp_aligns_file_name) == 0:

But it's still wrong because even an invalid sam file will create a non-empty BAM file 
(when using samtools view -bt) - the BAM file will still contain the 
chromosome names and sizes.

Example:

$ cat mm9.fa.fai
chr1197195432   6   50  51
chr10   129993255   201139354   50  51
chr11   121843856   333732482   50  51
chr12   121257530   458013223   50  51
chr13   120284312   581695911   50  51
chr13_random400311  704385924   50  51
chr14   125194864   704794249   50  51
chr15   103494974   832493018   50  51
...
...

$ cat 1.sam
Hello World
This is not a SAM file

$ samtools view -bt mm9.fa.fai -o 1.bam 1.sam
[sam_header_read2] 35 sequences loaded.
[sam_read1] reference 'This is not a SAM file' is recognized as '*'.
[main_samview] truncated file.

$ ls -l 1.*
-rw-r--r-- 1 gordon hannon 348 Apr  7 00:57 1.bam
-rw-r--r-- 1 gordon hannon  35 Apr  7 00:57 1.sam



So in short, this whole sam-to-bam wrapper tool is not suitable for large SAM 
files (if they don't fit entirely in memory), and not for error checking of 
invalid SAM files.


-gordon


On 04/07/2011 12:30 AM, Ryan Golhar wrote:

Here's what I get:

(galaxy_env)[galaxy@vail pbs]$ sh ./big.sh
Samtools Version: 0.1.14 (r933:170)
Traceback (most recent call last):
File /home/galaxy/galaxy-dist/tools/samtools/sam_to_bam.py, line 150,
in __main__
if len( open( tmp_aligns_file_name ).read() ) == 0:
MemoryError
Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat),
(galaxy_env)[galaxy@vail pbs]$



On 4/6/11 7:29 PM, Assaf Gordon wrote:

Ryan,

Since we're shooting in the dark here, best to try and understand
what's the exception.

Add the following line to the beginning of sam_to_bam.py:
import traceback

and add the following line to sam_to_bam.py line 156 (before the
call to stop_err):
traceback.print_exc()

Hopefully this will print out which exception you're getting, and
where is it thrown from.

-gordon


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] samtools sam-to-bam problem

2011-04-06 Thread Assaf Gordon

shameless plug
If your sam file already contains header lines, you can use our version of the 
sam-to-bam wrapper.
It works without python and without writing a temporary (non-sorted) bam file 
to disk.
Not so fast, and with minimal error checking - but it mostly works.

http://cancan.cshl.edu/labmembers/gordon/files/cshl_sam_to_bam.tar.bz2
/shameless plug

On 04/07/2011 01:05 AM, Assaf Gordon wrote:

Just another example why python's misleadingly simple idioms are quite
dangerous in production code (couldn't help myself from teasing about
python... sorry about that).

Seems like line 150 in sam_to_bam.py tries to read the entire BAM file
into memory just to find out if it's empty or not...

As a stop gap solution with minimal changes, change line 150 from:
if len( open( tmp_aligns_file_name ).read() ) == 0:
to
if len( open( tmp_aligns_file_name ).read(10) ) == 0:

Which will read up to the first 10 bytes (instead of the entire file).

A slightly better (but still wrong) solution is to simply check the file
size, with:
if os.path.getsize(tmp_aligns_file_name) == 0:

But it's still wrong because even an invalid sam file will create a
non-empty BAM file (when using samtools view -bt) - the BAM file will
still contain the chromosome names and sizes.

Example:

$ cat mm9.fa.fai
chr1 197195432 6 50 51
chr10 129993255 201139354 50 51
chr11 121843856 333732482 50 51
chr12 121257530 458013223 50 51
chr13 120284312 581695911 50 51
chr13_random 400311 704385924 50 51
chr14 125194864 704794249 50 51
chr15 103494974 832493018 50 51
...
...

$ cat 1.sam
Hello World
This is not a SAM file

$ samtools view -bt mm9.fa.fai -o 1.bam 1.sam
[sam_header_read2] 35 sequences loaded.
[sam_read1] reference 'This is not a SAM file' is recognized as '*'.
[main_samview] truncated file.

$ ls -l 1.*
-rw-r--r-- 1 gordon hannon 348 Apr 7 00:57 1.bam
-rw-r--r-- 1 gordon hannon 35 Apr 7 00:57 1.sam



So in short, this whole sam-to-bam wrapper tool is not suitable for
large SAM files (if they don't fit entirely in memory), and not for
error checking of invalid SAM files.


-gordon


On 04/07/2011 12:30 AM, Ryan Golhar wrote:

Here's what I get:

(galaxy_env)[galaxy@vail pbs]$ sh ./big.sh
Samtools Version: 0.1.14 (r933:170)
Traceback (most recent call last):
File /home/galaxy/galaxy-dist/tools/samtools/sam_to_bam.py, line 150,
in __main__
if len( open( tmp_aligns_file_name ).read() ) == 0:
MemoryError
Error extracting alignments from
(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat),
(galaxy_env)[galaxy@vail pbs]$



On 4/6/11 7:29 PM, Assaf Gordon wrote:

Ryan,

Since we're shooting in the dark here, best to try and understand
what's the exception.

Add the following line to the beginning of sam_to_bam.py:
import traceback

and add the following line to sam_to_bam.py line 156 (before the
call to stop_err):
traceback.print_exc()

Hopefully this will print out which exception you're getting, and
where is it thrown from.

-gordon


___
Please keep all replies on the list by using reply all
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/