Re: [galaxy-dev] Error on local galaxy using SAM-to-BAM tool on a cluster

2012-04-12 Thread Louise-Amélie Schmitt

Hi,

I'm having the same issue, has it been fixed since then?

Thanks,
L-A


Le 07/11/2011 21:43, Nate Coraor a écrit :

On Nov 4, 2011, at 1:11 PM, Carlos Borroto wrote:


Hi,

Reading a little more about this problem, I see Galaxy uses python
tempfile library (http://docs.python.org/library/tempfile.html),
specifically at line 70 in tools/samtools/sam_to_bam.py:
tmp_dir = tempfile.mkdtemp()

mkdtemp should honor TMPDIR, TEMP or TMP environment variables, I
setup all tree of them in ~/.bashrc with no results. I'm using:
default_cluster_job_runner = drmaa://-q all.q -V/

With -V I was hoping to be able to export all my environment
variables, which seems to work for everything else, but not for the
TMP.

I ended hardcoding the dir argument, which is not good workaround,
as I'm guessing this is not the only tool that will run into this
problem:
tmp_dir = tempfile.mkdtemp( dir='/home/cborroto/galaxy_dist/database/tmp')

Any advice? In a more SGE related question, is there a way for me to
debug what environment I'm getting when running Galaxy jobs?

Hi Carlos,

Try submitting an SGE job from the command line and having a look at the 
environment variables set on the execution host.  Most likely, SGE is setting 
its own TMPDIR variable which would be overriding the value set with -V.

--nate


Thanks,
Carlos

On Fri, Nov 4, 2011 at 11:22 AM, Carlos Borroto
carlos.borr...@gmail.com  wrote:

Hi Jen,

Thanks for the quick response. The workaround you describe could work,
but I might run into trouble later on.

My interest is to develop a workflow for GATK, which have very strict
requirements on the input BAM file. One of which is that the sorting
have to be exactly the same as the reference. My reference is not
sorted lexicographically chr1, chr10, chr11, , but instead is
sorted karyotypically chr1, chr2,  I don't think I'll be able to
do this with Filter and Sort -  Sort. Also GATK needs the header for
the @RG tags, which I could resolve by just reintroducing the header
later on, but still it will be cumbersome.

I'll work on my galaxy/cluster configuration and see if I can find why
the SAM-to-BAM tool is failing.

Thanks again,
Carlos

On Thu, Nov 3, 2011 at 6:35 PM, Jennifer Jacksonj...@bx.psu.edu  wrote:

Hello Carlos,

If what you want is a sorted SAM file, then the tool Filter and Sort -
Sort may be a better choice. A SAM file is a tabular file.

If there is header data at the beginning of the SAM file, it can be removed
before running Sort with the tool Filter and Sort -  Select (with a not
matching regex). Although, you can choose to not include header output as a
BWA option.

Perhaps this will solve the immediate problem?

Best,

Jen
Galaxy team

On 11/3/11 12:43 PM, Carlos Borroto wrote:

Hi,

I'm running into this error:
Error sorting alignments from
(/tmp/5800600.1.all.q/tmpXOc5mD/tmpAZCzt_),

When using SAM-to-BAM tool on a locally install Galaxy using a SGE
cluster. I'm using the last version of galaxy-dist. I'm guessing I
have a problem with the configuration for the tmp folder. I have this
on universe_wsgi.ini:
# Temporary files are stored in this directory.
new_file_path = /home/cborroto/galaxy_dist/database/tmp

But I don't see this directory being used and from the error looks
like /tmp in the node is used. I wonder if this is the problem, as I
don't know if there is enough space in the local /tmp directory at the
nodes? I ran the same tool in a subset of the same SAM file and it ran
fine.

Also, I see this in the description of the tool:
This tool uses the SAMTools toolkit to produce an indexed BAM file
based on a sorted input SAM file.

But what I actually need is to sort a SAM file output from bwa, I
haven't found any other way than to converting it to BAM. Looking at
sam_to_bam.py I see the BAM file will also be sorted. Would it be
wrong to feed an unsorted SAM file into this tool?

Finally, just to be sure there is nothing wrong with the initial SAM
file, I ran samtools view ... and samtools sort ... on this file
manually outside of Galaxy and it ran fine.

Thanks in advance,
Carlos
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/

--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/wiki/Support


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


___

[galaxy-dev] galaxy-dev#64;lists.bx.psu.edu

2012-04-12 Thread khanjan gandhi
hello this will give you a happier new year http://invest-cnbc.com/news/ talk 
to you soon
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] GATK tools

2012-04-12 Thread Daniel Blankenberg
Hi Rob,

At this time, these tools are still included within the main Galaxy 
distribution under tools/gatk. I would recommend using the latest version 
available from galaxy-central until they are migrated into the toolshed.  
Please let us know if you encounter any issues.


Thanks for using Galaxy,

Dan


On Apr 10, 2012, at 11:36 AM, Robert Chase wrote:

 Hello,
 
 One of our users has been using the GATK (beta) tools on the main instance of 
 galaxy and would like to have them locally accessible. I have looked at the 
 main toolshed but I couldn't find them there. Is there a place I can download 
 the GATK files and xml files?
 
 -Rob
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] param type=select and multiple=true

2012-04-12 Thread Yuri D'Elia
Hi everyone. I'm trying to use a param type=select display=checkboxes 
multiple=true in a tool, but galaxy doesn't seem to work correctly if I 
unselect all.

This is what I have:

param name=test type=select display=checkboxes multiple=true 
label=A test
  option value=a selected=trueA/option
  option value=b selected=trueB/option
  option value=cC/option
/param

If I unselect all, execute the tool, and then try to re-run the tool by 
examining the output dataset, parameter A and B are selected, but they 
shouldn't! When the history is converted to a workflow this breaks the intent 
of the user!

Second problem: how do I test for single values in the command template?
Right now I'm (wrongly?) doing:

  #if 'a' in $test.value:
...
  #end if

but it breaks badly when all values are unselected ($test.value does not 
exist, and thus cheetah fails to compile the template).

My solution would be:

  #set $foo = str($test).split(',')
  #if 'a' in $foo:
  #end if

but this is _incredibly_ ugly. When I use a multi-select, I often want to see 
if there's some single parameter in it. Any suggestions?

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Installation issue...

2012-04-12 Thread Gregory Miles
Hello,
  I am trying to install Galaxy on Linux. I ran the sh run.sh command and 
was greeted with the following error:

 File /install_folder/galaxy-dist/lib/galaxy/datatypes/registry.py, line 146
    finally:
  ^
SyntaxError: invalid syntax

I am using Python version 2.7.1 and was hoping that anyone who has seen this 
before might be able to assist me. Thank you very much in advance for your help.

Greg


-- 
Dr. Gregory Miles
Bioinformatics Specialist
Cancer Institute of New Jersey @ UMDNJ
Office: (732) 235 8817
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] delete data library via API

2012-04-12 Thread Leon Mei
Hi guys,

Is that possible to delete datasets in a shared data library via API? Of
course when the modify permission is granted.

We see delete.py in the API script folder but can't figure out how to get
it work.

Thanks!

Leon

-- 
Hailiang (Leon) Mei
Netherlands Bioinformatics Center
BioAssist NGS Taskforce
 - http://ngs.nbic.nlhttps://wiki.nbic.nl/index.php/Next_Generation_Sequencing
Skype: leon_meiMobile: +31 6 41709231
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Creating a workflow of workflows

2012-04-12 Thread Aaron Gallagher
Hi again,

I'm trying to set up a workflow for Galaxy (6799:40f1816d6857), but I can't
seem to find a way to make a workflow that invokes another workflow. It seems
like this wouldn't be unreasonable, since workflows have defined inputs and
outputs.

There might be another way to accomplish what I'm trying to ultimately do,
though. We have a set of analyses and conversions, and want to assemble a
workflow that goes through a long series of steps. The workflow can roughly be
organized into three parts. Once the workflow is finished, we might want to
rerun the third or second and third parts after modifiying some of the
intermediate files.

It seems like this might also be accomplished by somehow marking that some
input files should be substituted instead of using the outputs of some of the
steps in the workflow.

Anyway, I've been unable to find a simple way of doing this..

I could make each part of the workflow a tool, but if I also wanted to expose
the tools used in each part, I would be duplicating those invocations and
increasing the maintenance burden.

I could also make separate workflows for parts 1+2+3, 2+3, and 3, but that
would also be duplicating work and increasing the maintenance burden.

I could also make this three completely separate workflows, but that would mean
that the user would have to remember to run each of these separately. Each part
can take a very long time, so it would be useful to have this set up so that a
user can start from the first part on a Friday and have the output from the
third part available on Monday.

Thanks in advance for any help,
~Aaron


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/