Re: [galaxy-dev] running tools within tool

2014-02-10 Thread Peter Cock
The BLAST+ binaries support multi-threaded operation, which is handled
via the $GALAXY_SLOTS environment variable. This should be set
automatically by Galaxy via your job runner settings, which allows you to
(for example) allocate four cores to each BLAST job.

In addition, the BLAST+ wrappers also support high level parallelism
by task splitting if use_tasked_jobs = True is enabled in your
universe_wsgi.ini configuration file. Essentially, the FASTA input
query files are broken up into batches of 1000 sequences, a separate
BLAST child job is run for each chunk, and then the BLAST output
files are merged (in order). This is transparent for the end user.

Each tool enables this via their XML file, e.g.

parallelism method=multi split_inputs=query split_mode=to_size
split_size=1000 merge_outputs=output1/parallelism

This requires splitting support in the FASTA input datatypes, and
merging support in the selected output datatype (e.g. BLAST XML,
tabular, etc). This is done by methods in the Python datatype classes.

It would be interesting to see if any of John's work on collections
of files of the same type might fit nicely with this approach (and
thus avoid the disk IO overhead of the merge step?).

Peter


On Mon, Feb 10, 2014 at 1:56 AM, Ketan Maheshwari
ketancmaheshw...@gmail.com wrote:
 Thanks Dannon for the reference. I checked out the tool and installed from
 toolshed on my local Galaxy instance. I also checked out the related paper
 which refers that the Blast executables run in parallel by partitioning the
 input files into fragments and running batches in parallel. That sounds
 cool. I browsed the code but could not find the exact mechanism. Is the
 parallelism at workflow level aka branch parallelism or is it at the tool
 level that is the tool invokes parallel code?

 Thanks,
 Ketan


 On Sun, Feb 9, 2014 at 7:50 PM, Ketan Maheshwari ke...@mcs.anl.gov wrote:

 Thanks Dannon for the reference. I checked out the tool and installed from
 toolshed on my local Galaxy instance. I also checked out the related paper
 which refers that the Blast executables run in parallel by partitioning the
 input files into fragments and running batches in parallel. That sounds
 cool. I browsed the code but could not find the exact mechanism. Is the
 parallelism at workflow level aka branch parallelism or is it at the tool
 level that is the tool invokes parallel code?

 Thanks,
 Ketan


 On Thu, Feb 6, 2014 at 9:42 AM, Dannon Baker dannon.ba...@gmail.com
 wrote:

 Ketan,

 Have you taken a look at galaxy's built-in parallelism framework?  For a
 great current example of a tool using this, look at Peter's NCBI BLAST+
 wrappers.  https://github.com/peterjc/galaxy_blast

 -Dannon
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] running tools within tool

2014-02-10 Thread Ketan Maheshwari
Thanks Dannon for the reference. I checked out the tool and installed from
toolshed on my local Galaxy instance. I also checked out the related paper
which refers that the Blast executables run in parallel by partitioning the
input files into fragments and running batches in parallel. That sounds
cool. I browsed the code but could not find the exact mechanism. Is the
parallelism at workflow level aka branch parallelism or is it at the tool
level that is the tool invokes parallel code?

Thanks,
Ketan


On Thu, Feb 6, 2014 at 9:42 AM, Dannon Baker dannon.ba...@gmail.com wrote:

 Ketan,

 Have you taken a look at galaxy's built-in parallelism framework?  For a
 great current example of a tool using this, look at Peter's NCBI BLAST+
 wrappers.  https://github.com/peterjc/galaxy_blast

 -Dannon


 On Thu, Feb 6, 2014 at 10:32 AM, Ketan Maheshwari 
 ketancmaheshw...@gmail.com wrote:

 Hi John, Alex, All,

 Elaborating on the motivation behind my question of running tools within
 tool. First, running a tool in parallel at large-scale. For example, if I
 need to find a pattern from 1000 files via Galaxy Select tool from Text and
 Filter tool-group, I am limited by providing one file at a time to the tool
 which will take a long time to finish. Please correct me if there is a more
 sophisticated way to approach this problem. Second, related concern is
 running a tool in parallel on one or more HPC resources.

 We want to write a generic wrapper Galaxy tool, powered by Swift parallel
 framework such that it can run any arbitrary Galaxy tool in parallel on HPC
 resources. Currently, we have developed this capability but for external
 executables which is not a most secure way of using Galaxy as I understand
 from previous conversation.

 Having such a wrapper tool in a standard way is desirable so that it
 preserves the tool contract and binding within Galaxy environment. That is
 maintaining the history and metadata conventions of Galaxy.

 Thanks,
 Ketan


 On Wed, Feb 5, 2014 at 3:53 PM, John Chilton chil...@msi.umn.edu wrote:

 Galaxy has an API that is capable of running tools - certainly this is
 one path forward on something like this. I am not sure it is the best
 path forward though. Probably the best way to enhance Galaxy's
 execution capabilities is to extend the Galaxy core framework itself -
 this has its own downsides though.

 If you can offer more details about how you would like to enhance
 Galaxy - what it cannot do that you would like it to do - I or others
 may be able to provide more specific ideas. Otherwise, sorry I have
 not been or more help.

 -John



 On Tue, Feb 4, 2014 at 2:51 PM, Ketan Maheshwari ke...@mcs.anl.gov
 wrote:
  Hi,
 
  This is a question I posted to galaxy user mailing list a while back
 and was
  redirected to dev for possible answers:
  Is it possible in Galaxy to design a tool whose sole purpose is to run
 other
  tools. This is motivated by our desire to enhance execution
 capabilities of
  existing tools via a generic tool which acts as a wrapper.
 
  Thanks,
  Ketan
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/




 --
 Ketan


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Error introduced with Fastq Groomer

2014-02-10 Thread graham etherington (TSL)
Hi Philippe,
I’m unable to suggest a reason as to why this has happened, other than some 
sort of corruption whilst the job was running, but I would point out two things 
to you.

1. I don’t think you need to run the fastq groomer on your data anyway as it’s 
in Illumina 1.8+ format, which should already be in fastqsanger format.
2. It appears that the fastq groomer hasn’t worked as the quality scores 
haven’t changed format.  (A general question to anyone here – will fastq 
groomer change the quality format of reads that are already in fastqsanger 
format?)

Cheers,
Graham


Dr. Graham Etherington
Bioinformatics Support Officer,
The Sainsbury Laboratory,
Norwich Research Park,
Norwich NR4 7UH.
UK
Tel: +44 (0)1603 450601

From: Philippe Moncuquet 
philippe.m...@gmail.commailto:philippe.m...@gmail.com
Date: Monday, 10 February 2014 03:50
To: Galaxy Dev galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] Error introduced with Fastq Groomer

Hi,

Some unexpected symbols were introduced while grooming my fastq file

Before


@DJTPB5M1:327:C3PC4ACXX:6:1104:9355:84986 1:N:0:GTCCGC
GAGCCTTGCTAGGAGAGGGAAGGTGGAAGATCATCATTTCCAGGAGAGCACTGCTAGCAGGAAGCCACGTCTGCATTACACGCTTCATTAGGGACTTCCC
+
@@@FFFHHHE@=FDEGCCG2A7CDFHEF:B?BDEGGHGICHC9B@FGEHEGG;F=GHI==CE:;BBCC@CC;8=?=CA;ACC

After


@DJTPB5M1:327:C3PC4ACXX:6:1104:9355:84986 1:N:0:GTCCGC
GAGCCTTGCTAGGAGAGGGAAGGTGGAAGATCATCATTTCCAGGAGAGCACTGCTAGCAGGAAGCCACG+1�CATTACACGCTTCATTAGGGACTTCCC
+
@@@FFFHHHE@=FDEGCCG2A7CDFHEF:B?BDEGGHGICHC9B@FGEHEGG;F=GHI==CE:;BBCC@CC;8=?=CA;ACC


I relaunch this step without being able to reproduce the bug. Any ideas about 
this problem ? Have you guys came across the same problem before ?


Regards,

Philip
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Error introduced with Fastq Groomer

2014-02-10 Thread John Chilton
The groomer was recently migrated to the tool shed - this has not been
released as part of a galaxy-dist though so I assume you are still
running a version of the fastq groomer bundled with Galaxy? If yes,
what version of Galaxy are you running (i.e. can you attach the output
of hg summary)?

-John


On Mon, Feb 10, 2014 at 7:00 AM, graham etherington (TSL)
graham.ethering...@sainsbury-laboratory.ac.uk wrote:
 Hi Philippe,
 I’m unable to suggest a reason as to why this has happened, other than some
 sort of corruption whilst the job was running, but I would point out two
 things to you.

 1. I don’t think you need to run the fastq groomer on your data anyway as
 it’s in Illumina 1.8+ format, which should already be in fastqsanger format.
 2. It appears that the fastq groomer hasn’t worked as the quality scores
 haven’t changed format.  (A general question to anyone here – will fastq
 groomer change the quality format of reads that are already in fastqsanger
 format?)

 Cheers,
 Graham


 Dr. Graham Etherington
 Bioinformatics Support Officer,
 The Sainsbury Laboratory,
 Norwich Research Park,
 Norwich NR4 7UH.
 UK
 Tel: +44 (0)1603 450601

 From: Philippe Moncuquet philippe.m...@gmail.com
 Date: Monday, 10 February 2014 03:50
 To: Galaxy Dev galaxy-dev@lists.bx.psu.edu
 Subject: [galaxy-dev] Error introduced with Fastq Groomer

 Hi,

 Some unexpected symbols were introduced while grooming my fastq file

 Before

 @DJTPB5M1:327:C3PC4ACXX:6:1104:9355:84986 1:N:0:GTCCGC
 GAGCCTTGCTAGGAGAGGGAAGGTGGAAGATCATCATTTCCAGGAGAGCACTGCTAGCAGGAAGCCACGTCTGCATTACACGCTTCATTAGGGACTTCCC
 +
 @@@FFFHHHE@=FDEGCCG2A7CDFHEF:B?BDEGGHGICHC9B@FGEHEGG;F=GHI==CE:;BBCC@CC;8=?=CA;ACC


 After

 @DJTPB5M1:327:C3PC4ACXX:6:1104:9355:84986 1:N:0:GTCCGC
 GAGCCTTGCTAGGAGAGGGAAGGTGGAAGATCATCATTTCCAGGAGAGCACTGCTAGCAGGAAGCCACG+1�CATTACACGCTTCATTAGGGACTTCCC
 +
 @@@FFFHHHE@=FDEGCCG2A7CDFHEF:B?BDEGGHGICHC9B@FGEHEGG;F=GHI==CE:;BBCC@CC;8=?=CA;ACC


 I relaunch this step without being able to reproduce the bug. Any ideas
 about this problem ? Have you guys came across the same problem before ?


 Regards,

 Philip


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Shed_tools couldn't installed due to lack of proxy support. Change in lib/tool_shed/util/common_util.py solves the problem.

2014-02-10 Thread Luca Toldo
Dear Galaxians,
I am behind a proxy with authentication, and I think I will contribute a
wiki page on how to install and configure galaxy for saving other people's
time...

In this mail however I would like to propose a change in the
lib/tool_shed/util/common_util.py

The problem I had was that not all tool wrappers could be installed using
the web interface since I was always getting an error host not found
generated by the urllib2.py call

From that error message (found in paster.log) it was clear that the problem
was due to the fact I am behind a firewall with authentication, and the
http_proxy variable in the run.sh was not sufficient.

I then modified lib/tool_shed/util/common_util.py following the
instructions that I have found in

http://stackoverflow.com/questions/34079/how-to-specify-an-authenticated-proxy-for-a-python-http-connection

and then now finally it works.

Hereafter is what I've modified and it is certainly NOT the appropriate way
of doing it from a proper IT point of view (since is not good practice to
encode passwords in a source code). Furthermore, with the next mercurial
update my code changes will disappear ...  I really hope that therefore
someone of your smart people could modify the codebase accordingly, perhaps
reusing the http_proxy ... proxy_user and proxy_password system variables
:-?

--

def tool_shed_get( app, tool_shed_url, uri ):
Make contact with the tool shed via the uri provided.
registry = app.tool_shed_registry
#CHANGES FOR PROXY AUTH
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None,
'http://myproxy.domain.com:8080','MYDOMAIN\MYUSER'
, 'MYPASS')
proxy_handler=urllib2.ProxyHandler({'http':
'http://myproxy.domain.com:8080'})
proxy_auth_handler=urllib2.ProxyBasicAuthHandler(password_mgr)
urlopener = urllib2.build_opener(proxy_handler,proxy_auth_handler)
#PREVIOUS CODE
#urlopener = urllib2.build_opener()
#password_mgr = registry.password_manager_for_url( tool_shed_url )
#if password_mgr is not None:
#auth_handler = urllib2.HTTPBasicAuthHandler( password_mgr )
#urlopener.add_handler( auth_handler )
response = urlopener.open( uri )
content = response.read()
response.close()
return content
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Feb 10, 2014 Galaxy Distribution News Brief

2014-02-10 Thread Jennifer Jackson

Feb 10, 2014 Galaxy Distribution  News Brief
https://wiki.galaxyproject.org/News/2014_02_10_Galaxy_Distribution

//
*CompleteNews Brief 
https://wiki.galaxyproject.org/DevNewsBriefs/2014_02_10*


*Highlights:*

 * Visualization upgrades, including Trackster CSS styling
 * Multiple Tools migrated to the Tool Shed for a leaner distribution
 * Redesign of UI rendering: new icons, new font, history pane updates
 * API functionality upgrades featuring a new master admin API key and
 * Tool Shed updates a focus on repository metadata, displays,
   installs, and tests
 * Over 35 new community contributions added

http://getgalaxy.org http://getgalaxy.org/

http://bitbucket.org/galaxy/galaxy-dist

http://galaxy-dist.readthedocs.org http://galaxy-dist.readthedocs.org/

new:   $ hg clone https://bitbucket.org/galaxy/galaxy-dist#stable

upgrade:   $ hg pull
   $ hg update release_2014.02.10

/Thanks for using Galaxy!/

The Galaxy Team https://wiki.galaxyproject.org/Galaxy%20Team

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/