[galaxy-dev] Download data by data manager with non generic ftp-proxy

2015-06-15 Thread Tobias Zimmermann

Dear All,
I have some problems with the download of genome files by the data 
manager. I tried to download some genome files from the UCSC server with 
the data manager of galaxy ( name: data_manager_fetch_genome_all_fasta, 
revision: cca219f2b212), but it fails because of a timeout. Our galaxy 
server is in an internal network without connection to the Internet. To 
access the Internet we are required to go trough a http-proxy before any 
web-services can be accessed.


The file
/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_all_fasta/cca219f2b212/data_manager_fetch_genome_all_fasta/data_manager/data_manager_fetch_genome_all_fasta.py
manages the download of the file. Line 183 shows the problem. Here the 
FTP item from the python ftplib module tries to establishes the 
connection to the UCSC ftp server. This connection fails. So here is my 
question:


How can i download data by the data manager with a http-proxy?


Regards
Tobias

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

[galaxy-dev] GalaxyAdmins Online Meetup this Thursday: Peter Briggs on Galactic Engineering at Manchester

2015-06-15 Thread Dave Clements
Hello all,

The June 2015 online GalaxyAdmins meetup
https://wiki.galaxyproject.org/Community/GalaxyAdmins/Meetups/2015_06_18 will
be this Thursday, 18 June, at 5pm Central European Time (see your local time
http://bit.ly/1F7zN8l).

Peter Briggs http://galacticengineer.blogspot.co.uk/ of the University of
Manchester http://www.manchester.ac.uk/ and author of the *Galactic
Engineer http://galacticengineer.blogspot.co.uk/* blog will talk about
his Galaxy work deploying Galaxy and developing tools within the Bioinformatics
Core Facility http://www.ls.manchester.ac.uk/research/facilities/#bioinf at
the University of Manchester.

John Chilton https://wiki.galaxyproject.org/JohnChilton will also give an
update on recent and upcoming work on Planemo
https://github.com/galaxyproject/planemo, a set of command-line utilities
to assist in building and publishing Galaxy tools.

We'll also discuss the upcoming GalaxyAdmins meetup, and the hackathons at
GCC2015 http://gcc2015.tsl.ac.uk/.

We'll post a link on the meetup page
https://wiki.galaxyproject.org/Community/GalaxyAdmins/Meetups/2015_06_18
on Thursday.

Hope to see/hear you on Thursday,
Dave C

-- 
http://galaxyproject.org/
http://getgalaxy.org/
http://usegalaxy.org/
https://wiki.galaxyproject.org/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] minor visual bug found when using Firefox

2015-06-15 Thread John Chilton
Yeah - I can confirm this. I don't know how to fix it though - Firefox
and Chrome just render these data list elements differently. You can
disable the drop down arrow in Chrome - but not enable it Firefox -
and I like the default Chrome behavior better :(. I guess the correct
thing to do here is actually find some jquery widget library or
something that does this in a custom way instead of relying on the
browser.

-John



On Mon, Jun 1, 2015 at 2:41 PM, Ryan G ngsbioinformat...@gmail.com wrote:
 In the workflow editor, if you add an Input dataset collection, the
 Collection Type: shows list.  No dropdown arrow appears using Firefox.  A
 dropdown arrow appears when you mouse over using Chrome.

 In Firefox, if you mouse over where the dropdown box should be, you can
 still click to choose list of datasets...

 For the longest time, I thought this was broken until a co-worker showed me
 it works with Chrome.  I'd make a Trello card for this but I can't access
 Trello from behind the company firewall

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] functional tests / database lock

2015-06-15 Thread John Chilton
Not exactly - I feel like sqlite should be able to do the kinds of
things Galaxy does without this occurring - it should just be
inefficient. We are probably doing something across multiple threads
that we should not be - presumably related to job handling. After a
recent sqlalchemy upgrade - there is now a new error message in the
logs that should help track down the problem:

sqlalchemy.pool.NullPool ERROR 2015-06-15 09:53:47,547 Exception
during reset or similar
Traceback (most recent call last):
  File build/bdist.linux-x86_64/egg/sqlalchemy/pool.py, line 571, in
_finalize_fairy
fairy._reset(pool)
  File build/bdist.linux-x86_64/egg/sqlalchemy/pool.py, line 705, in _reset
pool._dialect.do_rollback(self)
  File build/bdist.linux-x86_64/egg/sqlalchemy/engine/default.py,
line 412, in do_rollback
dbapi_connection.rollback()
ProgrammingError: SQLite objects created in a thread can only be used
in that same thread.The object was created in thread id
140442561648384 and this is thread id 140442989479680

At any rate - we recommend using postgres in production for a reason -
these sqlite errors are pretty common.

-John


On Fri, Jun 12, 2015 at 11:15 AM, Ryan G ngsbioinformat...@gmail.com wrote:
 Is there a known cause for this?

 On Fri, Jun 12, 2015 at 10:52 AM, Nicola Soranzo nsora...@tiscali.it
 wrote:

 Hi Ryan,
 there is no know fix for this, apart from moving to PostgreSQL.

 Cheers,
 Nicola

 Il 12.06.2015 16:43 Ryan G ha scritto:

 Hi all - I'm running functional tests on a set of tools using
 'run_tests.sh -sid ...'.  A few of the tests fail.  When I go back and run
 the tool tests individually, they succeed.  So I looked into why they fail
 when running them in batch, and it turns out to be a database lock.  I'm
 using the default sqllite database that run_tests.sh creates.  Does this
 need to be tuned somehow?


 galaxy.jobs.handler INFO 2015-06-12 10:28:35,854 (14) Job dispatched
 Error - : (OperationalError) database is locked u'SELECT api_keys.id AS
 api_keys_id, api_keys.create_time AS api_keys_create_time, api_keys.user_id
 AS api_keys_user_id, api_keys.key AS api_keys_key \nFROM api_keys \nWHERE
 api_keys.key = ?' (u'e4fe9bf787edb9ef746714beb6212bc8',)
 URL:
 http://localhost:8363/api/histories/cb227cec8ca83994?key=e4fe9bf787edb9ef746714beb6212bc8
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/middleware/error.py',
 line 149 in __call__
   app_iter = self.application(environ, sr_checker)
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/eggs/Paste-1.7.5.1-py2.7.egg/paste/recursive.py',
 line 84 in __call__
   return self.application(environ, start_response)
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/eggs/Paste-1.7.5.1-py2.7.egg/paste/httpexceptions.py',
 line 633 in __call__
   return self.application(environ, start_response)
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/base.py',
 line 132 in __call__
   return self.handle_request( environ, start_response )
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/base.py',
 line 159 in handle_request
   trans = self.transaction_factory( environ )
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py',
 line 71 in
   self.set_transaction_factory( lambda e: self.transaction_chooser( e,
 galaxy_app, session_cookie ) )
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py',
 line 102 in transaction_chooser
   return GalaxyWebTransaction( environ, galaxy_app, self, session_cookie )
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py',
 line 194 in __init__
   self.error_message = self._authenticate_api( session_cookie )
 File
 '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py',
 line 283 in _authenticate_api
   provided_key = self.sa_session.query( self.app.model.APIKeys ).filter(
 self.app.model.APIKeys.table.c.key == api_key ).one()
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/orm/query.py', line 2184 in
 one
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/orm/query.py', line 2227 in
 __iter__
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/orm/query.py', line 2242 in
 _execute_and_instances
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1449
 in execute
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1584
 in _execute_clauseelement
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1698
 in _execute_context
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1691
 in _execute_context
 File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/default.py', line 331
 in do_execute
 OperationalError: (OperationalError) database is locked u'SELECT
 api_keys.id AS api_keys_id, api_keys.create_time AS api_keys_create_time,
 api_keys.user_id AS api_keys_user_id, api_keys.key AS api_keys_key \nFROM
 api_keys \nWHERE api_keys.key = ?' (u'e4fe9bf787edb9ef746714beb6212bc8',)





 Connetti 

Re: [galaxy-dev] Data Collections

2015-06-15 Thread John Chilton
On Wed, Jun 10, 2015 at 4:04 PM, Alexander Vowinkel
vowinkel.alexan...@gmail.com wrote:
 Hi Folks,

 thank you so far for the previous help. I got much further.
 Now I'm stuck with data collections.

 Because this is quite a list, I appreciate also answers to parts of my
 questions ;)

 I have two issues:
 A) manual definition of data collections (any type) by user and/or admin
 B) definition of data collections as input/output of a tool and inside a
 workflow


 A) manual
 Basically I would like to create
 i) a list of fastq files (unpaired)
 ii) a paired set of two fastq files
 iii) a list of each two paired fastq files

 How can I do that?
 By using the web app? As user? As admin?
 By working via ssh on the server?

So each of these got much easier/more robust with the most recent release.

For the user perspective - for any of these options you will want to
load the fastq files into a history, open the manage multiple datasets
option 
(https://wiki.galaxyproject.org/Histories#Managing_Multiple_Datasets_Easily),
select the datasets, and then choose the list type from the menu. Each
will cause a widget to pop up allowing you to group the datasets (into
a list, a pair, or a list of pairs  depending on your selection).

The most complicated option is the list of pairs - this option is
demonstrated in a the first video in Anton's recent NGS 101 -
Reference-based RNA-seq series
(https://vimeo.com/channels/884356/128265983). More information at
https://wiki.galaxyproject.org/Learn/GalaxyNGS101.

For all user-centric scenarios - you will need to get the plain
datasets into a history first. FTP upload for instance doesn't support
creating collections directly - you can import datasets and then
create them. Likewise - data libraries do not currently support
dataset collections. I believe there are Trello cards for both of
these issues.

For admins - there is a dataset collection API - I can point you at
examples if you want - but this doesn't seem to be your interest.



 B) in tool/workflow
 Here I also have different approaches I would like to realize:
 i) use a collection as input for a tool
 ii) create a collection as output of a tool
 ii.1) from known # of output parameters
 ii.2) from unknown # of output parameters

 For these things I was trying to find some tools in toolshed to see how they
 do it, but I couldn't quite adopt it.

I would look in the following directory instead of the tool shed -
https://github.com/galaxyproject/galaxy/tree/dev/test/functional/tools.
These are the tools used to drive the testing of the collections
implementation and contain some very stripped down examples of what is
possible.


 i) use a collection as input for a tool
 this is good documented - realizable by type=data_collection and the
 collection_type.
 Unfortunately I can't test this because I can't create a collection so far
 ;) - see A

Indeed :). Here some good examples are like the tools in the RNA-seq
pipeline - Tophat, Bowtie2, etc


 ii) create a collection as output of a tool
 Here it gets blurry for me.

So one can get very far without ever creating an output from a tool
explicitly. I contend most of the time - if you have a list of bam
files and you want to create another list of bam files - you just want
to map some operation over them. This is demonstrated in that RNA-seq
outline - and talked about in a more theoretical way in my GCC talk
from last year http://bit.ly/gcc2014workflows.

There are definitely cases when you want to explicitly create
collections though - the current best documentation on this is going
to be the pull request that added them - not the implementation but
the description which actually lays out these same categories and how
to handle them with explicit complete examples.
https://bitbucket.org/galaxy/galaxy-central/pull-request/634/allow-tools-to-explicitly-produce-dataset

Hopefully this helps - please follow up with additional questions as
you have them. I am keen to see more developers leveraging dataset
collections.

Thanks a bunch.
-John


 ii.1) from known # of output parameters
 Here I didn't find a tool. I just thought, it might be a simpler case than
 ii.2 and
 good to understand the concept.
 I would be glad if someone could explain the way(s) to do this.

 ii.2) from unknown # of output parameters
 For this I found barcode splitter tools (also from devteam) that have
 different approaches.
 But. Their output (defined in xml) is only some report file.
 The output files seem to be fed into the history.
 And here I don't know how to get hands on these files when I want to use
 them to feed them into the next step during a workflow.

 Help highly appreciated!

 Thanks!
 Alexander

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing 

Re: [galaxy-dev] Data Collections

2015-06-15 Thread Alexander Vowinkel
Thank you for this detailed descriptions!

I already have a followup question.
I'm working on Galaxy Cloudman:

 Galaxy is at revision: 93cda3eb81 (master branch) from 11 Jun 2015)


But I just can find Build dataset pair|list, not List of Dataset Pairs
like
in the video. At what version is that implemented?

Best,
Alexander

2015-06-15 10:17 GMT-05:00 John Chilton jmchil...@gmail.com:

 On Wed, Jun 10, 2015 at 4:04 PM, Alexander Vowinkel
 vowinkel.alexan...@gmail.com wrote:
  Hi Folks,
 
  thank you so far for the previous help. I got much further.
  Now I'm stuck with data collections.
 
  Because this is quite a list, I appreciate also answers to parts of my
  questions ;)
 
  I have two issues:
  A) manual definition of data collections (any type) by user and/or admin
  B) definition of data collections as input/output of a tool and inside a
  workflow
 
 
  A) manual
  Basically I would like to create
  i) a list of fastq files (unpaired)
  ii) a paired set of two fastq files
  iii) a list of each two paired fastq files
 
  How can I do that?
  By using the web app? As user? As admin?
  By working via ssh on the server?

 So each of these got much easier/more robust with the most recent release.

 For the user perspective - for any of these options you will want to
 load the fastq files into a history, open the manage multiple datasets
 option (
 https://wiki.galaxyproject.org/Histories#Managing_Multiple_Datasets_Easily
 ),
 select the datasets, and then choose the list type from the menu. Each
 will cause a widget to pop up allowing you to group the datasets (into
 a list, a pair, or a list of pairs  depending on your selection).

 The most complicated option is the list of pairs - this option is
 demonstrated in a the first video in Anton's recent NGS 101 -
 Reference-based RNA-seq series
 (https://vimeo.com/channels/884356/128265983). More information at
 https://wiki.galaxyproject.org/Learn/GalaxyNGS101.

 For all user-centric scenarios - you will need to get the plain
 datasets into a history first. FTP upload for instance doesn't support
 creating collections directly - you can import datasets and then
 create them. Likewise - data libraries do not currently support
 dataset collections. I believe there are Trello cards for both of
 these issues.

 For admins - there is a dataset collection API - I can point you at
 examples if you want - but this doesn't seem to be your interest.

 
 
  B) in tool/workflow
  Here I also have different approaches I would like to realize:
  i) use a collection as input for a tool
  ii) create a collection as output of a tool
  ii.1) from known # of output parameters
  ii.2) from unknown # of output parameters
 
  For these things I was trying to find some tools in toolshed to see how
 they
  do it, but I couldn't quite adopt it.

 I would look in the following directory instead of the tool shed -
 https://github.com/galaxyproject/galaxy/tree/dev/test/functional/tools.
 These are the tools used to drive the testing of the collections
 implementation and contain some very stripped down examples of what is
 possible.

 
  i) use a collection as input for a tool
  this is good documented - realizable by type=data_collection and the
  collection_type.
  Unfortunately I can't test this because I can't create a collection so
 far
  ;) - see A

 Indeed :). Here some good examples are like the tools in the RNA-seq
 pipeline - Tophat, Bowtie2, etc

 
  ii) create a collection as output of a tool
  Here it gets blurry for me.

 So one can get very far without ever creating an output from a tool
 explicitly. I contend most of the time - if you have a list of bam
 files and you want to create another list of bam files - you just want
 to map some operation over them. This is demonstrated in that RNA-seq
 outline - and talked about in a more theoretical way in my GCC talk
 from last year http://bit.ly/gcc2014workflows.

 There are definitely cases when you want to explicitly create
 collections though - the current best documentation on this is going
 to be the pull request that added them - not the implementation but
 the description which actually lays out these same categories and how
 to handle them with explicit complete examples.

 https://bitbucket.org/galaxy/galaxy-central/pull-request/634/allow-tools-to-explicitly-produce-dataset

 Hopefully this helps - please follow up with additional questions as
 you have them. I am keen to see more developers leveraging dataset
 collections.

 Thanks a bunch.
 -John

 
  ii.1) from known # of output parameters
  Here I didn't find a tool. I just thought, it might be a simpler case
 than
  ii.2 and
  good to understand the concept.
  I would be glad if someone could explain the way(s) to do this.
 
  ii.2) from unknown # of output parameters
  For this I found barcode splitter tools (also from devteam) that have
  different approaches.
  But. Their output (defined in xml) is only some report file.
  The output files seem to 

[galaxy-dev] Repository installation error - missing

2015-06-15 Thread Alexander Vowinkel
Hi,

I have a custom tool_conf.xml in gbs_tools/gbs_tool_conf.xml
It is added to the config and loaded without problems.
toolbox tool_path=./gbs_tools

But when I want to install tools from a toolshed (test and main)
to it (Shed tool configuration file), it reports Error during
installation.

Clicking on the tool, it says This repository is not installed
correctly (see the Repository installation error below). [...]
But there is no Repository installation error.
So, how can I find out, what the problem is?
Installing in the default tool_shed_conf.xml works.

I also checked if it's a problem with the tool_path (I assume it
points to the galaxy root). So I created a directory gbs_tools
in gbs_tools, in case it is relative to the xml file directory.
No difference.

Thanks for help!
Alexander
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] config files in cloudman

2015-06-15 Thread Alexander Vowinkel
Hi,

the config/galaxy.ini is being overwritten when restarting Galaxy
via the CloudMan Console.
Unfortunately I can't find the file, that is used as template.

I searched on the whole machine for *.cloud, galaxy.ini* and
universe_wsgi.ini*. The only file I found was
/mnt/cm/installed_files/universe_wsgi.ini.cloud. Changes to this
file doesn't have an effect.
The corresponding bucket also doesn't contain this file.

Where is this blueprint coming from?
And where can I change it?

I am using the next version of CloudMan (dev).
I created an own instance with the Cloud Man playbook.

Thanks a lot!
Alexander
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/