[galaxy-dev] Download data by data manager with non generic ftp-proxy
Dear All, I have some problems with the download of genome files by the data manager. I tried to download some genome files from the UCSC server with the data manager of galaxy ( name: data_manager_fetch_genome_all_fasta, revision: cca219f2b212), but it fails because of a timeout. Our galaxy server is in an internal network without connection to the Internet. To access the Internet we are required to go trough a http-proxy before any web-services can be accessed. The file /shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_all_fasta/cca219f2b212/data_manager_fetch_genome_all_fasta/data_manager/data_manager_fetch_genome_all_fasta.py manages the download of the file. Line 183 shows the problem. Here the FTP item from the python ftplib module tries to establishes the connection to the UCSC ftp server. This connection fails. So here is my question: How can i download data by the data manager with a http-proxy? Regards Tobias ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] GalaxyAdmins Online Meetup this Thursday: Peter Briggs on Galactic Engineering at Manchester
Hello all, The June 2015 online GalaxyAdmins meetup https://wiki.galaxyproject.org/Community/GalaxyAdmins/Meetups/2015_06_18 will be this Thursday, 18 June, at 5pm Central European Time (see your local time http://bit.ly/1F7zN8l). Peter Briggs http://galacticengineer.blogspot.co.uk/ of the University of Manchester http://www.manchester.ac.uk/ and author of the *Galactic Engineer http://galacticengineer.blogspot.co.uk/* blog will talk about his Galaxy work deploying Galaxy and developing tools within the Bioinformatics Core Facility http://www.ls.manchester.ac.uk/research/facilities/#bioinf at the University of Manchester. John Chilton https://wiki.galaxyproject.org/JohnChilton will also give an update on recent and upcoming work on Planemo https://github.com/galaxyproject/planemo, a set of command-line utilities to assist in building and publishing Galaxy tools. We'll also discuss the upcoming GalaxyAdmins meetup, and the hackathons at GCC2015 http://gcc2015.tsl.ac.uk/. We'll post a link on the meetup page https://wiki.galaxyproject.org/Community/GalaxyAdmins/Meetups/2015_06_18 on Thursday. Hope to see/hear you on Thursday, Dave C -- http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ https://wiki.galaxyproject.org/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] minor visual bug found when using Firefox
Yeah - I can confirm this. I don't know how to fix it though - Firefox and Chrome just render these data list elements differently. You can disable the drop down arrow in Chrome - but not enable it Firefox - and I like the default Chrome behavior better :(. I guess the correct thing to do here is actually find some jquery widget library or something that does this in a custom way instead of relying on the browser. -John On Mon, Jun 1, 2015 at 2:41 PM, Ryan G ngsbioinformat...@gmail.com wrote: In the workflow editor, if you add an Input dataset collection, the Collection Type: shows list. No dropdown arrow appears using Firefox. A dropdown arrow appears when you mouse over using Chrome. In Firefox, if you mouse over where the dropdown box should be, you can still click to choose list of datasets... For the longest time, I thought this was broken until a co-worker showed me it works with Chrome. I'd make a Trello card for this but I can't access Trello from behind the company firewall ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] functional tests / database lock
Not exactly - I feel like sqlite should be able to do the kinds of things Galaxy does without this occurring - it should just be inefficient. We are probably doing something across multiple threads that we should not be - presumably related to job handling. After a recent sqlalchemy upgrade - there is now a new error message in the logs that should help track down the problem: sqlalchemy.pool.NullPool ERROR 2015-06-15 09:53:47,547 Exception during reset or similar Traceback (most recent call last): File build/bdist.linux-x86_64/egg/sqlalchemy/pool.py, line 571, in _finalize_fairy fairy._reset(pool) File build/bdist.linux-x86_64/egg/sqlalchemy/pool.py, line 705, in _reset pool._dialect.do_rollback(self) File build/bdist.linux-x86_64/egg/sqlalchemy/engine/default.py, line 412, in do_rollback dbapi_connection.rollback() ProgrammingError: SQLite objects created in a thread can only be used in that same thread.The object was created in thread id 140442561648384 and this is thread id 140442989479680 At any rate - we recommend using postgres in production for a reason - these sqlite errors are pretty common. -John On Fri, Jun 12, 2015 at 11:15 AM, Ryan G ngsbioinformat...@gmail.com wrote: Is there a known cause for this? On Fri, Jun 12, 2015 at 10:52 AM, Nicola Soranzo nsora...@tiscali.it wrote: Hi Ryan, there is no know fix for this, apart from moving to PostgreSQL. Cheers, Nicola Il 12.06.2015 16:43 Ryan G ha scritto: Hi all - I'm running functional tests on a set of tools using 'run_tests.sh -sid ...'. A few of the tests fail. When I go back and run the tool tests individually, they succeed. So I looked into why they fail when running them in batch, and it turns out to be a database lock. I'm using the default sqllite database that run_tests.sh creates. Does this need to be tuned somehow? galaxy.jobs.handler INFO 2015-06-12 10:28:35,854 (14) Job dispatched Error - : (OperationalError) database is locked u'SELECT api_keys.id AS api_keys_id, api_keys.create_time AS api_keys_create_time, api_keys.user_id AS api_keys_user_id, api_keys.key AS api_keys_key \nFROM api_keys \nWHERE api_keys.key = ?' (u'e4fe9bf787edb9ef746714beb6212bc8',) URL: http://localhost:8363/api/histories/cb227cec8ca83994?key=e4fe9bf787edb9ef746714beb6212bc8 File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/middleware/error.py', line 149 in __call__ app_iter = self.application(environ, sr_checker) File '/apps/sys/galaxy/galaxy-v15.05-staging/eggs/Paste-1.7.5.1-py2.7.egg/paste/recursive.py', line 84 in __call__ return self.application(environ, start_response) File '/apps/sys/galaxy/galaxy-v15.05-staging/eggs/Paste-1.7.5.1-py2.7.egg/paste/httpexceptions.py', line 633 in __call__ return self.application(environ, start_response) File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/base.py', line 132 in __call__ return self.handle_request( environ, start_response ) File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/base.py', line 159 in handle_request trans = self.transaction_factory( environ ) File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py', line 71 in self.set_transaction_factory( lambda e: self.transaction_chooser( e, galaxy_app, session_cookie ) ) File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py', line 102 in transaction_chooser return GalaxyWebTransaction( environ, galaxy_app, self, session_cookie ) File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py', line 194 in __init__ self.error_message = self._authenticate_api( session_cookie ) File '/apps/sys/galaxy/galaxy-v15.05-staging/lib/galaxy/web/framework/webapp.py', line 283 in _authenticate_api provided_key = self.sa_session.query( self.app.model.APIKeys ).filter( self.app.model.APIKeys.table.c.key == api_key ).one() File 'build/bdist.linux-x86_64/egg/sqlalchemy/orm/query.py', line 2184 in one File 'build/bdist.linux-x86_64/egg/sqlalchemy/orm/query.py', line 2227 in __iter__ File 'build/bdist.linux-x86_64/egg/sqlalchemy/orm/query.py', line 2242 in _execute_and_instances File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1449 in execute File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1584 in _execute_clauseelement File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1698 in _execute_context File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py', line 1691 in _execute_context File 'build/bdist.linux-x86_64/egg/sqlalchemy/engine/default.py', line 331 in do_execute OperationalError: (OperationalError) database is locked u'SELECT api_keys.id AS api_keys_id, api_keys.create_time AS api_keys_create_time, api_keys.user_id AS api_keys_user_id, api_keys.key AS api_keys_key \nFROM api_keys \nWHERE api_keys.key = ?' (u'e4fe9bf787edb9ef746714beb6212bc8',) Connetti
Re: [galaxy-dev] Data Collections
On Wed, Jun 10, 2015 at 4:04 PM, Alexander Vowinkel vowinkel.alexan...@gmail.com wrote: Hi Folks, thank you so far for the previous help. I got much further. Now I'm stuck with data collections. Because this is quite a list, I appreciate also answers to parts of my questions ;) I have two issues: A) manual definition of data collections (any type) by user and/or admin B) definition of data collections as input/output of a tool and inside a workflow A) manual Basically I would like to create i) a list of fastq files (unpaired) ii) a paired set of two fastq files iii) a list of each two paired fastq files How can I do that? By using the web app? As user? As admin? By working via ssh on the server? So each of these got much easier/more robust with the most recent release. For the user perspective - for any of these options you will want to load the fastq files into a history, open the manage multiple datasets option (https://wiki.galaxyproject.org/Histories#Managing_Multiple_Datasets_Easily), select the datasets, and then choose the list type from the menu. Each will cause a widget to pop up allowing you to group the datasets (into a list, a pair, or a list of pairs depending on your selection). The most complicated option is the list of pairs - this option is demonstrated in a the first video in Anton's recent NGS 101 - Reference-based RNA-seq series (https://vimeo.com/channels/884356/128265983). More information at https://wiki.galaxyproject.org/Learn/GalaxyNGS101. For all user-centric scenarios - you will need to get the plain datasets into a history first. FTP upload for instance doesn't support creating collections directly - you can import datasets and then create them. Likewise - data libraries do not currently support dataset collections. I believe there are Trello cards for both of these issues. For admins - there is a dataset collection API - I can point you at examples if you want - but this doesn't seem to be your interest. B) in tool/workflow Here I also have different approaches I would like to realize: i) use a collection as input for a tool ii) create a collection as output of a tool ii.1) from known # of output parameters ii.2) from unknown # of output parameters For these things I was trying to find some tools in toolshed to see how they do it, but I couldn't quite adopt it. I would look in the following directory instead of the tool shed - https://github.com/galaxyproject/galaxy/tree/dev/test/functional/tools. These are the tools used to drive the testing of the collections implementation and contain some very stripped down examples of what is possible. i) use a collection as input for a tool this is good documented - realizable by type=data_collection and the collection_type. Unfortunately I can't test this because I can't create a collection so far ;) - see A Indeed :). Here some good examples are like the tools in the RNA-seq pipeline - Tophat, Bowtie2, etc ii) create a collection as output of a tool Here it gets blurry for me. So one can get very far without ever creating an output from a tool explicitly. I contend most of the time - if you have a list of bam files and you want to create another list of bam files - you just want to map some operation over them. This is demonstrated in that RNA-seq outline - and talked about in a more theoretical way in my GCC talk from last year http://bit.ly/gcc2014workflows. There are definitely cases when you want to explicitly create collections though - the current best documentation on this is going to be the pull request that added them - not the implementation but the description which actually lays out these same categories and how to handle them with explicit complete examples. https://bitbucket.org/galaxy/galaxy-central/pull-request/634/allow-tools-to-explicitly-produce-dataset Hopefully this helps - please follow up with additional questions as you have them. I am keen to see more developers leveraging dataset collections. Thanks a bunch. -John ii.1) from known # of output parameters Here I didn't find a tool. I just thought, it might be a simpler case than ii.2 and good to understand the concept. I would be glad if someone could explain the way(s) to do this. ii.2) from unknown # of output parameters For this I found barcode splitter tools (also from devteam) that have different approaches. But. Their output (defined in xml) is only some report file. The output files seem to be fed into the history. And here I don't know how to get hands on these files when I want to use them to feed them into the next step during a workflow. Help highly appreciated! Thanks! Alexander ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing
Re: [galaxy-dev] Data Collections
Thank you for this detailed descriptions! I already have a followup question. I'm working on Galaxy Cloudman: Galaxy is at revision: 93cda3eb81 (master branch) from 11 Jun 2015) But I just can find Build dataset pair|list, not List of Dataset Pairs like in the video. At what version is that implemented? Best, Alexander 2015-06-15 10:17 GMT-05:00 John Chilton jmchil...@gmail.com: On Wed, Jun 10, 2015 at 4:04 PM, Alexander Vowinkel vowinkel.alexan...@gmail.com wrote: Hi Folks, thank you so far for the previous help. I got much further. Now I'm stuck with data collections. Because this is quite a list, I appreciate also answers to parts of my questions ;) I have two issues: A) manual definition of data collections (any type) by user and/or admin B) definition of data collections as input/output of a tool and inside a workflow A) manual Basically I would like to create i) a list of fastq files (unpaired) ii) a paired set of two fastq files iii) a list of each two paired fastq files How can I do that? By using the web app? As user? As admin? By working via ssh on the server? So each of these got much easier/more robust with the most recent release. For the user perspective - for any of these options you will want to load the fastq files into a history, open the manage multiple datasets option ( https://wiki.galaxyproject.org/Histories#Managing_Multiple_Datasets_Easily ), select the datasets, and then choose the list type from the menu. Each will cause a widget to pop up allowing you to group the datasets (into a list, a pair, or a list of pairs depending on your selection). The most complicated option is the list of pairs - this option is demonstrated in a the first video in Anton's recent NGS 101 - Reference-based RNA-seq series (https://vimeo.com/channels/884356/128265983). More information at https://wiki.galaxyproject.org/Learn/GalaxyNGS101. For all user-centric scenarios - you will need to get the plain datasets into a history first. FTP upload for instance doesn't support creating collections directly - you can import datasets and then create them. Likewise - data libraries do not currently support dataset collections. I believe there are Trello cards for both of these issues. For admins - there is a dataset collection API - I can point you at examples if you want - but this doesn't seem to be your interest. B) in tool/workflow Here I also have different approaches I would like to realize: i) use a collection as input for a tool ii) create a collection as output of a tool ii.1) from known # of output parameters ii.2) from unknown # of output parameters For these things I was trying to find some tools in toolshed to see how they do it, but I couldn't quite adopt it. I would look in the following directory instead of the tool shed - https://github.com/galaxyproject/galaxy/tree/dev/test/functional/tools. These are the tools used to drive the testing of the collections implementation and contain some very stripped down examples of what is possible. i) use a collection as input for a tool this is good documented - realizable by type=data_collection and the collection_type. Unfortunately I can't test this because I can't create a collection so far ;) - see A Indeed :). Here some good examples are like the tools in the RNA-seq pipeline - Tophat, Bowtie2, etc ii) create a collection as output of a tool Here it gets blurry for me. So one can get very far without ever creating an output from a tool explicitly. I contend most of the time - if you have a list of bam files and you want to create another list of bam files - you just want to map some operation over them. This is demonstrated in that RNA-seq outline - and talked about in a more theoretical way in my GCC talk from last year http://bit.ly/gcc2014workflows. There are definitely cases when you want to explicitly create collections though - the current best documentation on this is going to be the pull request that added them - not the implementation but the description which actually lays out these same categories and how to handle them with explicit complete examples. https://bitbucket.org/galaxy/galaxy-central/pull-request/634/allow-tools-to-explicitly-produce-dataset Hopefully this helps - please follow up with additional questions as you have them. I am keen to see more developers leveraging dataset collections. Thanks a bunch. -John ii.1) from known # of output parameters Here I didn't find a tool. I just thought, it might be a simpler case than ii.2 and good to understand the concept. I would be glad if someone could explain the way(s) to do this. ii.2) from unknown # of output parameters For this I found barcode splitter tools (also from devteam) that have different approaches. But. Their output (defined in xml) is only some report file. The output files seem to
[galaxy-dev] Repository installation error - missing
Hi, I have a custom tool_conf.xml in gbs_tools/gbs_tool_conf.xml It is added to the config and loaded without problems. toolbox tool_path=./gbs_tools But when I want to install tools from a toolshed (test and main) to it (Shed tool configuration file), it reports Error during installation. Clicking on the tool, it says This repository is not installed correctly (see the Repository installation error below). [...] But there is no Repository installation error. So, how can I find out, what the problem is? Installing in the default tool_shed_conf.xml works. I also checked if it's a problem with the tool_path (I assume it points to the galaxy root). So I created a directory gbs_tools in gbs_tools, in case it is relative to the xml file directory. No difference. Thanks for help! Alexander ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] config files in cloudman
Hi, the config/galaxy.ini is being overwritten when restarting Galaxy via the CloudMan Console. Unfortunately I can't find the file, that is used as template. I searched on the whole machine for *.cloud, galaxy.ini* and universe_wsgi.ini*. The only file I found was /mnt/cm/installed_files/universe_wsgi.ini.cloud. Changes to this file doesn't have an effect. The corresponding bucket also doesn't contain this file. Where is this blueprint coming from? And where can I change it? I am using the next version of CloudMan (dev). I created an own instance with the Cloud Man playbook. Thanks a lot! Alexander ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/