Re: [galaxy-dev] Tool development - Selecting a single item from input dataset.
I have created a Trello card to track this request here https://trello.com/c/qCtBBB8n. Any chance you can share the tool with me? I understand that this instinct is too simplistic - but it seems to me like the option to repeatedly run a tool over many datasets should not be the concern of the tool author - it is a user concern. If we add the option - perhaps we could stipulate that it requests some other parameter to dependent on it (I assume you have a dependent parameter in the repeat?). -John On Thu, Jan 22, 2015 at 5:50 AM, Vimalkumar Velayudhan vi...@biotechcoder.com wrote: Thanks Peter. I see how this feature would be useful, but the program I'm writing a wrapper for has an argument with values corresponding to the input files. I am using a repeat tag to maintain this order. With the multi-run option, files are selected in a random manner and added to the job queue. It is best not to display the multi-run option in this case. I see there is a TODO on this already: https://bitbucket.org/galaxy/galaxy-dist/src/a2308bdc93b897af974766b190abe019ade49e9a/lib/galaxy/tools/parameters/basic.py?at=default#cl-2084 For now, I have set allow=False but I believe this is best set at the param tag level: param type=data multirun=false / Vimal On Wed, Jan 21, 2015 at 2:26 AM, Peter Cock p.j.a.c...@googlemail.com wrote: I think this is the (relatively new) Galaxy ability to automatically run N copies of your tool given N input files, making N outputs and is related to the collections work. (This is possible if your tool takes a single input file) Peter On Tue, Jan 20, 2015 at 6:17 PM, Vimalkumar Velayudhan vi...@biotechcoder.com wrote: Hi all, I am trying to create a select box with the possibility of selecting only a single item from the input dataset (figure 1). This works fine but the option for selecting multiple files is still visible (figure 2). The multiple=false attribute has no effect. Figure: http://i.imgur.com/oJVFCoF.png I have the following in my XML. param format=tabular name=ribo_files type=data label=Select Ribo-Seq alignment file multiple=false /param Any suggestions? galaxy-dist revision 5f4c13d622b8 Regards, Vimalkumar Velayudhan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] data collections - workflow - bug?
Hey, Really intensive database operations including datasest collections but other things too like multi-running tools or workflows over many individual datasets for instance - can very easily overwhelm the default sqlite database. This is frustrating and shouldn't happen - but it does unfortunately. I would recommend using a postgres database when testing out dataset collections. The good news is that it is easier than ever to get a fully fledged production-quality server thanks to Bjoern's docker server (https://github.com/bgruening/docker-galaxy-stable) - it comes bundled with Postgres and Slurm so it should be able to handle the collection operations. If you need to run Galaxy on a non-containerized server (for instance because that is where the software is) more information on setting up Galaxy can be found here https://wiki.galaxyproject.org/Admin/Config/Performance/ProductionServer. Here is a Trello card to track progress on the database optimization efforts if you are interested https://trello.com/c/UPLsMKQI. Very sorry. -John -John On Mon, Jan 26, 2015 at 9:35 AM, Torsten Houwaart houwa...@informatik.uni-freiburg.de wrote: Hello Galaxy Devs, I was using data collections (for the first time) for a new workflow of ours and I ran into this problem. There was no complaint by the workflow-editor and I could start the workflow but then see below happened. If you need more information about the workflow or otherwise let me know. Best, Torsten H. job traceback: Traceback (most recent call last): File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py, line 565, in finish_job job_state.job_wrapper.finish( stdout, stderr, exit_code ) File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py, line 1250, in finish self.sa_session.flush() File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/scoping.py, line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/session.py, line 1718, in flush self._flush(objects) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/session.py, line 1789, in _flush flush_context.execute() File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/unitofwork.py, line 331, in execute rec.execute(self) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/unitofwork.py, line 475, in execute uow File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/persistence.py, line 59, in save_obj mapper, table, update) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/persistence.py, line 485, in _emit_update_statements execute(statement, params) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py, line 1449, in execute params) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py, line 1584, in _execute_clauseelement compiled_sql, distilled_params File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py, line 1698, in _execute_context context) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py, line 1691, in _execute_context context) File /usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/default.py, line 331, in do_execute cursor.execute(statement, parameters) DBAPIError: (TransactionRollbackError) deadlock detected DETAIL: Process 3144 waits for ShareLock on transaction 2517124; blocked by process 3143. Process 3143 waits for ShareLock on transaction 2517123; blocked by process 3144. HINT: See server log for query details. 'UPDATE workflow_invocation SET update_time=%(update_time)s WHERE workflow_invocation.id = %(workflow_invocation_id)s' {'update_time': datetime.datetime(2015, 1, 26, 14, 20, 4, 155440), 'workflow_invocation_id': 5454} ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Re: [galaxy-dev] concatenate_datasets docker image can't create output
Can you send me a screenshot of the Tool form right before you hit submit. It is odd that two commands are executed barely differing at all. There were some recent UI changes and I want to make sure that you are passing two inputs once and not submitting one input twice in the newly named batch mode? container.sh is generated in command_factory.py using containers.py - here are some relevant links. https://bitbucket.org/galaxy/galaxy-central/src/tip/lib/galaxy/jobs/command_factory.py?at=default https://bitbucket.org/galaxy/galaxy-central/src/tip/lib/galaxy/tools/deps/containers.py?at=default https://bitbucket.org/galaxy/galaxy-central/src/tip/lib/galaxy/tools/deps/docker_util.py?at=default -John On Mon, Jan 26, 2015 at 2:17 PM, Jeltje van Baren jeltje.van.ba...@gmail.com wrote: I'm still trying to debug this. How do I change the working directory? I can't seem to find out how all those directory mounts get passed to docker - the catDocker.xml appears to generate a 'cat' command that I assume ends up in ()/job_working_directory/000/12/container.sh but I can't find where that's happening - container.sh isn't listed in any file under galaxy-dist. job_conf.xml lists the directories that get passed as variables - again, when and where are those defined? Thanks, -Jeltje On Fri, Jan 23, 2015 at 1:35 PM, Jeltje van Baren jeltje.van.ba...@gmail.com wrote: I tried the second solution (setting everything to rw) and while I can confirm that the command is changed accordingly, the results are the same: Two empty output files and the same error message. Command from paster.log: docker run -e GALAXY_SLOTS=$GALAXY_SLOTS -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist:rw -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/tools/docker:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/tools/docker:rw -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12:rw -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files:rw -w /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12 --net none busybox:ubuntu-14.04 /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12/container.sh; return_code=$?; if [ -f /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12/working_file ] ; then cp /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12/working_file /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files/000/dataset_19.dat ; fi; sh -c exit $return_code -Jeltje On Fri, Jan 23, 2015 at 12:24 PM, John Chilton jmchil...@gmail.com wrote: My first thought is the bug mentioned here - https://lists.galaxyproject.org/pipermail/galaxy-dev/2014-November/020892.html along with potential work arounds. Are you able to confirm that this is or is not the problem at all? -John On Fri, Jan 23, 2015 at 2:37 PM, Jeltje van Baren jeltje.van.ba...@gmail.com wrote: Hi, I'm following instructions at https://github.com/apetkau/galaxy-hackathon-2014. When I try to run the concatenate-datasets on two input files, two odd things happen. First, TWO nearly identical commands are generated in the History panel, only differing in their output filename. In paster.log, only the second one shows up. Second, the program fails, with this info in the History: An error occurred with this dataset: Galaxy slots passed through contain as 1 /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/10/container.sh: line 2: can't create /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files/000/dataset_14.dat This happens for both commands in the History panel, only the output filename in the other error is dataset_15.dat Oddly, the dataset_14.dat and dataset_15.dat are both created during this command, they just end up empty. Paster.log: galaxy.jobs.runners DEBUG 2015-01-23 11:08:35,973 (10) command is: docker inspect busybox:ubuntu-14.04 /dev/null 21 [ $? -ne 0 ] docker pull busybox:ubuntu-14.04 /dev/null 21 docker run -e GALAXY_SLOTS=$GALAXY_SLOTS -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist:ro -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/tools/docker:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/tools/docker:ro -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/10:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/10:rw -v
Re: [galaxy-dev] concatenate_datasets docker image can't create output
See attached for screenshot! Thanks for the links. On Mon, Jan 26, 2015 at 11:37 AM, John Chilton jmchil...@gmail.com wrote: Can you send me a screenshot of the Tool form right before you hit submit. It is odd that two commands are executed barely differing at all. There were some recent UI changes and I want to make sure that you are passing two inputs once and not submitting one input twice in the newly named batch mode? container.sh is generated in command_factory.py using containers.py - here are some relevant links. https://bitbucket.org/galaxy/galaxy-central/src/tip/lib/galaxy/jobs/command_factory.py?at=default https://bitbucket.org/galaxy/galaxy-central/src/tip/lib/galaxy/tools/deps/containers.py?at=default https://bitbucket.org/galaxy/galaxy-central/src/tip/lib/galaxy/tools/deps/docker_util.py?at=default -John On Mon, Jan 26, 2015 at 2:17 PM, Jeltje van Baren jeltje.van.ba...@gmail.com wrote: I'm still trying to debug this. How do I change the working directory? I can't seem to find out how all those directory mounts get passed to docker - the catDocker.xml appears to generate a 'cat' command that I assume ends up in ()/job_working_directory/000/12/container.sh but I can't find where that's happening - container.sh isn't listed in any file under galaxy-dist. job_conf.xml lists the directories that get passed as variables - again, when and where are those defined? Thanks, -Jeltje On Fri, Jan 23, 2015 at 1:35 PM, Jeltje van Baren jeltje.van.ba...@gmail.com wrote: I tried the second solution (setting everything to rw) and while I can confirm that the command is changed accordingly, the results are the same: Two empty output files and the same error message. Command from paster.log: docker run -e GALAXY_SLOTS=$GALAXY_SLOTS -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist:rw -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/tools/docker:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/tools/docker:rw -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12:rw -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files:rw -w /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12 --net none busybox:ubuntu-14.04 /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12/container.sh; return_code=$?; if [ -f /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12/working_file ] ; then cp /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/12/working_file /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files/000/dataset_19.dat ; fi; sh -c exit $return_code -Jeltje On Fri, Jan 23, 2015 at 12:24 PM, John Chilton jmchil...@gmail.com wrote: My first thought is the bug mentioned here - https://lists.galaxyproject.org/pipermail/galaxy-dev/2014-November/020892.html along with potential work arounds. Are you able to confirm that this is or is not the problem at all? -John On Fri, Jan 23, 2015 at 2:37 PM, Jeltje van Baren jeltje.van.ba...@gmail.com wrote: Hi, I'm following instructions at https://github.com/apetkau/galaxy-hackathon-2014. When I try to run the concatenate-datasets on two input files, two odd things happen. First, TWO nearly identical commands are generated in the History panel, only differing in their output filename. In paster.log, only the second one shows up. Second, the program fails, with this info in the History: An error occurred with this dataset: Galaxy slots passed through contain as 1 /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/job_working_directory/000/10/container.sh: line 2: can't create /inside/home/jeltje/exp/varscan2/programs/galaxy-dist/database/files/000/dataset_14.dat This happens for both commands in the History panel, only the output filename in the other error is dataset_15.dat Oddly, the dataset_14.dat and dataset_15.dat are both created during this command, they just end up empty. Paster.log: galaxy.jobs.runners DEBUG 2015-01-23 11:08:35,973 (10) command is: docker inspect busybox:ubuntu-14.04 /dev/null 21 [ $? -ne 0 ] docker pull busybox:ubuntu-14.04 /dev/null 21 docker run -e GALAXY_SLOTS=$GALAXY_SLOTS -v /inside/home/jeltje/exp/varscan2/programs/galaxy-dist:/inside/home/jeltje/exp/varscan2/programs/galaxy-dist:ro -v
[galaxy-dev] Galaxy XML from python's argparse
Howdy devs, I put together a library recently, and since it seems to be functional I thought I'd share with the rest of -dev in case it's of interest to anyone. If anyone has feedback, bugs/issues, or PRs, I'd be happy to receive them! https://github.com/erasche/gxargparse So, what is it? gxargparse is a drop in replacement for argparse which can generate Galaxy Tool XML on demand. When I say drop in replacement, I mean it. Through some python magic, as soon as you `pip install gxargparse`, your argparse will (maybe*) be transparently wrapped by gxargparse, and you'll have the --generate_galaxy_xml flag available, which will generate Galaxy Tool XML. This means *no code changes required*, and free tool XML generation. However, beware that it is free tool XML; you will likely need to make some manual corrections to it before publishing tools (repeat labels for instance). However, if you're converting an argparse tool with hundreds of arguments for use in Galaxy, this could save you a lot of initial manual work. * I say *maybe* because it depends a bit on your python module load order, which is something completely outside of my control. The package comes with a command line tool https://github.com/erasche/gxargparse#it-doesnt-work which spits out a path you can stick in PYTHONPATH to fix this issue. Where to get it? Now available on pypi (gxargparse https://mail.yandex.ru/re.jsx?h=a,u3kuzrWUkilmdvbgYLueaQl=aHR0cHM6Ly9weXBpLnB5dGhvbi5vcmcvcHlwaS9neGFyZ3BhcnNl) and github (erasche/gxargparse https://mail.yandex.ru/re.jsx?h=a,JjiBfNrf6Qgagy0RTH93mwl=aHR0cHM6Ly9naXRodWIuY29tL2VyYXNjaGUvZ3hhcmdwYXJzZQ). I *strongly* recommend against installing it system wide, as any bugs in it could render all argparse based python tools broken on your system. It's much more reasonable to use it in a virtualenv. (gx Known Problems - argument_groups are not dealt with specially - prefix_chars and other lesser used features are not (yet) supported - anything with a repeat is a bit of a hack - no translation from argparse to conditionals/which yet figured out. Bugs reports/suggestions are welcome https://github.com/erasche/gxargparse/issues/! Cheers, Eric -- Eric Rasche Programmer II Center for Phage Technology Rm 312A, BioBio Texas AM University College Station, TX 77843 404-692-2048 e...@tamu.edu rasche.e...@yandex.ru ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] CloudMan + Ansible + AWS
Thanks for the pointer, this is great stuff and I had not seen it before. I’ve wanted to do something similar for a while now, and it’s becoming even more important as we loop in GCE. I’ll definitely read through and play with this and see what we might be able to use, contribute back to, and so on. -Dannon On Mon, Jan 26, 2015 at 12:28 PM, Brad Chapman chapm...@fastmail.com wrote: Enis, John and all; I spotted ansible-cloudman on GitHub today which reminded me I've been meaning to write about the approach we setup late last year to run bcbio on AWS. It uses elasticluster (https://github.com/gc3-uzh-ch/elasticluster) which has the advantage of being all Ansible scripts and bootstrapping from standard images -- so no more making AMIs. It also uses SLURM instead of SGE, which is a nice change. We wrote an interface that automates all of the stuff you need to setup on AWS: IAM users, VPCs and what not. It is a pretty streamlined process from the command line, including specifying the cluster size and stopping/starting it: https://bcbio-nextgen.readthedocs.org/en/latest/contents/cloud.html#aws-setup I also wrote up some benchmarking work using to give an idea of using it in practice: http://bcb.io/2014/12/19/awsbench/ All of the code is here: https://github.com/chapmanb/bcbio-nextgen-vm As always, happy to overlap/share with whatever y'all decide to do. We could make bcbio specific stuff optional as needed, although it is pretty lightweight -- just the driver scripts and a Docker image of bcbio. It's basically a ready to use cluster with this little extra added so hopefully could be useful for future plans with CloudMan. Hope this is useful, Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] CloudMan + Ansible + AWS
Enis, John and all; I spotted ansible-cloudman on GitHub today which reminded me I've been meaning to write about the approach we setup late last year to run bcbio on AWS. It uses elasticluster (https://github.com/gc3-uzh-ch/elasticluster) which has the advantage of being all Ansible scripts and bootstrapping from standard images -- so no more making AMIs. It also uses SLURM instead of SGE, which is a nice change. We wrote an interface that automates all of the stuff you need to setup on AWS: IAM users, VPCs and what not. It is a pretty streamlined process from the command line, including specifying the cluster size and stopping/starting it: https://bcbio-nextgen.readthedocs.org/en/latest/contents/cloud.html#aws-setup I also wrote up some benchmarking work using to give an idea of using it in practice: http://bcb.io/2014/12/19/awsbench/ All of the code is here: https://github.com/chapmanb/bcbio-nextgen-vm As always, happy to overlap/share with whatever y'all decide to do. We could make bcbio specific stuff optional as needed, although it is pretty lightweight -- just the driver scripts and a Docker image of bcbio. It's basically a ready to use cluster with this little extra added so hopefully could be useful for future plans with CloudMan. Hope this is useful, Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/