Re: [galaxy-dev] Examples of Galaxy tools in the toolsheds that install and run JAR files properly?
Hi Melissa, Just commenting on point 1 of your email: did you try the “Reset metadata “ option? See screenshot below: [cid:image001.jpg@01CFC756.F193E4B0] For points 2 and 3: I normally look at the status (Installed/Green) and test whether the correct version appear in the menu for the users. I tend to ignore other messages or temporary(?)hiccups in the installation process web page. Regards, Pieter. From: melissa.s.cl...@gmail.com [mailto:melissa.s.cl...@gmail.com] On Behalf Of Melissa Cline Sent: woensdag 3 september 2014 3:23 To: Dave Bouvier Cc: Lukasse, Pieter; Peter Cock; Galaxy Dev Subject: Re: [galaxy-dev] Examples of Galaxy tools in the toolsheds that install and run JAR files properly? Peter, Pieter and Dave, thank you for the pointers to your tools - they've been extremely helpful! Now I can see how the process is supposed to work. I'm not really sure where mine is going wrong, but maybe someone here will have ideas. So folks, I've had partial success with a repository that includes a tool_dependencies.xml, which sets two environment variables and moves a JAR file from REPOSITORY_INSTALL_DIR to INSTALL_DIR. But the following things suggest to me that it's only a partial success, and I'm very interested in any insights to clear them up. 1. I have my repository checked into an internal tool shed. Somewhere in the course of development, I specified a buggy dependency, and now I can't seem to clear it up. When I go to (re)install my tool from the tool shed, here's what I see in the way of dependencies: Tool dependencies - these dependencies may not be required by tools in this repository Name Version Type Orphan JAR_PAHT set_environment yes and as you might imagine, I've checked the obvious things, and there is no more reference to JAR_PAHT (typos and all) anywhere in or around my tool, or in my galaxy-dist directory. It seems like there's some old metadata cruft that hasn't been cleared out. How can I clear it out? 2. When I install my tool, I see the following messages in paster.log: --- tool_shed.galaxy_install.repository_dependencies.repository_dependency_manager DEBUG 2014-09-02 17:13:59,279 Building repository dependency relationships... 172.30.0.22 - - [02/Sep/2014:17:13:59 -0700] POST /admin_toolshed/prepare_for_install HTTP/1.1 200 - http://tcga1:1235/admin_toolshed/prepare_for_install?tool_shed_url=http://medbook.ucsc.edu:9009/repository_ids=decab5ee1e95b10bchangeset_revisions=ff9b02e50bcf; Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36 172.30.0.22 - - [02/Sep/2014:17:14:02 -0700] POST /admin_toolshed/repository_installation_status_updates HTTP/1.1 200 - http://tcga1:1235/admin_toolshed/prepare_for_install; Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36 tool_shed.util.shed_util_common DEBUG 2014-09-02 17:14:03,477 Error attempting to get tool shed status for installed repository start_xena: HTTP Error 404: Not Found Attempting older 'check_for_updates' method. --- Should I be concerned about these last messages? I don't remember seeing them with Peter, Pieter and Dave's tools. 3. After my tool has installed, there is no INSTALLATION.log in my INSTALL_DIR. Does this mean that the installation process somehow terminated early? There is an env.sh file, with the correct values of the environment variables I'm setting in my tool_dependencies.xml, and my jar file is copied to INSTALL_DIR. In my Galaxy window, my tool is indicated as Installed, in green. Here are the last messages I see in paster.log: tool_shed.galaxy_install.install_manager DEBUG 2014-09-02 17:14:04,630 Changing status for tool dependency installXena from Installing to Installed. tool_shed.galaxy_install.install_manager DEBUG 2014-09-02 17:14:04,669 Tool dependency installXena version 1.0 has been installed in /inside/home/cline/src/galaxy-dist/tool_dependencies/installXena/1.0/melissacline/start_xena/3e4683c94d8d. 172.30.0.22 - - [02/Sep/2014:17:13:59 -0700] POST /admin_toolshed/manage_repositories HTTP/1.1 302 - http://tcga1:1235/admin_toolshed/prepare_for_install; Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36 172.30.0.22 - - [02/Sep/2014:17:14:04 -0700] GET /admin_toolshed/monitor_repository_installation?tool_shed_repository_ids=3f5830403180d620 HTTP/1.1 200 - http://tcga1:1235/admin_toolshed/prepare_for_install; Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36 172.30.0.22 - - [02/Sep/2014:17:14:05 -0700] POST /admin_toolshed/repository_installation_status_updates HTTP/1.1 200 - http://tcga1:1235/admin_toolshed/prepare_for_install; Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143
[galaxy-dev] Startup error after restoring Galaxy DB from backup
I have tried to restart Galaxy after restoring my database from a backup. Here is the error message I get in the log file. Any idea what is wrong and how to fix this problem? -- galaxy.jobs DEBUG 2014-09-03 08:33:46,367 Loading job configuration from /export/users/galaxy/galaxy-test/universe_wsgi.ini galaxy.jobs DEBUG 2014-09-03 08:33:46,367 Done loading job configuration Traceback (most recent call last): File /export/users/galaxy/galaxy-test/lib/galaxy/webapps/galaxy/buildapp.py, line 35, in app_factory app = UniverseApplication( global_conf = global_conf, **kwargs ) File /export/users/galaxy/galaxy-test/lib/galaxy/app.py, line 102, in __init__ self.toolbox = tools.ToolBox( tool_configs, self.config.tool_path, self ) File /export/users/galaxy/galaxy-test/lib/galaxy/tools/__init__.py, line 118, in __init__ self.load_integrated_tool_panel_keys() File /export/users/galaxy/galaxy-test/lib/galaxy/tools/__init__.py, line 283, in load_integrated_tool_panel_keys tree = parse_xml( self.integrated_tool_panel_config ) File /export/users/galaxy/galaxy-test/lib/galaxy/util/__init__.py, line 132, in parse_xml tree = ElementTree.parse(fname) File /export/users/galaxy/galaxy-test/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py, line 859, in parse tree.parse(source, parser) File /export/users/galaxy/galaxy-test/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py, line 583, in parse parser.feed(data) File /export/users/galaxy/galaxy-test/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py, line 1242, in feed self._parser.Parse(data, 0) ExpatError: not well-formed (invalid token): line 117, column 1 Removing PID file /var/run/paster.pid - Thanks, Graeme -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] directory as an input file
Collection are one potential answer for how users can specify the set of stuff that belongs in the directory. For explicitly dealing with applications that consume directories - I think it is best to just create the directory and link in files (if possible) before the tool runs. commandmkdir input_dir; #for $i, $input_file in enumerate($input_files)# ln -s $input_file input_dir/$i; #end for my_application input_dir /command inputs param name=input_files type=data format=bam multiple=true / ... You can also do this sort of thing in a wrapper. If needed you can build more interesting command-lines this way that add extensions, use names, etc peptideshaker is an example of a fairly complex tool that uses an idiom like this and doesn't resort to a helper wrapper (https://toolshed.g2.bx.psu.edu/repository/browse_repository?id=13a5bad5c984db6f#). I am currently working on support for tools that actually produce collections of files this way - I think I will probably land up adding some high-level utilities for doing stuff like this for those scenarios. But if you tool just produces a couple files and consume a directory - no need to necessarily resort to collections (as the tool author - your users will probably want to if they want to use these tools in workflows). -John On Tue, Sep 2, 2014 at 9:51 PM, Peter Cock p.j.a.c...@googlemail.com wrote: You might be able to do this by accepting a collection of SAM/BAM files as input instead. This is a quite new feature in Galaxy, see: https://wiki.galaxyproject.org/News/2014_06_02_Galaxy_Distribution Peter On Wed, Sep 3, 2014 at 10:00 AM, Philippe Moncuquet philippe.m...@gmail.com wrote: Hi, I am trying to write a wrapper for a tool that take a directory containing SAM/BAM files as an input. I am not sure how to do that, is there another tool that implements this and that I can have a look at ? Any suggestions would be greatly appreciated. Regards, Philip ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Startup error after restoring Galaxy DB from backup
Hi Graeme, It looks like your integrated_tool_panel.xml file has been corrupted. You can move/remove this file and it will be recreated the next time Galaxy is started up. Thanks for using Galaxy, Dan On Sep 3, 2014, at 3:54 AM, Graeme Grimes graeme.gri...@igmm.ed.ac.uk wrote: I have tried to restart Galaxy after restoring my database from a backup. Here is the error message I get in the log file. Any idea what is wrong and how to fix this problem? -- galaxy.jobs DEBUG 2014-09-03 08:33:46,367 Loading job configuration from /export/users/galaxy/galaxy-test/universe_wsgi.ini galaxy.jobs DEBUG 2014-09-03 08:33:46,367 Done loading job configuration Traceback (most recent call last): File /export/users/galaxy/galaxy-test/lib/galaxy/webapps/galaxy/buildapp.py, line 35, in app_factory app = UniverseApplication( global_conf = global_conf, **kwargs ) File /export/users/galaxy/galaxy-test/lib/galaxy/app.py, line 102, in __init__ self.toolbox = tools.ToolBox( tool_configs, self.config.tool_path, self ) File /export/users/galaxy/galaxy-test/lib/galaxy/tools/__init__.py, line 118, in __init__ self.load_integrated_tool_panel_keys() File /export/users/galaxy/galaxy-test/lib/galaxy/tools/__init__.py, line 283, in load_integrated_tool_panel_keys tree = parse_xml( self.integrated_tool_panel_config ) File /export/users/galaxy/galaxy-test/lib/galaxy/util/__init__.py, line 132, in parse_xml tree = ElementTree.parse(fname) File /export/users/galaxy/galaxy-test/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py, line 859, in parse tree.parse(source, parser) File /export/users/galaxy/galaxy-test/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py, line 583, in parse parser.feed(data) File /export/users/galaxy/galaxy-test/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py, line 1242, in feed self._parser.Parse(data, 0) ExpatError: not well-formed (invalid token): line 117, column 1 Removing PID file /var/run/paster.pid - Thanks, Graeme -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Concept for a Galaxy Versioned Fasta Data Retrieval Tool
Hi, There have been a few comments about how general we could make the system for Galaxy use or just as a stand-alone command line driven tool. So some notes below about what I could see it taking on. Given the scale of the sequencing data problem, I'm sure the Galaxy community has important feedback on this. I looked at git annex and it appears to me that though it promises to keep track of and synchronize network located files, it doesn't do versioning on them - am I wrong about that? I also looked at https://code.google.com/p/leveldb/ , also a key value database which relies more heavily on indexes - but I see that though this is well-tuned to answering key queries, it isn't particularly good at storing and retrieving entire versions of a database that could be many gigabytes long, which is our mission. It is relatively easy to generalize the simple keydb prototype I wrote so that it can handle any key-value database - including binary content and even binary key data, not just text (fasta sequences). So a name change for the tool is a good idea. I want a versioning system that doesn't assume the incoming master file of key-value pairs is in the same order as it was on a previous import run. I was afraid that any arbitrary change in the order of content on the source server could completely destroy the efficiency of a differential approach. Git assumes its content is like a document - so it generates a slew of inserts and deletes, in fact provides no benefit, if the fasta entries are rearranged. I tested helping git overcome this hurdle by converting the fasta content to 1 line key/value fasta entries, and sorting them before git processing. That seemed to work for some smaller and larger nucleotide fasta files (tested 10m to 2gb) but failed when it came to processing protein fasta files; though possibly that was because of the fasta data line length. That became another concern - thinking that git was failing because each line of the input file was many thousands of characters long. So having done a keydb versioning engine that works and performs as well as git, I am definitely shying away from git now as unreliable on certain kinds of data. The keydb approach is able to generate a version file at about the same speed that it takes to read the latest version of the same db, i.e. at 50mb/s on a standard hard drive. An extension to keydb that enables it to take in just a list of adds or deletes or updates is desirable but that can come later. More efficiency can be had by fine-tuning the updates so that one whole line of key-value doesn't have to replace the previous one but that's for later too. A generalization note that the keydb approach works where the keys are a sparse array. There's nothing stopping the keys from representing a 2D or 3D sparse array of data as long as the coordinates are coded uniquely into the one key list. For those interested in versioning XML data there is an interesting summary of the challenges here: http://useless-factor.blogspot.ca/2008/01/matching-diffing-and-merging-xml.html . It leaves me thinking that quick versioning of xml data could only be accomplished if it could somehow be converted into a key-value db, i.e. with each top level xml record identified by a unique key. I could see breaking larger keydb databases up into smaller chunks for data retrieval and fast parallel processing - the usual approach being to separate the sorted key-value db out into files based on the first character or two in the key of each record. Does this go along with people's expectations? Cheers, Damion From: Björn Grüning [bjoern.gruen...@gmail.com] Sent: Monday, September 01, 2014 12:47 PM To: Dooley, Damion; Björn Grüning; galaxy-dev@lists.bx.psu.edu Cc: Hsiao, William Subject: Re: [galaxy-dev] Concept for a Galaxy Versioned Fasta Data Retrieval Tool Am 25.08.2014 um 18:05 schrieb Dooley, Damion: Ok, I'll be very happy to see what you've accomplished there. I will read through what you've done when I return from vacation in a week! A key need is to have whatever data comes in show up as linked data in one's history to avoid server overhead; a second objective was to not need to modify existing workflows - as long as they could work of data in history that is typed appropriately. So your 'select type' solution sounds intreguing! And certainly interested in your use of git - I tried using git, using a 1-line fasta data format, but git seemed to choke on protein fasta files? And did it run into performance problems with larger files? That was my experience. I think I read its authors say that its upper limit was 15gb. This is probably true for one large file. I'm storing the entire PDB in git since a few years. One entry one file and it works fine. Do you know git annex? https://git-annex.branchable.com/ That was the motivation for writing a simple