Re: [galaxy-dev] Defining $GALAXY_SLOTS for use in tool wrappers
Am Samstag, den 12.10.2013, 19:42 +0100 schrieb Peter Cock: On Thu, Aug 1, 2013 at 10:27 AM, Nicola Soranzo sora...@crs4.it wrote: Il 2013-07-30 17:18 Peter Cock ha scritto: Hello all, Re: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html Something I raised during the GCC2013, and we talked about via Twitter as well was a Galaxy environment variable for use within Tool Wrappers setting the number of threads/CPUs to use. The idea is that you can configure a default value, and then override this per runner or per tool etc. Thanks Peter for pushing this idea, I totally support this proposal. In the mean time, I've been using for my tools the solution by Jim Johnson for its CD-HIT wrapper: http://toolshed.g2.bx.psu.edu/view/jjohnson/cdhit But this requires the system administrator to modify both the tool env.sh and job_conf.xml to be in sync. Is there an open Trello card for this? A Trello card would be useful indeed. Nicola Better than a Trello card, we now have a pull request from John: https://bitbucket.org/galaxy/galaxy-central/pull-request/236/job-runner-enhancements-galaxy_slots/diff And thanks to John its merged! Time for testing and migrating our tools :) Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Interact with running job?
Hej Thanks for your input. -- Jonas Hagberg BILS - Bioinformatics Infrastructure for Life Sciences - http://bils.se e-mail: jonas.hagb...@bils.se, jonas.hagb...@scilifelab.se phone: +46-(0)70 6683869 address: SciLifeLab, Box 1031, 171 21 Solna, Sweden On Wed, Oct 16, 2013 at 3:18 PM, John Chilton chil...@msi.umn.edu wrote: Hello Jonas, I don't believe this is currently doable in Galaxy and it may be difficult to add. I worry that some job/file/cluster configurations Galaxy could be in might not even be setup in such a way that intermediate files, standard output, etc... would not be available to the Galaxy processes until the job is complete. Nonetheless, if you have a fairly simple cluster configuration you could probably output intermediate files/logging files as part of your job and the Galaxy processes would be able to read them. The UI will show these files as running, but you may be able to hack up the Galaxy framework to provide additional logic and processing. If you do want to do this, I am not entirely sure to start but lib/galaxy/tools lib/galaxy/jobs, templates/webapps/galaxy/history/, static/scripts/mvc/history, static/scripts/mvc/dataset may be places to look. Great I guessed I needed to do some extra hack into Galaxy Framework. If you come up with some changes to Galaxy that allow supporting this use case or come up with specific recommendations for changes we can make to make Galaxy more amenable, please let us know. Yes I will give it a thought. A more Galaxy friendly approach might be to break up your job into several jobs/tools and string them together with a workflow. This would allow users to inspect intermediate results as the workflow is processed. I understand though most jobs/tools cannot be broken up in this fashion. Yes it could be a possibility. I need to talk more to the application guy about this. But I think it is possible. Sorry I could not be of more help, Just the help I needed to get started. Many thanks! -John On Wed, Oct 16, 2013 at 7:40 AM, Jonas Hagberg jonas.hagb...@scilifelab.se wrote: Hej I am new to Galaxy. I am reading the documentation but could not really find an answer . I would like to create a tool that when after executed on a cluster the user should be able to interact with the running job. Get the current output and make a plot and see how far the job has come. Clicking on special status link on the running job in history or something like that. How would one do this in galaxy. Is there any way today to do it with the tools in galaxy. Would be great to get some first guidance where to start. cheers -- Jonas Hagberg BILS - Bioinformatics Infrastructure for Life Sciences - http://bils.se e-mail: jonas.hagb...@bils.se, jonas.hagb...@scilifelab.se phone: +46-(0)70 6683869 address: SciLifeLab, Box 1031, 171 21 Solna, Sweden ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Deploying LOC files for tool built-in data during a tool installation
Hi Dan, On Tue, Oct 15, 2013 at 7:40 PM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi all, I think what we have are two similar, but somewhat separate problems: 1.) We need a way via the UI for an admin to be able to add additional configuration entries to data tables / .loc files. For 1.), we now have Data Managers. A Data Manager will do all the heavy lifting of adding additional data table entries. e.g. for bwa, it can build the mapping indexes and add the properly delimited line to the .loc file. These are accessed through the admin interface, under Manage local data. Data Managers are installed from a ToolShed, or can be installed manually. In addition to direct interactive usage, Data Manager tools can be included in workflows or accessed via the tools API. Not only does the use of a Data Manager remove the technical burdens/ concerns of adding new entries to a data table / .loc file, it also provides for the same reproducibility and provenance tracking that is afforded to regular Galaxy tools. You said there Data Managers can be used within a workflow. I don't quite follow - aren't the Data Managers restricted to administrators only? If you don't mind me picking two specific examples of direct personal interest - which lead me to ask if there a default Data Manager which just offers a web GUI for editing any *.loc file as a table? -- Blast2GO - http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go This tool wrapper uses blast2go.loc which should list one or more Blast2G) *.properties files. These can in principle be used for advanced things like changing evidence weighting codes etc. However, the primary point is to point to different Blast2GO databases. There have been a series of (date stamped) public (free) Blast2GO databases, and my tool installation script already sets up the *.properties files for the most recent databases (which it uses for a unit test), which was your point 2 (below). The local Galaxy administrator may need to add extra entries to the blast2go.loc file, for instance when there is a new public database release, or if they setup a local database (recommended). This seems to be an easy case (since there is little that we can automate). A simple interface for adding lines to the *.loc files would be enough, assuming it includes a file select browser. -- BLAST+ - http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/ This uses blastdb.loc (nucleotides), blastdb_p.loc (proteins) etc. A simple interface for adding lines to the *.loc files would be useful, although the oddities of BLAST database naming might need a little code on top of a plain file select browser (the database name if the file path temp without the *.nal, *.pal, etc extension). There is potential for offering to automatically create databases from this all_fasta data table you mention below? The documentation for Data Managers is currently limited to the tutorial-style doc here: http://wiki.galaxyproject.org/Admin/Tools/DataManagers/HowTo/Define; a more formal / config syntax type of page will also be made available, although the tutorial is a pretty inclusive description of the steps needed to define a Data Manager. Could I suggest you add that information (paraphrase what you just said in this email) to the main page: http://wiki.galaxyproject.org/Admin/Tools/DataManagers I think that would help. 2.) We need a way to bootstrap/initialize a Galaxy installation with data table/ .loc file entries ('built-in data') during installation for a.) a 'production' Galaxy instance - this would include local dev/testing/etc instances b.) automated testing framework - tests should run fast, but meaningfully test a tool, e.g., the horse mitochondrial genome could be a fine built-in genome for running automated tool tests, but not desired to be automatically installed into a production Galaxy instance For 2.): bootstrapping data during an installation process is something that still needs to be more completely spec'd out and implemented. ... OK, so the Data Manager work does not yet cover bootstrapping (installing data as part of tool installation from the tool shed etc). Regarding 2(b), Greg and I talked about this earlier in the thread and I filed Trello Card 1165 on a related issue: https://trello.com/c/P90b5Pa0/1165-functional-tests-need-separate-loc-files-to-the-live-production-loc-files-e-g-loc-test Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems
Hi John, Is all your semi-colon fixing on the trunk? I've found another bug in this area (patch below), which is showing up with task splitting under SGE. e.g. this job is meant to be running and merging 5 BLAST XML files, stderr: nothing to merge for /mnt/galaxy/galaxy-central/database/files/000/dataset_498.dat (expected 5 files) /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0: /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/galaxy_288_60.sh: line 13: syntax error near unexpected token `;' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/galaxy_288_60.sh: line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0; blastn -query /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/dataset_495.dat -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001 -out /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/dataset_498.dat -outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs 3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit $return_code' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1: /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/galaxy_288_61.sh: line 13: syntax error near unexpected token `;' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/galaxy_288_61.sh: line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1; blastn -query /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/dataset_495.dat -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001 -out /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/dataset_498.dat -outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs 3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit $return_code' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2: /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/galaxy_288_62.sh: line 13: syntax error near unexpected token `;' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/galaxy_288_62.sh: line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2; blastn -query /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/dataset_495.dat -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001 -out /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/dataset_498.dat -outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs 3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit $return_code' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3: /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/galaxy_288_63.sh: line 13: syntax error near unexpected token `;' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/galaxy_288_63.sh: line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3; blastn -query /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/dataset_495.dat -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001 -out /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/dataset_498.dat -outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs 3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit $return_code' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4: /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/galaxy_288_64.sh: line 13: syntax error near unexpected token `;' /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/galaxy_288_64.sh: line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4; blastn -query /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/dataset_495.dat -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001 -out /mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/dataset_498.dat -outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs 3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit $return_code' Note the repeated semi-colon, which causes the child jobs to fail (patch below). There is however a second bug, the merge fails yet the history entry is still green (success). The merge method is raising a ValueError yet it is being ignored. Regards, Peter $ hg branch default $ hg tip changeset: 12028:8e001dc9675c tag: tip user:John Chilton jmchil...@gmail.com
Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems
On Thu, Oct 17, 2013 at 10:53 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi John, Is all your semi-colon fixing on the trunk? I've found another bug in this area (patch below), which is showing up with task splitting under SGE. ... Note the repeated semi-colon, which causes the child jobs to fail (patch below). There is however a second bug, the merge fails yet the history entry is still green (success). The merge method is raising a ValueError yet it is being ignored. That bug detecting failiures be a symptom of something more general, I've just had a split-BLAST job return this as stderr, yet the history entry is green: Job output not returned from cluster Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems
On Thu, Oct 17, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote: I broke the TaskWrapper last week with my exit code handling fix, the double semi-colon thing you are seeing there. Your fix would break non-task split jobs so that is probably the problem(?) Hopefully? Want to revert 8e001dc9675c and pull in the changeset I just pushed out. Otherwise, I will test out task splitting later today. I am very sorry. -John Thanks John, Those of us running with galaxy-dist expect minor breakage from time to time - I do this to avoid more pain if the problems were not spotted by the community and reached galaxy-central and thus our production Galaxy instance. (And with the job splitting not being enabled by default, I am aware that I am in a relatively small group of Galaxy admins using it.) I don't think my fix hurts non-task split jobs, but I will now try your fix on the default branch: https://bitbucket.org/galaxy/galaxy-central/commits/329ea7a83af4f389a7c95ee4559d88c6fec0211b This appears to also address a metadata issue, which if I am lucky may be the fix for this issue?: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017031.html Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tabular data not displaying in main panel
Hello everyone, any news on this bug? I have the same issue here, with the error TypeError: column_types is undefined @ http://galaxy.immunbio.mpg.de/static/scripts/mvc/data.js:188 The suggested fix works for me, but the resulting formating isn't nice. Best regards, Sarah - Original Message - From: Ian Misner imis...@umd.edu To: galaxy-dev@lists.bx.psu.edu Sent: Friday, October 4, 2013 1:17:54 PM Subject: Re: [galaxy-dev] Tabular data not displaying in main panel Hello All, I'm having the same issue with tabular data not displaying,but I'm afraid I'm much newer to running a local galaxy instance. Apparently switching to https://bitbucket.org/galaxy/galaxy-central/src/a477486bf18eafdd14dd7ba1e91e17f1b05e8121/scripts/functional_tests.py?at=stable but frankly I don't know how to do that. Any help would be appreciated. Here is my current branch information changeset: 10421:a477486bf18e branch: stable tag: tip user:Nate Coraor n...@bx.psu.edu date:Thu Sep 26 11:02:58 2013 -0400 summary: Bugfix for tool-to-destination mapping, tool ids are lowercased but the mapping id was not lowercased. changeset: 10411:c42567f43aa7 user:greg date:Mon Aug 19 13:19:56 2013 -0400 summary: Filter invalid objects when generating the list of repository_dependencies objects that are associated with a tool shed repository installed into Galaxy. Cheers Ian ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tabular data not displaying in main panel
On Thu, Oct 17, 2013 at 1:01 PM, Sarah Diehl di...@ie-freiburg.mpg.de wrote: Hello everyone, any news on this bug? I have the same issue here, with the error TypeError: column_types is undefined @ http://galaxy.immunbio.mpg.de/static/scripts/mvc/data.js:188 The suggested fix works for me, but the resulting formating isn't nice. Best regards, Sarah No news about the root cause of the bug (the change to data.js just tackles the symptoms with the downside of messing up the column alignment). I filed an issue on Trello: https://trello.com/c/it0oXXeT/1190-tabular-data-not-displaying-in-main-panel-data-js-error Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Problem configuring Galaxy with an Apache proxy
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello Erwan, This issue is caused by a missing trailing slash in your proxy definition. That's why the root page will load, but nothing requiring any more depth than that. # Explanation (For example, I run mine in a subdirectory) localhost:8080/galaxy The root page makes the request to localhost:8080/galaxy and that suceeds. All sub-pages (e.g., style sheets, etc) make requests that look like localhost:8080/galaxystatic/welcome.html and fail. Just add your missing slash and you'll be fine. In the future I find setting LogLevel Debug in my apache conf helpful for this sort of thing, and watching the logs as I make HTTP requests to galaxy. Cheers, Eric On 10/17/2013 08:39 AM, Erwan Delage wrote: Hello everyone, I'm having trouble setting up Galaxy with an Apache Proxy. I did edit the Apache conf file with the following lines : RewriteEngine on RewriteRule ^/static/style/(.*)/home/nate/galaxy-dist/static/june_2007_style/blue/$1 [L] RewriteRule ^/static/scripts/(.*)/home/nate/galaxy-dist/static/scripts/packed/$1 [L] RewriteRule ^/static/(.*)/home/nate/galaxy-dist/static/$1 [L] RewriteRule ^/favicon.ico/home/nate/galaxy-dist/static/favicon.ico [L] RewriteRule ^/robots.txt/home/nate/galaxy-dist/static/robots.txt [L] RewriteRule ^(.*) http://localhost:8080$1 [P] The redirection seems to work as I reach a webpage that contains some Galaxy's links but, as you can see in the attached snapshot, the webpage does not really looks like the usual Galaxy welcome page :) Did you already face that problem or do you have any idea to solve it ? Thanks, Erwan, ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ - -- Eric Rasche Programmer II Center for Phage Technology Texas AM University College Station, TX 77843 404-692-2048 e...@tamu.edu rasche.e...@yandex.ru -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSX/vjAAoJEMqDXdrsMcpVXskP/2sE0Wi2pUASL9YWYnP9X92O p+QVecyPXzOsfYqTHvJ+oD49xWueVjNdqGhwjGRgAXEJhboQ4gzvjH6xRzwKDXIs c58b/hRq38Hshw292k24HxrDFKoy0ETYyJTTnIQ3IazkAs+OO4kW9zMHTg+IEgsf GCa8QkD2S7Om85k7JuY7XjmBC5SCtgvR1s+QnAmyXv3Atjsa1TZtyH3uFpppSKwn ufuZwcr/A7dkdX2SEakb9QDv/8/ksDe5U6SMd/hjGXgfPX7ZelWhxF6IZibYHDXT 5tG1t3q/facJFTXAFvTGl/Caif1JABeMEnBe6U/heDlO4GYMTGl6uJVYcJ1XzQWI B9a3Ui8YS72Nufce0xlGxlItpDsjz9UnHuX0b++vrPViyULCB2wZaOW4AUnBk1YY 6wK4dO4+2jXX6zuDeIWcHz/REpJDwIASak5FcvWTdVK3n9/tmz1puyF9OLUi907g Dc5MU9qyuHT1NM3NqHlxSCDSVZaX4rb6b14qNsceg/aXdE6c8NBN+PjyqYe4t7za TBMpgXwXxvq7lxnfu2sRkulte2zhZjcsZiAmDqPRfIB8Hxd9Zdwz5h/uvV4HnIzp uBns36WKuD5wJc42tb2gHxMDtwSEtr6raxNbvaNvIeISLc2Uc9tEG3Lzyw17fWT5 4ux/759O8sjh1JaoIj9p =dVoe -END PGP SIGNATURE- ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Deploying LOC files for tool built-in data during a tool installation
Hi Peter, Please see replies inline, below. Thanks, Dan On Oct 17, 2013, at 5:36 AM, Peter Cock wrote: Hi Dan, On Tue, Oct 15, 2013 at 7:40 PM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi all, I think what we have are two similar, but somewhat separate problems: 1.) We need a way via the UI for an admin to be able to add additional configuration entries to data tables / .loc files. For 1.), we now have Data Managers. A Data Manager will do all the heavy lifting of adding additional data table entries. e.g. for bwa, it can build the mapping indexes and add the properly delimited line to the .loc file. These are accessed through the admin interface, under Manage local data. Data Managers are installed from a ToolShed, or can be installed manually. In addition to direct interactive usage, Data Manager tools can be included in workflows or accessed via the tools API. Not only does the use of a Data Manager remove the technical burdens/ concerns of adding new entries to a data table / .loc file, it also provides for the same reproducibility and provenance tracking that is afforded to regular Galaxy tools. You said there Data Managers can be used within a workflow. I don't quite follow - aren't the Data Managers restricted to administrators only? This is correct. Admins can run workflows containing Data Managers, while standard users cannot. Additionally, the selection list for any installed Data Managers will only appear within the workflow editor for an admin. If you don't mind me picking two specific examples of direct personal interest - which lead me to ask if there a default Data Manager which just offers a web GUI for editing any *.loc file as a table? Something like this for adding entries could be done now, although currently existing entries cannot be modified or removed by using Data Managers. There is not currently a generic Data Manager written that will do this though. On my list of things to do is to write a Data Manager that would generically make use of our datacache rsync server, but there is not an ETA for this. Another one, or the same one, could also make use of S3, which would be particularly useful for Cloud instances. -- Blast2GO - http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go This tool wrapper uses blast2go.loc which should list one or more Blast2G) *.properties files. These can in principle be used for advanced things like changing evidence weighting codes etc. However, the primary point is to point to different Blast2GO databases. There have been a series of (date stamped) public (free) Blast2GO databases, and my tool installation script already sets up the *.properties files for the most recent databases (which it uses for a unit test), which was your point 2 (below). The local Galaxy administrator may need to add extra entries to the blast2go.loc file, for instance when there is a new public database release, or if they setup a local database (recommended). This seems to be an easy case (since there is little that we can automate). A simple interface for adding lines to the *.loc files would be enough, assuming it includes a file select browser. In this case, you could define a blast2go Data Manager that would be able to allow the selection of the external public (free) Blast2GO that the user wants. A code file could be used to populate this list dynamically from the external server's contents until a more generalized way of doing so is made available to tool parameters. The underlying Data Manager tool would then retrieve the database and return a JSON description of the fields to add to the data table .loc file. This same Data Manager could be allowed to add a file locally from a server's filesystem. We don't have a filesystem select widget for tools yet, but you could use a textbox for manual entry or use a select list/drill down with dynamic code for this. A ServerFileToolParameter could be defined to list server contents directly, but we would want to make sure that ordinary tool devs are aware of it being a bit of security risk, depending upon how it is used (don't want ordinary users, selecting random files off of the filesystem in normal tools, usually). It may be worthwhile to have a look at the Reference Genome / all_fasta data manager (http://testtoolshed.g2.bx.psu.edu/view/blankenberg/data_manager_fetch_genome_all_fasta), which can grab reference genome FASTAs from UCSC, NCBI, a URL, a Galaxy History, or a Directory on the server (copy or symlink) and then populates the all_fasta table. -- BLAST+ - http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/ This uses blastdb.loc (nucleotides), blastdb_p.loc (proteins) etc. A simple interface for adding lines to the *.loc files would be useful, although the oddities of BLAST database naming might need a little code on top of a plain file select browser (the database name if the file path temp without the
Re: [galaxy-dev] Job splitting
Hi Peter, bjoern.gruen...@gmail.com wrote: Hi Peter, I saw your are working again on the job splitting features. I may have a features request: Well, not really working on it - just reporting a regression using it. Ah ok :) For some tools you have different file formats as input but only for one or two of them the split and merge function are defined. If the filetype is now one of these non splittable ones, Galaxy will crash. I think a better way would be to default to non-split mode? I guess that isn't supported. Does that make sense for you? Any pointer where to look? Its not urgent, but it was on my todo list. Just remembered it, reading your mail. Yes, that makes perfect sense. Possibly here lib/galaxy/jobs/__init__.py it needs to look at the datatype to see that supports splitting: def can_split( self ): # Should the job handler split this job up? return self.app.config.use_tasked_jobs and self.tool.parallelism We could/should take this discussion to galaxy-dev, Sure, I thought it would be more easier, some missing return or something. Will have a deeper look at it if I have some more time. Trello card is here: https://trello.com/c/lIKKwiC1 Thanks! Bjoern Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems
On Thu, Oct 17, 2013 at 5:49 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Oct 17, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote: I broke the TaskWrapper last week with my exit code handling fix, the double semi-colon thing you are seeing there. Your fix would break non-task split jobs so that is probably the problem(?) Hopefully? Want to revert 8e001dc9675c and pull in the changeset I just pushed out. Otherwise, I will test out task splitting later today. I am very sorry. -John Thanks John, Those of us running with galaxy-dist expect minor breakage from time to time - I do this to avoid more pain if the problems were not spotted by the community and reached galaxy-central and thus our production Galaxy instance. Well they have given me commit access so expect a lot more minor breakage :). (And with the job splitting not being enabled by default, I am aware that I am in a relatively small group of Galaxy admins using it.) I don't think my fix hurts non-task split jobs, but I will now try your fix on the default branch: I think it will in at least some cases if metadata is getting set externally, I don't see how it is preventing some commands from running together, I could totally be wrong though. At any rate, I tested the version with my fix on your blast wrappers, with task splitting on and off, submitting to a DRM and using the local job runner, and they all seemed to work. Let me know if your galaxy instance is unconvinced. https://bitbucket.org/galaxy/galaxy-central/commits/329ea7a83af4f389a7c95ee4559d88c6fec0211b This appears to also address a metadata issue, which if I am lucky may be the fix for this issue?: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017031.html I doubt your luck is so good. That problems looks like some sort of disk caching issue to me (galaxy process and worker node having inconsistent views of the same file system), I doubt this will fix it. Though hopefully I am wrong on both counts :). Thanks for reporting the problem and the fix! -John Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems
On Thu, Oct 17, 2013 at 5:04 PM, John Chilton chil...@msi.umn.edu wrote: On Thu, Oct 17, 2013 at 5:49 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Oct 17, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote: (And with the job splitting not being enabled by default, I am aware that I am in a relatively small group of Galaxy admins using it.) I don't think my fix hurts non-task split jobs, but I will now try your fix on the default branch: I think it will in at least some cases if metadata is getting set externally, I don't see how it is preventing some commands from running together, I could totally be wrong though. At any rate, I tested the version with my fix on your blast wrappers, with task splitting on and off, submitting to a DRM and using the local job runner, and they all seemed to work. Let me know if your galaxy instance is unconvinced. Seem fine so far :) This appears to also address a metadata issue, which if I am lucky may be the fix for this issue?: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017031.html I doubt your luck is so good. That problems looks like some sort of disk caching issue to me (galaxy process and worker node having inconsistent views of the same file system), I doubt this will fix it. Though hopefully I am wrong on both counts :). Disk caching makes sense as a root cause - I've not had this happen consistently or reproducibly yet so it may well return. Thanks for reporting the problem and the fix! -John No problem, thank you for addressing it so promptly. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] file download returns empty file or webpage error
Hello, I've looked through the archives and I see that this issue has been raised before (http://dev.list.galaxyproject.org/file-download-returns-empty-file-or-webpage-fails-with-ERR-CONNECTION-CLOSED-tp4415687.html) but without a noted resolution. I am running Apache as proxy but I downloads from multiple machines and browsers are either empty or return an error after clicking the download button. I enabled Apache xsend file: Location / XSendFile on XSendFileAllowAbove on XSendFilePath /galaxy /Location And configured galaxy to be aware of this in universe_wsgi.ini apache_xsendfile = True. I have checked the apache error logs and do not see any issues reported. Any help would be appreciated. I've ended up changing apache_xsendfile = false just so that i can download data. How do I get galaxy apache and xsendfile to work properly? Cheers, Ian ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Zero padding corruption using galaxy with torque on AFS
Hi everyone, I'm currently setting up galaxy to run on top of AFS using torque for handling jobs. Everything is setup according to wiki documentation but I'm having a weird filesystem corruption problem. The setup is the following: machine A: runs galaxy. Galaxy home folder is on AFS. machine B: runs torque server and shares Galaxy's home folder. When I launch a process via galaxy everything works as expected but the output file becomes corrupted with 4kb of leading zero bytes (file NC_010473.tabular). This corruption is reproducible at all times, regardless of file size. In the attached example, the original file is NC_010473.faa. If I restart the openafs client or flush the AFS file cache the corruption goes away. However, if I re-run the same script created by galaxy through torque the corruption doesn't happen. Hence it only happens if launched via Galaxy. I also tried both the DRMAA and the PBS modules but the corruption remained. Does anyone know what could be the cause of this? Thanks, Renato gi|170079664|ref|YP_001728984.1| thr operon leader peptide [Escherichia coli str. K-12 substr. DH10B] MKRISTTITTTITITTGNGAG gi|170079665|ref|YP_001728985.1| bifunctional aspartokinase I/homoserine dehydrogenase I [Escherichia coli str. K-12 substr. DH10B] MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERI FAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEA RGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYS AAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPC LIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLIT QSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAAL ARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQVIEVFVIGVGGVGGALLEQLKRQQSW LKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAV ADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELM KFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIE IEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFK VKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV gi|170079666|ref|YP_001728986.1| homoserine kinase [Escherichia coli str. K-12 substr. DH10B] MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEPRENIVYQCWE RFCQELGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLNDTRLLALMGELEGRISGSIHY DNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGIKVSTAEARAILPAQYRRQDCIAHGRHLAGF IHACYSRQPELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKPETAQRVA DWLGKNYLQNQEGFVHICRLDTAGARVLEN gi|170079667|ref|YP_001728987.1| threonine synthase [Escherichia coli str. K-12 substr. DH10B] MKLYNLKDHNEQVSFAQAVTQGLGKNQGLFFPHDLPEFSLTEIDEMLKLDFVTRSAKILSAFIGDEIPQE ILEERVRAAFAFPAPVANVESDVGCLELFHGPTLAFKDFGGRFMAQMLTHIAGDKPVTILTATSGDTGAA VAHAFYGLPNVKVVILYPRGKISPLQEKLFCTLGGNIETVAIDGDFDACQALVKQAFDDEELKVALGLNS ANSINISRLLAQICYYFEAVAQLPQETRNQLVVSVPSGNFGDLTAGLLAKSLGLPVKRFIAATNVNDTVP RFLHDGQWSPKATQATLSNAMDVSQPNNWPRVEELFRRKIWQLKELGYAAVDDETTQQTMRELKELGYTS EPHAAVAYRALRDQLNPGEYGLFLGTAHPAKFKESVEAILGETLDLPKELAERADLPLLSHNLPADFAAL RKLMMNHQ gi|170079668|ref|YP_001728988.1| hypothetical protein ECDH10B_0005 [Escherichia coli str. K-12 substr. DH10B] MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQHYEWRGNRWHL HGPRHHKKAPHDHHGGHGPGKHHR gi|170079669|ref|YP_001728989.1| hypothetical protein ECDH10B_0006 [Escherichia coli str. K-12 substr. DH10B] MLILISPAKTLDYQSPLTTTRYTLPELLDNSQQLIHEARKLTPPQISTLMRISDKLAGINAARFHDWQPD FTPANARQAILAFKGDVYTGLQAETFSEDDFDFAQQHLRMLSGLYGVLRPLDLMQPYRLEMGIRLENARG KDLYQFWGDIITNKLNEALAAQGDNVVINLASDEYFKSVKPKKLNAEIIKPVFLDEKNGKFKIISFYAKK ARGLMSRFIIENRLTKPEQLTGFNSEGYFFDEDSSSNGELVFKRYEQR gi|170079670|ref|YP_001728990.1| transporter [Escherichia coli str. K-12 substr. DH10B] MPDFFSFINSVLWGSVMIYLLFGAGCWFTFRTGFVQFRYIRQFGKSLKNSIHPQPGGLTSFQSLCTSLAA RVGSGNLAGVALAITAGGPGAVFWMWVAAFIGMATSFAECSLAQLYKERDVNGQFRGGPAWYMARGLGMR WMGVLFAVFLLIAYGIIFSGVQANAVARALSFSFDFPPLVTGIILAVFTLLAITRGLHGVARLMQGFVPL MAIIWVLTSLVICVMNIGQLPHVIWSIFESAFGWQEAAGGAAGYTLSQAITNGFQRSMFSNEAGMGSTPN AASWPPHPAAQGIVQMIGIFIDTLVICTASAMLILLAGNGTTYMPLEGIQLIQKAMRVLMGSWGAE FVTLVVILFAFSSIVANYIYAENNLFFLRLNNPKAIWCLRICTFATVIGGTLLSLPLMWQLADIIMACMA ITNLTAILLLSPVVHTIASDYLRQRKLGVRPVFDPLRYPDIGRQLSPDAWDDVSQE gi|170079671|ref|YP_001728991.1| transaldolase B [Escherichia coli str. K-12 substr. DH10B] MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAKQQSNDRAQQI VDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYNDAGISNDRILIKLASTWQGI RAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVGRILDWYKANTDKKEYAPAEDPGVVSVSEIY QYYKEHGYETVVMGASFRNIGEILELAGCDRLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESE FLWQHNQDPMAVDKLAEGIRKFAIDQEKLEKMIGDLL gi|170079672|ref|YP_001728992.1| molybdenum cofactor biosynthesis protein MogA [Escherichia coli str. K-12 substr. DH10B] MNTLRIGLVSISDRASSGVYQDKGIPALEEWLTSALTTPFELETRLIPDEQAIIEQTLCELVDEMSCHLV LTTGGTGPARRDVTPDATLAVADREMPGFGEQMRQISLHFVPTAILSRQVGVIRKQALILNLPGQPKSIK ETLEGVKDAEGNVVVHGIFASVPYCIQLLEGPYVETAPEVVAAFRPKSARRDVSE gi|170079673|ref|YP_001728993.1| hypothetical