Re: [galaxy-dev] Defining $GALAXY_SLOTS for use in tool wrappers

2013-10-17 Thread Bjoern Gruening
Am Samstag, den 12.10.2013, 19:42 +0100 schrieb Peter Cock:
 On Thu, Aug 1, 2013 at 10:27 AM, Nicola Soranzo sora...@crs4.it wrote:
  Il 2013-07-30 17:18 Peter Cock ha scritto:
 
  Hello all,
 
  Re:
  http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html
  http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-October/011557.html
 
  Something I raised during the GCC2013, and we talked about
  via Twitter as well was a Galaxy environment variable for use
  within Tool Wrappers setting the number of threads/CPUs to
  use.
 
  The idea is that you can configure a default value, and then
  override this per runner or per tool etc.
 
 
  Thanks Peter for pushing this idea, I totally support this proposal.
  In the mean time, I've been using for my tools the solution by
  Jim Johnson for its CD-HIT wrapper:
 
  http://toolshed.g2.bx.psu.edu/view/jjohnson/cdhit
 
  But this requires the system administrator to modify both the tool
  env.sh and job_conf.xml to be in sync.
 
  Is there an open Trello card for this?
 
  A Trello card would be useful indeed.
 
  Nicola
 
 Better than a Trello card, we now have a pull request from John:
 https://bitbucket.org/galaxy/galaxy-central/pull-request/236/job-runner-enhancements-galaxy_slots/diff

And thanks to John its merged! Time for testing and migrating our
tools :)


 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Interact with running job?

2013-10-17 Thread Jonas Hagberg
Hej

Thanks for your input.

-- 
Jonas Hagberg
BILS - Bioinformatics Infrastructure for Life Sciences - http://bils.se
e-mail: jonas.hagb...@bils.se, jonas.hagb...@scilifelab.se
phone: +46-(0)70 6683869
address: SciLifeLab, Box 1031, 171 21 Solna, Sweden


On Wed, Oct 16, 2013 at 3:18 PM, John Chilton chil...@msi.umn.edu wrote:

 Hello Jonas,

   I don't believe this is currently doable in Galaxy and it may be
 difficult to add. I worry that some job/file/cluster configurations
 Galaxy could be in might not even be setup in such a way that
 intermediate files, standard output, etc... would not be available to
 the Galaxy processes until the job is complete.

   Nonetheless, if you have a fairly simple cluster configuration you
 could probably output intermediate files/logging files as part of your
 job and the Galaxy processes would be able to read them.   The UI will
 show these files as running, but you may be able to hack up the
 Galaxy framework to provide additional logic and processing. If you do
 want to do this, I am not entirely sure to start but lib/galaxy/tools
 lib/galaxy/jobs, templates/webapps/galaxy/history/,
 static/scripts/mvc/history, static/scripts/mvc/dataset may be places
 to look.


Great I guessed I needed to do some extra hack into Galaxy Framework.


   If you come up with some changes to Galaxy that allow supporting
 this use case or come up with specific recommendations for changes we
 can make to make Galaxy more amenable, please let us know.

Yes I will give it a thought.




   A more Galaxy friendly approach might be to break up your job into
 several jobs/tools and string them together with a workflow. This
 would allow users to inspect intermediate results as the workflow is
 processed. I understand though most jobs/tools cannot be broken up in
 this fashion.


Yes it could be a possibility. I need to talk more to the application guy
about this. But I think it is possible.


 Sorry I could not be of more help,

Just the help I needed to get started.

Many thanks!


 -John


 On Wed, Oct 16, 2013 at 7:40 AM, Jonas Hagberg
 jonas.hagb...@scilifelab.se wrote:
  Hej
 
  I am new to Galaxy.  I am reading the documentation but could not really
  find an answer . I would like to create a tool that when after executed
 on a
  cluster the user should be able to interact with the running job. Get the
  current output and make a plot and see how far the job has come.
 Clicking on
  special status link on the running job in history or something like that.
 
  How would one do this in galaxy. Is there any way today to do it with the
  tools in galaxy.
  Would be great to get some first guidance where to start.
 
  cheers
  --
  Jonas Hagberg
  BILS - Bioinformatics Infrastructure for Life Sciences - http://bils.se
  e-mail: jonas.hagb...@bils.se, jonas.hagb...@scilifelab.se
  phone: +46-(0)70 6683869
  address: SciLifeLab, Box 1031, 171 21 Solna, Sweden
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Deploying LOC files for tool built-in data during a tool installation

2013-10-17 Thread Peter Cock
Hi Dan,

On Tue, Oct 15, 2013 at 7:40 PM, Daniel Blankenberg d...@bx.psu.edu wrote:
 Hi all,

 I think what we have are two similar, but somewhat separate problems:
 1.) We need a way via the UI for an admin to be able to add additional
 configuration entries to data tables / .loc files.

 For 1.), we now have Data Managers. A Data Manager will do all the
 heavy lifting of adding additional data table entries. e.g. for bwa, it can
 build the mapping indexes and add the properly delimited line to the
 .loc file. These are accessed through the admin interface, under Manage
 local data. Data Managers are installed from a ToolShed, or can be
 installed manually. In addition to direct interactive usage, Data Manager
 tools can be included in workflows or accessed via the tools API. Not
 only does the use of a Data Manager remove the technical burdens/
 concerns of adding new entries to a data table / .loc file, it also provides
 for the same reproducibility and provenance tracking that is afforded
 to regular Galaxy tools.

You said there Data Managers can be used within a workflow.
I don't quite follow - aren't the Data Managers restricted to
administrators only?

If you don't mind me picking two specific examples of direct
personal interest - which lead me to ask if there a default
Data Manager which just offers a web GUI for editing any *.loc
file as a table?

--

Blast2GO - http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go
This tool wrapper uses blast2go.loc which should list one or more
Blast2G) *.properties files. These can in principle be used for
advanced things like changing evidence weighting codes etc.
However, the primary point is to point to different Blast2GO
databases.

There have been a series of (date stamped) public (free) Blast2GO
databases, and my tool installation script already sets up the
*.properties files for the most recent databases (which it uses
for a unit test), which was your point 2 (below).

The local Galaxy administrator may need to add extra entries
to the blast2go.loc file, for instance when there is a new public
database release, or if they setup a local database (recommended).

This seems to be an easy case (since there is little that we can
automate). A simple interface for adding lines to the *.loc files
would be enough, assuming it includes a file select browser.

--

BLAST+ - http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/
This uses blastdb.loc (nucleotides), blastdb_p.loc (proteins) etc.
A simple interface for adding lines to the *.loc files would be
useful, although the oddities of BLAST database naming might
need a little code on top of a plain file select browser (the database
name if the file path temp without the *.nal, *.pal, etc extension).

There is potential for offering to automatically create databases
from this all_fasta data table you mention below?

 The documentation for Data Managers is currently limited to the
 tutorial-style doc here: 
 http://wiki.galaxyproject.org/Admin/Tools/DataManagers/HowTo/Define;
 a more formal / config syntax type of page will also be made available,
 although the tutorial is a pretty inclusive description of the steps needed
 to define a Data Manager.

Could I suggest you add that information (paraphrase what you just
said in this email) to the main page:

http://wiki.galaxyproject.org/Admin/Tools/DataManagers

I think that would help.


 2.) We need a way to bootstrap/initialize a Galaxy installation with data
 table/ .loc file entries ('built-in data') during installation for
 a.) a 'production' Galaxy instance - this would include local
  dev/testing/etc instances
 b.) automated testing framework - tests should run fast, but
  meaningfully test a tool, e.g., the horse mitochondrial
  genome could be a fine built-in genome for running
  automated tool tests, but not desired to be automatically
  installed into a production Galaxy instance


 For 2.): bootstrapping data during an installation process is something
 that still needs to be more completely spec'd out and implemented. ...

OK, so the Data Manager work does not yet cover bootstrapping
(installing data as part of tool installation from the tool shed etc).

Regarding 2(b), Greg and I talked about this earlier in the thread and
I filed Trello Card 1165 on a related issue:
https://trello.com/c/P90b5Pa0/1165-functional-tests-need-separate-loc-files-to-the-live-production-loc-files-e-g-loc-test

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems

2013-10-17 Thread Peter Cock
Hi John,

Is all your semi-colon fixing on the trunk? I've found another bug in
this area (patch below), which is showing up with task splitting under
SGE.

e.g. this job is meant to be running and merging 5 BLAST XML files, stderr:

nothing to merge for
/mnt/galaxy/galaxy-central/database/files/000/dataset_498.dat
(expected 5 files)

/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0:
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/galaxy_288_60.sh:
line 13: syntax error near unexpected token `;'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/galaxy_288_60.sh:
line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0;
blastn -query 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/dataset_495.dat
  -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001
-out 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_0/dataset_498.dat
-outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs
3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit
$return_code'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1:
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/galaxy_288_61.sh:
line 13: syntax error near unexpected token `;'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/galaxy_288_61.sh:
line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1;
blastn -query 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/dataset_495.dat
  -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001
-out 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_1/dataset_498.dat
-outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs
3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit
$return_code'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2:
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/galaxy_288_62.sh:
line 13: syntax error near unexpected token `;'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/galaxy_288_62.sh:
line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2;
blastn -query 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/dataset_495.dat
  -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001
-out 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_2/dataset_498.dat
-outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs
3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit
$return_code'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3:
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/galaxy_288_63.sh:
line 13: syntax error near unexpected token `;'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/galaxy_288_63.sh:
line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3;
blastn -query 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/dataset_495.dat
  -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001
-out 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_3/dataset_498.dat
-outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs
3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit
$return_code'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4:
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/galaxy_288_64.sh:
line 13: syntax error near unexpected token `;'
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/galaxy_288_64.sh:
line 13: `/mnt/galaxy/galaxy-central/extract_dataset_parts.sh
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4;
blastn -query 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/dataset_495.dat
  -db /mnt/scratch/local/blast/ncbi/nt -task megablast -evalue 0.001
-out 
/mnt/galaxy/galaxy-central/database/job_working_directory/000/288/task_4/dataset_498.dat
-outfmt 5 -num_threads 8 -dust yes -strand both -max_target_seqs
3; return_code=$?; cd /mnt/galaxy/galaxy-central; ; sh -c exit
$return_code'

Note the repeated semi-colon, which causes the child jobs to fail (patch below).

There is however a second bug, the merge fails yet the history entry
is still green (success). The merge method is raising a ValueError yet
it is being ignored.

Regards,

Peter

$ hg branch
default

$ hg tip
changeset:   12028:8e001dc9675c
tag: tip
user:John Chilton jmchil...@gmail.com

Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems

2013-10-17 Thread Peter Cock
On Thu, Oct 17, 2013 at 10:53 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi John,

 Is all your semi-colon fixing on the trunk? I've found another bug in
 this area (patch below), which is showing up with task splitting under
 SGE.

 ...

 Note the repeated semi-colon, which causes the child jobs to fail (patch 
 below).

 There is however a second bug, the merge fails yet the history entry
 is still green (success). The merge method is raising a ValueError yet
 it is being ignored.

That bug detecting failiures be a symptom of something more general,
I've just had a split-BLAST job return this as stderr, yet the history entry
is green: Job output not returned from cluster

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems

2013-10-17 Thread Peter Cock
On Thu, Oct 17, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote:
 I broke the TaskWrapper last week with my exit code handling fix,
 the double semi-colon thing you are seeing there. Your fix would break
 non-task split jobs so that is probably the problem(?) Hopefully? Want
 to revert 8e001dc9675c and pull in the changeset I just pushed out.

 Otherwise, I will test out task splitting later today.

 I am very sorry.

 -John

Thanks John,

Those of us running with galaxy-dist expect minor breakage from
time to time - I do this to avoid more pain if the problems were
not spotted by the community and reached galaxy-central and
thus our production Galaxy instance.

(And with the job splitting not being enabled by default, I am
aware that I am in a relatively small group of Galaxy admins
using it.)

I don't think my fix hurts non-task split jobs, but I will now try your
fix on the default branch:

https://bitbucket.org/galaxy/galaxy-central/commits/329ea7a83af4f389a7c95ee4559d88c6fec0211b

This appears to also address a metadata issue, which if I am lucky
may be the fix for this issue?:

http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017031.html

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Tabular data not displaying in main panel

2013-10-17 Thread Sarah Diehl
Hello everyone,

any news on this bug? I have the same issue here, with the error
TypeError: column_types is undefined @ 
http://galaxy.immunbio.mpg.de/static/scripts/mvc/data.js:188

The suggested fix works for me, but the resulting formating isn't nice.

Best regards,
Sarah

- Original Message -
From: Ian Misner imis...@umd.edu
To: galaxy-dev@lists.bx.psu.edu
Sent: Friday, October 4, 2013 1:17:54 PM
Subject: Re: [galaxy-dev] Tabular data not displaying in main panel

Hello All,

I'm having the same issue with tabular data not displaying,but I'm afraid I'm 
much newer to running a local galaxy instance. 

Apparently switching to 
https://bitbucket.org/galaxy/galaxy-central/src/a477486bf18eafdd14dd7ba1e91e17f1b05e8121/scripts/functional_tests.py?at=stable


but frankly I don't know how to do that. Any help would be appreciated.


Here is my current branch information

changeset:   10421:a477486bf18e
branch:  stable
tag: tip
user:Nate Coraor n...@bx.psu.edu
date:Thu Sep 26 11:02:58 2013 -0400
summary: Bugfix for tool-to-destination mapping, tool ids are lowercased 
but the mapping id was not lowercased.

changeset:   10411:c42567f43aa7
user:greg
date:Mon Aug 19 13:19:56 2013 -0400
summary: Filter invalid objects when generating the list of 
repository_dependencies objects that are associated with a tool shed repository 
installed into Galaxy.



Cheers
Ian



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Tabular data not displaying in main panel

2013-10-17 Thread Peter Cock
On Thu, Oct 17, 2013 at 1:01 PM, Sarah Diehl di...@ie-freiburg.mpg.de wrote:
 Hello everyone,

 any news on this bug? I have the same issue here, with the error
 TypeError: column_types is undefined @ 
 http://galaxy.immunbio.mpg.de/static/scripts/mvc/data.js:188

 The suggested fix works for me, but the resulting formating isn't nice.

 Best regards,
 Sarah

No news about the root cause of the bug (the change to data.js
just tackles the symptoms with the downside of messing up the
column alignment). I filed an issue on Trello:

https://trello.com/c/it0oXXeT/1190-tabular-data-not-displaying-in-main-panel-data-js-error

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Problem configuring Galaxy with an Apache proxy

2013-10-17 Thread Eric Rasche
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello Erwan,

This issue is caused by a missing trailing slash in your proxy definition.

That's why the root page will load, but nothing requiring any more depth
than that.

# Explanation

(For example, I run mine in a subdirectory)
localhost:8080/galaxy

The root page makes the request to

localhost:8080/galaxy

and that suceeds.

All sub-pages (e.g., style sheets, etc) make requests that look like

localhost:8080/galaxystatic/welcome.html

and fail. Just add your missing slash and you'll be fine.

In the future I find setting LogLevel Debug in my apache conf helpful
for this sort of thing, and watching the logs as I make HTTP requests to
galaxy.

Cheers,
Eric



On 10/17/2013 08:39 AM, Erwan Delage wrote:
 Hello everyone,
 
 I'm having trouble setting up Galaxy with an Apache Proxy.
 
 I did edit the Apache conf file with the following lines :
 
 RewriteEngine  on
 RewriteRule 
 ^/static/style/(.*)/home/nate/galaxy-dist/static/june_2007_style/blue/$1
 [L]
 RewriteRule 
 ^/static/scripts/(.*)/home/nate/galaxy-dist/static/scripts/packed/$1 [L]
 RewriteRule  ^/static/(.*)/home/nate/galaxy-dist/static/$1 [L]
 RewriteRule  ^/favicon.ico/home/nate/galaxy-dist/static/favicon.ico  [L]
 RewriteRule  ^/robots.txt/home/nate/galaxy-dist/static/robots.txt  [L]
 RewriteRule  ^(.*) http://localhost:8080$1 [P]
 
 The redirection seems to work as I reach a webpage that contains some
 Galaxy's links but, as you can see in the attached snapshot, the webpage
 does not really looks like the usual Galaxy welcome page :)
 
 Did you already face that problem or do you have any idea to solve it ?
 
 Thanks,
 
 Erwan,
 
 
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
 
- -- 
Eric Rasche
Programmer II
Center for Phage Technology
Texas AM University
College Station, TX 77843
404-692-2048
e...@tamu.edu
rasche.e...@yandex.ru
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSX/vjAAoJEMqDXdrsMcpVXskP/2sE0Wi2pUASL9YWYnP9X92O
p+QVecyPXzOsfYqTHvJ+oD49xWueVjNdqGhwjGRgAXEJhboQ4gzvjH6xRzwKDXIs
c58b/hRq38Hshw292k24HxrDFKoy0ETYyJTTnIQ3IazkAs+OO4kW9zMHTg+IEgsf
GCa8QkD2S7Om85k7JuY7XjmBC5SCtgvR1s+QnAmyXv3Atjsa1TZtyH3uFpppSKwn
ufuZwcr/A7dkdX2SEakb9QDv/8/ksDe5U6SMd/hjGXgfPX7ZelWhxF6IZibYHDXT
5tG1t3q/facJFTXAFvTGl/Caif1JABeMEnBe6U/heDlO4GYMTGl6uJVYcJ1XzQWI
B9a3Ui8YS72Nufce0xlGxlItpDsjz9UnHuX0b++vrPViyULCB2wZaOW4AUnBk1YY
6wK4dO4+2jXX6zuDeIWcHz/REpJDwIASak5FcvWTdVK3n9/tmz1puyF9OLUi907g
Dc5MU9qyuHT1NM3NqHlxSCDSVZaX4rb6b14qNsceg/aXdE6c8NBN+PjyqYe4t7za
TBMpgXwXxvq7lxnfu2sRkulte2zhZjcsZiAmDqPRfIB8Hxd9Zdwz5h/uvV4HnIzp
uBns36WKuD5wJc42tb2gHxMDtwSEtr6raxNbvaNvIeISLc2Uc9tEG3Lzyw17fWT5
4ux/759O8sjh1JaoIj9p
=dVoe
-END PGP SIGNATURE-
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Deploying LOC files for tool built-in data during a tool installation

2013-10-17 Thread Daniel Blankenberg
Hi Peter,

Please see replies inline, below.


Thanks,

Dan


On Oct 17, 2013, at 5:36 AM, Peter Cock wrote:

 Hi Dan,
 
 On Tue, Oct 15, 2013 at 7:40 PM, Daniel Blankenberg d...@bx.psu.edu wrote:
 Hi all,
 
 I think what we have are two similar, but somewhat separate problems:
 1.) We need a way via the UI for an admin to be able to add additional
 configuration entries to data tables / .loc files.
 
 For 1.), we now have Data Managers. A Data Manager will do all the
 heavy lifting of adding additional data table entries. e.g. for bwa, it can
 build the mapping indexes and add the properly delimited line to the
 .loc file. These are accessed through the admin interface, under Manage
 local data. Data Managers are installed from a ToolShed, or can be
 installed manually. In addition to direct interactive usage, Data Manager
 tools can be included in workflows or accessed via the tools API. Not
 only does the use of a Data Manager remove the technical burdens/
 concerns of adding new entries to a data table / .loc file, it also provides
 for the same reproducibility and provenance tracking that is afforded
 to regular Galaxy tools.
 
 You said there Data Managers can be used within a workflow.
 I don't quite follow - aren't the Data Managers restricted to
 administrators only?

This is correct. Admins can run workflows containing Data Managers, while 
standard users cannot. Additionally, the selection list for any installed Data 
Managers will only appear within the workflow editor for an admin.



 If you don't mind me picking two specific examples of direct
 personal interest - which lead me to ask if there a default
 Data Manager which just offers a web GUI for editing any *.loc
 file as a table?

Something like this for adding entries could be done now, although currently 
existing entries cannot be modified or removed by using Data Managers. There is 
not currently a generic Data Manager written that will do this though. 

On my list of things to do is to write a Data Manager that would generically 
make use of our datacache rsync server, but there is not an ETA for this. 
Another one, or the same one, could also make use of S3, which would be 
particularly useful for Cloud instances.


 --
 
 Blast2GO - http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go
 This tool wrapper uses blast2go.loc which should list one or more
 Blast2G) *.properties files. These can in principle be used for
 advanced things like changing evidence weighting codes etc.
 However, the primary point is to point to different Blast2GO
 databases.
 
 There have been a series of (date stamped) public (free) Blast2GO
 databases, and my tool installation script already sets up the
 *.properties files for the most recent databases (which it uses
 for a unit test), which was your point 2 (below).
 
 The local Galaxy administrator may need to add extra entries
 to the blast2go.loc file, for instance when there is a new public
 database release, or if they setup a local database (recommended).
 
 This seems to be an easy case (since there is little that we can
 automate). A simple interface for adding lines to the *.loc files
 would be enough, assuming it includes a file select browser.

In this case, you could define a blast2go Data Manager that would be able to 
allow the selection of the external public (free) Blast2GO that the user wants. 
A code file could be used to populate this list dynamically from the external 
server's contents until a more generalized way of doing so is made available to 
tool parameters. The underlying Data Manager tool would then retrieve the 
database and return a JSON description of the fields to add to the data table 
.loc file.

This same Data Manager could be allowed to add a file locally from a server's 
filesystem. We don't have a filesystem select widget for tools yet, but you 
could use a textbox for manual entry or use a select list/drill down with 
dynamic code for this. A ServerFileToolParameter could be defined to list 
server contents directly, but we would want to make sure that ordinary tool 
devs are aware of it being a bit of security risk, depending upon how it is 
used (don't want ordinary users, selecting random files off of the filesystem 
in normal tools, usually).

It may be worthwhile to have a look at the Reference Genome / all_fasta data 
manager 
(http://testtoolshed.g2.bx.psu.edu/view/blankenberg/data_manager_fetch_genome_all_fasta),
 which can grab reference genome FASTAs from UCSC, NCBI, a URL, a Galaxy 
History, or a Directory on the server (copy or symlink) and then populates the 
all_fasta table.



 --
 
 BLAST+ - http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/
 This uses blastdb.loc (nucleotides), blastdb_p.loc (proteins) etc.
 A simple interface for adding lines to the *.loc files would be
 useful, although the oddities of BLAST database naming might
 need a little code on top of a plain file select browser (the database
 name if the file path temp without the 

Re: [galaxy-dev] Job splitting

2013-10-17 Thread Bjoern Gruening
Hi Peter,

 bjoern.gruen...@gmail.com wrote:
  Hi Peter,
 
  I saw your are working again on the job splitting features. I may have a
  features request:
 
 Well, not really working on it - just reporting a regression using it.

Ah ok :)

  For some tools you have different file formats as input but only for one
  or two of them the split and merge function are defined. If the filetype
  is now one of these non splittable ones, Galaxy will crash. I think a
  better way would be to default to non-split mode?
 
 I guess that isn't supported.
 
  Does that make sense for you? Any pointer where to look? Its not urgent,
  but it was on my todo list. Just remembered it, reading your mail.
 
 Yes, that makes perfect sense. Possibly here lib/galaxy/jobs/__init__.py
 it needs to look at the datatype to see that supports splitting:
 
 def can_split( self ):
 # Should the job handler split this job up?
 return self.app.config.use_tasked_jobs and self.tool.parallelism
 
 We could/should take this discussion to galaxy-dev,

Sure, I thought it would be more easier, some missing return or
something. Will have a deeper look at it if I have some more time.

Trello card is here:
https://trello.com/c/lIKKwiC1

Thanks!
Bjoern


 Regards,
 
 Peter



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems

2013-10-17 Thread John Chilton
On Thu, Oct 17, 2013 at 5:49 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Thu, Oct 17, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote:
 I broke the TaskWrapper last week with my exit code handling fix,
 the double semi-colon thing you are seeing there. Your fix would break
 non-task split jobs so that is probably the problem(?) Hopefully? Want
 to revert 8e001dc9675c and pull in the changeset I just pushed out.

 Otherwise, I will test out task splitting later today.

 I am very sorry.

 -John

 Thanks John,

 Those of us running with galaxy-dist expect minor breakage from
 time to time - I do this to avoid more pain if the problems were
 not spotted by the community and reached galaxy-central and
 thus our production Galaxy instance.


Well they have given me commit access so expect a lot more minor breakage :).

 (And with the job splitting not being enabled by default, I am
 aware that I am in a relatively small group of Galaxy admins
 using it.)

 I don't think my fix hurts non-task split jobs, but I will now try your
 fix on the default branch:

I think it will in at least some cases if metadata is getting set
externally, I don't see how it is preventing some commands from
running together, I could totally be wrong though.

At any rate, I tested the version with my fix on your blast wrappers,
with task splitting on and off, submitting to a DRM and using the
local job runner, and they all seemed to work. Let me know if your
galaxy instance is unconvinced.


 https://bitbucket.org/galaxy/galaxy-central/commits/329ea7a83af4f389a7c95ee4559d88c6fec0211b

 This appears to also address a metadata issue, which if I am lucky
 may be the fix for this issue?:

 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017031.html

I doubt your luck is so good. That problems looks like some sort of
disk caching issue to me (galaxy process and worker node having
inconsistent views of the same file system), I doubt this will fix it.
Though hopefully I am wrong on both counts :).

Thanks for reporting the problem and the fix!

-John


 Thanks,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] RFC: remove trailing semicolons from command line - Broken bowtie2_wrapper on some SGE systems

2013-10-17 Thread Peter Cock
On Thu, Oct 17, 2013 at 5:04 PM, John Chilton chil...@msi.umn.edu wrote:
 On Thu, Oct 17, 2013 at 5:49 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Thu, Oct 17, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote:

 (And with the job splitting not being enabled by default, I am
 aware that I am in a relatively small group of Galaxy admins
 using it.)

 I don't think my fix hurts non-task split jobs, but I will now try your
 fix on the default branch:

 I think it will in at least some cases if metadata is getting set
 externally, I don't see how it is preventing some commands from
 running together, I could totally be wrong though.

 At any rate, I tested the version with my fix on your blast wrappers,
 with task splitting on and off, submitting to a DRM and using the
 local job runner, and they all seemed to work. Let me know if your
 galaxy instance is unconvinced.

Seem fine so far :)

 This appears to also address a metadata issue, which if I am lucky
 may be the fix for this issue?:

 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017031.html

 I doubt your luck is so good. That problems looks like some sort of
 disk caching issue to me (galaxy process and worker node having
 inconsistent views of the same file system), I doubt this will fix it.
 Though hopefully I am wrong on both counts :).

Disk caching makes sense as a root cause - I've not had this
happen consistently or reproducibly yet so it may well return.

 Thanks for reporting the problem and the fix!

 -John

No problem, thank you for addressing it so promptly.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] file download returns empty file or webpage error

2013-10-17 Thread UMD Bioinformatics
Hello,

I've looked through the archives and I see that this issue has been raised 
before 
(http://dev.list.galaxyproject.org/file-download-returns-empty-file-or-webpage-fails-with-ERR-CONNECTION-CLOSED-tp4415687.html)
 but without a noted resolution. 

I am running Apache as proxy but I downloads from multiple machines and 
browsers are either empty or return an error after clicking the download 
button. 

I enabled Apache xsend file:

Location /
XSendFile on
XSendFileAllowAbove on
XSendFilePath /galaxy
/Location

And configured galaxy to be aware of this in universe_wsgi.ini apache_xsendfile 
= True.

I have checked the apache error logs and do not see any issues reported.

Any help would be appreciated.

I've ended up changing apache_xsendfile = false just so that i can download 
data. How do I get galaxy apache and xsendfile to work properly?



Cheers,
Ian


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Zero padding corruption using galaxy with torque on AFS

2013-10-17 Thread Renato Alves
Hi everyone,

I'm currently setting up galaxy to run on top of AFS using torque for
handling jobs.

Everything is setup according to wiki documentation but I'm having a
weird filesystem corruption problem.

The setup is the following:

machine A: runs galaxy. Galaxy home folder is on AFS.
machine B: runs torque server and shares Galaxy's home folder.

When I launch a process via galaxy everything works as expected but the
output file becomes corrupted with 4kb of leading zero bytes (file
NC_010473.tabular). This corruption is reproducible at all times,
regardless of file size. In the attached example, the original file is
NC_010473.faa.

If I restart the openafs client or flush the AFS file cache the
corruption goes away. However, if I re-run the same script created by
galaxy through torque the corruption doesn't happen. Hence it only
happens if launched via Galaxy.

I also tried both the DRMAA and the PBS modules but the corruption remained.

Does anyone know what could be the cause of this?

Thanks,
Renato
gi|170079664|ref|YP_001728984.1| thr operon leader peptide [Escherichia coli 
str. K-12 substr. DH10B]
MKRISTTITTTITITTGNGAG
gi|170079665|ref|YP_001728985.1| bifunctional aspartokinase I/homoserine 
dehydrogenase I [Escherichia coli str. K-12 substr. DH10B]
MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERI
FAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEA
RGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYS
AAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPC
LIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLIT
QSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAAL
ARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQVIEVFVIGVGGVGGALLEQLKRQQSW
LKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAV
ADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELM
KFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIE
IEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFK
VKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV
gi|170079666|ref|YP_001728986.1| homoserine kinase [Escherichia coli str. K-12 
substr. DH10B]
MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEPRENIVYQCWE
RFCQELGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLNDTRLLALMGELEGRISGSIHY
DNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGIKVSTAEARAILPAQYRRQDCIAHGRHLAGF
IHACYSRQPELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKPETAQRVA
DWLGKNYLQNQEGFVHICRLDTAGARVLEN
gi|170079667|ref|YP_001728987.1| threonine synthase [Escherichia coli str. 
K-12 substr. DH10B]
MKLYNLKDHNEQVSFAQAVTQGLGKNQGLFFPHDLPEFSLTEIDEMLKLDFVTRSAKILSAFIGDEIPQE
ILEERVRAAFAFPAPVANVESDVGCLELFHGPTLAFKDFGGRFMAQMLTHIAGDKPVTILTATSGDTGAA
VAHAFYGLPNVKVVILYPRGKISPLQEKLFCTLGGNIETVAIDGDFDACQALVKQAFDDEELKVALGLNS
ANSINISRLLAQICYYFEAVAQLPQETRNQLVVSVPSGNFGDLTAGLLAKSLGLPVKRFIAATNVNDTVP
RFLHDGQWSPKATQATLSNAMDVSQPNNWPRVEELFRRKIWQLKELGYAAVDDETTQQTMRELKELGYTS
EPHAAVAYRALRDQLNPGEYGLFLGTAHPAKFKESVEAILGETLDLPKELAERADLPLLSHNLPADFAAL
RKLMMNHQ
gi|170079668|ref|YP_001728988.1| hypothetical protein ECDH10B_0005 
[Escherichia coli str. K-12 substr. DH10B]
MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQHYEWRGNRWHL
HGPRHHKKAPHDHHGGHGPGKHHR
gi|170079669|ref|YP_001728989.1| hypothetical protein ECDH10B_0006 
[Escherichia coli str. K-12 substr. DH10B]
MLILISPAKTLDYQSPLTTTRYTLPELLDNSQQLIHEARKLTPPQISTLMRISDKLAGINAARFHDWQPD
FTPANARQAILAFKGDVYTGLQAETFSEDDFDFAQQHLRMLSGLYGVLRPLDLMQPYRLEMGIRLENARG
KDLYQFWGDIITNKLNEALAAQGDNVVINLASDEYFKSVKPKKLNAEIIKPVFLDEKNGKFKIISFYAKK
ARGLMSRFIIENRLTKPEQLTGFNSEGYFFDEDSSSNGELVFKRYEQR
gi|170079670|ref|YP_001728990.1| transporter [Escherichia coli str. K-12 
substr. DH10B]
MPDFFSFINSVLWGSVMIYLLFGAGCWFTFRTGFVQFRYIRQFGKSLKNSIHPQPGGLTSFQSLCTSLAA
RVGSGNLAGVALAITAGGPGAVFWMWVAAFIGMATSFAECSLAQLYKERDVNGQFRGGPAWYMARGLGMR
WMGVLFAVFLLIAYGIIFSGVQANAVARALSFSFDFPPLVTGIILAVFTLLAITRGLHGVARLMQGFVPL
MAIIWVLTSLVICVMNIGQLPHVIWSIFESAFGWQEAAGGAAGYTLSQAITNGFQRSMFSNEAGMGSTPN
AASWPPHPAAQGIVQMIGIFIDTLVICTASAMLILLAGNGTTYMPLEGIQLIQKAMRVLMGSWGAE
FVTLVVILFAFSSIVANYIYAENNLFFLRLNNPKAIWCLRICTFATVIGGTLLSLPLMWQLADIIMACMA
ITNLTAILLLSPVVHTIASDYLRQRKLGVRPVFDPLRYPDIGRQLSPDAWDDVSQE
gi|170079671|ref|YP_001728991.1| transaldolase B [Escherichia coli str. K-12 
substr. DH10B]
MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAKQQSNDRAQQI
VDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYNDAGISNDRILIKLASTWQGI
RAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVGRILDWYKANTDKKEYAPAEDPGVVSVSEIY
QYYKEHGYETVVMGASFRNIGEILELAGCDRLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESE
FLWQHNQDPMAVDKLAEGIRKFAIDQEKLEKMIGDLL
gi|170079672|ref|YP_001728992.1| molybdenum cofactor biosynthesis protein MogA 
[Escherichia coli str. K-12 substr. DH10B]
MNTLRIGLVSISDRASSGVYQDKGIPALEEWLTSALTTPFELETRLIPDEQAIIEQTLCELVDEMSCHLV
LTTGGTGPARRDVTPDATLAVADREMPGFGEQMRQISLHFVPTAILSRQVGVIRKQALILNLPGQPKSIK
ETLEGVKDAEGNVVVHGIFASVPYCIQLLEGPYVETAPEVVAAFRPKSARRDVSE
gi|170079673|ref|YP_001728993.1| hypothetical