[galaxy-dev] Troubleshooting file uploads (to Data Library)

2013-08-21 Thread Clare Sloggett
Hi,

I am having trouble uploading files to a Data Library, and I'm not sure
where to begin troubleshooting. I'm uploading from a URL (but I had a
similar issue from the desktop). The symptom is that the datasets in the
library have the message This job is queued and never seem to progress.

I am one of very few users of this instance (quite likely the only user
right this moment). I don't think the server is busy, so I'm not sure why
the files uploads don't seem to be proceeding. How can I investigate
further?

Thanks,
Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Troubleshooting file uploads (to Data Library)

2013-08-21 Thread Clare Sloggett
Sorry, missed some information: there are a handful of files, 35MB
(gzipped) each. The issue occurs even if I only try to upload one of them
though. The server is a 16-core machine.


On 21 August 2013 17:08, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi,

 I am having trouble uploading files to a Data Library, and I'm not sure
 where to begin troubleshooting. I'm uploading from a URL (but I had a
 similar issue from the desktop). The symptom is that the datasets in the
 library have the message This job is queued and never seem to progress.

 I am one of very few users of this instance (quite likely the only user
 right this moment). I don't think the server is busy, so I'm not sure why
 the files uploads don't seem to be proceeding. How can I investigate
 further?

 Thanks,
 Clare

 --

 Clare Sloggett
 Research Fellow / Bioinformatician
 Life Sciences Computation Centre
 Victorian Life Sciences Computation Initiative
 University of Melbourne, Parkville Campus
 187 Grattan Street, Carlton, Melbourne
 Victoria 3010, Australia
 Ph: 03 903 53357  M: 0414 854 759




-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Troubleshooting file uploads (to Data Library)

2013-08-21 Thread Clare Sloggett
Hi Hans and all,

The issue turned out to be a more general one with our job runners (it was
also stopping non-data-transfer jobs running, if I'd noticed). Yep, I do
specify file format when uploading.

Thanks for your help!

Clare


On 21 August 2013 17:38, Hans-Rudolf Hotz h...@fmi.ch wrote:

 Hi Clare

 a few points to start:

  - do you define the 'File Format'?
(don't use 'Auto-detect' for big files)

  - and similar to a recent question on the list: check your proxy
settings


 Regards, Hans-Rudof



 On 08/21/2013 09:09 AM, Clare Sloggett wrote:


 Sorry, missed some information: there are a handful of files, 35MB
 (gzipped) each. The issue occurs even if I only try to upload one of
 them though. The server is a 16-core machine.


 On 21 August 2013 17:08, Clare Sloggett s...@unimelb.edu.au
 mailto:s...@unimelb.edu.au wrote:

 Hi,

 I am having trouble uploading files to a Data Library, and I'm not
 sure where to begin troubleshooting. I'm uploading from a URL (but I
 had a similar issue from the desktop). The symptom is that the
 datasets in the library have the message This job is queued and
 never seem to progress.

 I am one of very few users of this instance (quite likely the only
 user right this moment). I don't think the server is busy, so I'm
 not sure why the files uploads don't seem to be proceeding. How can
 I investigate further?

 Thanks,
 Clare

 --

 Clare Sloggett
 Research Fellow / Bioinformatician
 Life Sciences Computation Centre
 Victorian Life Sciences Computation Initiative
 University of Melbourne, Parkville Campus
 187 Grattan Street, Carlton, Melbourne
 Victoria 3010, Australia
 Ph: 03 903 53357  M: 0414 854 759




 --

 Clare Sloggett
 Research Fellow / Bioinformatician
 Life Sciences Computation Centre
 Victorian Life Sciences Computation Initiative
 University of Melbourne, Parkville Campus
 187 Grattan Street, Carlton, Melbourne
 Victoria 3010, Australia
 Ph: 03 903 53357  M: 0414 854 759


 __**_
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/**search/mailinglists/http://galaxyproject.org/search/mailinglists/






-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] two versions of show datasets API call?

2013-03-18 Thread Clare Sloggett
Hi all,

This is a using-the-API question; not sure if it belongs in galaxy-dev or
galaxy-user !

There seem to be two ways to retrieve metadata for a dataset, one in the
Histories API and one in the Datasets API. They return different
information.

So for instance if I call
http://galaxy-vic.genome.edu.au/api/datasets/cb5d3b9eef2b9275?key=myapikey
I see

{
data_type: fastqsanger,
deleted: false,
file_size: 16439610,
genome_build: ?,
id: 397,
metadata_data_lines: null,
metadata_dbkey: ?,
metadata_sequences: null,
misc_blurb: 15.7 MB,
misc_info: uploaded fastqsanger file,
model_class: HistoryDatasetAssociation,
name: https://bioblend.s3.amazonaws.com/C1_R1_1.chr4.fq;,
purged: false,
state: ok,
visible: true
}

But if I call 
http://galaxy-vic.genome.edu.au/api/histories/fb4122d2ca33443e/contents/cb5d3b9eef2b9275?key=myapikey

I see

{
accessible: true,
api_type: file,
data_type: fastqsanger,
deleted: false,
display_apps: [],
download_url: /datasets/cb5d3b9eef2b9275/display?to_ext=fastqsanger,
file_ext: fastqsanger,
file_name: /mnt/all/cloudman/galaxy/clare/files/000/dataset_375.dat,
file_size: 16439610,
genome_build: ?,
hid: 1,
history_id: fb4122d2ca33443e,
id: cb5d3b9eef2b9275,
metadata_data_lines: null,
metadata_dbkey: ?,
metadata_sequences: null,
misc_blurb: 15.7 MB,
misc_info: uploaded fastqsanger file,
model_class: HistoryDatasetAssociation,
name: https://bioblend.s3.amazonaws.com/C1_R1_1.chr4.fq;,
peek: table cellspacing=\0\
cellpadding=\3\trtd@9453842/1/td/trtrtdCAGATTATGGAATCACTTGAAACTGATATTAATTGCCGAAAGATGCATCTTTCACGTTAGGAAATGTTGCT/td/trtrtd+/td/trtrtdIII/td/trtrtd@9454359/1/td/trtrtdGGAAATGAGTACAGCTATGCAACAGCTATCAGTAAGGCCGAAGAGTTTGATACTATTTCTGCATTGA/td/tr/table,
purged: false,
state: ok,
visible: true,
visualizations: []
}


The second version gives me much more information, including the History ID
(but ironically requires the History ID to make the call in the first
place). What I would ideally like is an API call which only requires
knowledge of the Dataset ID but provides all the information in the second
call.

I am also a bit confused by the existence of the two different methods in
the first place. Is it necessary for it to be this way, or are they just
there for historical reasons?

If it would be desirable to either only have one show-dataset REST method
or to make the behaviour of the two identical, should I add a Trello card
for this?

Thanks,
Clare


-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] BioBlend: Problem Running Example file

2013-03-17 Thread Clare Sloggett
Oops sorry, meant to keep the list cc'd - resending.

On 18 March 2013 16:18, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi Rob,

 Were you using a very old version of the library and examples or was it
 quite recent? In any case, try grabbing the script from
 https://github.com/afgane/bioblend/tree/master/docs/examples , we have
 done a lot of bugfixing and documentation recently. If you just want to
 test your setup there are also now some much simpler, less end-to-end
 example scripts that don't use any admin permissions, and we'll add a few
 more.

 The create_library step is the step where you're going to run into
 problems if you don't have admin rights to the Galaxy instance. If you're
 running on localhost though you should be able to set up the account as an
 admin account.

 Finally... we've put up some much better docs on using workflows (and
 libraries) in bioblend, which might be useful:
 http://bioblend.readthedocs.org/ and specifically
 http://bioblend.readthedocs.org/en/latest/api_docs/galaxy/docs.html#run-a-workflow

 A lot of this is very recent so if you run into any bugs or anything that
 is just not clear, let me know, the feedback is very helpful!

 Cheers,
 Clare

 On 8 March 2013 16:01, Rob Leclerc robert.lecl...@gmail.com wrote:

 I had trouble running blend4j, so I tried to jump into python (a language
 I have little experience with).

 I tried running the example run_import_workflow.py

 *% python run_imported_workflow.py 
 http://localhost:80808c25bc83f6f9e4001dd21eb7b64f063f
 *

 but I get an error:

 Initiating Galaxy connection
 Importing workflow
 Creating data library 'Imported data for API demo'

 Traceback (most recent call last):
   File run_imported_workflow.py, line 53, in module
 library_dict = gi.libraries.create_library(library_name)
   File
 build/bdist.macosx-10.6-intel/egg/bioblend/galaxy/libraries/__init__.py,
 line 27, in create_library
   File build/bdist.macosx-10.6-intel/egg/bioblend/galaxy/client.py,
 line 53, in _post
   File build/bdist.macosx-10.6-intel/egg/bioblend/galaxy/__init__.py,
 line 132, in make_post_request
   File
 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests-1.1.0-py2.7.egg/requests/models.py,
 line 604, in json
 return json.loads(self.text or self.content)
   File
 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/simplejson-3.1.0-py2.7-macosx-10.6-intel.egg/simplejson/__init__.py,
 line 454, in loads
 return _default_decoder.decode(s)
   File
 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/simplejson-3.1.0-py2.7-macosx-10.6-intel.egg/simplejson/decoder.py,
 line 374, in decode
 obj, end = self.raw_decode(s)
   File
 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/simplejson-3.1.0-py2.7-macosx-10.6-intel.egg/simplejson/decoder.py,
 line 393, in raw_decode
 return self.scan_once(s, idx=_w(s, idx).end())
 simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1
 (char 0)


 Is there anything I am missing from the stock configuration which would
 cause this not to run out of the box?

 Cheers,
 Rob

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/




 --

 Clare Sloggett
 Research Fellow / Bioinformatician
 Life Sciences Computation Centre
 Victorian Life Sciences Computation Initiative
 University of Melbourne, Parkville Campus
 187 Grattan Street, Carlton, Melbourne
 Victoria 3010, Australia
 Ph: 03 903 53357  M: 0414 854 759




-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Displayed versions of tools in Galaxy

2012-12-11 Thread Clare Sloggett
Hi guys,

I wasn't sure if I should send this one to galaxy-user. I have just
confused myself about the versions of tools displayed within Galaxy.

If I select the TopHat tool, the tool UI says Tophat for Illumina
(version 1.5.0).
After I have run the job, the step panel in the History window says
Info: TopHat v1.4.1

If I select the Cufflinks tool, the tool UI says Cufflinks (version 0.0.5)
After running the job, the step panel says Info: cufflinks v1.3.0.

Looking at the xml files, it does look like the version displayed
before running is the wrapper and the version displayed on the
resulting dataset is that of the command-line tool.

tool id=tophat name=Tophat for Illumina version=1.5.0
!-- Wrapper compatible with Tophat versions 1.3.0 to 1.4.1 --

tool id=cufflinks name=Cufflinks version=0.0.5
!-- Wrapper supports Cufflinks versions v1.3.0 and newer --

So looking at the XML it's very clear, but from the versions displayed
in Galaxy I was completely confused, particularly since the two tophat
version numbers happened to be similar (1.4.1 and 1.5.0). Maybe we
should change it so that when Galaxy says version it is always
explicit about whether it's wrapper version or just version?

It also seems like there's no way for a user to discover the
command-line tool version without actually running the tool (or is
there)? Is this because Galaxy itself does not know this information?

All this came about because I'm trying to specify to users which
versions of tools my exported Workflow was built with, and I'm not
sure how to do it without confusing them.

Thanks,
Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to remove a broken toolshed install

2012-10-29 Thread Clare Sloggett
Hi Greg,

 We had some occasions where we'd try to install a tool from the
 toolshed, and it would hang - it appeared that the hg pull was timing
 out.

 Was the timeout a regular occurrence?  if so, do you know that cause, and
 were you able to get it resolved?

It was repeated, but after a few tries the install would succeed
without me really 'resolving' the issue. We haven't run into the issue
in a while and I honestly have no idea if this is due to a galaxy
upgrade or if we had a temporary run of bad luck with those particular
tools.


 The October 5, 2012 Galaxy distribution news brief includes the following
 link to information about the process for handling repository installation
 errors, specifically when the errors occur during cloning.

 http://wiki.g2.bx.psu.edu/InstallingRepositoriesToGalaxy#Handling_repository_installation_errors

 If you're running an older version of Galaxy, you'll need to update to the
 October 5 release in order to have these features.

 The news brief release information is:

 upgrade: $ hg pull -u -r b5bda7a5c345

 Let me know if this is not what you're looking for.


Thanks again, this is great. I think we have upgraded past that point now.

Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Illumina adaptor sequences in tools - copyright?

2012-10-29 Thread Clare Sloggett
Hi all,

We are looking at wrapping trimmomatic (
http://www.usadellab.org/cms/index.php?page=trimmomatic ). However to
run, it requires the Illumina adaptor sequences, which are copyright.
I was wondering if anyone has already dealt with this issue when
wrapping other tools and putting them up on a public galaxy instance
or in the Toolshed. For instance, I think that FastQC requires the
same sequences.

I would imagine that Illumina wouldn't want to stop people using these
sequences for analysis purposes, but I'm still thinking we might need
some sort of permission.

Have others dealt with this?

Thanks,
Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to remove a broken toolshed install

2012-10-17 Thread Clare Sloggett
Hi Greg,

Thanks for this!

On 17 October 2012 01:17, Greg Von Kuster g...@bx.psu.edu wrote:

 I managed to break a toolshed-installed tool by fiddling with the
 files under shed_tools.

 As you've discovered, this is not a good thing to try.  Always use the Galaxy 
 interface features to perform tasks like this.


Actually, the reason I did this was because I didn't know how to solve
a different problem, so maybe I should ask you about that one as well.

We had some occasions where we'd try to install a tool from the
toolshed, and it would hang - it appeared that the hg pull was timing
out. In these cases the config files wouldn't get set up, but a
partial repository was pulled / directories were created, and the
repository files would then get in the way of trying to install the
tool (it seemed to think it was already there). The only way to fix it
seemed to be to manually delete the partially-pulled repository under
shed_tools. This worked fine for fixing failed installs. But, this
time, I thought (wrongly) that this had happened again and I deleted a
repository - then realised that it was actually installed and
registered in the database, etc.

So, if the hg pull times out, is there a right way to clean up the
resulting files? I got in the habit of doing it manually, which of
course is dangerous, because I didn't know any way to do it via the
admin interface.



 Depending on the changes you've made, you should be able to do the following:

 1. Manually remove the installed repository subdirectory hierarchy from disk.
 2. If the repository included any tools, manually remove entries for each of 
 them from the shed_tool-conf.xml file ( or the equivalent file you have 
 configured for handling installed repositories )
 3. Manually update the database using the following command (assuming your 
 installed repository is named 'bcftools_view' and it is the only repository 
 you have installed with that name) - letter capitalization is required:

 The following assumes you're using postgres:

 update tool_shed_repository set deleted=True, uninstalled=True, 
 status='Uninstalled', error_message=Null  where name = 'bcftools_view';


Thanks very much! Yes it's postgres. I'll let you know if I succeed.

Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] How to remove a broken toolshed install

2012-10-15 Thread Clare Sloggett
Hi all,

I managed to break a toolshed-installed tool by fiddling with the
files under shed_tools. This led to a situation in which the Galaxy
admin interface claims the tool is still installed, but can't find any
files for it. I manually put the repository files where I think they
should go, but this didn't fix the situation, so what I really want to
do is just get rid of it altogether and reinstall cleanly. I'm not
certain that the tool was working properly before I fiddled with it,
either.

Galaxy won't let me uninstall, deactivate or update it (because it
can't find it properly) and it won't let me install it (because it
thinks it's installed). It also seems (judging by the last of the
errors below) to be unable to find some config information that it
expects, but I don't really understand what's going on there.

So my question is: given a messy, screwed up install, how can I
completely remove it and start from scratch? What are the different
components and config files I need to remove it from and are they all
manually accessible?

Thanks in advance for any help!


If it's relevant to my question, here are some of the behaviours I see
currently:

The tool appears as Installed under Admin - Manage installed tool
shed repositories, but doesn't show up in the tools panel.

If I try Repository Actions - Get repository updates , I get the error:
The directory containing the installed repository named
'bcftools_view' cannot be found.

But if I try Repository Actions - Reset repository metadata , it
apparently works, I get
Metadata has been reset on repository bcftools_view.

And, if I try to 'Deactivate or uninstall' the apparently-installed
repository, I get:

URL: 
http://galaxy-tut.genome.edu.au/admin_toolshed/deactivate_or_uninstall_repository?id=a25e134c184d6e4b
Module paste.exceptions.errormiddleware:144 in __call__
  app_iter = self.application(environ, sr_checker)
Module paste.debug.prints:106 in __call__
  environ, self.app)
Module paste.wsgilib:543 in intercept_output
  app_iter = application(environ, replacement_start_response)
Module paste.recursive:84 in __call__
  return self.application(environ, start_response)
Module paste.httpexceptions:633 in __call__
  return self.application(environ, start_response)
Module galaxy.web.framework.base:160 in __call__
  body = method( trans, **kwargs )
Module galaxy.web.framework:205 in decorator
  return func( self, trans, *args, **kwargs )
Module galaxy.webapps.galaxy.controllers.admin_toolshed:452 in
deactivate_or_uninstall_repository
  remove_from_tool_panel( trans, tool_shed_repository, shed_tool_conf, 
 uninstall=remove_from_disk_checked )
Module galaxy.util.shed_util:1781 in remove_from_tool_panel
  tool_panel_dict = generate_tool_panel_dict_from_shed_tool_conf_entries( 
 trans, repository )
Module galaxy.util.shed_util:942 in
generate_tool_panel_dict_from_shed_tool_conf_entries
  tree = util.parse_xml( shed_tool_conf )
Module galaxy.util:135 in parse_xml
  tree = ElementTree.parse(fname)
Module elementtree.ElementTree:859 in parse
Module elementtree.ElementTree:576 in parse
TypeError: coercing to Unicode: need string or buffer, NoneType found



Thanks,
Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] patch contribution (was Re: So I think I fixed a bug.)

2012-08-28 Thread Clare Sloggett
 subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/





-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Pull request, and missing step connections

2012-08-28 Thread Clare Sloggett
Hi guys,

I made a very small code change so that the workflows API will display
all the steps in a workflow, not just the inputs (these are still
displayed separately, as before).  I made a pull request even though
the change is small, just to learn what I'm doing:
https://bitbucket.org/galaxy/galaxy-central/pull-request/68/show-workflow-steps-and-connectors-in-api

But in doing so I noticed something in the workflows I don't
understand. I can't see a connection between workflow input datasets
and the steps they are inputting to. I doubt this is a bug, I just
don't know how it's supposed to work, so I'm not sure if my API change
is sufficient.

So for instance, if I create a small workflow with steps
Input Dataset - TopHat (accepted_hits bam file) - Cufflinks
and call the API on this workflow, I now see

{
id: f2db41e1fa331b3e,
inputs: {
1: {
label: Input SE fastq,
value: 
}
},
name: Tophat + cufflinks,
steps: {
1: {
id: 1,
input_steps: {},
tool_id: null,
type: data_input
},
2: {
id: 2,
input_steps: {},
tool_id: tophat,
type: tool
},
3: {
id: 3,
input_steps: {
input: {
source_step: 2,
step_output: accepted_hits
}
},
tool_id: cufflinks,
type: tool
}
},
url: /api/workflows/f2db41e1fa331b3e
}

The inputs field was there before, the steps field is the new bit.
So as expected, step 3 lists step 2 as an input. However step 2 does
not list step 1 as an input, even though the GUI shows that they are
connected and the workflow works. From other testing it seems that
Input Dataset steps never appear wired up in my API response. I'm
just iterating over all input_connections, so apparently steps of type
data_input are not in the list of WorkflowStep.input_connections .

Is this how it should be? How can I find out which steps an Input
Dataset is connected to?

Thanks,
Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Using the tools API

2012-08-28 Thread Clare Sloggett
Hi Jeremy,

OK, that makes sense. Thanks again!

Clare

On 24 August 2012 02:17, Jeremy Goecks jeremy.goe...@emory.edu wrote:
 I think that handle_input() executes the tool?

 That's the intention and it should work but it hasn't been tested.

 Also, separately there
 is a method called _run_tool()  (although unlike _rerun_tool() I can't
 see anything that calls it).

 Looks like _run_tool is almost a copy of what's in create(). This is probably 
 legacy code from refactoring that hasn't been cleaned up yet.

 So, I thought from looking at the surface, that the tool-running code
 was there and that I just didn't know what data structure to pass into
 payload['inputs'] .  Is it not doing what I think?

 I think your inference is correct, but, yes, there's the problem of 
 specifying the tool input data structure. Tool inputs are specified as 
 dictionaries (often with nested dictionaries for things like conditionals), 
 so you could construct an appropriate input dictionary and could (likely) run 
 a tool. However, there's no help in the API right now to help you construct 
 an appropriate dictionary for a tool; this is the big missing piece in the 
 tools API.

 Best,
 J.




-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] egg distribution error when running galaxy-central

2012-08-23 Thread Clare Sloggett
Hi Nate  all,

I see - enthought changes the default python version, and virtualenv
was giving me a python version based on the version I used to run
virtualenv.
If I run
/usr/bin/python virtualenv.py galaxy_central_syspython
. galaxy_central_syspython/bin/activate

and then pull galaxy-central and run it, I don't get any egg errors,
either the build-by-hand ones or the more serious error that stopped
me originally.

Thanks!

Clare

On 22 August 2012 01:26, Nate Coraor n...@bx.psu.edu wrote:
 Hi Clare,

 If you use the system python, or a build from python.org, you should be fine. 
  It's Enthought python, which is only built for a single architecture, that's 
 the reason for having to do so much manual egg building.

 --nate

 On Aug 19, 2012, at 7:36 AM, Tomithy Too wrote:

 Hi Clare,

 I ran into the same problem as well when I upgraded my galaxy-central 
 version. I am running Mac Os10.6.8

 What I did to get use the command $ pip install fabric

 It manually fetches the latest version of fabric from pip 
 (http://www.pip-installer.org/en/latest/index.html) which is a package 
 manager from python, also its dependencies: ssh and pycrypto, which are the 
 components causing the problem. I think it might be due to an erroneous 
 version of the egg hosted on galaxy.

 Works fine after that for me.

 Cheers
 Tomithy



 On Wed, Aug 15, 2012 at 10:55 AM, Clare Sloggett s...@unimelb.edu.au wrote:
 Hi Scott,

 Thanks very much for this!

 virtualenv is ok I think:
 clare$ echo $PATH
 /Users/clare/galaxy/galaxy_central_env/bin: .

 which is where I set up my environment.

 I'm not using anything in particular outside Enthought, that I can
 think of. Enthought packages up a whole lot of things including scipy.

 The strange thing is that galaxy-dist runs but galaxy-central doesn't.
 So, I was hoping it would actually be a temporary bug in the egg
 distribution, but it sounds like the problem really is my environment.
 I don't understand how Enthought can be causing problems that
 virtualenv can't work around, but I've never really understood how
 python is structured in OSX! So I think it's probably worth me going
 through the effort of setting up a working environment in an ubuntu VM
 rather than running it on my Mac - I don't want to be asking you to
 pull code changes from an environment that's unusual.

 I'm setting it up in VirtualBox ubuntu now (which has python 2.7.1).
 So far I've pulled the code into the vm and run it, without
 virtualenv, and it gives none of the errors I see on the Mac. My plan
 is to both share the drive containing galaxy-central and share the
 network so that I can do both the editing and the browsing on my host
 machine, but if there are better ways advice is welcome!

 Thanks,
 Clare



 On 2 August 2012 07:26, Scott McManus scottmcma...@gatech.edu wrote:
 
  I haven't been able to reproduce this yet with the instructions you
  gave, but I'm not using the same environment. Can you give me an idea
  of what tools you're using outside of SciPy/NumPy/Enthought stuff?
 
  There is the possibility that the virtualenv.py script isn't being
  sourced correctly. We can check if it's actually using the correct
  environment by calling echo $PATH and checking that the path is
  pointing to the virtual environment. For example, I installed
  virtualenv stuff under /home/smcmanus/clare/galaxy_env/bin, and
  I got:
  (galaxy_env)$ echo $PATH
  /home/smcmanus/clare/galaxy_env/bin:/usr/local/bin:other stuff deleted
 
  -Scott
 
  - Original Message -
  Hi all,
 
  I'm trying to run galaxy-central on my laptop in order to play around
  with some changes, and I'm having trouble getting it to run. I can
  run
  galaxy-dist without problems and have been working with that (so its
  eggs are all installed already), but now I want to create a pull
  request so want to run galaxy-dist. I'm not trying to install any
  extra tools or data, just the code.
 
  I'm running on OSX 10.7.4 and using virtualenv. I have Enthought
  installed, and I assume I will be using its version of python by
  default. The default python seems to be 2.7.3.
 
  I'm using the same virtualenv environment for galaxy-dist and
  galaxy-central (though it doesn't seem to matter if I give
  galaxy-central its own environment, I see the same error). So the
  steps were:
  - create a virtualenv environment and activate it
  - get galaxy-dist and call run.sh - it asked me to build quite a lot
  of dependencies myself, which was just a matter of running the
  requested commands, and then it worked with no problems.
  - shut down galaxy-dist, and in another directory, get galaxy-central
  and call run.sh. I think it asked me to build a couple of
  dependencies, but then it gives up with the following:
 
  (galaxy_env)Clares-MacBook-Pro:galaxy-central clare$ sh run.sh
  --reload
  Some eggs are out of date, attempting to fetch...
  Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched
  Warning: pycrypto

[galaxy-dev] Using the tools API

2012-08-23 Thread Clare Sloggett
Hi guys,

The Tools API is currently working for me from galaxy-central, but I'm
not sure how to correctly run a tool. Are there any example scripts,
as there are for some other parts of the API? Specifically I want to
find out what the expected payload fields are when I post to CREATE to
run a tool.

Some of the fields are clear to me just from the api/tools.py code
(e.g. 'tool_id') but others are not (e.g. how the input datasets and
parameters are specified).

A separate question:

How do we specify Advanced or conditional-dependent fields for a
tool? Some of these fields are necessary to run the tool at all.
For instance, on my system, calling
http://localhost:8080/api/tools/tophat?key=
returns
{
description: Find splice junctions using RNA-seq data,
id: tophat,
inputs: [
{
html: %3Cselect%20name%3D%22input1%22%3E%0A%3C/select%3E,
label: RNA-Seq FASTQ file,
name: input1,
type: data
},
{
label: Conditional (refGenomeSource),
name: refGenomeSource
},
{
label: Conditional (singlePaired),
name: singlePaired
}
],
name: Tophat for Illumina,
version: 1.5.0
}

This is obviously only some of the inputs you see in the UI. I think
that all the Advanced fields are missing, and more importantly, any
input which is dependent on a conditional is missing. So the
refGenomeSource conditional is there, but the actual reference genome
field is not. The type of the reference genome field also presumably
depends on which value is supplied for the referenceGenomeSource
conditional.

Is there currently a way to specify (or see) these missing fields?

Thanks,
Clare


-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Using the tools API

2012-08-23 Thread Clare Sloggett
Hi Jeremy,

Thanks for the info!

I am confused though because the code in tools.py was what was making
me think I could run a tool with specified inputs. ie I was looking at

def create( ... )

# Set up inputs.
inputs = payload[ 'inputs' ]

params = util.Params( inputs, sanitize = False )
template, vars = tool.handle_input( trans, params.__dict__ )


I think that handle_input() executes the tool? Also, separately there
is a method called _run_tool()  (although unlike _rerun_tool() I can't
see anything that calls it).

So, I thought from looking at the surface, that the tool-running code
was there and that I just didn't know what data structure to pass into
payload['inputs'] .  Is it not doing what I think?

Thanks,
Clare


On 23 August 2012 23:00, Jeremy Goecks jeremy.goe...@emory.edu wrote:
 Unfortunately, the tools API isn't at all complete right now. The tools API 
 was driven by Trackster/Sweepster needs, so rerunning tools works well but 
 running tools from scratch doesn't.

 Practically, this means that the things you want to do, e.g.

 *view tool parameters;
 *set tool input datasets;

 are not yet supported.

 As always, community contributions are welcome and encouraged.

 Best,
 J.

 On Aug 23, 2012, at 4:10 AM, Clare Sloggett wrote:

 Hi guys,

 The Tools API is currently working for me from galaxy-central, but I'm
 not sure how to correctly run a tool. Are there any example scripts,
 as there are for some other parts of the API? Specifically I want to
 find out what the expected payload fields are when I post to CREATE to
 run a tool.

 Some of the fields are clear to me just from the api/tools.py code
 (e.g. 'tool_id') but others are not (e.g. how the input datasets and
 parameters are specified).

 A separate question:

 How do we specify Advanced or conditional-dependent fields for a
 tool? Some of these fields are necessary to run the tool at all.
 For instance, on my system, calling
 http://localhost:8080/api/tools/tophat?key=
 returns
 {
description: Find splice junctions using RNA-seq data,
id: tophat,
inputs: [
{
html: %3Cselect%20name%3D%22input1%22%3E%0A%3C/select%3E,
label: RNA-Seq FASTQ file,
name: input1,
type: data
},
{
label: Conditional (refGenomeSource),
name: refGenomeSource
},
{
label: Conditional (singlePaired),
name: singlePaired
}
],
name: Tophat for Illumina,
version: 1.5.0
 }

 This is obviously only some of the inputs you see in the UI. I think
 that all the Advanced fields are missing, and more importantly, any
 input which is dependent on a conditional is missing. So the
 refGenomeSource conditional is there, but the actual reference genome
 field is not. The type of the reference genome field also presumably
 depends on which value is supplied for the referenceGenomeSource
 conditional.

 Is there currently a way to specify (or see) these missing fields?

 Thanks,
 Clare


 --

 Clare Sloggett
 Research Fellow / Bioinformatician
 Life Sciences Computation Centre
 Victorian Life Sciences Computation Initiative
 University of Melbourne, Parkville Campus
 187 Grattan Street, Carlton, Melbourne
 Victoria 3010, Australia
 Ph: 03 903 53357  M: 0414 854 759
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/






-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] egg distribution error when running galaxy-central

2012-08-20 Thread Clare Sloggett
Hi Tomithy,

Thanks, this worked for me too!

Just to be clear for interested devs:

If I run galaxy-dist on my mac it asks me to build a whole series of
eggs by hand using scripts/scramble.py, and if I follow these
instructions, galaxy runs. A bit tedious but trivial to do.

If I run galaxy-central on my mac the same thing happens for a few
dependencies, but then it gets stuck at the error I posted originally.
If I run `pip install fabric` as Tomithy suggests then I get the same
results as running galaxy-dist, ie galaxy works after using
scramble.py a few times.

If I run galaxy-central (or presumably galaxy-dist) on ubuntu it
doesn't complain about any of the dependencies, doesn't get stuck, and
doesn't ask me to build any eggs by hand.

So I'm now wondering if, for code editing, I should use the ubuntu
environment I've set up even though the code is working natively on
the mac, just to avoid future complications.

Cheers,
Clare

On 19 August 2012 21:36, Tomithy Too tomithy@gmail.com wrote:
 Hi Clare,

 I ran into the same problem as well when I upgraded my galaxy-central
 version. I am running Mac Os10.6.8

 What I did to get use the command $ pip install fabric

 It manually fetches the latest version of fabric from pip
 (http://www.pip-installer.org/en/latest/index.html) which is a package
 manager from python, also its dependencies: ssh and pycrypto, which are the
 components causing the problem. I think it might be due to an erroneous
 version of the egg hosted on galaxy.

 Works fine after that for me.

 Cheers
 Tomithy



 On Wed, Aug 15, 2012 at 10:55 AM, Clare Sloggett s...@unimelb.edu.au
 wrote:

 Hi Scott,

 Thanks very much for this!

 virtualenv is ok I think:
 clare$ echo $PATH
 /Users/clare/galaxy/galaxy_central_env/bin: .

 which is where I set up my environment.

 I'm not using anything in particular outside Enthought, that I can
 think of. Enthought packages up a whole lot of things including scipy.

 The strange thing is that galaxy-dist runs but galaxy-central doesn't.
 So, I was hoping it would actually be a temporary bug in the egg
 distribution, but it sounds like the problem really is my environment.
 I don't understand how Enthought can be causing problems that
 virtualenv can't work around, but I've never really understood how
 python is structured in OSX! So I think it's probably worth me going
 through the effort of setting up a working environment in an ubuntu VM
 rather than running it on my Mac - I don't want to be asking you to
 pull code changes from an environment that's unusual.

 I'm setting it up in VirtualBox ubuntu now (which has python 2.7.1).
 So far I've pulled the code into the vm and run it, without
 virtualenv, and it gives none of the errors I see on the Mac. My plan
 is to both share the drive containing galaxy-central and share the
 network so that I can do both the editing and the browsing on my host
 machine, but if there are better ways advice is welcome!

 Thanks,
 Clare



 On 2 August 2012 07:26, Scott McManus scottmcma...@gatech.edu wrote:
 
  I haven't been able to reproduce this yet with the instructions you
  gave, but I'm not using the same environment. Can you give me an idea
  of what tools you're using outside of SciPy/NumPy/Enthought stuff?
 
  There is the possibility that the virtualenv.py script isn't being
  sourced correctly. We can check if it's actually using the correct
  environment by calling echo $PATH and checking that the path is
  pointing to the virtual environment. For example, I installed
  virtualenv stuff under /home/smcmanus/clare/galaxy_env/bin, and
  I got:
  (galaxy_env)$ echo $PATH
  /home/smcmanus/clare/galaxy_env/bin:/usr/local/bin:other stuff deleted
 
  -Scott
 
  - Original Message -
  Hi all,
 
  I'm trying to run galaxy-central on my laptop in order to play around
  with some changes, and I'm having trouble getting it to run. I can
  run
  galaxy-dist without problems and have been working with that (so its
  eggs are all installed already), but now I want to create a pull
  request so want to run galaxy-dist. I'm not trying to install any
  extra tools or data, just the code.
 
  I'm running on OSX 10.7.4 and using virtualenv. I have Enthought
  installed, and I assume I will be using its version of python by
  default. The default python seems to be 2.7.3.
 
  I'm using the same virtualenv environment for galaxy-dist and
  galaxy-central (though it doesn't seem to matter if I give
  galaxy-central its own environment, I see the same error). So the
  steps were:
  - create a virtualenv environment and activate it
  - get galaxy-dist and call run.sh - it asked me to build quite a lot
  of dependencies myself, which was just a matter of running the
  requested commands, and then it worked with no problems.
  - shut down galaxy-dist, and in another directory, get galaxy-central
  and call run.sh. I think it asked me to build a couple of
  dependencies, but then it gives up with the following

Re: [galaxy-dev] egg distribution error when running galaxy-central

2012-08-14 Thread Clare Sloggett
Hi Scott,

Thanks very much for this!

virtualenv is ok I think:
clare$ echo $PATH
/Users/clare/galaxy/galaxy_central_env/bin: .

which is where I set up my environment.

I'm not using anything in particular outside Enthought, that I can
think of. Enthought packages up a whole lot of things including scipy.

The strange thing is that galaxy-dist runs but galaxy-central doesn't.
So, I was hoping it would actually be a temporary bug in the egg
distribution, but it sounds like the problem really is my environment.
I don't understand how Enthought can be causing problems that
virtualenv can't work around, but I've never really understood how
python is structured in OSX! So I think it's probably worth me going
through the effort of setting up a working environment in an ubuntu VM
rather than running it on my Mac - I don't want to be asking you to
pull code changes from an environment that's unusual.

I'm setting it up in VirtualBox ubuntu now (which has python 2.7.1).
So far I've pulled the code into the vm and run it, without
virtualenv, and it gives none of the errors I see on the Mac. My plan
is to both share the drive containing galaxy-central and share the
network so that I can do both the editing and the browsing on my host
machine, but if there are better ways advice is welcome!

Thanks,
Clare



On 2 August 2012 07:26, Scott McManus scottmcma...@gatech.edu wrote:

 I haven't been able to reproduce this yet with the instructions you
 gave, but I'm not using the same environment. Can you give me an idea
 of what tools you're using outside of SciPy/NumPy/Enthought stuff?

 There is the possibility that the virtualenv.py script isn't being
 sourced correctly. We can check if it's actually using the correct
 environment by calling echo $PATH and checking that the path is
 pointing to the virtual environment. For example, I installed
 virtualenv stuff under /home/smcmanus/clare/galaxy_env/bin, and
 I got:
 (galaxy_env)$ echo $PATH
 /home/smcmanus/clare/galaxy_env/bin:/usr/local/bin:other stuff deleted

 -Scott

 - Original Message -
 Hi all,

 I'm trying to run galaxy-central on my laptop in order to play around
 with some changes, and I'm having trouble getting it to run. I can
 run
 galaxy-dist without problems and have been working with that (so its
 eggs are all installed already), but now I want to create a pull
 request so want to run galaxy-dist. I'm not trying to install any
 extra tools or data, just the code.

 I'm running on OSX 10.7.4 and using virtualenv. I have Enthought
 installed, and I assume I will be using its version of python by
 default. The default python seems to be 2.7.3.

 I'm using the same virtualenv environment for galaxy-dist and
 galaxy-central (though it doesn't seem to matter if I give
 galaxy-central its own environment, I see the same error). So the
 steps were:
 - create a virtualenv environment and activate it
 - get galaxy-dist and call run.sh - it asked me to build quite a lot
 of dependencies myself, which was just a matter of running the
 requested commands, and then it worked with no problems.
 - shut down galaxy-dist, and in another directory, get galaxy-central
 and call run.sh. I think it asked me to build a couple of
 dependencies, but then it gives up with the following:

 (galaxy_env)Clares-MacBook-Pro:galaxy-central clare$ sh run.sh
 --reload
 Some eggs are out of date, attempting to fetch...
 Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched
 Warning: pycrypto (a dependent egg of Fabric) cannot be fetched
 Warning: simplejson (a dependent egg of WebHelpers) cannot be fetched
 Fetched http://eggs.g2.bx.psu.edu/ssh/ssh-1.7.14-py2.7.egg
 One of Galaxy's managed eggs depends on something which is missing,
 this is almost certainly a bug in the egg distribution.
 Dependency ssh requires pycrypto=2.1,!=2.4
 Traceback (most recent call last):
   File ./scripts/fetch_eggs.py, line 30, in module
 c.resolve() # Only fetch eggs required by the config
   File
   /Users/clare/galaxy/galaxy-central/lib/galaxy/eggs/__init__.py,
 line 345, in resolve
 egg.resolve()
   File
   /Users/clare/galaxy/galaxy-central/lib/galaxy/eggs/__init__.py,
 line 168, in resolve
 dists = pkg_resources.working_set.resolve( (
 self.distribution.as_requirement(), ), env, self.fetch )
   File
   
 /Users/clare/galaxy/galaxy_env/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py,
 line 569, in resolve
 raise VersionConflict(dist,req) # XXX put more info here
 pkg_resources.VersionConflict: (ssh 1.7.14
 (/Users/clare/galaxy/galaxy-central/eggs/ssh-1.7.14-py2.7.egg),
 Requirement.parse('pycrypto=2.1,!=2.4'))
 Fetch failed.

 Any idea what is causing this?

 Thanks,
 Clare

 --

 Clare Sloggett
 Research Fellow / Bioinformatician
 Life Sciences Computation Centre
 Victorian Life Sciences Computation Initiative
 University of Melbourne, Parkville Campus
 187 Grattan Street, Carlton, Melbourne
 Victoria 3010, Australia
 Ph: 03 903 53357

Re: [galaxy-dev] egg distribution error when running galaxy-central

2012-07-31 Thread Clare Sloggett
On 1 August 2012 15:28, Clare Sloggett s...@unimelb.edu.au wrote:
I can run
 galaxy-dist without problems and have been working with that (so its
 eggs are all installed already), but now I want to create a pull
 request so want to run galaxy-dist.

oops, of course I mean 'so want to run galaxy-central.'




-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357  M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Citations for tools

2012-06-21 Thread Clare Sloggett
Hi Peter,

Thanks, I didn't realise it had been discussed!

I don't know what would be a good markup system for citations.
However, the current situation is that people are putting their
citations into the help tag with no special markup, and it seems to
work reasonably well. Maybe a simple field is all that's needed?

Clare

On 18 June 2012 19:50, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Mon, Jun 18, 2012 at 10:29 AM, Clare Sloggett s...@unimelb.edu.au wrote:
 Hi all,

 I'd like to suggest, or request, a feature - I think that posting to
 galaxy-dev is the right place to start?

 After I've done an analysis, it would be useful to be given a list of
 references for all the tools I used in that history, which I could use
 to cite the appropriate papers.

 At the moment, it seems that most tool developers add a please cite
 the following paper note to the help tag in the wrapper so that it
 displays on the tool screen before you run it. I'd like to suggest:
 * adding a cite tag to the tool wrappers xml,
 * adding a feature to the history UI which will list all the
 references to cite for the a history.

 I think this would encourage people to cite the tools they use
 properly and hence encourage developers to put their tools into the
 toolshed! With the standard tools moving into the toolshed it will be
 really important for tool wrappers to be maintained.

 Any thoughts?

 Clare

 Hi Clare,

 We talked about this at the end of last year, and yes, it would
 be a good idea:
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-December/007873.html

 Are you familiar enough with the area of semantic web/linked
 data to know what would be the best XML based markup to
 use for embedding the citations?

 Peter





-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357          M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Citations for tools

2012-06-18 Thread Clare Sloggett
Hi all,

I'd like to suggest, or request, a feature - I think that posting to
galaxy-dev is the right place to start?

After I've done an analysis, it would be useful to be given a list of
references for all the tools I used in that history, which I could use
to cite the appropriate papers.

At the moment, it seems that most tool developers add a please cite
the following paper note to the help tag in the wrapper so that it
displays on the tool screen before you run it. I'd like to suggest:
* adding a cite tag to the tool wrappers xml,
* adding a feature to the history UI which will list all the
references to cite for the a history.

I think this would encourage people to cite the tools they use
properly and hence encourage developers to put their tools into the
toolshed! With the standard tools moving into the toolshed it will be
really important for tool wrappers to be maintained.

Any thoughts?

Clare

-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357          M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Interested in speaking with other institutions deploying Galaxy locally?

2012-05-09 Thread Clare Sloggett
.

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/





-- 

Clare Sloggett
Research Fellow / Bioinformatician
Life Sciences Computation Centre
Victorian Life Sciences Computation Initiative
University of Melbourne, Parkville Campus
187 Grattan Street, Carlton, Melbourne
Victoria 3010, Australia
Ph: 03 903 53357          M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop

2011-12-01 Thread Clare Sloggett
Right! I did think to look for a 'share this cluster' command, I just
failed to find it. It all makes sense now, thanks.

On Thu, Dec 1, 2011 at 7:34 PM, Enis Afgan eaf...@emory.edu wrote:
 Hi Clare,
 The share string is generated when you share a cluster. The string is
 accessible on the shared cluster, when you click the green 'Share a cluster'
 icon next to the cluster name and then the top link Shared instances. You
 will get a list of the point in time shares of the cluster you have created.
 The share string will look something like
 this cm-cd53Bfg6f1223f966914df347687f6uf32/shared/2011-10-19--03-14
 You simply paste that string into new cluster box you mentioned.
 Enis

 On Thu, Dec 1, 2011 at 6:31 AM, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi Enis, Jeremy, and all,

 Thanks so much for all your help. I have another question which I
 suspect is just me missing something obvious.

 I'm guessing that when you cloned the cluster for your workshop, you
 used CloudMan's 'share-an-instance' functionality?
 When I launch a new cluster which I want to be a copy of an existing
 cluster, and select the share-an-instance option, it asks for the
 cluster share-string. How can I find this string for my existing
 cluster?

 Or have I got completely the wrong idea - did you actually clone the
 instance using AWS functionality?

 Thanks,
 Clare

 On Mon, Nov 21, 2011 at 5:37 PM, Enis Afgan eaf...@emory.edu wrote:
  Hi Clare,
  I don't recall what instance type we used earlier, but I think an Extra
  Large Instance is going to be fine. Do note that the master node is also
  being used to run jobs. However, if it's loaded by just the web server,
  SGE
  will typically just not schedule jobs to it.
 
  As far as the core/thread/slot concerns goes, SGE sees each core as a
  slot.
  Each job in Galaxy simply requires 1 slot, even if it uses multiple
  threads
  (i.e., cores). What this means is that nodes will probably get
  overloaded if
  only the same type of job is being run (BWA), but if analyses are being
  run
  that use multiple tools, jobs will get spread over the cluster to
  balance
  the overal load a bit better than by simply looking at the number of
  slots.
 
  Enis
 
  On Mon, Nov 21, 2011 at 4:34 AM, Clare Sloggett s...@unimelb.edu.au
  wrote:
 
  Hi Jeremy,
 
  Also if you do remember what kind of Amazon node you used,
  particularly for the cluster's master node (e.g. an 'xlarge' 4-core
  15GB or perhaps one of the 'high-memory' nodes?), that would be a
  reassuring sanity chech for me!
 
  Cheers,
  Clare
 
  On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett s...@unimelb.edu.au
  wrote:
   Hi Jeremy, Enis,
  
   That makes sense. I know I can configure how many threads BWA uses in
   its wrapper, with bwa -t. But, is there somewhere that I need to tell
   Galaxy the corresponding information, ie that this command-line task
   will make use of up to 4 cores?
  
   Or, does this imply that there is always exactly one job per node? So
   if I have (for instance) a cluster made of 4-core nodes, and a
   single-threaded task (e.g. samtools), are the other 3 cores just
   going
   to waste or will the scheduler allocate multiple single-threaded jobs
   to one node?
  
   I've cc'd galaxy-dev instead of galaxy-user as I think the
   conversation has gone that way!
  
   Thanks again,
   Clare
  
  
   On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks
   jeremy.goe...@emory.edu
   wrote:
  
   On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks
   jeremy.goe...@emory.edu wrote:
  
   Scalability issues are more likely to arise on the back end than
   the
   front end, so you'll want to ensure that you have enough compute
   nodes. BWA
   uses four nodes by default--Enis, does the cloud config change
   this
   parameter?--so you'll want 4x50 or 200 total nodes if you want
   everyone to
   be able to run a BWA job simultaneously.
  
  
   Actually, one other question - this paragraph makes me realise that
   I
   don't really understand how Galaxy is distributing jobs. I had
   thought
   that each job would only use one node, and in some cases take
   advantage of multiple cores within that node. I'm taking a node
   to
   be a set of cores with their own shared memory, so in this case a
   VM
   instance, is this right? If some types of jobs can be distributed
   over
   multiple nodes, can I configure, in Galaxy, how many nodes they
   should
   use?
  
   You're right -- my word choices were poor. Replace 'node' with
   'core'
   in my paragraph to get an accurate suggestion for resources.
  
   Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to
   different cluster nodes. Jobs that require multiple cores typically
   run on a
   single node. Enis can chime in on whether CloudMan supports job
   submission
   over multiple nodes; this would require setup of an appropriate
   parallel
   environment and a tool that can make use of this environment.
  
   Good luck,
   J

[galaxy-dev] Removing nodes from a CloudMan instance

2011-11-30 Thread Clare Sloggett
Hi galaxy-devs,

Quick question: when using the cloud console on CloudMan, it's
possible to add different types of nodes (large, micro, etc) to the
virtual cluster using the 'Add Nodes' option at the top. I can also
remove a given number of nodes using the 'Remove Nodes' option at the
top. However, is there any way to control exactly which node (or more
importantly just which type of node) gets removed?

Thanks for any help!

Clare

-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop

2011-11-30 Thread Clare Sloggett
Hi Enis, Jeremy, and all,

Thanks so much for all your help. I have another question which I
suspect is just me missing something obvious.

I'm guessing that when you cloned the cluster for your workshop, you
used CloudMan's 'share-an-instance' functionality?
When I launch a new cluster which I want to be a copy of an existing
cluster, and select the share-an-instance option, it asks for the
cluster share-string. How can I find this string for my existing
cluster?

Or have I got completely the wrong idea - did you actually clone the
instance using AWS functionality?

Thanks,
Clare

On Mon, Nov 21, 2011 at 5:37 PM, Enis Afgan eaf...@emory.edu wrote:
 Hi Clare,
 I don't recall what instance type we used earlier, but I think an Extra
 Large Instance is going to be fine. Do note that the master node is also
 being used to run jobs. However, if it's loaded by just the web server, SGE
 will typically just not schedule jobs to it.

 As far as the core/thread/slot concerns goes, SGE sees each core as a slot.
 Each job in Galaxy simply requires 1 slot, even if it uses multiple threads
 (i.e., cores). What this means is that nodes will probably get overloaded if
 only the same type of job is being run (BWA), but if analyses are being run
 that use multiple tools, jobs will get spread over the cluster to balance
 the overal load a bit better than by simply looking at the number of slots.

 Enis

 On Mon, Nov 21, 2011 at 4:34 AM, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi Jeremy,

 Also if you do remember what kind of Amazon node you used,
 particularly for the cluster's master node (e.g. an 'xlarge' 4-core
 15GB or perhaps one of the 'high-memory' nodes?), that would be a
 reassuring sanity chech for me!

 Cheers,
 Clare

 On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett s...@unimelb.edu.au
 wrote:
  Hi Jeremy, Enis,
 
  That makes sense. I know I can configure how many threads BWA uses in
  its wrapper, with bwa -t. But, is there somewhere that I need to tell
  Galaxy the corresponding information, ie that this command-line task
  will make use of up to 4 cores?
 
  Or, does this imply that there is always exactly one job per node? So
  if I have (for instance) a cluster made of 4-core nodes, and a
  single-threaded task (e.g. samtools), are the other 3 cores just going
  to waste or will the scheduler allocate multiple single-threaded jobs
  to one node?
 
  I've cc'd galaxy-dev instead of galaxy-user as I think the
  conversation has gone that way!
 
  Thanks again,
  Clare
 
 
  On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks jeremy.goe...@emory.edu
  wrote:
 
  On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks
  jeremy.goe...@emory.edu wrote:
 
  Scalability issues are more likely to arise on the back end than the
  front end, so you'll want to ensure that you have enough compute nodes. 
  BWA
  uses four nodes by default--Enis, does the cloud config change this
  parameter?--so you'll want 4x50 or 200 total nodes if you want everyone 
  to
  be able to run a BWA job simultaneously.
 
 
  Actually, one other question - this paragraph makes me realise that I
  don't really understand how Galaxy is distributing jobs. I had thought
  that each job would only use one node, and in some cases take
  advantage of multiple cores within that node. I'm taking a node to
  be a set of cores with their own shared memory, so in this case a VM
  instance, is this right? If some types of jobs can be distributed over
  multiple nodes, can I configure, in Galaxy, how many nodes they should
  use?
 
  You're right -- my word choices were poor. Replace 'node' with 'core'
  in my paragraph to get an accurate suggestion for resources.
 
  Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to
  different cluster nodes. Jobs that require multiple cores typically run 
  on a
  single node. Enis can chime in on whether CloudMan supports job submission
  over multiple nodes; this would require setup of an appropriate parallel
  environment and a tool that can make use of this environment.
 
  Good luck,
  J.
 
 
 
 
 
 
 
  --
  E: s...@unimelb.edu.au
  P: 03 903 53357
  M: 0414 854 759
 



 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759





-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop

2011-11-20 Thread Clare Sloggett
Hi Jeremy, Enis,

That makes sense. I know I can configure how many threads BWA uses in
its wrapper, with bwa -t. But, is there somewhere that I need to tell
Galaxy the corresponding information, ie that this command-line task
will make use of up to 4 cores?

Or, does this imply that there is always exactly one job per node? So
if I have (for instance) a cluster made of 4-core nodes, and a
single-threaded task (e.g. samtools), are the other 3 cores just going
to waste or will the scheduler allocate multiple single-threaded jobs
to one node?

I've cc'd galaxy-dev instead of galaxy-user as I think the
conversation has gone that way!

Thanks again,
Clare


On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks jeremy.goe...@emory.edu wrote:

 On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks jeremy.goe...@emory.edu 
 wrote:

 Scalability issues are more likely to arise on the back end than the front 
 end, so you'll want to ensure that you have enough compute nodes. BWA uses 
 four nodes by default--Enis, does the cloud config change this 
 parameter?--so you'll want 4x50 or 200 total nodes if you want everyone to 
 be able to run a BWA job simultaneously.


 Actually, one other question - this paragraph makes me realise that I
 don't really understand how Galaxy is distributing jobs. I had thought
 that each job would only use one node, and in some cases take
 advantage of multiple cores within that node. I'm taking a node to
 be a set of cores with their own shared memory, so in this case a VM
 instance, is this right? If some types of jobs can be distributed over
 multiple nodes, can I configure, in Galaxy, how many nodes they should
 use?

 You're right -- my word choices were poor. Replace 'node' with 'core' in my 
 paragraph to get an accurate suggestion for resources.

 Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to 
 different cluster nodes. Jobs that require multiple cores typically run on a 
 single node. Enis can chime in on whether CloudMan supports job submission 
 over multiple nodes; this would require setup of an appropriate parallel 
 environment and a tool that can make use of this environment.

 Good luck,
 J.







-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Missing requirements in xml wrappers in galaxy-dist?

2011-11-16 Thread Clare Sloggett
Hi James  all,

I have been getting some errors to do with the path environment
variable. For instance, when uploading a sam file to our local galaxy
instance, we got:
Traceback (most recent call last):
 File /data/ugalaxy/galaxy-dist/tools/data_source/upload.py, line 394, in
   __main__()
.
line 63, in _get_samtools_version
   output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE,
stdout=subprocess.PIPE ).communicate()[1]
 File /usr/local/lib/python2.7/subprocess.py, line 679, in __init__
   errread, errwrite)
 File /usr/local/lib/python2.7/subprocess.py, line 1228, in _execute_child
   raise child_exception
OSError: [Errno 2] No such file or directory

I can post the full error if you'd like, but basically the problem was
that samtools wasn't in the PATH. This was because we have our tools
installed in a non-standard place, so we are depending on the
requirements being specified as James described below, and samtools
isn't specified as a requirement in upload.xml, so when upload.py
calls datatypes.py and tries to use samtools, it gives an error. I've
found a couple of other examples like this - for instance samtools is
also used by some picard scripts so should be specified as a
requirement in the picard wrappers.

These problems probably don't show up in most cases when people just
have the tools installed as root and on their PATH by default?

I'm going to be fixing these where I find them. Would it be helpful
for me to contribute these tweaks back or would it be better just to
raise an issue?

Thanks,
Clare

On Wed, Nov 16, 2011 at 4:33 PM, Clare Sloggett s...@unimelb.edu.au wrote:
 Looks like it's working! The problem I had run into, in hindsight, was
 a) I hadn't set tool_dependency_dir as I didn't know about it
 b) if I had set it, I was installing tools to
 $SW/tool-name/version-number/ but the default tool wrappers in
 galaxy-dist don't have version numbers set, so they will just look in
 $SW/tool-name/ .

 Can I suggest this be added to the wiki somewhere under the Admin
 pages? Apologies if it's there, I couldn't find it except under News
 Briefs at 
 http://wiki.g2.bx.psu.edu/News%20Briefs/2010_11_24?highlight=%28tool_dependency_dir%29
 .

 As well as being on the wiki it would be useful to have it (commented
 out by default) in universe_wsgi.ini. I think the tool_dependency_dir
 variable isn't in there at all at the moment, at least in the
 galaxy-dist I have. It would also be useful to have a brief mention or
 link to it on http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup to
 save time for people like me who had tools installed in a non-standard
 place.

 Thanks again!
 Clare

 On Wed, Nov 16, 2011 at 3:01 PM, Clare Sloggett s...@unimelb.edu.au wrote:
 Great! Thanks James, this is exactly what I need.

 On Wed, Nov 16, 2011 at 2:20 PM, James Taylor ja...@jamestaylor.org wrote:
 On Nov 15, 2011, at 9:59 PM, Clare Sloggett wrote:

 If this is the case, what is the best way to install and maintain two
 versions of the same tool? I can write code into the wrapper to find
 the correct version of the tool in a given case, but I was wondering
 if there is a more standard 'galaxy' way to configure this.

 You should provide tool_dependency_dir in the config file and point it at 
 some directory $SW where you will install tools under.

 With this enabled, when a tool has a dependency, Galaxy will look for it 
 under that directory and attempt to run a script to setup the environment. 
 For example if you have tool with a dependency on foo version 1.3, Galaxy 
 will look for:

        $SW/foo/1.3/env.sh

 and if found will source it as part of the job submission script. This 
 usually contains something simple like

        PATH=$PACKAGE_BASE/bin:$PATH

 to add the binaries installed with the dependency to the path.

 Ideally all dependencies used by Galaxy tools would be installed in this 
 way.





 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759




 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759




-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] installing multiple versions of a tool / path configuration

2011-11-15 Thread Clare Sloggett
Hi all,

I am a little confused as to the right way to configure my installed NGS tools.

From the documentation I've found and from looking at the xml/python
wrappers, it looks to me like the NGS tool wrappers simply call the
tools and assume they will be in the galaxy account's PATH. For
instance, bwa_wrapper.py simply calls bwa in the shell. So it looks
to me like the default assumption is that bwa was installed as root
and is available to all users, and will always be on your PATH. Is
this right, or am I missing some configuration options that should
tell galaxy where to find the bwa binary?

If this is the case, what is the best way to install and maintain two
versions of the same tool? I can write code into the wrapper to find
the correct version of the tool in a given case, but I was wondering
if there is a more standard 'galaxy' way to configure this.

Sorry for the newbie questions again - I have been looking at the NGS
setup and Tools documentation, but I don't think I've found anything
on multiple installed versions of a tool. Also if there are any docs I
have missed on configuring path environment variables, please let me
know!

Thanks,
Clare

-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] installing multiple versions of a tool / path configuration

2011-11-15 Thread Clare Sloggett
Great! Thanks James, this is exactly what I need.

On Wed, Nov 16, 2011 at 2:20 PM, James Taylor ja...@jamestaylor.org wrote:
 On Nov 15, 2011, at 9:59 PM, Clare Sloggett wrote:

 If this is the case, what is the best way to install and maintain two
 versions of the same tool? I can write code into the wrapper to find
 the correct version of the tool in a given case, but I was wondering
 if there is a more standard 'galaxy' way to configure this.

 You should provide tool_dependency_dir in the config file and point it at 
 some directory $SW where you will install tools under.

 With this enabled, when a tool has a dependency, Galaxy will look for it 
 under that directory and attempt to run a script to setup the environment. 
 For example if you have tool with a dependency on foo version 1.3, Galaxy 
 will look for:

        $SW/foo/1.3/env.sh

 and if found will source it as part of the job submission script. This 
 usually contains something simple like

        PATH=$PACKAGE_BASE/bin:$PATH

 to add the binaries installed with the dependency to the path.

 Ideally all dependencies used by Galaxy tools would be installed in this way.





-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] installing multiple versions of a tool / path configuration

2011-11-15 Thread Clare Sloggett
Looks like it's working! The problem I had run into, in hindsight, was
a) I hadn't set tool_dependency_dir as I didn't know about it
b) if I had set it, I was installing tools to
$SW/tool-name/version-number/ but the default tool wrappers in
galaxy-dist don't have version numbers set, so they will just look in
$SW/tool-name/ .

Can I suggest this be added to the wiki somewhere under the Admin
pages? Apologies if it's there, I couldn't find it except under News
Briefs at 
http://wiki.g2.bx.psu.edu/News%20Briefs/2010_11_24?highlight=%28tool_dependency_dir%29
.

As well as being on the wiki it would be useful to have it (commented
out by default) in universe_wsgi.ini. I think the tool_dependency_dir
variable isn't in there at all at the moment, at least in the
galaxy-dist I have. It would also be useful to have a brief mention or
link to it on http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup to
save time for people like me who had tools installed in a non-standard
place.

Thanks again!
Clare

On Wed, Nov 16, 2011 at 3:01 PM, Clare Sloggett s...@unimelb.edu.au wrote:
 Great! Thanks James, this is exactly what I need.

 On Wed, Nov 16, 2011 at 2:20 PM, James Taylor ja...@jamestaylor.org wrote:
 On Nov 15, 2011, at 9:59 PM, Clare Sloggett wrote:

 If this is the case, what is the best way to install and maintain two
 versions of the same tool? I can write code into the wrapper to find
 the correct version of the tool in a given case, but I was wondering
 if there is a more standard 'galaxy' way to configure this.

 You should provide tool_dependency_dir in the config file and point it at 
 some directory $SW where you will install tools under.

 With this enabled, when a tool has a dependency, Galaxy will look for it 
 under that directory and attempt to run a script to setup the environment. 
 For example if you have tool with a dependency on foo version 1.3, Galaxy 
 will look for:

        $SW/foo/1.3/env.sh

 and if found will source it as part of the job submission script. This 
 usually contains something simple like

        PATH=$PACKAGE_BASE/bin:$PATH

 to add the binaries installed with the dependency to the path.

 Ideally all dependencies used by Galaxy tools would be installed in this way.





 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759




-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Configuration of a local install - mi-deploy ?

2011-11-10 Thread Clare Sloggett
Hi Greg  Ross,

Thanks for this!

Yes, I confused the issue by mentioning the Tool Shed, sorry - it's
the binaries themselves I need to install. Essentially I think I need
to follow the steps at
http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup , but I was
wondering about mi-deployment scripts as a better way to do this, and
whether that's standard practice.

After looking through the scripts they really seem like the best
option to me - it looks like they are set up so you mostly only need
to change configuration variables to get this to work.

The scripts are mentioned on the wiki instructions but don't seem to
be the default option (they are not bundled in galaxy-dist), so I
wondered if people are usually doing it this way?

Thanks,
Clare


On Fri, Nov 11, 2011 at 2:06 AM, Ross ross.laza...@gmail.com wrote:
 Clare,

 As Greg says, the tool wrappers exposed on Main all come with a
 mercurial checkout - but all the binary dependencies, genomic data and
 indexes do not. The tool shed is really the tool-wrapper shed - binary
 and data dependencies aren't there either.

 I haven't tried them but I'd guess that the deploy scripts take care
 of a lot of messy details but will probably need some work to fit your
 local setup.

 As to email for admin - my advice would be don't worry about it - if
 one user forgets their password you can reset it from the admin
 interface - that's the main use and if this is really a test instance,
 it's not a show stopper.

 On Thu, Nov 10, 2011 at 9:14 AM, Greg Von Kuster g...@bx.psu.edu wrote:
 Hello Clare,

 On Nov 9, 2011, at 11:11 PM, Clare Sloggett wrote:

 Hi all,

 Most of our playing around with Galaxy has been in getting it working
 on our local cloud, but now for the first time I'm configuring a
 non-cloud local install of galaxy-dist (set up as per
 http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Production%20Server)

 So I have some naive questions!

 Would it be a sensible approach to grab the tools_fabfile script from
 mi-deployment and use it in this case? Or should I be using the Tool
 Shed for installing the base set of tools?


 The Galaxy distribution includes all of the tools you see on both the Penn 
 State test and main instances.  You can also get tools form the tool shed if 
 you want - see our wiki at http://wiki.g2.bx.psu.edu/Tool%20Shed for 
 information about how to do this.  I don't think you'll need to use the 
 tools_fabfile script from mi_deployment repo for your local instance.



 Also, if I have problems with this server sending out emails (which
 may be the case) am I going to run into trouble with user/password
 management or can I just admin everything manually? There will only be
 a very small number of users on this server.

 I'm not clear on this question, but you certainly shouldn't run into any 
 problems with user / password management within Galaxy.  From the Galaxy 
 Admin interface you have the ability to Manage users where you can reset 
 passwords if necessary.


 If I have missed any good 'getting started' documentation on
 config/admin of a local install please point me in the right
 direction. I've been looking at http://wiki.g2.bx.psu.edu/Admin .

 You found the right wiki.


 Thanks,
 Clare

 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

 Greg Von Kuster
 Galaxy Development Team
 g...@bx.psu.edu




 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/




 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;






-- 
E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Using a Galaxy image on a local system

2011-05-05 Thread Clare Sloggett
Hi Alex  Enis,

Thanks very much!

I've only recently got VMware installed and am having a look at the NBIC VM
now. This is just the sort of thing I was looking for for quick start-up.

I've heard of the afgane project - it has been suggested to us that this
would be a good path to take for our local cloud. Is it specifically
designed for creating AMIs, or is it really for automated configuration and
deployment anywhere?

Thanks,
Clare

On Fri, Apr 29, 2011 at 8:48 AM, Enis Afgan eaf...@emory.edu wrote:

 Hi Clare,
 Once you have a VM setup and accessible via ssh, you can also use our
 scripts for automated configuration and deployment of dependencies and
 tools. These scripts are used to setup Galaxy Cloud and they're targeted for
 Ubuntu 10.04 but should be applicable to other distributions as well. The
 scripts are available here:
 https://bitbucket.org/afgane/mi-deployment/overview

 Good luck,
 Enis

 On Thu, Apr 28, 2011 at 3:16 AM, Clare Sloggett s...@unimelb.edu.auwrote:

 Hi all,

 I would like to set up Galaxy locally. At the moment, I'm just trying to
 use it on my desktop (a Mac, OSX 10.6.7), but later we will want a local
 server to play with.

 Rather than install galaxy and then install all the tools it can use (and
 deal with OSX issues for some of them), it seems simpler just to use a
 virtual machine, since there are images which get regularly updated and come
 with pretty much everything. Is there anything wrong with this approach?

 I know there are Amazon EC2 images for Galaxy. So far as I know there are
 not other kinds of images? So for using it on my desktop, I think my options
 are either to run an EC2-compatible system locally, or to try to convert the
 AMI to a VMWare or VirtualBox image. I was just wondering if anyone has
 already tried either of these approaches?

 Also, is it possible to get hold of the galaxy AMI files themselves?

 Any advice welcome!

 Thanks,
 Clare



 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/