Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop

2011-12-01 Thread Enis Afgan
Hi Clare,
The share string is generated when you share a cluster. The string is
accessible on the shared cluster, when you click the green 'Share a
cluster' icon next to the cluster name and then the top link Shared
instances. You will get a list of the point in time shares of the cluster
you have created. The share string will look something like this
cm-cd53Bfg6f1223f966914df347687f6uf32/shared/2011-10-19--03-14
You simply paste that string into new cluster box you mentioned.

Enis

On Thu, Dec 1, 2011 at 6:31 AM, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi Enis, Jeremy, and all,

 Thanks so much for all your help. I have another question which I
 suspect is just me missing something obvious.

 I'm guessing that when you cloned the cluster for your workshop, you
 used CloudMan's 'share-an-instance' functionality?
 When I launch a new cluster which I want to be a copy of an existing
 cluster, and select the share-an-instance option, it asks for the
 cluster share-string. How can I find this string for my existing
 cluster?

 Or have I got completely the wrong idea - did you actually clone the
 instance using AWS functionality?

 Thanks,
 Clare

 On Mon, Nov 21, 2011 at 5:37 PM, Enis Afgan eaf...@emory.edu wrote:
  Hi Clare,
  I don't recall what instance type we used earlier, but I think an Extra
  Large Instance is going to be fine. Do note that the master node is also
  being used to run jobs. However, if it's loaded by just the web server,
 SGE
  will typically just not schedule jobs to it.
 
  As far as the core/thread/slot concerns goes, SGE sees each core as a
 slot.
  Each job in Galaxy simply requires 1 slot, even if it uses multiple
 threads
  (i.e., cores). What this means is that nodes will probably get
 overloaded if
  only the same type of job is being run (BWA), but if analyses are being
 run
  that use multiple tools, jobs will get spread over the cluster to balance
  the overal load a bit better than by simply looking at the number of
 slots.
 
  Enis
 
  On Mon, Nov 21, 2011 at 4:34 AM, Clare Sloggett s...@unimelb.edu.au
 wrote:
 
  Hi Jeremy,
 
  Also if you do remember what kind of Amazon node you used,
  particularly for the cluster's master node (e.g. an 'xlarge' 4-core
  15GB or perhaps one of the 'high-memory' nodes?), that would be a
  reassuring sanity chech for me!
 
  Cheers,
  Clare
 
  On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett s...@unimelb.edu.au
  wrote:
   Hi Jeremy, Enis,
  
   That makes sense. I know I can configure how many threads BWA uses in
   its wrapper, with bwa -t. But, is there somewhere that I need to tell
   Galaxy the corresponding information, ie that this command-line task
   will make use of up to 4 cores?
  
   Or, does this imply that there is always exactly one job per node? So
   if I have (for instance) a cluster made of 4-core nodes, and a
   single-threaded task (e.g. samtools), are the other 3 cores just going
   to waste or will the scheduler allocate multiple single-threaded jobs
   to one node?
  
   I've cc'd galaxy-dev instead of galaxy-user as I think the
   conversation has gone that way!
  
   Thanks again,
   Clare
  
  
   On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks 
 jeremy.goe...@emory.edu
   wrote:
  
   On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks
   jeremy.goe...@emory.edu wrote:
  
   Scalability issues are more likely to arise on the back end than
 the
   front end, so you'll want to ensure that you have enough compute
 nodes. BWA
   uses four nodes by default--Enis, does the cloud config change this
   parameter?--so you'll want 4x50 or 200 total nodes if you want
 everyone to
   be able to run a BWA job simultaneously.
  
  
   Actually, one other question - this paragraph makes me realise that
 I
   don't really understand how Galaxy is distributing jobs. I had
 thought
   that each job would only use one node, and in some cases take
   advantage of multiple cores within that node. I'm taking a node to
   be a set of cores with their own shared memory, so in this case a VM
   instance, is this right? If some types of jobs can be distributed
 over
   multiple nodes, can I configure, in Galaxy, how many nodes they
 should
   use?
  
   You're right -- my word choices were poor. Replace 'node' with 'core'
   in my paragraph to get an accurate suggestion for resources.
  
   Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to
   different cluster nodes. Jobs that require multiple cores typically
 run on a
   single node. Enis can chime in on whether CloudMan supports job
 submission
   over multiple nodes; this would require setup of an appropriate
 parallel
   environment and a tool that can make use of this environment.
  
   Good luck,
   J.
  
  
  
  
  
  
  
   --
   E: s...@unimelb.edu.au
   P: 03 903 53357
   M: 0414 854 759
  
 
 
 
  --
  E: s...@unimelb.edu.au
  P: 03 903 53357
  M: 0414 854 759
 
 



 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759


Re: [galaxy-dev] Removing nodes from a CloudMan instance

2011-12-01 Thread Enis Afgan
Unfortunately not. When removing nodes, CloudMan chooses from the nodes
that are currently not being used but among those the choice is random.
You can manually terminate a particular worker instance from the AWS
console and thus remove it from your cluster by force. CloudMan will then
reconfigure the cluster. Although this has been implemented and tested, it
is not really the recommended behavior, especially not repeatedly.

I'll look into how to implement this option.
Enis

On Thu, Dec 1, 2011 at 6:20 AM, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi galaxy-devs,

 Quick question: when using the cloud console on CloudMan, it's
 possible to add different types of nodes (large, micro, etc) to the
 virtual cluster using the 'Add Nodes' option at the top. I can also
 remove a given number of nodes using the 'Remove Nodes' option at the
 top. However, is there any way to control exactly which node (or more
 importantly just which type of node) gets removed?

 Thanks for any help!

 Clare

 --
 E: s...@unimelb.edu.au
 P: 03 903 53357
 M: 0414 854 759
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] GalaxyCloudman + CADDSuite

2011-12-01 Thread Enis Afgan
Hi Marcel,

However, when I create an AMI, terminate the cluster and create a new
 cluster using the new AMI, both /mnt/galaxyData and /mnt/galaxyTools do not
 exist anymore, i.e. /dev/sdg3 and /dev/sdg4 are not mounted automatically.
 If I mount those two devices manually, everything runs smoothly again.

 So, is there anything that I might have forgotten to do while creating the
 AMI? Is there a way to make sure that those devices will be mounted
 automatically?

 It is not necessary to create a new AMI when wanting to customize your
cluster. Instead, on the admin interface - after you have modified the file
systems, there is an option to persist static file systems (galaxyTools 
galaxyIndices). Once the process is completed and you restart the cluster,
just continue to use the same AMI. CloudMan will use the new, customized,
data snapshots at runtime.

Let us know how it goes,
Enis


 Regards,
 Marcel



 On 11/30/11 3:57 PM, Enis Afgan wrote:

 Hi Marcel,
 It would be best to use 'galaxy' user to add any tools. To do so, after
 you've logged in as ubuntu user, simply execute:
 sudo su galaxy
 and you will become galaxy user. You can then make the desired
 modifications.

 Good luck,
 Enis

 On Wed, Nov 30, 2011 at 3:42 PM, Greg Von Kusterg...@bx.psu.edu  wrote:

  Hello Marcel,

 In the future, please send all questions like this to the galaxy-dev
 mailing list, as doing so will streamline the process of getting a timely
 answer.  I believe Enis is best able to answer your questions.

 Thanks!


 On Nov 30, 2011, at 9:29 AM, Marcel Schumann wrote:

  Hi Greg,

 I'm currently trying to create a GalaxyCloudman version that includes

 CADDSuite.

 Thus, I launched GalaxyCloudman as described in your wiki and tried to

 modify it afterwards.


 Well, starting cloudman worked without any problems... so far, so good

 :-)

 As described on
   http://wiki.g2.bx.psu.edu/**Admin/Cloud/Customize%**
 20Galaxy%20Cloudhttp://wiki.g2.bx.psu.edu/Admin/Cloud/Customize%20Galaxy%20Cloud
 I could then log-in via ssh as user 'ubuntu' (not as user 'galaxy').
 However, all files of the galaxy installation belong to user and group

 'galaxy'.


 Thus my question: How should users be able to customize cloudman? Is

 there some trick by which I can log-in as 'galaxy' or do you have any
 other
 idea how to make this work ? ;-)


 Sorry Greg, if you are not the correct contact in this case, but I found

 not specific contact or mailing list for cloudman. Perhaps, you could
 just
 forward this mail in that case ...



 Cheers,
 Marcel


 --
 Marcel Schumann

 University of Tuebingen
 Wilhelm Schickard Institute for Computer Science
 Division for Applied Bioinformatics
 Room C313, Sand 14, D-72076 Tuebingen

 phone:  +49 (0)7071-29 70437
 fax:  +49 (0)7071-29 5152
 email:  
 schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de


 Greg Von Kuster
 Galaxy Development Team
 g...@bx.psu.edu







 --
 Marcel Schumann

 University of Tuebingen
 Wilhelm Schickard Institute for Computer Science
 Division for Applied Bioinformatics
 Room C313, Sand 14, D-72076 Tuebingen

 phone:  +49 (0)7071-29 70437
 fax:  +49 (0)7071-29 5152
 email:  
 schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] synchronous data depositing

2011-12-01 Thread James Ireland
Greetings,

I've been attempting to return data to Galaxy via the synchronous data
depositing protocol.  Using the Biomart, UCSC Table Browser, etc as
examples in the data_source tools directory, I've been able to get the
initial GET request to my site just fine.  However, when I POST back to
galaxy I immediately get a redirect to the welcome page and Galaxy never
resubmits back to my site.

I was wondering if there is more to the protocol than is covered here:
http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps
configuration I need to perform on my local Galaxy installation to
correctly handle the POSTs back to tool_runner?  Also, are there any code
examples I should be looking at?

Thanks for your help!
-James


-- 
J Ireland
www.5amsolutions.com | Software for Life(TM)
m: 415 484-DATA (3282)
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] [galaxy-user] Using Galaxy Cloudman for a workshop

2011-12-01 Thread Clare Sloggett
Right! I did think to look for a 'share this cluster' command, I just
failed to find it. It all makes sense now, thanks.

On Thu, Dec 1, 2011 at 7:34 PM, Enis Afgan eaf...@emory.edu wrote:
 Hi Clare,
 The share string is generated when you share a cluster. The string is
 accessible on the shared cluster, when you click the green 'Share a cluster'
 icon next to the cluster name and then the top link Shared instances. You
 will get a list of the point in time shares of the cluster you have created.
 The share string will look something like
 this cm-cd53Bfg6f1223f966914df347687f6uf32/shared/2011-10-19--03-14
 You simply paste that string into new cluster box you mentioned.
 Enis

 On Thu, Dec 1, 2011 at 6:31 AM, Clare Sloggett s...@unimelb.edu.au wrote:

 Hi Enis, Jeremy, and all,

 Thanks so much for all your help. I have another question which I
 suspect is just me missing something obvious.

 I'm guessing that when you cloned the cluster for your workshop, you
 used CloudMan's 'share-an-instance' functionality?
 When I launch a new cluster which I want to be a copy of an existing
 cluster, and select the share-an-instance option, it asks for the
 cluster share-string. How can I find this string for my existing
 cluster?

 Or have I got completely the wrong idea - did you actually clone the
 instance using AWS functionality?

 Thanks,
 Clare

 On Mon, Nov 21, 2011 at 5:37 PM, Enis Afgan eaf...@emory.edu wrote:
  Hi Clare,
  I don't recall what instance type we used earlier, but I think an Extra
  Large Instance is going to be fine. Do note that the master node is also
  being used to run jobs. However, if it's loaded by just the web server,
  SGE
  will typically just not schedule jobs to it.
 
  As far as the core/thread/slot concerns goes, SGE sees each core as a
  slot.
  Each job in Galaxy simply requires 1 slot, even if it uses multiple
  threads
  (i.e., cores). What this means is that nodes will probably get
  overloaded if
  only the same type of job is being run (BWA), but if analyses are being
  run
  that use multiple tools, jobs will get spread over the cluster to
  balance
  the overal load a bit better than by simply looking at the number of
  slots.
 
  Enis
 
  On Mon, Nov 21, 2011 at 4:34 AM, Clare Sloggett s...@unimelb.edu.au
  wrote:
 
  Hi Jeremy,
 
  Also if you do remember what kind of Amazon node you used,
  particularly for the cluster's master node (e.g. an 'xlarge' 4-core
  15GB or perhaps one of the 'high-memory' nodes?), that would be a
  reassuring sanity chech for me!
 
  Cheers,
  Clare
 
  On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett s...@unimelb.edu.au
  wrote:
   Hi Jeremy, Enis,
  
   That makes sense. I know I can configure how many threads BWA uses in
   its wrapper, with bwa -t. But, is there somewhere that I need to tell
   Galaxy the corresponding information, ie that this command-line task
   will make use of up to 4 cores?
  
   Or, does this imply that there is always exactly one job per node? So
   if I have (for instance) a cluster made of 4-core nodes, and a
   single-threaded task (e.g. samtools), are the other 3 cores just
   going
   to waste or will the scheduler allocate multiple single-threaded jobs
   to one node?
  
   I've cc'd galaxy-dev instead of galaxy-user as I think the
   conversation has gone that way!
  
   Thanks again,
   Clare
  
  
   On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks
   jeremy.goe...@emory.edu
   wrote:
  
   On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks
   jeremy.goe...@emory.edu wrote:
  
   Scalability issues are more likely to arise on the back end than
   the
   front end, so you'll want to ensure that you have enough compute
   nodes. BWA
   uses four nodes by default--Enis, does the cloud config change
   this
   parameter?--so you'll want 4x50 or 200 total nodes if you want
   everyone to
   be able to run a BWA job simultaneously.
  
  
   Actually, one other question - this paragraph makes me realise that
   I
   don't really understand how Galaxy is distributing jobs. I had
   thought
   that each job would only use one node, and in some cases take
   advantage of multiple cores within that node. I'm taking a node
   to
   be a set of cores with their own shared memory, so in this case a
   VM
   instance, is this right? If some types of jobs can be distributed
   over
   multiple nodes, can I configure, in Galaxy, how many nodes they
   should
   use?
  
   You're right -- my word choices were poor. Replace 'node' with
   'core'
   in my paragraph to get an accurate suggestion for resources.
  
   Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to
   different cluster nodes. Jobs that require multiple cores typically
   run on a
   single node. Enis can chime in on whether CloudMan supports job
   submission
   over multiple nodes; this would require setup of an appropriate
   parallel
   environment and a tool that can make use of this environment.
  
   Good luck,
   J.
  
  
  
  
  
  

Re: [galaxy-dev] GalaxyCloudman + CADDSuite

2011-12-01 Thread Enis Afgan
Hi Marcel,

On Thu, Dec 1, 2011 at 11:28 AM, Marcel Schumann 
schum...@informatik.uni-tuebingen.de wrote:

 Hi Enis,

 well, I know I do not have to create a new AMI if I want to reuse an
 instance myself.

 However, I would like to share the modified GalaxyCloudman version with
 other people and therefore I do have to create an AMI.

Unless you modify the system packages (i.e., your customizations are not
self contained), you still don't have to create a new AMI to share a
cluster. There is the share-a-cluster option (icon next to the cluster
name). Just wanted to make sure you were aware of the functionality...


 Ok, I will try to make this work somehow ... but I guess there are no
 immediate clues as to what could have gone wrong? Or do you have any ideas
 what I should try?

CloudMan sets up the system at runtime so it performs changes that then get
persisted when you create the AMI. So, it is necessary to reverse those
changes before creating the AMI so that next time a cluster is started, the
startup procedure proceeds as before. Did you see what's in the cloudman
log (/mnt/cm/paster.log) on your customized AMI? That's probably the
easiest place to start and we can work from there.

Enis




 Cheers,
 Marcel



 On 12/1/11 10:45 AM, Enis Afgan wrote:

 Hi Marcel,

 However, when I create an AMI, terminate the cluster and create a new

 cluster using the new AMI, both /mnt/galaxyData and /mnt/galaxyTools do
 not
 exist anymore, i.e. /dev/sdg3 and /dev/sdg4 are not mounted
 automatically.
 If I mount those two devices manually, everything runs smoothly again.

 So, is there anything that I might have forgotten to do while creating
 the
 AMI? Is there a way to make sure that those devices will be mounted
 automatically?

 It is not necessary to create a new AMI when wanting to customize your

 cluster. Instead, on the admin interface - after you have modified the
 file
 systems, there is an option to persist static file systems (galaxyTools
 galaxyIndices). Once the process is completed and you restart the cluster,
 just continue to use the same AMI. CloudMan will use the new, customized,
 data snapshots at runtime.

 Let us know how it goes,
 Enis


  Regards,
 Marcel



 On 11/30/11 3:57 PM, Enis Afgan wrote:

  Hi Marcel,
 It would be best to use 'galaxy' user to add any tools. To do so, after
 you've logged in as ubuntu user, simply execute:
 sudo su galaxy
 and you will become galaxy user. You can then make the desired
 modifications.

 Good luck,
 Enis

 On Wed, Nov 30, 2011 at 3:42 PM, Greg Von Kusterg...@bx.psu.edu
 wrote:

  Hello Marcel,


 --
 Marcel Schumann

 University of Tuebingen
 Wilhelm Schickard Institute for Computer Science
 Division for Applied Bioinformatics
 Room C313, Sand 14, D-72076 Tuebingen

 phone:  +49 (0)7071-29 70437
 fax:  +49 (0)7071-29 5152
 email:  schum...@informatik.uni-**tueb**ingen.de http://tuebingen.de
 schumann@informatik.**uni-tuebingen.deschum...@informatik.uni-tuebingen.de
 




 --
 Marcel Schumann

 University of Tuebingen
 Wilhelm Schickard Institute for Computer Science
 Division for Applied Bioinformatics
 Room C313, Sand 14, D-72076 Tuebingen

 phone:  +49 (0)7071-29 70437
 fax:  +49 (0)7071-29 5152
 email:  
 schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Quota will not decrease with permanent delete

2011-12-01 Thread Glen Beane
You aren't the only one having problem in the disk-usage reported in the upper 
right.  I've also had problems with this total disk usage in our local instance 
of Galaxy

I've deleted all but a handful of files, and we have a cron job that purges the 
files from disk after a certain number of days.  I've deleted everything but 
one history, and for the last few weeks the size of that history has been 
reported as about 200GB.  I don't have any other histories, or any deleted but 
not purged datasets.  However the total disk usage shown by Galaxy is about 
1.5TB


On Nov 30, 2011, at 2:24 PM, Mary Anne Alliegro wrote:

 Hi Galaxy Users,
 I have permanently deleted numerous files.
 My usage % has decreased, but this is NOT reflected in my Gb report (upper 
 right)- it remains the same.
 Am I missing some phantom trash bin?
 If not, will Galaxy recalculate my Gb usage so that I may proceed with my 
 project?
 Thank you,
 Mary Anne
 
 
 pastedGraphic.pdf
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/

--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] synchronous data depositing

2011-12-01 Thread Daniel Blankenberg
Hi James,

Can you let us know which revision of Galaxy that you are using (hg head) and 
any log output that appears when accessing or running the tool? Also the list 
of parameters and values that are being POSTed to Galaxy and a copy of your 
tool.xml file would be useful.


Thanks for using Galaxy,

Dan



On Dec 1, 2011, at 5:08 AM, James Ireland wrote:

 Greetings,
 
 I've been attempting to return data to Galaxy via the synchronous data 
 depositing protocol.  Using the Biomart, UCSC Table Browser, etc as examples 
 in the data_source tools directory, I've been able to get the initial GET 
 request to my site just fine.  However, when I POST back to galaxy I 
 immediately get a redirect to the welcome page and Galaxy never resubmits 
 back to my site.
 
 I was wondering if there is more to the protocol than is covered here: 
 http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps 
 configuration I need to perform on my local Galaxy installation to correctly 
 handle the POSTs back to tool_runner?  Also, are there any code examples I 
 should be looking at?
 
 Thanks for your help!
 -James
 
 
 -- 
 J Ireland
 www.5amsolutions.com | Software for Life(TM)
 m: 415 484-DATA (3282)
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] GalaxyCloudman + CADDSuite

2011-12-01 Thread Marcel Schumann

Hi Enis,

well, I know I do not have to create a new AMI if I want to reuse an 
instance myself.


However, I would like to share the modified GalaxyCloudman version with 
other people and therefore I do have to create an AMI.


Ok, I will try to make this work somehow ... but I guess there are no 
immediate clues as to what could have gone wrong? Or do you have any 
ideas what I should try?



Cheers,
Marcel


On 12/1/11 10:45 AM, Enis Afgan wrote:

Hi Marcel,

However, when I create an AMI, terminate the cluster and create a new

cluster using the new AMI, both /mnt/galaxyData and /mnt/galaxyTools do not
exist anymore, i.e. /dev/sdg3 and /dev/sdg4 are not mounted automatically.
If I mount those two devices manually, everything runs smoothly again.

So, is there anything that I might have forgotten to do while creating the
AMI? Is there a way to make sure that those devices will be mounted
automatically?

It is not necessary to create a new AMI when wanting to customize your

cluster. Instead, on the admin interface - after you have modified the file
systems, there is an option to persist static file systems (galaxyTools
galaxyIndices). Once the process is completed and you restart the cluster,
just continue to use the same AMI. CloudMan will use the new, customized,
data snapshots at runtime.

Let us know how it goes,
Enis



Regards,
Marcel



On 11/30/11 3:57 PM, Enis Afgan wrote:


Hi Marcel,
It would be best to use 'galaxy' user to add any tools. To do so, after
you've logged in as ubuntu user, simply execute:
sudo su galaxy
and you will become galaxy user. You can then make the desired
modifications.

Good luck,
Enis

On Wed, Nov 30, 2011 at 3:42 PM, Greg Von Kusterg...@bx.psu.edu   wrote:

  Hello Marcel,



--
Marcel Schumann

University of Tuebingen
Wilhelm Schickard Institute for Computer Science
Division for Applied Bioinformatics
Room C313, Sand 14, D-72076 Tuebingen

phone:  +49 (0)7071-29 70437
fax:  +49 (0)7071-29 5152
email:  
schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de






--
Marcel Schumann

University of Tuebingen
Wilhelm Schickard Institute for Computer Science
Division for Applied Bioinformatics
Room C313, Sand 14, D-72076 Tuebingen

phone:  +49 (0)7071-29 70437
fax:  +49 (0)7071-29 5152
email:  schum...@informatik.uni-tuebingen.de
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] GATK / R local install configuration

2011-12-01 Thread Carlos Borroto
Hi,

Any chances someone could give me hint on this issue?

Thanks in advance,
Carlos

On Tue, Nov 29, 2011 at 2:02 PM, Carlos Borroto
carlos.borr...@gmail.com wrote:
 Hi,

 I'm testing the GATK pipeline and I ran into a problem with Variant
 Recalibrator tool. It seems I don't have correctly configure R and
 GATK on my instance, as this tool is failing with this error:
 mv: 
 /Volumes/Data/Users/cjavier/galaxy_central/database/files/000/dataset_393.dat.pdf:
 No such file or directory

 I see this PDF is builded with R and I also see this in the log file:
 INFO  17:11:44,260 VariantRecalibrator - Executing: Rscript
 /Volumes/Data/Users/cjavier/galaxy_central/database/files/000/dataset_393.dat
 WARN  17:11:48,407 RScriptExecutor - RScript exited with 1. Run with
 -l DEBUG for more info.

 In the testing server this tool does work:
 http://test.g2.bx.psu.edu/u/cjav/h/variant-recalibrator---tutorial

 At least after making sure the input VCF file has matching annotations
 string to the one you want to select in Galaxy.

 Any help on what need to be done to get this configuration right. BTW
 I already have R and Rpy correctly configured and I can run tools like
 Statistics/Summary Statistics.

 Thanks!
 Carlos

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Local installation of the tool shed

2011-12-01 Thread Greg Von Kuster
Hello Louise-Amelie,


On Dec 1, 2011, at 4:08 AM, Louise-Amélie Schmitt wrote:

 Yes this fixes the problem, it should work fine now :)
 
 Thanks a lot!
 
 Hehe, next issues:
 
 1) When I have my repo created and I click on the Upload files to 
 repository, I get a Not found error in the browser:
 Not Found
 The requested URL /toolshed/upload/upload was not found on this server.

This is probably due to your apache rewrite rule: 

RewriteRule ^/toolshed/upload/(.*) 
/home/galaxy/galaxy-dev/static/automated_upload/$1 [L]

Is this something proprietary you have set up on your local galaxy instance?  
Try removing it from your rewrite rules for your local tool shed.


 
 2) When we try to access our local toolshed from our Galaxy instance, it 
 appears in the Accessible tool sheds list along with your two public repos, 
 but when we click on it, the Valid repositories list is empty. Is this a 
 bug or does a repo have to actually contain files so it can appear in this 
 list?

The list of valid repositories will only include repositories that have content 
that is valid for a Galaxy instance.  These repositories are defined as either 
valid or in some cases downloadable (I'm working to replace the latter with 
the former).  See the following sections of the tool shed wiki for details:

http://wiki.g2.bx.psu.edu/Tool%20Shed#Repository_revisions:_downloadable_tool_versions
http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance

 
 3) Where should the hgweb.config be?

This file should be left in the Galaxy root install directory.


 Actually, we have absolute paths since in the community_wsgi.ini we set the 
 file_path option to an absolute path.

I advise against this - I'm fairly certain it will pose problems at some point. 
 Tool shed paths should be relative.  Here are some example entries in only of 
my local tool sheds:

[paths]

repos/test/filter = database/community_files/000/repo_1
repos/test/workflow_with_tools = database/community_files/000/repo_2
repos/test/heteroplasmy_workflow = database/community_files/000/repo_3



 
 Thanks for your patience!
 L-A
 

Greg Von Kuster
Galaxy Development Team
g...@bx.psu.edu



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] synchronous data depositing

2011-12-01 Thread James Ireland
Hi Dan,

Thanks for the quick response!

I have a feeling this is something silly that I'm missing.  As a reality
check I simply created a copy of the biomart.xml tool.xml file and called
it test_me.xml.  I  changed the tool_id and GALAXY_URL in the xml file as
you'll see and added a link in my tool_config.  Selecting this tool from
the Get Data folder brings me to Biomart as expected.  When I hit go, I
have the same issue of just being redirected to the welcome page.  The
server output is shown below.  The unchanged biomart tool continues to work
fine.

Galaxy server output:

127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST
/tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=
http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1HTTP/1.1;
302 - 
http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9;
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101
Firefox/8.0.1

Here's my Galaxy revision info:
changeset:   6056:338ead4737ba
tag: tip
user:Nate Coraor n...@bx.psu.edu
date:Thu Sep 29 16:45:19 2011 -0400

Thanks again,
 -James

P.S.  It's my pleasure to use Galaxy!  ;)




On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.edu wrote:

 Hi James,

 Can you let us know which revision of Galaxy that you are using (hg head)
 and any log output that appears when accessing or running the tool? Also
 the list of parameters and values that are being POSTed to Galaxy and a
 copy of your tool.xml file would be useful.


 Thanks for using Galaxy,

 Dan



 On Dec 1, 2011, at 5:08 AM, James Ireland wrote:

 Greetings,

 I've been attempting to return data to Galaxy via the synchronous data
 depositing protocol.  Using the Biomart, UCSC Table Browser, etc as
 examples in the data_source tools directory, I've been able to get the
 initial GET request to my site just fine.  However, when I POST back to
 galaxy I immediately get a redirect to the welcome page and Galaxy never
 resubmits back to my site.

 I was wondering if there is more to the protocol than is covered here:
 http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps
 configuration I need to perform on my local Galaxy installation to
 correctly handle the POSTs back to tool_runner?  Also, are there any code
 examples I should be looking at?

 Thanks for your help!
 -James


 --
 J Ireland
 www.5amsolutions.com | Software for Life(TM)
 m: 415 484-DATA (3282)
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/





-- 
J Ireland
www.5amsolutions.com | Software for Life(TM)
m: 415 484-DATA (3282)
?xml version=1.0?
!--
If the value of 'URL_method' is 'get', the request will consist of the value of 'URL' coming back in
the initial response.  If value of 'URL_method' is 'post', any additional params coming back in the
initial response ( in addition to 'URL' ) will be encoded and appended to URL and a post will be performed.

TODO: Hack to get biomart to work - the 'add_to_URL' param can be eliminated when the Biomart team encodes URL prior to sending, meanwhile
everything including and beyond the first '' is truncated from URL.  They said they'll let us know when this is fixed at their end.
--
tool name=BioMart id=test_me tool_type=data_source version=1.0.1
descriptionCentral server/description
command interpreter=pythondata_source.py $output $__app__.config.output_size_limit/command
inputs action=http://www.biomart.org/biomart/martview; check_values=false method=get target=_top
displaygo to BioMart Central $GALAXY_URL/display
param name=GALAXY_URL type=baseurl value=/tool_runner/test_me /
/inputs
request_param_translation
request_param galaxy_name=URL remote_name=URL missing=
append_param separator=amp; first_separator=? join==
value name=_export missing=1 /
value name=GALAXY_URL missing=0 /
/append_param
/request_param
request_param galaxy_name=data_type remote_name=exportView_outputformat missing=tabular 
value_translation
value galaxy_value=tabular remote_value=TSV /
/value_translation
/request_param
request_param galaxy_name=URL_method remote_name=URL_method missing=get /
request_param galaxy_name=dbkey remote_name=dbkey missing=? /
request_param galaxy_name=organism remote_name=organism missing= /
request_param galaxy_name=table remote_name=table missing= /
request_param galaxy_name=description remote_name=description missing= /
request_param galaxy_name=name remote_name=name missing=Biomart query /
request_param galaxy_name=info 

Re: [galaxy-dev] synchronous data depositing

2011-12-01 Thread Daniel Blankenberg
Hi James,

For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter, which is 
likely the source of the problem. If you remove the 'param name=GALAXY_URL 
... /' input parameter, restart Galaxy and reload the Galaxy interface, does 
it then work correctly?


Thanks for using Galaxy,

Dan

On Dec 1, 2011, at 11:18 AM, James Ireland wrote:

 Hi Dan,
 
 Thanks for the quick response!
 
 I have a feeling this is something silly that I'm missing.  As a reality 
 check I simply created a copy of the biomart.xml tool.xml file and called it 
 test_me.xml.  I  changed the tool_id and GALAXY_URL in the xml file as you'll 
 see and added a link in my tool_config.  Selecting this tool from the Get 
 Data folder brings me to Biomart as expected.  When I hit go, I have the 
 same issue of just being redirected to the welcome page.  The server output 
 is shown below.  The unchanged biomart tool continues to work fine.
 
 Galaxy server output:
 
 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST 
 /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1
  HTTP/1.1 302 - 
 http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9; 
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 
 Firefox/8.0.1
 
 Here's my Galaxy revision info:
 changeset:   6056:338ead4737ba
 tag: tip
 user:Nate Coraor n...@bx.psu.edu
 date:Thu Sep 29 16:45:19 2011 -0400
 
 Thanks again,
  -James
 
 P.S.  It's my pleasure to use Galaxy!  ;)
 
 
 
 
 On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.edu wrote:
 Hi James,
 
 Can you let us know which revision of Galaxy that you are using (hg head) and 
 any log output that appears when accessing or running the tool? Also the list 
 of parameters and values that are being POSTed to Galaxy and a copy of your 
 tool.xml file would be useful.
 
 
 Thanks for using Galaxy,
 
 Dan
 
 
 
 On Dec 1, 2011, at 5:08 AM, James Ireland wrote:
 
 Greetings,
 
 I've been attempting to return data to Galaxy via the synchronous data 
 depositing protocol.  Using the Biomart, UCSC Table Browser, etc as examples 
 in the data_source tools directory, I've been able to get the initial GET 
 request to my site just fine.  However, when I POST back to galaxy I 
 immediately get a redirect to the welcome page and Galaxy never resubmits 
 back to my site.
 
 I was wondering if there is more to the protocol than is covered here: 
 http://wiki.g2.bx.psu.edu/Admin/Internals/Data%20Sources or perhaps 
 configuration I need to perform on my local Galaxy installation to correctly 
 handle the POSTs back to tool_runner?  Also, are there any code examples I 
 should be looking at?
 
 Thanks for your help!
 -James
 
 
 -- 
 J Ireland
 www.5amsolutions.com | Software for Life(TM)
 m: 415 484-DATA (3282)
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/
 
 
 
 
 -- 
 J Ireland
 www.5amsolutions.com | Software for Life(TM)
 m: 415 484-DATA (3282)
 test_me.xml

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] synchronous data depositing

2011-12-01 Thread James Ireland
Hi Dan,

BTW - is there a particular tool/tool.xml I should focus on for a
demonstration of current best practices?

Thanks,
 -James


On Thu, Dec 1, 2011 at 8:40 AM, James Ireland jirel...@5amsolutions.comwrote:

 Hi Dan,

 Ahhh... ok.  I had seen two forms of the url ad was wondering what was up.

 So, simply removing the param creates a poorly formed url:


 http://127.0.0.1:8080/tool_runner?tool_id=test_me?type=textname=Homo%20sapiens%20genes%20%28GRCh37.p5%29URL=http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1resultsButton=1

 Which causes much wailing and nashing of teeth...

 Tool 'test_me?type=text' does not exist,
 kwd={'hsapiens_gene_ensembl__feature_page__attribute.ensembl_gene_id':
 u'on', 'hsapiens_gene_ensembl__filter.go_parent_name': u'',
 'hsapiens_gene_ensembl__filter.encode_region': u'5:131256415:132256414',
 'hsapiens_gene_ensembl__filtergroup.gene__visibility': u'show',
 'hsapiens_gene_ensembl__filter.start': u'1',
 'hsapiens_gene_ensembl__filter.marker_end': u'',
 'hsapiens_gene_ensembl__filter.id_list_limit_filters': u'ensembl_gene_id',
 'hsapiens_gene_ensembl__filter.type': u'manual_picks', 'export_subset':
 u'10', 'hsapiens_gene_ensembl__filter.somatic_variation_source': u'COSMIC',
 'defaulthsapiens_gene_ensembl__homologs__attributelist':
 [u'hsapiens_gene_ensembl__homologs__attribute.ensembl_gene_id',
 u'hsapiens_gene_ensembl__homologs__attribute.ensembl_transcript_id'],
 'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id':
 u'on', 'hsapiens_gene_ensembl__snp__attribute.ensembl_gene_id': u'on',
 'hsapiens_gene_ensembl__filtergroup.protein__visibility': u'hide',
 'exportView_outputformat': u'TSV', 'export_saveto': u'text',
 'export_dataset': u'0', 'URL': u'
 http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1',
 'defaulthsapiens_gene_ensembl__transcript_event__attributelist':
 [u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id',
 u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_transcript_id'],
 'hsapiens_gene_ensembl__filter.transcript_status': u'KNOWN',
 'hsapiens_gene_ensembl__filter.with_transmembrane_domain': u'only',
 'menuNumber': u'0', 'hsapiens_gene_ensembl__filter.protein_fam_id_b...
 etc, etc

 If I correct the URL by hand (change the second ? to ) and resubmit, that
 seems to work!  I'll try using this url form on my own page now.

 I'm surprised that the legacy url works until I change the tool name!  If
 you hear no more from me, you can assume it worked.

 Thanks!
  -James

 On Thu, Dec 1, 2011 at 8:25 AM, Daniel Blankenberg d...@bx.psu.edu wrote:

 Hi James,

 For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter,
 which is likely the source of the problem. If you remove the 'param
 name=GALAXY_URL ... /' input parameter, restart Galaxy and reload the
 Galaxy interface, does it then work correctly?


 Thanks for using Galaxy,

 Dan

 On Dec 1, 2011, at 11:18 AM, James Ireland wrote:

 Hi Dan,

 Thanks for the quick response!

 I have a feeling this is something silly that I'm missing.  As a reality
 check I simply created a copy of the biomart.xml tool.xml file and called
 it test_me.xml.  I  changed the tool_id and GALAXY_URL in the xml file as
 you'll see and added a link in my tool_config.  Selecting this tool from
 the Get Data folder brings me to Biomart as expected.  When I hit go, I
 have the same issue of just being redirected to the welcome page.  The
 server output is shown below.  The unchanged biomart tool continues to work
 fine.

 Galaxy server output:

 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST
 /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=
 http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1HTTP/1.1;
  302 - 
 http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9;
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101
 Firefox/8.0.1

 Here's my Galaxy revision info:
 changeset:   6056:338ead4737ba
 tag: tip
 user:Nate Coraor n...@bx.psu.edu
 date:Thu Sep 29 16:45:19 2011 -0400

 Thanks again,
  -James

 P.S.  It's my pleasure to use Galaxy!  ;)




 On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.eduwrote:

 Hi James,

 Can you let us know which revision of Galaxy that you are using (hg
 head) and any log output that appears when accessing or running the tool?
 Also the list of parameters and values that are being POSTed to Galaxy and
 a copy of your tool.xml file would be useful.


 Thanks for using Galaxy,

 Dan



 On Dec 1, 2011, at 5:08 AM, James Ireland wrote:

 Greetings,

 I've been attempting to return data to Galaxy via the synchronous data
 depositing protocol.  Using the Biomart, UCSC Table Browser, etc as
 examples in the data_source tools directory, I've been able to get the
 initial GET request to my site just fine.  However, when I POST back 

Re: [galaxy-dev] synchronous data depositing

2011-12-01 Thread Daniel Blankenberg
Hi James,

tools/data_source/yeastmine.xml is a good example of a relatively simple 
configuration. Ideally, the request_param_translation/ tagset would not be 
required, if a more specific data_type parameter value was being provided by 
the external site.


Thanks for using Galaxy,

Dan


On Dec 1, 2011, at 12:05 PM, James Ireland wrote:

 Hi Dan,
 
 BTW - is there a particular tool/tool.xml I should focus on for a 
 demonstration of current best practices?
 
 Thanks,
  -James
 
 
 On Thu, Dec 1, 2011 at 8:40 AM, James Ireland jirel...@5amsolutions.com 
 wrote:
 Hi Dan,
 
 Ahhh... ok.  I had seen two forms of the url ad was wondering what was up.
 
 So, simply removing the param creates a poorly formed url:
 
 http://127.0.0.1:8080/tool_runner?tool_id=test_me?type=textname=Homo%20sapiens%20genes%20%28GRCh37.p5%29URL=http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1resultsButton=1
 
 Which causes much wailing and nashing of teeth...
 
 Tool 'test_me?type=text' does not exist, 
 kwd={'hsapiens_gene_ensembl__feature_page__attribute.ensembl_gene_id': u'on', 
 'hsapiens_gene_ensembl__filter.go_parent_name': u'', 
 'hsapiens_gene_ensembl__filter.encode_region': u'5:131256415:132256414', 
 'hsapiens_gene_ensembl__filtergroup.gene__visibility': u'show', 
 'hsapiens_gene_ensembl__filter.start': u'1', 
 'hsapiens_gene_ensembl__filter.marker_end': u'', 
 'hsapiens_gene_ensembl__filter.id_list_limit_filters': u'ensembl_gene_id', 
 'hsapiens_gene_ensembl__filter.type': u'manual_picks', 'export_subset': 
 u'10', 'hsapiens_gene_ensembl__filter.somatic_variation_source': u'COSMIC', 
 'defaulthsapiens_gene_ensembl__homologs__attributelist': 
 [u'hsapiens_gene_ensembl__homologs__attribute.ensembl_gene_id', 
 u'hsapiens_gene_ensembl__homologs__attribute.ensembl_transcript_id'], 
 'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id': u'on', 
 'hsapiens_gene_ensembl__snp__attribute.ensembl_gene_id': u'on', 
 'hsapiens_gene_ensembl__filtergroup.protein__visibility': u'hide', 
 'exportView_outputformat': u'TSV', 'export_saveto': u'text', 
 'export_dataset': u'0', 'URL': 
 u'http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1',
  'defaulthsapiens_gene_ensembl__transcript_event__attributelist': 
 [u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id', 
 u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_transcript_id'], 
 'hsapiens_gene_ensembl__filter.transcript_status': u'KNOWN', 
 'hsapiens_gene_ensembl__filter.with_transmembrane_domain': u'only', 
 'menuNumber': u'0', 'hsapiens_gene_ensembl__filter.protein_fam_id_b...
 etc, etc
 
 If I correct the URL by hand (change the second ? to ) and resubmit, that 
 seems to work!  I'll try using this url form on my own page now.
 
 I'm surprised that the legacy url works until I change the tool name!  If you 
 hear no more from me, you can assume it worked.
 
 Thanks!
  -James 
 
 On Thu, Dec 1, 2011 at 8:25 AM, Daniel Blankenberg d...@bx.psu.edu wrote:
 Hi James,
 
 For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter, which 
 is likely the source of the problem. If you remove the 'param 
 name=GALAXY_URL ... /' input parameter, restart Galaxy and reload the 
 Galaxy interface, does it then work correctly?
 
 
 Thanks for using Galaxy,
 
 Dan
 
 On Dec 1, 2011, at 11:18 AM, James Ireland wrote:
 
 Hi Dan,
 
 Thanks for the quick response!
 
 I have a feeling this is something silly that I'm missing.  As a reality 
 check I simply created a copy of the biomart.xml tool.xml file and called it 
 test_me.xml.  I  changed the tool_id and GALAXY_URL in the xml file as 
 you'll see and added a link in my tool_config.  Selecting this tool from the 
 Get Data folder brings me to Biomart as expected.  When I hit go, I have 
 the same issue of just being redirected to the welcome page.  The server 
 output is shown below.  The unchanged biomart tool continues to work fine.
 
 Galaxy server output:
 
 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST 
 /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1
  HTTP/1.1 302 - 
 http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9; 
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 
 Firefox/8.0.1
 
 Here's my Galaxy revision info:
 changeset:   6056:338ead4737ba
 tag: tip
 user:Nate Coraor n...@bx.psu.edu
 date:Thu Sep 29 16:45:19 2011 -0400
 
 Thanks again,
  -James
 
 P.S.  It's my pleasure to use Galaxy!  ;)
 
 
 
 
 On Thu, Dec 1, 2011 at 7:06 AM, Daniel Blankenberg d...@bx.psu.edu wrote:
 Hi James,
 
 Can you let us know which revision of Galaxy that you are using (hg head) 
 and any log output that appears when accessing or running the tool? Also the 
 list of parameters and values that are being POSTed to Galaxy and a copy of 
 

Re: [galaxy-dev] Job output not returned from cluster

2011-12-01 Thread Nate Coraor

On Nov 29, 2011, at 9:22 PM, Fields, Christopher J wrote:

 On Nov 29, 2011, at 3:13 AM, Peter Cock wrote:
 
 On Monday, November 28, 2011, Joseph Hargitai 
 joseph.hargi...@einstein.yu.edu wrote:
 Ed,
 
 we had the classic goof on our cluster with this. 4 nodes could not see the 
 /home/galaxy folder due to a missing entry in /etc/fstab. When the jobs hit 
 those nodes (which explains the randomness) we got the error message.
 
 Bothersome was the lack of good logs to go on. The error message was too 
 generic - however I discovered that Galaxy was depositing the error and our 
 messages in the /pbs folder and you could briefly read them before they got 
 deleted. There the message was the classic SGE input/output message - 
 /home/galaxy file not found.
 
 Hence my follow up question - how can I have galaxy NOT to delete these SGE 
 error and out files?
 
 best,
 joe
 
 Better yet, Galaxy should read the SGE o and e files and record their 
 contents as it would for a directly executed tools stdout and stderr.
 
 Peter
 
 ...or at least have the option to do so, maybe a level of verbosity.  I have 
 been bitten by lack of stderr output myself, where having it might have saved 
 some manual debugging.

Unless I'm misunderstanding, this is what Galaxy already does.  stdout/stderr 
up to 32K are read from .o and .e and stored in job.stdout/job.stderr.  We do 
need to just store them as files and make them accessible for each tool run, 
this will hopefully happen sometime soonish.

--nate

 
 chris
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] proxy settings?

2011-12-01 Thread Nate Coraor
On Nov 29, 2011, at 9:35 PM, Smithies, Russell wrote:

 Found the cure – just required adding urllib2.ProxyHandler in the data_source 
 tools.
 Why doesn’t Galaxy pick up the system http_proxy variables?

Hi Russell,

Thanks for tracking down the problem.  Could you send a patch for this?

--nate

  
 --Russell Smithies
  
  
 From: galaxy-dev-boun...@lists.bx.psu.edu 
 [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Smithies, Russell
 Sent: Wednesday, 30 November 2011 9:09 a.m.
 To: galaxy-dev@lists.bx.psu.edu
 Subject: [galaxy-dev] proxy settings?
  
 I’m new to Galaxy so I’m not sure if this a Galaxy or linux/apache question .
  
 When I try to “Get Data” from UCSC or any other external site, I get a 407 
 error from our proxy as I need to authenticate.
 Is the request going out as the ‘galaxy’ user or ‘apache’ or the user that’s 
 logged in?
 I already have http_proxy and ftp_proxy configured in /etc/profile (we’re 
 running Centos 6) but I assume there a correct place to configure this for 
 Galaxy?
  
 The error message I’m seeing is:
 An error occurred running this job: The remote data source application may be 
 off line, please try again later. Error: ('http error', 407, 'Proxy Access 
 Denied', httplib.HTTPMessage instance at 0x35d2998)
  
 Any ideas?
  
 Thanx,
  
 Russell Smithies
  
  
 
 Attention: The information contained in this message and/or attachments from 
 AgResearch Limited is intended only for the persons or entities to which it 
 is addressed and may contain confidential and/or privileged material. Any 
 review, retransmission, dissemination or other use of, or taking of any 
 action in reliance upon, this information by persons or entities other than 
 the intended recipients is prohibited by AgResearch Limited. If you have 
 received this message in error, please notify the sender immediately.
 
  
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] dadaset (file) visualization

2011-12-01 Thread Alfonso Núñez Salgado

Hi,

I'm new in galaxy, but i've made several installations in different 
distributions. At his moment I'm using RHEL 6.1 and I'm not able to 
visualize certain files. After uploading a SAM file I click on the file 
title and a preview of few lines appears whith the rest of the available 
options. When i click on  the eye button nothing happen. If i try to 
download it, i do receive an empty file.


This behaviour occurs with other file extension (BAM, HTML), but not 
whith others (FASTA, FASTAq)


Using wget:

wget 
http://10.0.0.2:8090/galaxy/datasets/5969b1f7201f12ae/display/?preview=True;
--2011-12-01 18:18:56--  
http://10.0.0.2:8090/galaxy/datasets/5969b1f7201f12ae/display/?preview=True

Connecting to 10.0.0.2:8090... connected.
HTTP request sent, awaiting response... 10.0.0.2 - - 
[01/Dec/2011:18:18:56 +] GET 
/galaxy/datasets/5969b1f7201f12ae/display/?preview=True HTTP/1.0 200 - 
- Wget/1.12 (linux-gnu)

200 OK
Length: unspecified [text/plain]
Saving to: `index.html?preview=True.3'

finaly index.html?preview=True.3 is an empty file, and obviously, no 
errors is displayed with any of the debugging options.
I've done the same tests in other distributions and apparently every 
thing seems right


Can any one of you get some light to this unexplainable black hole?

--
=
Alfonso Núñez Salgado
Unidad de Bioinformática
Centro de Biologia Molecular Severo Ochoa
C/Nicolás Cabrera 1
Universidad Autónoma de Madrid
Cantoblanco, 28049 Madrid (Spain)
Phone: (34) 91-196-4633
Fax:   (34) 91-196-4420
web: http://ub.cbm.uam.es/
=

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] synchronous data depositing

2011-12-01 Thread James Ireland
Sorry - this truly is the last question.  Besides my question above on the
expected response type from my page, how do I get Galaxy to skip the
intermediate Execute button before the resubmission?

Sorry for all the questions and thanks for your help.
-James


On Thu, Dec 1, 2011 at 10:11 AM, James Ireland jirel...@5amsolutions.comwrote:

 Excellent!  Thanks, Dan.

 I'm now getting the post resubmits from Galaxy.  Works great!  Last
 question - can you provide any info on the expected http response type from
 my page back to Galaxy after the resubmit?  Returning a standard html
 response or text/csv doesn't seem to be cutting it.



 On Thu, Dec 1, 2011 at 9:13 AM, Daniel Blankenberg d...@bx.psu.edu wrote:

 Hi James,

 tools/data_source/yeastmine.xml is a good example of a relatively simple
 configuration. Ideally, the request_param_translation/ tagset would not
 be required, if a more specific data_type parameter value was being
 provided by the external site.


 Thanks for using Galaxy,

 Dan


 On Dec 1, 2011, at 12:05 PM, James Ireland wrote:

 Hi Dan,

 BTW - is there a particular tool/tool.xml I should focus on for a
 demonstration of current best practices?

 Thanks,
  -James


 On Thu, Dec 1, 2011 at 8:40 AM, James Ireland 
 jirel...@5amsolutions.comwrote:

 Hi Dan,

 Ahhh... ok.  I had seen two forms of the url ad was wondering what was
 up.

 So, simply removing the param creates a poorly formed url:


 http://127.0.0.1:8080/tool_runner?tool_id=test_me?type=textname=Homo%20sapiens%20genes%20%28GRCh37.p5%29URL=http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1resultsButton=1

 Which causes much wailing and nashing of teeth...

 Tool 'test_me?type=text' does not exist,
 kwd={'hsapiens_gene_ensembl__feature_page__attribute.ensembl_gene_id':
 u'on', 'hsapiens_gene_ensembl__filter.go_parent_name': u'',
 'hsapiens_gene_ensembl__filter.encode_region': u'5:131256415:132256414',
 'hsapiens_gene_ensembl__filtergroup.gene__visibility': u'show',
 'hsapiens_gene_ensembl__filter.start': u'1',
 'hsapiens_gene_ensembl__filter.marker_end': u'',
 'hsapiens_gene_ensembl__filter.id_list_limit_filters': u'ensembl_gene_id',
 'hsapiens_gene_ensembl__filter.type': u'manual_picks', 'export_subset':
 u'10', 'hsapiens_gene_ensembl__filter.somatic_variation_source': u'COSMIC',
 'defaulthsapiens_gene_ensembl__homologs__attributelist':
 [u'hsapiens_gene_ensembl__homologs__attribute.ensembl_gene_id',
 u'hsapiens_gene_ensembl__homologs__attribute.ensembl_transcript_id'],
 'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id':
 u'on', 'hsapiens_gene_ensembl__snp__attribute.ensembl_gene_id': u'on',
 'hsapiens_gene_ensembl__filtergroup.protein__visibility': u'hide',
 'exportView_outputformat': u'TSV', 'export_saveto': u'text',
 'export_dataset': u'0', 'URL': u'
 http://www.biomart.org/biomart/martview/1a5fc0bddc9f81dba837a1b1a5063691?do_export=1',
 'defaulthsapiens_gene_ensembl__transcript_event__attributelist':
 [u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_gene_id',
 u'hsapiens_gene_ensembl__transcript_event__attribute.ensembl_transcript_id'],
 'hsapiens_gene_ensembl__filter.transcript_status': u'KNOWN',
 'hsapiens_gene_ensembl__filter.with_transmembrane_domain': u'only',
 'menuNumber': u'0', 'hsapiens_gene_ensembl__filter.protein_fam_id_b...
 etc, etc

 If I correct the URL by hand (change the second ? to ) and resubmit,
 that seems to work!  I'll try using this url form on my own page now.

 I'm surprised that the legacy url works until I change the tool name!
 If you hear no more from me, you can assume it worked.

 Thanks!
  -James

 On Thu, Dec 1, 2011 at 8:25 AM, Daniel Blankenberg d...@bx.psu.eduwrote:

 Hi James,

 For legacy reasons, Biomart uses a special-cased GALAXY_URL parameter,
 which is likely the source of the problem. If you remove the 'param
 name=GALAXY_URL ... /' input parameter, restart Galaxy and reload
 the Galaxy interface, does it then work correctly?


 Thanks for using Galaxy,

 Dan

 On Dec 1, 2011, at 11:18 AM, James Ireland wrote:

 Hi Dan,

 Thanks for the quick response!

 I have a feeling this is something silly that I'm missing.  As a
 reality check I simply created a copy of the biomart.xml tool.xml file and
 called it test_me.xml.  I  changed the tool_id and GALAXY_URL in the xml
 file as you'll see and added a link in my tool_config.  Selecting this tool
 from the Get Data folder brings me to Biomart as expected.  When I hit
 go, I have the same issue of just being redirected to the welcome page.
 The server output is shown below.  The unchanged biomart tool continues to
 work fine.

 Galaxy server output:

 127.0.0.1 - - [01/Dec/2011:08:08:27 -0700] POST
 /tool_runner/test_me?type=textname=Homo%20sapiens%20genes%20(GRCh37.p5)URL=
 http://www.biomart.org/biomart/martview/534024a0e03f1befd52ca87ba741b6c9?do_export=1resultsButton=1HTTP/1.1;
  302 - 
 

Re: [galaxy-dev] GalaxyCloudman + CADDSuite

2011-12-01 Thread Brad Chapman

Marcel;

 well, I know I do not have to create a new AMI if I want to reuse an 
 instance myself.
 
 However, I would like to share the modified GalaxyCloudman version with 
 other people and therefore I do have to create an AMI.

What Enis was suggesting is using the share-a-cluster functionality
built into CloudMan. This bundles your data volumes as snapshots and
prepares a sharable cluster than anyone can initiate.

In the CloudMan interface, there is a little green button next to the
cluster name that enables this. That is definitely the easiest way to
share and distribute modified CloudMan versions.

 Ok, I will try to make this work somehow ... but I guess there are no 
 immediate clues as to what could have gone wrong? Or do you have any 
 ideas what I should try?

After CloudMan boots once, you need to clean up some files before
preparing an AMI. This is the automated code we use to clean up for
prepping CloudMan compatible AMIs:

https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/cloudman.py#L87

Be careful if you run that directly. It runs immediately before bundling
and removes the ssh keys (so you don't have a backdoor to the AMI you
are distributing) so you want to do it as the last thing. It also
assumes you have unmounted all of the associated Galaxy data libraries.

Hope this helps,
Brad

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] lims integration

2011-12-01 Thread Craig Blackhart
I am newish to Galaxy and trying to learn how I might integrate it with our
workflows and LIMS for automated data handling.  I am aware of the API and
have looked up all the documentation that I could find.  However, there are
many things I cannot make sense of, and have not been able to find
information to help me out.  I think a good place to start asking questions
is with how to run workflow_execute.py and ask what each of the parameters
are and where to get the information from them

 

Arguments

   *API key - got this and understand

   *url - got this and understand

   *workflow_id - I have created workflows and have been able to
find what looks to be a workflow_id by clicking on the workflow name and
selecting Download or Export.  It seems this may be correct, is it?

   *history - a named history to use?  Should this already
exist?  I have no idea here.

   *step=src=dataset_id - ??? I have no idea ???  I have seen
how to create data libraries manually at the command line; does this factor
in?

 

If anyone has information they can help me out with, it would be much
appreciated.

 

Thanks

 

Craig Blackhart

Computer Scientist

Applied Engineering Technologies

Los Alamos National Laboratory

505-665-6588

This message contains no information that requires ADC review

 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] lims integration

2011-12-01 Thread Edward Kirton
Hi Chris, unfortunately none of us here have played around with the API
yet.  I would recommend inquiring on the galaxy-central's mailing list (
mailto:galaxy-dev@lists.bx.psu.edu galaxy-dev@lists.bx.psu.edu).

- workflows, histories, libraries, and datasets have IDs in the database
but they may be obscured in the URLs used in galaxy; in the db they're just
integer primary keys
- histories must exist to do any work, but you can create a new history

On Thu, Dec 1, 2011 at 4:32 PM, Craig Blackhart blackh...@lanl.gov wrote:

 I am newish to Galaxy and trying to learn how I might integrate it with
 our workflows and LIMS for automated data handling.  I am aware of the API
 and have looked up all the documentation that I could find.  However, there
 are many things I cannot make sense of, and have not been able to find
 information to help me out.  I think a good place to start asking questions
 is with how to run workflow_execute.py and ask what each of the parameters
 are and where to get the information from them

 ** **

 Arguments

*API key – got this and understand

*url – got this and understand

*workflow_id – I have created workflows and have been able
 to find what looks to be a workflow_id by clicking on the workflow name and
 selecting “Download or Export”.  It seems this may be correct, is it?

*history – a named history to use?  Should this already
 exist?  I have no idea here.

*step=src=dataset_id - ??? I have no idea ???  I have seen
 how to create data libraries manually at the command line; does this factor
 in?

 ** **

 If anyone has information they can help me out with, it would be much
 appreciated.

 ** **

 Thanks

 ** **

 Craig Blackhart

 Computer Scientist

 Applied Engineering Technologies

 Los Alamos National Laboratory

 505-665-6588

 *This message contains no information that requires ADC review*

 ** **

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] lims integration

2011-12-01 Thread Dannon Baker
Hi Craig,

Thanks for your interest in the galaxy API.  For the parameters you're 
uncertain about:

The 'workflow_id' is indeed the encoded workflow id.  You can get this by 
encoding it yourself, or doing a GET on /api/workflows for a list of *all* 
workflows and their encoded id's (see example below)
The 'history' is either a name for a new history to be created, or a string of 
the format hist_id=encoded history_id to use, if you have an existing 
history that you'd like the results to appear in.
And the last parameter is a three part string.  The first part is the step that 
the input should be mapped to, the second part is the *type* of input it is, 
and the third step is the actual encoded id.  The type is going to be either 
ldda for a library dataset, or hda for a history dataset.

All of these encoded id's are discoverable through the API itself.  Try using 
scripts/api/display.py to view /api/workflows, 
/api/workflows/encoded_workflow_id to get a feel for what's available to you.

Lastly, an example of how the run_workflow component of the api could be used 
as a part of an external pipeline can be found in 
scripts/api/example_watch_folder.py.  This script monitors a particular folder 
for files, uploads them to galaxy, and executes a workflow on them.

Hope this helps, and definitely let me know if I can answer any more questions 
or if you have feedback about the API.

-Dannon


On Dec 1, 2011, at 7:32 PM, Craig Blackhart wrote:

 I am newish to Galaxy and trying to learn how I might integrate it with our 
 workflows and LIMS for automated data handling.  I am aware of the API and 
 have looked up all the documentation that I could find.  However, there are 
 many things I cannot make sense of, and have not been able to find 
 information to help me out.  I think a good place to start asking questions 
 is with how to run workflow_execute.py and ask what each of the parameters 
 are and where to get the information from them
  
 Arguments
*API key – got this and understand
*url – got this and understand
*workflow_id – I have created workflows and have been able to 
 find what looks to be a workflow_id by clicking on the workflow name and 
 selecting “Download or Export”.  It seems this may be correct, is it?
*history – a named history to use?  Should this already exist? 
  I have no idea here.
*step=src=dataset_id - ??? I have no idea ???  I have seen how 
 to create data libraries manually at the command line; does this factor in?
  
 If anyone has information they can help me out with, it would be much 
 appreciated.
  
 Thanks
  
 Craig Blackhart
 Computer Scientist
 Applied Engineering Technologies
 Los Alamos National Laboratory
 505-665-6588
 This message contains no information that requires ADC review
  
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] GalaxyCloudman + CADDSuite

2011-12-01 Thread Enis Afgan
Hi Marcel,

On Thu, Dec 1, 2011 at 10:35 PM, Marcel Schumann 
schum...@informatik.uni-tuebingen.de wrote:

 Hi Brad, hi Enis,

 ok, so one possible way should thus be to use the  _cleanup_ec2() function
 before creating an AMI via the amazon console and the second option perhaps
 is this share-a-cluster functionality.

 About the latter: sorry, I tried to find out what it really does, but did
 not find any documentation of it. So, how does this share-a-cluster
 functionality work in principle? If it does not create AMI, does it just
 create snapshots of the individual disks that users have to mount
 (manually) or ... ?
 My point here is: I need to make a modified version of GalaxyCloudman
 available (i.e. one that contains the CADDSuite tools). I do not just want
 to share an _instance_ (a running cluster) but a cloudman image (as an AMI
 or by some other means) that includes all my tools, so that other users can
 easily start their own server.

The share-an-instance functionality creates snapshot the data volume
(/mnt/galaxyData) so any tool that is installed on that file system will
get bundled into the shared instance. Also, any modifications that you make
to you galaxy configuration (e.g., galaxy_tools.xml) become part of the
shared instance. When a user then starts a new cluster and provides the
cluster share string (this is available on the shared cluster after you
share it), they will get access to an exact replica of your cluster. Like,
I mentioned earlier, unless you are modifing system libraries there is
absolutely no reason to create a new AMI. Simply contain your
customizations to the file system and use cluster sharing functionality.
Try it and see if it does what you need it to do.


 If you do indeed have a description of this somewhere that I just did not
 find, then I am sorry but would be grateful for a link ;-)

 Sorry but there's no functionality about this feature on the wiki.
However, there is a pretty detailed description of the functionality and
the process on the cloudman console after you click the green share a
cluster button. I'll work on preparing some additional documentation on
this topic.

Enis


 Cheers,
 Marcel



 On 12/1/11 10:04 PM, Brad Chapman wrote:


 Marcel;

  well, I know I do not have to create a new AMI if I want to reuse an
 instance myself.

 However, I would like to share the modified GalaxyCloudman version with
 other people and therefore I do have to create an AMI.


 What Enis was suggesting is using the share-a-cluster functionality
 built into CloudMan. This bundles your data volumes as snapshots and
 prepares a sharable cluster than anyone can initiate.

 In the CloudMan interface, there is a little green button next to the
 cluster name that enables this. That is definitely the easiest way to
 share and distribute modified CloudMan versions.

  Ok, I will try to make this work somehow ... but I guess there are no
 immediate clues as to what could have gone wrong? Or do you have any
 ideas what I should try?


 After CloudMan boots once, you need to clean up some files before
 preparing an AMI. This is the automated code we use to clean up for
 prepping CloudMan compatible AMIs:

 https://github.com/chapmanb/**cloudbiolinux/blob/master/**
 cloudbio/cloudman.py#L87https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/cloudman.py#L87

 Be careful if you run that directly. It runs immediately before bundling
 and removes the ssh keys (so you don't have a backdoor to the AMI you
 are distributing) so you want to do it as the last thing. It also
 assumes you have unmounted all of the associated Galaxy data libraries.

 Hope this helps,
 Brad



 --
 Marcel Schumann

 University of Tuebingen
 Wilhelm Schickard Institute for Computer Science
 Division for Applied Bioinformatics
 Room C313, Sand 14, D-72076 Tuebingen

 phone:  +49 (0)7071-29 70437
 fax:  +49 (0)7071-29 5152
 email:  
 schum...@informatik.uni-**tuebingen.deschum...@informatik.uni-tuebingen.de

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/