Re: [galaxy-dev] [galaxy-user] Why SGE needed for galaxy ?

2013-02-12 Thread Zeeshan Ali Shah
Thanks Enis, 
The paper is good , i was looking something like it. 

Ok, I will use CBL , just a small query : 

we use CBL for cloud man purpose, i think yes . Should we use the same Image 
for workers and enable cloudman script to use that one for launching nodes ? 


BR

Zeeshan

On W7-Feb 11, 2013, at 9:27 PM, Enis Afgan wrote:

 Here's a link to the architecture paper: 
 http://onlinelibrary.wiley.com/doi/10.1002/cpe.1836/full
 
 Building CloudMan image from the repo you mention will not work - that repo 
 is for cloudman itself while you need an image capable of running cloudman. 
 For that, you should use CBL.
 
 Also, here are some instructions about setting up cloudman and galaxy on 
 OpenNebula cloud: 
 https://www.cloud.sara.nl/projects/mattiasdehollander-project/wiki (note that 
 this mentions use of mi-deployment set of scripts; since that, mi-deployment 
 has been merged into CBL).
 
 
 On Mon, Feb 11, 2013 at 8:21 PM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 Hi, 
 
 Thanks for answers , 
 
 Actually i tried to understand cloudman role in galaxy , do you have an 
 architecture paper which i should read ? for e.g. what i unserstood si that 
 cloud man runs a python server and manage SGE through it via some python 
 script . (may be i am wrong) 
 
 Our proposed installation is like this:  Users launch cloud man from Open 
 nebula cloud, when cloudman is running they can add more nodes which are 
 endup in same private cloud. As you suggested in this particular case I 
 should have same images both for (master cloudman) and workers , am i right ?
 
 actually i built one image via https://bitbucket.org/galaxy/cloudman . DO you 
 suggest that I should built image from CBL tree you mentioned below ? which 
 wd have both cloudman and galaxy together. 
 
 BR
 
 Zeeshan
 
 On W6-Feb 8, 2013, at 10:21 PM, Enis Afgan wrote:
 
 As far as actually building the image, the recommended method is to use 
 CloudBioLinux build scripts: https://github.com/chapmanb/cloudbiolinux
 There is a CloudMan flavor of CBL that allows you to build only CloudMan- 
 and Galaxy-required parts: 
 https://github.com/chapmanb/cloudbiolinux/tree/master/contrib/flavor/cloudman
 
 
 On Sat, Feb 9, 2013 at 12:24 AM, Dannon Baker dannonba...@me.com wrote:
 The workers don't need their own copy of galaxy installed, but a shared 
 filesystem is a requirement for galaxy (in any cluster environment -- see 
 the galaxy wiki for more 
 http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster).  Cloudman 
 handles managing NFS for you and sharing the galaxy/tools/index/data 
 volumes.  In order for workers to communicate with the master instance, 
 they'll need the cloudman installation as well, so you should use the same 
 image.
 
 Now that I've answered that, I'm not sure I totally understand your proposed 
 installation yet, but if you're suggesting bypassing cloudman for 
 installation on a private cloud it should be possible.  You'd want the 
 master instance up full time running as the galaxy front end, dispatching 
 jobs to a separate cluster managed by SGE/PBS/whatever.  Basically the 
 standard cluster configuration outlined in the wiki above, but you'd want 
 your worker nodes automatically configured to mount the shared directories 
 and join the PBS/SGE queue so they could handle jobs.
 
 Depending on what type of private cloud you're working with, it might be 
 easier to just see if you can get cloudman to work :)
 
 Lastly, I swapped this message to galaxy-dev since it's about installation 
 nuts and bolts.
 
 -Dannon
 
 On Feb 8, 2013, at 3:02 AM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 
  Dear Enis, thanks for reply and being you as cloudman developer it is good 
  to see you in the list .
 
  Q2: On Workers node we need galaxy installed with its shared directories ? 
  like galaxyindices , galaxydata
  Q3: For a private cloud setup do you prefare to have  a master image with 
  cloudman and galaxy and use the same image for workers as well ? or worker 
  images can be vanilla OS ?
 
 
  BR
 
  Zeeshan
 
  On W6-Feb 7, 2013, at 11:50 PM, Enis Afgan wrote:
 
  Hi Zeeshan,
  In order to gain from the scalability of the cloud, SGE does need to run. 
  However, CloudMan sets all that up and manages it going forward.
 
  Enis
 
 
  On Fri, Feb 8, 2013 at 8:59 AM, Zeeshan Ali Shah zas...@pdc.kth.se 
  wrote:
  Hi,
 
  It seems that cloud man need SGE for scaling . Does SGE need also when 
  run cloud on private cloud ?
 
  Zeeshan
  ___
  The Galaxy User list should be used for the discussion of
  Galaxy analysis and other features on the public server
  at usegalaxy.org.  Please keep all replies on the list by
  using reply all in your mail client.  For discussion of
  local Galaxy instances and the Galaxy source code, please
  use the Galaxy Development list:
 
http://lists.bx.psu.edu/listinfo/galaxy-dev
 
  To manage your subscriptions to this and other 

[galaxy-dev] Bug in Galaxy Reports tool

2013-02-12 Thread Joachim Jacob |VIB|

Hi all,


Running ~/galaxy-dist/run_reports.sh brings up the Galaxy Reports 
interface on poort 9001.
When clicking on 'Jobs per user' - picking a user - picking a month - 
and clicking on 'State' in the bar at the top,  clears the screen.



Thanks,
Joachim

--
Joachim Jacob

Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Why SGE needed for galaxy ?

2013-02-12 Thread Dannon Baker
Yes, you use the same image for both masters and the workers.

-Dannon

On Feb 12, 2013, at 5:46 AM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:

 Thanks Enis, 
 The paper is good , i was looking something like it. 
 
 Ok, I will use CBL , just a small query : 
 
 we use CBL for cloud man purpose, i think yes . Should we use the same Image 
 for workers and enable cloudman script to use that one for launching nodes ? 
 
 
 BR
 
 Zeeshan
 
 On W7-Feb 11, 2013, at 9:27 PM, Enis Afgan wrote:
 
 Here's a link to the architecture paper: 
 http://onlinelibrary.wiley.com/doi/10.1002/cpe.1836/full
 
 Building CloudMan image from the repo you mention will not work - that repo 
 is for cloudman itself while you need an image capable of running cloudman. 
 For that, you should use CBL.
 
 Also, here are some instructions about setting up cloudman and galaxy on 
 OpenNebula cloud: 
 https://www.cloud.sara.nl/projects/mattiasdehollander-project/wiki (note 
 that this mentions use of mi-deployment set of scripts; since that, 
 mi-deployment has been merged into CBL).
 
 
 On Mon, Feb 11, 2013 at 8:21 PM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 Hi, 
 
 Thanks for answers , 
 
 Actually i tried to understand cloudman role in galaxy , do you have an 
 architecture paper which i should read ? for e.g. what i unserstood si that 
 cloud man runs a python server and manage SGE through it via some python 
 script . (may be i am wrong) 
 
 Our proposed installation is like this:  Users launch cloud man from Open 
 nebula cloud, when cloudman is running they can add more nodes which are 
 endup in same private cloud. As you suggested in this particular case I 
 should have same images both for (master cloudman) and workers , am i right ?
 
 actually i built one image via https://bitbucket.org/galaxy/cloudman . DO 
 you suggest that I should built image from CBL tree you mentioned below ? 
 which wd have both cloudman and galaxy together. 
 
 BR
 
 Zeeshan
 
 On W6-Feb 8, 2013, at 10:21 PM, Enis Afgan wrote:
 
 As far as actually building the image, the recommended method is to use 
 CloudBioLinux build scripts: https://github.com/chapmanb/cloudbiolinux
 There is a CloudMan flavor of CBL that allows you to build only CloudMan- 
 and Galaxy-required parts: 
 https://github.com/chapmanb/cloudbiolinux/tree/master/contrib/flavor/cloudman
 
 
 On Sat, Feb 9, 2013 at 12:24 AM, Dannon Baker dannonba...@me.com wrote:
 The workers don't need their own copy of galaxy installed, but a shared 
 filesystem is a requirement for galaxy (in any cluster environment -- see 
 the galaxy wiki for more 
 http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster).  Cloudman 
 handles managing NFS for you and sharing the galaxy/tools/index/data 
 volumes.  In order for workers to communicate with the master instance, 
 they'll need the cloudman installation as well, so you should use the same 
 image.
 
 Now that I've answered that, I'm not sure I totally understand your 
 proposed installation yet, but if you're suggesting bypassing cloudman for 
 installation on a private cloud it should be possible.  You'd want the 
 master instance up full time running as the galaxy front end, dispatching 
 jobs to a separate cluster managed by SGE/PBS/whatever.  Basically the 
 standard cluster configuration outlined in the wiki above, but you'd want 
 your worker nodes automatically configured to mount the shared directories 
 and join the PBS/SGE queue so they could handle jobs.
 
 Depending on what type of private cloud you're working with, it might be 
 easier to just see if you can get cloudman to work :)
 
 Lastly, I swapped this message to galaxy-dev since it's about installation 
 nuts and bolts.
 
 -Dannon
 
 On Feb 8, 2013, at 3:02 AM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 
  Dear Enis, thanks for reply and being you as cloudman developer it is 
  good to see you in the list .
 
  Q2: On Workers node we need galaxy installed with its shared directories 
  ? like galaxyindices , galaxydata
  Q3: For a private cloud setup do you prefare to have  a master image with 
  cloudman and galaxy and use the same image for workers as well ? or 
  worker images can be vanilla OS ?
 
 
  BR
 
  Zeeshan
 
  On W6-Feb 7, 2013, at 11:50 PM, Enis Afgan wrote:
 
  Hi Zeeshan,
  In order to gain from the scalability of the cloud, SGE does need to 
  run. However, CloudMan sets all that up and manages it going forward.
 
  Enis
 
 
  On Fri, Feb 8, 2013 at 8:59 AM, Zeeshan Ali Shah zas...@pdc.kth.se 
  wrote:
  Hi,
 
  It seems that cloud man need SGE for scaling . Does SGE need also when 
  run cloud on private cloud ?
 
  Zeeshan
  ___
  The Galaxy User list should be used for the discussion of
  Galaxy analysis and other features on the public server
  at usegalaxy.org.  Please keep all replies on the list by
  using reply all in your mail client.  For discussion of
  local Galaxy instances and the Galaxy 

Re: [galaxy-dev] Bug in Galaxy Reports tool

2013-02-12 Thread Greg Von Kuster
Hello joachim,

Thanks for reporting this.  I've added a Trello card for this issue:

https://trello.com/card/galaxy-reports/506338ce32ae458f6d15e4b3/610

Greg Von Kuster

On Feb 12, 2013, at 6:05 AM, Joachim Jacob |VIB| wrote:

 Hi all,
 
 
 Running ~/galaxy-dist/run_reports.sh brings up the Galaxy Reports interface 
 on poort 9001.
 When clicking on 'Jobs per user' - picking a user - picking a month - and 
 clicking on 'State' in the bar at the top,  clears the screen.
 
 
 Thanks,
 Joachim
 
 -- 
 Joachim Jacob
 
 Rijvisschestraat 120, 9052 Zwijnaarde
 Tel: +32 9 244.66.34
 Bioinformatics Training and Services (BITS)
 http://www.bits.vib.be
 @bitsatvib
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [galaxy-user] Why SGE needed for galaxy ?

2013-02-12 Thread Zeeshan Ali Shah
Thanks Dannon. 


BR

Zeeshan

On W7-Feb 12, 2013, at 12:22 PM, Dannon Baker wrote:

 Yes, you use the same image for both masters and the workers.
 
 -Dannon
 
 On Feb 12, 2013, at 5:46 AM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 
 Thanks Enis, 
 The paper is good , i was looking something like it. 
 
 Ok, I will use CBL , just a small query : 
 
 we use CBL for cloud man purpose, i think yes . Should we use the same Image 
 for workers and enable cloudman script to use that one for launching nodes ? 
 
 
 BR
 
 Zeeshan
 
 On W7-Feb 11, 2013, at 9:27 PM, Enis Afgan wrote:
 
 Here's a link to the architecture paper: 
 http://onlinelibrary.wiley.com/doi/10.1002/cpe.1836/full
 
 Building CloudMan image from the repo you mention will not work - that repo 
 is for cloudman itself while you need an image capable of running cloudman. 
 For that, you should use CBL.
 
 Also, here are some instructions about setting up cloudman and galaxy on 
 OpenNebula cloud: 
 https://www.cloud.sara.nl/projects/mattiasdehollander-project/wiki (note 
 that this mentions use of mi-deployment set of scripts; since that, 
 mi-deployment has been merged into CBL).
 
 
 On Mon, Feb 11, 2013 at 8:21 PM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 Hi, 
 
 Thanks for answers , 
 
 Actually i tried to understand cloudman role in galaxy , do you have an 
 architecture paper which i should read ? for e.g. what i unserstood si that 
 cloud man runs a python server and manage SGE through it via some python 
 script . (may be i am wrong) 
 
 Our proposed installation is like this:  Users launch cloud man from Open 
 nebula cloud, when cloudman is running they can add more nodes which are 
 endup in same private cloud. As you suggested in this particular case I 
 should have same images both for (master cloudman) and workers , am i right 
 ?
 
 actually i built one image via https://bitbucket.org/galaxy/cloudman . DO 
 you suggest that I should built image from CBL tree you mentioned below ? 
 which wd have both cloudman and galaxy together. 
 
 BR
 
 Zeeshan
 
 On W6-Feb 8, 2013, at 10:21 PM, Enis Afgan wrote:
 
 As far as actually building the image, the recommended method is to use 
 CloudBioLinux build scripts: https://github.com/chapmanb/cloudbiolinux
 There is a CloudMan flavor of CBL that allows you to build only CloudMan- 
 and Galaxy-required parts: 
 https://github.com/chapmanb/cloudbiolinux/tree/master/contrib/flavor/cloudman
 
 
 On Sat, Feb 9, 2013 at 12:24 AM, Dannon Baker dannonba...@me.com wrote:
 The workers don't need their own copy of galaxy installed, but a shared 
 filesystem is a requirement for galaxy (in any cluster environment -- see 
 the galaxy wiki for more 
 http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster).  Cloudman 
 handles managing NFS for you and sharing the galaxy/tools/index/data 
 volumes.  In order for workers to communicate with the master instance, 
 they'll need the cloudman installation as well, so you should use the same 
 image.
 
 Now that I've answered that, I'm not sure I totally understand your 
 proposed installation yet, but if you're suggesting bypassing cloudman for 
 installation on a private cloud it should be possible.  You'd want the 
 master instance up full time running as the galaxy front end, dispatching 
 jobs to a separate cluster managed by SGE/PBS/whatever.  Basically the 
 standard cluster configuration outlined in the wiki above, but you'd want 
 your worker nodes automatically configured to mount the shared directories 
 and join the PBS/SGE queue so they could handle jobs.
 
 Depending on what type of private cloud you're working with, it might be 
 easier to just see if you can get cloudman to work :)
 
 Lastly, I swapped this message to galaxy-dev since it's about installation 
 nuts and bolts.
 
 -Dannon
 
 On Feb 8, 2013, at 3:02 AM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
 
 Dear Enis, thanks for reply and being you as cloudman developer it is 
 good to see you in the list .
 
 Q2: On Workers node we need galaxy installed with its shared directories 
 ? like galaxyindices , galaxydata
 Q3: For a private cloud setup do you prefare to have  a master image with 
 cloudman and galaxy and use the same image for workers as well ? or 
 worker images can be vanilla OS ?
 
 
 BR
 
 Zeeshan
 
 On W6-Feb 7, 2013, at 11:50 PM, Enis Afgan wrote:
 
 Hi Zeeshan,
 In order to gain from the scalability of the cloud, SGE does need to 
 run. However, CloudMan sets all that up and manages it going forward.
 
 Enis
 
 
 On Fri, Feb 8, 2013 at 8:59 AM, Zeeshan Ali Shah zas...@pdc.kth.se 
 wrote:
 Hi,
 
 It seems that cloud man need SGE for scaling . Does SGE need also when 
 run cloud on private cloud ?
 
 Zeeshan
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your 

[galaxy-dev] Preffered way of running a tool on multiple input files

2013-02-12 Thread Hagai Cohen
Hi,
I'm looking for a preferred way of running Bowtie (or any other tool) on
multiple input files and run statistics on the Bowtie output afterwards.

The input is a directory of files fastq1..fastq100
The bowtie output should be bed1...bed100
The statistics tool should run on bed1...bed100 and return xls1..xls100
Then I will write a tool which will get xls1..xls100 and merge them to one
final output.

I searched for a smiliar cases, and I couldn't figure anyone which had this
problem before.
Can't use the parallelism tag, because what will be the input for each
tool? it should be a fastq file not a directory of fastq files.
Neither I would like to run each fastq file in a different workflow -
creating a mess.

I thought only on two solutions:
1. Implement new datatypes: bed_dir  fastq_dir and implements new tool
wrappers which will get a folder instead of a file.
2. merge the input files before sending to bowtie, and use parallelism tag
to make them be splitted  merged again on each tool.

Does anyone has any better suggestion?

Thanks,
Hagai
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Preffered way of running a tool on multiple input files

2013-02-12 Thread Joachim Jacob |VIB|

Hi Hagai,

Actually, using a workflow, you are able to select multiple input files, 
and let the workflow run separately on all input files.


I would proceed by creating a data library for all your fastq files, 
which you can upload via FTP, or via a system directory.
You can use a sample of your fastq files to create the steps in a 
history you want to perform, and extract a workflow out of it.
Next, copy all fastq files from a data library in a new history, and run 
your workflow on the all input files.


I hope this helps you further,
Joachim


Joachim Jacob

Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib

On 02/12/2013 04:02 PM, Hagai Cohen wrote:

Hi,
I'm looking for a preferred way of running Bowtie (or any other tool) 
on multiple input files and run statistics on the Bowtie output 
afterwards.


The input is a directory of files fastq1..fastq100
The bowtie output should be bed1...bed100
The statistics tool should run on bed1...bed100 and return xls1..xls100
Then I will write a tool which will get xls1..xls100 and merge them to 
one final output.


I searched for a smiliar cases, and I couldn't figure anyone which had 
this problem before.
Can't use the parallelism tag, because what will be the input for each 
tool? it should be a fastq file not a directory of fastq files.
Neither I would like to run each fastq file in a different workflow - 
creating a mess.


I thought only on two solutions:
1. Implement new datatypes: bed_dir  fastq_dir and implements new 
tool wrappers which will get a folder instead of a file.
2. merge the input files before sending to bowtie, and use parallelism 
tag to make them be splitted  merged again on each tool.


Does anyone has any better suggestion?

Thanks,
Hagai











___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] Preffered way of running a tool on multiple input files

2013-02-12 Thread Hagai Cohen
Thanks for your answer.
I figured that there is an option to run a workflow on multiple files, but
I can't merge the outputs afterwardsl. I would like the workflow to return
one final output.

But you gave me another idea.
Can I somehow tell one workflow to run on other workflow output?
If this can be done, I can run 100 different workflows with bowtie 
statistics, each working on one fastq file, than run another workflow which
gets 100 xls inputs and merge them to one.




On Tue, Feb 12, 2013 at 5:20 PM, Joachim Jacob |VIB|
joachim.ja...@vib.bewrote:

 Hi Hagai,

 Actually, using a workflow, you are able to select multiple input files,
 and let the workflow run separately on all input files.

 I would proceed by creating a data library for all your fastq files, which
 you can upload via FTP, or via a system directory.
 You can use a sample of your fastq files to create the steps in a history
 you want to perform, and extract a workflow out of it.
 Next, copy all fastq files from a data library in a new history, and run
 your workflow on the all input files.

 I hope this helps you further,
 Joachim


 Joachim Jacob

 Rijvisschestraat 120, 9052 Zwijnaarde
 Tel: +32 9 244.66.34
 Bioinformatics Training and Services (BITS)
 http://www.bits.vib.be
 @bitsatvib


 On 02/12/2013 04:02 PM, Hagai Cohen wrote:

 Hi,
 I'm looking for a preferred way of running Bowtie (or any other tool) on
 multiple input files and run statistics on the Bowtie output afterwards.

 The input is a directory of files fastq1..fastq100
 The bowtie output should be bed1...bed100
 The statistics tool should run on bed1...bed100 and return xls1..xls100
 Then I will write a tool which will get xls1..xls100 and merge them to
 one final output.

 I searched for a smiliar cases, and I couldn't figure anyone which had
 this problem before.
 Can't use the parallelism tag, because what will be the input for each
 tool? it should be a fastq file not a directory of fastq files.
 Neither I would like to run each fastq file in a different workflow -
 creating a mess.

 I thought only on two solutions:
 1. Implement new datatypes: bed_dir  fastq_dir and implements new tool
 wrappers which will get a folder instead of a file.
 2. merge the input files before sending to bowtie, and use parallelism
 tag to make them be splitted  merged again on each tool.

 Does anyone has any better suggestion?

 Thanks,
 Hagai











 __**_
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Preffered way of running a tool on multiple input files

2013-02-12 Thread John Chilton
Hagai,

Jorrit Boekel and I have implemented essentially literally what you described.

https://bitbucket.org/galaxy/galaxy-central/pull-request/116/multiple-file-datasets-implementation

Merge this in to your Galaxy tree
https://bitbucket.org/jmchilton/galaxy-central-multifiles-feb2013.
Switch use_composite_multfiles to true in universe_wsgi.ini. Then you
automatically get a multiple file version of each of your datatypes
(so m:fastq, m:xls, etc...). Tools that process a singleton version of
a datatype can seamlessly process a multiple file version of that
dataset in parallel and the outputs that are created as a result are
going to be of the multifile type of the original types.

These datasets can be created using the multifile upload tool, a
directory on the FTP server, or via library imports via API.

Input names are preserved like you described.

Some huge caveats:
 - The Galaxy team has expressed reservations about this particular
implementation so it will never be officially supported.
 - Its early days and this is very experimental (use at your own risk).
 - I am pretty sure it is not going to work with bed files, since
there is special logic in Galaxy to deal with bed indices (I think we
can work around it by declaring a concrete m:bed type and replicated
that logic, its on the TODO list but happy to accept contributions :)
).

More discussion of this can be found at these places:
http://www.youtube.com/watch?v=DxJzEkOasu4
https://bitbucket.org/galaxy/galaxy-central/pull-request/116/multiple-file-datasets-implementation
http://dev.list.galaxyproject.org/pass-more-information-on-a-dataset-merge-td4656455.html

-John



On Tue, Feb 12, 2013 at 9:02 AM, Hagai Cohen haga...@gmail.com wrote:
 Hi,
 I'm looking for a preferred way of running Bowtie (or any other tool) on
 multiple input files and run statistics on the Bowtie output afterwards.

 The input is a directory of files fastq1..fastq100
 The bowtie output should be bed1...bed100
 The statistics tool should run on bed1...bed100 and return xls1..xls100
 Then I will write a tool which will get xls1..xls100 and merge them to one
 final output.

 I searched for a smiliar cases, and I couldn't figure anyone which had this
 problem before.
 Can't use the parallelism tag, because what will be the input for each tool?
 it should be a fastq file not a directory of fastq files.
 Neither I would like to run each fastq file in a different workflow -
 creating a mess.

 I thought only on two solutions:
 1. Implement new datatypes: bed_dir  fastq_dir and implements new tool
 wrappers which will get a folder instead of a file.
 2. merge the input files before sending to bowtie, and use parallelism tag
 to make them be splitted  merged again on each tool.

 Does anyone has any better suggestion?

 Thanks,
 Hagai










 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Preffered way of running a tool on multiple input files

2013-02-12 Thread Joachim Jacob |VIB|

You cannot directly couple different workflows.

But you could indeed copy all outputs of the different workflows into 
one history, and create a separate workflow with your tool to work on 
all those input files.


Cheers,
Joachim

Joachim Jacob

Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib

On 02/12/2013 04:31 PM, Hagai Cohen wrote:


Thanks for your answer.
I figured that there is an option to run a workflow on multiple files, 
but I can't merge the outputs afterwardsl. I would like the workflow 
to return one final output.


But you gave me another idea.
Can I somehow tell one workflow to run on other workflow output?
If this can be done, I can run 100 different workflows with bowtie  
statistics, each working on one fastq file, than run another workflow 
which gets 100 xls inputs and merge them to one.





On Tue, Feb 12, 2013 at 5:20 PM, Joachim Jacob |VIB| 
joachim.ja...@vib.be mailto:joachim.ja...@vib.be wrote:


Hi Hagai,

Actually, using a workflow, you are able to select multiple input
files, and let the workflow run separately on all input files.

I would proceed by creating a data library for all your fastq
files, which you can upload via FTP, or via a system directory.
You can use a sample of your fastq files to create the steps in a
history you want to perform, and extract a workflow out of it.
Next, copy all fastq files from a data library in a new history,
and run your workflow on the all input files.

I hope this helps you further,
Joachim


Joachim Jacob

Rijvisschestraat 120, 9052 Zwijnaarde
Tel: +32 9 244.66.34 tel:%2B32%209%20244.66.34
Bioinformatics Training and Services (BITS)
http://www.bits.vib.be
@bitsatvib


On 02/12/2013 04:02 PM, Hagai Cohen wrote:

Hi,
I'm looking for a preferred way of running Bowtie (or any
other tool) on multiple input files and run statistics on the
Bowtie output afterwards.

The input is a directory of files fastq1..fastq100
The bowtie output should be bed1...bed100
The statistics tool should run on bed1...bed100 and return
xls1..xls100
Then I will write a tool which will get xls1..xls100 and merge
them to one final output.

I searched for a smiliar cases, and I couldn't figure anyone
which had this problem before.
Can't use the parallelism tag, because what will be the input
for each tool? it should be a fastq file not a directory of
fastq files.
Neither I would like to run each fastq file in a different
workflow - creating a mess.

I thought only on two solutions:
1. Implement new datatypes: bed_dir  fastq_dir and implements
new tool wrappers which will get a folder instead of a file.
2. merge the input files before sending to bowtie, and use
parallelism tag to make them be splitted  merged again on
each tool.

Does anyone has any better suggestion?

Thanks,
Hagai











___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] Upload file : auto-detect based on file extension ?

2013-02-12 Thread James Taylor
I don't believe you can, although it should be possible to extend
upload to provide that information. However, is there no header you
can use in your filetype to detect it?

--
James Taylor, Assistant Professor, Biology/CS, Emory University


On Mon, Feb 11, 2013 at 10:51 PM, David Angot dav...@intersect.org.au wrote:
 Hi,

 We are using a proprietary file format in some of our tools.
 I successfully added a new data type, but what I would like to do is to use
 the auto-detect when uploading the file, just based on the extension of the
 file.

 My guess is I have to override the sniff() in the datatype class, and test
 for the extension ? Somethinkg like that :

 if file.endswith('.extension123'):
 ...

 But how do I get the original filename when it was uploaded ?

 Thanks,

 --
 David



 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Datasets linked into Galaxy do not work

2013-02-12 Thread Sarah Diehl
Hi all,

I have problems with datasets that were linked into Galaxy data libraries (as 
Admin: Add datasets, select Upload files from filesystem paths and Link to 
files without copying to Galaxy). Those files cannot be looked at, downloaded 
or viewed in a genome browser.

Errors in the browser are:
The requested URL /datasets/126e6c4f4c2d468e/display/ was not found on this 
server.
The requested URL /library_common/download_dataset_from_folder was not found on 
this server.
An error occurred while accessing: 
http://galaxy.immunbio.mpg.de/display_application/7108f175b5be4900/igv_bam/local_default/d85d47a3ee6ecd54/data/galaxy_7108f175b5be4900.bam
 Read error; BinaryCodec in readmode; streamed file (filename not available)

I don't see any errors in the log files.

We do this kind of linking a lot and everything worked fine in the past. I need 
to use this feature and need it to work, due to hard disk space and data 
duplication.

Any help is appreciated.

Best regards,
Sarah
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Problems linking Data inside Galaxy

2013-02-12 Thread Gaueko Erge
Hi,

I have followed, carefully, the instructions posted in:
http://wiki.galaxyproject.org/Admin/Data%20Integration

Yet, despite my best efforts when I get, say all exons from hg19 build and
try to their fetch sequences, galaxy tells me that the sequences for hg19
are not there

I find this surprising because I have downloaded:

rsync://datacache.g2.bx.psu.edu/indexes/
rsync://hgdownload.cse.ucsc.edu/gbdb/
rsync://hgdownload.cse.ucsc.edu/goldenPath/vicPac1/bigZips

databases and added the correct paths to the 'alignseq.loc' file

Any ideas will be appreciated

Thanks

--G
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Overwriting tools directory?

2013-02-12 Thread Amanda Zuzolo
Since I can't find this on any of the wikis: when updating Galaxy
through mercurial, is the tools directory affected? Our instance has
backups of the directory, but I want to know whether I will have to
take care of them when I pull down the latest release.

Thanks in advance.

-- 
Amanda Zuzolo
Bioengineering Major, George Mason University
Metabiome Informatics Group, Environmental Biocomplexity
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Overwriting tools directory?

2013-02-12 Thread Björn Grüning
Hi Amanda,

the tools directory is also tracked and updated in mercurial. But only
the tools that are shipped with galaxy. If you have inserted your own
tools, they want be affected. If you modified galaxy tools that are part
of main galaxy, than you will probably get a merge conflict. But
mercurial will tell you that.

Kind regards,
Bjoern

 Since I can't find this on any of the wikis: when updating Galaxy
 through mercurial, is the tools directory affected? Our instance has
 backups of the directory, but I want to know whether I will have to
 take care of them when I pull down the latest release.
 
 Thanks in advance.
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Extract genomic DNA job error

2013-02-12 Thread Jeremy Goecks
 So we did the following:
 Downloaded the data
 Added the paths to .loc file
 restarted galaxy
 
 We still are getting the following error:
 
 2: Extract Genomic DNA on data 1
 empty
 format: fasta, database: hg19
 56 warnings, 1st is: Chromosome by name 'chr22' was not found for build 
 'hg19'. Skipped 56 invalid lines, 1st is #1, chr22   23487552
 23487738 NM_004914_cds_0_0_chr22_23487553_f 0   +

My guess is that you .loc file is not set up correctly; you'll need to make 
sure that there are tab characters—not spaces—separating the columns. 

If that's not the case, try finding the command line in the galaxy log and 
running it from the command line (make sure to prepend the command with 
'PYTHONPATH=./lib' to get needed Galaxy libraries) and adding debugging 
statements to see why you twobit file isn't being found. Line 200 of 
extract_genomic_dna.py may be the key failure point.

Finally, please keep all replies on the mailing list for community purposes.

Thanks,
J.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Custom Cheetah filters?

2013-02-12 Thread Smithies, Russell
I want to add a filter to strip whitespace and newlines from a text input box 
so I can pipe the sanitized string to a command.
Documentation is a bit sparse (and my Python a bit basic) so does anyone have 
an example?
Perhaps there's a better way of doing it - regex maybe?

Any ideas?

Thanx,

--Russell
--


===
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
===
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Regarding bug card #572 (Trello) and Galaxy Public Server

2013-02-12 Thread Matthew Paul
Regarding this issue (
https://trello.com/card/filter-and-sort-select-tool-not-dealing-with-special-characters-right/506338ce32ae458f6d15e4b3/572
):

No bug is apparent when using an up-to-date local instance of Galaxy (local
history on Trello card). However, when using the Galaxy public server the
bug is apparent (regular expressions are not registered). So we have come
to a few conjectures,

1) The bug has already been fixed in the newest version of Galaxy

or

2) Somehow, instantiating the public server is the source of the error
(which is unlikely).

If one can elucidate any misconceptions (especially about the current
version of the Galaxy public server) that we may have that would be great.
We would like to resolve this bug.
Thank You
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Fwd: Custom Cheetah filters?

2013-02-12 Thread Ross
Hi Russell,

There may be a better way, but this works for me in the Toolfactory to
create space and special character free names?

param name=foo type=text value= label=Foo
sanitizer invalid_char=
 valid initial=string.letters,string.digits/
/sanitizer
/param


On Wed, Feb 13, 2013 at 11:22 AM, Smithies, Russell 
russell.smith...@agresearch.co.nz wrote:

 I want to add a filter to strip whitespace and newlines from a text input
 box so I can pipe the sanitized string to a command.

 Documentation is a bit sparse (and my Python a bit basic) so does anyone
 have an example?

 Perhaps there’s a better way of doing it – regex maybe?

 ** **

 Any ideas?

 ** **

 Thanx,

 ** **

 --Russell

 -- 

 **

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Regarding bug card #572 (Trello) and Galaxy Public Server

2013-02-12 Thread Dannon Baker
Matthew,

This is related to the version of grep called by Galaxy.  GNU grep presents the 
issue while BSD grep works as expected.  I haven't really dug much, but this 
could probably be addressed by handling input parameters better in the tool 
wrapper, if you want to take a look.

-Dannon


On Feb 12, 2013, at 7:31 PM, Matthew Paul mrp...@g.cofc.edu wrote:

 Regarding this issue 
 (https://trello.com/card/filter-and-sort-select-tool-not-dealing-with-special-characters-right/506338ce32ae458f6d15e4b3/572):
 
 No bug is apparent when using an up-to-date local instance of Galaxy (local 
 history on Trello card). However, when using the Galaxy public server the bug 
 is apparent (regular expressions are not registered). So we have come to a 
 few conjectures, 
 
 1) The bug has already been fixed in the newest version of Galaxy
 
 or
 
 2) Somehow, instantiating the public server is the source of the error (which 
 is unlikely).
 
 If one can elucidate any misconceptions (especially about the current version 
 of the Galaxy public server) that we may have that would be great. We would 
 like to resolve this bug.
 Thank You
 
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] tool integration SOAPdenovo

2013-02-12 Thread Jorge Andrade
Dear all,

Is there a repository where I can finnd SOAPdenovo tool wrappers for
Galaxy?  I would like to install this tool in our local Galaxy server.

Thanks,

Jorge
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] tool integration SOAPdenovo

2013-02-12 Thread Ross
I don't think there's one in a toolshed but someone's clearly done some
work on it at http://galaxy.cbiit.cuhk.edu.hk/

Perhaps you may be able to convince them of the many benefits of
contributing back to the community by sharing some of their tool code?



On Wed, Feb 13, 2013 at 12:32 PM, Jorge Andrade andrade.jo...@gmail.comwrote:

 Dear all,

 Is there a repository where I can finnd SOAPdenovo tool wrappers for
 Galaxy?  I would like to install this tool in our local Galaxy server.

 Thanks,

 Jorge



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Fwd: Custom Cheetah filters?

2013-02-12 Thread Smithies, Russell
Bodged using string functions :-)

set $data = ''.join([line for line in ($seq_source_type.seq_paste.split()) if 
line[0] != ])

Removes the fasta header line and any white-space so I can pipe sequence 
directly to blastn.
Means users can quickly paste in a bit of sequence for blasting without first 
having to upload the data to their history.

--Russell
--

From: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ross
Sent: Wednesday, 13 February 2013 1:34 p.m.
To: galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] Fwd: Custom Cheetah filters?

Hi Russell,

There may be a better way, but this works for me in the Toolfactory to create 
space and special character free names?

param name=foo type=text value= label=Foo
sanitizer invalid_char=
 valid initial=string.letters,string.digits/
/sanitizer
/param

On Wed, Feb 13, 2013 at 11:22 AM, Smithies, Russell 
russell.smith...@agresearch.co.nzmailto:russell.smith...@agresearch.co.nz 
wrote:
I want to add a filter to strip whitespace and newlines from a text input box 
so I can pipe the sanitized string to a command.
Documentation is a bit sparse (and my Python a bit basic) so does anyone have 
an example?
Perhaps there's a better way of doing it - regex maybe?

Any ideas?

Thanx,

--Russell
--


===
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
===
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Questions regarding updating the Galaxy on Amazon AWS

2013-02-12 Thread Enis Afgan
Hi Chun-Yuan,
Sorry to see you're running running into so much trouble. The reality of
the situation is that the Update Galaxy button in CloudMan is currently
broken due to the changes in Galaxy that make the updates a rather manual
process. For the past several weeks, the team has been working on sorting
this out and we are getting closer to having that resolved. The upcoming
upgrade will include an update to Galaxy itself, a number of tools, and the
machine image. Also, a change in the architecture of the cloud deployment
will be necessary to accomplish this. Specifically, galaxyTools and
galaxyData volumes (ie, file systems) will be merged into a single file
system.

As far as mi-deployment goes - it has been deprecated at this point in
favor of CloudBioLinux (cloudbiolinux.org). Over the past 6 months or so,
the functionality from mi-deployment has been merged with CBL and is the
preferred way of building images and tools.

As far as getting a volume attached to a specific instance, the AWS console
allows you to attach a volume to an instance at a given device. Then, the
device becomes available on the attached instance. Note that the device ID
may differ from the one you used to attach the device as - this is noted in
the AWS console at the time of attaching a volume so you should look for a
device with that ID.

As far as support goes - unfortunately, we do not have resources to offer
phone support. Instead, you should subscribe and send emails to galaxy-dev
mailing list (http://lists.bx.psu.edu/listinfo/galaxy-dev). I have CC'd
that list now and posting to that list in the future should give the
biggest exposure to the questions you may have.

Hope this helps. Let us know if you have any more questions,
Enis



On Tue, Feb 12, 2013 at 9:04 AM, Chun-Yuan Huang hua...@uwm.edu wrote:

 Dear Dr. Afgan,

 I am in a lab that focuses on NGS analysis for Zebrafish Bis-seq dataset,
 and is about to use the Bismark tool (http://www.bioinformatics.**
 babraham.ac.uk/projects/**bismarkhttp://www.bioinformatics.babraham.ac.uk/projects/bismark)
 for that purpose.  We were very excited to learn that Bismark works in the
 Galaxy project that you are the main author of, and have been trying to
 install Bismark into Galaxy for the past few weeks. We have tried and
 received the following:
 1. Start up our own instance on Amazon AWS using ami-da58aab3,
 galaxy-cloudman-2011-03-22, and following the instructions from Galaxy Wiki
 to install Bismark manually, including the necessary modifications on the
 files universe_wsgi.ini, tool_conf.xml, tool_data_table_conf.xml.
 However, the installed Bismark kept generating error messages that we could
 not fully resolve.
 2.  We later noticed that Bismark can be installed from Tool Shed in a
 rather automatic way. But in order to take advantage of that, we need to
 update the galaxy system into more recent version in order to have the
 recent/automatical Tool Shed. Oddly the link in the Cloudman Admin page for
 Update Galaxy from a provided repository doesn't work for us.  So we
 tried to update the galaxy manually, then the Bismark from Tool Shed. It
 worked partially, as some tools are functional but not others, including
 the Bismark. We are in the process of configuring individual tools as well
 as Bismark.
 3. During the frustration, I kept wondering whether there is a more
 systemic and better documented way of upgrading Galaxy and installing its
 tools such as Bismark. Although we may be able to figure out all the
 problems eventually, the time and work spent on it is rather expensive as
 compared to a stand-alone Linux box.
 4. So in looking out for a more systemic and better documented solution, I
 just came across your site for the mi-deployment (
 https://bitbucket.org/afgane/**mi-deploymenthttps://bitbucket.org/afgane/mi-deployment).
  By reading its overview, it seems this is exactly the solution I have been
 looking for (sorry for my ignorance on this matter, as I should have tried
 it from the beginning).

 I am in the process of trying mi-deployment. But in the meantime, I would
 like to ask a couple of questions:
 1. Is mi-deployment the right track I should be following, or I am still
 in the wrong place for my situation? Are there more detailed instructions
 on doing it? Do you have any other suggestions? I consider myself and my
 group with fair literacy on bioinformatic tools, NGS tools, and how Linux
 system works (but not experts on it).
 2. Per your instruction on mi-deployment, we have established a
 CloudBioLinux Ubuntu 12.04 instance (version dated to December 2012) and
 have set environment variables for both access key and secret access key,
 in addition to using pip to install boto and fabric. We've attempted to
 create an EBS volume and link it to our EC2 instance, but it does not show
 up within the /dev directory following a kernel restart. How may we get
 it to display such that we can then mount it with sudo mount /dev/VOLUME
 

[galaxy-dev] Circster save just hangs

2013-02-12 Thread Anthonius deBoer
Hi,I had created a trackster vizualization that had not finished indexing yet, and I decided to change it into a circster visualization.I then added a few more BAM files and tried to save the visualization and now it just hangs there...no errors in the logs so far, but no saving of the visualization either...Thon
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] hg clone link on news brief incorrect

2013-02-12 Thread Anthonius deBoer
I think the hg clone link on the news brief is incorrect:it states: hgclonehttps://bitbucket.org/galaxy-dist#stableProbably should behgclonehttps://bitbucket.org/galaxy/galaxy-dist#stableThon
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] hg19 reference gnome for Tophat2

2013-02-12 Thread Sachit Adhikari
I downloaded the entire directory of UCSC for the reference gnome of
Tophat2. It turns out that the Tophat2 and Bowtie2 uses the same reference
gnome. I found a directory: Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index

with the files:

genome.1.bt2  genome.2.bt2  genome.3.bt2  genome.4.bt2  genome.rev.1.bt2
 genome.rev.2.bt2


While adding the reference gnome, I need to edit bowtie2_indices.loc

Shall I replace:

/orig/path/hg19hg19hg19
 /depot/data2/galaxy/bowtie2/hg19/hg19

with

hg19   hg19hg19   Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index


Thanks,

Sachit
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] hg19 reference gnome for Tophat2

2013-02-12 Thread Sachit Adhikari
Also, do I have make all the reference files executable?

On Wed, Feb 13, 2013 at 6:39 AM, Sachit Adhikari 
sachit.techner...@gmail.com wrote:

 I downloaded the entire directory of UCSC for the reference gnome of
 Tophat2. It turns out that the Tophat2 and Bowtie2 uses the same reference
 gnome. I found a directory: Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index

 with the files:

 genome.1.bt2  genome.2.bt2  genome.3.bt2  genome.4.bt2  genome.rev.1.bt2
  genome.rev.2.bt2


 While adding the reference gnome, I need to edit bowtie2_indices.loc

 Shall I replace:

 /orig/path/hg19hg19hg19
  /depot/data2/galaxy/bowtie2/hg19/hg19

 with

 hg19   hg19hg19   Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index


 Thanks,

 Sachit

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Custom Cheetah filters?

2013-02-12 Thread Björn Grüning
Hi Russell,

also keep in mind, that Cheetah is just python. Maybe you can try to do
something like that:

$text.strip() or str($text).strip()

Cheers,
Bjoern

 I want to add a filter to strip whitespace and newlines from a text
 input box so I can pipe the sanitized string to a command.
 
 Documentation is a bit sparse (and my Python a bit basic) so does
 anyone have an example?
 
 Perhaps there’s a better way of doing it – regex maybe?
 
  
 
 Any ideas?
 
  
 
 Thanx,
 
  
 
 --Russell
 
 -- 
 
  
 
 
 
 ===
 Attention: The information contained in this message and/or
 attachments
 from AgResearch Limited is intended only for the persons or entities
 to which it is addressed and may contain confidential and/or
 privileged
 material. Any review, retransmission, dissemination or other use of,
 or
 taking of any action in reliance upon, this information by persons or
 entities other than the intended recipients is prohibited by
 AgResearch
 Limited. If you have received this message in error, please notify the
 sender immediately.
 ===
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Preffered way of running a tool on multiple input files

2013-02-12 Thread Hagai Cohen
John, that's seems great.
I will read this stuff and see if  I can use it (The bed format isn't that
essential, bowtie can bam instead).

If it wont work I will try the other solution which doesn't need to change
the galaxy own code (Creating hundreds of workflow run, linking to their
outputs and running last workflow with the merging tool - this solution
also distribute in a better way).

Because galaxy is used a lot on sequencers output, I think someday it
should support this kind of jobs internally.
When I will have a running solution, I will publish what solution I have
used.

Its really great to know I'm not the first one to attack this problem.
Thanks for the advices.
Hagai




On Tue, Feb 12, 2013 at 5:42 PM, Joachim Jacob |VIB|
joachim.ja...@vib.bewrote:

 You cannot directly couple different workflows.

 But you could indeed copy all outputs of the different workflows into one
 history, and create a separate workflow with your tool to work on all those
 input files.

 Cheers,

 Joachim

 Joachim Jacob

 Rijvisschestraat 120, 9052 Zwijnaarde
 Tel: +32 9 244.66.34
 Bioinformatics Training and Services (BITS)
 http://www.bits.vib.be
 @bitsatvib

 On 02/12/2013 04:31 PM, Hagai Cohen wrote:


 Thanks for your answer.
 I figured that there is an option to run a workflow on multiple files,
 but I can't merge the outputs afterwardsl. I would like the workflow to
 return one final output.

 But you gave me another idea.
 Can I somehow tell one workflow to run on other workflow output?
 If this can be done, I can run 100 different workflows with bowtie 
 statistics, each working on one fastq file, than run another workflow which
 gets 100 xls inputs and merge them to one.




 On Tue, Feb 12, 2013 at 5:20 PM, Joachim Jacob |VIB| 
 joachim.ja...@vib.be mailto:joachim.ja...@vib.be wrote:

 Hi Hagai,

 Actually, using a workflow, you are able to select multiple input
 files, and let the workflow run separately on all input files.

 I would proceed by creating a data library for all your fastq
 files, which you can upload via FTP, or via a system directory.
 You can use a sample of your fastq files to create the steps in a
 history you want to perform, and extract a workflow out of it.
 Next, copy all fastq files from a data library in a new history,
 and run your workflow on the all input files.

 I hope this helps you further,
 Joachim


 Joachim Jacob

 Rijvisschestraat 120, 9052 Zwijnaarde
 Tel: +32 9 244.66.34 tel:%2B32%209%20244.66.34

 Bioinformatics Training and Services (BITS)
 http://www.bits.vib.be
 @bitsatvib


 On 02/12/2013 04:02 PM, Hagai Cohen wrote:

 Hi,
 I'm looking for a preferred way of running Bowtie (or any
 other tool) on multiple input files and run statistics on the
 Bowtie output afterwards.

 The input is a directory of files fastq1..fastq100
 The bowtie output should be bed1...bed100
 The statistics tool should run on bed1...bed100 and return
 xls1..xls100
 Then I will write a tool which will get xls1..xls100 and merge
 them to one final output.

 I searched for a smiliar cases, and I couldn't figure anyone
 which had this problem before.
 Can't use the parallelism tag, because what will be the input
 for each tool? it should be a fastq file not a directory of
 fastq files.
 Neither I would like to run each fastq file in a different
 workflow - creating a mess.

 I thought only on two solutions:
 1. Implement new datatypes: bed_dir  fastq_dir and implements
 new tool wrappers which will get a folder instead of a file.
 2. merge the input files before sending to bowtie, and use
 parallelism tag to make them be splitted  merged again on
 each tool.

 Does anyone has any better suggestion?

 Thanks,
 Hagai











 __**_
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/