[galaxy-user] Chip-SEq S.cerevisiae IOn-Torrent

2014-02-04 Thread Mónica Pérez Alegre
Hi everybody.

I´m trying to analyze Chip-SEq Data from Ion-Torrent using Peak Calling/MACS. I 
have some questions:


· How do I establish the Tag size? The median of size reads in my data 
are 156pb, the max 306?

· Bandwidht: is the sonication size?

Thanks in advance

Regards

☺If you have used the Services of the Genomics Unit of Cabimer, we would be 
grateful if you would give us a mention in future publications
Mónica Pérez Alegre, PhD
Genomics Unit
CABIMER-CSIC
Edif. CABIMER - Avda. Américo Vespucio s/n
Parque Científico y Tecnológico Cartuja 93
41092 Seville-SPAIN
Tlf:   +34 954 467 828
Fax: +34 954 461 664
www.cabimer.eshttp://www.cabimer.es/
http://www.cabimer.es/web/es/unidades-apoyo/genomica

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] ChIP-Seq Normalization to total number of reads

2013-12-02 Thread Björn Grüning
Hi Catheryn,

for ChIP-seq analysis, normalisation and BAM file correlation we use
deeptTools. Here you can read more about it:

https://github.com/fidelram/deepTools

And here is the toolshed repository:
http://toolshed.g2.bx.psu.edu/view/bgruening/deeptools

Cheers,
Bjoern

 Dear Galaxy,
 
  
 
 I am trying to analyze my ChIP-Seq data from Illumina using Galaxy. I
 have 2 datasets that I want to compare after normalizing each of them
 to their respective inputs, and these 2 datasets have very different
 number of reads to start with, is there a way to first normalize each
 dataset to total number of reads in Galaxy?
 
  
 
 Thanks. Your help is very much appreciated.
 
  
 
 Catheryn
 
 
 
  
 
 
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
   http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
   http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
 
   http://galaxyproject.org/search/mailinglists/



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/


[galaxy-user] ChIP-Seq Normalization to total number of reads

2013-12-01 Thread Wooi Lim
Dear Galaxy,

I am trying to analyze my ChIP-Seq data from Illumina using Galaxy. I have 2 
datasets that I want to compare after normalizing each of them to their 
respective inputs, and these 2 datasets have very different number of reads to 
start with, is there a way to first normalize each dataset to total number of 
reads in Galaxy?

Thanks. Your help is very much appreciated.

Catheryn



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] ChIP-seq data analysis question

2012-06-04 Thread cjt5
Hello,

My name is Christopher Terranova and am a M.S student at the University of
Buffalo SUNY.I have been attempting to analyze my MACS data using Galaxy, 
already
have my custom peaks on the UCSC Genome browser and have some specific 
questions.
I am attempting to show how my peaks (and peak center coordinates) relate to 
gene
units(+/-TSS and Genic) and intergenic regions specifically. I have been
attempting to do this two different ways and am not sure if I am doing this
correctly. Below I will list the steps I have been using with particular
questions highlighted near my problem. I would also like to apologize for this
extended e-mail, however, I have only been working with Galaxy for approx a 
month
and attempting to figure all the manipulations is kind of difficult. If some can
answer my questions I would greatly appreciate it!!!  

These questions relate specifically to promoters-

1.Retrieving TSS coordinates

1.Go to the UCSC genome browser, click Tables in the top of the page, and
select mouse mm9 as the organism
2.select RefSeq genes in tracks, BED as the output format and check 
Send
output to galaxy
3.click Get output then Send output to galaxy, and you are redirected to
your Galaxy account, which contains an additional dataset
4.use the galaxy Filter tool (left column) to select all + strand genes
5.use the Cut tool (left column) to extract columns 1,2,2,4,5,6 (**is the
c2 column repeated twice??**) in order to build a BED file containing the 
TSS
for all + strand genes
6.do the same for the genes on the - strand 

Computing peak center coordinates

1.In Galaxy, select the tool Compute expression on every row in the left
column (Text manipulation section)
2.as an expression, select c2+(c3-c2+1)/2, round result YES
3.select the dataset containing the peaks for one of the TFs (HNF4a or 
CBPA),
and click execute; this creates a new dataset with an additional column
containing the coordinate of the peak center.
4.now select the tool Cut, and extract the columns c1,c6,c6,c4,c5(**is the
c6 column repeated twice??**) to create a new BED file containing the peak 
center
5.edit the metadata of this new dataset (clicking on the small pencil icon),
and change the format to BED 

Computing distance to closest TSS

1.select the tool Fetch closest non-overlapping feature, select the new
dataset containing the peak center coordinates, and the dataset containing the
mouse TSS. A new dataset is created containing for each peak, the closest TSS
2.compute the distance from the peak center to the closest TSS using the
Compute expression on every row tool(**what expression should I use to do 
this**)
3.plot the distribution using the Histogram of a numeric column tool. 

Secondary way: I understand this is not identifying the peak center closest to
the TSS or a particular strand, however, still have a couple questions? 

Now we have a data set corresponding to all human RefSeqs (34,765) and we want 
to
convert this set into one corresponding to human promoter regions. First, we 
will
make sure our data set just contains the start and end coordinates of the genes.
Select the Text Manipulation tool and then Cut colums from a table. Set cut
columns to c1,c2,c3,c4,c6 (**Is this the right c1... conformation??**). Make
sure our previously downloaded RefSeq tdat set is selected and click on
Execute. When this is finished, click on the pencil icon to assign names to 
the
columns. Set name to RefSeqs, click save and change the data type to
interval and click save. Now click the pencil icon again to define the
columns. Set the start column to 2 and the end column to 3, the strand 
column
to 5 and the Name/Identifier column to 4 and click on save. Now, go to
the Operate on Genomic Intervals section of the Tools menu and select Get
flanks to get the flanking regions for the RefSeq data set we just created. 
Make
sure our RefSeq data set is selected and we want to get the upstream flanking
regions for this data set. Set the length of the flanking region to 1000 to get
the coordinates for 1kb upstream. Later on we could use different intervals.
Click on Execute. When this has finished, go to Operate on Genomic Intervals
again and select Join. Now set First query to Get flanks.. and Second
query to the peaks file of the MACS output and then click on Execute. We 
now
end up with 710 regions where our ChIP-Seq peaks overlap with our 1kb upstream
region (promoter region).

Lastly, while not discussed here, what exactly does the offset command do when
getting flanks? 

Thank you very much and again, I apologize for the extensive questions!

Sincerely,
Christopher Terranova

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy 

Re: [galaxy-user] Chip-seq exercise

2012-03-25 Thread Casey Bergman
Hi Dan -

I agree this tutorial is very helpful.  In running through this exercise 
recently, however, I noticed a few very small things that could help improve 
this already excellent tutorial for new users:

1) you may want to explicitly tell people to create an account first if they 
haven't so they can do the workflow parts without having to create and account 
in midstream.

2) change reads have been reduced to those mapping to chr9 to reads have 
been reduced to those mapping to chr19 -- fix typo

3) change Click the 'import this dataset' button above to add this dataset to 
Click the green circle with a + to the right of 'import this dataset' button 
above to add this dataset  -- make the location of the import button more 
explicit

4) change to your analysis history to being the analysis to to your analysis 
history to begin the analysis -- fix typo

5) change You will need to change the reference genome build you are mapping 
against to mm9 to You will need to change the reference genome build you are 
mapping against to Mouse (Mus musculus): mm9 Full  -- make instructions 
about which version of mm9 to use more explicit 

6) change Select your previous CTCF dataset for ChIP-seq tag file to Set 
your tag size to the same value as before and select your previous CTCF 
dataset -- add reminder to set tag size again

Best regards,
Casey





On 22 Mar 2012, at 13:38, Daniel Blankenberg wrote:

 Hi Josh,
 
 Thank you for reporting this issue, it has been resolved. Please let us know 
 if you encounter additional problems in the future.
 
 
 Thanks for using Galaxy,
 
 Dan
 
 
 On Mar 21, 2012, at 8:18 PM, Joshua Udall wrote:
 
 Does anyone know what happened to the chip-seq exercise by James?
 
 It was part of a collection here:
 http://main.g2.bx.psu.edu/u/james/p/exercises
 
 and it was linked here:
 http://main.g2.bx.psu.edu/u/james/p/exercise-chip-seq
 
 
 
 But is it 'Not Found'.
 
 It was a very useful exercise.  Will it be back soon?
 
 Josh
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
 http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
 http://lists.bx.psu.edu/
 
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
  http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
  http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Chip-seq data

2011-11-30 Thread graham etherington (TSL)
Giuseppe,
Your ChipSeq data is already in fastq format. It appears to have Illunima
quality scores, so you'll need to use the NGS:QC and manipulation  FASTQ
Groomer tool, using 'Illumina 1.3+' as input and 'Sanger' quality format
as output.
As to using MACS, I've never used it before but you should be able to get
your answers by reading the manual at:
http://liulab.dfci.harvard.edu/MACS/README.html
Hope this helps,
Graham

Dr. Graham Etherington
Bioinformatics Support Officer,
The Sainsbury Laboratory,
Norwich Research Park,
Norwich NR4 7UH.
UK
Tel: +44 (0)1603 450601





On 29/11/2011 15:16, Giuseppe Petrosino petros...@ceinge.unina.it
wrote:

Hi,I have illumina ChipSeq data in txt format with this structure:


@HWI-EAS225:8:1:1:58#0/1
NAGAGTGCCCGGGTTCAGTTCTCAGCACCCATGTGG
+HWI-EAS225:8:1:1:58#0/1
DMSSUSSTTTUTSRQRTTTSSSUS
@HWI-EAS225:8:1:1:1803#0/1
NCCATGGGAAGAGCTGGGCAGGCGGGCCGAGCGAAG
+HWI-EAS225:8:1:1:1803#0/1
DLSTTSKOUTRRTTSSSSRPNNTOJOTSSRTB
@HWI-EAS225:8:1:1:1547#0/1
NAGGGGTGGGACTGGCACTTGCCTCTACCAGC
+HWI-EAS225:8:1:1:1547#0/1
DLVVVTPTUVVWVVUVVUWVVVWWWVVV


Can I convert into Fastq format?If so, how can I?
Furthermore, after using Map with Bowtie for Illumina, how can I use MACS
(Model-based Analysis of ChIP-Seq) if I have two files for IP samples and
two files for Control samples?
Thank you so much.

Giuseppe


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] Chip-seq data

2011-11-29 Thread Giuseppe Petrosino
Hi,
I have illumina ChipSeq data in txt format with this structure:

@HWI-EAS225:8:1:1:58#0/1
NAGAGTGCCCGGGTTCAGTTCTCAGCACCCATGTGG
+HWI-EAS225:8:1:1:58#0/1
DMSSUSSTTTUTSRQRTTTSSSUS
@HWI-EAS225:8:1:1:1803#0/1
NCCATGGGAAGAGCTGGGCAGGCGGGCCGAGCGAAG
+HWI-EAS225:8:1:1:1803#0/1
DLSTTSKOUTRRTTSSSSRPNNTOJOTSSRTB
@HWI-EAS225:8:1:1:1547#0/1
NAGGGGTGGGACTGGCACTTGCCTCTACCAGC
+HWI-EAS225:8:1:1:1547#0/1
DLVVVTPTUVVWVVUVVUWVVVWWWVVV

Can I convert into Fastq format?If so, how can I?
Furthermore, after using Map with Bowtie for Illumina, how can I use
MACS (Model-based
Analysis of ChIP-Seq) if I have two files for IP samples and two files for
Control samples?
Thank you so much.

Giuseppe
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] ChIP seq on BED file

2011-09-30 Thread vasu punj
Jen,
 
I ran the flow as you suggested, but got following error message, Do You hav 
eany suggestion? I added 0 and flips the columns: Here is the few lines of 
input file:




chr1
12137
12336
61R33AAXX100706:1:79:7707:9270
0
-

chr1
31542
31741
61R33AAXX100706:1:37:11341:10600
1
-

chr1
39921
40120
61R33AAXX100706:1:2:16103:17629
2
-

chr1
93213
93412
61R33AAXX100706:1:113:14396:2056
3
-

chr1
109395
109594
61R33AAXX100706:1:13:8451:9619
4
-

chr1
146854
147053
61R33AAXX100706:1:53:15558:13513
5
-Te error message is as followINFO  @ Fri, 30 Sep 2011 17:59:54: 
# ARGUMENTS LIST:
# name = macs_output
# format = BED
# ChIP-seq file = 
/data/CistromeAP/galaxy_database/files/000/198/dataset_198187.dat
# control file = None
# effective genome size = 2.79e+09
# band width = 300
# model fold = 10,30
# pvalue cutoff = 1.00e-05
# Small dataset will be scaled towards larger dataset.
# Range for calculating regional lambda is: 1 bps
 
INFO  @ Fri, 30 Sep 2011 17:59:54: #1 read tag files... 
INFO  @ Fri, 30 Sep 2011 17:59:54: #1 read treatment tags... 
INFO  @ Fri, 30 Sep 2011 18:00:02:  100 
INFO  @ Fri, 30 Sep 2011 18:00:11:  200 
INFO  @ Fri, 30 Sep 2011 18:00:21:  300 
INFO  @ Fri, 30 Sep 2011 18:00:30:  400 
INFO  @ Fri, 30 Sep 2011 18:00:39:  500 
Traceback (most recent call last):
  File /usr/local/bin/macs14, line 358, in module
main()
  File /usr/local/bin/macs14, line 60, in main
(treat, control) = load_tag_files_options (options)
  File /usr/local/bin/macs14, line 330, in load_tag_files_options
treat = tp.build_fwtrack()
  File /usr/lib/python2.6/dist-packages/MACS14/IO/Parser.py, line 150, in 
build_fwtrack
(chromosome,fpos,strand) = self.__fw_parse_line(thisline)
  File /usr/lib/python2.6/dist-packages/MACS14/IO/Parser.py, line 187, in 
__fw_parse_line
raise StrandFormatError(thisline,thisfields[5])
MACS14.IO.Parser.StrandFormatError: 'Strand information can not be recognized 
in this line: chr2\t121859840\t121860039\t61R33AAX\t.\t5837743,5837743'

 Thanks
 
Vasu

--- On Fri, 9/30/11, Jennifer Jackson j...@bx.psu.edu wrote:


From: Jennifer Jackson j...@bx.psu.edu
Subject: Re: [galaxy-user] BED to BAM conversion in Galaxy
To: shamsher jagat kanwar...@gmail.com
Cc: galaxy-u...@bx.psu.edu
Date: Friday, September 30, 2011, 9:08 AM


Hello,

The format of the BED file may be a problem. To be in BED format, an 
additional field is required for the score attribute. This would be 
column 5, moving the strand out to column 6.

To do this:

1 - use Text Manipulation-Add column with the value 0
note: 0 often is used to represent a NULL or undefined score value in 
BED files. This field cannot be left as whitespace (two tabs), a 
placeholder value must be present.

2 - then use Text Manipulation-Cut and cut out the columns in the 
proper BED file order, in this case c1,c2,c3,c4,c6,c5, to swap the 
last two

3 - change datatype to BED using the pencil icon/Edit attributes form

In Galaxy, many of the tools in NGS: Peak Calling will work with 
ChIP-seq data in BED format. Having a control would be helpful, but is 
not required by all tools.

Good luck with your project,

Jen
Galaxy team

On 9/29/11 9:31 PM, shamsher jagat wrote:
 Thanks Jen,
 My problem is I have ChIP-seq data where I have one Bed
 file with  coordinates-

 chr172402772422661PDWAAXX100706:4:19:6952:18071-

 Then there is wig file.? Is it possible that thsi data can be analyzed
 in Galaxy/ Cistrome. I tried to use Cistrome  which gav eme error message.

 Thanks



 On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson j...@bx.psu.edu
 mailto:j...@bx.psu.edu wrote:

     Hello,

     It is possible to go from SAM/BAM to BED, but not the reverse.
     SAM/BAM files contain the actual sequence data associated with the
     original aligned read. BED files only have the reference genome
     location of the alignment (no read sequence).

     It is possible to extract genomic sequence based on BED coordinates,
     but the resulting sequence would not necessarily be the same
     sequence as in the original aligned read (any variation would be lost).

     BED is very similar to Interval format, so Interval tools also work
     with BED format. A BED file is basically a 3-12 column, tab
     delimited file, so tools that work with Tabular data are also
     appropriate for BED file. Note that you may need to change the
     datatype to be interval or tab for certain tools to recognize a BED
     file as an input.

     Hopefully this helps,

     Jen
     Galaxy team




     On 9/22/11 2:55 PM, shamsher jagat wrote:

         Is it possible to use some tool in Galaxy to convert BED file to
         Bam/
         sam file. In other word do we have Bed tools or other option in
         Galaxy

         Thanks


         _
         The Galaxy User list should be used for the discussion of
         Galaxy analysis and other features on the public server
      

[galaxy-user] ChIP seq on BED file

2011-09-30 Thread Jennifer Jackson

Hello Vasu,

The score value should be 0 for each row. When adding the new column, 
set Iterate?: to the default no.


It also looks like there may be some inconsistencies in the original 
file. Are you certain it is 5 columns (exactly) for every row? Including 
the error row reported? Some detective work to get the file in the right 
format is probably necessary.


Tabs are good to check. Change the filetype to tabular, run the file 
through the Convert delimiters to TAB tool using Convert all: 
whitespace, next run Condense consecutive characters to cleanup any 
trailing tabs, then change the filetype back to BED and assign the score 
column on the Edit Attributes form (pencil icon).


Hopefully this helps,

Jen
Galaxy team


On 9/30/11 3:13 PM, vasu punj wrote:

Jen,
I ran the flow as you suggested, but got following error message, Do You
hav eany suggestion? I added 0 and flips the columns: Here is the few
lines of input file:
chr112137   12336   61R33AAXX100706:1:79:7707:9270  0   -
chr131542   31741   61R33AAXX100706:1:37:11341:106001   -
chr139921   40120   61R33AAXX100706:1:2:16103:17629 2   -
chr193213   93412   61R33AAXX100706:1:113:14396:20563   -
chr1109395  109594  61R33AAXX100706:1:13:8451:9619  4   -
chr1146854  147053  61R33AAXX100706:1:53:15558:135135   -

Te error message is as follow

INFO  @ Fri, 30 Sep 2011 17:59:54:
# ARGUMENTS LIST:
# name = macs_output
# format = BED
# ChIP-seq file = 
/data/CistromeAP/galaxy_database/files/000/198/dataset_198187.dat
# control file = None
# effective genome size = 2.79e+09
# band width = 300
# model fold = 10,30
# pvalue cutoff = 1.00e-05
# Small dataset will be scaled towards larger dataset.
# Range for calculating regional lambda is: 1 bps

INFO  @ Fri, 30 Sep 2011 17:59:54: #1 read tag files...
INFO  @ Fri, 30 Sep 2011 17:59:54: #1 read treatment tags...
INFO  @ Fri, 30 Sep 2011 18:00:02:  100
INFO  @ Fri, 30 Sep 2011 18:00:11:  200
INFO  @ Fri, 30 Sep 2011 18:00:21:  300
INFO  @ Fri, 30 Sep 2011 18:00:30:  400
INFO  @ Fri, 30 Sep 2011 18:00:39:  500
Traceback (most recent call last):
   File /usr/local/bin/macs14, line 358, inmodule
 main()
   File /usr/local/bin/macs14, line 60, in main
 (treat, control) = load_tag_files_options (options)
   File /usr/local/bin/macs14, line 330, in load_tag_files_options
 treat = tp.build_fwtrack()
   File /usr/lib/python2.6/dist-packages/MACS14/IO/Parser.py, line 150, in 
build_fwtrack
 (chromosome,fpos,strand) = self.__fw_parse_line(thisline)
   File /usr/lib/python2.6/dist-packages/MACS14/IO/Parser.py, line 187, in 
__fw_parse_line
 raise StrandFormatError(thisline,thisfields[5])
MACS14.IO.Parser.StrandFormatError: 'Strand information can not be recognized in this line: 
chr2\t121859840\t121860039\t61R33AAX\t.\t5837743,5837743'

Thanks
Vasu

--- On *Fri, 9/30/11, Jennifer Jackson /j...@bx.psu.edu/* wrote:


From: Jennifer Jackson j...@bx.psu.edu
Subject: Re: [galaxy-user] BED to BAM conversion in Galaxy
To: shamsher jagat kanwar...@gmail.com
Cc: galaxy-u...@bx.psu.edu
Date: Friday, September 30, 2011, 9:08 AM

Hello,

The format of the BED file may be a problem. To be in BED format, an
additional field is required for the score attribute. This would be
column 5, moving the strand out to column 6.

To do this:

1 - use Text Manipulation-Add column with the value 0
note: 0 often is used to represent a NULL or undefined score value in
BED files. This field cannot be left as whitespace (two tabs), a
placeholder value must be present.

2 - then use Text Manipulation-Cut and cut out the columns in the
proper BED file order, in this case c1,c2,c3,c4,c6,c5, to swap the
last two

3 - change datatype to BED using the pencil icon/Edit attributes form

In Galaxy, many of the tools in NGS: Peak Calling will work with
ChIP-seq data in BED format. Having a control would be helpful, but is
not required by all tools.

Good luck with your project,

Jen
Galaxy team

On 9/29/11 9:31 PM, shamsher jagat wrote:
  Thanks Jen,
  My problem is I have ChIP-seq data where I have one Bed
  file with coordinates-
 
  chr172402772422661PDWAAXX100706:4:19:6952:18071-
 
  Then there is wig file.? Is it possible that thsi data can be
analyzed
  in Galaxy/ Cistrome. I tried to use Cistrome which gav eme error
message.
 
  Thanks
 
 
 
  On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson j...@bx.psu.edu
http://us.mc1147.mail.yahoo.com/mc/compose?to=j...@bx.psu.edu
  mailto:j...@bx.psu.edu
http://us.mc1147.mail.yahoo.com/mc/compose?to=j...@bx.psu.edu wrote:
 
  Hello,
 
  It is possible to go from SAM/BAM to BED, but not the reverse.
  SAM/BAM files contain the actual sequence data associated with the
  original aligned read. 

[galaxy-user] Chip-seq

2011-09-21 Thread shamsher jagat
Can I analyze two bed files from Chip seq  experiemnt in Galaxy? I have one
file of input and other of sample. Both these files have peak locations. Any
suggestion of a work flow in Galaxy?
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Chip-Seq, Encode Peaks and Galaxy

2011-09-07 Thread Daniel Blankenberg
Hi Rad, Jorge,

Sorry for the delay in reply.  We have not yet released a pre-canned workflow 
to do this. However, if you are looking to associate one set of Genomic 
interval/region data with another set, Galaxy's interval operation tools are a 
good place to begin. There are good examples of using these tools available 
through screencasts (http://galaxycast.org), Galaxy 101 
(http://usegalaxy.org/galaxy101),  as well as the wiki 
(http://wiki.g2.bx.psu.edu/Learn/Interval%20Operations). 

Please let us know if we can provide additional information.


Thanks for using Galaxy,

Dan


On Jun 23, 2011, at 9:41 AM, Radhouane Aniba wrote:

 Thanks Jennifer
 
 Rad
 
 2011/6/23 Jorge Andrade andrade.jo...@gmail.com
 Please keep me on the loop as I am also interested in similar workflow.
 Many thanks and best regards,
 Jorge
 
 
 On Thu, Jun 23, 2011 at 3:21 AM, Jennifer Jackson j...@bx.psu.edu wrote:
 Hello Rad,
 
 Dan will be able to help you get started and build up a workflow for your 
 analysis. He is currently on vacation, but will be returning soon and will 
 contact you directly when he returns.
 
 We are very sorry about the delayed reply. Please know that we definitely 
 want to help you to use Galaxy for your project,
 
 We will be in touch,
 
 Best,
 
 Jen
 Galaxy team
 
 
 
 On 6/17/11 10:55 AM, Radhouane Aniba wrote:
 Hi everyone,
 
 I have a list of genomic regions with some variants and would like to
 study the correlation between theses variants and epigenomics marks such
 as histone modifications.
 
  From Encode download page, i got some files corresponding to peaks of
 these hsitone modifications and would like to know if there is a way to
 create a pipeline using galaxy to map my variants, depending on genomic
 regions to the information I have from the histone modification peaks.
 
 Is there someone who can point me to a step by step to do things to
 start using Galaxy ?
 
 Thank you
 
 Rad
 
 
 
 
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
   http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
   http://lists.bx.psu.edu/
 
 -- 
 Jennifer Jackson
 http://usegalaxy.org/
 http://galaxyproject.org/
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
  http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
  http://lists.bx.psu.edu/
 
 
 
 
 -- 
 Radhouane Aniba
 Bioinformatics Postdoctoral Research Scientist
 Institute for Advanced Computer Studies
 Center for Bioinformatics and Computational Biology (CBCB)
 University of Maryland, College Park
 MD 20742
 
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
  http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Chip-Seq, Encode Peaks and Galaxy

2011-06-23 Thread Jorge Andrade
Please keep me on the loop as I am also interested in similar workflow.
Many thanks and best regards,
Jorge

On Thu, Jun 23, 2011 at 3:21 AM, Jennifer Jackson j...@bx.psu.edu wrote:

 Hello Rad,

 Dan will be able to help you get started and build up a workflow for your
 analysis. He is currently on vacation, but will be returning soon and will
 contact you directly when he returns.

 We are very sorry about the delayed reply. Please know that we definitely
 want to help you to use Galaxy for your project,

 We will be in touch,

 Best,

 Jen
 Galaxy team



 On 6/17/11 10:55 AM, Radhouane Aniba wrote:

 Hi everyone,

 I have a list of genomic regions with some variants and would like to
 study the correlation between theses variants and epigenomics marks such
 as histone modifications.

  From Encode download page, i got some files corresponding to peaks of
 these hsitone modifications and would like to know if there is a way to
 create a pipeline using galaxy to map my variants, depending on genomic
 regions to the information I have from the histone modification peaks.

 Is there someone who can point me to a step by step to do things to
 start using Galaxy ?

 Thank you

 Rad




 __**_
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   
 http://lists.bx.psu.edu/**listinfo/galaxy-devhttp://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/


 --
 Jennifer Jackson
 http://usegalaxy.org/
 http://galaxyproject.org/
 __**_
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  
 http://lists.bx.psu.edu/**listinfo/galaxy-devhttp://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Chip-Seq, Encode Peaks and Galaxy

2011-06-23 Thread Radhouane Aniba
Thanks Jennifer

Rad

2011/6/23 Jorge Andrade andrade.jo...@gmail.com

 Please keep me on the loop as I am also interested in similar workflow.
 Many thanks and best regards,
 Jorge


 On Thu, Jun 23, 2011 at 3:21 AM, Jennifer Jackson j...@bx.psu.edu wrote:

 Hello Rad,

 Dan will be able to help you get started and build up a workflow for your
 analysis. He is currently on vacation, but will be returning soon and will
 contact you directly when he returns.

 We are very sorry about the delayed reply. Please know that we definitely
 want to help you to use Galaxy for your project,

 We will be in touch,

 Best,

 Jen
 Galaxy team



 On 6/17/11 10:55 AM, Radhouane Aniba wrote:

 Hi everyone,

 I have a list of genomic regions with some variants and would like to
 study the correlation between theses variants and epigenomics marks such
 as histone modifications.

  From Encode download page, i got some files corresponding to peaks of
 these hsitone modifications and would like to know if there is a way to
 create a pipeline using galaxy to map my variants, depending on genomic
 regions to the information I have from the histone modification peaks.

 Is there someone who can point me to a step by step to do things to
 start using Galaxy ?

 Thank you

 Rad




 __**_
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   
 http://lists.bx.psu.edu/**listinfo/galaxy-devhttp://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/


 --
 Jennifer Jackson
 http://usegalaxy.org/
 http://galaxyproject.org/
 __**_
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  
 http://lists.bx.psu.edu/**listinfo/galaxy-devhttp://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/





-- 
*Radhouane Aniba*
*Bioinformatics Postdoctoral Research Scientist*
*Institute for Advanced Computer Studies
Center for Bioinformatics and Computational Biology* *(CBCB)*
*University of Maryland, College Park
MD 20742*
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Chip-Seq, Encode Peaks and Galaxy

2011-06-22 Thread Jennifer Jackson

Hello Rad,

Dan will be able to help you get started and build up a workflow for 
your analysis. He is currently on vacation, but will be returning soon 
and will contact you directly when he returns.


We are very sorry about the delayed reply. Please know that we 
definitely want to help you to use Galaxy for your project,


We will be in touch,

Best,

Jen
Galaxy team


On 6/17/11 10:55 AM, Radhouane Aniba wrote:

Hi everyone,

I have a list of genomic regions with some variants and would like to
study the correlation between theses variants and epigenomics marks such
as histone modifications.

 From Encode download page, i got some files corresponding to peaks of
these hsitone modifications and would like to know if there is a way to
create a pipeline using galaxy to map my variants, depending on genomic
regions to the information I have from the histone modification peaks.

Is there someone who can point me to a step by step to do things to
start using Galaxy ?

Thank you

Rad




___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org/
http://galaxyproject.org/
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] Chip-Seq, Encode Peaks and Galaxy

2011-06-17 Thread Radhouane Aniba
Hi everyone,

I have a list of genomic regions with some variants and would like to study
the correlation between theses variants and epigenomics marks such as
histone modifications.

From Encode download page, i got some files corresponding to peaks of these
hsitone modifications and would like to know if there is a way to create a
pipeline using galaxy to map my variants, depending on genomic regions to
the information I have from the histone modification peaks.

Is there someone who can point me to a step by step to do things to start
using Galaxy ?

Thank you

Rad
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/