Hello,
To summarize, you want to find existing genes that:
1 - have overlap with your ChIP-seq dataset
2 - have overlap within 5000 bp upstream of known TSS intervals
The basic steps are:
a - obtain intervals for TSS
b - obtain intervals for ChIP-seq peaks
c - obtain intervals for existing genes (transcripts)
d - answer both #1 & #2 above by comparing a + b, then the result + c
using tools from the group "Operate on Genomic Intervals" plus other
data manipulation tools as needed
For a, this was the prior question/reply.
For b, please see:
http://main.g2.bx.psu.edu/u/james/p/exercise-chip-seq
http://main.g2.bx.psu.edu/u/galaxyproject/p/using-galaxy-2012 -> Prot 3
For c, this is in the UCSC mailing list post, but also in several
Protocols of the Using Galaxy paper.
For d, see Prot 1 in the Using Galaxy paper for how to identify common
regions to address question #1. Prot 4 walks through all Genomic
Interval tools, plus the tools themselves have example graphics.
Hopefully this helps,
Jen
Galaxy team
On 7/23/12 7:33 AM, shamsher jagat wrote:
I want to have list of genes from UCSC browser or known genes.
Thanks
Kanwar
On Fri, Jul 20, 2012 at 8:00 PM, Jennifer Jackson <j...@bx.psu.edu
<mailto:j...@bx.psu.edu>> wrote:
Hello Kanwar,
On 7/20/12 3:31 PM, shamsher jagat wrote:
I am interested in getting regions flanking TSS, I am using
Glaxaxy and
have downloaded TSS sites using
this post steps
https://lists.soe.ucsc.edu/__pipermail/genome/2011-June/__026175.html
<https://lists.soe.ucsc.edu/pipermail/genome/2011-June/026175.html>
Now what I would like to do is to get 5000 bp upstream an
downstream using flank tool in galaxy, but i realize it only gave me
option for gene start or whole gene.
The "Region:" options are:
1 - around start - meaning interval start coordinate
2 - around end - meaning interval end coordinate
3 - whole gene - meaning entire intervals
Pick option #1.
Is it possible to extract 5000 bp upstream and downstream
regions across
tss start site .
The "Location of the flanking region/s:" options are:
4 - Upstream
5 - Downstream
6 - Both
Pick option #6 with "Length of the flanking region(s):" set to 5000.
Once I have that then I want to find non overlaping
genes in my regions from chipseq data.
Do you want to identify/label known genes or discover novel genes?
This part of your question is not clear. Could you explain in more
detail the end goal?
It is likely some for of the tool "Operate on Genomic Intervals - >
Merge will do what you want", but it is difficult to recommend the
correct option.
Going forward, sending question to a single public list, as Brooke
also suggests, is best. It is generally considered a good idea to
not post to two or more, at the same time, with the same email to
start threads.
Thanks!
jen
Galaxy team
Thanks
Kanwar
_____________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org <http://usegalaxy.org>. Please keep all
replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/__listinfo/galaxy-dev
<http://lists.bx.psu.edu/listinfo/galaxy-dev>
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Jennifer Jackson
http://galaxyproject.org
--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/