Hello Abdullah,
The tool geecee will take fasta sequence as input. I am not sure if you
just have the bed coordinates of the regions of interest or already have
the coordinates of genes contained within these regions yet.
If you need the genes, then one choice is to extract a track from the
UCSC table browser to obtain transcripts in bed12 format with the tool
"Get Data: UCSC Main". Tracks in the group "Genes and Gene Predictions"
are most likely what you will want. You can read about the choices at
UCSC, but common selections include UCSC Genes, Refseq Genes, etc. You
can get them all, them use tools in the group "Operate on Genomic
Intervals" to limit the group to just those that fit within the isochore
coordinate bounds.
For a list of associated gene identifiers, related tables to most gene
tracks at UCSC contain that sort of information. Do a separate extract
operation to obtain a file that contains the gene and transcript
identifiers, then join the data together with the transcript you obtain
after performing the above filtering, to link in the gene name.
Once you have the transcript coordinates, fasta sequence can be obtained
in two ways. If you want to do the GC counts off of the mRNA, use the
transcript identifiers in the UCSC Table browser again, choose sequence
output (not bed), and this time extract "mRNA" when prompted (not
genomic). If genomic sequence is fine, the tool "Fetch Sequences ->
Extract Genomic DNA" can be used.
Then use the fasta sequeces as input to the "geecee" tool - the problems
you were having were most likely with giving the tool the wrong type of
input.
This is a lot of steps, and how you decided to organzize the data before
running geecee will affect how the summary stats are calculated. Really,
any stretch of nucleotide fasta sequence can be used for input (I do not
know of an upper length bound, but there probably is one, so just watch
for that - if an error comes up, work with smaller regions). You could
also just convert the fasta sequence to tabular, and add up the total
bases, count Gs, count Cs, etc. then perform a calculation on your own.
See also "Regional Variation -> Feature coverage", "Graph/Display Data",
and "BEDTools*"*, each may be helpful, for different reasons.
There are several tutorials that do many of these same basic operations
as part of the analysis or tool demos. Reviewing them will help you to
know how to structure inputs, use particular tools, etc, if you would
like the guidance. Under "Shared Pages": pls see Galaxy 101 and Using
Galaxy 2012 for the introduction tutorials.
https://main.g2.bx.psu.edu/page/list_published
Best,
Jen
Galaxy project
On 6/8/13 6:17 PM, Abdullah Al Mahmud wrote:
Hi,
In my account I have uploaded a file name iso_mm10.bed. The bed files
contains coordinates of 6018 isochores of mouse genome mm10. I want to
extract GC% of each scores with the list of genes present in each
isochores.
I tried using extract features, geecee, and many other tools from
galaxy. But every time either it said error or no peak.
I will be grateful to you if you kindly give me an idea about how to
solve this problem.
Abdullah
--
Abdullah Al Mahmud, PhD
Postdoctoral fellow,
University of Montreal,
Lab. of Dr. Jacques Michaud
CHU Sainte-Justine Research Center,
Montreal, Quebec, Canada.
[email protected] <mailto:[email protected]>
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/