DESeq and edgeR vignettes.
Valerie
On 09/28/11 19:46, rohan bareja wrote:
Hi,
I have summed the counts now to the gene level for 3'UTR.
I want to assess the relative amount of each 3’UTR end usage such as
what percentage of reads comes from each 3’UTR isoform?
I want to identify the different 3’UTR ends for each gene to get
alternative 3'UTR usage(disease vs control)?
Do you have any idea about how to proceed?
Thanks,
Rohan
--- On *Sat, 24/9/11, Valerie Obenchain /<[email protected]>/*wrote:
From: Valerie Obenchain <[email protected]>
Subject: Re: [Bioc-sig-seq] Reads in 3'utr
To: "rohan bareja" <[email protected]>
Cc: [email protected]
Date: Saturday, 24 September, 2011, 4:24 AM
On 09/23/2011 02:57 PM, rohan bareja wrote:
Hi,
utr=threeUTRsByTranscript(txdb,use.names=FALSE)
So,utr is GRangesList of length 33381
Then as u said,I did the following:
txBygene <- transcriptsBy(txdb, "gene")
geneID <- rep(names(txBygene), elementLengths(txBygene))
df <- data.frame(geneID=geneID,
txID=values(unlist(txBygene))[["tx_id"]])
This gives me a dataframe with 40,780 rows with gene ID and txID
from txBygene object.
geneID txID
40775 9994 11731
40776 9994 11730
40777 9997 38491
40778 9997 38489
40779 9997 38496
40780 9997 38497
Since my utr object is of length 33,381 ,my counts length is same
i.e 33,381
So I am not able to map the counts to the above data frame which
has transcript and gene IDs.
Yes, these lengths are different.
In this example we have utr regions from 58 transcripts.
> length(utr)
[1] 58
Those 58 transcripts can be matched to their gene ID's by looking
at the txBygene object. All of the transcripts fall into one (or
more) of 51 genes,
> length(txBygene)
[1] 51
There are multiple transcripts per gene so we expand the gene ID's
to map to the transcripts.
> dim(df)
[1] 79 2
This data.frame has all transcripts from the txdb mapped to the
gene ID's. Your utr data may contain only a subset of these
transcripts. That is something you need to check. Match the
desired transcript names to the df, pull out the gene IDs. You
then have the gene ID's for your utr regions and can split or
group your counts by gene.
Valerie
--- On *Fri, 23/9/11, Valerie Obenchain /<[email protected]>
</mc/[email protected]>/*wrote:
From: Valerie Obenchain <[email protected]>
</mc/[email protected]>
Subject: Re: [Bioc-sig-seq] Reads in 3'utr
To: "rohan bareja" <[email protected]>
</mc/[email protected]>
Cc: [email protected]
</mc/[email protected]>
Date: Friday, 23 September, 2011, 10:50 PM
Hi Rohan,
You can relate the counts for 3UTR regions to gene IDs
through the transcript IDs.
txdb_file <- system.file("extdata",
"UCSC_knownGene_sample.sqlite", package="GenomicFeatures")
txdb <- loadFeatures(txdb_file)
utr=threeUTRsByTranscript(txdb,use.names=FALSE)
The transcript names can be matched to the gene ID's through,
txBygene <- transcriptsBy(txdb, "gene")
geneID <- rep(names(txBygene), elementLengths(txBygene))
df <- data.frame(geneID=geneID,
txID=values(unlist(txBygene))[["tx_id"]])
Now you know what gene ID each tx count belongs to. You can
split your counts by gene ID ...
Valerie
On 09/20/2011 12:13 PM, rohan bareja wrote:
Hi everyone,
I am doing NGS analysis using bam files.I have counted reads in 3'utr region using
utr=threeUTRsByTranscript(txdb,use.names=FALSE)
countsUTR<- countOverlaps(utr,reads)
I have got the transcript level counts from this.How can I get the gene
level counts??It might sound silly but Does anybody have an idea on what type
of anaylses we can do from this countsUTR ?
Thanks,Rohan
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing