Re: [Bioc-sig-seq] RNA-seq: test for within sample differential counts

Michal Okoniewski Mon, 25 Oct 2010 08:06:25 -0700

Hi Michael,

Recently I have submitted to BioC a library called rnaSeqMap, whichincludes the code we use locally

for problems like yours and similar.

It has the implementation of Aumann-Lindell algorithm for findingirreducible regionsthat perhaps may be used here. It is not a purely statistical solution,it is more of data mining, but I wouldlook for the minimal level of coverage (mi), for which both your regionsare irreducible - in order to compare

them on one sample.
That's the outline of the receipe:

1. Take one region, create SeqReads object on it. rs <-newSeqReads(chr, st, en, strand)2. Fill the object with data using rs <- addDataToReadset() if you havethe reads data in your workspace,or use addExperimentsToReadset() if you are connected to xmap databasewith the seq_reads table added.3. Run nd <- getCoverageFromRS(rs, 1) to get the NucleotideDistrobject with coverage.4. Then - in some sort of adaptive procedure do: nd.reg <-findRegionsAsND(nd, mi, minsup)to get the minimal mi, for which distr(nd.reg) is filled with mi on allthe positions.


Then repeat the procedure for both regions and compare mi.

Alternatively one can think of some ttest comparing "discretized" withA-L algorithm values from distributions for both regions to make thecomparison more "statistically correct", eg:


 nd.rbc1 <- regionBasedCoverage(nd1, steps, max_mi)
 d1 <- distribs(nd.rbc1)
 nd.rbc2 <- regionBasedCoverage(nd2, steps, max_mi)
 d2 <- distribs(nd.rbc2)
t.test(d1, d2)

As a side note: irreducible region in the sense of Aumann and Lindell(2003) has to start and end with coverage > miSo if your region is not "filled on both sides" you may check arepresentative sub-region.And all the functions except addExperimentsToReadset() do not requirethe database connection.

Do not know if this is what you precisely asked for - just myapproximate solution :)


Cheers,
Michal


On 25/10/2010 11:58, Michael Dondrup wrote:

Hi,

I need some statistical advise for the following problem. Given an RNA-seq 
experiment I would like to assess
statistical significance of differential read-counts>within<  a sample. Given a 
sample with read counts
for two (adjacent) regions out of all all regions of the genome I am interested 
in, say gene A and intron B.

I wish to detect if region B has a significantly lower read count than A, 
lengths of regions A and B are known to be different,
so I think fisher's-exact test does not apply here. Region length should be 
taken into account for this, as I think that
the more positions are different between regions, the more significant the 
result should be. I also have biol. replicates,
but these replicates have different numbers of reads.

Packages like DEseq and edgeR seem to be tailored to between samples 
comparison. So which method
would you recommend for within sample comparison?

Thank you very much
Michael

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] RNA-seq: test for within sample differential counts

Reply via email to