Dear bioc-sig-sequencing,

I would like to determine a cutoff/threshold for a chipseq experiment for 
defining a FDR (BasicChipSeq.pdf, A ChIP-Seq Data Analysis, page 6 & 7, EX 2, 
http://www.bioconductor.org/workshops/2009/SeattleNov09/ChIP-seq/BasicChipSeq.pdf).

After reading in the two files (ctcf, gfp), have AignedRead objects.  Before 
running code on page 6 & 7 for ctcf and gfp data (to find distribution of 
depths compared to the null distributions), would like to account for 
(equalize) any difference between the number of reads between ctcf, gfp data 
sets.  Is there a recommended way to do this?

For example, perhaps

1. One could use the R function 'sample' somehow on the AlignedRead object 
(ctcf or gfp) with more reads to produce a subset of reads equal to the number 
in the smaller file?  Repeat say 3 times to control for sampling variation when 
determining the cutoff described above?


2. Or perhaps sort of similar to slide 25 in workshop (CoverageEDA.pdf, 
http://bioconductor.org/packages/courses/seattle-01-2009/day3/CoverageEDA.pdf), 
find/create an R function that could multiply an Rle object, here ctcf or gfp 
(the depth value for each nucleotide) by the fraction representing the 
relationship between the number of reads in the two AlignedRead objects.  This 
followed by applying 'round' function as done in slide 25 to give integer 
values for the depth values in the Rle object?
(I note the '2009' in this URL should be '2010'?)

Can someone comment?


Thanks,
[email protected]
P. Terry

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to