Hi Joern, thanks for your email. You were right : I have extensively explored the bowtie manual, and I have been doing most of the analyses by using -- best option. My apologies to the bioc-sig subscribers 'cause my email is not 100% related to R/ShortRead, although I would appreciate any comments on any of these 2 questions :
- on a statistical ground, if read1 aligns to 5 places, any place is selected with 0.2 probability. and if a read 2 aligns to 3 places, any place is chosen with 0.3 probability. therefore, having read1 and read2 in the same place X would have the probability 0.3 * 0.2 . and if read3 ... read 7 are also included, the probability would be 0.3 * 0.2. * prob(read3) * ... * probab(read7). if this is the case, would it be legitimate to assign a ChIP-seq peak to the place X ? - if bowtie outputs the reads that align to repetitive regions (and arbitrarily selects one of these regions), could it be a good approximation for PET-seq data (that is intended to map the peaks on these repetitive regions, but it is much more expensive) ? thanks very much, bogdan On Tue, Jul 7, 2009 at 1:46 AM, Joern Toedling <[email protected]>wrote: > Hello, > > I think you should better ask questions of this kind to the developers of > Bowtie rather than on this mailing list. However, I believe all of your > questions are answered in the manual of bowtie. Basically you can customize > the report output by supplying additional arguments to the bowtie call. > "-k" > specifies the maximum number of valid match postions for each read in the > output. If you set "-k 1" (the default) I think a random position is > returned. > To my mind, the score is not directly given but the output column gives you > details which bases do not match and since you also have the quality > information of each base you can easily calculate the sum of mismatching > base > qualities. > The tab-delimited output format (non-binary) can be read in by function > readAligned(..., type="bowtie") > and the resulting object of class AlignedRead also stores the mismatch > information (at least in the current development version of package > ShortRead). > > Best regards, > Joern > > On Mon, 6 Jul 2009 14:17:34 -0700, Bogdan Tanasa wrote > > Hi everyone. > > > > I would appreciate to have your comments on the following : when aligning > > the solexa reads with bowtie, > > if a read aligns to multiple genomic regions, is the highest-scored > location > > picked up in the final report > > > > (i.e. when using --best option) ? And if a read aligns with the same > > score to multiple regions, would it be possible to see the score of > > the alignment and the differences in the score among multiple > > regions ? In this last scenario, a randomly picked location among > > the equally scored genomic locations is reported ? > > > > thanks very much, > > > > bogdan > > --- > Joern Toedling > Institut Curie -- U900 > 26 rue d'Ulm, 75005 Paris, FRANCE > Tel. +33 (0)156246926 > > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
