Hi Hervé, Michael,

That's great! I did try searching the message boards, but couldn't find anything.

Thanks for the help
Anna

On 01/12/10 05:46, Hervé Pagès wrote:
Hi Anna, Michael,

This is already the case if you use the current version of BSgenome:

    >  library(BSgenome.Mmusculus.UCSC.mm9)

    >  gr<- GRanges("chr5", IRanges(125821746, 125821945))

    >  gr
    GRanges with 1 range and 0 elementMetadata values
        seqnames                 ranges strand |
           <Rle>               <IRanges>   <Rle>  |
    [1]     chr5 [125821746, 125821945]      * |

    seqlengths
     chr5
       NA

    >  getSeq(Mmusculus, gr)
    [1]
"TGAACGCCCTCCACTCAAAATTCCGTGTCCCTCGGGGCCCTTTGCACTTCCCCCACTCGGAATTCCATATCCCTTTGGGCTTTTGCACACCCTCCATTCTGTGTCCCTCGTGGCCTTTGCACACTGTCCCCTCGGAATTCCATGTCTCCCGGGACCTTTGCACACCCTTCGTACAGAATTCTGTGTCCCTCGAGGCCTTA"

    >  sessionInfo()
    R version 2.12.0 (2010-10-15)
    Platform: x86_64-unknown-linux-gnu (64-bit)

    locale:
     [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
     [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
     [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
     [7] LC_PAPER=en_US.utf8       LC_NAME=C
     [9] LC_ADDRESS=C              LC_TELEPHONE=C
    [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base

    other attached packages:
    [1] BSgenome.Mmusculus.UCSC.mm9_1.3.16 BSgenome_1.18.2
    [3] Biostrings_2.19.1                  GenomicRanges_1.3.0
    [5] IRanges_1.8.5

    loaded via a namespace (and not attached):
    [1] Biobase_2.10.0

Cheers,
H.


On 11/30/2010 08:14 PM, Michael Lawrence wrote:
This is a sensible request, as it would be consistent with the behavior for
RangesList and RangedData. Anyone opposed to this change?

On Tue, Nov 30, 2010 at 6:43 AM, Anna Terry<[email protected]>wrote:

Hi,

Would it be possible for the BSgenome function getSeq to return the +
strand by default when given a GRanges object where the strand is "*" rather
than throw an error?  The strand is not known for ChIP-seq regions and so it
is sensible to have the strand as "*" when storing them in a GRanges object.

Anna

problem.gr
GRanges with 1 range and 6 elementMetadata values
     seqnames                 ranges strand |     score     count    unique
<Rle>   <IRanges>   <Rle>   |<numeric>   <integer>   <integer>
[1]     chr5 [125821746, 125821945]      * |  97.02651       124       116
     ref.count    height       max
<integer>   <numeric>   <numeric>
[1]        10  29.03108 125821846

seqlengths
          chr1         chr2         chr3 ...  chrY_random chrUn_random
     197195432    181748087    159599783 ...     58682461      5900358

problem.seq<- getSeq(Mmusculus, problem.gr)
Error in .extractSeqsFromDNAString(subject, ranges(grg), strand(grg)) :
   'strand' elements must be "+" or "-"
Calls: getSeq ... lapply ->   lapply ->   FUN ->   .extractSeqsFromDNAString

problem.seq<- getSeq(Mmusculus, problem.gr, strand="+")
Error in .extractSeqsFromDNAString(subject, ranges(grg), strand(grg)) :
   'strand' elements must be "+" or "-"
Calls: getSeq ... lapply ->   lapply ->   FUN ->   .extractSeqsFromDNAString

problem.seq<- getSeq(Mmusculus, seqnames(problem.gr), start=start(
problem.gr), end=end(problem.gr))
# works


sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-redhat-linux-gnu

locale:
   [1] LC_CTYPE=en_GB          LC_NUMERIC=C            LC_TIME=en_GB
   [4] LC_COLLATE=en_GB        LC_MONETARY=en_GB       LC_MESSAGES=en_GB
   [7] LC_PAPER=en_GB          LC_NAME=en_GB           LC_ADDRESS=en_GB
[10] LC_TELEPHONE=en_GB      LC_MEASUREMENT=en_GB
   LC_IDENTIFICATION=en_GB

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] XML_3.2-0                          BSgenome.Mmusculus.UCSC.mm9_1.3.16
[3] BSgenome_1.16.5                    GenomicRanges_1.0.9
[5] Biostrings_2.16.9                  IRanges_1.6.17
[7] rkward_0.5.3

loaded via a namespace (and not attached):
[1] Biobase_2.8.0 tools_2.11.1

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


--
Anna Terry
Lymphocyte Development
MRC Clinical Sciences Centre
Imperial College Faculty of Medicine
Hammersmith Hospital
Du Cane Road
London W12 0NN

Tel:    0208 3832140
        0208 3832145
Email: [email protected]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to