Hi Patrick, Thanks for your answer.
We are developing an motif discovery and analysis pipeline for ChIP-Seq experiment. We're using BSgenome to convert BED file into fasta sequence with getseq function. We would like to get masked sequences to improve the motif analysis by eliminating repeats or other low interest regions. So, is there a way to get masked region instead of the original sequence containing in BSgenome ? Not only "show" the sequence, but convert the masked sequence into a string. Because activated the masks chromosome only allow me to visualize the masked sequence of the BSgenome object. But I'm still not able to access to the masked sequence. Thanks, Arnaud. On 10-01-19 7:38 PM, "Patrick Aboyoun" <[email protected]> wrote: Arnaud, The BSgenome object, in this case Hsapiens, contains references to on disk storage of information such as masks. Since this information is not in memory and the data stored on disk is considered read-only, you cannot change the mask information on a BSgenome object. Instead, you need to modify the masks chromosome by chromosome after they have been loaded into memory as you showed in your code below. What is your use case that motivated your e-mail? If you never want to deal with masks, you can always use the unmasked function to strip the masks when you load the chromosome: > unmasked(Hsapiens$chr1) 247249719-letter "DNAString" instance seq: TAACCCTAACCCTAACCCTAACCCTAACCCTAACCC...NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN Patrick Droit Arnaud wrote: > Hi, > > I wondering if anybody can help me to generate masked (by RepeatMasker for > instance) sequences. > > I'm currently using Bsgenome to extract sequence from a BED file such as : > > library(BSgenome.Hsapiens.UCSC.hg18) > genome<-Hsapiens > FastaSeq<-getSeq(genome,"chr1",start=1000,end=1200, as.character=FALSE) > > I know that Bsgenome contains masks that can be apply by using : > > chr1 <- genome$chr1 > active(masks(chr1)) <- TRUE > > So, I'm trying to use it to change the masks of the genome object. But I > cannot modify it : > > active(masks(genome$chr1)) <- TRUE > Error in `$<-`(`*tmp*`, "chr1", value = <S4 object of class > "MaskedDNAString">) : > no method for assigning subsets of this S4 class > > Is there a way get the masked sequence with the getSeq function ? > > Thanks. > > Arnaud. > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
