Hello All,

I generated random nucleotide sequences having almost equal trinucleotide frequencies to a query sequence, using "sample" function in the following way:

seq1<-paste(sample(alpha,333,replace=TRUE,prob=freq),collapse=""); where "alpha" is a vector of 64 trinucleotides possible from the set c("A","G","C"."T") and *"freq" is a frequency vector of 64 trinucleotides present in a given query sequence*.

Let's consider a random sequence generated in above described way. Does the random sequence preserve the mon- and di- nucleotide frequencies of the query sequence? I mean, do the mono and di nucleotide frequencies of random sequence are similar to mono and di nucleotide frequencies of query sequence?

In one of the cases I worked with, the answer was "No" to the above question. If that is the case, How to generate a random sequence preserving a mono-, di- and tri- nucleotide frequencies of the query sequence?

Regards,
Purnachander G

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to