I am developing a new algorithm constructing Suffix Array that is not based 
on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) (the 
largest of longest common prefix) of the suffix array.  It will work 
perfectly for 8-bit character string without any code change. It needs some 
refine to deal with genome code. 

I want to know some special knowledge about genome DNA testing code. I know 
nothing about DNA sequence and biology.
 
1. Which are the best books about genome DNA sequence processing suitable 
for me who is developing a new algorithm constructing suffix array and want 
the algorithm better workable for DNA analyses. 
2. I want to know if there is any algorithm constructing Suffix Array whose 
performance depends on Max(LCPs)?
3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
right? I found another char U in RNA. Does the file still contain 4 
characters? 
4. If the number of chars in a file is limited to 4, and all repeatable 
patterns are known, I can specially design some technical refinement to 
improve my algorithm performance. I want know, in addition to 1 char, 2 
chars, 3 chars and 4 chars repentance, 5 chars or 
more repeatable sequence are common? And if common, the largest common 
chars repentance contains how many different chars? 
1 char repentance: AAAAAAAA...
2 char repentance: ACACACACACACACA... 

Thank you.

Weng

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].

Reply via email to