Jeff Shannon wrote:

(Plus, if this format might be used for RNA sequences as well as DNA sequences, you've got at least a fifth base to represent, which means you need at least three bits per base, which means only two bases per byte (or else base-encodings split across byte-boundaries).... That gets ugly real fast.)

Not to mention all the IUPAC symbols for incompletely specified bases (e.g. R = A or G).


http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html

--
Robert Kern
[EMAIL PROTECTED]

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to