Thanks for the suggestions everyone. I think I agree that using partial application is the simplest solution, and I agree with Aaron's suggestion that the 'translate' function should automatically use the standard table without needing it to be supplied as an argument.

I'm having trouble testing my code out in ghci. I've been loading the Bio.Core module, abut I get error messages saying that:

Not in scope: data constructor `SeqData'

Shouldn't that data constructor be in scope because Bio.Core exports Bio.Core.Sequence, which in turn contains the expression:

type SeqData  = LC.ByteString

?

Dan

On 26/08/11 10:04 AM, Aaron D. Ball wrote:
Since the default is the default, I'd like to see

    translateWith :: TransTable ->  ...

    translate = translateWith default_table

by analogy with sort/sortBy.

On Fri, Aug 26, 2011 at 05:53, Matt Fenwick<mfenwick...@gmail.com>  wrote:
Hi Dan,
a common way to capture this pattern in Haskell is to use partially applied
functions (I think this is also what Christian recommended).
The type for your function is:  translate :: SeqData ->  Offset ->  Maybe
TransTable ->  SeqData
But if you change the argument order to:
translate :: TransTable ->  SeqData ->  Offset ->  SeqData
then you can add another function -- which is simply a partial application
of translate (assuming defaulttable has been defined):
defaultTranslate :: SeqData ->  Offset ->  SeqData
defaultTranslate = translate defaulttable
Caveats:
   1.  you have to change the order of arguments -- this may break existing
code
   2.  the caller has two different function names to choose from -- is this
a problem?  I'm not sure
Advantages:
   1.  one implementation
   2.  no 'Maybe's to make things complicated
   3.  other 'default' tables could be swapped in, through the use of
additional partial applications of 'translate'


On Fri, Aug 26, 2011 at 6:00 AM,<biohaskell-requ...@biohaskell.org>  wrote:

Send Biohaskell mailing list submissions to
        biohaskell@biohaskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://malde.org/cgi-bin/mailman/listinfo/biohaskell
or, via email, send a message with subject or body 'help' to
        biohaskell-requ...@biohaskell.org

You can reach the person managing the list at
        biohaskell-ow...@biohaskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Biohaskell digest..."


Today's Topics:

   1. Re: Mitochondrial Translation Table (Dan Fornika)
   2. Re: Mitochondrial Translation Table (Ketil Malde)
   3. Re: Mitochondrial Translation Table
      (Christian H?ner zu Siederdissen)


----------------------------------------------------------------------

Message: 1
Date: Thu, 25 Aug 2011 18:53:40 -0700
From: Dan Fornika<dforn...@gmail.com>
To: biohaskell@biohaskell.org
Subject: Re: [Biohaskell] Mitochondrial Translation Table
Message-ID:<4e56fca4.4090...@gmail.com>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"

Hello,

I've gotten a start on this but am running into a few difficulties.
I've attached my code so far.

I'd like to implement a function "translate" that uses the standard
translation table by default, but can optionally use a different
translation table if it is passed as an argument.  Does it make sense to
write this with the type signature:

translate :: SeqData ->  Offset ->  Maybe TransTable ->  SeqData

I'd like to be able to call the function with only the first two
argument and get the hard-coded standard translation table, but if I
pass a third argument with type TransTable, I get the customized
translation.

As I've mentioned before on this list I'm still quite new to haskell, so
any tips are appreciated.

Dan

On 16/08/11 02:16 PM, Ketil Malde wrote:
Dan Fornika<dforn...@gmail.com>    writes:

I've been working on a mitochondrial genetics project, and I'd like to
see a human mitochondrial-specific translation function integrated
into BioHaskell.

Hm, yes, support for the non-canonical translations would be nice.

I see that Bio.Sequence.Phd has a standard translation table,

Bio.Sequence.SeqData, actually, but it's reexported in a lot of places.

so I've
made a version called trans_tbl_mito with the appropriate changes
integrated.

Currently, there's just a 'translate' that hardwires this table.  I
concede this is poor design, the table ought to be a parameter, but it
was what I needed at the time.

I know that there is a lot of re-factoring going on, and
I'm not sure what is the most efficient/elegant way to introduce this
feature.

How about a module Bio.Sequence.Translation containing:

    import Bio.Core.Sequence

    data Triple = ... -- e.g. String, or perhaps just SeqData
    data Amino  = ...
    data TransTable = [(Triple,Amino)]

    transtable_std, transtbl_mito, ... :: TransTable

    translate :: TransTable ->    SeqData ->    SeqData

    -- and perhaps versions for all frames, or some specific frame?

What does everybody think?

Dan: would you like to write this?  Feel free to rip what you want from
the old SeqData module.

-k
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Translation.hs
Type: text/x-haskell
Size: 6809 bytes
Desc: not available
URL:
<http://malde.org/pipermail/biohaskell/attachments/20110825/b5bc6166/attachment-0001.hs>

------------------------------

Message: 2
Date: Fri, 26 Aug 2011 06:06:49 +0200
From: Ketil Malde<ke...@malde.org>
To: biohaskell@biohaskell.org
Subject: Re: [Biohaskell] Mitochondrial Translation Table
Message-ID:<874o143jfq....@malde.org>
Content-Type: text/plain; charset=us-ascii

Dan Fornika<dforn...@gmail.com>  writes:

I'd like to implement a function "translate" that uses the standard
translation table by default, but can optionally use a different
translation table if it is passed as an argument.

Okay.

Does it make sense to write this with the type signature:

translate :: SeqData ->  Offset ->  Maybe TransTable ->  SeqData

To some extent.  You'll need to supply 'Nothing' as the third parameter,
so you might as well write 'default_table' or some such, avoiding the
maybe.

I'd like to be able to call the function with only the first two
argument and get the hard-coded standard translation table, but if I
pass a third argument with type TransTable, I get the customized
translation.

Variable number of arguments is possible (see e.g. Printf) but it
involves some trickery.  I think it might not be worth it, and suggest
'translate' always take an argument, possibly with a specialized
'translateStd' function supplying the default argument.

Regarding the implementation, you can simplify the 'translate' function
(since it basically repeats the same function twice) by using something
like:

  translate s' o' mb_tab = unfoldr codons (s',o',mytable)
       where mytable = fromMaybe trans_table_std mb_tab
             codons (s, o, trans_table) = ....

Also, -Wall will warn you about recycling names, which can cause errors.

-k
--
If I haven't seen further, it is by standing in the footprints of giants


------------------------------

Message: 3
Date: Fri, 26 Aug 2011 11:22:29 +0200
From: Christian H?ner zu Siederdissen<choe...@tbi.univie.ac.at>
To: Dan Fornika<dforn...@gmail.com>
Cc: biohaskell@biohaskell.org
Subject: Re: [Biohaskell] Mitochondrial Translation Table
Message-ID:<20110826092229.GA2001@workstation>
Content-Type: text/plain; charset="us-ascii"

Hi,

there are a number of possibilities:

(i) implicit parameters ;-)

http://www.haskell.org/ghc/docs/7.0.3/html/users_guide/other-type-extensions.html#implicit-parameters

(ii) this thing

http://neilmitchell.blogspot.com/2008/04/optional-parameters-in-haskell.html

(iii) what I prefer

myfun :: a ->  b ->  c
myFun a b = "complicated stuff"

defFun = myFun (default a)


Gruss,
Christian

* Dan Fornika<dforn...@gmail.com>  [26.08.2011 03:56]:
Hello,

I've gotten a start on this but am running into a few difficulties.
I've attached my code so far.

I'd like to implement a function "translate" that uses the standard
translation table by default, but can optionally use a different
translation table if it is passed as an argument.  Does it make
sense to write this with the type signature:

translate :: SeqData ->  Offset ->  Maybe TransTable ->  SeqData

I'd like to be able to call the function with only the first two
argument and get the hard-coded standard translation table, but if I
pass a third argument with type TransTable, I get the customized
translation.

As I've mentioned before on this list I'm still quite new to
haskell, so any tips are appreciated.

Dan

On 16/08/11 02:16 PM, Ketil Malde wrote:
Dan Fornika<dforn...@gmail.com>    writes:

I've been working on a mitochondrial genetics project, and I'd like to
see a human mitochondrial-specific translation function integrated
into BioHaskell.

Hm, yes, support for the non-canonical translations would be nice.

I see that Bio.Sequence.Phd has a standard translation table,

Bio.Sequence.SeqData, actually, but it's reexported in a lot of places.

so I've
made a version called trans_tbl_mito with the appropriate changes
integrated.

Currently, there's just a 'translate' that hardwires this table.  I
concede this is poor design, the table ought to be a parameter, but it
was what I needed at the time.

I know that there is a lot of re-factoring going on, and
I'm not sure what is the most efficient/elegant way to introduce this
feature.

How about a module Bio.Sequence.Translation containing:

   import Bio.Core.Sequence

   data Triple = ... -- e.g. String, or perhaps just SeqData
   data Amino  = ...
   data TransTable = [(Triple,Amino)]

   transtable_std, transtbl_mito, ... :: TransTable

   translate :: TransTable ->    SeqData ->    SeqData

   -- and perhaps versions for all frames, or some specific frame?

What does everybody think?

Dan: would you like to write this?  Feel free to rip what you want from
the old SeqData module.

-k

import Bio.Core.Sequence

import Data.List (unfoldr, intercalate, isPrefixOf)
import Data.Char (toUpper, isNumber)
import Data.Maybe
import qualified Data.ByteString.Lazy.Char8 as B
import qualified Data.ByteString.Lazy as BB

data Codon = String                      -- SeqData?

data Amino = Ala | Arg | Asn | Asp | Cys | Gln | Glu | Gly
            | His | Ile | Leu | Lys | Met | Phe | Pro | Ser
            | Thr | Tyr | Trp | Val
            | STP | Asx | Glx | Xle | Xaa -- unknowns
      deriving (Show,Eq)

data TransTable = TTable [(Codon, Amino)]

data InverseTransTable = ITTable [(Amino, Codon)]

{-# INLINE (!) #-}
(!) :: SeqData ->  Offset ->  Char
(!) (SeqData bs) = B.index bs

translate ::  SeqData ->  Offset ->  Maybe TransTable ->  SeqData
translate s' o' Nothing = unfoldr codons (s', o')
    where codons (s,o) =
              if o>  seqlength s - 3 then Nothing
              else Just (trans1 (map (s!) [o, o+1, o+2]), (s, o+3)
trans_table_std)
translate s' o' trans_table = unfoldr codons (s', o', trans_table)
    where codons (s, o, trans_table) =
              if o>  seqlength s - 3 then Nothing
              else Just (trans1 (map (s!) [o, o+1, o+2]), (s, o+3)
trans_table)

trans1 :: String ->  Maybe TransTable->  Amino
trans1 s Nothing = (flip lookup trans_table_std . map (repUT . toUpper))
s
     where repUT x = if x == 'U' then 'T' else x -- RNA uses U for T
trans1 s transl_table = (flip lookup transl_table . map (repUT .
toUpper)) s
     where repUT x = if x == 'U' then 'T' else x -- RNA uses U for T

-- amino2iupac :: [Amino] ->  SeqData
-- amino2iupac =

-- reverse_translate :: SeqData ->  TransTable ->  SeqData
-- reverse_translate =

-- rev_trans1 :: Amino ->  TransTable ->  String
-- rev_trans1 =

-- todo: extend with more IUPAC nucleotide wildcards?
-- | Convert a list of amino acids to a sequence in IUPAC format.
toIUPAC :: [Amino] ->  SeqData
toIUPAC = B.pack . map (fromJust . flip lookup iupac)

-- | Convert a sequence in IUPAC format to a list of amino acids.
fromIUPAC :: SeqData ->  [Amino]
fromIUPAC = map (maybe Xaa id . flip lookup iupac' . toUpper) . B.unpack

iupac :: [(Amino, Char)]
iupac = [(Ala, 'A'), (Arg, 'R'), (Asn, 'N')
         ,(Asp, 'D'), (Cys, 'C'), (Gln, 'Q')
         ,(Glu, 'E'), (Gly, 'G'), (His, 'H')
         ,(Ile, 'I'), (Leu, 'L'), (Lys, 'K')
         ,(Met, 'M'), (Phe, 'F'), (Pro, 'P')
         ,(Ser, 'S'), (Thr, 'T'), (Tyr, 'Y')
         ,(Trp, 'W'), (Val,'V')
         ,(Asx, 'B') -- Asn or Asp
         ,(Glx, 'Z') -- Gln or Glu
         ,(Xle, 'J') -- Ile or Leu
         ,(STP, '*')
         ,(Xaa, 'X')
         ]

iupac' :: [(Char,Amino)]
iupac' = map (\(a,b)->(b,a)) iupac

trans_table_std :: TransTable
trans_table_std = TTable [("AAA", Lys), ("AAC", Asn), ("AAG", Lys),
("AAT", Asn)
                          ,("ACA", Thr), ("ACC", Thr), ("ACG", Thr),
("ACT", Thr)
                          ,("AGA", Arg), ("AGC", Ser), ("AGG", Arg),
("AGT", Ser)
                          ,("ATA", Ile), ("ATC", Ile), ("ATG", Met),
("ATT", Ile)
                          ,("CAA", Gln), ("CAC", His), ("CAG", Gln),
("CAT", His)
                          ,("CCA", Pro), ("CCC", Pro), ("CCG", Pro),
("CCT", Pro)
                          ,("CGA", Arg), ("CGC", Arg), ("CGG", Arg),
("CGT", Arg),
                          ,("CTA", Leu), ("CTC", Leu), ("CTG", Leu),
("CTT", Leu),
                          ,("GAA", Glu), ("GAC", Asp), ("GAG", Glu),
("GAT", Asp),
                          ,("GCA", Ala), ("GCC", Ala), ("GCG", Ala),
("GCT", Ala),
                          ,("GGA", Gly), ("GGC", Gly), ("GGG", Gly),
("GGT", Gly),
                          ,("GTA", Val), ("GTC", Val), ("GTG", Val),
("GTT", Val),
                          ,("TAA", STP), ("TAC", Tyr), ("TAG", STP),
("TAT", Tyr),
                          ,("TCA", Ser), ("TCC", Ser), ("TCG", Ser),
("TCT", Ser),
                          ,("TGA", STP), ("TGC", Cys), ("TGG", Trp),
("TGT", Cys),
                          ,("TTA", Leu), ("TTC", Phe), ("TTG", Leu),
("TTT", Phe)
--                        ("RAT", Asx), ("RAC", Asx), ("SAA", Glx),
("SAG", Glx)
-- IUPAC R is (A or G), S is (C or G)
                          ]

trans_table_vertebrate_mito :: TransTable
trans_table_vertebrate_mito = TTable [
                                      , ("AAA", Lys), ("AAC", Asn),
("AAG", Lys), ("AAT", Asn),
   ("ACA", Thr), ("ACC", Thr), ("ACG", Thr), ("ACT", Thr),
   ("AGA", STP), ("AGC", Ser), ("AGG", STP), ("AGT", Ser),
   ("ATA", Met), ("ATC", Ile), ("ATG", Met), ("ATT", Ile),
   ("CAA", Gln), ("CAC", His), ("CAG", Gln), ("CAT", His),
   ("CCA", Pro), ("CCC", Pro), ("CCG", Pro), ("CCT", Pro),
   ("CGA", Arg), ("CGC", Arg), ("CGG", Arg), ("CGT", Arg),
   ("CTA", Leu), ("CTC", Leu), ("CTG", Leu), ("CTT", Leu),
   ("GAA", Glu), ("GAC", Asp), ("GAG", Glu), ("GAT", Asp),
   ("GCA", Ala), ("GCC", Ala), ("GCG", Ala), ("GCT", Ala),
   ("GGA", Gly), ("GGC", Gly), ("GGG", Gly), ("GGT", Gly),
   ("GTA", Val), ("GTC", Val), ("GTG", Val), ("GTT", Val),
   ("TAA", STP), ("TAC", Tyr), ("TAG", STP), ("TAT", Tyr),
   ("TCA", Ser), ("TCC", Ser), ("TCG", Ser), ("TCT", Ser),
   ("TGA", Trp), ("TGC", Cys), ("TGG", Trp), ("TGT", Cys),
   ("TTA", Leu), ("TTC", Phe), ("TTG", Leu), ("TTT", Phe)
--("RAT", Asx), ("RAC", Asx), ("SAA", Glx), ("SAG", Glx)  --IUPAC R is
(A or G), S is (C or G)
   ]

trans_table_yeast_nuc :: TransTable
trans_table_yeast_nuc = TTable [
   ("AAA", Lys), ("AAC", Asn), ("AAG", Lys), ("AAT", Asn),
   ("ACA", Thr), ("ACC", Thr), ("ACG", Thr), ("ACT", Thr),
   ("AGA", Arg), ("AGC", Ser), ("AGG", Arg), ("AGT", Ser),
   ("ATA", Met), ("ATC", Ile), ("ATG", Met), ("ATT", Ile),
   ("CAA", Gln), ("CAC", His), ("CAG", Gln), ("CAT", His),
   ("CCA", Pro), ("CCC", Pro), ("CCG", Pro), ("CCT", Pro),
   ("CGA", Arg), ("CGC", Arg), ("CGG", Arg), ("CGT", Arg),
   ("CTA", Leu), ("CTC", Thr), ("CTG", Thr), ("CTT", Thr),
   ("GAA", Glu), ("GAC", Asp), ("GAG", Glu), ("GAT", Asp),
   ("GCA", Ala), ("GCC", Ala), ("GCG", Ala), ("GCT", Ala),
   ("GGA", Gly), ("GGC", Gly), ("GGG", Gly), ("GGT", Gly),
   ("GTA", Val), ("GTC", Val), ("GTG", Val), ("GTT", Val),
   ("TAA", STP), ("TAC", Tyr), ("TAG", STP), ("TAT", Tyr),
   ("TCA", Ser), ("TCC", Ser), ("TCG", Ser), ("TCT", Ser),
   ("TGA", Trp), ("TGC", Cys), ("TGG", Trp), ("TGT", Cys),
   ("TTA", Leu), ("TTC", Phe), ("TTG", Leu), ("TTT", Phe)
--("RAT", Asx), ("RAC", Asx), ("SAA", Glx), ("SAG", Glx)  --IUPAC R is
(A or G), S is (C or G)
   ]

inverse_trans_table_std :: InverseTransTable
inverse_trans_table_std = ITTable [
   (Ala, "GCN"), (Leu, "YTN"), (Arg, "MGN"), (Lys, "AAR"),
   (Asn, "AAY"), (Met, "ATG"), (Asp, "GAY"), (Phe, "TTY"),
   (Cys, "TGY"), (Pro, "CCN"), (Gln, "CAR"), (Ser, "WSN"),
   (Glu, "GAR"), (Thr, "ACN"), (Gly, "GGN"), (Trp, "TGG"),
   (His, "CAY"), (Tyr, "TAY"), (Ile, "AUH"), (Val, "GUN"),
   (STP, "TRR"), (Asx, "RAY"), (Glx, "SAR"), (Xaa, "NNN")
   ]
_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
<http://malde.org/pipermail/biohaskell/attachments/20110826/a145e643/attachment-0001.pgp>

------------------------------

_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell


End of Biohaskell Digest, Vol 9, Issue 4
****************************************


_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell


_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell
_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Reply via email to