Hi Ken, I've also been working on the rosalind.info problem set. If you'd like to see my answer for this problem, you can see it here:
https://github.com/dfornika/rosalind/blob/master/04_gc_content/04_gc_content.hs If you'd prefer to work through the problem yourself, I can offer some advice to get started. The internal sequence data in the Sequence type in Bio.Sequence.Fasta is a lazy bytestring. You can access the sequence data with the 'seqdata' function from Bio.Sequence.Fasta, and convert it to a 'plain old' String with 'toStr'. If you need to work with they bytestrings, you can import Data.ByteString.Lazy.Char8. It should be imported 'qualified' so that it doesn't conflict with functions from the Prelude. Here is some code. I haven't had a chance to test it, so I apologize for any bugs. Dan module Main where import Bio.Core.Sequence import Bio.Sequence.Fasta import qualified Data.ByteString.Lazy.Char8 as B -- not necessary for this example, but this is how you would import the bytestring functions. main :: IO() main = do input <- readFasta "./input.fasta" let seqs = map (toStr . seqdata) seqs mapM_ putStrLn seqs On Mon, Oct 20, 2014 at 9:27 AM, Youens-Clark, Charles Kenneth - (kyclark) < kycl...@email.arizona.edu> wrote: > I’m currently working my way through the problems at “rosalind.info,” > implementing each of my solutions in Perl, Python, and Haskell. I’m stuck > on the “GC” problem (http://rosalind.info/problems/gc/) as I need to > parse a FASTA file with "Bio.Sequence.Fasta.” > > In ghci, I can easily do this: > > ghci> let f = readFasta "input.fasta" > ghci> f > [ID ----------------------------------------------------------------- > Rosalind_6404 > > COMMENT ------------------------------------------------------------ > > > DATA --------------------------------------------------------------- > 0 CCTGCGGAAG ATCGGCACTA GAATAGCCAG AACCGTTTCT CTGAGGCTTC CGGCCTTCCC > 60 TCCCACTAAT AATTCTGAGG,ID > ----------------------------------------------------------------- > Rosalind_5959 > > COMMENT ------------------------------------------------------------ > > > DATA --------------------------------------------------------------- > 0 CCATCGGTAG CGCATCCTTA GTCCAATTAA GTCCCTATCC AGGCGCTCCG CCGAAGGTCT > 60 ATATCCATTT GTCAGCAGAC ACGC,ID ———————————————————————————————— > … > > So I know it’s working, but I can’t figure out what to do with “f” now. I > can see it’s type: > > ghci> :t f > f :: IO > [Bio.Sequence.SeqData.Sequence Bio.Sequence.SeqData.Unknown] > > But what do I do with “f” to, say, iterate over the sequences and count > G/C content, get the length of the sequence, etc.? > > Thanks, > > Ken