Hi Ken,

I've also been working on the rosalind.info problem set.  If you'd like to
see my answer for this problem, you can see it here:

https://github.com/dfornika/rosalind/blob/master/04_gc_content/04_gc_content.hs

If you'd prefer to work through the problem yourself, I can offer some
advice to get started.  The internal sequence data in the Sequence type in
Bio.Sequence.Fasta is a lazy bytestring.  You can access the sequence data
with the 'seqdata' function from Bio.Sequence.Fasta, and convert it to a
'plain old' String with 'toStr'.

If you need to work with they bytestrings, you can import
Data.ByteString.Lazy.Char8. It should be imported 'qualified' so that it
doesn't conflict with functions from the Prelude.

Here is some code. I haven't had a chance to test it, so I apologize for
any bugs.

Dan

module Main where

import Bio.Core.Sequence
import Bio.Sequence.Fasta
import qualified Data.ByteString.Lazy.Char8 as B  -- not necessary for this
example, but this is how you would import the bytestring functions.

main :: IO()
main = do
  input <- readFasta "./input.fasta"
  let seqs = map (toStr . seqdata) seqs
  mapM_ putStrLn seqs

On Mon, Oct 20, 2014 at 9:27 AM, Youens-Clark, Charles Kenneth - (kyclark) <
kycl...@email.arizona.edu> wrote:

> I’m currently working my way through the problems at “rosalind.info,”
> implementing each of my solutions in Perl, Python, and Haskell.  I’m stuck
> on the “GC” problem (http://rosalind.info/problems/gc/) as I need to
> parse a FASTA file with "Bio.Sequence.Fasta.”
>
> In ghci, I can easily do this:
>
> ghci> let f = readFasta "input.fasta"
> ghci> f
> [ID -----------------------------------------------------------------
> Rosalind_6404
>
> COMMENT ------------------------------------------------------------
>
>
> DATA ---------------------------------------------------------------
>   0 CCTGCGGAAG ATCGGCACTA GAATAGCCAG AACCGTTTCT CTGAGGCTTC CGGCCTTCCC
>  60 TCCCACTAAT AATTCTGAGG,ID
> -----------------------------------------------------------------
> Rosalind_5959
>
> COMMENT ------------------------------------------------------------
>
>
> DATA ---------------------------------------------------------------
>   0 CCATCGGTAG CGCATCCTTA GTCCAATTAA GTCCCTATCC AGGCGCTCCG CCGAAGGTCT
>  60 ATATCCATTT GTCAGCAGAC ACGC,ID ————————————————————————————————
> …
>
> So I know it’s working, but I can’t figure out what to do with “f” now.  I
> can see it’s type:
>
> ghci> :t f
> f :: IO
>        [Bio.Sequence.SeqData.Sequence Bio.Sequence.SeqData.Unknown]
>
> But what do I do with “f” to, say, iterate over the sequences and count
> G/C content, get the length of the sequence, etc.?
>
> Thanks,
>
> Ken

Reply via email to