I’m trying to read FASTQ data with the Bio.Core.Sequence and Bio.Sequence.FastQ 
libraries.  My code so far:

> import Bio.Core.Sequence
> import Bio.Sequence.FastQ
> import System.Environment (getArgs)
> import qualified Data.ByteString.Lazy as B
> import qualified Data.ByteString.Lazy.Char8 as BC
> 
> main = do
>     (file:args) <- getArgs
>     seqs <- readSangerQ file
>     let first = head seqs
>     let label = seqid first
>     putStrLn $ ">" ++ BC.unpack (unSL label)
>     putStrLn $ BC.unpack (unSD $ seqdata first)
>     --qual <- BC.unpack (unQD $ seqqual first)
>     --putStrLn ">>" ++ qual ++ "<<“

And it runs:

> $ runghc Fastq.hs ../data/044_Eikenellacorrodens_BAA1152.bam.fastq
> >L9GAC:00013:00077
> CTTGCCGATTTATTTGCGGGTCGGCGAGCCAGTCGATACGTCGGTGTCAGGTGGCGGATAAGTCTAAACCGGAAACGGGGAAGT

First off, I’d appreciate any pointers on how to do this better.

Second, I don’t understand how to get the quality data as a list of integer 
values.

Thanks,

Ken
  • Parsing FASTQ Youens-Clark, Charles Kenneth - (kyclark)

Reply via email to