On Jan 1, 2009 11:50pm, "Galchin, Vasili" <[email protected]> wrote:
> it is a bioinformatics standard .. . I am writing on this newsgroup in order 
> to try to be objective to get a "correct" and elegant answer .. in any case I 
> am helping on the bioinformatics code (you can see on Hackage). I am trying 
> to finish the 2Bit file format code ... it seems to me that bioinformatics as 
> an area is not clearly defined .... e.g. it is unclear to me whether "offset" 
> is a marshalled/serialized concept or or unmarshalled/unserialized concept 
> ..... this distinction is very important .... I will have to think about more 
> myself!
>
>
> Regards, Vasili
>


Here's some code using Data.Binary to store data as offsets into a
byte array.  I haven't tested it too much, so it may have bugs.  Maybe
there's some inspiration in there.

-Antoine

>>>>
import Data.Binary
import Data.Binary.Get
import Data.Binary.Put

import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy as B

data TestStruct = TestStruct
    { property1 :: ByteString
    , property2 :: ByteString
    , property3 :: ByteString
    }
 deriving Show

{-

 The serialized format looks like (all big-endian):

 * first offset into data block (Word32)
 * second offset into data block (Word32)
 * third offset into data block (Word32)
 * length of bnary data block (Word32)
 * binary data block (Arbitrary binary data)

-}
instance Binary TestStruct where
    put struct =
        let data1 = property1 struct
            data2 = property2 struct
            data3 = property3 struct

            dataBlock = data1 `B.append` data2 `B.append` data3

            offset1 = 0
            offset2 = offset1 + B.length data1
            offset3 = offset2 + B.length data2

       in do
         putWord32be $ fromIntegral offset1
         putWord32be $ fromIntegral offset2
         putWord32be $ fromIntegral offset3

         putWord32be $ fromIntegral $ B.length dataBlock
         putLazyByteString dataBlock

    get = do
      offset1 <- getWord32be
      offset2 <- getWord32be
      offset3 <- getWord32be

      dataBlockLength <- getWord32be
      dataBlock <- B.drop (fromIntegral offset1) `fmap`
                   getLazyByteString (fromIntegral dataBlockLength)

      let (data1, rest1) =
              B.splitAt (fromIntegral $ offset2 - offset1) dataBlock
          (data2, rest2) =
              B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1
          data3          = rest2

      return $ TestStruct data1 data2 data3
<<<<<
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to