[Haskell-cafe] representation on persistent store question

2009-01-01 Thread Galchin, Vasili
Hello,

Say I have several data structures that are marshalled(using Binary
class) and written out linearly on persistence store. I want to calculate
the offsets in bytes of these various data structures in a functional
language way. What is the suggested (elegant) way  ?

Regards, Vasili
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] representation on persistent store question

2009-01-01 Thread Antoine Latter
2009/1/1 Galchin, Vasili vigalc...@gmail.com:

 Say I have several data structures that are marshalled(using Binary
 class) and written out linearly on persistence store. I want to calculate
 the offsets in bytes of these various data structures in a functional
 language way. What is the suggested (elegant) way  ?


It doesn't look like the 'Put' monad in te binary package keeps track
of position in the output stream.

Is there a bigger-picture goal you're trying to achieve?  Maybe we
could suggest a better approach by stepping back a bit.

-Antoine
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] representation on persistent store question

2009-01-01 Thread Galchin, Vasili
The second data structure is an array of structure .. the third set of
structure are a series of bit lists ... Each array element  has an offset
for its  corresponding bit list:

[{, offset: Int64}] [[bit]]

when I marshall up all this offset should be the serialized/marshalled
offset of its correponding [bit]!!

Regards, Vasili


On Thu, Jan 1, 2009 at 10:51 PM, Antoine Latter aslat...@gmail.com wrote:

 2009/1/1 Galchin, Vasili vigalc...@gmail.com:
 
  Say I have several data structures that are marshalled(using Binary
  class) and written out linearly on persistence store. I want to calculate
  the offsets in bytes of these various data structures in a functional
  language way. What is the suggested (elegant) way  ?
 

 It doesn't look like the 'Put' monad in te binary package keeps track
 of position in the output stream.

 Is there a bigger-picture goal you're trying to achieve?  Maybe we
 could suggest a better approach by stepping back a bit.

 -Antoine

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] representation on persistent store question

2009-01-01 Thread Antoine Latter
On Jan 1, 2009 11:50pm, Galchin, Vasili vigalc...@gmail.com wrote:
 it is a bioinformatics standard .. . I am writing on this newsgroup in order 
 to try to be objective to get a correct and elegant answer .. in any case I 
 am helping on the bioinformatics code (you can see on Hackage). I am trying 
 to finish the 2Bit file format code ... it seems to me that bioinformatics as 
 an area is not clearly defined  e.g. it is unclear to me whether offset 
 is a marshalled/serialized concept or or unmarshalled/unserialized concept 
 . this distinction is very important  I will have to think about more 
 myself!


 Regards, Vasili



Here's some code using Data.Binary to store data as offsets into a
byte array.  I haven't tested it too much, so it may have bugs.  Maybe
there's some inspiration in there.

-Antoine


import Data.Binary
import Data.Binary.Get
import Data.Binary.Put

import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy as B

data TestStruct = TestStruct
{ property1 :: ByteString
, property2 :: ByteString
, property3 :: ByteString
}
 deriving Show

{-

 The serialized format looks like (all big-endian):

 * first offset into data block (Word32)
 * second offset into data block (Word32)
 * third offset into data block (Word32)
 * length of bnary data block (Word32)
 * binary data block (Arbitrary binary data)

-}
instance Binary TestStruct where
put struct =
let data1 = property1 struct
data2 = property2 struct
data3 = property3 struct

dataBlock = data1 `B.append` data2 `B.append` data3

offset1 = 0
offset2 = offset1 + B.length data1
offset3 = offset2 + B.length data2

   in do
 putWord32be $ fromIntegral offset1
 putWord32be $ fromIntegral offset2
 putWord32be $ fromIntegral offset3

 putWord32be $ fromIntegral $ B.length dataBlock
 putLazyByteString dataBlock

get = do
  offset1 - getWord32be
  offset2 - getWord32be
  offset3 - getWord32be

  dataBlockLength - getWord32be
  dataBlock - B.drop (fromIntegral offset1) `fmap`
   getLazyByteString (fromIntegral dataBlockLength)

  let (data1, rest1) =
  B.splitAt (fromIntegral $ offset2 - offset1) dataBlock
  (data2, rest2) =
  B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1
  data3  = rest2

  return $ TestStruct data1 data2 data3

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] representation on persistent store question

2009-01-01 Thread Galchin, Vasili
dude .. you rock ... let me check it out ;^)

Vasili


On Fri, Jan 2, 2009 at 12:24 AM, Antoine Latter aslat...@gmail.com wrote:

 On Jan 1, 2009 11:50pm, Galchin, Vasili vigalc...@gmail.com wrote:
  it is a bioinformatics standard .. . I am writing on this newsgroup in
 order to try to be objective to get a correct and elegant answer .. in any
 case I am helping on the bioinformatics code (you can see on Hackage). I am
 trying to finish the 2Bit file format code ... it seems to me that
 bioinformatics as an area is not clearly defined  e.g. it is unclear to
 me whether offset is a marshalled/serialized concept or or
 unmarshalled/unserialized concept . this distinction is very important
  I will have to think about more myself!
 
 
  Regards, Vasili
 


 Here's some code using Data.Binary to store data as offsets into a
 byte array.  I haven't tested it too much, so it may have bugs.  Maybe
 there's some inspiration in there.

 -Antoine

 
 import Data.Binary
 import Data.Binary.Get
 import Data.Binary.Put

 import Data.ByteString.Lazy (ByteString)
 import qualified Data.ByteString.Lazy as B

 data TestStruct = TestStruct
{ property1 :: ByteString
, property2 :: ByteString
, property3 :: ByteString
}
  deriving Show

 {-

  The serialized format looks like (all big-endian):

  * first offset into data block (Word32)
  * second offset into data block (Word32)
  * third offset into data block (Word32)
  * length of bnary data block (Word32)
  * binary data block (Arbitrary binary data)

 -}
 instance Binary TestStruct where
put struct =
let data1 = property1 struct
data2 = property2 struct
data3 = property3 struct

dataBlock = data1 `B.append` data2 `B.append` data3

offset1 = 0
offset2 = offset1 + B.length data1
offset3 = offset2 + B.length data2

   in do
 putWord32be $ fromIntegral offset1
 putWord32be $ fromIntegral offset2
 putWord32be $ fromIntegral offset3

 putWord32be $ fromIntegral $ B.length dataBlock
 putLazyByteString dataBlock

get = do
  offset1 - getWord32be
  offset2 - getWord32be
  offset3 - getWord32be

  dataBlockLength - getWord32be
  dataBlock - B.drop (fromIntegral offset1) `fmap`
   getLazyByteString (fromIntegral dataBlockLength)

  let (data1, rest1) =
  B.splitAt (fromIntegral $ offset2 - offset1) dataBlock
  (data2, rest2) =
  B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1
  data3  = rest2

  return $ TestStruct data1 data2 data3
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe