drtomc: > >Anyone trying to do any of this? > > I've done some work in this area. I'm particularly interested in > manipulating ASN.1 in haskell. Actually, my first use of Parsec was an > ASN.1 parser. I'd done one previously in Spirit (the Boost C++ rip-off > of parsec), but semantic actions were horrible in the extreme. Mmmm > Parsec. > > In the indexing system I'm currently building in Haskell for my day > job, I'm serializing several data structures, and using Data.Bits and > Data.ByteString heavily. > > I was using HaXml, but I found it was very slow. So instead, I'm using > an internal (within the indexing system) representation that is more > akin to WBXML: > > import Data.ByteString as ByteString > import Data.List as List > import Data.Sequence as Seq > > data DocTree > = DocElem ByteString [(ByteString,ByteString)] [DocTree] > | DocText ByteString > > serialize tree = ByteString.concat $ Seq.toList $ execState > (serialize' tree) Seq.empty > serialize' (DocText txt) = do > stuff <- get > put (stuff |> pack [0]) > putStr txt > serialize' (DocElem name attrs kids) = do > stuff <- get > put (stuff |> pack [1]) > putStr name > putNum (List.length attrs) > mapM_ (putPair putStr putStr) attrs > putNum (List.length kids) > mapM_ serialize' kids > > putStr .... > > You get the idea. Actually, the *real* code is trickier - it grovels > first to find all the element names and numbers them. Likewise with > attribute names (per element). The extra grovel is well worth it - it > takes a little longer to serialize, but is more compact and > deserializes quicker. > > Also worth noting - whether you compile a dictionary of element names > or not, the result is much much much more space efficient than using > HaXml, since it can all be decoded out of a single ByteString > containing the document tree, with no actual string copying at all. > That's the kind of [de]serialization I like. :-) Mind you, I still > have to use HaXml when I first read documents into the system, and a > very nice job it does too.
Can we do a cheap bytestring binding to libxml, to avoid any initial String processing? -- Don _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe