Thanks for the reply! The rotated lens is no problem (rotateR is from 
Data.Bits), but i'm afraid the data won't decode as UTF-8. Just to make 
sure I understand correctly: When you talk about re-encoding unused values, 
do you mean the values that would be left if the parser zoomed into was a 
different one than drawAll and didn't consume all the data provided by the 
span lens? I understand why it would be a problem if those leftovers 
weren't propagated back, but I'm not sure I understand why that decision 
can't be made before the data is rotated and decoded as text. Does it have 
to do with the data being bytestrings that get transformed in blocks rather 
than per byte?

Anyway I'll have to go with your second option. Instead of breaking the 
parser into multiple code blocks (that have to be runStateTed individually) 
in order to get at the bytestring producer, is it reasonable to use get and 
put from Control.Monad.State? That way I can keep everything a single 
Parser, view the bytestring producer from "get" through the PB.span lens 
composed with the transformations, and "put" back the producer returned by 
span.

Bonus question: If the rotated lens was simply Bits a => Int -> Lens' a a, 
could it be mapped/zoomed/something over a ByteString producer instead of 
including PB.map in the lens? That way rotated would be more reusable.

On Saturday, May 10, 2014 1:45:32 AM UTC+2, Gabriel Gonzalez wrote:
>
>  This works much better if you can make two small changes.
>
> First, I'm guessing that your `rotateR` function has some sort of inverse 
> named `rotateL`.  If it does, then you can make a rotation lens:
>
>     rotated :: Int -> Lens' (Producer ByteString m x) (Producer ByteString 
> m x)
>     rotated n = iso (PB.map (`rotateR` n)) (PB.map (`rotateL` n))
>
> Second, if you can use utf8 instead of latin1, then you can just write:
>
>     decodeFileName :: Parser ByteString String
>     decodeFileName = zoom (PB.span (/= 0) . rotated 3 . PT.utf8 . from 
> PT.packChars) PP.drawAll
>
> The reason this works is that `rotated` and `utf8` contain extra 
> information for how to propagate unused bytes back to the original input 
> source.  In the case of `rotated` it reverse the original rotation and in 
> the case of `utf8` it re-encodes them.
>
> If you don't have information for how to re-encode unused values, then you 
> must apply the rotation and encoding to the producer before feeding it to 
> the parser:
>
>     yourProducer :: Producer ByteString IO ()
>
>     runStateT PP.drawAll (yourProducer ^. span (/= 0) ^. to (PB.map 
> (`rotateR` n)) ^. PT.utf8 ^. fromPT.packChars)
>         :: IO (String, Producer String IO (... {- more nested producers 
> -}))
>
> `pipes-parse` doesn't let you merge logic into the parser unless you also 
> include logic for how to propagate unused bytes to the input source.  
> Without that guarantee you get bugs related to silently dropping input 
> values.
>
> On 5/9/14, 11:06 AM, Torgeir Strand Henriksen wrote:
>  
> While working with a binary file format, I started out with this naive 
> code:
>
> import qualified Pipes.Parse as P
> import qualified Pipes.Binary as P
> import qualified Pipes.ByteString as PB
> import qualified Data.Text as T
> import qualified Data.ByteString as BS
>  
>  entryParser tableStart = P.decodeGet $ (,,,) <$> decodeFilename <*> fmap 
> (tableStart +) getWord32le <*> getWord32le <*> getWord32le
>
> decodeFilename = T.unpack . decodeLatin1 . BS.pack <$> go where
>     go = do
>         c <- (`rotateR` 3) <$> getWord8
>         if c /= 0 then (c :) <$> go else pure [] -- terminate on (and 
> consume the) 0
>  
> While it does work, I'm unhappy with decodeFilename as it basically 
> implements a combination of map and span/fold with explicit recursion. But 
> the underlying ByteString isn't available inside the Get monad without 
> consuming it, so using e.g. BS.span seems out of the question. Let's see if 
> lenses can come to the rescue:
>
> entryParser tableStart = do
>     nameChunks <- zoom (PB.span (/= 0)) P.drawAll
>     PB.drawByte -- draw the terminating 0
>     let fileName = T.unpack . decodeLatin1 . BS.map (flip rotateR 3) . 
> BS.concat $ nameChunks
>     P.decodeGet $ (,,,) fileName <$> fmap (tableStart +) getWord32le <*> 
> getWord32le <*> getWord32le
>  
> I like this better - map and span aren't implemented manually anymore - 
> but at the same time I was hoping for more. It doesn't seem right to work 
> directly on ByteStrings (i.e. BS.map instead of PB.map, and text instead of 
> pipes-text), and the combination of drawAll and concat is a bit awkward, 
> especially since drawAll is only for testing (even though all the tutorials 
> use it :) ). The latter point might be addressed by giving pipes-bytestring 
> a folding function similar to P.foldAll, but even so I wonder if there's a 
> more ideomatic way to do this?
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "Haskell Pipes" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected]<javascript:>
> .
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Reply via email to