On Wed, 2014-07-02 at 17:39 -0700, Gabriel Gonzalez wrote: > This can be solved really elegantly by adding this function to > `Pipes.Prelude`: > > -- I need a better name for this > andFold :: Monad m => (x -> a -> x) -> x -> (x -> b) -> Producer a > m r -> Producer a m (r, b)
Yes please! > > Then you can make your code much more reusable by writing it as a `Fold` > from `foldl`: > > hash :: HashAlgorithm a => Fold ByteString (Digest a) > > Then you would apply `hash` to a `Producer` using: > > purely andFold hash :: HashAlgorithm a => Producer ByteString m r > -> Producer ByteSTring m (r, Digest a) > > The benefit of factoring out the `hash` logic into a `Fold` is that: > > * You can then use it to fold other containers, like lists or vectors: > > Control.Foldl.fold hash :: Foldable f => f ByteString -> Digest a > > `conduit` also supports folding using the `Fold` type in > `conduit-extras` so your solution then generalizes to conduits, too. > > * You can apply other folds at the same time using the `Applicative` > instance for `Fold`: > > purely andFold ((,) <$> hash <*> Control.Foldl.ByteString.length) > :: Producer ByteString m r -> Producer ByteString m (r, (Digest > a, Int64)) > > The above code computes the hash and the length in bytes simultaneously > in a single pass over the data > > * People can use your `hash` `Fold` without a `pipes` dependency. I > design all my libraries to minimize coupling like this. I'm familiar with the `foldl` library (even created a sort-of port to OCaml), and it's really nice! If the `andFold` would be in `pipes`, I'd certainly change my code to use a `Fold` instead. Might even try to convince to get such `Fold` in `cryptohash` upstream, so my package becomes obsolete. Nicolas > > On 7/2/14, 3:59 PM, Nicolas Trangez wrote: > > On Wed, 2014-07-02 at 15:31 -0700, Gabriel Gonzalez wrote: > >> I think the interface I would recommend would be: > >> > >> hash :: (HashAlgorithm a, Monad m) => Producer ByteString m r -> > >> Producer ByteString m (Digest, r) > >> > >> I noticed your version has two `Producer` layers. What's the role of > >> the second `Producer`? > > Good question :-) I expected to be able to write the function according > > to the type you point out above, but when I wrote the thing, GHC > > insisted the base `Producer` was required, and I tend to trust GHC when > > it tells me such things. > > > > Now, since you made the same remark, and I tend to trust you as well > > (;-)), I went back to the code and figured out all I need is a `lift`... > > > > So, first API change: > > > > hash :: (HashAlgorithm a, Monad m) > > => Producer ByteString m r > > -> Producer ByteString m (r, Digest a) > > > > Much better! > > > > Now I'm wondering whether I'm overlooking any function in the Pipes > > library which does what my internal function does: > > > > hashInternal :: Monad m > > => (a -> b -> a) > > -> a > > -> Producer b m r > > -> Producer b m (r, a) > > hashInternal f = loop > > where > > loop !ctx p = > > lift (next p) >>= > > either > > (\r -> return (r, ctx)) > > (\(b, p') -> yield b >> loop (f ctx b) p') > > > > This seems like a fairly reasonable/common thing to do, no (modulo the > > strictness, could be pushed into `f`)? > > > > Thanks, > > > > Nicolas > > > >> On 7/2/14, 2:49 PM, Nicolas Trangez wrote: > >>> All, > >>> > >>> Recently I was looking for a way to calculate the SHA256 hash of a > >>> stream of ByteStrings, but not only calculate this hash: also pass on > >>> the stream so it could be stored to a file (use-case: receiving data > >>> from a socket, which needs to be saved to a file, named after a hash of > >>> the content, then returning this 'name' to the client). > >>> > >>> I couldn't find any such library of-the-shelve in the Pipes ecosystem, > >>> so I tried to implement something myself (with some help from IRC), and > >>> just now pushed a first version of the code to GitHub. > >>> > >>> So, consider this a 'request for feedback/review' :-) > >>> > >>> The hashing itself is based on the 'cryptohash' library. > >>> > >>> There are 2 interfaces: > >>> - One where a (strict or lazy) ByteString Producer can be wrapped, > >>> resulting in a new Producer which will yield the exact same values as > >>> the original one, but tupling the result of the original Producer with a > >>> digest. E.g. > >>> > >>> hash :: (HashAlgorithm a, Monad m) > >>> => Producer ByteString (Producer ByteString m) r > >>> -> Producer ByteString m (r, Digest a) > >>> > >>> - One which acts as a real 'Pipe' which passes along every (strict or > >>> lazy) ByteString it receives, and updates a hashing Context kept in a > >>> State layer of the monad stack. E.g. > >>> > >>> hashPipe :: (HashAlgorithm a, MonadState (Context a) m) > >>> => Pipe ByteString ByteString m r > >>> > >>> > >>> These have some variants: working over streams of strict or lazy > >>> bytestrings, requiring type inference to determine the hash type, or an > >>> explicit value, or allowing to pass in an existing context (e.g. > >>> generated by another stream before). > >>> > >>> Code @ https://github.com/NicolasT/pipes-cryptohash/ > >>> > >>> Please let me know if you think the API could be enhanced, I'm doing > >>> something utterly wrong,... > >>> > >>> Thanks! > >>> > >>> Nicolas > >>> > > > -- You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected].
