I was attempting to figure out how to handle other
encodings besides utf8 in connection with pipes-text,
now that it's clearer what one can hope for from the 
newer version of `text`. -- It is easy enough to replicate 
the`Codec` types used in enumerator and conduit, and 
then define`decode`and`encode` functions. (See e.g.

              
http://hackage.haskell.org/package/conduit-1.0.8/docs/src/Data-Conduit-Text.html#Codec
              
http://hackage.haskell.org/package/enumerator-0.4.20/docs/src/Data-Enumerator-Text.html#Codec
 
) 

I tried this, and it doesn't seem to raise any problem
except that it involves a dependency on conduit or
enumerator or else replicating their code (which is
what I did in my experiment).

But, especially given the types we wanted for e.g.

      decodeUtf8 :: Producer ByteString m r -> Producer Text m (Producer 
ByteString m r) 

the type of the different `Codec` s ends up being very
close to a `Lens'` between `Producer ByteString m r`
and `Producer Text m (Producer ByteString m r)` (Or
rather a defective `Iso`) That is, it is similar to
`span isValidUtf8Char` which is a `Lens'` in the new
pipes-parse, and akin to all the lenses exported by
pipes-bytestring .

     type Codec m =  Lens' (Producer ByteString m r) (Producer Text m 
(Producer ByteString m r))
     utf8 :: Monad m => Codec m
     latin1 :: Monad m => Codec m 
     decodeUtf8 p = p ^. utf8

(There are other possibilities.) One difference from the
conduit/enumerator `Codec` type is that they evisage
possible failure going from `Text` to `ByteString`, which
could certainly happen with `latin1`. But I wonder if
this is necessary given the purposes Gabriel is
thinking of putting this style of `Lens'` to. The
`Producer Text ...` I am dealing with in many contexts
is a `Producer Text m (Producer ByteString m r)` where
the text is validated as utf8.

Another possiblity is to introduce a special exception
type the way `conduit` and `enumerator` do, and then use
`MonadCatch` or something. So we'd end up with e.g.

      utf8 :: MonadCatch m => Lens' (Producer ByteString m r) (Producer 
Text m r) 

Really that could be Iso or Kinda_Iso, I guess.

It is a problem about the text library how deeply
entrenched its use of exceptions is. For example on the
exception-avoiding principles we have adopted,

     pack :: Producer String m a -> Producer Text m a  -- i.e. map T.pack

ought really to be

     pack :: Producer String m a -> Producer Text m (Producer String m a)

since `astral plane` Haskell `Chars` cannot be
represented in `Text`, no doubt for reasonable reasons.
So we ought to stop when we hit one and return the rest.
It is really no different from `encodeLatin1`. It makes one
want to use `U.Vector Char` ...

Sorry, these are somewhat half-baked and perhaps 
confused  thoughts that came to me after looking at 
the swank new `pipes-parse` and `pipes-bytestring`, 
having first thought about this `Codec` concept a little. 
I was wondering some obvious excellent idea
might occur to someone. 

Michael

-- 
You received this message because you are subscribed to the Google Groups 
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Reply via email to