from:"Myles C. Maxfield"

[Haskell-cafe] Parallel Data.Vector.generate?

2013-03-28 Thread Myles C. Maxfield

Hello all. I'm using the Data.Vector.generate function with a complicated
creation function to create a long vector. Is it possible to parallelize
the creation of each element?

Alternatively, if there was something like parMap for vectors, I suppose I
could pass id to Data.Vector.generate and use parMap. Does something like
this exist?

Thanks,
Myles C. Maxfield
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Parallel Data.Vector.generate?

2013-03-28 Thread Myles C. Maxfield

Thanks for this! I didn't know about Repa, and it sounds like it's exactly
what the doctor ordered. I think I'll port me entire program to it!

--Myles

On Thursday, March 28, 2013, Dmitry Dzhus wrote:

 28.03.2013, 10:38, Myles C. Maxfield myles.maxfi...@gmail.comjavascript:;
 :
  Hello all. I'm using the Data.Vector.generate function with a
 complicated creation function to create a long vector. Is it possible to
 parallelize the creation of each element?
  Alternatively, if there was something like parMap for vectors, I suppose
 I could pass id to Data.Vector.generate and use parMap. Does something like
 this exist?

 You may use computeP + fromFunction from Repa. Wrapping of vectors to
 Repa arrays (and vice versa) is O(1).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Safe 'chr' function?

2013-01-03 Thread Myles C. Maxfield

Thanks you two for your answers. Consider this issue closed now :-)

--Myles

On Thu, Jan 3, 2013 at 12:05 AM, Michael Snoyman mich...@snoyman.comwrote:

 You could wrap chr with a call to spoon[1]. It's not the most elegant
 solution, but it works.

 [1]
 http://hackage.haskell.org/packages/archive/spoon/0.3/doc/html/Control-Spoon.html#v:spoon


 On Thu, Jan 3, 2013 at 9:50 AM, Myles C. Maxfield 
 myles.maxfi...@gmail.com wrote:

 Hello,
 I'm working on a general text-processing library [1] and one of my
 quickcheck tests is designed to make sure that my library doesn't throw
 exceptions (it returns an Either type on failure). However, there are some
 inputs that cause me to pass bogus values to the 'chr' function (such
 as 1208914), which causes it to throw an exception. Is there a version of
 that function that is safe? (I'm hoping for something like Int - Maybe
 Char). Alternatively, is there a way to know ahead of time whether or not
 an Int will cause 'chr' to throw an exception?

 Thanks,
 Myles C. Maxfield

 [1] http://hackage.haskell.org/package/punycode

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Safe 'chr' function?

2013-01-02 Thread Myles C. Maxfield

Hello,
I'm working on a general text-processing library [1] and one of my
quickcheck tests is designed to make sure that my library doesn't throw
exceptions (it returns an Either type on failure). However, there are some
inputs that cause me to pass bogus values to the 'chr' function (such
as 1208914), which causes it to throw an exception. Is there a version of
that function that is safe? (I'm hoping for something like Int - Maybe
Char). Alternatively, is there a way to know ahead of time whether or not
an Int will cause 'chr' to throw an exception?

Thanks,
Myles C. Maxfield

[1] http://hackage.haskell.org/package/punycode
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Does anyone know where George Pollard is?

2012-11-14 Thread Myles C. Maxfield

False alarm. He got back to me.

Thanks,
Myles

On Thu, Nov 8, 2012 at 12:20 AM, Ketil Malde ke...@malde.org wrote:
 Myles C. Maxfield myles.maxfi...@gmail.com writes:

 Does anyone know where he is?

 On GitHub? https://github.com/Porges  One of the repos was apparently
 updated less than a week ago.

 If not, is there an accepted practice to
 resolve this situation? Should I upload my own 'idna2' package?

 You can always upload a fork, but unless you have a lot of new
 functionality that won't fit naturally in the old package, you can
 perhaps try a bit more to contact the original author.

 -k

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Does anyone know where George Pollard is?

2012-11-07 Thread Myles C. Maxfield

Hello,
I sent a message to George Pollard (por...@porg.es) about his 'idna'
package [1] a couple days ago, but he hasn't responded. I'd like to
depend on his package for something that I'm working on, but his
package fails to build (The email I sent him includes a patch that
should fix up the build problems). The package hasn't been updated for
3 years and he hasn't listed a source control repository.

Does anyone know where he is? If not, is there an accepted practice to
resolve this situation? Should I upload my own 'idna2' package?

Thanks,
Myles

[1] http://hackage.haskell.org/package/idna

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Auto-termination and leftovers in Conduits

2012-10-28 Thread Myles C. Maxfield

Cool! Thanks so much!

--Myles

On Sat, Oct 27, 2012 at 8:35 PM, Michael Snoyman mich...@snoyman.com wrote:
 The important issue here is that, when using =$, $=, and =$=, leftovers will
 discarded. To see this more clearly, realize that the first line of sink is
 equivalent to:

   out1 - C.injectLeftovers CT.lines C.+ CL.head

 So any leftovers from lines are lost once you move past that line. In order
 to get this to work, stick the consume inside the same composition:

 sink = C.injectLeftovers CT.lines C.+ do
 out1 - CL.head
 out2 - CL.consume
 return (out1, T.unlines out2)

 Or:

 sink = CT.lines C.=$ do
 out1 - CL.head
 out2 - CL.consume
 return (out1, T.unlines out2)

 Michael

 On Sat, Oct 27, 2012 at 9:20 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:

 Hey,
 Say I have a stream of Data.Text.Text objects flowing through a
 conduit, where the divisions between successive Data.Text.Text items
 occur at arbitrary boundaries (maybe the source is sourceFile $=
 decode utf8). I'd like to create a Sink that returns a tuple of (the
 first line, the rest of the input).

 My first attempt at this looks like this:

 sink = do
   out1 - CT.lines C.=$ CL.head
   out2 - CL.consume
   return (out1, T.concat out2)

 However, the following input provides:

 runIdentity $ CL.sourceList [abc\nde, f\nghi] C.$$ sink
 (Just abc,f\nghi)

 But what I really want is
 (Just abc, \ndef\nghi)

 I think this is due to the auto-termination you mention in [1]. My
 guess is that when CT.lines yields the first value, (CL.head then also
 yields it,) and execution is auto-terminated before CT.lines gets a
 chance to specify any leftovers.

 How can I write this sink? (I know I can just use CL.consume and
 T.break (== '\n'), but I'm not interested in that. I'm trying to
 figure out how to get the behavior I'm looking for with conduits.)

 Thanks,
 Myles

 [1]
 http://hackage.haskell.org/packages/archive/conduit/0.5.2.7/doc/html/Data-Conduit.html



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Auto-termination and leftovers in Conduits

2012-10-27 Thread Myles C. Maxfield

Hey,
Say I have a stream of Data.Text.Text objects flowing through a
conduit, where the divisions between successive Data.Text.Text items
occur at arbitrary boundaries (maybe the source is sourceFile $=
decode utf8). I'd like to create a Sink that returns a tuple of (the
first line, the rest of the input).

My first attempt at this looks like this:

sink = do
  out1 - CT.lines C.=$ CL.head
  out2 - CL.consume
  return (out1, T.concat out2)

However, the following input provides:

runIdentity $ CL.sourceList [abc\nde, f\nghi] C.$$ sink
(Just abc,f\nghi)

But what I really want is
(Just abc, \ndef\nghi)

I think this is due to the auto-termination you mention in [1]. My
guess is that when CT.lines yields the first value, (CL.head then also
yields it,) and execution is auto-terminated before CT.lines gets a
chance to specify any leftovers.

How can I write this sink? (I know I can just use CL.consume and
T.break (== '\n'), but I'm not interested in that. I'm trying to
figure out how to get the behavior I'm looking for with conduits.)

Thanks,
Myles

[1] 
http://hackage.haskell.org/packages/archive/conduit/0.5.2.7/doc/html/Data-Conduit.html

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Hackage Package Discoverability

2012-10-23 Thread Myles C. Maxfield

The last revision of the encoding package (0.6.7.1) was uploaded 6
days ago, so it's certainly not old. The package is also not
unwieldly: the functions (runPut . encode punycode) and (runGet
(decode punycode)) are equivalent to my 'encode' and 'decode'
functions. In addition, it supports many more kinds of encodings and
is much more general than my little library. In addition, it is much
more flexible because of its use of ByteSource and ByteSink. It seems
like a hands-down win to me.

I've CC'ed the maintainer of the encoding package; maybe he can better
reply about the encoding library.

On Mon, Oct 22, 2012 at 11:14 PM, Bryan O'Sullivan b...@serpentine.com wrote:

 On Tue, Oct 23, 2012 at 5:53 AM, Myles C. Maxfield myles.maxfi...@gmail.com 
 wrote:

 I am the author/maintainer of the 'punycode' hackage package. After 4 
 months, I just found that punycode conversion already exists in the 
 Data.Encoding.BootString package inside the 'encoding' package. I'd like to 
 deprecate my package in favor of the 'encoding' package.


 Please don't plan to do that. The encoding package may have filled a gap at 
 some point, but now it looks old, unwieldy, inefficient (String), and weird 
 (implicit parameters?) to me, and it's mostly obsolete (the standard I/O 
 library has supported Unicode and encodings for a while now). I would not use 
 the encodings package myself, for instance.

 Your punycode package, in contrast, has a simple API and looks easy to use. 
 I'd suggest that you supprt the Text type as well as String, but otherwise 
 please keep it around and maintain it.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Hackage Package Discoverability

2012-10-22 Thread Myles C. Maxfield

Hello,
I am the author/maintainer of the 'punycode' hackage package. After 4
months, I just found that punycode conversion already exists in the
Data.Encoding.BootString package inside the 'encoding' package. I'd like to
deprecate my package in favor of the 'encoding' package.

However, I would also like to solve the discoverability problem of people
not knowing to look in the 'encoding' package when they're looking for the
punycode algorithm. (I certainly didn't look there, and as a result, I
re-implemented the algorithm). My initial thought is to keep my package in
the hackage database, but put a big label on it saying DEPRECATED: Use
Data.Encoding.BootString instead. Is there a better way to make this
algorithm discoverable?

Thanks,
Myles
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Dynamic Programming with Data.Vector

2012-09-17 Thread Myles C. Maxfield

Aha there it is! Thanks so much. I didn't see it because it's under the
Unfolding section instead of the Construction section.

--Myles

On Mon, Sep 17, 2012 at 6:07 AM, Roman Leshchinskiy r...@cse.unsw.edu.auwrote:

 Myles C. Maxfield wrote:
 
  Overall, I'm looking for a function, similar to Data.Vector's 'generate'
  function, but instead of the generation function taking the destination
  index, I'd like it to take the elements that have previously been
  constructed. Is there such a function? If there isn't one, is this kind
 of
  function feasible to write? If such a function doesn't exist and is
  feasible to write, I'd be happy to try to write and contribute it.

 Indeed there is, it's called constructN (or constructrN if you want to
 construct it right to left).

 Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Dynamic Programming with Data.Vector

2012-09-16 Thread Myles C. Maxfield

Hello,

I've been writing dynamic programming (dp) algorithms in imperative
languages for years, and I was thinking recently about how to use it in a
Haskell context. In particular, I want to write a function that takes an
ordered collection of items and produces a new item to insert into the
ordered collection.

The most straightforward way to do this would be to use a list, something
like the following:

recurse :: [Integer] - [Integer]
recurse l = newValue : recurse (take (length l + 1) infiniteList)
  where newValue = ...

infiniteList :: [Integer]
infiniteList = initialList ++ recurse initialList
  where initialList = ...

solution :: Integer
solution = infiniteList !! 5

I'm assuming that this can run fast because I'm assuming the 'take'
function won't actually duplicate the list ([1] doesn't actually list the
running time of 'take') Is this a correct assumption to make?

Secondarily, and most importantly for me, I'm curious about how to make
this fast when the computation of 'newValue' requires random access to the
inputted list. I'm assuming that I would use Vectors instead of lists for
this kind of computation, and [2] describes how I can use the O(1) 'slice'
instead of 'take' above. However, both of Vector's cons and snoc functions
are O(n) which defeats the purpose of using this kind of algorithm.
Obviously, I can solve this problem with mutable vectors, but that's quite
inelegant.

Overall, I'm looking for a function, similar to Data.Vector's 'generate'
function, but instead of the generation function taking the destination
index, I'd like it to take the elements that have previously been
constructed. Is there such a function? If there isn't one, is this kind of
function feasible to write? If such a function doesn't exist and is
feasible to write, I'd be happy to try to write and contribute it.

[1]
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html#g:11
[2]
http://hackage.haskell.org/packages/archive/vector/0.9.1/doc/html/Data-Vector.html#g:6
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Dynamic Programming with Data.Vector

2012-09-16 Thread Myles C. Maxfield

Someone replied saying that I could use a HashMap and a fold to do this,
and that solution should work quite well.

Bonus points if there's a solution without the space overhead of a hashmap
:-) I'm hoping for an unboxed vector.

--Myles

On Sun, Sep 16, 2012 at 12:40 PM, Myles C. Maxfield 
myles.maxfi...@gmail.com wrote:

 Hello,

 I've been writing dynamic programming (dp) algorithms in imperative
 languages for years, and I was thinking recently about how to use it in a
 Haskell context. In particular, I want to write a function that takes an
 ordered collection of items and produces a new item to insert into the
 ordered collection.

 The most straightforward way to do this would be to use a list, something
 like the following:

 recurse :: [Integer] - [Integer]
 recurse l = newValue : recurse (take (length l + 1) infiniteList)
   where newValue = ...

 infiniteList :: [Integer]
 infiniteList = initialList ++ recurse initialList
   where initialList = ...

 solution :: Integer
 solution = infiniteList !! 5

 I'm assuming that this can run fast because I'm assuming the 'take'
 function won't actually duplicate the list ([1] doesn't actually list the
 running time of 'take') Is this a correct assumption to make?

 Secondarily, and most importantly for me, I'm curious about how to make
 this fast when the computation of 'newValue' requires random access to the
 inputted list. I'm assuming that I would use Vectors instead of lists for
 this kind of computation, and [2] describes how I can use the O(1) 'slice'
 instead of 'take' above. However, both of Vector's cons and snoc functions
 are O(n) which defeats the purpose of using this kind of algorithm.
 Obviously, I can solve this problem with mutable vectors, but that's quite
 inelegant.

 Overall, I'm looking for a function, similar to Data.Vector's 'generate'
 function, but instead of the generation function taking the destination
 index, I'd like it to take the elements that have previously been
 constructed. Is there such a function? If there isn't one, is this kind of
 function feasible to write? If such a function doesn't exist and is
 feasible to write, I'd be happy to try to write and contribute it.

 [1]
 http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html#g:11
 [2]
 http://hackage.haskell.org/packages/archive/vector/0.9.1/doc/html/Data-Vector.html#g:6

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re-exports of resourcet in conduit

2012-06-03 Thread Myles C. Maxfield

Yep, that was my problem. Thanks so much!

GHC is smart enough to realize that Data.Conduit.ResourceT is the same
as Control.Monad.Trans.Resource.ResourceT. Yay!

--Myles

On Sun, Jun 3, 2012 at 2:00 AM, Michael Snoyman mich...@snoyman.com wrote:
 The easiest thing to do is just build your code with cabal, which will
 ensure you're using consistent versions. (Similar questions came up
 twice recently on Stack Overflow[1][2].) Wiping our your ~/.ghc and
 installing from scratch should work also, but it's like using a
 tactical nuke instead of a scalpel.

 As for checking versions of dependencies, try `ghc-pkg describe conduit`.

 Michael

 [1] 
 http://stackoverflow.com/questions/10729291/lifting-trouble-with-resourcet/10730909#10730909
 [2] 
 http://stackoverflow.com/questions/10843547/snap-monad-liftio-and-ghc-7-4-1/10847401#10847401

 On Sun, Jun 3, 2012 at 3:01 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
 It could be. Do you know how I can check which versions of packages other
 packages have built against with Cabal? Will it help if I remove all the
 relevant packages and then re-install only a single version?

 Thanks,
 Myles


 On Saturday, June 2, 2012, Antoine Latter wrote:

 Is it possible that you are puuling in two different versions of the
 package that defines the MonadThrow class?

 That is, package a was built against version 1, but package b was built
 against version 2? This would make GHC think the type-class were
 incompatable.

 This is just a guess - I have not tried what you are trying.

 On Jun 2, 2012 6:35 PM, Myles C. Maxfield myles.maxfi...@gmail.com
 wrote:

 To: Michael Snoyman
 CC: haskell-cafe

 Hello,
 I'm having a problem working with the conduit library, and was hoping
 you could help me out.

 Data.Conduit re-exports ResourceT, MonadResource, and MonadThrow (but
 not ExceptionT) from Control.Monad.Trans.Resource. I have a conduit
 which operates on a monad in the MonadThrow class. I am trying to
 figure out which MonadThrow class this should be (the
 Data.Conduit.MonadThrow class, or the
 Montrol.Monad.Trans.Resource.MonadThrow class, since apparently GHC
 doesn't recognize them as the same, even though one is just a
 re-export of the other).

 If a user of this conduit wants to chain this conduit up with
 something like sourceFile, the underlying monad has to be a member of
 Data.Conduit.MonadResource and whatever MonadThrow class I chose to
 use. I would like to be able to use an existing instance to lift the
 class of the inner monad to the class of the entire monad stack (so I
 don't have to tell the user of my conduit that they have to define
 their own instances), and the only rule that I can find that does that
 is the following from Data.Conduit:

 Data.Conduit.MonadThrow m = Data.Conduit.MonadThrow
 (Data.Conduit.ResourceT m)

 However, GHC doesn't seem to think that
 Control.Monad.Trans.Resource.ExceptionT is an instance of
 Data.Conduit.MonadThrow:

    No instance for (Data.Conduit.MonadThrow (ExceptionT IO))
      arising from a use of `.'

 Control.Monad.Trans.Resource has a similar instance:

 Control.Monad.Trans.Resource.MonadThrow m =
 Control.Monad.Trans.Resource.MonadThrow
 (Control.Monad.Trans.Resource.ResourceT m)

 but because sourceFile operates in the Data.Conduit.MonadResource
 class, and Control.Monad.Trans.Resource.ResourceT isn't a member of
 that class (it's only a member of
 Control.Monad.Trans.Resource.MonadResource), that doesn't help:

    No instance for (Data.Conduit.MonadResource
                       (Control.Monad.Trans.Resource.ResourceT (ExceptionT
 IO)))
      arising from a use of `.'

 It should be noted that neither module defines anything like the
 following:

 MonadResource m = MonadResource (ExceptionT m)

 It seems like the underlying problem here is that:
 1) I am required to use the Control.Monad.Trans.Resource.ExceptionT
 class, because Data.Conduit doesn't re-export it
 2) I am required to use the Data.Conduit.MonadResource class, because
 sourceFile and others require it
 3) There doesn't seem to be an existing instance that bridges between the
 two.

 This seems like a fundamental flaw with re-exporting; it can only work
 if you re-export every single last thing from the original module.
 This doesn't seem tenable because the orignal module might not be
 under your control, so its author can add new symbols whenever he/she
 wants to.

 I see two solutions to this problem:
 1) Re-export Control.Monad.Trans.Resource.ExceptionT in Data.Conduit.
 This will work until someone adds something to the resourcet package
 and someone wants to use the new addition and Data.Conduit.ResourceT
 in the same stack
 2) Don't re-export anything in Data.Conduit; make sourceFile and
 others explicitly depend on types in another module, but this might
 break compatibility with existing programs if they use fully-qualified
 symbol names.
 3) Make anyone who wants to use a monad stack in MonadThrow and
 MonadResource

[Haskell-cafe] Re-exports of resourcet in conduit

2012-06-02 Thread Myles C. Maxfield

To: Michael Snoyman
CC: haskell-cafe

Hello,
I'm having a problem working with the conduit library, and was hoping
you could help me out.

Data.Conduit re-exports ResourceT, MonadResource, and MonadThrow (but
not ExceptionT) from Control.Monad.Trans.Resource. I have a conduit
which operates on a monad in the MonadThrow class. I am trying to
figure out which MonadThrow class this should be (the
Data.Conduit.MonadThrow class, or the
Montrol.Monad.Trans.Resource.MonadThrow class, since apparently GHC
doesn't recognize them as the same, even though one is just a
re-export of the other).

If a user of this conduit wants to chain this conduit up with
something like sourceFile, the underlying monad has to be a member of
Data.Conduit.MonadResource and whatever MonadThrow class I chose to
use. I would like to be able to use an existing instance to lift the
class of the inner monad to the class of the entire monad stack (so I
don't have to tell the user of my conduit that they have to define
their own instances), and the only rule that I can find that does that
is the following from Data.Conduit:

Data.Conduit.MonadThrow m = Data.Conduit.MonadThrow (Data.Conduit.ResourceT m)

However, GHC doesn't seem to think that
Control.Monad.Trans.Resource.ExceptionT is an instance of
Data.Conduit.MonadThrow:

No instance for (Data.Conduit.MonadThrow (ExceptionT IO))
  arising from a use of `.'

Control.Monad.Trans.Resource has a similar instance:

Control.Monad.Trans.Resource.MonadThrow m =
Control.Monad.Trans.Resource.MonadThrow
(Control.Monad.Trans.Resource.ResourceT m)

but because sourceFile operates in the Data.Conduit.MonadResource
class, and Control.Monad.Trans.Resource.ResourceT isn't a member of
that class (it's only a member of
Control.Monad.Trans.Resource.MonadResource), that doesn't help:

No instance for (Data.Conduit.MonadResource
   (Control.Monad.Trans.Resource.ResourceT (ExceptionT IO)))
  arising from a use of `.'

It should be noted that neither module defines anything like the following:

MonadResource m = MonadResource (ExceptionT m)

It seems like the underlying problem here is that:
1) I am required to use the Control.Monad.Trans.Resource.ExceptionT
class, because Data.Conduit doesn't re-export it
2) I am required to use the Data.Conduit.MonadResource class, because
sourceFile and others require it
3) There doesn't seem to be an existing instance that bridges between the two.

This seems like a fundamental flaw with re-exporting; it can only work
if you re-export every single last thing from the original module.
This doesn't seem tenable because the orignal module might not be
under your control, so its author can add new symbols whenever he/she
wants to.

I see two solutions to this problem:
1) Re-export Control.Monad.Trans.Resource.ExceptionT in Data.Conduit.
This will work until someone adds something to the resourcet package
and someone wants to use the new addition and Data.Conduit.ResourceT
in the same stack
2) Don't re-export anything in Data.Conduit; make sourceFile and
others explicitly depend on types in another module, but this might
break compatibility with existing programs if they use fully-qualified
symbol names.
3) Make anyone who wants to use a monad stack in MonadThrow and
MonadResource define their own instances. This is probably no good
because it means that many different modules will implement the same
instance in potentially many different ways.

I feel like option 2) is probably the best solution here. I'm
perfectly happy to issue a pull request for whichever option you think
is best, but I don't know which solution you think is best for your
project. What do you think?

--Myles

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re-exports of resourcet in conduit

2012-06-02 Thread Myles C. Maxfield

It could be. Do you know how I can check which versions of packages other
packages have built against with Cabal? Will it help if I remove all the
relevant packages and then re-install only a single version?

Thanks,
Myles

On Saturday, June 2, 2012, Antoine Latter wrote:

 Is it possible that you are puuling in two different versions of the
 package that defines the MonadThrow class?

 That is, package a was built against version 1, but package b was built
 against version 2? This would make GHC think the type-class were
 incompatable.

 This is just a guess - I have not tried what you are trying.
 On Jun 2, 2012 6:35 PM, Myles C. Maxfield 
 myles.maxfi...@gmail.comjavascript:_e({}, 'cvml', 
 'myles.maxfi...@gmail.com');
 wrote:

 To: Michael Snoyman
 CC: haskell-cafe

 Hello,
 I'm having a problem working with the conduit library, and was hoping
 you could help me out.

 Data.Conduit re-exports ResourceT, MonadResource, and MonadThrow (but
 not ExceptionT) from Control.Monad.Trans.Resource. I have a conduit
 which operates on a monad in the MonadThrow class. I am trying to
 figure out which MonadThrow class this should be (the
 Data.Conduit.MonadThrow class, or the
 Montrol.Monad.Trans.Resource.MonadThrow class, since apparently GHC
 doesn't recognize them as the same, even though one is just a
 re-export of the other).

 If a user of this conduit wants to chain this conduit up with
 something like sourceFile, the underlying monad has to be a member of
 Data.Conduit.MonadResource and whatever MonadThrow class I chose to
 use. I would like to be able to use an existing instance to lift the
 class of the inner monad to the class of the entire monad stack (so I
 don't have to tell the user of my conduit that they have to define
 their own instances), and the only rule that I can find that does that
 is the following from Data.Conduit:

 Data.Conduit.MonadThrow m = Data.Conduit.MonadThrow
 (Data.Conduit.ResourceT m)

 However, GHC doesn't seem to think that
 Control.Monad.Trans.Resource.ExceptionT is an instance of
 Data.Conduit.MonadThrow:

No instance for (Data.Conduit.MonadThrow (ExceptionT IO))
  arising from a use of `.'

 Control.Monad.Trans.Resource has a similar instance:

 Control.Monad.Trans.Resource.MonadThrow m =
 Control.Monad.Trans.Resource.MonadThrow
 (Control.Monad.Trans.Resource.ResourceT m)

 but because sourceFile operates in the Data.Conduit.MonadResource
 class, and Control.Monad.Trans.Resource.ResourceT isn't a member of
 that class (it's only a member of
 Control.Monad.Trans.Resource.MonadResource), that doesn't help:

No instance for (Data.Conduit.MonadResource
   (Control.Monad.Trans.Resource.ResourceT (ExceptionT
 IO)))
  arising from a use of `.'

 It should be noted that neither module defines anything like the
 following:

 MonadResource m = MonadResource (ExceptionT m)

 It seems like the underlying problem here is that:
 1) I am required to use the Control.Monad.Trans.Resource.ExceptionT
 class, because Data.Conduit doesn't re-export it
 2) I am required to use the Data.Conduit.MonadResource class, because
 sourceFile and others require it
 3) There doesn't seem to be an existing instance that bridges between the
 two.

 This seems like a fundamental flaw with re-exporting; it can only work
 if you re-export every single last thing from the original module.
 This doesn't seem tenable because the orignal module might not be
 under your control, so its author can add new symbols whenever he/she
 wants to.

 I see two solutions to this problem:
 1) Re-export Control.Monad.Trans.Resource.ExceptionT in Data.Conduit.
 This will work until someone adds something to the resourcet package
 and someone wants to use the new addition and Data.Conduit.ResourceT
 in the same stack
 2) Don't re-export anything in Data.Conduit; make sourceFile and
 others explicitly depend on types in another module, but this might
 break compatibility with existing programs if they use fully-qualified
 symbol names.
 3) Make anyone who wants to use a monad stack in MonadThrow and
 MonadResource define their own instances. This is probably no good
 because it means that many different modules will implement the same
 instance in potentially many different ways.

 I feel like option 2) is probably the best solution here. I'm
 perfectly happy to issue a pull request for whichever option you think
 is best, but I don't know which solution you think is best for your
 project. What do you think?

 --Myles

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org javascript:_e({}, 'cvml',
 'Haskell-Cafe@haskell.org');
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Conduit Best Practices for leftover data

2012-04-15 Thread Myles C. Maxfield

Thanks for responding to this. Some responses are inline.

On Sat, Apr 14, 2012 at 8:30 PM, Michael Snoyman mich...@snoyman.com wrote:
 On Thu, Apr 12, 2012 at 9:25 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
 Hello,
 I am interested in the argument to Done, namely, leftover data. More
 specifically, when implementing a conduit/sink, what should the
 conduit specify for the (Maybe i) argument to Done in the following
 scenarios (Please note that these scenarios only make sense if the
 type of 'i' is something in Monoid):

 1) The conduit outputted the last thing that it felt like outputting,
 and exited willfully. There seem to be two options here - a) the
 conduit/sink should greedily gather up all the remaining input in the
 stream and mconcat them, or b) Return the part of the last thing that
 never got represented in any part of anything outputted. Option b
 seems to make the most sense here.

 Yes, option (b) is definitely what's intended.

 2) Something upstream produced Done, so the second argument to
 NeedInput gets run. This is guaranteed to be run at the boundary of an
 item, so should it always return Nothing? Instead, should it remember
 all the input it has consumed for the current (yet-to-be-outputted)
 element, so it can let Data.Conduit know that, even though the conduit
 appeared to consume the past few items, it actually didn't (because it
 needs more input items to make an output)? Remembering this sequence
 could potentially have disastrous memory usage. On the other hand, It
 could also greedily gather everything remaining in the stream.

 No, nothing so complicated is intended. Most likely you'll never
 return any leftovers from the second field of NeedInput. One other
 minor point: it's also possible that the second field will be used if
 the *downstream* pipe returns Done.

Just to help me understand, what is a case when you want to specify
something in this field? I can't think of a case when a Conduit would
specify anything in this case.


 3) The conduit/sink encountered an error mid-item. In general, is
 there a commonly-accepted way to deal with this? If a conduit fails in
 the middle of an item, it might not be clear where it should pick up
 processing, so the conduit probably shouldn't even attempt to
 continue. It would probably be good to return some notion of where it
 was in the input when it failed. It could return (Done (???) (Left
 errcode)) but this requires that everything downstream in the pipeline
 be aware of Errcode, which is not ideal.I could use MonadError along
 with PipeM, but this approach completely abandons the part of the
 stream that has been processed successfully. I'd like to avoid using
 Exceptions if at all possible.

 Why avoid Exceptions? It's the right fit for the job. You can still
 keep your conduit pure by setting up an `ExceptionT Identity` stack,
 which is exactly how you can use the Data.Conduit.Text functions from
 pure code. Really, what you need to be asking is is there any logical
 way to recover from an exception here?

I suppose this is a little off-topic, but do you prefer ExceptionT or
ErrorT? Any exception/error that I'd be throwing is just  a container
around a String, so both of them will work fine for my purposes.


 It doesn't seem that a user application even has any way to access
 leftover data anyway, so perhaps this discussion will only be relevant
 in a future version of Conduit. At any rate, any feedback you could
 give me on this issue would be greatly appreciated.

 Leftover data is definitely used:

 1. If you compose together two `Sink` with monadic bind, the leftovers
 from the first will be passed to the second.

You can do that That's so cool!I never realized that Pipes are
members of Monad.

 2. If you use connect-and-resume ($$+), the leftovers are returned as
 part of the `Source`, and provided downstream.

This too is really neat :] I didn't realize how this worked.


 Michael

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Conduit Best Practices for leftover data

2012-04-15 Thread Myles C. Maxfield

 2. If you use connect-and-resume ($$+), the leftovers are returned as
 part of the `Source`, and provided downstream.

I'm trying to figure out how to use this, but I'm getting a little bit
confused. In particular, here is a conduit that produces an output for
every 'i' inputs. I'm returning partial data when the input stream
hits an EOF (And I verified that the partial data is correct with
Debug.Trace), yet the output of 'partial' is ([[1,2,3,4,5]],[])
instead of ([[1,2,3,4,5]],[6,7,8]). Can you help me understand what's
going on?

Thanks,
Myles

import qualified Data.Conduit as C
import qualified Data.Conduit.List as CL

-- functionally the same as concatenating all the inputs, then
repeatedly running splitAt on the concatenation.
takeConduit :: (Num a, Monad m) = a - C.Pipe [a1] [a1] m ()
takeConduit i = takeConduitHelper i [] []
  where takeConduitHelper x lout lin
  | x == 0 = C.HaveOutput (takeConduitHelper i [] lin) (return
()) $ reverse lout
  | null lin = C.NeedInput (takeConduitHelper x lout) (C.Done
(Just $ reverse lout) ())
  | otherwise = takeConduitHelper (x - 1) (head lin : lout) $ tail lin

partial :: (Num t, Monad m, Enum t) = m ([[t]], [[t]])
partial = do
  (source, output) - CL.sourceList [[1..8]] C.$$+ (takeConduit 5 C.=$
CL.consume)
  output' - source C.$$ CL.consume
  return (output, output')

On Sun, Apr 15, 2012 at 2:12 PM, Myles C. Maxfield
myles.maxfi...@gmail.com wrote:
 Thanks for responding to this. Some responses are inline.

 On Sat, Apr 14, 2012 at 8:30 PM, Michael Snoyman mich...@snoyman.com wrote:
 On Thu, Apr 12, 2012 at 9:25 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
 Hello,
 I am interested in the argument to Done, namely, leftover data. More
 specifically, when implementing a conduit/sink, what should the
 conduit specify for the (Maybe i) argument to Done in the following
 scenarios (Please note that these scenarios only make sense if the
 type of 'i' is something in Monoid):

 1) The conduit outputted the last thing that it felt like outputting,
 and exited willfully. There seem to be two options here - a) the
 conduit/sink should greedily gather up all the remaining input in the
 stream and mconcat them, or b) Return the part of the last thing that
 never got represented in any part of anything outputted. Option b
 seems to make the most sense here.

 Yes, option (b) is definitely what's intended.

 2) Something upstream produced Done, so the second argument to
 NeedInput gets run. This is guaranteed to be run at the boundary of an
 item, so should it always return Nothing? Instead, should it remember
 all the input it has consumed for the current (yet-to-be-outputted)
 element, so it can let Data.Conduit know that, even though the conduit
 appeared to consume the past few items, it actually didn't (because it
 needs more input items to make an output)? Remembering this sequence
 could potentially have disastrous memory usage. On the other hand, It
 could also greedily gather everything remaining in the stream.

 No, nothing so complicated is intended. Most likely you'll never
 return any leftovers from the second field of NeedInput. One other
 minor point: it's also possible that the second field will be used if
 the *downstream* pipe returns Done.

 Just to help me understand, what is a case when you want to specify
 something in this field? I can't think of a case when a Conduit would
 specify anything in this case.


 3) The conduit/sink encountered an error mid-item. In general, is
 there a commonly-accepted way to deal with this? If a conduit fails in
 the middle of an item, it might not be clear where it should pick up
 processing, so the conduit probably shouldn't even attempt to
 continue. It would probably be good to return some notion of where it
 was in the input when it failed. It could return (Done (???) (Left
 errcode)) but this requires that everything downstream in the pipeline
 be aware of Errcode, which is not ideal.I could use MonadError along
 with PipeM, but this approach completely abandons the part of the
 stream that has been processed successfully. I'd like to avoid using
 Exceptions if at all possible.

 Why avoid Exceptions? It's the right fit for the job. You can still
 keep your conduit pure by setting up an `ExceptionT Identity` stack,
 which is exactly how you can use the Data.Conduit.Text functions from
 pure code. Really, what you need to be asking is is there any logical
 way to recover from an exception here?

 I suppose this is a little off-topic, but do you prefer ExceptionT or
 ErrorT? Any exception/error that I'd be throwing is just  a container
 around a String, so both of them will work fine for my purposes.


 It doesn't seem that a user application even has any way to access
 leftover data anyway, so perhaps this discussion will only be relevant
 in a future version of Conduit. At any rate, any feedback you could
 give me on this issue would be greatly appreciated.

 Leftover data

Re: [Haskell-cafe] Conduit Best Practices for leftover data

2012-04-15 Thread Myles C. Maxfield

Sorry for the spam.

A similar matter is this following program, where something downstream
reaches EOF right after a conduit outputs a HaveOutput. Because the type of
the early-closed function is just 'r' or 'm r', there is no way for the
conduit to return any partial output. This means that any extra values in
the chunk the conduit read are lost. Is there some way around this?

-- takeConduit as in previous email
-- partial2 outputs ([[1,2,3,4,5]],[]) instead of ([[1,2,3,4,5]],[6,7,8])

monadSink :: Monad m = CI.Sink [a1] m ([[a1]], [[a1]])
monadSink = do
  output - takeConduit 5 C.=$ CL.take 1
  output' - CL.consume
  return (output, output')

partial2 :: (Num t, Monad m, Enum t) = m ([[t]], [[t]])
partial2 = CL.sourceList [[1..8]] C.$$ monadSink

Thanks,
Myles

On Sun, Apr 15, 2012 at 4:53 PM, Myles C. Maxfield myles.maxfi...@gmail.com
wrote:
 2. If you use connect-and-resume ($$+), the leftovers are returned as
 part of the `Source`, and provided downstream.

 I'm trying to figure out how to use this, but I'm getting a little bit
 confused. In particular, here is a conduit that produces an output for
 every 'i' inputs. I'm returning partial data when the input stream
 hits an EOF (And I verified that the partial data is correct with
 Debug.Trace), yet the output of 'partial' is ([[1,2,3,4,5]],[])
 instead of ([[1,2,3,4,5]],[6,7,8]). Can you help me understand what's
 going on?

 Thanks,
 Myles

 import qualified Data.Conduit as C
 import qualified Data.Conduit.List as CL

 -- functionally the same as concatenating all the inputs, then
 repeatedly running splitAt on the concatenation.
 takeConduit :: (Num a, Monad m) = a - C.Pipe [a1] [a1] m ()
 takeConduit i = takeConduitHelper i [] []
  where takeConduitHelper x lout lin
  | x == 0 = C.HaveOutput (takeConduitHelper i [] lin) (return
 ()) $ reverse lout
  | null lin = C.NeedInput (takeConduitHelper x lout) (C.Done
 (Just $ reverse lout) ())
  | otherwise = takeConduitHelper (x - 1) (head lin : lout) $ tail
lin

 partial :: (Num t, Monad m, Enum t) = m ([[t]], [[t]])
 partial = do
  (source, output) - CL.sourceList [[1..8]] C.$$+ (takeConduit 5 C.=$
 CL.consume)
  output' - source C.$$ CL.consume
  return (output, output')

 On Sun, Apr 15, 2012 at 2:12 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
 Thanks for responding to this. Some responses are inline.

 On Sat, Apr 14, 2012 at 8:30 PM, Michael Snoyman mich...@snoyman.com
wrote:
 On Thu, Apr 12, 2012 at 9:25 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
 Hello,
 I am interested in the argument to Done, namely, leftover data. More
 specifically, when implementing a conduit/sink, what should the
 conduit specify for the (Maybe i) argument to Done in the following
 scenarios (Please note that these scenarios only make sense if the
 type of 'i' is something in Monoid):

 1) The conduit outputted the last thing that it felt like outputting,
 and exited willfully. There seem to be two options here - a) the
 conduit/sink should greedily gather up all the remaining input in the
 stream and mconcat them, or b) Return the part of the last thing that
 never got represented in any part of anything outputted. Option b
 seems to make the most sense here.

 Yes, option (b) is definitely what's intended.

 2) Something upstream produced Done, so the second argument to
 NeedInput gets run. This is guaranteed to be run at the boundary of an
 item, so should it always return Nothing? Instead, should it remember
 all the input it has consumed for the current (yet-to-be-outputted)
 element, so it can let Data.Conduit know that, even though the conduit
 appeared to consume the past few items, it actually didn't (because it
 needs more input items to make an output)? Remembering this sequence
 could potentially have disastrous memory usage. On the other hand, It
 could also greedily gather everything remaining in the stream.

 No, nothing so complicated is intended. Most likely you'll never
 return any leftovers from the second field of NeedInput. One other
 minor point: it's also possible that the second field will be used if
 the *downstream* pipe returns Done.

 Just to help me understand, what is a case when you want to specify
 something in this field? I can't think of a case when a Conduit would
 specify anything in this case.


 3) The conduit/sink encountered an error mid-item. In general, is
 there a commonly-accepted way to deal with this? If a conduit fails in
 the middle of an item, it might not be clear where it should pick up
 processing, so the conduit probably shouldn't even attempt to
 continue. It would probably be good to return some notion of where it
 was in the input when it failed. It could return (Done (???) (Left
 errcode)) but this requires that everything downstream in the pipeline
 be aware of Errcode, which is not ideal.I could use MonadError along
 with PipeM, but this approach completely abandons the part of the
 stream that has been processed

[Haskell-cafe] Conduit Best Practices for leftover data

2012-04-12 Thread Myles C. Maxfield

Hello,
I am interested in the argument to Done, namely, leftover data. More
specifically, when implementing a conduit/sink, what should the
conduit specify for the (Maybe i) argument to Done in the following
scenarios (Please note that these scenarios only make sense if the
type of 'i' is something in Monoid):

1) The conduit outputted the last thing that it felt like outputting,
and exited willfully. There seem to be two options here - a) the
conduit/sink should greedily gather up all the remaining input in the
stream and mconcat them, or b) Return the part of the last thing that
never got represented in any part of anything outputted. Option b
seems to make the most sense here.

2) Something upstream produced Done, so the second argument to
NeedInput gets run. This is guaranteed to be run at the boundary of an
item, so should it always return Nothing? Instead, should it remember
all the input it has consumed for the current (yet-to-be-outputted)
element, so it can let Data.Conduit know that, even though the conduit
appeared to consume the past few items, it actually didn't (because it
needs more input items to make an output)? Remembering this sequence
could potentially have disastrous memory usage. On the other hand, It
could also greedily gather everything remaining in the stream.

3) The conduit/sink encountered an error mid-item. In general, is
there a commonly-accepted way to deal with this? If a conduit fails in
the middle of an item, it might not be clear where it should pick up
processing, so the conduit probably shouldn't even attempt to
continue. It would probably be good to return some notion of where it
was in the input when it failed. It could return (Done (???) (Left
errcode)) but this requires that everything downstream in the pipeline
be aware of Errcode, which is not ideal.I could use MonadError along
with PipeM, but this approach completely abandons the part of the
stream that has been processed successfully. I'd like to avoid using
Exceptions if at all possible.

It doesn't seem that a user application even has any way to access
leftover data anyway, so perhaps this discussion will only be relevant
in a future version of Conduit. At any rate, any feedback you could
give me on this issue would be greatly appreciated.

Thanks,
Myles

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Mixing Unboxed Mutable Vectors and Parsers

2012-04-08 Thread Myles C. Maxfield

It's a JPEG parser.

Progressive JPEG is set up where there's a vector Word8s, and some of
the entries in the vector may be 0. The JPEG has a stream of bits, and
the decoder is supposed to shift in one bit to each successive element
in the vector, skipping over 0s, and stop when it reaches some
specified number of 0s.

So if your partially decoded vector is
2, 8, 0, 12, 0, 10, 6, 0, 2, 10
and the jpeg has this bit stream
1, 1, 0, 1, 0, 0, 1, 0, ...
and the jpeg says shift in until the 3rd zero is found
that would result in the partially decoded vector being
3, 9, 0, 12, 0, 11, 6, 0, 2, 10
with the leftover part of the stream being
0, 1, 0, ...

The JPEG parser has to keep track of where it is in the partially
decoded vector to know how many bits to shift in, and where they
belong, so the next iteration is aligned to the right place. It would
be possible to keep track of this stuff throughout the parsing, and
have the result of the parse be a second delta framebuffer and apply
it to the original after each scan is parsed, but that's fairly ugly
and I'd like to avoid doing that.

If that's what I have to do, though, I guess I have to do it. Isn't
there a better way?

--Myles

On Sat, Apr 7, 2012 at 11:56 PM, Stephen Tetley
stephen.tet...@gmail.com wrote:
 Hi Myles

 It seems odd to mix parsing (consuming input) with mutation.

 What problem are you trying to solve and are you sure you can't get
 better phase separation than this paragraph suggests?


 My first idea was to simply parse all the deltas, and later apply them
 to the input list. However, I can't do that because the value of the
 deltas depend on the value they're modifying.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Mixing Unboxed Mutable Vectors and Parsers

2012-04-07 Thread Myles C. Maxfield

CC: Maintainers of STMonadTrans, Vector, and JuicyPixels

Hello,
I am writing a Haskell Attoparsec parser which will modify 2-d arrays
of small values (Word8, Int8, etc.).

My first idea was to simply parse all the deltas, and later apply them
to the input list. However, I can't do that because the value of the
deltas depend on the value they're modifying.

My first pass at this program used a function of the form:

p :: [[Word8]] - Parser [[Word8]]

This approach works, however, the program uses far too much memory.
Some quick testing shows that lists of Word8s are ~52.6x larger than
unboxed vectors of Word8s, and boxed vectors of Word8s are ~7.5x
larger than unboxed vectors of Word8s. A better approach would be to
use Data.Vector.Unboxed.Mutable and do the mutations inline with the
parser. Because mutable vectors require a monad in PrimMonad to do the
mutations inside of, I'd have to use a monad transformer to combine
Parser and something in PrimMonad. Attoparsec doesn't support being
used as a monad transformer, so I can't say something like

p :: (PrimMonad m, UM.Unbox a) = M.MVector (PrimState m) (UM.MVector
(PrimState m) a) - ParserT m ()

I can't use Parsec (instead of Attoparsec) because I require streaming
semantics -- eventually I'm going to hook this up to Data.Conduit and
parse directly from the net.

There is STT (in the package STMonadTrans), however, so I could
potentially make the function result in STT Parser (). However, STT
doesn't work with Data.Vector.Mutable or Data.Vector.Unboxed.Mutable,
because STT isn't a member of the PrimMonad class (as far as I can
tell). STT itself doesn't define unboxed mutable vectors (only boxed
mutable vectors), but I feel that giving up unboxing isn't really an
option because of the memory footprint.

As a general observation, it seems silly to have two different mutable
vector implementations, one for STT and the other for PrimMonad.

So here are my questions:
1. Is it possible to implement PrimMonad with STT? I looked around for
a little while, but couldn't find anything that did this.
2. Otherwise, is it reasonable to try to implement unboxed mutable
vectors in STT? I feel this is probably going down the wrong path.
3. Are there any parsers that support streaming semantics and being
used as a monad transformer? This would require rewriting my whole
program to use this new parser, but if that's what I have to do, then
so be it.

Thanks,
Myles C. Maxfield

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Streaming to JuicyPixels

2012-02-23 Thread Myles C. Maxfield

So I started working on moving JuicyPixels to a streaming interface, and
have some observations. This is going to be a pretty major change, touching
pretty much every function, and the end result will end up looking very
close to the code that I have already written. I'm not nearly as close to
the code as the author is, and I've already made some mistakes due to not
understanding how the original code is structured and layed out.

Because this is essentially turning out to be a rewrite, I think it makes
more sense for me to just work on my own library, and release it as a
streaming alternative to JuicyPixels. How do you feel about this, Vincent?

Thanks,
Myles

On Wed, Feb 22, 2012 at 12:30 PM, Vincent Berthoux
vincent.berth...@gmail.com wrote:

Hi,
Please go ahead, and github is the perfect medium for code sharing :)

Regards

Vincent Berthoux

Le 22 février 2012 20:20, Myles C. Maxfield myles.maxfi...@gmail.com a
écrit :

Let's put aside the issue of getting access to the pixels before the
stream is complete.

How would you feel if I implemented use of the STT monad transformer on
top of Get in JuicyPixels, in order to get rid of the (remaining
getBytes) call, and then expose the underlying Get interface to callers?
This would allow use for streaming.

Is this something that you feel that I should pursue? I can send you a
GitHub Pull Request when I'm done.

Thanks,
Myles

On Wed, Feb 22, 2012 at 5:01 AM, Vincent Berthoux
vincent.berth...@gmail.com wrote:

Hi,

I can understand your performance problems, I bumped into them before
the first release of Juicy Pixels and took a long time to get 'correct'
performance out of the box, and the IDCT is not the only 'hot point', I got
problems with intermediate data structure as well. Any list has proven a
bit problematic performance wise, so I rewrote some things which would have
been easily implemented with forM_ or mapM_ with manual recursion.

I didn't knew STT existed, so it open a new area of reflection for the
streaming Jpeg decoder, instead of using the (remaining getBytes) combo,
staying in the Get monad might help. The reason to expose the ST monad is
that I use it internally to mutate the final image directly, and I'd prefer
to avoid freezing/unfreezing the underlying array. So in order to give
access to the intermediate representation, a type could be (STT s (StateT
JpegDecodingState Get) (MutableImage s PixelYCbCr8)) (the Jpeg decoder only
produce (Image PixelYCbCr8 internally)). This should allow a freeze then a
color conversion.

As it would induce performance loss, this version should exist
alongside current implementation. This is not trivial, but it's far from
impossible. For the IDCT implementation, I don't think a package make much
sense, if you want to use it, just grab it and customize the interface to
your needs :).

Regards

Vincent Berthoux

Le 21 février 2012 06:16, Myles C. Maxfield myles.maxfi...@gmail.coma
écrit :

Hello, and thanks for the quick reply.

You're right that using (remaining getBytes) won't work for
streaming, as it would pull the rest of the stream into a buffer, thereby
delaying further processing until the image is completely downloaded. :-(

I'm a little unclear about what you mean about the use of the ST monad.
There is an
STThttp://hackage.haskell.org/packages/archive/STMonadTrans/0.2/doc/html/Control-Monad-ST-Trans.html
monad
transformer, so you could wrap that around Get. Is that what you're
referring to?

As an aside: I didn't realize that JuicyPixels existed, so I wrote a
JPEG decoder specifically designed for streaming - it doesn't request a
byte of input unless it has to, uses a StateT (wrapped around Attoparsec)
to keep track of which bit in the current byte is next, and does the
Huffman decoding in-line. However, I didn't use ST for the IDCT, so my own
decoder has serious performance problems. This prompted me to start
searching around for a solution, and I came across JuicyPixels, which
already exists and is much faster than my own implementation. I'm hoping to
get rid of my own decoder and just use JuicyPixels. If you're curious, my
own code is here: https://github.com/litherum/jpeg.

Is it reasonable to extend JuicyPixels to fit my use case? It sounds
like JuicyPixels wouldn't work so well as it stands. I'd be happy to do
whatever work is necessary to help out and get JuicyPixels usable for me.
However, if that would require a full (or near-full) rewrite, it might make
more sense for me to use my own implementation with your IDCT. Is there a
way we can share just the IDCT between our two repositories? Perhaps making
a new IDCT8 library that we can both depend on?

Re: [Haskell-cafe] Streaming to JuicyPixels

2012-02-22 Thread Myles C. Maxfield

Let's put aside the issue of getting access to the pixels before the stream
is complete.

How would you feel if I implemented use of the STT monad transformer on top
of Get in JuicyPixels, in order to get rid of the (remaining getBytes)
call, and then expose the underlying Get interface to callers? This would
allow use for streaming.

Is this something that you feel that I should pursue? I can send you a
GitHub Pull Request when I'm done.

Thanks,
Myles

On Wed, Feb 22, 2012 at 5:01 AM, Vincent Berthoux
vincent.berth...@gmail.com wrote:

Hi,

As it would induce performance loss, this version should exist alongside
current implementation. This is not trivial, but it's far from impossible.
For the IDCT implementation, I don't think a package make much sense, if
you want to use it, just grab it and customize the interface to your needs
:).

Regards

Vincent Berthoux

Le 21 février 2012 06:16, Myles C. Maxfield myles.maxfi...@gmail.com a
écrit :

Hello, and thanks for the quick reply.

You're right that using (remaining getBytes) won't work for streaming,
as it would pull the rest of the stream into a buffer, thereby delaying
further processing until the image is completely downloaded. :-(

As an aside: I didn't realize that JuicyPixels existed, so I wrote a JPEG
decoder specifically designed for streaming - it doesn't request a byte of
input unless it has to, uses a StateT (wrapped around Attoparsec) to keep
track of which bit in the current byte is next, and does the Huffman
decoding in-line. However, I didn't use ST for the IDCT, so my own decoder
has serious performance problems. This prompted me to start searching
around for a solution, and I came across JuicyPixels, which already exists
and is much faster than my own implementation. I'm hoping to get rid of my
own decoder and just use JuicyPixels. If you're curious, my own code is
here: https://github.com/litherum/jpeg.

Is it reasonable to extend JuicyPixels to fit my use case? It sounds like
JuicyPixels wouldn't work so well as it stands. I'd be happy to do whatever
work is necessary to help out and get JuicyPixels usable for me. However,
if that would require a full (or near-full) rewrite, it might make more
sense for me to use my own implementation with your IDCT. Is there a way we
can share just the IDCT between our two repositories? Perhaps making a new
IDCT8 library that we can both depend on?

As for what API I'd like to be able to use, just a Get DynamicImage
should suffice (assuming it has streaming semantics as described above). It
would be really nice if it was possible to get at the incomplete image
before the stream is completed (so the image could slowly update as more
data arrives from the network), but I'm not quite sure how to elegantly
express that. Do you have any ideas?

I think that having 2 native jpeg decoders (Actually 3, because of this
package http://hackage.haskell.org/package/jpeg) is detrimental to the
Haskell community, and I would really like to use JuicyPixels :D

Thanks,
Myles C. Maxfield

On Mon, Feb 20, 2012 at 3:01 PM, Vincent Berthoux
vincent.berth...@gmail.com wrote:

Hi,

I can expose the low level parsing, but you would only get the
chunks/frames/sections of the image, Cereal is mainly used to parse the
structure of the image, not to do the raw processing. For the raw
processing, I rely on `remaining getBytes` to be able to manipulate data
at bit level or to feed it to zlib, and the documentation clearly state
that remaining doesn't work well with runGetPartial, so no read ahead, but
even worse for streaming :).

To be fair, I

[Haskell-cafe] Streaming to JuicyPixels

2012-02-20 Thread Myles C. Maxfield

Hello,
I am interested in the possibility of using JuicyPixels for streaming
images from the web. It doesn't appear to expose any of its internally-used
Serialize.Get functionality, which is problematic for streaming - I would
not like to have to stream the whole image into a buffer before the decoder
can start decoding. Are there any plans on exposing this API, so I can use
the runGetPartial function to facilitate streaming?

In addition, does the library do much readahead? There's no point in
exposing a Get interface if it's just going to wait until the stream is
done to start decoding anyway.

Thanks,
Myles C. Maxfield
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Streaming to JuicyPixels

2012-02-20 Thread Myles C. Maxfield

Hello, and thanks for the quick reply.

You're right that using (remaining getBytes) won't work for streaming,
as it would pull the rest of the stream into a buffer, thereby delaying
further processing until the image is completely downloaded. :-(

As an aside: I didn't realize that JuicyPixels existed, so I wrote a JPEG
decoder specifically designed for streaming - it doesn't request a byte of
input unless it has to, uses a StateT (wrapped around Attoparsec) to keep
track of which bit in the current byte is next, and does the Huffman
decoding in-line. However, I didn't use ST for the IDCT, so my own decoder
has serious performance problems. This prompted me to start searching
around for a solution, and I came across JuicyPixels, which already exists
and is much faster than my own implementation. I'm hoping to get rid of my
own decoder and just use JuicyPixels. If you're curious, my own code is
here: https://github.com/litherum/jpeg.

Is it reasonable to extend JuicyPixels to fit my use case? It sounds like
JuicyPixels wouldn't work so well as it stands. I'd be happy to do whatever
work is necessary to help out and get JuicyPixels usable for me. However,
if that would require a full (or near-full) rewrite, it might make more
sense for me to use my own implementation with your IDCT. Is there a way we
can share just the IDCT between our two repositories? Perhaps making a new
IDCT8 library that we can both depend on?

Thanks,
Myles C. Maxfield

On Mon, Feb 20, 2012 at 3:01 PM, Vincent Berthoux
vincent.berth...@gmail.com wrote:

Hi,

To be fair, I never thought of this use case, and exposing a partially
decoded image would impose the use of the ST Monad somehow, and Serialize
is not a monad transformer, making it a bit hard to implement.

By curiosity what kind of API would you hope for this kind of
functionality?

Regards

Vincent Berthoux

Le 20 février 2012 22:08, Myles C. Maxfield myles.maxfi...@gmail.com a
écrit :

Hello,
I am interested in the possibility of using JuicyPixels for streaming
images from the web. It doesn't appear to expose any of its internally-used
Serialize.Get functionality, which is problematic for streaming - I would
not like to have to stream the whole image into a buffer before the decoder
can start decoding. Are there any plans on exposing this API, so I can use
the runGetPartial function to facilitate streaming?

In addition, does the library do much readahead? There's no point in
exposing a Get interface if it's just going to wait until the stream is
done to start decoding anyway.

Thanks,
Myles C. Maxfield

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-06 Thread Myles C. Maxfield

After all these commits have been flying around, I have yet another
question:

the 'HTTP' package defines Network.Browser which is a State monad which
keeps state about a browser (i.e. a cookie jar, a proxy, redirection
parameters, etc.) It would be pretty straightforward to implement this kind
of functionality on top of http-conduit.

I was originally going to do it and release it as its own package, but it
may be beneficial to add such a module to the existing http-conduit
package. Should I add it in to the existing package, or release it as its
own package?

--Myles

On Mon, Feb 6, 2012 at 12:15 AM, Michael Snoyman mich...@snoyman.comwrote:

 Just an FYI for everyone: Myles sent an (incredibly thorough) pull
 request to handle cookies:

 https://github.com/snoyberg/http-conduit/pull/13

 Thanks!

 On Sun, Feb 5, 2012 at 8:20 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  1. The spec defines a grammar for the attributes. They're in uppercase.
  2. Yes - 1.3 is the first version that lists DiffTime as an instance of
  RealFrac (so I can use the 'floor' function to pull out the number of
  seconds to render it)
  3. I'll see what I can do.
 
  --Myles
 
 
  On Sat, Feb 4, 2012 at 9:06 PM, Michael Snoyman mich...@snoyman.com
 wrote:
 
  Looks good, a few questions/requests:
 
  1. Is there a reason to upper-case all the attributes?
  2. Is the time = 1.3 a requirements? Because that can cause a lot of
  trouble for people.
  3. Can you send the patch as a Github pull request? It's easier to
  track that way.
 
  Michael
 
  On Sat, Feb 4, 2012 at 1:21 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   Here is the patch to Web.Cookie. I didn't modify the tests at all
   because
   they were already broken - they looked like they hadn't been updated
   since
   SetCookie only had 5 parameters. I did verify by hand that the patch
   works,
   though.
  
   Thanks,
   Myles
  
  
   On Thu, Feb 2, 2012 at 11:26 PM, Myles C. Maxfield
   myles.maxfi...@gmail.com wrote:
  
   Alright, I'll make a small patch that adds 2 fields to SetCookie:
   setCookieMaxAge :: Maybe DiffTime
   setCookieSecureOnly :: Bool
  
   I've also gotten started on those cookie functions. I'm currently
   writing
   tests for them.
  
   @Chris: The best advice I can give is that Chrome (what I'm using as
 a
   source on all this) has the data baked into a .cc file. However, they
   have
   directions in a README and a script which will parse the list and
   generate
   that source file. I recommend doing this. That way, the Haskell
 module
   would
   have 2 source files: one file that reads the list and generates the
   second
   file, which is a very large source file that contains each element in
   the
   list. The list should export `elem`-type queries. I'm not quite sure
   how to
   handle wildcards that appear in the list - that part is up to you.
   Thanks
   for helping out with this :]
  
   --Myles
  
  
   On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman 
 mich...@snoyman.com
   wrote:
  
   Looks good to me too. I agree with Aristid: let's make the change to
   cookie itself. Do you want to send a pull request? I'm also
   considering making the SetCookie constructor hidden like we have for
   Request, so that if in the future we realize we need to add some
 other
   settings, it doesn't break the API.
  
   Chris: I would recommend compiling it into the module. Best bet
 would
   likely being converting the source file to Haskell source.
  
   Michael
  
   On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
   myles.maxfi...@gmail.com wrote:
Alright. After reading the spec, I have these questions /
 concerns:
   
The spec supports the Max-Age cookie attribute, which
 Web.Cookies
doesn't.
   
I see two possible solutions to this. The first is to have
parseSetCookie
take a UTCTime as an argument which will represent the current
 time
so
it
can populate the setCookieExpires field by adding the Max-Age
attribute
to
the current time. Alternatively, that function can return an IO
SetCookie so
it can ask for the current time by itself (which I think is
 inferior
to
taking the current time as an argument). Note that the spec says
 to
prefer
Max-Age over Expires.
Add a field to SetCookie of type Maybe DiffTime which represents
 the
Max-Age
attribute
   
Cookie code should be aware of the Public Suffix List as a part of
its
domain verification. The cookie code only needs to be able to tell
if a
specific string is in the list (W.Ascii - Bool)
   
I propose making an entirely unrelated package,
 public-suffix-list,
with a
module Network.PublicSuffixList, which will expose this function,
 as
well as
functions about parsing the list itself. Thoughts?
   
Web.Cookie doesn't have a secure-only attribute. Adding one in
 is
straightforward enough.
The spec describes cookies as a property

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-04 Thread Myles C. Maxfield

That's a pretty reasonable thing to do.

Didn't you say that I should keep the 'prefer max-age to expires' logic out
of Web.Cookie?

What do you think, Michael?

--Myles

On Sat, Feb 4, 2012 at 4:03 AM, Aristid Breitkreuz
arist...@googlemail.comwrote:

 Is it possible to have both an Expires and a Max-age? If not, maybe
 you should make a type like

 data Expiry = NeverExpires | ExpiresAt UTCTime | ExpiresIn DiffTime


 2012/2/4 Myles C. Maxfield myles.maxfi...@gmail.com:
  Here is the patch to Web.Cookie. I didn't modify the tests at all because
  they were already broken - they looked like they hadn't been updated
 since
  SetCookie only had 5 parameters. I did verify by hand that the patch
 works,
  though.
 
  Thanks,
  Myles
 
 
  On Thu, Feb 2, 2012 at 11:26 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
 
  Alright, I'll make a small patch that adds 2 fields to SetCookie:
  setCookieMaxAge :: Maybe DiffTime
  setCookieSecureOnly :: Bool
 
  I've also gotten started on those cookie functions. I'm currently
 writing
  tests for them.
 
  @Chris: The best advice I can give is that Chrome (what I'm using as a
  source on all this) has the data baked into a .cc file. However, they
 have
  directions in a README and a script which will parse the list and
 generate
  that source file. I recommend doing this. That way, the Haskell module
 would
  have 2 source files: one file that reads the list and generates the
 second
  file, which is a very large source file that contains each element in
 the
  list. The list should export `elem`-type queries. I'm not quite sure
 how to
  handle wildcards that appear in the list - that part is up to you.
 Thanks
  for helping out with this :]
 
  --Myles
 
 
  On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  Looks good to me too. I agree with Aristid: let's make the change to
  cookie itself. Do you want to send a pull request? I'm also
  considering making the SetCookie constructor hidden like we have for
  Request, so that if in the future we realize we need to add some other
  settings, it doesn't break the API.
 
  Chris: I would recommend compiling it into the module. Best bet would
  likely being converting the source file to Haskell source.
 
  Michael
 
  On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   Alright. After reading the spec, I have these questions / concerns:
  
   The spec supports the Max-Age cookie attribute, which Web.Cookies
   doesn't.
  
   I see two possible solutions to this. The first is to have
   parseSetCookie
   take a UTCTime as an argument which will represent the current time
 so
   it
   can populate the setCookieExpires field by adding the Max-Age
 attribute
   to
   the current time. Alternatively, that function can return an IO
   SetCookie so
   it can ask for the current time by itself (which I think is inferior
 to
   taking the current time as an argument). Note that the spec says to
   prefer
   Max-Age over Expires.
   Add a field to SetCookie of type Maybe DiffTime which represents the
   Max-Age
   attribute
  
   Cookie code should be aware of the Public Suffix List as a part of
 its
   domain verification. The cookie code only needs to be able to tell
 if a
   specific string is in the list (W.Ascii - Bool)
  
   I propose making an entirely unrelated package, public-suffix-list,
   with a
   module Network.PublicSuffixList, which will expose this function, as
   well as
   functions about parsing the list itself. Thoughts?
  
   Web.Cookie doesn't have a secure-only attribute. Adding one in is
   straightforward enough.
   The spec describes cookies as a property of HTTP, not of the World
 Wide
   Web.
   Perhaps Web.Cookie should be renamed? Just a thought; it doesn't
   really
   matter to me.
  
   As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
   Storage Model what fields a Cookie has. Here is my proposal for the
   functions it will expose:
  
   receiveSetCookie :: SetCookie - Req.Request m - UTCTime - Bool -
   CookieJar - CookieJar
  
   Runs the algorithm described in section 5.3 Storage Model
   The UTCTime is the current-time, the Bool is whether or not the
 caller
   is an
   HTTP-based API (as opposed to JavaScript or anything else)
  
   updateCookieJar :: Res.Response a - Req.Request m - UTCTime -
   CookieJar
   - (CookieJar, Res.Response a)
  
   Applies receiveSetCookie to a Response. The output CookieJar is
   stripped
   of any Set-Cookie headers.
   Specifies True for the Bool in receiveSetCookie
  
   computeCookieString :: Req.Request m - CookieJar - UTCTime - Bool
 -
   (W.Ascii, CookieJar)
  
   Runs the algorithm described in section 5.4 The Cookie Header
   The UTCTime and Bool are the same as in receiveSetCookie
  
   insertCookiesIntoRequest :: Req.Request m - CookieJar - UTCTime -
   (Req.Request m, CookieJar)
  
   Applies computeCookieString to a Request. The output cookie jar has

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-04 Thread Myles C. Maxfield

1. The spec defines a grammar for the attributes. They're in uppercase.
2. Yes - 1.3 is the first version that lists DiffTime as an instance of
RealFrac (so I can use the 'floor' function to pull out the number of
seconds to render it)
3. I'll see what I can do.

--Myles

On Sat, Feb 4, 2012 at 9:06 PM, Michael Snoyman mich...@snoyman.com wrote:

 Looks good, a few questions/requests:

 1. Is there a reason to upper-case all the attributes?
 2. Is the time = 1.3 a requirements? Because that can cause a lot of
 trouble for people.
 3. Can you send the patch as a Github pull request? It's easier to
 track that way.

 Michael

 On Sat, Feb 4, 2012 at 1:21 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Here is the patch to Web.Cookie. I didn't modify the tests at all because
  they were already broken - they looked like they hadn't been updated
 since
  SetCookie only had 5 parameters. I did verify by hand that the patch
 works,
  though.
 
  Thanks,
  Myles
 
 
  On Thu, Feb 2, 2012 at 11:26 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
 
  Alright, I'll make a small patch that adds 2 fields to SetCookie:
  setCookieMaxAge :: Maybe DiffTime
  setCookieSecureOnly :: Bool
 
  I've also gotten started on those cookie functions. I'm currently
 writing
  tests for them.
 
  @Chris: The best advice I can give is that Chrome (what I'm using as a
  source on all this) has the data baked into a .cc file. However, they
 have
  directions in a README and a script which will parse the list and
 generate
  that source file. I recommend doing this. That way, the Haskell module
 would
  have 2 source files: one file that reads the list and generates the
 second
  file, which is a very large source file that contains each element in
 the
  list. The list should export `elem`-type queries. I'm not quite sure
 how to
  handle wildcards that appear in the list - that part is up to you.
 Thanks
  for helping out with this :]
 
  --Myles
 
 
  On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  Looks good to me too. I agree with Aristid: let's make the change to
  cookie itself. Do you want to send a pull request? I'm also
  considering making the SetCookie constructor hidden like we have for
  Request, so that if in the future we realize we need to add some other
  settings, it doesn't break the API.
 
  Chris: I would recommend compiling it into the module. Best bet would
  likely being converting the source file to Haskell source.
 
  Michael
 
  On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   Alright. After reading the spec, I have these questions / concerns:
  
   The spec supports the Max-Age cookie attribute, which Web.Cookies
   doesn't.
  
   I see two possible solutions to this. The first is to have
   parseSetCookie
   take a UTCTime as an argument which will represent the current time
 so
   it
   can populate the setCookieExpires field by adding the Max-Age
 attribute
   to
   the current time. Alternatively, that function can return an IO
   SetCookie so
   it can ask for the current time by itself (which I think is inferior
 to
   taking the current time as an argument). Note that the spec says to
   prefer
   Max-Age over Expires.
   Add a field to SetCookie of type Maybe DiffTime which represents the
   Max-Age
   attribute
  
   Cookie code should be aware of the Public Suffix List as a part of
 its
   domain verification. The cookie code only needs to be able to tell
 if a
   specific string is in the list (W.Ascii - Bool)
  
   I propose making an entirely unrelated package, public-suffix-list,
   with a
   module Network.PublicSuffixList, which will expose this function, as
   well as
   functions about parsing the list itself. Thoughts?
  
   Web.Cookie doesn't have a secure-only attribute. Adding one in is
   straightforward enough.
   The spec describes cookies as a property of HTTP, not of the World
 Wide
   Web.
   Perhaps Web.Cookie should be renamed? Just a thought; it doesn't
   really
   matter to me.
  
   As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
   Storage Model what fields a Cookie has. Here is my proposal for the
   functions it will expose:
  
   receiveSetCookie :: SetCookie - Req.Request m - UTCTime - Bool -
   CookieJar - CookieJar
  
   Runs the algorithm described in section 5.3 Storage Model
   The UTCTime is the current-time, the Bool is whether or not the
 caller
   is an
   HTTP-based API (as opposed to JavaScript or anything else)
  
   updateCookieJar :: Res.Response a - Req.Request m - UTCTime -
   CookieJar
   - (CookieJar, Res.Response a)
  
   Applies receiveSetCookie to a Response. The output CookieJar is
   stripped
   of any Set-Cookie headers.
   Specifies True for the Bool in receiveSetCookie
  
   computeCookieString :: Req.Request m - CookieJar - UTCTime - Bool
 -
   (W.Ascii, CookieJar)
  
   Runs the algorithm described in section 5.4 The Cookie Header

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-03 Thread Myles C. Maxfield

Here is the patch to Web.Cookie. I didn't modify the tests at all because
they were already broken - they looked like they hadn't been updated since
SetCookie only had 5 parameters. I did verify by hand that the patch works,
though.

Thanks,
Myles

On Thu, Feb 2, 2012 at 11:26 PM, Myles C. Maxfield myles.maxfi...@gmail.com
 wrote:

 Alright, I'll make a small patch that adds 2 fields to SetCookie:
 setCookieMaxAge :: Maybe DiffTime
 setCookieSecureOnly :: Bool

 I've also gotten started on those cookie functions. I'm currently writing
 tests for them.

 @Chris: The best advice I can give is that Chrome (what I'm using as a
 source on all this) has the data baked into a .cc file. However, they have
 directions in a README and a script which will parse the list and generate
 that source file. I recommend doing this. That way, the Haskell module
 would have 2 source files: one file that reads the list and generates the
 second file, which is a very large source file that contains each element
 in the list. The list should export `elem`-type queries. I'm not quite sure
 how to handle wildcards that appear in the list - that part is up to you.
 Thanks for helping out with this :]

 --Myles


 On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman mich...@snoyman.comwrote:

 Looks good to me too. I agree with Aristid: let's make the change to
 cookie itself. Do you want to send a pull request? I'm also
 considering making the SetCookie constructor hidden like we have for
 Request, so that if in the future we realize we need to add some other
 settings, it doesn't break the API.

 Chris: I would recommend compiling it into the module. Best bet would
 likely being converting the source file to Haskell source.

 Michael

 On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Alright. After reading the spec, I have these questions / concerns:
 
  The spec supports the Max-Age cookie attribute, which Web.Cookies
 doesn't.
 
  I see two possible solutions to this. The first is to have
 parseSetCookie
  take a UTCTime as an argument which will represent the current time so
 it
  can populate the setCookieExpires field by adding the Max-Age attribute
 to
  the current time. Alternatively, that function can return an IO
 SetCookie so
  it can ask for the current time by itself (which I think is inferior to
  taking the current time as an argument). Note that the spec says to
 prefer
  Max-Age over Expires.
  Add a field to SetCookie of type Maybe DiffTime which represents the
 Max-Age
  attribute
 
  Cookie code should be aware of the Public Suffix List as a part of its
  domain verification. The cookie code only needs to be able to tell if a
  specific string is in the list (W.Ascii - Bool)
 
  I propose making an entirely unrelated package, public-suffix-list,
 with a
  module Network.PublicSuffixList, which will expose this function, as
 well as
  functions about parsing the list itself. Thoughts?
 
  Web.Cookie doesn't have a secure-only attribute. Adding one in is
  straightforward enough.
  The spec describes cookies as a property of HTTP, not of the World Wide
 Web.
  Perhaps Web.Cookie should be renamed? Just a thought; it doesn't
 really
  matter to me.
 
  As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
  Storage Model what fields a Cookie has. Here is my proposal for the
  functions it will expose:
 
  receiveSetCookie :: SetCookie - Req.Request m - UTCTime - Bool -
  CookieJar - CookieJar
 
  Runs the algorithm described in section 5.3 Storage Model
  The UTCTime is the current-time, the Bool is whether or not the caller
 is an
  HTTP-based API (as opposed to JavaScript or anything else)
 
  updateCookieJar :: Res.Response a - Req.Request m - UTCTime -
 CookieJar
  - (CookieJar, Res.Response a)
 
  Applies receiveSetCookie to a Response. The output CookieJar is
 stripped
  of any Set-Cookie headers.
  Specifies True for the Bool in receiveSetCookie
 
  computeCookieString :: Req.Request m - CookieJar - UTCTime - Bool -
  (W.Ascii, CookieJar)
 
  Runs the algorithm described in section 5.4 The Cookie Header
  The UTCTime and Bool are the same as in receiveSetCookie
 
  insertCookiesIntoRequest :: Req.Request m - CookieJar - UTCTime -
  (Req.Request m, CookieJar)
 
  Applies computeCookieString to a Request. The output cookie jar has
  updated last-accessed-times.
  Specifies True for the Bool in computeCookieString
 
  evictExpiredCookies :: CookieJar - UTCTime - CookieJar
 
  Runs the algorithm described in the last part of section 5.3 Storage
 Model
 
  This will make the relevant part of 'http' look like:
 
  go count req'' cookie_jar'' = do
  now - liftIO $ getCurrentTime
  let (req', cookie_jar') = insertCookiesIntoRequest req''
  (evictExpiredCookies cookie_jar'' now) now
  res' - httpRaw req' manager
  let (cookie_jar, res) = updateCookieJar res' req' now
 cookie_jar'
  case getRedirectedRequest req

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-02 Thread Myles C. Maxfield

Alright. After reading the spec, I have these questions / concerns:

   - The spec supports the Max-Age cookie attribute, which Web.Cookies
   doesn't.
  - I see two possible solutions to this. The first is to have
  parseSetCookie take a UTCTime as an argument which will represent the
  current time so it can populate the setCookieExpires field by adding the
  Max-Age attribute to the current time. Alternatively, that function can
  return an IO SetCookie so it can ask for the current time by
itself (which
  I think is inferior to taking the current time as an argument). Note that
  the spec says to prefer Max-Age over Expires.
  - Add a field to SetCookie of type Maybe DiffTime which represents
  the Max-Age attribute
   - Cookie code should be aware of the Public Suffix
Listhttp://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat
as
   a part of its domain verification. The cookie code only needs to be able to
   tell if a specific string is in the list (W.Ascii - Bool)
  - I propose making an entirely unrelated package, public-suffix-list,
  with a module Network.PublicSuffixList, which will expose this
function, as
  well as functions about parsing the list itself. Thoughts?
   - Web.Cookie doesn't have a secure-only attribute. Adding one in is
   straightforward enough.
   - The spec describes cookies as a property of HTTP, not of the World
   Wide Web. Perhaps Web.Cookie should be renamed? Just a thought; it
   doesn't really matter to me.

As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
Storage Model what fields a Cookie has. Here is my proposal for the
functions it will expose:

   - receiveSetCookie :: SetCookie - Req.Request m - UTCTime - Bool -
   CookieJar - CookieJar
  - Runs the algorithm described in section 5.3 Storage Model
  - The UTCTime is the current-time, the Bool is whether or not the
  caller is an HTTP-based API (as opposed to JavaScript or anything else)
   - updateCookieJar :: Res.Response a - Req.Request m - UTCTime -
   CookieJar - (CookieJar, Res.Response a)
  - Applies receiveSetCookie to a Response. The output CookieJar is
  stripped of any Set-Cookie headers.
  - Specifies True for the Bool in receiveSetCookie
   - computeCookieString :: Req.Request m - CookieJar - UTCTime - Bool
   - (W.Ascii, CookieJar)
  - Runs the algorithm described in section 5.4 The Cookie Header
  - The UTCTime and Bool are the same as in receiveSetCookie
   - insertCookiesIntoRequest :: Req.Request m - CookieJar - UTCTime -
   (Req.Request m, CookieJar)
  - Applies computeCookieString to a Request. The output cookie jar
  has updated last-accessed-times.
  - Specifies True for the Bool in computeCookieString
   - evictExpiredCookies :: CookieJar - UTCTime - CookieJar
  - Runs the algorithm described in the last part of section 5.3
  Storage Model

This will make the relevant part of 'http' look like:

go count req'' cookie_jar'' = do
now - liftIO $ getCurrentTime
let (req', cookie_jar') = insertCookiesIntoRequest req''
(evictExpiredCookies cookie_jar'' now) now
res' - httpRaw req' manager
let (cookie_jar, res) = updateCookieJar res' req' now cookie_jar'
case getRedirectedRequest req' (responseHeaders res) (W.statusCode
(statusCode res)) of
Just req - go (count - 1) req cookie_jar
Nothing - return res

I plan to not allow for a user-supplied cookieFilter function. If they want
that functionality, they can re-implement the redirection-following logic.

Any thoughts on any of this?

Thanks,
Myles

On Wed, Feb 1, 2012 at 5:19 PM, Myles C. Maxfield
myles.maxfi...@gmail.comwrote:

 Nope. I'm not. The RFC is very explicit about how to handle cookies. As
 soon as I'm finished making sense of it (in terms of Haskell) I'll send
 another proposal email.
  On Feb 1, 2012 3:25 AM, Michael Snoyman mich...@snoyman.com wrote:

 You mean you're *not* making this proposal?

 On Wed, Feb 1, 2012 at 7:30 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Well, this is embarrassing. Please disregard my previous email. I should
  learn to read the RFC *before* submitting proposals.
 
  --Myles
 
 
  On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
 
  Here are my initial ideas about supporting cookies. Note that I'm using
  Chrome for ideas since it's open source.
 
  Network/HTTP/Conduit/Cookies.hs file
  Exporting the following symbols:
 
  type StuffedCookie = SetCookie
 
  A regular SetCookie can have Nothing for its Domain and Path
 attributes. A
  StuffedCookie has to have these fields set.
 
  type CookieJar = [StuffedCookie]
 
  Chrome's cookie jar is implemented as (the C++ equivalent of) Map
 W.Ascii
  StuffedCookie. The key is the eTLD+1 of the domain, so lookups for
 all
  cookies for a given domain are fast.
  I think I'll stay with just a list

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-02 Thread Myles C. Maxfield

Alright, I'll make a small patch that adds 2 fields to SetCookie:
setCookieMaxAge :: Maybe DiffTime
setCookieSecureOnly :: Bool

I've also gotten started on those cookie functions. I'm currently writing
tests for them.

@Chris: The best advice I can give is that Chrome (what I'm using as a
source on all this) has the data baked into a .cc file. However, they have
directions in a README and a script which will parse the list and generate
that source file. I recommend doing this. That way, the Haskell module
would have 2 source files: one file that reads the list and generates the
second file, which is a very large source file that contains each element
in the list. The list should export `elem`-type queries. I'm not quite sure
how to handle wildcards that appear in the list - that part is up to you.
Thanks for helping out with this :]

--Myles

On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman mich...@snoyman.comwrote:

 Looks good to me too. I agree with Aristid: let's make the change to
 cookie itself. Do you want to send a pull request? I'm also
 considering making the SetCookie constructor hidden like we have for
 Request, so that if in the future we realize we need to add some other
 settings, it doesn't break the API.

 Chris: I would recommend compiling it into the module. Best bet would
 likely being converting the source file to Haskell source.

 Michael

 On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Alright. After reading the spec, I have these questions / concerns:
 
  The spec supports the Max-Age cookie attribute, which Web.Cookies
 doesn't.
 
  I see two possible solutions to this. The first is to have parseSetCookie
  take a UTCTime as an argument which will represent the current time so it
  can populate the setCookieExpires field by adding the Max-Age attribute
 to
  the current time. Alternatively, that function can return an IO
 SetCookie so
  it can ask for the current time by itself (which I think is inferior to
  taking the current time as an argument). Note that the spec says to
 prefer
  Max-Age over Expires.
  Add a field to SetCookie of type Maybe DiffTime which represents the
 Max-Age
  attribute
 
  Cookie code should be aware of the Public Suffix List as a part of its
  domain verification. The cookie code only needs to be able to tell if a
  specific string is in the list (W.Ascii - Bool)
 
  I propose making an entirely unrelated package, public-suffix-list, with
 a
  module Network.PublicSuffixList, which will expose this function, as
 well as
  functions about parsing the list itself. Thoughts?
 
  Web.Cookie doesn't have a secure-only attribute. Adding one in is
  straightforward enough.
  The spec describes cookies as a property of HTTP, not of the World Wide
 Web.
  Perhaps Web.Cookie should be renamed? Just a thought; it doesn't really
  matter to me.
 
  As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
  Storage Model what fields a Cookie has. Here is my proposal for the
  functions it will expose:
 
  receiveSetCookie :: SetCookie - Req.Request m - UTCTime - Bool -
  CookieJar - CookieJar
 
  Runs the algorithm described in section 5.3 Storage Model
  The UTCTime is the current-time, the Bool is whether or not the caller
 is an
  HTTP-based API (as opposed to JavaScript or anything else)
 
  updateCookieJar :: Res.Response a - Req.Request m - UTCTime -
 CookieJar
  - (CookieJar, Res.Response a)
 
  Applies receiveSetCookie to a Response. The output CookieJar is
 stripped
  of any Set-Cookie headers.
  Specifies True for the Bool in receiveSetCookie
 
  computeCookieString :: Req.Request m - CookieJar - UTCTime - Bool -
  (W.Ascii, CookieJar)
 
  Runs the algorithm described in section 5.4 The Cookie Header
  The UTCTime and Bool are the same as in receiveSetCookie
 
  insertCookiesIntoRequest :: Req.Request m - CookieJar - UTCTime -
  (Req.Request m, CookieJar)
 
  Applies computeCookieString to a Request. The output cookie jar has
  updated last-accessed-times.
  Specifies True for the Bool in computeCookieString
 
  evictExpiredCookies :: CookieJar - UTCTime - CookieJar
 
  Runs the algorithm described in the last part of section 5.3 Storage
 Model
 
  This will make the relevant part of 'http' look like:
 
  go count req'' cookie_jar'' = do
  now - liftIO $ getCurrentTime
  let (req', cookie_jar') = insertCookiesIntoRequest req''
  (evictExpiredCookies cookie_jar'' now) now
  res' - httpRaw req' manager
  let (cookie_jar, res) = updateCookieJar res' req' now cookie_jar'
  case getRedirectedRequest req' (responseHeaders res)
 (W.statusCode
  (statusCode res)) of
  Just req - go (count - 1) req cookie_jar
  Nothing - return res
 
  I plan to not allow for a user-supplied cookieFilter function. If they
 want
  that functionality, they can re-implement the redirection-following
 logic.
 
  Any thoughts on any of this?
 
  Thanks,
  Myles

Re: [Haskell-cafe] Contributing to http-conduit

2012-02-01 Thread Myles C. Maxfield

Nope. I'm not. The RFC is very explicit about how to handle cookies. As
soon as I'm finished making sense of it (in terms of Haskell) I'll send
another proposal email.
On Feb 1, 2012 3:25 AM, Michael Snoyman mich...@snoyman.com wrote:

 You mean you're *not* making this proposal?

 On Wed, Feb 1, 2012 at 7:30 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Well, this is embarrassing. Please disregard my previous email. I should
  learn to read the RFC *before* submitting proposals.
 
  --Myles
 
 
  On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
 
  Here are my initial ideas about supporting cookies. Note that I'm using
  Chrome for ideas since it's open source.
 
  Network/HTTP/Conduit/Cookies.hs file
  Exporting the following symbols:
 
  type StuffedCookie = SetCookie
 
  A regular SetCookie can have Nothing for its Domain and Path
 attributes. A
  StuffedCookie has to have these fields set.
 
  type CookieJar = [StuffedCookie]
 
  Chrome's cookie jar is implemented as (the C++ equivalent of) Map
 W.Ascii
  StuffedCookie. The key is the eTLD+1 of the domain, so lookups for all
  cookies for a given domain are fast.
  I think I'll stay with just a list of StuffedCookies just to keep it
  simple. Perhaps a later revision can implement the faster map.
 
  getRelevantCookies :: Request m - CookieJar - UTCTime - (CookieJar,
  Cookies)
 
  Gets all the cookies from the cookie jar that should be set for the
 given
  Request.
  The time argument is whatever now is (it's pulled out of the function
 so
  the function can remain pure and easily testable)
  The function will also remove expired cookies from the cookie jar (given
  what now is) and return the filtered cookie jar
 
  putRelevantCookies :: Request m - CookieJar - [StuffedCookie] -
  CookieJar
 
  Insert cookies from a server response into the cookie jar.
  The first argument is only used for checking to see which cookies are
  valid (which cookies match the requested domain, etc, so site1.comcan't set
  a cookie for site2.com)
 
  stuffCookie :: Request m - SetCookie - StuffedCookie
 
  If the SetCookie's fields are Nothing, fill them in given the Request
 from
  which it originated
 
  getCookies :: Response a - ([SetCookie], Response a)
 
  Pull cookies out of a server response. Return the response with the
  Set-Cookie headers filtered out
 
  putCookies :: Request a - Cookies - Request a
 
  A wrapper around renderCookies. Inserts some cookies into a request.
  Doesn't overwrite cookies that are already set in the request
 
  These functions will be exported from Network.HTTP.Conduit as well, so
  callers can use them to re-implement redirection chains
  I won't implement a cookie filtering function (like what Network.Browser
  has)
 
  If you want to have arbitrary handling of cookies, re-implement
  redirection following. It's not very difficult if you use the API
 provided,
  and the 'http' function is open source so you can use that as a
 reference.
 
  I will implement the functions according to RFC 6265
  I will also need to write the following functions. Should they also be
  exported?
 
  canonicalizeDomain :: W.Ascii - W.Ascii
 
  turns ..a.b.c..d.com... to a.b.c.d.com
  Technically necessary for domain matching (Chrome does it)
  Perhaps unnecessary for a first pass? Perhaps we can trust users for
 now?
 
  domainMatches :: W.Ascii - W.Ascii - Maybe W.Ascii
 
  Does the first domain match against the second domain?
  If so, return the prefix of the first that isn't in the second
 
  pathMatches :: W.Ascii - W.Ascii - Bool
 
  Do the paths match?
 
  In order to implement domain matching, I have to have knowledge of
  the Public Suffix List so I know that sub1.sub2.pvt.k12.wy.us can set a
  cookie for sub2.pvt.k12.wy.us but not for k12.wy.us (because
 pvt.k12.wy.us
  is a suffix). There are a variety of ways to implement this.
 
  As far as I can tell, Chrome does it by using a script (which a human
  periodically runs) which parses the list at creates a .cc file that is
  included in the build.
 
  I might be wrong about the execution of the script; it might be a build
  step. If it is a build step, however, it is suspicious that a build
 target
  would try to download a file...
 
  Any more elegant ideas?
 
  Feedback on any/all of the above would be very helpful before I go off
  into the weeds on this project.
 
  Thanks,
  Myles C. Maxfield
 
  On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  Thanks, looks great! I've merged it into the Github tree.
 
  On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   Ah, yes, you're completely right. I completely agree that moving the
   function into the Maybe monad increases readability. This kind of
   function
   is what the Maybe monad was designed for.
  
   Here is a revised patch.
  
  
   On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman 
 mich...@snoyman.com
   wrote

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-31 Thread Myles C. Maxfield

Here are my initial ideas about supporting cookies. Note that I'm using
Chrome for ideas since it's open source.

   - Network/HTTP/Conduit/Cookies.hs file
   - Exporting the following symbols:
  - type StuffedCookie = SetCookie
 - A regular SetCookie can have Nothing for its Domain and Path
 attributes. A StuffedCookie has to have these fields set.
  - type CookieJar = [StuffedCookie]
 - Chrome's cookie jar is implemented as (the C++ equivalent of)
 Map W.Ascii StuffedCookie. The key is the eTLD+1 of the domain, so
 lookups for all cookies for a given domain are fast.
 - I think I'll stay with just a list of StuffedCookies just to
 keep it simple. Perhaps a later revision can implement the faster map.
  - getRelevantCookies :: Request m - CookieJar - UTCTime -
  (CookieJar, Cookies)
 - Gets all the cookies from the cookie jar that should be set for
 the given Request.
 - The time argument is whatever now is (it's pulled out of the
 function so the function can remain pure and easily testable)
 - The function will also remove expired cookies from the cookie
 jar (given what now is) and return the filtered cookie jar
  - putRelevantCookies :: Request m - CookieJar - [StuffedCookie] -
  CookieJar
 - Insert cookies from a server response into the cookie jar.
 - The first argument is only used for checking to see which
 cookies are valid (which cookies match the requested domain, etc, so
 site1.com can't set a cookie for site2.com)
  - stuffCookie :: Request m - SetCookie - StuffedCookie
 - If the SetCookie's fields are Nothing, fill them in given the
 Request from which it originated
  - getCookies :: Response a - ([SetCookie], Response a)
 - Pull cookies out of a server response. Return the response with
 the Set-Cookie headers filtered out
  - putCookies :: Request a - Cookies - Request a
 - A wrapper around renderCookies. Inserts some cookies into a
 request.
 - Doesn't overwrite cookies that are already set in the request
  - These functions will be exported from Network.HTTP.Conduit as well,
   so callers can use them to re-implement redirection chains
   - I won't implement a cookie filtering function (like what
   Network.Browser has)
  - If you want to have arbitrary handling of cookies, re-implement
  redirection following. It's not very difficult if you use the
API provided,
  and the 'http' function is open source so you can use that as a
reference.
   - I will implement the functions according to RFC 6265
   - I will also need to write the following functions. Should they also be
   exported?
  - canonicalizeDomain :: W.Ascii - W.Ascii
 - turns ..a.b.c..d.com... to a.b.c.d.com
 - Technically necessary for domain matching (Chrome does it)
 - Perhaps unnecessary for a first pass? Perhaps we can trust users
 for now?
  - domainMatches :: W.Ascii - W.Ascii - Maybe W.Ascii
 - Does the first domain match against the second domain?
 - If so, return the prefix of the first that isn't in the second
  - pathMatches :: W.Ascii - W.Ascii - Bool
 - Do the paths match?
  - In order to implement domain matching, I have to have knowledge of
   the Public Suffix
Listhttp://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat
so
   I know that sub1.sub2.pvt.k12.wy.us can set a cookie for
   sub2.pvt.k12.wy.us but not for k12.wy.us (because pvt.k12.wy.us is a
   suffix). There are a variety of ways to implement this.
  - As far as I can tell, Chrome does it by using a script (which a
  human periodically runs) which parses the list at creates a .cc file that
  is included in the build.
 - I might be wrong about the execution of the script; it might be
 a build step. If it is a build step, however, it is
suspicious that a build
 target would try to download a file...
  - Any more elegant ideas?

Feedback on any/all of the above would be very helpful before I go off into
the weeds on this project.

Thanks,
Myles C. Maxfield

On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman mich...@snoyman.comwrote:

 Thanks, looks great! I've merged it into the Github tree.

 On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Ah, yes, you're completely right. I completely agree that moving the
  function into the Maybe monad increases readability. This kind of
 function
  is what the Maybe monad was designed for.
 
  Here is a revised patch.
 
 
  On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  On Sat, Jan 28, 2012 at 1:20 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   the fromJust should never fail, beceause of the guard statement:
  
   | 300 = code  code  400  isJust l

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-31 Thread Myles C. Maxfield

Well, this is embarrassing. Please disregard my previous email. I should
learn to read the RFC *before* submitting proposals.

--Myles

On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield myles.maxfi...@gmail.com
 wrote:

 Here are my initial ideas about supporting cookies. Note that I'm using
 Chrome for ideas since it's open source.

- Network/HTTP/Conduit/Cookies.hs file
- Exporting the following symbols:
   - type StuffedCookie = SetCookie
  - A regular SetCookie can have Nothing for its Domain and Path
  attributes. A StuffedCookie has to have these fields set.
   - type CookieJar = [StuffedCookie]
  - Chrome's cookie jar is implemented as (the C++ equivalent of)
  Map W.Ascii StuffedCookie. The key is the eTLD+1 of the domain, so
  lookups for all cookies for a given domain are fast.
  - I think I'll stay with just a list of StuffedCookies just to
  keep it simple. Perhaps a later revision can implement the faster 
 map.
   - getRelevantCookies :: Request m - CookieJar - UTCTime -
   (CookieJar, Cookies)
  - Gets all the cookies from the cookie jar that should be set
  for the given Request.
  - The time argument is whatever now is (it's pulled out of the
  function so the function can remain pure and easily testable)
  - The function will also remove expired cookies from the cookie
  jar (given what now is) and return the filtered cookie jar
   - putRelevantCookies :: Request m - CookieJar - [StuffedCookie]
   - CookieJar
  - Insert cookies from a server response into the cookie jar.
  - The first argument is only used for checking to see which
  cookies are valid (which cookies match the requested domain, etc, so
  site1.com can't set a cookie for site2.com)
   - stuffCookie :: Request m - SetCookie - StuffedCookie
  - If the SetCookie's fields are Nothing, fill them in given the
  Request from which it originated
   - getCookies :: Response a - ([SetCookie], Response a)
  - Pull cookies out of a server response. Return the response
  with the Set-Cookie headers filtered out
   - putCookies :: Request a - Cookies - Request a
  - A wrapper around renderCookies. Inserts some cookies into a
  request.
  - Doesn't overwrite cookies that are already set in the request
   - These functions will be exported from Network.HTTP.Conduit as
well, so callers can use them to re-implement redirection chains
- I won't implement a cookie filtering function (like what
Network.Browser has)
   - If you want to have arbitrary handling of cookies, re-implement
   redirection following. It's not very difficult if you use the API 
 provided,
   and the 'http' function is open source so you can use that as a 
 reference.
- I will implement the functions according to RFC 6265
- I will also need to write the following functions. Should they also
be exported?
   - canonicalizeDomain :: W.Ascii - W.Ascii
  - turns ..a.b.c..d.com... to a.b.c.d.com
  - Technically necessary for domain matching (Chrome does it)
  - Perhaps unnecessary for a first pass? Perhaps we can trust
  users for now?
   - domainMatches :: W.Ascii - W.Ascii - Maybe W.Ascii
  - Does the first domain match against the second domain?
  - If so, return the prefix of the first that isn't in the second
   - pathMatches :: W.Ascii - W.Ascii - Bool
  - Do the paths match?
   - In order to implement domain matching, I have to have knowledge
of the Public Suffix 
 Listhttp://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat
  so
I know that sub1.sub2.pvt.k12.wy.us can set a cookie for
sub2.pvt.k12.wy.us but not for k12.wy.us (because pvt.k12.wy.us is a
suffix). There are a variety of ways to implement this.
   - As far as I can tell, Chrome does it by using a script (which a
   human periodically runs) which parses the list at creates a .cc file 
 that
   is included in the build.
  - I might be wrong about the execution of the script; it might
  be a build step. If it is a build step, however, it is suspicious 
 that a
  build target would try to download a file...
   - Any more elegant ideas?

 Feedback on any/all of the above would be very helpful before I go off
 into the weeds on this project.

 Thanks,
 Myles C. Maxfield

 On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman mich...@snoyman.comwrote:

 Thanks, looks great! I've merged it into the Github tree.

 On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Ah, yes, you're completely right. I completely agree that moving the
  function into the Maybe monad increases readability. This kind of
 function
  is what the Maybe monad was designed for.
 
  Here

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-28 Thread Myles C. Maxfield

Ah, yes, you're completely right. I completely agree that moving the
function into the Maybe monad increases readability. This kind of function
is what the Maybe monad was designed for.

Here is a revised patch.

On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman mich...@snoyman.comwrote:

 On Sat, Jan 28, 2012 at 1:20 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  the fromJust should never fail, beceause of the guard statement:
 
  | 300 = code  code  400  isJust l''  isJust l' = Just $ req
 
  Because of the order of the  operators, it will only evaluate fromJust
  after it makes sure that the argument isJust. That function in particular
  shouldn't throw any exceptions - it should only return Nothing.
 
  Knowing that, I don't quite think I understand what your concern is. Can
 you
  elaborate?

 You're right, but I had to squint really hard to prove to myself that
 you're right. That's the kind of code that could easily be broken in
 future updates by an unwitting maintainer (e.g., me). To protect the
 world from me, I'd prefer if the code didn't have the fromJust. This
 might be a good place to leverage the Monad instance of Maybe.

 Michael



getRedirectedRequest.2.patch
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-27 Thread Myles C. Maxfield

the fromJust should never fail, beceause of the guard statement:

| 300 = code  code  400  isJust l''  isJust l' = Just $ req

Because of the order of the  operators, it will only evaluate fromJust
after it makes sure that the argument isJust. That function in particular
shouldn't throw any exceptions - it should only return Nothing.

Knowing that, I don't quite think I understand what your concern is. Can
you elaborate?

Thanks,
Myles

On Thu, Jan 26, 2012 at 12:59 AM, Michael Snoyman mich...@snoyman.comwrote:

 I'm a little worried about the use of `fromJust`, it will give users a
 very confusing error message, and the error might be trigged at the
 wrong point in the computation. I'd feel better if checkRedirect lived
 in either some Failure, an Either, or maybe even in IO itself. IO
 might make sense if we want to implement some cookie jar functionality
 in the future via mutable references.

 Michael

 On Thu, Jan 26, 2012 at 10:29 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Here is a patch regarding getRedirectedRequest. Comments are very
 welcome.
 
  --Myles C. Maxfield
 
 
  On Wed, Jan 25, 2012 at 10:21 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
 
  I was planning on making the caller deal with keeping track of cookies
  between requests. My cookie idea only solves the problem of cookies
  persisting through a redirect chain - not between subsequent request
 chains.
 
  Do you think that Network.HTTP.Conduit should have a persistent cookie
 jar
  between caller's requests? I don't really think so.
 
  --Myles
 
 
  On Wed, Jan 25, 2012 at 9:28 PM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  On Wed, Jan 25, 2012 at 8:18 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   Alright, that's fine. I just wanted to be explicit about the
 interface
   we'd
   be providing. Taking the Request construction code out of 'http' and
   putting
   it into its own function should be a quick change - I'll have it to
 you
   soon. One possible wrench - The existing code copies some fields
 (like
   the
   proxy) from the original request. In order to keep this
 functionality,
   the
   signature would have to be:
  
   checkRedirect :: Request m - Response - Maybe (Request m)
  
   Is that okay with you? I think I'd also like to call the function
   something
   different, perhaps 'getRedirectedRequest'. Is that okay? I'll also
 add
   an
   example to the documentation about how a caller would get the
   redirection
   chain by re-implementing redirection (by using the example in your
   previous
   email).
 
  Sounds great.
 
   As for cookie handling - I think Network.Browser has a pretty elegant
   solution to this. They allow a CookieFilter which has type
   of URI - Cookie - IO Bool. Cookies are only put in the cookie jar
 if
   the
   function returns True. There is a default CookieFilter, which behaves
   as we
   would expect, but the user can override this function. That way, if
 you
   don't want to support cookies, you can just pass in (\ _ _ - return
   False).
 
  Also sounds good.
 
   If we're already expecting people that want specific functionality to
   re-implement the redirect-following code, this solution might be
   unnecessary. Do you think that such a concept would be beneficial for
   Network.HTTP.Conduit to implement?
 
  Yes, I can imagine that some people would want more fine-grained
  control of which cookies are accepted.
 
   Either way, I'll probably end up making a solution similar to your
   checkRedirect function that will just allow people to take SetCookies
   out of
   a Response and put Cookies into a Request. I'll probably also
 provide a
   default function which converts a SetCookie into a cookie by looking
 up
   the
   current time, inspecting the Request, etc. This will allow me to not
   have to
   change the type of Request or Response - the functions I'll be
 writing
   can
   deal with the raw Headers that are already in Requests and Responses.
   Modifying 'http' to use these functions will be straightforward.
  
   How does this sound to you?
 
  Sounds like a good plan to me. I'm not entirely certain how you're
  planning on implementing the cookie jar itself. In other words, if I
  make a request, have a cookie set, and then make another request
  later, where will the cookie be stored in the interim, and how will
  the second request know to use it?
 
  Michael
 
 
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-26 Thread Myles C. Maxfield

Here is a patch regarding getRedirectedRequest. Comments are very welcome.

--Myles C. Maxfield

On Wed, Jan 25, 2012 at 10:21 PM, Myles C. Maxfield 
myles.maxfi...@gmail.com wrote:

 I was planning on making the caller deal with keeping track of cookies
 between requests. My cookie idea only solves the problem of cookies
 persisting through a redirect chain - not between subsequent request chains.

 Do you think that Network.HTTP.Conduit should have a persistent cookie jar
 between caller's requests? I don't really think so.

 --Myles


 On Wed, Jan 25, 2012 at 9:28 PM, Michael Snoyman mich...@snoyman.comwrote:

 On Wed, Jan 25, 2012 at 8:18 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Alright, that's fine. I just wanted to be explicit about the interface
 we'd
  be providing. Taking the Request construction code out of 'http' and
 putting
  it into its own function should be a quick change - I'll have it to you
  soon. One possible wrench - The existing code copies some fields (like
 the
  proxy) from the original request. In order to keep this functionality,
 the
  signature would have to be:
 
  checkRedirect :: Request m - Response - Maybe (Request m)
 
  Is that okay with you? I think I'd also like to call the function
 something
  different, perhaps 'getRedirectedRequest'. Is that okay? I'll also add
 an
  example to the documentation about how a caller would get the
 redirection
  chain by re-implementing redirection (by using the example in your
 previous
  email).

 Sounds great.

  As for cookie handling - I think Network.Browser has a pretty elegant
  solution to this. They allow a CookieFilter which has type
  of URI - Cookie - IO Bool. Cookies are only put in the cookie jar if
 the
  function returns True. There is a default CookieFilter, which behaves
 as we
  would expect, but the user can override this function. That way, if you
  don't want to support cookies, you can just pass in (\ _ _ - return
 False).

 Also sounds good.

  If we're already expecting people that want specific functionality to
  re-implement the redirect-following code, this solution might be
  unnecessary. Do you think that such a concept would be beneficial for
  Network.HTTP.Conduit to implement?

 Yes, I can imagine that some people would want more fine-grained
 control of which cookies are accepted.

  Either way, I'll probably end up making a solution similar to your
  checkRedirect function that will just allow people to take SetCookies
 out of
  a Response and put Cookies into a Request. I'll probably also provide a
  default function which converts a SetCookie into a cookie by looking up
 the
  current time, inspecting the Request, etc. This will allow me to not
 have to
  change the type of Request or Response - the functions I'll be writing
 can
  deal with the raw Headers that are already in Requests and Responses.
  Modifying 'http' to use these functions will be straightforward.
 
  How does this sound to you?

 Sounds like a good plan to me. I'm not entirely certain how you're
 planning on implementing the cookie jar itself. In other words, if I
 make a request, have a cookie set, and then make another request
 later, where will the cookie be stored in the interim, and how will
 the second request know to use it?

 Michael





getRedirectedRequest.patch
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-25 Thread Myles C. Maxfield

Alright, that's fine. I just wanted to be explicit about the interface we'd
be providing. Taking the Request construction code out of 'http' and
putting it into its own function should be a quick change - I'll have it to
you soon. One possible wrench - The existing code copies some fields (like
the proxy) from the original request. In order to keep this functionality,
the signature would have to be:

checkRedirect :: Request m - Response - Maybe (Request m)

Is that okay with you? I think I'd also like to call the function something
different, perhaps 'getRedirectedRequest'. Is that okay? I'll also add an
example to the documentation about how a caller would get the redirection
chain by re-implementing redirection (by using the example in your previous
email).

As for cookie handling - I think Network.Browser has a pretty elegant
solution to this. They allow a CookieFilter which has type of
URIhttp://hackage.haskell.org/packages/archive/network/2.2.1.7/doc/html/Network-URI.html#t%3AURI
 - 
Cookiehttp://hackage.haskell.org/packages/archive/HTTP/3001.0.0/doc/html/Network-Browser.html#t%3ACookie
 - 
IOhttp://hackage.haskell.org/packages/archive/base/4.2.0.0/doc/html/System-IO.html#t%3AIO
 
Boolhttp://hackage.haskell.org/packages/archive/base/4.2.0.0/doc/html/Data-Bool.html#t%3ABool.
Cookies are only put in the cookie jar if the function returns True. There
is a default CookieFilter, which behaves as we would expect, but the user
can override this function. That way, if you don't want to support cookies,
you can just pass in (\ _ _ - return False).

If we're already expecting people that want specific functionality to
re-implement the redirect-following code, this solution might be
unnecessary. Do you think that such a concept would be beneficial for
Network.HTTP.Conduit to implement?

Either way, I'll probably end up making a solution similar to your
checkRedirect function that will just allow people to take SetCookies out
of a Response and put Cookies into a Request. I'll probably also provide a
default function which converts a SetCookie into a cookie by looking up the
current time, inspecting the Request, etc. This will allow me to not have
to change the type of Request or Response - the functions I'll be writing
can deal with the raw Headers that are already in Requests and Responses.
Modifying 'http' to use these functions will be straightforward.

How does this sound to you?

Thanks,
Myles C. Maxfield

On Wed, Jan 25, 2012 at 5:10 AM, Aristid Breitkreuz arist...@googlemail.com
 wrote:

 The nice thing is that this way, nobody can force me to handle cookies.
 ;-)

 Might be that usage patterns emerge, which we can then codify into
 functions later.
 Am 25.01.2012 08:09 schrieb Michael Snoyman mich...@snoyman.com:

  On Wed, Jan 25, 2012 at 9:01 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Sorry, I think I'm still a little confused about this.
 
  From the point of view of a library user, if I use the 'http' function,
 but
  want to know what final URL I ended up at, I would have to set
 redirects to
  0, call http, call checkRedirect, and recurse until checkRedirect
 returns
  Nothing (or a count runs out). I would be handling the recursion of
  redirects myself.
 
  On one hand, this solution is lightweight and easy to implement in the
  library. On the other hand, the caller has to run each individual
 request
  themselves, keeping track of the number of requests (so there isn't an
  infinite loop). The loop is already implemented in the http function - I
  think it's reasonable to modify the existing loop rather than expect the
  caller to re-implement that logic.
 
  However, it's probably just as reasonable to say if you want to know
 what
  URL you end up at, you have to re-implement your own
 redirection-following
  logic.
 
  I do agree, however, that including an (possibly long, though explicitly
  bounded) [Ascii] along with every request is arbitrary, and probably
 not the
  best solution. Can you think of a solution which allows the caller to
 know
  the url chain (or possibly just the last URL - that's the important one)
  without having to re-implement the redirection-following logic
 themselves?
 
  It sounds like if you had to choose, you would rather force a caller to
  re-implement redirection-following rather than include a URL chain in
 every
  Response. Is this correct?

 That's correct. I think knowing the final URL is a fairly arbitrary
 requirement, in the same boat as wanting redirect handling without
 supporting cookies. These to me fall well within the 20%: most people
 won't need them, so the API should not be optimized for them.

 There's also the fact that [Ascii] isn't nearly enough information to
 properly follow the chain. Next someone's going to want to know if a
 request was GET or POST, or whether it was a permanent or temporary
 redirect, or the exact text of the location header, or a million other
 things involved. If someone wants this stuff

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-25 Thread Myles C. Maxfield

I was planning on making the caller deal with keeping track of cookies
between requests. My cookie idea only solves the problem of cookies
persisting through a redirect chain - not between subsequent request chains.

Do you think that Network.HTTP.Conduit should have a persistent cookie jar
between caller's requests? I don't really think so.

--Myles

On Wed, Jan 25, 2012 at 9:28 PM, Michael Snoyman mich...@snoyman.comwrote:

 On Wed, Jan 25, 2012 at 8:18 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Alright, that's fine. I just wanted to be explicit about the interface
 we'd
  be providing. Taking the Request construction code out of 'http' and
 putting
  it into its own function should be a quick change - I'll have it to you
  soon. One possible wrench - The existing code copies some fields (like
 the
  proxy) from the original request. In order to keep this functionality,
 the
  signature would have to be:
 
  checkRedirect :: Request m - Response - Maybe (Request m)
 
  Is that okay with you? I think I'd also like to call the function
 something
  different, perhaps 'getRedirectedRequest'. Is that okay? I'll also add an
  example to the documentation about how a caller would get the redirection
  chain by re-implementing redirection (by using the example in your
 previous
  email).

 Sounds great.

  As for cookie handling - I think Network.Browser has a pretty elegant
  solution to this. They allow a CookieFilter which has type
  of URI - Cookie - IO Bool. Cookies are only put in the cookie jar if
 the
  function returns True. There is a default CookieFilter, which behaves as
 we
  would expect, but the user can override this function. That way, if you
  don't want to support cookies, you can just pass in (\ _ _ - return
 False).

 Also sounds good.

  If we're already expecting people that want specific functionality to
  re-implement the redirect-following code, this solution might be
  unnecessary. Do you think that such a concept would be beneficial for
  Network.HTTP.Conduit to implement?

 Yes, I can imagine that some people would want more fine-grained
 control of which cookies are accepted.

  Either way, I'll probably end up making a solution similar to your
  checkRedirect function that will just allow people to take SetCookies
 out of
  a Response and put Cookies into a Request. I'll probably also provide a
  default function which converts a SetCookie into a cookie by looking up
 the
  current time, inspecting the Request, etc. This will allow me to not
 have to
  change the type of Request or Response - the functions I'll be writing
 can
  deal with the raw Headers that are already in Requests and Responses.
  Modifying 'http' to use these functions will be straightforward.
 
  How does this sound to you?

 Sounds like a good plan to me. I'm not entirely certain how you're
 planning on implementing the cookie jar itself. In other words, if I
 make a request, have a cookie set, and then make another request
 later, where will the cookie be stored in the interim, and how will
 the second request know to use it?

 Michael

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-24 Thread Myles C. Maxfield

On Mon, Jan 23, 2012 at 10:43 PM, Michael Snoyman mich...@snoyman.comwrote:

 On Tue, Jan 24, 2012 at 8:37 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  I have attached a patch to add a redirect chain to the Response datatype.
  Comments on this patch are very welcome.

 I thought that this isn't necessary since a client wanting to track
 all the redirects could just handle them manually by setting the
 redirect count to 0.

It seems like a lot of work to re-implement the redirection-following code,
just to know which URL the bytes are coming from.  I feel that adding this
field makes the library easier to use, but it's your call.


  I was originally going to include the entire Request object in the
  redirection chain, but Request objects are parameterized with a type
 'm', so
  including a 'Request m' field would force the Response type to be
  parameterized as well. I felt that would be too large a change, so I made
  the type of the redirection chain W.Ascii.
 
  Perhaps its worth using the 'forall' keyword to get rid of the pesky 'm'
  type parameter for Requests?
 
  data RequestBody
  = RequestBodyLBS L.ByteString
  | RequestBodyBS S.ByteString
  | RequestBodyBuilder Int64 Blaze.Builder
  | forall m. RequestBodySource Int64 (C.Source m Blaze.Builder)
  | forall m. RequestBodySourceChunked (C.Source m Blaze.Builder)

 There'd be no way to run the request body then (try compiling the code
 after that change).

Yeah, I never actually tried this change to see if it works. I'll try it
tonight after work.


 Michael


Thanks,
Myles
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-24 Thread Myles C. Maxfield

Sorry, I don't think I'm following. What would the meaning of the value
returned from checkRedirect be?

--Myles

On Tue, Jan 24, 2012 at 10:47 AM, Michael Snoyman mich...@snoyman.comwrote:

 On Tue, Jan 24, 2012 at 6:57 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
 
  On Mon, Jan 23, 2012 at 10:43 PM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  On Tue, Jan 24, 2012 at 8:37 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   I have attached a patch to add a redirect chain to the Response
   datatype.
   Comments on this patch are very welcome.
 
  I thought that this isn't necessary since a client wanting to track
  all the redirects could just handle them manually by setting the
  redirect count to 0.
 
  It seems like a lot of work to re-implement the redirection-following
 code,
  just to know which URL the bytes are coming from.  I feel that adding
 this
  field makes the library easier to use, but it's your call.

 If that's the concern, I'd much rather just expose a function to help
 with dealing with redirects, rather than sticking a rather arbitrary
 [Ascii] in everyone's Response. I think a function along the lines of:

 checkRedirect :: Response - Maybe Request

 would fit the bill, and could be extracted from the current `http`
 function.

 Michael

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-24 Thread Myles C. Maxfield

Sorry, I think I'm still a little confused about this.

From the point of view of a library user, if I use the 'http' function, but
want to know what final URL I ended up at, I would have to set redirects to
0, call http, call checkRedirect, and recurse until checkRedirect returns
Nothing (or a count runs out). I would be handling the recursion of
redirects myself.

On one hand, this solution is lightweight and easy to implement in the
library. On the other hand, the caller has to run each individual request
themselves, keeping track of the number of requests (so there isn't an
infinite loop). The loop is already implemented in the http function - I
think it's reasonable to modify the existing loop rather than expect the
caller to re-implement that logic.

However, it's probably just as reasonable to say if you want to know what
URL you end up at, you have to re-implement your own redirection-following
logic.

I do agree, however, that including an (possibly long, though explicitly
bounded) [Ascii] along with every request is arbitrary, and probably not
the best solution. Can you think of a solution which allows the caller to
know the url chain (or possibly just the last URL - that's the important
one) without having to re-implement the redirection-following logic
themselves?

It sounds like if you had to choose, you would rather force a caller to
re-implement redirection-following rather than include a URL chain in every
Response. Is this correct?

Thanks for helping me out with this,
Myles C. Maxfield

On Tue, Jan 24, 2012 at 8:05 PM, Michael Snoyman mich...@snoyman.comwrote:

 It would be the new request indicated by the server response, if the
 server gave a redirect response.

 On Tue, Jan 24, 2012 at 9:05 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Sorry, I don't think I'm following. What would the meaning of the value
  returned from checkRedirect be?
 
  --Myles
 
 
  On Tue, Jan 24, 2012 at 10:47 AM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  On Tue, Jan 24, 2012 at 6:57 PM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
  
   On Mon, Jan 23, 2012 at 10:43 PM, Michael Snoyman 
 mich...@snoyman.com
   wrote:
  
   On Tue, Jan 24, 2012 at 8:37 AM, Myles C. Maxfield
   myles.maxfi...@gmail.com wrote:
I have attached a patch to add a redirect chain to the Response
datatype.
Comments on this patch are very welcome.
  
   I thought that this isn't necessary since a client wanting to track
   all the redirects could just handle them manually by setting the
   redirect count to 0.
  
   It seems like a lot of work to re-implement the redirection-following
   code,
   just to know which URL the bytes are coming from.  I feel that adding
   this
   field makes the library easier to use, but it's your call.
 
  If that's the concern, I'd much rather just expose a function to help
  with dealing with redirects, rather than sticking a rather arbitrary
  [Ascii] in everyone's Response. I think a function along the lines of:
 
  checkRedirect :: Response - Maybe Request
 
  would fit the bill, and could be extracted from the current `http`
  function.
 
  Michael
 
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-23 Thread Myles C. Maxfield

I have attached a patch to add a redirect chain to the Response datatype.
Comments on this patch are very welcome.

I was originally going to include the entire Request object in the
redirection chain, but Request objects are parameterized with a type 'm',
so including a 'Request m' field would force the Response type to be
parameterized as well. I felt that would be too large a change, so I made
the type of the redirection chain W.Ascii.

Perhaps its worth using the 'forall' keyword to get rid of the pesky 'm'
type parameter for Requests?

data RequestBody
= RequestBodyLBS L.ByteString
| RequestBodyBS S.ByteString
| RequestBodyBuilder Int64 Blaze.Builder
| forall m. RequestBodySource Int64 (C.Source m Blaze.Builder)
| forall m. RequestBodySourceChunked (C.Source m Blaze.Builder)

--Myles

On Mon, Jan 23, 2012 at 3:31 AM, Michael Snoyman mich...@snoyman.comwrote:

 On Mon, Jan 23, 2012 at 1:20 PM, Aristid Breitkreuz
 arist...@googlemail.com wrote:
  Rejecting cookies is not without precedent.
 
  If you must force cookie handling upon us, at least make it possible to
  selectively reject them.
 
  Aristid

 If you turn off automatic redirects, then you won't have cookie
 handling. I'd be interested to hear of a use case where you would want
 to avoid passing cookies after a redirect.

 Michael

From d60bc1adf4af5a038432c35cde222654dfabf6dd Mon Sep 17 00:00:00 2001
From: Myles C. Maxfield lithe...@gmail.com
Date: Mon, 23 Jan 2012 21:44:12 -0800
Subject: [PATCH] Adding a redirection chain field to Responses

---
 Network/HTTP/Conduit.hs  |7 ---
 Network/HTTP/Conduit/Request.hs  |   24 +++-
 Network/HTTP/Conduit/Response.hs |7 ---
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/Network/HTTP/Conduit.hs b/Network/HTTP/Conduit.hs
index 794a62a..879d5a8 100644
--- a/Network/HTTP/Conduit.hs
+++ b/Network/HTTP/Conduit.hs
@@ -147,7 +147,7 @@ http
 - Manager
 - ResourceT m (Response (C.Source m S.ByteString))
 http req0 manager = do
-res@(Response status hs body) -
+res@(Response _ status hs body) -
 if redirectCount req0 == 0
 then httpRaw req0 manager
 else go (redirectCount req0) req0
@@ -160,7 +160,7 @@ http req0 manager = do
   where
 go 0 _ = liftBase $ throwIO TooManyRedirects
 go count req = do
-res@(Response (W.Status code _) hs _) - httpRaw req manager
+res@(Response uri (W.Status code _) hs _) - httpRaw req manager
 case (300 = code  code  400, lookup location hs) of
 (True, Just l'') - do
 -- Prepend scheme, host and port if missing
@@ -192,7 +192,8 @@ http req0 manager = do
 then GET
 else method l
 }
-go (count - 1) req'
+response - go (count - 1) req'
+return $ response {requestChain = (head uri) : (requestChain 
response)}
 _ - return res
 
 -- | Get a 'Response' without any redirect following.
diff --git a/Network/HTTP/Conduit/Request.hs b/Network/HTTP/Conduit/Request.hs
index e6e8876..a777285 100644
--- a/Network/HTTP/Conduit/Request.hs
+++ b/Network/HTTP/Conduit/Request.hs
@@ -7,6 +7,7 @@ module Network.HTTP.Conduit.Request
 , ContentType
 , Proxy (..)
 , parseUrl
+, unParseUrl
 , browserDecompress
 , HttpException (..)
 , alwaysDecompress
@@ -39,7 +40,7 @@ import qualified Network.HTTP.Types as W
 
 import Control.Exception (Exception, SomeException, toException)
 import Control.Failure (Failure (failure))
-import Codec.Binary.UTF8.String (encodeString)
+import Codec.Binary.UTF8.String (encode, encodeString)
 import qualified Data.CaseInsensitive as CI
 import qualified Data.ByteString.Base64 as B64
 
@@ -207,6 +208,27 @@ parseUrl2 full sec s = do
 (readDec rest)
 x - error $ parseUrl1: this should never happen:  ++ show x
 
+unParseUrl :: Request m - W.Ascii
+unParseUrl Request { secure = secure'
+   , host = host'
+   , port = port'
+   , path = path'
+   , queryString = querystring'
+   } = S.concat
+  [ http
+  , if secure' then s else S.empty
+  , ://
+  , host'
+  , case (secure', port') of
+  (True, 443) - S.empty
+  (True, p) - S.pack $ encode $ : ++ show p
+  (False, 80) - S.empty
+  (False, p) - S.pack $ encode $ : ++ show p
+  , path'
+  , ?
+  , querystring'
+  ]
+
 data HttpException = StatusCodeException W.Status W.ResponseHeaders
| InvalidUrlException String String
| TooManyRedirects
diff --git a/Network/HTTP/Conduit/Response.hs b/Network/HTTP/Conduit/Response.hs
index 5c6fd23..c183e34 100644
--- a/Network/HTTP/Conduit/Response.hs
+++ b/Network/HTTP/Conduit/Response.hs
@@ -33,7 +33,8 @@ import Network.HTTP.Conduit.Chunk
 
 -- | A simple representation of the HTTP

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-22 Thread Myles C. Maxfield

Replies are inline. Thanks for the quick and thoughtful response!

On Sat, Jan 21, 2012 at 8:56 AM, Michael Snoyman mich...@snoyman.comwrote:

 Hi Myles,

 These sound like two solid features, and I'd be happy to merge in code to
 support it. Some comments below.

 On Sat, Jan 21, 2012 at 8:38 AM, Myles C. Maxfield 
 myles.maxfi...@gmail.com wrote:

 To: Michael Snoyman, author and maintainer of http-conduit
 CC: haskell-cafe

 Hello!

 I am interested in contributing to the http-conduit library. I've been
 using it for a little while and reading through its source, but have felt
 that it could be improved with two features:

- Allowing the caller to know the final URL that ultimately resulted
in the HTTP Source. Because httpRaw is not exported, the caller can't even
re-implement the redirect-following code themselves. Ideally, the caller
would be able to know not only the final URL, but also the entire chain of
URLs that led to the final request. I was thinking that it would be even
cooler if the caller could be notified of these redirects as they happen 
 in
another thread. There are a couple ways to implement this that I have been
thinking about:
   - A straightforward way would be to add a [W.Ascii] to the type of
   Response, and getResponse can fill in this extra field. getResponse 
 already
   knows about the Request so it can tell if the response should be 
 gunzipped.

 What would be in the [W.Ascii], a list of all paths redirected to? Also,
 I'm not sure what gunzipping has to do with here, can you clarify?


Yes; my idea was to make the [W.Ascii] represent the list of all URLs
redirected to, in order.

My comment about gunzipping is only tangentially related. I meant that in
the latest version of the code on GitHub, the getResponse function already
takes a Request as an argument. This means that the getResponse function
already knows what URL its data is coming from, so modifying the
getResponse function to return that URL is simple. (I mentioned gunzip
because, as far as I can tell, the reason that getResponse *already* takes
a Request is so that the function can tell if the request should be
gunzipped.)


- It would be nice for the caller to be able to know in real time
   what URLs the request is being redirected to. A possible way to do this
   would be for the 'http' function to take an extra argument of type 
 (Maybe
   (Control.Concurrent.Chan W.Ascii)) which httpRaw can push URLs into. 
 If the
   caller doesn't want to use this variable, they can simply pass Nothing.
   Otherwise, the caller can create an IO thread which reads the Chan 
 until
   some termination condition is met (Perhaps this will change the type 
 of the
   extra argument to (Maybe (Chan (Maybe W.Ascii. I like this 
 solution,
   though I can see how it could be considered too heavyweight.


 I do think it's too heavyweight. I think if people really want lower-level
 control of the redirects, they should turn off automatic redirect and allow
 3xx responses.

Yeah, that totally makes more sense. As it stands, however, httpRaw isn't
exported, so a caller has no way of knowing about each individual HTTP
transaction. Exporting httpRaw solves the problem I'm trying to solve. If
we export httpRaw, should we *also* make 'http' return the URL chain? Doing
both is probably the best solution, IMHO.


- Making the redirection aware of cookies. There are redirects around
the web where the first URL returns a Set-Cookie header and a 3xx code
which redirects to another site that expects the cookie that the first 
 HTTP
transaction set. I propose to add an (IORef to a Data.Set of Cookies) to
the Manager datatype, letting the Manager act as a cookie store as well as
a repository of available TCP connections. httpRaw could deal with the
cookie store. Network.HTTP.Types does not declare a Cookie datatype, so I
would probably be adding one. I would probably take it directly from
Network.HTTP.Cookie.

 Actually, we already have the cookie package for this. I'm not sure if
 putting the cookie store in the manager is necessarily the right approach,
 since I can imagine wanting to have separate sessions while reusing the
 same connections. A different approach could be adding a list of Cookies to
 both the Request and Response.

Ah, looks like you're the maintainer of that package as well! I didn't
realize it existed. I should have, though; Yesod must need to know about
cookies somehow.

As the http-conduit package stands, the headers of the original Request can
be set, and the headers of the last Response can be read. Because cookies
are implemented on top of headers, the caller knows about the cookies
before and after the redirection chain. I'm more interested in the
preservation of cookies *within* the redirection chain. As discussed
earlier, exposing the httpRaw function allows the entire redirection chain

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-22 Thread Myles C. Maxfield

1. Oops - I overlooked the fact that the redirectCount attribute of a
Request is exported (it isn't listed on the
documentationhttp://hackage.haskell.org/packages/archive/http-conduit/1.2.0/doc/html/Network-HTTP-Conduit.html
probably
because the constructor itself isn't exported. This seems like a flaw in
Haddock...). Silly me. No need to export httpRaw.

2. I think that stuffing many arguments into the 'http' function is ugly.
However, I'm not sure that the number of arguments to 'http' could ever
reach an unreasonably large amount. Perhaps I have bad foresight, but I
personally feel that adding cookies to the http request will be the last
thing that we will need to add. Putting a bound on this growth of arguments
makes me more willing to think about this option. On the other hand, using
a BrowserAction to modify internal state is very elegant. Which approach do
you think is best? I think I'm leaning toward the upper-level Browser
module idea.

If there was to be a higher-level HTTP library, I would argue that the
redirection code should be moved into it, and the only high-level function
that the Network.HTTP.Conduit module would export is 'http' (or httpRaw).
What do you think about this?

Thanks for helping me out with this,
Myles C. Maxfield

On Sun, Jan 22, 2012 at 9:56 PM, Michael Snoyman mich...@snoyman.comwrote:

 On Sun, Jan 22, 2012 at 11:07 PM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  Replies are inline. Thanks for the quick and thoughtful response!
 
  On Sat, Jan 21, 2012 at 8:56 AM, Michael Snoyman mich...@snoyman.com
  wrote:
 
  Hi Myles,
 
  These sound like two solid features, and I'd be happy to merge in code
 to
  support it. Some comments below.
 
  On Sat, Jan 21, 2012 at 8:38 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
 
  To: Michael Snoyman, author and maintainer of http-conduit
  CC: haskell-cafe
 
  Hello!
 
  I am interested in contributing to the http-conduit library. I've been
  using it for a little while and reading through its source, but have
 felt
  that it could be improved with two features:
 
  Allowing the caller to know the final URL that ultimately resulted in
 the
  HTTP Source. Because httpRaw is not exported, the caller can't even
  re-implement the redirect-following code themselves. Ideally, the
 caller
  would be able to know not only the final URL, but also the entire
 chain of
  URLs that led to the final request. I was thinking that it would be
 even
  cooler if the caller could be notified of these redirects as they
 happen in
  another thread. There are a couple ways to implement this that I have
 been
  thinking about:
 
  A straightforward way would be to add a [W.Ascii] to the type of
  Response, and getResponse can fill in this extra field. getResponse
 already
  knows about the Request so it can tell if the response should be
 gunzipped.
 
  What would be in the [W.Ascii], a list of all paths redirected to? Also,
  I'm not sure what gunzipping has to do with here, can you clarify?
 
 
  Yes; my idea was to make the [W.Ascii] represent the list of all URLs
  redirected to, in order.
 
  My comment about gunzipping is only tangentially related. I meant that in
  the latest version of the code on GitHub, the getResponse function
 already
  takes a Request as an argument. This means that the getResponse function
  already knows what URL its data is coming from, so modifying the
 getResponse
  function to return that URL is simple. (I mentioned gunzip because, as
 far
  as I can tell, the reason that getResponse already takes a Request is so
  that the function can tell if the request should be gunzipped.)
 
  It would be nice for the caller to be able to know in real time what
 URLs
  the request is being redirected to. A possible way to do this would be
 for
  the 'http' function to take an extra argument of type (Maybe
  (Control.Concurrent.Chan W.Ascii)) which httpRaw can push URLs into.
 If the
  caller doesn't want to use this variable, they can simply pass Nothing.
  Otherwise, the caller can create an IO thread which reads the Chan
 until
  some termination condition is met (Perhaps this will change the type
 of the
  extra argument to (Maybe (Chan (Maybe W.Ascii. I like this
 solution,
  though I can see how it could be considered too heavyweight.
 
 
  I do think it's too heavyweight. I think if people really want
 lower-level
  control of the redirects, they should turn off automatic redirect and
 allow
  3xx responses.
 
  Yeah, that totally makes more sense. As it stands, however, httpRaw isn't
  exported, so a caller has no way of knowing about each individual HTTP
  transaction. Exporting httpRaw solves the problem I'm trying to solve.
 If we
  export httpRaw, should we also make 'http' return the URL chain? Doing
 both
  is probably the best solution, IMHO.

 What's the difference between calling httpRaw and calling http with
 redirections turned off?

 
  Making the redirection aware of cookies

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-22 Thread Myles C. Maxfield

Alright, that sounds good to me. I'll get started on it (the IORef idea).

Thanks for the insight!
--Myles

On Sun, Jan 22, 2012 at 10:42 PM, Michael Snoyman mich...@snoyman.comwrote:

 On Mon, Jan 23, 2012 at 8:31 AM, Myles C. Maxfield
 myles.maxfi...@gmail.com wrote:
  1. Oops - I overlooked the fact that the redirectCount attribute of a
  Request is exported (it isn't listed on the documentation probably
 because
  the constructor itself isn't exported. This seems like a flaw in
  Haddock...). Silly me. No need to export httpRaw.
 
  2. I think that stuffing many arguments into the 'http' function is ugly.
  However, I'm not sure that the number of arguments to 'http' could ever
  reach an unreasonably large amount. Perhaps I have bad foresight, but I
  personally feel that adding cookies to the http request will be the last
  thing that we will need to add. Putting a bound on this growth of
 arguments

 I completely disagree here. If we'd followed this approach, rawBody,
 decompress, redirectCount, and checkStatus all would have been
 arguments. There's a reason we use a settings data type[1] here.

 [1] http://www.yesodweb.com/blog/2011/10/settings-types

  makes me more willing to think about this option. On the other hand,
 using a
  BrowserAction to modify internal state is very elegant. Which approach do
  you think is best? I think I'm leaning toward the upper-level Browser
 module
  idea.
 
  If there was to be a higher-level HTTP library, I would argue that the
  redirection code should be moved into it, and the only high-level
 function
  that the Network.HTTP.Conduit module would export is 'http' (or httpRaw).
  What do you think about this?

 I actually don't want to move the redirection code out from where it
 is right now. I think that redirection *is* a basic part of HTTP. I'd
 be more in favor of just bundling cookies in with the current API,
 possibly with the IORef approach I'd mentioned (unless someone wants
 to give a different idea). Having a single API that provides both
 high-level and low-level approaches seems like a win to me.

 Michael

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-22 Thread Myles C. Maxfield

I'm a little confused as to what you mean by 'cookie handling'. Do you mean
cookies being set inside redirects for future requests inside the same
redirect chain, or users being able to supply cookies to the first HTTP
request and pull them out of the last HTTP response?

Clearly, making the original request specify 0 cookies is (will be)
trivial. It is up to the caller to determine if he/she wants to pull
cookies out of the last server response.

As for cookies getting set inside a redirect chain - I believe that the
Internet is 'broken' without this. I believe a client which does not set
cookies inside a redirect chain is a misbehaving client.

Are you suggesting that we have a 'do not obey cookies inside a redirection
chain, instead always blindly send this arbitrary (possibly empty) set of
cookies' setting? That's fine with me, but we should at least but a big
disclaimer around that option saying that its use leads to technically
misbehaving client behavior.

Comments?

--Myles

On Sun, Jan 22, 2012 at 11:16 PM, Michael Snoyman mich...@snoyman.comwrote:

 The only times cookies would be used would be:

 1. If you explicitly use it.
 2. If you have redirects turned on, and a page that redirects you also
 sets a cookie.

 I would think that we would want (2) to be on regardless of user
 setting, do you disagree?

 Michael

 On Mon, Jan 23, 2012 at 8:46 AM, Aristid Breitkreuz
 arist...@googlemail.com wrote:
  Just make sure Cookie handling can be disabled completely.
 
  Aristid
 
  Am 23.01.2012 07:44 schrieb Michael Snoyman mich...@snoyman.com:
 
  On Mon, Jan 23, 2012 at 8:31 AM, Myles C. Maxfield
  myles.maxfi...@gmail.com wrote:
   1. Oops - I overlooked the fact that the redirectCount attribute of a
   Request is exported (it isn't listed on the documentation probably
   because
   the constructor itself isn't exported. This seems like a flaw in
   Haddock...). Silly me. No need to export httpRaw.
  
   2. I think that stuffing many arguments into the 'http' function is
   ugly.
   However, I'm not sure that the number of arguments to 'http' could
 ever
   reach an unreasonably large amount. Perhaps I have bad foresight, but
 I
   personally feel that adding cookies to the http request will be the
 last
   thing that we will need to add. Putting a bound on this growth of
   arguments
 
  I completely disagree here. If we'd followed this approach, rawBody,
  decompress, redirectCount, and checkStatus all would have been
  arguments. There's a reason we use a settings data type[1] here.
 
  [1] http://www.yesodweb.com/blog/2011/10/settings-types
 
   makes me more willing to think about this option. On the other hand,
   using a
   BrowserAction to modify internal state is very elegant. Which approach
   do
   you think is best? I think I'm leaning toward the upper-level Browser
   module
   idea.
  
   If there was to be a higher-level HTTP library, I would argue that the
   redirection code should be moved into it, and the only high-level
   function
   that the Network.HTTP.Conduit module would export is 'http' (or
   httpRaw).
   What do you think about this?
 
  I actually don't want to move the redirection code out from where it
  is right now. I think that redirection *is* a basic part of HTTP. I'd
  be more in favor of just bundling cookies in with the current API,
  possibly with the IORef approach I'd mentioned (unless someone wants
  to give a different idea). Having a single API that provides both
  high-level and low-level approaches seems like a win to me.
 
  Michael
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Contributing to http-conduit

2012-01-20 Thread Myles C. Maxfield

To: Michael Snoyman, author and maintainer of http-conduit
CC: haskell-cafe

Hello!

I am interested in contributing to the http-conduit library. I've been
using it for a little while and reading through its source, but have felt
that it could be improved with two features:

   - Allowing the caller to know the final URL that ultimately resulted in
   the HTTP Source. Because httpRaw is not exported, the caller can't even
   re-implement the redirect-following code themselves. Ideally, the caller
   would be able to know not only the final URL, but also the entire chain of
   URLs that led to the final request. I was thinking that it would be even
   cooler if the caller could be notified of these redirects as they happen in
   another thread. There are a couple ways to implement this that I have been
   thinking about:
  - A straightforward way would be to add a [W.Ascii] to the type of
  Response, and getResponse can fill in this extra field.
getResponse already
  knows about the Request so it can tell if the response should be
gunzipped.
  - It would be nice for the caller to be able to know in real time
  what URLs the request is being redirected to. A possible way to do this
  would be for the 'http' function to take an extra argument of type (Maybe
  (Control.Concurrent.Chan W.Ascii)) which httpRaw can push URLs
into. If the
  caller doesn't want to use this variable, they can simply pass Nothing.
  Otherwise, the caller can create an IO thread which reads the Chan until
  some termination condition is met (Perhaps this will change the
type of the
  extra argument to (Maybe (Chan (Maybe W.Ascii. I like this solution,
  though I can see how it could be considered too heavyweight.
   - Making the redirection aware of cookies. There are redirects around
   the web where the first URL returns a Set-Cookie header and a 3xx code
   which redirects to another site that expects the cookie that the first HTTP
   transaction set. I propose to add an (IORef to a Data.Set of Cookies) to
   the Manager datatype, letting the Manager act as a cookie store as well as
   a repository of available TCP connections. httpRaw could deal with the
   cookie store. Network.HTTP.Types does not declare a Cookie datatype, so I
   would probably be adding one. I would probably take it directly from
   Network.HTTP.Cookie.

I'd be happy to do both of these things, but I'm hoping for your input on
how to go about this endeavor. Are these features even good to be pursuing?
Should I be going about this entirely differently?

Thanks,
Myles C. Maxfield

P.S. I'm curious about the lack of Network.URI throughout
Network.HTTP.Conduit. Is there a particular design decision that led you to
use raw ascii strings?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Network.Browser and Network.TLS

2012-01-16 Thread Myles C. Maxfield

Hello!

I am interested in extending the Network.HTTP code in the HTTP package to
support HTTPS via TLS. A clear candidate is to use the Network.TLS module
in the TLS library (because its TLS logic is written in pure Haskell,
rather than any of the FFI libraries like Network.Curl or the OpenSSL
package). It's simple enough to provide an implementation of the
Network.Stream.Stream typeclass around a TLSCtx, and this works for the
Network.HTTP.Stream functions.

However, I am interested in using the functionality in the Network.Browser
module. This module uses the Network.HTTP.HandleStream interface, which is
implemented directly on top of the Handle datatype and the
Network.BufferType.BufferOp functions. HandleStreams seem to be for
allowing the user to pull out an arbitrary data type out of an HTTP stream,
not for doing any stream processing the way TLS does. As far as I can tell,
the current typeclass system does not allow a TLSCtx to piggypack off of a
HandleStream. My assumption is that this interface is used for speed so the
user doesn't have to convert some canonical type into the type that he/she
desires in client code.

TLS, however, must use a specific type to decode the bytes that it pulls
out of the stream. I don't think it's reasonable to try to modify the TLS
library to decode bytes from an arbitrary type. Decoding necessarily needs
byte-level access to its input (and therefore output) streams in a manner
extremely similar to the functions that ByteString provides. Perhaps I'm
wrong about this, but the conclusion I've reached is that it doesn't make
sense for TLS to use an arbitrary typeclass because the interface it
requires is so similar to the existing ByteString datatype. If an
application wants a specific type out of a TLS stream, it must necessarily
convert the type in software. Any speed that might be gained by pulling
your native type out of a network connection will be dwarfed anyway by the
cost of decryption.

The Network.Stream functions allow this by using String type for all data
transfers (which is counterintuitive for binary data transfers). An
implementation of Network.Stream.Stream using TLS would convert TLS's
output ByteString into a String (possibly by doing something like ((map
(toEnum . fromIntegral)) . unpack) which doesn't make a whole lot of sense
and is fairly wasteful). A client program might even convert it back to a
ByteString, so the client program must have knowledge about how the bytes
are packed into the String.

Network.Browser only seems to have one function which isn't simply a state
accessor/mutator: 'request'. This function gives the connection a type of
HStream ty = HandleStream ty. As stated before, the HandleStream directly
uses the Handle type. This means that, as far as I can tell, there is no
way to fit TLS into the Network.Browser module as it stands because the
types don't allow for it. Supporting TLS in Network.Browser would have to
change the type of 'request' and therefore break every program out there
which uses the Network.Browser module. It would be possible to create
something like a 'requestHTTPS' function which returns a different type,
but this is quite inelegant - there should be one function that inspects
the scheme of the URI it is handed.

I am left with the conclusion that it is impossible to support TLS in
Network.Browser without breaking many Haskell programs. It is obviously
possible to fork the HTTP library, but I'd like to improve the state of the
existing libraries. Likewise, it is possible to create a new module that
supports HTTPS but has different typed functions and lots of code
duplication with Network.Browser, but that is quite inelegant.

I suppose this is mostly directed at the maintainers of the HTTP and TLS
libraries, Sigbjorn Finne and Vincent Hanquez, but I'd be greatful for your
input on what I can do to contribute to the Haskell community regarding
Network.Browser and Network.TLS. Perhaps I should just use the
Network.HTTP.Enumerator module and not deal with Network.Browser? Maybe I'm
going about this in entirely the wrong way.

Thanks,
Myles C. Maxfield
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

50 matches

Mail list logo