Re: [Haskell-cafe] Database connection pool

2010-05-07 Thread Michael Snoyman
On Fri, May 7, 2010 at 1:02 AM, Bas van Dijk v.dijk@gmail.com wrote:

 On Thu, May 6, 2010 at 11:54 PM, Bas van Dijk v.dijk@gmail.com
 wrote:
  On Thu, May 6, 2010 at 7:48 PM, Bas van Dijk v.dijk@gmail.com
 wrote:
  On Thu, May 6, 2010 at 3:24 PM, Michael Snoyman mich...@snoyman.com
 wrote:
 
 
  On Thu, May 6, 2010 at 9:13 AM, Bryan O'Sullivan b...@serpentine.com
 wrote:
 
  On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.com
 
  wrote:
 
  * When a connection is released, is goes to the end of the pool, so
  connections get used evenly (not sure if this actually matters in
 practice).
 
  In practice, you're better off letting idle connections stay that way,
  because then your DB server can close connections and free up
 resources. In
  other words, when you're done with a connection, put it at the front
 of the
  reuse queue, not the back.
  You'll also want to handle the possibility that a connection that you
 grab
  from the pool has been closed by the server. Many connection pooling
  implementations I've seen get this wrong in subtle or expensive ways.
 
  Thanks for the feedback. I've gone ahead and implemented a simple
 resource
  pool module. Since I need it to work with monad transformer stacks,
 I've
  built it on top of MonadCatchIO-transformers. I've put the code up in a
 gist
  on github[1]. I would appreciate if anyone could review this,
 especially to
  make sure the exception handling code is correct. block and unblock in
  particular concern me.
  Thanks,
  Michael
  [1] http://gist.github.com/392078
 
  I also have a suggestion for your design. (Note however that I don't
  have much experience with resource pools.)
 
  In your current design a Pool has a fixed maximum number of opened
  resources. I can imagine situations where the maximum number of opened
  resources can change dynamically. For example due to plugging in (or
  out) a new blade server at run-time which will increase (or decrease)
  the maximum number of resources that can be handled.
 
  So what about changing:
 
  createPool :: IO a - Int - IO (Pool a)
  to:
  createPool :: IO (Maybe a) - IO (Pool a)
 
  so, instead of statically storing the maximum number of  opened
  resources (Int), the resource creation function will decide itself
  when it has created enough (Maybe a).
 
  Regards,
 
  Bas
 
 
  How about something like this:
 
 
 
  {-# LANGUAGE NoImplicitPrelude #-} -- (I like to be explicit)
 
  module Pool (Pool, new, withPool) where
 
  import Data.Function   ( ($), (.) )
  import Data.Maybe  ( Maybe(Nothing,Just), maybe )
  import Data.Functor( ($) )
  import Control.Monad   ( return, (=), (), (=), fail,
  join, liftM )
  import Control.Monad.STM   ( atomically )
  import Control.Concurrent.STM.TVar ( TVar, newTVarIO, readTVar, writeTVar
 )
  import Control.Monad.CatchIO   ( MonadCatchIO, block, finally )
  import Control.Monad.IO.Class  ( liftIO )
 
  newtype Pool r = Pool (TVar [r])
 
  new :: MonadCatchIO m = m (Pool r)
  new = liftIO $ Pool $ newTVarIO []
 
  withPool :: MonadCatchIO m = Pool r - m (Maybe r) - (r - m a) - m
 (Maybe a)
  withPool (Pool tv) mk f = block $ join $ liftIO $ atomically $ do
   rrs - readTVar tv
   case rrs of
 [] - return $ mk = maybe (return Nothing) with
 r:rs - writeTVar tv rs  return (with r)
 where
   with r = liftM Just (f r)
 `finally`
   liftIO (atomically $ writeTVar tv . (r:) = readTVar
 tv)
 
 
 
  Note that I don't store the resource creation action (m (Maybe r))
  inside the pool. It's just passed as an argument to withPool.
 
  Regards,
 
  Bas
 

 Note that it's probably better to pass the resource creation action as
 the first argument to withPool:

 withPool :: MonadCatchIO m = m (Maybe r) - Pool r - (r - m a) - m
 (Maybe a)

 This way it's easier to create specialized withPool functions by
 partially applying a specific resource creation action to withPool as
 in:

 withDBConsPool :: MonadCatchIO m = Pool DBCon - (DBCon - m a) - m
 (Maybe a)
 withDBConsPool = withPool connectWithDB

 Regards,

 Bas


Bas,

Thank you for all the very thorough comments. If I'm understanding
correctly, there are two categories of suggestion:

1) Make the resource exhaustion mechanism more extensible.
2) Avoid wormholes

Please tell me if I've missed something.

Regarding (1), I think your approach is definitely better for complex pools;
however, for the usually case, I think it would present a more difficult API
for users. I could definitely imagine wrapping an easier-to-use interface
around your final example.

Regarding (2), I was not aware of it, thank you for updating me on the
issue.

So, here's my idea of how to wrap your Pool module to provide a simple

Re: [Haskell-cafe] Database connection pool

2010-05-06 Thread Bryan O'Sullivan
On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.comwrote:

 * When a connection is released, is goes to the end of the pool, so
 connections get used evenly (not sure if this actually matters in practice).


In practice, you're better off letting idle connections stay that way,
because then your DB server can close connections and free up resources. In
other words, when you're done with a connection, put it at the front of the
reuse queue, not the back.

You'll also want to handle the possibility that a connection that you grab
from the pool has been closed by the server. Many connection pooling
implementations I've seen get this wrong in subtle or expensive ways.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Database connection pool

2010-05-06 Thread Michael Snoyman
On Thu, May 6, 2010 at 9:13 AM, Bryan O'Sullivan b...@serpentine.com wrote:

 On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.comwrote:

 * When a connection is released, is goes to the end of the pool, so
 connections get used evenly (not sure if this actually matters in practice).


 In practice, you're better off letting idle connections stay that way,
 because then your DB server can close connections and free up resources. In
 other words, when you're done with a connection, put it at the front of the
 reuse queue, not the back.

 You'll also want to handle the possibility that a connection that you grab
 from the pool has been closed by the server. Many connection pooling
 implementations I've seen get this wrong in subtle or expensive ways.


Thanks for the feedback. I've gone ahead and implemented a simple resource
pool module. Since I need it to work with monad transformer stacks, I've
built it on top of MonadCatchIO-transformers. I've put the code up in a gist
on github[1]. I would appreciate if anyone could review this, especially to
make sure the exception handling code is correct. block and unblock in
particular concern me.

Thanks,
Michael

[1] http://gist.github.com/392078
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Database connection pool

2010-05-06 Thread Bas van Dijk
On Thu, May 6, 2010 at 3:24 PM, Michael Snoyman mich...@snoyman.com wrote:


 On Thu, May 6, 2010 at 9:13 AM, Bryan O'Sullivan b...@serpentine.com wrote:

 On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.com
 wrote:

 * When a connection is released, is goes to the end of the pool, so
 connections get used evenly (not sure if this actually matters in practice).

 In practice, you're better off letting idle connections stay that way,
 because then your DB server can close connections and free up resources. In
 other words, when you're done with a connection, put it at the front of the
 reuse queue, not the back.
 You'll also want to handle the possibility that a connection that you grab
 from the pool has been closed by the server. Many connection pooling
 implementations I've seen get this wrong in subtle or expensive ways.

 Thanks for the feedback. I've gone ahead and implemented a simple resource
 pool module. Since I need it to work with monad transformer stacks, I've
 built it on top of MonadCatchIO-transformers. I've put the code up in a gist
 on github[1]. I would appreciate if anyone could review this, especially to
 make sure the exception handling code is correct. block and unblock in
 particular concern me.
 Thanks,
 Michael
 [1] http://gist.github.com/392078

In 'withPool' you correctly call 'block' to block (or mask)
asynchronous exceptions so that when you remove a resource from the
pool you can be sure that it will be added to the pool in the end,
even if an asynchronous exception is thrown to you in the middle of
your computation.

So it looks like your code is safe. However 'withPool'  contains, what
I call, an asynchronous exception wormhole.

'insertResource' calls 'modifyMVar_' which calls 'takeMVar' which will
block when the MVar is taken by another thread executing 'withPool'.
Now any operation which may itself block is, so called,
interruptible.  Interruptible means that even in the scope of a
'block' the operation will unblock (or unmask) asynchronous
exceptions! See [1] for details.

Note that I recently discussed this problem[2] on the libraries list
and also note that Simon Marlow is working on a solution.

I would advise you to replace the MVars with TVars which don't suffer
from asynchronous exception wormholes.

Regards,

Bas

[1] 
http://haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/Control-Exception.html#13
[2] http://hackage.haskell.org/trac/ghc/ticket/4035
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Database connection pool

2010-05-06 Thread Bas van Dijk
On Thu, May 6, 2010 at 3:24 PM, Michael Snoyman mich...@snoyman.com wrote:


 On Thu, May 6, 2010 at 9:13 AM, Bryan O'Sullivan b...@serpentine.com wrote:

 On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.com
 wrote:

 * When a connection is released, is goes to the end of the pool, so
 connections get used evenly (not sure if this actually matters in practice).

 In practice, you're better off letting idle connections stay that way,
 because then your DB server can close connections and free up resources. In
 other words, when you're done with a connection, put it at the front of the
 reuse queue, not the back.
 You'll also want to handle the possibility that a connection that you grab
 from the pool has been closed by the server. Many connection pooling
 implementations I've seen get this wrong in subtle or expensive ways.

 Thanks for the feedback. I've gone ahead and implemented a simple resource
 pool module. Since I need it to work with monad transformer stacks, I've
 built it on top of MonadCatchIO-transformers. I've put the code up in a gist
 on github[1]. I would appreciate if anyone could review this, especially to
 make sure the exception handling code is correct. block and unblock in
 particular concern me.
 Thanks,
 Michael
 [1] http://gist.github.com/392078

I also have a suggestion for your design. (Note however that I don't
have much experience with resource pools.)

In your current design a Pool has a fixed maximum number of opened
resources. I can imagine situations where the maximum number of opened
resources can change dynamically. For example due to plugging in (or
out) a new blade server at run-time which will increase (or decrease)
the maximum number of resources that can be handled.

So what about changing:

createPool :: IO a - Int - IO (Pool a)
to:
createPool :: IO (Maybe a) - IO (Pool a)

so, instead of statically storing the maximum number of  opened
resources (Int), the resource creation function will decide itself
when it has created enough (Maybe a).

Regards,

Bas
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Database connection pool

2010-05-06 Thread Bas van Dijk
On Thu, May 6, 2010 at 7:48 PM, Bas van Dijk v.dijk@gmail.com wrote:
 On Thu, May 6, 2010 at 3:24 PM, Michael Snoyman mich...@snoyman.com wrote:


 On Thu, May 6, 2010 at 9:13 AM, Bryan O'Sullivan b...@serpentine.com wrote:

 On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.com
 wrote:

 * When a connection is released, is goes to the end of the pool, so
 connections get used evenly (not sure if this actually matters in 
 practice).

 In practice, you're better off letting idle connections stay that way,
 because then your DB server can close connections and free up resources. In
 other words, when you're done with a connection, put it at the front of the
 reuse queue, not the back.
 You'll also want to handle the possibility that a connection that you grab
 from the pool has been closed by the server. Many connection pooling
 implementations I've seen get this wrong in subtle or expensive ways.

 Thanks for the feedback. I've gone ahead and implemented a simple resource
 pool module. Since I need it to work with monad transformer stacks, I've
 built it on top of MonadCatchIO-transformers. I've put the code up in a gist
 on github[1]. I would appreciate if anyone could review this, especially to
 make sure the exception handling code is correct. block and unblock in
 particular concern me.
 Thanks,
 Michael
 [1] http://gist.github.com/392078

 I also have a suggestion for your design. (Note however that I don't
 have much experience with resource pools.)

 In your current design a Pool has a fixed maximum number of opened
 resources. I can imagine situations where the maximum number of opened
 resources can change dynamically. For example due to plugging in (or
 out) a new blade server at run-time which will increase (or decrease)
 the maximum number of resources that can be handled.

 So what about changing:

 createPool :: IO a - Int - IO (Pool a)
 to:
 createPool :: IO (Maybe a) - IO (Pool a)

 so, instead of statically storing the maximum number of  opened
 resources (Int), the resource creation function will decide itself
 when it has created enough (Maybe a).

 Regards,

 Bas


How about something like this:


{-# LANGUAGE NoImplicitPrelude #-} -- (I like to be explicit)

module Pool (Pool, new, withPool) where

import Data.Function   ( ($), (.) )
import Data.Maybe  ( Maybe(Nothing,Just), maybe )
import Data.Functor( ($) )
import Control.Monad   ( return, (=), (), (=), fail,
join, liftM )
import Control.Monad.STM   ( atomically )
import Control.Concurrent.STM.TVar ( TVar, newTVarIO, readTVar, writeTVar )
import Control.Monad.CatchIO   ( MonadCatchIO, block, finally )
import Control.Monad.IO.Class  ( liftIO )

newtype Pool r = Pool (TVar [r])

new :: MonadCatchIO m = m (Pool r)
new = liftIO $ Pool $ newTVarIO []

withPool :: MonadCatchIO m = Pool r - m (Maybe r) - (r - m a) - m (Maybe a)
withPool (Pool tv) mk f = block $ join $ liftIO $ atomically $ do
  rrs - readTVar tv
  case rrs of
[] - return $ mk = maybe (return Nothing) with
r:rs - writeTVar tv rs  return (with r)
where
  with r = liftM Just (f r)
`finally`
  liftIO (atomically $ writeTVar tv . (r:) = readTVar tv)


Note that I don't store the resource creation action (m (Maybe r))
inside the pool. It's just passed as an argument to withPool.

Regards,

Bas
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Database connection pool

2010-05-06 Thread Bas van Dijk
On Thu, May 6, 2010 at 11:54 PM, Bas van Dijk v.dijk@gmail.com wrote:
 On Thu, May 6, 2010 at 7:48 PM, Bas van Dijk v.dijk@gmail.com wrote:
 On Thu, May 6, 2010 at 3:24 PM, Michael Snoyman mich...@snoyman.com wrote:


 On Thu, May 6, 2010 at 9:13 AM, Bryan O'Sullivan b...@serpentine.com 
 wrote:

 On Wed, May 5, 2010 at 10:51 PM, Michael Snoyman mich...@snoyman.com
 wrote:

 * When a connection is released, is goes to the end of the pool, so
 connections get used evenly (not sure if this actually matters in 
 practice).

 In practice, you're better off letting idle connections stay that way,
 because then your DB server can close connections and free up resources. In
 other words, when you're done with a connection, put it at the front of the
 reuse queue, not the back.
 You'll also want to handle the possibility that a connection that you grab
 from the pool has been closed by the server. Many connection pooling
 implementations I've seen get this wrong in subtle or expensive ways.

 Thanks for the feedback. I've gone ahead and implemented a simple resource
 pool module. Since I need it to work with monad transformer stacks, I've
 built it on top of MonadCatchIO-transformers. I've put the code up in a gist
 on github[1]. I would appreciate if anyone could review this, especially to
 make sure the exception handling code is correct. block and unblock in
 particular concern me.
 Thanks,
 Michael
 [1] http://gist.github.com/392078

 I also have a suggestion for your design. (Note however that I don't
 have much experience with resource pools.)

 In your current design a Pool has a fixed maximum number of opened
 resources. I can imagine situations where the maximum number of opened
 resources can change dynamically. For example due to plugging in (or
 out) a new blade server at run-time which will increase (or decrease)
 the maximum number of resources that can be handled.

 So what about changing:

 createPool :: IO a - Int - IO (Pool a)
 to:
 createPool :: IO (Maybe a) - IO (Pool a)

 so, instead of statically storing the maximum number of  opened
 resources (Int), the resource creation function will decide itself
 when it has created enough (Maybe a).

 Regards,

 Bas


 How about something like this:

 
 {-# LANGUAGE NoImplicitPrelude #-} -- (I like to be explicit)

 module Pool (Pool, new, withPool) where

 import Data.Function               ( ($), (.) )
 import Data.Maybe                  ( Maybe(Nothing,Just), maybe )
 import Data.Functor                ( ($) )
 import Control.Monad               ( return, (=), (), (=), fail,
 join, liftM )
 import Control.Monad.STM           ( atomically )
 import Control.Concurrent.STM.TVar ( TVar, newTVarIO, readTVar, writeTVar )
 import Control.Monad.CatchIO       ( MonadCatchIO, block, finally )
 import Control.Monad.IO.Class      ( liftIO )

 newtype Pool r = Pool (TVar [r])

 new :: MonadCatchIO m = m (Pool r)
 new = liftIO $ Pool $ newTVarIO []

 withPool :: MonadCatchIO m = Pool r - m (Maybe r) - (r - m a) - m (Maybe 
 a)
 withPool (Pool tv) mk f = block $ join $ liftIO $ atomically $ do
  rrs - readTVar tv
  case rrs of
    [] - return $ mk = maybe (return Nothing) with
    r:rs - writeTVar tv rs  return (with r)
    where
      with r = liftM Just (f r)
                `finally`
                  liftIO (atomically $ writeTVar tv . (r:) = readTVar tv)
 

 Note that I don't store the resource creation action (m (Maybe r))
 inside the pool. It's just passed as an argument to withPool.

 Regards,

 Bas


Note that it's probably better to pass the resource creation action as
the first argument to withPool:

withPool :: MonadCatchIO m = m (Maybe r) - Pool r - (r - m a) - m (Maybe a)

This way it's easier to create specialized withPool functions by
partially applying a specific resource creation action to withPool as
in:

withDBConsPool :: MonadCatchIO m = Pool DBCon - (DBCon - m a) - m (Maybe a)
withDBConsPool = withPool connectWithDB

Regards,

Bas
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe