Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
Hi Simon,

It is good that you support thread-local variables.

I have initialized a wiki page:

http://haskell.org/haskellwiki/Thread_local_storage

The main difference between my and your proposals, as I see it, is
that your proposal is based on keys which can be used for other
things.

I think that leads to an interface which is less natural. In my
proposal, the IOParam type is quite similar to an IORef - it has a
user-specified initial state, and the internal implementation is
hidden from the user - yours differs in both of these aspects.

 * I agree with Robert that a key issue is initialisation.  Maybe it
 should be possible to associate an initialiser with a key.  I have not
 thought this out.

I still don't understand this, so it is not mentioned on the wiki.

 *  A key issue is this: when forking a thread, does the new thread
 inherit the current thread's bindings, or does it get a
 freshly-initialised set.  Sometimes you want one, sometimes the other,
 alas.

I think the inheritance semantics are more useful and also more
general: If I wanted a freshly-initialized set of bindings, and I only
had inheritance semantics, then I could start a thread early on when
all the bindings are in their initial state, and have this thread read
actions from a channel and execute them in sub-threads of itself, and
implement a 'fork' variant based on this. More generally, I could do
the same thing from a sub-thread of the main thread - I could start a
thread with any set of bindings, and use it to launch other threads
with those bindings. In this way, the initial set of bindings is not
specially privileged over intermediate sets of bindings.

 On the GHC front, we're going to be busy with 6.6 etc until after ICFP,
 so nothing is going to happen fast -- which gives an opportunity to
 discuss it.  However it's just infeasible for the community at large to
 follow a long email thread like this one. My suggestion would be for the
 interested parties to proceed somewhat as we did with packages.
 (http://hackage.haskell.org/trac/ghc/wiki/GhcPackages)

I have put a page on the wiki summarizing the thread. However, I want
to say that I think that email is a better medium for most ongoing
discussions. (I'm not sure if I may have suggested the opposite
earlier) For those who are not interested in the discussion, it should
be easy in most mail readers to ignore or hide a long thread, or to
skip to the very end of it to get a rough idea of where things stand. 
I think it is a good idea to have proposals on a wiki, though, so that
the product of all agreed-upon amendments and alterations can be
easily referred to.

When discussions happen on a wiki, though, they often take the same
threaded form as email discussions (see Wikipedia) - but, they are
seen by fewer interested people, and the interface is clumsier (for
instance, I can subscribe to email notification when a wiki page
changes - thanks to whomever finally made this possible on
haskell.org, by the way - but I have to read the updated version to
figure out whether the modification was replying to me or another
poster; whereas my mail reader clearly flags messages where I appear
in the recipients list).

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-08 Thread Einar Karttunen
On 07.08 13:16, Frederik Eaton wrote:
  How would this work together with the FFI?
 
 It wouldn't, at least I wouldn't care if it didn't.


Suddenly breaking libraries that happen to use FFI behind your
back does not seem like a good conservative extension.

I think we should move the discussion to the wiki as Simon
suggested. I can create a wikipage if you don't want to.

 What about my example:
 
 newMain host environment program_args
 network_config locale terminal_settings
 stdin stdout stderr = do
 ...
 
 Now, let's see. We might want two threads to have the same network
 configuration, but a different view of the filesystem; or the same
 view of the filesystem, but a different set of environment variables;
 or the same environment, but different command line arguments. All
 three cases are pretty common in practice. We might also want to have
 the same arguments but different IO handles - as in a multi-threaded
 server application.

This won't be pretty even with TLS. Our fancy app will probably mix
in STM and pass callback actions to the thread processing
packets coming directly from the network interface. Quickly
the TLS approach seems problematic - we need to know what actions
depend on each other and how.

 And the part that implements the filesystem might want to access the
 network (if there is a network filesystem). And the part that starts
 processes with an environment might want to access the filesystem, for
 instance to read the code for the process and for shared libraries;
 and maybe it also wants to get the hostname from the network layer. 
 And the part that starts programs with arguments might want to access
 the environment (for instance, to get the current locale), as well as
 the filesystem (for instance, to read locale configuration files). And
 the part that accesses the IO handles might also want to access not
 just the program arguments but the environment, and the filesystem,
 and the network.

So we have the following dependencies:


FileSystem  - Network
Environment - FileSystem, Network
Arguments   - Environment
IO Handles  - Arguments,Environment,FS,Network

With TLS every one of them has type IO. Now the programmer is supposed
to know that he has to configure the network before using program
arguments? So a programmer first wanting to process command line
arguments and only then configuring network will probably have
hidden bugs.

It becomes very hard to know what different components depend on.

Even if we had to define all those instances that would be
1+2+1+3 = 7 instance declarations. Not 5^2 = 25 instances.
Or use small wrapper combinators (which I prefer).

btw how would the TLS solution elegantly handle that I'd like
separate network configurations for e.g.
IO Handle - Network(socket) and
IO Handle - FileSystem(NFS) - Network
?

 So here is an example where we have nested layers, and each layer
 accesses most of the layers below it.

And this will cause problems. A good API should not encourage
going to the lower levels directly. If the lowest level changes
then with your design one has to make O(layers) changes instead of
O(1) if the layers are not available directly.

If one of the layers adds a new dependency then making sure it is
initialized and used correctly seems very hard to check.

 If we started with a library that dealt with OS devices such as the
 network, and used a special monad for that; and then if we built upon
 that a layer for keeping track of environment variables, with another
 monad; and then a layer for invoking executables with arguments; and
 then a layer for IO; all with monads - then we would have a good
 modular, extensible design, which, due to the interactions between
 layers, would, in Haskell, require code length which is quadratic in
 the number of layers.

The trick here is that most components should not talk with each
other. Composition and encapsulation are the keys to victory.

 (Of course, it's true that in real operating systems, each of these
 layers has its own set of interfaces to the other layers - so the
 monadic approach is actually not more verbose. But the point is that
 it's a reasonable design, with layers, and where each layer uses each
 of the ones below it. I want to write code which is designed the same
 way, but without the overhead)

Yes, the size of the code is dependent on the size of the API.
Making things explicit is more infrastructure at the start,
but makes things easier later on when they have to be changed.

 If you move it somewhere else, but forget to move the thread-local
 variables it refers to, then you'll get a compiler error.

I was meaning forgetting to initialize it - not omitting the whole
definition.

 db2 - getIOParam db2Param
 withIOParam dbParam db2 $ ...

And one needs to make sure that the ... part does not need the
other database connection(s). Makes composing things hard.

 I'm still not sure I understand why thread pools are necessary, by the
 way. I thought forking 

Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
On Tue, Aug 08, 2006 at 04:21:06PM +0300, Einar Karttunen wrote:
 On 07.08 13:16, Frederik Eaton wrote:
   How would this work together with the FFI?
  
  It wouldn't, at least I wouldn't care if it didn't.
 
 Suddenly breaking libraries that happen to use FFI behind your
 back does not seem like a good conservative extension.

FFI already doesn't mix well with GHC's IO handles. What if I write to
file descriptor 1 before all data in stdout has been flushed? Is that
a reason not to allow FFI?

 I think we should move the discussion to the wiki as Simon
 suggested. I can create a wikipage if you don't want to.

http://haskell.org/haskellwiki/Thread_local_storage

I think the wiki is a good place for proposals, but not most
discussion.

  What about my example:
  
  newMain host environment program_args
  network_config locale terminal_settings
  stdin stdout stderr = do
  ...
  
  Now, let's see. We might want two threads to have the same network
  configuration, but a different view of the filesystem; or the same
  view of the filesystem, but a different set of environment variables;
  or the same environment, but different command line arguments. All
  three cases are pretty common in practice. We might also want to have
  the same arguments but different IO handles - as in a multi-threaded
  server application.
 
 This won't be pretty even with TLS. Our fancy app will probably mix
 in STM and pass callback actions to the thread processing
 packets coming directly from the network interface. Quickly
 the TLS approach seems problematic - we need to know what actions
 depend on each other and how.

I don't understand. Does TLS make such design harder or easier?

  And the part that implements the filesystem might want to access the
  network (if there is a network filesystem). And the part that starts
  processes with an environment might want to access the filesystem, for
  instance to read the code for the process and for shared libraries;
  and maybe it also wants to get the hostname from the network layer. 
  And the part that starts programs with arguments might want to access
  the environment (for instance, to get the current locale), as well as
  the filesystem (for instance, to read locale configuration files). And
  the part that accesses the IO handles might also want to access not
  just the program arguments but the environment, and the filesystem,
  and the network.
 
 So we have the following dependencies:
 
 
 FileSystem  - Network
 Environment - FileSystem, Network
 Arguments   - Environment
and Filesystem
 IO Handles  - Arguments,Environment,FS,Network
 
 With TLS every one of them has type IO. Now the programmer is supposed
 to know that he has to configure the network before using program
 arguments? So a programmer first wanting to process command line
 arguments and only then configuring network will probably have
 hidden bugs.

The running example is an example of an executable starting in an
operating system. So everything is already configured by the time it
starts, as you know.

My application will be no different - for instance, the
database-related parameter will be set; then a request thread will
start, and after parsing the request, a user-id parameter will be set,
and then the request-processing functions will be called. There is no
reason for the main server thread to call any of the
request-processing functions, because it doesn't have a request to
process.

 It becomes very hard to know what different components depend on.
 
 Even if we had to define all those instances that would be
 1+2+1+3 = 7 instance declarations. Not 5^2 = 25 instances.
 Or use small wrapper combinators (which I prefer).

O(x) doesn't mean same as x.

 btw how would the TLS solution elegantly handle that I'd like
 separate network configurations for e.g.
 IO Handle - Network(socket) and
 IO Handle - FileSystem(NFS) - Network
 ?

The filesystem could send its actions to be executed in a separate
thread, which has its own configuration?

  So here is an example where we have nested layers, and each layer
  accesses most of the layers below it.
 
 And this will cause problems. A good API should not encourage
 going to the lower levels directly. If the lowest level changes
 then with your design one has to make O(layers) changes instead of
 O(1) if the layers are not available directly.

No, you just write a compatibility wrapper over the new
implementation.

 If one of the layers adds a new dependency then making sure it is
 initialized and used correctly seems very hard to check.

I disagree.

  If we started with a library that dealt with OS devices such as the
  network, and used a special monad for that; and then if we built upon
  that a layer for keeping track of environment variables, with another
  monad; and then a layer for invoking executables with arguments; and
  then a layer for IO; all with monads - then we would have a good
  modular, extensible design, which, due to the 

Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
 Furthermore, can we move this thread from the Haskell mailing list
 (which should not have heavy traffic) to either Haskell-Café, or
 the libraries list?

Sure, moving to haskell-cafe.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell-cafe] RE: [Haskell] thread-local variables

2006-08-08 Thread Simon Peyton-Jones
| I have initialized a wiki page:
| 
| http://haskell.org/haskellwiki/Thread_local_storage

Great

| I have put a page on the wiki summarizing the thread. However, I want
| to say that I think that email is a better medium for most ongoing
| discussions. 

I agree.  
Discussion by email
Outcomes on Wiki (including outcomes recording differences of viewpoint)

The goal is that the outcome is a comprehensible summary of the outcome of the 
discussion, for the benefit of the many who will not follow the evolving 
debate.  


Furthermore, can we move this thread from the Haskell mailing list (which 
should not have heavy traffic) to either Haskell-Café, or the libraries list?

Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
 Furthermore, can we move this thread from the Haskell mailing list
 (which should not have heavy traffic) to either Haskell-Café, or
 the libraries list?

Sure, moving to haskell-cafe.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] thread-local variables

2006-08-07 Thread Frederik Eaton
On Sun, Aug 06, 2006 at 01:36:15PM +0300, Einar Karttunen wrote:
 On 06.08 02:41, Frederik Eaton wrote:
  Also, note that my proposal differs in that thread local variables are
  not writable, but can only be changed by calling (e.g. in my API)
  'withIOParam'. This is still just as general, because an IORef can be
  stored in a thread-local variable, but it makes it easier to reason
  about the more common use case where TLS is used to make IO a Reader;
  and it makes it easier to share modifiable state across more than one
  thread. I.e. if modifiable state is stored as 'IOParam (IORef a)' then
  the default is for the stored 'IORef a' to be shared across all
  threads; it can only be changed locally for a specified action and
  any sub-threads using 'withIOParam'; and if some library I use decides
  to fork a thread behind the scenes, it won't change my program's
  behavior.
 
 Perhaps a function like this would solve all our problems:
 
 -- | Tie all TLS references in the IO action to the current
 -- environment rather than the environment it will actually
 -- be executed.
 tieToCurrentTLS :: IO a - IO (IO a)

Our problems? :) Well, it should be easy to implement. I think it's
a good idea.

  I think it is a good idea to have stdin, cwd, etc. be thread-local.
 
 How would this work together with the FFI?

It wouldn't, at least I wouldn't care if it didn't.

  I don't understand why the 'TL' monad is necessary, but I haven't read
  the proposal very carefully.
 
 The TL monad is necessary to make initialization order problems go
 away.

That's what it seemed like the intended purpose was, but I don't see
any initialization order problems in my proposal.

 On 05.08 19:56, Frederik Eaton wrote:
  That doesn't answer the question: What if my application has a need
  for several different sets of parameters - what if it doesn't make
  sense to combine them into a single monad? What if there are 'n'
  layers? Is it incorrect to say that the monadic approach requires code
  size O(n^2)?
 
 Well designed monadic approach does not require O(n^2). But if you
 want to design code in a way that requires O(n^2) code size you
 can do it.
 
 Parallel layers require O(layers).
 Nested layers hiding the lower layer need O(layers).
 
 This is not a problem in practice and makes refactoring very easy.

Is that true? I would be very careful when making generalizations
about all software design.

What about my example:

newMain host environment program_args
network_config locale terminal_settings
stdin stdout stderr = do
...

Now, let's see. We might want two threads to have the same network
configuration, but a different view of the filesystem; or the same
view of the filesystem, but a different set of environment variables;
or the same environment, but different command line arguments. All
three cases are pretty common in practice. We might also want to have
the same arguments but different IO handles - as in a multi-threaded
server application.

And the part that implements the filesystem might want to access the
network (if there is a network filesystem). And the part that starts
processes with an environment might want to access the filesystem, for
instance to read the code for the process and for shared libraries;
and maybe it also wants to get the hostname from the network layer. 
And the part that starts programs with arguments might want to access
the environment (for instance, to get the current locale), as well as
the filesystem (for instance, to read locale configuration files). And
the part that accesses the IO handles might also want to access not
just the program arguments but the environment, and the filesystem,
and the network.

So here is an example where we have nested layers, and each layer
accesses most of the layers below it.

kernel (networking, devices)
filesystem
linker
libc
application

If we started with a library that dealt with OS devices such as the
network, and used a special monad for that; and then if we built upon
that a layer for keeping track of environment variables, with another
monad; and then a layer for invoking executables with arguments; and
then a layer for IO; all with monads - then we would have a good
modular, extensible design, which, due to the interactions between
layers, would, in Haskell, require code length which is quadratic in
the number of layers.

(Of course, it's true that in real operating systems, each of these
layers has its own set of interfaces to the other layers - so the
monadic approach is actually not more verbose. But the point is that
it's a reasonable design, with layers, and where each layer uses each
of the ones below it. I want to write code which is designed the same
way, but without the overhead)

   And don't have any static guarantees that you have done all the proper
   initialization calls before you use them.
  
  Well, there are a lot of things I don't have static guarantees for. 
  For instance, sometimes I call the function 

RE: [Haskell] thread-local variables

2006-08-07 Thread Simon Peyton-Jones
| On Sat, Aug 05, 2006 at 02:18:58PM -0400, Robert Dockins wrote:
|  Sorry to jump into this thread so late.  However,  I'd like to take
a moment
|  to remind everyone that some time ago I put a concrete proposal for
|  thread-local variables on the table.
| 
|  http://article.gmane.org/gmane.comp.lang.haskell.cafe/11010

I'm cautious about jumping into this swampy topic, but here are some
thoughts.

* The thoughts that Simon and were considering about thread-local state
are quite close to Robert's proposal.  For myself, I am somewhat
persuaded that some form of implicitly-passed state in the IO monad
(without explicit parameters) is useful.   Examples I often think of are
- Allocating unique identifiers
- Making random numbers
- Where stdin and stdout should go
In all of these cases, a form of dynamic binding is just what we want:
send stdout to the current thread's stdout, use the current thread's
random number seed, etc.  

* There's no need to connect it to *state*.  The key top-level thing you
need is to allocate what Adrian Hey calls a thing with identity.
http://www.haskell.org/hawiki/GlobalMutableState.
I'll call it a key.  For example, rather than a 'threadlocal'
declaration, one might just have:

newkey foo :: Key Int

where 'newkey' the keyword; this declares a new key with type (Key Int),
distinct from all other keys.

Now you can imagine that the IO monad could provide operations
withBinding :: Key a - a - IO b - IO b
lookupBinding :: Key a - IO a

very much like the dynamic-binding primitives that have popped up on
this thread.

* If you want *state*, you can have a (Key (IORef Int)).  Now you look
up the binding to get an IORef (or MVar, whatever you like) and you can
mutate that at will.  So this separates a thread-local *environment*
from thread-local *state*.

* Keys may be useful for purposes other than withBinding and
thread-local state.  One would also want to dynamically create new keys:
newKey :: IO (Key a)

* I agree with Robert that a key issue is initialisation.  Maybe it
should be possible to associate an initialiser with a key.  I have not
thought this out.

*  A key issue is this: when forking a thread, does the new thread
inherit the current thread's bindings, or does it get a
freshly-initialised set.  Sometimes you want one, sometimes the other,
alas.


On the GHC front, we're going to be busy with 6.6 etc until after ICFP,
so nothing is going to happen fast -- which gives an opportunity to
discuss it.  However it's just infeasible for the community at large to
follow a long email thread like this one. My suggestion would be for the
interested parties to proceed somewhat as we did with packages.
(http://hackage.haskell.org/trac/ghc/wiki/GhcPackages)

* Make a Wiki page to describe a concrete proposal. 
 
* Where there are design choices, describe them

* If there are competing proposals that are incompatible, 
   make more than one Wiki page. 

* But strive to evolve to one or two, so that the rest of us are
not
  faced with 10 proposals from 10 people!

This could happen on the GHC wiki or the Haskell wiki, though the latter
seems more appropriate.

Simon

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-06 Thread Einar Karttunen
On 06.08 04:23, Frederik Eaton wrote:
 I also forgot to mention that if you hold on to a ThreadId, it
 apparently causes the whole thread to be retained. Simon Marlow
 explained this on 2005/10/18:

Actually this problem does not exist in the code.
The problem is encountered if children are tied to their parents,
that is they contain the ThreadId of the parent thread. In my
code this problem should not occur.

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-06 Thread Einar Karttunen
On 06.08 02:41, Frederik Eaton wrote:
 Also, note that my proposal differs in that thread local variables are
 not writable, but can only be changed by calling (e.g. in my API)
 'withIOParam'. This is still just as general, because an IORef can be
 stored in a thread-local variable, but it makes it easier to reason
 about the more common use case where TLS is used to make IO a Reader;
 and it makes it easier to share modifiable state across more than one
 thread. I.e. if modifiable state is stored as 'IOParam (IORef a)' then
 the default is for the stored 'IORef a' to be shared across all
 threads; it can only be changed locally for a specified action and
 any sub-threads using 'withIOParam'; and if some library I use decides
 to fork a thread behind the scenes, it won't change my program's
 behavior.

Perhaps a function like this would solve all our problems:

-- | Tie all TLS references in the IO action to the current
-- environment rather than the environment it will actually
-- be executed.
tieToCurrentTLS :: IO a - IO (IO a)


 I think it is a good idea to have stdin, cwd, etc. be thread-local.

How would this work together with the FFI?

 I don't understand why the 'TL' monad is necessary, but I haven't read
 the proposal very carefully.

The TL monad is necessary to make initialization order problems go
away.

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-06 Thread Einar Karttunen
On 05.08 19:56, Frederik Eaton wrote:
 That doesn't answer the question: What if my application has a need
 for several different sets of parameters - what if it doesn't make
 sense to combine them into a single monad? What if there are 'n'
 layers? Is it incorrect to say that the monadic approach requires code
 size O(n^2)?

Well designed monadic approach does not require O(n^2). But if you
want to design code in a way that requires O(n^2) code size you
can do it.

Parallel layers require O(layers).
Nested layers hiding the lower layer need O(layers).

This is not a problem in practice and makes refactoring very easy.


  And don't have any static guarantees that you have done all the proper
  initialization calls before you use them.
 
 Well, there are a lot of things I don't have static guarantees for. 
 For instance, sometimes I call the function 'head', and the compiler
 isn't able to verify that the argument isn't an empty list. If I
 initialize my TLS to 'undefined' then I'll get a similar error
 message, at run time. For another example, I don't use monadic regions
 when I do file IO. I can live with that.

The problem is with refactoring and taking a piece of code and
reusing it somewhere else - and trying to figure out what does
it need.

  ... Also if we have two pieces of the same per-thread state that we
  wish to use in one thread (e.g. db-connections) then the TLS
  approach becomes quite hard.
 
 No harder than the monadic approach, in my opinion.

In the monadic approach adding a second db connection would involve:
1) add a line to the state record
2) add a db2query = withPart db2 . flip query
3) no changes elsewhere

If the DB API uses a TLS parameter of type Proxy DBH how would
you implement this in a nice manner for the TLS case?

 You've redefined 'fork'. If I want a library which works with other
 libraries, that will not be an option. The original purpose of my
 posting to this thread was to ask for two standard functions which
 would let me define thread-local variables in a way which is
 interoperable with other libraries, to the same extent as 'withArgs'
 and 'withProgName' are.

All libraries which may fork may use a preallocated thread pool.
Thus they might not work with TLS. withArgs and withProgName are
global and not very thread-friendly.

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-06 Thread Taral

On 8/5/06, Frederik Eaton [EMAIL PROTECTED] wrote:

Also, note that my proposal differs in that thread local variables are
not writable, but can only be changed by calling (e.g. in my API)
'withIOParam'.

[snip]

and if some library I use decides
to fork a thread behind the scenes, it won't change my program's
behavior.


Yes, but if it passes your action to another pre-existing thread, it will.

What people seem to want is dynamic scoping. Why not implement that
instead of messing around with yucky thread stuff?

data IODynamicRef a

getDynamicRef :: IODynamicRef a - IO a
setDynamicRef :: IODynamicRef a - a - IO b - IO b

--
Taral [EMAIL PROTECTED]
You can't prove anything.
   -- Gödel's Incompetence Theorem
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-05 Thread Frederik Eaton
  Maybe I'm misunderstanding your position - maybe you think that I
  should use lots of different processes to segregate global state into
  separate contexts? Well, that's nice, but I'd rather not. For
  instance, I'm writing a server - and it's just not efficient to use a
  separate process for each request. And there are some things such as
  database connections, current user id, log files, various profiling
  data, etc., that I would like to be thread-global but not
  process-global.
 
 I have done many servers in Haskell. Usually I have threads allocated
 to specific tasks rather than specific requests.
 
 What guarantees do your code have that all the relevant parameters
 are already initialized - and how can an user of the code know
 which TLS variables need to be initialized? 

You could ask the same questions about process-global state, couldn't
you?

 If it is documented maybe it could be done at the level of an
 implicit parameter?

Do you think implicit parameters are better than TLS?

  Or maybe you think that certain types of global state should be
  privileged - for instance, that all of the things which are arguments
  to 'newMain' above are OK to have as global state, but that anything
  else should be passed as function arguments, thus making
  thread-localization moot. I disagree with this - I am a proponent of
  extensibility, and think that the language should make as few things
  as possible built-in. I want to define my own application-specific
  global state, and, additionally, I want to have it thread-global, not
  process-global.
 
 This can cause much fun with the FFI. If we change e.g. stdout to
 thread specific what should be do before each foreign call? Same
 with the other things that are related to the OS process in question.
 
 A thread is a context of execution while a process is a context for
 resources. Would you like to have multiple Haskell processes inside
 one OS process?

If you want to think of it that way, then sure.

 I don't consider these very different:
 1) use one thread from a pre-allocated pool to do a task
 2) fork a new thread to do the task
 
 With TLS they are vastly different.

If you don't consider them different, then you can start using (2)
instead of (1).

  You asked for an example, but, because of the nature of this topic, it
  would have to be a very large example to prove my point. Thread-local
  variables are things that only become really useful in large programs. 
  Instead, I've asked you to put yourself in my shoes - what if the bits
  of context that you already take for granted in your programs had to
  be thread-local? How would you cope, without thread-local variables,
  in such a situation?
 
 I have been using an application specific monad (newtyped transformer) and
 a clean set of functions so that the implementation is not hardcoded
 and can be changed easily. Thus I haven't had the same difficulties
 as you.
 
 I don't think many of the process global resources would make sense
 on a per-thread basis and I am not against all global state.

You say many, but the question is are there any.

   But I would say that I think I would find having to know what thread
   a particular bit of code was running in in order to grok it very
   strange,
  
  I agree that it is important to have code which is easy to understand.
  
  Usually, functions run in the same thread as their caller, unless they
  are passed to something with the word 'fork' in the name. That's a
  good rule of thumb that is in fact sufficient to let you understand
  the code I write. Also, if that's too much to remember, then since I'm
  only proposing and using non-mutable thread-local state (i.e. it
  behaves like a MonadReader), and since I'm not passing actions between
  threads as Einar is, then you can forget about the 'fork' caveat.
 
 The only problem appears when someone uses two libraries one written
 by me and an another written by you and wonders why is my program
 failing in mysterious ways.

Can you give the API for your library? I have a hard time imagining
how it could not be obvious that a thread pool is being used.

  I think the code would in fact be more difficult to grok, if all of
  the things which I want to be thread-local were instead passed around
  as parameters, a la 'newMain'. This is simply because, in that
  scenario, there would much more code to read, and it would be very
  repetitive. If I used special monads for my state, then the situation
  would be only slightly better - a single monad would not suffice, and
  I'd be faced with a plethora of 'lift' functions and redefinitions of
  'catch', as well as long type signatures and a crowded namespace.
 
 As said before the monadic approach can be quite clean. I haven't used
 implicit parameters that much, so I won't comment on them.

Perhaps you can give an example? As I said, a single monad won't
suffice for me, because different libraries only know about different
parts of the state. 

Re: [Haskell] thread-local variables

2006-08-05 Thread Einar Karttunen
On 05.08 14:32, Frederik Eaton wrote:
  If it is documented maybe it could be done at the level of an
  implicit parameter?
 
 Do you think implicit parameters are better than TLS?


Implicit parameters are explicit and the type checker
guards that they are not undefined (and thus are safe
in the presence of callbacks). I haven't used implicit
parameters extensively because I prefer the monadic
approach.

  I don't consider these very different:
  1) use one thread from a pre-allocated pool to do a task
  2) fork a new thread to do the task
  
  With TLS they are vastly different.
 
 If you don't consider them different, then you can start using (2)
 instead of (1).

Performance reasons or access to a shared resources. Also 2) would
mean in many cases making currently local state global which is
not nice.

 Can you give the API for your library? I have a hard time imagining
 how it could not be obvious that a thread pool is being used.

e.g. various

withFooResource :: (Foo - IO a) - IO a

can use worker threads.

  As said before the monadic approach can be quite clean. I haven't used
  implicit parameters that much, so I won't comment on them.
 
 Perhaps you can give an example? As I said, a single monad won't
 suffice for me, because different libraries only know about different
 parts of the state. With TLS, one can delimit the scope of parameters
 by making the references to them module-internal, for instance.
 
 With monads, I imagine that I'll need for each parameter
 
 (1) a MonadX class, with a liftX member
 (2) a catchX function
 (3) a MonadY instance, for each wrapped monad Y (thus the number of
 such instances will be O(n^2) where n is the number of parameters)

That is usually the wrong approach. Newtype something like
StateT AppState IO. Use something like:

runWithPart :: (AppState - c) - (c - IO a) - AppM a

to define nice actions for different parts of the libraries.

Usually this is very easy if one uses combinators and high level
constructs and messier if it is hard to find the right combinators.

If you look at the various web frameworks in Haskell you will notice
that most of them live happily with one monad and don't suffer from
problems because of that.

 With TLS, I need
 
 (1) a declaration x = unsafePerformIO $ newIOParam ...

And don't have any static guarantees that you have done all the proper
initialization calls before you use them.

In the previous example we were using a lot of libraries using hidden
state. How do we guarantee that they have valid values in TLS?
Also if we have two pieces of the same per-thread state that
we wish to use in one thread (e.g. db-connections) then the TLS
approach becomes quite hard.

Here is a naive and dirty implementation. The largest problem is that
TypeRep is not in  Ord. An alternative approach using Dynamic would be
possible, but I like the connection between the key 
and the associated type.

http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/

Not optimized for performance at all.

- Einar Karttunen

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-05 Thread Robert Dockins
Sorry to jump into this thread so late.  However,  I'd like to take a moment 
to remind everyone that some time ago I put a concrete proposal for 
thread-local variables on the table.

http://article.gmane.org/gmane.comp.lang.haskell.cafe/11010

I believe this proposal addresses the initialization issues that Einar has 
been discussing.  In my proposal, thread-local variables always have some 
defined value, and they obtain their values at well-defined points.

The liked message also gives several use cases that I felt motivated the 
proposal.

--
Rob Dockins

Talk softly and drive a Sherman tank.
Laugh hard, it's a long way to the bank.
   -- TMBG
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-05 Thread Frederik Eaton
   As said before the monadic approach can be quite clean. I haven't used
   implicit parameters that much, so I won't comment on them.
  
  Perhaps you can give an example? As I said, a single monad won't
  suffice for me, because different libraries only know about different
  parts of the state. With TLS, one can delimit the scope of parameters
  by making the references to them module-internal, for instance.
  
  With monads, I imagine that I'll need for each parameter
  
  (1) a MonadX class, with a liftX member
  (2) a catchX function
  (3) a MonadY instance, for each wrapped monad Y (thus the number of
  such instances will be O(n^2) where n is the number of parameters)
 
 That is usually the wrong approach. Newtype something like
 StateT AppState IO. Use something like:
 
 runWithPart :: (AppState - c) - (c - IO a) - AppM a
 
 to define nice actions for different parts of the libraries.
 
 Usually this is very easy if one uses combinators and high level
 constructs and messier if it is hard to find the right combinators.
 
 If you look at the various web frameworks in Haskell you will notice
 that most of them live happily with one monad and don't suffer from
 problems because of that.

That doesn't answer the question: What if my application has a need
for several different sets of parameters - what if it doesn't make
sense to combine them into a single monad? What if there are 'n'
layers? Is it incorrect to say that the monadic approach requires code
size O(n^2)?

  With TLS, I need
  
  (1) a declaration x = unsafePerformIO $ newIOParam ...
 
 And don't have any static guarantees that you have done all the proper
 initialization calls before you use them.

Well, there are a lot of things I don't have static guarantees for. 
For instance, sometimes I call the function 'head', and the compiler
isn't able to verify that the argument isn't an empty list. If I
initialize my TLS to 'undefined' then I'll get a similar error
message, at run time. For another example, I don't use monadic regions
when I do file IO. I can live with that.

 ... Also if we have two pieces of the same per-thread state that we
 wish to use in one thread (e.g. db-connections) then the TLS
 approach becomes quite hard.

No harder than the monadic approach, in my opinion.

 Here is a naive and dirty implementation. The largest problem is that
 TypeRep is not in  Ord. An alternative approach using Dynamic would be
 possible, but I like the connection between the key 
 and the associated type.
 
 http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/
 
 Not optimized for performance at all.

You've redefined 'fork'. If I want a library which works with other
libraries, that will not be an option. The original purpose of my
posting to this thread was to ask for two standard functions which
would let me define thread-local variables in a way which is
interoperable with other libraries, to the same extent as 'withArgs'
and 'withProgName' are.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-05 Thread Frederik Eaton
Hi Robert,

I looked over your proposal.

I'm not sure if I'm in favor of introducing a new keyword. It seems
unnecessary.

Also, note that my proposal differs in that thread local variables are
not writable, but can only be changed by calling (e.g. in my API)
'withIOParam'. This is still just as general, because an IORef can be
stored in a thread-local variable, but it makes it easier to reason
about the more common use case where TLS is used to make IO a Reader;
and it makes it easier to share modifiable state across more than one
thread. I.e. if modifiable state is stored as 'IOParam (IORef a)' then
the default is for the stored 'IORef a' to be shared across all
threads; it can only be changed locally for a specified action and
any sub-threads using 'withIOParam'; and if some library I use decides
to fork a thread behind the scenes, it won't change my program's
behavior.

I think it is a good idea to have stdin, cwd, etc. be thread-local.

I don't understand why the 'TL' monad is necessary, but I haven't read
the proposal very carefully.

Best,

Frederik

On Sat, Aug 05, 2006 at 02:18:58PM -0400, Robert Dockins wrote:
 Sorry to jump into this thread so late.  However,  I'd like to take a moment 
 to remind everyone that some time ago I put a concrete proposal for 
 thread-local variables on the table.
 
 http://article.gmane.org/gmane.comp.lang.haskell.cafe/11010
 
 I believe this proposal addresses the initialization issues that Einar has 
 been discussing.  In my proposal, thread-local variables always have some 
 defined value, and they obtain their values at well-defined points.
 
 The liked message also gives several use cases that I felt motivated the 
 proposal.
 
 --
 Rob Dockins
 
 Talk softly and drive a Sherman tank.
 Laugh hard, it's a long way to the bank.
-- TMBG
 ___
 Haskell mailing list
 Haskell@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell
 

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-05 Thread Frederik Eaton
  Here is a naive and dirty implementation. The largest problem is that
  TypeRep is not in  Ord. An alternative approach using Dynamic would be
  possible, but I like the connection between the key 
  and the associated type.
  
  http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/
  
  Not optimized for performance at all.
 
 You've redefined 'fork'. If I want a library which works with other
 libraries, that will not be an option. The original purpose of my
 posting to this thread was to ask for two standard functions which
 would let me define thread-local variables in a way which is
 interoperable with other libraries, to the same extent as 'withArgs'
 and 'withProgName' are.

I also forgot to mention that if you hold on to a ThreadId, it
apparently causes the whole thread to be retained. Simon Marlow
explained this on 2005/10/18:

m One could argue that getting the parent ThreadId is something that
m should be supported natively by forkIO, and I might be inlined to agree.
m Unfortunately there are some subtleties: currently a ThreadId is
m represented by a pointer to the thread itself, which causes the thread
m to be kept alive.  This has implications not only for space leaks, but
m also for reporting deadlock: if you have a ThreadId for a thread, you
m can send it an exception with throwTo at any time, and hence the runtime
m can never determine that the thread is deadlocked so it will never get
m the NonTermination exception.  Perhaps we need two kinds of ThreadId: a
m weak one for use in Maps, and a strong one that you can use with
m throwTo.  But then building a Map in which some elements can be garbage
m collected is a bit tricky (it can be done though; see our old Memo table
m implementation in fptools/hslibs/util/Memo.hs).

So this is another problem with your implementation, and another
reason why I want TLS support in the standard libraries.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-04 Thread Frederik Eaton
 As for the subject under discussion (thread local state), I am
 personally sceptical about it. Why do we need it? Are we talking
 about safety or just convenience/API elegance? I've never
 encountered a situation where I've needed thread local state,
 (but this does not necessarily make it evil:-)

OK. What if all Haskell processes, all over the world, were made into
threads in the same large process? There are a lot of things that are
currently global state - as in, process-global - which would have to
become non-global in some way - pretty much all interaction with the
world: file IO, networking, command line arguments, system
environment, etc.

You, Einar, and others seem to be arguing that the only way to make
these things non-global should be to either make them explicit
arguments to functions, or to have them appear explicitly in the type
of the application's primary monad.

For instance, this simple program:

main :: IO ()
main = do
putStrLn Hello world

might, in Adrian Hey and Einar Karttunen's world, become:

newMain host environment program_args
network_config locale terminal_settings
stdin stdout stderr = do
hPutStrLn stdout (defaultEncoding locale) Hello world

Now, some people might find this second version delightfully explicit,
but I'd have doubts about whether such people are actually trying to
get things done, or whether they see the language as an end in itself. 
As for me, I prefer the first version - it saves reading and typing,
and is perfectly clear, and I have work to do.

Maybe I'm misunderstanding your position - maybe you think that I
should use lots of different processes to segregate global state into
separate contexts? Well, that's nice, but I'd rather not. For
instance, I'm writing a server - and it's just not efficient to use a
separate process for each request. And there are some things such as
database connections, current user id, log files, various profiling
data, etc., that I would like to be thread-global but not
process-global.

Or maybe you think that certain types of global state should be
privileged - for instance, that all of the things which are arguments
to 'newMain' above are OK to have as global state, but that anything
else should be passed as function arguments, thus making
thread-localization moot. I disagree with this - I am a proponent of
extensibility, and think that the language should make as few things
as possible built-in. I want to define my own application-specific
global state, and, additionally, I want to have it thread-global, not
process-global.

You asked for an example, but, because of the nature of this topic, it
would have to be a very large example to prove my point. Thread-local
variables are things that only become really useful in large programs. 
Instead, I've asked you to put yourself in my shoes - what if the bits
of context that you already take for granted in your programs had to
be thread-local? How would you cope, without thread-local variables,
in such a situation?

 But I would say that I think I would find having to know what thread
 a particular bit of code was running in in order to grok it very
 strange,

I agree that it is important to have code which is easy to understand.

Usually, functions run in the same thread as their caller, unless they
are passed to something with the word 'fork' in the name. That's a
good rule of thumb that is in fact sufficient to let you understand
the code I write. Also, if that's too much to remember, then since I'm
only proposing and using non-mutable thread-local state (i.e. it
behaves like a MonadReader), and since I'm not passing actions between
threads as Einar is, then you can forget about the 'fork' caveat.

I think the code would in fact be more difficult to grok, if all of
the things which I want to be thread-local were instead passed around
as parameters, a la 'newMain'. This is simply because, in that
scenario, there would much more code to read, and it would be very
repetitive. If I used special monads for my state, then the situation
would be only slightly better - a single monad would not suffice, and
I'd be faced with a plethora of 'lift' functions and redefinitions of
'catch', as well as long type signatures and a crowded namespace.

 unless there was some obvious technical reason why the
 thread local state needed to be thread local (can't think of any
 such reason right now).

Some things are not immediately obvious. If you don't like to think of
reasons, then just take my word for it that it would help me. A
facility for thread-local variables would be just another of many
facilities that programmers could choose from when designing their
code. I'm not asking you to change the way you program - I don't care
how other people program. I trust them to know what is best for their
particular application. It's none of my business, anyway.

Since Simon Marlow said that he had been considering a thread-local
variable facility, I merely wanted to voice my support:


Re: [Haskell] thread-local variables

2006-08-04 Thread Einar Karttunen
On 04.08 17:29, Frederik Eaton wrote:
 might, in Adrian Hey and Einar Karttunen's world, become:
 
 newMain host environment program_args
 network_config locale terminal_settings
 stdin stdout stderr = do
 hPutStrLn stdout (defaultEncoding locale) Hello world

Actually I have implemented network-libraries, and I don't remember
them requiring such things ;-)

I think our main difference is that when designing concurrent
applications in Haskell I frequently use monadic actions as
callbacks invoked in distant unrelated threads. Threading
behind the API is seen by me as mostly an implementation issue
as long as the service guarantees don't change.

You seem to use threads in a much more constrained fashion (my own
interpretation) which results in us seeing TLS from very different
perspectives.

 Maybe I'm misunderstanding your position - maybe you think that I
 should use lots of different processes to segregate global state into
 separate contexts? Well, that's nice, but I'd rather not. For
 instance, I'm writing a server - and it's just not efficient to use a
 separate process for each request. And there are some things such as
 database connections, current user id, log files, various profiling
 data, etc., that I would like to be thread-global but not
 process-global.

I have done many servers in Haskell. Usually I have threads allocated
to specific tasks rather than specific requests.

What guarantees do your code have that all the relevant parameters
are already initialized - and how can an user of the code know
which TLS variables need to be initialized? If it is documented
maybe it could be done at the level of an implicit parameter?

 Or maybe you think that certain types of global state should be
 privileged - for instance, that all of the things which are arguments
 to 'newMain' above are OK to have as global state, but that anything
 else should be passed as function arguments, thus making
 thread-localization moot. I disagree with this - I am a proponent of
 extensibility, and think that the language should make as few things
 as possible built-in. I want to define my own application-specific
 global state, and, additionally, I want to have it thread-global, not
 process-global.

This can cause much fun with the FFI. If we change e.g. stdout to
thread specific what should be do before each foreign call? Same
with the other things that are related to the OS process in question.

A thread is a context of execution while a process is a context for
resources. Would you like to have multiple Haskell processes inside
one OS process?

I don't consider these very different:
1) use one thread from a pre-allocated pool to do a task
2) fork a new thread to do the task

With TLS they are vastly different.

 You asked for an example, but, because of the nature of this topic, it
 would have to be a very large example to prove my point. Thread-local
 variables are things that only become really useful in large programs. 
 Instead, I've asked you to put yourself in my shoes - what if the bits
 of context that you already take for granted in your programs had to
 be thread-local? How would you cope, without thread-local variables,
 in such a situation?

I have been using an application specific monad (newtyped transformer) and
a clean set of functions so that the implementation is not hardcoded
and can be changed easily. Thus I haven't had the same difficulties
as you.

I don't think many of the process global resources would make sense
on a per-thread basis and I am not against all global state.

  But I would say that I think I would find having to know what thread
  a particular bit of code was running in in order to grok it very
  strange,
 
 I agree that it is important to have code which is easy to understand.
 
 Usually, functions run in the same thread as their caller, unless they
 are passed to something with the word 'fork' in the name. That's a
 good rule of thumb that is in fact sufficient to let you understand
 the code I write. Also, if that's too much to remember, then since I'm
 only proposing and using non-mutable thread-local state (i.e. it
 behaves like a MonadReader), and since I'm not passing actions between
 threads as Einar is, then you can forget about the 'fork' caveat.

The only problem appears when someone uses two libraries one written
by me and an another written by you and wonders why is my program
failing in mysterious ways.

 I think the code would in fact be more difficult to grok, if all of
 the things which I want to be thread-local were instead passed around
 as parameters, a la 'newMain'. This is simply because, in that
 scenario, there would much more code to read, and it would be very
 repetitive. If I used special monads for my state, then the situation
 would be only slightly better - a single monad would not suffice, and
 I'd be faced with a plethora of 'lift' functions and redefinitions of
 'catch', as well as long type signatures and a crowded namespace.

As 

Re: [Haskell] thread-local variables

2006-08-01 Thread Einar Karttunen
On 31.07 23:53, Adrian Hey wrote:
 Frederik Eaton wrote:
 On Mon, Jul 31, 2006 at 03:09:59PM +0300, Einar Karttunen wrote:
 On 31.07 03:18, Frederik Eaton wrote:
 4) the library runs the callback code in Tw where the TLS state is
invalid. This is even worse than a global variable in this case.
 
 If you have threads, and you have something which needs to be
 different among different threads, then it is hard for me to see how
 thread-local variables could be worse than global variables in any
 case at all.
 
 I haven't been following the technicalities of the particular
 scenario that's under discussion so I don't know exactly
 what either of you mean by (even) worse than global variables.
 
 I just want to point out that, as I (and a few others) see it at
 least, top level mutable state (aka global variables) is
 absolutely necessary sometimes for _SAFETY_ reasons.

I agree that global variables are sometimes the best solution.
My point in the quote was that in the example described
TLS would cause more trouble than global mutable state.

 But I would say that I think I would find having to know what thread
 a particular bit of code was running in in order to grok it very
 strange, unless there was some obvious technical reason why the
 thread local state needed to be thread local (can't think of any
 such reason right now).

I have to agree to this. It would be very nice to see good examples
of thread local state in action that would teach us (the sceptics)
why TLS is a good idea in Haskell - and maybe we would learn to
write better code with it. Something more than simply avoiding
a Reader monad / implicit parameters would be nice.

ps. Should we move this discussion to haskell-cafe?

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-31 Thread Einar Karttunen
On 31.07 03:18, Frederik Eaton wrote:
 I don't think it's necessarily such a big deal. Presumably the library
 with the worker threads will have to be invoked somewhere. One should
 just make sure that it is invoked in the appropriate environment, for
 instance with the database connection already properly initialized.
 
 (*) One might even want to change the environment a little within each
 thread, for instance so that errors get logged to a thread-specific
 log file.

So we have the following:
1) the library is initialized and spawns worker thread Tw
2) application initializes the database connection and it
   is associated with the current thread Tc and all the children
   it will have (unless changed)
3) the application calls the library in Tc passing an IO action
   to it. The IO action refers to the TLS thinking it is in
   Tc where it is valid.
4) the library runs the callback code in Tw where the TLS state is
   invalid. This is even worse than a global variable in this case.

Of course one can argue that the application should first initialize
the database handle. But if the app uses worker threads (spawned
before library initialization) then things will break if a library
uses TLS and callbacks and they end up running in threads created
before the library initialization.

- Einar Karttunen

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-31 Thread Einar Karttunen
On 31.07 14:03, Thomas Conway wrote:
 This is why I believe transaction-local variables are a more useful concept.
 You are garanteed that there is only one thread accessing them, and
 they behave just like ordinary TVars except that each transaction has
 its own copy.

This seems like it could be useful. E.g. marking graph nodes while
traversing them.

 The argument to newLVar is an initial value that is used at the start
 of each transaction.  Note that this means that the value in an LVar
 does not persist between transaction. I agree that this limits their
 use, but simplifies them immensely, and doesn't stand in the way their
 being useful for solving a bunch of problems.

I think that them reverting to the initial value is more useful
than persisting behavior.

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-31 Thread Frederik Eaton
On Mon, Jul 31, 2006 at 03:09:59PM +0300, Einar Karttunen wrote:
 On 31.07 03:18, Frederik Eaton wrote:
  I don't think it's necessarily such a big deal. Presumably the library
  with the worker threads will have to be invoked somewhere. One should
  just make sure that it is invoked in the appropriate environment, for
  instance with the database connection already properly initialized.
  
  (*) One might even want to change the environment a little within each
  thread, for instance so that errors get logged to a thread-specific
  log file.
 
 So we have the following:
 1) the library is initialized and spawns worker thread Tw
 2) application initializes the database connection and it
is associated with the current thread Tc and all the children
it will have (unless changed)
 3) the application calls the library in Tc passing an IO action
to it. The IO action refers to the TLS thinking it is in
Tc where it is valid.
 4) the library runs the callback code in Tw where the TLS state is
invalid. This is even worse than a global variable in this case.

If you have threads, and you have something which needs to be
different among different threads, then it is hard for me to see how
thread-local variables could be worse than global variables in any
case at all.

 Of course one can argue that the application should first initialize
 the database handle. But if the app uses worker threads (spawned
 before library initialization) then things will break if a library
 uses TLS and callbacks and they end up running in threads created
 before the library initialization.

OK, sure. In certain situations you have to keep track of whether a
function to which you pass an action might be sending it off to be run
in a different thread. We've been over this. Perhaps users should be
warned in the documentation - and in the documentation for exceptions
as well. I really don't see that as a problem that would sneak up on
people, since if you're passing an action to a function, rather than
executing it yourself, then in most cases it should be clear to
programmers that the action will run in a different context if not a
different thread altogether. And if you want to force the context to
be the same, you wrap the action in something restoring that context,
just like you would have to do with your state transformer monad
stack.

Another way to write buggy code is to have it so bloated with extra
syntax - e.g. with monad conversions, or extra function parameters, as
you propose - that it becomes impossible to read and understand.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-07-31 Thread Adrian Hey

Frederik Eaton wrote:

On Mon, Jul 31, 2006 at 03:09:59PM +0300, Einar Karttunen wrote:

On 31.07 03:18, Frederik Eaton wrote:
4) the library runs the callback code in Tw where the TLS state is
   invalid. This is even worse than a global variable in this case.


If you have threads, and you have something which needs to be
different among different threads, then it is hard for me to see how
thread-local variables could be worse than global variables in any
case at all.


I haven't been following the technicalities of the particular
scenario that's under discussion so I don't know exactly
what either of you mean by (even) worse than global variables.

I just want to point out that, as I (and a few others) see it at
least, top level mutable state (aka global variables) is
absolutely necessary sometimes for _SAFETY_ reasons.

BTW
If anybody still doesn't get it re. why we need top level
mutable state, the point is not to avoid explicit state handle
threading. The point is to avoid exposing newState constructors
as part of the IO library API (thereby making it invulnerable to
accidental state spoofing).

If you're going to deny library users the ability to create new
state handles then you have to make at least one ready made
state handle available at the top level. It just so happens
that in the common case where there can be only one such state
handle (for safety reasons) then you can completely eliminate
this from the exposed API.
/BTW

As for the subject under discussion (thread local state), I am
personally sceptical about it. Why do we need it? Are we talking
about safety or just convenience/API elegance? I've never
encountered a situation where I've needed thread local state,
(but this does not necessarily make it evil:-)

But I would say that I think I would find having to know what thread
a particular bit of code was running in in order to grok it very
strange, unless there was some obvious technical reason why the
thread local state needed to be thread local (can't think of any
such reason right now).

My 2p..

Regards
--
Adrian Hey



___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-30 Thread Einar Karttunen
On 29.07 13:25, Frederik Eaton wrote:
 I think support for thread-local variables is something which is
 urgently needed. It's very frustrating that using concurrency in
 Haskell is so easy and nice, yet when it comes to IORefs there is no
 way to get thread-local behavior. Furthermore, that one can make
 certain things thread-local (e.g. with withArgs, withProgName) makes
 the solution seem close at hand (although I can appreciate that it may
 not be). Yet isn't it just a matter of making a Map with existentially
 quantified values part of the state of each thread, just as the
 program name and arguments are also part of that state?

Are thread local variables really a good idea in Haskell?

If variables are thread local how would this combinator work:

withTimeOut :: Int - IO a - IO a
withTimeOut tout op = mdo
  mv - newEmptyMVar
  wt - forkIO $ do try op = tryPutMVar mv  killThread kt
  kt - forkIO $ do threadDelay tout
e - tryPutMVar mv $ Left $ DynException $ toDyn 
TimeOutException
if e then killThread wt else return ()
  either throw return = takeMVar mv


Would it change the semantics of the action as it is run in a
different thread (this is a must if there are potentially blocking FFI
calls). Now if the action changes the thread local state then
it behaves differently. Do we really want that?

Usually one can just add a monad that wraps IO/STM and provides the
variables one needs. This has the good side of making scoping
explicit.

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-30 Thread Frederik Eaton
On Sun, Jul 30, 2006 at 12:35:42PM +0300, Einar Karttunen wrote:
 On 29.07 13:25, Frederik Eaton wrote:
  I think support for thread-local variables is something which is
  urgently needed. It's very frustrating that using concurrency in
  Haskell is so easy and nice, yet when it comes to IORefs there is no
  way to get thread-local behavior. Furthermore, that one can make
  certain things thread-local (e.g. with withArgs, withProgName) makes
  the solution seem close at hand (although I can appreciate that it may
  not be). Yet isn't it just a matter of making a Map with existentially
  quantified values part of the state of each thread, just as the
  program name and arguments are also part of that state?
 
 Are thread local variables really a good idea in Haskell?

Yes.

 If variables are thread local how would this combinator work:

Do read the code I posted. Please note I'm not suggesting that *all*
variables be thread local, I was proposing a special data-type for
that.

 withTimeOut :: Int - IO a - IO a
 withTimeOut tout op = mdo
   mv - newEmptyMVar
   wt - forkIO $ do try op = tryPutMVar mv  killThread kt
   kt - forkIO $ do threadDelay tout
 e - tryPutMVar mv $ Left $ DynException $ toDyn 
 TimeOutException
 if e then killThread wt else return ()
   either throw return = takeMVar mv
 
 
 Would it change the semantics of the action as it is run in a
 different thread (this is a must if there are potentially blocking FFI
 calls).

No, because the thread in which it runs inherits any thread-local
state from its parent.

 Now if the action changes the thread local state then
 it behaves differently. Do we really want that?

I'm not sure what you're suggesting. The API I proposed actually
doesn't let users discover when their actions are running in
sub-threads. (Can you write an example using that API?) However, even
if it did, I don't see a problem. Do you think that we should get rid
of 'myThreadId', for instance? I don't.

 Usually one can just add a monad that wraps IO/STM and provides the
 variables one needs. This has the good side of making scoping
 explicit.

That's easier said than done. Sometimes I take that route. But
sometimes I don't want 5 different monads wrapping each other, each
with its own 'lift' and 'catch' functions, making error messages
indecipherable and code difficult to read and debug. Do you propose
creating a special monad for file operations? For network operations? 
No? Then I don't see why I should have to make a special monad for
database operations. Or, if the answer was yes, then fine: obfuscate
your own code, but please don't ask me to do the same. Let's support
both ways of doing things, and we can be different.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-30 Thread Einar Karttunen
On 30.07 11:49, Frederik Eaton wrote:
 No, because the thread in which it runs inherits any thread-local
 state from its parent.


So we have different threads modifying the thread-local state?
If it is a copy then updates are not propagated.

What about a design with 10 worker threads taking requests
from a Chan (IO ()) and running them (this occurs in real code).
To get things right they should use the TLS-context relevant
to each IO () rather than the thread.
 
  Now if the action changes the thread local state then
  it behaves differently. Do we really want that?
 
 I'm not sure what you're suggesting. The API I proposed actually
 doesn't let users discover when their actions are running in
 sub-threads. (Can you write an example using that API?) However, even
 if it did, I don't see a problem. Do you think that we should get rid
 of 'myThreadId', for instance? I don't.

I do consider using myThreadId bad form for most purposes.
It is nice to have for debugging output - and occasionally
for sending other threads a handle for asynchronous exceptions,
but this can lead to problems when changing threading patterns.

Usually nice code does not care in which thread it is run.

 
  Usually one can just add a monad that wraps IO/STM and provides the
  variables one needs. This has the good side of making scoping
  explicit.
 
 That's easier said than done. Sometimes I take that route. But
 sometimes I don't want 5 different monads wrapping each other, each
 with its own 'lift' and 'catch' functions, making error messages
 indecipherable and code difficult to read and debug. Do you propose
 creating a special monad for file operations? For network operations? 
 No? Then I don't see why I should have to make a special monad for
 database operations. Or, if the answer was yes, then fine: obfuscate
 your own code, but please don't ask me to do the same. Let's support
 both ways of doing things, and we can be different.

Usually I just define one custom monad for the application which
wraps the stack of monad transformers. Thus changing the monad stack
does not affect the application code. A quite clean and efficient
solution.

My main objection to the TLS is that it looks like normal IO,
but changing the thread that evaluates it can break things in ways
that are hard to debug. E.g. we have an application that uses
TLS and passes an IO action to a library that happens to use
a pool of worker threads that invisible to the application. 
Or the same with the role of the application and library reversed.

Offering it up as a separate library should be ok as it would
be very easy to spot and take extra care not to cause problems.

- Einar Karttunen
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-30 Thread Frederik Eaton
On Mon, Jul 31, 2006 at 03:54:29AM +0300, Einar Karttunen wrote:
 On 30.07 11:49, Frederik Eaton wrote:
  No, because the thread in which it runs inherits any thread-local
  state from its parent.
 
 So we have different threads modifying the thread-local state?
 If it is a copy then updates are not propagated.

As I said, please read my code. There are no updates.

 What about a design with 10 worker threads taking requests
 from a Chan (IO ()) and running them (this occurs in real code).
 To get things right they should use the TLS-context relevant
 to each IO () rather than the thread.

I could see how either behavior might be desirable, see below. (*)

 (snip)
 Usually I just define one custom monad for the application which
 wraps the stack of monad transformers. Thus changing the monad stack
 does not affect the application code. A quite clean and efficient
 solution.

That does sound like a clean approach. However, I think that my
approach would be cleaner; and in any case I think that both
approaches should be available to the programmer.

 My main objection to the TLS is that it looks like normal IO,
 but changing the thread that evaluates it can break things in ways
 that are hard to debug. E.g. we have an application that uses
 TLS and passes an IO action to a library that happens to use
 a pool of worker threads that invisible to the application. 
 Or the same with the role of the application and library reversed.

I don't think it's necessarily such a big deal. Presumably the library
with the worker threads will have to be invoked somewhere. One should
just make sure that it is invoked in the appropriate environment, for
instance with the database connection already properly initialized.

(*) One might even want to change the environment a little within each
thread, for instance so that errors get logged to a thread-specific
log file.

 Offering it up as a separate library should be ok as it would
 be very easy to spot and take extra care not to cause problems.

That's good to hear.

Regards,

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-30 Thread Thomas Conway

Hi All,

On 7/31/06, Einar Karttunen ekarttun@cs.helsinki.fi wrote:

My main objection to the TLS is that it looks like normal IO,
but changing the thread that evaluates it can break things in ways
that are hard to debug. E.g. we have an application that uses
TLS and passes an IO action to a library that happens to use
a pool of worker threads that invisible to the application.


This is why I believe transaction-local variables are a more useful concept.
You are garanteed that there is only one thread accessing them, and
they behave just like ordinary TVars except that each transaction has
its own copy.

I think you'd need an API like

   type LVar a -- local var
   newLVar :: a - STM (LVar a)
   readLVar :: LVar a - STM a
   writeLVar:: LVar a - a - STM ()

The argument to newLVar is an initial value that is used at the start
of each transaction.  Note that this means that the value in an LVar
does not persist between transaction. I agree that this limits their
use, but simplifies them immensely, and doesn't stand in the way their
being useful for solving a bunch of problems.

cheers,
Tom
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-29 Thread Frederik Eaton
Hi,

Sorry to bring up this thread from so long ago.

On Wed, Mar 01, 2006 at 11:53:42AM +, Simon Marlow wrote:
 Ashley Yakeley wrote:
 Simon Marlow wrote:
 Simon  I have discussed doing some form of thread-local state, which 
 covers many uses of implicit 
 parameters and is much preferable IMO. Thread-local state doesn't change 
 your types, and it 
 doesn't require passing any extra parameters at runtime.  It works 
 perfectly well for the OS 
 example you give, for example.
 Interesting. What would that look like in code?
 
 No concrete plans yet.  There have been proposals for thread-local variables 
 in the past on this 
 list and haskell-cafe, and other languages have similar features (eg. 
 Scheme's support for dynamic 
 scoping).  Doing something along these lines is likely to be quite 
 straightforward to implement, 
 won't require any changes to the type system, and gives you a useful form of 
 implicit parameters 
 without any of the drawbacks.
 
 The main difference from implicit parameters would be that thread-local 
 variables would be 
 restricted to the IO monad.

I think support for thread-local variables is something which is
urgently needed. It's very frustrating that using concurrency in
Haskell is so easy and nice, yet when it comes to IORefs there is no
way to get thread-local behavior. Furthermore, that one can make
certain things thread-local (e.g. with withArgs, withProgName) makes
the solution seem close at hand (although I can appreciate that it may
not be). Yet isn't it just a matter of making a Map with existentially
quantified values part of the state of each thread, just as the
program name and arguments are also part of that state?


import qualified Data.Map as M 
import Data.Maybe 
import Data.Unique
import Data.IORef 
import Data.Typeable 
 
-- only these 2 must be implemented:
withParams :: ParamsMap - IO () - IO () 
getParams :: IO ParamsMap 
--

type ParamsMap = M.Map Unique Value

data Value = forall a . (Typeable a) = V a 
 
type IOParam a = IORef (Unique, a) 
 
newIOParam :: Typeable a = a - IO (IOParam a) 
newIOParam def = do 
k - newUnique 
newIORef (k,def) 
 
withIOParam :: Typeable a = IOParam a - a - IO () - IO () 
withIOParam p value act = do 
(k,def) - readIORef p 
m - getParams 
withParams (M.insert k (V value) m) act 
 
getIOParam :: Typeable a = IOParam a - IO a 
getIOParam p = do 
(k,def) - readIORef p 
m - getParams 
return $ fromMaybe def (M.lookup k m = (\ (V x) - cast x)) 


Frederik

P.S. I sent a message about this a while back, when I was trying to
implement my own version using ThreadId (not really a good approach).

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)

2006-07-29 Thread Thomas Conway

I would also note that some form of transaction-local variable would
also be really handy for STM usage.

Tom
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell