Re: [Haskell] thread-local variables
Hi Simon, It is good that you support thread-local variables. I have initialized a wiki page: http://haskell.org/haskellwiki/Thread_local_storage The main difference between my and your proposals, as I see it, is that your proposal is based on keys which can be used for other things. I think that leads to an interface which is less natural. In my proposal, the IOParam type is quite similar to an IORef - it has a user-specified initial state, and the internal implementation is hidden from the user - yours differs in both of these aspects. * I agree with Robert that a key issue is initialisation. Maybe it should be possible to associate an initialiser with a key. I have not thought this out. I still don't understand this, so it is not mentioned on the wiki. * A key issue is this: when forking a thread, does the new thread inherit the current thread's bindings, or does it get a freshly-initialised set. Sometimes you want one, sometimes the other, alas. I think the inheritance semantics are more useful and also more general: If I wanted a freshly-initialized set of bindings, and I only had inheritance semantics, then I could start a thread early on when all the bindings are in their initial state, and have this thread read actions from a channel and execute them in sub-threads of itself, and implement a 'fork' variant based on this. More generally, I could do the same thing from a sub-thread of the main thread - I could start a thread with any set of bindings, and use it to launch other threads with those bindings. In this way, the initial set of bindings is not specially privileged over intermediate sets of bindings. On the GHC front, we're going to be busy with 6.6 etc until after ICFP, so nothing is going to happen fast -- which gives an opportunity to discuss it. However it's just infeasible for the community at large to follow a long email thread like this one. My suggestion would be for the interested parties to proceed somewhat as we did with packages. (http://hackage.haskell.org/trac/ghc/wiki/GhcPackages) I have put a page on the wiki summarizing the thread. However, I want to say that I think that email is a better medium for most ongoing discussions. (I'm not sure if I may have suggested the opposite earlier) For those who are not interested in the discussion, it should be easy in most mail readers to ignore or hide a long thread, or to skip to the very end of it to get a rough idea of where things stand. I think it is a good idea to have proposals on a wiki, though, so that the product of all agreed-upon amendments and alterations can be easily referred to. When discussions happen on a wiki, though, they often take the same threaded form as email discussions (see Wikipedia) - but, they are seen by fewer interested people, and the interface is clumsier (for instance, I can subscribe to email notification when a wiki page changes - thanks to whomever finally made this possible on haskell.org, by the way - but I have to read the updated version to figure out whether the modification was replying to me or another poster; whereas my mail reader clearly flags messages where I appear in the recipients list). Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
On 07.08 13:16, Frederik Eaton wrote: How would this work together with the FFI? It wouldn't, at least I wouldn't care if it didn't. Suddenly breaking libraries that happen to use FFI behind your back does not seem like a good conservative extension. I think we should move the discussion to the wiki as Simon suggested. I can create a wikipage if you don't want to. What about my example: newMain host environment program_args network_config locale terminal_settings stdin stdout stderr = do ... Now, let's see. We might want two threads to have the same network configuration, but a different view of the filesystem; or the same view of the filesystem, but a different set of environment variables; or the same environment, but different command line arguments. All three cases are pretty common in practice. We might also want to have the same arguments but different IO handles - as in a multi-threaded server application. This won't be pretty even with TLS. Our fancy app will probably mix in STM and pass callback actions to the thread processing packets coming directly from the network interface. Quickly the TLS approach seems problematic - we need to know what actions depend on each other and how. And the part that implements the filesystem might want to access the network (if there is a network filesystem). And the part that starts processes with an environment might want to access the filesystem, for instance to read the code for the process and for shared libraries; and maybe it also wants to get the hostname from the network layer. And the part that starts programs with arguments might want to access the environment (for instance, to get the current locale), as well as the filesystem (for instance, to read locale configuration files). And the part that accesses the IO handles might also want to access not just the program arguments but the environment, and the filesystem, and the network. So we have the following dependencies: FileSystem - Network Environment - FileSystem, Network Arguments - Environment IO Handles - Arguments,Environment,FS,Network With TLS every one of them has type IO. Now the programmer is supposed to know that he has to configure the network before using program arguments? So a programmer first wanting to process command line arguments and only then configuring network will probably have hidden bugs. It becomes very hard to know what different components depend on. Even if we had to define all those instances that would be 1+2+1+3 = 7 instance declarations. Not 5^2 = 25 instances. Or use small wrapper combinators (which I prefer). btw how would the TLS solution elegantly handle that I'd like separate network configurations for e.g. IO Handle - Network(socket) and IO Handle - FileSystem(NFS) - Network ? So here is an example where we have nested layers, and each layer accesses most of the layers below it. And this will cause problems. A good API should not encourage going to the lower levels directly. If the lowest level changes then with your design one has to make O(layers) changes instead of O(1) if the layers are not available directly. If one of the layers adds a new dependency then making sure it is initialized and used correctly seems very hard to check. If we started with a library that dealt with OS devices such as the network, and used a special monad for that; and then if we built upon that a layer for keeping track of environment variables, with another monad; and then a layer for invoking executables with arguments; and then a layer for IO; all with monads - then we would have a good modular, extensible design, which, due to the interactions between layers, would, in Haskell, require code length which is quadratic in the number of layers. The trick here is that most components should not talk with each other. Composition and encapsulation are the keys to victory. (Of course, it's true that in real operating systems, each of these layers has its own set of interfaces to the other layers - so the monadic approach is actually not more verbose. But the point is that it's a reasonable design, with layers, and where each layer uses each of the ones below it. I want to write code which is designed the same way, but without the overhead) Yes, the size of the code is dependent on the size of the API. Making things explicit is more infrastructure at the start, but makes things easier later on when they have to be changed. If you move it somewhere else, but forget to move the thread-local variables it refers to, then you'll get a compiler error. I was meaning forgetting to initialize it - not omitting the whole definition. db2 - getIOParam db2Param withIOParam dbParam db2 $ ... And one needs to make sure that the ... part does not need the other database connection(s). Makes composing things hard. I'm still not sure I understand why thread pools are necessary, by the way. I thought forking
Re: [Haskell] thread-local variables
On Tue, Aug 08, 2006 at 04:21:06PM +0300, Einar Karttunen wrote: On 07.08 13:16, Frederik Eaton wrote: How would this work together with the FFI? It wouldn't, at least I wouldn't care if it didn't. Suddenly breaking libraries that happen to use FFI behind your back does not seem like a good conservative extension. FFI already doesn't mix well with GHC's IO handles. What if I write to file descriptor 1 before all data in stdout has been flushed? Is that a reason not to allow FFI? I think we should move the discussion to the wiki as Simon suggested. I can create a wikipage if you don't want to. http://haskell.org/haskellwiki/Thread_local_storage I think the wiki is a good place for proposals, but not most discussion. What about my example: newMain host environment program_args network_config locale terminal_settings stdin stdout stderr = do ... Now, let's see. We might want two threads to have the same network configuration, but a different view of the filesystem; or the same view of the filesystem, but a different set of environment variables; or the same environment, but different command line arguments. All three cases are pretty common in practice. We might also want to have the same arguments but different IO handles - as in a multi-threaded server application. This won't be pretty even with TLS. Our fancy app will probably mix in STM and pass callback actions to the thread processing packets coming directly from the network interface. Quickly the TLS approach seems problematic - we need to know what actions depend on each other and how. I don't understand. Does TLS make such design harder or easier? And the part that implements the filesystem might want to access the network (if there is a network filesystem). And the part that starts processes with an environment might want to access the filesystem, for instance to read the code for the process and for shared libraries; and maybe it also wants to get the hostname from the network layer. And the part that starts programs with arguments might want to access the environment (for instance, to get the current locale), as well as the filesystem (for instance, to read locale configuration files). And the part that accesses the IO handles might also want to access not just the program arguments but the environment, and the filesystem, and the network. So we have the following dependencies: FileSystem - Network Environment - FileSystem, Network Arguments - Environment and Filesystem IO Handles - Arguments,Environment,FS,Network With TLS every one of them has type IO. Now the programmer is supposed to know that he has to configure the network before using program arguments? So a programmer first wanting to process command line arguments and only then configuring network will probably have hidden bugs. The running example is an example of an executable starting in an operating system. So everything is already configured by the time it starts, as you know. My application will be no different - for instance, the database-related parameter will be set; then a request thread will start, and after parsing the request, a user-id parameter will be set, and then the request-processing functions will be called. There is no reason for the main server thread to call any of the request-processing functions, because it doesn't have a request to process. It becomes very hard to know what different components depend on. Even if we had to define all those instances that would be 1+2+1+3 = 7 instance declarations. Not 5^2 = 25 instances. Or use small wrapper combinators (which I prefer). O(x) doesn't mean same as x. btw how would the TLS solution elegantly handle that I'd like separate network configurations for e.g. IO Handle - Network(socket) and IO Handle - FileSystem(NFS) - Network ? The filesystem could send its actions to be executed in a separate thread, which has its own configuration? So here is an example where we have nested layers, and each layer accesses most of the layers below it. And this will cause problems. A good API should not encourage going to the lower levels directly. If the lowest level changes then with your design one has to make O(layers) changes instead of O(1) if the layers are not available directly. No, you just write a compatibility wrapper over the new implementation. If one of the layers adds a new dependency then making sure it is initialized and used correctly seems very hard to check. I disagree. If we started with a library that dealt with OS devices such as the network, and used a special monad for that; and then if we built upon that a layer for keeping track of environment variables, with another monad; and then a layer for invoking executables with arguments; and then a layer for IO; all with monads - then we would have a good modular, extensible design, which, due to the
Re: [Haskell] thread-local variables
Furthermore, can we move this thread from the Haskell mailing list (which should not have heavy traffic) to either Haskell-Café, or the libraries list? Sure, moving to haskell-cafe. Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
[Haskell-cafe] RE: [Haskell] thread-local variables
| I have initialized a wiki page: | | http://haskell.org/haskellwiki/Thread_local_storage Great | I have put a page on the wiki summarizing the thread. However, I want | to say that I think that email is a better medium for most ongoing | discussions. I agree. Discussion by email Outcomes on Wiki (including outcomes recording differences of viewpoint) The goal is that the outcome is a comprehensible summary of the outcome of the discussion, for the benefit of the many who will not follow the evolving debate. Furthermore, can we move this thread from the Haskell mailing list (which should not have heavy traffic) to either Haskell-Café, or the libraries list? Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] thread-local variables
Furthermore, can we move this thread from the Haskell mailing list (which should not have heavy traffic) to either Haskell-Café, or the libraries list? Sure, moving to haskell-cafe. Frederik -- http://ofb.net/~frederik/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] thread-local variables
On Sun, Aug 06, 2006 at 01:36:15PM +0300, Einar Karttunen wrote: On 06.08 02:41, Frederik Eaton wrote: Also, note that my proposal differs in that thread local variables are not writable, but can only be changed by calling (e.g. in my API) 'withIOParam'. This is still just as general, because an IORef can be stored in a thread-local variable, but it makes it easier to reason about the more common use case where TLS is used to make IO a Reader; and it makes it easier to share modifiable state across more than one thread. I.e. if modifiable state is stored as 'IOParam (IORef a)' then the default is for the stored 'IORef a' to be shared across all threads; it can only be changed locally for a specified action and any sub-threads using 'withIOParam'; and if some library I use decides to fork a thread behind the scenes, it won't change my program's behavior. Perhaps a function like this would solve all our problems: -- | Tie all TLS references in the IO action to the current -- environment rather than the environment it will actually -- be executed. tieToCurrentTLS :: IO a - IO (IO a) Our problems? :) Well, it should be easy to implement. I think it's a good idea. I think it is a good idea to have stdin, cwd, etc. be thread-local. How would this work together with the FFI? It wouldn't, at least I wouldn't care if it didn't. I don't understand why the 'TL' monad is necessary, but I haven't read the proposal very carefully. The TL monad is necessary to make initialization order problems go away. That's what it seemed like the intended purpose was, but I don't see any initialization order problems in my proposal. On 05.08 19:56, Frederik Eaton wrote: That doesn't answer the question: What if my application has a need for several different sets of parameters - what if it doesn't make sense to combine them into a single monad? What if there are 'n' layers? Is it incorrect to say that the monadic approach requires code size O(n^2)? Well designed monadic approach does not require O(n^2). But if you want to design code in a way that requires O(n^2) code size you can do it. Parallel layers require O(layers). Nested layers hiding the lower layer need O(layers). This is not a problem in practice and makes refactoring very easy. Is that true? I would be very careful when making generalizations about all software design. What about my example: newMain host environment program_args network_config locale terminal_settings stdin stdout stderr = do ... Now, let's see. We might want two threads to have the same network configuration, but a different view of the filesystem; or the same view of the filesystem, but a different set of environment variables; or the same environment, but different command line arguments. All three cases are pretty common in practice. We might also want to have the same arguments but different IO handles - as in a multi-threaded server application. And the part that implements the filesystem might want to access the network (if there is a network filesystem). And the part that starts processes with an environment might want to access the filesystem, for instance to read the code for the process and for shared libraries; and maybe it also wants to get the hostname from the network layer. And the part that starts programs with arguments might want to access the environment (for instance, to get the current locale), as well as the filesystem (for instance, to read locale configuration files). And the part that accesses the IO handles might also want to access not just the program arguments but the environment, and the filesystem, and the network. So here is an example where we have nested layers, and each layer accesses most of the layers below it. kernel (networking, devices) filesystem linker libc application If we started with a library that dealt with OS devices such as the network, and used a special monad for that; and then if we built upon that a layer for keeping track of environment variables, with another monad; and then a layer for invoking executables with arguments; and then a layer for IO; all with monads - then we would have a good modular, extensible design, which, due to the interactions between layers, would, in Haskell, require code length which is quadratic in the number of layers. (Of course, it's true that in real operating systems, each of these layers has its own set of interfaces to the other layers - so the monadic approach is actually not more verbose. But the point is that it's a reasonable design, with layers, and where each layer uses each of the ones below it. I want to write code which is designed the same way, but without the overhead) And don't have any static guarantees that you have done all the proper initialization calls before you use them. Well, there are a lot of things I don't have static guarantees for. For instance, sometimes I call the function
RE: [Haskell] thread-local variables
| On Sat, Aug 05, 2006 at 02:18:58PM -0400, Robert Dockins wrote: | Sorry to jump into this thread so late. However, I'd like to take a moment | to remind everyone that some time ago I put a concrete proposal for | thread-local variables on the table. | | http://article.gmane.org/gmane.comp.lang.haskell.cafe/11010 I'm cautious about jumping into this swampy topic, but here are some thoughts. * The thoughts that Simon and were considering about thread-local state are quite close to Robert's proposal. For myself, I am somewhat persuaded that some form of implicitly-passed state in the IO monad (without explicit parameters) is useful. Examples I often think of are - Allocating unique identifiers - Making random numbers - Where stdin and stdout should go In all of these cases, a form of dynamic binding is just what we want: send stdout to the current thread's stdout, use the current thread's random number seed, etc. * There's no need to connect it to *state*. The key top-level thing you need is to allocate what Adrian Hey calls a thing with identity. http://www.haskell.org/hawiki/GlobalMutableState. I'll call it a key. For example, rather than a 'threadlocal' declaration, one might just have: newkey foo :: Key Int where 'newkey' the keyword; this declares a new key with type (Key Int), distinct from all other keys. Now you can imagine that the IO monad could provide operations withBinding :: Key a - a - IO b - IO b lookupBinding :: Key a - IO a very much like the dynamic-binding primitives that have popped up on this thread. * If you want *state*, you can have a (Key (IORef Int)). Now you look up the binding to get an IORef (or MVar, whatever you like) and you can mutate that at will. So this separates a thread-local *environment* from thread-local *state*. * Keys may be useful for purposes other than withBinding and thread-local state. One would also want to dynamically create new keys: newKey :: IO (Key a) * I agree with Robert that a key issue is initialisation. Maybe it should be possible to associate an initialiser with a key. I have not thought this out. * A key issue is this: when forking a thread, does the new thread inherit the current thread's bindings, or does it get a freshly-initialised set. Sometimes you want one, sometimes the other, alas. On the GHC front, we're going to be busy with 6.6 etc until after ICFP, so nothing is going to happen fast -- which gives an opportunity to discuss it. However it's just infeasible for the community at large to follow a long email thread like this one. My suggestion would be for the interested parties to proceed somewhat as we did with packages. (http://hackage.haskell.org/trac/ghc/wiki/GhcPackages) * Make a Wiki page to describe a concrete proposal. * Where there are design choices, describe them * If there are competing proposals that are incompatible, make more than one Wiki page. * But strive to evolve to one or two, so that the rest of us are not faced with 10 proposals from 10 people! This could happen on the GHC wiki or the Haskell wiki, though the latter seems more appropriate. Simon ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
On 06.08 04:23, Frederik Eaton wrote: I also forgot to mention that if you hold on to a ThreadId, it apparently causes the whole thread to be retained. Simon Marlow explained this on 2005/10/18: Actually this problem does not exist in the code. The problem is encountered if children are tied to their parents, that is they contain the ThreadId of the parent thread. In my code this problem should not occur. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
On 06.08 02:41, Frederik Eaton wrote: Also, note that my proposal differs in that thread local variables are not writable, but can only be changed by calling (e.g. in my API) 'withIOParam'. This is still just as general, because an IORef can be stored in a thread-local variable, but it makes it easier to reason about the more common use case where TLS is used to make IO a Reader; and it makes it easier to share modifiable state across more than one thread. I.e. if modifiable state is stored as 'IOParam (IORef a)' then the default is for the stored 'IORef a' to be shared across all threads; it can only be changed locally for a specified action and any sub-threads using 'withIOParam'; and if some library I use decides to fork a thread behind the scenes, it won't change my program's behavior. Perhaps a function like this would solve all our problems: -- | Tie all TLS references in the IO action to the current -- environment rather than the environment it will actually -- be executed. tieToCurrentTLS :: IO a - IO (IO a) I think it is a good idea to have stdin, cwd, etc. be thread-local. How would this work together with the FFI? I don't understand why the 'TL' monad is necessary, but I haven't read the proposal very carefully. The TL monad is necessary to make initialization order problems go away. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
On 05.08 19:56, Frederik Eaton wrote: That doesn't answer the question: What if my application has a need for several different sets of parameters - what if it doesn't make sense to combine them into a single monad? What if there are 'n' layers? Is it incorrect to say that the monadic approach requires code size O(n^2)? Well designed monadic approach does not require O(n^2). But if you want to design code in a way that requires O(n^2) code size you can do it. Parallel layers require O(layers). Nested layers hiding the lower layer need O(layers). This is not a problem in practice and makes refactoring very easy. And don't have any static guarantees that you have done all the proper initialization calls before you use them. Well, there are a lot of things I don't have static guarantees for. For instance, sometimes I call the function 'head', and the compiler isn't able to verify that the argument isn't an empty list. If I initialize my TLS to 'undefined' then I'll get a similar error message, at run time. For another example, I don't use monadic regions when I do file IO. I can live with that. The problem is with refactoring and taking a piece of code and reusing it somewhere else - and trying to figure out what does it need. ... Also if we have two pieces of the same per-thread state that we wish to use in one thread (e.g. db-connections) then the TLS approach becomes quite hard. No harder than the monadic approach, in my opinion. In the monadic approach adding a second db connection would involve: 1) add a line to the state record 2) add a db2query = withPart db2 . flip query 3) no changes elsewhere If the DB API uses a TLS parameter of type Proxy DBH how would you implement this in a nice manner for the TLS case? You've redefined 'fork'. If I want a library which works with other libraries, that will not be an option. The original purpose of my posting to this thread was to ask for two standard functions which would let me define thread-local variables in a way which is interoperable with other libraries, to the same extent as 'withArgs' and 'withProgName' are. All libraries which may fork may use a preallocated thread pool. Thus they might not work with TLS. withArgs and withProgName are global and not very thread-friendly. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
On 8/5/06, Frederik Eaton [EMAIL PROTECTED] wrote: Also, note that my proposal differs in that thread local variables are not writable, but can only be changed by calling (e.g. in my API) 'withIOParam'. [snip] and if some library I use decides to fork a thread behind the scenes, it won't change my program's behavior. Yes, but if it passes your action to another pre-existing thread, it will. What people seem to want is dynamic scoping. Why not implement that instead of messing around with yucky thread stuff? data IODynamicRef a getDynamicRef :: IODynamicRef a - IO a setDynamicRef :: IODynamicRef a - a - IO b - IO b -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
Maybe I'm misunderstanding your position - maybe you think that I should use lots of different processes to segregate global state into separate contexts? Well, that's nice, but I'd rather not. For instance, I'm writing a server - and it's just not efficient to use a separate process for each request. And there are some things such as database connections, current user id, log files, various profiling data, etc., that I would like to be thread-global but not process-global. I have done many servers in Haskell. Usually I have threads allocated to specific tasks rather than specific requests. What guarantees do your code have that all the relevant parameters are already initialized - and how can an user of the code know which TLS variables need to be initialized? You could ask the same questions about process-global state, couldn't you? If it is documented maybe it could be done at the level of an implicit parameter? Do you think implicit parameters are better than TLS? Or maybe you think that certain types of global state should be privileged - for instance, that all of the things which are arguments to 'newMain' above are OK to have as global state, but that anything else should be passed as function arguments, thus making thread-localization moot. I disagree with this - I am a proponent of extensibility, and think that the language should make as few things as possible built-in. I want to define my own application-specific global state, and, additionally, I want to have it thread-global, not process-global. This can cause much fun with the FFI. If we change e.g. stdout to thread specific what should be do before each foreign call? Same with the other things that are related to the OS process in question. A thread is a context of execution while a process is a context for resources. Would you like to have multiple Haskell processes inside one OS process? If you want to think of it that way, then sure. I don't consider these very different: 1) use one thread from a pre-allocated pool to do a task 2) fork a new thread to do the task With TLS they are vastly different. If you don't consider them different, then you can start using (2) instead of (1). You asked for an example, but, because of the nature of this topic, it would have to be a very large example to prove my point. Thread-local variables are things that only become really useful in large programs. Instead, I've asked you to put yourself in my shoes - what if the bits of context that you already take for granted in your programs had to be thread-local? How would you cope, without thread-local variables, in such a situation? I have been using an application specific monad (newtyped transformer) and a clean set of functions so that the implementation is not hardcoded and can be changed easily. Thus I haven't had the same difficulties as you. I don't think many of the process global resources would make sense on a per-thread basis and I am not against all global state. You say many, but the question is are there any. But I would say that I think I would find having to know what thread a particular bit of code was running in in order to grok it very strange, I agree that it is important to have code which is easy to understand. Usually, functions run in the same thread as their caller, unless they are passed to something with the word 'fork' in the name. That's a good rule of thumb that is in fact sufficient to let you understand the code I write. Also, if that's too much to remember, then since I'm only proposing and using non-mutable thread-local state (i.e. it behaves like a MonadReader), and since I'm not passing actions between threads as Einar is, then you can forget about the 'fork' caveat. The only problem appears when someone uses two libraries one written by me and an another written by you and wonders why is my program failing in mysterious ways. Can you give the API for your library? I have a hard time imagining how it could not be obvious that a thread pool is being used. I think the code would in fact be more difficult to grok, if all of the things which I want to be thread-local were instead passed around as parameters, a la 'newMain'. This is simply because, in that scenario, there would much more code to read, and it would be very repetitive. If I used special monads for my state, then the situation would be only slightly better - a single monad would not suffice, and I'd be faced with a plethora of 'lift' functions and redefinitions of 'catch', as well as long type signatures and a crowded namespace. As said before the monadic approach can be quite clean. I haven't used implicit parameters that much, so I won't comment on them. Perhaps you can give an example? As I said, a single monad won't suffice for me, because different libraries only know about different parts of the state.
Re: [Haskell] thread-local variables
On 05.08 14:32, Frederik Eaton wrote: If it is documented maybe it could be done at the level of an implicit parameter? Do you think implicit parameters are better than TLS? Implicit parameters are explicit and the type checker guards that they are not undefined (and thus are safe in the presence of callbacks). I haven't used implicit parameters extensively because I prefer the monadic approach. I don't consider these very different: 1) use one thread from a pre-allocated pool to do a task 2) fork a new thread to do the task With TLS they are vastly different. If you don't consider them different, then you can start using (2) instead of (1). Performance reasons or access to a shared resources. Also 2) would mean in many cases making currently local state global which is not nice. Can you give the API for your library? I have a hard time imagining how it could not be obvious that a thread pool is being used. e.g. various withFooResource :: (Foo - IO a) - IO a can use worker threads. As said before the monadic approach can be quite clean. I haven't used implicit parameters that much, so I won't comment on them. Perhaps you can give an example? As I said, a single monad won't suffice for me, because different libraries only know about different parts of the state. With TLS, one can delimit the scope of parameters by making the references to them module-internal, for instance. With monads, I imagine that I'll need for each parameter (1) a MonadX class, with a liftX member (2) a catchX function (3) a MonadY instance, for each wrapped monad Y (thus the number of such instances will be O(n^2) where n is the number of parameters) That is usually the wrong approach. Newtype something like StateT AppState IO. Use something like: runWithPart :: (AppState - c) - (c - IO a) - AppM a to define nice actions for different parts of the libraries. Usually this is very easy if one uses combinators and high level constructs and messier if it is hard to find the right combinators. If you look at the various web frameworks in Haskell you will notice that most of them live happily with one monad and don't suffer from problems because of that. With TLS, I need (1) a declaration x = unsafePerformIO $ newIOParam ... And don't have any static guarantees that you have done all the proper initialization calls before you use them. In the previous example we were using a lot of libraries using hidden state. How do we guarantee that they have valid values in TLS? Also if we have two pieces of the same per-thread state that we wish to use in one thread (e.g. db-connections) then the TLS approach becomes quite hard. Here is a naive and dirty implementation. The largest problem is that TypeRep is not in Ord. An alternative approach using Dynamic would be possible, but I like the connection between the key and the associated type. http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/ Not optimized for performance at all. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
Sorry to jump into this thread so late. However, I'd like to take a moment to remind everyone that some time ago I put a concrete proposal for thread-local variables on the table. http://article.gmane.org/gmane.comp.lang.haskell.cafe/11010 I believe this proposal addresses the initialization issues that Einar has been discussing. In my proposal, thread-local variables always have some defined value, and they obtain their values at well-defined points. The liked message also gives several use cases that I felt motivated the proposal. -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
As said before the monadic approach can be quite clean. I haven't used implicit parameters that much, so I won't comment on them. Perhaps you can give an example? As I said, a single monad won't suffice for me, because different libraries only know about different parts of the state. With TLS, one can delimit the scope of parameters by making the references to them module-internal, for instance. With monads, I imagine that I'll need for each parameter (1) a MonadX class, with a liftX member (2) a catchX function (3) a MonadY instance, for each wrapped monad Y (thus the number of such instances will be O(n^2) where n is the number of parameters) That is usually the wrong approach. Newtype something like StateT AppState IO. Use something like: runWithPart :: (AppState - c) - (c - IO a) - AppM a to define nice actions for different parts of the libraries. Usually this is very easy if one uses combinators and high level constructs and messier if it is hard to find the right combinators. If you look at the various web frameworks in Haskell you will notice that most of them live happily with one monad and don't suffer from problems because of that. That doesn't answer the question: What if my application has a need for several different sets of parameters - what if it doesn't make sense to combine them into a single monad? What if there are 'n' layers? Is it incorrect to say that the monadic approach requires code size O(n^2)? With TLS, I need (1) a declaration x = unsafePerformIO $ newIOParam ... And don't have any static guarantees that you have done all the proper initialization calls before you use them. Well, there are a lot of things I don't have static guarantees for. For instance, sometimes I call the function 'head', and the compiler isn't able to verify that the argument isn't an empty list. If I initialize my TLS to 'undefined' then I'll get a similar error message, at run time. For another example, I don't use monadic regions when I do file IO. I can live with that. ... Also if we have two pieces of the same per-thread state that we wish to use in one thread (e.g. db-connections) then the TLS approach becomes quite hard. No harder than the monadic approach, in my opinion. Here is a naive and dirty implementation. The largest problem is that TypeRep is not in Ord. An alternative approach using Dynamic would be possible, but I like the connection between the key and the associated type. http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/ Not optimized for performance at all. You've redefined 'fork'. If I want a library which works with other libraries, that will not be an option. The original purpose of my posting to this thread was to ask for two standard functions which would let me define thread-local variables in a way which is interoperable with other libraries, to the same extent as 'withArgs' and 'withProgName' are. Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
Hi Robert, I looked over your proposal. I'm not sure if I'm in favor of introducing a new keyword. It seems unnecessary. Also, note that my proposal differs in that thread local variables are not writable, but can only be changed by calling (e.g. in my API) 'withIOParam'. This is still just as general, because an IORef can be stored in a thread-local variable, but it makes it easier to reason about the more common use case where TLS is used to make IO a Reader; and it makes it easier to share modifiable state across more than one thread. I.e. if modifiable state is stored as 'IOParam (IORef a)' then the default is for the stored 'IORef a' to be shared across all threads; it can only be changed locally for a specified action and any sub-threads using 'withIOParam'; and if some library I use decides to fork a thread behind the scenes, it won't change my program's behavior. I think it is a good idea to have stdin, cwd, etc. be thread-local. I don't understand why the 'TL' monad is necessary, but I haven't read the proposal very carefully. Best, Frederik On Sat, Aug 05, 2006 at 02:18:58PM -0400, Robert Dockins wrote: Sorry to jump into this thread so late. However, I'd like to take a moment to remind everyone that some time ago I put a concrete proposal for thread-local variables on the table. http://article.gmane.org/gmane.comp.lang.haskell.cafe/11010 I believe this proposal addresses the initialization issues that Einar has been discussing. In my proposal, thread-local variables always have some defined value, and they obtain their values at well-defined points. The liked message also gives several use cases that I felt motivated the proposal. -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
Here is a naive and dirty implementation. The largest problem is that TypeRep is not in Ord. An alternative approach using Dynamic would be possible, but I like the connection between the key and the associated type. http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/ Not optimized for performance at all. You've redefined 'fork'. If I want a library which works with other libraries, that will not be an option. The original purpose of my posting to this thread was to ask for two standard functions which would let me define thread-local variables in a way which is interoperable with other libraries, to the same extent as 'withArgs' and 'withProgName' are. I also forgot to mention that if you hold on to a ThreadId, it apparently causes the whole thread to be retained. Simon Marlow explained this on 2005/10/18: m One could argue that getting the parent ThreadId is something that m should be supported natively by forkIO, and I might be inlined to agree. m Unfortunately there are some subtleties: currently a ThreadId is m represented by a pointer to the thread itself, which causes the thread m to be kept alive. This has implications not only for space leaks, but m also for reporting deadlock: if you have a ThreadId for a thread, you m can send it an exception with throwTo at any time, and hence the runtime m can never determine that the thread is deadlocked so it will never get m the NonTermination exception. Perhaps we need two kinds of ThreadId: a m weak one for use in Maps, and a strong one that you can use with m throwTo. But then building a Map in which some elements can be garbage m collected is a bit tricky (it can be done though; see our old Memo table m implementation in fptools/hslibs/util/Memo.hs). So this is another problem with your implementation, and another reason why I want TLS support in the standard libraries. Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
As for the subject under discussion (thread local state), I am personally sceptical about it. Why do we need it? Are we talking about safety or just convenience/API elegance? I've never encountered a situation where I've needed thread local state, (but this does not necessarily make it evil:-) OK. What if all Haskell processes, all over the world, were made into threads in the same large process? There are a lot of things that are currently global state - as in, process-global - which would have to become non-global in some way - pretty much all interaction with the world: file IO, networking, command line arguments, system environment, etc. You, Einar, and others seem to be arguing that the only way to make these things non-global should be to either make them explicit arguments to functions, or to have them appear explicitly in the type of the application's primary monad. For instance, this simple program: main :: IO () main = do putStrLn Hello world might, in Adrian Hey and Einar Karttunen's world, become: newMain host environment program_args network_config locale terminal_settings stdin stdout stderr = do hPutStrLn stdout (defaultEncoding locale) Hello world Now, some people might find this second version delightfully explicit, but I'd have doubts about whether such people are actually trying to get things done, or whether they see the language as an end in itself. As for me, I prefer the first version - it saves reading and typing, and is perfectly clear, and I have work to do. Maybe I'm misunderstanding your position - maybe you think that I should use lots of different processes to segregate global state into separate contexts? Well, that's nice, but I'd rather not. For instance, I'm writing a server - and it's just not efficient to use a separate process for each request. And there are some things such as database connections, current user id, log files, various profiling data, etc., that I would like to be thread-global but not process-global. Or maybe you think that certain types of global state should be privileged - for instance, that all of the things which are arguments to 'newMain' above are OK to have as global state, but that anything else should be passed as function arguments, thus making thread-localization moot. I disagree with this - I am a proponent of extensibility, and think that the language should make as few things as possible built-in. I want to define my own application-specific global state, and, additionally, I want to have it thread-global, not process-global. You asked for an example, but, because of the nature of this topic, it would have to be a very large example to prove my point. Thread-local variables are things that only become really useful in large programs. Instead, I've asked you to put yourself in my shoes - what if the bits of context that you already take for granted in your programs had to be thread-local? How would you cope, without thread-local variables, in such a situation? But I would say that I think I would find having to know what thread a particular bit of code was running in in order to grok it very strange, I agree that it is important to have code which is easy to understand. Usually, functions run in the same thread as their caller, unless they are passed to something with the word 'fork' in the name. That's a good rule of thumb that is in fact sufficient to let you understand the code I write. Also, if that's too much to remember, then since I'm only proposing and using non-mutable thread-local state (i.e. it behaves like a MonadReader), and since I'm not passing actions between threads as Einar is, then you can forget about the 'fork' caveat. I think the code would in fact be more difficult to grok, if all of the things which I want to be thread-local were instead passed around as parameters, a la 'newMain'. This is simply because, in that scenario, there would much more code to read, and it would be very repetitive. If I used special monads for my state, then the situation would be only slightly better - a single monad would not suffice, and I'd be faced with a plethora of 'lift' functions and redefinitions of 'catch', as well as long type signatures and a crowded namespace. unless there was some obvious technical reason why the thread local state needed to be thread local (can't think of any such reason right now). Some things are not immediately obvious. If you don't like to think of reasons, then just take my word for it that it would help me. A facility for thread-local variables would be just another of many facilities that programmers could choose from when designing their code. I'm not asking you to change the way you program - I don't care how other people program. I trust them to know what is best for their particular application. It's none of my business, anyway. Since Simon Marlow said that he had been considering a thread-local variable facility, I merely wanted to voice my support:
Re: [Haskell] thread-local variables
On 04.08 17:29, Frederik Eaton wrote: might, in Adrian Hey and Einar Karttunen's world, become: newMain host environment program_args network_config locale terminal_settings stdin stdout stderr = do hPutStrLn stdout (defaultEncoding locale) Hello world Actually I have implemented network-libraries, and I don't remember them requiring such things ;-) I think our main difference is that when designing concurrent applications in Haskell I frequently use monadic actions as callbacks invoked in distant unrelated threads. Threading behind the API is seen by me as mostly an implementation issue as long as the service guarantees don't change. You seem to use threads in a much more constrained fashion (my own interpretation) which results in us seeing TLS from very different perspectives. Maybe I'm misunderstanding your position - maybe you think that I should use lots of different processes to segregate global state into separate contexts? Well, that's nice, but I'd rather not. For instance, I'm writing a server - and it's just not efficient to use a separate process for each request. And there are some things such as database connections, current user id, log files, various profiling data, etc., that I would like to be thread-global but not process-global. I have done many servers in Haskell. Usually I have threads allocated to specific tasks rather than specific requests. What guarantees do your code have that all the relevant parameters are already initialized - and how can an user of the code know which TLS variables need to be initialized? If it is documented maybe it could be done at the level of an implicit parameter? Or maybe you think that certain types of global state should be privileged - for instance, that all of the things which are arguments to 'newMain' above are OK to have as global state, but that anything else should be passed as function arguments, thus making thread-localization moot. I disagree with this - I am a proponent of extensibility, and think that the language should make as few things as possible built-in. I want to define my own application-specific global state, and, additionally, I want to have it thread-global, not process-global. This can cause much fun with the FFI. If we change e.g. stdout to thread specific what should be do before each foreign call? Same with the other things that are related to the OS process in question. A thread is a context of execution while a process is a context for resources. Would you like to have multiple Haskell processes inside one OS process? I don't consider these very different: 1) use one thread from a pre-allocated pool to do a task 2) fork a new thread to do the task With TLS they are vastly different. You asked for an example, but, because of the nature of this topic, it would have to be a very large example to prove my point. Thread-local variables are things that only become really useful in large programs. Instead, I've asked you to put yourself in my shoes - what if the bits of context that you already take for granted in your programs had to be thread-local? How would you cope, without thread-local variables, in such a situation? I have been using an application specific monad (newtyped transformer) and a clean set of functions so that the implementation is not hardcoded and can be changed easily. Thus I haven't had the same difficulties as you. I don't think many of the process global resources would make sense on a per-thread basis and I am not against all global state. But I would say that I think I would find having to know what thread a particular bit of code was running in in order to grok it very strange, I agree that it is important to have code which is easy to understand. Usually, functions run in the same thread as their caller, unless they are passed to something with the word 'fork' in the name. That's a good rule of thumb that is in fact sufficient to let you understand the code I write. Also, if that's too much to remember, then since I'm only proposing and using non-mutable thread-local state (i.e. it behaves like a MonadReader), and since I'm not passing actions between threads as Einar is, then you can forget about the 'fork' caveat. The only problem appears when someone uses two libraries one written by me and an another written by you and wonders why is my program failing in mysterious ways. I think the code would in fact be more difficult to grok, if all of the things which I want to be thread-local were instead passed around as parameters, a la 'newMain'. This is simply because, in that scenario, there would much more code to read, and it would be very repetitive. If I used special monads for my state, then the situation would be only slightly better - a single monad would not suffice, and I'd be faced with a plethora of 'lift' functions and redefinitions of 'catch', as well as long type signatures and a crowded namespace. As
Re: [Haskell] thread-local variables
On 31.07 23:53, Adrian Hey wrote: Frederik Eaton wrote: On Mon, Jul 31, 2006 at 03:09:59PM +0300, Einar Karttunen wrote: On 31.07 03:18, Frederik Eaton wrote: 4) the library runs the callback code in Tw where the TLS state is invalid. This is even worse than a global variable in this case. If you have threads, and you have something which needs to be different among different threads, then it is hard for me to see how thread-local variables could be worse than global variables in any case at all. I haven't been following the technicalities of the particular scenario that's under discussion so I don't know exactly what either of you mean by (even) worse than global variables. I just want to point out that, as I (and a few others) see it at least, top level mutable state (aka global variables) is absolutely necessary sometimes for _SAFETY_ reasons. I agree that global variables are sometimes the best solution. My point in the quote was that in the example described TLS would cause more trouble than global mutable state. But I would say that I think I would find having to know what thread a particular bit of code was running in in order to grok it very strange, unless there was some obvious technical reason why the thread local state needed to be thread local (can't think of any such reason right now). I have to agree to this. It would be very nice to see good examples of thread local state in action that would teach us (the sceptics) why TLS is a good idea in Haskell - and maybe we would learn to write better code with it. Something more than simply avoiding a Reader monad / implicit parameters would be nice. ps. Should we move this discussion to haskell-cafe? - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On 31.07 03:18, Frederik Eaton wrote: I don't think it's necessarily such a big deal. Presumably the library with the worker threads will have to be invoked somewhere. One should just make sure that it is invoked in the appropriate environment, for instance with the database connection already properly initialized. (*) One might even want to change the environment a little within each thread, for instance so that errors get logged to a thread-specific log file. So we have the following: 1) the library is initialized and spawns worker thread Tw 2) application initializes the database connection and it is associated with the current thread Tc and all the children it will have (unless changed) 3) the application calls the library in Tc passing an IO action to it. The IO action refers to the TLS thinking it is in Tc where it is valid. 4) the library runs the callback code in Tw where the TLS state is invalid. This is even worse than a global variable in this case. Of course one can argue that the application should first initialize the database handle. But if the app uses worker threads (spawned before library initialization) then things will break if a library uses TLS and callbacks and they end up running in threads created before the library initialization. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On 31.07 14:03, Thomas Conway wrote: This is why I believe transaction-local variables are a more useful concept. You are garanteed that there is only one thread accessing them, and they behave just like ordinary TVars except that each transaction has its own copy. This seems like it could be useful. E.g. marking graph nodes while traversing them. The argument to newLVar is an initial value that is used at the start of each transaction. Note that this means that the value in an LVar does not persist between transaction. I agree that this limits their use, but simplifies them immensely, and doesn't stand in the way their being useful for solving a bunch of problems. I think that them reverting to the initial value is more useful than persisting behavior. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On Mon, Jul 31, 2006 at 03:09:59PM +0300, Einar Karttunen wrote: On 31.07 03:18, Frederik Eaton wrote: I don't think it's necessarily such a big deal. Presumably the library with the worker threads will have to be invoked somewhere. One should just make sure that it is invoked in the appropriate environment, for instance with the database connection already properly initialized. (*) One might even want to change the environment a little within each thread, for instance so that errors get logged to a thread-specific log file. So we have the following: 1) the library is initialized and spawns worker thread Tw 2) application initializes the database connection and it is associated with the current thread Tc and all the children it will have (unless changed) 3) the application calls the library in Tc passing an IO action to it. The IO action refers to the TLS thinking it is in Tc where it is valid. 4) the library runs the callback code in Tw where the TLS state is invalid. This is even worse than a global variable in this case. If you have threads, and you have something which needs to be different among different threads, then it is hard for me to see how thread-local variables could be worse than global variables in any case at all. Of course one can argue that the application should first initialize the database handle. But if the app uses worker threads (spawned before library initialization) then things will break if a library uses TLS and callbacks and they end up running in threads created before the library initialization. OK, sure. In certain situations you have to keep track of whether a function to which you pass an action might be sending it off to be run in a different thread. We've been over this. Perhaps users should be warned in the documentation - and in the documentation for exceptions as well. I really don't see that as a problem that would sneak up on people, since if you're passing an action to a function, rather than executing it yourself, then in most cases it should be clear to programmers that the action will run in a different context if not a different thread altogether. And if you want to force the context to be the same, you wrap the action in something restoring that context, just like you would have to do with your state transformer monad stack. Another way to write buggy code is to have it so bloated with extra syntax - e.g. with monad conversions, or extra function parameters, as you propose - that it becomes impossible to read and understand. Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables
Frederik Eaton wrote: On Mon, Jul 31, 2006 at 03:09:59PM +0300, Einar Karttunen wrote: On 31.07 03:18, Frederik Eaton wrote: 4) the library runs the callback code in Tw where the TLS state is invalid. This is even worse than a global variable in this case. If you have threads, and you have something which needs to be different among different threads, then it is hard for me to see how thread-local variables could be worse than global variables in any case at all. I haven't been following the technicalities of the particular scenario that's under discussion so I don't know exactly what either of you mean by (even) worse than global variables. I just want to point out that, as I (and a few others) see it at least, top level mutable state (aka global variables) is absolutely necessary sometimes for _SAFETY_ reasons. BTW If anybody still doesn't get it re. why we need top level mutable state, the point is not to avoid explicit state handle threading. The point is to avoid exposing newState constructors as part of the IO library API (thereby making it invulnerable to accidental state spoofing). If you're going to deny library users the ability to create new state handles then you have to make at least one ready made state handle available at the top level. It just so happens that in the common case where there can be only one such state handle (for safety reasons) then you can completely eliminate this from the exposed API. /BTW As for the subject under discussion (thread local state), I am personally sceptical about it. Why do we need it? Are we talking about safety or just convenience/API elegance? I've never encountered a situation where I've needed thread local state, (but this does not necessarily make it evil:-) But I would say that I think I would find having to know what thread a particular bit of code was running in in order to grok it very strange, unless there was some obvious technical reason why the thread local state needed to be thread local (can't think of any such reason right now). My 2p.. Regards -- Adrian Hey ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On 29.07 13:25, Frederik Eaton wrote: I think support for thread-local variables is something which is urgently needed. It's very frustrating that using concurrency in Haskell is so easy and nice, yet when it comes to IORefs there is no way to get thread-local behavior. Furthermore, that one can make certain things thread-local (e.g. with withArgs, withProgName) makes the solution seem close at hand (although I can appreciate that it may not be). Yet isn't it just a matter of making a Map with existentially quantified values part of the state of each thread, just as the program name and arguments are also part of that state? Are thread local variables really a good idea in Haskell? If variables are thread local how would this combinator work: withTimeOut :: Int - IO a - IO a withTimeOut tout op = mdo mv - newEmptyMVar wt - forkIO $ do try op = tryPutMVar mv killThread kt kt - forkIO $ do threadDelay tout e - tryPutMVar mv $ Left $ DynException $ toDyn TimeOutException if e then killThread wt else return () either throw return = takeMVar mv Would it change the semantics of the action as it is run in a different thread (this is a must if there are potentially blocking FFI calls). Now if the action changes the thread local state then it behaves differently. Do we really want that? Usually one can just add a monad that wraps IO/STM and provides the variables one needs. This has the good side of making scoping explicit. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On Sun, Jul 30, 2006 at 12:35:42PM +0300, Einar Karttunen wrote: On 29.07 13:25, Frederik Eaton wrote: I think support for thread-local variables is something which is urgently needed. It's very frustrating that using concurrency in Haskell is so easy and nice, yet when it comes to IORefs there is no way to get thread-local behavior. Furthermore, that one can make certain things thread-local (e.g. with withArgs, withProgName) makes the solution seem close at hand (although I can appreciate that it may not be). Yet isn't it just a matter of making a Map with existentially quantified values part of the state of each thread, just as the program name and arguments are also part of that state? Are thread local variables really a good idea in Haskell? Yes. If variables are thread local how would this combinator work: Do read the code I posted. Please note I'm not suggesting that *all* variables be thread local, I was proposing a special data-type for that. withTimeOut :: Int - IO a - IO a withTimeOut tout op = mdo mv - newEmptyMVar wt - forkIO $ do try op = tryPutMVar mv killThread kt kt - forkIO $ do threadDelay tout e - tryPutMVar mv $ Left $ DynException $ toDyn TimeOutException if e then killThread wt else return () either throw return = takeMVar mv Would it change the semantics of the action as it is run in a different thread (this is a must if there are potentially blocking FFI calls). No, because the thread in which it runs inherits any thread-local state from its parent. Now if the action changes the thread local state then it behaves differently. Do we really want that? I'm not sure what you're suggesting. The API I proposed actually doesn't let users discover when their actions are running in sub-threads. (Can you write an example using that API?) However, even if it did, I don't see a problem. Do you think that we should get rid of 'myThreadId', for instance? I don't. Usually one can just add a monad that wraps IO/STM and provides the variables one needs. This has the good side of making scoping explicit. That's easier said than done. Sometimes I take that route. But sometimes I don't want 5 different monads wrapping each other, each with its own 'lift' and 'catch' functions, making error messages indecipherable and code difficult to read and debug. Do you propose creating a special monad for file operations? For network operations? No? Then I don't see why I should have to make a special monad for database operations. Or, if the answer was yes, then fine: obfuscate your own code, but please don't ask me to do the same. Let's support both ways of doing things, and we can be different. Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On 30.07 11:49, Frederik Eaton wrote: No, because the thread in which it runs inherits any thread-local state from its parent. So we have different threads modifying the thread-local state? If it is a copy then updates are not propagated. What about a design with 10 worker threads taking requests from a Chan (IO ()) and running them (this occurs in real code). To get things right they should use the TLS-context relevant to each IO () rather than the thread. Now if the action changes the thread local state then it behaves differently. Do we really want that? I'm not sure what you're suggesting. The API I proposed actually doesn't let users discover when their actions are running in sub-threads. (Can you write an example using that API?) However, even if it did, I don't see a problem. Do you think that we should get rid of 'myThreadId', for instance? I don't. I do consider using myThreadId bad form for most purposes. It is nice to have for debugging output - and occasionally for sending other threads a handle for asynchronous exceptions, but this can lead to problems when changing threading patterns. Usually nice code does not care in which thread it is run. Usually one can just add a monad that wraps IO/STM and provides the variables one needs. This has the good side of making scoping explicit. That's easier said than done. Sometimes I take that route. But sometimes I don't want 5 different monads wrapping each other, each with its own 'lift' and 'catch' functions, making error messages indecipherable and code difficult to read and debug. Do you propose creating a special monad for file operations? For network operations? No? Then I don't see why I should have to make a special monad for database operations. Or, if the answer was yes, then fine: obfuscate your own code, but please don't ask me to do the same. Let's support both ways of doing things, and we can be different. Usually I just define one custom monad for the application which wraps the stack of monad transformers. Thus changing the monad stack does not affect the application code. A quite clean and efficient solution. My main objection to the TLS is that it looks like normal IO, but changing the thread that evaluates it can break things in ways that are hard to debug. E.g. we have an application that uses TLS and passes an IO action to a library that happens to use a pool of worker threads that invisible to the application. Or the same with the role of the application and library reversed. Offering it up as a separate library should be ok as it would be very easy to spot and take extra care not to cause problems. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
On Mon, Jul 31, 2006 at 03:54:29AM +0300, Einar Karttunen wrote: On 30.07 11:49, Frederik Eaton wrote: No, because the thread in which it runs inherits any thread-local state from its parent. So we have different threads modifying the thread-local state? If it is a copy then updates are not propagated. As I said, please read my code. There are no updates. What about a design with 10 worker threads taking requests from a Chan (IO ()) and running them (this occurs in real code). To get things right they should use the TLS-context relevant to each IO () rather than the thread. I could see how either behavior might be desirable, see below. (*) (snip) Usually I just define one custom monad for the application which wraps the stack of monad transformers. Thus changing the monad stack does not affect the application code. A quite clean and efficient solution. That does sound like a clean approach. However, I think that my approach would be cleaner; and in any case I think that both approaches should be available to the programmer. My main objection to the TLS is that it looks like normal IO, but changing the thread that evaluates it can break things in ways that are hard to debug. E.g. we have an application that uses TLS and passes an IO action to a library that happens to use a pool of worker threads that invisible to the application. Or the same with the role of the application and library reversed. I don't think it's necessarily such a big deal. Presumably the library with the worker threads will have to be invoked somewhere. One should just make sure that it is invoked in the appropriate environment, for instance with the database connection already properly initialized. (*) One might even want to change the environment a little within each thread, for instance so that errors get logged to a thread-specific log file. Offering it up as a separate library should be ok as it would be very easy to spot and take extra care not to cause problems. That's good to hear. Regards, Frederik -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
Hi All, On 7/31/06, Einar Karttunen ekarttun@cs.helsinki.fi wrote: My main objection to the TLS is that it looks like normal IO, but changing the thread that evaluates it can break things in ways that are hard to debug. E.g. we have an application that uses TLS and passes an IO action to a library that happens to use a pool of worker threads that invisible to the application. This is why I believe transaction-local variables are a more useful concept. You are garanteed that there is only one thread accessing them, and they behave just like ordinary TVars except that each transaction has its own copy. I think you'd need an API like type LVar a -- local var newLVar :: a - STM (LVar a) readLVar :: LVar a - STM a writeLVar:: LVar a - a - STM () The argument to newLVar is an initial value that is used at the start of each transaction. Note that this means that the value in an LVar does not persist between transaction. I agree that this limits their use, but simplifies them immensely, and doesn't stand in the way their being useful for solving a bunch of problems. cheers, Tom ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
[Haskell] thread-local variables (was: Re: Implicit Parameters)
Hi, Sorry to bring up this thread from so long ago. On Wed, Mar 01, 2006 at 11:53:42AM +, Simon Marlow wrote: Ashley Yakeley wrote: Simon Marlow wrote: Simon I have discussed doing some form of thread-local state, which covers many uses of implicit parameters and is much preferable IMO. Thread-local state doesn't change your types, and it doesn't require passing any extra parameters at runtime. It works perfectly well for the OS example you give, for example. Interesting. What would that look like in code? No concrete plans yet. There have been proposals for thread-local variables in the past on this list and haskell-cafe, and other languages have similar features (eg. Scheme's support for dynamic scoping). Doing something along these lines is likely to be quite straightforward to implement, won't require any changes to the type system, and gives you a useful form of implicit parameters without any of the drawbacks. The main difference from implicit parameters would be that thread-local variables would be restricted to the IO monad. I think support for thread-local variables is something which is urgently needed. It's very frustrating that using concurrency in Haskell is so easy and nice, yet when it comes to IORefs there is no way to get thread-local behavior. Furthermore, that one can make certain things thread-local (e.g. with withArgs, withProgName) makes the solution seem close at hand (although I can appreciate that it may not be). Yet isn't it just a matter of making a Map with existentially quantified values part of the state of each thread, just as the program name and arguments are also part of that state? import qualified Data.Map as M import Data.Maybe import Data.Unique import Data.IORef import Data.Typeable -- only these 2 must be implemented: withParams :: ParamsMap - IO () - IO () getParams :: IO ParamsMap -- type ParamsMap = M.Map Unique Value data Value = forall a . (Typeable a) = V a type IOParam a = IORef (Unique, a) newIOParam :: Typeable a = a - IO (IOParam a) newIOParam def = do k - newUnique newIORef (k,def) withIOParam :: Typeable a = IOParam a - a - IO () - IO () withIOParam p value act = do (k,def) - readIORef p m - getParams withParams (M.insert k (V value) m) act getIOParam :: Typeable a = IOParam a - IO a getIOParam p = do (k,def) - readIORef p m - getParams return $ fromMaybe def (M.lookup k m = (\ (V x) - cast x)) Frederik P.S. I sent a message about this a while back, when I was trying to implement my own version using ThreadId (not really a good approach). -- http://ofb.net/~frederik/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] thread-local variables (was: Re: Implicit Parameters)
I would also note that some form of transaction-local variable would also be really handy for STM usage. Tom ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell