[Haskell-cafe] Announcing postgresql-libpq-0.8.2.3
I just fixed a fairly serious performance problem with postgresql-libpq's binding to PQescapeStringConn; in was exhibiting a non-linear slowdown when more strings are escaped and retained. https://github.com/lpsmith/postgresql-libpq/commit/adf32ff26cdeca0a12fa59653b49c87198acc9ae If you are using postgresql-libpq's escapeStringConn, or a library that uses it (e.g. postgresql-simple, or persistent-postgresql), I do recommend upgrading. You may or may not see a performance improvement, depending on your particular use case, but if you do it can be quite substantial. It's not entirely clear to me what the root cause really is, but it certainly appears as though it's related to the (direct) use of mallocBytes, which was replaced with (indirect) calls to mallocForeignPtrBytes / mallocPlainForeignPtrBytes (through the bytestring package). In this case, it resulted in an asymptotic improvement in time complexity of some algorithms. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [database-devel] Announcing postgresql-libpq-0.8.2.3
I'll have to benchmark withMVar on my system, but (at least on my old laptop) a safe foreign function call is also on the order of 100ns. As c_PQescapeStringConn and c_PQescapeByteaConn are currently safe calls, that would limit the maximum time saved at ~50%. Perhaps it would make sense to make these unsafe calls as well, but the justification I used at the time was that the amount of time consumed by these functions is bounded by the length of the string being escaped, which is itself unbounded.Certainly postgresql-libpq is currently overly biased towards safe calls; though when I took the bindings over it was overly biased towards unsafe calls. (Though, arguably, it's worse to err on the side of making ffi calls unsafe.) I've also considered integrating calls to c_PQescapeStringConn with blaze-builder and/or bytestring-builder, which could help a fair bit, but would also introduce dependencies on the internals of these libraries when currently there is none. There is certainly a lot of room for optimizing query generation in postgresql-simple, this I've been well aware of since the beginning. And it probably would be worthwhile to move to protocol-level parameters which would avoid the need for escaping value parameters altogether, and open up the possibility of binary formats as well, which would be a huge performance improvement for things like numerical values and timestamps. Although IIRC, one downside is that this prevents multiple DML commands from being issued in a single request, which would subtly change the interface postgresql-simple exports. Best, Leon On Mon, Jul 8, 2013 at 10:00 PM, Joey Adams joeyadams3.14...@gmail.comwrote: On Mon, Jul 8, 2013 at 9:03 PM, Leon Smith leon.p.sm...@gmail.com wrote: I just fixed a fairly serious performance problem with postgresql-libpq's binding to PQescapeStringConn; in was exhibiting a non-linear slowdown when more strings are escaped and retained. I'd like to point out a somewhat related bottleneck in postgresql-simple (but not postgresql-libpq). Every PQescapeStringConn or PQescapeByteaConn call involves a withMVar, which is about 100ns on the threaded RTS on my system. Taking the Connection lock once for the whole buildQuery call might be much faster, especially for multi-row inserts and updates. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] A use case for *real* existential types
I've been working on a new Haskell interface to the linux kernel's inotify system, which allows applications to subscribe to and be notified of filesystem events. An application first issues a system call that returns a file descriptor that notification events can be read from, and then issues further system calls to watch particular paths for events. These return a watch descriptor (which is just an integer) that can be used to unsubscribe from those events. Now, of course an application can open multiple inotify descriptors, and when removing watch descriptors, you want to remove it from the same inotify descriptor that contains it; otherwise you run the risk of deleting a completely different watch descriptor. So the natural question here is if we can employ the type system to enforce this correspondence. Phantom types immediately come to mind, as this problem is almost the same as ensuring that STRefs are only ever used in a single ST computation. The twist is that the inotify interface has nothing analogous to runST, which does the heavy lifting of the type magic behind the ST monad. This twist is very simple to deal with if you have real existential types, with the relevant part of the interface looking approximately like init :: exists a. IO (Inotify a) addWatch :: Inotify a - FilePath - IO (Watch a) rmWatch :: Inotify a - Watch a - IO () UHC supports this just fine, as demonstrated by a mockup attached to this email. However a solution for GHC is not obvious to me. inotify.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A use case for *real* existential types
On Fri, May 10, 2013 at 9:00 AM, Andres Löh and...@well-typed.com wrote: This twist is very simple to deal with if you have real existential types, with the relevant part of the interface looking approximately like init :: exists a. IO (Inotify a) addWatch :: Inotify a - FilePath - IO (Watch a) rmWatch :: Inotify a - Watch a - IO () You can still do the ST-like encoding (after all, the ST typing trick is just an encoding of an existential), with init becoming like runST: init :: (forall a. Inotify a - IO b) - IO b addWatch :: Inotify a - FilePath - IO (Watch a) rmWatch :: Inotify a - Watch a - IO () Right, but my interface the Inotify descriptor has an indefinite extent, whereas your interface enforces a dynamic extent. I'm not sure to what degree this would impact use cases of this particular library, but in general moving a client program from the the first interface to the second can require significant changes to the structure of the program, whereas moving a client program from the second interface to the first is trivial. So I'd say my interface is more expressive. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A use case for *real* existential types
On Fri, May 10, 2013 at 9:04 AM, MigMit miguelim...@yandex.ru wrote: With that kind of interface you don't actually need existential types. Or phantom types. You can just keep Inotify inside the Watch, like this: Right, that is an alternative solution, but phantom types are a relatively simple and well understood way of enforcing this kind of property in the type system without incurring run-time costs. My inotify binding is intended to be as thin as possible, and given my proposed interface, you could implement your interface in terms of mine, making the phantom types disappear using the restricted existentials already available in GHC, and such a wrapper should be just as efficient as if you had implemented your interface directly. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A use case for *real* existential types
On Fri, May 10, 2013 at 5:49 PM, Alexander Solla alex.so...@gmail.comwrote: I'm not sure if it would work for your case, but have you considered using DataKinds instead of phantom types? At least, it seems like it would be cheap to try out. http://www.haskell.org/ghc/docs/7.4.2/html/users_guide/kind-polymorphism-and-promotion.html I do like DataKinds a lot, and I did think about them a little bit with respect to this problem, but a solution isn't obvious to me, and perhaps more importantly I'd like to be able to support older versions of GHC, probably back to 7.0 at least. The issue is that every call to init needs to return a slightly different type, and whether this is achieved via phantom types or datakinds, it seems to me some form of existential typing is required. As both Andres and MigMit pointed out, you can sort of achieve this by using a continuation-like construction and higher-ranked types (is there a name for this transform? I've seen it a number of times and it is pretty well known...), but this enforces a dynamic extent on the descriptor whereas the original interface I proposed allows an indefinite extent. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A use case for *real* existential types
A value has an indefinite extent if it's lifetime is independent of any block of code or related program structure, think malloc/free or new/gc. A value has a dynamic extent if is lifetime is statically determined relative to the dynamic execution of the program (e.g. a stack variable): in this case the type system ensures that no references to the inotify descriptor can exist after the callback returns. Best, Leon On Fri, May 10, 2013 at 6:52 PM, Alexander Solla alex.so...@gmail.comwrote: On Fri, May 10, 2013 at 3:31 PM, Leon Smith leon.p.sm...@gmail.comwrote: On Fri, May 10, 2013 at 5:49 PM, Alexander Solla alex.so...@gmail.comwrote: I'm not sure if it would work for your case, but have you considered using DataKinds instead of phantom types? At least, it seems like it would be cheap to try out. http://www.haskell.org/ghc/docs/7.4.2/html/users_guide/kind-polymorphism-and-promotion.html I do like DataKinds a lot, and I did think about them a little bit with respect to this problem, but a solution isn't obvious to me, and perhaps more importantly I'd like to be able to support older versions of GHC, probably back to 7.0 at least. The issue is that every call to init needs to return a slightly different type, and whether this is achieved via phantom types or datakinds, it seems to me some form of existential typing is required. As both Andres and MigMit pointed out, you can sort of achieve this by using a continuation-like construction and higher-ranked types (is there a name for this transform? I've seen it a number of times and it is pretty well known...), but this enforces a dynamic extent on the descriptor whereas the original interface I proposed allows an indefinite extent. I know what extensions (of predicates and the like) are, but what exactly does dynamic and indefinite mean in this context? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Extensible Type Unification
It finally occurred to me how to get most of what I want, at least from a functional perspective.Here's a sample GADT, with four categories of constructor: data Foo :: Bool - Bool - Bool - Bool - * where A :: Foo True b c d B :: Foo True b c d C :: Foo a True c d D :: Foo a b True d E :: Foo a b c True Given an Eq instance, we can easily compare two constructors for equality; we can put some constructors in a list and tell which categories appear in the list, and it plays reasonably nicely with the case coverage analyzer, which I think is important from a software engineering standpoint.For example, this function will compile without warning: fooVal :: Foo a b False False - Int fooVal x = case x of A - 0 B - 1 C - 2 The only thing this doesn't do that I wish it did is infer the type of fooVal above.Rather, GHC infers the type :: Foo a b c d - Int, and then warns me that I'm missing cases. But this is admittedly a strange form of type inference, completely alien to the Haskell landscape, which I realized only shortly after sending my original email. My original description of ranges of sets of types wasn't sufficient to fully capture my intention. That said, this solution falls rather flat in several software engineering respects. It doesn't scale with complexity; types quickly become overbearing even with modest numbers of categories.If you want to add additional categories, you have to modify every type constructor instance you've ever written down.You could mitigate some of this via the existing type-level machinery, but it seems a band-aid, not a solution. For comparison, here is the best I had when I wrote my original email: data Category = W | X | Y | Z data Foo :: [Category] - * where A :: (Member W ub) = Foo ub B :: (Member W ub) = Foo ub C :: (Member X ub) = Foo ub D :: (Member Y ub) = Foo ub E :: (Member Z ub) = Foo ub class Member (a :: x) (bs :: [x]) instance Member a (a ': bs) instance Member a bs = Member a (b ': bs) The code is closer to what I want, in terms of expression, though it mostly fails on the functional requirements.Case analysis doesn't work, I can't compare two constructors without additional type annotations, and while I can find out which categories of constructors appear in a list, it doesn't seem I can actually turn those contexts into many useful things, automatically. So while I didn't have any trouble computing set operations over unordered lists at the type level, I couldn't figure out out to put it to use in the way I wanted.Perhaps there is a clever way to emulate unification, but I really like the new features that reduce the need to be clever when it comes to type-level computation. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Extensible Type Unification
I've been toying with some type-level programming ideas I just can't quite make work, and it seems what I really want is a certain kind of type unification. Basically, I'd like to introduce two new kind operators: kind Set as -- a finite set of ground type terms of kind as kind Range as = Range (Set as) (Set as) -- a greatest lower bound and a least upper bound A type expression of kind (Set as) would either be a type variable of kind (Set as), or set of *ground* type terms of kind (as). A type expression of kind (Range as) would be a type variable of kind (Range as), or two type expressions of kind (Set as). To unify ground terms of these types, one would compute as follows: unifySets xs ys | xs == ys = Just xs | otherwise = Nothing unifyRanges (Range glb0 lub0) (Range glb1 lub1) | glb' `subset` lub' = Just (Range glb' lub') | otherwise = Nothing where glb' = glb0 `union` glb1 lub' = lub0 `isect` lub1 I say sets of ground types, because I have no idea what unification between sets of non-ground types would mean, and I really don't need it. Among applications that came to mind, one could use this to create a restricted IO abstraction that could tell you which kinds of actions might be taken (read from mouse, talk to network, etc), and be able to run only those scripts that are restricted to certain resources. Or, one could define a singular GADT that represents messages/events decorated by various categories, and then define functions that only operate on selected categories of messages.E.G. for something vaguely IRC-ish, one could write something like: data EventCategory = Channel | Presence | Information | Conversation data Event :: Range EventCategory - * where ChanCreate :: ... - Event (Range (Set '[Channel]) a) ChanJoin :: ... - Event (Range (Set '[Channel]) a) ChanLeave :: ... - Event (Range (Set '[Channel]) a) PresAvailable :: ... - Event (Range (Set '[Presence]) a) PresAway :: ... - Event (Range (Set '[Presence]) a) WhoisQuery :: ... - Event (Range (Set '[Information]) a) WhoisResponse :: ... - Event (Range (Set '[Information]) a) Message:: ... - Event (Range (Set '[Conversation]) a) And then be able to write functions such as dispatch :: Event (Range a (Set '[Channel, Conversation])) - IO () dispatch e = case e of ChanCreate{..} - ... ChanJoin{..} - ... ChanLeave{..} - ... Message{..} - ... In this case, the case analysis tool would be able to warn me if I'm missing any possible events in this dispatcher, or if I have extraneous events that I can't be passed (according to my type.) Anyway, I've been trying to see if I can't come up with something similar using existing type-level functionality in 7.6, with little success. (Though I'm not very experienced with type-level programming.) If not, might it be possible to add some kind of extensible unification mechanism, in furtherance of type-level programming? Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How can I avoid buffered reads?
On Thu, Dec 6, 2012 at 5:23 PM, Brandon Allbery allber...@gmail.com wro\ Both should be cdevs, not files, so they do not go through the normal filesystem I/O pathway in the kernel and should support select()/poll(). (ls -l, the first character should be c instead of - indicating character-mode device nodes.) If ghc is not detecting that, then *that* is indeed an I/O manager issue. The issue here is that if you look at the source of fdReadBuf, you see that it's a plain system call without any reference to GHC's (relatively new) IO manager. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How can I avoid buffered reads?
Well, I took Bardur's suggestion and avoided all the complexities of GHC's IO stack and simply used System.Posix.IO and Foreign.This appears to work, but for better or worse, it is using blocking calls to the read system call and is not integrated with GHC's IO manager. This shouldn't be an issue for my purposes, but I suppose it's worth pointing out. {-# LANGUAGE BangPatterns, ViewPatterns #-} import Control.Applicative import Data.Bits import Data.Word(Word64) import qualified Data.ByteString as S import qualified Data.ByteString.Lazy as L import Data.ByteString.Internal (c2w) import Control.Exception import System.Posix.IO import Foreign import qualified System.IO as IO import qualified Data.Binary.Getas Get showHex :: Word64 - S.ByteString showHex n = s where (!s,_) = S.unfoldrN 16 f n f n = Just (char (n `shiftR` 60), n `shiftL` 4) char (fromIntegral - i) | i 10= (c2w '0' - 0) + i | otherwise = (c2w 'a' - 10) + i twoRandomWord64s :: IO (Word64,Word64) twoRandomWord64s = bracket openRd closeRd readRd where openRd = openFd /dev/urandom ReadOnly Nothing defaultFileFlags { noctty = True } readRd = \fd - allocaBytes 16 $ \ptr - do fdReadAll fd ptr 16 x - peek (castPtr ptr) y - peek (castPtr ptr `plusPtr` 8) return (x,y) closeRd = closeFd fdReadAll fd ptr n = do n' - fdReadBuf fd ptr n if n /= n' then fdReadAll fd (ptr `plusPtr` n') (n - n') else return () main = do (x,y) - twoRandomWord64s S.hPutStrLn IO.stdout (S.append (showHex x) (showHex y)) On Wed, Nov 28, 2012 at 6:05 PM, Leon Smith leon.p.sm...@gmail.com wrote: If you have rdrand, there is no need to build your own PRNG on top of rdrand. RdRand already incorporates one so that it can produce random numbers as fast as they can be requested, and this number is continuously re-seeded with the on-chip entropy source. It would be nice to have a little more information about /dev/urandom and how it varies by OS and hardware, but on Linux and FreeBSD at least it's supposed to be a cryptographically secure RNG that incorporates a PRNG to produce numbers in case you exhaust the entropy pool. On Wed, Nov 28, 2012 at 5:00 PM, Vincent Hanquez t...@snarc.org wrote: On 11/28/2012 09:31 PM, Leon Smith wrote: Quite possibly, entropy does seem to be a pretty lightweight dependency... Though doesn't recent kernels use rdrand to seed /dev/urandom if it's available? So /dev/urandom is the most portable source of random numbers on unix systems, though rdrand does have the advantage of avoiding system calls, so it certainly would be preferable, especially if you need large numbers of random numbers. There's no much information on this i think, but if you need large number of random numbers you should build a PRNG yourself on top of the best random seed you can get, and make sure you reseed your prng casually with more entropy bytes. Also if you don't have enough initial entropy, you should block. /dev/urandom is not the same thing on every unix system. leading to various assumptions broken when varying the unixes. It also varies with the hardware context: for example on an embedded or some virtualized platform, giving you really terrible entropy. -- Vincent ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] How can I avoid buffered reads?
I have some code that reads (infrequently) small amounts of data from /dev/urandom, and because this is pretty infrequent, I simply open the handle and close it every time I need some random bytes. The problem is that I recently discovered that, thanks to buffering within GHC, I was actually reading 8096 bytes when I only need 16 bytes, and thus wasting entropy. Moreover calling hSetBuffering handle NoBuffering did not change this behavior. I'm not sure if this behavior is a bug or a feature, but in any case it's unacceptable for dealing with /dev/urandom. Probably the simplest way to fix this is to write a little C helper function that will read from /dev/urandom for me, so that I have precise control over the system calls involved. But I'm curious if GHC can manage this use case correctly; I've just started digging into the GHC.IO code myself. Best, Leon {-# LANGUAGE BangPatterns, ViewPatterns #-} import Control.Applicativeimport Data.Bitsimport Data.Word(Word64)import qualified Data.ByteString as Simport qualified Data.ByteString.Lazy as Limport Data.ByteString.Internal (c2w)import qualified System.IOas IOimport qualified Data.Binary.Getas Get showHex :: Word64 - S.ByteStringshowHex n = s where (!s,_) = S.unfoldrN 16 f n f n = Just (char (n `shiftR` 60), n `shiftL` 4) char (fromIntegral - i) | i 10= (c2w '0' - 0) + i | otherwise = (c2w 'a' - 10) + i twoRandomWord64s :: IO (Word64,Word64)twoRandomWord64s = IO.withBinaryFile /dev/urandom IO.ReadMode $ \handle - do IO.hSetBuffering handle IO.NoBuffering Get.runGet ((,) $ Get.getWord64host * Get.getWord64host) $ L.hGet handle 16 main = do (x,y) - twoRandomWord64s S.hPutStrLn IO.stdout (S.append (showHex x) (showHex y)) {- Relevant part of strace: open(/dev/urandom, O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3 fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL (Invalid argument) ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL (Invalid argument) read(3, N\304\4\367/\26c\\3218\237f\214yKg~i\310\r\262\\224H\340y\n\376V?\265\344..., 8096) = 8096 close(3)= 0 -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How can I avoid buffered reads?
Quite possibly, entropy does seem to be a pretty lightweight dependency... Though doesn't recent kernels use rdrand to seed /dev/urandom if it's available? So /dev/urandom is the most portable source of random numbers on unix systems, though rdrand does have the advantage of avoiding system calls, so it certainly would be preferable, especially if you need large numbers of random numbers. Best, Leon On Wed, Nov 28, 2012 at 2:45 PM, Thomas DuBuisson thomas.dubuis...@gmail.com wrote: As an alternative, If there existed a Haskell package to give you fast cryptographically secure random numbers or use the new Intel RDRAND instruction (when available) would that interest you? Also, what you are doing is identical to the entropy package on hackage, which probably suffers from the same bug/performance issue. Cheers, Thomas On Wed, Nov 28, 2012 at 11:38 AM, Leon Smith leon.p.sm...@gmail.com wrote: I have some code that reads (infrequently) small amounts of data from /dev/urandom, and because this is pretty infrequent, I simply open the handle and close it every time I need some random bytes. The problem is that I recently discovered that, thanks to buffering within GHC, I was actually reading 8096 bytes when I only need 16 bytes, and thus wasting entropy. Moreover calling hSetBuffering handle NoBuffering did not change this behavior. I'm not sure if this behavior is a bug or a feature, but in any case it's unacceptable for dealing with /dev/urandom. Probably the simplest way to fix this is to write a little C helper function that will read from /dev/urandom for me, so that I have precise control over the system calls involved. But I'm curious if GHC can manage this use case correctly; I've just started digging into the GHC.IO code myself. Best, Leon {-# LANGUAGE BangPatterns, ViewPatterns #-} import Control.Applicative import Data.Bits import Data.Word(Word64) import qualified Data.ByteString as S import qualified Data.ByteString.Lazy as L import Data.ByteString.Internal (c2w) import qualified System.IOas IO import qualified Data.Binary.Getas Get showHex :: Word64 - S.ByteString showHex n = s where (!s,_) = S.unfoldrN 16 f n f n = Just (char (n `shiftR` 60), n `shiftL` 4) char (fromIntegral - i) | i 10= (c2w '0' - 0) + i | otherwise = (c2w 'a' - 10) + i twoRandomWord64s :: IO (Word64,Word64) twoRandomWord64s = IO.withBinaryFile /dev/urandom IO.ReadMode $ \handle - do IO.hSetBuffering handle IO.NoBuffering Get.runGet ((,) $ Get.getWord64host * Get.getWord64host) $ L.hGet handle 16 main = do (x,y) - twoRandomWord64s S.hPutStrLn IO.stdout (S.append (showHex x) (showHex y)) {- Relevant part of strace: open(/dev/urandom, O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3 fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL (Invalid argument) ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL (Invalid argument) read(3, N\304\4\367/\26c\\3218\237f\214yKg~i\310\r\262\\224H\340y\n\376V?\265\344..., 8096) = 8096 close(3)= 0 -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How can I avoid buffered reads?
If you have rdrand, there is no need to build your own PRNG on top of rdrand. RdRand already incorporates one so that it can produce random numbers as fast as they can be requested, and this number is continuously re-seeded with the on-chip entropy source. It would be nice to have a little more information about /dev/urandom and how it varies by OS and hardware, but on Linux and FreeBSD at least it's supposed to be a cryptographically secure RNG that incorporates a PRNG to produce numbers in case you exhaust the entropy pool. On Wed, Nov 28, 2012 at 5:00 PM, Vincent Hanquez t...@snarc.org wrote: On 11/28/2012 09:31 PM, Leon Smith wrote: Quite possibly, entropy does seem to be a pretty lightweight dependency... Though doesn't recent kernels use rdrand to seed /dev/urandom if it's available? So /dev/urandom is the most portable source of random numbers on unix systems, though rdrand does have the advantage of avoiding system calls, so it certainly would be preferable, especially if you need large numbers of random numbers. There's no much information on this i think, but if you need large number of random numbers you should build a PRNG yourself on top of the best random seed you can get, and make sure you reseed your prng casually with more entropy bytes. Also if you don't have enough initial entropy, you should block. /dev/urandom is not the same thing on every unix system. leading to various assumptions broken when varying the unixes. It also varies with the hardware context: for example on an embedded or some virtualized platform, giving you really terrible entropy. -- Vincent ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Platform Versioning Policy: upper bounds are not our friends
I think we actually agree more than we disagree; I do think distinguishing hard and soft upper bounds (no matter what they are called) would help, and I'm just trying to justify them to some of the more dismissive attitudes towards the idea The only thing I think we (might) disagree on is the relative importance of distinguishing hard and soft bounds versus being able to change bounds easily after the fact (and *without* changing the version number associated with the package.) And on that count, given the choice, I pick being able to change bounds after the fact, hands down. I believe this is more likely to significantly improve the current situation than distinguishing the two types of bound alone. However, being able to specify both (and change both) after the fact may prove to be even better. Best, Leon On Sat, Aug 18, 2012 at 11:52 PM, wren ng thornton w...@freegeek.orgwrote: On 8/17/12 11:28 AM, Leon Smith wrote: And the difference between reactionary and proactive approaches I think is a potential justification for the hard and soft upper bounds; perhaps we should instead call them reactionary and proactive upper bounds instead. I disagree. A hard constraint says this package *will* break if you violate me. A soft constraint says this package *may* break if you violate me. These are vastly different notions of boundary conditions, and they have nothing to do with a proactive vs reactionary stance towards specifying constraints (of either type). The current problems of always giving (hard) upper bounds, and the previous problems of never giving (soft) upper bounds--- both stem from a failure to distinguish hard from soft! The current/proactive approach fails because the given constraints are interpreted by Cabal as hard constraints, when in truth they are almost always soft constraints. The previous/reactionary approach fails because when the future breaks noone bothered to write down when the last time things were known to work. To evade both problems, one must distinguish these vastly different notions of boundary conditions. Hard constraints are necessary for blacklisting known-bad versions; soft constraints are necessary for whitelisting known-good versions. Having a constraint at all shows where the grey areas are, but it fails to indicate whether that grey is most likely to be black or white. -- Live well, ~wren __**_ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Platform Versioning Policy: upper bounds are not our friends
I see good arguments on both sides of the upper bounds debate, though at the current time I think the best solution is to omit upper bounds (and I have done so for most/all of my packages on hackage).But I cannot agree with this enough: On Thu, Aug 16, 2012 at 4:45 AM, Joachim Breitner m...@joachim-breitner.dewrote: I think what we’d need is a more relaxed policy with modifying a package’s meta data on hackage. What if hackage would allow uploading a new package with the same version number, as long as it is identical up to an extended version range? Then the first person who stumbles over an upper bound that turned out to be too tight can just fix it and upload the fixed package directly, without waiting for the author to react. I think that constraint ranges of a given package should be able to both be extended and restricted after the fact. Those in favor of the reactionary approach (as I am at the moment, or Bryan O'Sullivan) would find the ability of to restrict the version range useful, while those in favor of the proactive approach (like Joachim Breitner or Doug Beardsley) would find the ability to extend the version range useful. I suspect that attitudes towards upper bounds may well change if we can set version ranges after the fact.I know mine very well might.And the difference between reactionary and proactive approaches I think is a potential justification for the hard and soft upper bounds; perhaps we should instead call them reactionary and proactive upper bounds instead. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What does unpacking an MVar really mean?
On Tue, Jul 31, 2012 at 7:37 AM, Bertram Felgenhauer bertram.felgenha...@googlemail.com wrote: Note that MVar# itself cannot be unpacked -- the StgMVar record will always be a separate heap object. One could imagine a couple of techniques to unpack the MVar# itself, and was curious if GHC might employ one of them. So, really, unpacking the MVar does not eliminate a layer of indirection, it just eliminates the need to check a pointer tag (and possibly execute a thunk or follow some redirects if you don't have a pointer to an MVar#). I think this is what I was ultimately after. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] What does unpacking an MVar really mean?
I admit I don't know exactly how MVars are implemented, but given that they can be aliased and have indefinite extent, I would think that they look something vaguely like a cdatatype ** var, basically a pointer to an MVar (which is itself a pointer, modulo some other things such as a thread queue.) And, I would think that unpacking such an structure would basically be eliminating one layer of indirection, so it would then look vague like a cdatatype * var.But again, given aliasing and indefinite extent, this would seem to be a difficult thing to do. Actually this isn't too difficult if an MVar only exists in a single unpacked structure: other references to the MVar can simply be pointers into the structure. But the case where an MVar is unpacked into two different structures suggests that, at least in some cases, an unpacked MVar is still a cdatatype ** var; So, is my understanding more or less correct? Does anybody have a good, succinct explanation of how MVars are implemented, and how they are unpacked? One final question, assuming that unpacking an MVar really does eliminate a layer of indirection, and that other references to that MVar are simply pointers into a larger structure, what happens to that larger structure when there are no more references to it (but still some references to the MVar?)Given the complications that must arise out of a doubly unpacked MVar, I'm going to guess that the larger structure does get garbage collected in this case, and that the MVar becomes dislodged from this structure. Would that MVar then be placed directly inside another unpacked reference, if one is available? Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What does unpacking an MVar really mean?
Let me clarify a bit. I am familiar with the source of Control.Concurrent.MVar, and I do see {-# UNPACK #-}'ed MVars around, for example in GHC's IO manager. What I should have asked is, what does an MVar# look like? This cannot be inferred from Haskell source; though I suppose I could have tried to read the Runtime source. Now, one would hope that and (MVar# RealWorld footype) would approximately correspond to a footype * mvar; variable in C. The problem is this cannot _always_ be the case, because you can alias the (MVar# RealWorld footype) by placing a single MVar into two unpacked columns in two different data structures.So you would need to be able to still sometimes represent an MVar# by a footype ** mvar at runtime, even though one would hope that it would be represented by a footype * mvar in one particular data structure. On Tue, Jul 31, 2012 at 1:04 AM, Ryan Ingram ryani.s...@gmail.com wrote: Because of this, boxed MVars can be garbage collected without necessarily garbage-collecting the MVar# it holds, if a live reference to that MVar# still exists elsewhere. I was asking the dual question: if the MVar# exists in some data structure, can that data structure still be garbage collected when there is a reference to the MVar#, but not the data structure it is contained within. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] twitter election on favorite programming language
Out of curiousity, was this a plurality election (vote for one), or an approval election (vote for many)? On Tue, May 1, 2012 at 12:11 AM, Kazu Yamamoto k...@iij.ad.jp wrote: Hello, A twitter election on favorite programming language was held in Japan and it appeared that Heskell is No. 10 loved language in Japan. :-) http://twisen.com/election/index/654 Regards, --Kazu ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Turn GC off
I doubt it. Even if you could turn GC completely off, the vast majority of GHC Haskell programs will run out of memory very quickly. Lazy evaluation has been called evaluation by allocation; unless your program has very simple requirements and can live in the completely-strict fragment of Haskell without consing, almost everything allocates something. Also, your programs probably won't even run faster without GC, as GHC's GC is an important part of getting halfway reasonable L2 cache performance. Best, Leon On Wed, Sep 14, 2011 at 12:42 PM, Andreas Voellmy andreas.voel...@gmail.com wrote: Hi everyone, Is there a way to completely turn garbage collection off in the Haskell runtime system? I'm aware of the -A runtime option, but I'd like to completely turn it off, if possible. I'm OK with running the program until it runs out of memory, and I'm willing to recompile GHC if needed. Regards, Andreas ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] stack overflow pain
On Wed, Sep 21, 2011 at 3:39 AM, Heinrich Apfelmus apfel...@quantentunnel.de wrote: Of course, a list of 1 million items is going to take a lot of memory, unless you generate it lazily. Unfortunately mapM cannot generate its result lazily because it has to execute all IO actions before returning the list of results. That's oversimplifying a bit. The outer list cannot be generated lazily, but the inner values (in this case inner lists) can be generated lazily. On Wed, Sep 21, 2011 at 7:00 PM, Tim Docker t...@dockerz.net wrote: I believe the error is happening in the concat because there are subsequent IO actions that fail to execute. ie the code is equivalent to: vs - fmap concat $ mapM applyAction sas someOtherAction consume vs and someOtherAction seems not to be run. However, to be sure, I'll confirm with code akin to what you suggest above. The error shouldn't be happening in either concat or mapM. Are you sure that someOtherAction isn't being run? Might it be writing to a file and the result isn't getting flushed? GHC has no inherent limit on the stack size, though using extremely large amounts of stack is usually indicative of an error. You can up the stack limit with the -Ksize RTS option, and I think there is a way it can be disabled entirely.You might try upping your stack size and profiling your program to see if that's helpful: the -xt profiling option might be useful, but I haven't played with it much. I suspect the issue is that one of your applyAction is creating a thunk that blows the stack when it's evaluated, and return [] ensures that the thunk is never evaluated.Though it's not clear to me why it'd be getting evaluated in this new scenario with the information you've provided, assuming you really truly aren't running someOtherAction. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Higher-kinded Quantification
Thanks! The issue with eta-reduction had been confusing me... Best, Leon On Tue, Apr 12, 2011 at 3:35 PM, Dan Doel dan.d...@gmail.com wrote: On Tuesday 12 April 2011 11:27:31 AM Leon Smith wrote: I think impredicative polymorphism is actually needed here; if I write ... Then I get a type error ... I'm not going to worry about the type error, because that wasn't what I had in mind for the types. The type for loop I had in mind was: [i] - Iterator i o m a - Iterator i o m a Then, feedPure cracks open the first (forall m. ...), instantiates it to the m for the result, and runs loop on it. If we had explicit type application, it'd look like: feedPure l it = /\m - loop l (it@m) but as it is it's just: feedPure l it = loop l it Which cannot be eta reduced. But what I find rather befuddling is the kind error: feedPure' :: [i] - Iterator i o (forall (m :: * - *) . m) a - Iterator i o (forall (m :: * - *) . m) a feedPure' = undefined Iterator.hs:193:58: `m' is not applied to enough type arguments Expected kind `*', but `m' has kind `* - *' In the type signature for `feedPure'': feedPure' :: [i] - Iterator i o (forall m :: (* - *). m) a - Iterator i o (forall m :: (* - *). m) a Is impredicative polymorphism restricted to the kind *? The problem is that (forall (m :: * - *). m) is not a valid type, and forall is not a valid way to construct things with kind * - *. You have: m :: * - * |- T[m] :: * == |- (forall (m :: * - *). T[m]) :: * but that is the only way forall works. -- Dan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Higher-kinded Quantification
I think impredicative polymorphism is actually needed here; if I write {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} feedPure :: forall i o a. [i] - (forall m. Iterator i o m a) - (forall m. Iterator i o m a) feedPure = loop where loop :: [i] - (forall m. Iterator i o m a) - (forall m. Iterator i o m a) loop [] iter = iter loop (i:is) (NeedInput k)= loop is (k i) Then I get a type error: Iterator.hs:185:36: Couldn't match type `m0' with `m1' because type variable `m1' would escape its scope This (rigid, skolem) type variable is bound by a type expected by the context: Iterator i o m1 a The following variables have types that mention m0 k :: i - Iterator i o m0 a (bound at Iterator.hs:185:28) Which I think I vaguely understand, as the type of NeedInput is (i - Iterator i o m a) - Iterator i o m a, meaning the type of m is equal. So it seems the polymorphism must be carried on m. But what I find rather befuddling is the kind error: feedPure' :: [i] - Iterator i o (forall (m :: * - *) . m) a - Iterator i o (forall (m :: * - *) . m) a feedPure' = undefined Iterator.hs:193:58: `m' is not applied to enough type arguments Expected kind `*', but `m' has kind `* - *' In the type signature for `feedPure'': feedPure' :: [i] - Iterator i o (forall m :: (* - *). m) a - Iterator i o (forall m :: (* - *). m) a Is impredicative polymorphism restricted to the kind *? Best, Leon Then I get a tp On Tue, Apr 12, 2011 at 5:37 AM, Dan Doel dan.d...@gmail.com wrote: On Monday 11 April 2011 8:31:54 PM Leon Smith wrote: I have a type constructor (Iterator i o m a) of kind (* - * - (* - *) - *), which is a monad transformer, and I'd like to use the type system to express the fact that some computations must be pure, by writing the impredicative type (Iterator i o (forall m. m) a). However I've run into a bit of difficulty expressing this, due to the kind of m. I've attached a minimal-ish example. Is there a way to express this in GHC? I think the simplest way is 'Iterator i o Id a'. Then there's a function: embed :: Iterator i o Id a - Iterator i o m a with the obvious implementation. This means your NeedAction case is no longer undefined, too. You can either peel off NeedActions (since they're just delays) or leave them in place. However, another option is probably: [i] - (forall m. Iterator i o m a) - (forall m. Iterator i o m a) which will still have the 'this is impossible' case. You know that the incoming iterator can't take advantage of what m is, though, so it will be impossible. -- Dan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Higher-kinded Quantification
I have a type constructor (Iterator i o m a) of kind (* - * - (* - *) - *), which is a monad transformer, and I'd like to use the type system to express the fact that some computations must be pure, by writing the impredicative type (Iterator i o (forall m. m) a). However I've run into a bit of difficulty expressing this, due to the kind of m. I've attached a minimal-ish example. Is there a way to express this in GHC? {-# LANGUAGE ImpredicativeTypes #-} {-# LANGUAGE KindSignatures #-} data Iterator i o m a = NeedInput (i - Iterator i o m a) | HasOutput o (Iterator i o m a) | NeedAction (m (Iterator i o m a)) | IsDone a feedPure :: [i] - Iterator i o (forall m . m) a - Iterator i o (forall m . m) a feedPure = loop where loop [] iter = iter loop (i:is) (NeedInput k)= loop is (k i) loop is (HasOutput o k0) = HasOutput o (loop is k0) loop _ (NeedAction _) = undefined -- shouldn't happen, due to type loop _ (IsDone a) = IsDone a {- impredicative.hs:10:34: Kind mis-match The third argument of `Iterator' should have kind `* - *', but `m' has kind `*' In the type signature for `feedPure': feedPure :: [i] - Iterator i o (forall m. m) a - Iterator i o (forall m. m) -} feedPure' :: [i] - Iterator i o (forall m . (m :: * - *)) a - Iterator i o (forall m . (m :: * - *)) a feedPure' = loop where loop [] iter = iter loop (i:is) (NeedInput k)= loop is (k i) loop is (HasOutput o k0) = HasOutput o (loop is k0) loop _ (NeedAction _) = undefined loop _ (IsDone a) = IsDone a {- impredicative.hs:28:46: `(m :: * - *)' is not applied to enough type arguments Expected kind `*', but `(m :: * - *)' has kind `* - *' In the type signature for `feedPure'': feedPure' :: [i] - Iterator i o (forall m. (m :: * - *)) a - Iterator i o (forall m. (m :: * - *)) a -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Possibility to implant Haskell GC into PostgreSQL interesting?
I'm not particularly familiar with the codebase of either PostgreSQL or GHC, but I'd be rather surprised that porting GHC's garbage collector to PostgreSQL would be an easy or worthwhile task. For example, GHC's garbage collector understands the memory layout that its data structures use, which I'm sure is rather different from the memory layout of PostgreSQL's data structures.Also, often these kinds of projects turn out to be busts; for example, the GHC team worked on integrating Hugs into GHC for a while, but after some effort decided that it would be a lot easier to write an interpreter from scratch instead, which became ghci. What would be far more likely to turn out to be realistic, would be to understand GHC's garbage collector and use (some of) it's ideas when writing a replacement garbage collector for PostgreSQL. However, there is a lot of variation in garbage collectors because different techniques are suited to different patterns of memory allocation. And based on the observation that PostgreSQL makes reasonably effective usage of virtual memory, whereas GHC's garbage collector thrashes Virtual Memory, I would bet there is a fair number of important differences in the systems. Best, Leon On Tue, Feb 22, 2011 at 7:25 PM, Nick Rudnick joerg.rudn...@t-online.de wrote: Dear all, recently, at an email conversation with pgsql hackers I had a quick shot, asking about their position to somebody replacing their palloc GC -- having roughly in mind that either here or on a Mercury mailing list (where there's a similar case with a pure declarative language and a Boehm GC), where there was a conclusion a non-pure GC would be a major hindrance to deeper interaction. Ok, I found the answer worth a discussion here; as far as I understood, they don't oppose the idea that the PostgreSQL GC might be a candidate for an update. I see three issues: (a) The most open question to me is the gain from the Haskell perspective; most critical: Would a Haskell GC inside PostgreSQL mean a significant change or rather a drop in the bucket? Once this may be answered optimistically, there comes the question about possible applications -- i.e., what can be done with such a DBMS system. Knowing about efforts like (http://groups.inf.ed.ac.uk/links/) I would like to let this open for discussion. Let me please drop here a quote that I believe their object relational efforts seem to have gotten stuck at PostgreSQL due to the conceptual clash of OO with the relational algebra underlying PostgreSQL -- which in turn seems to harmonize much better with Hindley-Milner Co. (System F??) (b) The question I personally can say least about are the expenditures to be expected for a such project. I would be very interested in some statements. I have limited knowledge about the PostgreSQL GC and would assume it is much simpler than, e.g. the GHC GC. (c) Gain from PostgreSQL perspective: This IMO should be answered easiest, hoping the Haskell GC experts to be able to answer easily how much is the overhead to be payed for pure declarativity, and the chances (e.g. parallelism, multi cores??), too. Besides it might be interesting to see inhowfar a considerable overhead problem may be alleviated by a 'plugin' architecture allowing future PostgreSQL users to switch between a set of GCs. I would be very interested about any comments, Cheers, Nick ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] HDBC's future and request for help
This seems a timely email, as I've been submitting a steady-ish trickle of patches to HDBC-postgresql lately. Honestly, I'm rather dissatisfied with HDBC in many respects, but I don't have a very good idea of what a (low-level) database access library for Haskell *should* be, and I've found HDBC to be the least worst option available at the moment. HSQL doesn't support parameterized queries, making SQL injection issues tedious and error prone.And while Takusen looks rather interesting, I find the documentation and examples lacking, I find the API a bit too esoteric for my current tastes, and I'm fairly certain that Takusen cannot possibly work for my application due to the way Takusen reclaims connections. My biggest (mostly) fixable complaint with HDBC is that it hasn't turned out to be a very complete or robust solution for accessing databases that like to use PostgreSQL-specific features. My biggest complaint that (probably) isn't easily fixed is it's reliance on Convertible, and the use of lots of unsafe pattern matching and exception-happy functions. At least for the time being, I've found it easiest and most expedient to fix up HDBC. I'm not particularly interested in taking over the maintenance of HDBC, and I am comfortable with model #1 at the time being. However if somebody else is interested in another option, I'm probably ok with that too. Best, Leon On Tue, Feb 22, 2011 at 11:50 AM, John Goerzen jgoer...@complete.org wrote: Hi folks, HDBC has been out there for quite some time now. I wrote it initially to meet some specific needs, and from that perspective, it has been done for awhile. It is clear, however, that there are some needs it doesn't meet. Most of them relate to specific backend driver items. I'd like to start some discussion in the community about what the future of HDBC and its backend drivers ought to look like. Some models might be: 1. I continue as maintainer for HDBC and HDBC-{postgresql,odbc,sqlite3} and act as patch manager/gatekeeper for patches that are discussed on some public mailing list. 2. Interested parties adopt the backend drivers while I continue to act as maintainer/patch manager/gatekeeper for HDBC itself. 3. Interested parties adopt all of HDBC and maintain it I am not expressing a particular preference for any of these options; just putting them forth. Here are some of the current issues I am aware of: 1. I have no Windows development platform. I can't test the releases on Windows. Many Windows users don't have the skill to diagnose problems. These problems do eventually get fixed when a Windows user with that skill comes along -- and I appreciate their efforts very much! -- but it takes longer than it ought to. 2. The ODBC documentation is monumentally terrible, and the API is perhaps only majestically terrible, and it is not 100% clear to me all the time that I have it right. A seasoned ODBC person would be ideal here. 3. Issues exist with transferring binary data and BLOBs to/from at least PostgreSQL databases and perhaps others. There appear to be bugs in the backend for this, but BLOB support in the API may need to be developed as well. 4. Although the API supports optimizations for inserting many rows at once and precompiled queries, most backends to not yet take advantage of these optimization. 5. I have received dueling patches for whether foreign imports should be marked safe or unsafe on various backends. There seems to be disagreement in the community about this one. 6. Many interactions with database backends take place using a String when a more native type could be used for efficiency. This project may be rather complex given varying types of column data in a database -- what it expects for bound parameters and what it returns. The API support is there for it though. 7. Various other more minor items. Thoughts? -- John ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] HANSEI in Haskell?
On Wed, Feb 23, 2011 at 10:52 AM, Chung-chieh Shan ccs...@cs.rutgers.edu wrote: Mostly we preferred (as do the domain experts we target) to write probabilistic models in direct style rather than monadic style. Haskell's laziness doesn't help -- in fact, to avoid running out of memory, we'd have to defeat that memoization by sprinkling () - throughout the types. I don't think that () - is even guaranteed to work... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] The implementation of Control.Exception.bracket
There is a common idiom used in Control.Concurrent libraries, as embodied in the implementation of bracket: http://www.haskell.org/ghc/docs/7.0-latest/html/libraries/base-4.3.0.0/src/Control-Exception-Base.html#bracket bracket before after thing = mask $ \restore - do a - before r - restore (thing a) `onException` after a _ - after a return r Is there any particular reason why bracket is not implemented as: bracket before after thing = mask $ \restore - do a - before r - restore (thing a) `finally` after a return r Is there some subtle semantic difference? Is there a performance difference? It seems like a trivial thing, but I am genuinely curious. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Installing a top-level handler on an existing thread
I've been toying with a little thread manager library, and for sanity's sake I really need a way to install another top-level exception handler on an existing thread.I don't want to replace any other handlers, just put my own handler around the thread's continuation. Of course, it would be pretty easy to do if the IO monad had a nice efficient implementation of call/cc and call/cc1, although I don't need the full power of continuations. I really don't want to create my own Thread monad (which could use ContT), I want to be able to do it on plain old IO threads. Is there any way to do this in GHC? Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] HDBC, postgresql, bytestrings and embedded NULLs
On Sat, Jan 8, 2011 at 11:55 AM, Michael Snoyman mich...@snoyman.com wrote: In general I think it would be a good thing to have solid, low-level bindings to PostgreSQL. Well, there is PostgreSQL and libpq on hackage: http://hackage.haskell.org/package/libpq http://hackage.haskell.org/package/PostgreSQL The PostgreSQL looks like it's in need of maintenance, and hasn't been updated in a few years. libpq is new, and looks promising. I haven't really used either one, so I can't really say too much about either. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] OpenSSL question
I don't believe that HsOpenSSL offers support for creating your own SSL keys programmatically from Haskell. Do you actually need to generate keys programmatically? If not, you could manually use OpenSSL's command line tools; if your needs are simple enough it shouldn't be too hard to spawn a subprocess to generate the keys. Best, Leon On Sat, Jan 1, 2011 at 5:49 PM, Charles-Pierre Astolfi c...@crans.org wrote: Hi -cafe, I'm using the OpenSSL package, and I don't see any way to forge my own keys. (http://hackage.haskell.org/packages/archive/HsOpenSSL/0.9/doc/html/OpenSSL-RSA.html) For example, in the case of RSA, I'm given the exponent and modulus and I would like to create a public key from that, of type RSAPubKey. Is it possible ? How should I do? I'm not particularly tied to OpenSSL, anything that let me use RSA and DSA would be fine. Regards, -- Cp ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] V.I.P.s and the associativity of merge'
Ok, after mulling over the issues that Will Ness has brought up in the last few days [1], I think I have a partial explanation for the apparent tension between Will's observations and Heinrich Apfelmus's Implicit Heaps article [2], which both concern the implementation of mergeAll [3]. The merge' function takes two ordered lists, with the head of the first list less than the head of the second, and merges their contents: merge' [] ys = ys merge' (x:xs) ys = x : merge xs ys The nice thing about this function is we can merge an infinite number of lists by folding right, if assume that the heads of those lists are appropriately ordered. This appears in Richard Bird's code at the end of Melissa O'Neill's Genuine Sieve of Eratosthenes, though undoubtedly this observation has been independently made by many people. Now, in an ordinary sense, merge' *is* an associative operator, in that given three fully defined ordered lists with ordered heads, merge' xs (merge' ys zs) == merge' (merge' xs ys) zs. This allows us to merge an infinite number of lists using an arbitrary tree of merge' operations, without ever worrying that we will return an incorrect result. (However, we might get stuck in a non-productive loop, for example if we fold left over an infinite list of ordered lists) However, Heinrich's article uses a stronger sense of associativity which includes laziness properties and thus captures some operational characteristics. In this sense, merge' *is not* associative.To remedy this, Heinrich uses VIPs to strengthen the associativity property of merge'. This raises the question, is there some combination of the shape of the merge' tree and some input for which using VIPs dramatically changes the efficiency of a mergeAll operation? I suspect the answer is yes, though I don't know for sure at this point in time. However, I do tacitly believe that the current tree that mergeAll uses doesn't exhibit this property for any input, and so I have simplified the implementations of mergeAll and unionAll in the latest version of data-ordlist-0.4.4 by avoiding the use of VIPs. This has the nice side benefit of modestly improving performance when the elements of the result are highly biased towards the first list. Best, Leon [1] http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/84666 [2] http://apfelmus.nfshost.com/articles/implicit-heaps.html [3] http://hackage.haskell.org/packages/archive/data-ordlist/0.4.4/doc/html/Data-List-Ordered.html#v:mergeAll ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] HDBC-postgresql and safe/unsafe FFI calls
On Thu, Sep 2, 2010 at 1:00 AM, David Powell da...@drp.id.au wrote: Thanks Jason, I think I had read that - I quite enjoy Edward's posts. Re-reading, seems to confirm what I thought, most (all?) of the FFI calls in HDBC-postgresql should be changed to safe. Wouldn't that require thread safety on the part of libpq? Versions 8.4 and earlier requires the library to be configured with the option --enable-thread-safety, although version 9.0 has thread safety turned on by default. http://www.postgresql.org/docs/8.4/static/libpq-threading.html http://www.postgresql.org/docs/9.0/static/libpq-threading.html At least the Ubuntu PostgreSQL 8.4 packages are compiled with thread safety, but I can't speak for other distributions. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Projects that could use student contributions?
There is a lot of room for improvement to my NumberSieves package. The package consists of algorithms I extracted and polished up from when I was working on Project Euler problems. It makes solving a number of problems into quick five minute affairs. At some point I would probably do it myself, but I don't have a pressing reason to do it. http://hackage.haskell.org/package/NumberSieves Possible Improvements I have in mind: 1. Make things faster 2. Reduce memory requirements 3. Parallelize the sieves 4. Incrementalize the Factor and Phi sieves, so that an explicit upper bound is no longer required 5. Remove limitations so that Factor and Phi can sieve beyond 2^32 It would be a small and self-contained project, suitable for anybody who has an interest in Haskell and mathematics! If somebody is interested, by all means contact me via email, and I will share the approach I had in mind. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] PDF generation?
The fonts aren't rasterized, but PDFs that were converted from PS tend to look awful in almost any PDF viewer other than Adobe's Acrobat Reader. Fonts look especially bad. I don't know exactly what the problem is, but my experience is that you are best off generating PDF directly, and using Acrobat Reader on any document that looks bad otherwise. Not to mention that PDF has a document model and supports things such as hyperlinks, which Postscript does not, which is another reason to avoid PostScript. Best, Leon On Tue, Jun 1, 2010 at 2:45 PM, Yitzchak Gale g...@sefer.org wrote: I wrote: I have often generated PostScript from Haskell... Then you convert the PS to PDF using any of the nice utilities around for that Pierre-Etienne Meunier wrote: Isn't there a problem with non-type 1 vectorial fonts being rasterized during this conversion ? No. PDF is just a simplified, compressed encoding of PostScript. Unless there is some special reason to do so, why would a conversion utility go to the trouble of rasterizing fonts instead of just copying them in? Perhaps something like ImageMagick might do that; its internal format is a raster. Regards, Yitz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Compressing GHC tarballs with LZMA
Out of curiousity, I downloaded a binary distribution for GHC-6.12.2, and tried compressing and recompressing it with bzip2 and lzma compression, using no command line arguments (all default parameters) file: http://haskell.org/ghc/dist/6.12.2/ghc-6.12.2-x86_64-unknown-linux-n.tar.bz2 bzip2 lzma size106M 72M compression time2:01 9:51 decompression time 0:25 0:12 the time is reported as minutes:seconds. At it default compression level, lzma was almost five times slower than bzip2, but it's approximately twice as fast when decompressing and produced a file nearly a third smaller. Given that many modern variants of the tar command support .tar.lzma files directly, and that both debian's dpkg and red hat's rpm now use lzma, maybe it's time to consider offering .tar.lzma as a download option for GHC's binary (and maybe source-code) bundles. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Break Function Using Lazy Pairs
That example doesn't particularly tie the knot, unless you count the fact that break is itself a recursive function. Usually tie the knot refers to some kind of circular programming, i.e. a self-referential data structure, or explicit use of Data.Function.fix to produce a recursive function (as is useful in e.g. dynamic programming) Also, you aren't understanding the laziness of the return product; instead you are still thinking of this example in terms of eager evaluation as almost every other programming language uses. A better approximation of what is going on could be represented textually as: -- break hi\nbye -- ( 'h' : ys, zs ) where ( ys, zs ) = break i\nbye -- ( 'h' : 'i' : ys, zs ) where ( ys, zs ) = break \nbye -- ( 'h' : 'i' : [] , bye) Of course, if you want to get more pendantic, I've glossed over steps involving the conditional and something resembling beta-reduction. Incidentally, it's the latter part I omitted which, naively implemented, creates the space leak referred to in the original post. Best, Leon On Mon, Apr 5, 2010 at 5:19 PM, aditya siram aditya.si...@gmail.com wrote: Hi all, For the past couple of weeks I've been trying to understand tying-the-knot style programming in Haskell. A couple of days ago, Heinrich Apfelmus posted the following 'break' function as part of an unrelated thread: break [] = ([],[]) break (x:xs) = if x == '\n' then ([],xs) else (x:ys,zs) where (ys,zs) = Main.break xs I've stepped through the code manually to see if I understand and I'm hoping someone will tell me if I'm on the right track: -- break hi\nbye = -- let f1 = (break i\nbye) -- = let f2 = (break \nbye) -- = ([],bye) -- ('i' : fst f2, snd f2) = ('i' : [], bye) -- ('h' : fst f1, snd f1) = ('h' : 'i' : [], bye) -- = (hi,bye) -deech ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Hughes' parallel annotations for fixing a space leak
On Wed, Mar 31, 2010 at 3:51 PM, Heinrich Apfelmus apfel...@quantentunnel.de wrote: which were introduced by John Hughes in his Phd thesis from 1983. They are intriguing! Unfortunately, I haven't been able to procure a copy of Hughes' thesis, either electronic or in paper. :( Can anyone help? Are there any other resources about this parallel approach? Aye, returning lazy pairs is one of those things that seems rather magical in several respects. Out of curiousity, have you looked at the unsafePerformIO thought experiment near the end of my Monad Reader article? It demonstrates that returning lazy pairs can introduce multiple code paths through a single function, reminiscent of (but different than) multi-mode logical relations. (Mercury, for example, optimizes relations differently depending on their mode.) I too am interested in looking at Hughes' thesis, I tried tracking it down early last year with little success. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Are there any female Haskellers?
On Sat, Mar 27, 2010 at 1:56 PM, Jason Dagit da...@codersbase.com wrote: For some reason it started out as a male dominated field. Let's assume for cultural reasons. Once it became a male dominated field, us males unknowingly made the work and learning environments somewhat hostile or unattractive to women. I bet I would feel out of place if I were the only male in a class of 100 women. Is this really true? I've heard rumors that in the early days of programming, that women were in the majority, or at least they represented a much greater proportion of programmers than they do now. I seem to recall that this started to change sometime in the 60s. Of course, I can't recall when or where I heard these stories, and I'm not sure that my source was reliable, so I might be completely off on this count. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Takusen sqlite3 insert is very slow
Using PostgreSQL on my computer, your code executes in 3.5 seconds with GHCi, while compiled it executes in 16.2 seconds! Clearly something is wrong, although I don't yet know enough about Takusen enough to be able to say what. I tried hoisting the preparation of the statement out of the loop, however, Takusen throws an error after the first insertion. However, the insertion was apparently successful, as a single row shows up in the table afterwards. Here's sample code and a sample shell session: Best, Leon import Database.PostgreSQL.Enumerator import Database.Enumerator import Control.Monad(forM_) main = do withSession (connect [ CAhost localhost , CAport 5432 , CAuser takusen , CApassword takusen , CAdbname test ]) $ do execDDL (sql create table x (y text)) let query = sql insert into x (y) values (?) stmt = prepareQuery testinsert query [bindType (0::Int)] forM_ ([1..1] :: [Int]) $ \x - do withPreparedStatement stmt $ \pstmt - do withBoundStatement pstmt [bindP x] $ \bstmt - do execDML bstmt return () $ psql test psql (8.4.2) Type help for help. test=# select * from x; ERROR: relation x does not exist LINE 1: select * from x; ^ test=# \q $ ghci Tak.lhs GHCi, version 6.12.1: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. [1 of 1] Compiling Main ( Tak.lhs, interpreted ) Ok, modules loaded: Main. (0.09 secs, 40984568 bytes) *Main main Loading package mtl-1.1.0.2 ... linking ... done. Loading package old-locale-1.0.0.2 ... linking ... done. Loading package old-time-1.0.0.3 ... linking ... done. Loading package time-1.1.4 ... linking ... done. Loading package Takusen-0.8.5 ... linking ... done. *** Exception: DBError (,) 1 *Main :q Leaving GHCi. $ psql test psql (8.4.2) Type help for help. test=# select * from x; y --- 1 (1 row) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] developing against privately patched libraries, and cabal
As somebody who's hacked on cabal-install a bit (but don't have a worthwhile patch to contribute (yet?)), I can tell you that versions support a tag structure, at least internally, but I haven't seen a non-empty tags field and don't know how to make the tags field non-empty. For that I'd have to go source-code diving again. http://www.haskell.org/ghc/docs/latest/html/libraries/Cabal-1.8.0.2/Distribution-Version.html Best, Leon On Sun, Mar 21, 2010 at 10:58 AM, Dougal Stanton dou...@dougalstanton.net wrote: If you're making local changes against a library you don't own (with the ultimate intention of sending those changes back upstream to the library maintainer) it makes sense change the version number to avoid clashes with the canonical version of the library. Of course, it's easy to lose track and end up publishing your own program against a non-existent (outside your hard disk) version of the library. I'd like to make it very obvious, both in mypogram.cabal and library.cabal that one is a patched copy and the other has to be compiled against a patched copy. Does cabal provide any way of marking a version private? I thought initially to just mark the version field in the patched library as X.y-dougal and enforce my program to compile against that, but it doesn't seem to recognise the -dougal suffix there. Thoughts? D -- Dougal Stanton dou...@dougalstanton.net // http://www.dougalstanton.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A small oversight
On Sat, Feb 20, 2010 at 5:47 AM, Andrew Coppin andrewcop...@btinternet.com wrote: sortOn :: (Ord y) = (x - y) - [x] - [x] sortOn foo = sortBy (compare `on` foo) Incidentally, this function is provided as Data.List.Ordered.sortOn' in the data-ordlist package... On Sat, Feb 20, 2010 at 7:39 AM, Ben Millwood hask...@benmachine.co.uk wrote: But it would still be useful to have sortOn et al to capture the common technique when your sorting property is potentially expensive (sortOn length, for example): sortOn f = map fst . sortBy (comparing snd) . map (\x - (x, f x)) a technique which I believe is called a Schwar[t]zian transform. An older name for this technique is decorate-sort-undecorate. Data-ordlist also provides this as Data.List.Ordered.sortOn http://hackage.haskell.org/package/data-ordlist Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Implementing unionAll
On Thu, Feb 18, 2010 at 2:32 AM, Evan Laforge qdun...@gmail.com wrote: By purest coincidence I just wrote the exact same function (the simple mergeAll', not the VIP one). Well, extensionally the same... intensionally mine is 32 complicated lines and equivalent to the 3 line mergeAll'. I even thought of short solution by thinking that pulling the first element destroys the ascending lists property so it's equivalent to a normal sorted merge after that, and have no idea why I didn't just write it that way. Well, the three line version wasn't my first implementation, by any stretch of the imagination. I know I had tried to implement mergeAll at least once, if not two or three times before coming up with the foldr-based implementation. However, I can't find any of them; they may well be lost to the sands of time. Incidentally, that implementation also appears in Melissa O'Neill's Genuine Sieve of Eratosthenes, in an alternate prime sieve by Richard Bird that appears at the end. Anyway, I'm dropping mine and downloading data-ordlist. Thanks for the library *and* the learning experience :) Thanks! BTW, I notice that your merges, like mine, are left-biased. This is a useful property (my callers require it), and doesn't seem to cost anything to implement, so maybe you could commit to it in the documentation? Yes, the description of the module explicitly states that all functions are left-biased; if you have suggestions about how to improve the documentation in content or organization, I am interested in hearing them. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Implementing unionAll
On Thu, Feb 18, 2010 at 3:07 AM, Evan Laforge qdun...@gmail.com wrote: BTW, I notice that your merges, like mine, are left-biased. This is a useful property (my callers require it), and doesn't seem to cost anything to implement, so maybe you could commit to it in the documentation? Also, I did briefly consider giving up left bias. GHC has an optimization strategy that seeks to reduce pattern matching, and due to interactions with this I could have saved a few kilobytes of -O2 object code size by giving up left-bias. For example: module MergeLeft where mergeBy :: (a - a - Ordering) - [a] - [a] - [a] mergeBy cmp = loop where loop [] ys = ys loop xs [] = xs loop (x:xs) (y:ys) = case cmp x y of GT - y : loop (x:xs) ys _ - x : loop xs (y:ys) compiles ghc-6.12.1 -O2 to a 4208 byte object file for x64 ELF. By changing the very last line to: _ - x : loop (y:ys) xs I get a 3336 byte object file instead, but of course this is no longer left- (or right-) biased.Repeat this strategy across the entire module, and you can save 3 kilobytes or so. However, in today's modern computing environment, left-bias is clearly a greater benefit to more people. If you are curious why, I suggest taking a look at GHC's core output for each of these two variants. The hackage package ghc-core makes this a little bit more pleasant, as it can pretty-print it for you. It's amazing to think that this library, at 55k (Optimized -O2 for x64), would take up most of the memory of my very first computer, a Commodore 64. Of course, I'm sure there are many others on this list who's first computers had a small fraction of 64k of memory to play with. :-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Implementing unionAll
On Wed, Feb 17, 2010 at 6:58 AM, Heinrich Apfelmus apfel...@quantentunnel.de wrote: Ah, I meant to use the union' from your previous message, but I think that doesn't work because it doesn't have the crucial property that the case union (VIP x xs) ys = ... does not pattern match on the second argument. Ahh yes, the funny thing is that I tested the code in my previous message, and it worked in the infinite case. Then I replaced the union' to pattern match on the second argument as well, and tested it on only finite cases, and then released it.Thus, unionAll in data-ordlist-0.4.1 doesn't work on an infinite number of lists. So my original unionAll in data-ordlist-0.4 appears to work ok, my revised and simplified unionAll doesn't work at all. The easiest solution is simply to define unionAll = nub . mergeAll where -- specialized definition of nub nub = map head . groupBy (==) Incidentally, data-ordlist has a (slightly different) version of nub that does exactly what you want in this particular case.Check out the documentation for nub and nubBy But you're probably concerned that filtering for duplicates afterwards will be less efficient. After all, the (implicit) tree built by mergeAll might needlessly compare a lot of equal elements. Well, yes and no. Efficiency is good, but this implementation does not match my intention.For example: unionAll [[1,1,2,2,2],[1,1,1,2]] == foldr union [] [...] == [1,1,1,2,2,2] The union function preserves strictly ascending lists, but it also works on multisets as well, returning an element as many times as the maximum number of times in either list.Thus, on an infinite number of lists, unionAll should return a particular element as many times as the maximum number of times it appears in any single list. On Wed, Feb 17, 2010 at 1:18 PM, Daniel Fischer daniel.is.fisc...@web.de wrote: Am Mittwoch 17 Februar 2010 18:59:42 schrieb Ozgur Akgun: Ooops I thought the inner lists are possibly of infinite size. Both, I think. Yes, both the inner and outer lists of an input to unionAll might be infinite.It's just that foldr union [] works fine if the inner lists are infinite, but gets stuck in an infinite non-productive list if the outer list is infinite. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Implementing unionAll
On Wed, Feb 17, 2010 at 6:58 AM, Heinrich Apfelmus apfel...@quantentunnel.de wrote: Ah, I meant to use the union' from your previous message, but I think that doesn't work because it doesn't have the crucial property that the case union (VIP x xs) ys = ... does not pattern match on the second argument. Ahh yes, my original union' has a bit that looks like this union' (VIP x xs) (VIP y ys) = case cmp x y of LT - VIP x (union' xs (VIP y ys)) EQ - VIP x (union' xs ys) GT - error Data.List.Ordered.unionAll: assumption violated! union' (VIP x xs) (Crowd ys) = VIP x (union' xs (Crowd ys)) For whatever reason, this works in the case of an infinite number of lists with my original version, but not the simplified version. By applying a standard transformation to make this lazier, we can rewrite these clauses as union' (VIP x xs) ys = VIP x $ case ys of Crowd _ - union' xs ys VIP y yt - case cmp x y of LT - union' xs ys EQ - union' xs yt GT - error msg In the original case, we have this strictness property union' (VIP x xs) ⊥ == ⊥ The revised verison is a bit lazier: union' (VIP x xs) ⊥ == VIP x ⊥ And so the simplified unionAll now works again on an infinite number of lists. I've uploaded data-ordlist-0.4.2 to fix the bug introduced with data-ordlist-0.4.1, and added a regression test to the suite. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Implementing unionAll
I see no obvious deficiencies. :) Personally, I'd probably structure it like http://www.haskell.org/haskellwiki/Prime_numbers#Implicit_Heap This variant, based on the wiki article, is cleaner, slightly simpler, appears to be just as fast, and allocates slightly less memory: import GHC.Exts(inline) import Data.List.Ordered(unionBy) union' :: People Int - People Int - People Int union' (VIP x xt) ys= VIP x (union' xt ys) union' (Crowd xs) (Crowd ys)= Crowd (inline unionBy compare xs ys) union' xs@(Crowd (x:xt)) ys@(VIP y yt) = case compare x y of LT - VIP x (union' (Crowd xt) ys) EQ - VIP x (union' (Crowd xt) yt) GT - VIP y (union' xs yt) foldTree :: (a - a - a) - [a] - a foldTree f xs = case xs of [] - [] xs - loop xs where loop [x]= x loop (x:xs) = x `f` loop (pairs xs) pairs (x:y:ys) = f x y : pairs ys pairs xs = xs unions xss = serve $ inline foldTree union' [ VIP x (Crowd xs) | (x:xs) - xss ] where serve (VIP x xs) = x:serve xs serve (Crowd xs) = xs One of the differences is that I started with a slightly different foldTree, one that was taken directly from Data.List.sort. The only problem is that it has the same problem as I mentioned: unionAll [[1,2],[1,2]] == [1,1,2] whereas unionAll is intended to be a generalization of foldr union [] to an infinite number of lists, and should thus return [1,2]. But I should be able to fix this without much difficulty. Your loop function is a strange melange of many different concerns (building a tree, union', adding and removing the VIP constructors). Note that it's currently unclear to me whether the lazy pattern match in pairs ~(x: ~(y:ys)) = f x y : pairs ys is beneficial or not; you used a strict one unionPairs (x:y:zs) = union' x y : unionPairs zs Well, as the library implementation must work on finite cases as well, the lazy pattern seems out of the question. If you're really concerned about time space usage, it might even be worth to abandon the lazy tree altogether and use a heap to achieve the same effect, similar to Melissa O'Neils prime number code. It's not as neat, but much more predictable. :) Well, it is intended as a high quality, generally useful implementation, so of course I care about time and space usage. :) Dave Bayer's original algorithm does slightly better, but was much larger in terms of both source code and object size. Omar implemented something along these lines, but it didn't perform so well. I did not dig into the reasons why, though; it might not have had anything to do with the fact an explicit heap was used. Incidentally, I tried implementing something like implicit heaps once upon a time; but it had a severe performance problem, taking a few minutes to produce 20-30 elements.I didn't have a pressing reason to figure out why though, and didn't pursue it further. Best, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Implementing unionAll
With the urging and assistance of Omar Antolín Camarena, I will be adding two functions to data-ordlist: mergeAll and unionAll, which merge (or union) a potentially infinite list of potentially infinite ordered lists, under the assumption that the heads of the non-empty lists appear in a non-decreasing sequence. Union takes two sorted lists and produces a new sorted list; an element occurs in the result as many times as the maximum number of occurrences in either list. The unionAll function generalizes this behavior to an infinite number of lists. A reasonable implementation of mergeAll is: import Data.List.Ordered(merge, union) mergeAll :: Ord a = [[a]] - [a] mergeAll = foldr (\(x:xs) ys - x : merge xs ys) [] However, for many inputs, we can do better; the library implementation of mergeAll is based on H. Apfelmus's article Implicit Heaps, which presents a simplification of Dave Bayer's venturi algorithm. The difference is that the foldr version uses a line of comparisons, whereas venturi uses a tree of comparisons. http://apfelmus.nfshost.com/articles/implicit-heaps.html http://www.mail-archive.com/haskell-cafe@haskell.org/msg27612.html However, as Omar pointed out to me, the following implementation of unionAll has a flaw: unionAll :: Ord a = [[a]] - [a] unionAll = foldr (\(x:xs) ys - x : union xs ys) [] Namely unionAll [[1,2],[1,2]] should return [1,2], whereas it actually returns [1,1,2]. After some work, I believe I have generalized H. Apfelmus's algorithm to handle this; however it seems a bit complicated. I would love feedback, especially with regard to simplifications, bugs, testing strategies, and optimizations: unionAll' :: Ord a = [[a]] - [a] unionAll' = unionAllBy compare data People a = VIP a (People a) | Crowd [a] unionAllBy :: (a - a - Ordering) - [[a]] - [a] unionAllBy cmp xss = loop [ (VIP x (Crowd xs)) | (x:xs) - xss ] where loop [] = [] loop ( VIP x xs : VIP y ys : xss ) = case cmp x y of LT - x : loop ( xs : VIP y ys : xss ) EQ - loop ( VIP x (union' xs ys) : unionPairs xss ) GT - error Data.List.Ordered.unionAll: assumption violated! loop ( VIP x xs : xss ) = x : loop (xs:xss) loop [Crowd xs] = xs loop (xs:xss) = loop (unionPairs (xs:xss)) unionPairs [] = [] unionPairs [x] = [x] unionPairs (x:y:zs) = union' x y : unionPairs zs union' (VIP x xs) (VIP y ys) = case cmp x y of LT - VIP x (union' xs (VIP y ys)) EQ - VIP x (union' xs ys) GT - error Data.List.Ordered.unionAll: assumption violated! union' (VIP x xs) (Crowd ys) = VIP x (union' xs (Crowd ys)) union' (Crowd []) ys = ys union' (Crowd xs) (Crowd ys) = Crowd (unionBy cmp xs ys) union' xs@(Crowd (x:xt)) ys@(VIP y yt) = case cmp x y of LT - VIP x (union' (Crowd xt) ys) EQ - VIP x (union' (Crowd xt) yt) GT - VIP y (union' xs yt) -- Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: data-ordlist-0.2
On Sun, Feb 7, 2010 at 6:43 AM, Ross Paterson r...@soi.city.ac.uk wrote: Why not wrap lists as Set and Bag abstract datatypes? An added bonus is that you could make them instances of Monoid. Well, my current thinking is that if you really want an abstract datatype for bags and sets, hackage (and the standard GHC distribution) offer a multitude of options. Previous versions of the code in data-ordlist dates back 9 years or more; I extracted and refurbished data-ordlist from a mess of miscellaneous list functions I use. Personally, I've never really used this code as an abstract datatype; my typical use cases break the set and bag abstractions to some extent. I use a few functions when I happen to know the lists I'm dealing with are ordered, and want a simple and efficient way to manipulate them. The functions I most often use are sortOn, sortOn', nubSort, and to a somewhat lesser extent, the set-like operators. I put this package on hackage as much for my own personal convenience as for others; but I do hope that other people will find it useful. I realize that others might use it in rather different ways than I do, and am open to suggestions and proposals. Cheers, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: data-ordlist-0.2
I have released data-ordlist 0.2, with a number of changes: 1. The module name is now Data.List.Ordered, instead of Data.OrdList 2. Three bugfixes: (ack!) insertSet and insertBag assumed reverse-ordered lists, nub failed to remove duplicates. Thanks to Topi Karvonen for reporting the first problem. 3. One semantic change: old_nubBy f == new_nubBy (not . f). The new version is better keeping with the spirit of the rest of the library, and makes the old nub bug much more obvious. Now nubBy is the greedy algorithm that returns a sublist such that for all binary predicates: isSortedBy pred (nubBy pred xs) == True 4. Improved documentation, I hope! Please consider taking a look and letting me know what you think. http://hackage.haskell.org/package/data-ordlist Cheers, Leon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: homework help: upn parser
Hi Max! This is a type definiton, which says that parsing takes one argument, a String, and gives you back a list of OpTree: parsing :: String-[OpTree] This is a function definition. The part before the = is called the left hand side, while the part after the = is called the right hand side. parsing (a:b:c:xs) t |isDigit b = ((parsing xs)) ((Term (read a) c (read b)):t) The (a:b:c:xs) is an example of a pattern. You are using the list constructor (:) to break down the list into individual items. According to your type definition, this argument is of type String, which is the same thing as [Char].Thus, a, b, and c are bound to single characters, while xs is the bound to the rest of the string, assuming your string has at least three characters. If your string has less than three characters, then this pattern fails and the next clause in your function definiton will be tried. Then, according to your function definition, you then bind the second argument of the function to the variable t. However, your type definition says you only have one argument. Since they disagree, this is an error. The right hand side of your function definition contains lots of errors, as you suggest. You probably want to rethink your type definition. I'd suggest the following: parsing :: String - [OpTree] - OpTree This says that parsing is a function that takes two arguments, a list of characters and a list of OpTrees, and returns a single OpTree. You are going to treat the second list argument like a stack. When you run across a number in your string, you are going to take the number off the string and push it onto the stack. When you run across an operation in your string, you are going to take two elements off the stack, combine the elements from the stack using the operation, and push the result back on the stack.Thus, your solution would contain this kind of structure: parsing ( string_with_number_in_front ) ( stack ) = parsing ( string_without_number ) ( Number (the_number) : stack ) parsing ( string_with_operation_in_front ) (last_element : next_element : rest_stack) = parsing ( string_without_operation ) ( Term last_element (the_operation) next_element : rest_stack ) Now, you must consider the boundary cases of the function parsing. What I mean is, what kinds of clauses do you need for when you are done with recursion, and what should your first call to this function look like? Of course, you need to try out your code and carefully think about what it's doing. When asking for homework help on haskell-cafe, Haskellers have been known to introduce small intentional errors into their advice. Also, it's very impressive that Paul Natorp high school is using Haskell in it's coursework! In the United States, I've never heard of a high school that takes functional programming seriously. Sadly, many American universities don't take functional programming seriously either. Americans are really behind the game in this regard. Hope this helps. best, leon ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
The Curry-Howard Isomorphism
For an explanation why runST has the type it does, see page 4 of Simon Peyton-Jones and John Launchbury's paper Lazy functional state threads http://research.microsoft.com/Users/simonpj/Papers/lazy-functional-state-threads.ps.Z On Friday 16 August 2002 23:57, Scott J. wrote: runST :: forall a ( forall s ST s a) - a ? In logic forall x forall y statement(x.y) is equivalent to: forall y forall x statement(x,y). True, but that isn't runST's type. forall a quantifies over the entire type, whereas forall s quantifies over the left hand side of the implication. In other words, you have forall a. statement(a), where statement includes a substatement forall s. substatement(s,a) Now, using a different argument, since s does not appear free on the R.H.S of the implication, forall s can be moved outside the implication to quantify over the entire type. However, this brings up an important point about the Curry-Howard Isomorphism, which basically says that types are statements in first-order logic, and the corresponding values are proofs of those statements. When we work in logic, we care mainly about the *statements*. Proofs are ways of justifying new statements. Equivalent statements are in a sense equal, and can usually be freely substituted for each other. When we have a statement such as P /\ Q, where P and Q are statements such as John is wearing a kilt and John is Scottish, it doesn't really matter which order they come in. Mathematicians generally understand that /\ is commutative, and thus we can use P /\ Q and Q /\ P interchangeably throughout our discussion. (Unless you are writing proofs in certain highly formal proof systems, such as Coq, Isabelle, or Haskell's type system) When we work in type theory, we care mainly about the *proofs*. Statements are ways of constraining our proofs to ensure they are at least somewhat sensible. When we have a type (Int, Char), where Int is the statement my value is a 32-bit integer, and Char says my value is a character, order does matter. Here, (4,'H') is a proof of the proposition (Int,Char), and ('H', 4) is a proof of the proposition (Char, Int). Although (Int, Char) = (Char, Int) are logically equivalent, (4,'H') is a different value than ('H', 4).This equivalence can be proven using the function: swap :: (a,b) - (b,a) swap (x,y) = (y, x) Which, using the Curry-Howard isomorphism, is also a proof that /\ is commutative. In short, just because a type a implies type b, and type b implies type a, doesn't mean they are interchangeable as statements are in logic. Hope this helps, leon ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Haskell Problem
What you need to do is write a function that operates on a string that does what you want it to, and then use that to write some top-level I/O code. If you have a function sortFile :: String - String, you would write something like this for main: main :: IO () main = do string - getContents "theFile" putStr (sortFile string) You can treat "string" as a variable that has type String, not IO String, which you can use anywhere you want in "main". Keep in mind, though that what is going on here is quite different than an assignment statement or "converting" a IO String to a String. This is not like the single assignment variables introduced in "where" or "let" clauses, as we cannot substitute the value "(getContents "theFile")" for the variable "string" in main. This would lead to a type error, as sortFile takes a String argument, not an IO String. Nor is is it like the assignment statement in imperative programming languages like C++ and Java for several reasons. One can represent "State Transformers" using monads, so what the IO monad is a state transformer that modifies the state of the computer. int a = 0; int dirty_inc(int a) { a++; return i + a; } int main(int argc, char ** argv) { int i = dirty_inc(1); printf("%i %i", i, i); } Unlike monads, if you "substitute" dirty_inc(1) for i in main will result in a legal program, but it isn't really a substitution, because it would modify the behavior of the program. Moreover, while we could write main = do message - return "Hello!" message - return "Goodbye!" putStr message and get "Goodbye!" as output, what really is happening is that you are introducing two variables with the same name, and we can statically determine which one we are referring to. Thus if we write main = do message - return "Hello!" do message - return "Goodbye! " putStr message putStr message we will get "Goodbye! Hello!", as output, not "Goodbye! Goodbye!". To start to understand what's really going on, do-notation is just syntactic sugar for using the (=) operator. Let's rewrite your example to something that is syntactically equivalent: main :: IO () main = getContents "theFile" = (\string - putStr (sortFile string)) Which we could in turn rewrite as: main :: IO () main = getContents "theFile" = output_sort output_sort :: String - IO () output_sort string = putStr (sortFile string) What (=) does is that it takes the String returned inside of a IO String value, and gives it to output_sort, which in turn may use that value in any way it sees fit, *as long as output_sort returns another "IO a" value for some type a.*This is why we are not simply converting a IO String to a String, because in order to use the String value in IO String, we must produce a new IO monad. This is summed up in (=)'s type, which is (=) :: IO a - (a - IO b) - IO b, which can then be generalized to any monad m, so (=) :: m a - (a - m b) - m b. best, leon ___ Haskell-cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe