[Haskell-cafe] Announcing postgresql-libpq-0.8.2.3

2013-07-08 Thread Leon Smith
I just fixed a fairly serious performance problem with postgresql-libpq's
binding to PQescapeStringConn;   in was exhibiting a non-linear slowdown
when more strings are escaped and retained.

https://github.com/lpsmith/postgresql-libpq/commit/adf32ff26cdeca0a12fa59653b49c87198acc9ae

If you are using postgresql-libpq's escapeStringConn,  or a library that
uses it (e.g.  postgresql-simple,  or persistent-postgresql),  I do
recommend upgrading.   You may or may not see a performance improvement,
 depending on your particular use case,   but if you do it can be quite
substantial.

It's not entirely clear to me what the root cause really is,  but it
certainly appears as though it's related to the (direct) use of
mallocBytes,   which was replaced with (indirect) calls to
mallocForeignPtrBytes / mallocPlainForeignPtrBytes (through the bytestring
package).   In this case,  it resulted in an asymptotic improvement in time
complexity of some algorithms.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [database-devel] Announcing postgresql-libpq-0.8.2.3

2013-07-08 Thread Leon Smith
I'll have to benchmark withMVar on my system,  but (at least on my old
laptop) a safe foreign function call is also on the order of 100ns.   As
c_PQescapeStringConn and c_PQescapeByteaConn are currently safe calls,
that would limit the maximum time saved at ~50%.

Perhaps it would make sense to make these unsafe calls as well,  but the
justification I used at the time was that the amount of time consumed by
these functions is bounded by the length of the string being escaped,
 which is itself unbounded.Certainly postgresql-libpq is currently
overly biased towards safe calls;  though when I took the bindings over it
was overly biased towards unsafe calls.   (Though, arguably,  it's worse to
err on the side of making ffi calls unsafe.)

I've also considered integrating calls to c_PQescapeStringConn with
blaze-builder and/or bytestring-builder,  which could help a fair bit,  but
would also introduce dependencies on the internals of these libraries when
currently there is none.

There is certainly a lot of room for optimizing query generation in
postgresql-simple,  this I've been well aware of since the beginning.   And
it probably would be worthwhile to move to protocol-level parameters which
would avoid the need for escaping value parameters altogether,  and open up
the possibility of binary formats as well, which would be a huge
performance improvement for things like numerical values and timestamps.
 Although IIRC,  one downside is that this prevents multiple DML commands
from being issued in a single request,  which would subtly change the
interface postgresql-simple exports.

Best,
Leon


On Mon, Jul 8, 2013 at 10:00 PM, Joey Adams joeyadams3.14...@gmail.comwrote:

 On Mon, Jul 8, 2013 at 9:03 PM, Leon Smith leon.p.sm...@gmail.com wrote:

 I just fixed a fairly serious performance problem with postgresql-libpq's
 binding to PQescapeStringConn;   in was exhibiting a non-linear slowdown
 when more strings are escaped and retained.


  I'd like to point out a somewhat related bottleneck in postgresql-simple
 (but not postgresql-libpq).  Every PQescapeStringConn or PQescapeByteaConn
 call involves a withMVar, which is about 100ns on the threaded RTS on my
 system.  Taking the Connection lock once for the whole buildQuery call
 might be much faster, especially for multi-row inserts and updates.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] A use case for *real* existential types

2013-05-10 Thread Leon Smith
I've been working on a new Haskell interface to the linux kernel's inotify
system, which allows applications to subscribe to and be notified of
filesystem events.   An application first issues a system call that returns
a file descriptor that notification events can be read from,  and then
issues further system calls to watch particular paths for events.   These
return a watch descriptor (which is just an integer) that can be used to
unsubscribe from those events.

Now,  of course an application can open multiple inotify descriptors,  and
when removing watch descriptors,  you want to remove it from the same
inotify descriptor that contains it;  otherwise you run the risk of
deleting a completely different watch descriptor.

So the natural question here is if we can employ the type system to enforce
this correspondence.   Phantom types immediately come to mind,  as this
problem is almost the same as ensuring that STRefs are only ever used in a
single ST computation.   The twist is that the inotify interface has
nothing analogous to runST,  which does the heavy lifting of the type
magic behind the ST monad.

This twist is very simple to deal with if you have real existential types,
 with the relevant part of the interface looking approximately like

init :: exists a. IO (Inotify a)
addWatch :: Inotify a - FilePath - IO (Watch a)
rmWatch :: Inotify a - Watch a - IO ()

UHC supports this just fine,  as demonstrated by a mockup attached to this
email.  However a solution for GHC is not obvious to me.


inotify.hs
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A use case for *real* existential types

2013-05-10 Thread Leon Smith
On Fri, May 10, 2013 at 9:00 AM, Andres Löh and...@well-typed.com wrote:


  This twist is very simple to deal with if you have real existential
 types,
  with the relevant part of the interface looking approximately like
 
  init :: exists a. IO (Inotify a)
  addWatch :: Inotify a - FilePath - IO (Watch a)
  rmWatch :: Inotify a - Watch a - IO ()

 You can still do the ST-like encoding (after all, the ST typing trick
 is just an encoding of an existential), with init becoming like
 runST:

  init :: (forall a. Inotify a - IO b) - IO b
  addWatch :: Inotify a - FilePath - IO (Watch a)
  rmWatch :: Inotify a - Watch a - IO ()


Right, but my interface the Inotify descriptor has an indefinite extent,
 whereas your interface enforces a dynamic extent.   I'm not sure to what
degree this would impact use cases of this particular library,  but in
general moving a client program from the the first interface to the second
can require significant changes to the structure of the program,   whereas
moving a client program from the second interface to the first is trivial.
   So I'd say my interface is more expressive.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A use case for *real* existential types

2013-05-10 Thread Leon Smith
On Fri, May 10, 2013 at 9:04 AM, MigMit miguelim...@yandex.ru wrote:

 With that kind of interface you don't actually need existential types. Or
 phantom types. You can just keep Inotify inside the Watch, like this:


Right, that is an alternative solution,  but phantom types are a relatively
simple and well understood way of enforcing this kind of property in the
type system without incurring run-time costs.   My inotify binding is
intended to be as thin as possible,  and given my proposed interface,   you
could implement your interface in terms of mine,  making the phantom types
disappear using the restricted existentials already available in GHC,   and
such a wrapper should be just as efficient as if you had implemented your
interface directly.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A use case for *real* existential types

2013-05-10 Thread Leon Smith
On Fri, May 10, 2013 at 5:49 PM, Alexander Solla alex.so...@gmail.comwrote:

 I'm not sure if it would work for your case, but have you considered using
 DataKinds instead of phantom types?  At least, it seems like it would be
 cheap to try out.


 http://www.haskell.org/ghc/docs/7.4.2/html/users_guide/kind-polymorphism-and-promotion.html


I do like DataKinds a lot,  and I did think about them a little bit with
respect to this problem,  but a solution isn't obvious to me,  and perhaps
more importantly I'd like to be able to support older versions of GHC,
 probably back to 7.0 at least.

The issue is that every call to init needs to return a slightly different
type,  and whether this is achieved via phantom types or datakinds,  it
seems to me some form of existential typing is required.  As both Andres
and MigMit pointed out,  you can sort of achieve this by using a
continuation-like construction and higher-ranked types (is there a name for
this transform?  I've seen it a number of times and it is pretty well
known...),  but this enforces a dynamic extent on the descriptor whereas
the original interface I proposed allows an indefinite extent.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A use case for *real* existential types

2013-05-10 Thread Leon Smith
A value has an indefinite extent if it's lifetime is independent of any
block of code or related program structure,  think malloc/free or new/gc.
 A value has a dynamic extent if is lifetime is statically determined
relative to the dynamic execution of the program (e.g. a stack variable):
 in this case the type system ensures that no references to the inotify
descriptor can exist after the callback returns.

Best,
Leon


On Fri, May 10, 2013 at 6:52 PM, Alexander Solla alex.so...@gmail.comwrote:




 On Fri, May 10, 2013 at 3:31 PM, Leon Smith leon.p.sm...@gmail.comwrote:

 On Fri, May 10, 2013 at 5:49 PM, Alexander Solla alex.so...@gmail.comwrote:

 I'm not sure if it would work for your case, but have you considered
 using DataKinds instead of phantom types?  At least, it seems like it would
 be cheap to try out.


 http://www.haskell.org/ghc/docs/7.4.2/html/users_guide/kind-polymorphism-and-promotion.html


 I do like DataKinds a lot,  and I did think about them a little bit with
 respect to this problem,  but a solution isn't obvious to me,  and perhaps
 more importantly I'd like to be able to support older versions of GHC,
  probably back to 7.0 at least.

 The issue is that every call to init needs to return a slightly different
 type,  and whether this is achieved via phantom types or datakinds,  it
 seems to me some form of existential typing is required.  As both Andres
 and MigMit pointed out,  you can sort of achieve this by using a
 continuation-like construction and higher-ranked types (is there a name for
 this transform?  I've seen it a number of times and it is pretty well
 known...),  but this enforces a dynamic extent on the descriptor whereas
 the original interface I proposed allows an indefinite extent.


 I know what extensions (of predicates and the like) are, but what exactly
 does dynamic and indefinite mean in this context?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Extensible Type Unification

2013-02-08 Thread Leon Smith
It finally occurred to me how to get most of what I want,  at least from a
functional perspective.Here's a sample GADT,  with four categories of
constructor:

data Foo :: Bool - Bool - Bool - Bool - * where
A :: Foo True b c d
B :: Foo True b c d
C :: Foo a True c d
D :: Foo a b True d
E :: Foo a b c True

Given an Eq instance,  we can easily compare two constructors for equality;
  we can put some constructors in a list and tell which categories appear
in the list,   and it plays reasonably nicely with the case coverage
analyzer,  which I think is important from a software engineering
standpoint.For example,  this function will compile without warning:

fooVal :: Foo a b False False - Int
fooVal  x =
   case x of
  A - 0
  B - 1
  C - 2

The only thing this doesn't do that I wish it did is infer the type of
fooVal above.Rather,  GHC  infers the type  :: Foo a b c d - Int,
and then warns me that I'm missing cases.   But this is admittedly a
strange form of type inference,   completely alien to the Haskell
landscape,   which I realized only shortly after sending my original email.
  My original description of ranges of sets of types wasn't sufficient to
fully capture my intention.

That said,  this solution falls rather flat in several software engineering
respects.  It doesn't scale with complexity;   types quickly become
overbearing even with modest numbers of categories.If you want to add
additional categories,   you have to modify every type constructor instance
you've ever written down.You could mitigate some of this via the
existing type-level machinery,  but it seems a band-aid,  not a solution.

For comparison,  here is the best I had when I wrote my original email:

data Category = W | X | Y | Z

data Foo :: [Category] - * where
A :: (Member W ub) = Foo ub
B :: (Member W ub) = Foo ub
C :: (Member X ub) = Foo ub
D :: (Member Y ub) = Foo ub
E :: (Member Z ub) = Foo ub

class Member (a :: x) (bs :: [x])
instance Member a (a ': bs)
instance Member a bs = Member a (b ': bs)

The code is closer to what I want,  in terms of expression,  though it
mostly fails on the functional requirements.Case analysis doesn't work,
  I can't compare two constructors without additional type annotations,
and while I can find out which categories of constructors appear in a list,
  it doesn't seem I can actually turn those contexts into many useful
things,  automatically.

So while I didn't have any trouble computing set operations over unordered
lists at the type level,   I couldn't figure out out to put it to use in
the way I wanted.Perhaps there is a clever way to emulate unification,
 but I really like the new features that reduce the need to be clever when
it comes to type-level computation.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Extensible Type Unification

2013-02-07 Thread Leon Smith
I've been toying with some type-level programming ideas I just can't quite
make work,   and it seems what I really want is a certain kind of type
unification.

Basically,  I'd like to introduce two new kind operators:

kind Set as   -- a finite set of ground type terms of kind as

kind Range as = Range (Set as) (Set as)  -- a greatest lower bound and
a least upper bound


A type expression of kind (Set as)  would either be a type variable of kind
(Set as),  or set of *ground* type terms of kind (as).  A type
expression of kind (Range as) would be a type variable
of kind (Range as),  or two type expressions of kind (Set as).   To unify
ground terms of these types,  one would compute as follows:


unifySets xs ys
   |  xs == ys  = Just xs
   |  otherwise = Nothing

unifyRanges (Range glb0 lub0) (Range glb1 lub1)
| glb' `subset` lub' = Just (Range glb' lub')
| otherwise  = Nothing
  where
glb' = glb0 `union` glb1
lub' = lub0 `isect` lub1


I say sets of ground types,  because I have no idea what unification
between sets of non-ground types would mean,  and I really don't need it.

Among applications that came to mind,   one could use this to create a
restricted IO abstraction that could tell you which kinds of actions might
be taken  (read from mouse,  talk to network,  etc),   and be able to run
only those scripts that are restricted to certain resources. Or,  one
could define a singular GADT that represents messages/events decorated by
various categories,   and then define functions that
only operate on selected categories of messages.E.G. for something
vaguely IRC-ish,  one could write something like:

data EventCategory
= Channel
| Presence
| Information
| Conversation

data Event :: Range EventCategory - *  where
ChanCreate ::  ...  - Event (Range (Set '[Channel])  a)
ChanJoin   ::  ...  - Event (Range (Set '[Channel])  a)
ChanLeave  ::  ...  - Event (Range (Set '[Channel])  a)
PresAvailable  ::  ...  - Event (Range (Set '[Presence]) a)
PresAway   ::  ...  - Event (Range (Set '[Presence]) a)
WhoisQuery ::  ...  - Event (Range (Set '[Information])  a)
WhoisResponse  ::  ...  - Event (Range (Set '[Information])  a)
Message::  ...  - Event (Range (Set '[Conversation]) a)

And then be able to write functions such as


dispatch :: Event (Range a (Set '[Channel, Conversation]))  - IO ()
dispatch e =
case e of
ChanCreate{..} - ...
ChanJoin{..} - ...
ChanLeave{..} - ...
Message{..} - ...


In this case,  the case analysis tool would be able to warn me if I'm
missing any possible events in this dispatcher,  or  if I have extraneous
events that I can't be passed (according to my type.)

Anyway,  I've been trying to see if I can't come up with something similar
using existing type-level functionality in 7.6,   with little success.
(Though I'm not very experienced with type-level programming.) If not,
 might it be possible to add some kind of extensible unification mechanism,
 in furtherance of type-level programming?

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How can I avoid buffered reads?

2012-12-09 Thread Leon Smith
On Thu, Dec 6, 2012 at 5:23 PM, Brandon Allbery allber...@gmail.com wro\

 Both should be cdevs, not files, so they do not go through the normal
 filesystem I/O pathway in the kernel and should support select()/poll().
  (ls -l, the first character should be c instead of - indicating
 character-mode device nodes.)  If ghc is not detecting that, then *that* is
 indeed an I/O manager issue.


The issue here is that if you look at the source of fdReadBuf,  you see
that it's a plain system call without any reference to GHC's (relatively
new) IO manager.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How can I avoid buffered reads?

2012-11-29 Thread Leon Smith
Well,  I took Bardur's suggestion and avoided all the complexities of GHC's
IO stack and simply used System.Posix.IO and Foreign.This appears to
work,  but for better or worse,   it is using blocking calls to the read
system call and is not integrated with GHC's IO manager.   This shouldn't
be an issue for my purposes,  but I suppose it's worth pointing out.

{-# LANGUAGE BangPatterns, ViewPatterns #-}

import   Control.Applicative
import   Data.Bits
import   Data.Word(Word64)
import qualified Data.ByteString as S
import qualified Data.ByteString.Lazy as L
import   Data.ByteString.Internal (c2w)
import   Control.Exception
import   System.Posix.IO
import   Foreign
import qualified System.IO  as IO
import qualified Data.Binary.Getas Get

showHex :: Word64 - S.ByteString
showHex n = s
  where
(!s,_) = S.unfoldrN 16 f n

f n = Just (char (n `shiftR` 60), n `shiftL` 4)

char (fromIntegral - i)
  | i  10= (c2w '0' -  0) + i
  | otherwise = (c2w 'a' - 10) + i

twoRandomWord64s :: IO (Word64,Word64)
twoRandomWord64s = bracket openRd closeRd readRd
  where
openRd = openFd /dev/urandom ReadOnly Nothing defaultFileFlags {
noctty = True }
readRd = \fd - allocaBytes 16 $ \ptr - do
fdReadAll fd ptr 16
x - peek (castPtr ptr)
y - peek (castPtr ptr `plusPtr` 8)
return (x,y)
closeRd = closeFd
fdReadAll fd ptr n = do
  n' - fdReadBuf fd ptr n
  if n /= n'
  then fdReadAll fd (ptr `plusPtr` n') (n - n')
  else return ()

main = do
   (x,y) - twoRandomWord64s
   S.hPutStrLn IO.stdout (S.append (showHex x) (showHex y))


On Wed, Nov 28, 2012 at 6:05 PM, Leon Smith leon.p.sm...@gmail.com wrote:

 If you have rdrand,  there is no need to build your own PRNG on top of
 rdrand.   RdRand already incorporates one so that it can produce random
 numbers as fast as they can be requested,  and this number is continuously
 re-seeded with the on-chip entropy source.

 It would be nice to have a little more information about /dev/urandom and
 how it varies by OS and hardware,   but on Linux and FreeBSD at least it's
 supposed to be a cryptographically secure RNG that incorporates a PRNG to
 produce numbers in case you exhaust the entropy pool.

 On Wed, Nov 28, 2012 at 5:00 PM, Vincent Hanquez t...@snarc.org wrote:

 On 11/28/2012 09:31 PM, Leon Smith wrote:

 Quite possibly,  entropy does seem to be a pretty lightweight
 dependency...

 Though doesn't recent kernels use rdrand to seed /dev/urandom if it's
 available?   So /dev/urandom is the most portable source of random numbers
 on unix systems,  though rdrand does have the advantage of avoiding system
 calls,  so it certainly would be preferable, especially if you need large
 numbers of random numbers.

 There's no much information on this i think, but if you need large number
 of random numbers you should build a PRNG yourself on top of the best
 random seed you can get, and make sure you reseed your prng casually with
 more entropy bytes. Also if
 you don't have enough initial entropy, you should block.

 /dev/urandom is not the same thing on every unix system. leading to
 various assumptions broken when varying the unixes. It also varies with the
 hardware context: for example on an embedded or some virtualized platform,
 giving you really terrible entropy.

 --
 Vincent



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] How can I avoid buffered reads?

2012-11-28 Thread Leon Smith
I have some code that reads (infrequently) small amounts of data from
/dev/urandom,  and because this is pretty infrequent,  I simply open the
handle and close it every time I need some random bytes.

The problem is that I recently discovered that,  thanks to buffering within
GHC,   I was actually reading 8096 bytes when I only need 16 bytes,  and
thus wasting entropy.   Moreover  calling hSetBuffering  handle NoBuffering
did not change this behavior.

I'm not sure if this behavior is a bug or a feature,  but in any case it's
unacceptable for dealing with /dev/urandom.   Probably the simplest way to
fix this is to write a little C helper function that will read from
/dev/urandom for me,  so that I have precise control over the system calls
involved. But I'm curious if GHC can manage this use case correctly;
I've just started digging into the GHC.IO code myself.

Best,
Leon

{-# LANGUAGE BangPatterns, ViewPatterns #-}
import   Control.Applicativeimport   Data.Bitsimport
Data.Word(Word64)import qualified Data.ByteString as Simport
qualified Data.ByteString.Lazy as Limport
Data.ByteString.Internal (c2w)import qualified System.IOas
IOimport qualified Data.Binary.Getas Get
showHex :: Word64 - S.ByteStringshowHex n = s
  where
(!s,_) = S.unfoldrN 16 f n

f n = Just (char (n `shiftR` 60), n `shiftL` 4)

char (fromIntegral - i)
  | i  10= (c2w '0' -  0) + i
  | otherwise = (c2w 'a' - 10) + i
twoRandomWord64s :: IO (Word64,Word64)twoRandomWord64s =
IO.withBinaryFile /dev/urandom IO.ReadMode $ \handle - do
   IO.hSetBuffering handle IO.NoBuffering
   Get.runGet ((,) $ Get.getWord64host * Get.getWord64host) $
L.hGet handle 16
main = do
   (x,y) - twoRandomWord64s
   S.hPutStrLn IO.stdout (S.append (showHex x) (showHex y))

{- Relevant part of strace:

open(/dev/urandom, O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3
fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL
(Invalid argument)
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL
(Invalid argument)
read(3, 
N\304\4\367/\26c\\3218\237f\214yKg~i\310\r\262\\224H\340y\n\376V?\265\344...,
8096) = 8096
close(3)= 0

-}
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How can I avoid buffered reads?

2012-11-28 Thread Leon Smith
Quite possibly,  entropy does seem to be a pretty lightweight dependency...

Though doesn't recent kernels use rdrand to seed /dev/urandom if it's
available?   So /dev/urandom is the most portable source of random numbers
on unix systems,  though rdrand does have the advantage of avoiding system
calls,  so it certainly would be preferable, especially if you need large
numbers of random numbers.

Best,
Leon

On Wed, Nov 28, 2012 at 2:45 PM, Thomas DuBuisson 
thomas.dubuis...@gmail.com wrote:

 As an alternative, If there existed a Haskell package to give you fast
 cryptographically secure random numbers or use the new Intel RDRAND
 instruction (when available) would that interest you?

 Also, what you are doing is identical to the entropy package on
 hackage, which probably suffers from the same bug/performance issue.

 Cheers,
 Thomas

 On Wed, Nov 28, 2012 at 11:38 AM, Leon Smith leon.p.sm...@gmail.com
 wrote:
  I have some code that reads (infrequently) small amounts of data from
  /dev/urandom,  and because this is pretty infrequent,  I simply open the
  handle and close it every time I need some random bytes.
 
  The problem is that I recently discovered that,  thanks to buffering
 within
  GHC,   I was actually reading 8096 bytes when I only need 16 bytes,  and
  thus wasting entropy.   Moreover  calling hSetBuffering  handle
 NoBuffering
  did not change this behavior.
 
  I'm not sure if this behavior is a bug or a feature,  but in any case
 it's
  unacceptable for dealing with /dev/urandom.   Probably the simplest way
 to
  fix this is to write a little C helper function that will read from
  /dev/urandom for me,  so that I have precise control over the system
 calls
  involved. But I'm curious if GHC can manage this use case correctly;
  I've just started digging into the GHC.IO code myself.
 
  Best,
  Leon
 
  {-# LANGUAGE BangPatterns, ViewPatterns #-}
 
  import   Control.Applicative
  import   Data.Bits
  import   Data.Word(Word64)
  import qualified Data.ByteString as S
  import qualified Data.ByteString.Lazy as L
  import   Data.ByteString.Internal (c2w)
  import qualified System.IOas IO
  import qualified Data.Binary.Getas Get
 
  showHex :: Word64 - S.ByteString
  showHex n = s
where
  (!s,_) = S.unfoldrN 16 f n
 
  f n = Just (char (n `shiftR` 60), n `shiftL` 4)
 
  char (fromIntegral - i)
| i  10= (c2w '0' -  0) + i
| otherwise = (c2w 'a' - 10) + i
 
  twoRandomWord64s :: IO (Word64,Word64)
  twoRandomWord64s = IO.withBinaryFile /dev/urandom IO.ReadMode $
 \handle -
  do
 IO.hSetBuffering handle IO.NoBuffering
 Get.runGet ((,) $ Get.getWord64host * Get.getWord64host) $
 L.hGet
  handle 16
 
  main = do
 (x,y) - twoRandomWord64s
 S.hPutStrLn IO.stdout (S.append (showHex x) (showHex y))
 
 
  {- Relevant part of strace:
 
  open(/dev/urandom, O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3
  fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0
  ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL
 (Invalid
  argument)
  ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7367e528) = -1 EINVAL
 (Invalid
  argument)
  read(3,
 
 N\304\4\367/\26c\\3218\237f\214yKg~i\310\r\262\\224H\340y\n\376V?\265\344...,
  8096) = 8096
  close(3)= 0
 
  -}
 
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How can I avoid buffered reads?

2012-11-28 Thread Leon Smith
If you have rdrand,  there is no need to build your own PRNG on top of
rdrand.   RdRand already incorporates one so that it can produce random
numbers as fast as they can be requested,  and this number is continuously
re-seeded with the on-chip entropy source.

It would be nice to have a little more information about /dev/urandom and
how it varies by OS and hardware,   but on Linux and FreeBSD at least it's
supposed to be a cryptographically secure RNG that incorporates a PRNG to
produce numbers in case you exhaust the entropy pool.

On Wed, Nov 28, 2012 at 5:00 PM, Vincent Hanquez t...@snarc.org wrote:

 On 11/28/2012 09:31 PM, Leon Smith wrote:

 Quite possibly,  entropy does seem to be a pretty lightweight
 dependency...

 Though doesn't recent kernels use rdrand to seed /dev/urandom if it's
 available?   So /dev/urandom is the most portable source of random numbers
 on unix systems,  though rdrand does have the advantage of avoiding system
 calls,  so it certainly would be preferable, especially if you need large
 numbers of random numbers.

 There's no much information on this i think, but if you need large number
 of random numbers you should build a PRNG yourself on top of the best
 random seed you can get, and make sure you reseed your prng casually with
 more entropy bytes. Also if
 you don't have enough initial entropy, you should block.

 /dev/urandom is not the same thing on every unix system. leading to
 various assumptions broken when varying the unixes. It also varies with the
 hardware context: for example on an embedded or some virtualized platform,
 giving you really terrible entropy.

 --
 Vincent

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Platform Versioning Policy: upper bounds are not our friends

2012-08-22 Thread Leon Smith
I think we actually agree more than we disagree;  I do think distinguishing
hard and soft upper bounds (no matter what they are called)  would help,
 and I'm just trying to justify them to some of the more dismissive
attitudes towards the idea

The only thing I think we (might) disagree on is the relative importance of
distinguishing hard and soft bounds versus being able to change bounds
easily after the fact (and *without* changing the version number associated
with the package.)

And on that count,  given the choice,  I pick being able to change bounds
after the fact, hands down.   I believe this is more likely to
significantly improve the current situation than distinguishing the two
types of bound alone.   However,  being able to specify both (and change
both) after the fact may prove to be even better.

Best,
Leon

On Sat, Aug 18, 2012 at 11:52 PM, wren ng thornton w...@freegeek.orgwrote:

 On 8/17/12 11:28 AM, Leon Smith wrote:

 And the
 difference between reactionary and proactive approaches I think is a
 potential justification for the hard and soft upper bounds;  perhaps
 we
 should instead call them reactionary and proactive upper bounds
 instead.


 I disagree. A hard constraint says this package *will* break if you
 violate me. A soft constraint says this package *may* break if you
 violate me. These are vastly different notions of boundary conditions, and
 they have nothing to do with a proactive vs reactionary stance towards
 specifying constraints (of either type).

 The current problems of always giving (hard) upper bounds, and the
 previous problems of never giving (soft) upper bounds--- both stem from a
 failure to distinguish hard from soft! The current/proactive approach fails
 because the given constraints are interpreted by Cabal as hard constraints,
 when in truth they are almost always soft constraints. The
 previous/reactionary approach fails because when the future breaks noone
 bothered to write down when the last time things were known to work.

 To evade both problems, one must distinguish these vastly different
 notions of boundary conditions. Hard constraints are necessary for
 blacklisting known-bad versions; soft constraints are necessary for
 whitelisting known-good versions. Having a constraint at all shows where
 the grey areas are, but it fails to indicate whether that grey is most
 likely to be black or white.

 --
 Live well,
 ~wren


 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Platform Versioning Policy: upper bounds are not our friends

2012-08-17 Thread Leon Smith
I see good arguments on both sides of the upper bounds debate,  though at
the current time I think the best solution is to omit upper bounds (and I
have done so for most/all of my packages on hackage).But I cannot agree
with this enough:

On Thu, Aug 16, 2012 at 4:45 AM, Joachim Breitner
m...@joachim-breitner.dewrote:

 I think what we’d need is a more relaxed policy with modifying a
 package’s meta data on hackage. What if hackage would allow uploading a
 new package with the same version number, as long as it is identical up
 to an extended version range? Then the first person who stumbles over an
 upper bound that turned out to be too tight can just fix it and upload
 the fixed package directly, without waiting for the author to react.


I think that constraint ranges of a given package should be able to both be
extended and restricted after the fact.   Those in favor of the reactionary
approach (as I am at the moment, or Bryan O'Sullivan) would find the
ability of to restrict the version range useful,  while those in favor of
the proactive approach (like Joachim Breitner or Doug Beardsley) would find
the ability to extend the version range useful.

I suspect that attitudes towards upper bounds may well change if we can set
version ranges after the fact.I know mine very well might.And the
difference between reactionary and proactive approaches I think is a
potential justification for the hard and soft upper bounds;  perhaps we
should instead call them reactionary and proactive upper bounds instead.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What does unpacking an MVar really mean?

2012-07-31 Thread Leon Smith
On Tue, Jul 31, 2012 at 7:37 AM, Bertram Felgenhauer 
bertram.felgenha...@googlemail.com wrote:

 Note that MVar# itself cannot be unpacked -- the StgMVar record will
 always be a separate heap object.


One could imagine a couple of techniques to unpack the MVar# itself,  and
was curious if GHC might employ one of them.

So, really,  unpacking the MVar does not eliminate a layer of indirection,
 it just eliminates the need to check a pointer tag (and possibly execute a
thunk or follow some redirects if you don't have a pointer to an MVar#).
 I think this is what I was ultimately after.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] What does unpacking an MVar really mean?

2012-07-30 Thread Leon Smith
I admit I don't know exactly how MVars are implemented,  but given that
they can be aliased and have indefinite extent,   I would think that they
look something vaguely like a  cdatatype ** var,  basically a pointer to an
MVar (which is itself a pointer,  modulo some other things such as a thread
queue.)

And,  I would think that unpacking such an structure would basically be
eliminating one layer of indirection,  so it would then look vague like a
cdatatype * var.But again,  given aliasing and indefinite extent, this
would seem to be a difficult thing to do.

Actually this isn't too difficult if an MVar only exists in a single
unpacked structure:   other references to the MVar can simply be pointers
into the structure.   But the case where an MVar is unpacked into two
different structures suggests that,  at least in some cases,  an unpacked
MVar is still a cdatatype ** var;

So, is my understanding more or less correct?  Does anybody have a good,
succinct explanation of how MVars are implemented,  and how they are
unpacked?

One final question,   assuming that unpacking an MVar really does eliminate
a layer of indirection,  and that other references to that MVar are simply
pointers into a larger structure,   what happens to that larger structure
when there are no more references to it (but still some references to the
MVar?)Given the complications that must arise out of a doubly
unpacked MVar,  I'm going to guess that the larger structure does get
garbage collected in this case,  and that the MVar becomes dislodged from
this structure.   Would that MVar then be placed directly inside another
unpacked reference, if one is available?

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What does unpacking an MVar really mean?

2012-07-30 Thread Leon Smith
Let me clarify a bit.

I am familiar with the source of Control.Concurrent.MVar,  and I do see {-#
UNPACK #-}'ed MVars around,  for example in GHC's IO manager. What I
should have asked is,  what does an MVar# look like?  This cannot be
inferred from Haskell source;  though I suppose I could have tried to read
the Runtime source.

Now,  one would hope that and (MVar# RealWorld footype) would
 approximately correspond to a footype * mvar; variable in C.   The
problem is this cannot _always_ be the case,  because you can alias the
(MVar# RealWorld footype) by placing a single MVar into two unpacked
columns in two different data structures.So you would need to be able
to still sometimes represent an MVar# by a footype ** mvar at runtime,
 even though one would hope that it would be represented by a footype *
mvar in one particular data structure.

On Tue, Jul 31, 2012 at 1:04 AM, Ryan Ingram ryani.s...@gmail.com wrote:

 Because of this, boxed MVars can be garbage collected without necessarily
 garbage-collecting the MVar# it holds, if a live reference to that MVar#
 still exists elsewhere.


I was asking the dual question:  if the MVar# exists in some data
structure,  can that data structure still be garbage collected when there
is a reference to the MVar#,  but not the data structure it is contained
within.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] twitter election on favorite programming language

2012-05-01 Thread Leon Smith
Out of curiousity,  was this a plurality election (vote for one),  or an
approval election (vote for many)?

On Tue, May 1, 2012 at 12:11 AM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Hello,

 A twitter election on favorite programming language was held in Japan
 and it appeared that Heskell is No. 10 loved language in Japan. :-)

http://twisen.com/election/index/654

 Regards,

 --Kazu

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Turn GC off

2011-09-21 Thread Leon Smith
I doubt it.  Even if you could turn GC completely off,  the vast
majority of GHC Haskell programs will run out of memory very quickly.
 Lazy evaluation has been called evaluation by allocation;   unless
your program has very simple requirements and can live in the
completely-strict fragment of Haskell without consing,  almost
everything allocates something.   Also,  your programs probably won't
even run faster without GC,  as GHC's GC is an important part of
getting halfway reasonable L2 cache performance.

Best,
Leon

On Wed, Sep 14, 2011 at 12:42 PM, Andreas Voellmy
andreas.voel...@gmail.com wrote:
 Hi everyone,
 Is there a way to completely turn garbage collection off in the Haskell
 runtime system? I'm aware of the -A runtime option, but I'd like to
 completely turn it off, if possible. I'm OK with running the program until
 it runs out of memory, and I'm willing to recompile GHC if needed.
 Regards,
 Andreas
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] stack overflow pain

2011-09-21 Thread Leon Smith
On Wed, Sep 21, 2011 at 3:39 AM, Heinrich Apfelmus
apfel...@quantentunnel.de wrote:
 Of course, a list of 1 million items is going to take a lot of memory,
 unless you generate it lazily. Unfortunately  mapM  cannot generate its
 result lazily because it has to execute all IO actions before returning the
 list of results.

That's oversimplifying a bit.  The outer list cannot be generated
lazily,  but the inner values (in this case inner lists) can be
generated lazily.

On Wed, Sep 21, 2011 at 7:00 PM, Tim Docker t...@dockerz.net wrote:
 I believe the error is happening in the concat because there are subsequent
 IO actions that fail to execute. ie the code is equivalent to:

        vs - fmap concat $ mapM applyAction sas
        someOtherAction
        consume vs

 and someOtherAction seems not to be run. However, to be sure, I'll confirm
 with code akin to what you suggest above.

The error shouldn't be happening in either concat or mapM.   Are you
sure that someOtherAction isn't being run?  Might it be writing to a
file and the result isn't getting flushed?

GHC has no inherent limit on the stack size, though using extremely
large amounts of stack is usually indicative of an error.   You can up
the stack limit with the -Ksize RTS option, and I think there is a way
it can be disabled entirely.You might try upping your stack size
and profiling your program to see if that's helpful:  the -xt
profiling option might be useful, but I haven't played with it much.

I suspect the issue is that one of your applyAction is creating a
thunk that blows the stack when it's evaluated,  and return []
ensures that the thunk is never evaluated.Though it's not clear to
me why it'd be getting evaluated in this new scenario with the
information you've provided,  assuming you really truly aren't running
someOtherAction.

Best,
Leon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Higher-kinded Quantification

2011-04-13 Thread Leon Smith
Thanks!  The issue with eta-reduction had been confusing me...

Best,
Leon

On Tue, Apr 12, 2011 at 3:35 PM, Dan Doel dan.d...@gmail.com wrote:
 On Tuesday 12 April 2011 11:27:31 AM Leon Smith wrote:
 I think impredicative polymorphism is actually needed here;  if I write
 ...
 Then I get a type error
 ...

 I'm not going to worry about the type error, because that wasn't what I had in
 mind for the types. The type for loop I had in mind was:

  [i] - Iterator i o m a - Iterator i o m a

 Then, feedPure cracks open the first (forall m. ...), instantiates it to the m
 for the result, and runs loop on it. If we had explicit type application, it'd
 look like:

  feedPure l it = /\m - loop l (it@m)

 but as it is it's just:

  feedPure l it = loop l it

 Which cannot be eta reduced.

 But what I find rather befuddling is the kind error:
  feedPure' :: [i] - Iterator i o (forall (m :: * - *) . m) a - Iterator
  i o (forall (m :: * - *) . m) a feedPure' = undefined

 Iterator.hs:193:58:
     `m' is not applied to enough type arguments
     Expected kind `*', but `m' has kind `* - *'
     In the type signature for `feedPure'':
       feedPure' :: [i]
                    - Iterator i o (forall m :: (* - *). m) a
                       - Iterator i o (forall m :: (* - *). m) a

 Is impredicative polymorphism restricted to the kind *?

 The problem is that (forall (m :: * - *). m) is not a valid type, and forall
 is not a valid way to construct things with kind * - *. You have:

  m :: * - * |- T[m] :: * == |- (forall (m :: * - *). T[m]) :: *

 but that is the only way forall works.

 -- Dan

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Higher-kinded Quantification

2011-04-12 Thread Leon Smith
I think impredicative polymorphism is actually needed here;  if I write

 {-# LANGUAGE ScopedTypeVariables #-}
 {-# LANGUAGE RankNTypes #-}

 feedPure :: forall i o a. [i] - (forall m. Iterator i o m a) - (forall m. 
 Iterator i o m a)
 feedPure = loop
   where
 loop :: [i] - (forall m. Iterator i o m a) - (forall m. Iterator i o m 
 a)
 loop [] iter = iter
 loop (i:is) (NeedInput k)= loop is (k i)

Then I get a type error:

Iterator.hs:185:36:
Couldn't match type `m0' with `m1'
  because type variable `m1' would escape its scope
This (rigid, skolem) type variable is bound by
  a type expected by the context: Iterator i o m1 a
The following variables have types that mention m0
  k :: i - Iterator i o m0 a (bound at Iterator.hs:185:28)

Which I think I vaguely understand,  as the type of NeedInput is (i -
Iterator i o m a) - Iterator i o m a,  meaning the type of m is
equal.   So it seems the polymorphism must be carried on m.

But what I find rather befuddling is the kind error:

 feedPure' :: [i] - Iterator i o (forall (m :: * - *) . m) a - Iterator i o 
 (forall (m :: * - *) . m) a
 feedPure' = undefined

Iterator.hs:193:58:
`m' is not applied to enough type arguments
Expected kind `*', but `m' has kind `* - *'
In the type signature for `feedPure'':
  feedPure' :: [i]
   - Iterator i o (forall m :: (* - *). m) a
  - Iterator i o (forall m :: (* - *). m) a

Is impredicative polymorphism restricted to the kind *?

Best,
Leon




Then I get a tp



On Tue, Apr 12, 2011 at 5:37 AM, Dan Doel dan.d...@gmail.com wrote:
 On Monday 11 April 2011 8:31:54 PM Leon Smith wrote:
 I have a type constructor (Iterator i o m a) of kind (* - * - (* -
 *) - *),  which is a monad transformer,  and I'd like to use the type
 system to express the fact that some computations must be pure,  by
 writing the impredicative type (Iterator i o (forall m. m) a).
 However I've run into a bit of difficulty expressing this,  due to the
 kind of m.   I've attached a minimal-ish example.   Is there a way to
 express this in GHC?

 I think the simplest way is 'Iterator i o Id a'. Then there's a function:

  embed :: Iterator i o Id a - Iterator i o m a

 with the obvious implementation. This means your NeedAction case is no longer
 undefined, too. You can either peel off NeedActions (since they're just
 delays) or leave them in place.

 However, another option is probably:

    [i] - (forall m. Iterator i o m a) - (forall m. Iterator i o m a)

 which will still have the 'this is impossible' case. You know that the
 incoming iterator can't take advantage of what m is, though, so it will be
 impossible.

 -- Dan

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Higher-kinded Quantification

2011-04-11 Thread Leon Smith
I have a type constructor (Iterator i o m a) of kind (* - * - (* -
*) - *),  which is a monad transformer,  and I'd like to use the type
system to express the fact that some computations must be pure,  by
writing the impredicative type (Iterator i o (forall m. m) a).
However I've run into a bit of difficulty expressing this,  due to the
kind of m.   I've attached a minimal-ish example.   Is there a way to
express this in GHC?
{-# LANGUAGE ImpredicativeTypes #-}
{-# LANGUAGE KindSignatures #-}

data Iterator i o m a
   = NeedInput  (i - Iterator i o m a)
   | HasOutput  o (Iterator i o m a)
   | NeedAction (m (Iterator i o m a))
   | IsDone a

feedPure :: [i] - Iterator i o (forall m . m) a - Iterator i o (forall m . m) a
feedPure = loop
  where
loop [] iter = iter
loop (i:is) (NeedInput k)= loop is (k i)
loop is (HasOutput o k0) = HasOutput o (loop is k0)
loop _  (NeedAction _)   = undefined  -- shouldn't happen, due to type
loop _  (IsDone a)   = IsDone a

{-
impredicative.hs:10:34:
Kind mis-match
The third argument of `Iterator' should have kind `* - *',
but `m' has kind `*'
In the type signature for `feedPure':
  feedPure :: [i]
  - Iterator i o (forall m. m) a - Iterator i o (forall m. m) -}

feedPure' :: [i] - Iterator i o (forall m . (m :: * - *)) a - Iterator i o (forall m . (m :: * - *)) a
feedPure' = loop
  where
loop [] iter = iter
loop (i:is) (NeedInput k)= loop is (k i)
loop is (HasOutput o k0) = HasOutput o (loop is k0)
loop _  (NeedAction _)   = undefined
loop _  (IsDone a)   = IsDone a

{-
impredicative.hs:28:46:
`(m :: * - *)' is not applied to enough type arguments
Expected kind `*', but `(m :: * - *)' has kind `* - *'
In the type signature for `feedPure'':
  feedPure' :: [i]
   - Iterator i o (forall m. (m :: * - *)) a
  - Iterator i o (forall m. (m :: * - *)) a
-}
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Possibility to implant Haskell GC into PostgreSQL interesting?

2011-02-28 Thread Leon Smith
I'm not particularly familiar with the codebase of either PostgreSQL
or GHC,  but I'd be rather surprised that porting GHC's garbage
collector to PostgreSQL would be an easy or worthwhile task.   For
example,  GHC's garbage collector understands the memory layout that
its data structures use,  which I'm sure is rather different from the
memory layout of PostgreSQL's data structures.Also,  often these
kinds of projects turn out to be busts;  for example,  the GHC team
worked on integrating Hugs into GHC for a while,  but after some
effort decided that it would be a lot easier to write an interpreter
from scratch instead,  which became ghci.

What would be far more likely to turn out to be realistic,  would be
to understand GHC's garbage collector and use (some of) it's ideas
when writing a replacement garbage collector for PostgreSQL.
However,  there is a lot of variation in garbage collectors because
different techniques are suited to different patterns of memory
allocation.   And based on the observation that PostgreSQL makes
reasonably effective usage of virtual memory,  whereas GHC's garbage
collector thrashes Virtual Memory,   I would bet there is a fair
number of important differences in the systems.

Best,
Leon

On Tue, Feb 22, 2011 at 7:25 PM, Nick Rudnick joerg.rudn...@t-online.de wrote:
 Dear all,

 recently, at an email conversation with pgsql hackers I had a quick shot,
 asking about their position to somebody replacing their palloc GC -- having
 roughly in mind that either here or on a Mercury mailing list (where there's
 a similar case with a pure declarative language and a Boehm GC), where there
 was a conclusion a non-pure GC would be a major hindrance to deeper
 interaction.

 Ok, I found the answer worth a discussion here; as far as I understood, they
 don't oppose the idea that the PostgreSQL GC might be a candidate for an
 update. I see three issues:

 (a) The most open question to me is the gain from the Haskell perspective;
 most critical: Would a Haskell GC inside PostgreSQL mean a significant
 change or rather a drop in the bucket? Once this may be answered
 optimistically, there comes the question about possible applications --
 i.e., what can be done with such a DBMS system. Knowing about efforts like
 (http://groups.inf.ed.ac.uk/links/) I would like to let this open for
 discussion.

 Let me please drop here a quote that I believe their object relational
 efforts seem to have gotten stuck at PostgreSQL due to the conceptual clash
 of OO with the relational algebra underlying PostgreSQL -- which in turn
 seems to harmonize much better with Hindley-Milner  Co. (System F??)

 (b) The question I personally can say least about are the expenditures to be
 expected for a such project. I would be very interested in some statements.
 I have limited knowledge about the PostgreSQL GC and would assume it is much
 simpler than, e.g. the GHC GC.

 (c) Gain from PostgreSQL perspective: This IMO should be answered easiest,
 hoping the Haskell GC experts to be able to answer easily how much is the
 overhead to be payed for pure declarativity, and the chances (e.g.
 parallelism, multi cores??), too.

 Besides it might be interesting to see inhowfar a considerable overhead
 problem may be alleviated by a 'plugin' architecture allowing future
 PostgreSQL users to switch between a set of GCs.

 I would be very interested about any comments, Cheers, Nick

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HDBC's future and request for help

2011-02-23 Thread Leon Smith
This seems a timely email,  as I've been submitting a steady-ish
trickle of patches to HDBC-postgresql lately.   Honestly,  I'm rather
dissatisfied with HDBC in many respects,   but I don't have a very
good idea of what a (low-level) database access library for Haskell
*should* be,  and I've found HDBC to be the least worst option
available at the moment.   HSQL doesn't support parameterized queries,
 making SQL injection issues tedious and error prone.And while
Takusen looks rather interesting,  I find the documentation and
examples lacking,   I find the API a bit too esoteric for my current
tastes,  and I'm fairly certain that Takusen cannot possibly work for
my application due to the way Takusen reclaims connections.

My biggest (mostly) fixable complaint with HDBC is that it hasn't
turned out to be a very complete or robust solution for accessing
databases that like to use PostgreSQL-specific features.   My biggest
complaint that (probably) isn't easily fixed is it's reliance on
Convertible,  and the use of lots of unsafe pattern matching and
exception-happy functions.

At least for the time being,  I've found it easiest and most expedient
to fix up HDBC.   I'm not particularly interested in taking over the
maintenance of HDBC,  and I am comfortable with model #1 at the time
being.   However if somebody else is interested in another option,
I'm probably ok with that too.

Best,
Leon


On Tue, Feb 22, 2011 at 11:50 AM, John Goerzen jgoer...@complete.org wrote:
 Hi folks,

 HDBC has been out there for quite some time now.  I wrote it initially to
 meet some specific needs, and from that perspective, it has been done for
 awhile.  It is clear, however, that there are some needs it doesn't meet.
  Most of them relate to specific backend driver items.

 I'd like to start some discussion in the community about what the future of
 HDBC and its backend drivers ought to look like.  Some models might be:

  1. I continue as maintainer for HDBC and HDBC-{postgresql,odbc,sqlite3} and
 act as patch manager/gatekeeper for patches that are discussed on some
 public mailing list.

  2. Interested parties adopt the backend drivers while I continue to act as
 maintainer/patch manager/gatekeeper for HDBC itself.

  3. Interested parties adopt all of HDBC and maintain it

 I am not expressing a particular preference for any of these options; just
 putting them forth.

 Here are some of the current issues I am aware of:

  1. I have no Windows development platform.  I can't test the releases on
 Windows.  Many Windows users don't have the skill to diagnose problems.
  These problems do eventually get fixed when a Windows user with that skill
 comes along -- and I appreciate their efforts very much! -- but it takes
 longer than it ought to.

  2. The ODBC documentation is monumentally terrible, and the API is perhaps
 only majestically terrible, and it is not 100% clear to me all the time that
 I have it right.  A seasoned ODBC person would be ideal here.

  3. Issues exist with transferring binary data and BLOBs to/from at least
 PostgreSQL databases and perhaps others.  There appear to be bugs in the
 backend for this, but BLOB support in the API may need to be developed as
 well.

  4. Although the API supports optimizations for inserting many rows at once
 and precompiled queries, most backends to not yet take advantage of these
 optimization.

  5. I have received dueling patches for whether foreign imports should be
 marked safe or unsafe on various backends.  There seems to be
 disagreement in the community about this one.

  6. Many interactions with database backends take place using a String when
 a more native type could be used for efficiency.  This project may be rather
 complex given varying types of column data in a database -- what it expects
 for bound parameters and what it returns.  The API support is there for it
 though.

  7. Various other more minor items.

 Thoughts?

 -- John

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HANSEI in Haskell?

2011-02-23 Thread Leon Smith
On Wed, Feb 23, 2011 at 10:52 AM, Chung-chieh Shan
ccs...@cs.rutgers.edu wrote:
 Mostly we preferred (as do the domain experts we target) to write
 probabilistic models in direct style rather than monadic style.
 Haskell's laziness doesn't help -- in fact, to avoid running out of
 memory, we'd have to defeat that memoization by sprinkling () -
 throughout the types.

I don't think that () - is even guaranteed to work...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] The implementation of Control.Exception.bracket

2011-01-31 Thread Leon Smith
There is a common idiom used in Control.Concurrent libraries,  as
embodied in the implementation of bracket:

http://www.haskell.org/ghc/docs/7.0-latest/html/libraries/base-4.3.0.0/src/Control-Exception-Base.html#bracket

bracket before after thing =
  mask $ \restore - do
a - before
r - restore (thing a) `onException` after a
_ - after a
return r


Is there any particular reason why bracket is not implemented as:

bracket before after thing =
  mask $ \restore - do
a - before
r - restore (thing a) `finally` after a
return r

Is there some subtle semantic difference?   Is there a performance
difference?   It seems like a trivial thing,  but I am genuinely
curious.

Best,
Leon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Installing a top-level handler on an existing thread

2011-01-24 Thread Leon Smith
I've been toying with a little thread manager library,  and for
sanity's sake I really need a way to install another top-level
exception handler on an existing thread.I don't want to replace
any other handlers,  just put my own handler around the thread's
continuation.   Of course,  it would be pretty easy to do if the IO
monad had a nice efficient implementation of call/cc and call/cc1,
although I don't need the full power of continuations.

I really don't want to create my own Thread monad (which could use
ContT),  I want to be able to do it on plain old IO threads.   Is
there any way to do this in GHC?

Best,
Leon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HDBC, postgresql, bytestrings and embedded NULLs

2011-01-17 Thread Leon Smith
On Sat, Jan 8, 2011 at 11:55 AM, Michael Snoyman mich...@snoyman.com wrote:

 In general I think it would be a good thing to have solid, low-level bindings 
 to PostgreSQL.

Well, there is PostgreSQL and libpq on hackage:

http://hackage.haskell.org/package/libpq
http://hackage.haskell.org/package/PostgreSQL

The PostgreSQL looks like it's in need of maintenance,  and hasn't
been updated in a few years.   libpq is new,  and looks promising.   I
haven't really used either one, so I can't really say too much about
either.

Best,
Leon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OpenSSL question

2011-01-01 Thread Leon Smith
I don't believe that HsOpenSSL offers support for creating your own
SSL keys programmatically from Haskell.   Do you actually need to
generate keys programmatically?   If not,  you could manually use
OpenSSL's command line tools;  if your needs are simple enough it
shouldn't be too hard to spawn a subprocess to generate the keys.

Best,
Leon

On Sat, Jan 1, 2011 at 5:49 PM, Charles-Pierre Astolfi c...@crans.org wrote:
 Hi -cafe,

 I'm using the OpenSSL package, and I don't see any way to forge my own keys.
 (http://hackage.haskell.org/packages/archive/HsOpenSSL/0.9/doc/html/OpenSSL-RSA.html)
 For example, in the case of RSA, I'm given the exponent and modulus
 and I would like to create a public key from that, of type RSAPubKey.
 Is it possible ? How should I do?

 I'm not particularly tied to OpenSSL, anything that let me use RSA and
 DSA would be fine.

 Regards,
 --
 Cp

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] V.I.P.s and the associativity of merge'

2010-12-29 Thread Leon Smith
Ok,  after mulling over the issues that Will Ness has brought up in
the last few days [1],  I think I have a partial explanation for the
apparent tension between Will's observations and Heinrich Apfelmus's
Implicit Heaps article [2],  which both concern the implementation of
mergeAll [3].

The merge' function takes two ordered lists,  with the head of the
first list less than the head of the second,  and merges their
contents:

   merge' [] ys = ys
   merge' (x:xs) ys = x : merge xs ys

The nice thing about this function is we can merge an infinite number
of lists by folding right,  if assume that the heads of those lists
are appropriately ordered.  This appears in Richard Bird's code at the
end of Melissa O'Neill's Genuine Sieve of Eratosthenes,  though
undoubtedly this observation has been independently made by many
people.

Now,  in an ordinary sense,   merge' *is* an associative operator,  in
that given three fully defined ordered lists with ordered heads,
merge' xs (merge' ys zs) == merge' (merge' xs ys) zs.   This allows us
to merge an infinite number of lists using an arbitrary tree of merge'
operations,  without ever worrying that we will return an incorrect
result.   (However,  we might get stuck in a non-productive loop,  for
example if we fold left over an infinite list of ordered lists)

However,  Heinrich's article uses a stronger sense of associativity
which includes laziness properties and thus captures some operational
characteristics.  In this sense,  merge' *is not* associative.To
remedy this,  Heinrich uses VIPs to strengthen the associativity
property of merge'.   This raises the question,  is there some
combination of the shape of the merge' tree and some input for which
using VIPs dramatically changes the efficiency of a mergeAll
operation?   I suspect the answer is yes,  though I don't know for
sure at this point in time.

However,  I do tacitly believe that the current tree that mergeAll
uses doesn't exhibit this property for any input,   and so I have
simplified the implementations of mergeAll and unionAll in the latest
version of data-ordlist-0.4.4 by avoiding the use of VIPs.   This has
the nice side benefit of modestly improving performance when the
elements of the result are highly biased towards the first list.

Best,
Leon

[1] http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/84666
[2] http://apfelmus.nfshost.com/articles/implicit-heaps.html
[3] 
http://hackage.haskell.org/packages/archive/data-ordlist/0.4.4/doc/html/Data-List-Ordered.html#v:mergeAll

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HDBC-postgresql and safe/unsafe FFI calls

2010-09-02 Thread Leon Smith
On Thu, Sep 2, 2010 at 1:00 AM, David Powell da...@drp.id.au wrote:
 Thanks Jason, I think I had read that - I quite enjoy Edward's posts.
 Re-reading, seems to confirm what I thought, most (all?) of the FFI calls in
 HDBC-postgresql should be changed to safe.

Wouldn't that require thread safety on the part of libpq?   Versions
8.4 and earlier requires the library to be configured with the option
--enable-thread-safety,  although version 9.0 has thread safety turned
on by default.

http://www.postgresql.org/docs/8.4/static/libpq-threading.html
http://www.postgresql.org/docs/9.0/static/libpq-threading.html

At least the Ubuntu PostgreSQL 8.4 packages are compiled with thread
safety,  but I can't speak for other distributions.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Projects that could use student contributions?

2010-09-02 Thread Leon Smith
There is a lot of room for improvement to my NumberSieves package.
The package consists of algorithms I extracted and polished up from
when I was working on Project Euler problems.   It makes solving a
number of problems into quick five minute affairs.  At some point I
would probably do it myself,  but I don't have a pressing reason to do
it.

http://hackage.haskell.org/package/NumberSieves

Possible Improvements I have in mind:
   1.  Make things faster
   2.  Reduce memory requirements
   3.  Parallelize the sieves
   4.  Incrementalize the Factor and Phi sieves,  so that an explicit
upper bound is no longer required
   5.  Remove limitations so that Factor and Phi can sieve  beyond 2^32

It would be a small and self-contained project,  suitable for anybody
who has an interest in Haskell and mathematics!   If somebody is
interested,  by all means contact me via email,  and I will share the
approach I had in mind.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] PDF generation?

2010-06-08 Thread Leon Smith
The fonts aren't rasterized,  but PDFs that were converted from PS
tend to look awful in almost any PDF viewer other than Adobe's Acrobat
Reader.  Fonts look especially bad.

I don't know exactly what the problem is, but my experience is that
you are best off generating PDF directly,  and using Acrobat Reader on
any document that looks bad otherwise.   Not to mention that PDF has a
document model and supports things such as hyperlinks,  which
Postscript does not,  which is another reason to avoid PostScript.

Best,
Leon


On Tue, Jun 1, 2010 at 2:45 PM, Yitzchak Gale g...@sefer.org wrote:
 I wrote:
 I have often generated PostScript from Haskell...
 Then you convert the PS to PDF using any of the nice
 utilities around for that

 Pierre-Etienne Meunier wrote:
 Isn't there a problem with non-type 1 vectorial fonts being
 rasterized during this conversion ?

 No.

 PDF is just a simplified, compressed encoding of PostScript.
 Unless there is some special reason to do so, why would
 a conversion utility go to the trouble of rasterizing fonts
 instead of just copying them in?

 Perhaps something like ImageMagick might do that; its
 internal format is a raster.

 Regards,
 Yitz
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Compressing GHC tarballs with LZMA

2010-04-23 Thread Leon Smith
Out of curiousity,  I downloaded a binary distribution for GHC-6.12.2,
 and tried compressing and recompressing it with bzip2 and lzma
compression,  using no command line arguments (all default parameters)

file:  
http://haskell.org/ghc/dist/6.12.2/ghc-6.12.2-x86_64-unknown-linux-n.tar.bz2

bzip2  lzma
size106M   72M
compression time2:01   9:51
decompression time  0:25   0:12

the time is reported as  minutes:seconds.   At it default
compression level,  lzma was almost five times slower than bzip2,  but
it's approximately twice as fast when decompressing and produced a
file nearly a third smaller.   Given that many modern variants of the
tar command support .tar.lzma files directly,  and that both
debian's dpkg and red hat's rpm now use lzma,   maybe it's time to
consider offering .tar.lzma as a download option for GHC's binary (and
maybe source-code) bundles.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Break Function Using Lazy Pairs

2010-04-07 Thread Leon Smith
That example doesn't particularly tie the knot,  unless you count
the fact that break is itself a recursive function.  Usually tie
the knot refers to some kind of circular programming,  i.e. a
self-referential data structure,  or explicit use of Data.Function.fix
to produce a recursive function  (as is useful in e.g. dynamic
programming)

Also,  you aren't understanding the laziness of the return product;
instead you are still thinking of this example in terms of eager
evaluation as almost every other programming language uses.   A better
approximation of what is going on could be represented textually as:

-- break hi\nbye
-- ( 'h' : ys, zs )   where (  ys, zs  ) = break i\nbye
-- ( 'h' : 'i' : ys, zs )   where (  ys, zs  ) = break \nbye
-- ( 'h' : 'i' : [] , bye)

Of course,  if you want to get more pendantic,  I've glossed over
steps involving the conditional and something resembling
beta-reduction.   Incidentally,  it's the latter part I omitted which,
 naively implemented,  creates the space leak referred to in the
original post.

Best,
Leon

On Mon, Apr 5, 2010 at 5:19 PM, aditya siram aditya.si...@gmail.com wrote:
 Hi all,
 For the past couple of weeks I've been trying to understand
 tying-the-knot style programming in Haskell.

 A couple of days ago, Heinrich Apfelmus posted the following 'break'
 function as part of an unrelated thread:
 break []     = ([],[])
 break (x:xs) = if x == '\n' then ([],xs) else (x:ys,zs)
               where (ys,zs) = Main.break xs

 I've stepped through the code manually to see if I understand and I'm
 hoping someone will tell me if I'm on the right track:
 -- break hi\nbye =
 --    let f1 = (break i\nbye)
 --           = let f2 = (break \nbye)
 --                    = ([],bye)
 --             ('i' : fst f2, snd f2) = ('i' : [], bye)
 --    ('h' : fst f1, snd f1) = ('h' : 'i' : [], bye)
 --                           = (hi,bye)

 -deech
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Hughes' parallel annotations for fixing a space leak

2010-04-04 Thread Leon Smith
On Wed, Mar 31, 2010 at 3:51 PM, Heinrich Apfelmus
apfel...@quantentunnel.de wrote:
 which were introduced by John Hughes in his Phd thesis from 1983. They
 are intriguing! Unfortunately, I haven't been able to procure a copy of
 Hughes' thesis, either electronic or in paper. :( Can anyone help? Are
 there any other resources about this parallel approach?

Aye,  returning lazy pairs is one of those things that seems rather
magical in several respects.   Out of curiousity,  have you looked at
the unsafePerformIO thought experiment near the end of my Monad Reader
article?   It demonstrates that returning lazy pairs can introduce
multiple code paths through a single function,  reminiscent of (but
different than) multi-mode logical relations.   (Mercury, for example,
 optimizes relations differently depending on their mode.)

I too am interested in looking at Hughes' thesis,  I tried tracking it
down early last year with little success.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Are there any female Haskellers?

2010-03-27 Thread Leon Smith
On Sat, Mar 27, 2010 at 1:56 PM, Jason Dagit da...@codersbase.com wrote:
 For some reason it started out as a male dominated field.  Let's assume
 for cultural reasons.  Once it became a male dominated field, us males
 unknowingly made the work and learning environments somewhat hostile
 or unattractive to women.  I bet I would feel out of place if I were the only
 male in a class of 100 women.

Is this really true?  I've heard rumors that in the early days of
programming, that women were in the majority,  or at least they
represented a much greater proportion of programmers than they do now.
  I seem to recall that this started to change sometime in the 60s.
Of course,  I can't recall when or where I heard these stories,   and
I'm not sure that my source was reliable,   so I might be completely
off on this count.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Takusen sqlite3 insert is very slow

2010-03-21 Thread Leon Smith
Using PostgreSQL on my computer,  your code executes in 3.5 seconds
with GHCi,  while compiled it executes in 16.2 seconds!   Clearly
something is wrong,  although I don't yet know enough about Takusen
enough to be able to say what.

I tried hoisting the preparation of the statement out of the loop,
however,  Takusen throws an error after the first insertion.  However,
 the insertion was apparently successful,  as a single row shows up in
the table afterwards.   Here's sample code and a sample shell session:

Best,
Leon

 import Database.PostgreSQL.Enumerator
 import Database.Enumerator
 import Control.Monad(forM_)

 main = do
   withSession
 (connect [ CAhost localhost
  , CAport 5432
  , CAuser takusen
  , CApassword takusen
  , CAdbname test  ])
 $ do
 execDDL (sql create table x (y text))
 let query = sql insert into x (y) values (?)
 stmt  = prepareQuery testinsert query [bindType (0::Int)]
 forM_ ([1..1] :: [Int]) $ \x - do
  withPreparedStatement stmt $ \pstmt - do
withBoundStatement pstmt [bindP x] $ \bstmt - do
  execDML bstmt
  return ()

$ psql test
psql (8.4.2)
Type help for help.

test=# select * from x;
ERROR:  relation x does not exist
LINE 1: select * from x;
  ^
test=# \q
$ ghci Tak.lhs
GHCi, version 6.12.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 1] Compiling Main ( Tak.lhs, interpreted )
Ok, modules loaded: Main.
(0.09 secs, 40984568 bytes)
*Main main
Loading package mtl-1.1.0.2 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.3 ... linking ... done.
Loading package time-1.1.4 ... linking ... done.
Loading package Takusen-0.8.5 ... linking ... done.
*** Exception: DBError (,) 1 
*Main :q
Leaving GHCi.
$ psql test
psql (8.4.2)
Type help for help.

test=# select * from x;
 y
---
 1
(1 row)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] developing against privately patched libraries, and cabal

2010-03-21 Thread Leon Smith
As somebody who's hacked on cabal-install a bit  (but don't have a
worthwhile patch to contribute (yet?)),  I can tell you that versions
support a tag structure,  at least internally,  but I haven't seen a
non-empty tags field and don't know how to make the tags field
non-empty.   For that I'd have to go source-code diving again.

http://www.haskell.org/ghc/docs/latest/html/libraries/Cabal-1.8.0.2/Distribution-Version.html

Best,
Leon

On Sun, Mar 21, 2010 at 10:58 AM, Dougal Stanton
dou...@dougalstanton.net wrote:
 If you're making local changes against a library you don't own (with
 the ultimate intention of sending those changes back upstream to the
 library maintainer) it makes sense change the version number to avoid
 clashes with the canonical version of the library.

 Of course, it's easy to lose track and end up publishing your own
 program against a non-existent (outside your hard disk) version of the
 library. I'd like to make it very obvious, both in mypogram.cabal and
 library.cabal that one is a patched copy and the other has to be
 compiled against a patched copy.

 Does cabal provide any way of marking a version private? I thought
 initially to just mark the version field in the patched library as
 X.y-dougal and enforce my program to compile against that, but it
 doesn't seem to recognise the -dougal suffix there.

 Thoughts?

 D

 --
 Dougal Stanton
 dou...@dougalstanton.net // http://www.dougalstanton.net
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A small oversight

2010-02-20 Thread Leon Smith
On Sat, Feb 20, 2010 at 5:47 AM, Andrew Coppin
andrewcop...@btinternet.com wrote:

  sortOn :: (Ord y) = (x - y) - [x] - [x]
  sortOn foo = sortBy (compare `on` foo)


Incidentally,  this function is provided as Data.List.Ordered.sortOn'
in the data-ordlist package...


On Sat, Feb 20, 2010 at 7:39 AM, Ben Millwood hask...@benmachine.co.uk wrote:
 But it would still be useful to have sortOn et al to capture the
 common technique when your sorting property is potentially expensive
 (sortOn length, for example):

 sortOn f = map fst . sortBy (comparing snd) . map (\x - (x, f x))

 a technique which I believe is called a Schwar[t]zian transform.

An older name for this technique is decorate-sort-undecorate.
Data-ordlist also provides this as Data.List.Ordered.sortOn

http://hackage.haskell.org/package/data-ordlist

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Implementing unionAll

2010-02-18 Thread Leon Smith
On Thu, Feb 18, 2010 at 2:32 AM, Evan Laforge qdun...@gmail.com wrote:
 By purest coincidence I just wrote the exact same function (the simple
 mergeAll', not the VIP one).  Well, extensionally the same...
 intensionally mine is 32 complicated lines and equivalent to the 3
 line mergeAll'.  I even thought of short solution by thinking that
 pulling the first element destroys the ascending lists property so
 it's equivalent to a normal sorted merge after that, and have no idea
 why I didn't just write it that way.

Well, the three line version wasn't my first implementation,  by any
stretch of the imagination.   I know I had tried to implement mergeAll
at least once,  if not two or three times before coming up with the
foldr-based implementation.   However,  I can't find any of them;
they may well be lost to the sands of time.

Incidentally,  that implementation also appears in Melissa O'Neill's
Genuine Sieve of Eratosthenes,   in an alternate prime sieve by
Richard Bird that appears at the end.

 Anyway, I'm dropping mine and downloading data-ordlist.  Thanks for
 the library *and* the learning experience :)

Thanks!

 BTW, I notice that your merges, like mine, are left-biased.  This is a
 useful property (my callers require it), and doesn't seem to cost
 anything to implement, so maybe you could commit to it in the
 documentation?

Yes,  the description of the module explicitly states that all
functions are left-biased; if you have suggestions about how to
improve the documentation in content or organization,  I am interested
in hearing them.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Implementing unionAll

2010-02-18 Thread Leon Smith
On Thu, Feb 18, 2010 at 3:07 AM, Evan Laforge qdun...@gmail.com wrote:
 BTW, I notice that your merges, like mine, are left-biased.  This is a
 useful property (my callers require it), and doesn't seem to cost
 anything to implement, so maybe you could commit to it in the
 documentation?

Also, I did briefly consider giving up left bias.  GHC has an
optimization strategy that seeks to reduce pattern matching,  and due
to interactions with this I could have saved a few kilobytes of -O2
object code size by giving up left-bias.

For example:

module MergeLeft where

mergeBy :: (a - a - Ordering) - [a] - [a] - [a]
mergeBy cmp = loop
  where
loop [] ys  = ys
loop xs []  = xs
loop (x:xs) (y:ys)
  = case cmp x y of
 GT - y : loop (x:xs) ys
 _  - x : loop xs (y:ys)

compiles ghc-6.12.1 -O2 to a 4208 byte object file for x64 ELF.   By
changing the very last line to:

 _  - x : loop (y:ys) xs

I get a 3336 byte object file instead,  but of course this is no
longer left- (or right-) biased.Repeat this strategy across the
entire module,  and you can save 3 kilobytes or so.   However,  in
today's modern computing environment,  left-bias is clearly a greater
benefit to more people.

If you are curious why,  I suggest taking a look at GHC's core output
for each of these two variants.   The hackage package ghc-core
makes this a little bit more pleasant,  as it can pretty-print it for
you.

It's amazing to think that this library,  at 55k (Optimized -O2 for
x64),  would take up most of the memory of my very first computer,  a
Commodore 64.   Of course,  I'm sure there are many others on this
list who's first computers had a small fraction of 64k of memory to
play with.  :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Implementing unionAll

2010-02-17 Thread Leon Smith
On Wed, Feb 17, 2010 at 6:58 AM, Heinrich Apfelmus
apfel...@quantentunnel.de wrote:

 Ah, I meant to use the  union'  from your previous message, but I think
 that doesn't work because it doesn't have the crucial property that the case

    union (VIP x xs) ys = ...

 does not pattern match on the second argument.

Ahh yes,  the funny thing is that I tested the code in my previous
message,  and it worked in the infinite case.   Then I replaced the
union' to pattern match on the second argument as well,  and tested it
on only finite cases,  and then released it.Thus,  unionAll in
data-ordlist-0.4.1 doesn't work on an infinite number of lists.

So my original unionAll in data-ordlist-0.4  appears to work ok,   my
revised and simplified unionAll doesn't work at all.

 The easiest solution is simply to define

    unionAll = nub . mergeAll
        where
        -- specialized definition of  nub
        nub = map head . groupBy (==)

Incidentally,  data-ordlist has a (slightly different) version of nub
that does exactly what you want in this particular case.Check out
the documentation for nub and nubBy

 But you're probably concerned that filtering for duplicates afterwards
 will be less efficient. After all, the (implicit) tree built by
 mergeAll  might needlessly compare a lot of equal elements.

Well,  yes and no.   Efficiency is good,  but this implementation does
not match my intention.For example:

unionAll [[1,1,2,2,2],[1,1,1,2]] == foldr union [] [...] == [1,1,1,2,2,2]

The union function preserves strictly ascending lists,  but it also
works on multisets as well,  returning an element as many times as the
maximum number of times in either list.Thus, on an infinite number
of lists,   unionAll should return a particular element as many times
as the maximum number of times it appears in any single list.

On Wed, Feb 17, 2010 at 1:18 PM, Daniel Fischer
daniel.is.fisc...@web.de wrote:
 Am Mittwoch 17 Februar 2010 18:59:42 schrieb Ozgur Akgun:
 Ooops I thought the inner lists are possibly of infinite size.


 Both, I think.

Yes,  both the inner and outer lists of an input to unionAll might be
infinite.It's just that

foldr union []

works fine if the inner lists are infinite,  but gets stuck in an
infinite non-productive list if the outer list is infinite.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Implementing unionAll

2010-02-17 Thread Leon Smith
On Wed, Feb 17, 2010 at 6:58 AM, Heinrich Apfelmus
apfel...@quantentunnel.de wrote:
 Ah, I meant to use the  union'  from your previous message, but I think
 that doesn't work because it doesn't have the crucial property that the case

union (VIP x xs) ys = ...

 does not pattern match on the second argument.

Ahh yes,   my original union'  has a bit that looks like this

union' (VIP x xs) (VIP y ys)
   = case cmp x y of
   LT - VIP x (union' xs (VIP y ys))
   EQ - VIP x (union' xs ys)
   GT - error Data.List.Ordered.unionAll:  assumption violated!
union' (VIP x xs) (Crowd ys) = VIP x (union' xs (Crowd ys))

For whatever reason, this works in the case of an infinite number of
lists with my original version,  but not the simplified version.  By
applying a standard transformation to make this lazier,  we can
rewrite these clauses as

   union' (VIP x xs) ys
  = VIP x $ case ys of
 Crowd _ - union' xs ys
 VIP y yt - case cmp x y of
  LT - union' xs ys
  EQ - union' xs yt
  GT - error msg

In the original case,  we have this strictness property

   union' (VIP x xs) ⊥ == ⊥

The revised verison is a bit lazier:

   union' (VIP x xs) ⊥ == VIP x ⊥

And so the simplified unionAll now works again on an infinite number
of lists.   I've uploaded data-ordlist-0.4.2 to fix the bug introduced
with data-ordlist-0.4.1,   and added a regression test to the suite.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Implementing unionAll

2010-02-16 Thread Leon Smith
 I see no obvious deficiencies. :) Personally, I'd probably structure it like

   http://www.haskell.org/haskellwiki/Prime_numbers#Implicit_Heap

This variant,  based on the wiki article,  is cleaner,  slightly
simpler,  appears to be just as fast,  and allocates slightly less
memory:

 import GHC.Exts(inline)
 import Data.List.Ordered(unionBy)

 union' :: People Int - People Int - People Int
 union' (VIP x xt) ys= VIP x (union' xt ys)
 union' (Crowd xs) (Crowd ys)= Crowd (inline unionBy compare xs ys)
 union' xs@(Crowd (x:xt)) ys@(VIP y yt)  = case compare x y of
LT - VIP x (union' (Crowd xt) ys)
EQ - VIP x (union' (Crowd xt) yt)
GT - VIP y (union' xs yt)

 foldTree :: (a - a - a) - [a] - a
 foldTree f xs = case xs of
   [] - []
   xs - loop xs
  where
loop [x]= x
loop (x:xs) = x `f` loop (pairs xs)

pairs (x:y:ys) = f x y : pairs ys
pairs xs = xs

  unions xss = serve $ inline foldTree union' [ VIP x (Crowd xs) | (x:xs) - 
 xss ]
 where
 serve (VIP x xs) = x:serve xs
 serve (Crowd xs) = xs

One of the differences is that I started with a slightly different
foldTree,  one that was taken directly from Data.List.sort.

The only problem is that it has the same problem as I mentioned:

unionAll [[1,2],[1,2]]  == [1,1,2]

whereas unionAll is intended to be a generalization of foldr union
[] to an infinite number of lists,  and should thus return [1,2].
But I should be able to fix this without much difficulty.

 Your  loop  function is a strange melange of many different concerns
 (building a tree, union', adding and removing the VIP constructors).


 Note that it's currently unclear to me whether the lazy pattern match in

   pairs ~(x: ~(y:ys)) = f x y : pairs ys

 is beneficial or not; you used a strict one

   unionPairs (x:y:zs) = union' x y : unionPairs zs


Well,  as the library implementation must work on finite cases as
well,  the lazy pattern seems out of the question.

 If you're really concerned about time  space usage, it might even be
 worth to abandon the lazy tree altogether and use a heap to achieve the
 same effect, similar to Melissa O'Neils prime number code. It's not as
 neat, but much more predictable. :)

Well, it is intended as a high quality, generally useful
implementation,  so of course I care about time and space usage.  :)
 Dave Bayer's original algorithm does slightly better,  but was much
larger in terms of both source code and object size.

Omar implemented something along these lines,  but it didn't perform
so well.   I did not dig into the reasons why, though;  it might not
have had anything to do with the fact an explicit heap was used.

Incidentally,  I tried implementing something like implicit heaps once
upon a time;   but it had a severe performance problem,  taking a few
minutes to produce 20-30 elements.I didn't have a pressing reason
to figure out why though,  and didn't pursue it further.

Best,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Implementing unionAll

2010-02-13 Thread Leon Smith
With the urging and assistance of Omar Antolín Camarena,  I will be
adding two functions to data-ordlist:  mergeAll and unionAll,  which
merge (or union)  a potentially infinite list of potentially infinite
ordered lists,   under the assumption that the heads of the non-empty
lists appear in a non-decreasing sequence.

Union takes two sorted lists and produces a new sorted list;  an
element occurs in the result as many times as the maximum number of
occurrences in either list.   The unionAll function generalizes this
behavior to an infinite number of lists.

A reasonable implementation of mergeAll is:

 import Data.List.Ordered(merge, union)

 mergeAll :: Ord a = [[a]] - [a]
 mergeAll = foldr (\(x:xs) ys - x : merge xs ys) []

However,  for many inputs,  we can do better;   the library
implementation of mergeAll is based on H. Apfelmus's article Implicit
Heaps,  which presents a simplification of Dave Bayer's venturi
algorithm.   The difference is that the foldr version uses a line of
comparisons, whereas venturi uses a tree of comparisons.

http://apfelmus.nfshost.com/articles/implicit-heaps.html
http://www.mail-archive.com/haskell-cafe@haskell.org/msg27612.html

However,  as Omar pointed out to me,  the following implementation of
unionAll has a flaw:

 unionAll :: Ord a = [[a]] - [a]
 unionAll = foldr (\(x:xs) ys - x : union xs ys) []

Namely unionAll [[1,2],[1,2]] should return [1,2],  whereas it
actually returns [1,1,2].   After some work,  I believe I have
generalized H. Apfelmus's algorithm to handle this;  however it seems
a bit complicated.   I would love feedback,  especially with regard to
simplifications,  bugs,  testing strategies,  and optimizations:

 unionAll' :: Ord a = [[a]] - [a]
 unionAll' = unionAllBy compare

 data People a = VIP a (People a) | Crowd [a]

 unionAllBy :: (a - a - Ordering) - [[a]] - [a]
 unionAllBy cmp xss = loop [ (VIP x (Crowd xs)) | (x:xs) - xss ]
   where
 loop [] = []
 loop (  VIP x xs  :  VIP y ys  :  xss  )
   = case cmp x y of
   LT - x : loop (  xs  :  VIP y ys  :  xss  )
   EQ - loop (  VIP x (union' xs ys)  :  unionPairs xss  )
   GT - error Data.List.Ordered.unionAll:  assumption violated!
 loop (  VIP x xs  :  xss  )
   =  x : loop (xs:xss)
 loop [Crowd xs] = xs
 loop (xs:xss) = loop (unionPairs (xs:xss))

 unionPairs [] = []
 unionPairs [x] = [x]
 unionPairs (x:y:zs) = union' x y : unionPairs zs

 union' (VIP x xs) (VIP y ys)
= case cmp x y of
LT - VIP x (union' xs (VIP y ys))
EQ - VIP x (union' xs ys)
GT - error Data.List.Ordered.unionAll:  assumption violated!
 union' (VIP x xs) (Crowd ys) = VIP x (union' xs (Crowd ys))
 union' (Crowd []) ys = ys
 union' (Crowd xs) (Crowd ys) = Crowd (unionBy cmp xs ys)
 union' xs@(Crowd (x:xt)) ys@(VIP y yt)
= case cmp x y of
LT - VIP x (union' (Crowd xt) ys)
EQ - VIP x (union' (Crowd xt) yt)
GT - VIP y (union' xs yt)

-- Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: data-ordlist-0.2

2010-02-07 Thread Leon Smith
On Sun, Feb 7, 2010 at 6:43 AM, Ross Paterson r...@soi.city.ac.uk wrote:

 Why not wrap lists as Set and Bag abstract datatypes?  An added bonus
 is that you could make them instances of Monoid.

Well,  my current thinking is that if you really want an abstract
datatype for bags and sets,  hackage (and the standard GHC
distribution) offer a multitude of options.   Previous versions of the
code in data-ordlist dates back 9 years or more;  I extracted and
refurbished data-ordlist from a mess of miscellaneous list functions I
use.

Personally,  I've never really used this code as an abstract datatype;
 my typical use cases break the set and bag abstractions to some
extent. I use a few functions when I happen to know the lists I'm
dealing with are ordered,  and want a simple and efficient way to
manipulate them.  The functions I most often use are sortOn,  sortOn',
 nubSort,  and to a somewhat lesser extent,  the set-like operators.

I put this package on hackage as much for my own personal convenience
as for others;  but I do hope that other people will find it useful.
I realize that others might use it in rather different ways than I do,
 and am open to suggestions and proposals.

Cheers,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: data-ordlist-0.2

2010-02-06 Thread Leon Smith
I have released data-ordlist 0.2,  with a number of changes:

1.  The module name is now Data.List.Ordered,  instead of Data.OrdList

2.  Three bugfixes: (ack!)  insertSet and insertBag assumed reverse-ordered
lists,   nub failed to remove duplicates.   Thanks to Topi Karvonen for
reporting the first problem.

3.  One semantic change:  old_nubBy f == new_nubBy (not . f).   The new
version is better keeping with the spirit of the rest of the library,  and
makes the old nub bug much more obvious.  Now nubBy is the greedy algorithm
that returns a sublist such that for all binary predicates:

   isSortedBy pred (nubBy pred xs) == True

4.  Improved documentation,  I hope!   Please consider taking a look and
letting me know what you think.

http://hackage.haskell.org/package/data-ordlist

Cheers,
Leon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: homework help: upn parser

2003-12-15 Thread Leon Smith
Hi Max!

This is a type definiton, which says that parsing takes one argument,  a 
String, and gives you back a list of OpTree:

parsing :: String-[OpTree]

This is a function definition.   The part before the = is called the 
left hand side, while the part after the = is called the right hand side.

parsing (a:b:c:xs) t
 |isDigit b = ((parsing xs)) ((Term (read a) c (read b)):t)
 

The (a:b:c:xs) is an example of a pattern.   You are using the list 
constructor (:)  to break down the list into  individual items.   
According to your type definition,  this argument is of type String, 
which is the same thing as [Char].Thus,  a,  b, and c are bound to 
single characters,  while xs is the bound to the rest of the string, 
assuming  your string has at least three characters.   If your string 
has less than three characters,  then this pattern fails and the next 
clause in your function definiton will be tried. 
Then, according to your function definition,  you then bind the second 
argument of the function to the variable t.   However,  your type 
definition says you only have one argument.   Since they disagree, this 
is an error. 
The right hand side of your function definition contains lots of errors, 
as you suggest.

You probably want to rethink your type definition.   I'd suggest the 
following:

parsing :: String - [OpTree] - OpTree

This says that parsing is a function that takes two arguments,   a list 
of characters and a list of OpTrees,  and returns a single OpTree.   You 
are going to treat the second list argument like a stack.   When you run 
across a number in your string,  you are going to take the number off 
the string and push it onto the stack.  When you run across an operation 
in your string,  you are going to take two elements off the stack, 
combine the elements from the stack using the operation,  and push the 
result back on the stack.Thus,  your solution would contain this 
kind of structure:

parsing  ( string_with_number_in_front )  ( stack )
   = parsing ( string_without_number )  ( Number (the_number) : stack )
parsing  ( string_with_operation_in_front ) (last_element : next_element 
: rest_stack)
   = parsing  ( string_without_operation )   ( Term last_element 
(the_operation) next_element : rest_stack )

Now,  you must consider the boundary cases of the function parsing.   
What I mean is,  what kinds of clauses do you need for when you are done 
with recursion,  and what should your first call to this function look 
like?

Of course,  you need to try out your code and carefully think about what 
it's doing.  When asking for homework help on haskell-cafe,  Haskellers 
have been known to introduce small intentional errors into their advice.

Also,  it's very impressive that Paul Natorp high school is using 
Haskell in it's coursework!   In the United States,  I've never heard of 
a high school that takes functional programming seriously.   Sadly, many 
American universities don't take functional programming seriously 
either.   Americans are really behind the game in this regard.

Hope this helps.

best,
leon
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


The Curry-Howard Isomorphism

2002-08-19 Thread Leon Smith

For an explanation why runST has the type it does, see page 4 of Simon 
Peyton-Jones and John Launchbury's paper Lazy functional state threads

http://research.microsoft.com/Users/simonpj/Papers/lazy-functional-state-threads.ps.Z

On Friday 16 August 2002 23:57, Scott J. wrote:
runST :: forall a ( forall s ST s a) - a ? 

In logic forall x forall y statement(x.y) is equivalent to: 
  forall y forall x statement(x,y).

True, but that isn't runST's type.   forall a quantifies over the entire 
type, whereas forall s quantifies over the left hand side of the 
implication.  In other words, you have forall a. statement(a), where 
statement includes a substatement forall s. substatement(s,a)

Now, using a different argument, since s does not appear free on the R.H.S 
of the implication,  forall s can be moved outside the implication to 
quantify over the entire type.

However, this brings up an important point about the Curry-Howard 
Isomorphism, which basically says that types are statements in first-order 
logic, and the corresponding values are proofs of those statements.   

When we work in logic, we care mainly about the *statements*.  Proofs are 
ways of justifying new statements.   Equivalent statements are in a sense 
equal, and can usually be freely substituted for each other.  When we have 
a statement such as P /\ Q, where P and Q are statements such as John is 
wearing a kilt and John is Scottish, it doesn't really matter which order 
they come in.  Mathematicians generally understand that /\ is commutative, 
and thus we can use P /\ Q and Q /\ P interchangeably throughout our 
discussion.  (Unless you are writing proofs in certain highly formal proof 
systems, such as Coq, Isabelle, or Haskell's type system)

When we work in type theory, we care mainly about the *proofs*.  Statements 
are ways of constraining our proofs to ensure they are at least somewhat 
sensible.  When we have a type (Int, Char), where Int is the statement my 
value is a 32-bit integer, and Char says my value is a character, order 
does matter.  Here, (4,'H') is a proof of the proposition (Int,Char), and 
('H', 4) is a proof of the proposition (Char, Int).   Although (Int, Char) 
= (Char, Int)  are logically equivalent, (4,'H') is a different value than 
('H', 4).This equivalence  can be proven using the function:  

swap :: (a,b) - (b,a)
swap (x,y) = (y, x)

Which, using the Curry-Howard isomorphism, is also a proof that /\ is 
commutative.

In short, just because a type a implies type b, and type b implies type 
a, doesn't mean they are interchangeable as statements are in logic.

Hope this helps,
leon
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: Haskell Problem

2000-10-10 Thread Leon Smith

What you need to do is write a function that operates on a string that does
what you want it to,  and then use that to write some top-level I/O code.
If you have a function  sortFile :: String - String, you would write
something like this for main:

main :: IO ()
main = do
string - getContents "theFile"
putStr (sortFile string)

You can treat "string"  as a variable that has type String, not IO String,
which you can use anywhere you want in "main".  Keep in mind, though that
what is going on here is quite different than an assignment statement or
"converting" a IO String to a String.

This is not like the single assignment variables introduced in "where" or
"let" clauses, as we cannot substitute the value "(getContents "theFile")"
for the variable "string" in main.   This would lead to a type error, as
sortFile takes a String argument, not an IO String.

Nor is is it like the assignment statement in imperative programming
languages like C++ and Java for several reasons.   One can represent  "State
Transformers"  using monads, so what the IO monad is a state transformer
that modifies the state of the computer.

int a = 0;
int dirty_inc(int a) {
a++;
return i + a;
}

int main(int argc, char ** argv) {
int i = dirty_inc(1);
printf("%i %i", i, i);
}

Unlike monads, if you "substitute" dirty_inc(1) for i in main will result in
a legal program, but it isn't really a substitution, because it would modify
the behavior of the program.  Moreover, while we could write

main = do
message - return "Hello!"
message - return "Goodbye!"
putStr message

and get "Goodbye!" as output, what really is happening is that you are
introducing two variables with the same name, and we can statically
determine which one we are referring to.  Thus if we write

main = do
message - return "Hello!"
do
message - return "Goodbye! "
putStr message
putStr message

we will get "Goodbye! Hello!",  as output, not "Goodbye! Goodbye!".

To start to understand what's really going on,  do-notation is just
syntactic sugar for using the (=) operator.  Let's rewrite your example to
something that is syntactically equivalent:

main :: IO ()
main = getContents "theFile" = (\string - putStr (sortFile string))

Which we could in turn rewrite as:

main :: IO ()
main = getContents "theFile" = output_sort

output_sort :: String - IO ()
output_sort string = putStr (sortFile string)

What (=) does is that it takes the String returned inside of a IO String
value, and gives it to output_sort, which in turn may use that value in any
way it sees fit, *as long as output_sort returns another "IO a" value for
some type a.*This is why we are not simply converting a IO String to a
String, because in order to use the String value in IO String, we must
produce a new IO monad.   This is summed up in (=)'s type, which is  (=)
:: IO a - (a - IO b) - IO b, which can then be generalized to any monad
m, so (=) :: m a - (a - m b) - m b.

best,
leon


___
Haskell-cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe