Re: [Haskell-cafe] Increasing memory use in stream computation

2013-10-10 Thread Bertram Felgenhauer
Arie Peterson wrote:
 (Sorry for the long email.)
 
 Summary: why does the attached program have non-constant memory use?

Unfortunately, I don't know. I'll intersperse some remarks and
propose an alternative to stream fusion at the end, which allows
your test program to run in constant space.

  A simple program
 
 When running the program, the resident memory quickly grows to about 3.5 MB 
 (which I am fine with); this stays constant for a long time, but after about 
 7 
 minutes, it starts to grow further. The growth is slow, but I really would 
 hope this program to run in constant memory.

A quicker way to spot the increased memory usage is to look at GC
statistics. I used

 ./Test +RTS -Sstderr 21 | grep 'Gen:  1'
   569904  8192 65488  0.00  0.000.010.0100  (Gen:  1)
   516520  9768 67080  0.00  0.004.234.2300  (Gen:  1)
   513824 14136 71448  0.00  0.008.438.4400  (Gen:  1)
   515856 16728 74040  0.00  0.00   12.70   12.7500  (Gen:  1)
   515416 19080 76392  0.00  0.00   17.01   17.1100  (Gen:  1)
   515856 22248 79560  0.00  0.00   21.33   21.4800  (Gen:  1)
   514936 25080 82392  0.00  0.00   25.65   25.8400  (Gen:  1)
   514936 28632 85944  0.00  0.00   29.94   30.1600  (Gen:  1)
   513512 32328 89640  0.00  0.00   34.24   34.4800  (Gen:  1)
   515224 37032127112  0.00  0.00   38.35   38.6200  (Gen:  1)

Note the increasing values in the third column; that's the live bytes
after each major GC.

  The code 
 
 Note that I added an instance for Monad Stream, using concatMap. This is 
 implicitly used in the definition of the big stream.
 
 The source of Data.Stream contains many alternative implementations of concat 
 and concatMap, and alludes to the difficulty of making it fuse properly. 
 Could 
 it be that the fusion did not succeed in this case?

I had a glimpse at the core code generated by ghc, but the amount of
code is overwhelming. From reading the source code, and as far as my
intuition goes, the code *should* run in constant space.

As an experiment, I rewrote the code using difference lists, and the
result ran in constant memory. I then tried to abstract this idea into
a nice data type. (I lost the low-level difference list code on the way,
but the code was quite hard to read anyway.)

I ended up with an odd mixture of a difference lists and a continuation
that should be applied to each element:

data Stream a where
Stream :: (forall r. (a - r) - [r] - [r]) - Stream a

with  Stream s  representing the list  s id []. The motivation for the
(a - r) argument is that it makes fmap trivial:

fmap f (Stream s) = Stream (\g - s (g . f))

I'll attach the full code below (it's a separate module, Stream.hs
that can be imported instead of Data.Stream for your small example.)
With that replacement, the code runs in constant space and becomes
about 3x faster. Using the 'singleton' function for 'return' results
in an additional, but very modest (about 10%) speedup.

I wonder whether that approach scales up to your real code.

Enjoy,

Bertram
{-# LANGUAGE GADTs, Rank2Types #-}

-- A difference list based implementation of a small part of the
-- Data.Stream interface from the stream-fusion package.

module Stream where

import Prelude hiding (concatMap)
import qualified Data.List as List

data Stream a where
Stream :: { unStream :: forall r. (a - r) - [r] - [r] } - Stream a

empty :: Stream a
empty = Stream (\_ - id)

singleton :: a - Stream a
singleton x = Stream (\f - (f x :))

fromList :: [a] - Stream a
fromList xs = Stream (\f zs - foldr (\x xs - f x : xs) zs xs)

toList :: Stream a - [a]
toList (Stream s) = s id []

instance Functor Stream where
fmap f (Stream s) = Stream (\g - s (g . f))

concatMap :: (a - Stream b) - Stream a - Stream b
concatMap f g = Stream $ \h zs - foldr (\x - unStream (f x) h) zs (toList g)

filter :: (a - Bool) - Stream a - Stream a
filter p = concatMap (\x - if p x then singleton x else empty)

stream :: [a] - Stream a
stream = fromList

foldl' :: (a - b - a) - a - Stream b - a
foldl' f i s = List.foldl' f i (toList s)



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is withAsync absolutely safe?

2013-07-28 Thread Bertram Felgenhauer
Roman Cheplyaka wrote:
 Can withAsync guarantee that its child will be terminated if the thread
 executing withAsync gets an exception?
 
 To remind, here's an implementation of withAsync:
 
   withAsyncUsing :: (IO () - IO ThreadId)
  - IO a - (Async a - IO b) - IO b
   -- The bracket version works, but is slow.  We can do better by
   -- hand-coding it:
   withAsyncUsing doFork = \action inner - do
 var - newEmptyTMVarIO
 mask $ \restore - do
   t - doFork $ try (restore action) = atomically . putTMVar var
   let a = Async t (readTMVar var)
   r - restore (inner a) `catchAll` \e - do cancel a; throwIO e
   cancel a
   return r
 
 I am interested in the case when an exception arrives which transfers
 control to 'cancel', and then another exception arrives to the same
 thread. Even though 'catchAll' (which is a type-restricted synonym for
 catch) masks the exception handler, 'throwTo' inside 'cancel' is
 interruptible (as stated by the documentation).
 
 Will this scenario lead to a thread leakage?

Yes. I guess that 'cancel' should use 'uninterruptibleMask_', but it's a
hard call to make (if an async action becomes unresponsive, do we want
to risk not being able to deliver any exceptions to the controlling
thread just because it wants to terminate the async action?)

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] I killed performance of my code with Eval and Strategies

2012-11-17 Thread Bertram Felgenhauer
Dear Janek,

 I am reading Simon Marlow's tutorial on parallelism and I have problems
 with correctly using Eval monad and Strategies. I *thought* I understand
 them but after writing some code it turns out that  obviously I don't
 because parallelized code is about 20 times slower. Here's a short
 example  (code + criterion benchmarks):

Actually, (sin . sqrt) is simply too cheap. The overhead of constructing
chunks (which have to be constructed on the heap) and concatenating the
results far outweighs the cost of computing the list elements.

If, for example, you replace sin . sqrt by f defined by

f :: Double - Double
f x | x  10 = x*x
| otherwise = sin x * f (x-100)

the picture will change. The loss also becomes far less dramatic if
you construct the chunks outside of the benchmark:

main :: IO ()
main = defaultMain [
bench Seq $ nf (map calculateSeq) xs
  , bench Par $ nf calculatePar xs ]
where xs = chunk 2048 [1..16384]

f, f' :: Double - Double
f x = sqrt (sin x)
f' x | x  10 = x*x
| otherwise = sin x * f' (x-100)


calculateSeq :: [Double] - [Double]
calculateSeq [] = []
calculateSeq (x:xs) = f x : calculateSeq xs

calculatePar :: [[Double]] - [[Double]]
calculatePar xss = runEval $ parList (rdeepseq . calculateSeq) xss

chunk :: Int - [a] - [[a]]
chunk _ [] = []
chunk n xs = as : chunk n bs where !(as, bs) = splitAt n xs

The parallel version (with f = sqrt . sin) is still somewhat slower
than the sequential version with -N1 -- probably due to rdeepseq.

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What does unpacking an MVar really mean?

2012-07-31 Thread Bertram Felgenhauer
Leon Smith wrote:
 I am familiar with the source of Control.Concurrent.MVar,  and I do see {-#
 UNPACK #-}'ed MVars around,  for example in GHC's IO manager. What I
 should have asked is,  what does an MVar# look like?  This cannot be
 inferred from Haskell source;  though I suppose I could have tried to read
 the Runtime source.

So let's have a brief look at the source. MVar# is an RTS specific
heap object which contains three pointers:

(from ghc/includes/rts/storage/Closures.h)

typedef struct {
StgHeaderheader;
struct StgMVarTSOQueue_ *head;
struct StgMVarTSOQueue_ *tail;
StgClosure*  value;
} StgMVar;

The 'value' pointer refers to the actual value held by the mutable
variable, if any. The 'head' and 'tail' pointers are used for
managing a linked list of threads blocked on the mutable variable.

An MVar (if evaluated) contains just a pointer the MVar# object.

To access the value of an MVar, one starts with a pointer to the MVar
heap object. Then,

  1. Make sure that the MVar is evaluated, using standard lazy
 evaluation (follow indirections, enter thunks, ...).

 In the best case that's a check of a tag bit in the pointer.

  2. Read the pointer to the MVar# in the MVar.

  3. access the 'value' field of the StgMVar record, which
 results in another pointer to a heap object representing
 the actual data held by the MVar.

 (In reality the code has to check whether the MVar is full
 or not, and block if necessary. This is quite involved; see
 stg_takeMVarzh  in  ghc/rts/PrimOps.cmm)

That's two dereferences and some bookkeeping work.

In loops, the compiler will often unpack the MVar, so that you can
expect the first two steps to be performed just once.

Unpacking an MVar into a bigger record means that the pointer to the
MVar# will be stored in the record directly, rather than a pointer
to an MVar object that holds a pointer to the MVar#.

Note that MVar# itself cannot be unpacked -- the StgMVar record will
always be a separate heap object.

 I was asking the dual question:  if the MVar# exists in some data
 structure,  can that data structure still be garbage collected when there
 is a reference to the MVar#,  but not the data structure it is contained
 within.

Yes, because the data structure only contains a pointer to an MVar#
(StgMVar record) that will live on.

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] not enough fusion?

2012-07-02 Thread Bertram Felgenhauer
Hi,

Johannes Waldmann wrote:
 s2 :: Int - Int   
 s2 n = sum $ do 
   x - [ 0 .. n-1 ]
   y - [ 0 .. n-1 ]
   return $ gcd x y

This code shows some interesting behaviour: its runtime depends heavily
on the allocation area size.

For comparison, with  main = print $ s1 1  I see the following GC
statistics.

# ./a.out +RTS -A512k -s
  15,201,891,976 bytes allocated in the heap
   4,272,144 bytes copied during GC
  Total   time9.97s  ( 10.00s elapsed)
  %GC time   1.1%  (1.1% elapsed)

For s2, using  main = print $ s2 1,  I get

# ./s2 +RTS -A512k -s
  20,801,251,976 bytes allocated in the heap
   3,438,870,504 bytes copied during GC
  Total   time   15.90s  ( 15.95s elapsed)
  %GC time  34.3%  (34.4% elapsed)
# ./s2 +RTS -A1m -s
  20,801,251,976 bytes allocated in the heap
   2,724,903,032 bytes copied during GC
  Total   time   14.74s  ( 14.80s elapsed)
  %GC time  29.2%  (29.3% elapsed)
# ./s2 +RTS -A2m -s
  20,801,251,976 bytes allocated in the heap
  20,311,952 bytes copied during GC
  Total   time   10.74s  ( 10.78s elapsed)
  %GC time   1.2%  (1.2% elapsed)
# ./a.out +RTS -A2052k -s
  20,801,251,976 bytes allocated in the heap
   1,812,776 bytes copied during GC
  Total   time   10.35s  ( 10.38s elapsed)
  %GC time   0.8%  (0.8% elapsed)

Note that the number of bytes copied during GC drops by three orders of
magnitude when we increase the allocation area size from 1 MB to 2 MB,
and another order of magnitude when adding an additional 4kB to that.

Why does this happen? The reason is a failure of generational GC
heuristics. If we look at the core, we find that the inner loop (which
generates a list that is consumed by sum) translates to something like

nums = [0..]

go xs0 = case xs0 of
[] - []
x : xs - let
go' ys0 = case ys0 of
[] - []
y : ys - GHC.Real.gcdInt x y : go' ys
  in
case go' nums of
[] - go xs
(z : zs) - z : zs ++ go xs

At a first glance, this looks fine - there's one big chunk of fixed data
(the nums list) that will end up in the old generation. The rest of the
data is consumed as it is created. However, this is not quite true: The
thunk for the second argument to (++) (representing 'go xs') is already
allocated on the heap when the first element of the result of (++), i.e.,
the first element of zs, is consumed. While  zs  is processed, this thunk
is never touched. If it survives a whole GC-mutator cycle, then the next
garbage collection will consider the thunk mature and put it into the
old generation. But when the code starts evaluating this 'go xs' call,
it produces a long list, all of which is being kept alive by the (now
updated) thunk in generation 1, and as a result will be copied during
GC, until the next major GC.

So the observed behaviour hinges on the question whether the  go xs
thunk can survive processing the zs list. The memory allocated during
this time is constant - besides the list, some memory is allocated
for thunks in go', and for intermediate Integers[*] in gcdInt. If the
allocation area size exceeds this constant, then the program will
run as expected. Note that every time a 'go xs' thunk survives, a lot
of extra work is caused for the GC -- this explains the sharp increase
in bytes copied observed above.

Bertram

[*] gcdInt really ought to only do a single allocation for the result,
with an inlining opportunity to get rid of that as well. It is defined
in GHC.Real as

gcdInt :: Int - Int - Int
gcdInt a b = fromIntegral (gcdInteger (fromIntegral a) (fromIntegral b))

which used to optimize to a single low-level GHC.Integer.Type.gcdInt call
in ghc 7.2. With 7.4 and HEAD, integerToInt and smallInteger are no longer
inlined, resulting in worse code. See also

  http://hackage.haskell.org/trac/ghc/ticket/7041

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] mueval leaving behind tmp files

2012-04-04 Thread Bertram Felgenhauer
Johannes Waldmann wrote:
 The following program prints   Right (test,Bool,True)
 as it should, but it leaves behind in /tmp 
 two files (name is a long string of digits)
 and an empty directory (name is ghcN_N).

I have a partial solutions for this; see the attached patch for hint.
It cleans up the numbered files, which are generated for phantom
modules. There are two phantom modules generated -- one that defines
a _show function for hint's internal use, and one that provides a
bunch of pre-imported modules for mueval. The latter will not be
generated if the 'modules' mueval option is set to Nothing.

For some reason, cleaning the ghc* directory fails if the executed
code fails (for example, using 'test = undefined' in Johannes' code).
Can anybody explain why? The `finally` handler is run, and the
phantom module files are actually deleted.

 ... and it deletes the input file (/tmp/Main.hs).

That appears to be an undocumented mueval feature. It's not a good one,
I think.

 That's not nice. Ideally, I would want to read input
 from a String (instead of the file), and not write to disk at all.

I suppose that ghc's interface does not support this, but I have
not checked.

Best regards,

Bertram
1 patch for repository http://darcsden.com/jcpetruzza/hint:

Wed Apr  4 14:59:33 CEST 2012  Bertram Felgenhauer in...@gmx.de
  * clean temporary files in runInterpreterT(withArgs)

New patches:

[clean temporary files in runInterpreterT(withArgs)
Bertram Felgenhauer in...@gmx.de**20120404125933
 Ignore-this: ff9abed505645f81131a57182d371861
] hunk ./src/Hint/Context.hs 10
 
   PhantomModule(..), ModuleText,
   addPhantomModule, removePhantomModule, getPhantomModules,
+  cleanPhantomModules,
 
   allModulesInContext, onAnEmptyContext,
 
hunk ./src/Hint/Context.hs 254
--
onState (\s -s{qual_imports = quals})
 
--- | All imported modules are cleared from the context, and
---   loaded modules are unloaded. It is similar to a @:load@ in
---   GHCi, but observe that not even the Prelude will be in
---   context after a reset.
-reset :: MonadInterpreter m = m ()
-reset =
+-- | 'cleanPhantomModules' works like 'reset', but skips the
+--   loading of the support module that installs '_show'. Its purpose
+--   is to clean up all temporary files generated for phantom modules.
+cleanPhantomModules :: MonadInterpreter m = m ()
+cleanPhantomModules =
 do -- Remove all modules from context
runGhc2 Compat.setContext [] []
--
hunk ./src/Hint/Context.hs 280
 import_qual_hack_mod = Nothing,
 qual_imports = []})
liftIO $ mapM_ (removeFile . pm_file) (old_active ++ old_zombie)
-   --
-   -- Now, install a support module
-   installSupportModule
+
+-- | All imported modules are cleared from the context, and
+--   loaded modules are unloaded. It is similar to a @:load@ in
+--   GHCi, but observe that not even the Prelude will be in
+--   context after a reset.
+reset :: MonadInterpreter m = m ()
+reset = do -- clean up context
+   cleanPhantomModules
+   --
+   -- Now, install a support module
+   installSupportModule
 
 -- Load a phantom module with all the symbols from the prelude we need
 installSupportModule :: MonadInterpreter m = m ()
hunk ./src/Hint/InterpreterT.hs 164
   ifInterpreterNotRunning $
 do s - newInterpreterSession `catch` rethrowGhcException
-- SH.protectHandlers $ execute s (initialize args  action)
-   execute s (initialize args  action)
+   execute s (initialize args  action `finally` cleanSession)
 where rethrowGhcException   = throw . GhcException . showGhcEx
 #if __GLASGOW_HASKELL__  610
   newInterpreterSession =  do s - liftIO $
hunk ./src/Hint/InterpreterT.hs 170
  Compat.newSession GHC.Paths.libdir
   newSessionData s
+  cleanSession = cleanPhantomModules -- clean ghc session, too?
 #else
   -- GHC = 610
   newInterpreterSession = newSessionData ()
hunk ./src/Hint/InterpreterT.hs 174
+  cleanSession =
+   do cleanPhantomModules
+  runGhc $ do dflags - GHC.getSessionDynFlags
+  GHC.defaultCleanupHandler dflags (return ())
 #endif
 
 {-# NOINLINE uniqueToken #-}

Context:

[bump to version 0.3.3.4
jcpetru...@gmail.com**20111220224039
 Ignore-this: 23d55959cc61ebbd20f5ebd4f2a86bd9
] 
[authors file updated
jcpetru...@gmail.com**20111220224018
 Ignore-this: e4f66f8324ac599e74e665a1e2292c12
] 
[compile with ghc 7.4 snapshot
Mark Wright gie...@gentoo.org**20111220114907
 Ignore-this: cc43ccb4e716324ccfbfbb1d38f2668c
] 
[TAG 0.3.3.3
jcpetru...@gmail.com**2004192050
 Ignore-this: f2f8da08437fa759cb41f0f4e35a11a
] 
Patch bundle hash:
7ee3dacd0f2fd8713494534c000231ed2e85b783
___
Haskell-Cafe mailing list
Haskell-Cafe

[Haskell-cafe] Cabal-1.10.1.0 and bytestring-0.9.2.0 hackage problem.

2011-08-25 Thread Bertram Felgenhauer
Dear list,

Cabal-1.10.1.0 contains a bug that causes it to fail to parse the
test-suite target of bytestring-0.9.2.0. Since cabal-install parses
all package descriptions to before resolving dependencies, users
with that version of Cabal are stuck.

Now it seems somebody realised this problem and removed
bytestring-0.9.2.0 from hackage. However,

1. http://hackage.haskell.org/package/bytestring is now broken.
2. The downloadeble package index (00-index.tar) still contains
   the bytestring-0.9.2.0 cabal file, so the problem persists.

As a workaround, one can remove the cabal file from the downloaded
index manually,

tar -f ~/.cabal/packages/hackage.haskell.org/00-index.tar --delete 
bytestring/0.9.2.0

(or rebuild Cabal / cabal-install starting with
'cabal unpack Cabal; cabal unpack cabal-install')

But it really needs to be fixed on the hackage server.

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] pointer equality

2011-07-20 Thread Bertram Felgenhauer
Carl Howells wrote:
 On Tue, Jul 19, 2011 at 11:14 PM, yi huang yi.codepla...@gmail.com wrote:
  2011/7/20 Eugene Kirpichov ekirpic...@gmail.com
 
  reallyUnsafePointerEq#, and it really is as unsafe as it sounds :)
 
  Why is it so unsafe? i can't find any documentation on it.
  I think always compare pointer first is a good optimization.
 
 False positives and false negatives are both possible, depending on GC
 timing.

At the moment, as implemented in ghc, false positives are not possible,
because GC only happens on allocation [*], and there is no allocation
happening in that primitive operation. I don't think this is going
to change without a total rewrite of ghc, since allowing GC (i.e.,
moving pointers) at arbitrary times would be a fundamental change to
the STG execution model.

Pretty much everything else imaginable can happen; in particular, if two
variables a and b compared equal at one point, they may later become
different pointers again. In the parallel RTS, if you're unlucky, this
may even be a permanent effect.

Best regards,

Bertram

[*] we'll have thread-local GC for the first generation soon, but a lot
of effort went into ensuring consistentcy of pointers seen by other
threads.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] pointer equality

2011-07-20 Thread Bertram Felgenhauer
David Barbour wrote:
 On Wed, Jul 20, 2011 at 10:40 AM, Chris Smith cdsm...@gmail.com wrote:
  The point, I think, is that if pointer equality testing really does what
  it says, then there shouldn't *be* any correct implementation in which
  false positives are possible.  It seems the claim is that the garbage
  collector might be moving things around, have just by chance happened to
  place the second value in the spot formerly occupied by the first, and
  have not updated the first pointer yet.  But if that's the case, and
  it's executing arbitrary user code that may refer to that memory, then
  the garbage collector contains race conditions!
 
 You assume that the GC uses the same primitive as the developer, and is
 inherently subject to its own race conditions.
 
 But Bertram has said that false positives are not possible. I can only
 assume that the pointer comparison is atomic with respect to the GC.

That's right. A lot of things that the CMM code (and eventually the
machine code) generated by ghc does is atomic with respect to GCs - from
a single worker thread's point of view, GCs only happen when it tries to
allocate some memory. (Then it does a heap check, and if that fails,
saves some state and hands control over to the garbage collector. If the
state contains pointers, the GC will know that and adjust them. Finally
the state is restored and execution resumes.)

Between these points, the code is free to access pointers on the stack
and heap and dereference them, without having to worry about GC changing
the memory under its nose.

The reallyUnsafePointerEquality# primitive is implemented at this low
level, and there are no intervening heap checks, and thus no GCs that
could interfere with the comparison.

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Splitting Hackage Packages and re-exporting entire modules (with same module name)

2011-07-13 Thread Bertram Felgenhauer
Antoine Latter wrote:
 If you give the module a new name in the new package then the old
 module can re-export all of the symbols in the new module.
 
 In GHC I don't think there is a way for two packages to export the
 same module and have them be recognized as the same thing, as far as I
 know.

Right, but you don't have to rename the module if you use the
PackageImports extension. (Incidentally, this used by the haskell2010
ibrary, which is implemented in terms of base)

http://haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#package-imports

So the following should work, assuming 'original' is a package exporting
the Foo.Bar module.

  {-# LANGUAGE PackageImports #-}
  module Foo.Bar (module Original) where

  import original Foo.Bar as Original

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generating random graph

2011-04-13 Thread Bertram Felgenhauer
Hi Mitar,

 I have made this function to generate a random graph for
 Data.Graph.Inductive library:
 
 generateGraph :: Int - IO (Gr String Double)
 generateGraph graphSize = do
   when (graphSize  1) $ throwIO $ AssertionFailed $ Graph size out
 of bounds  ++ show graphSize
   let ns = map (\n - (n, show n)) [1..graphSize]
   es - fmap concat $ forM [1..graphSize] $ \node - do
 nedges - randomRIO (0, graphSize)
 others - fmap (filter (node /=) . nub) $ forM [1..nedges] $ \_ -
 randomRIO (1, graphSize)
 gen - getStdGen

Others have already remarked that you could implement this as a pure
function. However, the mistake is the use of  getStdGen  here, which
is (almost?) never what you need: two consecutive valls of getStdGen
will return the same generator. You should call newStdGen  instead.

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Faster timeout but is it correct?

2011-02-21 Thread Bertram Felgenhauer
Hi Bas,

 The solution is probably to reverse the order of: unsafeUnmask $
 forkIO to forkIO $ unsafeUnmask. Or just use forkIOUnmasked. The
 reason I didn't used that in the first place was that it was much
 slower for some reason.

The reason is probably that in order for the  forkIOUnmaske-d thread
to receive an exception (which it must, to be killed), it first has
to be activated at least once, to perform the unsafeUnmask. So the
worker thread's call to killThread will block. Now if there were a
way to specify the exception mask of the newly created thread directly,
it should be just as fast as the  unsafeUnmask . forkIO  version.

I have not checked the event manager based implementation in detail,
but from your numbers it looks like the best option at this time.

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Misleading MVar documentation

2011-01-05 Thread Bertram Felgenhauer
Mitar wrote:
 Hi!
 
 On Sat, Dec 25, 2010 at 11:58 AM, Edward Z. Yang ezy...@mit.edu wrote:
  I think you're right. A further comment is that you don't really need
  stringent timing conditions (which is the only thing I think of when
  I hear race) to see another thread grab the mvar underneath
  you
 
 Yes, MVars are (bounded, 1 space long) queues with predictable behavior.
 
 Maybe we should change documentation for swapMVar (and others) and
 replace notion of race condition with that it can block.

But this was not the issue this thread was about, namely: If readMVar
(to pick one example) is applied to a full MVar with a pending putMVar,
then readMVar will block *after* reading the value from the MVar,
before being able to put it back. This is surprising, as it means that
the value of the MVar can change during the execution of readMVar. (it
will be changed back before readMVar completes but then the damage may
already be done.)

I think atomicity is the right concept here: All these operations consist
of two separate steps (and can be interrupted in the middle) and this
behaviour is observable if the MVar is not used in a single token mutex
fashion, where either the MVar is full or there is exactly one thread
owning the token. (Unlike simple mutexes, the MVar token can be labeled
and re-labeled by the thread owning it.)

Best regards,

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Are newtypes optimised and how much?

2010-11-05 Thread Bertram Felgenhauer
 | Then we can define
 | 
 | safeCoerce :: (a ~~ b) = a - b
 | safeCoerce = unsafeCoerce
 
 Yes, that's right.  When I said we have the technology I meant that we 
 (will) have something similar to ~~.  See our paper Generative Type 
 Abstraction and Type-level Computation 
 http://www.cis.upenn.edu/~sweirich/newtypes.pdf.  No unsafeCoerce required.

The idea was to put safeCoerce into a library. The syntax extension
would be light-weight because contexts, unlike expressions, still have
plenty of room for extensions. The idea is is based on the assumption
that to the compiler, 'unsafeCoerce' looks like an artificial safe
coercion, so that after inlining safeCoerce, we get exactly the effect
of a safe coercion during type checking and further compilation. Perhaps
that assumption is wrong. I'll look at the paper.

Regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Scrap your rolls/unrolls

2010-11-02 Thread Bertram Felgenhauer
Max Bolingbroke wrote:
 On 23 October 2010 15:32, Sjoerd Visscher sjo...@w3future.com wrote:
  A little prettier (the cata detour wasn't needed after all):
 
    data IdThunk a
    type instance Force (IdThunk a) = a
 
 Yes, this IdThunk is key - in my own implementation I called this Forced, 
 so:
 
 
 type instance Force (Forced a) = a
 
 
 You can generalise this trick to abstract over type functions, though
 I haven't had a need to do this yet. Let us suppose you had these
 definitions:
...
 With type functions, Haskell finally type-level lambda of a sort :-)

Indeed. I had a lot of fun with the ideas of this thread, extending
the 'Force' type family (which I call 'Eval' below) to a small EDSL
on the type level:

The EDSL supports constants

  data Con (t :: *)-- type constant
  data Con1 (t :: * - *)  -- unary type constructor
  data ConE (t :: * - *)  -- like Con1, but the argument is used with Eval

and a few operators

  data App a b -- apply a to b
  data Fix a   -- compute fixpoint of a
  data a :.: b -- compose two unary type constructors

There is a type family Eval that maps expressions from that EDSL to
actual types,

  type family Eval t :: *

The basic operations are straight-forward to implement,

  type instance Eval (Con t) = t
  type instance Eval (App (Con1 t) a) = t (Eval a)
  type instance Eval (App (ConE t) a) = t a
  type instance Eval (App (a :.: b) c) = Eval (App a (App b c))

Now we turn to fixed points. An easy definition would be

  type instance Eval (Fix f) = Eval (App f (Fix f))

Let's play with that. For that, we defined

  data TreeF a t = Node a [Eval t]
  type Tree a = Eval (Fix (ConE (TreeF a)))

  -- works fine in ghc 7.1
  -- ghc 6.12.3 chokes on it for the recursive types
  deriving instance (Show a, Show (Eval t)) = Show (TreeF a t)

And indeed,

  ghci Node 1 [Node 2 []] :: Tree Int
  Node 1 [Node 2 []]

and a lot of other things work as expected. But what if we want to
work with fixed points of the composition of several functors?

This works fine:

  type Tree2 a b = Eval (Fix (ConE (TreeF a) :.: ConE (TreeF b)))

  t0 :: Tree2 Bool Int
  t0 = Node True [Node 1 [Node False []]]

  t1 :: Tree2 Int Bool
  t1 = Node 1 [Node True [Node 1 [Node False [

but this doesn't:

  t1 :: Tree2 Int Bool
  t1 = Node 1 [t0]

We can help the type checker out by evaluating fixed points for
compositions differently: Instead of applying the whole sequence
of functors at once, apply them one by one, and keep the remaining
sequence in a nice shape so that the type checker can identify
equivalent compositions.

The implementation is somewhat verbose, but quite straight-forward
tree manipulation.

  -- easy case first
  type instance Eval (Fix (ConE f)) = f (Fix (ConE f))

  -- handle compositions, phase 1.: find last element.
  type instance Eval (Fix (a :.: (b :.: c))) = Eval (Fix ((a :.: b) :.: c))
  type instance Eval (Fix (a :.: (ConE b))) = Eval (Fix1 a (ConE b))
  type instance Eval (Fix (a :.: (Con1 b))) = Eval (Fix1 a (Con1 b))

  data Fix1 a b
  -- compositions, phase 2.: build left-associative composition chain
  type instance Eval (Fix1 (a :.: (b :.: c)) d) = Eval (Fix1 ((a :.: b) :.: c) 
d)
  type instance Eval (Fix1 (a :.: ConE b) c) = Eval (Fix1 a (ConE b :.: c))
  type instance Eval (Fix1 (a :.: Con1 b) c) = Eval (Fix1 a (Con1 b :.: c))

  -- compositions, final step: apply first element to fixpoint of shifted cycle
  type instance Eval (Fix1 (ConE a) b) = a (Fix (b :.: ConE a))
  type instance Eval (Fix1 (Con1 a) b) = a (Eval (Fix (b :.: Con1 a)))

And with that implementation, the above definition of  t1  typechecks.

Full code with more examples is available at

  http://int-e.cohomology.org/~bf3/haskell/Fix.hs

Best regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Are newtypes optimised and how much?

2010-11-02 Thread Bertram Felgenhauer
Simon Peyton-Jones wrote:
 What you really want is to say is something like this.  Suppose my_tree :: 
 Tree String.  Then you'd like to say
   my_tree ::: Tree Foo
 meaning please find a way to convert m_tree to type (Tree Foo), using 
 newtype coercions.  
 
 The exact syntax is a problem (as usual).  We have the technology now.  The 
 question is how important it is.

I think extending the syntax for contexts would be sufficient:
Write a ~~ b for a can be converted to b by wrapping / unwrapping
newtypes, which is a conservative approximation of a and b have the
same representation.

Then we can define

safeCoerce :: (a ~~ b) = a - b
safeCoerce = unsafeCoerce

and your example would become

safeCoerce my_tree :: Tree Foo

The feature would add convenience to the language when working with
newtypes, and reduce the tension between type safety and performance
(where the choice is between using a newtype and unsafeCoerce, and
just working with the plain underlying type.) So while the pressure
is quite low, I imagine it would become quite a useful feature once
we'd have it, but that's of course speculation.

As far as I can see, the feature is nontrivial: Care has to be taken
to not break abstractions (like safelycoercing IO to ST), so it's
quite possible that the engineering effort outweighs the potential
benefits.

Best regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Eta-expansion destroys memoization?

2010-10-12 Thread Bertram Felgenhauer
Simon Marlow wrote:
 Interesting.  You're absolutely right, GHC doesn't respect the
 report, on something as basic as sections!  The translation we use
 is
 
   (e op)  ==  (op) e
 
 once upon a time, when the translation in the report was originally
 written (before seq was added) this would have been exactly
 identical to \x - e op x, so the definition in the report was
 probably used for consistency with left sections.
 
 We could make GHC respect the report, but we'd have to use
 
   (e op)  ==  let z = e in \x - z op x
 
 to retain sharing without relying on full laziness.

We should keep in mind that this was changed deliberately in ghc 6.6,
in order to support postfix operators.

http://www.haskell.org/ghc/docs/6.6/html/users_guide/release-6-6.html

The motivating example was the factorial operator which can currently
be written as  (n !)  in ghc-Haskell.

Cheers,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] hClose: invalid argument (Invalid or incomplete multibyte or wide character)

2010-10-06 Thread Bertram Felgenhauer
Hi,

Daniel Fischer wrote:
 On Tuesday 05 October 2010 23:34:56, Johannes Waldmann wrote:
  main =  writeFile check.out ü
 
  that's u-umlaut, and the source file is utf-8-encoded
  and ghc-6.12.3 compiles it without problems but when running, I get
 
  hClose: invalid argument (Invalid or incomplete multibyte or wide
  character)

In order to make the behaviour independent of the locale (which is
desirable for programs storing state in text files), you can use
functions like writeFileUTF8 and readFileUTF8 here:

import System.IO

writeFileUTF8 file text = withFile file WriteMode $ \handle - do
hSetEncoding handle utf8
hPutStr handle text

readFileUTF8 file = do
handle - openFile file ReadMode
hSetEncoding handle utf8
hGetContents handle

main = do
let s = äöü
writeFileUTF8 test.out s
s' - readFileUTF8 test.out
putStrLn $ unwords [s, ==, s']

Of course using System.IO.UTF8 from utf8-string would also work.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Why isn't there a cheaper split-in-two operation for Data.Set?

2010-10-04 Thread Bertram Felgenhauer
Ryan Newton wrote:
 Would there be anything wrong with a Data.Set simply chopping off half its
 (balanced) tree and returning two approximately balanced partitions
...
 cleave :: Set a - (Set a, Set a)
 cleave Tip = (Tip, Tip)
 cleave (Bin _ x l r)
   | size l  size r = (l, insertMin x r)
   | otherwise   = (insertMax x l, r)

This function would expose some of the internal structure of Set - i.e.
there could be equal sets  s1 == s2  with  cleave s1 /= cleave s2.

Maybe a better idea than to expose such a function would be to split
Data.Set into Data.Set.Internal and Data.Set, where Data.Set.Internal
would export the actual Tip and Bin constructors. Then people who want
to break the abstraction, for example to experiment with parallel folds,
could do that easily.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: base-3 -gt; base-4

2010-09-05 Thread Bertram Felgenhauer
Johannes Waldmann wrote:
 Ivan Lazar Miljenovic ivan.miljenovic at gmail.com writes:
 
  ... the only thing that changed of significance was the
  exception handling: Control.Exception now uses extensible exceptions

base-4 also introduced the Control.Category.Category class and
restructured Control.Arrow to use that.

 I'm pretty sure ghc-6.12.3 gives warnings this will not work with base-4
 in some more places. I was hoping there is a comprehensive list somewhere.

There's

  http://www.haskell.org/haskellwiki/Upgrading_packages

but it's not up-to-date, probably because updating to ghc-6.12
was comparatively painless.

And of course there are the ghc-6.10.1 release notes which also
cover the changes in base-4:

  http://www.haskell.org/ghc/docs/6.10.1/html/users_guide/release-6-10-1.html

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fast Integer Input

2010-08-23 Thread Bertram Felgenhauer
Serguey Zefirov wrote:
 2010/8/23  200901...@daiict.ac.in:
  This function takes 1.8 seconds to
  convert 2000 integers of length 10^13000. I need it to be smaller that
  0.5 sec. Is it possible?
 
 2000 integers of magnitude 10^13000 equals to about 26 MBytes of data
 (2000 numbers each 13000 digits long). Rounding 1.8 seconds to two
 seconds we conclude that you proceed with speed about 13MBytes per
 second. Assuming you have CPU clock frequency near 2.6GHz, you
 performing about 200 clock cycles per input digit.
 
 10^13000 roughly equal to 2^39000. Or (2^32)^1219 - 1219 32-bit words
 of representation. So you're doing some last nextN =
 (n*10)+currentDigit conversion operations in less that one clock cycle
 per word.

You can do better than calculating (n*10 + d) repeatedly, using a
divide and conquer scheme, which is in fact implemented in ByteString's
readInteger:

((a*10 + b) * 100 + (c*10 + d)) * 1 + ...

This helps because now we're multiplying numbers of roughly equal
size, which using FFT and related methods, as gmp uses, can be sped
up immensely, beating the quadratic complexity that you get with
the naive approach.

The timings seem about right.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak with unsafePerformIO

2010-06-27 Thread Bertram Felgenhauer
Henning Thielemann wrote:
 Attached is a program with a space leak that I do not understand. I
 have coded a simple 'map' function, once using unsafePerformIO and
 once without. UnsafePerformIO has a space leak in some circumstances.
 In the main program I demonstrate cases with and without space leak.
 Without space leak the program writes a file to the disk until it
 is full. Any idea?

The program relies on the GC doing short-cut evaluation of record
selectors to avoid a space leak. If the user of the function
splitAtLazy

| splitAtLazy :: [b] - [a] - ([a],[a])
| splitAtLazy nt xt =
|(\ ~(ys,zs) - (ys,zs)) $
|case (nt,xt) of
|   (_:ns, x:xs) -
|  let (ys,zs) = splitAtLazy ns xs
|  in  (x:ys,zs)
|   (_, xs) - ([],xs)

somehow holds on to a reference of the returned pair while processing
the first part of the list, there will be a space leak, because that
means that the whole prefix remains reachable.

splitAtLazy itself is not leaky, because the value returned by the
recursive call is scrutinized as follows,

  Main.$wsplitAtLazy =
  ...
  (# case ds_sLb of wild_B1 { (ys_agC, zs_agE) - ys_agC },
 case ds_sLb of wild_B1 { (ys_agC, zs_agE) - zs_agE } #)

and ghc turns that into record selector thunks in the code generator.

The precise rule can be found in compiler/codeGen/StgCmmBind.hs:

| Note [Selectors]
| ~~~
| We look at the body of the closure to see if it's a selector---turgid,
| but nothing deep.  We are looking for a closure of {\em exactly} the
| form:
| 
| ...  = [the_fv] \ u [] -
|  case the_fv of
|con a_1 ... a_n - a_i

Now let's look at how the result of splitAtLazy is used.

non-leaky version (case 0):

  Main.lvl1 =
case Main.$wsplitAtLazy @ () @ GHC.Types.Char Main.xs Main.xs1
of ww_sLv { (# ww1_sLx, ww2_sLy #) -
Main.go (GHC.Base.++ @ GHC.Types.Char ww1_sLx ww2_sLy)
}

The return values are passed on to (++) directly. The result pair is
actually never built at all, so no reference to it can be kept.

leaky version (case 3):

  Main.ds =
case Main.$wsplitAtLazy @ () @ GHC.Types.Char Main.xs Main.xs1
of ww_sKy { (# ww1_sKA, ww2_sKB #) -
(ww1_sKA, ww2_sKB)
}

This builds the pair returned by splitAtLazy.

  Main.lvl1 =
case Main.ds of wild_Xw { (prefix_aCf, suffix_aCh) - prefix_aCf }

Use of prefix: it's a record selector. This is fine.

  Main.lvl2 = Main.go Main.lvl1

The prefix is then passed to some worker function.

  Main.lvl3 =
case Main.ds of wild_Xw { (prefix_aCf, suffix_aCh) -
Main.go1 suffix_aCh
}

Use of suffix: Due to the call of  Main.go1  this is *not* a record
selector. It is compiled to an actual case expression, which to the
garbage collector looks just like an ordinary thunk. A reference to
Main.ds is kept around until the suffix is about to be processed
and a memory leak ensues.

If the compiler had produced

  Main.lvl3 =
case Main.ds of wild_Xw { (prefix_aCf, suffix_aCh) -
suffix_aCh
}

  Main.lvl4 = Main.go1 Main.lvl3

instead, then there would not be a leak. This whole record selector
thunk business is very fragile.

The good news is that the problem is completely unrelated to
unsafePerformIO (the presence of unsafePerformIO makes optimisations
more difficult, but any pure function of sufficient complexity would
have the same effect).

There's a simple fix for the problem, too: Change

   let (prefix, suffix) = makeTwoLists 'a'

to
  let !(prefix, suffix) = makeTwoLists 'a'

in which case the compiler produces code similar to the non-leaky case
for all alternatives.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Huffman Codes in Haskell

2010-06-27 Thread Bertram Felgenhauer
Andrew Bromage wrote:
  But honestly, it's just not that hard to do in linear time, assuming
  the symbols are sorted by frequency:
 
 Or maybe not so easy.

But not much harder.

data Tree a = Branch (Tree a) (Tree a)
| Leaf   a
deriving Show

huffmanTree :: (Ord a, Num a) = [(a, b)] - Tree b
huffmanTree [] = error huffmanTree: empty code
huffmanTree xs =
let xs' = sortBy (comparing fst) [(a, Leaf b) | (a, b) - xs]
branches ((a, l) : (b, r) : ts) = (a+b, Branch l r) : branches ts
merged = merge (-1 :: Int) xs' (branches merged)
merge n [] ys  = take n ys
merge n (x:xs) ys | n = 0 = x : merge (n+1) xs ys
merge n (x:xs) (y:ys) = case comparing fst x y of
GT - y : merge (n-1) (x:xs) ys
_  - x : merge (n+1) xs (y:ys)
in  snd $ last merged

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] C variable access via FFI

2010-04-20 Thread Bertram Felgenhauer
Tom Hawkins wrote:
 I have a bunch of global variables in C I would like to directly read
 and write from Haskell.  Is this possible with FFI,

Yes it is, as explained in section 4.1.1. in the FFI specification [1].
An import for a global variable  int bar  would look like this:

foreign import ccall bar bar :: Ptr CInt 

The difference to an import of a function  int foo()  is the extra .

HTH,

Bertram

[1] http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi/ffise4.html#x7-170004.1.1
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC, odd concurrency space leak

2010-04-18 Thread Bertram Felgenhauer
Bulat Ziganshin wrote:
  This expands as
 
  always a = a  always a
   = a  a  always a
   = a  a  a  always a
  ...
  where each  application is represented by a newly allocated object
  (or several, I have not looked at it in detail) on the heap.
 
 why you think so?

At the time I wrote this, because it explains the space leak and because
the space leak disappears if I address this precise issue. But I've
since verified the theory by inspecting Core and Cmm code.

 i always thought that  in ghc just sequentially
 executes statements, the RealWorld magic exists only at compile-time

Yes, that's what happens once () gets actually executed in IO. But
this fact and the RealWorld token have nothing to do with the whole
issue, which is about accumulating a chain of IO actions that have not
yet been executed.

I'll continue to write a  b, which in IO, modulo newtypes, stands for

   \(s :: RealWorld#) - case a s of (s', _) - b s'

The fact that the state token disappears at runtime does not change
that this is a closure, represented by a (function) heap node.

So we have some IO action

let x = always a

Now we run x, but also hold onto the corresponding thunk to reuse it
later, say

let x = always a
in  x  x

In order to execute that, x is forced, and evaluated to

let x = let x' = always a in a  x'
in  x  x

or, equivalently,

let x' = always a
x  = a  x'
in  x  x

Then the first step of the IO action is performed, resulting in

let x' = always a
x  = a  x'
in  x'  x

And now the same reduction happens again for x',

let x2 = always a
x' = a  x2
x  = a  x'
in  x2  x

and then again for x2,

let x3 = always a
x2 = a  x3
x' = a  x2
x  = a  x'
in  x2  x

and so on, ad infinitum. This leaks memory because x, x', x2 etc. can't
be garbage collected - there's still a reference to x. Note that this
also explains why the space leak disappears if we remove the 'forever'
in the spawner thread in the original example.

This would not happen if the 'always a' was reused, i.e. if the code
tied a knot as

   let act = a  act in act

does, but as you can see in the Core (and even Cmm if you look closely
enough) that does not happen in those cases where the code leaks memory.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC, odd concurrency space leak

2010-04-17 Thread Bertram Felgenhauer
Daniel Fischer wrote:
 Am Samstag 17 April 2010 14:41:28 schrieb Simon Peyton-Jones:
  I have not been following the details of this, I'm afraid, but I notice 
 this:
   forever' m = do _ - m
   forever' m
 
  When I define that version of forever, the space leak goes away.
 
  What was the old version of forever that led to the leak?
 
 Control.Monad.forever
 
 forever :: Monad m = m a - m b
 forever m = m  forever m
 
 However, that isn't the problem. In my tests, both variants of forever 
 exhibit the same behaviour, what makes it leak or not is the optimisation 
 level.

This definition, plus sharing, is the source of the space leak.
Consider this modification of your code:

import Control.Concurrent

always :: Monad m = m a - m b
always a = -- let act = a  act in act
do
_ - a
always a

noop :: IO ()
noop = return ()

body :: IO ()
body = always noop

spawner :: IO ()
spawner = do
forkIO $ body
putStrLn Delaying
threadDelay 100
body `seq` return ()

main :: IO ()
main = do
putStrLn Spawning
forkIO spawner
putStrLn Delaying main
threadDelay 400

Note that the 'always' in 'spawner' is gone, but it still exhibits the
space leak. The leak goes away if the final line of 'spawner' is removed,
hinting at the real problem: 'always' actually creates a long chain of
actions instead of tying the knot.

Indeed the following definition of 'always' (or 'forever') fares better
in that regard, but is more susceptible to producing unproductive loops:

always a = let act = a  act in act

(I used  noop = yield  for avoiding that problem in my tests)

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC, odd concurrency space leak

2010-04-17 Thread Bertram Felgenhauer
Bulat Ziganshin wrote:
 Hello Bertram,
 
 Sunday, April 18, 2010, 12:11:05 AM, you wrote:
 
  always a = -- let act = a  act in act
  do
  _ - a
  always a
  
 
  hinting at the real problem: 'always' actually creates a long chain of
  actions instead of tying the knot.
 
 can you explain it deeper? it's what i see: always definition is
 equivalent to
 
  always a = do a
always a
 
 what's the same as
 
  always a = a  always a

This expands as

always a = a  always a
 = a  a  always a
 = a  a  a  always a
...
where each  application is represented by a newly allocated object
(or several, I have not looked at it in detail) on the heap.

With

always a = let act = a  act in act

there's only one  application being allocated.

The principle is the same as with

repeat x = x : repeat x

versus

repeat x = let xs = x : xs in xs

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC, odd concurrency space leak

2010-04-17 Thread Bertram Felgenhauer
Daniel Fischer wrote:
 Except that with optimisations turned on, GHC ties the knot for you (at 
 least if always isn't exported).
 Without -fno-state-hack, the knot is tied so tightly that 
 always (return ()) is never descheduled (and there's no leak).

Yes, I was concentrating on -O2, without -fno-state-hack.

 With -fno-state-hack, I get
 
 Rec {
 Main.main_always :: GHC.Types.IO () - GHC.Types.IO ()
 GblId
 [Arity 1
  NoCafRefs
  Str: DmdType L]
 Main.main_always =
   \ (a_aeO :: GHC.Types.IO ()) -
 let {
   k_sYz :: GHC.Types.IO ()
   LclId
   [Str: DmdType]
   k_sYz = Main.main_always a_aeO } in
 (\ (eta_ann :: GHC.Prim.State# GHC.Prim.RealWorld) -
case (a_aeO
  `cast` (GHC.Types.NTCo:IO ()
  :: GHC.Types.IO ()
   ~
 (GHC.Prim.State# GHC.Prim.RealWorld
  - (# GHC.Prim.State# GHC.Prim.RealWorld, () #
   eta_ann
of _ { (# new_s_anz, _ #) -
(k_sYz
 `cast` (GHC.Types.NTCo:IO ()
 :: GHC.Types.IO ()
  ~
(GHC.Prim.State# GHC.Prim.RealWorld
 - (# GHC.Prim.State# GHC.Prim.RealWorld, () #
  new_s_anz
})
 `cast` (sym (GHC.Types.NTCo:IO ())
 :: (GHC.Prim.State# GHC.Prim.RealWorld
 - (# GHC.Prim.State# GHC.Prim.RealWorld, () #))
  ~
GHC.Types.IO ())
 end Rec }

Which is

always = \a_aeO - let k_sYz = always a_aeO
   in  a_aeO  k_sYz

specialised to IO, and with () inlined.

Where is the knot?

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Fwd: [Haskell-cafe] Re: Simple game: a monad for each player

2010-04-14 Thread Bertram Felgenhauer
Limestraël wrote:
 Okay, I just understood that 'Prompt' was just a sort of view for 'Program'.

Right.

runMyStackT :: MyStackT (Player m) a - Player m a
 
 According to what Bertram said, each strategy can pile its own custom monad
 stack ON the (Player m) monad.

Yes, and I meant what Heinrich wrote, you wrap some transformer around
the common Player m monad.

  game :: Monad m = Player m () - Player m () - m ()
 As it is written, it requires both players to run in the SAME monad.
 And if have a network player ( e.g.* Player (StateT Handle IO)* ) and an AI
 storing former opponent's moves ( e.g. *(Monad m) = Player (StateT [Move]
 m)* ), then they can't be in the same monad...

The idea is to pick m = IO, and then use

  type NetPlayer a = StateT Handle (Player IO) a

and

  type AIPlayer a = StateT [Move] (Player IO) a

or possibly

  type AIPlayer a = StateT [Move] (Player Identity) a

using the mapPlayerM (or mapMonad as suggested by Heinrich) function.

You'd then provide functions like

runAIPlayer :: AIPlayer a - Player IO a
runAIPlayer player = {- mapMonad (return . runIdentity) $ -}
evalStateT player []

This gives you most of what you want: You can add custom state and the
like to each player. You can not hope to exchange the base monad m,
because then the 'game' function would have to know how to run both
of those base monads simultaneously. A function like mapMonad is the
best device you can hope for, I think.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Simple game: a monad for each player

2010-04-13 Thread Bertram Felgenhauer
Yves Parès wrote:
 
 I answered my own question by reading this monad-prompt example:
 http://paste.lisp.org/display/53766
 
 But one issue remains: those examples show how to make play EITHER a human
 or an AI. I don't see how to make a human player and an AI play SEQUENTIALLY
 (to a TicTacToe, for instance).

A useful idea is to turn the construction upside-down - rather than
implementing the game logic using MonadPrompt (or operational),
implement the players in such a monad.

A sketch:

{-# LANGUAGE GADTs, EmptyDataDecls #-}
import Control.Monad.Prompt hiding (Lift)

data Game -- game state
data Move -- move

data Request m a where
Board:: Request m Game
MakeMove :: Move - Request m ()
Lift :: m a - Request m a

type Player m a = Prompt (Request m) a

The core game logic would be provided by functions

initGame :: Monad m = m Game
initGame = undefined

makeMove :: Monad m = Move - Game - m Game
makeMove = undefined

To run a game we need to mediate between the two players, performing
their moves. To make this easier we turn the Player's program
into a list of actions. (This is essentially the Prompt type of the
operational package.)

data Program p a where
Return :: a - Program p a
Then   :: p b - (b - Program p a) - Program p a
  
programView :: Prompt p a - Program p a
programView = runPromptC Return Then

game :: Monad m = Player m () - Player m () - m ()
game first second = do
g - initGame
let first'  = programView first
second' = programView second
go :: Monad m
   = Game   -- current state
   - Program (Request m) () -- player 1
   - Program (Request m) () -- player 2
   - m ()
go g (Return _)   pl2 =
return ()
go g (Then (Lift l) pl1)  pl2 =
l = \a - go g (pl1 a) pl2
go g (Then Board pl1) pl2 =
go g (pl1 g) pl2
go g (Then (MakeMove mv) pl1) pl2 =
makeMove mv g = \g - go g pl2 (pl1 ())
go g first' second'

Note that MakeMove swaps the two players.

What have we achieved? Both players still can only access functions from
whatever monad m turns out to be. But now each strategy can pile its own
custom monad stack on the  Player m  monad! And of course, the use of
the m Monad is completely optional.

Mapping between various 'm' monads may also be useful:

mapPlayerM :: forall m1 m2 a . (forall a . m1 a - m2 a)
   - Player m1 a - Player m2 a
mapPlayerM m1m2 pl = runPromptC return handle pl where
handle :: Request m1 x - (x - Player m2 a) - Player m2 a
handle (Lift a)  x = prompt (Lift (m1m2 a)) = x
handle (MakeMove mv) x = prompt (MakeMove mv) = x
handle (Board)   x = prompt (Board) = x

This could be used to lock out the AI player from using IO, say.

HTH,

Bertram (aka int-e)


P.S. this is what 'game' would look like without the intermediate
'Program' type, using bare continuations instead:

newtype Pl m = Pl {
 runPl :: Pl m  -- other player
   - Game  -- current game state
   - m ()
}

gameC :: Monad m = Player m () - Player m () - m ()
gameC first second = do
g - initGame
let pl1 = runPromptC ret handle first
pl2 = runPromptC ret handle second
ret _ = Pl $ \_ _ - return ()
handle :: Monad m = Request m a - (a - Pl m) - Pl m
handle (Lift l)  pl1 =
Pl $ \pl2 g - l = \a - runPl (pl1 a) pl2 g
handle Board pl1 =
Pl $ \pl2 g - runPl (pl1 g) pl2 g
handle (MakeMove mv) pl1 =
Pl $ \pl2 g - runPl pl2 (pl1 ()) = makeMove mv g
runPl pl1 pl2 = initGame
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Asynchronous exception wormholes kill modularity

2010-04-09 Thread Bertram Felgenhauer
Simon Marlow wrote:
 but they are needlessly complicated, in my opinion.  This offers the
 same functionality:
 
 mask :: ((IO a - IO a) - IO b) - IO b
 mask io = do
   b - blocked
   if b
  then io id
  else block $ io unblock

How does forkIO fit into the picture? That's one point where reasonable
code may want to unblock all exceptions unconditionally - for example to
allow the thread to be killed later.

timeout t io = block $ do
result - newEmptyMVar
tid - forkIO $ unblock (io = putMVar result)
threadDelay t `onException` killThread tid
killThread tid
tryTakeMVar result

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Asynchronous exception wormholes kill modularity

2010-04-09 Thread Bertram Felgenhauer
Simon Marlow wrote:
 On 09/04/2010 09:40, Bertram Felgenhauer wrote:
 Simon Marlow wrote:
 mask :: ((IO a -  IO a) -  IO b) -  IO b
 
 How does forkIO fit into the picture? That's one point where reasonable
 code may want to unblock all exceptions unconditionally - for example to
 allow the thread to be killed later.
 
 Sure, and it works exactly as before in that the new thread inherits
 the masking state of its parent thread.  To unmask exceptions in the
 child thread you need to use the restore operator passed to the
 argument of mask.
 
 This does mean that if you fork a thread inside mask and don't pass
 it the restore operation, then it has no way to ever unmask
 exceptions.  At worst, this means you have to pass a restore value
 around where you didn't previously.
 
  timeout t io = block $ do
  result - newEmptyMVar
  tid - forkIO $ unblock (io = putMVar result)
  threadDelay t `onException` killThread tid
  killThread tid
  tryTakeMVar result
 
 This would be written
 
   timeout t io = mask $ \restore - do
   result - newEmptyMVar
   tid - forkIO $ restore (io = putMVar result)
   threadDelay t `onException` killThread tid
   killThread tid
   tryTakeMVar result

I'm worried about the case when this function is called with exceptions
already blocked. Then 'restore' will be the identity, and exceptions
will continue to be blocked inside the forked thread.

You could argue that this is the responsibility of the whole chain of
callers (who'd have to supply their own 'restore' functions that will
have to be incorporated into the 'io' action), but that goes against
modularity. In my opinion there's a valid demand for an escape hatch
out of the blocked exception state for newly forked threads.

It could be baked into a variant of the forkIO primitive, say

forkIOwithUnblock :: ((IO a - IO a) - IO b) - IO ThreadId

Kind regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: breadth first search one-liner?

2010-03-22 Thread Bertram Felgenhauer
Ross Paterson wrote:
 On Mon, Mar 22, 2010 at 10:30:32AM +, Johannes Waldmann wrote:
  Nice! - Where's the 'nub'?
 
 A bit longer:
 
 bfs :: Eq a = (a - [a]) - a - [a]
 bfs f s = concat $ takeWhile (not . null) $ map snd $ iterate step ([], [s])
   where step (seen, xs) = let seen' = xs++seen in (seen', nub $ [y | x - xs, 
 y - f x, notElem y seen'])

Basically the same idea:

bfs next start =
let go _  [] = []
go xs ys = let zs = nub (ys = next) \\ xs
   in  ys ++ go (zs ++ xs) zs
in  go [start] [start]

A slightly different approach is to add stage markers to the produced
streams, say

bfs next start =
let xs = nub $ Left 0 : Right s : (xs = next')
next' (Left n) = [Left (n + 1)]
next' (Right s) = map Right (next s)
stop (Left _ : Left _ : _) = []
stop (Left x : xs) = stop xs
stop (Right x : xs) = x : stop xs
in  stop xs

or
bfs next start = lefts . takeWhile (not . null)
. unfoldr (Just . span (either (const False) (const True)) . tail)
$ fix (nub . (Left 0 :) . (Right start :)
  . (= either ((:[]) . Left . succ) (map Right . next)))

This has the advantage that nub can be used directly. But it's far from
beautiful.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: breadth first search one-liner?

2010-03-22 Thread Bertram Felgenhauer
Bertram Felgenhauer wrote:
 or
 bfs next start = lefts . takeWhile (not . null)

I copied the wrong version. This should be

bfs next start = rights . concat . takeWhile (not . null) 
-- rest unchanged
 . unfoldr (Just . span (either (const False) (const True)) . tail)
 $ fix (nub . (Left 0 :) . (Right start :)
   . (= either ((:[]) . Left . succ) (map Right . next)))

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] parallel matrix multiply (dph, par/pseq)

2010-01-18 Thread Bertram Felgenhauer
Johannes Waldmann wrote:
 Hello.
 
 How can I multiply matrices (of Doubles)
 with dph (-0.4.0)?  (ghc-6.12.1)  -  I was trying
 
 type Vector = [:Double:]
 type Matrix = [:Vector:]
 
 times :: Matrix - Matrix - Matrix
 times a b =
   mapP
   ( \ row - mapP ( \ col - sumP ( zipWithP (*) row col  ) )
   ( transposeP b )
   ) a
 
 but there is no such thing as transposeP.

It's possible to implement transposeP as follows,

{-# LANGUAGE PArr #-}
...
import qualified Data.Array.Parallel.Prelude.Int as I

transposeP :: Matrix - Matrix
transposeP a = let
h = lengthP a
w = lengthP (a !: 0)
rh = I.enumFromToP 0 (h I.- 1) -- or [: 0 .. h I.- 1 :]
rw = I.enumFromToP 0 (w I.- 1) -- or [: 0 .. w I.- 1 :]
  in
if h == 0 then [: :]
  else mapP (\y - mapP (\x - a !: x !: y) rh) rw

Maybe there is a better way?

Bertram

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Are there standard idioms for lazy, pure error handling?

2009-12-13 Thread Bertram Felgenhauer
Duncan Coutts wrote:
 Another approach that some people have advocated as a general purpose
 solution is to use:
 
 data Exceptional e a = Exceptional {
   exception :: Maybe e
   result:: a
 }
 
 However it's pretty clear from the structure of this type that it cannot
 cope with lazy error handling in sequences. If you try it you'll find
 you cannot do it without space leaks.

It's not all that clear. Consider this toy example (from a private
discussion with Henning Thielemann a while ago), which runs in
constant space:

import System.IO.Unsafe
import System.Environment
import Control.Monad
import Data.List

data Exceptional e a =
Exceptional { exception :: Maybe e, result :: a }

ok  a = Exceptional Nothing  a
fault e a = Exceptional (Just e) a

faulty :: Int - IO (Exceptional Int [Int])
faulty 0 = return (fault 0 [])
faulty 1 = return (ok [])
faulty n = unsafeInterleaveIO $ do
-- getChar
r - faulty (n-2)
return $ Exceptional (exception r) (n : result r)

main = do
n - readIO . head = getArgs
Exceptional exc res - faulty n
print $ last res
when (n `mod` 3 == 0) $ print exc

This works because ghc's garbage collector evaluates record selectors.
(There are a simpler cases where this matters, for example
last $ fst $ unzip [(a,a) | a - [1..1]]
which also runs in constant space.)

The approach is very fragile, though. For example, if we change main to

main = do
n - readIO . head = getArgs
f - faulty n
print $ last (result f)
when (n `mod` 3 == 0) $ print (exception f)

then the space leak reoccurs - doing the pattern match on the
Excpeptional constructor before using the result is essential.

Bad things also happen if ghc's optimiser turns the record selectors
into explicit pattern matches in the worker ('faulty' in the example).

Kind regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fair diagonals

2009-11-06 Thread Bertram Felgenhauer
Martijn van Steenbergen wrote:
 Bonus points for the following:
 * An infinite number of singleton axes produces [origin] (and
 finishes computing), e.g. forall (infinite) xs. diagN (map (:[]) xs)
 == map (:[]) xs

This can't be done - you can not produce any output before you have
checked that all the lists are not empty:

  diag (replicate n [0] ++ [[]]) == []

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] \Statically checked binomail heaps?

2009-10-30 Thread Bertram Felgenhauer
Maciej Kotowicz wrote:
 I'm trying to implement a binomial heaps from okaski's book [1]
 but as most it's possible to be statically checked for correctness of
 definition.

How about this encoding in Haskell 98?

data Tree a t = Tree { root :: a, children :: t }
data Nest a t = Nest { head :: Tree a t, tail :: t }

where
- Tree a () is a binomial tree of order 0
- Tree a (Nest a ()) is a binomial tree of order 1
- Tree a (Nest a (Nest a ())) is a binomial tree of order 2. and so on

data Heap' a t
= Z'
| D0' (Heap' a (Nest a t))
| D1' (Heap' a (Nest a t)) (Tree a t)

With (Tree a t) representing a binomial tree of rank n,
- Heap' a t  represents the list of binomial trees of rank = n in a heap
- Z' represents an empty list
- D0' xs represents a list /without/ a tree of rank n
- D1' xs x   represents a list /with/ a tree of rank n, namely x.

Finally, define

type Heap a = Heap' a ()

This forces Heap to be a well-shaped binomial heap, up to the equality
D0' Z' = Z'.

The main difference to standard binomial heaps is the existence of the
D0' nodes which represent skipped ranks. This makes the type checking
much easier (no comparison of natural numbers is required). It also
makes rank calculations unecessary - the rank increases implicitely as
the heap is traversed.

Heap order can be maintained by using a smart constructor:

-- combine two binomial trees of rank n into one of rank n+1
mkTree' :: Ord a = Tree a t - Tree a t - Tree a (Nest a t)
mkTree' l@(Tree a x) r@(Tree b y)
| a  b = Tree a (Nest r x)
| True  = Tree b (Nest l y)

Here is insert, as an example of a non-trivial operation:

insert' :: Ord a = Heap' a t - Tree a t - Heap' a t
insert' Z'b = D1' Z' b
insert' (D0' x)   b = D1' x b
insert' (D1' x a) b = D0' (x `insert'` mkTree' a b)

HTH,

Bertram

P.S. As a refinement, one can define
data Heap a = Z | D0 | D1 (Heap' a a) a

This improves memory efficiency at the cost of requiring more code, like

mkTree :: Ord a = a - a - Tree a a
mkTree a b | a  b = Tree a b
   | True  = Tree b a

insert :: Ord a = Heap a - a - Heap a
insert Zb = D1 Z' b
insert (D0 x)   b = D1 x b
insert (D1 x a) b = D0 (x `insert'` mkTree a b)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell Platform - changing the global install dir

2009-10-06 Thread Bertram Felgenhauer
Paul Moore wrote:
 grep global -A7 D:\Documents and Settings\uk03306\Application 
 Data\cabal\config
 install-dirs global
   -- prefix: D:\\Apps\\Haskell\\Cabal
^^^
You should remove the '-- '. Lines beginning with '--' are comments.
So this line has no effect.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Quadratic complexity though use of STArrays

2009-09-27 Thread Bertram Felgenhauer
Dan Rosén wrote:
 What complexity does these functions have?
 
 I argue that the shuffleArr function should be O(n), since it only contains
 one loop of n, where each loop does actions that are O(1): generating a random
 number and swapping two elements in an array.
 
 However, they both have the same runnig time (roughly), and through looking
 at the plot it _very_ much looks quadratic.

Right. In the case of shuffleArr, it's the garbage collection time that
explodes. I think that the reason is that the garbage collector has to
scan the whole array (which lives in the old generation) on each garbage
collection, while the young generation is kept very small, making GCs
very frequent.

Indeed the program behaves much better with just one GC generation:

   ./a.out +RTS -G1
  (100, 1988698, 1760732, 1.98870,1.76073)
  (150, 2991546, 2683592, 1.99436,1.78906)
  (200, 4018388, 3611451, 2.00919,1.80573)

Output format:
  (n, time for your original shuffleArray (t1),
  time for Daniel's stricter version (t2),
  t1/n, t2/n)

While with the default settings, it looks much worse:
   ./a.out
  (100, 7575850, 7958790, 7.57585, 7.95879)
  (150,17119397,19449044,11.41293,12.96602)
  (200,30687335,35125661,15.34367,17.56283)

You can also increase the initial heap size,
   ./a.out +RTS -H250M
  (100, 1131828,  901863, 1.13183, 0.90186)
  (150, 1726737, 1427783, 1.15116, 0.95186)
  (200, 2324647, 1935705, 1.16232, 0.96785)

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to calculate de number of digits of an integer? (was: Is logBase right?)

2009-08-29 Thread Bertram Felgenhauer
Uwe Hollerbach wrote:
 Here's my version... maybe not as elegant as some, but it seems to
 work. For base 2 (or 2^k), it's probably possible to make this even
 more efficient by just walking along the integer as stored in memory,
 but that difference probably won't show up until at least tens of
 thousands of digits.
 
 Uwe
 
 ilogb :: Integer - Integer - Integer
 ilogb b n | n  0  = ilogb b (- n)
   | n  b  = 0
   | otherwise  = (up 1) - 1
   where up a = if n  (b ^ a)
   then bin (quot a 2) a
   else up (2*a)
 bin lo hi = if (hi - lo) = 1
then hi
else let av = quot (lo + hi) 2
 in if n  (b ^ av)
   then bin lo av
   else bin av hi

We can streamline this algorithm, avoiding the repeated iterated squaring
of the base that (^) does:

-- numDigits b n | n  0 = 1 + numDigits b (-n)
numDigits b n = 1 + fst (ilog b n) where
ilog b n
| n  b = (0, n)
| otherwise = let (e, r) = ilog (b*b) n
  in  if r  b then (2*e, r) else (2*e+1, r `div` b)

It's a worthwhile optimization, as timings on n = 2^100 show:

Prelude T length (show n)
301030
(0.48 secs, 17531388 bytes)
Prelude T numDigits 10 n
301030
(0.10 secs, 4233728 bytes)
Prelude T ilogb 10 n
301029
(1.00 secs, 43026552 bytes)

(Code compiled with -O2, but the interpreted version is just as fast; the
bulk of the time is spent in gmp anyway.)

Regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Improving MPTC usability when fundeps aren't appropriate?

2009-08-13 Thread Bertram Felgenhauer
Daniel Peebles wrote:
 I've been playing with multiparameter typeclasses recently and have
 written a few uncallable methods in the process. For example, in
 
 class Moo a b where
   moo :: a - a
 
 Another solution would be to artificially force moo to take
 a dummy b so that the compiler can figure out which instance you
 meant. That's what I've been doing in the mean time, but wouldn't it
 be simpler and less hackish to add a some form of instance
 annotation, like a type annotation, that would make it possible to
 specify what instance you wanted when it's ambiguous?

Syntax aside, dummy arguments have a disadvantage when it comes to
optimizing code, because the compiler doesn't know that the dummy
argument is unused in the function; indeed you could define instances
where the dummy argument is used.

For this reason it's technically better to use a newtype instead:

newtype Lambda t a = Lambda { unLambda :: a }

and, say,

class Moo a b where
moo :: Lambda b (a - a)

Note that we can convert between functions taking dummy arguments and
such lambda types easily:

lambdaToDummy :: Lambda t a - t - a
lambdaToDummy a _ = unLambda a

dummyToLambda :: (t - a) - Lambda t a
dummyToLambda f = Lambda (f undefined)

In fact, lambdaToDummy makes a great infix operator:

(@@) :: Lambda t a - t - a
(@@) = lambdaToDummy

infixl 1 @@

Now we can write

moo @@ (undefined :: x)

where with dummy arguments we would write

moo (undefined :: x) .

The compiler will inline (@@) and lambdaToDummy (they're both small)
and produce

   unLambda moo ,

that is, the plain value of type a - a that we wanted to use in the
first place. All this happens after type checking and fixing the
instance of Moo to use.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: [Haskell] ANNOUNCE: OpenGL 2.3.0.0

2009-08-01 Thread Bertram Felgenhauer
Rafael Gustavo da Cunha Pereira Pinto wrote:
 Sorry for all this annoyance, but I was starting to study those libraries
 (OpenGL, GLUT and GLFW) using Haskell and the update broke some of my code.

 Here is a patch that makes it compile, but then it breaks all code developed
 for GLFW-0.3, as all Floats need to be changed to CFloat.
 
 --- GLFW-0.3/Graphics/UI/GLFW.hs  2008-01-15 20:23:18.0 -0200
 +++ GLFW.hs   2009-07-30 21:09:55.0 -0300
 @@ -517,11 +517,11 @@
  _GLFW_INFINITY = 10.0 :: Double
  
  -- Callback function type
 -type GLFWwindowsizefun = Int32 - Int32 - IO ()
 +type GLFWwindowsizefun = CInt - CInt - IO ()

[snip]

You should use the type aliases GLint, GLfloat, etc.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] excercise - a completely lazy sorting algorithm

2009-07-12 Thread Bertram Felgenhauer
Petr Pudlak wrote:
 Would it be possible to create a lazy selection/sorting
 algorithm so that getting any element of the sorted list/array by its index
 would require just O(n) time, and getting all the elements would still be in
 O(n * log n)?

The (merge) sorting algorithm provided by Data.List has this property,
providing the first k elements in O(n + k log(n)) time. (source at [1])

As a mental model, you can think of suspended 'merge' evaluations as
internal nodes in a tree, with the two arguments as subtrees. In that
model, the algorithm turns into heap sort: It builds a balanced binary
tree with n external nodes, taking O(n) time -- this job is done by
merge_pairs -- and then repeatedly extracts the minimum element, taking
O(log(n)) time for each one.

regards,

Bertram

[1] 
http://www.haskell.org/ghc/docs/latest/html/libraries/base/src/Data-List.html#mergesort
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Type families and polymorphism

2009-07-12 Thread Bertram Felgenhauer
Jeremy Yallop wrote:
 Why does compiling the following program give an error?

 {-# LANGUAGE TypeFamilies, RankNTypes #-}

 type family TF a

 identity :: (forall a. TF a) - (forall a. TF a)
 identity x = x

 GHC 6.10.3 gives me:

 Couldn't match expected type `TF a1' against inferred type `TF a'
 In the expression: x
 In the definition of `identity': identity x = x

The error message is slightly better in GHC head:

Couldn't match expected type `TF a1' against inferred type `TF a'
  NB: `TF' is a type function, and may not be injective
In the expression: x
In the definition of `identity': identity x = x

Dan Doel already explained how the lack of injectivity leads to a type
checking error.

FWIW, the same code would work with a data family, because data families
are injective.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] STM/Data Invariant related Segfault with GHC 6.10.3

2009-06-22 Thread Bertram Felgenhauer
Jan Schaumlöffel wrote:
 I just discovered that programs compiled with GHC 6.10.3 segfault when
 accessing a TVar created under certain conditions.

This is a known bug, but it hasn't gotten much attention:

  http://hackage.haskell.org/trac/ghc/ticket/3049

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Slightly off-topic: Lambda calculus

2009-06-21 Thread Bertram Felgenhauer
Miguel Mitrofanov wrote:
 Correction: I think that one can find an expression that causes name 
 clashes anyway, I'm just not certain that there is one that would clash 
 independent of whichever order you choose.

Yes there is.

Consider

 (\f g - f (f (f (f (f (f g)) (\l a b - l (b a)) (\x - x)

which has 6 variables in total. This reduces to the normal form

 \a b c d e f g - g (f (e (d (c (b a)

which requires 7 variables. So without alpha-conversion at least one of
the original 6 variables will clash with itself.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance of functional priority queues

2009-06-15 Thread Bertram Felgenhauer
Sebastian Sylvan wrote:
 On Mon, Jun 15, 2009 at 4:18 AM, Richard O'Keefe o...@cs.otago.ac.nz wrote:
  There's a current thread in the Erlang mailing list about
  priority queues.  I'm aware of, for example, the Brodal/Okasaki
  paper and the David King paper. I'm also aware of James Cook's
  priority queue package in Hackage, have my own copy of Okasaki's
  book, and have just spent an hour searching the web.
 
  One of the correspondents in that thread claims that it is
  provably impossible to have an efficient priority queue implementation
 
 A priority queue based on skewed binomial heaps is asymptotically optimal
 (O(1) for everything except deleteMin which is O(log n)), so if that's what
 he means by efficient then he's most definitely wrong.

What about decreaseKey in a purely functional setting? I suppose it's
O(log n), based on the intuition of trees with limited branching factor.
Fibonacci heaps can do it in O(1), which makes a difference for
Dijkstra's algorithm, for example.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] nubBy seems broken in recent GHCs

2009-06-06 Thread Bertram Felgenhauer
Cale Gibbard wrote:
 According to the Report:
 
   nubBy:: (a - a - Bool) - [a] - [a]
   nubBy eq []  =  []
   nubBy eq (x:xs)  =  x : nubBy eq (filter (\y - not (eq x y)) xs)
 
 Hence, we should have that
 
 nubBy () (1:2:[])
 = 1 : nubBy () (filter (\y - not (1  y)) (2:[]))
 = 1 : nubBy () []
 = 1 : []
 
 However in ghc-6.10.3:
 
 Prelude Data.List nubBy () [1,2]
 [1,2]

Interesting. This was changed in response to

http://hackage.haskell.org/trac/ghc/ticket/2528

| Tue Sep  2 11:29:50 CEST 2008  Simon Marlow marlo...@gmail.com
|   * #2528: reverse the order of args to (==) in nubBy to match nub
|   This only makes a difference when the (==) definition is not
|   reflexive, but strictly speaking it does violate the report definition
|   of nubBy, so we should fix it.

It turns out that 'elem' differs from the report version and should
have its comparison reversed. Of course that would only ever matter
for broken Eq instances.

However, the report also states that the nubBy function may assume that
the given predicate defines an equivalence relation.

http://haskell.org/onlinereport/list.html#sect17.6

So I'm not sure there's anything to be fixed here - although backing
out the above patch probably won't hurt anybody.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Still having problems building a very simple Executable ....

2009-06-06 Thread Bertram Felgenhauer
Hi Vasili,

Vasili I. Galchin wrote:
 I picked an exceedingly case to build an Executable:
 
 Executable QNameTest
Hs-source-dirs: Swish/
Main-Is:HaskellUtils/QNameTest.hs
Other-Modules:  HaskellUtils.QName

I'm not sure what you did; the original Swish code doesn't have any
hierarchical modules. Starting with that, I could build QNameTest like
this:

executable QNameTest
hs-source-dirs: HaskellUtils HaskellRDF HaskellRDF/HUnit
main-is: QNameTest.hs
other-modules: HUnitLang HUnitBase HUnitText QName QNameTest
build-depends: base, haskell98
ghc-options: -main-is QNameTest

The trick here is to list all used subdirectories in hs-source-dirs;
the module HUnit for example will be found in HaskellRDF/HUnit/ under
the name HUnit.lhs.

If you have a module A.B.C then the source file should be in
  foo/A/B/C.hs (or lhs or some other recognized extension)
and foo should be listed in hs-source-dirs.

Also note the ghc-options line: it tells ghc to use QNameTest instead of
Main for the main module.

I also managed to produce a Swish executable after a bit of tweaking;
you can find my changes in

http://int-e.home.tlink.de/haskell/Swish.diff

The cabal file is at the end of the diff.

HTH,

Bertram

P.S. I took Swish code from here:
http://www.ninebynine.org/RDFNotes/Swish/Intro.html#SwishLinks
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] type checking that I can't figure out ....

2009-06-03 Thread Bertram Felgenhauer
Michael Snoyman wrote:
 On Wed, Jun 3, 2009 at 8:42 AM, Daniel Fischer 
 daniel.is.fisc...@web.dewrote:
  Am Mittwoch 03 Juni 2009 06:12:46 schrieb Michael Snoyman:
   2. lookup does not return any generalized Monad, just Maybe (I think that
   should be changed).
 
  Data.Map.lookup used to return a value in any monad you wanted, I believe
  until 6.8
  inclusive.
  I don't think it's going to change again soon.
 
 Is there a reason why it only returns in the Maybe monad? I often times have
 to write a liftMaybe function to deal with that.

Here's the proposal that changed it:
  http://hackage.haskell.org/trac/ghc/ticket/2309

The discussion about the proposal can be found here:
  http://www.haskell.org/pipermail/libraries/2008-May/009698.html

(There's even the suggestion of adding a function like liftMaybe to
Data.Maybe, but apparently nobody turned that into a formal proposal.)

Regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Cabal/primes

2009-06-02 Thread Bertram Felgenhauer
michael rice wrote:
 Finally got adventurous enough to get Cabal working, downloaded the
 primes package, and got the following error message when trying 
 isPrime. Am I missing something here?

The Data.Numbers.Primes module of the primes package does not implement
'isPrime'. The Numbers package is probably the one you want.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Stack overflow

2009-05-28 Thread Bertram Felgenhauer
Krzysztof Skrzętnicki wrote:
 2009/5/27 Bertram Felgenhauer bertram.felgenha...@googlemail.com:
  I wrote:
  Krzysztof Skrzętnicki wrote:
  The code for modifying the counter:
  (\ msg - atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg,(
 
  atomicModifyIORef does not force the new value of the IORef.
  If the previous contents of the IORef is x, the new contents
  will be a thunk,
 
    (\ cnt - (cntMsg cnt msg,())) x
 
  Sorry, it's slightly worse than that. The contents becomes
 
     sel_0 (\ cnt - (cntMsg cnt msg, ())) x
 
  where sel_0 is basically an RTS internal version of fst.
 
  Instead of reading the new value of the IORef, you could also force the
  old one:
 
     atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg, msg)) = (return $!)
 
 
 Thanks for the tip, although it seems tricky to get it right. I wonder
 why there is no strict version of atomicModifyIORef?

Something like this?

-- | Stricter version of 'atomicModifyIORef', which prevents building
--   up huge thunks in the 'IORef' due to repeated modification.
--   Unlike 'atomicModifyIORef', 'atomicModifyIORef'' may block.
atomicModifyIORef' :: IORef a - (a - (a, b)) - IO b
atomicModifyIORef' ioref f = do
res - atomicModifyIORef ioref f
new - readIORef ioref
new `seq` return res

(The step that may block is forcing the new value - if another thread is
already evaluating part of the thunk, the currently executing thread
will block, waiting for the other thread to finish.)

 Dually there might be a strict version of IORef datatype.

One interesting feature of atomicModifyIORef is that its implementation
is lock-free, and never blocks (which affects exception handling):
replacing the old value by the new value is done with compare-and-swap
operation in a tight loop. Each iteration executes very quickly because
all it does is replace the reference to the old value in the new thunk.

With a strict IORef, the time window between reading the old value and
storing the new value would become arbitrarily large, because you'd have
to force the new value before exchanging it with the old value. So a
reasonable implementation would have to use locks instead, I think,
making atomicModifyIORef more expensive, and less useful in contexts
that block exceptions.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Stack overflow

2009-05-27 Thread Bertram Felgenhauer
Krzysztof Skrzętnicki wrote:
 The code for modifying the counter:
 (\ msg - atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg,(

atomicModifyIORef does not force the new value of the IORef.
If the previous contents of the IORef is x, the new contents
will be a thunk,

   (\ cnt - (cntMsg cnt msg,())) x

You can try forcing the new value, say by adding

readIORef ioref = (return $!)

after the atomicModifyIORef.

 The datatype itself is strict. So where is the thunk actually accumulating?

In the IORef.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Stack overflow

2009-05-27 Thread Bertram Felgenhauer
I wrote:
 Krzysztof Skrzętnicki wrote:
 The code for modifying the counter:
 (\ msg - atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg,(

 atomicModifyIORef does not force the new value of the IORef.
 If the previous contents of the IORef is x, the new contents
 will be a thunk,

   (\ cnt - (cntMsg cnt msg,())) x

Sorry, it's slightly worse than that. The contents becomes

sel_0 (\ cnt - (cntMsg cnt msg, ())) x

where sel_0 is basically an RTS internal version of fst.

Instead of reading the new value of the IORef, you could also force the
old one:

atomicModifyIORef ioref (\ cnt - (cntMsg cnt msg, msg)) = (return $!)

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Issues with IO and FFIs

2009-04-22 Thread Bertram Felgenhauer
Jon Harrop wrote:
 
 Does anyone have any comments on the following criticism of some
 difficulties  with FFI, including IO, in Haskell:
 
 http://groups.google.com/group/comp.lang.functional/msg/6d650c086b2c8a49?hl=en

That post conflates two separate questions.

1) binding to foreign libraries that export functions that are
   semantically pure, but can't be imported as pure functions, for
   example because they use pointers.

   unsafePerformIO was added in the FFI spec for exactly this purpose,
   as you can see for yourself here:

   http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi/ffise5.html#x8-240005.1

2) global variables. This has been a topic of long and heated debates,
   with no clear winner. I don't think another debate will be fruitful.

 In particular, is it not always possible to write IO libraries safely in 
 Haskell?

Flame bait? Yes, it is not always possible, because some interfaces are
inherently unsafe, putting the burden of using them safely on the user.
unsafePerformIO itself is a prime example for this.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Link errors in Gtk2Hs are more general than I thought.

2009-04-04 Thread Bertram Felgenhauer
Jeff Heard wrote:
 I tried to get yi to run on my Mac earlier and I get the following errors:
 
 dyld: lazy symbol binding failed: Symbol not found:
 _cairo_quartz_font_face_create_for_atsu_font_id
   Referenced from: /opt/local/lib/libpangocairo-1.0.0.dylib
   Expected in: /opt/local/lib/libcairo.2.dylib
 
 dyld: Symbol not found: _cairo_quartz_font_face_create_for_atsu_font_id
   Referenced from: /opt/local/lib/libpangocairo-1.0.0.dylib
   Expected in: /opt/local/lib/libcairo.2.dylib

This looks like the quartz backend was disabled in the cairo C library,
not like a gtk2hs problem. I don't know how ports work, but

  http://trac.macports.org/browser/trunk/dports/graphics/cairo/Portfile

defines a 'quartz' variant that enables that backend. Another idea is
to reinstall pango.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trying to download leksah ....

2009-04-04 Thread Bertram Felgenhauer
Vasili I. Galchin wrote:
 vigalc...@ubuntu:~/FTP$ darcs get http://code.haskell.org/leksah
 Invalid repository:  http://code.haskell.org/leksah
 
 darcs failed:  Failed to download URL
 http://code.haskell.org/leksah/_darcs/inventory : HTTP error (404?)
 
 I did a google on HTTP 404 = not found  why?

It's a darcs 2 repository, so you'll need to install darcs 2. See

http://code.haskell.org/leksah/_darcs/format

Admittedly, the error message is horrible.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] TMVar's are great but fail under ghc 6.10.1 windows

2009-03-31 Thread Bertram Felgenhauer
Alberto G. Corona  wrote:
 however, It happens that fails in my windows box with ghc 6.10.1  , single
 core
 
 here is the code and the results:
 
 ---begin code:
 module Main where
 
 import Control.Concurrent.STM
 
 import Control.Concurrent
 import System.IO.Unsafe
 import GHC.Conc
 
 
 
 mtxs=  unsafePerformIO $ mapM newTMVarIO $ take 5 $ repeat  0
 
 proc i= atomically $ do
  unsafeIOToSTM $ putStr $ init of process ++ show i++\n

As Sterling points out, unsafeIOToSTM is really unsafe. A fundamental
restriction is that the IO action used with unsafeIOToSTM may not block.
However, putStr may block.

It's actually possible to do the logging without blocking:

 cut here 
module Main where

import Control.Concurrent
import Control.Concurrent.STM
import GHC.Conc
import Control.Concurrent.MVar
import Data.IORef
import Control.Monad

data Logger = Logger (IORef [String]) (MVar ())

newLogger :: IO Logger
newLogger = do
ref  - newIORef []
wake - newEmptyMVar
return $ Logger ref wake

logLogger :: Logger - String - IO ()
logLogger (Logger ref wake) msg = do
atomicModifyIORef ref $ \msgs - (msg:msgs, ())
tryPutMVar wake ()
return ()

dumpLogger :: Logger - IO ()
dumpLogger (Logger ref wake) = forever $ do
takeMVar wake
msgs - atomicModifyIORef ref $ \msgs - ([], msgs)
putStr $ unlines . reverse $ msgs

proc log mtxs i = do
 let logSTM = unsafeIOToSTM . log
 xs' - atomically $ do
   logSTM $ init of process  ++ show i
   xs - mapM takeTMVar mtxs
   mapM (\(mtx, x) - putTMVar mtx (x+1)) $ zip mtxs xs
   xs' - mapM readTMVar mtxs
   logSTM $ End of processs  ++ show i ++  result =  ++ show xs'
   return xs'
 log $ Final result of process  ++ show i ++  =  ++ show xs'

main = do
  log - newLogger
  forkIO $ dumpLogger log
  mtxs - replicateM 5 $ newTMVarIO 0
  mapM (forkIO . proc (logLogger log) mtxs) [1..5]
  threadDelay 100


And that gives reasonable results, for example:

init of process 1
End of processs 1 result= [1,1,1,1,1]
Final result of process 1 = [1,1,1,1,1]
init of process 2
End of processs 2 result= [2,2,2,2,2]
Final result of process 2 = [2,2,2,2,2]
init of process 3
init of process 4
End of processs 4 result= [3,3,3,3,3]
Final result of process 4 = [3,3,3,3,3]
init of process 5
End of processs 5 result= [4,4,4,4,4]
Final result of process 5 = [4,4,4,4,4]
End of processs 3 result= [3,3,3,3,3]
init of process 3
End of processs 3 result= [5,5,5,5,5]
Final result of process 3 = [5,5,5,5,5]

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Definition of tail recursive wrt Folds

2009-03-28 Thread Bertram Felgenhauer
Ben Franksen wrote:
 Mark Spezzano wrote:
  Just looking at the definitions for foldr and foldl I see that foldl is
  (apparently) tail recursive while foldr is not.
  
  Why?
  
  Is it because foldl defers calling itself until last whereas foldr
  evaluates itself as it runs?
  
  What, strictly speaking, is the definition of ”tail recursive” as opposed
  to just “recursive”?
 
 An application of some function f inside another function g is in 'tail
 position' (or a 'tail call') if the result of applying f is the result of
 g. Operationally speaking, calling f is the very last thing g does. Tail
 calls can be optimized by a compiler (or interpreter) so that the call does
 not use additional stack; that is, the call can be replaced by a jump.
 
 A function is called ”tail recursive” (as opposed to just “recursive”) if
 the recursive call to itself is in tail position. If the compiler performs
 tail call optimization, tail recursive functions can work with constant
 stack space, similar to a (imperative) loop.
 
 Looking at a definition of foldl, e.g.
 
 foldl f z0 xs0 = lgo z0 xs0 where
   lgo z [] =  z
   lgo z (x:xs) = lgo (f z x) xs
 
 you see that lgo calls itself in tail position, thus is tail recursive. In
 contrast, foldr can be defined as
 
 foldr k z xs = go xs where
   go [] = z
   go (y:ys) = y `k` go ys
 
 where you see that the result of go is not the recursive call to go.
 Instead, the result is  y `k` go ys . Thus, foldr is not tail recursive.
 
 So, if you are saying that foldl defers calling itself until last whereas
 foldr evaluates itself as it runs then you got the right idea, I think.
 The point is that foldr still needs to do something (namely to apply  (y
 `k`)) to the result of applying itself. It needs to remember to do so, and
 thus the stack grows linearly with the size of the list.

Sorry, but that's wrong. It would be right in a strict language. In Haskell,
the 'go ys' term is not evaluated straight away; it is instead turned into
a suspended evaluation (a thunk) that is typically stored on the heap.

The following discussion is implementation specific, with ghc in mind.
Haskell itself has no notion of a stack.

In fact both using foldl and using foldr can produce stack overflows:

  Prelude foldl (+) 0 [1..10^7]
  *** Exception: stack overflow
  Prelude foldr (+) 0 [1..10^7]
  *** Exception: stack overflow

Let's examine why. First, consider the foldl case. For simplicity, I'll
ignore the evaluation of [1..10^7]. The first few evaluation steps are

 foldl (+) 0 [1,2,3..100]
  - lgo 0 [1,2,3..100]
  - lgo (0+1) [2,3,4..100]
  - lgo ((0+1)+2) [3,4,5..10]

None of these steps uses the stack - foldl is indeed tail recursive. However,
the ((0+1)+2) is a thunk on the heap. Continuing,

  - lgo ((...(0+1)+...)+999) [1000]
  - lgo ((...(0+1)+...)+1000) []
  - ((...(0+1)+...)+1000)

Now we have to evaluate that huge thunk. This turns out to cause trouble
because + (for Integers) is strict. So in order to find x+1000, the
code needs the value of x first. And that's where the stack gets
involved: the information of the pending addition, (?+1000) is pushed
onto the stack, and evaluation proceeds with the first term.

Denoting the stack by [[item1, item2, ...]], evaluation continues like
this:

  - [[(?+1000)]] ((...(0+1)+...)+999)
  - [[(?+1000),(?+999)]]

The stack will keep growing, until the (0+1) is reached, or the stack
overflows.

To make things more confusing, ghc has a strictness analyzer that
sometimes manages to avoid such a thunk being built up. For an
example how strictness helps see foldl' (below).


Now for the foldr case. Evaluation in that case looks a bit different:

 foldr (+) 0 [1,2,3..1000]
  - go [1,2,3..1000]
  - 1 + go [2,3,4..1000]

No stack was used so far; the go [2,3,4..100] is a thunk. Now, as
above, we need to add two numbers. And that's where the stack gets
involved again:

  - [[(1+?)]] go [2,3,4..1000]
  - [[(1+?)]] 2 + go [3,4,5..1000]
  - [[(1+?),(2+?)]] go [3,4,5..1000]
  - [[(1+?),(2+?),(3+?)]] go [4,5,6..1000]

and so on, until we reach  go []  or the stack overflows. Note that
there is no reference to 'go' on the stack at all.

If instead of (+), we had a lazy function like (:), the stack would
not get involved in this way:

 foldr (:) [] [1,2,3..1000]
  - go [1,2,3..1000]
  - 1 : go [2,3,4..1000]

Which is in weak head normal form, so evaluation stops here. Later,
when other code examines the list, the 'go [2,3,4..1000]' thunk
will get evaluated.


As a final note, the stack overflow with  foldl  above is cured by using
foldl', which is _strict_ in the accumulator.

For reference,
foldl' f z0 xs0 = lgo z0 xs0
where lgo z [] = z
  lgo z (x:xs) = let z' = f z x in z' `seq` lgo z' xs

 foldl' (+) 0 [1,2,3..100]
  - lgo 0 [1,2,3..100]
  - let z' = 0+1 in z' `seq` lgo z' [2,3,4..100]

Now 

Re: [Haskell-cafe] Performance question

2009-02-26 Thread Bertram Felgenhauer
hask...@kudling.de wrote:
 Do you think it would be feasable to replace the GHC implementation
 of System.Random with something like System.Random.Mersenne?

There's a problem with using the Mersenne Twister: System.Random's
interface has a split method:

class RandomGen g where
   split:: g - (g, g)

The Mersenne Twister is good at producing a single stream of random
numbers - in fact it works by generating a whole block of random
numbers in one go, then consuming the block, and only then generating
the next block.

I have no idea how to implement a split method that produces
independent streams. Even if I did, using split a lot would likely
spoil the performance benefit of the generator.

(System.Random.Mersenne.Pure64 provides a RandomGen instance for
PureMT, but it cheats:)

   split = error System.Random.Mersenne.Pure: unable to split the mersenne 
twister

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.Binary, strict reading

2009-02-25 Thread Bertram Felgenhauer
Neil Mitchell wrote:
 Hi,
 
 I want to read a file using Data.Binary, and I want to read the file
 strictly - i.e. when I leave the read file I want to guarantee the
 handle is closed. The reason is that (possibly immediately after) I
 need to write to the file. The following is the magic I need to use -
 is it all necessary, is it guaranteed correct, should I use something
 else?
 
 src - decodeFile _make/_make
 Map.size mp `seq` performGC

With binary 0.5,

src - decodeFile _make/_make
return $! src

should close the file, assuming that all the data is read from the file,
thanks to this patch:

  Mon Aug 25 23:01:09 CEST 2008  Don Stewart d...@galois.com
* WHNF the tail of a bytestring on decodeFile, will close the resource

For older versions,

import qualified Data.Binary.Get as Get

data EOF = EOF
instance Binary EOF where
get = do
   eof - Get.isEmpty
   return (if eof then EOF else error EOF expected)
put EOF = return ()

...
(src, EOF) - decodeFile _make/_make

accomplishes the same effect.

Btw, contrary to what Duncan said, Get is a lazy monad (lazy in its
actions, that is):

instance Binary EOF where
get = do
   eof - Get.isEmpty
   when (not eof) error EOF expected
   return EOF
put EOF = return ()

does not help, because the result (EOF) does not depend on the value
returned by isEmpty.

The idea of using isEmpty for closing the file is not perfect though;
due to the lazy nature of Get, there's a stack overflow lurking below:

main = do
encodeFile w.bin [0..100 :: Int]
m - decodeFile w.bin
print $ foldl' (+) 0 (m :: [Int])

One idea to fix this is to force the read data before checking for EOF,
as follows:

   data BinaryRNF a = BinaryRNF a

   instance (NFData a, Binary a) = Binary (BinaryRNF a) where
   get = (\a - rnf a `seq` BinaryRNF a) `fmap` get
   put (BinaryRNF a) = put a

   main = do
   encodeFile w.bin [0..100 :: Int]
   (BinaryRNF m, EOF) - decode `fmap` L.readFile w.bin
   print $ foldl' (+) 0 (m :: [Int])

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.Binary, strict reading

2009-02-25 Thread Bertram Felgenhauer
I wrote:
 With binary 0.5,

Or binary 0.4.3 and later.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Pickling a finite map (Binary + zlib) [was: [Haskell-cafe] Data.Binary poor read performance]

2009-02-24 Thread Bertram Felgenhauer
Don Stewart wrote:
 dons:
[...]
 Just serialising straight lists of pairs,
[...]
 And reading them back in,
 
 main = do
 [f] - getArgs
 m - decode `fmap` L.readFile f
 print (length (m :: [(B.ByteString,Int)]))
 print done

Well, you don't actually read the whole list here, just its length:

instance Binary a = Binary [a] where
put l  = put (length l)  mapM_ put l
get= do n - get :: Get Int
replicateM n get

To demonstrate, this works:

main = do
L.writeFile v (encode (42 :: Int))
m - decode `fmap` L.readFile v
print (length (m :: [Int]))

So instead, we should try something like this:

import Control.Parallel.Strategies

instance NFData B.ByteString where
rnf bs = bs `seq` ()

main = do
[f] - getArgs
m - decode `fmap` L.readFile f
print (rnf m `seq` length (m :: [(B.ByteString,Int)]))

My timings:

reading list, without rnf:
0.04s
with rnf:
0.16s
reading a Data.Map:
0.52s
with rnf:
0.62s

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Pickling a finite map (Binary + zlib) [was: [Haskell-cafe] Data.Binary poor read performance]

2009-02-24 Thread Bertram Felgenhauer
Felipe Lessa wrote:
 On Tue, Feb 24, 2009 at 4:59 AM, Don Stewart d...@galois.com wrote:
  Looks like the Map reading/showing via association lists could do with
  further work.
 
  Anyone want to dig around in the Map instance? (There's also some patches 
  for
  an alternative lazy Map serialisation, if people are keen to load maps -- 
  happstack devs?).
 
 From binary-0.5:
 
 instance (Ord k, Binary k, Binary e) = Binary (Map.Map k e) where
 put m = put (Map.size m)  mapM_ put (Map.toAscList m)
 get   = liftM Map.fromDistinctAscList get
 
 instance Binary a = Binary [a] where
 put l  = put (length l)  mapM_ put l
 get= do n - get :: Get Int
 replicateM n get
 
 Can't get better, I think.

We can improve it slightly (about 20% runtime in dons example [*]):

   instance (Ord k, Binary k, Binary e) = Binary (Map.Map k e) where
   get = liftM (Map.fromDistinctAscList . map strictValue) get where
  strictValue (k,v) = (v `seq` k, v)

The point is that Data.Map.Map is strict in the keys, but not in the
values of the map. In the case of deserialisation this means the values
will be thunks that hang on to the Daya.Binary buffer.

 Now, from containers-0.2.0.0:
 
 fromDistinctAscList :: [(k,a)] - Map k a
 fromDistinctAscList xs
   = build const (length xs) xs
   where
 -- 1) use continutations so that we use heap space instead of stack space.
 -- 2) special case for n==5 to build bushier trees.
 build c 0 xs'  = c Tip xs'
 build c 5 xs'  = case xs' of
((k1,x1):(k2,x2):(k3,x3):(k4,x4):(k5,x5):xx)
 - c (bin k4 x4 (bin k2 x2 (singleton k1
 x1) (singleton k3 x3)) (singleton k5 x5)) xx
_ - error fromDistinctAscList build
 build c n xs'  = seq nr $ build (buildR nr c) nl xs'
where
  nl = n `div` 2
  nr = n - nl - 1
 
 buildR n c l ((k,x):ys) = build (buildB l k x c) n ys
 buildR _ _ _ [] = error fromDistinctAscList buildR []
 buildB l k x c r zs = c (bin k x l r) zs
 
 
 The builds seem fine, but we spot a (length xs) on the beginning.
 Maybe this is the culprit? We already know the size of the map (it was
 serialized), so it is just a matter of exporting
 
 fromDistinctAscSizedList :: Int - [(k, a)] - Map k a

Eliminating the 'length' call helps, too, improving runtime by
another about 5%.

The result is still a factor of 1.7 slower than reading the list of
key/value pairs.

Bertram

[*] Notes on timings: 
1) I used `rnf` for all timings, as in my previous mail.
2) I noticed that in my previous measurements, the GC time for the
   Data.Map tests was excessively large (70% and more), so I used
   +RTS -H32M this time. This resulted in a significant runtime
   improvement of about 30%.
3) Do your own measurements! Some code to play with is available here:
   http://int-e.home.tlink.de/haskell/MapTest.hs
   http://int-e.home.tlink.de/haskell/Map.hs
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ONeillPrimes.hs - priority queue broken?

2009-02-24 Thread Bertram Felgenhauer
Eugene Kirpichov wrote:
 Hi,
 I've recently tried to use the priority queue from the
 ONeillPrimes.hs, which is famous for being a very fast prime
 generator: actually, I translated the code to Scheme and dropped the
 values, to end up with a key-only heap implementation.
 However, the code didn't work quite well, and I decided to check the
 haskell code itself.
 
 Turns out that it is broken.
 
 module PQ where
 
 import Test.QuickCheck
 
 data PriorityQ k v = Lf
| Br {-# UNPACK #-} !k v !(PriorityQ k v) !(PriorityQ k v)
deriving (Eq, Ord, Read, Show)

Let
size Lf = 0
size (Br _ _ l r) = 1 + sizePQ l + sizePQ r

be the size of the priority queue.

To work, the code maintains heap order and the invariant that the left
subtree is at least as large as the right one, and at most one element
larger.

validSize Lf = True
validSize (Br _ _ l r) = validSize l  validSize r  0 = d  d = 1
 where d = size l - size r

This invariant justifies the assumption that Daniel Fischer pointed out.

The code is careful to maintain this invariant, but it is broken in one
place:

 leftrem :: PriorityQ k v - (k, v, PriorityQ k v)
 leftrem (Br vk vv Lf Lf) = (vk, vv, Lf)

(Why not this?)
leftrem (Br vk vv Lf _) = (vk, vv, Lf)

 leftrem (Br vk vv t1 t2) = (wk, wv, Br vk vv t t2) where
 (wk, wv, t) = leftrem t1

Here, the left subtree is replaced by one that is one element smaller.
This breaks the invariant if the two original subtrees had equal size.
The bug is easy to fix; just swap the two subtrees on the right side:

leftrem (Br vk vv t1 t2) = (wk, wv, Br vk vv t2 t) where
(wk, wv, t) = leftrem t1

 leftrem _= error Empty heap!
 *PQ s [3,1,4,1,5,9,2,6,5,3,5,8]
 [1,1,2*** Exception: Empty heap!

*PQ s [3,1,4,1,5,9,2,6,5,3,5,8]
[1,1,2,3,3,4,5,5,5,6,8,9]

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ONeillPrimes.hs - priority queue broken?

2009-02-24 Thread Bertram Felgenhauer
Eugene Kirpichov wrote:
 module PQ where
 
 import Test.QuickCheck
 
 data PriorityQ k v = Lf
| Br {-# UNPACK #-} !k v !(PriorityQ k v) !(PriorityQ k v)
deriving (Eq, Ord, Read, Show)

For the record, we can exploit the invariant that the sizes of the left
and right subtrees have difference 0 or 1 to implement 'size' in better
than O(n) time, where n is the size of the heap:

-- Return number of elements in the priority queue.
-- /O(log(n)^2)/
size :: PriorityQ k v - Int
size Lf = 0
size (Br _ _ t1 t2) = 2*n + rest n t1 t2 where
n = size t2
-- rest n p q, where n = size q, and size p - size q = 0 or 1
-- returns 1 + size p - size q.
rest :: Int - PriorityQ k v - PriorityQ k v - Int
rest 0 Lf _ = 1
rest 0 _  _ = 2
rest n (Br _ _ p1 p2) (Br _ _ q1 q2) = case r of
0 - rest d p1 q1 -- subtree sizes: (d or d+1), d; d, d
1 - rest d p2 q2 -- subtree sizes: d+1, (d or d+1); d+1, d
  where (d, r) = (n-1) `quotRem` 2

Of course we can reduce the cost to O(1) by annotating the heap with its
size, but that is less interesting, and incurs a little overhead in the
other heap operations.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: speed: ghc vs gcc

2009-02-20 Thread Bertram Felgenhauer
Don Stewart wrote:
 If we take what I usually see as the best loops GHC can do for this kind
 of thing:
 
 import Data.Array.Vector
 
 main = print (sumU (enumFromToU 1 (10^9 :: Int)))
 
 And compile it:
 
 $ ghc-core A.hs -O2 -fvia-C -optc-O3
 
 We get ideal core, all data structures fused away, and no heap allocation:
 
 $wfold_s15t :: Int# - Int# - Int#
 $wfold_s15t =
   \ (ww1_s150 :: Int#) (ww2_s154 :: Int#) -
 case # ww2_s154 ww_s14U of wild_aWm {
   False -
 $wfold_s15t
   (+# ww1_s150 ww2_s154) (+# ww2_s154 1);
   True - ww1_s150
 }; } in
 case $wfold_s15t 0 1
 
 Which produces nice assembly:
 
 s16e_info:
   cmpq6(%rbx), %rdi
   jg  .L2
   addq%rdi, %rsi
   leaq1(%rdi), %rdi
   jmp s16e_info

Note that this does the addition to the accumulator first, and then
increments the counter.

[snip]
 I wondered if we just got worse code on backwards counting loops. So
 translating into the obvious translation, counting up:
 
 main = print (sum0 0 1)
 
 sum0 :: Int - Int - Int
 sum0 acc n | n  10^9  = acc
| otherwise = sum0 (acc + n) (n + 1)
 
 We start to notice something interesting:
 
 
 $wsum0 :: Int# - Int# - Int#
 $wsum0 =
   \ (ww_sOH :: Int#) (ww1_sOL :: Int#) -
 case lvl2 of wild1_aHn { I# y_aHp -
 case # ww1_sOL y_aHp of wild_B1 {
   False -
 letrec {
 
   $wsum01_XPd :: Int# - Int# - Int#
   $wsum01_XPd =
 \ (ww2_XP4 :: Int#) (ww3_XP9 :: Int#) -
   case # ww3_XP9 y_aHp of wild11_Xs {
 False -
   $wsum01_XPd (+# ww2_XP4 ww3_XP9) (+# ww3_XP9 1);
 True - ww2_XP4
   }; } in
 $wsum01_XPd (+# ww_sOH ww1_sOL) (+# ww1_sOL 1);
 
   True - ww_sOH
 }

This is odd, but it doesn't hurt the inner loop, which only involves
$wsum01_XPd, and is identical to $wfold_s15t above.

 Checking the asm:
 $ ghc -O2 -fasm
 
 sQ3_info:
 .LcRt:
   cmpq 8(%rbp),%rsi
   jg .LcRw
   leaq 1(%rsi),%rax
   addq %rsi,%rbx
   movq %rax,%rsi
   jmp sQ3_info

So for some reason ghc ends up doing the (n + 1) addition before the
(acc + n) addition in this case - this accounts for the extra
instruction, because both n+1 and n need to be kept around for the
duration of the addq (which does the acc + n addition).

 Checking via C:
 
$ ghc -O2 -optc-O3 -fvia-C
 
 Better code, but still a bit slower:   
 
 sQ3_info:
   cmpq8(%rbp), %rsi
   jg  .L8
   addq%rsi, %rbx
   leaq1(%rsi), %rsi
   jmp sQ3_info

This code is identical (up to renaming registers and one offset that
I can't fully explain, but is probably related to a slight difference
in handling pointer tags between the two versions of the code) to the
nice assembly above.

 Running:
 
 $ time   ./B
 55
 ./B  1.01s user 0.01s system 97% cpu 1.035 total

Hmm, about 5% slower, are you sure this isn't just noise?

If not noise, it may be some alignment effect. Hard to say.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: convertible (first release)

2009-01-28 Thread Bertram Felgenhauer
wren ng thornton wrote:
 John Goerzen wrote:
 Hi folks,
 I have uploaded a new package to Haskell: convertible.  At its heart,
 it's a very simple typeclass that's designed to enable a reasonable
 default conversion between two different types without having to
 remember a bunch of functions.

 I once again point out that realToFrac is *wrong* for converting from Float 
 or Double.

Yes, realToFrac is just broken, or at least the Real instances of Float
and Double are. Of course, with the restriction to single parameter type
classes in Haskell98, it's hard to come up with anything better - we'd
end up with fooToBar for all Real / Fractional pairs.

  realToFrac (1/0::Float) ::Double
 3.402823669209385e38
 
  realToFrac (0/0::Float) ::Double
 -5.104235503814077e38

  realToFrac (0/0::Double) ::Float
 -Infinity

[snip]

GHC makes the mess just a bit messier. The following program prints
different answers when compiled with -O or without:

main = do
print (realToFrac (1/0::Float) :: Double)
print (realToFrac (0/0::Float) :: Double)
print (realToFrac (0/0::Double) :: Float)

Without -O:
3.402823669209385e38
-5.104235503814077e38
-Infinity

With -O:
Infinity
NaN
NaN

The reason for this behaviour are rules replacing realToFrac by direct
conversions from Float to Double or vice versa, where applicable. These
two evils -- realToFrac and changing the behaviour with -O -- amount to
something good: proper treatment of NaNs and Infinities.

Your RealToFrac class is a definitive improvement over this situation.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Comments from OCaml Hacker Brian Hurt

2009-01-15 Thread Bertram Felgenhauer
Andrew Wagner wrote:
 I think perhaps the correct question here is not how many instances of
 Monoid are there?, but how many functions are written that can use an
 arbitrary Monoid. E.g., the fact that there are a lot of instances of Monad
 doesn't make it useful. There are a lot of instances of Monad because it's
 useful to have instances of Monad. Why? Because of
 http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html 
 !
 Look at all the cool stuff you can automagically do with your type just
 because it's an instance of Monad! I think that's the point. What can you do
 with arbitrary Monoids? Not much, as evidenced by
 http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html

One example where Monoids (in full generality) are useful is that of
measurements in the Data.Sequence paper (which is sadly not implemented
in the library, although it is used to maintain the length for efficient
indexing),

   http://www.soi.city.ac.uk/~ross/papers/FingerTree.html

The concept applies to any tree that represents an ordered list.

The basic idea is that given a measurement for single elements,

class Monoid v = Measured a v where
 measure :: a - v

we can annotate a tree with cached measurements of the corresponding
sequences,

data Tree a v = Empty | Leaf v a | Node v (Tree a v) (Tree a v)

measureTree :: Measured a v = Tree a v - v
measureTree Empty = mzero
measureTree (Leaf v _) = v
measureTree (Node v _ _) = v

which can be calculated easily by smart constructors:

leaf :: Measured a v = a - Tree a v
leaf a = Leaf (measure a) a

node :: Measured a v = Tree a v - Tree a v - Tree a v
node l r = Node (measureTree l `mappend` measureTree r) l r

Because v is a monoid, the construction satisfies the law

measureTree = mconcat . map measure . toList

where
toList Empty = []
toList (Leaf _ a) = [a]
toList (Node _ l r) = toList l ++ toList r

All usually efficient tree operations, insertion, deletion, splitting,
concatenation, and so on, will continue to work, if the cached values
are ignored on pattern matching and the smart constructors are used
for constructing the new trees. measure or `mappend` will be called
for each smart constructor use - if they take constant time, the
complexity of the tree operations doesn't change.

Applications include:
  - finding and maintaining the sum of any substring of the sequence.
  - maintaining minimum and maximum of the list elements
  - maintaining the maximal sum of any substring of the sequence
(this can be done by measuring four values for each subtree:
1. the sum of all elements of the sequence
2. the maximum sum of any prefix of the sequence
3. the maximum sum of any suffix of the sequence
4. the maximum sum of any substring of the sequence)

I also found the idea useful for
http://projecteuler.net/index.php?section=problemsid=220

starting out with
-- L system basis
class Monoid a = Step a where
l :: a
r :: a
f :: a

and then providing a few instances for Step, one of which was a binary
tree with two measurements.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

2008-12-29 Thread Bertram Felgenhauer
Evan Laforge wrote:
 On Mon, Dec 29, 2008 at 1:15 PM, Ryan Ingram ryani.s...@gmail.com wrote:
  Both readTVar and writeTVar are worse than O(1); they have to look up
  the TVar in the transaction log to see if you have made local changes
  to it.
 
  Right now it looks like that operation is O(n) where n is the number
  of TVars accessed by a transaction, so your big transaction which is
  just accessing a ton of TVars is likely O(n^2).
 
 So this actually brings up a tangential question I've wondered about
 for a while.
 
 The lock-free datastructure paper at
 http://research.microsoft.com/users/Cambridge/simonpj/papers/stm/lock-free-flops06.pdf
 shows lock vs. STM with very similar performance for single processor
 with STM quickly winning on multi-processors.

I have not verified this, but a possible cause is that
Control.Concurrent.QSem isn't efficient if there are many waiters.
It should use two lists for managing the waiters (ala Okasaki).

But why does it manually manage the waiters at all? MVars are fair, in
ghc at least. So this should work:

newtype Sem = Sem (MVar Int) (MVar Int)

newSem :: Int - IO Sem
newSem initial = liftM2 Sem (newMVar initial) newEmptyMVar

-- | Wait for a unit to become available
waitSem :: Sem - IO ()
waitSem (Sem sem wakeup) = do
   avail' - modifyMVar sem (\avail - return (avail-1, avail-1))
   when (avail'  0) $ takeMVar wakeup = putMVar sem

-- | Signal that a unit of the 'Sem' is available
signalSem :: Sem - IO ()
signalSem (Sem sem wakeup) = do
   avail - takeMVar sem
   if avail  0 then putMVar wakeup (avail+1)
else putMVar sem (avail+1)

(I should turn this into a library proposal.)

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to think about this? (profiling)

2008-12-16 Thread Bertram Felgenhauer
Magnus Therning wrote:
 This behaviour by Haskell seems to go against my intuition, I'm sure I
 just need an update of my intuition ;-)
 
 I wanted to improve on the following little example code:
 
   foo :: Int - Int
   foo 0 = 0
   foo 1 = 1
   foo 2 = 2
   foo n = foo (n - 1) + foo (n - 2) + foo (n - 3)

Two more ideas: How about

  -- loop keeping the last three elements of the sequence
  -- O(n) per call, constant memory
  foo' :: Int - Int
  foo' n = go n 0 1 2 where
  go 0 a _ _ = a
  go n a b c = go (n - 1) b c (a + b + c)

or

  -- analogue of the folklore fibonacci definition:
  -- fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
  foos :: [Int]
  foos = 0 : 1 : 2 : zipWith3 (\a b c - a + b + c)
  foos (tail foos) (tail (tail foos))

[snip]
 Then I added a convenience function and called it like this:
 
   createArray :: Int - UArray Int Int
   createArray n = array (0, n) (zip [0..n] (repeat (-1)))
 
   main = do
   (n:_)  - liftM (map read) getArgs
   print $ evalState (foo n) (createArray n)
 
[snip]
 
   main = do
   (n:_)  - liftM (map read) getArgs
   print $ evalState (mapM_ foo [0..n]  foo n) (createArray n)
 
 Then I started profiling and found out that the latter version both uses
 more memory and makes far more calls to `foo`.  That's not what I
 expected!  (I suspect there's something about laziness I'm missing.)
 
 Anyway, I ran it with `n=35` and got
 
  foo n : 202,048 bytes , foo entries 100
  mapM_ foo [0..n]  foo n : 236,312 , foo entries 135 + 1

The number of function calls is to be expected: to evaluate  foo n
for the first time, you need to call  foo (n-1), foo (n-2) and
foo (n-3), making 4 calls per evaluated value. 36*4 = 144 is pretty
close to 135.
(The missing 9 calls correspond to foo 0, foo 1 and foo 2)

The difference of 35 can be explained in the same way: the first version
makes 35 fewer explicit calls to 'foo'.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Memoization-question

2008-12-12 Thread Bertram Felgenhauer
Mattias Bengtsson wrote:
 The program below computes (f 27) almost instantly but if i replace the
 definition of (f n) below with (f n = f (n - 1) * f (n -1)) then it
 takes around 12s to terminate. I realize this is because the original
 version caches results and only has to calculate, for example, (f 25)
 once instead of (i guess) four times.
 There is probably a good reason why this isn't caught by the compiler.
 But I'm interested in why. Anyone care to explain?

GHC does opportunistic CSE, when optimizations are enabled. See

http://www.haskell.org/haskellwiki/GHC:FAQ#Does_GHC_do_common_subexpression_elimination.3F
(http://tinyurl.com/33q93a)

I've found it very hard to predict whether this will happen or not, from
a given source code, because the optimizer will transform the program a
lot and the opportunistic CSE rule may apply to one of the transformed
versions.

It's best to make sharing explicit when you need it, as you did below.

  main = print (f 27)
  
  f 0 = 1
  f n = let f' = f (n-1)
in f' * f'

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The Knight's Tour: solutions please

2008-12-02 Thread Bertram Felgenhauer
Dan Doel wrote:
 On Monday 01 December 2008 1:39:13 pm Bertram Felgenhauer wrote:
  As one of the posters there points out, for n=100 the program doesn't
  actually backtrack if the 'loneliest neighbour' heuristic is used. Do
  any of our programs finish quickly for n=99? The Python one doesn't.
 
 Nothing I tried finished. Do you have any figures on how much backtracking 
 needs to be done to find a solution for n=99 (there is a solution, right?)?

Yes, there is a solution. After changing

  successors n b = sortWith (length . succs) . succs

to

  successors n b = sortWith (length . (succs =) . succs) . succs

in the LogicT solution, it finds one with no backtracking at all.

This heuristic fails on other n though, n=8 and n=66 at least.

The obvious next step,

  successors n b = sortWith (length . (succs =) . (succs =) . succs) . succs

works without backtracking up to n=100.

These improved heuristics don't come cheap though. Here are some timings
for n = 100:

  LogicT  :  0.48 user 0.00 system 0:00.48 elapsed
  LogicT' :  2.16 user 0.00 system 0:02.16 elapsed
  LogicT'': 13.84 user 0.01 system 0:13.86 elapsed

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: The Knight's Tour: solutions please

2008-12-02 Thread Bertram Felgenhauer
ChrisK wrote:
 Hmmm... it seems that n=63 is a special case.

 [EMAIL PROTECTED] wrote:
 Yes, there is a solution for n=99 and for n=100 for that matter --
 which can be found under one second. I only had to make a trivial
 modification to the previously posted code
 tour n k s b | k  n*n   = return b
  | otherwise = do next - (foldr mplus mzero).map return $ 
 successors n b s
   tour n (k+1) next $ insrt next k b
 I replaced foldl1 mplus with foldr mplus zero.

 The old version sees no solution to n=63 quite quickly:

 time nice ./fromwiki-63 63
 fromwiki-63: Prelude.foldl1: empty list
 real 0m0.285s
 user 0m0.172s
 sys  0m0.026s

That's a bug. When the list of candidates is empty at that point, the
program should backtrack, not terminate.

In fact there are solutions for n=63. Using the first improved heuristic
from my previous mail in this thread:

 time ./tour2 63
   14  143  148  2116  229  226  2418  553  578  571   10  551  584 
 573   12  549  630  643   14  547  670  665   16  545  684  679   18  543  770 
 765   20  541  816  867   22  539  952  995   24  537 1044 1121   26  535 1208 
1231   28  533 1300 1307   30  531  494  489   32  491  404   39   34   37
...
real0m1.750s
user0m1.564s
sys 0m0.008s


 The version with the 'tour' given above does not halt after running up to 
 0.4 GB of RAM, so I killed it.

In fact, replacing  foldl1 mplus  by  foldr mplus mzero  fixed that bug.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The Knight's Tour: solutions please

2008-12-01 Thread Bertram Felgenhauer
Don Stewart wrote:
 Lee Pike forwarded the following:
 
 Solving the Knight's Tour Puzzle In 60 Lines of Python
 
 http://developers.slashdot.org/article.pl?sid=08/11/30/1722203
 
 Seems that perhaps (someone expert in) Haskell could do even better?   
 Maybe even parallelize the problem? :)

As one of the posters there points out, for n=100 the program doesn't
actually backtrack if the 'loneliest neighbour' heuristic is used. Do any
of our programs finish quickly for n=99? The Python one doesn't.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] '#' in literate haskell

2008-11-30 Thread Bertram Felgenhauer
John MacFarlane wrote:
 Can anyone explain why ghc does not treat the following
 as a valid literate haskell program?
 
 - test.lhs 
 # This is a test
 
  foo = reverse . words
 
 

I believe this is an artifact of ghc trying to parse cpp style line
number information:

 foo.lhs 
# 123 foo.foo

 t = 


will print this error:
   foo.foo:124:6: parse error on input `'

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Permutations

2008-11-30 Thread Bertram Felgenhauer
Daniel Fischer wrote:
 Needs an Ord constraint:
 
 inserts :: [a] - [a] - [[a]]
 inserts [] ys = [ys]
 inserts xs [] = [xs]
 inserts xs@(x:xt) ys@(y:yt) = [x:zs | zs - inserts xt ys] 
   ++ [y:zs | zs - inserts xs yt]

Heh, I came up with basically the same thing.

I'd call this function 'merges' - it returns all possibilities for
merging two given lists.

 uniquePermutations :: Ord a = [a] - [[a]]
 uniquePermutations = foldr (concatMap . inserts) [[]] . group . sort

Which can also be written as

  uniquePermutations = foldM merges [] . group . sort

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Histogram creation

2008-11-10 Thread Bertram Felgenhauer
Alexey Khudyakov wrote:
 Hello!
 
 I'm tryig to write efficient code for creating histograms. I have following
 requirements for it:
 
 1. O(1) element insertion
 2. No reallocations. Thus in place updates are needed.
 
 accumArray won't go because I need to fill a lot of histograms (hundrends)
 from vely long list of data (possibly millions of elements) and it will
 traverse input data for each histogram.

That's just not true, for GHC's implementation of accumArray at least,
which goes via the ST monad. It creates a mutable array, fills it,
traversing the input list exactly once, and finally freezes the array
and returns it. This is just what you suggest below.

If you still run into performance problems, try out unboxed arrays.

If that isn't enough, unsafeAccumArray from Data.Base may help.

I'd try both before using the ST monad directly.

 It seems that I need to use mutable array and/or ST monad or something else.
 Sadly both of them are tricky and difficult to understand. So good examples
 or any other ideas greatly appreciated.

http://www.haskell.org/haskellwiki/Shootout/Nsieve_Bits

perhaps. There must be better examples out there.

I can think of two common problems with mutable arrays and ST:

1) You need to specify a type signature for the array being created,
   because the compiler can't guess the MArray instance that you want.

   For example, from the shootout entry:

   arr - newArray (0,m) False :: IO (IOUArray Int Bool)

   In ST, this is slightly trickier, because the phantom 's' type
   parameter has to be mirrord in the ST*Array type constructor. You
   can use scoped type variables, which allow you to write

   {-# LANGUAGE ScopedTypeVariables #-}
   import Control.Monad.ST
   import Data.Array.ST

   foo :: forall s . ST s ()
   foo = do
   arr - newArray (0,42) False :: ST s (STUArray s Int Bool)
   ...

   Alternatively you can define helper functions to specify just the
   part of the type signature that you care about.

   stuArray :: ST s (STUArray s i e) - ST s (STUArray s i e)
   stuArray = id

   foo :: ST s ()
   foo = do
   arr - stuArray $ newArray (0,42 :: Int) False
   ...

2) runST $ foo bar   doesn't work. You have to write  runST (foo bar)

But in the end it's just imperative array programming with a rather
verbose syntax -- you can do only one array access per statement, and
'readArray' and 'writeArray' are rather long names.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Histogram creation

2008-11-10 Thread Bertram Felgenhauer
Alexey Khudyakov wrote:
 Hello!
 
 I'm tryig to write efficient code for creating histograms. I have following
 requirements for it:
 
 1. O(1) element insertion
 2. No reallocations. Thus in place updates are needed.
 
 
 accumArray won't go because I need to fill a lot of histograms (hundrends) 
 from
 vely long list of data (possibly millions of elements) and it will traverse
 input data for each histogram.

Sorry, Duncan is right. I misread here.

My first idea would still be to use accumArray though, or rather, accum,
processing the input data in chunks of an appropriate size (which depends
on the histogram sizes.)

But actually, the ST code isn't bad (I hope this isn't homework):


import Control.Monad.ST
import Control.Monad
import Data.Array.ST
import Data.Array.Unboxed

stuArray :: ST s (STUArray s i e) - ST s (STUArray s i e)
stuArray = id

-- Create histograms.
--
-- Each histogram is described by a pair (f, (l, u)), where 'f' maps
-- a data entry to an Int index, and l and u are lower and upper bounds
-- of the indices, respectively.
--
mkHistograms :: [(a - Int, (Int, Int))] - [a] - [UArray Int Int]
mkHistograms hs ds = runST collect where
-- Why is the type signature on 'collect' required here?
collect :: ST s [UArray Int Int]
collect = do
-- create histogram arrays of appropriate sizes
histograms - forM hs $ \(_, range) - do
stuArray $ newArray range 0

-- iterate through the data
forM_ ds $ \d - do
-- iterate through the histograms
forM_ (zip hs histograms) $ \((f, _), h) - do
-- update appropriate entry
writeArray h (f d) . succ = readArray h (f d)

-- finally, freeze the histograms and return them
-- (using unsafeFreeze is ok because we're done modifying the
-- arrays)
mapM unsafeFreeze histograms

test = mkHistograms [((`mod` 3), (0,2)), ((`mod` 5), (0,4))] [0..10]

-- test returns
-- [array (0,2) [(0,4),(1,4),(2,3)],
--  array (0,4) [(0,3),(1,2),(2,2),(3,2),(4,2)]]


Bertram

P.S. Ryan is right, too - I'm not sure where I got confused there.
  runST $ foo  didn't work in ghc 6.6; I knew that it works in
  ghc 6.8.3, but I thought this was changed again.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Writing an IRC bot, problems with plugins

2008-11-06 Thread Bertram Felgenhauer
Alexander Foremny wrote:
 I am writing an single server, multi channel IRC bot with the support of
 plugins and limited plugin communication. With the plugin system I am facing
 problems I cannot really solve myself.

Here's an approach built completely around Data.Typeable. The
fundamental idea is that a Plugin encompasses a set of interfaces
of unknown types, which have Typeable instances.

All we need is an operation to extract such an interface from
a Plugin.

data Plugin = Plugin {
getInterface :: forall i. Typeable i = Maybe i
}

Then we can define the interfaces we want to use, for example:

data BaseInterface = BaseInterface {
identifier :: String,
rawMessage :: (MonadIO m) = Message - PL m ()
} deriving Typeable

Sending a message to a plugin can be implemented as

sendMessage :: Plugin - Message - PL m ()
sendMessage p msg = do
let pI :: Maybe BaseInterface
pI = getInterface p
case pI of
Nothing - error Plugin does not support BaseInterface
Just pI' - rawMessage pI' msg

A more complete example follows below.

Does that help?

Bertram



{-# LANGUAGE GADTs, Rank2Types, DeriveDataTypeable #-}

module PluginTest (main) where

import Data.Typeable
import Data.IORef
import Control.Monad.Trans
import Control.Monad.State
import qualified Data.Map as M


-- Types

-- A Plugin is just a method that returns various interfaces.
data Plugin = Plugin {
getInterface :: forall i. Typeable i = Maybe i
}

-- The basic interface.
--
-- It should be made a part of Plugin, but it's a queryable interface
-- in this example for demonstration purposes.
data BaseInterface = BaseInterface {
identifier :: String,
rawMessage :: (MonadIO m) = Message - PL m ()
} deriving Typeable

type Message = String
type PL = StateT PluginConfig
type PluginConfig = M.Map String Plugin


-- Main

-- look up a plugin by name
findPlugin :: Monad m = String - PL m (Maybe Plugin)
findPlugin k = get = return . M.lookup k

-- register a plugin
registerPlugin :: MonadIO m = Plugin - PL m ()
registerPlugin p = do
-- note: 'getInterface' can return 'Nothing' - needs error checking
let Just i = getInterface p
modify (M.insert (identifier i) p)

-- unregister, etc.

main' :: MonadIO m = PL m ()
main' = do
-- create two plugins (see below) and register them.
a - createAPlugin
registerPlugin a
b - createBPlugin
registerPlugin b
-- extract base interfaces of a and b and send some messages
-- (needs error checking)
let aI, bI :: BaseInterface
Just aI = getInterface a
Just bI = getInterface b
liftIO $ putStrLn - Sending message to A
rawMessage aI dummy
liftIO $ putStrLn - Sending message to B
rawMessage bI Hi, here's a message from B
liftIO $ putStrLn - Sending another message to A
rawMessage aI dummy

main :: IO ()
main = evalStateT main' M.empty


-- Plugin A
--
-- This plugin provides an additional Interface that allows to
-- query and change a string value in its state.

data APlugin = APlugin (IORef String)
data AInterface = AInterface {
aGet :: (MonadIO m) = PL m String,
aPut :: (MonadIO m) = String - PL m ()
} deriving Typeable

createAPlugin :: (MonadIO m) = PL m Plugin
createAPlugin = do
r - liftIO (newIORef initial state)
let a = APlugin r
return $ Plugin {
getInterface = cast (aBase a) `mplus` cast (aInterface a)
}

aBase (APlugin r) = BaseInterface {
identifier = A,
rawMessage = msg
}
  where
msg _ = liftIO $ do
   s - readIORef r
   putStrLn (A has state ( ++ s ++ )!)


aInterface :: APlugin - AInterface
aInterface (APlugin r) = AInterface {
aGet = liftIO (readIORef r),
aPut = \v - liftIO (writeIORef r v)
}


-- Plugin B
--
-- Plugin B knows about Plugin A and uses its additional interface for
-- modifying its state

createBPlugin :: (MonadIO m) = PL m Plugin
createBPlugin = return $ Plugin {
getInterface = cast bBase
}

bBase = BaseInterface {
identifier = B,
rawMessage = msg
}
  where
msg s = do
-- find A plugin
Just a - findPlugin A
-- and get its additional interface
let aI :: AInterface
Just aI = getInterface a
aPut aI s
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Anyone know why this always returns invalid texture objects?

2008-11-06 Thread Bertram Felgenhauer
[CCing gtk2hs-users]

Jefferson Heard wrote:
 import Graphics.UI.Gtk
 import Graphics.UI.Gtk.Glade
 import Graphics.UI.Gtk.OpenGL
 import qualified Graphics.Rendering.OpenGL as GL
 import Graphics.Rendering.OpenGL (($=))
 
 main = do
   initGUI
   initGL

initGL may be slightly misleading - it initialises the
gtkglext gtk+ extension. It does not create a GL context.

   GL.shadeModel $= GL.Flat
   GL.depthFunc $= Just GL.Less
   (window1,gui,dlgs) - constructGUIObject
   (sX, sY) - liftM (mapPair fromIntegral) . widgetGetSize .
 drawing_canvas $ gui -- get the canvas size for determining the part
 of the widget to repaint
   pb - pixbufNew ColorspaceRgb False 8 (round pbWidth) (round pbHeight)
   pixbufFill pb 0 0 0 255
   pxbufs - initSubpixbufs pb texRows texCols
   textures - GL.genObjectNames (texRows*texCols)
   print textures

There is no active GL context at this point. GtkGLExt creates
new GL contexts for GL enabled widgets when they're realized -
I think. I'm a bit fuzzy about the exact life time of the GL
context. [1]

After the context was created, you have to activate it before
doing any GL operations.

In Gtk2hs you can use the GLDrawingArea widget, which provides
withGLDrawingArea for easy activation of the GL context.

There's an example in the gtk2hs sources, in examples/opengl.

Enabling GL for other widgets is not supported well at the moment.
(There are low level bindings (using DrawWindow), but no generic
binding to the higher level gtk_widget_set_gl_capability() call.
Such support wouldn't be too hard to add, I think.)

HTH,

Bertram

[1] see 
http://gtkglext.sourceforge.net/reference/gtkglext/gtkglext-gtkglwidget.html#gtk-widget-get-gl-context
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Array bug?

2008-11-02 Thread Bertram Felgenhauer
Andrew Coppin wrote:
 Bertram Felgenhauer wrote:
 Yes, it's a known bug - a conscious choice really. See

 http://hackage.haskell.org/trac/ghc/ticket/2120

 It's somewhat ironic that this behaviour was introduced by a patch
 that made arrays safer to use in other respects.

 ...so it's *not* going to be fixed then?

It's not going to be fixed by itself - the first comment for the
bug report basically asks interested parties to submit a proposal
for changing this.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Array bug?

2008-11-01 Thread Bertram Felgenhauer
Andrew Coppin wrote:
 Consider the following GHCi session:

 GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
 Prelude Data.Array.IO t - newArray ((0,0),(5,4)) 0 :: IO (IOUArray 
 (Int,Int) Int)
 Prelude Data.Array.IO getBounds t
 ((0,0),(5,4))
 Prelude Data.Array.IO

 Is this a known bug? Is it likely to be fixed any time soon? (I'm guessing 
 the bug is as simple is converting indicies to integers and then checking 
 the integers are in-range, rather than the underlying index type.)

Yes, it's a known bug - a conscious choice really. See

http://hackage.haskell.org/trac/ghc/ticket/2120

It's somewhat ironic that this behaviour was introduced by a patch
that made arrays safer to use in other respects.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Why 'round' does not just round numbers ?

2008-10-30 Thread Bertram Felgenhauer
George Pollard wrote:
 There's also the ieee-utils package, which provides an IEEE monad with
 `setRound`:
 
 http://hackage.haskell.org/packages/archive/ieee-utils/0.4.0/doc/html/Numeric-IEEE-RoundMode.html

Hmm, this does not work well with the threaded RTS:

 import Numeric.IEEE.Monad
 import Numeric.IEEE.RoundMode (RoundMode (..))
 import Control.Monad
 import Control.Concurrent
 
 main = withIeeeDo $ do
 replicateM_ 2 $ forkIO $ forever $ putChar '.'
 forkIO $ do
 runIEEE $ do
 withRoundMode Downward $
 forever $ do
 IEEE . putStr . (++ \n) . show = getRound
 threadDelay 100

When run with +RTS -N2 -RTS, the output randomly alternates
between Downward and ToNearest - for me at least.

The problem is that the setRound call will only affect one worker
thread, while the RTS will sometimes migrate RTS threads from one
worker to another.

runIEEE really has to be executed in a bound thread (see forkOS
documentation). Using `par` will also cause trouble - in fact even
more.

I think that conceptually, the cleanest approach is to provide separate
data types for Double and Float in each of the rounding modes. This is
quite expensive: basically, it means setting the rounding mode on each
operation, and we would miss out on the code generator support for
floating point math. (A primop for setting the rounding mode could help
here, to some extent.)

Maybe tracking the rounding mode per RTS thread would be a useful
compromise between performance and usability for computations with
mostly uniform rounding mode - this is what the Numeric.IEEE.Monad
module seems to be aiming at. `par` would still be unusable with that
approach though.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: is there a way to pretty print a module?

2008-10-30 Thread Bertram Felgenhauer
Jason Dagit wrote:
 Could you use haskell-src from TH and then unsafePerformIO to get the
 reading to work during compile time?  I've done something like this in
 the past with Language.Haskell and TH.  I described it here:
 http://blog.codersbase.com/2006/09/01/simple-unit-testing-in-haskell/
 
 Maybe someone who has studied more TH knows a way to remove the 
 unsafePerformIO.


Replace

  tests :: [String]
  tests = unsafePerformIO $

by

  tests :: Q [String]
  tests = runIO $

and

  $(mkChecks tests)

by

  $(mkChecks = tests)

(The Q monad should really have an MonadIO instance.)

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ghc error: requested module name differs from name found in interface file

2008-10-21 Thread Bertram Felgenhauer
Larry Evans wrote:
 On 10/20/08 12:33, Larry Evans wrote:
 With a file containing:
   module Main where
  
   import Array
   import Control.Functor.Fix
 I get:
   make
   ghc -i/root/.cabal/lib/category-extras-0.53.5/ghc-6.8.2 -c 
 catamorphism.example.hs

Yes, using -i to give paths to installed packages does not work - you
really have to tell ghc about the corresponding package.conf file,
using -package-conf file. See also

  
http://www.haskell.org/ghc/docs/latest/html/users_guide/packages.html#package-databases

   catamorphism.example.hs:19:0:
   Bad interface file: 
 /root/.cabal/lib/category-extras-0.53.5/ghc-6.8.2/Control/Functor/Fix.hi
   Something is amiss; requested module main:Control.Functor.Fix 
 differs from name found in the interface file 
 category-extras-0.53.5:Control.Functor.Fix
   make: *** [all] Error 1

The problem is that all modules found by -i are expected to be in the
current package - which is 'main' by default. (Build tools like Cabal
specify a different package name for libraries; for example the
Control.Functor.Fix is in the 'category-extras-0.53.5' package here.)

 So, I've got to figure how to tell cabal install to install in right
 place :(

Have you tried 'cabal install --global'? To make it stick, put
'user-install: False' in root's .cabal/config file.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] is 256M RAM insufficient for a 20 million element Int/Int map?

2008-10-19 Thread Bertram Felgenhauer
Bulat Ziganshin wrote:
 Hello Bertram,
 
 Sunday, October 19, 2008, 6:19:31 AM, you wrote:
 
  That's 5 words per elements
 
 ... that, like everything else, should be multiplied by 2-3 to
 account GC effect

True. You can control this factor though. Two RTS options help:

  -c  (Enable compaction for all major collections) - mostly
  avoids fragmentation in the old generation.
  -Ffactor
  (Control the amount of memory reserved in terms of the size
  of the oldest generation. The default is 2, meaning that if
  the oldest generation is 200MB in size, 400 MB of heap will
  be used)

Consider this program,

 module Main (main) where
 
 import qualified Data.IntMap as M
 import Data.List (foldl')
 
 main = do
 loop (M.fromList [(i,0) | i - [1..500]]) 1
 
 loop dict j = do
 i - readLn
 print $ dict M.! (i :: Int)
 let dict' = foldl' (\m (k, v) - M.insert k v m) dict [(,) i $! j*i | i 
 - [j`mod`10 * 50 + 1..j`mod`10 * 50 + 50]]
 loop dict' (j+1)

This program maintains an IntMap with 5 million entries, which means
200 MB of live data on a 32 bit computer. It updates the map a lot, too,
so I think this is a fairly realistic example.

Running it on a 51 line input with various RTS options [*], we get:

Options  Memory used   Time used
+RTS -c -F1.1220 MB3m22s
+RTS -c -F1.2243 MB2m12s
+RTS -c -F1.5306 MB1m58s
+RTS -c  398 MB1m57s
+RTS -F 1.1  406 MB1m43s
+RTS -F 1.2  425 MB1m15s
+RTS -F 1.5  483 MB1m6s
none 580 MB1m11s

Heap residency was around 200.5 million bytes in all runs.

As expected, saving memory this way doesn't come cheap - it can
dramatically increase the program's runtime. But if a program builds
and slowly updates a large dictionary, playing with these options can
help a lot.

Bertram

[*] time (seq 50; echo 0) | ./Main +RTS -sstderr -c -F1.2
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] package question/problem

2008-10-18 Thread Bertram Felgenhauer
Galchin, Vasili wrote:
 I am trying to cabal install HSQL. I am using ghc 6.8.2. I get the
 following error about a non-visible/hidden package (old-time-1.0.0.0):
 
 [EMAIL PROTECTED]:~$ cabal install hsql
[snip]
 Database/HSQL.hsc:66:7:
 Could not find module `System.Time':
   it is a member of package old-time-1.0.0.0, which is hidden
 cabal: Error: some packages failed to install:
 hsql-1.7 failed during the building phase. The exception was:
 exit: ExitFailure 1
 [EMAIL PROTECTED]:~$

This happens because Cabal invokes ghc with the -hide-all-packages flag.
It then adds explicit -package flags for all packages listed as
build dependencies. (The idea is to force developers to specify all
their package's dependencies in the library's .cabal file.)

Sadly, hsql looks unmaintained.

You can try unpacking the hsql-1.7 tarball and adding the required
dependencies to the  build-depends:  field of the .cabal file.
Running 'cabal install' in the source directory will attempt to
install the package.

 What is the history of old-time package?

old-time was split out of the base package in base 3.0 (which ships
with ghc 6.8.x).

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] is 256M RAM insufficient for a 20 million element Int/Int map?

2008-10-18 Thread Bertram Felgenhauer
Don Stewart wrote:
 tphyahoo:
  I'm trying to run a HAppS web site with a large amount of data: stress
  testing happstutorial.com.
  Well, 20 million records doesn't sound that large by today's
  standards, but anyway that's my goal for now.
  I have a standard Data.Map.Map as the base structure for one of my
  macid data tables (jobs), but I noticed something
  that is probably causing problems for me.
  Even a simple 20 million record with int/int key values causes an out
  of memory error for me in ghci,
 
 Int keys, Int values eh?
 
 Does using IntMap help?

Interesting. Map Int Int, IntMap Int and [(Int, Int)] use pretty much
the same amount of memory, assuming that no keys or values are shared:

Map Int Int:

  data Map k a = Tip 
   | Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a) 

  - all 'Tip's are shared.
  - For each (key, value) pair, there is one 'Bin', needing
tag[*] (1 word)
size (unpacked, 1 word)
key, value (pointer, tag, value = 3 words each)
two more pointers (1 word each)
  for a total of 10 words per element.

IntMap Int:

  data IntMap a = Nil
 | Tip {-# UNPACK #-} !Key a
 | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask
   !(IntMap a) !(IntMap a) 

  - one 'Tip' per element:
tag (1 word) + key (1 word) + value (3 words)
  - one 'Bin' per element (minus 1):
tag (1 word) + prefix, mask (1 word each) + 2 pointers (1 word each)
  for a total of 10 words per element.

[(Int, Int)]:
  - one '(:)' per element:
- tag (1 word) + 2 pointers (1 word each)
  - one '(Int, Int)' per element:
- tag (1 word) + key (3 words) + value (3 words)
  for a total of 10 words per element, again.  

Now if we want to save memory, we can specialise Data.Map
to Int keys and values, saving 4 words per element, and encode the
external leaves (Tips) in the constructor, saving another word:

  data IntIntMap = Tip
  | BinBB !Size !Int !Int IntIntMap IntIntMap
  | BinBT !Size !Int !Int IntIntMap
  | BinTB !Size !Int !Int IntIntMap
  | BinTT !Size !Int !Int
  -- sprinkle {-# UNPACK #-} as needed

That's 5 words per elements. Would that be worthwhile?

Bertram

[*] actually, the info pointer in the Spineless, Tagless G-machine.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] 'par' - why has it the type a - b - b ?

2008-09-29 Thread Bertram Felgenhauer
Henning Thielemann wrote:

 What is the reason for implementing parallelism with 'par :: a - b - b'? 
 Analogy to 'seq'?

I'd think it's actually easier to implement than par2 below; evaluating
par x y sparks a thread evaluating x, and then returns y. The analogy
to 'seq' is there, of course.

 I thought parallelism would be introduced most naturally 
 by a function which does two computations in parallel and puts together 
 their results after completion. Say

 par2 :: (a - b - c) - (a - b - c)

 to be used like

 par2 (+) expensiveComputationA expensiveComputationB

 I assume that par2 can be implemented this way:

 par2 f x y =
f x (par x y)

For this to work, f has to evaluate its second argument before the
first one, or the par will be useless. Try this:

 par2 f x y = x `par` y `par` f x y

(In complete analogy to using `seq` for enforcing strictness.)

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] random colors, stack space overflow, mersenne and mersenne.pure64

2008-09-12 Thread Bertram Felgenhauer
Cetin Sert wrote:
[snip]
 colorR :: RandomGen g ⇒ (RGB,RGB) → g → (RGB,g)
 colorR ((a,b,c),(x,y,z)) s0 = ((r,g,b),s3)
   where
 (r,s1) = q (a,x) s0
 (g,s2) = q (b,y) s1
 (b,s3) = q (c,z) s2
 q = randomR

Look closely at how you use the variable 'b'.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] two problems with Data.Binary and Data.ByteString

2008-08-14 Thread Bertram Felgenhauer
Tim Newsham wrote:
[snip]
 I would have expected this to fix my problems:

   binEof :: Get ()
   binEof = do
   more - not $ isEmpty
   when more $ error expected EOF

   decodeFully :: Binary b = B.ByteString - b
   decodeFully = runGet (get  binEof)
 where a  b = a = (\x - b  return x)

Remember that the Get monad is lazy; the result of binEof is never
used, so the action is not performed.

decodeFully :: Binary b = B.ByteString - b
decodeFully = runGet (get  binEof)
  where a  b = a = (\x - b = \y - y `seq` return x)
works, for example, and also
  where a  b = a = (\x - b = \y - return (y `seq` x))
and
  where () = liftM2 (flip seq)

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] -fvia-C error

2008-06-12 Thread Bertram Felgenhauer
Duncan Coutts wrote:
 Don, this does not work:
 
 includes: SFMT.h SFMT_wrap.h
 install-includes: SFMT.h

Sorry, that was my fault.

(It does work with ghc 6.9, but that's not much of an excuse)

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne Build Problem

2008-06-07 Thread Bertram Felgenhauer
Dominic Steinitz wrote:
 I'm getting errors (see below) trying to build the tests in
 
 http://hackage.haskell.org/cgi-bin/hackage-scripts/package/mersenne-random-0.1.1
 
[snip]
  Linking Unit ...
  Unit.o: In function `s4Da_info':
  (.text+0x1b21): undefined reference to `genrand_real2'
  Unit.o: In function `s4RA_info':
  (.text+0x3e75): undefined reference to `genrand_real2'
  Unit.o: In function `s4S4_info':
  (.text+0x3f61): undefined reference to `genrand_real2'
  Unit.o: In function `s5su_info':
  (.text+0x40bc): undefined reference to `genrand_real2'
  /usr/local/lib/mersenne-random-0.1.1/ghc-6.9.20080517/[...]
 
[snip]

The missing symbols are inlined functions. ghc 6.9 doesn't include the
header files anymore when compiling via C. (The solution is to create
C wrappers around those functions. I guess I'll whip up a patch.)

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Fwd: installing happy 1.17

2008-06-07 Thread Bertram Felgenhauer
Duncan Coutts wrote:
 The immediate workarounds are:
   * unregister Cabal-1.5.2

Better, hide it (that's reversible) - or does that not work with
cabal-install?

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] [ANN] hs-pgms 0.1 -- Programmer's Minesweeper in Haskell

2008-06-04 Thread Bertram Felgenhauer
Hi,

I've just uploaded hs-pgms to hackage. It is a Haskell implementation
of Programmer's Minesweeper [1], which allows programmers to implement
minesweeper strategies and run them. (Note: ghc = 6.8 is required.)

hs-pgms uses MonadPrompt to achieve a clean separation between
strategies, game logic, and presentation. There are two frontends,
one command line frontend which is mainly useful for collecting
statistics, and a GUI frontend, using gtk2hs, for watching the
strategies in action.

There's a git repo, see http://repo.or.cz/w/hs-pgms.git .

enjoy,

Bertram

[1] http://www.ccs.neu.edu/home/ramsdell/pgms/index.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANNOUNCE] git-darcs-import 0.1

2008-06-03 Thread Bertram Felgenhauer
Darrin Thompson wrote:
 On Sun, Jun 1, 2008 at 2:44 PM, Bertram Felgenhauer
 [EMAIL PROTECTED] wrote:
  I'm pleased to announce yet another tool for importing darcs repositories
  to git. [...]

 What's the appeal of this? I personally love git, but I thought all
 the cool kids at this school used darcs and that was that.

For myself, git-darcs-import itself is an opportunity to learn more
about both darcs and git. It wasn't meant to be argument in the git
vs. darcs discussion, although it was inevitable that it would be
seen as such.

I really like darcs' concepts, but in my opinion, darcs doesn't get
enough power out of the theory of patches to really shine so far.

This is a hard problem, and I can't offer solutions. Ideally, you'd have
semantic patches which just commute with virtually all other patches
because they know what they are about. The only thing that darcs
offers in that direction - besides handling conflicts, mergers and
undos gracefully, which is quite useful in itself - is a keyword
substitution patch type.

In the meantime, I prefer git to darcs, mainly because I'm sort of
attached to seeing the development history, i.e. I prefer to think of
patches as (partially) ordered instead of being a cloud of patches
that darcs uses as a model.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANNOUNCE] git-darcs-import 0.1

2008-06-02 Thread Bertram Felgenhauer
Thomas Schilling wrote:

 On 1 jun 2008, at 20.44, Bertram Felgenhauer wrote:
[git-darcs-import]

 Nice!  Do you happen to also have a darcs (or Git) repository somewhere?

I've uploaded my (git) repo to repo.or.cz, see
http://repo.or.cz/w/git-darcs-import.git

Patches are welcome.

enjoy,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] [ANNOUNCE] git-darcs-import 0.1

2008-06-01 Thread Bertram Felgenhauer
Hi,

I'm pleased to announce yet another tool for importing darcs repositories
to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in
Haskell, on top of the darcs2 source code. The result is a much faster
program - it can convert the complete ghc 6.9 branch (without libraries)
in less than 15 minutes on my slightly dated machine (Athlon XP 2500+),
which is quite fast [3]. Incremental updates work, too.

The program is still rough around the edges, and there's some cosmetical
work to do, especially with respect to converting author names. The
program should recover from most errors, as long as nobody else modifies
the destination repository.

Nevertheless, it seems quite useable already. I hope somebody finds
this useful.

You can grab the source at

   http://int-e.home.tlink.de/haskell/git-darcs-import-0.1.tar.bz2

Look at the README for further information.

Credits go to:
David Roundy and all contributors for darcs2. The code base is
surprisingly pleasant to work with.

And of course, Linus Torvalds, Junio Hamano and all other git
contributors.

Enjoy,

Bertram

[1] http://repo.or.cz/w/darcs2git.git?a=shortlog
[2] http://git.sanityinc.com/?p=darcs-to-git.git
[3] http://nominolo.blogspot.com/2008/05/thing-that-should-not-be-or-how-to.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


  1   2   >