Konrad

That does sound frustrating.

I think your first port of call should be Manuel Chakravarty, the author of 
accelerate.  The example you give in your stackoverflow 
post<http://stackoverflow.com/questions/27541609/difference-in-performance-of-compiled-accelerate-code-ran-from-ghci-and-shell>
 can only be some weird systems thing.  After all, you are executing precisely 
the same code (namely compiled Accelerate code); it’s just that in one case 
it’s dynamically linked and excecuted from GHCi and in the other it’s linked 
and executed by the shell.  I have no clue what could cause that.  I wonder if 
you are using a GPU and whether that might somehow behave differently.   Could 
it be the difference between static linking and dynamic linking (which could 
plausibly account for some startup delay)?  Is it a fixed overhead (eg takes 
100ms extra) or does it run a factor of two slower (increase the size of your 
test case to see)?

I’d be happy to have a Skype call with you, but I am rather unlikely to know 
anything helpful because it doesn’t sound like a core Haskell issue at all.   
You are executing the very same machine instructions!

The overheads of the GHC API to compile and run the expression “main” are 
pretty small.

I’m copying ghc-devs in case anyone else has any ideas.

Simon



From: Konrad Gądek [mailto:[email protected]]
Sent: 14 January 2015 13:59
To: Simon Peyton Jones
Cc: Piotr Młodawski; [email protected]
Subject: Request for assistance from Haskell-oriented startup: GHCi performance


Dear Mr Jones,



My name is Konrad Gądek and I'm one of the programmers at Flowbox ( 
http://flowbox.io ), a startup that is to bring a fresh view on image 
composition in movie industry. We proudly use Haskell in nearly all of our 
development. I believe you may remember our CEO, Wojciech Daniło, from 
discussions like in this thread: https://phabricator.haskell.org/D69 .

What can be interesting for you is that to achieve our goals as a company, we 
started developing a new programming language - Luna. Long story short, we 
believe that Luna could be as beneficial for the Haskell community as Elixir is 
for Erlang.

However, we found some major performance problems with the code that are as 
critical for us as they are cryptic. We have found difficulties in pinpointing 
the actual issue, not to mention solving it. We're getting a bit desperate 
about that, nobody so far has been able to help us, and so we would like to ask 
you for help. We would be really really grateful if you could take a look, 
maybe your fresh ideas could shed some light on the issue. Details are attached 
below.

Is there any chance we could arrange eg. a Skype call so we could further 
discuss the matter?



Thank you in advance!



Background

Currently Luna is trans-compiled to Haskell and then compiled to bytecode by 
GHC. Furthermore, we use ghci to evaluate expressions (the flow graph) 
interactively. We use accelerate library to perform high-performance 
computations with the help of graphic cards.

The problem

Executing some of the functions from libraries compiled with -O2 (especially 
from accelerate) is much slower than calling it from compiled executable (see 
http://stackoverflow.com/questions/27541609/difference-in-performance-of-compiled-accelerate-code-ran-from-ghci-and-shell
 and https://github.com/AccelerateHS/accelerate/issues/227).

Maybe there is some other way to interactively evaluate Haskell code, which is 
more lightweight/more customizable ie. would not require all ghc-api features 
which are probably slowing down the whole process? Is it possible to just use 
ghc linker and make function calls simpler and more time efficient?



Details

We feed ghci with statements (using ghc-api) and declarations (using runStmt 
and runDecls). We can also change imports and language extensions before each 
call. The overall process is as follows:

  *   on init:
·

     *   set ghcpath to one with our custom installation of ghc with 
preinstalled graphic libraries
     *   set imports to our libraries
     *   enable/disable appropriate language extensions

  *   for each run:
·

     *   generate haskell code (including datatype declarations, using lenses 
and TemplateHaskell) and load it to ghci using runDecls
     *   for each expression:
o

        *   run statements that use freshly generated code
        *   bind (lazy) results to variables
        *   evaluate values from bound variables, and get it from GhcMonad to 
runtime of our interpreter (see 
http://hackage.haskell.org/package/hint–0.4.2.1/docs/Language-Haskell-Interpreter.html#v:interpret<http://hackage.haskell.org/package/hint%E2%80%930.4.2.1/docs/Language-Haskell-Interpreter.html#v:interpret>)

This behaviour was observed when using GHC 7.8.3 (with D69 patch) on Fedora 20 
(x86-64), Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz

Tried so far

  1.  Specializing nearly everything in accelerate library, specializing calls 
to accelearate methods (no speedup).
  2.  Load precompiled, optimised code to ghci (no speedup).
  3.  Truth to be told, we have no idea what to try next.



--
Konrad Gądek
typechecker team-leader in Flowbox
_______________________________________________
ghc-devs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/ghc-devs

Reply via email to