[Haskell-cafe] Haskell integration with C/C++ (GSOC)

Sutherland, Julian Wed, 04 Apr 2012 22:56:56 -0700

Hey Guys,

I'm Julian, I am reaching the end of my second year as a JMC (Joint Mathematics 
and Computer science) Student at Imperial College London
and I'd like to apply to GSOC for a project involving Haskell and I just wanted 
to run my idea past the community.


I've already talked about this on the haskell-soc IRC channel (I go by the 
pseudonym "julek").

What I would like to do is to improve the integration of C/C++ with Haskell, 
particularly in calling Haskell from C/C++.

Currently ghc is able to generate stubs to export functions whose arguments are 
simple types such as CInts into C/C++.
The stub generated is always in an extern "C" clause due to the fact that ghc 
does not as yet implement the C++
calling conventions as defined in the "The Haskell 98 Foreign Function 
Interface 1.0" (http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi.pdf)

So a first step would be to implement this calling convention to bring it up to 
speed with the above referenced report.
This shouldn't be too hard and mostly involves implementing C++ name mangling 
conventions.

Next, I would like to extend the stub generation so as to be able to deal with 
more complex types.

The type systems in C++ and Haskell have many analogous syntaxes that can be 
easily exploited to provide strong compatibility and interoperability between 
the two languages.

For example a simple type such as:

data Foo = A | B

Could be implemented as an enum in C/C++:

enum Foo {A, B};

More advanced types that take arguments such as:

data Tree = Node Left Right | Leaf

Could be converted to a struct in C/C++:

struct Tree {
    struct Tree* left;
    struct Tree* right;
};

Types that have functions that act on them such as:

data IntContainer = IntContainer Int

getInt :: IntContainer -> Int
getInt (IntContainer a) = a

could have these functions automatically converted to C/C++:

struct IntContainer {
    int a;
};

extern int getInt_hs(IntContainer a);

This also opens up the possibility of exploiting C/C++ name mangling 
conventions, to allow the _hs postfix I'm suggesting here to be eliminated.

Haskell classes:

class Arithmetic a where
    (+) :: a -> a -> a
    (*) :: a -> a -> a
    (-) :: a -> a -> a
    (/) :: a -> a -> a

could be implemented using C++ functions with virtual members:

class Monad {
    public:
       virtual Monad add(Monad a, Monad b);
       virtual Monad mult(Monad a, Monad b)
       virtual Monad neg(Monad a, Monad b);
       virtual Monad div(Monad a, Monad b);
}

All types of single/multiple instancing (i.e. either directly or through 
requirements of instances)
would be implemented using single/multiple inheritance.

Obviously, this example is rather contrived due to the conversion of the 
function names.
The fact that the rules that govern function naming in Haskell are much more 
permissive
than those of C/C++ might cause compatibility issues.

This can be worked around by implementing a similar syntax to that currently 
used for function imports by the FFI.
E.g..:
    foreign export ccall "bind" >>= :: CInt -> CInt

Similar to:
    foreign import ccall "f" func :: CInt -> CInt

The latter is the current syntax for imports.

The name given for the export would be checked for legality in the namespace of 
the target language.

Alternatively this could be done in an automated manner using some naming 
conventions as well as operator polymorphism, but this would probably sacrifice 
ease of use.

Finally polymorphic Haskell functions/types can be implemented in C++ using 
templates.

I would like to extend ghc to implement enhanced C/C++ stub generation using 
the methods described above as well as to generate Haskell stubs which describe 
the Haskell CType equivalents of the Haskell types defined, functions for 
conversion between the two and function stubs to convert the types, run the 
Haskell function and convert back as required.

On top of this I'd like to write C/C++ libraries for dealing with most of the 
standard Haskell types such as Maybe, Either, etc...

Finally, I'd like to work on ironing out any bugs that remain in the RTS when 
it is used in "multilingual" situations, as well as improving it's performance 
in this situation.

I found an example of such a bug, which I will test further before reporting it.
It seems to be the opposite of the following bug:
    http://hackage.haskell.org/trac/ghc/ticket/5594

i.e. the stdout buffer isn't always correctly flushed when calling C/C++ in a 
program whose main is written in Haskell.

For example, when running the code:

main.hs:
    module Main where

    import Foreign.C.Types
    import System.IO

    foreign import ccall "inc" c_inc :: CInt -> CInt

    main :: IO ()
    main = do
      putStr "Enter n: "
      -- hFlush stdout
      s <-getLine
      putStrLn . show . c_inc . read $ s

inc.c:

    int inc(int i) __attribute__ ((const));

    int inc(int i)
    {
      return i + 1;
    }

Built with
Makefile:
    all:
        gcc -c -o inc.o inc.c
        ghc --make -main-is Main main.hs inc.o -o test
        rm *.hi *.o

The output comes out as:
    [julek@cryptid inc]$ ./test
    2
    Enter n: 3

But when the " hFlush stdout" line is commented back in, the output is:
    [julek@cryptid inc]$ ./test
    Enter n: 2
    3

which is correct, but the extra line shouldn't be necessary.

I am currently using ghc 7.4.1 which is the newest version, so this is a 
current bug.

I have had a look for such a bug being reported and have found no such report.
I'll look into this further before reporting it, but I am fairly certain this 
is a bug in the RTS.

As part of this project, I would fix this bug (if it is still around when I 
start) as well as looking for other ones in this area.

I think that extending ghc to the level required by "The Haskell 98 Foreign 
Function Interface 1.0" specification and above would reap significant benefit 
to the Haskell community.

The improved integration into C/C++ would open the door for this to happen for 
several other languages and would make Haskell more widespread.

Many Haskell beginners are daunted by the falsely perceived complexity of 
working with Haskell IO and monads, but love using the massive advantages that 
the paradigm gives in a non monadic context. Due to this, simplifying the 
interoperability between Haskell and C/C++ would enable many of these users to 
stick around for longer and perhaps encourage them to eventually look deeper 
into the language. This would make the size of the community grow and make the 
use of Haskell more widespread, potentially reaping benefits for the community 
at large.

I believe this could be implemented within the time frame given for GSOC.

Hope you like the ideas presented here and hopefully I'll be accepted into the 
Haskell summer of code!

If anybody has any opinions on the implementability/usefulness of this and/or 
criticism of the idea, please inform me and I'll be happy to discuss it!

Looking forward to hearing from you!

Thanking you in advance.

Kind regards,

Julian Sutherland

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Haskell integration with C/C++ (GSOC)

Reply via email to