Re: Announcing C::Blocks, a different way to interface Perl and C code

sisyphus1 Fri, 23 May 2014 20:35:27 -0700

Sounds pretty cool, David.
A plodder like me is probably just going to stick with Inline, but it would be 
great to see C::Blocks takes off. And it has the attractiveness to do so.
(I like the way you can so easily just plonk the C code right in there amongst 
the perl code .... makes Inline look not-so-inline :-)

Cheers,
Rob

From: David Mertens 
Sent: Friday, May 23, 2014 10:35 PM
To: perl-xs@perl.org ; Perl Inline Mail List 
Subject: Announcing C::Blocks, a different way to interface Perl and C code
Hey everyone,

tl;dr: C::Blocks is a new TinyCC-based module, presently only available on 
github. (1) It jit-compiles blocks of C code, building and inserting OPs into 
the Perl OP tree, making invocation of C code essentially free. (2) It will 
allow different blocks of C code to share function and struct declarations, 
thus removing the need to always recompile perl.h, an otherwise major cost of 
jit-compiling C code that can interface with Perl and Perl data structures.

I am currently seeking help and encouragement to squash the segfaults that 
currently prevent the completion of the second feature. :-)

----

I like Perl, but I like C, too. I would like to be able to write and call C 
code from Perl in about as painless a way as possible. Inline::C is nice, as 
are XS::TCC and C::TinyCompiler, but we can do better. C::Blocks is my attempt 
to do better.

Pain point 1. C code should be a first class citizen. With XS::TCC and 
C::TinyCompiler, you pass your code to the compiler via a string. With 
Inline::C, you either place your code at the bottom of your script in a 
__DATA__ section, or you enclose it in a string. Steffen's module is probably 
the most transparent in this sense. Still, working with an interface that 
requires me to compile a string to get my product feels the same as compiling a 
regex from a string. This is Perl! We can do better!

C::Blocks does better by using a keyword parser hook. Blocks of code that you 
want executed are called like so:

    print "Before cblock\n";

    cblock {

        printf("In cblock\n");

    }

    print "After cblock\n";

If stdio.h is included, you get the output

    Before cblock

    In cblock

    After cblock

Because it uses a parser hook, the C code really is inline with your Perl code.

Pain point 2. Calling C code should be obvious and cheap. All three modules 
discussed so far provide a mechanism for calling C functions. This means that 
for a simple, small operation, I must wrap my idea into a function one place 
and invoke it in another. Furthermore, if I want to repeatedly call a block of 
C code in a loop, I must define that block of code somewhere outside of the 
loop, potentially very far from the call site. C::TinyCompiler suffers further 
because it uses a complicated and rather slow calling mechanism.

C::Blocks solves this by extracting and jit-compiling the C code at Perl parse 
time, generating an OP and inserting it into the Perl OP tree. This means that 
you can insert your C code exactly where you want it and not worry about 
repeated re-compiles. If you were to wrap the example given above in a for 
loop, you would see how this works.

Pain point 3. Sharing C code should be as easy as sharing Perl code. 
C::TinyCompiler provides a fairly complex mechanism to allow modules to add 
declarations and symbols to a compiler context. Any string that uses that will 
need to recompile those declarations, however, tempting me to prematurely 
optimize by placing all of my C code in one giant string instead of 
interspersed among my Perl code. Neither XS::TCC nor Inline::C provide much (if 
any) automated machinery to share code.

C::Blocks provides a mechanism to share function declarations, struct 
definitions, and other identifiers with other cblocks in the current lexical 
scope, as well as to share them on a per-package basis. It is even more 
versatile than normal Perl function scoping, allowing you to correctly 
correlate functionality with lexical scope. (It is also somewhat buggy, as 
discussed next.)

Pain point 4. Changing C code should not cost anything. Inline::C can take 
seconds to recompile a changed set of C code. In contrast, there is no cost 
associated with changing code when using XS::TCC and C::TinyCompiler because 
they jit-compile their code. That comes at the cost, however, of always 
compiling everything each time you invoke your Perl script. If your C code 
needs the Perl C API, you will have to re-parse perl.h every time you compile a 
code block, which can happen many times with each execution of your script. 
Inline::C's caching mechanism provides a big win in that respect, unless you 
change your code. It would be nice if we could somehow cache the result of 
parsing ``#include "perl.h"''.

C::Blocks uses a fork of tcc that I've been working on for many months aimed at 
allowing one compiler context to share its symbol table with other compiler 
contexts. This is related to the previous point. The sharing mechanism 
discussed in the previous point applies to preprocessor includes, so once I 
have compiled a block that uses the Perl headers, I can share all of those 
declarations with later compilation units, without recompiling. In future work, 
I plan to store these symbol tables to disk so that they don't even need to be 
re-parsed each time you run your script.

----

C::Blocks currently addresses, completely, pain points 1 and 2 above. It has 
taken many months to hack on tcc to reach this point. I have now encountered 
some segfault-causing issues when trying to share code, yet I have a hard time 
reproducing those segfaults with direct tests on tcc. If you think this sounds 
like a cool project, I would appreciate some camaraderie as I try to dig into 
the internals of tcc. You can find me on perl's IRC network hanging out on 
#pdl, #xs, and #tinycc, among other channels. You can find my work at 
https://github.com/run4flat/C-Blocks

If you would like to help out, let me know and I will give you a tour through 
the codebase. :-)

Any help or encouragement would be much appreciated! Thanks!
David

-- 
"Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan

Re: Announcing C::Blocks, a different way to interface Perl and C code

Reply via email to