I for one welcome our new C::Blocks overlords! :-)
On Fri, May 23, 2014 at 5:35 AM, David Mertens <dcmertens.p...@gmail.com>wrote: > Hey everyone, > > tl;dr: C::Blocks is a new TinyCC-based module, presently only available on > github. (1) It jit-compiles blocks of C code, building and inserting OPs > into the Perl OP tree, making invocation of C code essentially free. (2) It > will allow different blocks of C code to share function and struct > declarations, thus removing the need to always recompile perl.h, an > otherwise major cost of jit-compiling C code that can interface with Perl > and Perl data structures. > > I am currently seeking help and encouragement to squash the segfaults that > currently prevent the completion of the second feature. :-) > > ---- > > I like Perl, but I like C, too. I would like to be able to write and call > C code from Perl in about as painless a way as possible. Inline::C is nice, > as are XS::TCC and C::TinyCompiler, but we can do better. C::Blocks is my > attempt to do better. > > *Pain point 1*. C code should be a first class citizen. With XS::TCC and > C::TinyCompiler, you pass your code to the compiler via a string. With > Inline::C, you either place your code at the bottom of your script in a > __DATA__ section, or you enclose it in a string. Steffen's module is > probably the most transparent in this sense. Still, working with an > interface that requires me to compile a string to get my product feels the > same as compiling a regex from a string. This is Perl! We can do better! > > C::Blocks does better by using a keyword parser hook. Blocks of code that > you want executed are called like so: > > print "Before cblock\n"; > cblock { > printf("In cblock\n"); > } > print "After cblock\n"; > > If stdio.h is included, you get the output > > Before cblock > In cblock > After cblock > > Because it uses a parser hook, the C code really is inline with your Perl > code. > > *Pain point 2*. Calling C code should be obvious and cheap. All three > modules discussed so far provide a mechanism for calling C functions. This > means that for a simple, small operation, I must wrap my idea into a > function one place and invoke it in another. Furthermore, if I want to > repeatedly call a block of C code in a loop, I must define that block of > code somewhere outside of the loop, potentially very far from the call > site. C::TinyCompiler suffers further because it uses a complicated and > rather slow calling mechanism. > > C::Blocks solves this by extracting and jit-compiling the C code at Perl > parse time, generating an OP and inserting it into the Perl OP tree. This > means that you can insert your C code exactly where you want it and not > worry about repeated re-compiles. If you were to wrap the example given > above in a for loop, you would see how this works. > > *Pain point 3*. Sharing C code should be as easy as sharing Perl code. > C::TinyCompiler provides a fairly complex mechanism to allow modules to add > declarations and symbols to a compiler context. Any string that uses that > will need to recompile those declarations, however, tempting me to > prematurely optimize by placing all of my C code in one giant string > instead of interspersed among my Perl code. Neither XS::TCC nor Inline::C > provide much (if any) automated machinery to share code. > > C::Blocks provides a mechanism to share function declarations, struct > definitions, and other identifiers with other cblocks in the current > lexical scope, as well as to share them on a per-package basis. It is even > more versatile than normal Perl function scoping, allowing you to correctly > correlate functionality with lexical scope. (It is also somewhat buggy, as > discussed next.) > > *Pain point 4*. Changing C code should not cost anything. Inline::C can > take seconds to recompile a changed set of C code. In contrast, there is no > cost associated with changing code when using XS::TCC and C::TinyCompiler > because they jit-compile their code. That comes at the cost, however, of > always compiling everything each time you invoke your Perl script. If your > C code needs the Perl C API, you will have to re-parse perl.h every time > you *compile* a code block, which can happen many times with each > execution of your script. Inline::C's caching mechanism provides a big win > in that respect, unless you change your code. It would be nice if we could > somehow cache the result of parsing ``#include "perl.h"''. > > C::Blocks uses a fork of tcc that I've been working on for many months > aimed at allowing one compiler context to share its symbol table with other > compiler contexts. This is related to the previous point. The sharing > mechanism discussed in the previous point applies to preprocessor includes, > so once I have compiled a block that uses the Perl headers, I can share all > of those declarations with later compilation units, without recompiling. In > future work, I plan to store these symbol tables to disk so that they don't > even need to be re-parsed each time you run your script. > > ---- > > C::Blocks currently addresses, completely, pain points 1 and 2 above. It > has taken many months to hack on tcc to reach this point. I have now > encountered some segfault-causing issues when trying to share code, yet I > have a hard time reproducing those segfaults with direct tests on tcc. If > you think this sounds like a cool project, I would appreciate some > camaraderie as I try to dig into the internals of tcc. You can find me on > perl's IRC network hanging out on #pdl, #xs, and #tinycc, among other > channels. You can find my work at https://github.com/run4flat/C-Blocks > > If you would like to help out, let me know and I will give you a tour > through the codebase. :-) > > Any help or encouragement would be much appreciated! Thanks! > David > > -- > "Debugging is twice as hard as writing the code in the first place. > Therefore, if you write the code as cleverly as possible, you are, > by definition, not smart enough to debug it." -- Brian Kernighan >