Re: Announcing C::Blocks, a different way to interface Perl and C code

2014-05-24 Thread Ingy dot Net
I for one welcome our new C::Blocks overlords! :-)


On Fri, May 23, 2014 at 5:35 AM, David Mertens dcmertens.p...@gmail.comwrote:

 Hey everyone,

 tl;dr: C::Blocks is a new TinyCC-based module, presently only available on
 github. (1) It jit-compiles blocks of C code, building and inserting OPs
 into the Perl OP tree, making invocation of C code essentially free. (2) It
 will allow different blocks of C code to share function and struct
 declarations, thus removing the need to always recompile perl.h, an
 otherwise major cost of jit-compiling C code that can interface with Perl
 and Perl data structures.

 I am currently seeking help and encouragement to squash the segfaults that
 currently prevent the completion of the second feature. :-)

 

 I like Perl, but I like C, too. I would like to be able to write and call
 C code from Perl in about as painless a way as possible. Inline::C is nice,
 as are XS::TCC and C::TinyCompiler, but we can do better. C::Blocks is my
 attempt to do better.

 *Pain point 1*. C code should be a first class citizen. With XS::TCC and
 C::TinyCompiler, you pass your code to the compiler via a string. With
 Inline::C, you either place your code at the bottom of your script in a
 __DATA__ section, or you enclose it in a string. Steffen's module is
 probably the most transparent in this sense. Still, working with an
 interface that requires me to compile a string to get my product feels the
 same as compiling a regex from a string. This is Perl! We can do better!

 C::Blocks does better by using a keyword parser hook. Blocks of code that
 you want executed are called like so:

 print Before cblock\n;
 cblock {
 printf(In cblock\n);
 }
 print After cblock\n;

 If stdio.h is included, you get the output

 Before cblock
 In cblock
 After cblock

 Because it uses a parser hook, the C code really is inline with your Perl
 code.

 *Pain point 2*. Calling C code should be obvious and cheap. All three
 modules discussed so far provide a mechanism for calling C functions. This
 means that for a simple, small operation, I must wrap my idea into a
 function one place and invoke it in another. Furthermore, if I want to
 repeatedly call a block of C code in a loop, I must define that block of
 code somewhere outside of the loop, potentially very far from the call
 site. C::TinyCompiler suffers further because it uses a complicated and
 rather slow calling mechanism.

 C::Blocks solves this by extracting and jit-compiling the C code at Perl
 parse time, generating an OP and inserting it into the Perl OP tree. This
 means that you can insert your C code exactly where you want it and not
 worry about repeated re-compiles. If you were to wrap the example given
 above in a for loop, you would see how this works.

 *Pain point 3*. Sharing C code should be as easy as sharing Perl code.
 C::TinyCompiler provides a fairly complex mechanism to allow modules to add
 declarations and symbols to a compiler context. Any string that uses that
 will need to recompile those declarations, however, tempting me to
 prematurely optimize by placing all of my C code in one giant string
 instead of interspersed among my Perl code. Neither XS::TCC nor Inline::C
 provide much (if any) automated machinery to share code.

 C::Blocks provides a mechanism to share function declarations, struct
 definitions, and other identifiers with other cblocks in the current
 lexical scope, as well as to share them on a per-package basis. It is even
 more versatile than normal Perl function scoping, allowing you to correctly
 correlate functionality with lexical scope. (It is also somewhat buggy, as
 discussed next.)

 *Pain point 4*. Changing C code should not cost anything. Inline::C can
 take seconds to recompile a changed set of C code. In contrast, there is no
 cost associated with changing code when using XS::TCC and C::TinyCompiler
 because they jit-compile their code. That comes at the cost, however, of
 always compiling everything each time you invoke your Perl script. If your
 C code needs the Perl C API, you will have to re-parse perl.h every time
 you *compile* a code block, which can happen many times with each
 execution of your script. Inline::C's caching mechanism provides a big win
 in that respect, unless you change your code. It would be nice if we could
 somehow cache the result of parsing ``#include perl.h''.

 C::Blocks uses a fork of tcc that I've been working on for many months
 aimed at allowing one compiler context to share its symbol table with other
 compiler contexts. This is related to the previous point. The sharing
 mechanism discussed in the previous point applies to preprocessor includes,
 so once I have compiled a block that uses the Perl headers, I can share all
 of those declarations with later compilation units, without recompiling. In
 future work, I plan to store these symbol tables to disk so that they don't
 even need to be re-parsed each time you run 

Announcing C::Blocks, a different way to interface Perl and C code

2014-05-23 Thread David Mertens
Hey everyone,

tl;dr: C::Blocks is a new TinyCC-based module, presently only available on
github. (1) It jit-compiles blocks of C code, building and inserting OPs
into the Perl OP tree, making invocation of C code essentially free. (2) It
will allow different blocks of C code to share function and struct
declarations, thus removing the need to always recompile perl.h, an
otherwise major cost of jit-compiling C code that can interface with Perl
and Perl data structures.

I am currently seeking help and encouragement to squash the segfaults that
currently prevent the completion of the second feature. :-)



I like Perl, but I like C, too. I would like to be able to write and call C
code from Perl in about as painless a way as possible. Inline::C is nice,
as are XS::TCC and C::TinyCompiler, but we can do better. C::Blocks is my
attempt to do better.

*Pain point 1*. C code should be a first class citizen. With XS::TCC and
C::TinyCompiler, you pass your code to the compiler via a string. With
Inline::C, you either place your code at the bottom of your script in a
__DATA__ section, or you enclose it in a string. Steffen's module is
probably the most transparent in this sense. Still, working with an
interface that requires me to compile a string to get my product feels the
same as compiling a regex from a string. This is Perl! We can do better!

C::Blocks does better by using a keyword parser hook. Blocks of code that
you want executed are called like so:

print Before cblock\n;
cblock {
printf(In cblock\n);
}
print After cblock\n;

If stdio.h is included, you get the output

Before cblock
In cblock
After cblock

Because it uses a parser hook, the C code really is inline with your Perl
code.

*Pain point 2*. Calling C code should be obvious and cheap. All three
modules discussed so far provide a mechanism for calling C functions. This
means that for a simple, small operation, I must wrap my idea into a
function one place and invoke it in another. Furthermore, if I want to
repeatedly call a block of C code in a loop, I must define that block of
code somewhere outside of the loop, potentially very far from the call
site. C::TinyCompiler suffers further because it uses a complicated and
rather slow calling mechanism.

C::Blocks solves this by extracting and jit-compiling the C code at Perl
parse time, generating an OP and inserting it into the Perl OP tree. This
means that you can insert your C code exactly where you want it and not
worry about repeated re-compiles. If you were to wrap the example given
above in a for loop, you would see how this works.

*Pain point 3*. Sharing C code should be as easy as sharing Perl code.
C::TinyCompiler provides a fairly complex mechanism to allow modules to add
declarations and symbols to a compiler context. Any string that uses that
will need to recompile those declarations, however, tempting me to
prematurely optimize by placing all of my C code in one giant string
instead of interspersed among my Perl code. Neither XS::TCC nor Inline::C
provide much (if any) automated machinery to share code.

C::Blocks provides a mechanism to share function declarations, struct
definitions, and other identifiers with other cblocks in the current
lexical scope, as well as to share them on a per-package basis. It is even
more versatile than normal Perl function scoping, allowing you to correctly
correlate functionality with lexical scope. (It is also somewhat buggy, as
discussed next.)

*Pain point 4*. Changing C code should not cost anything. Inline::C can
take seconds to recompile a changed set of C code. In contrast, there is no
cost associated with changing code when using XS::TCC and C::TinyCompiler
because they jit-compile their code. That comes at the cost, however, of
always compiling everything each time you invoke your Perl script. If your
C code needs the Perl C API, you will have to re-parse perl.h every time
you *compile* a code block, which can happen many times with each execution
of your script. Inline::C's caching mechanism provides a big win in that
respect, unless you change your code. It would be nice if we could somehow
cache the result of parsing ``#include perl.h''.

C::Blocks uses a fork of tcc that I've been working on for many months
aimed at allowing one compiler context to share its symbol table with other
compiler contexts. This is related to the previous point. The sharing
mechanism discussed in the previous point applies to preprocessor includes,
so once I have compiled a block that uses the Perl headers, I can share all
of those declarations with later compilation units, without recompiling. In
future work, I plan to store these symbol tables to disk so that they don't
even need to be re-parsed each time you run your script.



C::Blocks currently addresses, completely, pain points 1 and 2 above. It
has taken many months to hack on tcc to reach this point. I have now
encountered some segfault-causing issues when 

Re: Announcing C::Blocks, a different way to interface Perl and C code

2014-05-23 Thread sisyphus1
Sounds pretty cool, David.
A plodder like me is probably just going to stick with Inline, but it would be 
great to see C::Blocks takes off. And it has the attractiveness to do so.
(I like the way you can so easily just plonk the C code right in there amongst 
the perl code  makes Inline look not-so-inline :-)

Cheers,
Rob

From: David Mertens 
Sent: Friday, May 23, 2014 10:35 PM
To: perl-xs@perl.org ; Perl Inline Mail List 
Subject: Announcing C::Blocks, a different way to interface Perl and C code
Hey everyone,


tl;dr: C::Blocks is a new TinyCC-based module, presently only available on 
github. (1) It jit-compiles blocks of C code, building and inserting OPs into 
the Perl OP tree, making invocation of C code essentially free. (2) It will 
allow different blocks of C code to share function and struct declarations, 
thus removing the need to always recompile perl.h, an otherwise major cost of 
jit-compiling C code that can interface with Perl and Perl data structures.


I am currently seeking help and encouragement to squash the segfaults that 
currently prevent the completion of the second feature. :-)





I like Perl, but I like C, too. I would like to be able to write and call C 
code from Perl in about as painless a way as possible. Inline::C is nice, as 
are XS::TCC and C::TinyCompiler, but we can do better. C::Blocks is my attempt 
to do better.

Pain point 1. C code should be a first class citizen. With XS::TCC and 
C::TinyCompiler, you pass your code to the compiler via a string. With 
Inline::C, you either place your code at the bottom of your script in a 
__DATA__ section, or you enclose it in a string. Steffen's module is probably 
the most transparent in this sense. Still, working with an interface that 
requires me to compile a string to get my product feels the same as compiling a 
regex from a string. This is Perl! We can do better!


C::Blocks does better by using a keyword parser hook. Blocks of code that you 
want executed are called like so:


print Before cblock\n;

cblock {

printf(In cblock\n);

}

print After cblock\n;


If stdio.h is included, you get the output


Before cblock

In cblock

After cblock


Because it uses a parser hook, the C code really is inline with your Perl code.


Pain point 2. Calling C code should be obvious and cheap. All three modules 
discussed so far provide a mechanism for calling C functions. This means that 
for a simple, small operation, I must wrap my idea into a function one place 
and invoke it in another. Furthermore, if I want to repeatedly call a block of 
C code in a loop, I must define that block of code somewhere outside of the 
loop, potentially very far from the call site. C::TinyCompiler suffers further 
because it uses a complicated and rather slow calling mechanism.


C::Blocks solves this by extracting and jit-compiling the C code at Perl parse 
time, generating an OP and inserting it into the Perl OP tree. This means that 
you can insert your C code exactly where you want it and not worry about 
repeated re-compiles. If you were to wrap the example given above in a for 
loop, you would see how this works.


Pain point 3. Sharing C code should be as easy as sharing Perl code. 
C::TinyCompiler provides a fairly complex mechanism to allow modules to add 
declarations and symbols to a compiler context. Any string that uses that will 
need to recompile those declarations, however, tempting me to prematurely 
optimize by placing all of my C code in one giant string instead of 
interspersed among my Perl code. Neither XS::TCC nor Inline::C provide much (if 
any) automated machinery to share code.


C::Blocks provides a mechanism to share function declarations, struct 
definitions, and other identifiers with other cblocks in the current lexical 
scope, as well as to share them on a per-package basis. It is even more 
versatile than normal Perl function scoping, allowing you to correctly 
correlate functionality with lexical scope. (It is also somewhat buggy, as 
discussed next.)


Pain point 4. Changing C code should not cost anything. Inline::C can take 
seconds to recompile a changed set of C code. In contrast, there is no cost 
associated with changing code when using XS::TCC and C::TinyCompiler because 
they jit-compile their code. That comes at the cost, however, of always 
compiling everything each time you invoke your Perl script. If your C code 
needs the Perl C API, you will have to re-parse perl.h every time you compile a 
code block, which can happen many times with each execution of your script. 
Inline::C's caching mechanism provides a big win in that respect, unless you 
change your code. It would be nice if we could somehow cache the result of 
parsing ``#include perl.h''.


C::Blocks uses a fork of tcc that I've been working on for many months aimed at 
allowing one compiler context to share its symbol table with other compiler 
contexts. This is related to the previous point. The