Hello everyone,

While this is not strictly a PDL announcement, it is something that will
undoubtedly be of interest to PDL users, and may even be of interest to PDL
developers.

Yesterday I pushed the first "alpha" release of C::Blocks
<https://metacpan.org/pod/C::Blocks>, a module designed to make it much
easier to write and share C code and data structures among your Perl code.

How does this work? C::Blocks uses Perl's keyword API to add the "cblock"
keyword <https://metacpan.org/pod/C::Blocks#Procedural-Blocks> (among a
couple <https://metacpan.org/pod/C::Blocks#Private-C-Declarations> of
<https://metacpan.org/pod/C::Blocks#Shared-C-Declarations> others
<https://metacpan.org/pod/C::Blocks#csub-name-code>). This keyword expects
a curly-bracket-delimited block of C code that gets executed in-place. It
achieves this by jit-compiling the block of C code and inserting an OP in
place of the block of code during Perl parse time.

There are two technical problem that I had to solve to get C::Blocks to be
computationally effective. First, C::Blocks is built on my own fork of the
Tiny C Compiler <https://github.com/run4flat/tinycc> that makes it possible
to make a copy of a compilation unit's symbol table. This makes it possible
to declare a struct definition, write a function, or #include a set of
header files in one (clex or cshare) block, and have other blocks use that
information without having to recompile any of it. Second, this extended
symbol table information can be cached and quickly loaded from disk,
something that I use to rapidly load Perl's C API
<https://metacpan.org/pod/C::Blocks::PerlAPI>.

In this case, "alpha" means that it passes its test suite on at least one
of the operating systems I have access to (Linux), and it compiles and
passes at least some tests on all operating systems that I have access to
(Mac and Windows). In other words, it looks like it'll fly, but it's still
quite fragile. The goal for the alpha release series is to iron out the
bugs and get it to compile and pass all of its tests on all major platforms.

I have worked on a few benchmarks
<https://metacpan.org/source/DCMERTENS/C-Blocks-0.01/bench> for C::Blocks
and have been quite surprised by the results. So surprised, in fact, that I
had to double-check that my Perl had been compiled with decent optimization
(it had). Benchmarks on my Linux laptop indicate that for simple operations
(i.e. sum) on small arrays (N < 3000), PDL's method setup and teardown are
so costly that they dominate the calculation, and C::Blocks code wins
handily. PDL gets the upper hand near N = 10,000, but only beats C::Blocks
by a factor of 2. A more interesting benchmark was for the operation
"sqrt(sum($pdl_data*$pdl_data))", an N-dimensional Euclidean distance. For
this, the C::Blocks equivalent code was comparable except in the vicinity
of N = 100,000, where PDL beat it by a factor of 2. The method setup and
teardown dominated PDL performance for small N, and for large N the memory
allocations and re-allocations degraded PDL's performance to the point that
they both performed the calculation at essentially the same rate.

You can learn more about C::Blocks at the CPAN page:
https://metacpan.org/pod/C::Blocks.

David

-- 
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan
------------------------------------------------------------------------------
_______________________________________________
pdl-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-general

Reply via email to