Hey Chris,

I wish I could say something more precise and causal about why PDL is
slowing down, but it's hard without measurements. I think the JIT has some
promise both in producing context-optimized code, but more importantly for
playing around with the internals.

The keyword API was introduced in v5.12. Also, I just found a way to get
lexical variable insertion working for Perls earlier than 5.18, although I
have not yet pushed it to CPAN.

David

On Tue, Aug 4, 2015 at 4:33 PM, Chris Marshall <[email protected]>
wrote:

> Neat, David.  I'm looking forward to trying...
>
> What version of perl is required to support the keyword API?
>
> Regarding the latency comparison, the C::Blocks implementation
> is arguably close to optimal latency to issue a perl operation.
> The motivation for JIT compiling for PDL3 is precisely to address
> the problems of latency and extra memory motion (cache thrash).
> Your timings indicate that this approach for JIT has promise for
> some significant performance improvements.
>
> --Chris
>
>
>
> On Tue, Aug 4, 2015 at 1:35 PM, David Mertens <[email protected]>
> wrote:
>
>> Hello everyone,
>>
>> While this is not strictly a PDL announcement, it is something that will
>> undoubtedly be of interest to PDL users, and may even be of interest to PDL
>> developers.
>>
>> Yesterday I pushed the first "alpha" release of C::Blocks
>> <https://metacpan.org/pod/C::Blocks>, a module designed to make it much
>> easier to write and share C code and data structures among your Perl code.
>>
>> How does this work? C::Blocks uses Perl's keyword API to add the "cblock"
>> keyword <https://metacpan.org/pod/C::Blocks#Procedural-Blocks> (among a
>> couple <https://metacpan.org/pod/C::Blocks#Private-C-Declarations> of
>> <https://metacpan.org/pod/C::Blocks#Shared-C-Declarations> others
>> <https://metacpan.org/pod/C::Blocks#csub-name-code>). This keyword
>> expects a curly-bracket-delimited block of C code that gets executed
>> in-place. It achieves this by jit-compiling the block of C code and
>> inserting an OP in place of the block of code during Perl parse time.
>>
>> There are two technical problem that I had to solve to get C::Blocks to
>> be computationally effective. First, C::Blocks is built on my own fork
>> of the Tiny C Compiler <https://github.com/run4flat/tinycc> that makes
>> it possible to make a copy of a compilation unit's symbol table. This makes
>> it possible to declare a struct definition, write a function, or #include a
>> set of header files in one (clex or cshare) block, and have other blocks
>> use that information without having to recompile any of it. Second, this
>> extended symbol table information can be cached and quickly loaded from
>> disk, something that I use to rapidly load Perl's C API
>> <https://metacpan.org/pod/C::Blocks::PerlAPI>.
>>
>> In this case, "alpha" means that it passes its test suite on at least one
>> of the operating systems I have access to (Linux), and it compiles and
>> passes at least some tests on all operating systems that I have access to
>> (Mac and Windows). In other words, it looks like it'll fly, but it's still
>> quite fragile. The goal for the alpha release series is to iron out the
>> bugs and get it to compile and pass all of its tests on all major platforms.
>>
>> I have worked on a few benchmarks
>> <https://metacpan.org/source/DCMERTENS/C-Blocks-0.01/bench> for
>> C::Blocks and have been quite surprised by the results. So surprised, in
>> fact, that I had to double-check that my Perl had been compiled with decent
>> optimization (it had). Benchmarks on my Linux laptop indicate that for
>> simple operations (i.e. sum) on small arrays (N < 3000), PDL's method setup
>> and teardown are so costly that they dominate the calculation, and
>> C::Blocks code wins handily. PDL gets the upper hand near N = 10,000, but
>> only beats C::Blocks by a factor of 2. A more interesting benchmark was for
>> the operation "sqrt(sum($pdl_data*$pdl_data))", an N-dimensional Euclidean
>> distance. For this, the C::Blocks equivalent code was comparable except in
>> the vicinity of N = 100,000, where PDL beat it by a factor of 2. The method
>> setup and teardown dominated PDL performance for small N, and for large N
>> the memory allocations and re-allocations degraded PDL's performance to the
>> point that they both performed the calculation at essentially the same rate.
>>
>> You can learn more about C::Blocks at the CPAN page:
>> https://metacpan.org/pod/C::Blocks.
>>
>> David
>>
>> --
>>  "Debugging is twice as hard as writing the code in the first place.
>>   Therefore, if you write the code as cleverly as possible, you are,
>>   by definition, not smart enough to debug it." -- Brian Kernighan
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> pdl-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/pdl-general
>>
>>
>


-- 
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan
------------------------------------------------------------------------------
_______________________________________________
pdl-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-general

Reply via email to