Funny you should mention that. This idea (of dataflow) has been around for a
long time, and PDL internals actually have additional infrastructure for more
sophisticated data flow that is intended to do just that kind of thing. But
IMHO the reason generalized dataflow never caught on is that it is hard to
implement, and also (as envisioned originally) a bit limited. A general
cascading callback infrastructure is probably a better and simple answer.
Fully implemented, it could make use of the parent/daughter links in the PDL
structure itself -- which are currently only used by selection operators.
After a lot of reflection, I think computed dataflow (as a formalized language
mechanism) is misguided; the lighter weight explicit callback mechanism you
mention might be a nice alternate mechanism that fills some of the same niche.
Here's why I don't like formalized computed dataflow:
(1) computed dataflow can be expensive. I often work with PDLs that
took 100 seconds or more of CPU time to calculate. If I change a prior, I
don't want to wait 100 seconds to execute the next statement.
(2) computed dataflow is dangerous. In particular, any circular
dependencies will lead to infinite update loops.
(3) computed dataflow is surprising. It's already surprising to
newcomers to find that selection operators preserve dataflow; nobody expects
(e.g.) sin(x) or multiplication to do that.
(4) computed dataflow can be wasteful. Currently there is no way to
mark and carry a locus of touched elements in a PDL. That means computed
dataflow operations (notably, indices, dices, and ranges) have to recompute the
connected values for an entire PDL whenever even one element gets touched,
which can be ludicrously wasteful of CPU time (Affine slices have this problem
too but sidestep it by not actually performing any computing -- an affine slice
operation just makes a new PDL header with different dimincs counters, pointing
to the same underlying data).
(5) there are many non-obvious API pitfalls with computed dataflow.
For example it is probably not possible to implement a general-purpose two-way
flow even for simple operators, but one-way flow poses its own problems (e.g.
what happens if you say "$b = sin($a); $b .= 0.5;"? Does $a get stomped on?
Does $b get transient values that disappear the next time $a is touched? Also,
many multivariate calculations are non-commutative; this problem faces
spreadsheet program designers, but they at least have a spatial structure to
guide the programmer, rather than a not-easily-visualizable directed graph).
Callbacks, on the other hand, are pretty well known in the programming
community and come with pretty well understood pitfalls, as well as actually
being more useful than a full computed dataflow infrastructure. So, er, I like
it.
Hmmm...
It appears to me that you are keeping a whole copy of the entire original PDL
just to compare and see if it has changed. I think there is a "changed" flag
that you can use in the underlying PDL structure, which should avoid having to
page through all the data. Failing that, you could use a hash algorithm like,
say, md5 instead, and it would probably run faster (by accessing RAM only half
as often).
Cheers,
Craig
On Sep 15, 2011, at 6:09 AM, David Mertens wrote:
> Hello all -
>
> A friend of mine once speculated that it would be cool to be able to
> update a piddle in the pdl shell and have the corresponding plot
> immediately update itself. I thought about it for a second and said,
> "I'll bet you could do that with some fancy slice operation." Well, I
> am happy to say that I have a first cut at such a slice operation. The
> code below implements it using Inline::Pdlpp so you can try it out.
> Basically, it allows you to carry around a piddle that 'knows' when
> you've updated it, and which calls a customized callback. The callback
> is a typical Perl sub ref, which means it knows about its surrounding
> variables. However, it does not get any arguments, for reasons that
> you'll see if you look closely at the first wart.
>
> I would like to clean this up (checks for nan, bad value handling,
> etc) and have this added to PDL::Slices. Thoughts?
>
> @Porters and wizards: in order to get at the code ref in the BackCode
> section, I had to go explicitly through __privtrans. That seems
> hackish. Is there a better way to do this?
>
> David
>
>
>
> use strict;
> use warnings;
> use PDL;
> use Inline 'Pdlpp';
> use PDL::NiceSlice;
>
> ###############
> # Basic Usage #
> ###############
>
> # Create a 'ghost' piddle called _x:
> my $_x = sequence(20);
> # Monitor _x by wrapping it with a callback piddle called x:
> my $update_counter = 0;
> my $x = $_x->callback(sub {
> $update_counter++;
> print "Detected update number $update_counter: $_x\n";
> });
> # Any changes in x will lead to a print statement:
> $x++;
> $x->set(0, -10);
> # Updates to *slices* of x also call the callback:
> $x(5:10) -= 30;
> # Simply using x does not lead to the callback (this will not print a line):
> my $y = $x + 10;
> print "Done with basic usage\n";
>
> ############################################################
> # Wart - Stringing together the construction does not work #
> ############################################################
>
> # You would think you could string together such a declaration like this,
> # but it doesn't work:
> my $z = sequence(20)->callback( sub { "Changed z!\n" });
> print "z is $z; incrementing\n";
> # Notice, this never calls the callback!
> $z++;
> print "z is now $z\n";
>
> ######################################################
> # Partial Wart - cannot print out parents of parents #
> ######################################################
>
> my $a = sequence(20);
> # A non-affine slice:
> my $b = $a->where($a < 10);
> # create a callback piddle that monitors changes to itself:
> my $c = $b->callback( sub {
> print "Changed c\n";
> print "During the callback, b is $b\n";
> print "During the callback, a is $a\n";
> });
> $c++;
> print "After the increment, a is $a\n";
>
> no PDL::NiceSlice;
>
> __END__
>
> __Pdlpp__
>
> pp_def('callback',
> Pars => 'p();[o] c(); ',
> OtherPars => 'SV * code_ref',
> DefaultFlow => 1,
> Reversible => 1,
> # Main code is easy -- just copy the data
> Code => q{
> $c() = $p();
> },
> BackCode => q{
> /* We'll only call back if there is a change */
> int data_changed = 0;
> threadloop %{
> /* Check if the child changed */
> if ($c() != $p()) {
> $p() = $c();
> data_changed = 1;
> }
> %}
>
> /* If the data changed, call the callback */
> if (data_changed) {
> dSP; /* Get at Perl's argument stack */
> ENTER; /* */
> SAVETMPS; /* Equivalent of left curly bracket */
> PUSHMARK(SP); /* */
> PUTBACK;
>
> call_sv(__privtrans->code_ref, G_VOID);
>
> SPAGAIN;
> FREETMPS;
> LEAVE;
> }
> },
> );
>
> _______________________________________________
> Perldl mailing list
> [email protected]
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl