I just need to explain better perhaps. So look at this:
typedef vec = array[int,3];
fun + (x:vec,y:vec)=>x.0 + y.0, x.1 + y.1, x.2 + y.2;
var x = 1,2,3;
var y = x + x + x + x;
println$ y;
generated code:
PTF x.data[0] = 1; //assign
PTF x.data[1] = 2; //assign
PTF x.data[2] = 3; //assign
PTF y.data[0] = PTF x.data[0] + PTF x.data[0] + PTF x.data[0] + PTF
x.data[0] ; //assign
PTF y.data[1] = PTF x.data[1] + PTF x.data[1] + PTF x.data[1] + PTF
x.data[1] ; //assign
PTF y.data[2] = PTF x.data[2] + PTF x.data[2] + PTF x.data[2] + PTF
x.data[2] ; //assign
Do you understand how awesome that it???
It looks obvious that this should be what is generated but that is NOT going to
be
what C++ would generate for a struct. Let me illustrate the awesomeness better:
var z = (x+x+x+x) . 0;
What will C++ do? Why, it will calculate
tmp = x + x;
tmp2 = tmp + x;
tmp3 = tmp2 + x;
z = tmp3.0;
Here's what Felix calculates:
PTF z = PTF x.data[0] + PTF x.data[0] + PTF x.data[0] + PTF x.data[0] ;
//assign
The projection "slices" into the whole formula. The technical (and confusing)
term is that it commutes with parallel addition.
Now let me rewrite that code:
struct vec { a:int; b:int; c:int; };
fun + (x:vec,y:vec)=>vec (x.a + y.a, x.b + y.b, x.c + y.c);
val x = vec (1,2,3);
var y = x + x + x + x;
var z = (x+x+x+x) . a;
println$ (y.a,y.b,y.c),z;
That's semantically identical, right?
Here's the generated code:
PTF x = reinterpret<_s42810t_57875>(_at57877(1, 2, 3))/* apply struct */;
//assign
PTF y =
reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(PTF
x.a + PTF x.a , PTF x.b + PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.a +
PTF x.a , reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b +
PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.b + PTF x.b ,
reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b + PTF x.b ,
PTF x.c + PTF x.c ))/* apply struct */.c + PTF x.c ))/* apply struct */.a + PTF
x.a ,
reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(PTF
x.a + PTF x.a , PTF x.b + PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.a +
PTF x.a , reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b +
PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.b + PTF x.b ,
reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b + PTF x.b ,
PTF x.c + PTF x.c ))/* apply struct */.c + PTF x.c ))/* apply struct */.b + PTF
x.b ,
reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(PTF
x.a + PTF x.a , PTF x.b + PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.a +
PTF x.a , reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b +
PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.b + PTF x.b ,
reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b + PTF x.b ,
PTF x.c + PTF x.c ))/* apply struct */.c + PTF x.c ))/* apply struct */.c + PTF
x.c ))/* apply struct */; //assign
PTF z =
reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(PTF
x.a + PTF x.a , PTF x.b + PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.a +
PTF x.a , reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b +
PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.b + PTF x.b ,
reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b + PTF x.b ,
PTF x.c + PTF x.c ))/* apply struct */.c + PTF x.c ))/* apply struct */.a + PTF
x.a ,
reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(PTF
x.a + PTF x.a , PTF x.b + PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.a +
PTF x.a , reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b +
PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.b + PTF x.b ,
reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b + PTF x.b ,
PTF x.c + PTF x.c ))/* apply struct */.c + PTF x.c ))/* apply struct */.b + PTF
x.b ,
reinterpret<_s42810t_57875>(_at57877(reinterpret<_s42810t_57875>(_at57877(PTF
x.a + PTF x.a , PTF x.b + PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.a +
PTF x.a , reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b +
PTF x.b , PTF x.c + PTF x.c ))/* apply struct */.b + PTF x.b ,
reinterpret<_s42810t_57875>(_at57877(PTF x.a + PTF x.a , PTF x.b + PTF x.b ,
PTF x.c + PTF x.c ))/* apply struct */.c + PTF x.c ))/* apply struct */.c + PTF
x.c ))/* apply struct */.a; //assign
Now, if you look at nbody, you can see that the C code CHEATS. It hand optimises
the calculation by doing each x,y,z component separately.
The Felix code does NOT cheat. It defines array addition, subtraction, scalar
multiplication
and uses those definitions.
The PROBLEM in the Felix code is that a planet is a struct, and value
projections like:
planets . orbitno . velocity
actually grab the whole planet structure for the given orbitno, then grab the
velocity of that. C++ would use a const ref instead which is a pointer.
In this case, pass by value is not a good thing.
Generally pass by value is still better though! because it can "delve into"
the structure of a computation and optimise it, whereas passing
a pointer cannot do that.
--
john skaller
[email protected]
http://felix-lang.org
------------------------------------------------------------------------------
The Go Parallel Website, sponsored by Intel - in partnership with Geeknet,
is your hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials, tech docs,
whitepapers, evaluation guides, and opinion stories. Check out the most
recent posts - join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language