Felix currently optimised parallel assignments in just one case: tail recursion. Simple cases like:
///////////////////////////////////// #import <flx.flxh> def var a,var b, var c = 1,2,3; println (a,b,c); a,b,c=c,a,b; // parallel assignment println (a,b,c); var x = a,b,c; // parallel assignment x = c,b,a; println x; ////////////////////////////////////// are not optimised. Felix uses the horrible routine: ////////////////////////////////////////////////////// | `AST_assign (sr,fid,l,r) -> let rec aux (l,t) r = match l with | `Expr (sr,e) -> begin match e with | `AST_tuple (_,ls) -> let n = seq() in let vn = "_" ^ si n in let sts = ref [] in let count = ref 0 in iter (fun l -> let r' = `AST_get_n (sr,(!count,`AST_name (sr,vn,[]))) in let l' = `Expr (sr,l),None in let asg = aux l' r' in sts := !sts @ asg; incr count ) ls ; `AST_val_decl (sr,vn,dfltvs,t,Some r) :: !sts | _ -> if fid = "_init" then match e with | `AST_coercion (_,(`AST_name (_,n,[]),t')) -> let t = match t with | None -> Some t' | t -> t in [`AST_val_decl (sr,n,dfltvs,t,Some r)] | `AST_name (_,n,[]) -> [`AST_val_decl (sr,n,dfltvs,t,Some r)] | x -> clierr sr ("identifier required in val init, got " ^ string_of_expr x) else [assign sr fid e r] end | `Val (sr,n) -> [`AST_val_decl (sr,n,dfltvs,t,Some r)] | `Var (sr,n) -> [`AST_var_decl (sr,n,dfltvs,t,Some r)] | `Skip (sr) -> [] | `Name (sr,n) -> let n = `AST_name(sr,n,[]) in [assign sr fid n r] | `List ls -> let n = seq() in let vn = "_" ^ si n in let sts = ref [] in let count = ref 0 in iter (fun l -> let r' = `AST_get_n (sr,(!count,`AST_name (sr,vn,[]))) in let asg = aux l r' in sts := !sts @ asg; incr count ) ls ; `AST_val_decl (sr,vn,dfltvs,t,Some r) :: !sts in let sts = aux l r in rsts name parent_vs access sts ////////////////////////////////////////////////////////////// to implement assignments. The weird encoding of the LHS here is to account for l-expressions (left expressions) aka ties, like this: def var a,var b, var c = 1,2,3; a,b,c=c,a,b; Here, the LHS is NOT an expression of tuple type, but an l-expression. It is implemented by (in the second case): tmp := c,a,b; a = tmp.(0); b = tmp.(1); c = tmp.(2); in the desugaring phase by the above Ocaml mess. This also takes care of, for example: a,b,c = x; the same way: first make a temporary of tuple type, then set the LHS variables using tuple projections. OTOH, an assignment like: x = a,b,c; is done by: tmp := a,b,c; // initialisation x = tmp; // assignment The point of these temporaries is twofold: the decoupled assignments are easy to code as an temporary initialisation and assignment, where the LHS and RHS visibility as a list of entries or whole tuple can be treated separately, AND, it is necessary, in cases like: a,b,c=c,a,b; NOT to write this as a = c; b = a; c = b; because the first a=c clobbers the old value of a required for the second b = a. The POINT of the parallel assignment routine is to minimise the number of temporaries needed, which also reduces the number of assignments + initialisations. Instead of the 6 required at present, we COULD do: a' = c; c = b; b = a; a = a'; which only requires 1 temporary and 4 assignments. One reason the Felix version of nbody is slow is that the 3-tuple used as a vector is unravelled into component wise operations for the functional calculations .. but the results are gathered into a tuple for the assignments, instead of setting the individual LHS tuple components individually: that's safe, but it isn't necessary for these vector ops. The parallel assignment algorithm is designed to calculate that this is actually safe. However at present, it only works when the LHS consists of separate variables: that's the simplest case, and enough for the tail recursion optimisation. So there are two problems. 1. How to preserve the parallel assignment structure until the optimiser can look at it, without destroying it the way the desugaring algorithm does at present 2. How to generalise the parallel assignment algorithm so it can handle tuple components (and perhaps record and struct components, but they're all equivalent). Step 2 is non-trivial. At present, the algorithm is present with a list of the form: lhs1 = rhs1 ... where lhs terms are just variable indicies -- they can't be expressions, including tuple components. The rhs is analysed to get all the symbols used, which is not entirely trivial, since it can call a child function which uses the variable. [Note: the RHS is a function .. it cannot MODIFY anything. If you cheat Felix, and put side effects in functions, Felix can do unexpected things.. this is an example where the algorithm assumes you do not do this and the optimiser can change the calculated result unexpectedly] Now, we have to generalise all this so we accept a list var,proj, proj, ... with a variable index, and sequence of tuple indexes (they're always constants) which identify a sub-structure of a tuple. So changing x,mem0,mem2 // var x.(0).(2) affects x, and x.mem0, and x.mem0.mem2. So the dependency calculation is no longer a simple setwise elimination process, because you can refer to a product as a whole, or just a component. Problem 1 has a nice and horrible solution. The nice solution is simply to add a new combinator 'parallel_assign' and eliminate it in the optimiser as above, then fix it to do a better job in some cases. The nasty solution is to leave the above code as is, and do enough trivial data flow analysis to detect code like: tmp := c,a,b; a = tmp.(0); b = tmp.(1); c = tmp.(2); The latter is somewhat appealing because it involves no changes to the rest of the compiler, however the detector is not so easy to write. In particular, Felix has 3 distinct ways to set variables: BEXE_assign, BEXE_init, BEXE_iinit and i have trouble remembering which is used when .. :) The actual bound code generated for the above (with the prints removed) is: _1044<4603> := (1, 2, 3); a<4604> := (_1044<4603>).mem_0; b<4605> := (_1044<4603>).mem_1; c<4606> := (_1044<4603>).mem_2; _1045<4607> := (c<4606>, a<4604>, b<4605>); a<4604> = (_1045<4607>).mem_0; b<4605> = (_1045<4607>).mem_1; c<4606> = (_1045<4607>).mem_2; x<4608> := (a<4604>, b<4605>, c<4606>); x<4608> = (c<4606>, b<4605>, a<4604>); where = is assignment, and := is initialisation. It isn't clear why to bother with initialisation: initialisation was intended for val x = 1; where the LHS is necessarily local to the current function, and also as a 'val' it cannot be assigned (but, the compiler can break that rule of course .. just the user can't). It was also supposedly a one-off .. but tail rec optimisations and other loops plus switches mean that actually, initialisation is an assignment anyhow. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language