My comments are mostly advisory for optimizers in general ;) On 2/11/12 3:11 PM, Stefan Manegold wrote: > On Sat, Feb 11, 2012 at 02:06:17PM +0100, Martin Kersten wrote: >> >> >> On 2/11/12 11:03 AM, Stefan Manegold wrote: >>> On Wed, Feb 08, 2012 at 10:27:11AM +0100, Martin Kersten wrote: >>>> Changeset: 67c12a700166 for MonetDB >>>> URL: http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=67c12a700166 >>>> Modified Files: >>>> monetdb5/extras/mal_optimizer_template/opt_sql_append.mx >>>> Branch: default >>>> Log Message: >>>> >>>> More advice on the optimizer template. >>>> >>>> >>>> diffs (140 lines): >>>> >>>> diff --git a/monetdb5/extras/mal_optimizer_template/opt_sql_append.mx >>>> b/monetdb5/extras/mal_optimizer_template/opt_sql_append.mx >>>> --- a/monetdb5/extras/mal_optimizer_template/opt_sql_append.mx >>>> +++ b/monetdb5/extras/mal_optimizer_template/opt_sql_append.mx >>> [...] >>>> @@ -39,6 +39,8 @@ All Rights Reserved. >>>> * i.e., an sql.append() statement that is eventually followed by some >>>> other >>>> * statement later on in the MAL program that uses the same v0 BAT as >>>> * argument as the sql.append() statement does, >>>> + * Do you assume a single re-use of the variable v0? >>> >>> No. Why? >> Use assign-once and use-many-times policy. It can improve parallel >> processing >> and simplifies scope analysis. > > v0 is (as far as I know) created (assigned) once (by Niels, or preceeding > optimizers). true, on purpose > If it is used only once (only by sql_append), my optimizer does not (have > to) do anything. Otherwise, it replaces one use v0 (by sql_append) by a > view of v0. > That's the very purpose of this optimizer. > >>>> + * Do you assume a non-nested MAL block ? >>> >>> Not necessarily. >>> >> Analysis may become complex if you have something like >> >> V0:= expr >> barrier E1:=expr >> V0:= expr2 >> exit E1 >> now V0 depends on runtime use >> >> >> same holds for >> barrier E1:= expr >> V0:=expr >> exit E1 >> z:= f(V0) >> >> will be flagged as an error because V0 may be uninitialized >> >>> I must admit, that I do not know how the oprimizer framework handles nested >>> MAL blocks, and what an optimizer needs to do to be aware of nested MAL >>> blocks and to handle them correctly. >> Preferrably the MAL blocks are linear programs (until you reach the >> dataflow optimizer). > > How do I know / see that in my optimizer? While looping through the plan you check if p->barrier is set. You can always safely exit an optimizer. > Do I have to check for barrier / exit statements / constructs myself? in principle, yes Optimizers in the pipeline preceeding yours could introduce them. > >>> >>> In the sample optimizer, for now, I'd be fine if there are no >>> false-positives, i.e., the optimizer triggers in case it should not trigger >>> or in cases it cannot handle correctly. >>> I can accept false-negatives, i.e., not triggering in all case it could >>> handle >>> correctly. >>> >>>> * >>>> * and transform them into >>>> * >>>> @@ -52,6 +54,7 @@ All Rights Reserved. >>>> * >>>> * i.e., handing a BAT view v2 of BAT v0 as argument to the sql.append() >>>> * statement, rather than the original BAT v0. >>>> + * My advice, always use new variable names, it may capture some easy to >>>> make errors. >>> >>> I/my optimizer does use new variables for all new statements/results. >>> I/my optimizer re-use variable names only for identical results. >>> >>>> * >>>> * As a refinement, patterns like >>>> * >>> [...] >>>> @@ -181,13 +195,17 @@ OPTsql_appendImplementation(Client cntxt >>>> pushInstruction(mb, q); >>>> q1 = q; >>>> i++; >>>> - actions++; >>>> + actions++; /* to keep track if >>>> anything has been done */ >>>> } >>>> } >>>> >>>> - /* look for >>>> + /* look for >>>> * v5 := ... v0 ...; >>>> */ >>>> + /* an expensive loop, better would be to remember that >>>> v0 has a different role. >>>> + * A typical method is to keep a map from variable -> >>>> instruction where it was >>>> + * detected. The you can check each assignment for use >>>> of v0 >>>> + */ >>> >>> This is general support functionality. >>> Is this already available in the optimizer framework? >> I try to use single pass algorithms in the optimizers. >> Even in the case of commonterms optimizer, we may have to >> traverse the history. This can become a n^2 process >> >>> If so, where is it and how can I use it? >> Mimic how it is done in other optimizers (e.g. opt_reorder). >> Typically, a buffer is maintained per variable to keep >> optimization properties around. >> >>> If not, where/how could we add it? >>> >>>> for (j = i+1; !found&& j< limit; j++) >>>> for (k = old[j]->retc; !found&& k< >>>> old[j]->argc; k++) >>>> found = (getArg(old[j], k) == getArg(p, >>>> 5)); >>>> @@ -202,6 +220,8 @@ OPTsql_appendImplementation(Client cntxt >>>> >>>> /* push new v1 := aggr.count( v0 ); unless >>>> already available */ >>>> if (q1 == NULL) { >>>> + /* use mal_buil.mx primitives q1 = newStmt(mb, >>>> aggrRef,countRef); setArgType(mb,q1,TYPE_wrd) */ >>>> + /* it will be added to the block and even my >>>> re-use MAL instructions */ >>> >>> Is this (supposed to be) documentation of the existing code below, >>> or rather advice how to implement the below functionality differently? >> Use the mal_builder to simplify your code base. >> >>> >>>> q1 = newInstruction(mb,ASSIGNsymbol); >>>> getArg(q1,0) = newTmpVariable(mb, >>>> TYPE_wrd); >>>> setModuleId(q1, aggrRef); >>>> @@ -211,6 +231,7 @@ OPTsql_appendImplementation(Client cntxt >>>> } >>>> >>>> /* push new v2 := algebra.slice( v0, 0, v1 ); */ >>>> + /* use mal_buil.mx primitives q1 = newStmt(mb, >>>> algebraRef,sliceRef); */ >>> >>> Is this (supposed to be) documentation of the existing code below, >>> or rather advice how to implement the below functionality differently? >>> >>>> q2 = newInstruction(mb,ASSIGNsymbol); >>>> getArg(q2,0) = newTmpVariable(mb, TYPE_any); >>>> setModuleId(q2, algebraRef); >>>> @@ -240,6 +261,7 @@ OPTsql_appendImplementation(Client cntxt >>>> for(i++; i<limit; i++) >>>> if (old[i]) >>>> pushInstruction(mb, old[i]); >>>> + /* any remaining MAL instruction records are removed */ >>>> for(; i<slimit; i++) >>>> if (old[i]) >>>> freeInstruction(old[i]); >>>> @@ -253,6 +275,9 @@ OPTsql_appendImplementation(Client cntxt >>>> return actions; >>>> } >>>> >>>> +/* optimizers have to be registered in the optcatalog in opt_support.c. >>> >>> Why? >> SQL needs a place to pick up all optimizers known. You may also have >> to extend the optimizer pipeline validity code. >> >>> If at all possible, I'd prefer to be able to add a new optimizer without the >>> need to change existing code ... >> yes understood, but you have to patch Makefile.ag, youroptimizer.mx, and >> opt_support. Possibly, you may have to extend opt_prelude as well >> >>> >>>> + * you have to path the file accordingly. >> "path" >>> ^^^^ >>> parse? >>> >>> What does this mean? What am I supposed to do in detail? >>> >>>> + */ >>>> @include ../../optimizer/optimizerWrapper.mx >>>> @c >>>> #include "opt_statistics.h" >>>> _______________________________________________ >>>> Checkin-list mailing list >>>> checkin-l...@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/checkin-list >>>> >>> >>> Thanks! >>> >>> Stefan >>> >> _______________________________________________ >> Checkin-list mailing list >> checkin-l...@monetdb.org >> http://mail.monetdb.org/mailman/listinfo/checkin-list >> >> >
------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers