Changeset: e0ceb1b0ee5a for MonetDB URL: http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=e0ceb1b0ee5a Modified Files: Branch: default Log Message:
merge diffs (truncated from 975 to 300 lines): diff --git a/MonetDB.spec b/MonetDB.spec --- a/MonetDB.spec +++ b/MonetDB.spec @@ -27,7 +27,7 @@ Group: Applications/Databases License: MPL - http://monetdb.cwi.nl/Legal/MonetDBLicense-1.1.html URL: http://monetdb.cwi.nl/ -Source: http://dev.monetdb.org/downloads/sources/Mar2011/%{name}-%{version}.tar.gz +Source: http://dev.monetdb.org/downloads/sources/Mar2011/%{name}-%{version}.tar.bz2 BuildRequires: bison BuildRequires: bzip2-devel diff --git a/configure.ag b/configure.ag --- a/configure.ag +++ b/configure.ag @@ -27,7 +27,7 @@ AC_CANONICAL_HOST AC_CANONICAL_TARGET dnl use tar-ustar since we have long (longer than 99 characters) file names -AM_INIT_AUTOMAKE([tar-ustar]) +AM_INIT_AUTOMAKE([tar-ustar no-dist-gzip dist-bzip2]) AC_CONFIG_SRCDIR([gdk/gdk.mx]) AM_CONFIG_HEADER([monetdb_config.h]) AC_SUBST([CONFIG_H], [monetdb_config.h]) diff --git a/gdk/gdk_posix.mx b/gdk/gdk_posix.mx --- a/gdk/gdk_posix.mx +++ b/gdk/gdk_posix.mx @@ -318,7 +318,7 @@ #ifdef WIN32 int GDK_mem_pagebits = 16; /* on windows, the mmap addresses can be set by the 64KB */ #else -int GDK_mem_pagebits = 14; /* on linux, 4KB pages can be addressed */ +int GDK_mem_pagebits = 14; /* on linux, 4KB pages can be addressed (but we use 16KB) */ #endif #ifndef MAP_NORESERVE diff --git a/monetdb5/mal/mal_instruction.mx b/monetdb5/mal/mal_instruction.mx --- a/monetdb5/mal/mal_instruction.mx +++ b/monetdb5/mal/mal_instruction.mx @@ -20,274 +20,7 @@ @a M. Kersten @v 1.0 @* MonetDB Assembly Language (MAL) -The primary textual interface to the Monetdb kernel -is a simple, assembly-like language, called MAL. -The language reflects the virtual machine architecture around the -kernel libraries and has been designed for speed of parsing, -ease of analysis, and ease of target compilation by query compilers. -The language is not meant as a primary programming language, -or scripting language. Such use is even discouraged. -Furthermore, a MAL program is considered a specification -of intended computation and data flow behavior. It should be -understood that its actual evaluation depends on the execution -paradigm choosen in a scenario. The program blocks can both -be interpreted as ordered sequences of assembler instructions, -or as a representation of a data-flow graph that should be resolved -in a dataflow driven manner. -The language syntax uses a functional style definition of actions and -mark those that affect the flow explicitly. -Flow of control keywords identify a point to chance -the interpretation paradigm and denote a synchronization point. - -MAL is the target language for query compilers, such as the -SQL and XQuery front-ends. -Even simple SQL queries generate a long sequence of MAL instructions. -They represent both the administrative actions to ensure binding and transaction -control, the flow dependencies to produce the query result, -and the steps needed to prepare the result set for delivery to -the front-end. - -Only when the algebraic structure is too limited (e.g. updates), -or the database back-end lacks feasible builtin bulk operators, -one has to rely on more detailed flow of control primitives. -But even in that case, the basic blocks to be processed -by a MAL back-end are considered large, e.g. tens of simple -bulk assignment instructions. - -The remainder of this chapter provide a concise overview of the -language features and illustrative examples. -@menu -* MAL Literals:: -* MAL Variables:: -* MAL Instructions:: -* MAL Flow-of-control:: -* MAL Functions:: -* MAL Factories:: -* MAL Type System:: -* Boxed variables:: -* Property Management:: -@end menu - - -@node MAL Literals, MAL Variables, MAL Reference,MAL Reference -@+ MAL Literals -Literals in MAL follow the lexical conventions of the programming -language C. -A default type is attached, e.g. the literal 1 is typed -as an @sc{int} value. -Likewise, the literal 3.14 is typed @sc{flt} rather than @sc{dbl}. - -A literal can be coerced to another type by tagging it with a type -classifier, provided a coercion operation is defined. -For example, @sc{1:lng} marks the literal as of type @sc{lng}. -and @sc{"1999-12-10":date} creates a @sc{date} literal. - -MonetDB comes with the hardwired types @sc{bit, bte, chr, wrd, sht, int, lng, oid, flt, -dbl, str} and @sc{bat}, the bat identifier. -The kernel code has been optimized to deal with these types efficiently, -i.e. without unnecessary function call overheads. -In addition, the system supports -temporal types @sc{date, daytime, time, timestamp, timezone}, -extensions to deal with IPv4 addresses and URLs using @sc{inet, url}, -and several types to interact more closely with the -kernel @sc{lock, semphore}. -This list can be extended with user defined types. - -@node MAL Variables, MAL Instructions, MAL Literals, MAL Reference -@+ MAL Variables -Variables are denoted by identifers and -implicitly defined upon first use. They take -on a type through a type classifier or inherit it from -the context in which they are first used, see @ref{MAL Type System}. - -Variables are organized into two classes, starting with and without -an underscore. The latter are reserved as MAL parser tempoaries, -whose name aligns with an entry in the symbol table. -In general they can not be used in MAL programs, but they may become -visible in MAL program listings or during debugging. - -@node MAL Instructions, MAL Flow-of-control, MAL Variables, MAL Reference -@+ Instructions -A MAL instruction has purposely a simple format. -It is syntactically represented by an assignment, where -an expression (function call) delivers results to multiple target variables. -The assignment patterns recognized are illustrated below. -@example -(t1,..,t32) := module.fcn(a1,..,a32); -t1 := module.fcn(a1,..,a32); -t1 := v1 operator v2; -t1 := literal; -(t1,..,tn) := (a1,..,an); -@end example - -Operators are grouped into user defined modules. -Ommission of the module name is interpreter as the @sc{user} module. - -Simple binary arithmetic operations are merely provided as a short-hand, -e.g. the expression @sc{t:=2+2} is converted directly -into @sc{t:= calc.+(2,2)}. - -Target variables are optional. The compiler introduces temporary -variables to hold the result of the expression upon need. -They won't show up when you list the MAL program unless it -is used elsewhere. - -For parsing simplicity, each instruction fits on a single line. -Comments start with a sharp '#' and continues to the end of the line. -They are retained in the internal code representation to ease -debugging of compiler generated MAL programs. - -The data structure to represent a MAL block is kept simple. -It contains a sequence of MAL statements and a symbol table. -The MAL instruction record is a code byte string overlaid with the -instruction pattern, which contains references into the symbol tables -and administrative data for the interpreter. - -This method leads to a large allocated block, which can be easily freed. -Variable- and statement- block together describe the -static part of a MAL procedure. It carries enough -information to produce a listing and to aid symbolic debugging. - -@node MAL Flow-of-control, MAL Functions, MAL Instructions, MAL Reference -@+ MAL Flow-of-control -The flow of control within a MAL program block can be changed by -tagging a statement with either @sc{return}, @sc{yield}, -@sc{barrier}, @sc{catch}, @sc{leave}, @sc{redo}, or @sc{exit}. - -The flow modifiers @sc{return} and @sc{yield} mark the end -of a call and return one or more results to the calling environment. -The @sc{return} and @sc{yield} are followed by a target list -or an assignment, which is executed first. - -The @sc{barrier} (@sc{catch}) and @sc{exit} pair mark a -guarded statement block. They may be nested to form a proper -hierarchy identified by their primary target variable, -also called the control variable. - -The @sc{leave} and @sc{redo} are conditional flow modifiers. -The control variable is used after the assignment statement has -been evaluated to decide on the flow-of-control action to be taken. -Built-in controls exists for booleans and numeric values. -The barrier block is opened when the control variable holds -true, when its numeric value >= 0, or when it is a non-empty -string. The @sc{nil} value blocks entry in all cases. - -Once inside the barrier you have an option to prematurely -@sc{leave} it at the exit statement -or to @sc{redo} interpretation just after the corresponding -barrier statement. Much like 'break' and 'continue' statements -in the programming language C. -The action is taken when the condition is met. - -The @sc{exit} marks the exit for a block. Its optional assignment -can be used to re-initialize the barrier control variables -or wrap-up any related administration. - -The barrier blocks can be properly nested to form -a hierarchy of basic blocks. -The control flow within and between blocks is -simple enough to deal with during an optimizer stage. -The @sc{redo} and @sc{leave} statements mark -the partial end of a block. Statements within these -blocks can be re-arranged according to the data-flow -dependencies. The order of partial blocks can not -be changed that easily. It depends on the mutual -exclusion of the data flows within each partial block. - -Common guarded blocks in imperative languages are -the for-loop and if-then-else constructs. -They can be simulated as follows. - -Consider the statement @sc{for(i=1;i<10;i++) print(i)}. -The (optimized) MAL block to implement this becomes: -@example - i:= 1; -barrier B:= i<10; - io.print(i); - i:= i+1; -redo B:= i<10; -exit B; -@end example - -Translation of the statement @sc{if(i<1) print("ok"); else print("wrong");} -becomes: -@example - i:=1; -barrier ifpart:= i<1; - io.print("ok"); -exit ifpart; -barrier elsepart:= i>=1; - io.print("wrong"); -exit elsepart; -@end example - -Note that both guarded blocks can be interchanged without -affecting the outcome. Moreover, neither block would -have been entered if the variable happens to be assigned @sc{nil}. - -The primitives are sufficient to model a wide variety of iterators, -whose pattern look like: -@example -barrier i:= M.newIterator(T); - elm:= M.getElement(T,i); - ... - leave i:= M.noMoreElements(T); - ... - redo i:= M.hasMoreElements(T); -exit i:= M.exitIterator(T); -@end example -The semantics obeyed by the iterator implementations is as follows. -The redo expression updates the target variable @emph{ i} and control -proceeds at the first statement after the barrier when the -barrier is opened by @emph{ i}. If the barrier could not be -re-opened, execution proceeds with the first statement after -the redo. -Likewise, the leave control statement skips to the exit -when the control variable @emph{ i} shows a closed barrier block. -Otherwise, it continues with the next instruction. -Note, in both failed cases the control variable is -possibly changed. - -A recurring situation is to iterate over the elements in -a BAT. This is supported by an iterator implementation for BATs -as follows: -@example -barrier (idx,hd,tl):= bat.newIterator(B); - ... - redo (idx,hd,tl):= bat.hasMoreElements(B); -exit (ids,hd,tl); -@end example -Where idx is an integer to denote the row in the BAT, -hd and tl denote values of the current element. -@{ -@+ The MAL instruction records -The data structure to represent a MAL block is kept simple. -It carries a sequence of MAL statements and a variable table. -Each instruction contains references to elements in the -symbol table. - -The MAL instruction is a code byte string overlaid with the -instruction pattern. This method leads to a large -allocated block, which can be easily freed, and -pattern makes it possible to accommodate a variable argument list. - -Variable- and stmt- block together describe the -static part of a MAL procedure. It carries carry enough -information to produce -a listing and to aid symbolic debugging. - _______________________________________________ Checkin-list mailing list Checkin-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/checkin-list