Changeset: 02fef945a1e5 for MonetDB
URL: http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=02fef945a1e5
Modified Files:
        monetdb5/modules/mal/tablet.c
        monetdb5/modules/mal/tablet.h
        monetdb5/modules/mal/tablet.mal
        monetdb5/modules/mal/tablet_sql.c
Branch: default
Log Message:

Cleanout of tablet code
The multipaging system and formatting for tables was not used.
It is also more a task of any frontent.
Moving the code to the attic.


diffs (truncated from 2932 to 300 lines):

diff --git a/monetdb5/modules/mal/tablet.c b/monetdb5/modules/mal/tablet.c
--- a/monetdb5/modules/mal/tablet.c
+++ b/monetdb5/modules/mal/tablet.c
@@ -18,223 +18,34 @@
  */
 
 /*
- * @a Niels Nes, Martin Kersten
- * @d 29/07/2003
- * @+ The table interface
+ *  Niels Nes, Martin Kersten
  *
- * A database cannot live without ASCII tabular print/dump/load operations.
- * It is needed to produce reasonable listings, to exchange answers
- * with a client, and to keep a database version for backup.
- * This is precisely where the tablet module comes in handy.
- * [This module should replace all other table dump/load functions]
+ * Parallel bulk load for SQL
+ * The COPY INTO command for SQL is heavily CPU bound, which means
+ * that ideally we would like to exploit the multi-cores to do that
+ * work in parallel.
+ * Complicating factors are the initial record offset, the
+ * possible variable length of the input, and the original sort order
+ * that should preferable be maintained.
  *
- * We start with a simple example to illustrate the plain ASCII
- * representation and the features provided. Consider the
- * relational table answer(name:str, age:int, sex:chr, address:str, dob:date)
- * obtained by calling the routine tablet.page(B1,...,Bn) where the Bi 
represent
- * BATS.
- * @verbatim
- * [ "John Doe",               25,     'M',    "Parklane 5",   "25-12-1978" ]
- * [ "Maril Streep",   23,     'F',    "Church 5",     "12-07-1980" ]
- * [ "Mr. Smith",              53,     'M',    "Church 1",     "03-01-1950" ]
- * @end verbatim
+ * The code below consists of a file reader, which breaks up the
+ * file into chunks of distinct lines. Then multiple parallel threads
+ * grab them, and break them on the field boundaries.
+ * After all fields are identified this way, the columns are converted
+ * and stored in the BATs.
  *
- * The lines contain the representation of a list in Monet tuple format.
- * This format has been chosen to ease parsing by any front-end. The scalar 
values
- * are represented according to their type. For visual display, the columns
- * are aligned by placing enough tabs between columns based on sampling the
- * underlying bat to determine a maximal column width.
- * (Note,actual commas are superfluous).
+ * The threads get a reference to a private copy of the READERtask.
+ * It includes a list of columns they should handle. This is a basis
+ * to distributed cheap and expensive columns over threads.
  *
- * The arguments to the command can be any sequence of BATs, but which are
- * assumed to be aligned. That is, they all should have the same number of
- * tuples and the j-th tuple tail of Bi is printed along-side the j-th tuple
- * tail of Bi+1.
+ * The file reader overlaps IO with updates of the BAT.
+ * Also the buffer size of the block stream might be a little small for
+ * this task (1MB). It has been increased to 8MB, which indeed improved.
  *
- * Printing both columns of a single bat is handled by tablet as a
- * print of two columns. This slight inconvenience is catch-ed by
- * the io.print(b) command, which resolves most back-ward compatibility issues.
- *
- * In many cases, this output would suffice for communication with a front-end.
- * However, for visual inspection the user should be provided also some meta
- * information derived from the database schema. Likewise, when reading a
- * table this information is needed to prepare a first approximation of
- * the schema namings. This information is produced by the command
- * tablet.header(B1,...,Bn), which lists the column role name.
- * If no role name is give, a default is generated based on the
- * BAT name, e.g. B1_tail.
- *
- * @verbatim
- * #------------------------------------------------------#
- * # name,           age, sex, address,       dob         #
- * #------------------------------------------------------#
- * [ "John Doe",      25, 'M', "Parklane 5", "25-12-1978" ]
- * [ "Maril Streep",  23, 'F', "Church 5",   "12-07-1980" ]
- * [ "Mr. Smith",     53, 'M', "Church 1",   "03-01-1950" ]
- * @end verbatim
- *
- * The command tablet.display(B1,...,Bn) is a contraction of tablet.header();
- * tablet.page().
- *
- * In many cases, the @code{tablet} produced may be too long to consume 
completely
- * by the front end. In that case, the user needs page size control, much
- * like the more/less utilities under Linux. However, no guarantee
- * is given for arbitrarily going back and forth.
- * [but works as long as we materialize results first ].
- * A portion of the tablet can be printed by identifying the rows of interest 
as
- * the first parameter(s) in the page command, e.g.
- *
- *
- * @verbatim
- * tablet.page(nil,10,B1,...,Bn);      #prints first 10 rows
- * tablet.page(10,20,B1,...,Bn);       #prints next 10 rows
- * tablet.page(100,nil,B1,...,Bn);     #starts printing at tuple 100 until end
- * @end verbatim
- *
- * A paging system also provides the commands tablet.firstPage(),
- * tablet.nextPage(), tablet.prevPage(), and tablet.lastPage() using
- * a user controlled tablet size tablet.setPagesize(L).
- *
- * The tablet display operations use a client (thread) specific formatting
- * structure. This structure is initialized using either
- * tablet.setFormat(B1,...,Bn) or tablet.setFormat(S1,...,Sn) (Bi is a BAT, Si 
a scalar).
- * Subsequently, some additional properties can be set/modified,
- * column width and brackets.
- * After printing/paging the BAT resources should be freed using
- * the command tablet.finish().
- *
- * Any access outside the page-range leads to removal of the report structure.
- * Subsequent access will generate an error.
- * To illustrate, the following code fragment would be generated by
- * the SQL compiler
- *
- * @verbatim
- *     tablet.setFormat(B1,B2);
- *     tablet.setDelimiters("|","\t","|\n");
- *     tablet.setName(0, "Name");
- *     tablet.setNull(0, "?");
- *     tablet.setWidth(0, 15);
- *     tablet.setBracket(0, " ", ",");
- *     tablet.setName(1, "Age");
- *     tablet.setNull(1, "-");
- *     tablet.setDecimal(1, 9,2);
- *     tablet.SQLtitle("Query: select * from tables");
- *     tablet.page();
- *     tablet.SQLfooter(count(B1),cpuTicks);
- * @end verbatim
- *
- * This table is printed with tab separator(s) between elements
- * and the bar (|) to mark begin and end of the string.
- * The column parameters give a new title,
- * a null replacement value, and the preferred column width.
- * Each column value is optionally surrounded by brackets.
- * Note, scale and precision can be applied to integer values only.
- * A negative scale leads to a right adjusted value.
- *
- * The title and footer operations are SQL specific routines to
- * decorate the output.
- *
- * Another example involves printing a two column table in XML format.
- * [Alternative, tablet.XMLformat(B1,B2) is a shorthand for the following:]
- *
- * @verbatim
- *     tablet.setFormat(B1,B2);
- *     tablet.setTableBracket("<rowset>","</rowset>");
- *     tablet.setRowBracket("<row>","</row>");
- *     tablet.setBracket(0, "<name>", "</name>");
- *     tablet.setBracket(1, "<age>", "</age>");
- *     tablet.page();
- * @end verbatim
- * @- Tablet properties
- * More detailed header information can be obtained with the command
- * tablet.setProperties(S), where S
- * is a comma separated list of properties of interest,
- * followed by the tablet.header().
- * The properties to choose from are: bat, name, type, width,
- * sorted, dense, key, base, min, max, card,....
- *
- * @verbatim
- * #--------------------------------------#
- * # B1,   B2,     B3,     B4,     B5     # BAT
- * # str,  int,    chr,    str,    date   # type
- * # true, false,  false,  false,  false  # sorted
- * # true, true,   false,  false,  false  # key
- * # ,     23,     'F',    ,              # min
- * # ,     53,     'M',        ,              # max
- * # 4,     4,     4,      4,      4      # count
- * # 4,i    3,     2,      2,      3      # card
- * # name,     age,    sex,   address, dob    # name
- * #--------------------------------------#
- * @end verbatim
- *
- * @- Scalar tablets
- * In line with the 10-year experience of Monet, printing scalar values
- * follow the tuple layout structure. This means that the header() command
- * is also applicable.
- * For example, the sequence "i:=0.2;v:=sin(i); tablet.display(i,v);"
- * produces the answer:
- * @verbatim
- * #----------------#
- * # i,        v        #
- * #----------------#
- * [ 0.2,      0.198669 ]
- * #----------------#
- * @end verbatim
- *
- * All other formatted printing should be done with the printf() operations
- * contained in the module @sc{io}.
- *
- * @- Tablet dump/restore
- *
- * Dump and restore operations are abstractions over sequence of tablet 
commands.
- * The command tablet.dump(stream,B1,...,Bn) is a contraction of the sequence
- * tablet.setStream(stream);
- * tablet.setProperties("name,type,dense,sorted,key,min,max");
- * tablet.header(B1,..,Bn); tablet.page(B1,..,Bn).
- * The result can be read by tablet.load(stream,B1,..,Bn) command.
- * If loading is successful, e.g. no parsing
- * errors occurred, the tuples are appended to the corresponding BATs.
- *
- * @- Front-end extension
- * A general bulk loading of foreign tables, e.g. CSV-files and fixed position
- * records, is not provided. Instead, we extend the list upon need.
- * Currently, the routines tablet.SQLload(stream,delim1,delim2, B1,..,Bn)
- * reads the files using the Oracle(?) storage. The counterpart for
- * dumping is tablet.SQLdump(stream,delim1,delim2);
- *
- * @- The commands
- *
- * The load operation is for bulk loading a table, each column will be loaded
- * into its own bat. The arguments are void-aligned bats describing the
- * input, ie the name of the column, the tuple separator and the type.
- * The nr argument can be -1 (The input (datafile) is read until the end)
- * or a maximum.
- *
- * The dump operation is for dumping a set of bats, which are aligned.
- * Again with void-aligned arguments, with name (currently not used),
- * tuple separator (the last is the record separator) and bat to be dumped.
- * With the nr argument the dump can be limited (-1 for unlimited).
- *
- * The output operation is for ordered output. A bat (possibly form the 
collection)
- * gives the order. For each element in the order bat the values in the bats 
are
- * searched, if all are found they are output in the datafile, with the given
- * separators.
- *
- * The scripts from the tablet.mil file are all there too for backward
- * compatibility with the old Mload format files.
- *
- * The load_format loads the format file, since the old format file was
- * in a table format it can be loaded with the load command.
- *
- * The result from load_format can be used with load_data to load the data
- * into a set of new bats.
- *
- * These bats can be made persistent with the make_persistent script or
- * merge with existing bats with the merge_data script.
- *
- * The dump_format scripts dump a format file for a given set of
- * to be dumped bats. These bats can be dumped with dump_data.
+ * The work divider allocates subtasks to threads based on the
+ * observed time spending so far.
  */
+
 #include "monetdb_config.h"
 #include "tablet.h"
 #include "algebra.h"
@@ -248,52 +59,8 @@
 #define getcwd _getcwd
 #endif
 
-tablet_export str TABsetFormat(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABheader(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABdisplayTable(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABdisplayRow(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABpage(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr 
pci);
-tablet_export str TABsetProperties(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABdump(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr 
pci);
-tablet_export str TABfinishReport(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetStream(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetPivot(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetComplaints(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetDelimiter(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetColumn(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetColumnName(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetTableBracket(Client cntxt, MalBlkPtr mb, MalStkPtr 
stk, InstrPtr pci);
-tablet_export str TABsetRowBracket(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetColumnBracket(Client cntxt, MalBlkPtr mb, MalStkPtr 
stk, InstrPtr pci);
-tablet_export str TABsetColumnNull(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetColumnWidth(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABsetColumnPosition(Client cntxt, MalBlkPtr mb, MalStkPtr 
stk, InstrPtr pci);
+tablet_export str CMDtablet_input(int *ret, int *nameid, int *sepid, int 
*typeid, stream *s, int *nr);
 
-tablet_export str TABsetColumnDecimal(Client cntxt, MalBlkPtr mb, MalStkPtr 
stk, InstrPtr pci);
-tablet_export str TABsetColumnDecimal(Client cntxt, MalBlkPtr mb, MalStkPtr 
stk, InstrPtr pci);
-tablet_export str TABsetTryAll(Client cntxt, MalBlkPtr mb, MalStkPtr stk, 
InstrPtr pci);
-tablet_export str TABfirstPage(int *ret);
-tablet_export str TABlastPage(int *ret);
-tablet_export str TABnextPage(int *ret);
-tablet_export str TABprevPage(int *ret);
-tablet_export str TABgetPage(int *ret, int *pnr);
-tablet_export str TABgetPageCnt(int *ret);
-tablet_export str CMDtablet_load(int *ret, int *nameid, int *sepid, int 
*typeid, str *filename, int *nr);
-tablet_export str CMDtablet_dump(int *ret, int *nameid, int *sepid, int *bids, 
str *filename, int *nr);
-tablet_export str CMDtablet_input(int *ret, int *nameid, int *sepid, int 
*typeid, stream *s, int *nr);
-tablet_export str CMDtablet_output(int *ret, int *nameid, int *sepid, int 
*bids, void **s);
-tablet_export void TABshowHeader(Tablet *t);
-tablet_export void TABshowRow(Tablet *t);
-tablet_export void TABshowRange(Tablet *t, lng first, lng last);
-
-static void makeTableSpace(int rnr, unsigned int acnt);
-static str bindVariable(Tablet *t, unsigned int anr, str nme, int tpe, ptr 
val, int *k);
-static void clearTable(Tablet *t);
-static int isScalarVector(Tablet *t);
-static int isBATVector(Tablet *t);
-
-static void TABshowPage(Tablet *t);
-static int setTabwidth(Column *c);
_______________________________________________
Checkin-list mailing list
[email protected]
http://mail.monetdb.org/mailman/listinfo/checkin-list

Reply via email to