Update of /cvsroot/monetdb/MonetDB4/src/modules/contrib
In directory sc8-pr-cvs16.sourceforge.net:/tmp/cvs-serv15343/src/modules/contrib
Modified Files:
bat_mmath.mx bitset.mx ddbench.mx iterator.mx malalgebra.mx
mprof.mx salgebra.mx txtsim.mx uchr.mx
Log Message:
propagated changes of Sunday Feb 03 2008 - Friday Feb 08 2008
from the MonetDB_4-22 branch to the development trunk
Index: salgebra.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/salgebra.mx,v
retrieving revision 1.7
retrieving revision 1.8
diff -u -d -r1.7 -r1.8
--- salgebra.mx 11 Jan 2008 10:38:53 -0000 1.7
+++ salgebra.mx 8 Feb 2008 22:36:05 -0000 1.8
@@ -19,7 +19,6 @@
@v 1.0
@t Some (experimental) sorted-order algebraic operators
@* Introduction
[EMAIL PROTECTED]
Warning: experimental code!
@@ -164,7 +163,6 @@
\end{verbatim}
@+ Running example
[EMAIL PROTECTED]
In the following, we will assume that we work with three [void,oid] BATs
{\tt dj}, {\tt ti}, {\tt tfij} -- like the {\tt CONTREP} structure described
@@ -181,7 +179,6 @@
\end{verbatim}
@+ Building the cluster-hash
[EMAIL PROTECTED]
We first need to obtain `a' clustering of the data; potential sources are a
normal hash-table, an enum, a radix-ed BAT, as well as a
@@ -251,7 +248,6 @@
#include "gdk.h"
@+ Type definition
[EMAIL PROTECTED]
Explain fields.
@h
@@ -284,7 +280,6 @@
@c
@+ Utility functions
[EMAIL PROTECTED]
Not extremely interesting utility stuff.
We maintain a cache of known clusterhashes, protected by a lock,
@@ -333,7 +328,6 @@
BAT *salgebra_refcnt;
@+ MEL required functions
[EMAIL PROTECTED]
\begin{itemize}
\item size() and align()
\item ToStr
@@ -502,7 +496,6 @@
}
@+ Dictionary
[EMAIL PROTECTED]
The dictionary stores for each cluster the start and end bucket in the
clusterhash itself.
@@ -542,7 +535,6 @@
}
@+ Hashtable
[EMAIL PROTECTED]
...
@@ -650,7 +642,7 @@
@:deltaRefCnt(inc,++)@
@:deltaRefCnt(dec,--)@
[EMAIL PROTECTED]
+@
Build a clusterhash for bat b, deriving the clustering from a hash
defined for b.
@c
Index: bitset.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/bitset.mx,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- bitset.mx 11 Jan 2008 10:38:53 -0000 1.3
+++ bitset.mx 8 Feb 2008 22:36:05 -0000 1.4
@@ -19,10 +19,8 @@
@a Menzo Windhouwer
@v 1.0
@* Introduction
[EMAIL PROTECTED]
@* Implementation
[EMAIL PROTECTED]
@+ MEL definition
@m
@@ -121,7 +119,7 @@
@- Atom commands
@c
[EMAIL PROTECTED]
+@
{\bf{\tt fromstr(str):bitset}}
Create a BitSet from a string
@@ -151,7 +149,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt tostr(bitset):str}}
Create a string from a BitSet
@@ -211,7 +209,7 @@
@- Constructors
@c
[EMAIL PROTECTED]
+@
{\bf{\tt newBitset(int):bitset}}
Create a new BitSet from an existing integer.
@@ -232,7 +230,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt newBitset():bitset}}
Create a new empty BitSet.
@@ -256,7 +254,7 @@
@- Accessors
@c
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_clearBit(bitset,int):bitset}}
Clear the specified bit in this BitSet.
@@ -279,7 +277,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_flipBit(bitset,int)}}
Flip the specified bit in this BitSet.
@@ -301,7 +299,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_getBit(bitset,int):bit}}
Get the specified bit from this BitSet.
@@ -325,7 +323,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_setBit(bitset,int)}}
Set the specified bit from this BitSet.
@@ -347,7 +345,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_and(bitset,bitset):bitset}}
AND this BitSet with an other BitSet.
@@ -364,7 +362,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_or(bitset,bitset):bitset}}
OR this BitSet with an other BitSet.
@@ -381,7 +379,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_xor(bitset,bitset):bitset}}
XOR this BitSet with an other BitSet.
@@ -398,7 +396,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_not(bitset):bitset}}
NOT this BitSet.
@@ -415,7 +413,7 @@
}
[EMAIL PROTECTED]
+@
{\bf{\tt bitset\_toInt(bitset):int}}
Convert this BitSet to an integer.
Index: mprof.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/mprof.mx,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -d -r1.4 -r1.5
--- mprof.mx 11 Jan 2008 10:38:53 -0000 1.4
+++ mprof.mx 8 Feb 2008 22:36:05 -0000 1.5
@@ -20,7 +20,6 @@
@v 1.0
@* Introduction
[EMAIL PROTECTED]
Performance analysis is a must to understand the and improve the
database applications developed for Monet.
To simplify this job, a profiling module is provided, which
@@ -89,7 +88,6 @@
The properties collected are summarized below. The items marked '*' are
collected for both the beginning and the end of the intervals.
[EMAIL PROTECTED]
\begin{tabular}{l l}
name & label associate with a oid \\
intervals & oid associate with each interval\\
@@ -1344,7 +1342,6 @@
The first example does not rely on general aggregate processing.
@-
[EMAIL PROTECTED]
\begin{verbatim}
# some simple atom expressions
a := Tms("12,13");
@@ -1385,7 +1382,7 @@
pmE("pmPrint");
printf("#~BeginVariableOutput~#\n"); pmSummary();
printf("#~EndVariableOutput~#\n");
}
[EMAIL PROTECTED]
+@
A sample output of the pmSummary routine is shown below
(for the example above)
Index: ddbench.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/ddbench.mx,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -d -r1.5 -r1.6
--- ddbench.mx 11 Jan 2008 10:38:53 -0000 1.5
+++ ddbench.mx 8 Feb 2008 22:36:05 -0000 1.6
@@ -19,12 +19,10 @@
@a Peter Boncz
@t DDbench Optimization Module
@* Introduction
[EMAIL PROTECTED]
This module is intended to attack the times presented on the DD benchmark
by the infocharger product.
@+ Enumeration Grouping/Counting
[EMAIL PROTECTED]
It enables experimentation with various optimizations:
\begin{itemize}
@@ -56,7 +54,6 @@
are incorporated using procs into the generic {\small\tt group()} commands.
@+ Selections and Intersecting them
[EMAIL PROTECTED]
Our initial benchmark result was obtained by
creating selection-BATs with {\small\tt uselect()} and intersecting those for
AND
predicates.
@@ -120,7 +117,6 @@
an enumeration type.
@+ Sub-Histograms
[EMAIL PROTECTED]
For the computation of the sub-histograms, the following
optimizations have been done:
@@ -143,7 +139,6 @@
the histogram counting buckets.
@+ Other Possible Optimizations
[EMAIL PROTECTED]
We discuss possible avenues to further improve the performance of the DD
benchmark.
One such is idea is to further enhance the speed of the basic primitives on
@@ -160,7 +155,6 @@
should find its way into this document (TODO).
@- MMX parallellism
[EMAIL PROTECTED]
Extra speed may be gained by implementing in assembly directly.
The incompatibilities suffered by this might be recompensated by
the opportunity to achieve implementation code that cannot possibly
@@ -183,7 +177,6 @@
\end{description}
@- Query Optimization
[EMAIL PROTECTED]
An additional way to speed up the DD benchmark is to use an intelligent
cube-caching system. The basic idea is to first invest in computing a
complex cubes that are cached instead of out current binary \[X,reliable\]
@@ -982,7 +975,6 @@
}
@- optimized implementation
[EMAIL PROTECTED]
The below grouping algorithm exploits a limited number of grouping
combinations by creating an 'map' of index numbers for each possible group
that is initialized on zero. The index to the map is computed for each
@@ -1431,7 +1423,6 @@
@- optimized bit32 select/refine implementations
[EMAIL PROTECTED]
The creation of the destination submask, of which each integer
contains 32 answers (yes or no selected) is sped up by loop
unrolling these 32 tests. If the attribute type being
Index: iterator.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/iterator.mx,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -d -r1.6 -r1.7
--- iterator.mx 11 Jan 2008 10:38:53 -0000 1.6
+++ iterator.mx 8 Feb 2008 22:36:05 -0000 1.7
@@ -18,7 +18,6 @@
@a Peter Boncz
@t Iterator-Based BAT Engine
@* Introduction
[EMAIL PROTECTED]
In the Monet system it was decided that operators use full materialization.
This is a simple approach that allows for efficient implementation of the
individual operators. In MIL, one operator is executed without interruption
@@ -148,7 +147,6 @@
size_t chunksize = 1;
@- BUN insert
[EMAIL PROTECTED]
Moving tuples is optimized to moving only the fixed-size part. The operators
presented
here do not introduce new values; they only pass already existing values. For
the result tuple buffers we use BATs, but let their heaps for variable-size
atoms
@@ -174,7 +172,6 @@
@- Data Structure
[EMAIL PROTECTED]
A stream is an object that has a next() method that returns more tuples. We
also gave
it a guess() for estimating how many tuples more will come, and a free() for
deallocation.
This would have been nicely implemented as a hierarchy of streams in C++, but
I hate
@@ -335,7 +332,6 @@
@+ Scan Iterator
[EMAIL PROTECTED]
Input is simpy an already loaded bat. We provide slices of it.
@h
typedef struct {
@@ -378,7 +374,6 @@
}
@+ Mirror, Reverse and Mark Iterators
[EMAIL PROTECTED]
This passes through the mirrored results of its input. No buns need to be
copied as
their BAT buffers are views of each other; hence share the same data
structures.
@c
@@ -410,7 +405,6 @@
}
@- reverse
[EMAIL PROTECTED]
This passes through the reversed results of its input. We reuse the mirror
methods.
@c
int
@@ -436,7 +430,6 @@
}
@+ Scan-Select Iterator
[EMAIL PROTECTED]
This scans the input stream (whatever it may be) and only lets pass the
qualifying
elements. Qualification is determined using some bit-function that is executed
with
three parameters: current tail-value and two constant parameter values.
@@ -594,11 +587,9 @@
}
@+ Join Iterator
[EMAIL PROTECTED]
Falls down into two cases: hash-join and positional join.
@- hash-join
[EMAIL PROTECTED]
This is equi-join using a hash table on the inner operand. The inner
operand is materialied (it is a BAT). The iterator then iteratively
returns matches on tuples from the outer stream with the hash-table.
@@ -756,7 +747,6 @@
@+ GroupBy-Aggregation
[EMAIL PROTECTED]
Aggregation breaks the tuple stream, as it needs to see all tuples before the
aggregate totals are known. Therefore, these operators return a BAT rather
than
an ITER. We provide two implementations: a generic hash-based grouping and
@@ -977,7 +967,6 @@
@+ next
[EMAIL PROTECTED]
Replace the contents of the result bat with the next chunk (if any) and return
TRUE.
Otherwise, deallocate the iterator and return FALSE.
@c
@@ -999,7 +988,6 @@
}
@+ collect
[EMAIL PROTECTED]
Materialize a tuple stream into a BAT. This function intelligently uses
the guess() method for estimating the memory resources needed.
@c
@@ -1026,7 +1014,6 @@
}
@+ chunksize
[EMAIL PROTECTED]
Get and set the chunksize parameter. This determines the granularity in
which tuples flow through an iterator tree.
@c
Index: uchr.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/uchr.mx,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- uchr.mx 11 Jan 2008 10:38:53 -0000 1.3
+++ uchr.mx 8 Feb 2008 22:36:05 -0000 1.4
@@ -19,7 +19,6 @@
@a Niels Nes
@v 1.0
@* Introduction
[EMAIL PROTECTED]
The uchr atom is just a chr but uses numerical string representations.
Index: malalgebra.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/malalgebra.mx,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -d -r1.6 -r1.7
--- malalgebra.mx 11 Jan 2008 10:38:53 -0000 1.6
+++ malalgebra.mx 8 Feb 2008 22:36:05 -0000 1.7
@@ -19,7 +19,6 @@
@v 1.0
@t Work around internal select and join algorithm selection
@* Introduction
[EMAIL PROTECTED]
Too many bugs.
@m
Index: bat_mmath.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/bat_mmath.mx,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -d -r1.5 -r1.6
--- bat_mmath.mx 11 Jan 2008 10:38:53 -0000 1.5
+++ bat_mmath.mx 8 Feb 2008 22:36:05 -0000 1.6
@@ -30,7 +30,7 @@
@m
.MODULE bat_mmath;
[EMAIL PROTECTED]
+@
\begin{verbatim}
signatures
@1: dbl arithmetic type
Index: txtsim.mx
===================================================================
RCS file: /cvsroot/monetdb/MonetDB4/src/modules/contrib/txtsim.mx,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -d -r1.6 -r1.7
--- txtsim.mx 11 Jan 2008 10:38:53 -0000 1.6
+++ txtsim.mx 8 Feb 2008 22:36:05 -0000 1.7
@@ -20,7 +20,6 @@
@d 15/04/2005
@v 0.3
[EMAIL PROTECTED]
@* String metrics
Provides basic similarity metrics for strings.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Monetdb-checkins mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-checkins