Bugs item #2952191, was opened at 2010-02-15 17:57
Message generated for change (Settings changed) made by stmane
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2952191&group_id=56967
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: MonetDB5 "stable"
Status: Open
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Wouter Alink (vzzzbx)
Assigned to: Stefan Manegold (stmane)
Summary: GDK: merge-join produces invalid BAT
Initial Comment:
Nov2009, the bug can be reproduced with the attached script.
Result of a merge-join operation doesn't pass all the checks.
Running the query (the last statement of the attached script)
SELECT dd.docid as docid
FROM y as sd, x as dd
WHERE sd.docid = dd.docoid;
returns:
ERROR = !MALException:!ERROR: BATpropcheck: BAT tmpr_6767(-3575) with 26584
tuples seqbase of dense oid bat is wrong! 0 != 4196
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2010-02-18 02:38
Message:
fixed in CVS (Nov2009 branch) by fixing the property settings in
mergejoin()
propagation to Feb2010 branch and development trunk pending
a proper test still needs to be added before closing this report
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2010-02-17 20:50
Message:
In a debugging session with Roberto, we located a bug in mergejoin where
properties are set incorrectly.
I'll checkin a fix, once testing has confirmed it's correctness.
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2010-02-15 20:01
Message:
(a)
it seems to work fine in the Feb2010 branch
(need to check what has changed since Nov2009 ...
(b)
in the Nov2009 branch, mergejoin --- in fact a binary lookup join between
BATmirror(r) and BATmirror(l) --- is chosen here:
Breakpoint 1, mergejoin (l=0x7fffd00008c8, r=0x7fffc00008f0, bn=0x0,
nil_on_miss=0x0, estimate=9223372036854775807, limit=0x0) at
/ufs/manegold/_/scratch0/Monet/Testing/Stable/source/MonetDB/src/gdk/gdk_relop.mx:264
264 ptr nil = ATOMnilptr(r->htype);
Missing separate debuginfos, use: debuginfo-install
bzip2-libs-1.0.5-6.fc12.x86_64 cyrus-sasl-lib-2.1.23-4.fc12.x86_64
glibc-2.11.1-1.x86_64 keyutils-libs-1.2-6.fc12.x86_64
krb5-libs-1.7-18.fc12.x86_64 libcom_err-1.41.9-5.fc12.x86_64
libcurl-7.19.7-6.fc12.x86_64 libidn-1.9-5.x86_64
libselinux-2.0.90-3.fc12.x86_64 libssh2-1.2.2-5.fc12.x86_64
libxml2-2.7.6-1.fc12.x86_64 libxslt-1.1.26-1.fc12.x86_64
ncurses-libs-5.7-3.20090207.fc12.x86_64 nspr-4.8.2-1.fc12.x86_64
nss-3.12.5-8.fc12.x86_64 nss-softokn-3.12.4-10.fc12.x86_64
nss-softokn-freebl-3.12.4-10.fc12.x86_64 nss-util-3.12.5-1.fc12.1.x86_64
openldap-2.4.19-1.fc12.x86_64 openssl-1.0.0-0.13.beta4.fc12.x86_64
pcre-7.8-3.fc12.x86_64 raptor-1.4.18-5.fc12.x86_64
readline-6.0-3.fc12.x86_64 sqlite-3.6.20-1.fc12.x86_64
zlib-1.2.3-23.fc12.x86_64
(gdb) up
#1 0x00007ffff762e704 in batmergejoin (l=0x7fffc00008c8,
r=0x7fffd00008f0, estimate=9223372036854775807, swap=1 '\001', limit=0x0)
at
/ufs/manegold/_/scratch0/Monet/Testing/Stable/source/MonetDB/src/gdk/gdk_relop.mx:410
410 BAT *bn = mergejoin(BATmirror(r), BATmirror(l), NULL,
NULL,
estimate, limit);
(gdb)
#2 0x00007ffff766b6ee in batjoin (l=0x7fffc00008c8, r=0x7fffd00008f0,
estimate=9223372036854775807, swap=1 '\001') at
/ufs/manegold/_/scratch0/Monet/Testing/Stable/source/MonetDB/src/gdk/gdk_relop.mx:1230
1230 return batmergejoin(l, r, estimate, swap, NULL);
(gdb) li
1225 loop binary search (both implemented by BATmergejoin).
1226 @c
1227 if ((BATtordered(l) & BAThordered(r) & 1) || (must_hash &&
(((BATtordered(l) & 1) && ((lng) lcount > logl * (lng) rcount) && swap) ||
((BAThordered(r) & 1) && ((lng) rcount > logr * (lng) lcount))))) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1228 ALGODEBUG THRprintf(GDKout, "#BATjoin:
BATmergejoin(l,r," BUNFMT
");\n", estimate);
1229
1230 return batmergejoin(l, r, estimate, swap, NULL);
1231 }
1232 @-
1233 hash join: the bread&butter join of monet
1234 @c
(gdb) print l->T->sorted &1
$1 = 1
(gdb) print r->H->sorted &1
$2 = 0
(gdb) print must_hash
$3 = 1
(gdb) print l->U->count
$4 = 26584
(gdb) print r->U->count
$5 = 999
(gdb) print logl
$6 = 19
(gdb) print logl * r->U->count
$7 = 18981
(gdb) print swap
$8 = 1 '\001'
(gdb) print logr
$9 = 14
(gdb)
----------------------------------------------------------------------
Comment By: Roberto Cornacchia (cornuz)
Date: 2010-02-15 18:29
Message:
Notice that there is probably nothing wrong with mergejoin here.
The whole point is that mergejoin should not have been selected in the
first place, as dd.docoid is NOT sorted (it is until almost the end, but
not till the end) and the attached bat has properties correctly set (not
sorted).
sql debug session:
mdb># _20 :=
algebra.join(_3=<LHM_sec_doc_docID>:bat[:oid,:lng][26584],_19=<tmpr_3130>[999]);
#### L is [void,lng], tail is sorted
mdb>info _3
Show info for 291
#-------------------------------------------------#
# h t # name
# str str # type
#-------------------------------------------------#
[ "batId", "LHM_sec_doc_docID" ]
[ "batCacheid", "291" ]
[ "hparentid", "0" ]
[ "tparentid", "0" ]
[ "batSharecnt", "0" ]
[ "batCount", "26584" ]
[ "batCapacity", "32768" ]
[ "head", "void" ]
[ "tail", "lng" ]
[ "batPersistence", "persistent" ]
[ "batRestricted", "read-only" ]
[ "batRefcnt", "1" ]
[ "batLRefcnt", "2" ]
[ "batDirty", "clean" ]
[ "batSet", "0" ]
[ "hsorted", "65" ]
[ "hident", "h" ]
[ "hdense", "1" ]
[ "hseqbase", "0...@0" ]
[ "hkey", "1" ]
[ "hvarsized", "1" ]
[ "halign", "1018390" ]
[ "hnosorted", "0" ]
[ "hnosorted_rev", "0" ]
[ "hnodense", "0" ]
[ "hnokey[0]", "0" ]
[ "hnokey[1]", "0" ]
[ "hnonil", "1" ]
[ "hnil", "0" ]
[ "tident", "t" ]
[ "tdense", "0" ]
[ "tseqbase", "0...@0" ]
[ "tsorted", "65" ]
[ "tkey", "0" ]
[ "tvarsized", "0" ]
[ "talign", "1018391" ]
[ "tnosorted", "0" ]
[ "tnosorted_rev", "0" ]
[ "tnodense", "0" ]
[ "tnokey[0]", "0" ]
[ "tnokey[1]", "1" ]
[ "tnonil", "1" ]
[ "tnil", "0" ]
[ "batInserted", "26584" ]
[ "batDeleted", "0" ]
[ "batFirst", "0" ]
[ "htop", "0" ]
[ "ttop", "212672" ]
[ "batStamp", "0" ]
[ "lastUsed", "485486" ]
[ "curStamp", "210" ]
[ "batCopiedtodisk", "1" ]
[ "batDirtydesc", "clean" ]
[ "H->heap.dirty", "clean" ]
[ "T->heap.dirty", "clean" ]
[ "head.free", "0" ]
[ "head.size", "0" ]
[ "head.maxsize", "0" ]
[ "head.storage", "absent" ]
[ "head.newstorage", "memory mapped" ]
[ "head.filename", "no file" ]
[ "tail.free", "212672" ]
[ "tail.size", "262144" ]
[ "tail.maxsize", "262144" ]
[ "tail.storage", "memory mapped" ]
[ "tail.newstorage", "memory mapped" ]
[ "tail.filename", "04/443.tail" ]
[ "H->vheap->dirty", "clean" ]
[ "T->vheap->dirty", "clean" ]
mdb># _20 :=
algebra.join(_3=<LHM_sec_doc_docID>:bat[:oid,:lng][26584],_19=<tmpr_3130>[999]);
#### R is [lng,oid], head is NOT sorted
mdb>info _19
Show info for -1624
#-----------------------------------------#
# h t # name
# str str # type
#-----------------------------------------#
[ "batId", "tmpr_3130" ]
[ "batCacheid", "-1624" ]
[ "hparentid", "-998" ]
[ "tparentid", "0" ]
[ "batSharecnt", "0" ]
[ "batCount", "999" ]
[ "batCapacity", "1024" ]
[ "head", "lng" ]
[ "tail", "void" ]
[ "batPersistence", "transient" ]
[ "batRestricted", "read-only" ]
[ "batRefcnt", "1" ]
[ "batLRefcnt", "1" ]
[ "batDirty", "dirty" ]
[ "batSet", "0" ]
[ "hsorted", "0" ]
[ "hident", "t" ]
[ "hdense", "0" ]
[ "hseqbase", "0...@0" ]
[ "hkey", "1" ]
[ "hvarsized", "0" ]
[ "halign", "1027928" ]
[ "hnosorted", "990" ]
[ "hnosorted_rev", "0" ]
[ "hnodense", "0" ]
[ "hnokey[0]", "0" ]
[ "hnokey[1]", "0" ]
[ "hnonil", "1" ]
[ "hnil", "0" ]
[ "tident", "h" ]
[ "tdense", "1" ]
[ "tseqbase", "0...@0" ]
[ "tsorted", "65" ]
[ "tkey", "1" ]
[ "tvarsized", "1" ]
[ "talign", "1029656" ]
[ "tnosorted", "0" ]
[ "tnosorted_rev", "0" ]
[ "tnodense", "0" ]
[ "tnokey[0]", "0" ]
[ "tnokey[1]", "0" ]
[ "tnonil", "1" ]
[ "tnil", "0" ]
[ "batInserted", "999" ]
[ "batDeleted", "0" ]
[ "batFirst", "0" ]
[ "htop", "7992" ]
[ "ttop", "0" ]
[ "batStamp", "208" ]
[ "lastUsed", "485645" ]
[ "curStamp", "209" ]
[ "batCopiedtodisk", "0" ]
[ "batDirtydesc", "dirty" ]
[ "H->heap.dirty", "clean" ]
[ "T->heap.dirty", "clean" ]
[ "head.free", "7992" ]
[ "head.size", "8192" ]
[ "head.maxsize", "8192" ]
[ "head.storage", "malloced" ]
[ "head.newstorage", "malloced" ]
[ "head.filename", "17/1746.tail" ]
[ "tail.free", "0" ]
[ "tail.size", "0" ]
[ "tail.maxsize", "0" ]
[ "tail.storage", "absent" ]
[ "tail.newstorage", "malloced" ]
[ "tail.filename", "no file" ]
[ "H->vheap->dirty", "clean" ]
[ "T->vheap->dirty", "clean" ]
mdb># _20 :=
algebra.join(_3=<LHM_sec_doc_docID>:bat[:oid,:lng][26584],_19=<tmpr_3130>[999]);
mdb>debug 2097152
Set debug mask to 2097152
### with the head of R not sorted, a mergejoin is selected!
mdb># _20 :=
algebra.join(_3=<LHM_sec_doc_docID>:bat[:oid,:lng][26584],_19=<tmpr_3130>[999]);
mdb>
#BATjoin: BATmergejoin(l,r,9223372036854775807);
ERROR: GDKerror:!ERROR: BATpropcheck: BAT tmpr_3132(-1626) with 26584
tuples seqbase of dense oid bat is wrong! 0 != 4196
mdb>
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2952191&group_id=56967
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs