Bugs item #2783279, was opened at 2009-04-29 00:33
Message generated for change (Comment added) made by stmane
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2783279&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: MonetDB Common CVS Head
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Stefan de Konink (skinkie)
Assigned to: Niels Nes (nielsnes)
Summary: SEGV upon bat too large

Initial Comment:
CREATE TABLE "sys"."way_nds" (
        "way" int,
        "idx" int,
        "to_node" int
);

create table segments (way integer, node1 integer, node2 integer);
insert into segments select t1.way, t1.to_node, t2.to_node from way_nds as t1, 
way_nds as t2 where t1.way = t2.way and t1.idx = t2.idx-1;

way_nds consists of 370896230 rows

The error on the server:
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
!ERROR: bunfastins: bat too large
Segmentation fault

(will try to run GDB, but currently have enabled optimize)

I am using 64bit/oid32.

----------------------------------------------------------------------

>Comment By: Stefan Manegold (stmane)
Date: 2009-04-30 22:23

Message:
avoiding the multi-column join (with arithmetic term) seems to work fine:

sql>select min(idx), max(idx), min(way), max(way) from way_nds;
+-----+-------+-----+----------+
| L11 | L12   | L13 | L14      |
+=====+=======+=====+==========+
|   0 | 36168 |  35 | 33244337 |
+-----+-------+-----+----------+
1 tuple
sql>select count(*) from way_nds as t1, way_nds as t2 where ((t1.way *
100000) + t1.idx) = ((t2.way * 100000) + t2.idx);
+-----------+
| L7        |
+===========+
| 370896230 |
+-----------+
1 tuple
sql>select count(*) from way_nds as t1, way_nds as t2 where ((t1.way *
100000) + t1.idx) = ((t2.way * 100000) + (t2.idx-1));
+-----------+
| L10       |
+===========+
| 343983657 |
+-----------+
1 tuple


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-04-30 22:11

Message:
Without my patch,
select count(*) from way_nds as t1, way_nds as t2 where t1.way = t2.way
and t1.idx = t2.idx-1;
seems to trigger a segfault.


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-04-30 20:06

Message:
With today's CVS code base (Feb2009 branch) plus the attached mkey.mx
patch, I get the following on 64-bit Fedora 10, using 32-bit OIDs,
configured with --enable-strict --disable-debug --enable-optimize
--disable-assert --enable-oid32, compiled with gcc 4.3.2, server started
with default `mserver5 --dbinit='include sql;'`:

sql>CREATE TABLE "sys"."way_nds" ( "way" int, "idx" int, "to_node" int );
0 tuples
sql>copy 370896230 records into way_nds from
'/export/data1/konink/way_nds.csv' delimiters ',';
Rows affected 370896230
sql>select count(*) from way_nds;
+-----------+
| L1        |
+===========+
| 370896230 |
+-----------+
1 tuple
sql>select * from way_nds limit 9;
+-----+-----+-----------+
| way | idx | to_node   |
+=====+=====+===========+
|  35 |   0 |    200542 |
|  35 |   1 | 274057218 |
|  35 |   2 |    200550 |
|  35 |   3 |    200551 |
|  35 |   4 |    200553 |
|  37 |   0 |    200511 |
|  37 |   1 | 177231081 |
|  37 |   2 |    200513 |
|  37 |   3 | 177081428 |
+-----+-----+-----------+
9 tuples
sql>select count(*) from way_nds as t1, way_nds as t2 where t1.way =
t2.way and t1.idx = t2.idx;
+-----------+
| L2        |
+===========+
| 370896230 |
+-----------+
1 tuple
sql>select count(*) from way_nds as t1, way_nds as t2 where t1.way =
t2.way and t1.idx = t2.idx-1;
+----+
| L3 |
+====+
|  0 |
+----+
1 tuple
sql>select min(idx) from way_nds;
+----+
| L6 |
+====+
|  0 |
+----+
1 tuple
sql>select max(idx) from way_nds;
+-------+
| L4    |
+=======+
| 36168 |
+-------+
1 tuple
sql>select count(*) from way_nds as t1, way_nds as t2 where t2.idx > 0 and
t1.way = t2.way and t1.idx = t2.idx-1;
+-----------+
| L5        |
+===========+
| 343983657 |
+-----------+
1 tuple


... I suspect that there are (still?) (despite or due to my patch??) some
problems with the multi-column join, possibly releated to or caused by
potential bugs in bulk_rotate_xor_hash() of module mkey, in particular when
hashing negaitve numbers and using a 32-bit left shift as done by the SQL
code generation ...


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-04-30 19:50

Message:
With the latest (April 30) CVS code base (Feb2009 branch or HEAD):
- does the "bat too large" error still occur?
- does the segfault still occur?


----------------------------------------------------------------------

Comment By: Martin Kersten (mlkersten)
Date: 2009-04-29 09:45

Message:
The sporadic segfault has been catched in the HEAD

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-04-29 08:46

Message:
My question about the primary key was not about to add one, but rather
about which attribute (or combination of attributes) has unique values /
value combinations (and hence would qualify as a primary key) --- this
information would help to (roughly) estimate the join result size ...

... apart from that, ALGjoinPath() should of course not segfault, even if
(a previous) inter mediate result "overflows", and hence triggers an
exception/error ...


----------------------------------------------------------------------

Comment By: Stefan de Konink (skinkie)
Date: 2009-04-29 08:10

Message:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe2aa3950 (LWP 7736)]
0x00007fffe544a9c9 in ALGjoinPath ()
   from
/export/scratch1/konink/monetdb-head/lib/MonetDB5/lib/lib_optimizer.so
(gdb) bt
#0  0x00007fffe544a9c9 in ALGjoinPath ()
   from
/export/scratch1/konink/monetdb-head/lib/MonetDB5/lib/lib_optimizer.so
#1  0x00007ffff7d12975 in DFLOWstep ()
   from /export/scratch1/konink/monetdb-head/lib/libmonetdb5.so.5
#2  0x00007ffff7d14894 in runDFLOWworker ()
   from /export/scratch1/konink/monetdb-head/lib/libmonetdb5.so.5
#3  0x0000003c1a4073da in start_thread () from /lib64/libpthread.so.0
#4  0x0000003c19ce62bd in clone () from /lib64/libc.so.6

[I'll recompile with debugging enabled again]


Currently no primary key has been set. But if I set it, it would be (way,
idx). But since my query makes an explicit relation between way and idx,
would a primary key make any difference?

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-04-29 05:09

Message:
"bat too large" occurs only if there are BUN_MAX BUNs; with 32-bit OIDs,
BUN_MAX is INT_MAX-1, i.e., 2^31-1-1 == 2147483646 >> 370896230.

I suspect that an intermediate result while calculating the (self-)join
"explodes".
What is the primary key of table way_nds? (way), (idx), (to_node),
(way,idx), ..., (way,idx,to_node)?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2783279&group_id=56967

------------------------------------------------------------------------------
Register Now & Save for Velocity, the Web Performance & Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance & Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to