Bugs item #2879117, was opened at 2009-10-14 13:04
Message generated for change (Comment added) made by smokinn
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2879117&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SQL/Core
Group: MonetDB5 "stable"
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Guillaume Theoret (smokinn)
Assigned to: Niels Nes (nielsnes)
Summary: Consistent segmentation fault

Initial Comment:
It seems I can consistently crash MonetDB by running inserts in a while loop.

I have a script that reads 7.5MM rows from mysql and runs the inserts on 
MonetDB. After about 24 hours the database segfaults.

The messages my scripts output before crashing
Run 1:
6380000 done - 84.2207% in 86083.58 (0.0109 avg)
Segmentation fault

Run 2:
6380000 done - 84.2207% in 85957.17 (0.0107 avg)
Segmentation fault (core dumped)

I also log all get_last_error messages if there are any and got these two 
messages just before the crash:

MALException:!ERROR: HEAPextend: failed to extend to 244108529 for 07/744theap
MALException:!ERROR: HEAPextend: failed to extend to 244108529 for 07/744theap

On the second run I installed Apport to get a dump that I've attached.

I'm running Ubuntu 9.04 Jaunty on a 32bit machine with 2GB of ram. I compiled 
from the bz2 tarball linked here: 
http://monetdb.cwi.nl/projects/monetdb//Download/ ( 
http://monetdb.cwi.nl/downloads/sources/Aug2009-SP1/MonetDB-Aug2009-SuperBall-SP1.tar.bz2
 ) and the crash occurs both when compiling with and without optimizations and 
with and without debug.

guilla...@dev3:~$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.3.3-5ubuntu4' 
--with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3 
--program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug 
--enable-objc-gc --enable-mpfr --enable-targets=all --with-tune=generic 
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu 
--target=i486-linux-gnu
Thread model: posix
gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4)

guilla...@dev3:~$ bison -V
bison (GNU Bison) 2.3

guilla...@dev3:~$ yacc -V
bison (GNU Bison) 2.3

guilla...@dev3:~$ autoconf -V
autoconf (GNU Autoconf) 2.63

guilla...@dev3:~$ automake --version
automake (GNU automake) 1.10.2

guilla...@dev3:~$ libtool --version
ltmain.sh (GNU libtool) 2.2.6

The attached tarball also contains the script I was running.

Thanks,
Guillaume

Edit: I've tried attaching the file but sourceforge keeps telling me I haven't 
chosen a group when I have. I'm going to submit without the file and try to add 
the file later.

----------------------------------------------------------------------

>Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-15 09:46

Message:
The production server is 64 bit but my test machine isn't.

I tried creating individual tables but it segfaulted again after inserting
~9000 rows.

The program creates tablename_day_num and tablename_week_num as needed and
this is my output:
Inserting
Creating _day_89 for Tue, 31 Mar 2009 00:00:00 -0400
Creating _week_14 for Tue, 31 Mar 2009 00:00:00 -0400
Creating _day_90 for Wed, 01 Apr 2009 00:00:00 -0400
Creating _day_91 for Thu, 02 Apr 2009 00:00:00 -0400
Creating _day_92 for Fri, 03 Apr 2009 00:00:00 -0400
[...]
Creating _day_158 for Mon, 08 Jun 2009 00:00:00 -0400
Creating _week_24 for Mon, 08 Jun 2009 00:00:00 -0400
Creating _day_159 for Tue, 09 Jun 2009 00:00:00 -0400
Creating _day_160 for Wed, 10 Jun 2009 00:00:00 -0400
3000 done - 0.0396% in 31.06 (0.014 avg)
4000 done - 0.0528% in 39.28 (0.0082 avg)
5000 done - 0.066% in 48.28 (0.009 avg)
6000 done - 0.0792% in 54.48 (0.0062 avg)
7000 done - 0.0924% in 60.62 (0.0061 avg)
8000 done - 0.1056% in 67.05 (0.0064 avg)
9000 done - 0.1188% in 73.51 (0.0065 avg)
Segmentation fault (core dumped)

This time mserver5 wasn't using a lot of memory when it segfaulted
(according to top).

I can upload another dump if it's useful.

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-10-15 02:17

Message:
Well, there is not proof, yet, but there is a high chance that you simply
ran out of address space ...

In MonetDB, each column that is accessed is entirely loaded/mapped into
the address space. Hence, all columns that are required during query
processing need to fit into the address space. For insert and bulk-loads,
this means that the whole table (all columns) need to fit into the address
space. Of course, MonetDB also need extra space, e.g., for temporary data
and intermediate results.

Thus, splitting your large table into multiple smaller ones should allow
you to load each smaller table individually, as well as to query each table
individually. Also querying multiple/all tables concurrently (e.g., an
on-the-fly union of them) might work, provided you only use a small subset
of all columns per query.

Alternatively, you might want to consider moving to a 64-bit system ...


----------------------------------------------------------------------

Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-14 15:40

Message:
So I ran out of addressable memory space, thanks.

Does monetdb need the addressable memory space globally or per table?

Basically, if instead of a single table with 7.5MM rows, I had 30 tables
of 250k rows or 4 tables of 1875000 rows would I still have the problem?
The 7.5MM rows was 1 month worth of data so I'd split it by day or week.

Thanks

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-10-14 15:14

Message:
These are 71 columns most of them int or decimal, i.e., 4 byte wide, one
being a char(32); (70*4B+32B)*7.5M = 2340MB ~= 2.3GB.
This comes close to the practical limit on 32-bit systems; surely when
considering address space fragmentation and the fact the 32-bit systems
usually do not allow processes to use the full (theoretical) address space
of 4GB ...


----------------------------------------------------------------------

Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-14 14:29

Message:
I will run it again and check memory usage before it crashes.

My use case is ~10-15k inserts/updates by cron job every 15 minutes with a
few (< 250-500 / 15 min) sporadic inserts at any time. I could probably
rewrite the cron job to do a batch insert rather than sequential inserts.

As for how wide the table is, this is the table definition:

CREATE TABLE test.aggregated_detail_data_2009 (
  date_added int NOT NULL,
  hash_key char(32) NOT NULL,
  landing_page int NOT NULL,
  domain int NOT NULL,
  ad_domain int NOT NULL,
  ad_unit int NOT NULL,
  ad_path int NOT NULL,
  referer_domain int NOT NULL,
  referer_path int NOT NULL,
  campaign_id int DEFAULT NULL,
  program_id int DEFAULT NULL,
  affiliate_id int NOT NULL DEFAULT '0',
  sub_campaign int NOT NULL,
  os int NOT NULL,
  os_version int NOT NULL,
  browser int NOT NULL,
  browser_version int NOT NULL,
  search_engine int NOT NULL,
  search_keyword int NOT NULL,
  traffic_type int NOT NULL,
  country_code char(2) NOT NULL,
  flash_version_major int NOT NULL,
  flash_version int NOT NULL,
  is_member int DEFAULT NULL,
  member_id int NOT NULL,
  payout_type int NOT NULL,
  payout decimal(14,8) NOT NULL,
  payout_units int NOT NULL,
  num_visits int NOT NULL,
  num_unique_visits int NOT NULL,
  num_impressions int NOT NULL,
  num_unique_impressions int NOT NULL,
  num_clicks int NOT NULL,
  num_unique_clicks int NOT NULL,
  sum_time_on_site int NOT NULL,
  sum_page_views int NOT NULL,
  num_bounce_visits int NOT NULL,
  num_joins int NOT NULL,
  num_attempts_step1 int NOT NULL,
  num_attempts_step2 int NOT NULL,
  num_premium_upgrades int NOT NULL,
  membership_id int NOT NULL,
  movie_id int NOT NULL,
  stop_reason int NOT NULL,
  stop_note int NOT NULL,
  num_premium_stops int NOT NULL,
  num_premium_stops_for_percent int NOT NULL,
  num_premium_starts int NOT NULL,
  num_running_premium int NOT NULL,
  num_running_free int NOT NULL,
  num_running_overdue int NOT NULL,
  num_vod int NOT NULL,
  sum_vod_revenues decimal(10,2) NOT NULL,
  sum_vod_royalties decimal(10,2) NOT NULL,
  num_vod_refunds int NOT NULL,
  sum_vod_refunds decimal(10,2) NOT NULL,
  num_dto int NOT NULL,
  sum_dto_revenues decimal(10,2) NOT NULL,
  sum_dto_royalties decimal(10,2) NOT NULL,
  num_dto_refunds int NOT NULL,
  sum_dto_refunds decimal(10,2) NOT NULL,
  num_svod int NOT NULL,
  sum_svod_revenues decimal(10,2) NOT NULL,
  sum_svod_royalties decimal(10,2) NOT NULL,
  num_svod_refunds int NOT NULL,
  sum_svod_refunds decimal(10,2) NOT NULL,
  num_affiliate_sales int NOT NULL,
  sum_affiliate_payouts decimal(10,2) DEFAULT NULL,
  sum_cpc_payouts decimal(16,8) NOT NULL,
  sum_cpm_payouts decimal(16,8) NOT NULL,
  signup_flow int NOT NULL,
  CONSTRAINT hash_key_detailed UNIQUE (hash_key,date_added)
);


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-10-14 13:33

Message:
While indeed MonetDB should not segfault (will have to investigate and fix
that), I have two comments / questions:

1)
How wide is your 7.5M row table (number and type of columns), i.e., how
much data do you try to load in total?
MonetDB might just run out of address space on your 32-bit system.
How big was mserver5 just before crashing?

2)
Would it be an option for you to have MySQL dump the table into a CSV
file, and then bulk-load the data from there into MonetDB using COPY INTO?
In MonetDB, bulk-loading should be (much) faster than individual insert
statements ...


----------------------------------------------------------------------

Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-14 13:12

Message:
Sourceforge refuses to let me upload a file. It says success but nothing
happens. I've uploaded the file here:
http://smokinn.com/files/monetdb_crash.tar.gz

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2879117&group_id=56967

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to