Bugs item #2879117, was opened at 2009-10-14 13:04
Message generated for change (Comment added) made by smokinn
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2879117&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SQL/Core
Group: MonetDB5 "stable"
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Guillaume Theoret (smokinn)
Assigned to: Niels Nes (nielsnes)
Summary: Consistent segmentation fault

Initial Comment:
It seems I can consistently crash MonetDB by running inserts in a while loop.

I have a script that reads 7.5MM rows from mysql and runs the inserts on 
MonetDB. After about 24 hours the database segfaults.

The messages my scripts output before crashing
Run 1:
6380000 done - 84.2207% in 86083.58 (0.0109 avg)
Segmentation fault

Run 2:
6380000 done - 84.2207% in 85957.17 (0.0107 avg)
Segmentation fault (core dumped)

I also log all get_last_error messages if there are any and got these two 
messages just before the crash:

MALException:!ERROR: HEAPextend: failed to extend to 244108529 for 07/744theap
MALException:!ERROR: HEAPextend: failed to extend to 244108529 for 07/744theap

On the second run I installed Apport to get a dump that I've attached.

I'm running Ubuntu 9.04 Jaunty on a 32bit machine with 2GB of ram. I compiled 
from the bz2 tarball linked here: 
http://monetdb.cwi.nl/projects/monetdb//Download/ ( 
http://monetdb.cwi.nl/downloads/sources/Aug2009-SP1/MonetDB-Aug2009-SuperBall-SP1.tar.bz2
 ) and the crash occurs both when compiling with and without optimizations and 
with and without debug.

guilla...@dev3:~$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.3.3-5ubuntu4' 
--with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3 
--program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug 
--enable-objc-gc --enable-mpfr --enable-targets=all --with-tune=generic 
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu 
--target=i486-linux-gnu
Thread model: posix
gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4)

guilla...@dev3:~$ bison -V
bison (GNU Bison) 2.3

guilla...@dev3:~$ yacc -V
bison (GNU Bison) 2.3

guilla...@dev3:~$ autoconf -V
autoconf (GNU Autoconf) 2.63

guilla...@dev3:~$ automake --version
automake (GNU automake) 1.10.2

guilla...@dev3:~$ libtool --version
ltmain.sh (GNU libtool) 2.2.6

The attached tarball also contains the script I was running.

Thanks,
Guillaume

Edit: I've tried attaching the file but sourceforge keeps telling me I haven't 
chosen a group when I have. I'm going to submit without the file and try to add 
the file later.

----------------------------------------------------------------------

>Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-14 15:40

Message:
So I ran out of addressable memory space, thanks.

Does monetdb need the addressable memory space globally or per table?

Basically, if instead of a single table with 7.5MM rows, I had 30 tables
of 250k rows or 4 tables of 1875000 rows would I still have the problem?
The 7.5MM rows was 1 month worth of data so I'd split it by day or week.

Thanks

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-10-14 15:14

Message:
These are 71 columns most of them int or decimal, i.e., 4 byte wide, one
being a char(32); (70*4B+32B)*7.5M = 2340MB ~= 2.3GB.
This comes close to the practical limit on 32-bit systems; surely when
considering address space fragmentation and the fact the 32-bit systems
usually do not allow processes to use the full (theoretical) address space
of 4GB ...


----------------------------------------------------------------------

Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-14 14:29

Message:
I will run it again and check memory usage before it crashes.

My use case is ~10-15k inserts/updates by cron job every 15 minutes with a
few (< 250-500 / 15 min) sporadic inserts at any time. I could probably
rewrite the cron job to do a batch insert rather than sequential inserts.

As for how wide the table is, this is the table definition:

CREATE TABLE test.aggregated_detail_data_2009 (
  date_added int NOT NULL,
  hash_key char(32) NOT NULL,
  landing_page int NOT NULL,
  domain int NOT NULL,
  ad_domain int NOT NULL,
  ad_unit int NOT NULL,
  ad_path int NOT NULL,
  referer_domain int NOT NULL,
  referer_path int NOT NULL,
  campaign_id int DEFAULT NULL,
  program_id int DEFAULT NULL,
  affiliate_id int NOT NULL DEFAULT '0',
  sub_campaign int NOT NULL,
  os int NOT NULL,
  os_version int NOT NULL,
  browser int NOT NULL,
  browser_version int NOT NULL,
  search_engine int NOT NULL,
  search_keyword int NOT NULL,
  traffic_type int NOT NULL,
  country_code char(2) NOT NULL,
  flash_version_major int NOT NULL,
  flash_version int NOT NULL,
  is_member int DEFAULT NULL,
  member_id int NOT NULL,
  payout_type int NOT NULL,
  payout decimal(14,8) NOT NULL,
  payout_units int NOT NULL,
  num_visits int NOT NULL,
  num_unique_visits int NOT NULL,
  num_impressions int NOT NULL,
  num_unique_impressions int NOT NULL,
  num_clicks int NOT NULL,
  num_unique_clicks int NOT NULL,
  sum_time_on_site int NOT NULL,
  sum_page_views int NOT NULL,
  num_bounce_visits int NOT NULL,
  num_joins int NOT NULL,
  num_attempts_step1 int NOT NULL,
  num_attempts_step2 int NOT NULL,
  num_premium_upgrades int NOT NULL,
  membership_id int NOT NULL,
  movie_id int NOT NULL,
  stop_reason int NOT NULL,
  stop_note int NOT NULL,
  num_premium_stops int NOT NULL,
  num_premium_stops_for_percent int NOT NULL,
  num_premium_starts int NOT NULL,
  num_running_premium int NOT NULL,
  num_running_free int NOT NULL,
  num_running_overdue int NOT NULL,
  num_vod int NOT NULL,
  sum_vod_revenues decimal(10,2) NOT NULL,
  sum_vod_royalties decimal(10,2) NOT NULL,
  num_vod_refunds int NOT NULL,
  sum_vod_refunds decimal(10,2) NOT NULL,
  num_dto int NOT NULL,
  sum_dto_revenues decimal(10,2) NOT NULL,
  sum_dto_royalties decimal(10,2) NOT NULL,
  num_dto_refunds int NOT NULL,
  sum_dto_refunds decimal(10,2) NOT NULL,
  num_svod int NOT NULL,
  sum_svod_revenues decimal(10,2) NOT NULL,
  sum_svod_royalties decimal(10,2) NOT NULL,
  num_svod_refunds int NOT NULL,
  sum_svod_refunds decimal(10,2) NOT NULL,
  num_affiliate_sales int NOT NULL,
  sum_affiliate_payouts decimal(10,2) DEFAULT NULL,
  sum_cpc_payouts decimal(16,8) NOT NULL,
  sum_cpm_payouts decimal(16,8) NOT NULL,
  signup_flow int NOT NULL,
  CONSTRAINT hash_key_detailed UNIQUE (hash_key,date_added)
);


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-10-14 13:33

Message:
While indeed MonetDB should not segfault (will have to investigate and fix
that), I have two comments / questions:

1)
How wide is your 7.5M row table (number and type of columns), i.e., how
much data do you try to load in total?
MonetDB might just run out of address space on your 32-bit system.
How big was mserver5 just before crashing?

2)
Would it be an option for you to have MySQL dump the table into a CSV
file, and then bulk-load the data from there into MonetDB using COPY INTO?
In MonetDB, bulk-loading should be (much) faster than individual insert
statements ...


----------------------------------------------------------------------

Comment By: Guillaume Theoret (smokinn)
Date: 2009-10-14 13:12

Message:
Sourceforge refuses to let me upload a file. It says success but nothing
happens. I've uploaded the file here:
http://smokinn.com/files/monetdb_crash.tar.gz

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2879117&group_id=56967

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to