Bugs item #1830366, was opened at 2007-11-12 11:38
Message generated for change (Comment added) made by nielsnes
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1830366&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: MonetDB Common CVS Head
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Stefan Manegold (stmane)
>Assigned to: Niels Nes (nielsnes)
Summary: gcc 4.1/4.2 produces wrong code on FC6, F8 & RHEL4WS

Initial Comment:
(Triggered by Arjen d.V.'s problems of loading module pathfinder after a 
default compilation of the develoment trunk from CVS on his 64-bit Fedora Core 
6 desktop at CWI; see below.)


Since recently --- I cannot tell exactly since when, surely only after the 
latest MonetDB_1-20 release branch has been created, possibly since the GDK-2 
branch has been merged into the development trunk --- gcc seems to generate 
faulty code for gdk_scanselect in very selected situations:

1)
With gcc 4.1.2 on 64-bit Fedora Core 6, only with 64-bit OIDs and only when 
compiled with "--disable-optimize --disable-debug" (the default in the 
development trunk) resulting in CFLAGS="-g -O2"; using either "--enable-debug" 
(CFLAGS="-g") or "--enable-optimize" (CFLAGS="-O6 ...") does seem to generate 
correct code.

2)
With gcc 4.2.0 on 64-bit Red Hat Enterprise Linux WS release 4 (Nahant Update 
4) on Itanuim 2 with 64-bit OIDs (32-bit OIDs not tested), both with 
"--disable-optimize" and "--enable-optimize" ("--enable-debug" not tested).

(The problem only occurs in the development trunk, but not in the MonetDB_1-20 
release branch.)


The code that triggers the problem is in lines 468-476 of 
MonetDB/src/gdk/gdk_scanselect.mx:
========
 BATloop(b, p, q) {
         ptr v = [EMAIL PROTECTED](bi, p);
         @4 {
                 @[EMAIL PROTECTED](bn, _p, (void *) (@8), @5);
                 _p++;
         }
         @9
 }
========
or once Mx-expanded
========
 BATloop(b, p, q) {
         ptr v = BUNtloc(bi, p);
         if ( simple_EQ(tl ,v,int) ) {
                 lngvoid_bunfastins_nocheck_noinc(bn, _p, (void *) (&oid_ctr), 
v);
                 _p++;
         }
         oid_ctr++; 
 }
========
and with (some) C macros expanded:
========
 BATloop(b, p, q) {
         ptr v = BUNtloc(bi, p);
         if ( simple_EQ(tl ,v,int) ) {
                 ((lng*)((bn)->H->heap.base))[_p] = *(lng*) (ptr) ((void *) 
&oid_ctr));
                 _p++;
         }
         oid_ctr++;
 }
========

It appears that the resulting faulty code does perform the increment 
"oid_ctr++;" before the code in the if-body has dereferenced "&oid_ctr" and 
read the value.
See these tests for details:
http://monetdb.cwi.nl/testing/projects/monetdb/Current/MonetDB4/.mTests103/GNU.64.64.d-Fedora6/src_gdk/scanselect.out.00.html
http://monetdb.cwi.nl/testing/projects/monetdb/Current/MonetDB5/.mTests103/GNU.64.64.d-Fedora6/tests_gdkTests/scanselect.out.00.html
(last night, I had the TestWeb compiled with "--disable-optimize" as opposed to 
the usual "--enable-optimize".)


Additionally, this problem makes loading the pathfinder module fail with:
========
MonetDB>module(pathfinder); 
# PF/Tijah module v0.3.0 loaded. http://dbappl.cs.utwente.nl/pftijah
!ERROR: BATfetchjoin(tmp_221,ws_nme) does not hit always (2) (|bn|=57 != 
58=|l|) => can't use fetchjoin.
!ERROR: CMDleftfetchjoin: operation failed.
!ERROR: interpret_params: rename(param 1): evaluation error.
========


Explicitly assigning "oid oid_val = oid_ctr;" in the if-body and then replacing 
"&oid_ctr" by "&oid_val" seems to "solve" (prevent) the problem.

However, I'm hesitant to apply this patch since (1) the original code seems 
correct to me and (2) the problem occurs only in selected cases --- although 
the very default case on Fedora Core 6 ...

Any comments and/or help is more than welcome!


----------------------------------------------------------------------

>Comment By: Niels Nes (nielsnes)
Date: 2008-01-11 09:32

Message:
Logged In: YES 
user_id=43556
Originator: NO

we worked around this gcc bug. Its hard to make a simple small example
this problem, ie we won't send a bug report to the gcc comunity.

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-01-07 16:31

Message:
Logged In: YES 
user_id=572415
Originator: YES

With the default gcc 4.1.2 on Fedora 8 (basically the same as on Fedora
Core 6) the problem still persists.


----------------------------------------------------------------------

Comment By: Sjoerd Mullender (sjoerd)
Date: 2007-12-21 12:46

Message:
Logged In: YES 
user_id=43607
Originator: NO

Since Fedora Core 6 has reached EOL and our systems are slowly but surely
being upgraded to Fedora 8, and since the bug seems to be in the compiler,
it is unlikely that we're going to fix this.

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2007-12-18 10:31

Message:
Logged In: YES 
user_id=572415
Originator: YES

This (or a similar gcc 4.1.* related) problem might affect Pathfinder
testing on 64-bit Fedora Core 6 with 64-bit OIDs also with
--enable-optimize;
see
http://monetdb.cwi.nl/testing/projects/monetdb/Current/pathfinder/.mTests103/index_short.html
and (e.g.)
http://monetdb.cwi.nl/testing/projects/monetdb/Current/pathfinder/.mTests103/GNU.64.64.d-Fedora6/benchmarks_MBench/qr02.out.00.html


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2007-11-13 10:26

Message:
Logged In: YES 
user_id=572415
Originator: YES

Just for the records, the following combinations seem to work fine:
gcc 4.1.2 on 32-bit Debian 4.0
gcc 4.1.1 on 32-bit Fedora Core 6
gcc 4.2.2 on 32-bit Gentoo 1.12.9
gcc 4.1.2 on 64-bit Fedora Core 6 using 32-bit OIDs


----------------------------------------------------------------------

Comment By: Fabian (mr-meltdown)
Date: 2007-11-12 13:41

Message:
Logged In: YES 
user_id=963970
Originator: NO

using
CFLAGS="-march=athlon64 -pipe -g -W -Wall"
./configure --disable-strict --disable-optimize
remedies the problem indeed also on 4.2.2.

----------------------------------------------------------------------

Comment By: Fabian (mr-meltdown)
Date: 2007-11-12 13:33

Message:
Logged In: YES 
user_id=963970
Originator: NO

I tested with gcc 4.1.2 and gcc 4.2.2 and get exactly the same behaviour,
i.e. significantly differing output.

----------------------------------------------------------------------

Comment By: Fabian (mr-meltdown)
Date: 2007-11-12 12:13

Message:
Logged In: YES 
user_id=963970
Originator: NO

Maybe related:

64      =sys-devel/gcc-4.2.0*
65      =sys-devel/gcc-4.2.1*

are "too broken", GCC upstream fixed many issues in 4.2.2, worth a try to
use it on Itanium.

Gentoo has 4.2.2 installed (iirc) at least I use 4.2.2 on my desktop.  I
will try and compile with 4.1.2 on (still) FC6 with --disable-optimize
--disable-debug and see if I get the same issue.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1830366&group_id=56967

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to