Update: it seems to work with JOIN_OPERATOR_IMPLEMENTATION = IMPROVED. This is the plan:
table col or idx strategy pagecount TA2 TABLE SCAN 36154 TA1 ABORTCODEID JOIN VIA KEY COLUMN 1 TABLE HASHED NO TEMPORARY RESULTS CREATED RESULT IS COPIED , COSTVALUE IS 32918 Kind regards robert 2005/11/24, Robert Klemme <[EMAIL PROTECTED]>: > 2005/11/24, Becker, Holger <[EMAIL PROTECTED]>: > > Hi, > > > > what does the explain plan looks like for the group by statement? > > JOIN_OPERATOR_IMPLEMENTATION = YES > > table col or idx strategy > pagecount > TA1 EQUAL CONDITION FOR KEY 1 > ABORTCODEID (USED KEY COLUMN) > TA2 IDX_HOURLY_APPID JOIN VIA INDEXED COLUMN 36154 > NO TEMPORARY RESULTS CREATED > RESULT IS COPIED , COSTVALUE IS 1783 > > > JOIN_OPERATOR_IMPLEMENTATION = NO > > table col or idx strategy pagecount > TA2 TABLE SCAN 36154 > TA1 ABORTCODEID JOIN VIA KEY COLUMN 1 > RESULT IS COPIED , COSTVALUE IS 45460 > > (also attached as txt for easier reading) > > > Additionally I would like to ask you to switch back to old join > > implementation by setting JOIN_OPERATOR_IMPLEMENTATION to NO. > > Perhaps we can encircle the reason for the problem. > > When doing the group by query with this option set to NO I get 261185 > as result which is an order of magnitude off. If I remove just one > group by column I get the expected 4502345. Now this is getting > stranger and stranger.. > > Kind regards > > robert > > > Kind regards > > Holger > > > > > -----Original Message----- > > > From: Robert Klemme [mailto:[EMAIL PROTECTED] > > > Sent: Donnerstag, 24. November 2005 10:26 > > > To: Becker, Holger > > > Cc: maxdb > > > Subject: Re: Strange results with join and group by > > > > > > 2005/11/24, Becker, Holger <[EMAIL PROTECTED]>: > > > > > From: Robert Klemme wrote: > > > > > > > > > Hi, > > > > > > > > > > we discovered a strange phenomenon with MaxDB 7.5.00.30 > > > on Linux (RHEE > > > > > 4). Consider these two queries: > > > > > > > > > > SELECT TA1.abortcode AS COL1, TA1.description AS COL2, > > > > > SUM(TA2.refcount) AS COL3 > > > > > FROM ag_abortcode TA1, v_ag_hourlymaster TA2 > > > > > WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 ) > > > > > GROUP BY TA1.abortcode, TA1.description > > > > > > > > > > SELECT count(*) AS COL3, SUM(TA2.refcount) > > > > > FROM ag_abortcode TA1, v_ag_hourlymaster TA2 > > > > > WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 ) > > > > > > > > > > Same tables, same join condition, same filter criteria > > > and since the > > > > > grouping just returns one row in the first case comparing > > > of results > > > > > is easy. Now here's the strange thing: the second query > > > always returns > > > > > the same (correct) result for SUM(): 4502345. The first > > > query returns > > > > > 4436366 as well as 2921189 and probably other values, > > > too, if we try > > > > > longer - without any change to the data! Earlier 7.5 versions on > > > > > Windows do not exhibit this behavior. Our tests have shown that as > > > > > soon as we add a group by clause with two (!) columns the > > > results are, > > > > > um, unpredictable. > > > > > > > > > > Did anybody see something like this before? I didn't > > > find anything > > > > > similar in the bug database... We'll try to come up with > > > a smaller > > > > > test case (several million rows are involved) but I > > > thought I might > > > > > ask in the meantime whether this is a known issue. > > > > > > > > Hi, > > > > > > > > you could check the setting of the database parameter > > > HASHED_RESULTSET. > > > > This parameter enables a new feature for set functions in joins. > > > > Perhaps there is a problem with this new code. > > > > So if the parameter is set to 'YES' in your database you should > > > > disable the feature by setting the parameter to 'NO' and test > > > > your group by query again. > > > > Please let me kown if this was the reason. > > > > > > No luck here: it is already set to 'NO': > > > > > > fox:CR> param_getvalue HASHED_RESULTSET > > > OK > > > NO > > > fox:CR> param_getfull HASHED_RESULTSET > > > OK > > > string > > > NO > > > NO > > > CHANGE YES > > > INTERN NO > > > MANDATORY YES > > > CLEAR NO > > > DYNAMIC NO > > > CASESENSITIVE NO > > > DEVSPACE NO > > > MODIFY YES > > > GROUP EXTENDED > > > DISPLAYNAME > > > VALUESET > > > MAX > > > MIN > > > INSTANCES > > > LASTKNOWNGOOD NO > > > HELP > > > Enables the use of hashed resultsets (YES/NO) > > > EXPLAIN > > > HASHED_RESULTSET 'YES' or 'NO' > > > > > > 'YES': Hashed resultsets are used for aggregates > > > > > > 'NO' : Hashed resultsets will not be used > > > > > > Hm, what now? This seems like a serious issue to me as one cannot rely > > > on query results being correct... > > > > > > Kind regards > > > > > > robert > > > > > > > > -- MaxDB Discussion Mailing List For list archives: http://lists.mysql.com/maxdb To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]