2005/11/24, Becker, Holger <[EMAIL PROTECTED]>:
> Hi,
>
> what does the explain plan looks like for the group by statement?

JOIN_OPERATOR_IMPLEMENTATION = YES

table   col or idx              strategy                              
 pagecount
TA1                             EQUAL CONDITION FOR KEY                 1
        ABORTCODEID                  (USED KEY COLUMN)
TA2     IDX_HOURLY_APPID        JOIN VIA INDEXED COLUMN                 36154
                                NO TEMPORARY RESULTS CREATED
                                RESULT IS COPIED   , COSTVALUE IS       1783


JOIN_OPERATOR_IMPLEMENTATION = NO

table   col or idx      strategy                                pagecount
TA2                     TABLE SCAN                              36154
TA1     ABORTCODEID     JOIN VIA KEY COLUMN                     1
                        RESULT IS COPIED   , COSTVALUE IS       45460

(also attached as txt for easier reading)

> Additionally I would like to ask you to switch back to old join
> implementation by setting JOIN_OPERATOR_IMPLEMENTATION to NO.
> Perhaps we can encircle the reason for the problem.

When doing the group by query with this option set to NO  I get 261185
as result which is an order of magnitude off.  If I remove just one
group by column I get the expected 4502345.  Now this is getting
stranger and stranger..

Kind regards

robert

> Kind regards
> Holger
>
> > -----Original Message-----
> > From: Robert Klemme [mailto:[EMAIL PROTECTED]
> > Sent: Donnerstag, 24. November 2005 10:26
> > To: Becker, Holger
> > Cc: maxdb
> > Subject: Re: Strange results with join and group by
> >
> > 2005/11/24, Becker, Holger <[EMAIL PROTECTED]>:
> > > > From: Robert Klemme wrote:
> > >
> > > > Hi,
> > > >
> > > > we discovered a strange phenomenon with MaxDB 7.5.00.30
> > on Linux (RHEE
> > > > 4).  Consider these two queries:
> > > >
> > > > SELECT TA1.abortcode AS COL1, TA1.description AS COL2,
> > > > SUM(TA2.refcount) AS COL3
> > > > FROM ag_abortcode TA1, v_ag_hourlymaster TA2
> > > > WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )
> > > > GROUP BY TA1.abortcode, TA1.description
> > > >
> > > > SELECT count(*) AS COL3, SUM(TA2.refcount)
> > > > FROM ag_abortcode TA1, v_ag_hourlymaster TA2
> > > > WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )
> > > >
> > > > Same tables, same join condition, same filter criteria
> > and since the
> > > > grouping just returns one row in the first case comparing
> > of results
> > > > is easy. Now here's the strange thing: the second query
> > always returns
> > > > the same (correct) result for SUM(): 4502345. The first
> > query returns
> > > > 4436366 as well as 2921189 and probably other values,
> > too, if we try
> > > > longer - without any change to the data!  Earlier 7.5 versions on
> > > > Windows do not exhibit this behavior. Our tests have shown that as
> > > > soon as we add a group by clause with two (!) columns the
> > results are,
> > > > um, unpredictable.
> > > >
> > > > Did anybody see something like this before?  I didn't
> > find anything
> > > > similar in the bug database...  We'll try to come up with
> > a smaller
> > > > test case (several million rows are involved) but I
> > thought I might
> > > > ask in the meantime whether this is a known issue.
> > >
> > > Hi,
> > >
> > > you could check the setting of the database parameter
> > HASHED_RESULTSET.
> > > This parameter enables a new feature for set functions in joins.
> > > Perhaps there is a problem with this new code.
> > > So if the parameter is set to 'YES' in your database you should
> > > disable the feature by setting the parameter to 'NO' and test
> > > your group by query again.
> > > Please let me kown if this was the reason.
> >
> > No luck here: it is already set to 'NO':
> >
> > fox:CR> param_getvalue HASHED_RESULTSET
> > OK
> > NO
> > fox:CR> param_getfull HASHED_RESULTSET
> > OK
> > string
> > NO
> > NO
> > CHANGE        YES
> > INTERN        NO
> > MANDATORY     YES
> > CLEAR NO
> > DYNAMIC       NO
> > CASESENSITIVE NO
> > DEVSPACE      NO
> > MODIFY        YES
> > GROUP EXTENDED
> > DISPLAYNAME
> > VALUESET
> > MAX
> > MIN
> > INSTANCES
> > LASTKNOWNGOOD NO
> > HELP
> > Enables the use of hashed resultsets (YES/NO)
> > EXPLAIN
> >  HASHED_RESULTSET 'YES' or 'NO'
> >
> >  'YES': Hashed resultsets are used for aggregates
> >
> >  'NO' : Hashed resultsets will not be used
> >
> > Hm, what now? This seems like a serious issue to me as one cannot rely
> > on query results being correct...
> >
> > Kind regards
> >
> > robert
> >
>
JOIN_OPERATOR_IMPLEMENTATION = YES

table   col or idx              strategy                                
pagecount
TA1                             EQUAL CONDITION FOR KEY                 1
        ABORTCODEID                  (USED KEY COLUMN)
TA2     IDX_HOURLY_APPID        JOIN VIA INDEXED COLUMN                 36154
                                NO TEMPORARY RESULTS CREATED
                                RESULT IS COPIED   , COSTVALUE IS       1783

JOIN_OPERATOR_IMPLEMENTATION = NO

table   col or idx      strategy                                pagecount
TA2                     TABLE SCAN                              36154
TA1     ABORTCODEID     JOIN VIA KEY COLUMN                     1
                        RESULT IS COPIED   , COSTVALUE IS       45460
-- 
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to