Bugs item #2686008, was opened at 2009-03-13 09:43
Message generated for change (Settings changed) made by nielsnes
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2686008&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SQL/Core
Group: SQL "stable"
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Roberto Cornacchia (cornuz)
Assigned to: Niels Nes (nielsnes)
Summary: SQL: multi-attribute GROUP BY may ignore sortedness

Initial Comment:
-- A table with unsorted values in a and b
create table unsorted (a int,b int);
insert into unsorted values (2, 3);
insert into unsorted values (1, 2);
insert into unsorted values (4, 1);
insert into unsorted values (3, 2);
insert into unsorted values (2, 3);
insert into unsorted values (3, 3);
insert into unsorted values (3, 1);
insert into unsorted values (4, 3);

-- Store it in a new table with tuples sorted on a,b
create table sorted (a int, b int);
insert into sorted
select * from unsorted
order by a,b;

-- these tho are semantically equivalent (the group by attributes are swapped)
trace select a,b from sorted group by a,b;
trace select a,b from sorted group by b,a;

The first version is much faster, as it exploits the sortedness on a,b when 
grouping on a,b.
The second version identifies the same groups. Therefore, the sortedness on a,b 
could be exploited when grouping on b,a, but it isn't.

Even on such a small example, the trace shows already a performance difference, 
which becomes dramatic on larger data.

In general, when a table is sorted on a1,a2,...,aN ,
grouping on any permutation of a1,a2,...,aK with K<=N should exploit sortedness.



----------------------------------------------------------------------

>Comment By: Niels Nes (nielsnes)
Date: 2009-11-20 13:05

Message:
added a test to
src/test/BugTracker-2009/Tests/use_order_column_first.SF-2686008.sql

fixed by an optimizer which checks for sorted columns in the group by
(only works when we can easily derive the base table)

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-13 09:51

Message:
Good point --- while performance can be considered bugs, a solution might
only become available as new feature in the CVS HEAD and hence in the next
feature release --- Martin's recently added "derivepath" MAL optimizer
provides a framework that can be exploited to do the proposed (and other,
e.g., exploit also uniqueness) optimization for multi-attribute grouping,
very similar to what the "joinpath" optimizer does for (binary- / mapping-
/ pivot-) join paths.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2686008&group_id=56967

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to