[jira] [Created] (DERBY-6967) Query Optimizer picks Table Scan based on WHERE predicate, ignores expensive ORDER BY.

Alex Vasiliev (JIRA) Wed, 04 Oct 2017 20:41:03 -0700

Alex Vasiliev created DERBY-6967:
------------------------------------

             Summary: Query Optimizer picks Table Scan based on WHERE 
predicate, ignores expensive ORDER BY. 
                 Key: DERBY-6967
                 URL: https://issues.apache.org/jira/browse/DERBY-6967
             Project: Derby
          Issue Type: Bug
    Affects Versions: 10.13.1.1
            Reporter: Alex Vasiliev



h5. Problem:

It is appeared that query optimizer mises dramatically for a fairly simple 
query:
{code:sql}
SELECT * FROM Customer 
WHERE ID >= 0 
ORDER BY ID 
-- optionally: FETCH NEXT 2 ROWS ONLY;
{code}
The "ID >= 0" predicate is expected to return almost entire table (true), so 
optimizer decides to utilize Table Scan to avoid having to resolve index row 
references, but it should take into account that then it need to order the 
result set (sort), which is a way more expensive operation. Especially if it's 
followed by, let's say, "FETCH NEXT 2 ROWS ONLY" which would explicitly 
indicate that there is no need to iterate through the entire table.

h5. Steps to recreate:
{code:sql}
CREATE TABLE Customer (
  ID    BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY,
  Name  VARCHAR(1024) NOT NULL
);

INSERT INTO Customer (Name) VALUES ('Alex'), ('Peter'), ('Bob'), ('Fred');

CALL SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(1);

SELECT * FROM Customer WHERE ID >= 0 ORDER BY ID;

VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
{code}

h5. Result:
{code}
Statement Name: 
        null
Statement Text: 
        SELECT * FROM Customer WHERE ID >= 0 ORDER BY ID
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text: 
Sort ResultSet:
Number of opens = 1
Rows input = 4
Rows returned = 4
Eliminate duplicates = false
In sorted order = false
Sort information: 
        Number of rows input=4
        Number of rows output=4
        Sort type=internal
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        optimizer estimated row count: 2.97
        optimizer estimated cost: 39.32
Source result set:
        Table Scan ResultSet for CUSTOMER at read committed isolation level 
using instantaneous share row locking chosen by the optimizer
        Number of opens = 1
        Rows seen = 4
        Rows filtered = 0
        Fetch Size = 16
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                next time in milliseconds/row = 0

        scan information:
                Bit set of columns fetched=All
                Number of columns fetched=2
                Number of pages visited=1
                Number of rows qualified=4
                Number of rows visited=4
                Scan type=heap
                start position:
                        null
                stop position:
                        null
                qualifiers:
                        Column[0][0] Id: 0
                        Operator: <
                        Ordered nulls: false
                        Unknown return value: true
                        Negate comparison result: true
                optimizer estimated row count: 2.97
                optimizer estimated cost: 39.32
{code}

^ Note that Table Scan is picked and not an index over ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DERBY-6967) Query Optimizer picks Table Scan based on WHERE predicate, ignores expensive ORDER BY.

Reply via email to