[ 
https://issues.apache.org/jira/browse/DERBY-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshvardhan Gupta updated DERBY-6940:
--------------------------------------
    Attachment: DERBY-6940_2.diff

I am attaching an updated diff which now collects extra statistics whenever an 
index is created / altered. There are 3 ways for statistics update - 

1) Creating a new index.
2) Altering an Index.
3) Explicitly calling SYSCS_UPDATE_STATISTICS system procedure.

The current patch collects statistics in all the above cases. I am currently 
working on upgrade logic post which we can look at ways to utilise the extra 
statistics in selectivity estimates.

While a soft upgrade can be performed by turning off extra statistics 
collection and turning off reading and writing to store. However, we can go 
about multiple ways regarding the hard upgrade strategy.

Options - 
1) We update all the existing indexes and add extra statistics during upgrade 
time.
2) We wait for user to explicitly call UPDATE_STATISTICS / Alter index system 
procedure before collecting and writing to store extra statistics.

I would like to know what the community thinks is the best way, also please 
share if you have any other strategy in mind.

> Enhance derby statistics for more accurate selectivity estimates.
> -----------------------------------------------------------------
>
>                 Key: DERBY-6940
>                 URL: https://issues.apache.org/jira/browse/DERBY-6940
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Harshvardhan Gupta
>            Assignee: Harshvardhan Gupta
>            Priority: Minor
>         Attachments: DERBY-6940_2.diff, derby-6940.diff
>
>
> Derby should collect extra statistics during index build time, statistics 
> refresh time which will help optimizer make more precise selectivity 
> estimates and chose better execution paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to