[
https://issues.apache.org/jira/browse/DERBY-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kristian Waagan updated DERBY-4938:
-----------------------------------
Attachment: derby-4938-1a-istat_scheduling.diff
derby-4938-1a-istat_scheduling.stat
Attaching patch 1a, which adds the initial scheduling logic.
Updates or creation of the index cardinality statistics will only happen for
prepared statements, and only when the query involves an access path using an
index. In addition there are threshold that has to be reached/exceeded before
an update is scheduled. These thresholds may have to be tweaked after a period
of testing.
Note that DERBY-4939 has to be committed before the autostats are enabled, but
here's some comments from DERBY-4771 about the available debug knobs for this
feature:
-----
a) derby.storage.indexStats.debug.createThreshold (100)
b) derby.storage.indexStats.debug.absdiffThreshold (1000)
c) derby.storage.indexStats.debug.lndiffThreshold (1.0)
d) derby.storage.indexStats.debug.queueSize (5)
(a) determines how big a table must be before statistics are automatically
created. (b) determines how big the discrepancy between the row estimates for
the table and the index must be before the statistics are updated. (c)
determines how big the logarithmic (natural logarithm) must be before the
statistics are updated. The values of these properties are printed if tracing
is turned on. Now:
Q: I don't understand these properties!
A: Read the code ;)
These properties are made available for experimentation and debugging
only. a-c affect when statistics are created or updated, and are used in
TableDescriptor. (d) is only used in IndexStatisticsDaemonImpl.
Q: Why have both (a) and (b)?
A: Purely for debugging and experimentation. If these properties are included
in production code, I expect they can be folded into one.
Q: Why have both (b) and (c)?
A: In general (c) will decide if the statistics are updated. However, for
small tables (c) will cause frequent updates of the statistics. For small
tables accurate statistics are not needed for good performance [1], so
there is no reason to frequently update the stats. This is where (b) comes
into play.
[1] One exception might be if the rows are huge.
-----
Committed to trunk with revision 1069160.
> Implement istat scheduling/triggering
> -------------------------------------
>
> Key: DERBY-4938
> URL: https://issues.apache.org/jira/browse/DERBY-4938
> Project: Derby
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 10.8.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-4938-1a-istat_scheduling.diff,
> derby-4938-1a-istat_scheduling.stat
>
>
> The istat daemon has to get its orders from somewhere (it is not operating
> purely on its own), and this issue tracks the addition of code that will
> schedule units of works with with the daemon.
> The current approach is based on statement compilation, i.e. prepared
> statements, triggering the addition of units of work.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira