[ 
https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965879#action_12965879
 ] 

Dag H. Wanvik commented on DERBY-4771:
--------------------------------------

Hi Kristian, some comments on the patch beyond the tests. I still need to look 
at the meat in IndexStatisticsDaemonImpl, that will follow in a next review 
installment. To the extent my (preliminary) questions will be answered when I 
have read IndexStatisticsDaemonImpl fully, please ignore them ;-)

Generally the structure of the patch is clear and seems to do the right things. 
There are some amount of TODOs left that need weork as you are no doubt aware, 
and some of the naming could use some work, more below.

M       java/engine/org/apache/derby/impl/sql/compile/CursorNode.java

                if (fromTable instanceof FromBaseTable) {
                    TableDescriptor td = fromTable.getTableDescriptor();
                    :
                    // TODO: This was done because I didn't find another way to
                    //       access the base tables in the query at the time I
                    //       needed to. There may be a better way to do this!

* The privilege collection phase seems like a good place to do it to me? 
Obviates need for an extra pass, and it's always performed I think. It would be 
nice to format the block in "if (fromTable.isPrivilegeCollectionRequired())" 
with braces and correct indentation while you are at it, too.

                    if (checkIndexStats &&
                            td.getTableType() == 
TableDescriptor.BASE_TABLE_TYPE) {

* Isn't td's table type always BASE_TABLE_TYPE here? Cf. "if (fromTable 
instanceof FromBaseTable)" above.. No, it seems it can also be a view, cf 
comment in FromBaseTable's class Javadoc. Confusing.. maybe add a comment. 

                        if (statsToUpdate == null) {
                            statsToUpdate = new ArrayList(2);
                        }
                        statsToUpdate.add(td);
                    }
                }

* Btw, why use "2" as initialCapacity to "new ArrayList"? Is it statistically 
more likely we have 2 than any other number (1, 3)? If so, comment, if not, I'd 
suggest use no-arg constructor.

* // Look for missing and stale statistics.

  The loop seems to be doing the opposite: removing those that are up-to-date? 
I am not a big fan of include type (here "flag" in the name of methods & 
variables); I'd prefer a name that indicated the semantics of the flag instead, 
e.g. "getAndClearIsUpToDate". That reversed meaning removes the need for the 
"!" in the test and makes it easier to read.

* // Assume a low number of base tables.
  baseTables.remove();

I presume this comment pertains to performance only.

M       java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java

Parameterize 100 properly with a constant or property? Fix two TODOs in 
estimateCost.


M       java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java

Remove TODO in setIndexStatsRefresher (machinery is temporary, code says).

A       
java/engine/org/apache/derby/impl/services/daemon/IndexStatisticsDaemonImpl.java
M       java/engine/org/apache/derby/impl/db/BasicDatabase.java
M       java/engine/org/apache/derby/iapi/sql/dictionary/TableDescriptor.java

* markForIndexStatsUpdate: sdl may see nullpointerexception if 
tableRowCountEstimate >= 0, cf.
  test: "if (sdl.isEmpty() && tableRowCountEstimate < 0)" doesn't hold. If 
known invariant, use ASSERT instead. Javadoc should explain the use of negative 
tableRowCountEstimate which seems expected.

The diff must be one order of magnitude as far as i can read the code, for an 
update to be triggered. Would be nice to comment this logic a bit. Not sure why 
you need the first comparison (against 1000 - has a TODO attached though) when 
you have the log comparison anyway just beneath? Only saves som math 
operations; worth it?  The generated reason string

  "indexStatsUpdateReason = "t-est=" + tableRowCountEstimate +
  ", i-est=" + indexRowCountEstimate + " => cmp=" + cmp;

seems to be comparing apples to oranges (number vs log number)?

* markForIndexStatsUpdate: change comment to Javadoc. Remaining TODO: The 
statement "the more accurate table row count estimate may be lost" isn't clear 
to me.


M       java/engine/org/apache/derby/iapi/sql/dictionary/DataDictionary.java

getIndexStatsRefresher: "if disabled" how can it be disabled? A: See 
disableIndexStatsRefresher.
When is this one used (scenario) ? Can it be reeenabled, if so, how? No, only 
on fatal error. Maybe note in Javadoc what it's for..

doSetIndexStatsRefresher, setIndexStatsRefresher: I'd prefer 
doCreateIndexStatsRefresher, createIndexStatsRefresher I think. It just seemed 
vague to me what the verb "set" meant here. Since a refresher isn't provided in 
"setIndexStatsRefresher", I mean, as in a normal getter/setter pattern.

doSetIndexStatsRefresher -> !indexRefreshedExists

A       
java/engine/org/apache/derby/iapi/services/daemon/IndexStatisticsDaemon.java

 *  The background mode will try to
 * affect other operations as little as possible, and errors won't be reported
 * unless they are severe. 

How is this safe? (errors not reported?) explain! What happens if scheduled 
work throws? Under which condition can work queue be full? Then what happens? I 
guess I'll know when I read the Impl class :)

Javadoc typo for schedule: schedulig


> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>          Components: SQL, Store
>    Affects Versions: 10.8.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: autoindexstats.html, 
> derby-4771-1a-prototype_code_dump.diff, 
> derby-4771-1a-prototype_code_dump.stat, 
> derby-4771-1b-prototype_code_dump.diff, 
> derby-4771-2a-prototype_lcc_code_dump.diff, 
> derby-4771-2b-prototype_lcc_code_dump.diff, 
> derby-4771-2c-prototype_lcc_code_dump.diff, 
> derby-4771-2d-prototype_lcc_code_dump.diff, DERBY-4771-2e-prototype.rar, 
> derby-4771-2e-prototype_lcc_code_dump.diff, 
> Derby-4771-2f-AutomaticIndexStatisticsTest_wondows7.rar, 
> derby-4771-2f-prototype_lcc_code_dump-WORK-IN-PROGRESS.diff, derby.log, 
> error-stacktrace.out, rjall.out, rjall.out, rjall.out, rjall.rar, rjone.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue 
> tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to