Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7999 )

Change subject: [DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1224
PS2, Line 1224:         For a particular table, use either <codeph>COMPUTE 
STATS</codeph> or
Yes!


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1233
PS2, Line 1233:         When you run <codeph>COMPUTE INCREMENTAL STATS</codeph> 
on a table for the first time,
I suggest some minor rephrasing to drive home the "don't switch mantra" a 
little more, see comments.


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1234
PS2, Line 1234:         the statistics are computed again from scratch 
regardless of whether you previously ran
regardless of whether the table has existing stats.


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1236
PS2, Line 1236:         for scanning the entire table when switching from 
<codeph>COMPUTE STATS</codeph> to
when running COMPUTE INCREMENTAL STATS for the first time on a given table.

(do not mention switching... not supposed to do that)


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1244
PS2, Line 1244:         2 GB, a serious error can occur. If only a limited 
number of partitions are actively being
If the aggregate metadata of all tables exceeds 2 GB you may experience service 
downtime (daemon crashes).

("serious error" really isn't clear to me)


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1245
PS2, Line 1245:         added or inserted into, you can run <codeph>COMPUTE 
INCREMENTAL STATS</codeph> for the active
Sorry my phrasing might have been misleading. By "active" partitions I meant 
those partitions that are being queried (i.e. read)... if you query some 
partitions very infrequently then there is no point in keeping incremental 
stats for them.


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/shared/impala_common.xml@1248
PS2, Line 1248:         optimizations such as partition pruning.
such as partition pruning or join ordering.


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/topics/impala_partitioning.xml
File docs/topics/impala_partitioning.xml:

http://gerrit.cloudera.org:8080/#/c/7999/2/docs/topics/impala_partitioning.xml@624
PS2, Line 624:         subset of partitions rather than the entire table. The 
incremental nature makes it suitable for large tables
Need to be careful here because "large tables" could be misinterpreted to mean 
"tables with many partitions".

I'd prefer to avoid the word "suitable" and instead use a phrasing that states 
it enables updating the stats as partitions are added. Whether incremental 
stats is "suitable" for anything is questionable because of the huge memory 
downside.

I'd agree that incremental stats could be suitable in situations where you have 
a huge partitioned table with a small rolling window of "active" partitions, so 
you only ever need to keep incremental stats on let's say <100 partitions.


http://gerrit.cloudera.org:8080/#/c/7999/2/docs/topics/impala_perf_stats.xml
File docs/topics/impala_perf_stats.xml:

http://gerrit.cloudera.org:8080/#/c/7999/2/docs/topics/impala_perf_stats.xml@361
PS2, Line 361:           <codeph>COMPUTE STATS</codeph> statement might take 
hours, or even days. That situation is where you switch
Rephrase to avoid "switch" since switching is bad



--
To view, visit http://gerrit.cloudera.org:8080/7999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia53a6518ce5541e5c9a2cd896856ce042a599b03
Gerrit-Change-Number: 7999
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Greg Rahn <[email protected]>
Gerrit-Reviewer: John Russell <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Silvius Rus <[email protected]>
Gerrit-Comment-Date: Fri, 06 Oct 2017 06:43:06 +0000
Gerrit-HasComments: Yes

Reply via email to