John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8003 )
Change subject: IMPALA-3200: [DOCS] Document user-facing aspects of new buffer pool ...................................................................... Patch Set 4: (21 comments) http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml File docs/topics/impala_buffer_pool_limit.xml: http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@39 PS4, Line 39: Defines the size in bytes of an internal buffer for allocating memory during queries. > First sentence isn't accurate. Something like this might be better: "Define Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@53 PS4, Line 53: MiB MB. I may have mixed up MB / KB and MiB / KiB in a few places throughout. MB / KB = powers of 2, MiB / KiB = powers of 10. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@53 PS4, Line 53: his default is represented by a value : of 0. > The default is actually that the query option is unset. 0 means a limit of Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@69 PS4, Line 69: set buffer_pool_limit=8GB; > It can also be expressed as a percentage of mem_limit (e.g. 80%), which may Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml File docs/topics/impala_default_spillable_buffer_size.xml: http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@44 PS4, Line 44: <p conref="../shared/impala_common.xml#common/type_integer"/> > Do we document somewhere the supported byte suffixes (e.g. B, KB, MB, GB pl Done. I'll reuse wording to that effect from the MEM_LIMIT page. I trust that of these new query options, only BUFFER_LIMIT accepts a percentage, the others are purely sizes in bytes, correct? http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@49 PS4, Line 49: <p> > I think it's confusing to say that the default varies, since the query opti Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@50 PS4, Line 50: <codeph>65536</codeph> to <codeph>2097152</codeph>, depending on the > Feel free to ignore, but might be easier to read if they were human-readabl Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@59 PS4, Line 59: consider increasing the <codeph>DEFAULT_SPILLABLE_BUFFER_SIZE</codeph> setting. > Would it be helpful to elaborate here? E.g. "Larger buffer sizes will resul Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@61 PS4, Line 61: </p> > It's implied but would it make sense to state the inverse. I.e. decreasing Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml File docs/topics/impala_max_row_size.xml: http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@47 PS4, Line 47: 524288 > 512K to be human-readable? Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@56 PS4, Line 56: accomodate > sp: accommodate That and 'separate' vs. 'seperate' are my kryptonite. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@57 PS4, Line 57: the total bytes stored in the largest row. > It's left ambiguous whether queries processing rows larger than MAX_ROW_SIZ Good point. In my testing I tried to come close to the boundary with a 530 K row size and was surprised that the query succeeded. Oops, I see this ties in to the initial description of the option purpose, which I left blank by mistake. I'll fill that in above. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@82 PS4, Line 82: 500 KB string and a 30 KB These are actually KiB. I didn't mean to actually cut it so close (530,000 vs. 524,288 limit). I couldn't make this slightly-too-large row cause a failure. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@95 PS4, Line 95: KiB Agh, KiB again where it should be KB. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@109 PS4, Line 109: Increase the max_row_size query option (currently 512.00 KB) to process larger rows. Wrap these very long error message lines. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_min_spillable_buffer_size.xml File docs/topics/impala_min_spillable_buffer_size.xml: http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_min_spillable_buffer_size.xml@38 PS4, Line 38: <indexterm audience="hidden">MIN_SPILLABLE_BUFFER_SIZE query option</indexterm> > I think most of the comments I made on DEFAULT_SPILLABLE_BUFFER_SIZE have a Done. Please doublecheck whether the wording I lifted from the DEFAULT_ page also makes sense here. E.g. is it even possible to lower the MIN_ setting to less than 64 KB, or could you only ever increase it? http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml File docs/topics/impala_scalability.xml: http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@425 PS4, Line 425: allocated > it's reserved at the beginning of the query, not allocated. The distinction Done http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@560 PS4, Line 560: The infrastructure of the spilling feature affects the way the affected SQL operators, such as > A lot of the discussion in this paragraph and the one below is describing t OK, I will summarize the new situation and then comment out the old discussion (in case we need to rescue any details from it later.) http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@615 PS4, Line 615: >BlockMgr.BytesWritten > This still needs updating - IMPALA-5656 - should I keep an eye out for a di Nah, there's only a single reference to this name so I'll fix IMPALA-5656 as part of this gerrit. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@701 PS4, Line 701: <b>Testing performance implications of spilling to disk:</b> > This section is stale too. The scenario still seems valid but the profile o OK, I'll skip over the output piece and see if users ask for more details to be added back. http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@745 PS4, Line 745: Do not specify a memory limit lower than about 300 MB, because with such a : low limit, queries could fail to start for other reasons. Now try the memory-intensive query again. > This is one of the things that was fixed :) Done -- To view, visit http://gerrit.cloudera.org:8080/8003 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49323f8ffbff3e195058e88762eedbb1fcb1bc0e Gerrit-Change-Number: 8003 Gerrit-PatchSet: 4 Gerrit-Owner: John Russell <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: John Russell <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Thu, 05 Oct 2017 00:27:39 +0000 Gerrit-HasComments: Yes
