[
https://issues.apache.org/jira/browse/IMPALA-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on IMPALA-5740 started by Alex Rodoni.
-------------------------------------------
> Clarify STRING size limit
> -------------------------
>
> Key: IMPALA-5740
> URL: https://issues.apache.org/jira/browse/IMPALA-5740
> Project: IMPALA
> Issue Type: Documentation
> Components: Docs
> Affects Versions: Impala 2.10.0
> Reporter: Tim Armstrong
> Assignee: Alex Rodoni
> Priority: Critical
>
> The Impala docs currently state that strings have a maximum size of 32kb.
> http://impala.apache.org/docs/build/html/topics/impala_string.html
> This is misleading and causes confusion. We should clarify this in a way that
> makes it clear that 32kb is not the actual limit, but also makes it clear
> that performance and memory consumption will degrade with larger strings.
> Here's what the behaviour is. Unsure the best way to succinctly characterise
> it.
> * We expect that queries operating on 32KB strings will work reliably and not
> hit significant performance or memory problems (unless you have very complex
> queries, very many columns, etc).
> * There is an absolute hard limit of ~ 2GB on strings.
> * Memory consumption of queries will grow as string sizes increase and the
> probability of hitting "memory limit exceeded" will increase.
> * Performance of queries will decrease as strings get larger.
> * The row size (i.e. total size of all string and other columns) is limited
> in various places by the implementation of various operators.
> ** Rows coming from the right side of any hash join
> ** Rows coming from either side of a spilling hash join
> ** Rows being sorted
> ** Rows in a grouping aggregation.
> In Impala <= 2.9 the default limit in those places is 8MB.
> With the IMPALA-3200 changes that default number may decrease significantly,
> to 2MB or less, but there will be a new query option, something like
> MAX_ROW_SIZE, that will make this row size limit configurable on a per-query
> basis.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]