[
https://issues.apache.org/jira/browse/IMPALA-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561691#comment-17561691
]
ASF subversion and git services commented on IMPALA-9615:
---------------------------------------------------------
Commit a625a95dbd347d5a5e64566c77bcb27e991ce352 in impala's branch
refs/heads/dependabot/pip/infra/python/deps/urllib3-1.26.5 from Omid Shahidi
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a625a95db ]
IMPALA-9615: re2's max_mem opt configurable via an Impala startup flag
Some regex patterns require more memory to be compiled and pattern matched
using different string functions and like predicate available.
For more memory consuming patterns this can cause the following error:
"re2/re2.cc:667: DFA out of memory:
size xxxxx, bytemap range xx, list count xxxxx".
To avoid such errors in Impalad's ERROR log, a global flag can
be added to impala cluster startup. The re2_mem_limit flag will
accept a memory specification string to set the re2 max_mem parameter for
memory used to store regexps in Bytes.
Testing:
- Use a long regex pattern to use up all the memory in the
case of allocating less or the same amount of memory as default for re2.
By using a greater value for re2_mem_limit flag, the regexp can be
consumed with no error.
Change-Id: Idf28d2f7217b1322ab8fdfb2c02fff0608078571
Reviewed-on: http://gerrit.cloudera.org:8080/18602
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Make re2's max_mem option configurable via an Impala startup flag.
> ------------------------------------------------------------------
>
> Key: IMPALA-9615
> URL: https://issues.apache.org/jira/browse/IMPALA-9615
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 3.4.0
> Reporter: Attila Jeges
> Assignee: Omid Shahidi
> Priority: Major
> Labels: backend, ramp-up
>
> Right now Impala always uses the default max_mem value for re2 regexp pattern
> matching.
> For more memory consuming patterns this can cause the following error:
> "re2/re2.cc:667: DFA out of memory: size xxxxx, bytemap range xx, list count
> xxxxx".
> It would be nice if re2's max_mem option would be configurable via an Impala
> startup flag.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]