Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18602 )

Change subject: IMPALA-9615: re2's max_mem opt configurable via an Impala 
startup flag
......................................................................


Patch Set 9:

(7 comments)

Overall looks good to me. I just have some minor comments and suggestions for 
follow on tasks.

http://gerrit.cloudera.org:8080/#/c/18602/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18602/9//COMMIT_MSG@9
PS9, Line 9: Some regex patterns require more memory to be compiled and pattern 
matched using
Would be good to wrap the commit message lines within 72 chars as mentioned 
here: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala


http://gerrit.cloudera.org:8080/#/c/18602/9//COMMIT_MSG@19
PS9, Line 19: The testcase uses a long regex pattern to use up all the memory 
in the
Maybe, a good idea to include testing details in a 'Testing' section. Here is a 
recent commit which is a good example to follow: 
https://github.com/apache/impala/commit/d12d4c9b077392f58701af2e343dd927fe516846


http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/like-predicate.cc
File be/src/exprs/like-predicate.cc:

http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/like-predicate.cc@104
PS9, Line 104:       StringFunctions::SetRE2MemOpt(&opts);
Would be interesting to run some tests to measure the perf impact of this patch.


http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/string-functions-ir.cc@46
PS9, Line 46: DECLARE_string(re2_mem_limit);
Is this being used somewhere?


http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/string-functions-ir.cc@934
PS9, Line 934: void StringFunctions::SetRE2MemLimit(int64_t re2_mem_limit) {
Unfortunately, this is another example of untracked memory usage at runtime. We 
could have multiple instances of re2 (at least one per fragment) and each could 
use the configured amount of memory. Unfortunately, the planner doesn't know of 
any such memory requirement from the LIKE and other predicates using 
StringFunctions and so doesn't quite predict the real memory requirement. Would 
be good to include the memory requirement for re2 in the planner estimates. 
This probably should be a follow on JIRA.


http://gerrit.cloudera.org:8080/#/c/18602/9/tests/custom_cluster/test_re2_max_mem.py
File tests/custom_cluster/test_re2_max_mem.py:

http://gerrit.cloudera.org:8080/#/c/18602/9/tests/custom_cluster/test_re2_max_mem.py@30
PS9, Line 30:         """"test to see given an amount of memory (in Bytes) does 
the re2 print an error for
nit:  "for for DFA" -> "for DFA"


http://gerrit.cloudera.org:8080/#/c/18602/9/tests/custom_cluster/test_re2_max_mem.py@43
PS9, Line 43:         if expect_fail:
If 'expect_fail' is set then we seem to be ignoring 'dfa_out_of_mem' flag. This 
is probably okay, but maybe something to add in the test comments.



--
To view, visit http://gerrit.cloudera.org:8080/18602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf28d2f7217b1322ab8fdfb2c02fff0608078571
Gerrit-Change-Number: 18602
Gerrit-PatchSet: 9
Gerrit-Owner: Omid Shahidi <omid.shahidi.2...@gmail.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Omid Shahidi <omid.shahidi.2...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Jun 2022 22:55:29 +0000
Gerrit-HasComments: Yes

Reply via email to