Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/18602 )
Change subject: IMPALA-9615: re2's max_mem opt configurable via an Impala startup flag ...................................................................... Patch Set 9: (7 comments) Overall looks good to me. I just have some minor comments and suggestions for follow on tasks. http://gerrit.cloudera.org:8080/#/c/18602/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18602/9//COMMIT_MSG@9 PS9, Line 9: Some regex patterns require more memory to be compiled and pattern matched using Would be good to wrap the commit message lines within 72 chars as mentioned here: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala http://gerrit.cloudera.org:8080/#/c/18602/9//COMMIT_MSG@19 PS9, Line 19: The testcase uses a long regex pattern to use up all the memory in the Maybe, a good idea to include testing details in a 'Testing' section. Here is a recent commit which is a good example to follow: https://github.com/apache/impala/commit/d12d4c9b077392f58701af2e343dd927fe516846 http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/like-predicate.cc File be/src/exprs/like-predicate.cc: http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/like-predicate.cc@104 PS9, Line 104: StringFunctions::SetRE2MemOpt(&opts); Would be interesting to run some tests to measure the perf impact of this patch. http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/string-functions-ir.cc@46 PS9, Line 46: DECLARE_string(re2_mem_limit); Is this being used somewhere? http://gerrit.cloudera.org:8080/#/c/18602/9/be/src/exprs/string-functions-ir.cc@934 PS9, Line 934: void StringFunctions::SetRE2MemLimit(int64_t re2_mem_limit) { Unfortunately, this is another example of untracked memory usage at runtime. We could have multiple instances of re2 (at least one per fragment) and each could use the configured amount of memory. Unfortunately, the planner doesn't know of any such memory requirement from the LIKE and other predicates using StringFunctions and so doesn't quite predict the real memory requirement. Would be good to include the memory requirement for re2 in the planner estimates. This probably should be a follow on JIRA. http://gerrit.cloudera.org:8080/#/c/18602/9/tests/custom_cluster/test_re2_max_mem.py File tests/custom_cluster/test_re2_max_mem.py: http://gerrit.cloudera.org:8080/#/c/18602/9/tests/custom_cluster/test_re2_max_mem.py@30 PS9, Line 30: """"test to see given an amount of memory (in Bytes) does the re2 print an error for nit: "for for DFA" -> "for DFA" http://gerrit.cloudera.org:8080/#/c/18602/9/tests/custom_cluster/test_re2_max_mem.py@43 PS9, Line 43: if expect_fail: If 'expect_fail' is set then we seem to be ignoring 'dfa_out_of_mem' flag. This is probably okay, but maybe something to add in the test comments. -- To view, visit http://gerrit.cloudera.org:8080/18602 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idf28d2f7217b1322ab8fdfb2c02fff0608078571 Gerrit-Change-Number: 18602 Gerrit-PatchSet: 9 Gerrit-Owner: Omid Shahidi <omid.shahidi.2...@gmail.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Omid Shahidi <omid.shahidi.2...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Mon, 20 Jun 2022 22:55:29 +0000 Gerrit-HasComments: Yes