[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 15 Jun 2024 00:28:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Simplify table dimension creation methods and fix few flake8 warnings in
test_dimensions.py.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Reviewed-on: http://gerrit.cloudera.org:8080/21455
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 70 insertions(+), 22 deletions(-)

Approvals:
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16355/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 19:40:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10717/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 19:21:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 10: Code-Review+2

(1 comment)

Carry +2 for trivial changes.

http://gerrit.cloudera.org:8080/#/c/21455/9/tests/common/test_dimensions.py
File tests/common/test_dimensions.py:

http://gerrit.cloudera.org:8080/#/c/21455/9/tests/common/test_dimensions.py@127
PS9, Line 127: return create_table_format_dimension(workload, 'text
> Forgot to return here and below.
Done



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 19:20:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#10).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Simplify table dimension creation methods and fix few flake8 warnings in
test_dimensions.py.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 70 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/10
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/9/tests/common/test_dimensions.py
File tests/common/test_dimensions.py:

http://gerrit.cloudera.org:8080/#/c/21455/9/tests/common/test_dimensions.py@127
PS9, Line 127: create_table_format_dimension(workload, 'text/none')
Forgot to return here and below.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 19:14:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10714/


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 19:06:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16350/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 14:53:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#9).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Simplify table dimension creation methods and fix few flake8 warnings in
test_dimensions.py.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 70 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/9
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 9: Code-Review+2

Carry +2 from Csaba after rebase.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 14:28:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10714/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 14:29:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10713/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 14:14:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 8: Code-Review+2

Thanks for the changes, lgtm!


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 14 Jun 2024 11:29:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16336/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 13 Jun 2024 19:12:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-13 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/query_test/test_join_queries.py@218
PS7, Line 218: parquet/snap/bloc
> Is there a reason for using this instead of default parquet/none?
With parquet_none, the catalog_sales table is not partitioned, and the minimum 
MEM_LIMIT to pass admission control is 183mb. That is high enough for this test 
to pass with and without the improvement code.

Meanwhile, in tpcds_partitioned_snap_block, both store_sales and catalog_sales 
are partitioned, allowing query to pass admission control with lower 149mb 
MEM_LIMIT. However, without the improvement code, it will fail execution in 
backend due to memory limit exceeded.
https://issues.apache.org/jira/browse/IMPALA-13075?focusedCommentId=17848720&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17848720

I will mention this in test comment.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 13 Jun 2024 18:16:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-13 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/common/test_dimensions.py
File tests/common/test_dimensions.py:

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/common/test_dimensions.py@119
PS7, Line 119:
> the create...dimension functions below could be shorter by calling this fun
Done


http://gerrit.cloudera.org:8080/#/c/21455/7/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/query_test/test_join_queries.py@218
PS7, Line 218:
> With parquet_none, the catalog_sales table is not partitioned, and the mini
Done



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 13 Jun 2024 18:48:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-13 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#8).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Simplify table dimension creation methods and fix few flake8 warnings in
test_dimensions.py.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 70 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/8
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 7: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/common/test_dimensions.py
File tests/common/test_dimensions.py:

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/common/test_dimensions.py@119
PS7, Line 119: create_table_format_dimension
the create...dimension functions below could be shorter by calling this 
functions


http://gerrit.cloudera.org:8080/#/c/21455/7/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/21455/7/tests/query_test/test_join_queries.py@218
PS7, Line 218: parquet/snap/bloc
Is there a reason for using this instead of default parquet/none?



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 13 Jun 2024 12:31:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16309/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 10 Jun 2024 21:58:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-10 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/5/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/5/be/src/exec/hash-table.cc@376
PS5, Line 376: _exprs_) / sam
> If I understand correctly than capacity_ can be 1 or N*64. Isn't it possibl
Change the formula to use double in patch set 7.

The -8 in max_capacity calculation is to account for 1 extra BitMap overhead. 
Therefore, if mem_per_row = 4KB, max_capacity will be equal to 63.998046875. 
Thus, expect capacity_ = 63.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 10 Jun 2024 21:35:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-10 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#7).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 54 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/7
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-06-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/5/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/5/be/src/exec/hash-table.cc@376
PS5, Line 376: * sample_size;
If I understand correctly than capacity_ can be 1 or N*64. Isn't it possible 
that the first jump (1-64) is too high in some cases, e.g. if the capacity 
would be 16 with the original calculation (16KB expr_values_bytes_per_row_)?



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 10 Jun 2024 15:26:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries
File testdata/workloads/tpcds_partitioned/queries:

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries@1
PS4, Line 1: ../tpcds/queries
> get_workload() along with table format dimension dictate which database the
Thanks, that makes sense to me.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 20:03:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16256/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 19:54:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@373
PS2, Line 373: rocessor dependent
> Ack. Maybe also adds this to the code comment to explain why 64 is picked?
Done


http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries
File testdata/workloads/tpcds_partitioned/queries:

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries@1
PS4, Line 1: ../tpcds/queries
> I don't quite understand the difference between using testdata/workloads/tp
get_workload() along with table format dimension dictate which database the 
query will run. For example,

tpcds_partitioned + 'parquet/snap/block' = tpcds_partitioned_parquet_snap 
database.
tpcds + 'parquet/none' = tpcds_parquet database.

My intention is to run the same TPC-DS Q97 that exist in tpcds workload dir to 
query tpcds_partitioned_parquet_snap instead of tpcds_parquet, because all 
facts table in tpcds_partitioned_parquet_snap are all partitioned. In 
tpcds_parquet, only store_sales are partitioned.

All test files under testdata/workloads/tpcds/queries/ should be compatible 
against tpcds_partitioned_parquet_snap. So I just create symlink instead 
copying all test files into separate 
testdata/workloads/tpcds_partitioned/queries/. Eventually, tpcds dataset should 
be changed to follow schema from tpcds_partitioned and we can keep just one of 
them.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 19:30:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#5).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 53 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/5
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@373
PS2, Line 373: t int sample_size
> Done. Keeping it local for this Init method.
Ack. Maybe also adds this to the code comment to explain why 64 is picked?


http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@378
PS2, Line 378: // TODO: Add 'mem_
> capacity_ is capped to 1 at minimum. It is unlikely, but we want to know if
Done


http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries
File testdata/workloads/tpcds_partitioned/queries:

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries@1
PS4, Line 1: ../tpcds/queries
I don't quite understand the difference between using 
testdata/workloads/tpcds/queries and the new workload tpcds_partitioned in 
test_join_queries.py. Is there a specific reason why test_join_queries.py uses 
tpcds_partitioned, which seems to be same to tpcds?



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 18:57:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16240/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 29 May 2024 00:27:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-28 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#4).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 52 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/4
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/16238/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 28 May 2024 23:36:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-28 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@373
PS2, Line 373: t int sample_size
> Looks like the 64 is the sampling number, can we give the 64 a const, and r
Done. Keeping it local for this Init method.
Picked 64 because MemUsage account for Bitmap::MemUsage as well, which cost 8 
bytes per 64 entries.


http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@378
PS2, Line 378: // TODO: Add 'mem_
> Should "if (capacity_ > 0)" also hold the DCHECK? Or is it possible that th
capacity_ is capped to 1 at minimum. It is unlikely, but we want to know if 
this indeed happen.
Turn this into LOG(WARNING) instead.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 28 May 2024 23:16:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-28 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#3).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
5 files changed, 49 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/3
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-25 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@373
PS2, Line 373: mem_for_64_entries
Looks like the 64 is the sampling number, can we give the 64 a const, and 
rename the mem_for_64_entries to mem_for_sampling_entries?


http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@378
PS2, Line 378: if (capacity_ > 1)
Should "if (capacity_ > 0)" also hold the DCHECK? Or is it possible that the 
mem_usage can exceed the MAX_EXPR_VALUES_CACHE_BYTES in that case? Maybe we can 
log something if mem_usage can exceed MAX_EXPR_VALUES_CACHE_BYTES and the case 
is rare



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 25 May 2024 16:53:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/16218/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 May 2024 15:02:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-24 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#2).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
5 files changed, 44 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/2
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/16217/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 May 2024 07:12:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21455


Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
5 files changed, 45 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/1
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/1/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/21455/1/tests/query_test/test_join_queries.py@27
PS1, Line 27: from tests.common.test_dimensions import (
flake8: F401 'tests.common.test_dimensions.add_exec_option_dimension' imported 
but unused



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 May 2024 06:50:03 +
Gerrit-HasComments: Yes