Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8971 )
Change subject: IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool ...................................................................... Patch Set 5: (16 comments) http://gerrit.cloudera.org:8080/#/c/8971/5/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/8971/5/common/thrift/PlanNodes.thrift@151 PS5, Line 151: filter_byte_size > "filter_size_bytes" Done http://gerrit.cloudera.org:8080/#/c/8971/5/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/8971/5/common/thrift/Planner.thrift@76 PS5, Line 76: runtime_filters_mem_requirement > I would rename to "runtime_filters_mem_requirement_bytes" to be consistent Done, renamed to "runtime_filters_reservation_bytes" http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@120 PS5, Line 120: runtimeFiltersReservation_ > nit: I would rename it to runtimeFilterMemReservation_. Also, you may want how about "runtimeFiltersReservationBytes_", to make it consistent with its corresponding thrift field, which i just renamed to "runtime_filters_reservation_bytes" to be consistent with "min_reservation_bytes" http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@234 PS5, Line 234: new HashSet<RuntimeFilterId>(); > nit: we typically use guava's static methods, e.g. = Sets.newHashSet(); Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@432 PS5, Line 432: runtime-filters > maybe "runtime-filters-memory" Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@71 PS5, Line 71: * original predicates. > I think this is a good place to also comment on the resources used by runti Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@89 PS5, Line 89: MIN_BLOOM_FILTER_SIZE > I don't want to be pedantic but maybe use "RUNTIME" instead of "BLOOM". Sam These size restrictions are only for bloom filters, since min-max filters are already of a fixed size. Would you recommend I rename other variables like maxFilterSize, minFilterSize and defaultFilterSize to maxBloomFilterSize,minBloomFilterSize and defaultBLoomFilterSize respectively, just to be explicit. http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@91 PS5, Line 91: // Map of base table tuple ids to a list of runtime filters that > nit: be a good citizen, leave an empty line :) Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@106 PS5, Line 106: // Precomputed default BloomFilter size, rounded up to a power of two. > nit: "in bytes" Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@116 PS5, Line 116: BitUtil.roundUpToPowerOf2(Math.min(maxFilterSize, : MAX_BLOOM_FILTER_SIZE)); > Although they are equivalent, I think it's a bit more explicit if you say: Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@145 PS5, Line 145: BLOOM_FILTER_BUCKET_WORDS > Here it is ok to keep "BLOOM" Yes, this applies only to bloom filters http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@466 PS5, Line 466: the > remove Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@480 PS5, Line 480: ones > "values" Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@482 PS5, Line 482: public > Do we need these two functions to be public? Done http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@600 PS5, Line 600: boundFilterSize > Why calculate* and bound* happen in two different places? I had three options here: 1. do both in the RuntimeFilter constructor. => this would mean adding these three params to the constructor which already has an inflated list of params being passed in. 2. do both here => that would mean having a representation of an uninitialized bloom filter size like -1 after construction, then doing all the calculation here. 3. Doing calculate in the constructor and bound here => 'calculate' leaves the newly created RuntimeFilter in a meaningful state with a valid value for filterSize, and the three limits (defaultFilterSize, maxFilterSize, minFilterSize) are stored at the RuntimeFilterGenerator level, so having the "bound" method being called here made sense. I dont really have a strong preference for either. So, I am definitely open to suggestions. http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/8971/5/fe/src/main/java/org/apache/impala/service/BackendConfig.java@75 PS5, Line 75: public double getMaxFilterErrorRate() { : return backendCfg_.max_filter_error_rate; : } : : public long getMinBufferSize() { : return backendCfg_.min_buffer_size; : } > single lines Done -- To view, visit http://gerrit.cloudera.org:8080/8971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea2759665fb2e8bef9433014a8d42a7ebf99ce1f Gerrit-Change-Number: 8971 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Mon, 29 Jan 2018 23:04:05 +0000 Gerrit-HasComments: Yes