Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9757 )
Change subject: IMPALA-6679,IMPALA-6678: reduce scan reservation ...................................................................... Patch Set 8: (11 comments) http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node-base.h@502 PS8, Line 502: compute computed? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node-base.cc@519 PS8, Line 519: small footer range only so we won't request an increase. is that always true? i.e. is the footer size really bounded? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node-base.cc@537 PS8, Line 537: reservation_increase_success_counter_ will this counter be useful? Or does the ideal and actual reservation tell us what we need to know? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node-base.cc@522 PS8, Line 522: // Check if we could use at least one more max-sized I/O buffer for this range. Don't : // increase in smaller increments since we may not be able to use additional smaller : // buffers. : while (*reservation < ideal_scan_range_reservation) { : // Increase to the next I/O buffer multiple or to the ideal reservation. : int64_t target = min(ideal_scan_range_reservation, : BitUtil::RoundUpToPowerOf2(*reservation + 1, io_mgr->max_buffer_size())); : DCHECK_LT(*reservation, target); : COUNTER_ADD(reservation_increase_request_counter_, 1); : bool increased = buffer_pool_client_.IncreaseReservation(target - *reservation); : VLOG_FILE << "Increasing reservation from " : << PrettyPrinter::PrintBytes(*reservation) << " to " : << PrettyPrinter::PrintBytes(target) << " " : << (increased ? "succeeded" : "failed"); : if (!increased) break; : COUNTER_ADD(reservation_increase_success_counter_, 1); : *reservation = target; : } is that the same loop as in ScannerContext? If so, maybe factor it into a static utility routine? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node.h File be/src/exec/hdfs-scan-node.h: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node.h@189 PS8, Line 189: /// Return true if there is enough reservation to start an extra scanner thread. : /// Tries to increase reservation if enough is not already available in : /// 'spare_reservation_'. 'lock_' must be held via 'lock' : bool EnoughReservationForExtraThread(const boost::unique_lock<boost::mutex>& lock); now that ReturnReservationFromScannerThread() eagerly decreases the reservation, this will always need to to an increase, right? If that's true, how about renaming this to IncreaseReservationForExtraThread() or AcquireReservationForExtraThread() (to match Release)? Hmm, but i guess that's not true for the first thread case. but see my question later about spare_reservation_ accounting. http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node.h@203 PS8, Line 203: return or all ? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/hdfs-scan-node.cc@310 PS8, Line 310: // must be available. now that we return reservation eagerly and we hold the lock, will spare_reservation_ always be equal to the minimum reservation at this point? do we still need spare_reservation_ accounting then? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/scanner-context.h File be/src/exec/scanner-context.h: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/scanner-context.h@376 PS8, Line 376: reservation' is the amount of reservation that : /// is given to this stream for allocating I/O buffers how does this reservation relate to total_reservation_? I assume this is a portion of total_reservation_ (as opposed to additional reservation) given to this particular stream? http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/scanner-context.cc File be/src/exec/scanner-context.cc: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/exec/scanner-context.cc@78 PS8, Line 78: } where does the reservation get released? who owns the reservation? after reading the rest of the code again, I see that we effectively transfer the ownership while the context is in scope, and then the scanner thread in the scan node code takes back ownership as this thing falls out of scope. I think explaining this somewhere would be helpful (basically that the context can acquire more reservation but it's the context's owner that has to subsume ownership of that reservation). http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/9757/8/be/src/util/runtime-profile.h@204 PS8, Line 204: Returns space before http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1394 PS6, Line 1394: for (SlotDescriptor slot: desc_.getSlots()) { > This is the case where the value is logically inside a nested collection bu That helps. -- To view, visit http://gerrit.cloudera.org:8080/9757 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc80e05118a9eef72cac8e2308418122e3ee0842 Gerrit-Change-Number: 9757 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 18 Apr 2018 22:50:53 +0000 Gerrit-HasComments: Yes
