Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14039 )
Change subject: IMPALA-8818: Replace deque with spillable queue in BufferedPRS ...................................................................... Patch Set 7: (9 comments) Nice! Had some relatively minor comments. I think a few of them come from a philosophical thing about error handling - I think I'd prefer if SpillableRowBatchQueue's invariants were stricter about when it was valid to call different methods in the lifecycle - currently there are a few runtime checks that would only be triggered if there was a bug in the calling code. There's a couple of reasons I prefer it this way (and why we've generally done it this way in Impala). First, if the error-handling code isn't actually reachable, then we can't really test that it works correctly. Second, too many defensive checks add complexity and runtime overhead if the pattern is repeated - and it does actually get confusing whether something is an invariant of the system or a runtime error. http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/exec/exec-node.h File be/src/exec/exec-node.h: http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/exec/exec-node.h@196 PS7, Line 196: const std::string label() const; I think the const is unnecessary for a return by value (if you're returning by value, the caller gets its own copy anyway, so can choose whether it wants to mutate it). http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/buffered-tuple-stream.h File be/src/runtime/buffered-tuple-stream.h: http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/buffered-tuple-stream.h@370 PS7, Line 370: int64_t BytesUnpinned() const { I'd probably just call it bytes_unpinned() cause it's a plain accessor (there's a bit of a grey area in naming conventions here, but plain accessors usually have are lower case with underscores). http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.h File be/src/runtime/spillable-row-batch-queue.h: http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.h@71 PS7, Line 71: /// successfully added, returns an error Status if the queue is full, has already been Why not just make it invalid to append to the queue when it's full? We could then make it a DCHECK instead of a runtime check http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.h@79 PS7, Line 79: /// Returns and removes the RowBatch at the head of the queue. Returns Status::OK() if Same thing - if it's not valid to call GetBatch() on an empty queue, just make it part of the contract and use a DCHECK instead of the runtime check. http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc File be/src/runtime/spillable-row-batch-queue.cc: http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc@67 PS7, Line 67: if (UNLIKELY(closed_)) return Status("SpillableRowBatchQueue is already closed."); If it's a bug to call this when it's closed, let's just make it a DCHECK. http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc@112 PS7, Line 112: if (UNLIKELY(closed_)) return false; I think it would be a bug to call this when it's closed, so we could document it as a precondition and make it a DCHECK http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc@130 PS7, Line 130: if (batch_queue_ != nullptr) Missing braces for multi-line if http://gerrit.cloudera.org:8080/#/c/14039/7/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14039/7/common/thrift/ImpalaService.thrift@415 PS7, Line 415: // The maximum amount of pinned memory used when spooling query results. If this value It's probably better to avoid renumbering things in thrift files. I think it's ok in this case since it's not really exposed in public APIs, but it's probably a good habit to avoid doing it. http://gerrit.cloudera.org:8080/#/c/14039/7/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/14039/7/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@88 PS7, Line 88: maxMemReservationBytes = Math.max(maxMemReservationBytes, minMemReservationBytes); Move this up to l74 so that the calculation is all in one place? -- To view, visit http://gerrit.cloudera.org:8080/14039 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10f9e72374cdf9501c0e5e2c5b39c13688ae65a9 Gerrit-Change-Number: 14039 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 19 Aug 2019 21:26:51 +0000 Gerrit-HasComments: Yes
