[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Dan Hecht (Code Review) Mon, 06 Mar 2017 16:40:04 -0800

Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
......................................................................



Patch Set 8:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS8, Line 108: if
iff, just to be super clear.


Line 116:   }
fine with me to keep, but isn't this the same as line 109?


Line 141:     ss << &*read_page_;
will it be possible to identify which page in the list printed below this is?


Line 166:   DCHECK(read_page_ == pages_.end());
DCHECK_EQ x2


PS8, Line 219: If the stream is pinned, the previous write page will remain 
pinned
             :   // with pin count 1 as a read page. If a read/write stream is 
pinned, the write
             :   // page has pin count 2, so we need to unpin once
these two sentences seem to both be talking about pinned streams, but i'm not 
sure what they are trying to conclude. i also don't understand this if-stmt 
that follows.

Why isn't it simply:

if (write_page_ != nullptr) UnpinPage(write_page_); ?

i.e. we need to unpin *for write* the page regardless of the state of the 
read-iterator, no?  i.e. can't we separate out reasoning of the read and write 
iterators?


PS8, Line 232: read_page_ != pages_.end()
what does this condition mean? why isn't it just: if the stream is pinned for 
reading, need to take a pin count for both read and write, otherwise need a pin 
count for only write?


Line 241:     RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, 
&new_page.handle));
does this purposely not go through PinPage()?


PS8, Line 250: (double_pin ? 2 : 1)
i guess because of this. maybe move this to be after the CreatePage() call and 
just add 1 in there and then let PinPage() do its job?


Line 348:     *pinned = true;
why was this needed?


PS8, Line 356: We need to double-pin the current write page if we also have a 
read iterator.
why? or maybe you mean just that we need to take a pin-count for reading even 
if it's already pinned for write? if that's the case, I think we should just 
talk about "read pinning" and "write pinning" separately (and this routine 
shouldn't have to muck with write pinning, right?)


PS8, Line 358: write_page_ == &page && read_page_ == pages_.end()
why? i thought we now do take a separate pin count for read and for write?


Line 387:       if (pinned_) UnpinPage(&page);
do we allow you to call UnpinStream() on an already unpinned stream? why?


PS8, Line 388: continue
i think this would be clearer as:

if (is_write_page || is_read_page) {

} else {

}

rather than the "continue" flow.


http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS8, Line 68: .
maybe add: "for reading"


PS8, Line 69: access
read access?


Line 71: /// unpinned and therefore do not use the client's reservation and can 
be spilled to disk.
comments in this paragraph are meant to further clarify that pinned vs. 
unpinned stream really only applies to the read-iterator.


PS8, Line 75: event
even


PS8, Line 319: bytes_pinned
i guess should be BytesPinned() since not a simple accessor


PS8, Line 385: including any pages already deleted in 'delete_on_read'
             :   /// mode.
that's not intuitive. is there a good reason why we do that rather than make it 
always be the size of the stream? and if so, seems like it should be called 
highwater_byte_size or something.


PS8, Line 392: num_rows_returned_
rows_returned_? (or rename that member, which might be better).


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Reply via email to