Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
......................................................................


Patch Set 20:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.h@289
PS20, Line 289: AddBuffersToRange
> nit: AllocateBuffersForRange
Done


http://gerrit.cloudera.org:8080/#/c/8707/14/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/14/be/src/runtime/io/disk-io-mgr.h@259
PS14, Line 259: do
> nit: does
Done


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.cc@826
PS20, Line 826: ScheduleMode::IMMEDIATELY
> please see comment on request-context.cc::179
Done


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-context.cc
File be/src/runtime/io/request-context.cc:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-context.cc@174
PS20, Line 174: disk_state->ScheduleContext(lock, this, range->disk_id());
> shouldn't this be called only for ScheduleMode::IMMEDIATELY?
Right, the range won't be scheduled, because it needs to move to 
'next_scan_range_to_start', then be picked up by GetNextRange(). We may need to 
schedule the context to update next_scan_range_to_start though. The above 
comment was meant to explain this, but it's a bit confusing. Tried to improve 
it.


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-context.cc@179
PS20, Line 179: DCHECK(schedule_mode == ScheduleMode::IMMEDIATELY) << 
static_cast<int>(schedule_mode);
> can you document this in the method comment, that schedule_mode should alwa
Updated the comment.

I didn't change the behaviour of writes, but changed the enum value to better 
fit the behaviour. Writes never have to wait for a client to do anything, 
because they're picked up off the unstarted_write_ranges() queue automatically, 
so I felt like "IMMEDIATELY" fit better, since GETNEXT doesn't need to be 
involved.


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-ranges.h
File be/src/runtime/io/request-ranges.h:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-ranges.h@437
PS20, Line 437: unused_buffers_
> should we call this unused_internal_buffers? to make the distinction clear
We call the "IoMgr-allocated buffers" in the header, so used 
'unused_iomgr_buffer' as the term here.


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-ranges.h@444
PS20, Line 444: unused_buffer_bytes_returned_
> this implicitly signifies the amount of bytes that have been read or are be
I moved it to GetNextBufferforRange(). That seems to make it a little clearer.


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc@139
PS20, Line 139: unused_buffer_bytes_ += buffer->buffer_len();
> should we do a check for (unused_buffer_bytes_ >= len_ - unused_buffer_byte
Yeah we should. I'm just going to move the whole "if" inside the loop instead 
of having an additional layer of nesting.


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc@141
PS20, Line 141: if (blocked_on_buffer_) {
              :           // This scan range was blocked and is no longer, 
schedule it again.
              :           blocked_on_buffer_ = false;
              :           schedule_range = true;
              :         }
> maybe do this outside the for loop?
It seems a little easier to follow to do this at the point when we add the 
buffer, since the unblocking is connected to the buffer being added. Especially 
with moving the if statement inside the loop, it seems easier to just check 
every time we add a buffer, instead of trying to have a separate condition 
outside the loop.



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 01 Feb 2018 22:38:08 +0000
Gerrit-HasComments: Yes

Reply via email to