Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 )
Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation ...................................................................... Patch Set 11: (19 comments) http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h@a222 PS11, Line 222: : : did you mean to just delete that? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h@457 PS11, Line 457: queue queued? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@333 PS11, Line 333: CANCELLED seems like the right thing to do but are you sure we weren't depending on propagating the error status through this path? how does the potential error get propagated now and are we careful to not overwrite it with this CANCELLED further up the stack / across threads? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@581 PS11, Line 581: DCHECK(!buffer->is_cached()) << "HDFS cache reads don't go through this code path."; maybe put that dcheck upstream too (like ReadRange() or even earlier) to help clarify the paths)? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@665 PS11, Line 665: DCHECK nit: _EQ http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.h@448 PS11, Line 448: threads disk threads? http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@27 PS9, Line 27: For most I/O manager clients it is an : /// opaque pointer, but some clients may need to include this heade is that still true? http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@93 PS9, Line 93: void Cancel(); how did you decide to make this a first class thing in the RequestContext, as opposed to all the other DiskIoMgr methods that take a RequestContext* as their first parameter? Should those also (eventually) just becomes methods on context? (And maybe context is really "DiskIoMgrClient" - though still unclear exactly what this abstraction should be)? e.g. Read(), GetNextRange(), AddScanRanges(), AddWriteRange() kinda seem like methods on the context (or client). http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc File be/src/runtime/io/request-context.cc: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@42 PS11, Line 42: << static_cast<int>(range->external_buffer_tag_); oh, I guess DCHECK_EQ doesn't work with enum or something? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@75 PS11, Line 75: reader what about the writing case? is that relevant? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@76 PS11, Line 76: GetNext GetNext() http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@79 PS11, Line 79: it does this referring to the context or something else? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@90 PS11, Line 90: cancelled Status does that mean the CANCELLED status or the status that lead to cancellation (if cancelled for an error)? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@91 PS11, Line 91: the cancelled state. what does that mean? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@95 PS11, Line 95: decrements the number of disk queues the context : // is on. which code is this talking about? Oh, I guess that's now in DecrementDiskThread()? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-ranges.h File be/src/runtime/io/request-ranges.h: http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-ranges.h@a391 PS11, Line 391: great to get rid of this. but what is it that allows us to no longer need it? http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-ranges.h@400 PS11, Line 400: to the I/O manager should that be reworded given where ReturnBuffer() now lives? One option is to just delete that. http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-ranges.h File be/src/runtime/io/request-ranges.h: http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-ranges.h@226 PS9, Line 226: Read() I guess that's talking about DiskIoMgr::Read()? I wonder if we should get rid of Read() so that the lifecycle is even more explicit (so we don't have cases where you get the buffer via DiskIoMgr but return it back to the scan range)? http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-ranges.h@236 PS9, Line 236: const Status& status why is that needed? it seems like a circuitous (and maybe redundant) way to propagate the status. I'm not saying we need to clean it up now but just wondering whether this makes sense or if it's something that can be simplified at some point. -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Sat, 18 Nov 2017 01:14:43 +0000 Gerrit-HasComments: Yes
