Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9447 )
Change subject: IMPALA-6512: Maintenace thread period should respect FLAGS_datastream_sender_timeout_ms ...................................................................... Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/9447/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9447/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-6512: Maintenace thread period should respect FLAGS_datastream_sender_timeout_ms > is that the right JIRA? Wasn't that one addressed with Lars' error message Yes, I originally thought that it was just the error message not matching. However, for some reasons, we weren't catching this other issue. May be the custom cluster test didn't properly enable KRPC before Lars' recent changes to add a flag to enable KRPC during testing. http://gerrit.cloudera.org:8080/#/c/9447/1/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9447/1/be/src/runtime/krpc-data-stream-mgr.cc@367 PS1, Line 367: DCHECK_GE(STREAM_EXPIRATION_TIME_MS, 10000); > what does this mean? what's the reason for not using STREAM_EXPIRATION_TIME Not sure if it's entirely desirable to use the min(FLAGS_datastream_sender_timeout_ms, STREAM_EXPIRATION_TIME_MS) / 2 as the period of the maintenance thread. While this provides the guarantee that the thread is late by at most half of the timeout period, this also means the amount of lateness is directly proportional to the timeout period. This seems quite inappropriate if the sender sets a timeout to a large value (e.g. 10 mins) and it only gets notified 14~15 mins later. Having the upper bound of 10 seconds seem to make the behavior more deterministic. I removed this DCHECK as STREAM_EXPIRATION_TIME_MS is an internal timeout and there is no serious need to have time guarantee on how late we remove expired entries as long as garbage collection happens periodically. http://gerrit.cloudera.org:8080/#/c/9447/1/be/src/runtime/krpc-data-stream-mgr.cc@369 PS1, Line 369: (1 > is 1ms a good lower bound, or should we make it 10? Though, I guess it only FLAGS_datastream_sender_timeout_ms has millisecond granularity so waking up once every millisecond should work even for the min possible value it's set to. Yes, it only matters if FLAGS_datastream_sender_timeout_ms is set to very small value which I believe is uncommon. http://gerrit.cloudera.org:8080/#/c/9447/1/be/src/runtime/krpc-data-stream-mgr.cc@375 PS1, Line 375: int64_t now = MonotonicMillis(); > maybe hoist that up, and use the same 'now' value for the entire iteration. Done -- To view, visit http://gerrit.cloudera.org:8080/9447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I804cef7cc991007ec44375f8eac804aa2df46bd7 Gerrit-Change-Number: 9447 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Comment-Date: Tue, 27 Feb 2018 07:58:35 +0000 Gerrit-HasComments: Yes
