xintongsong commented on issue #8704: [FLINK-12812][runtime] Set resource profiles for task slots URL: https://github.com/apache/flink/pull/8704#issuecomment-508334308 Thanks for the review, @StephanEwen. I would like to explain regarding your concern about the assumption of RM/TM having same configuration: The reason we need to calculate TM's slot resource profiles on RM side is that, we need to set resource profile for `PendingTaskManagerSlot` before the corresponding TM is started. Currently, Flink can assign a pending slot to a slot request before the TM is started and registered. In this way, the subsequent slot requests will first consume slots on the pending TM (for multi-slot TMs) before requesting and launching a new one. When the TM is registered, the SlotManager matches the registered new slot to a `PendingTaskManagerSlot` with the same resource profile, and assigns the registered slot to the same slot request that the pending slot is assigned to (if any). Before this PR, both the pending slot on RM side and the actual slot on TM side have the same resource profile `ANY`, which can be matched with the method `equals`. Since this PR sets the slot resource profile on TM side to the actual resource of the slot, we need to set the resource profile for the pending slots on RM side in the same way. This is way I introduced calculating TM's slot resource profiles on RM side, and the approximate matching. The assigning over pending slots and the RM side slot resource calculating only happens on Yarn/Mesos. In these scenarios, TMs do have the same configuration as RM does, which is transmitted from RM side. For a standalone cluster, there should be no pending slots because RM can not actively start any TM. Except for the `PendingTaskManagerSlot`, RM does use the slot resource profile reported from TM for matching slot request against registered slots, and converting requested `UNKNOWN` resource profile to a default value (as shown in the following PR #8846 for dynamic managed memory). Therefore, it should not cause problems on a standalone cluster with TMs having different configs. It's my bad not making these clear in codes and comments. For the rest of your comments, I'll address them ASAP. I especially admire your suggestions on encapsulation and simplifying tests. It's a good lesson for me. Thank you again.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
