OK so one difference here is, you're adding a new DAG SLA concept. Which is useful. One subtle difference from what I think is the existing "concept" of SLA is that you are evaluating it against when it started, as opposed to when it should have started, and evaluating it only in the course of running.
Let's suppose for a moment that everyone is on board with this and thinks it's a necessary tradeoff. Well now let's look at individual task instance SLAs. With that change in the concept of what an SLA is, do we still *need* to move to "soft timeout" for individual tasks? I think maybe no. Because, why could we not, at the same time as we evaluate the dag run SLA, also evaluate each task's SLA, and evaluate it against the same "start time" that the overall DAG SLA is evaluating against? This would seem to be more *like* the existing SLA concept for individual tasks, the difference being it requires the dag to be running (which is already a requirement of your new task SLA concept). The other difference, again, is the start time vs should-have-started time distinction. But this would also seem to remove the "doesn't work for deferrables" problem.