On Wed, 14 Dec 2016 02:44:46 +0100 Frederic Weisbecker <[email protected]> wrote:
> On Tue, Dec 13, 2016 at 02:21:21PM +0100, Martin Schwidefsky wrote: > > On Tue, 13 Dec 2016 12:13:22 +0100 > > Martin Schwidefsky <[email protected]> wrote: > > > > > On Mon, 12 Dec 2016 16:02:30 +0100 > > > Frederic Weisbecker <[email protected]> wrote: > > > > > > > On Mon, Dec 12, 2016 at 11:27:54AM +0100, Martin Schwidefsky wrote: > > > > > 3) The call to vtime_flush in account_process_tick is done in irq > > > > > context from > > > > > update_process_times. hardirq_offset==1 is also correct. > > > > > > > > Let's see this for example: > > > > > > > > + if ((tsk->flags & PF_VCPU) && (irq_count() - hardirq_offset == > > > > 0)) > > > > + S390_lowcore.guest_timer += timer; > > > > > > > > If the tick is interrupting guest, we have accounted the guest time on > > > > tick IRQ entry. > > > > Now we are in the middle of the tick interrupt and since hardirq_offset > > > > is 1, we > > > > are taking the above path by accounting half of the tick-IRQ time as > > > > guest, which is wrong, > > > > it's actually IRQ time. > > > > > > Hmm, you got me there. The system time from irq_enter until > > > account_process_tick > > > is reached is indeed IRQ time. It is not much but it is incorrect. The > > > best fix > > > would be to rip out the accounting of the system time from > > > account_process_tick > > > as irq_enter / irq_exit will do system time accounting anyway. To do that > > > do_account_vtime needs to be split, because for the task switch we need to > > > account the system time of the previous task. > > > > New patch for the delayed cputime account. I can not just rip out system > > time > > accounting from account_process_tick after all, I need a sync point for the > > steal time calculation. It basically is the same patch as before but with a > > new > > helper update_tsk_timer, the removal of hardirq_offset and a simplification > > for do_account_vtime: the last accounting delta is either hardirq time for > > the tick or system time for the task switch. > > > > Keeping my finger crossed.. > > The patch looks good. But you might want to remove the hardirq_offset in a > separate patch to queue for this merge window (I'm not sure if it needs a > stable tag, the argument may be there since the beginning). > > Because the rest depends on the series that is unlikely to be queued in this > merge window at this stage. I just pushed two fixes to the linux-s390:fixes tree which will be merged eventually after the first -rc candidate for 4.10 is released. This includes "s390/vtime: correct system time accounting" which fixes the hardirq_offset bug for s390. The link to the fixes-tree in case you want to look at the patch: git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git fixes -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.

