On Thu, 11 Aug 2016 17:57:27 +0300, Dmitry V. Levin wrote: > Date: Thu, 11 Aug 2016 17:57:27 +0300 > From: "Dmitry V. Levin" <l...@altlinux.org> > To: strace-devel@lists.sourceforge.net > Subject: Re: [PATCH 3/4] Introduce syscall fault injection feature > Message-ID: <20160811145727.gb30...@altlinux.org> > In-Reply-To: <20160811144031.2fptrdylyhobfogl@Bane> > List-Id: strace development list <strace-devel.lists.sourceforge.net> > > On Thu, Aug 11, 2016 at 04:40:31PM +0200, Nahim El Atmani wrote: > [...] > > > > The thing is this time I need a copy of the global sparse array by tcb. > > > > I was > > > > > I don't see why one may need a sparse array by tcb. > > > > > There has to be a global sparse array that fully describes fault injection > > > settings. As the decision whether/how each particular syscall is going > > > to be > > > fault-injected is made on entering syscall, the only fault injection > > > related state that has to be stored in each tcb is the information whether > > > this particular syscall is being fault-injected, and the error code that > > > has > > > to be injected on exiting syscall. > > > > Yes, plus the accounting variable 'cnt' in the struct fault_opts to know > > wether > > we have to discard the syscall this time or not. If we don't bring this one > > we > > get the the race condition I just mentionned in my previous email. So I can > > either create a subset of the struct fault_opts or, simply shadow it since > > the « memory cost is negligible ». What do you think? > > There is no need to keep fault injection accounting in struct tcb. > > The decision whether this particular syscall has to be fault-injected this > time or not is made on entering syscall, right? Every time the decision > is made, global accounting is updated if needed.
Ok, let's take a small example to see why neither letting the accounting information in a global scope nor letting the flag as it in the tcp does work. If we have two tracees, and we want to cancel the second write *each* of them may issue, we will end up doing something like: $ cat&; cat& $ ./strace -p PID1 -p PID2 -e write -e fault=write:2:EAGAIN Writing to the first one twice will effectively trigger the fault injection mechanism that will discard the second call. Each occurrence will update the global accounting information 'cnt' and the tcb flag with FAULT_ENTER or FAULT_DONE. However, if we write twice in the second tracee we will never end up in the fault injection mechanism because 'cnt' is global and already equal to two (three at the time we test it in fault_syscall_enter()). So it's clear that we want it by thread, hence in the tcb. Now for the flag droped as it in the tcb; we know that currently the FAULT_DONE is set when the FAULT_AT mode is used to fault the nth occurrence of a particular syscall. It's used mainly to exit early in fault_syscall_enter() if the fault is already done for the target syscall. So it's something we want *by* syscall. Using tcb flags to store this information leverage this state on a tracee basis. So we can't inject fault in more than one syscall at a time, which is pretty bad. I hope this small explanation makes things clearer. -- Nahim El Atmani <nahim+...@naam.me> http://naam.me/ ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. http://sdm.link/zohodev2dev _______________________________________________ Strace-devel mailing list Strace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/strace-devel