On 14/06/2016 10:39, Stefan Hajnoczi wrote: > On Mon, Jun 13, 2016 at 06:39:46PM +0200, Lluís Vilanova wrote: >> Paolo Bonzini writes: >> >>> On 13/06/2016 14:15, Lluís Vilanova wrote: >>>>> That said, I am skeptical about the benefit of the interfaces you are >>>>> adding. They add a lot of complication and overhead (especially >>>>> regarding the memory/cache overhead of the dstate array) without a clear >>>>> use case, in my opinion; all the processing you do at run-time is just >>>>> as well suited for later filtering. >>>> >>>> This should make tracing faster on the future with multi-threaded TCG, as >>>> well >>>> as trace files much smaller if you're tracing something like memory >>>> accesses. Also, bear in mind this series was split from a much larger one >>>> for >>>> simplicity. The follow-up one provides much larger performance benefits by >>>> avoiding the generation of TCG code to call the tracing backend when a >>>> vCPU is >>>> not traced. >> >>> This still assumes that tracing only some VCPUs is a common use case. >>> Is it?... >> >> I use it for code profiling by sampling across vCPUs, or only on vCPUs I know >> run processes of my interest. The profiles can then be used for analyzing the >> application/system behaviour. >> >> Also, with the fast-path checks already in place >> ('trace_events_enabled_count'), >> performance when not tracing should never be worse with this series. >> >> If this feature does not look useful to overall QEMU I will fold it into my >> other out-of-tree patches. > > I think the per-vcpu tracing feature is reasonable for qemu.git as long > as it does not introduce performance regressions for existing users.
I'm okay with it if the dstate array is changed to uint16_t. Thanks, Paolo