Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-08-20 Thread Rusty Russell
On Thu, 9 Aug 2012 15:46:20 +0530, Amit Shah amit.s...@redhat.com wrote:
 Hi,
 
 On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:
  Hi All,
  
  The following patch set provides a low-overhead system for collecting kernel
  tracing data of guests by a host in a virtualization environment.
 
 So I just have one minor comment, please post a non-RFC version of the
 patch.
 
 Since you have an ACK from Steven for the ftrace patch, I guess Rusty
 can push this in via his virtio tree?
 
 I'll ack the virtio-console bits in the next series you send.

You didn't Ack, BTW.  At least, AFAICT.

Cheers,
Rusty.



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-08-20 Thread Amit Shah
On (Tue) 21 Aug 2012 [11:47:16], Rusty Russell wrote:
 On Thu, 9 Aug 2012 15:46:20 +0530, Amit Shah amit.s...@redhat.com wrote:
  Hi,
  
  On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:
   Hi All,
   
   The following patch set provides a low-overhead system for collecting 
   kernel
   tracing data of guests by a host in a virtualization environment.
  
  So I just have one minor comment, please post a non-RFC version of the
  patch.
  
  Since you have an ACK from Steven for the ftrace patch, I guess Rusty
  can push this in via his virtio tree?
  
  I'll ack the virtio-console bits in the next series you send.
 
 You didn't Ack, BTW.  At least, AFAICT.

Ah, sorry.  Will do that now.

Amit



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-08-09 Thread Amit Shah
Hi,

On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:
 Hi All,
 
 The following patch set provides a low-overhead system for collecting kernel
 tracing data of guests by a host in a virtualization environment.

So I just have one minor comment, please post a non-RFC version of the
patch.

Since you have an ACK from Steven for the ftrace patch, I guess Rusty
can push this in via his virtio tree?

I'll ack the virtio-console bits in the next series you send.

Thanks,

Amit



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-30 Thread Yoshihiro YUNOMAE

Hi Amit,

Sorry for the late reply.

(2012/07/27 18:43), Amit Shah wrote:

On (Fri) 27 Jul 2012 [17:55:11], Yoshihiro YUNOMAE wrote:

Hi Amit,

Thank you for commenting on our work.

(2012/07/26 20:35), Amit Shah wrote:

On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:




[...]



***Just enhancement ideas***
  - Support for trace-cmd
  - Support for 9pfs protocol
  - Support for non-blocking mode in QEMU


There were patches long back (by me) to make chardevs non-blocking but
they didn't make it upstream.  Fedora carries them, if you want to try
out.  Though we want to converge on a reasonable solution that's
acceptable upstream as well.  Just that no one's working on it
currently.  Any help here will be appreciated.


Thanks! In this case, since a guest will stop to run when host reads
trace data of the guest, char device is needed to add a non-blocking
mode. I'll read your patch series. Is the latest version 8?
http://lists.gnu.org/archive/html/qemu-devel/2010-12/msg00035.html


I suppose the latest version on-list is what you quote above.  The
objections to the patch series are mentioned in Anthony's mails.


I'll check the mails.


Hans maintains a rebased version of the patches in his tree at

http://cgit.freedesktop.org/~jwrdegoede/qemu/

those patches are included in Fedora's qemu-kvm, so you can try that
out if it improves performance for you.


Thanks. I'll check those patches.


  - Make vhost-serial


I need to understand a) why it's perf-critical, and b) why should the
host be involved at all, to comment on these.


a) To make collecting overhead decrease for application on a guest.
(see above)
b) Trace data of host kernel is not involved even if we introduce this
patch set.


I see, so you suggested vhost-serial only because you saw the guest
stopping problem due to the absence of non-blocking code?  If so, it
now makes sense.  I don't think we need vhost-serial in any way yet.


I understood. We suggested vhost-serial as one of the ideas for
improving performances. Other features(trace-cmd, 9pfs, and
non-blocking chardev) should be supported first, I think.


BTW where do you parse the trace data obtained from guests?  On a
remote host?


It is the best that we can parse the data on a remote host in this
tracing system. Existing trace-cmd can already parse it on a remote
site. If we add the feature collecting event-format data(guest's
debugfs has that) from guests, we can parse tracing data on a remote
host as well as on a host running guests.

Thank you,

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com





Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-27 Thread Yoshihiro YUNOMAE

Hi Amit,

Thank you for commenting on our work.

(2012/07/26 20:35), Amit Shah wrote:

On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:


[...]



Therefore, we propose a new system virtio-trace, which uses enhanced
virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
tracing data. In this system, there are 5 main components:
  (1) Ring-buffer of ftrace in a guest
  - When trace agent reads ring-buffer, a page is removed from ring-buffer.
  (2) Trace agent in the guest
  - Splice the page of ring-buffer to read_pipe using splice() without
memory copying. Then, the page is spliced from write_pipe to virtio
without memory copying.


I really like the splicing idea.


Thanks. We will improve this patch set.


  (3) Virtio-console driver in the guest
  - Pass the page to virtio-ring
  (4) Virtio-serial bus in QEMU
  - Copy the page to kernel pipe
  (5) Reader in the host
  - Read guest tracing data via FIFO(named pipe)


So will this be useful only if guest and host run the same kernel?

I'd like to see the host kernel not being used at all -- collect all
relevant info from the guest and send it out to qemu, where it can be
consumed directly by apps driving the tracing.


No, this patch set is used only for guest kernels, so guest and host
don't need to run the same kernel.


***Evaluation***
When a host collects tracing data of a guest, the performance of using
virtio-trace is compared with that of using native(just running ftrace),
IVRing, and virtio-serial(normal method of read/write).


Why is tracing performance-sensitive?  i.e. why try to optimise this
at all?


To minimize effects for applications on guests when a host collects
tracing data of guests.
For example, we assume the situation where guests A and B are running
on a host sharing I/O device. An I/O delay problem occur in guest A,
but it doesn't for the requirement in guest B. In this case, we need to
collect tracing data of guests A and B, but a usual method using
network takes high load for applications of guest B even if guest B is
normally running. Therefore, we try to decrease the load on guests.
We also use this feature for performance analysis on production
virtualization systems.

[...]



***Just enhancement ideas***
  - Support for trace-cmd
  - Support for 9pfs protocol
  - Support for non-blocking mode in QEMU


There were patches long back (by me) to make chardevs non-blocking but
they didn't make it upstream.  Fedora carries them, if you want to try
out.  Though we want to converge on a reasonable solution that's
acceptable upstream as well.  Just that no one's working on it
currently.  Any help here will be appreciated.


Thanks! In this case, since a guest will stop to run when host reads
trace data of the guest, char device is needed to add a non-blocking
mode. I'll read your patch series. Is the latest version 8?
http://lists.gnu.org/archive/html/qemu-devel/2010-12/msg00035.html


  - Make vhost-serial


I need to understand a) why it's perf-critical, and b) why should the
host be involved at all, to comment on these.


a) To make collecting overhead decrease for application on a guest.
   (see above)
b) Trace data of host kernel is not involved even if we introduce this
   patch set.

Thank you,

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com





Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-27 Thread Amit Shah
On (Fri) 27 Jul 2012 [17:55:11], Yoshihiro YUNOMAE wrote:
 Hi Amit,
 
 Thank you for commenting on our work.
 
 (2012/07/26 20:35), Amit Shah wrote:
 On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:
 
 [...]
 
 
 Therefore, we propose a new system virtio-trace, which uses enhanced
 virtio-serial and existing ring-buffer of ftrace, for collecting guest 
 kernel
 tracing data. In this system, there are 5 main components:
   (1) Ring-buffer of ftrace in a guest
   - When trace agent reads ring-buffer, a page is removed from 
  ring-buffer.
   (2) Trace agent in the guest
   - Splice the page of ring-buffer to read_pipe using splice() without
 memory copying. Then, the page is spliced from write_pipe to virtio
 without memory copying.
 
 I really like the splicing idea.
 
 Thanks. We will improve this patch set.
 
   (3) Virtio-console driver in the guest
   - Pass the page to virtio-ring
   (4) Virtio-serial bus in QEMU
   - Copy the page to kernel pipe
   (5) Reader in the host
   - Read guest tracing data via FIFO(named pipe)
 
 So will this be useful only if guest and host run the same kernel?
 
 I'd like to see the host kernel not being used at all -- collect all
 relevant info from the guest and send it out to qemu, where it can be
 consumed directly by apps driving the tracing.
 
 No, this patch set is used only for guest kernels, so guest and host
 don't need to run the same kernel.

OK - that's good to know.

 ***Evaluation***
 When a host collects tracing data of a guest, the performance of using
 virtio-trace is compared with that of using native(just running ftrace),
 IVRing, and virtio-serial(normal method of read/write).
 
 Why is tracing performance-sensitive?  i.e. why try to optimise this
 at all?
 
 To minimize effects for applications on guests when a host collects
 tracing data of guests.
 For example, we assume the situation where guests A and B are running
 on a host sharing I/O device. An I/O delay problem occur in guest A,
 but it doesn't for the requirement in guest B. In this case, we need to
 collect tracing data of guests A and B, but a usual method using
 network takes high load for applications of guest B even if guest B is
 normally running. Therefore, we try to decrease the load on guests.
 We also use this feature for performance analysis on production
 virtualization systems.

OK, got it.

 
 [...]
 
 
 ***Just enhancement ideas***
   - Support for trace-cmd
   - Support for 9pfs protocol
   - Support for non-blocking mode in QEMU
 
 There were patches long back (by me) to make chardevs non-blocking but
 they didn't make it upstream.  Fedora carries them, if you want to try
 out.  Though we want to converge on a reasonable solution that's
 acceptable upstream as well.  Just that no one's working on it
 currently.  Any help here will be appreciated.
 
 Thanks! In this case, since a guest will stop to run when host reads
 trace data of the guest, char device is needed to add a non-blocking
 mode. I'll read your patch series. Is the latest version 8?
 http://lists.gnu.org/archive/html/qemu-devel/2010-12/msg00035.html

I suppose the latest version on-list is what you quote above.  The
objections to the patch series are mentioned in Anthony's mails.

Hans maintains a rebased version of the patches in his tree at

http://cgit.freedesktop.org/~jwrdegoede/qemu/

those patches are included in Fedora's qemu-kvm, so you can try that
out if it improves performance for you.

   - Make vhost-serial
 
 I need to understand a) why it's perf-critical, and b) why should the
 host be involved at all, to comment on these.
 
 a) To make collecting overhead decrease for application on a guest.
(see above)
 b) Trace data of host kernel is not involved even if we introduce this
patch set.

I see, so you suggested vhost-serial only because you saw the guest
stopping problem due to the absence of non-blocking code?  If so, it
now makes sense.  I don't think we need vhost-serial in any way yet.

BTW where do you parse the trace data obtained from guests?  On a
remote host?

Thanks,
Amit



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-27 Thread Blue Swirl
On Wed, Jul 25, 2012 at 8:15 AM, Masami Hiramatsu
masami.hiramatsu...@hitachi.com wrote:
 (2012/07/25 5:26), Blue Swirl wrote:
 The following patch set provides a low-overhead system for collecting kernel
 tracing data of guests by a host in a virtualization environment.

 A guest OS generally shares some devices with other guests or a host, so
 reasons of any problems occurring in a guest may be from other guests or a
 host.
 Then, to collect some tracing data of a number of guests and a host is 
 needed
 when some problems occur in a virtualization environment. One of methods to
 realize that is to collect tracing data of guests in a host. To do this,
 network
 is generally used. However, high load will be taken to applications on 
 guests
 using network I/O because there are many network stack layers. Therefore,
 a communication method for collecting the data without using network is
 needed.

 I implemented something similar earlier by passing trace data from
 OpenBIOS to QEMU using the firmware configuration device. The data
 format was the same as QEMU used for simpletrace event structure
 instead of ftrace. I didn't commit it because of a few problems.

 Sounds interesting :)
 I guess you traced BIOS events, right?

Yes, I converted a few DPRINTFs to tracepoints as a proof of concept.


 I'm not familiar with ftrace, is it possible to trace two guest
 applications (BIOS and kernel) at the same time?

 Since ftrace itself is a tracing feature in the linux kernel, it
 can trace two or more applications (processes) if those run on linux
 kernel. However, I think OpenBIOS runs *under* the guest kernel.
 If so, ftrace currently can't trace OpenBIOS from guest side.

No, OpenBIOS boots the machine and then passes control to boot loader
and that to kernel. The kernel will make a few calls to OpenBIOS at
start but not later. OpenBIOS is used by QEMU as Sparc and PowerPC
BIOS.


 I think it may need another enhancement on both OpenBIOS and linux
 kernel to trace BIOS event from linux kernel.


Ideally both OpenBIOS and Linux should be able to feed trace events
back to QEMU independently.

 Or could this be
 handled by opening two different virtio-serial pipes, one for BIOS and
 the other for the kernel?

 Of course, virtio-serial itself can open multiple channels, thus, if
 OpenBIOS can handle virtio, it can pass trace data via another
 channel.

Currently OpenBIOS probes the PCI bus and identifies virtio devices
but ignores them, adding virtio-serial support shouldn't be too hard.
There's a time window between CPU boot and PCI probe when the the
device will not be available though.


 In my version, the tracepoint ID would have been used to demultiplex
 QEMU tracepoints from BIOS tracepoints, but something like separate ID
 spaces would have been better.

 I guess your feature notifies events to QEMU and QEMU records that in
 their own buffer. Therefore it must have different tracepoint IDs.
 On the other hand, with this feature, QEMU just passes trace-data to
 host-side pipe. Since outer tracing tool separately collects trace
 data, we don't need to demultiplex the data.

 Perhaps, in the analyzing phase (after tracing), we have to mix events
 again. At that time, we'll add some guest-ID for each event-ID, but
 it can be done offline.

Yes, the multiplexing/demultiplexing is only needed in my version
because the feeds are not independent.


 Best Regards,

 --
 Masami HIRAMATSU
 Software Platform Research Dept. Linux Technology Center
 Hitachi, Ltd., Yokohama Research Laboratory
 E-mail: masami.hiramatsu...@hitachi.com



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-26 Thread Stefan Hajnoczi
On Wed, Jul 25, 2012 at 10:13 AM, Yoshihiro YUNOMAE
yoshihiro.yunomae...@hitachi.com wrote:
 Hi Stefan,


 (2012/07/24 22:41), Stefan Hajnoczi wrote:

 On Tue, Jul 24, 2012 at 12:19 PM, Yoshihiro YUNOMAE
 yoshihiro.yunomae...@hitachi.com wrote:

 Are you using text formatted ftrace?

 No, currently using raw format, but we'd like to reformat it in text.


 Capturing the info necessary to translate numbers into symbols is one
 of the problems of host-guest tracing so I'm curious how you handle
 this :).


 Right, your consideration is true.


 Apologies for my lack of ftrace knowledge but how useful is the raw
 tracing data on the host?  How do you pretty-print it in
 human-readable form?


 perf and trace-cmd can actually translate raw-formatted trace data to
 text-formatted trace data by using information of kernel or trace
 format under tracing/events directory in debugfs. In the same way, if
 the information of a guest is exported to a host, we can translate
 raw trace data of a guest to text trace data on a host. We will use
 9pfs to export that.

Thanks, it's clear now :).

Stefan



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-26 Thread Amit Shah
On (Tue) 24 Jul 2012 [11:36:57], Yoshihiro YUNOMAE wrote:
 Hi All,
 
 The following patch set provides a low-overhead system for collecting kernel
 tracing data of guests by a host in a virtualization environment.
 
 A guest OS generally shares some devices with other guests or a host, so
 reasons of any problems occurring in a guest may be from other guests or a 
 host.
 Then, to collect some tracing data of a number of guests and a host is needed
 when some problems occur in a virtualization environment. One of methods to
 realize that is to collect tracing data of guests in a host. To do this, 
 network
 is generally used. However, high load will be taken to applications on guests
 using network I/O because there are many network stack layers. Therefore,
 a communication method for collecting the data without using network is 
 needed.
 
 We submitted a patch set of IVRing, a ring-buffer driver constructed on
 Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in
 this June. IVRing and the IVRing reader use POSIX shared memory each other
 without using network, so a low-overhead system for collecting guest tracing
 data is realized. However, this patch set has some problems as follows:
  - use IVShmem instead of virtio
  - create a new ring-buffer without using existing ring-buffer in kernel
  - scalability
-- not support SMP environment
-- buffer size limitation
-- not support live migration (maybe difficult for realize this)
 
 Therefore, we propose a new system virtio-trace, which uses enhanced
 virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
 tracing data. In this system, there are 5 main components:
  (1) Ring-buffer of ftrace in a guest
  - When trace agent reads ring-buffer, a page is removed from ring-buffer.
  (2) Trace agent in the guest
  - Splice the page of ring-buffer to read_pipe using splice() without
memory copying. Then, the page is spliced from write_pipe to virtio
without memory copying.

I really like the splicing idea.

  (3) Virtio-console driver in the guest
  - Pass the page to virtio-ring
  (4) Virtio-serial bus in QEMU
  - Copy the page to kernel pipe
  (5) Reader in the host
  - Read guest tracing data via FIFO(named pipe) 

So will this be useful only if guest and host run the same kernel?

I'd like to see the host kernel not being used at all -- collect all
relevant info from the guest and send it out to qemu, where it can be
consumed directly by apps driving the tracing.

 ***Evaluation***
 When a host collects tracing data of a guest, the performance of using
 virtio-trace is compared with that of using native(just running ftrace),
 IVRing, and virtio-serial(normal method of read/write).

Why is tracing performance-sensitive?  i.e. why try to optimise this
at all?

 environment
 The overview of this evaluation is as follows:
  (a) A guest on a KVM is prepared.
  - The guest is dedicated one physical CPU as a virtual CPU(VCPU).
 
  (b) The guest starts to write tracing data to ring-buffer of ftrace.
  - The probe points are all trace points of sched, timer, and kmem.
 
  (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark
  tool in the guest.
  - Dhrystone 2 intends system performance by repeating integer arithmetic
as a score.
  - Since higher score equals to better system performance, if the score
decrease based on bare environment, it indicates that any operation
disturbs the integer arithmetic. Then, we define the overhead of
transporting trace data is calculated as follows:
   OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100.
 
 The performance of each method is compared as follows:
  [1] Native
  - only recording trace data to ring-buffer on a guest
  [2] Virtio-trace
  - running a trace agent on a guest
  - a reader on a host opens FIFO using cat command
  [3] IVRing
  - A SystemTap script in a guest records trace data to IVRing.
-- probe points are same as ftrace.
  [4] Virtio-serial(normal)
  - A reader(using cat) on a guest output trace data to a host using
standard output via virtio-serial.
 
 Other information is as follows:
  - host
kernel: 3.3.7-1 (Fedora16)
CPU: Intel Xeon x5660@2.80GHz(12core)
Memory: 48GB
 
  - guest(only booting one guest)
kernel: 3.5.0-rc4+ (Fedora16)
CPU: 1VCPU(dedicated)
Memory: 1GB
 
 result
 3 patterns based on the bare environment were indicated as follows:
  Scores  overhead against [0] Native
 [0] Native:  28807569.5   -
 [1] Virtio-trace:28685049.5 0.43%
 [2] IVRing:  28418595.5 1.35%
 [3] Virtio-serial:   13262258.753.96%
 
 
 ***Just enhancement ideas***
  - Support for trace-cmd
  - Support for 9pfs protocol
  - Support for non-blocking mode in QEMU

There were 

Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-25 Thread Masami Hiramatsu
(2012/07/25 5:26), Blue Swirl wrote:
 The following patch set provides a low-overhead system for collecting kernel
 tracing data of guests by a host in a virtualization environment.

 A guest OS generally shares some devices with other guests or a host, so
 reasons of any problems occurring in a guest may be from other guests or a
 host.
 Then, to collect some tracing data of a number of guests and a host is needed
 when some problems occur in a virtualization environment. One of methods to
 realize that is to collect tracing data of guests in a host. To do this,
 network
 is generally used. However, high load will be taken to applications on guests
 using network I/O because there are many network stack layers. Therefore,
 a communication method for collecting the data without using network is
 needed.

 I implemented something similar earlier by passing trace data from
 OpenBIOS to QEMU using the firmware configuration device. The data
 format was the same as QEMU used for simpletrace event structure
 instead of ftrace. I didn't commit it because of a few problems.

Sounds interesting :)
I guess you traced BIOS events, right?

 I'm not familiar with ftrace, is it possible to trace two guest
 applications (BIOS and kernel) at the same time?

Since ftrace itself is a tracing feature in the linux kernel, it
can trace two or more applications (processes) if those run on linux
kernel. However, I think OpenBIOS runs *under* the guest kernel.
If so, ftrace currently can't trace OpenBIOS from guest side.

I think it may need another enhancement on both OpenBIOS and linux
kernel to trace BIOS event from linux kernel.

 Or could this be
 handled by opening two different virtio-serial pipes, one for BIOS and
 the other for the kernel?

Of course, virtio-serial itself can open multiple channels, thus, if
OpenBIOS can handle virtio, it can pass trace data via another
channel.

 In my version, the tracepoint ID would have been used to demultiplex
 QEMU tracepoints from BIOS tracepoints, but something like separate ID
 spaces would have been better.

I guess your feature notifies events to QEMU and QEMU records that in
their own buffer. Therefore it must have different tracepoint IDs.
On the other hand, with this feature, QEMU just passes trace-data to
host-side pipe. Since outer tracing tool separately collects trace
data, we don't need to demultiplex the data.

Perhaps, in the analyzing phase (after tracing), we have to mix events
again. At that time, we'll add some guest-ID for each event-ID, but
it can be done offline.

Best Regards,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-25 Thread Yoshihiro YUNOMAE

Hi Stefan,

(2012/07/24 22:41), Stefan Hajnoczi wrote:

On Tue, Jul 24, 2012 at 12:19 PM, Yoshihiro YUNOMAE
yoshihiro.yunomae...@hitachi.com wrote:

Are you using text formatted ftrace?

No, currently using raw format, but we'd like to reformat it in text.


Capturing the info necessary to translate numbers into symbols is one
of the problems of host-guest tracing so I'm curious how you handle
this :).


Right, your consideration is true.


Apologies for my lack of ftrace knowledge but how useful is the raw
tracing data on the host?  How do you pretty-print it in
human-readable form?


perf and trace-cmd can actually translate raw-formatted trace data to
text-formatted trace data by using information of kernel or trace
format under tracing/events directory in debugfs. In the same way, if
the information of a guest is exported to a host, we can translate
raw trace data of a guest to text trace data on a host. We will use
9pfs to export that.

Thank you,

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com





Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-24 Thread Blue Swirl
On Tue, Jul 24, 2012 at 2:36 AM, Yoshihiro YUNOMAE
yoshihiro.yunomae...@hitachi.com wrote:
 Hi All,

 The following patch set provides a low-overhead system for collecting kernel
 tracing data of guests by a host in a virtualization environment.

 A guest OS generally shares some devices with other guests or a host, so
 reasons of any problems occurring in a guest may be from other guests or a 
 host.
 Then, to collect some tracing data of a number of guests and a host is needed
 when some problems occur in a virtualization environment. One of methods to
 realize that is to collect tracing data of guests in a host. To do this, 
 network
 is generally used. However, high load will be taken to applications on guests
 using network I/O because there are many network stack layers. Therefore,
 a communication method for collecting the data without using network is 
 needed.

I implemented something similar earlier by passing trace data from
OpenBIOS to QEMU using the firmware configuration device. The data
format was the same as QEMU used for simpletrace event structure
instead of ftrace. I didn't commit it because of a few problems.

I'm not familiar with ftrace, is it possible to trace two guest
applications (BIOS and kernel) at the same time? Or could this be
handled by opening two different virtio-serial pipes, one for BIOS and
the other for the kernel?

In my version, the tracepoint ID would have been used to demultiplex
QEMU tracepoints from BIOS tracepoints, but something like separate ID
spaces would have been better.


 We submitted a patch set of IVRing, a ring-buffer driver constructed on
 Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in
 this June. IVRing and the IVRing reader use POSIX shared memory each other
 without using network, so a low-overhead system for collecting guest tracing
 data is realized. However, this patch set has some problems as follows:
  - use IVShmem instead of virtio
  - create a new ring-buffer without using existing ring-buffer in kernel
  - scalability
-- not support SMP environment
-- buffer size limitation
-- not support live migration (maybe difficult for realize this)

 Therefore, we propose a new system virtio-trace, which uses enhanced
 virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
 tracing data. In this system, there are 5 main components:
  (1) Ring-buffer of ftrace in a guest
  - When trace agent reads ring-buffer, a page is removed from ring-buffer.
  (2) Trace agent in the guest
  - Splice the page of ring-buffer to read_pipe using splice() without
memory copying. Then, the page is spliced from write_pipe to virtio
without memory copying.
  (3) Virtio-console driver in the guest
  - Pass the page to virtio-ring
  (4) Virtio-serial bus in QEMU
  - Copy the page to kernel pipe
  (5) Reader in the host
  - Read guest tracing data via FIFO(named pipe)

 ***Evaluation***
 When a host collects tracing data of a guest, the performance of using
 virtio-trace is compared with that of using native(just running ftrace),
 IVRing, and virtio-serial(normal method of read/write).

 environment
 The overview of this evaluation is as follows:
  (a) A guest on a KVM is prepared.
  - The guest is dedicated one physical CPU as a virtual CPU(VCPU).

  (b) The guest starts to write tracing data to ring-buffer of ftrace.
  - The probe points are all trace points of sched, timer, and kmem.

  (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark
  tool in the guest.
  - Dhrystone 2 intends system performance by repeating integer arithmetic
as a score.
  - Since higher score equals to better system performance, if the score
decrease based on bare environment, it indicates that any operation
disturbs the integer arithmetic. Then, we define the overhead of
transporting trace data is calculated as follows:
 OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100.

 The performance of each method is compared as follows:
  [1] Native
  - only recording trace data to ring-buffer on a guest
  [2] Virtio-trace
  - running a trace agent on a guest
  - a reader on a host opens FIFO using cat command
  [3] IVRing
  - A SystemTap script in a guest records trace data to IVRing.
-- probe points are same as ftrace.
  [4] Virtio-serial(normal)
  - A reader(using cat) on a guest output trace data to a host using
standard output via virtio-serial.

 Other information is as follows:
  - host
kernel: 3.3.7-1 (Fedora16)
CPU: Intel Xeon x5660@2.80GHz(12core)
Memory: 48GB

  - guest(only booting one guest)
kernel: 3.5.0-rc4+ (Fedora16)
CPU: 1VCPU(dedicated)
Memory: 1GB

 result
 3 patterns based on the bare environment were indicated as follows:
Scores  overhead against [0] Native
 [0] Native:  28807569.5   

Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-24 Thread Stefan Hajnoczi
On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE
yoshihiro.yunomae...@hitachi.com wrote:
 The performance of each method is compared as follows:
  [1] Native
  - only recording trace data to ring-buffer on a guest
  [2] Virtio-trace
  - running a trace agent on a guest
  - a reader on a host opens FIFO using cat command
  [3] IVRing
  - A SystemTap script in a guest records trace data to IVRing.
-- probe points are same as ftrace.
  [4] Virtio-serial(normal)
  - A reader(using cat) on a guest output trace data to a host using
standard output via virtio-serial.

The first time I read this I thought you are adding a new virtio-trace
device.  But it looks like this series really add splice support to
virtio-console and that yields a big performance improvement when
sending trace_pipe_raw.

Guest ftrace is useful and I like this.  Have you thought about
controlling ftrace from the host?  Perhaps a command could be added to
the QEMU guest agent which basically invokes trace-cmd/perf.

Are you using text formatted ftrace?

Stefan



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-24 Thread Masami Hiramatsu
(2012/07/24 19:02), Stefan Hajnoczi wrote:
 On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE
 yoshihiro.yunomae...@hitachi.com wrote:
 The performance of each method is compared as follows:
  [1] Native
  - only recording trace data to ring-buffer on a guest
  [2] Virtio-trace
  - running a trace agent on a guest
  - a reader on a host opens FIFO using cat command
  [3] IVRing
  - A SystemTap script in a guest records trace data to IVRing.
-- probe points are same as ftrace.
  [4] Virtio-serial(normal)
  - A reader(using cat) on a guest output trace data to a host using
standard output via virtio-serial.
 
 The first time I read this I thought you are adding a new virtio-trace
 device.  But it looks like this series really add splice support to
 virtio-console and that yields a big performance improvement when
 sending trace_pipe_raw.

Yes, sorry for the confusion. Actually this is an enhancement of
virtio-serial. I'm working with Yoshihiro on this feature.

 Guest ftrace is useful and I like this.  Have you thought about
 controlling ftrace from the host?  Perhaps a command could be added to
 the QEMU guest agent which basically invokes trace-cmd/perf.

As you can see, guest trace-agent can be controlled via a
control channel. In our scenario, host tools can control that
instead of guest one.

We are considering that exporting the tracing part of guest's
debugfs to host via another virtio-serial channel by using
9pfs, so that the host tools can refer that.

(In this scenario, guest trace-agent will also provide 9pfs server.
Since it means that the agent can handle writing a special file,
trace-agent can be controlled via the special file on exported
debugfs.)

Of course, this also requires modifying trace-cmd/perf to accept
some options like guest-debugfs mount point, guest's serial
channel pipe (or unix socket?), etc. However, it will be a small
change.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-24 Thread Yoshihiro YUNOMAE

Hi Stefan,

Thank you for commenting on our patch set.

(2012/07/24 20:03), Masami Hiramatsu wrote:

(2012/07/24 19:02), Stefan Hajnoczi wrote:

On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE
yoshihiro.yunomae...@hitachi.com wrote:

The performance of each method is compared as follows:
  [1] Native
  - only recording trace data to ring-buffer on a guest
  [2] Virtio-trace
  - running a trace agent on a guest
  - a reader on a host opens FIFO using cat command
  [3] IVRing
  - A SystemTap script in a guest records trace data to IVRing.
-- probe points are same as ftrace.
  [4] Virtio-serial(normal)
  - A reader(using cat) on a guest output trace data to a host using
standard output via virtio-serial.


The first time I read this I thought you are adding a new virtio-trace
device.  But it looks like this series really add splice support to
virtio-console and that yields a big performance improvement when
sending trace_pipe_raw.


Yes, sorry for the confusion. Actually this is an enhancement of
virtio-serial. I'm working with Yoshihiro on this feature.


Guest ftrace is useful and I like this.  Have you thought about
controlling ftrace from the host?  Perhaps a command could be added to
the QEMU guest agent which basically invokes trace-cmd/perf.


As you can see, guest trace-agent can be controlled via a
control channel. In our scenario, host tools can control that
instead of guest one.

We are considering that exporting the tracing part of guest's
debugfs to host via another virtio-serial channel by using
9pfs, so that the host tools can refer that.

(In this scenario, guest trace-agent will also provide 9pfs server.
Since it means that the agent can handle writing a special file,
trace-agent can be controlled via the special file on exported
debugfs.)

Of course, this also requires modifying trace-cmd/perf to accept
some options like guest-debugfs mount point, guest's serial
channel pipe (or unix socket?), etc. However, it will be a small
change.

Thank you,



 Are you using text formatted ftrace?
No, currently using raw format, but we'd like to reformat it in text.

Thank you,

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com





Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-24 Thread Stefan Hajnoczi
On Tue, Jul 24, 2012 at 12:19 PM, Yoshihiro YUNOMAE
yoshihiro.yunomae...@hitachi.com wrote:
 Are you using text formatted ftrace?
 No, currently using raw format, but we'd like to reformat it in text.

Capturing the info necessary to translate numbers into symbols is one
of the problems of host-guest tracing so I'm curious how you handle
this :).

Apologies for my lack of ftrace knowledge but how useful is the raw
tracing data on the host?  How do you pretty-print it in
human-readable form?

Stefan



Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-24 Thread Stefan Hajnoczi
On Tue, Jul 24, 2012 at 12:03 PM, Masami Hiramatsu
masami.hiramatsu...@hitachi.com wrote:
 (2012/07/24 19:02), Stefan Hajnoczi wrote:
 On Tue, Jul 24, 2012 at 3:36 AM, Yoshihiro YUNOMAE
 yoshihiro.yunomae...@hitachi.com wrote:
 The performance of each method is compared as follows:
  [1] Native
  - only recording trace data to ring-buffer on a guest
  [2] Virtio-trace
  - running a trace agent on a guest
  - a reader on a host opens FIFO using cat command
  [3] IVRing
  - A SystemTap script in a guest records trace data to IVRing.
-- probe points are same as ftrace.
  [4] Virtio-serial(normal)
  - A reader(using cat) on a guest output trace data to a host using
standard output via virtio-serial.

 The first time I read this I thought you are adding a new virtio-trace
 device.  But it looks like this series really add splice support to
 virtio-console and that yields a big performance improvement when
 sending trace_pipe_raw.

 Yes, sorry for the confusion. Actually this is an enhancement of
 virtio-serial. I'm working with Yoshihiro on this feature.

 Guest ftrace is useful and I like this.  Have you thought about
 controlling ftrace from the host?  Perhaps a command could be added to
 the QEMU guest agent which basically invokes trace-cmd/perf.

 As you can see, guest trace-agent can be controlled via a
 control channel. In our scenario, host tools can control that
 instead of guest one.

 We are considering that exporting the tracing part of guest's
 debugfs to host via another virtio-serial channel by using
 9pfs, so that the host tools can refer that.

 (In this scenario, guest trace-agent will also provide 9pfs server.
 Since it means that the agent can handle writing a special file,
 trace-agent can be controlled via the special file on exported
 debugfs.)

 Of course, this also requires modifying trace-cmd/perf to accept
 some options like guest-debugfs mount point, guest's serial
 channel pipe (or unix socket?), etc. However, it will be a small
 change.

Okay, thanks for explaining some of the ideas you have.  I won't ask
more because it's out of scope for this patch series :).

Stefan



[Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-23 Thread Yoshihiro YUNOMAE
Hi All,

The following patch set provides a low-overhead system for collecting kernel
tracing data of guests by a host in a virtualization environment.

A guest OS generally shares some devices with other guests or a host, so
reasons of any problems occurring in a guest may be from other guests or a host.
Then, to collect some tracing data of a number of guests and a host is needed
when some problems occur in a virtualization environment. One of methods to
realize that is to collect tracing data of guests in a host. To do this, network
is generally used. However, high load will be taken to applications on guests
using network I/O because there are many network stack layers. Therefore,
a communication method for collecting the data without using network is needed.

We submitted a patch set of IVRing, a ring-buffer driver constructed on
Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in
this June. IVRing and the IVRing reader use POSIX shared memory each other
without using network, so a low-overhead system for collecting guest tracing
data is realized. However, this patch set has some problems as follows:
 - use IVShmem instead of virtio
 - create a new ring-buffer without using existing ring-buffer in kernel
 - scalability
   -- not support SMP environment
   -- buffer size limitation
   -- not support live migration (maybe difficult for realize this)

Therefore, we propose a new system virtio-trace, which uses enhanced
virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
tracing data. In this system, there are 5 main components:
 (1) Ring-buffer of ftrace in a guest
 - When trace agent reads ring-buffer, a page is removed from ring-buffer.
 (2) Trace agent in the guest
 - Splice the page of ring-buffer to read_pipe using splice() without
   memory copying. Then, the page is spliced from write_pipe to virtio
   without memory copying.
 (3) Virtio-console driver in the guest
 - Pass the page to virtio-ring
 (4) Virtio-serial bus in QEMU
 - Copy the page to kernel pipe
 (5) Reader in the host
 - Read guest tracing data via FIFO(named pipe) 

***Evaluation***
When a host collects tracing data of a guest, the performance of using
virtio-trace is compared with that of using native(just running ftrace),
IVRing, and virtio-serial(normal method of read/write).

environment
The overview of this evaluation is as follows:
 (a) A guest on a KVM is prepared.
 - The guest is dedicated one physical CPU as a virtual CPU(VCPU).

 (b) The guest starts to write tracing data to ring-buffer of ftrace.
 - The probe points are all trace points of sched, timer, and kmem.

 (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark
 tool in the guest.
 - Dhrystone 2 intends system performance by repeating integer arithmetic
   as a score.
 - Since higher score equals to better system performance, if the score
   decrease based on bare environment, it indicates that any operation
   disturbs the integer arithmetic. Then, we define the overhead of
   transporting trace data is calculated as follows:
OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100.

The performance of each method is compared as follows:
 [1] Native
 - only recording trace data to ring-buffer on a guest
 [2] Virtio-trace
 - running a trace agent on a guest
 - a reader on a host opens FIFO using cat command
 [3] IVRing
 - A SystemTap script in a guest records trace data to IVRing.
   -- probe points are same as ftrace.
 [4] Virtio-serial(normal)
 - A reader(using cat) on a guest output trace data to a host using
   standard output via virtio-serial.

Other information is as follows:
 - host
   kernel: 3.3.7-1 (Fedora16)
   CPU: Intel Xeon x5660@2.80GHz(12core)
   Memory: 48GB

 - guest(only booting one guest)
   kernel: 3.5.0-rc4+ (Fedora16)
   CPU: 1VCPU(dedicated)
   Memory: 1GB

result
3 patterns based on the bare environment were indicated as follows:
   Scores  overhead against [0] Native
[0] Native:  28807569.5   -
[1] Virtio-trace:28685049.5 0.43%
[2] IVRing:  28418595.5 1.35%
[3] Virtio-serial:   13262258.753.96%


***Just enhancement ideas***
 - Support for trace-cmd
 - Support for 9pfs protocol
 - Support for non-blocking mode in QEMU
 - Make vhost-serial

Thank you,

---

Masami Hiramatsu (5):
  virtio/console: Allocate scatterlist according to the current pipe size
  ftrace: Allow stealing pages from pipe buffer
  virtio/console: Wait until the port is ready on splice
  virtio/console: Add a failback for unstealable pipe buffer
  virtio/console: Add splice_write support

Yoshihiro YUNOMAE (1):
  tools: Add guest trace agent as a user tool


 drivers/char/virtio_console.c   |  198 ++--
 kernel/trace/trace.c   

Re: [Qemu-devel] [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-23 Thread Masami Hiramatsu
(2012/07/24 11:36), Yoshihiro YUNOMAE wrote:
 Therefore, we propose a new system virtio-trace, which uses enhanced
 virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
 tracing data. In this system, there are 5 main components:
  (1) Ring-buffer of ftrace in a guest
  - When trace agent reads ring-buffer, a page is removed from ring-buffer.
  (2) Trace agent in the guest
  - Splice the page of ring-buffer to read_pipe using splice() without
memory copying. Then, the page is spliced from write_pipe to virtio
without memory copying.
  (3) Virtio-console driver in the guest
  - Pass the page to virtio-ring
  (4) Virtio-serial bus in QEMU
  - Copy the page to kernel pipe
  (5) Reader in the host
  - Read guest tracing data via FIFO(named pipe)

So, this is our answer for the argued points in previous thread.
This virtio-serial and ftrace enhancements doesn't introduce new
ringbuffer in the kernel, and just use virtio's ringbuffer.
Also, using splice gives us a great advantage in the performance
because of copy-less trace-data transfer.

Actually, one copy should occur in the host (to write it into the pipe),
because removing physical pages of the guest is hard to track and may
involve a TLB flush per page, even if it is done in background.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com