Re: [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms

2021-03-10 Thread liuqi (BA)




On 2021/3/11 0:19, Alexander Antonov wrote:


On 3/9/2021 10:51 AM, liuqi (BA) wrote:

Hi Alexander,

On 2021/2/3 21:58, Alexander Antonov wrote:

This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
  - Inbound Read: I/O devices below root port read from the host memory
  - Inbound Write: I/O devices below root port write to the host memory
  - Outbound Read: CPU reads from I/O devices below root port
  - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
 #EventCount * 4B / (1024 * 1024)

Signed-off-by: Alexander Antonov
---
  tools/perf/Documentation/perf-iostat.txt |  88 ++
  tools/perf/Makefile.perf |   5 +-
  tools/perf/arch/x86/util/Build   |   1 +
  tools/perf/arch/x86/util/iostat.c    | 345 +++
  tools/perf/command-list.txt  |   1 +
  tools/perf/perf-iostat.sh    |  12 +
  6 files changed, 451 insertions(+), 1 deletion(-)
  create mode 100644 tools/perf/Documentation/perf-iostat.txt
  create mode 100644 tools/perf/perf-iostat.sh

diff --git a/tools/perf/Documentation/perf-iostat.txt 
b/tools/perf/Documentation/perf-iostat.txt

new file mode 100644
index ..165176944031
--- /dev/null
+++ b/tools/perf/Documentation/perf-iostat.txt
@@ -0,0 +1,88 @@
+perf-iostat(1)
+===
+
+NAME
+
+perf-iostat - Show I/O performance metrics
+
+SYNOPSIS
+
+[verse]
+'perf iostat' list
+'perf iostat'  --  []
+
+DESCRIPTION
+---
+Mode is intended to provide four I/O performance metrics per each 
PCIe root port:

+
+- Inbound Read   - I/O devices below root port read from the host 
memory, in MB

+
+- Inbound Write  - I/O devices below root port write to the host 
memory, in MB

+
+- Outbound Read  - CPU reads from I/O devices below root port, in MB
+
+- Outbound Write - CPU writes to I/O devices below root port, in MB
+
+OPTIONS
+---
+...::
+    Any command you can specify in a shell.
+
+list::
+    List all PCIe root ports.


I noticed that "iostat" commond and cmd_iostat() callback function is 
not registered in cmd_struct in perf.c. So I think "perf iostat list" 
perhaps can not work properly.


I also test this patchset on x86 platform, and here is the log:

root@ubuntu:/home/lq# ./perf iostat list
perf: 'iostat' is not a perf-command. See 'perf --help'.
root@ubuntu:/home/lq# ./perf stat --iostat
^C
 Performance counter stats for 'system wide':

   port Inbound Read(MB)    Inbound Write(MB) Outbound 
Read(MB)   Outbound Write(MB)

:00    0 0    0  0
:80    0 0    0  0
:17    0 0    0  0
:85    0 0    0  0
:3a    0 0    0  0
:ae    0 0    0  0
:5d    0 0    0  0
:d7    0 0    0  0

   0.611303832 seconds time elapsed


root@ubuntu:/home/lq# ./perf stat --iostat=:17
^C
 Performance counter stats for 'system wide':

   port Inbound Read(MB)    Inbound Write(MB) Outbound 
Read(MB)   Outbound Write(MB)

:17    0 0    0  0

   0.521317572 seconds time elapsed

So how does following perf iostat list work, did I miss something?

Thanks,
Qi


Hello,

The 'iostat' mode uses aliases mechanism in perf same as 'perf archive' and
in this case you don't need to add function callback into cmd_struct.
For example, the command 'perf iostat list' will be converted to
'perf stat --iostat=list'.

After building the perf tool you should have two shell scripts in 
tools/perf

directory and one of them is executable, for example:
# make -C tools/perf
# ls -l tools/perf/perf-iostat*
-rwxr-xr-x 1 root root 290 Mar 10 18:17 perf-iostat
-rw-r--r-- 1 root root 290 Feb  3 15:14 perf-iostat.sh

It should be possible to run 'perf iostat' from build directory:
# cd tools/perf
# ./perf iostat list
S0-uncore_iio_0<:00>
S1-uncore_iio_0<:80>
S0-uncore_iio_1<:17>
S1-uncore_iio_1<:85>
S0-uncore_iio_2<:3a>
S1-uncore_iio_2<:ae>
S0-uncore_iio_3<:5d>
S1-uncore_iio_3<:d7>

Also you can copy 'perf-iostat' to ~/libexec/perf-core/ or just run 
'make install'


# make install
# cp perf /usr/bin/
# ls -lh ~/libexec/perf-core/
total 24K
-rwxr-xr-x 1 root root 1.4K Mar 10 18:17 perf-archive
-rwxr-xr-

Re: [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms

2021-03-10 Thread Alexander Antonov



On 3/9/2021 10:51 AM, liuqi (BA) wrote:

Hi Alexander,

On 2021/2/3 21:58, Alexander Antonov wrote:

This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
  - Inbound Read: I/O devices below root port read from the host memory
  - Inbound Write: I/O devices below root port write to the host memory
  - Outbound Read: CPU reads from I/O devices below root port
  - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
 #EventCount * 4B / (1024 * 1024)

Signed-off-by: Alexander Antonov
---
  tools/perf/Documentation/perf-iostat.txt |  88 ++
  tools/perf/Makefile.perf |   5 +-
  tools/perf/arch/x86/util/Build   |   1 +
  tools/perf/arch/x86/util/iostat.c    | 345 +++
  tools/perf/command-list.txt  |   1 +
  tools/perf/perf-iostat.sh    |  12 +
  6 files changed, 451 insertions(+), 1 deletion(-)
  create mode 100644 tools/perf/Documentation/perf-iostat.txt
  create mode 100644 tools/perf/perf-iostat.sh

diff --git a/tools/perf/Documentation/perf-iostat.txt 
b/tools/perf/Documentation/perf-iostat.txt

new file mode 100644
index ..165176944031
--- /dev/null
+++ b/tools/perf/Documentation/perf-iostat.txt
@@ -0,0 +1,88 @@
+perf-iostat(1)
+===
+
+NAME
+
+perf-iostat - Show I/O performance metrics
+
+SYNOPSIS
+
+[verse]
+'perf iostat' list
+'perf iostat'  --  []
+
+DESCRIPTION
+---
+Mode is intended to provide four I/O performance metrics per each 
PCIe root port:

+
+- Inbound Read   - I/O devices below root port read from the host 
memory, in MB

+
+- Inbound Write  - I/O devices below root port write to the host 
memory, in MB

+
+- Outbound Read  - CPU reads from I/O devices below root port, in MB
+
+- Outbound Write - CPU writes to I/O devices below root port, in MB
+
+OPTIONS
+---
+...::
+    Any command you can specify in a shell.
+
+list::
+    List all PCIe root ports.


I noticed that "iostat" commond and cmd_iostat() callback function is 
not registered in cmd_struct in perf.c. So I think "perf iostat list" 
perhaps can not work properly.


I also test this patchset on x86 platform, and here is the log:

root@ubuntu:/home/lq# ./perf iostat list
perf: 'iostat' is not a perf-command. See 'perf --help'.
root@ubuntu:/home/lq# ./perf stat --iostat
^C
 Performance counter stats for 'system wide':

   port Inbound Read(MB)    Inbound Write(MB) Outbound 
Read(MB)   Outbound Write(MB)

:00    0 0    0  0
:80    0 0    0  0
:17    0 0    0  0
:85    0 0    0  0
:3a    0 0    0  0
:ae    0 0    0  0
:5d    0 0    0  0
:d7    0 0    0  0

   0.611303832 seconds time elapsed


root@ubuntu:/home/lq# ./perf stat --iostat=:17
^C
 Performance counter stats for 'system wide':

   port Inbound Read(MB)    Inbound Write(MB) Outbound 
Read(MB)   Outbound Write(MB)

:17    0 0    0  0

   0.521317572 seconds time elapsed

So how does following perf iostat list work, did I miss something?

Thanks,
Qi


Hello,

The 'iostat' mode uses aliases mechanism in perf same as 'perf archive' and
in this case you don't need to add function callback into cmd_struct.
For example, the command 'perf iostat list' will be converted to
'perf stat --iostat=list'.

After building the perf tool you should have two shell scripts in tools/perf
directory and one of them is executable, for example:
# make -C tools/perf
# ls -l tools/perf/perf-iostat*
-rwxr-xr-x 1 root root 290 Mar 10 18:17 perf-iostat
-rw-r--r-- 1 root root 290 Feb  3 15:14 perf-iostat.sh

It should be possible to run 'perf iostat' from build directory:
# cd tools/perf
# ./perf iostat list
S0-uncore_iio_0<:00>
S1-uncore_iio_0<:80>
S0-uncore_iio_1<:17>
S1-uncore_iio_1<:85>
S0-uncore_iio_2<:3a>
S1-uncore_iio_2<:ae>
S0-uncore_iio_3<:5d>
S1-uncore_iio_3<:d7>

Also you can copy 'perf-iostat' to ~/libexec/perf-core/ or just run 
'make install'


# make install
# cp perf /usr/bin/
# ls -lh ~/libexec/perf-core/
total 24K
-rwxr-xr-x 1 root root 1.4K Mar 10 18:17 perf-archive
-rwxr-xr-x 1 root root  290 Mar 10 18:17 perf-iostat
-rwxr

Re: [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms

2021-03-08 Thread liuqi (BA)

Hi Alexander,

On 2021/2/3 21:58, Alexander Antonov wrote:

This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
  - Inbound Read: I/O devices below root port read from the host memory
  - Inbound Write: I/O devices below root port write to the host memory
  - Outbound Read: CPU reads from I/O devices below root port
  - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
 #EventCount * 4B / (1024 * 1024)

Signed-off-by: Alexander Antonov
---
  tools/perf/Documentation/perf-iostat.txt |  88 ++
  tools/perf/Makefile.perf |   5 +-
  tools/perf/arch/x86/util/Build   |   1 +
  tools/perf/arch/x86/util/iostat.c| 345 +++
  tools/perf/command-list.txt  |   1 +
  tools/perf/perf-iostat.sh|  12 +
  6 files changed, 451 insertions(+), 1 deletion(-)
  create mode 100644 tools/perf/Documentation/perf-iostat.txt
  create mode 100644 tools/perf/perf-iostat.sh

diff --git a/tools/perf/Documentation/perf-iostat.txt 
b/tools/perf/Documentation/perf-iostat.txt
new file mode 100644
index ..165176944031
--- /dev/null
+++ b/tools/perf/Documentation/perf-iostat.txt
@@ -0,0 +1,88 @@
+perf-iostat(1)
+===
+
+NAME
+
+perf-iostat - Show I/O performance metrics
+
+SYNOPSIS
+
+[verse]
+'perf iostat' list
+'perf iostat'  --  []
+
+DESCRIPTION
+---
+Mode is intended to provide four I/O performance metrics per each PCIe root 
port:
+
+- Inbound Read   - I/O devices below root port read from the host memory, in MB
+
+- Inbound Write  - I/O devices below root port write to the host memory, in MB
+
+- Outbound Read  - CPU reads from I/O devices below root port, in MB
+
+- Outbound Write - CPU writes to I/O devices below root port, in MB
+
+OPTIONS
+---
+...::
+   Any command you can specify in a shell.
+
+list::
+   List all PCIe root ports.


I noticed that "iostat" commond and cmd_iostat() callback function is 
not registered in cmd_struct in perf.c. So I think "perf iostat list" 
perhaps can not work properly.


I also test this patchset on x86 platform, and here is the log:

root@ubuntu:/home/lq# ./perf iostat list
perf: 'iostat' is not a perf-command. See 'perf --help'.
root@ubuntu:/home/lq# ./perf stat --iostat
^C
 Performance counter stats for 'system wide':

   port Inbound Read(MB)Inbound Write(MB)Outbound 
Read(MB)   Outbound Write(MB)
:00000 
 0
:80000 
 0
:17000 
 0
:85000 
 0
:3a000 
 0
:ae000 
 0
:5d000 
 0
:d7000 
 0


   0.611303832 seconds time elapsed


root@ubuntu:/home/lq# ./perf stat --iostat=:17
^C
 Performance counter stats for 'system wide':

   port Inbound Read(MB)Inbound Write(MB)Outbound 
Read(MB)   Outbound Write(MB)
:17000 
 0


   0.521317572 seconds time elapsed

So how does following perf iostat list work, did I miss something?

Thanks,
Qi


+
+::
+   Select the root ports for monitoring. Comma-separated list is supported.
+
+EXAMPLES
+
+
+1. List all PCIe root ports (example for 2-S platform):
+
+   $ perf iostat list
+   S0-uncore_iio_0<:00>
+   S1-uncore_iio_0<:80>
+   S0-uncore_iio_1<:17>
+   S1-uncore_iio_1<:85>
+   S0-uncore_iio_2<:3a>
+   S1-uncore_iio_2<:ae>
+   S0-uncore_iio_3<:5d>
+   S1-uncore_iio_3<:d7>




[PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms

2021-02-03 Thread Alexander Antonov
This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
 - Inbound Read: I/O devices below root port read from the host memory
 - Inbound Write: I/O devices below root port write to the host memory
 - Outbound Read: CPU reads from I/O devices below root port
 - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
#EventCount * 4B / (1024 * 1024)

Signed-off-by: Alexander Antonov 
---
 tools/perf/Documentation/perf-iostat.txt |  88 ++
 tools/perf/Makefile.perf |   5 +-
 tools/perf/arch/x86/util/Build   |   1 +
 tools/perf/arch/x86/util/iostat.c| 345 +++
 tools/perf/command-list.txt  |   1 +
 tools/perf/perf-iostat.sh|  12 +
 6 files changed, 451 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/Documentation/perf-iostat.txt
 create mode 100644 tools/perf/perf-iostat.sh

diff --git a/tools/perf/Documentation/perf-iostat.txt 
b/tools/perf/Documentation/perf-iostat.txt
new file mode 100644
index ..165176944031
--- /dev/null
+++ b/tools/perf/Documentation/perf-iostat.txt
@@ -0,0 +1,88 @@
+perf-iostat(1)
+===
+
+NAME
+
+perf-iostat - Show I/O performance metrics
+
+SYNOPSIS
+
+[verse]
+'perf iostat' list
+'perf iostat'  --  []
+
+DESCRIPTION
+---
+Mode is intended to provide four I/O performance metrics per each PCIe root 
port:
+
+- Inbound Read   - I/O devices below root port read from the host memory, in MB
+
+- Inbound Write  - I/O devices below root port write to the host memory, in MB
+
+- Outbound Read  - CPU reads from I/O devices below root port, in MB
+
+- Outbound Write - CPU writes to I/O devices below root port, in MB
+
+OPTIONS
+---
+...::
+   Any command you can specify in a shell.
+
+list::
+   List all PCIe root ports.
+
+::
+   Select the root ports for monitoring. Comma-separated list is supported.
+
+EXAMPLES
+
+
+1. List all PCIe root ports (example for 2-S platform):
+
+   $ perf iostat list
+   S0-uncore_iio_0<:00>
+   S1-uncore_iio_0<:80>
+   S0-uncore_iio_1<:17>
+   S1-uncore_iio_1<:85>
+   S0-uncore_iio_2<:3a>
+   S1-uncore_iio_2<:ae>
+   S0-uncore_iio_3<:5d>
+   S1-uncore_iio_3<:d7>
+
+2. Collect metrics for all PCIe root ports:
+
+   $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
+   357708+0 records in
+   357707+0 records out
+   375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s
+
+Performance counter stats for 'system wide':
+
+  port Inbound Read(MB)Inbound Write(MB)Outbound 
Read(MB)   Outbound Write(MB)
+   :00102  
  3
+   :80000  
  0
+   :17   352552   430  
 21
+   :85000  
  0
+   :3a300  
  0
+   :ae000  
  0
+   :5d000  
  0
+   :d7000  
  0
+
+3. Collect metrics for comma-separated list of PCIe root ports:
+
+   $ perf iostat :17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M 
oflag=direct
+   357708+0 records in
+   357707+0 records out
+   375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s
+
+Performance counter stats for 'system wide':
+
+  port Inbound Read(MB)Inbound Write(MB)Outbound 
Read(MB)   Outbound Write(MB)
+   :17   358559   440  
 22
+   :3a320  
  0
+
+197.081983474 seconds time elapsed
+
+SEE ALSO
+
+linkperf:perf-stat[1]
\ No newline at end of file
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 902c792f326a..b4ab48cc01e3 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -267,6 +267,7 @@ SCRIPT_SH =
 
 SCRIPT_SH += perf-archive.sh
 SCRIPT_SH += perf-with-kcore.sh
+SCRIPT_SH += perf-iostat.sh
 
 grep-libs = $(filter -l%,$(1))
 strip-libs = $(filter-out -l%,$(1))
@@ -890,6 +891,8 @@ endif
$(INSTALL) $(OUTPUT)