Re: [RESEND PATCH v5 0/4] perf stat: Introduce iostat mode to provide I/O performance metrics
Em Mon, Apr 19, 2021 at 12:41:43PM +0300, alexander.anto...@linux.intel.com escreveu: > From: Alexander Antonov > > Resending V5 with added Acked-by: Namhyung Kim tag. Thanks, applied. - Arnaldo > Thanks, > Alexander > > The previous version can be found at: > v4: > https://lkml.kernel.org/r/20210203135830.38568-1-alexander.anto...@linux.intel.com/ > Changes in this revision are: > v4 -> v5: > - Addressed comments from Namhyung Kim: > 1. Removed AGGR_PCIE_PORT aggregation mode > 2. Added iostat_prepare() function > 3. Moved implementation specific fprintf() calls to separate x86-related > function > 4. Fixed code-related issues > - Moved __weak iostat's functions to separate util/iostat.c file > > The previous version can be found at: > v3: > https://lkml.kernel.org/r/20210126080619.30275-1-alexander.anto...@linux.intel.com/ > Changes in this revision are: > v3 -> v4: > - Addressed comment from Namhyung Kim: > 1. Removed NULL-termination of root ports list > > The previous version can be found at: > v2: > https://lkml.kernel.org/r/20201223130320.3930-1-alexander.anto...@linux.intel.com > > Changes in this revision are: > v2 -> v3: > - Addressed comments from Namhyung Kim: > 1. Removed perf_device pointer from evsel structure. Use priv field instead > 2. Renamed 'iiostat' to 'iostat' > 3. Renamed 'show' mode to 'list' mode > 4. Renamed iiostat_delete_root_ports() to iiostat_release() and > iostat_show_root_ports() to iostat_list() > > The previous version can be found at: > v1: > https://lkml.kernel.org/r/20201210090340.14358-1-alexander.anto...@linux.intel.com > > Changes in this revision are: > v1 -> v2: > - Addressed comment from Arnaldo Carvalho de Melo: > 1. Using 'perf iiostat' subcommand instead of 'perf stat --iiostat': > - Added perf-iiostat.sh script to use short command > - Updated manual pages to get help for 'perf iiostat' > - Added 'perf-iiostat' to perf's gitignore file > > Mode is intended to provide four I/O performance metrics in MB per each > root port: > - Inbound Read: I/O devices below root port read from the host memory > - Inbound Write: I/O devices below root port write to the host memory > - Outbound Read: CPU reads from I/O devices below root port > - Outbound Write: CPU writes to I/O devices below root port > > Each metric requiries only one uncore event which increments at every 4B > transfer in corresponding direction. The formulas to compute metrics > are generic: > #EventCount * 4B / (1024 * 1024) > > Note: iostat introduces new perf data aggregation mode - per PCIe root port > hence -e and -M options are not supported. > > Usage examples: > > 1. List all PCIe root ports (example for 2-S platform): >$ perf iostat list >S0-uncore_iio_0<:00> >S1-uncore_iio_0<:80> >S0-uncore_iio_1<:17> >S1-uncore_iio_1<:85> >S0-uncore_iio_2<:3a> >S1-uncore_iio_2<:ae> >S0-uncore_iio_3<:5d> >S1-uncore_iio_3<:d7> > > 2. Collect metrics for all PCIe root ports: >$ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct >357708+0 records in >357707+0 records out >375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s > > Performance counter stats for 'system wide': > > port Inbound Read(MB)Inbound Write(MB)Outbound > Read(MB) Outbound Write(MB) >:00102 >3 >:80000 >0 >:17 352552 430 > 21 >:85000 >0 >:3a300 >0 >:ae000 >0 >:5d000 >0 >:d7000 >0 > > 3. Collect metrics for comma separated list of PCIe root ports: >$ perf iostat :17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M > oflag=direct >357708+0 records in >357707+0 records out >375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s > > Performance counter stats for 'system wide': > > port Inbound Read(MB)Inbound Write(MB)Outbound > Read(MB) Outbound Write(MB) >:17 358559 440 > 22 >:3a320 >0 > > 197.081983474 seconds time elapsed > > Alexander Antonov (4): > perf stat: Basic support for iostat in
[RESEND PATCH v5 0/4] perf stat: Introduce iostat mode to provide I/O performance metrics
From: Alexander Antonov Resending V5 with added Acked-by: Namhyung Kim tag. Thanks, Alexander The previous version can be found at: v4: https://lkml.kernel.org/r/20210203135830.38568-1-alexander.anto...@linux.intel.com/ Changes in this revision are: v4 -> v5: - Addressed comments from Namhyung Kim: 1. Removed AGGR_PCIE_PORT aggregation mode 2. Added iostat_prepare() function 3. Moved implementation specific fprintf() calls to separate x86-related function 4. Fixed code-related issues - Moved __weak iostat's functions to separate util/iostat.c file The previous version can be found at: v3: https://lkml.kernel.org/r/20210126080619.30275-1-alexander.anto...@linux.intel.com/ Changes in this revision are: v3 -> v4: - Addressed comment from Namhyung Kim: 1. Removed NULL-termination of root ports list The previous version can be found at: v2: https://lkml.kernel.org/r/20201223130320.3930-1-alexander.anto...@linux.intel.com Changes in this revision are: v2 -> v3: - Addressed comments from Namhyung Kim: 1. Removed perf_device pointer from evsel structure. Use priv field instead 2. Renamed 'iiostat' to 'iostat' 3. Renamed 'show' mode to 'list' mode 4. Renamed iiostat_delete_root_ports() to iiostat_release() and iostat_show_root_ports() to iostat_list() The previous version can be found at: v1: https://lkml.kernel.org/r/20201210090340.14358-1-alexander.anto...@linux.intel.com Changes in this revision are: v1 -> v2: - Addressed comment from Arnaldo Carvalho de Melo: 1. Using 'perf iiostat' subcommand instead of 'perf stat --iiostat': - Added perf-iiostat.sh script to use short command - Updated manual pages to get help for 'perf iiostat' - Added 'perf-iiostat' to perf's gitignore file Mode is intended to provide four I/O performance metrics in MB per each root port: - Inbound Read: I/O devices below root port read from the host memory - Inbound Write: I/O devices below root port write to the host memory - Outbound Read: CPU reads from I/O devices below root port - Outbound Write: CPU writes to I/O devices below root port Each metric requiries only one uncore event which increments at every 4B transfer in corresponding direction. The formulas to compute metrics are generic: #EventCount * 4B / (1024 * 1024) Note: iostat introduces new perf data aggregation mode - per PCIe root port hence -e and -M options are not supported. Usage examples: 1. List all PCIe root ports (example for 2-S platform): $ perf iostat list S0-uncore_iio_0<:00> S1-uncore_iio_0<:80> S0-uncore_iio_1<:17> S1-uncore_iio_1<:85> S0-uncore_iio_2<:3a> S1-uncore_iio_2<:ae> S0-uncore_iio_3<:5d> S1-uncore_iio_3<:d7> 2. Collect metrics for all PCIe root ports: $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct 357708+0 records in 357707+0 records out 375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s Performance counter stats for 'system wide': port Inbound Read(MB)Inbound Write(MB)Outbound Read(MB) Outbound Write(MB) :00102 3 :80000 0 :17 352552 430 21 :85000 0 :3a300 0 :ae000 0 :5d000 0 :d7000 0 3. Collect metrics for comma separated list of PCIe root ports: $ perf iostat :17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct 357708+0 records in 357707+0 records out 375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s Performance counter stats for 'system wide': port Inbound Read(MB)Inbound Write(MB)Outbound Read(MB) Outbound Write(MB) :17 358559 440 22 :3a320 0 197.081983474 seconds time elapsed Alexander Antonov (4): perf stat: Basic support for iostat in perf perf stat: Helper functions for PCIe root ports list in iostat mode perf stat: Enable iostat mode for x86 platforms perf: Update .gitignore file tools/perf/.gitignore| 1 + tools/perf/Documentation/perf-iostat.txt | 88 + tools/perf/Makefile.perf | 5 +- tools/perf/arch/x86/util/Build | 1 +