[PATCH] papr_vpd.c: calling devfd before get_system_loc_code
Calling get_system_loc_code before checking devfd and errno - fails the test when the device is not available, expected a SKIP. Change the order of 'SKIP_IF_MSG' correctly SKIP when the /dev/papr-vpd device is not available. with out patch: Test FAILED on line 271 with patch: [SKIP] Test skipped on line 266: /dev/papr-vpd not present Signed-off-by: R Nageswara Sastry --- tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c b/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c index 98cbb9109ee6..505294da1b9f 100644 --- a/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c +++ b/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c @@ -263,10 +263,10 @@ static int papr_vpd_system_loc_code(void) off_t size; int fd; - SKIP_IF_MSG(get_system_loc_code(), - "Cannot determine system location code"); SKIP_IF_MSG(devfd < 0 && errno == ENOENT, DEVPATH " not present"); + SKIP_IF_MSG(get_system_loc_code(), + "Cannot determine system location code"); FAIL_IF(devfd < 0); -- 2.37.1 (Apple Git-137.1)
Re: [PATCH v4 5/6] integrity: PowerVM machine keyring enablement
On 15/08/23 4:57 pm, Nayna Jain wrote: Update Kconfig to enable machine keyring and limit to CA certificates on PowerVM. Only key signing CA keys are allowed. Signed-off-by: Nayna Jain Reviewed-and-tested-by: Mimi Zohar Reviewed-by: Jarkko Sakkinen Tested with trustedcadb, moduledb scenarios Tested-by: Nageswara R Sastry --- security/integrity/Kconfig | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/security/integrity/Kconfig b/security/integrity/Kconfig index ec6e0d789da1..232191ee09e3 100644 --- a/security/integrity/Kconfig +++ b/security/integrity/Kconfig @@ -67,7 +67,9 @@ config INTEGRITY_MACHINE_KEYRING depends on SECONDARY_TRUSTED_KEYRING depends on INTEGRITY_ASYMMETRIC_KEYS depends on SYSTEM_BLACKLIST_KEYRING - depends on LOAD_UEFI_KEYS + depends on LOAD_UEFI_KEYS || LOAD_PPC_KEYS + select INTEGRITY_CA_MACHINE_KEYRING if LOAD_PPC_KEYS + select INTEGRITY_CA_MACHINE_KEYRING_MAX if LOAD_PPC_KEYS help If set, provide a keyring to which Machine Owner Keys (MOK) may be added. This keyring shall contain just MOK keys. Unlike keys -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v4 6/6] integrity: PowerVM support for loading third party code signing keys
On 15/08/23 4:57 pm, Nayna Jain wrote: On secure boot enabled PowerVM LPAR, third party code signing keys are needed during early boot to verify signed third party modules. These third party keys are stored in moduledb object in the Platform KeyStore (PKS). Load third party code signing keys onto .secondary_trusted_keys keyring. Signed-off-by: Nayna Jain Tested with trustedcadb, moduledb scenarios Tested-by: Nageswara R Sastry --- certs/system_keyring.c| 30 +++ include/keys/system_keyring.h | 4 +++ .../platform_certs/keyring_handler.c | 8 + .../platform_certs/keyring_handler.h | 5 .../integrity/platform_certs/load_powerpc.c | 17 +++ 5 files changed, 64 insertions(+) diff --git a/certs/system_keyring.c b/certs/system_keyring.c index b348e0898d34..33841c91f12c 100644 --- a/certs/system_keyring.c +++ b/certs/system_keyring.c @@ -152,6 +152,36 @@ static __init struct key_restriction *get_builtin_and_secondary_restriction(void return restriction; } + +/** + * add_to_secondary_keyring - Add to secondary keyring. + * @source: Source of key + * @data: The blob holding the key + * @len: The length of the data blob + * + * Add a key to the secondary keyring. The key must be vouched for by a key in the builtin, + * machine or secondary keyring itself. + */ +void __init add_to_secondary_keyring(const char *source, const void *data, size_t len) +{ + key_ref_t key; + key_perm_t perm; + + perm = (KEY_POS_ALL & ~KEY_POS_SETATTR) | KEY_USR_VIEW; + + key = key_create_or_update(make_key_ref(secondary_trusted_keys, 1), + "asymmetric", + NULL, data, len, perm, + KEY_ALLOC_NOT_IN_QUOTA); + if (IS_ERR(key)) { + pr_err("Problem loading X.509 certificate from %s to secondary keyring %ld\n", + source, PTR_ERR(key)); + return; + } + + pr_notice("Loaded X.509 cert '%s'\n", key_ref_to_ptr(key)->description); + key_ref_put(key); +} #endif #ifdef CONFIG_INTEGRITY_MACHINE_KEYRING void __init set_machine_trusted_keys(struct key *keyring) diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h index 7e2583208820..8365adf842ef 100644 --- a/include/keys/system_keyring.h +++ b/include/keys/system_keyring.h @@ -50,9 +50,13 @@ int restrict_link_by_digsig_builtin_and_secondary(struct key *keyring, const struct key_type *type, const union key_payload *payload, struct key *restriction_key); +void __init add_to_secondary_keyring(const char *source, const void *data, size_t len); #else #define restrict_link_by_builtin_and_secondary_trusted restrict_link_by_builtin_trusted #define restrict_link_by_digsig_builtin_and_secondary restrict_link_by_digsig_builtin +static inline void __init add_to_secondary_keyring(const char *source, const void *data, size_t len) +{ +} #endif #ifdef CONFIG_INTEGRITY_MACHINE_KEYRING diff --git a/security/integrity/platform_certs/keyring_handler.c b/security/integrity/platform_certs/keyring_handler.c index 586027b9a3f5..13ea17207902 100644 --- a/security/integrity/platform_certs/keyring_handler.c +++ b/security/integrity/platform_certs/keyring_handler.c @@ -78,6 +78,14 @@ __init efi_element_handler_t get_handler_for_ca_keys(const efi_guid_t *sig_type) return NULL; } +__init efi_element_handler_t get_handler_for_code_signing_keys(const efi_guid_t *sig_type) +{ + if (efi_guidcmp(*sig_type, efi_cert_x509_guid) == 0) + return add_to_secondary_keyring; + + return NULL; +} + /* * Return the appropriate handler for particular signature list types found in * the UEFI dbx and MokListXRT tables. diff --git a/security/integrity/platform_certs/keyring_handler.h b/security/integrity/platform_certs/keyring_handler.h index 6f15bb4cc8dc..f92895cc50f6 100644 --- a/security/integrity/platform_certs/keyring_handler.h +++ b/security/integrity/platform_certs/keyring_handler.h @@ -34,6 +34,11 @@ efi_element_handler_t get_handler_for_mok(const efi_guid_t *sig_type); */ efi_element_handler_t get_handler_for_ca_keys(const efi_guid_t *sig_type); +/* + * Return the handler for particular signature list types for code signing keys. + */ +efi_element_handler_t get_handler_for_code_signing_keys(const efi_guid_t *sig_type); + /* * Return the handler for particular signature list types found in the dbx. */ diff --git a/security/integrity/platform_certs/load_powerpc.c b/security/integrity/platform_certs/load_powerpc.c index 339053d9726d..c85febca3343 100644 --- a/security/integrity/platform_certs/load_powerpc.c +++ b/security/integrity/platform_certs/load_powerpc.c @@
Re: [PATCH v4 4/6] integrity: check whether imputed trust is enabled
On 15/08/23 4:57 pm, Nayna Jain wrote: trust_moklist() is specific to UEFI enabled systems. Other platforms rely only on the Kconfig. Define a generic wrapper named imputed_trust_enabled(). Signed-off-by: Nayna Jain Reviewed-off-by: Mimi Zohar Tested with trustedcadb, moduledb scenarios Tested-by: Nageswara R Sastry --- security/integrity/digsig.c| 2 +- security/integrity/integrity.h | 5 +++-- .../integrity/platform_certs/keyring_handler.c | 3 ++- .../integrity/platform_certs/machine_keyring.c | 18 -- 4 files changed, 22 insertions(+), 6 deletions(-) diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c index d0704b1597d4..df387de29bfa 100644 --- a/security/integrity/digsig.c +++ b/security/integrity/digsig.c @@ -113,7 +113,7 @@ static int __init __integrity_init_keyring(const unsigned int id, } else { if (id == INTEGRITY_KEYRING_PLATFORM) set_platform_trusted_keys(keyring[id]); - if (id == INTEGRITY_KEYRING_MACHINE && trust_moklist()) + if (id == INTEGRITY_KEYRING_MACHINE && imputed_trust_enabled()) set_machine_trusted_keys(keyring[id]); if (id == INTEGRITY_KEYRING_IMA) load_module_cert(keyring[id]); diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h index 7167a6e99bdc..d7553c93f5c0 100644 --- a/security/integrity/integrity.h +++ b/security/integrity/integrity.h @@ -320,13 +320,14 @@ static inline void __init add_to_platform_keyring(const char *source, #ifdef CONFIG_INTEGRITY_MACHINE_KEYRING void __init add_to_machine_keyring(const char *source, const void *data, size_t len); -bool __init trust_moklist(void); +bool __init imputed_trust_enabled(void); #else static inline void __init add_to_machine_keyring(const char *source, const void *data, size_t len) { } -static inline bool __init trust_moklist(void) + +static inline bool __init imputed_trust_enabled(void) { return false; } diff --git a/security/integrity/platform_certs/keyring_handler.c b/security/integrity/platform_certs/keyring_handler.c index 1649d047e3b8..586027b9a3f5 100644 --- a/security/integrity/platform_certs/keyring_handler.c +++ b/security/integrity/platform_certs/keyring_handler.c @@ -61,7 +61,8 @@ __init efi_element_handler_t get_handler_for_db(const efi_guid_t *sig_type) __init efi_element_handler_t get_handler_for_mok(const efi_guid_t *sig_type) { if (efi_guidcmp(*sig_type, efi_cert_x509_guid) == 0) { - if (IS_ENABLED(CONFIG_INTEGRITY_MACHINE_KEYRING) && trust_moklist()) + if (IS_ENABLED(CONFIG_INTEGRITY_MACHINE_KEYRING) && + imputed_trust_enabled()) return add_to_machine_keyring; else return add_to_platform_keyring; diff --git a/security/integrity/platform_certs/machine_keyring.c b/security/integrity/platform_certs/machine_keyring.c index 9482e16cb2ca..a401640a63cd 100644 --- a/security/integrity/platform_certs/machine_keyring.c +++ b/security/integrity/platform_certs/machine_keyring.c @@ -34,7 +34,8 @@ void __init add_to_machine_keyring(const char *source, const void *data, size_t * If the restriction check does not pass and the platform keyring * is configured, try to add it into that keyring instead. */ - if (rc && efi_enabled(EFI_BOOT) && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) + if (rc && efi_enabled(EFI_BOOT) && + IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) rc = integrity_load_cert(INTEGRITY_KEYRING_PLATFORM, source, data, len, perm); @@ -60,7 +61,7 @@ static __init bool uefi_check_trust_mok_keys(void) return false; } -bool __init trust_moklist(void) +static bool __init trust_moklist(void) { static bool initialized; static bool trust_mok; @@ -75,3 +76,16 @@ bool __init trust_moklist(void) return trust_mok; } + +/* + * Provides platform specific check for trusting imputed keys before loading + * on .machine keyring. UEFI systems enable this trust based on a variable, + * and for other platforms, it is always enabled. + */ +bool __init imputed_trust_enabled(void) +{ + if (efi_enabled(EFI_BOOT)) + return trust_moklist(); + + return true; +} -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v4 3/6] integrity: remove global variable from machine_keyring.c
On 15/08/23 4:57 pm, Nayna Jain wrote: trust_mok variable is accessed within a single function locally. Change trust_mok from global to local static variable. Signed-off-by: Nayna Jain Reviewed-and-tested-by: Mimi Zohar Reviewed-by: Jarkko Sakkinen Tested with trustedcadb, moduledb scenarios Tested-by: Nageswara R Sastry --- security/integrity/platform_certs/machine_keyring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/security/integrity/platform_certs/machine_keyring.c b/security/integrity/platform_certs/machine_keyring.c index 389a6e7c9245..9482e16cb2ca 100644 --- a/security/integrity/platform_certs/machine_keyring.c +++ b/security/integrity/platform_certs/machine_keyring.c @@ -8,8 +8,6 @@ #include #include "../integrity.h" -static bool trust_mok; - static __init int machine_keyring_init(void) { int rc; @@ -65,9 +63,11 @@ static __init bool uefi_check_trust_mok_keys(void) bool __init trust_moklist(void) { static bool initialized; + static bool trust_mok; if (!initialized) { initialized = true; + trust_mok = false; if (uefi_check_trust_mok_keys()) trust_mok = true; -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v4 2/6] integrity: ignore keys failing CA restrictions on non-UEFI platform
On 15/08/23 4:57 pm, Nayna Jain wrote: On non-UEFI platforms, handle restrict_link_by_ca failures differently. Certificates which do not satisfy CA restrictions on non-UEFI platforms are ignored. Signed-off-by: Nayna Jain Reviewed-and-tested-by: Mimi Zohar Acked-by: Jarkko Sakkinen Tested with trustedcadb, moduledb scenarios Tested-by: Nageswara R Sastry --- security/integrity/platform_certs/machine_keyring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/integrity/platform_certs/machine_keyring.c b/security/integrity/platform_certs/machine_keyring.c index 7aaed7950b6e..389a6e7c9245 100644 --- a/security/integrity/platform_certs/machine_keyring.c +++ b/security/integrity/platform_certs/machine_keyring.c @@ -36,7 +36,7 @@ void __init add_to_machine_keyring(const char *source, const void *data, size_t * If the restriction check does not pass and the platform keyring * is configured, try to add it into that keyring instead. */ - if (rc && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) + if (rc && efi_enabled(EFI_BOOT) && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) rc = integrity_load_cert(INTEGRITY_KEYRING_PLATFORM, source, data, len, perm); -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v4 1/6] integrity: PowerVM support for loading CA keys on machine keyring
On 15/08/23 4:57 pm, Nayna Jain wrote: Keys that derive their trust from an entity such as a security officer, administrator, system owner, or machine owner are said to have "imputed trust". CA keys with imputed trust can be loaded onto the machine keyring. The mechanism for loading these keys onto the machine keyring is platform dependent. Load keys stored in the variable trustedcadb onto the .machine keyring on PowerVM platform. Signed-off-by: Nayna Jain Reviewed-and-tested-by: Mimi Zohar Tested with trustedcadb, moduledb scenarios Tested-by: Nageswara R Sastry --- .../integrity/platform_certs/keyring_handler.c | 8 .../integrity/platform_certs/keyring_handler.h | 5 + .../integrity/platform_certs/load_powerpc.c | 17 + 3 files changed, 30 insertions(+) diff --git a/security/integrity/platform_certs/keyring_handler.c b/security/integrity/platform_certs/keyring_handler.c index 8a1124e4d769..1649d047e3b8 100644 --- a/security/integrity/platform_certs/keyring_handler.c +++ b/security/integrity/platform_certs/keyring_handler.c @@ -69,6 +69,14 @@ __init efi_element_handler_t get_handler_for_mok(const efi_guid_t *sig_type) return NULL; } +__init efi_element_handler_t get_handler_for_ca_keys(const efi_guid_t *sig_type) +{ + if (efi_guidcmp(*sig_type, efi_cert_x509_guid) == 0) + return add_to_machine_keyring; + + return NULL; +} + /* * Return the appropriate handler for particular signature list types found in * the UEFI dbx and MokListXRT tables. diff --git a/security/integrity/platform_certs/keyring_handler.h b/security/integrity/platform_certs/keyring_handler.h index 212d894a8c0c..6f15bb4cc8dc 100644 --- a/security/integrity/platform_certs/keyring_handler.h +++ b/security/integrity/platform_certs/keyring_handler.h @@ -29,6 +29,11 @@ efi_element_handler_t get_handler_for_db(const efi_guid_t *sig_type); */ efi_element_handler_t get_handler_for_mok(const efi_guid_t *sig_type); +/* + * Return the handler for particular signature list types for CA keys. + */ +efi_element_handler_t get_handler_for_ca_keys(const efi_guid_t *sig_type); + /* * Return the handler for particular signature list types found in the dbx. */ diff --git a/security/integrity/platform_certs/load_powerpc.c b/security/integrity/platform_certs/load_powerpc.c index 170789dc63d2..339053d9726d 100644 --- a/security/integrity/platform_certs/load_powerpc.c +++ b/security/integrity/platform_certs/load_powerpc.c @@ -59,6 +59,7 @@ static __init void *get_cert_list(u8 *key, unsigned long keylen, u64 *size) static int __init load_powerpc_certs(void) { void *db = NULL, *dbx = NULL, *data = NULL; + void *trustedca; u64 dsize = 0; u64 offset = 0; int rc = 0; @@ -120,6 +121,22 @@ static int __init load_powerpc_certs(void) kfree(data); } + data = get_cert_list("trustedcadb", 12, ); + if (!data) { + pr_info("Couldn't get trustedcadb list from firmware\n"); + } else if (IS_ERR(data)) { + rc = PTR_ERR(data); + pr_err("Error reading trustedcadb from firmware: %d\n", rc); + } else { + extract_esl(trustedca, data, dsize, offset); + + rc = parse_efi_signature_list("powerpc:trustedca", trustedca, dsize, + get_handler_for_ca_keys); + if (rc) + pr_err("Couldn't parse trustedcadb signatures: %d\n", rc); + kfree(data); + } + return rc; } late_initcall(load_powerpc_certs); -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v2] security/integrity: fix pointer to ESL data and its size on pseries
On 08/06/23 5:34 pm, Nayna Jain wrote: On PowerVM guest, variable data is prefixed with 8 bytes of timestamp. Extract ESL by stripping off the timestamp before passing to ESL parser. Fixes: 4b3e71e9a34c ("integrity/powerpc: Support loading keys from PLPKS") Cc: sta...@vger.kenrnel.org # v6.3 Signed-off-by: Nayna Jain --- Tested-by: Nageswara R Sastry Seeing the keyring difference with and with out patch: With out patch: [root@ltcrain80-lp2 ~]# grep keyring /proc/keys | grep -v _ses 0131ebb4 I--Q--- 1 perm 0c03 0 65534 keyring .user_reg: 2 039dfd93 I-- 1 perm 1f0f 0 0 keyring .ima: 1 03a160b5 I-- 1 perm 1f0b 0 0 keyring .builtin_trusted_keys: 2 05377b73 I-- 1 perm 1f0b 0 0 keyring .platform: empty 0d7ea730 I-- 1 perm 0f0b 0 0 keyring .blacklist: empty 16235f2d I--Q--- 6 perm 1f3f 0 65534 keyring _uid.0: empty 1721f130 I-- 1 perm 1f0f 0 0 keyring .evm: empty With patch: [root@ltcrain80-lp2 ~]# grep keyring /proc/keys | grep -v _ses 04820159 I-- 1 perm 0f0b 0 0 keyring .blacklist: 1 16d05827 I--Q--- 1 perm 0c03 0 65534 keyring .user_reg: 2 17648d6a I-- 1 perm 1f0b 0 0 keyring .builtin_trusted_keys: 2 2158b34f I--Q--- 6 perm 1f3f 0 65534 keyring _uid.0: empty 2237eff6 I-- 1 perm 1f0f 0 0 keyring .evm: empty 26d0330c I-- 1 perm 1f0b 0 0 keyring .platform: 1 2daa48ab I-- 1 perm 1f0f 0 0 keyring .ima: 1 Thank you. Changelog: v2: Fixed feedback from Jarkko * added CC to stable * moved *data declaration to same line as *db,*dbx Renamed extract_data() macro to extract_esl() for clarity .../integrity/platform_certs/load_powerpc.c | 40 --- 1 file changed, 26 insertions(+), 14 deletions(-) diff --git a/security/integrity/platform_certs/load_powerpc.c b/security/integrity/platform_certs/load_powerpc.c index b9de70b90826..170789dc63d2 100644 --- a/security/integrity/platform_certs/load_powerpc.c +++ b/security/integrity/platform_certs/load_powerpc.c @@ -15,6 +15,9 @@ #include "keyring_handler.h" #include "../integrity.h" +#define extract_esl(db, data, size, offset) \ + do { db = data + offset; size = size - offset; } while (0) + /* * Get a certificate list blob from the named secure variable. * @@ -55,8 +58,9 @@ static __init void *get_cert_list(u8 *key, unsigned long keylen, u64 *size) */ static int __init load_powerpc_certs(void) { - void *db = NULL, *dbx = NULL; - u64 dbsize = 0, dbxsize = 0; + void *db = NULL, *dbx = NULL, *data = NULL; + u64 dsize = 0; + u64 offset = 0; int rc = 0; ssize_t len; char buf[32]; @@ -74,38 +78,46 @@ static int __init load_powerpc_certs(void) return -ENODEV; } + if (strcmp("ibm,plpks-sb-v1", buf) == 0) + /* PLPKS authenticated variables ESL data is prefixed with 8 bytes of timestamp */ + offset = 8; + /* * Get db, and dbx. They might not exist, so it isn't an error if we * can't get them. */ - db = get_cert_list("db", 3, ); - if (!db) { + data = get_cert_list("db", 3, ); + if (!data) { pr_info("Couldn't get db list from firmware\n"); - } else if (IS_ERR(db)) { - rc = PTR_ERR(db); + } else if (IS_ERR(data)) { + rc = PTR_ERR(data); pr_err("Error reading db from firmware: %d\n", rc); return rc; } else { - rc = parse_efi_signature_list("powerpc:db", db, dbsize, + extract_esl(db, data, dsize, offset); + + rc = parse_efi_signature_list("powerpc:db", db, dsize, get_handler_for_db); if (rc) pr_err("Couldn't parse db signatures: %d\n", rc); - kfree(db); + kfree(data); } - dbx = get_cert_list("dbx", 4, ); - if (!dbx) { + data = get_cert_list("dbx", 4, ); + if (!data) { pr_info("Couldn't get dbx list from firmware\n"); - } else if (IS_ERR(dbx)) { - rc = PTR_ERR(dbx); + } else if (IS_ERR(data)) { + rc = PTR_ERR(data); pr_err("Error reading dbx from firmware: %d\n", rc); return rc; } else { - rc = parse_efi_signature_list("powerpc:dbx", dbx, dbxsize, + extract_esl(dbx, data, dsize, offset); + + rc = parse_efi_signature_list("powerpc:dbx", dbx, dsize, get_handler_for_dbx); if (rc) pr_err("Couldn't parse dbx signatures: %d\n", rc); - kfree(dbx); +
Re: [PATCH] powerpc/security: Fix Speculation_Store_Bypass reporting on Power10
On 17/05/23 1:19 pm, Michael Ellerman wrote: Nageswara reported that /proc/self/status was showing "vulnerable" for the Speculation_Store_Bypass feature on Power10, eg: $ grep Speculation_Store_Bypass: /proc/self/status Speculation_Store_Bypass: vulnerable But at the same time the sysfs files, and lscpu, were showing "Not affected". This turns out to simply be a bug in the reporting of the Speculation_Store_Bypass, aka. PR_SPEC_STORE_BYPASS, case. When SEC_FTR_STF_BARRIER was added, so that firmware could communicate the vulnerability was not present, the code in ssb_prctl_get() was not updated to check the new flag. So add the check for SEC_FTR_STF_BARRIER being disabled. Rather than adding the new check to the existing if block and expanding the comment to cover both cases, rewrite the three cases to be separate so they can be commented separately for clarity. Fixes: 84ed26fd00c5 ("powerpc/security: Add a security feature for STF barrier") Cc: sta...@vger.kernel.org # v5.14+ Reported-by: Nageswara R Sastry Signed-off-by: Michael Ellerman Thanks for the patch. Adding tested-by tag along with test results. With out patch: # grep Speculation_Store_Bypass: /proc/self/status Speculation_Store_Bypass: vulnerable # uname -r 6.4.0-rc2 With patch: # grep Speculation_Store_Bypass: /proc/self/status Speculation_Store_Bypass: not vulnerable # uname -r 6.4.0-rc2 Tested-by: Nageswara R Sastry --- arch/powerpc/kernel/security.c | 37 +- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 206475e3e0b4..4856e1a5161c 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -364,26 +364,27 @@ ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute * static int ssb_prctl_get(struct task_struct *task) { + /* +* The STF_BARRIER feature is on by default, so if it's off that means +* firmware has explicitly said the CPU is not vulnerable via either +* the hypercall or device tree. +*/ + if (!security_ftr_enabled(SEC_FTR_STF_BARRIER)) + return PR_SPEC_NOT_AFFECTED; + + /* +* If the system's CPU has no known barrier (see setup_stf_barrier()) +* then assume that the CPU is not vulnerable. +*/ if (stf_enabled_flush_types == STF_BARRIER_NONE) - /* -* We don't have an explicit signal from firmware that we're -* vulnerable or not, we only have certain CPU revisions that -* are known to be vulnerable. -* -* We assume that if we're on another CPU, where the barrier is -* NONE, then we are not vulnerable. -*/ return PR_SPEC_NOT_AFFECTED; - else - /* -* If we do have a barrier type then we are vulnerable. The -* barrier is not a global or per-process mitigation, so the -* only value we can report here is PR_SPEC_ENABLE, which -* appears as "vulnerable" in /proc. -*/ - return PR_SPEC_ENABLE; - - return -EINVAL; + + /* +* Otherwise the CPU is vulnerable. The barrier is not a global or +* per-process mitigation, so the only value that can be reported here +* is PR_SPEC_ENABLE, which appears as "vulnerable" in /proc. +*/ + return PR_SPEC_ENABLE; } int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH] powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
On 22/03/23 9:23 am, Russell Currey wrote: fail_iommu_setup() registers the fail_iommu_bus_notifier struct to both PCI and VIO buses. struct notifier_block is a linked list node, so this causes any notifiers later registered to either bus type to also be registered to the other since they share the same node. This causes issues in (at least) the vgaarb code, which registers a notifier for PCI buses. pci_notify() ends up being called on a vio device, converted with to_pci_dev() even though it's not a PCI device, and finally makes a bad access in vga_arbiter_add_pci_device() as discovered with KASAN: BUG: KASAN: slab-out-of-bounds in vga_arbiter_add_pci_device+0x60/0xe00 Read of size 4 at addr c00264c26fdc by task swapper/0/1 Call Trace: [c00263607520] [c00010f7023c] dump_stack_lvl+0x1bc/0x2b8 (unreliable) [c00263607560] [cf142a64] print_report+0x3f4/0xc60 [c00263607640] [cf142144] kasan_report+0x244/0x698 [c00263607740] [cf1460e8] __asan_load4+0xe8/0x250 [c00263607760] [cff4b850] vga_arbiter_add_pci_device+0x60/0xe00 [c00263607850] [cff4c678] pci_notify+0x88/0x444 [c002636078b0] [ce94dfc4] notifier_call_chain+0x104/0x320 [c00263607950] [ce94f050] blocking_notifier_call_chain+0xa0/0x140 [c00263607990] [c000100cb3b8] device_add+0xac8/0x1d30 [c00263607aa0] [c000100ccd98] device_register+0x58/0x80 [c00263607ad0] [ce84247c] vio_register_device_node+0x9ac/0xce0 [c00263607ba0] [c000126c95d8] vio_bus_scan_register_devices+0xc4/0x13c [c00263607bd0] [c000126c96e4] __machine_initcall_pseries_vio_device_init+0x94/0xf0 [c00263607c00] [ce69467c] do_one_initcall+0x12c/0xaa8 [c00263607cf0] [c0001268b8a8] kernel_init_freeable+0xa48/0xba8 [c00263607dd0] [ce695f24] kernel_init+0x64/0x400 [c00263607e50] [ce68e0e4] ret_from_kernel_thread+0x5c/0x64 Fix this by creating separate notifier_block structs for each bus type. Fixes: d6b9a81b2a45 ("powerpc: IOMMU fault injection") Reported-by: Nageswara R Sastry Signed-off-by: Russell Currey Tested-by: Nageswara R Sastry --- arch/powerpc/kernel/iommu.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index ee95937bdaf1..6f1117fe3870 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -171,17 +171,26 @@ static int fail_iommu_bus_notify(struct notifier_block *nb, return 0; } -static struct notifier_block fail_iommu_bus_notifier = { +/* + * PCI and VIO buses need separate notifier_block structs, since they're linked + * list nodes. Sharing a notifier_block would mean that any notifiers later + * registered for PCI buses would also get called by VIO buses and vice versa. + */ +static struct notifier_block fail_iommu_pci_bus_notifier = { + .notifier_call = fail_iommu_bus_notify +}; + +static struct notifier_block fail_iommu_vio_bus_notifier = { .notifier_call = fail_iommu_bus_notify }; static int __init fail_iommu_setup(void) { #ifdef CONFIG_PCI - bus_register_notifier(_bus_type, _iommu_bus_notifier); + bus_register_notifier(_bus_type, _iommu_pci_bus_notifier); #endif #ifdef CONFIG_IBMVIO - bus_register_notifier(_bus_type, _iommu_bus_notifier); + bus_register_notifier(_bus_type, _iommu_vio_bus_notifier); #endif return 0; -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH 2/2] tools/perf: Fix out of bound access to cpu mask array
On 05/09/22 10:24 am, Athira Rajeev wrote: The cpu mask init code in "record__mmap_cpu_mask_init" function access "bits" array part of "struct mmap_cpu_mask". The size of this array is the value from cpu__max_cpu().cpu. This array is used to contain the cpumask value for each cpu. While setting bit for each cpu, it calls "set_bit" function which access index in "bits" array. If we provide a command line option to -C which is greater than the number of CPU's present in the system, the set_bit could access an array member which is out-of the array size. This is because currently, there is no boundary check for the CPU. This will result in seg fault: <<>> ./perf record -C 12341234 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Segmentation fault (core dumped) <<>> Debugging with gdb, points to function flow as below: <<>> set_bit record__mmap_cpu_mask_init record__init_thread_default_masks record__init_thread_masks cmd_record <<>> Fix this by adding boundary check for the array. After the patch: <<>> ./perf record -C 12341234 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Failed to initialize parallel data streaming masks <<>> With this fix, if -C is given a non-exsiting CPU, perf record will fail with: <<>> ./perf record -C 50 ls Failed to initialize parallel data streaming masks <<>> Reported-by: Nageswara Sastry Tested-by: Nageswara Sastry Signed-off-by: Athira Rajeev --- tools/perf/builtin-record.c | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 4713f0f3a6cf..09b68d76bbdc 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -3358,16 +3358,22 @@ static struct option __record_options[] = { struct option *record_options = __record_options; -static void record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cpu_map *cpus) +static int record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cpu_map *cpus) { struct perf_cpu cpu; int idx; if (cpu_map__is_dummy(cpus)) - return; + return 0; - perf_cpu_map__for_each_cpu(cpu, idx, cpus) + perf_cpu_map__for_each_cpu(cpu, idx, cpus) { + /* Return ENODEV is input cpu is greater than max cpu */ + if ((unsigned long)cpu.cpu > mask->nbits) + return -ENODEV; set_bit(cpu.cpu, mask->bits); + } + + return 0; } static int record__mmap_cpu_mask_init_spec(struct mmap_cpu_mask *mask, const char *mask_spec) @@ -3379,7 +3385,9 @@ static int record__mmap_cpu_mask_init_spec(struct mmap_cpu_mask *mask, const cha return -ENOMEM; bitmap_zero(mask->bits, mask->nbits); - record__mmap_cpu_mask_init(mask, cpus); + if (record__mmap_cpu_mask_init(mask, cpus)) + return -ENODEV; + perf_cpu_map__put(cpus); return 0; @@ -3461,7 +3469,12 @@ static int record__init_thread_masks_spec(struct record *rec, struct perf_cpu_ma pr_err("Failed to allocate CPUs mask\n"); return ret; } - record__mmap_cpu_mask_init(_mask, cpus); + + ret = record__mmap_cpu_mask_init(_mask, cpus); + if (ret) { + pr_err("Failed to init cpu mask\n"); + goto out_free_cpu_mask; + } ret = record__thread_mask_alloc(_mask, cpu__max_cpu().cpu); if (ret) { @@ -3702,7 +3715,8 @@ static int record__init_thread_default_masks(struct record *rec, struct perf_cpu if (ret) return ret; - record__mmap_cpu_mask_init(>thread_masks->maps, cpus); + if (record__mmap_cpu_mask_init(>thread_masks->maps, cpus)) + return -ENODEV; rec->nr_threads = 1; -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH 1/2] tools/perf: Fix out of bound access to affinity "sched_cpus"
On 05/09/22 10:24 am, Athira Rajeev wrote: The affinity code in "affinity_set" function access array named "sched_cpus". The size for this array is allocated in affinity_setup function which is nothing but value from get_cpu_set_size. This is used to contain the cpumask value for each cpu. While setting bit for each cpu, it calls "set_bit" function which access index in sched_cpus array. If we provide a command-line option to -C which is more than the number of CPU's present in the system, the set_bit could access an array member which is out-of the array size. This is because currently, there is no boundary check for the CPU. This will result in seg fault: <<>> ./perf stat -C 12323431 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Segmentation fault (core dumped) <<>> Fix this by adding boundary check for the array. After the fix from powerpc system: <<>> ./perf stat -C 12323431 ls 1>out Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Performance counter stats for 'CPU(s) 12323431': msec cpu-clock context-switches cpu-migrations page-faults cycles instructions branches branch-misses 0.001192373 seconds time elapsed <<>> Reported-by: Nageswara Sastry Tested-by: Nageswara Sastry Signed-off-by: Athira Rajeev --- tools/perf/util/affinity.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/affinity.c b/tools/perf/util/affinity.c index 4d216c0dc425..a1dd37347abc 100644 --- a/tools/perf/util/affinity.c +++ b/tools/perf/util/affinity.c @@ -49,8 +49,14 @@ void affinity__set(struct affinity *a, int cpu) { int cpu_set_size = get_cpu_set_size(); - if (cpu == -1) + /* +* Return: +* - if cpu is -1 +* - restrict out of bound access to sched_cpus +*/ + if (cpu == -1 || ((cpu / __BITS_PER_LONG) >= (cpu_set_size / 8))) return; + a->changed = true; set_bit(cpu, a->sched_cpus); /* -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v6 0/4] Add perf interface to expose nvdimm
On 25/02/22 12:08 pm, kajoljain wrote: On 2/25/22 11:25, Nageswara Sastry wrote: On 17/02/22 10:03 pm, Kajol Jain wrote: Patchset adds performance stats reporting support for nvdimm. Added interface includes support for pmu register/unregister functions. A structure is added called nvdimm_pmu to be used for adding arch/platform specific data such as cpumask, nvdimm device pointer and pmu event functions like event_init/add/read/del. User could use the standard perf tool to access perf events exposed via pmu. Interface also defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. Patch 3 exposes IBM pseries platform nmem* device performance stats using this interface. Result from power9 pseries lpar with 2 nvdimm device: Ex: List all event by perf list command:# perf list nmem nmem0/cache_rh_cnt/ [Kernel PMU event] nmem0/cache_wh_cnt/ [Kernel PMU event] nmem0/cri_res_util/ [Kernel PMU event] nmem0/ctl_res_cnt/ [Kernel PMU event] nmem0/ctl_res_tm/ [Kernel PMU event] nmem0/fast_w_cnt/ [Kernel PMU event] nmem0/host_l_cnt/ [Kernel PMU event] nmem0/host_l_dur/ [Kernel PMU event] nmem0/host_s_cnt/ [Kernel PMU event] nmem0/host_s_dur/ [Kernel PMU event] nmem0/med_r_cnt/ [Kernel PMU event] nmem0/med_r_dur/ [Kernel PMU event] nmem0/med_w_cnt/ [Kernel PMU event] nmem0/med_w_dur/ [Kernel PMU event] nmem0/mem_life/ [Kernel PMU event] nmem0/poweron_secs/ [Kernel PMU event] ... nmem1/mem_life/ [Kernel PMU event] nmem1/poweron_secs/ [Kernel PMU event] Patch1: Introduces the nvdimm_pmu structure Patch2: Adds common interface to add arch/platform specific data includes nvdimm device pointer, pmu data along with pmu event functions. It also defines supported event list and adds attribute groups for format, events and cpumask. It also adds code for cpu hotplug support. Patch3: Add code in arch/powerpc/platform/pseries/papr_scm.c to expose nmem* pmu. It fills in the nvdimm_pmu structure with pmu name, capabilities, cpumask and event functions and then registers the pmu by adding callbacks to register_nvdimm_pmu. Patch4: Sysfs documentation patch Changelog Tested these patches with the automated tests at avocado-misc-tests/perf/perf_nmem.py URL: https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/perf/perf_nmem.py 1. On the system where target id and online id were different then not seeing value in 'cpumask' and those tests failed. Example: Log from dmesg ... papr_scm ibm,persistent-memory:ibm,pmemory@4413: Region registered with target node 1 and online node 0 ... Hi Nageswara Sastry, Thanks for testing the patch set. Yes you right, incase target node id and online node id is different, it can happen when target node is not online and hence can cause this issue, thanks for pointing it. Function dev_to_node will return node id for a given nvdimm device which can be offline in some scenarios. We should use numa node id return by numa_map_to_online_node function in that scenario. This function incase given node is offline, it will lookup for next closest online node and return that nodeid. Can you try with below change and see, if you are still getting this issue. Please let me know. diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index bdf2620db461..4dd513d7c029 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -536,7 +536,7 @@ static void papr_scm_pmu_register(struct papr_scm_priv *p) PERF_PMU_CAP_NO_EXCLUDE; /*updating the cpumask variable */ - nodeid = dev_to_node(>pdev->dev); + nodeid = numa_map_to_online_node(dev_to_node(>pdev->dev)); nd_pmu->arch_cpumask = *cpumask_of_node(nodeid); Thanks, Kajol Jain With the above patch all the tests are passing on the system where target id and online id were different. Here is the the result: (1/9) perf_nmem.py:perfNMEM.test_pmu_register_dmesg: PASS (3.47 s) (2/9) perf_nmem.py:perfNMEM.test_sysfs: PASS (1.15 s) (3/9) perf_nmem.py:perfNMEM.test_pmu_count: PASS (1.08 s) (4/9) perf_nmem.py:perfNMEM.
Re: [PATCH v6 0/4] Add perf interface to expose nvdimm
On 17/02/22 10:03 pm, Kajol Jain wrote: Patchset adds performance stats reporting support for nvdimm. Added interface includes support for pmu register/unregister functions. A structure is added called nvdimm_pmu to be used for adding arch/platform specific data such as cpumask, nvdimm device pointer and pmu event functions like event_init/add/read/del. User could use the standard perf tool to access perf events exposed via pmu. Interface also defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. Patch 3 exposes IBM pseries platform nmem* device performance stats using this interface. Result from power9 pseries lpar with 2 nvdimm device: Ex: List all event by perf list command:# perf list nmem nmem0/cache_rh_cnt/[Kernel PMU event] nmem0/cache_wh_cnt/[Kernel PMU event] nmem0/cri_res_util/[Kernel PMU event] nmem0/ctl_res_cnt/ [Kernel PMU event] nmem0/ctl_res_tm/ [Kernel PMU event] nmem0/fast_w_cnt/ [Kernel PMU event] nmem0/host_l_cnt/ [Kernel PMU event] nmem0/host_l_dur/ [Kernel PMU event] nmem0/host_s_cnt/ [Kernel PMU event] nmem0/host_s_dur/ [Kernel PMU event] nmem0/med_r_cnt/ [Kernel PMU event] nmem0/med_r_dur/ [Kernel PMU event] nmem0/med_w_cnt/ [Kernel PMU event] nmem0/med_w_dur/ [Kernel PMU event] nmem0/mem_life/[Kernel PMU event] nmem0/poweron_secs/[Kernel PMU event] ... nmem1/mem_life/[Kernel PMU event] nmem1/poweron_secs/[Kernel PMU event] Patch1: Introduces the nvdimm_pmu structure Patch2: Adds common interface to add arch/platform specific data includes nvdimm device pointer, pmu data along with pmu event functions. It also defines supported event list and adds attribute groups for format, events and cpumask. It also adds code for cpu hotplug support. Patch3: Add code in arch/powerpc/platform/pseries/papr_scm.c to expose nmem* pmu. It fills in the nvdimm_pmu structure with pmu name, capabilities, cpumask and event functions and then registers the pmu by adding callbacks to register_nvdimm_pmu. Patch4: Sysfs documentation patch Changelog Tested these patches with the automated tests at avocado-misc-tests/perf/perf_nmem.py URL: https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/perf/perf_nmem.py 1. On the system where target id and online id were different then not seeing value in 'cpumask' and those tests failed. Example: Log from dmesg ... papr_scm ibm,persistent-memory:ibm,pmemory@4413: Region registered with target node 1 and online node 0 ... tests log: (1/9) perf_nmem.py:perfNMEM.test_pmu_register_dmesg: PASS (1.13 s) (2/9) perf_nmem.py:perfNMEM.test_sysfs: PASS (1.10 s) (3/9) perf_nmem.py:perfNMEM.test_pmu_count: PASS (1.07 s) (4/9) perf_nmem.py:perfNMEM.test_all_events: PASS (18.14 s) (5/9) perf_nmem.py:perfNMEM.test_all_group_events: PASS (2.18 s) (6/9) perf_nmem.py:perfNMEM.test_mixed_events: CANCEL: With single PMU mixed events test is not possible. (1.10 s) (7/9) perf_nmem.py:perfNMEM.test_pmu_cpumask: ERROR: invalid literal for int() with base 10: '' (1.10 s) (8/9) perf_nmem.py:perfNMEM.test_cpumask: ERROR: invalid literal for int() with base 10: '' (1.10 s) (9/9) perf_nmem.py:perfNMEM.test_cpumask_cpu_off: ERROR: invalid literal for int() with base 10: '' (1.07 s) 2. On the system where target id and online id were same then seeing value in 'cpumask' and those tests pass. tests log: (1/9) perf_nmem.py:perfNMEM.test_pmu_register_dmesg: PASS (1.16 s) (2/9) perf_nmem.py:perfNMEM.test_sysfs: PASS (1.10 s) (3/9) perf_nmem.py:perfNMEM.test_pmu_count: PASS (1.12 s) (4/9) perf_nmem.py:perfNMEM.test_all_events: PASS (18.10 s) (5/9) perf_nmem.py:perfNMEM.test_all_group_events: PASS (2.23 s) (6/9) perf_nmem.py:perfNMEM.test_mixed_events: CANCEL: With single PMU mixed events test is not possible. (1.13 s) (7/9) perf_nmem.py:perfNMEM.test_pmu_cpumask: PASS (1.08 s) (8/9) perf_nmem.py:perfNMEM.test_cpumask: PASS (1.09 s) (9/9) perf_nmem.py:perfNMEM.test_cpumask_cpu_off: PASS (1.62 s) --- Resend v5 -> v6 - No logic change, just a rebase to latest upstream and tested the patchset. - Link to the patchset Resend v5: https://lkml.org/lkml/2021/11/15/3979 v5 -> Resend v5 - Resend the patchset - Link
Re: [PATCH]selftests/powerpc: skip tests for unavailable mitigations.
On 13/12/21 10:12 pm, Sachin Sant wrote: Mitigation patching test iterates over a set of mitigations irrespective of whether a certain mitigation is supported/available in the kernel. This causes following messages on a kernel where some mitigations are unavailable: Spawned threads enabling/disabling mitigations ... cat: entry_flush: No such file or directory cat: uaccess_flush: No such file or directory Waiting for timeout ... OK This patch adds a check for available mitigations in the kernel. Reported-by: Nageswara R Sastry Signed-off-by: Sachin Sant Tested-by: Nageswara R Sastry --- diff -Naurp aa/tools/testing/selftests/powerpc/security/mitigation-patching.sh bb/tools/testing/selftests/powerpc/security/mitigation-patching.sh --- aa/tools/testing/selftests/powerpc/security/mitigation-patching.sh 2021-12-13 10:17:05.714127154 -0500 +++ bb/tools/testing/selftests/powerpc/security/mitigation-patching.sh 2021-12-13 10:19:32.575315913 -0500 @@ -44,7 +44,10 @@ mitigations="barrier_nospec stf_barrier for m in $mitigations do -do_one "$m" & +if [[ -f /sys/kernel/debug/powerpc/$m ]] +then +do_one "$m" & +fi done echo "Spawned threads enabling/disabling mitigations ..." -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH V2 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key
On 07/12/21 8:22 pm, Arnaldo Carvalho de Melo wrote: Em Fri, Dec 03, 2021 at 07:50:37AM +0530, Athira Rajeev escreveu: Sort key p_stage_cyc is used to present the latency cycles spend in pipeline stages. perf tool has local p_stage_cyc sort key to display this info. There is no global variant available for this sort key. local variant shows latency in a sinlge sample, whereas, global value will be useful to present the total latency (sum of latencies) in the hist entry. It represents latency number multiplied by the number of samples. Add global (p_stage_cyc) and local variant (local_p_stage_cyc) for this sort key. Use the local_p_stage_cyc as default option for "mem" sort mode. Also add this to list of dynamic sort keys and made the "dynamic_headers" and "arch_specific_sort_keys" as static. Signed-off-by: Athira Rajeev Reported-by: Namhyung Kim I got this for v1, does it stand for v2? Tested-by: Nageswara R Sastry Tested with v2 also. Tested-by: Nageswara R Sastry --- Changelog: v1 -> v2: Addressed review comments from Jiri by making the "dynamic_headers" and "arch_specific_sort_keys" as static. tools/perf/util/hist.c | 4 +++- tools/perf/util/hist.h | 3 ++- tools/perf/util/sort.c | 34 +- tools/perf/util/sort.h | 3 ++- 4 files changed, 32 insertions(+), 12 deletions(-) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index b776465e04ef..0a8033b09e28 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -211,7 +211,9 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h) hists__new_col_len(hists, HISTC_MEM_BLOCKED, 10); hists__new_col_len(hists, HISTC_LOCAL_INS_LAT, 13); hists__new_col_len(hists, HISTC_GLOBAL_INS_LAT, 13); - hists__new_col_len(hists, HISTC_P_STAGE_CYC, 13); + hists__new_col_len(hists, HISTC_LOCAL_P_STAGE_CYC, 13); + hists__new_col_len(hists, HISTC_GLOBAL_P_STAGE_CYC, 13); + if (symbol_conf.nanosecs) hists__new_col_len(hists, HISTC_TIME, 16); else diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index 621f35ae1efa..2a15e22fb89c 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -75,7 +75,8 @@ enum hist_column { HISTC_MEM_BLOCKED, HISTC_LOCAL_INS_LAT, HISTC_GLOBAL_INS_LAT, - HISTC_P_STAGE_CYC, + HISTC_LOCAL_P_STAGE_CYC, + HISTC_GLOBAL_P_STAGE_CYC, HISTC_NR_COLS, /* Last entry */ }; diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index a111065b484e..e417e47f51b9 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -37,7 +37,7 @@ const chardefault_parent_pattern[] = "^sys_|^do_page_fault"; const char*parent_pattern = default_parent_pattern; const char*default_sort_order = "comm,dso,symbol"; const chardefault_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles"; -const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc"; +const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc"; const chardefault_top_sort_order[] = "dso,symbol"; const chardefault_diff_sort_order[] = "dso,symbol"; const chardefault_tracepoint_sort_order[] = "trace"; @@ -46,8 +46,8 @@ const char*field_order; regex_t ignore_callees_regex; int have_ignore_callees = 0; enum sort_modesort__mode = SORT_MODE__NORMAL; -const char *dynamic_headers[] = {"local_ins_lat", "p_stage_cyc"}; -const char *arch_specific_sort_keys[] = {"p_stage_cyc"}; +static const char *const dynamic_headers[] = {"local_ins_lat", "ins_lat", "local_p_stage_cyc", "p_stage_cyc"}; +static const char *const arch_specific_sort_keys[] = {"local_p_stage_cyc", "p_stage_cyc"}; /* * Replaces all occurrences of a char used with the: @@ -1392,22 +1392,37 @@ struct sort_entry sort_global_ins_lat = { }; static int64_t -sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right) +sort__p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right) { return left->p_stage_cyc - right->p_stage_cyc; } +static int hist_entry__global_p_stage_cyc_snprintf(struct hist_entry *he, char *bf, + size_t size, unsigned int width) +{ + return repsep_snprintf(bf, size, "%-*u", width, + he->p_stage_cyc * he->stat.nr_events); +} + + static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc); } -struct sort_entry sort_p_stage_cyc = { - .se_header = "Pipeline Stage Cycle", - .se_cmp =
Re: [PATCH 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key
On 25/11/21 8:18 am, Athira Rajeev wrote: Sort key p_stage_cyc is used to present the latency cycles spend in pipeline stages. perf tool has local p_stage_cyc sort key to display this info. There is no global variant available for this sort key. local variant shows latency in a sinlge sample, whereas, global value will be useful to present the total latency (sum of latencies) in the hist entry. It represents latency number multiplied by the number of samples. Add global (p_stage_cyc) and local variant (local_p_stage_cyc) for this sort key. Use the local_p_stage_cyc as default option for "mem" sort mode. Also add this to list of dynamic sort keys. Signed-off-by: Athira Rajeev Reported-by: Namhyung Kim Tested the patch on Power10 LPAR and could see the required data. # Overhead Samples Command Shared Object Symbol Dispatch Cyc # ... .. . # 9.41% 156 dd [kernel.vmlinux][k] system_call_common1 4.91%82 dd [kernel.vmlinux][k] __fget_light 1 ... # Overhead Samples Command Shared Object Symbol Dispatch Cyc Global Dispatch_cyc # ... .. . ... # 9.41% 156 dd [kernel.vmlinux][k] system_call_common1 156 4.91%82 dd [kernel.vmlinux][k] __fget_light 1 82 ... Tested-by: Nageswara R Sastry --- tools/perf/util/hist.c | 4 +++- tools/perf/util/hist.h | 3 ++- tools/perf/util/sort.c | 34 +- tools/perf/util/sort.h | 3 ++- 4 files changed, 32 insertions(+), 12 deletions(-) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index b776465e04ef..0a8033b09e28 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -211,7 +211,9 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h) hists__new_col_len(hists, HISTC_MEM_BLOCKED, 10); hists__new_col_len(hists, HISTC_LOCAL_INS_LAT, 13); hists__new_col_len(hists, HISTC_GLOBAL_INS_LAT, 13); - hists__new_col_len(hists, HISTC_P_STAGE_CYC, 13); + hists__new_col_len(hists, HISTC_LOCAL_P_STAGE_CYC, 13); + hists__new_col_len(hists, HISTC_GLOBAL_P_STAGE_CYC, 13); + if (symbol_conf.nanosecs) hists__new_col_len(hists, HISTC_TIME, 16); else diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index 5343b62476e6..2752ce681108 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -75,7 +75,8 @@ enum hist_column { HISTC_MEM_BLOCKED, HISTC_LOCAL_INS_LAT, HISTC_GLOBAL_INS_LAT, - HISTC_P_STAGE_CYC, + HISTC_LOCAL_P_STAGE_CYC, + HISTC_GLOBAL_P_STAGE_CYC, HISTC_NR_COLS, /* Last entry */ }; diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index e9216a292a04..e978f7883e07 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -37,7 +37,7 @@ const chardefault_parent_pattern[] = "^sys_|^do_page_fault"; const char*parent_pattern = default_parent_pattern; const char*default_sort_order = "comm,dso,symbol"; const chardefault_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles"; -const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc"; +const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc"; const chardefault_top_sort_order[] = "dso,symbol"; const chardefault_diff_sort_order[] = "dso,symbol"; const chardefault_tracepoint_sort_order[] = "trace"; @@ -46,8 +46,8 @@ const char*field_order; regex_t ignore_callees_regex; int have_ignore_callees = 0; enum sort_modesort__mode = SORT_MODE__NORMAL; -const char *dynamic_headers[] = {"local_ins_lat", "p_stage_cyc"}; -const char *arch_specific_sort_keys[] = {"p_stage_cyc"}; +const char *dynamic_headers[] = {"local_ins_lat", "ins_lat", "local_p_stage_cyc", "p_stage_cyc"}; +const char *arch_specific_sort_keys[] = {"local_p_stage_cyc", "p_stage_cyc"}; /* * Replaces all occurrences of a char used with the: @@ -1392,22 +1392,37 @@ struct sort_entry sort_global_ins_lat = { }; static int64_t -sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right) +sort__p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right) { return left->p_stage_cyc - right->p_stage_cyc; } +static int
Re: [PATCH] powerpc/perf/hv-gpci: Fix the logic to compute counter value from the hcall result buffer.
On 13/08/21 1:51 pm, Kajol Jain wrote: H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in the result buffer. Result buffer has specific format defined in the PAPR specification. One of the field is counter offset and width of the counter data returned. Counter data are returned in a unsigned char array. To get the final counter data, these values should be left shifted byte at a time. But commit 220a0c609ad17 ("powerpc/perf: Add support for the hv gpci (get performance counter info) interface") made the shifting bitwise. Because of this, hcall counters values could end up in lower side, which messes the counter prev vs now calculation. This lead to huge counter value reporting [command]#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ -C 0 -I 1000 time counts unit events 1.78854 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 2.000213293 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 3.000320107 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 4.000428392 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 5.000537864 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 6.000649087 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 7.000760312 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 8.000865218 16,448 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 9.000978985 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 10.001088891 16,384 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 11.001201435 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 12.001307937 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ Patch here fixes the shifting logic to make is byte-wise with which no more the issue seen. Fixes: e4f226b1580b3 ("powerpc/perf/hv-gpci: Increase request buffer size") Reported-by: Nageswara R Sastry Signed-off-by: Kajol Jain Tested-by: Nageswara R Sastry Now not seeing huge numbers. # perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ -C 0 -I 1000 # time counts unit events 1.001023931 26,624 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 2.002176767 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 3.003296382 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 4.004385311 33,280 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ --- arch/powerpc/perf/hv-gpci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c index d48413e28c39..c756228a081f 100644 --- a/arch/powerpc/perf/hv-gpci.c +++ b/arch/powerpc/perf/hv-gpci.c @@ -175,7 +175,7 @@ static unsigned long single_gpci_request(u32 req, u32 starting_index, */ count = 0; for (i = offset; i < offset + length; i++) - count |= arg->bytes[i] << (i - offset); + count |= (u64)(arg->bytes[i]) << ((length - 1 - (i - offset)) * 8); *value = count; out: -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v4 1/3] powerpc/perf: Use stack siar instead of mfspr
On 18/08/21 10:45 pm, Kajol Jain wrote: Minor optimization in the 'perf_instruction_pointer' function code by making use of stack siar instead of mfspr. Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs into perf_read_regs") Signed-off-by: Kajol Jain Tested this patch series, not seeing any '0' values. Tested-by: Nageswara R Sastry example output: # perf report -D | grep addr 0 26236879714 0x3dcc8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x1): 1446/1446: 0xc0113584 period: 1 addr: 0 0 26236882500 0x3dd00 [0x38]: PERF_RECORD_SAMPLE(IP, 0x1): 1446/1446: 0xc0113584 period: 1 addr: 0 0 26236883436 0x3dd38 [0x38]: PERF_RECORD_SAMPLE(IP, 0x1): 1446/1446: 0xc0113584 period: 10 addr: 0 ... --- arch/powerpc/perf/core-book3s.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index bb0ee716de91..1b464aad29c4 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs) else return regs->nip; } else if (use_siar && siar_valid(regs)) - return mfspr(SPRN_SIAR) + perf_ip_adjust(regs); + return siar + perf_ip_adjust(regs); else if (use_siar) return 0; // no valid instruction pointer else -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH V4 1/1] powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC
On 21/07/21 11:18 am, Athira Rajeev wrote: Running perf fuzzer showed below in dmesg logs: "Can't find PMC that caused IRQ" This means a PMU exception happened, but none of the PMC's (Performance Monitor Counter) were found to be overflown. There are some corner cases that clears the PMCs after PMI gets masked. In such cases, the perf interrupt handler will not find the active PMC values that had caused the overflow and thus leads to this message while replaying. Case 1: PMU Interrupt happens during replay of other interrupts and counter values gets cleared by PMU callbacks before replay: During replay of interrupts like timer, __do_irq and doorbell exception, we conditionally enable interrupts via may_hard_irq_enable(). This could potentially create a window to generate a PMI. Since irq soft mask is set to ALL_DISABLED, the PMI will get masked here. We could get IPIs run before perf interrupt is replayed and the PMU events could deleted or stopped. This will change the PMU SPR values and resets the counters. Snippet of ftrace log showing PMU callbacks invoked in "__do_irq": -0 [051] dns. 132025441306354: __do_irq <-call_do_irq -0 [051] dns. 132025441306430: irq_enter <-__do_irq -0 [051] dns. 132025441306503: irq_enter_rcu <-__do_irq -0 [051] dnH. 132025441306599: xive_get_irq <-__do_irq <<>> -0 [051] dnH. 132025441307770: generic_smp_call_function_single_interrupt <-smp_ipi_demux_relaxed -0 [051] dnH. 132025441307839: flush_smp_call_function_queue <-smp_ipi_demux_relaxed -0 [051] dnH. 132025441308057: _raw_spin_lock <-event_function -0 [051] dnH. 132025441308206: power_pmu_disable <-perf_pmu_disable -0 [051] dnH. 132025441308337: power_pmu_del <-event_sched_out -0 [051] dnH. 132025441308407: power_pmu_read <-power_pmu_del -0 [051] dnH. 132025441308477: read_pmc <-power_pmu_read -0 [051] dnH. 132025441308590: isa207_disable_pmc <-power_pmu_del -0 [051] dnH. 132025441308663: write_pmc <-power_pmu_del -0 [051] dnH. 132025441308787: power_pmu_event_idx <-perf_event_update_userpage -0 [051] dnH. 132025441308859: rcu_read_unlock_strict <-perf_event_update_userpage -0 [051] dnH. 132025441308975: power_pmu_enable <-perf_pmu_enable <<>> -0 [051] dnH. 132025441311108: irq_exit <-__do_irq -0 [051] dns. 132025441311319: performance_monitor_exception <-replay_soft_interrupts Case 2: PMI's masked during local_* operations, example local_add. If the local_add operation happens within a local_irq_save, replay of PMI will be during local_irq_restore. Similar to case 1, this could also create a window before replay where PMU events gets deleted or stopped. Patch adds a fix to update the PMU callback function 'power_pmu_disable' to check for pending perf interrupt. If there is an overflown PMC and pending perf interrupt indicated in Paca, clear the PMI bit in paca to drop that sample. Clearing of PMI bit is done in 'power_pmu_disable' since disable is invoked before any event gets deleted/stopped. With this fix, if there are more than one event running in the PMU, there is a chance that we clear the PMI bit for the event which is not getting deleted/stopped. The other events may still remain active. Hence to make sure we don't drop valid sample in such cases, another check is added in power_pmu_enable. This checks if there is an overflown PMC found among the active events and if so enable back the PMI bit. Two new helper functions are introduced to clear/set the PMI, ie 'clear_pmi_irq_pending' and 'set_pmi_irq_pending'. Helper function 'pmi_irq_pending' is introduced to give a warning if there is pending PMI bit in paca, but no PMC is overflown. Also there are corner cases which results in performance monitor interrupts getting triggered during power_pmu_disable. This happens since PMXE bit is not cleared along with disabling of other MMCR0 bits in the pmu_disable. Such PMI's could leave the PMU running and could trigger PMI again which will set MMCR0 PMAO bit. This could lead to spurious interrupts in some corner cases. Example, a timer after power_pmu_del which will re-enable interrupts and triggers a PMI again since PMAO bit is still set. But fails to find valid overflow since PMC get cleared in power_pmu_del. Patch fixes this by disabling PMXE along with disabling of other MMCR0 bits in power_pmu_disable. We can't just replay PMI any time. Hence this approach is preferred rather than replaying PMI before resetting overflown PMC. Patch also documents core-book3s on a race condition which can trigger these PMC messages during idle path in PowerNV. Fixes: f442d004806e ("powerpc/64s: Add support to mask perf interrupts and replay them") Reported-by: Nageswara R Sastry Suggested-by: Nicholas Piggin Suggested-by: Madhavan Srinivasan Signed-off-by: Athira Rajeev Reviewed-by: Nicholas Piggin Tested-by: Nageswara R Sastry --- arch/powerpc/include/asm/hw_irq.h | 38 + arch/powerpc/perf/core-book3s.c | 59 ++- 2 files changed, 96
Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
Tested by creating perf-script.py using perf script and priting the iregs. Seen more values with this patch. Tested-by: Nageswara R Sastry On 28/06/21 11:53 am, Kajol Jain wrote: Commit 48a1f565261d ("perf script python: Add more PMU fields to event handler dict") added functionality to report fields like weight, iregs, uregs etc via perf report. That commit predefined buffer size to 512 bytes to print those fields. But incase of powerpc, since we added extended regs support in commits: Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs") Commit d735599a069f ("powerpc/perf: Add extended regs support for power10 platform") Now iregs can carry more bytes of data and this predefined buffer size can result to data loss in perf script output. Patch resolve this issue by making buffer size dynamic based on number of registers needed to print. It also changed return type for function "regs_map" from int to void, as the return value is not being used by the caller function "set_regs_in_dict". Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs") Signed-off-by: Kajol Jain --- .../util/scripting-engines/trace-event-python.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c index 4e4aa4c97ac5..c8c9706b4643 100644 --- a/tools/perf/util/scripting-engines/trace-event-python.c +++ b/tools/perf/util/scripting-engines/trace-event-python.c @@ -687,7 +687,7 @@ static void set_sample_datasrc_in_dict(PyObject *dict, _PyUnicode_FromString(decode)); } -static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size) +static void regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size) { unsigned int i = 0, r; int printed = 0; @@ -695,7 +695,7 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size) bf[0] = 0; if (!regs || !regs->regs) - return 0; + return; for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) { u64 val = regs->regs[i++]; @@ -704,8 +704,6 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size) "%5s:0x%" PRIx64 " ", perf_reg_name(r), val); } - - return printed; } static void set_regs_in_dict(PyObject *dict, @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict, struct evsel *evsel) { struct perf_event_attr *attr = >core.attr; - char bf[512]; + + /* +* Here value 28 is a constant size which can be used to print +* one register value and its corresponds to: +* 16 chars is to specify 64 bit register in hexadecimal. +* 2 chars is for appending "0x" to the hexadecimal value and +* 10 chars is for register name. +*/ + int size = __sw_hweight64(attr->sample_regs_intr) * 28; + char bf[size]; regs_map(>intr_regs, attr->sample_regs_intr, bf, sizeof(bf)); -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH] perf vendor events power10: Adds 24x7 nest metric events for power10 platform
Test scenarios: 1. With 'perf list metric' and 'perf list metricgroup' - can see metrics 2. Run all the metrics with perf stat with -M option and --metric-only -M option The above test scenarios automated with avocado framework, pull request title: perf_metric.py: Add perf metric test case Output from automated test script run: (1/2) perf_metric.py:perf_metric.test_all_metric_events_with_M: PASS (89.83 s) (2/2) perf_metric.py:perf_metric.test_all_metric_events_with_metric: PASS (89.34 s) Tested-by: Nageswara R Sastry On 25/06/21 5:29 pm, Kajol Jain wrote: Patch adds 24x7 nest metric events for POWER10. Signed-off-by: Kajol Jain --- .../arch/powerpc/power10/nest_metrics.json| 491 ++ 1 file changed, 491 insertions(+) create mode 100644 tools/perf/pmu-events/arch/powerpc/power10/nest_metrics.json diff --git a/tools/perf/pmu-events/arch/powerpc/power10/nest_metrics.json b/tools/perf/pmu-events/arch/powerpc/power10/nest_metrics.json new file mode 100644 index ..b79046cd8b09 --- /dev/null +++ b/tools/perf/pmu-events/arch/powerpc/power10/nest_metrics.json @@ -0,0 +1,491 @@ +[ +{ + "MetricName": "VEC_GROUP_PUMP_RETRY_RATIO_P01", + "BriefDescription": "VEC_GROUP_PUMP_RETRY_RATIO_P01", + "MetricExpr": "(hv_24x7@PM_PB_RTY_VG_PUMP01\\,chip\\=?@ / hv_24x7@PM_PB_VG_PUMP01\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "VEC_GROUP_PUMP_RETRY_RATIO_P23", + "BriefDescription": "VEC_GROUP_PUMP_RETRY_RATIO_P23", + "MetricExpr": "(hv_24x7@PM_PB_RTY_VG_PUMP23\\,chip\\=?@ / hv_24x7@PM_PB_VG_PUMP23\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "LOCAL_NODE_PUMP_RETRY_RATIO_P01", + "BriefDescription": "LOCAL_NODE_PUMP_RETRY_RATIO_P01", + "MetricExpr": "(hv_24x7@PM_PB_RTY_LNS_PUMP01\\,chip\\=?@ / hv_24x7@PM_PB_LNS_PUMP01\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "LOCAL_NODE_PUMP_RETRY_RATIO_P23", + "BriefDescription": "LOCAL_NODE_PUMP_RETRY_RATIO_P23", + "MetricExpr": "(hv_24x7@PM_PB_RTY_LNS_PUMP23\\,chip\\=?@ / hv_24x7@PM_PB_LNS_PUMP23\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "GROUP_PUMP_RETRY_RATIO_P01", + "BriefDescription": "GROUP_PUMP_RETRY_RATIO_P01", + "MetricExpr": "(hv_24x7@PM_PB_RTY_GROUP_PUMP01\\,chip\\=?@ / hv_24x7@PM_PB_GROUP_PUMP01\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "GROUP_PUMP_RETRY_RATIO_P23", + "BriefDescription": "GROUP_PUMP_RETRY_RATIO_P23", + "MetricExpr": "(hv_24x7@PM_PB_RTY_GROUP_PUMP23\\,chip\\=?@ / hv_24x7@PM_PB_GROUP_PUMP23\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "TOTAL_GROUP_PUMPS_P01", + "BriefDescription": "TOTAL_GROUP_PUMPS_P01(PER-CYC)", + "MetricExpr": "(hv_24x7@PM_PB_GROUP_PUMP01\\,chip\\=?@ / hv_24x7@PM_PAU_CYC\\,chip\\=?@)", + "ScaleUnit": "4", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "TOTAL_GROUP_PUMPS_P23", + "BriefDescription": "TOTAL_GROUP_PUMPS_P23(PER-CYC)", + "MetricExpr": "(hv_24x7@PM_PB_GROUP_PUMP23\\,chip\\=?@ / hv_24x7@PM_PAU_CYC\\,chip\\=?@)", + "ScaleUnit": "4", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "TOTAL_GROUP_PUMPS_RETRIES_P01", + "BriefDescription": "TOTAL_GROUP_PUMPS_RETRIES_P01(PER-CYC)", + "MetricExpr": "(hv_24x7@PM_PB_RTY_GROUP_PUMP01\\,chip\\=?@ / hv_24x7@PM_PAU_CYC\\,chip\\=?@)", + "ScaleUnit": "4", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "TOTAL_GROUP_PUMPS_RETRIES_P23", + "BriefDescription": "TOTAL_GROUP_PUMPS_RETRIES_P23(PER-CYC)", + "MetricExpr": "(hv_24x7@PM_PB_RTY_GROUP_PUMP23\\,chip\\=?@ / hv_24x7@PM_PAU_CYC\\,chip\\=?@)", + "ScaleUnit": "4", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "REMOTE_NODE_PUMPS_RETRIES_RATIO_P01", + "BriefDescription": "REMOTE_NODE_PUMPS_RETRIES_RATIO_P01", + "MetricExpr": "(hv_24x7@PM_PB_RTY_RNS_PUMP01\\,chip\\=?@ / hv_24x7@PM_PB_RNS_PUMP01\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "REMOTE_NODE_PUMPS_RETRIES_RATIO_P23", + "BriefDescription": "REMOTE_NODE_PUMPS_RETRIES_RATIO_P23", + "MetricExpr": "(hv_24x7@PM_PB_RTY_RNS_PUMP23\\,chip\\=?@ / hv_24x7@PM_PB_RNS_PUMP23\\,chip\\=?@) * 100", + "ScaleUnit": "1%", + "AggregationMode": "PerChip" +}, +{ + "MetricName": "TOTAL_VECTOR_GROUP_PUMPS_P01", + "BriefDescription": "TOTAL_VECTOR_GROUP_PUMPS_P01(PER-CYC)", + "MetricExpr": "(hv_24x7@PM_PB_VG_PUMP01\\,chip\\=?@ /
Re: [PATCH 0/2] powerpc/perf: Add instruction and data address registers to extended regs
On 20/06/21 8:15 pm, Athira Rajeev wrote: Patch set adds PMU registers namely Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) as part of extended regs in PowerPC. These registers provides the instruction/data address and adding these to extended regs helps in debug purposes. Patch 1/2 adds SIAR and SDAR as part of the extended regs mask. Patch 2/2 includes perf tools side changes to add the SPRs to sample_reg_mask to use with -I? option. Athira Rajeev (2): powerpc/perf: Expose instruction and data address registers as part of extended regs tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs Tested with the following scenarios on P9, P10 - PowerVM environment 1. perf record -I? - shows added - sdar, siar 2. perf record -I and perf report -D - shows added - sdar, siar with and with out counts. Tested-by: Nageswara R Sastry arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++- arch/powerpc/perf/perf_regs.c | 4 tools/arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++- tools/perf/arch/powerpc/include/perf_regs.h | 2 ++ tools/perf/arch/powerpc/util/perf_regs.c| 2 ++ 5 files changed, 22 insertions(+), 10 deletions(-) -- Thanks and Regards R.Nageswara Sastry
Re: [PATCH v2 0/4] Add perf interface to expose nvdimm
> On 14-Jun-2021, at 10:53 AM, Kajol Jain wrote: > > Patchset adds performance stats reporting support for nvdimm. > Added interface includes support for pmu register/unregister > functions. A structure is added called nvdimm_pmu to be used for > adding arch/platform specific data such as supported events, cpumask > pmu event functions like event_init/add/read/del. > User could use the standard perf tool to access perf > events exposed via pmu. > > Added implementation to expose IBM pseries platform nmem* > device performance stats using this interface. > ... > > Patch1: >Introduces the nvdimm_pmu structure > Patch2: > Adds common interface to add arch/platform specific data > includes supported events, pmu event functions. It also > adds code for cpu hotplug support. > Patch3: >Add code in arch/powerpc/platform/pseries/papr_scm.c to expose >nmem* pmu. It fills in the nvdimm_pmu structure with event attrs >cpumask andevent functions and then registers the pmu by adding >callbacks to register_nvdimm_pmu. > Patch4: >Sysfs documentation patch Tested with the following scenarios: 1. Check dmesg for nmem PMU registered messages. 2. Listed nmem events using 'perf list and perf list nmem' 3. Ran 'perf stat' with single event, grouping events, events from same pmu, different pmu and invalid events 4. Read from sysfs files, Writing in to sysfs files 5. While running nmem events with perf stat, offline cpu from the nmem?/cpumask While running the above functionality worked as expected, no error messages seen in dmesg. Tested-by: Nageswara R Sastry > > Changelog > --- > PATCH v1 -> PATCH v2 > - Fix hotplug code by adding pmu migration call > incase current designated cpu got offline. As > pointed by Peter Zijlstra. > > - Removed the retun -1 part from cpu hotplug offline > function. > > - Link to the previous patchset : https://lkml.org/lkml/2021/6/8/500 > --- > Kajol Jain (4): > drivers/nvdimm: Add nvdimm pmu structure > drivers/nvdimm: Add perf interface to expose nvdimm performance stats > powerpc/papr_scm: Add perf interface support > powerpc/papr_scm: Document papr_scm sysfs event format entries > > Documentation/ABI/testing/sysfs-bus-papr-pmem | 31 ++ > arch/powerpc/include/asm/device.h | 5 + > arch/powerpc/platforms/pseries/papr_scm.c | 365 ++ > drivers/nvdimm/Makefile | 1 + > drivers/nvdimm/nd_perf.c | 230 +++ > include/linux/nd.h| 46 +++ > 6 files changed, 678 insertions(+) > create mode 100644 drivers/nvdimm/nd_perf.c > Thanks and Regards, R.Nageswara Sastry >
Re: [PATCH V3 0/2] selftests/powerpc: Updates to EBB selftest for ISA v3.1
> On 25-May-2021, at 7:21 PM, Athira Rajeev wrote: > > The "no_handler_test" in ebb selftests attempts to read the PMU > registers after closing of the event via helper function > "dump_ebb_state". With the MMCR0 control bit (PMCCEXT) in ISA v3.1, > read access to group B registers is restricted when MMCR0 PMCC=0b00. > Hence the call to dump_ebb_state after closing of event will generate > a SIGILL, which is expected. > > Test has below in logs: > <<>> > !! child died by signal 4 > failure: no_handler_test > <<>> > Tested patches on Power9 and Power10 - PowerVM environment. Power9 - no_handler_test - with and without patch - success Power10 - no_handler_test - without patch - failure Power10 - no_handler_test - with patch - success Power9 - regs_access_pmccext_test - with patch - success Power10 - regs_access_pmccext_test - with patch - success Tested-by: Nageswara R Sastry mailto:rnsas...@linux.ibm.com>> Thank you.
Re: [PATCH] perf vendor events: Fix eventcode of power10 json events
> On 25-May-2021, at 8:57 PM, Paul A. Clarke wrote: >> > I lost the original message, but Nageswara Sastry said: >> 1. Extracted all the 244 events from the patch. >> 2. Check them in 'perf list' - all 244 events found >> 3. Ran all the events with 'perf stat -e "event name" sleep 1', all ran fine. >>No errors were seen in 'dmesg' > > I count 255 events. > > PC Seems while extracting I filtered out newly added ones, so I got 244(255-11). Now checked with all 255 events. Thanks for pointing out. Thanks!! R.Nageswara Sastry
Re: [PATCH] perf vendor events: Fix eventcode of power10 json events
Tested patch with the following steps: 1. Extracted all the 244 events from the patch. 2. Check them in 'perf list' - all 244 events found 3. Ran all the events with 'perf stat -e "event name" sleep 1', all ran fine. No errors were seen in 'dmesg' Tested-by: Nageswara R Sastry > On 25-May-2021, at 12:07 PM, Kajol Jain wrote: > > Fixed the eventcode values in the power10 json event files to append > "0x" since these are hexadecimal values. > Patch also changes event description of PM_EXEC_STALL_LOAD_FINISH and > PM_EXEC_STALL_NTC_FLUSH event and move some events to correct files. > > Fixes: 32daa5d7899e ("perf vendor events: Initial JSON/events list for > power10 platform") > Signed-off-by: Kajol Jain > --- > .../arch/powerpc/power10/cache.json | 30 ++-- > .../arch/powerpc/power10/floating_point.json | 2 +- > .../arch/powerpc/power10/frontend.json| 124 ++-- > .../arch/powerpc/power10/locks.json | 4 +- > .../arch/powerpc/power10/marked.json | 61 > .../arch/powerpc/power10/memory.json | 79 +- > .../arch/powerpc/power10/others.json | 133 +++-- > .../arch/powerpc/power10/pipeline.json| 135 +- > .../pmu-events/arch/powerpc/power10/pmc.json | 8 +- > .../arch/powerpc/power10/translation.json | 22 +-- > 10 files changed, 299 insertions(+), 299 deletions(-) > > diff --git a/tools/perf/pmu-events/arch/powerpc/power10/cache.json > b/tools/perf/pmu-events/arch/powerpc/power10/cache.json > index 616f29098c71..605be14f441c 100644 > --- a/tools/perf/pmu-events/arch/powerpc/power10/cache.json > +++ b/tools/perf/pmu-events/arch/powerpc/power10/cache.json > @@ -1,46 +1,56 @@ > [ > { > -"EventCode": "1003C", > +"EventCode": "0x1003C", > "EventName": "PM_EXEC_STALL_DMISS_L2L3", > "BriefDescription": "Cycles in which the oldest instruction in the > pipeline was waiting for a load miss to resolve from either the local L2 or > local L3." > }, > { > -"EventCode": "34056", > +"EventCode": "0x1E054", > +"EventName": "PM_EXEC_STALL_DMISS_L21_L31", > +"BriefDescription": "Cycles in which the oldest instruction in the > pipeline was waiting for a load miss to resolve from another core's L2 or L3 > on the same chip." > + }, > + { > +"EventCode": "0x34054", > +"EventName": "PM_EXEC_STALL_DMISS_L2L3_NOCONFLICT", > +"BriefDescription": "Cycles in which the oldest instruction in the > pipeline was waiting for a load miss to resolve from the local L2 or local > L3, without a dispatch conflict." > + }, > + { > +"EventCode": "0x34056", > "EventName": "PM_EXEC_STALL_LOAD_FINISH", > -"BriefDescription": "Cycles in which the oldest instruction in the > pipeline was finishing a load after its data was reloaded from a data source > beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles > in which the NTF instruction merged with another load in the LMQ." > +"BriefDescription": "Cycles in which the oldest instruction in the > pipeline was finishing a load after its data was reloaded from a data source > beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles > in which the NTF instruction merged with another load in the LMQ; cycles in > which the NTF instruction is waiting for a data reload for a load miss, but > the data comes back with a non-NTF instruction." > }, > { > -"EventCode": "3006C", > +"EventCode": "0x3006C", > "EventName": "PM_RUN_CYC_SMT2_MODE", > "BriefDescription": "Cycles when this thread's run latch is set and the > core is in SMT2 mode." > }, > { > -"EventCode": "300F4", > +"EventCode": "0x300F4", > "EventName": "PM_RUN_INST_CMPL_CONC", > "BriefDescription": "PowerPC instructions completed by this thread when > all threads in the core had the run-latch set." > }, > { > -"EventCode": "4C016", > +"EventCode": "0x4C016", > "EventName": "PM_EXEC_STALL_DMISS_L2L3_CONFLICT", > "BriefDescription": "Cycles in which the oldest instruction in the > pipeline was waiting for a load miss to resolve from the local L2 or local > L3, with a dispatch conflict." > }, > { > -"EventCode": "4D014", > +"EventCode": "0x4D014", > "EventName": "PM_EXEC_STALL_LOAD", > "BriefDescription": "Cycles in which the oldest instruction in the > pipeline was a load instruction executing in the Load Store Unit." > }, > { > -"EventCode": "4D016", > +"EventCode": "0x4D016", > "EventName": "PM_EXEC_STALL_PTESYNC", > "BriefDescription": "Cycles in which the oldest instruction in the > pipeline was a PTESYNC instruction executing in the Load Store Unit." > }, > { > -"EventCode": "401EA", > +"EventCode": "0x401EA", > "EventName": "PM_THRESH_EXC_128", > "BriefDescription": "Threshold counter exceeded a value of 128." > }, > { > -"EventCode": "400F6", > +