Re: [PATCH v9 1/4] KEYS: trusted: Add generic trusted keys framework
On Mon, 2021-03-01 at 18:41 +0530, Sumit Garg wrote: > Current trusted keys framework is tightly coupled to use TPM device > as an underlying implementation which makes it difficult for > implementations like Trusted Execution Environment (TEE) etc. to > provide trusted keys support in case platform doesn't posses a TPM > device. > > Add a generic trusted keys framework where underlying implementations > can be easily plugged in. Create struct trusted_key_ops to achieve > this, which contains necessary functions of a backend. > > Also, define a module parameter in order to select a particular trust > source in case a platform support multiple trust sources. In case its > not specified then implementation itetrates through trust sources > list starting with TPM and assign the first trust source as a backend > which has initiazed successfully during iteration. > > Note that current implementation only supports a single trust source > at runtime which is either selectable at compile time or during boot > via aforementioned module parameter. You never actually tested this, did you? I'm now getting EINVAL from all the trusted TPM key operations because of this patch. The reason is quite simple: this function: > index ..0db86b44605d > --- /dev/null > +++ b/security/keys/trusted-keys/trusted_core.c [...] > +static int datablob_parse(char *datablob, struct trusted_key_payload > *p) > +{ > + substring_t args[MAX_OPT_ARGS]; > + long keylen; > + int ret = -EINVAL; > + int key_cmd; > + char *c; > + > + /* main command */ > + c = strsep(&datablob, " \t"); Modifies its argument to consume tokens and separates them with NULL. so the arguments for keyctl add trusted kmk "new 34 keyhandle=0x8101" Go into this function as datablob="new 34 keyhandle=0x8101" After we leave it, it looks like datablob="new\034\0keyhandle=0x8101" However here: > +static int trusted_instantiate(struct key *key, > +struct key_preparsed_payload *prep) > +{ > + struct trusted_key_payload *payload = NULL; > + size_t datalen = prep->datalen; > + char *datablob; > + int ret = 0; > + int key_cmd; > + size_t key_len; > + > + if (datalen <= 0 || datalen > 32767 || !prep->data) > + return -EINVAL; > + > + datablob = kmalloc(datalen + 1, GFP_KERNEL); > + if (!datablob) > + return -ENOMEM; > + memcpy(datablob, prep->data, datalen); > + datablob[datalen] = '\0'; > + > + payload = trusted_payload_alloc(key); > + if (!payload) { > + ret = -ENOMEM; > + goto out; > + } > + > + key_cmd = datablob_parse(datablob, payload); > + if (key_cmd < 0) { > + ret = key_cmd; > + goto out; > + } > + > + dump_payload(payload); > + > + switch (key_cmd) { > + case Opt_load: > + ret = static_call(trusted_key_unseal)(payload, > datablob); We're passing the unmodified datablob="new\034\0keyhandle=0x8101" Into the tpm trusted_key_unseal function. However, it only sees "new" and promply gives EINVAL because you've removed the ability to process the new option from it. What should have happened is you should have moved data blob up to passed the consumed tokens, so it actually reads datablob="keyhandle=0x8101" However, to do that you'd have to have the updated pointer passed out of your datablob_parse() above. There's also a lost !tpm2 in the check for options->keyhandle, but I suspect Jarkko lost that merging the two patches. I think what's below fixes all of this, so if you can test it for trusted_tee, I'll package it up as two separate patches fixing all of this. James --- diff --git a/security/keys/trusted-keys/trusted_core.c b/security/keys/trusted-keys/trusted_core.c index ec3a066a4b42..7c636212429b 100644 --- a/security/keys/trusted-keys/trusted_core.c +++ b/security/keys/trusted-keys/trusted_core.c @@ -62,7 +62,7 @@ static const match_table_t key_tokens = { * * On success returns 0, otherwise -EINVAL. */ -static int datablob_parse(char *datablob, struct trusted_key_payload *p) +static int datablob_parse(char **datablob, struct trusted_key_payload *p) { substring_t args[MAX_OPT_ARGS]; long keylen; @@ -71,14 +71,14 @@ static int datablob_parse(char *datablob, struct trusted_key_payload *p) char *c; /* main command */ - c = strsep(&datablob, " \t"); + c = strsep(datablob, " \t"); if (!c) return -EINVAL; key_cmd = match_token(c, key_tokens, args); switch (key_cmd) { case Opt_new: /* first argument is key size */ - c = strsep(&datablob, " \t"); + c = strsep(datablob, " \t"); if (!c) return -EINVAL; ret = kstrtol(c, 10, &keylen); @@ -89,7 +89,7 @@ static int datablob_parse(char *datablob, struct trusted_key_
[GIT PULL] SCSI fixes for 5.12-rc6
This libsas fix is for a problem that occurs when trying to change the cache type of an ATA device and the libiscsi one is a regression fix from this merge window. git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Jolly Shah (1): scsi: libsas: Reset num_scatter if libata marks qc as NODATA Mike Christie (1): scsi: iscsi: Fix iSCSI cls conn state And the diffstat: drivers/scsi/libiscsi.c | 26 +++--- drivers/scsi/libsas/sas_ata.c | 9 - drivers/scsi/scsi_transport_iscsi.c | 18 +++--- 3 files changed, 22 insertions(+), 31 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 7ad11e42306d..bfd2aaa9b66b 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -3179,9 +3179,10 @@ fail_mgmt_tasks(struct iscsi_session *session, struct iscsi_conn *conn) } } -static void iscsi_start_session_recovery(struct iscsi_session *session, -struct iscsi_conn *conn, int flag) +void iscsi_conn_stop(struct iscsi_cls_conn *cls_conn, int flag) { + struct iscsi_conn *conn = cls_conn->dd_data; + struct iscsi_session *session = conn->session; int old_stop_stage; mutex_lock(&session->eh_mutex); @@ -3239,27 +3240,6 @@ static void iscsi_start_session_recovery(struct iscsi_session *session, spin_unlock_bh(&session->frwd_lock); mutex_unlock(&session->eh_mutex); } - -void iscsi_conn_stop(struct iscsi_cls_conn *cls_conn, int flag) -{ - struct iscsi_conn *conn = cls_conn->dd_data; - struct iscsi_session *session = conn->session; - - switch (flag) { - case STOP_CONN_RECOVER: - cls_conn->state = ISCSI_CONN_FAILED; - break; - case STOP_CONN_TERM: - cls_conn->state = ISCSI_CONN_DOWN; - break; - default: - iscsi_conn_printk(KERN_ERR, conn, - "invalid stop flag %d\n", flag); - return; - } - - iscsi_start_session_recovery(session, conn, flag); -} EXPORT_SYMBOL_GPL(iscsi_conn_stop); int iscsi_conn_bind(struct iscsi_cls_session *cls_session, diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index 024e5a550759..8b9a39077dba 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -201,18 +201,17 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc) memcpy(task->ata_task.atapi_packet, qc->cdb, qc->dev->cdb_len); task->total_xfer_len = qc->nbytes; task->num_scatter = qc->n_elem; + task->data_dir = qc->dma_dir; + } else if (qc->tf.protocol == ATA_PROT_NODATA) { + task->data_dir = DMA_NONE; } else { for_each_sg(qc->sg, sg, qc->n_elem, si) xfer += sg_dma_len(sg); task->total_xfer_len = xfer; task->num_scatter = si; - } - - if (qc->tf.protocol == ATA_PROT_NODATA) - task->data_dir = DMA_NONE; - else task->data_dir = qc->dma_dir; + } task->scatter = qc->sg; task->ata_task.retry_count = 1; task->task_state_flags = SAS_TASK_STATE_PENDING; diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index bebfb355abdf..21a2d997a72e 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2470,10 +2470,22 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag) * it works. */ mutex_lock(&conn_mutex); + switch (flag) { + case STOP_CONN_RECOVER: + conn->state = ISCSI_CONN_FAILED; + break; + case STOP_CONN_TERM: + conn->state = ISCSI_CONN_DOWN; + break; + default: + iscsi_cls_conn_printk(KERN_ERR, conn, + "invalid stop flag %d\n", flag); + goto unlock; + } + conn->transport->stop_conn(conn, flag); - conn->state = ISCSI_CONN_DOWN; +unlock: mutex_unlock(&conn_mutex); - } static void stop_conn_work_fn(struct work_struct *work) @@ -2961,7 +2973,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport *transport, mutex_lock(&conn->ep_mutex); conn->ep = NULL; mutex_unlock(&conn->ep_mutex); - conn->state = ISCSI_CONN_DOWN; + conn->state = ISCSI_CONN_FAILED; } transport->ep_disconnect(ep);
Re: [PATCH][next] scsi: aacraid: Replace one-element array with flexible-array member
On Tue, 2021-04-13 at 00:45 -0500, Gustavo A. R. Silva wrote: > Hi Martin, > > On 4/12/21 23:52, Martin K. Petersen wrote: > > > Silencing analyzer warnings shouldn't be done at the expense of > > human > > readers. If it is imperative to switch to flex_array_size() to > > quiesce > > checker warnings, please add a comment in the code explaining that > > the > > size evaluates to nseg_new-1 sge_ieee1212 structs. > > Done: > > https://lore.kernel.org/lkml/20210413054032.GA276102@embeddedor/ I think the reason everyone gets confused is that they think the first argument should do something. If flex_array_size had been defined #define flex_array_size(p, count) \ array_size(count, \ sizeof(*(p)) + __must_be_array(p)) Then we could have used flex_array_size(sge, nseg_new - 1) or flex_array_size(rio->sge, nseg_new - 1) and everyone would have understood either expression. This would also have been useful, as the first example demonstrates, when we have a pointer rather than a flexible member ... although that means the macro likely needs a new name. However, perhaps just do array_size(nseg_new - 1, sizeof(*sge)); And lose the comment? James
Re: [PATCH][next] KEYS: trusted: Fix missing null return from kzalloc call
On Mon, 2021-04-12 at 17:01 +0100, Colin King wrote: > From: Colin Ian King > > The kzalloc call can return null with the GFP_KERNEL flag so > add a null check and exit via a new error exit label. Use the > same exit error label for another error path too. > > Addresses-Coverity: ("Dereference null return value") > Fixes: 830027e2cb55 ("KEYS: trusted: Add generic trusted keys > framework") > Signed-off-by: Colin Ian King > --- > security/keys/trusted-keys/trusted_core.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/security/keys/trusted-keys/trusted_core.c > b/security/keys/trusted-keys/trusted_core.c > index ec3a066a4b42..90774793f0b1 100644 > --- a/security/keys/trusted-keys/trusted_core.c > +++ b/security/keys/trusted-keys/trusted_core.c > @@ -116,11 +116,13 @@ static struct trusted_key_payload > *trusted_payload_alloc(struct key *key) > > ret = key_payload_reserve(key, sizeof(*p)); > if (ret < 0) > - return p; > + goto err; > p = kzalloc(sizeof(*p), GFP_KERNEL); > + if (!p) > + goto err; > > p->migratable = migratable; > - > +err: > return p; This is clearly a code migration bug in commit 251c85bd106099e6f388a89e88e12d14de2c9cda Author: Sumit Garg Date: Mon Mar 1 18:41:24 2021 +0530 KEYS: trusted: Add generic trusted keys framework Which has for addition to trusted_core.c: +static struct trusted_key_payload *trusted_payload_alloc(struct key *key) +{ + struct trusted_key_payload *p = NULL; + int ret; + + ret = key_payload_reserve(key, sizeof(*p)); + if (ret < 0) + return p; + p = kzalloc(sizeof(*p), GFP_KERNEL); + + p->migratable = migratable; + + return p; +} And for trusted_tpm1.c: -static struct trusted_key_payload *trusted_payload_alloc(struct key *key) -{ - struct trusted_key_payload *p = NULL; - int ret; - - ret = key_payload_reserve(key, sizeof *p); - if (ret < 0) - return p; - p = kzalloc(sizeof *p, GFP_KERNEL); - if (p) - p->migratable = 1; /* migratable by default */ - return p; -} The trusted_tpm1.c code was correct and we got this bug introduced by what should have been a simple cut and paste ... how did that happen? And therefore, how safe is the rest of the extraction into trusted_core.c? James
[GIT PULL] SCSI fixes for 5.12-rc6
Seven fixes all in drivers. The hpsa three are the most extensive and the most problematic: it's a packed structure misalignment that oopses on ia64 but looks like it would also oops on quite a few non-x86 architectures. The pm80xx is a regression and the rest are bug fixes for patches in the misc tree. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Can Guo (2): scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs scsi: ufs: core: Fix task management request completion timeout Martin Wilck (1): scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST state Roman Bolshakov (1): scsi: target: iscsi: Fix zero tag inside a trace event Sergei Trofimovich (3): scsi: hpsa: Add an assert to prevent __packed reintroduction scsi: hpsa: Fix boot on ia64 (atomic_t alignment) scsi: hpsa: Use __packed on individual structs, not header-wide Viswas G (1): scsi: pm80xx: Fix chip initialization failure And the diffstat: drivers/scsi/hpsa_cmd.h | 78 + drivers/scsi/pm8001/pm8001_hwi.c| 8 ++-- drivers/scsi/scsi_transport_srp.c | 2 +- drivers/scsi/ufs/ufshcd.c | 31 +++ drivers/target/iscsi/iscsi_target.c | 3 +- 5 files changed, 65 insertions(+), 57 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h index d126bb877250..ba6a3aa8d954 100644 --- a/drivers/scsi/hpsa_cmd.h +++ b/drivers/scsi/hpsa_cmd.h @@ -20,6 +20,11 @@ #ifndef HPSA_CMD_H #define HPSA_CMD_H +#include + +#include /* static_assert */ +#include /* offsetof */ + /* general boundary defintions */ #define SENSEINFOBYTES 32 /* may vary between hbas */ #define SG_ENTRIES_IN_CMD 32 /* Max SG entries excluding chain blocks */ @@ -200,12 +205,10 @@ union u64bit { MAX_EXT_TARGETS + 1) /* + 1 is for the controller itself */ /* SCSI-3 Commands */ -#pragma pack(1) - #define HPSA_INQUIRY 0x12 struct InquiryData { u8 data_byte[36]; -}; +} __packed; #define HPSA_REPORT_LOG 0xc2/* Report Logical LUNs */ #define HPSA_REPORT_PHYS 0xc3 /* Report Physical LUNs */ @@ -221,7 +224,7 @@ struct raid_map_disk_data { u8xor_mult[2];/**< XOR multipliers for this position, * valid for data disks only */ u8reserved[2]; -}; +} __packed; struct raid_map_data { __le32 structure_size;/* Size of entire structure in bytes */ @@ -247,14 +250,14 @@ struct raid_map_data { __le16 dekindex; /* Data encryption key index. */ u8reserved[16]; struct raid_map_disk_data data[RAID_MAP_MAX_ENTRIES]; -}; +} __packed; struct ReportLUNdata { u8 LUNListLength[4]; u8 extended_response_flag; u8 reserved[3]; u8 LUN[HPSA_MAX_LUN][8]; -}; +} __packed; struct ext_report_lun_entry { u8 lunid[8]; @@ -269,20 +272,20 @@ struct ext_report_lun_entry { u8 lun_count; /* multi-lun device, how many luns */ u8 redundant_paths; u32 ioaccel_handle; /* ioaccel1 only uses lower 16 bits */ -}; +} __packed; struct ReportExtendedLUNdata { u8 LUNListLength[4]; u8 extended_response_flag; u8 reserved[3]; struct ext_report_lun_entry LUN[HPSA_MAX_PHYS_LUN]; -}; +} __packed; struct SenseSubsystem_info { u8 reserved[36]; u8 portname[8]; u8 reserved1[1108]; -}; +} __packed; /* BMIC commands */ #define BMIC_READ 0x26 @@ -317,7 +320,7 @@ union SCSI3Addr { u8 Targ:6; u8 Mode:2;/* b10 */ } LogUnit; -}; +} __packed; struct PhysDevAddr { u32 TargetId:24; @@ -325,20 +328,20 @@ struct PhysDevAddr { u32 Mode:2; /* 2 level target device addr */ union SCSI3Addr Target[2]; -}; +} __packed; struct LogDevAddr { u32VolId:30; u32Mode:2; u8 reserved[4]; -}; +} __packed; union LUNAddr { u8 LunAddrBytes[8]; union SCSI3AddrSCSI3Lun[4]; struct PhysDevAddr PhysDev; struct LogDevAddr LogDev; -}; +} __packed; struct CommandListHeader { u8 ReplyQueue; @@ -346,7 +349,7 @@ struct CommandListHeader { __le16 SGTotal; __le64 tag; union LUNAddr LUN; -}; +} __packed; struct RequestBlock { u8 CDBLen; @@ -365,18 +368,18 @@ struct RequestBlock { #define GET_DIR(tad) (((tad) >> 6) & 0x03) u16 Timeout; u8 CDB[16]; -}; +} __packed; struct ErrDescriptor { __le64 Addr; __le32 Len; -}; +} __packed; struct SGDescriptor { __le64 Addr; __le32 Len; __le32 Ext; -}; +} __packed; union MoreErrInfo {
Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context
On Thu, 2021-04-08 at 17:41 -0700, Steve Rutherford wrote: > On Thu, Apr 8, 2021 at 2:15 PM James Bottomley > wrote: > > On Thu, 2021-04-08 at 12:48 -0700, Steve Rutherford wrote: > > > On Thu, Apr 8, 2021 at 10:43 AM James Bottomley < > > > j...@linux.ibm.com> > > > wrote: > > > > On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote: [...] > > > > > However, it would be nice to collaborate on the low-level > > > > > (SEC/PEI) firmware patches to detect whether a CPU is part of > > > > > the primary VM or the mirror. If Google has any OVMF patches > > > > > already done for that, it would be great to combine it with > > > > > IBM's SEV migration code and merge it into upstream OVMF. > > > > > > > > We've reached the stage with our prototyping where not having > > > > the OVMF support is blocking us from working on QEMU. If we're > > > > going to have to reinvent the wheel in OVMF because Google is > > > > unwilling to publish the patches, can you at least give some > > > > hints about how you did it? > > > > > > > > Thanks, > > > > > > > > James > > > > > > Hey James, > > > It's not strictly necessary to modify OVMF to make SEV VMs live > > > migrate. If we were to modify OVMF, we would contribute those > > > changes > > > upstream. > > > > Well, no, we already published an OVMF RFC to this list that does > > migration. However, the mirror approach requires a different boot > > mechanism for the extra vCPU in the mirror. I assume you're doing > > this bootstrap through OVMF so the hypervisor can interrogate it to > > get the correct entry point? That's the code we're asking to see > > because that's what replaces our use of the MP service in the RFC. > > > > James > > Hey James, > The intention would be to have a separate, stand-alone firmware-like > binary run by the mirror. Since the VMM is in control of where it > places that binary in the guest physical address space and the > initial configuration of the vCPUs, it can point the vCPUs at an > entry point contained within that binary, rather than at the standard > x86 reset vector. If you want to share ASIDs you have to share the firmware that the running VM has been attested to. Once the VM moves from LAUNCH to RUNNING, the PSP won't allow the VMM to inject any more firmware or do any more attestations. What you mirror after this point can thus only contain what has already been measured or what the guest added. This is why we think there has to be a new entry path into the VM for the mirror vCPU. So assuming you're thinking you'll inject two pieces of firmware at start of day: the OVFM and this separate binary and attest to both, then you can do that, but then you have two problems: 1. Preventing OVMF from trampling all over your separate binary while it's booting 2. Launching the vCPU up into this separate binary in a way it can execute (needs stack and heap) I think you can likely solve 1. by making the separate binary look like a ROM, but then you have the problem of where you steal the RAM you need for a heap and stack and it still brings us back to how to launch the vCPU which was the original question. With ES we can set the registers at launch, so a vCPU that's never launched can still be pre-programmed with the separate binary entry point but solving the stack and heap looks like it requires co- operation from OVMF. That's why we were thinking the easiest straight line approach is to have a runtime DXE which has a declared initialization routine that allocates memory for the stack and a heap and a separate declared entry point for the vCPU which picks up the already allocated and mapped stack and heap. James
Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context
On Thu, 2021-04-08 at 12:48 -0700, Steve Rutherford wrote: > On Thu, Apr 8, 2021 at 10:43 AM James Bottomley > wrote: > > On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote: > > > On 02/04/21 13:58, Ashish Kalra wrote: > > > > Hi Nathan, > > > > > > > > Will you be posting a corresponding Qemu patch for this ? > > > > > > Hi Ashish, > > > > > > as far as I know IBM is working on QEMU patches for guest-based > > > migration helpers. > > > > Yes, that's right, we'll take on this part. > > > > > However, it would be nice to collaborate on the low-level > > > (SEC/PEI) firmware patches to detect whether a CPU is part of the > > > primary VM or the mirror. If Google has any OVMF patches already > > > done for that, it would be great to combine it with IBM's SEV > > > migration code and merge it into upstream OVMF. > > > > We've reached the stage with our prototyping where not having the > > OVMF support is blocking us from working on QEMU. If we're going > > to have to reinvent the wheel in OVMF because Google is unwilling > > to publish the patches, can you at least give some hints about how > > you did it? > > > > Thanks, > > > > James > > Hey James, > It's not strictly necessary to modify OVMF to make SEV VMs live > migrate. If we were to modify OVMF, we would contribute those changes > upstream. Well, no, we already published an OVMF RFC to this list that does migration. However, the mirror approach requires a different boot mechanism for the extra vCPU in the mirror. I assume you're doing this bootstrap through OVMF so the hypervisor can interrogate it to get the correct entry point? That's the code we're asking to see because that's what replaces our use of the MP service in the RFC. James
Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context
On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote: > On 02/04/21 13:58, Ashish Kalra wrote: > > Hi Nathan, > > > > Will you be posting a corresponding Qemu patch for this ? > > Hi Ashish, > > as far as I know IBM is working on QEMU patches for guest-based > migration helpers. Yes, that's right, we'll take on this part. > However, it would be nice to collaborate on the low-level (SEC/PEI) > firmware patches to detect whether a CPU is part of the primary VM > or the mirror. If Google has any OVMF patches already done for that, > it would be great to combine it with IBM's SEV migration code and > merge it into upstream OVMF. We've reached the stage with our prototyping where not having the OVMF support is blocking us from working on QEMU. If we're going to have to reinvent the wheel in OVMF because Google is unwilling to publish the patches, can you at least give some hints about how you did it? Thanks, James
[GIT PULL] SCSI fixes for 5.12-rc
Single fix to iscsi for a rare race condition which can cause a kernel panic. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Gulam Mohamed (1): scsi: iscsi: Fix race condition between login and sync thread And the diffstat: drivers/scsi/scsi_transport_iscsi.c | 14 +- include/scsi/scsi_transport_iscsi.h | 1 + 2 files changed, 14 insertions(+), 1 deletion(-) With full diff below. James --- diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 969d24d580e2..bebfb355abdf 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2471,6 +2471,7 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag) */ mutex_lock(&conn_mutex); conn->transport->stop_conn(conn, flag); + conn->state = ISCSI_CONN_DOWN; mutex_unlock(&conn_mutex); } @@ -2894,6 +2895,13 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) default: err = transport->set_param(conn, ev->u.set_param.param, data, ev->u.set_param.len); + if ((conn->state == ISCSI_CONN_BOUND) || + (conn->state == ISCSI_CONN_UP)) { + err = transport->set_param(conn, ev->u.set_param.param, + data, ev->u.set_param.len); + } else { + return -ENOTCONN; + } } return err; @@ -2953,6 +2961,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport *transport, mutex_lock(&conn->ep_mutex); conn->ep = NULL; mutex_unlock(&conn->ep_mutex); + conn->state = ISCSI_CONN_DOWN; } transport->ep_disconnect(ep); @@ -3713,6 +3722,8 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group) ev->r.retcode = transport->bind_conn(session, conn, ev->u.b_conn.transport_eph, ev->u.b_conn.is_leading); + if (!ev->r.retcode) + conn->state = ISCSI_CONN_BOUND; mutex_unlock(&conn_mutex); if (ev->r.retcode || !transport->ep_connect) @@ -3944,7 +3955,8 @@ iscsi_conn_attr(local_ipaddr, ISCSI_PARAM_LOCAL_IPADDR); static const char *const connection_state_names[] = { [ISCSI_CONN_UP] = "up", [ISCSI_CONN_DOWN] = "down", - [ISCSI_CONN_FAILED] = "failed" + [ISCSI_CONN_FAILED] = "failed", + [ISCSI_CONN_BOUND] = "bound" }; static ssize_t show_conn_state(struct device *dev, diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h index 8a26a2ffa952..fc5a39839b4b 100644 --- a/include/scsi/scsi_transport_iscsi.h +++ b/include/scsi/scsi_transport_iscsi.h @@ -193,6 +193,7 @@ enum iscsi_connection_state { ISCSI_CONN_UP = 0, ISCSI_CONN_DOWN, ISCSI_CONN_FAILED, + ISCSI_CONN_BOUND, }; struct iscsi_cls_conn {
Re: [PATCH v1 0/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
On Thu, 2021-04-01 at 18:50 +0530, Sumit Garg wrote: > On Thu, 1 Apr 2021 at 15:36, Ahmad Fatoum > wrote: > > Hello Richard, > > > > On 31.03.21 21:36, Richard Weinberger wrote: > > > James, > > > > > > - Ursprüngliche Mail - > > > > Von: "James Bottomley" > > > > Well, yes. For the TPM, there's a defined ASN.1 format for the > > > > keys: > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/jejb/openssl_tpm2_engine.git/tree/tpm2-asn.h > > > > > > > > and part of the design of the file is that it's distinguishable > > > > either > > > > in DER or PEM (by the guards) format so any crypto application > > > > can know > > > > it's dealing with a TPM key simply by inspecting the file. I > > > > think you > > > > need the same thing for CAAM and any other format. > > > > > > > > We're encouraging new ASN.1 formats to be of the form > > > > > > > > SEQUENCE { > > > >type OBJECT IDENTIFIER > > > >... key specific fields ... > > > > } > > > > > > > > Where you choose a defined OID to represent the key and that > > > > means > > > > every key even in DER form begins with a unique binary > > > > signature. > > > > > > I like this idea. > > > Ahmad, what do you think? > > > > > > That way we could also get rid off the kernel parameter and all > > > the fall back logic, > > > given that we find a way to reliable detect TEE blobs too... > > > > Sounds good to me. Sumit, your thoughts on doing this for TEE as > > well? > > > > AFAIU, ASN.1 formating should be independent of trusted keys backends > which could be abstracted to trusted keys core layer so that every > backend could be plugged in seamlessly. > > James, > > Would it be possible to achieve this? I'm not quite sure what you're asking. The saved format of the keys is particular to the underlying hardware. The ASN.1 wraps this so we have an identifier, some useful information to help load the key (like the policy systemements) and then the blobs the hardware expects. This makes the ASN.1 specific to the backend but having a distinguishing OID that allows anyone to tell which backend it needs from the file. So you can call the ASN.1 format that begins with the type OID, the "universal" format, but if you mean there should be a standard ASN.1 format for all keys that defines all the fields then that's not possible because the fields after type are very specific to the underlying hardware. James
Re: [PATCH v1 0/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
On Wed, 2021-03-31 at 20:36 +0200, Richard Weinberger wrote: > James, > > - Ursprüngliche Mail - > > Von: "James Bottomley" > > > On Wed, Mar 17, 2021 at 3:08 PM Ahmad Fatoum < > > > a.fat...@pengutronix.de wrote: > > > > keyctl add trusted $KEYNAME "load $(cat ~/kmk.blob)" @s > > > > > > Is there a reason why we can't pass the desired backend name in > > > the > > > trusted key parameters? > > > e.g. > > > keyctl add trusted $KEYNAME "backendtype caam load $(cat > > > ~/kmk.blob)" > > > @s > > > > Why would you want to in the load? The blob should be type > > specific, so a TPM key shouldn't load as a CAAM key and vice versa > > ... and if they're not they need to be made so before the patches > > go upstream. > > I fear right now there is no good way to detect whether a blob is > desired for CAAM or TPM. At least for the TPM the old format is two TPM2B structures, and the new one is ASN.1 so either should be completely distinguishable over what CAAM does. > > I could possibly see that you might want to be type specific in the > > create, but once you're simply loading an already created key, the > > trusted key subsystem should be able to figure what to do on its > > own. > > So you have some kind of container format in mind which denotes the > type of the blob? Well, yes. For the TPM, there's a defined ASN.1 format for the keys: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/openssl_tpm2_engine.git/tree/tpm2-asn.h and part of the design of the file is that it's distinguishable either in DER or PEM (by the guards) format so any crypto application can know it's dealing with a TPM key simply by inspecting the file. I think you need the same thing for CAAM and any other format. We're encouraging new ASN.1 formats to be of the form SEQUENCE { type OBJECT IDENTIFIER ... key specific fields ... } Where you choose a defined OID to represent the key and that means every key even in DER form begins with a unique binary signature. James
Re: [PATCH v1 0/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
On Wed, 2021-03-31 at 00:04 +0200, Richard Weinberger wrote: > Ahmad, > > On Wed, Mar 17, 2021 at 3:08 PM Ahmad Fatoum > wrote: > > keyctl add trusted $KEYNAME "load $(cat ~/kmk.blob)" @s > > Is there a reason why we can't pass the desired backend name in the > trusted key parameters? > e.g. > keyctl add trusted $KEYNAME "backendtype caam load $(cat ~/kmk.blob)" > @s Why would you want to in the load? The blob should be type specific, so a TPM key shouldn't load as a CAAM key and vice versa ... and if they're not they need to be made so before the patches go upstream. I could possibly see that you might want to be type specific in the create, but once you're simply loading an already created key, the trusted key subsystem should be able to figure what to do on its own. James
[GIT PULL] SCSI fixes for 5.12-rc4
Seven fixes, all in drivers (qla2xxx, mkt3sas, qedi, target, ibmvscsi). The most serious are the target pscsi oom and the qla2xxx revert which can otherwise cause a use after free. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bart Van Assche (1): scsi: Revert "qla2xxx: Make sure that aborted commands are freed" Jia-Ju Bai (2): scsi: mpt3sas: Fix error return code of mpt3sas_base_attach() scsi: qedi: Fix error return code of qedi_alloc_global_queues() Martin Wilck (2): scsi: target: pscsi: Clean up after failure in pscsi_map_sg() scsi: target: pscsi: Avoid OOM in pscsi_map_sg() Tyrel Datwyler (2): scsi: ibmvfc: Make ibmvfc_wait_for_ops() MQ aware scsi: ibmvfc: Fix potential race in ibmvfc_wait_for_ops() And the diffstat: drivers/scsi/ibmvscsi/ibmvfc.c | 67 ++--- drivers/scsi/mpt3sas/mpt3sas_base.c | 8 +++-- drivers/scsi/qedi/qedi_main.c | 1 + drivers/scsi/qla2xxx/qla_target.c | 13 +++ drivers/scsi/qla2xxx/tcm_qla2xxx.c | 4 --- drivers/target/target_core_pscsi.c | 9 - 6 files changed, 74 insertions(+), 28 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 6a92891ac488..bb64e3247a6c 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -2371,6 +2371,24 @@ static int ibmvfc_match_lun(struct ibmvfc_event *evt, void *device) return 0; } +/** + * ibmvfc_event_is_free - Check if event is free or not + * @evt: ibmvfc event struct + * + * Returns: + * true / false + **/ +static bool ibmvfc_event_is_free(struct ibmvfc_event *evt) +{ + struct ibmvfc_event *loop_evt; + + list_for_each_entry(loop_evt, &evt->queue->free, queue_list) + if (loop_evt == evt) + return true; + + return false; +} + /** * ibmvfc_wait_for_ops - Wait for ops to complete * @vhost: ibmvfc host struct @@ -2385,35 +2403,58 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device, { struct ibmvfc_event *evt; DECLARE_COMPLETION_ONSTACK(comp); - int wait; + int wait, i, q_index, q_size; unsigned long flags; signed long timeout = IBMVFC_ABORT_WAIT_TIMEOUT * HZ; + struct ibmvfc_queue *queues; ENTER; + if (vhost->mq_enabled && vhost->using_channels) { + queues = vhost->scsi_scrqs.scrqs; + q_size = vhost->scsi_scrqs.active_queues; + } else { + queues = &vhost->crq; + q_size = 1; + } + do { wait = 0; - spin_lock_irqsave(&vhost->crq.l_lock, flags); - list_for_each_entry(evt, &vhost->crq.sent, queue_list) { - if (match(evt, device)) { - evt->eh_comp = ∁ - wait++; + spin_lock_irqsave(vhost->host->host_lock, flags); + for (q_index = 0; q_index < q_size; q_index++) { + spin_lock(&queues[q_index].l_lock); + for (i = 0; i < queues[q_index].evt_pool.size; i++) { + evt = &queues[q_index].evt_pool.events[i]; + if (!ibmvfc_event_is_free(evt)) { + if (match(evt, device)) { + evt->eh_comp = ∁ + wait++; + } + } } + spin_unlock(&queues[q_index].l_lock); } - spin_unlock_irqrestore(&vhost->crq.l_lock, flags); + spin_unlock_irqrestore(vhost->host->host_lock, flags); if (wait) { timeout = wait_for_completion_timeout(&comp, timeout); if (!timeout) { wait = 0; - spin_lock_irqsave(&vhost->crq.l_lock, flags); - list_for_each_entry(evt, &vhost->crq.sent, queue_list) { - if (match(evt, device)) { - evt->eh_comp = NULL; - wait++; + spin_lock_irqsave(vhost->host->host_lock, flags); + for (q_index = 0; q_index < q_size; q_index++) { + spin_lock(&queues[q_index].l_lock); + for (i = 0; i < queues[q_index].evt_pool.size; i++) { + evt = &queues[q_index].evt_pool.events[i]; + if (!ibmvfc_event_is_free
Re: [PATCH v1 3/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
On Wed, 2021-03-24 at 16:49 -0400, Mimi Zohar wrote: > On Wed, 2021-03-24 at 09:14 -0700, James Bottomley wrote: > > On Tue, 2021-03-23 at 14:07 -0400, Mimi Zohar wrote: > > > On Tue, 2021-03-23 at 17:35 +0100, Ahmad Fatoum wrote: > > > > Hello Horia, > > > > > > > > On 21.03.21 21:48, Horia Geantă wrote: > > > > > On 3/16/2021 7:02 PM, Ahmad Fatoum wrote: > > > > > [...] > > > > > > +struct trusted_key_ops caam_trusted_key_ops = { > > > > > > + .migratable = 0, /* non-migratable */ > > > > > > + .init = trusted_caam_init, > > > > > > + .seal = trusted_caam_seal, > > > > > > + .unseal = trusted_caam_unseal, > > > > > > + .exit = trusted_caam_exit, > > > > > > +}; > > > > > caam has random number generation capabilities, so it's worth > > > > > using that > > > > > by implementing .get_random. > > > > > > > > If the CAAM HWRNG is already seeding the kernel RNG, why not > > > > use > > > > the kernel's? > > > > > > > > Makes for less code duplication IMO. > > > > > > Using kernel RNG, in general, for trusted keys has been discussed > > > before. Please refer to Dave Safford's detailed explanation for > > > not > > > using it [1]. > > > > > > [1] > > > https://lore.kernel.org/linux-integrity/bca04d5d9a3b764c9b7405bba4d4a3c035f2a...@alpmbapa12.e2k.ad.ge.com/ > > > > I still don't think relying on one source of randomness to be > > cryptographically secure is a good idea. The fear of bugs in the > > kernel entropy pool is reasonable, but since it's widely used > > they're > > unlikely to persist very long. Studies have shown that some TPMs > > (notably the chinese manufactured ones) have suspicious failures in > > their RNGs: > > > > https://www.researchgate.net/publication/45934562_Benchmarking_the_True_Random_Number_Generator_of_TPM_Chips > > > > And most cryptograhpers recommend using a TPM for entropy mixing > > rather > > than directly: > > > > https://blog.cryptographyengineering.com/category/rngs/ > > > > The TPMFail paper also shows that in spite of NIST certification > > things can go wrong with a TPM: > > > > https://tpm.fail/ > > We already had a lengthy discussion on replacing the TPM RNG with the > kernel RNG for trusted keys, when TEE was being introduced > [2,3]. I'm not interested in re-hashing that discussion here. The > only difference now is that CAAM is a new trust source. I suspect > the same concerns/issues persist, but at least in this case using the > kernel RNG would not be a regression. Upstreaming the ASN.1 parser gives us a way to create trusted keys outside the kernel and so choose any RNG that suits the user, so I don't think there's any need to rehash for TPM based keys either. However CaaM doesn't have the ability to create keys outside the kernel yet, so they do need to consider the problem. James > [2] Pascal Van Leeuwen on mixing different sources of entropy and > certification - > > https://lore.kernel.org/linux-integrity/mn2pr20mb29732a856a40131a671f949fca...@mn2pr20mb2973.namprd20.prod.outlook.com/ > [3] Jarrko on "regression" and tpm_asym.c - > https://lore.kernel.org/linux-integrity/20191014190033.ga15...@linux.intel.com/ > > > Mimi >
Re: [PATCH v1 3/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
On Tue, 2021-03-23 at 14:07 -0400, Mimi Zohar wrote: > On Tue, 2021-03-23 at 17:35 +0100, Ahmad Fatoum wrote: > > Hello Horia, > > > > On 21.03.21 21:48, Horia Geantă wrote: > > > On 3/16/2021 7:02 PM, Ahmad Fatoum wrote: > > > [...] > > > > +struct trusted_key_ops caam_trusted_key_ops = { > > > > + .migratable = 0, /* non-migratable */ > > > > + .init = trusted_caam_init, > > > > + .seal = trusted_caam_seal, > > > > + .unseal = trusted_caam_unseal, > > > > + .exit = trusted_caam_exit, > > > > +}; > > > caam has random number generation capabilities, so it's worth > > > using that > > > by implementing .get_random. > > > > If the CAAM HWRNG is already seeding the kernel RNG, why not use > > the kernel's? > > > > Makes for less code duplication IMO. > > Using kernel RNG, in general, for trusted keys has been discussed > before. Please refer to Dave Safford's detailed explanation for not > using it [1]. > > thanks, > > Mimi > > [1] > https://lore.kernel.org/linux-integrity/bca04d5d9a3b764c9b7405bba4d4a3c035f2a...@alpmbapa12.e2k.ad.ge.com/ I still don't think relying on one source of randomness to be cryptographically secure is a good idea. The fear of bugs in the kernel entropy pool is reasonable, but since it's widely used they're unlikely to persist very long. Studies have shown that some TPMs (notably the chinese manufactured ones) have suspicious failures in their RNGs: https://www.researchgate.net/publication/45934562_Benchmarking_the_True_Random_Number_Generator_of_TPM_Chips And most cryptograhpers recommend using a TPM for entropy mixing rather than directly: https://blog.cryptographyengineering.com/category/rngs/ The TPMFail paper also shows that in spite of NIST certification things can go wrong with a TPM: https://tpm.fail/ James
Re: [Ksummit-discuss] RFC: create mailing list "linux-issues" focussed on issues/bugs and regressions
On Tue, 2021-03-23 at 12:20 -0400, Steven Rostedt wrote: > On Mon, 22 Mar 2021 20:25:15 +0100 > Thorsten Leemhuis wrote: > > > I agree to the last point and yeah, maybe regressions are the more > > important problem we should work on – at least from the perspective > > of kernel development. But from the users perspective (and > > reporting-issues.rst is written for that perspective) it feel a bit > > unsatisfying to not have a solution to query for existing report, > > regressions or not. H... > > I think the bulk of user issues are going to be regressions. Although > you may be in a better position to know for sure, but at least for > me, wearing my "user" hat, the thing that gets me the most is > upgrading to a new kernel and suddenly something that use to work no > longer does. And that is the definition of a regression. My test > boxes still run old distros (one is running fedora 13). These are the > boxes that catch the most issues, and if they do, they are pretty > much guaranteed to be a regression. > > I like the "linux-regressions" mailing list idea. Can't we use the fancy features of public inbox to get the best of both worlds? Have the bug list (or even a collection of lists) but make the linux-regressions one a virtual list keying off an imap flag which a group of people control. That way anything that is flagged as a regression appears in that public inbox. I assume the search can be quite wide so we could flag a regression on any list indexed by lore? James
Re: [Ksummit-discuss] RFC: create mailing list "linux-issues" focussed on issues/bugs and regressions
On Mon, 2021-03-22 at 13:16 -0400, Konstantin Ryabitsev wrote: > On Mon, Mar 22, 2021 at 04:18:14PM +0100, Thorsten Leemhuis wrote: > > Note, there is a second reason why ksummit-discuss is CCed: another > > reason why I want to create this new list is making it easier to > > find and track regressions reported to our various mailing lists > > (often without LKML in CC, as for some subsystems it's seems to be > > custom to not CC it). > > FYI, there will soon be a unified "search all of lore.kernel.org > regardless of the list/feed source" capability that may make it > unnecessary to create a separate list for this purpose. There's > active ongoing work in the public-inbox project to provide parallel > ways to follow aggregate topics, including query-based subscriptions > (i.e. "put a thread into my inbox whenever someone mentions my > favourite file/function/device name"). This work is not complete yet, > but I have great hopes that it will become available in the next > little while. > > Once we have this ability, we should be able to plug in multiple > sources beyond just mailing lists, including a feed of all > bugzilla.kernel.org changes. This should allow someone an easy way to > query specific bugs and may not require the creation of a separate > list. > > I'm not opposed to the creation of a new list, of course -- just want > to make sure it's aligned with the improvements we are working to > make available. I suspect the problem is that there's no known useful search string to find a bug report even given a searchable set of lists, so the main purpose of this list would be "if it's on here, it's a bug report" and the triage team can cc additional lists as appropriate. Then we simply tell everyone to send kernel bugs to this list and ask maintainers to cc it if a bug report shows up on their list? James
[GIT PULL] SCSI fixes for 5.12-rc3
Eight fixes, all in drivers, all fairly minor either being fixes in error legs, memory leaks on teardown, context errors or semantic problems. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Alexey Dobriyan (1): scsi: qla2xxx: Fix broken #endif placement Christophe JAILLET (1): scsi: mpt3sas: Do not use GFP_KERNEL in atomic context Dan Carpenter (1): scsi: lpfc: Fix some error codes in debugfs Johannes Thumshirn (1): scsi: sd_zbc: Update write pointer offset cache Lv Yunlong (2): scsi: st: Fix a use after free in st_open() scsi: myrs: Fix a double free in myrs_cleanup() Tyrel Datwyler (1): scsi: ibmvfc: Free channel_setup_buf during device tear down dongjian (1): scsi: ufs: ufs-mediatek: Correct operator & -> && And the diffstat: drivers/scsi/ibmvscsi/ibmvfc.c | 2 ++ drivers/scsi/lpfc/lpfc_debugfs.c | 4 ++-- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/myrs.c | 2 +- drivers/scsi/qla2xxx/qla_target.h| 2 +- drivers/scsi/sd_zbc.c| 19 +++ drivers/scsi/st.c| 2 +- drivers/scsi/ufs/ufs-mediatek.c | 2 +- 8 files changed, 20 insertions(+), 15 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 338e29d3e025..6a92891ac488 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -5784,6 +5784,8 @@ static void ibmvfc_free_mem(struct ibmvfc_host *vhost) vhost->disc_buf_dma); dma_free_coherent(vhost->dev, sizeof(*vhost->login_buf), vhost->login_buf, vhost->login_buf_dma); + dma_free_coherent(vhost->dev, sizeof(*vhost->channel_setup_buf), + vhost->channel_setup_buf, vhost->channel_setup_dma); dma_pool_destroy(vhost->sg_pool); ibmvfc_free_queue(vhost, async_q); LEAVE; diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index bc79a017e1a2..46a8f2d1d2b8 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -2421,7 +2421,7 @@ lpfc_debugfs_dif_err_write(struct file *file, const char __user *buf, memset(dstbuf, 0, 33); size = (nbytes < 32) ? nbytes : 32; if (copy_from_user(dstbuf, buf, size)) - return 0; + return -EFAULT; if (dent == phba->debug_InjErrLBA) { if ((dstbuf[0] == 'o') && (dstbuf[1] == 'f') && @@ -2430,7 +2430,7 @@ lpfc_debugfs_dif_err_write(struct file *file, const char __user *buf, } if ((tmp == 0) && (kstrtoull(dstbuf, 0, &tmp))) - return 0; + return -EINVAL; if (dent == phba->debug_writeGuard) phba->lpfc_injerr_wgrd_cnt = (uint32_t)tmp; diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index ffca03064797..6aa6de729187 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -413,7 +413,7 @@ mpt3sas_get_port_by_id(struct MPT3SAS_ADAPTER *ioc, * And add this object to port_table_list. */ if (!ioc->multipath_on_hba) { - port = kzalloc(sizeof(struct hba_port), GFP_KERNEL); + port = kzalloc(sizeof(struct hba_port), GFP_ATOMIC); if (!port) return NULL; diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c index 4adf9ded296a..329fd025c718 100644 --- a/drivers/scsi/myrs.c +++ b/drivers/scsi/myrs.c @@ -2273,12 +2273,12 @@ static void myrs_cleanup(struct myrs_hba *cs) if (cs->mmio_base) { cs->disable_intr(cs); iounmap(cs->mmio_base); + cs->mmio_base = NULL; } if (cs->irq) free_irq(cs->irq, cs); if (cs->io_addr) release_region(cs->io_addr, 0x80); - iounmap(cs->mmio_base); pci_set_drvdata(pdev, NULL); pci_disable_device(pdev); scsi_host_put(cs->host); diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h index 10e5e6c8087d..01620f3eab39 100644 --- a/drivers/scsi/qla2xxx/qla_target.h +++ b/drivers/scsi/qla2xxx/qla_target.h @@ -116,7 +116,6 @@ (min(1270, ((ql) > 0) ? (QLA_TGT_DATASEGS_PER_CMD_24XX + \ QLA_TGT_DATASEGS_PER_CONT_24XX*((ql) - 1)) : 0)) #endif -#endif #define GET_TARGET_ID(ha, iocb) ((HAS_EXTENDED_IDS(ha)) \ ? le16_to_cpu((iocb)->u.isp2x.target.extended) \ @@ -244,6 +243,7 @@ struct ctio_to_2xxx { #ifndef CTIO_RET_TYPE #define CTIO_RET_TYPE 0x17/* CTIO return entry */ #define ATIO_TYPE7 0x06 /* Accept target I/O entry for 24xx */ +#endif struct fcp_hdr { uint8_t r_ctl; diff --git a/drivers/scsi/s
[GIT PULL] SCSI fixes for 5.12-rc2
10 updates, a non code maintainer update for vmw_pvscsi, five code updates for ibmvfc and four for UFS. All updates are either trivial patches or bug fixes. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Can Guo (2): scsi: ufs: Remove redundant checks of !hba in suspend/resume callbacks scsi: ufs: Minor adjustments to error handling Jiapeng Chong (1): scsi: ufs: Convert sysfs sprintf/snprintf family to sysfs_emit Nitin Rawat (1): scsi: ufs: ufs-qcom: Disable interrupt in reset path Tyrel Datwyler (5): scsi: ibmvfc: Reinitialize sub-CRQs and perform channel enquiry after LPM scsi: ibmvfc: Store return code of H_FREE_SUB_CRQ during cleanup scsi: ibmvfc: Treat H_CLOSED as success during sub-CRQ registration scsi: ibmvfc: Fix invalid sub-CRQ handles after hard reset scsi: ibmvfc: Simplify handling of sub-CRQ initialization Vishal Bhakta (1): scsi: vmw_pvscsi: MAINTAINERS: Update maintainer And the diffstat: MAINTAINERS| 2 +- drivers/scsi/ibmvscsi/ibmvfc.c | 62 -- drivers/scsi/ufs/ufs-qcom.c| 10 +++ drivers/scsi/ufs/ufshcd.c | 41 +--- drivers/scsi/vmw_pvscsi.c | 2 -- drivers/scsi/vmw_pvscsi.h | 2 -- 6 files changed, 60 insertions(+), 59 deletions(-) With full diff below. James --- diff --git a/MAINTAINERS b/MAINTAINERS index d92f85ca831d..dede3b947834 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -19165,7 +19165,7 @@ S: Maintained F: drivers/infiniband/hw/vmw_pvrdma/ VMware PVSCSI driver -M: Jim Gill +M: Vishal Bhakta M: VMware PV-Drivers L: linux-s...@vger.kernel.org S: Maintained diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 755313b766b9..338e29d3e025 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -158,6 +159,9 @@ static void ibmvfc_npiv_logout(struct ibmvfc_host *); static void ibmvfc_tgt_implicit_logout_and_del(struct ibmvfc_target *); static void ibmvfc_tgt_move_login(struct ibmvfc_target *); +static void ibmvfc_release_sub_crqs(struct ibmvfc_host *); +static void ibmvfc_init_sub_crqs(struct ibmvfc_host *); + static const char *unknown_error = "unknown error"; static long h_reg_sub_crq(unsigned long unit_address, unsigned long ioba, @@ -899,6 +903,9 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host *vhost) { int rc = 0; struct vio_dev *vdev = to_vio_dev(vhost->dev); + unsigned long flags; + + ibmvfc_release_sub_crqs(vhost); /* Re-enable the CRQ */ do { @@ -910,6 +917,15 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host *vhost) if (rc) dev_err(vhost->dev, "Error enabling adapter (rc=%d)\n", rc); + spin_lock_irqsave(vhost->host->host_lock, flags); + spin_lock(vhost->crq.q_lock); + vhost->do_enquiry = 1; + vhost->using_channels = 0; + spin_unlock(vhost->crq.q_lock); + spin_unlock_irqrestore(vhost->host->host_lock, flags); + + ibmvfc_init_sub_crqs(vhost); + return rc; } @@ -926,8 +942,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost) unsigned long flags; struct vio_dev *vdev = to_vio_dev(vhost->dev); struct ibmvfc_queue *crq = &vhost->crq; - struct ibmvfc_queue *scrq; - int i; + + ibmvfc_release_sub_crqs(vhost); /* Close the CRQ */ do { @@ -947,16 +963,6 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost) memset(crq->msgs.crq, 0, PAGE_SIZE); crq->cur = 0; - if (vhost->scsi_scrqs.scrqs) { - for (i = 0; i < nr_scsi_hw_queues; i++) { - scrq = &vhost->scsi_scrqs.scrqs[i]; - spin_lock(scrq->q_lock); - memset(scrq->msgs.scrq, 0, PAGE_SIZE); - scrq->cur = 0; - spin_unlock(scrq->q_lock); - } - } - /* And re-open it again */ rc = plpar_hcall_norets(H_REG_CRQ, vdev->unit_address, crq->msg_token, PAGE_SIZE); @@ -966,9 +972,12 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost) dev_warn(vhost->dev, "Partner adapter not ready\n"); else if (rc != 0) dev_warn(vhost->dev, "Couldn't register crq (rc=%d)\n", rc); + spin_unlock(vhost->crq.q_lock); spin_unlock_irqrestore(vhost->host->host_lock, flags); + ibmvfc_init_sub_crqs(vhost); + return rc; } @@ -5642,7 +5651,8 @@ static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost, rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE, &scrq->co
Re: [PATCH v9 0/4] Introduce TEE based Trusted Keys support
On Fri, 2021-03-12 at 18:26 +0200, Jarkko Sakkinen wrote: > On Wed, Mar 10, 2021 at 02:26:27PM -0800, James Bottomley wrote: > > On Wed, 2021-03-10 at 21:56 +0200, Jarkko Sakkinen wrote: > > [...] > > > I also need to apply > > > > > > https://lore.kernel.org/linux-integrity/20210127190617.17564-1-james.bottom...@hansenpartnership.com/ > > > > > > and I would like to do both while I'm at it. > > > > > > James, there was one patch that needed fixing but I cannot find > > > lore.kernel.org link. Can you point me to that so that we > > > can proceed? > > > > I think you mean this one observing a missing space in the commit > > message: > > > > https://lore.kernel.org/keyrings/1327393.1612972...@warthog.procyon.org.uk/ > > > > James > > Makefile needed fixing (separate lines), and spaces where missing > between > commas in one file (checkpatch complained). > > I digged a version from my reflog but as I noted privately it's > missing one > file. > > Either provide that file or send a new version of the full patch set. This is the file that got lost James --- --- ASN.1 for TPM 2.0 keys --- TPMKey ::= SEQUENCE { typeOBJECT IDENTIFIER ({tpm2_key_type}), emptyAuth [0] EXPLICIT BOOLEAN OPTIONAL, parent INTEGER ({tpm2_key_parent}), pubkey OCTET STRING ({tpm2_key_pub}), privkey OCTET STRING ({tpm2_key_priv}) }
Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem
On Thu, 2021-03-11 at 01:49 +0100, Linus Walleij wrote: > The use case for TPM on laptops is similar: it can be used by a > provider to lock down a machine, but it can also be used by the > random user to store keys. Very few users beside James > Bottomley are capable of doing that (I am not) Yes, that's the problem with the TPM: pretty much no-one other than someone prepared to become an expert in the subject can use it. This means that enabling RPMB is unlikely to be useful ... you have to develop easy use cases for it as well. > but they exist. > https://blog.hansenpartnership.com/using-your-tpm-as-a-secure-key-store/ It's the difficulty of actually *using* the thing as a keystore which causes the problem. The trick to expanding use it to make it simple. Not to derail the thread, but this should hopefully become a whole lot easier soon. Gnupg-2.3 will release with easy to use TPM support for all your gpg keys: https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=log;h=6720f1343aef9342127380b155c19e12c92d65ac It's not the end of the road by any means, but hopefully it will become a beach head of sorts for more uses. James
Re: [PATCH v9 0/4] Introduce TEE based Trusted Keys support
On Wed, 2021-03-10 at 21:56 +0200, Jarkko Sakkinen wrote: [...] > I also need to apply > > https://lore.kernel.org/linux-integrity/20210127190617.17564-1-james.bottom...@hansenpartnership.com/ > > and I would like to do both while I'm at it. > > James, there was one patch that needed fixing but I cannot find > lore.kernel.org link. Can you point me to that so that we > can proceed? I think you mean this one observing a missing space in the commit message: https://lore.kernel.org/keyrings/1327393.1612972...@warthog.procyon.org.uk/ James
Re: [PATCH v2][next] scsi: mpt3sas: Replace one-element array with flexible-array in struct _MPI2_CONFIG_PAGE_IO_UNIT_3
On Mon, 2021-03-08 at 13:32 -0600, Gustavo A. R. Silva wrote: > Hi all, > > Friendly ping: who can review/take this, please? Well, before embarking on a huge dynamic update, let's ask Broadcom the simpler question of why isn't MPI2_IO_UNIT_PAGE_3_GPIO_VAL_MAX simply set to 36? There's no dynamic allocation of anything in the current code ... it's all hard coded to allocate 36 entries. If there's no need for anything dynamic then the kzalloc could become io_unit_pg3 = kzalloc(sizeof(*io_unit_pg3), GFP_KERNEL); James
Re: [PATCH] scsi: iscsi: Switch to using the new API kobj_to_dev()
On Mon, 2021-03-08 at 11:34 +0800, Jiapeng Chong wrote: > Fix the following coccicheck warnings: > > ./drivers/scsi/scsi_transport_iscsi.c:930:60-61: WARNING opportunity > for kobj_to_dev(). I have to ask, what is the point of this? container_of is usually pretty safe ... as in it will detect when you screw up the usage. The only real misuse you can get is when the input type has an object of the same name and return type and you got confused between two objects with this property, but misuses like this resulting in bugs are very, very rare. Usually we wrap container_of because the wrapping is a bit shorter as you can see: kobj_to_dev is about half the size of the container_of form ... but is there any other reason to do it? The problem is that container_of is a standard way of doing cast outs in the kernel and we have hundreds of them. To be precise, in scsi alone: jejb@jarvis:~/git/linux/drivers/scsi> git grep container_of|wc -l 496 So we really don't want to encourage wrapping them all because the churn would be unbelievable and the gain minute. So why should this one case especially be wrapped when we don't want to wrap the others? James
Re: [PATCH v3] selinux: measure state and policy capabilities
On Fri, 2021-03-05 at 12:52 -0500, Paul Moore wrote: [...] > This draft seems fine to me, but there is a small logistical blocker > at the moment which means I can't merge this until -rc2 is released, > which likely means this coming Monday. The problem is that this > patch relies on code that went upstream via in the last merge window > via the IMA tree, not the SELinux tree; normally that wouldn't be a > problem as I typically rebase the selinux/next to Linus' -rc1 tag > once the merge window is closed, but in this particular case the -rc1 > tag is dangerously broken for some system configurations (the tag has > since been renamed) so I'm not rebasing onto -rc1 this time around. > > Assuming that -rc2 fixes the swapfile/fs-corruption problem, early > next week I'll rebase selinux/next to -rc2 and merge this patch. > However, if the swapfile bug continues past -rc2 we can consider > merging this via the IMA tree, but I'd assume not do that if possible > due to merge conflict and testing reasons. If it helps, we rebased the SCSI tree on top of the merge for the swapfile fix which is this one, without waiting for -rc2: commit f69d02e37a85645aa90d18cacfff36dba370f797 Merge: 7a7fd0de4a98 caf6912f3f4a Author: Linus Torvalds Date: Tue Mar 2 18:18:17 2021 -0800 Merge tag 'misc-5.12-2021-03-02' of git://git.kernel.dk/linux-block James
Re: linux-next: rebase of the scsi-mkp tree
On Fri, 2021-03-05 at 11:04 +1100, Stephen Rothwell wrote: > Hi Martin, > > I notice that you have rebased the scsi-mkp tree. Unfotunately James > has already merged part of the old version of the scsi-mkp tree int > the scsi tree so that commits f69d02e37a85..39ae3edda325 in the scsi- > mkp tree are the same patches as commits fe07bfda2fb9..100d21c4ff29 > in the scsi tree. It's just the flux from Linus announcing he's screwed up -rc1 and we shouldn't base on it ... it should all be fixed soon. James signature.asc Description: This is a digitally signed message part
[GIT PULL] final round of SCSI updates for the 5.11+ merge window
This is a few driver updates (iscsi, mpt3sas) that were still in the staging queue when the merge window opened (all committed on or before 8 Feb) and some small bug fixes which came in during the merge window (all committed on 22 Feb). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc The short changelog of the bug fixes is: Aleksandr Miloserdov (2): scsi: target: core: Add cmd length set before cmd complete scsi: target: core: Prevent underflow for service actions Avri Altman (1): scsi: ufs: Fix a duplicate dev quirk number Bart Van Assche (1): scsi: sd: Fix Opal support Bhaskar Chowdhury (1): scsi: aic79xx: Fix spelling of version Bodo Stroesser (2): scsi: target: tcmu: Move some functions without code change scsi: target: tcmu: Fix memory leak caused by wrong uio usage Chen Lin (1): scsi: aic7xxx: Remove unused function pointer typedef ahc_bus_suspend/resume_t Don Brace (1): scsi: hpsa: Correct dev cmds outstanding for retried cmds Johannes Thumshirn (1): scsi: sd: sd_zbc: Don't pass GFP_NOIO to kvcalloc Randy Dunlap (1): scsi: bnx2fc: Fix Kconfig warning & CNIC build errors The short changelog of the staging updates is: Arnd Bergmann (1): scsi: pmcraid: Fix 'ioarcb' alignment warning Damien Le Moal (1): scsi: sd: Warn if unsupported ZBC device is probed DooHyun Hwang (1): scsi: ufs: Print the counter of each event history Jiapeng Chong (1): scsi: qla2xxx: Simplify if statement Mike Christie (9): scsi: iscsi: Drop session lock in iscsi_session_chkready() scsi: qla4xxx: Use iscsi_is_session_online() scsi: libiscsi: Reset max/exp cmdsn during recovery scsi: iscsi_tcp: Fix shost can_queue initialization scsi: libiscsi: Add helper to calculate max SCSI cmds per session scsi: libiscsi: Fix iSCSI host workq destruction scsi: libiscsi: Fix iscsi_task use after free() scsi: libiscsi: Drop taskqueuelock scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling Sreekanth Reddy (2): scsi: mpt3sas: Add support for shared host tagset for CPU hotplug scsi: mpt3sas: Fix ReplyPostFree pool allocation Suganath Prabu S (2): scsi: mpt3sas: Update driver version to 37.100.00.00 scsi: mpt3sas: Additional diagnostic buffer query interface Yang Li (2): scsi: isci: Remove redundant initialization of variable 'status' scsi: target: sbp: Remove unneeded semicolon And the diffstat: drivers/scsi/aic7xxx/aic79xx.h | 2 +- drivers/scsi/aic7xxx/aic7xxx.h | 2 - drivers/scsi/bnx2fc/Kconfig | 1 + drivers/scsi/bnx2i/bnx2i_iscsi.c| 2 - drivers/scsi/hpsa.c | 51 +++- drivers/scsi/hpsa_cmd.h | 2 +- drivers/scsi/isci/request.c | 8 +- drivers/scsi/iscsi_tcp.c| 9 +- drivers/scsi/libiscsi.c | 348 ++-- drivers/scsi/libiscsi_tcp.c | 86 --- drivers/scsi/mpt3sas/mpt3sas_base.c | 58 +++-- drivers/scsi/mpt3sas/mpt3sas_base.h | 52 - drivers/scsi/mpt3sas/mpt3sas_ctl.c | 67 +- drivers/scsi/mpt3sas/mpt3sas_ctl.h | 22 ++ drivers/scsi/mpt3sas/mpt3sas_scsih.c| 44 +++- drivers/scsi/mpt3sas/mpt3sas_trigger_diag.c | 38 ++- drivers/scsi/pmcraid.h | 6 +- drivers/scsi/qla2xxx/qla_target.c | 3 +- drivers/scsi/qla4xxx/ql4_os.c | 2 +- drivers/scsi/scsi_transport_iscsi.c | 3 - drivers/scsi/sd.c | 14 +- drivers/scsi/sd_zbc.c | 6 +- drivers/scsi/ufs/ufshcd.c | 6 +- drivers/scsi/ufs/ufshcd.h | 2 +- drivers/target/sbp/sbp_target.c | 2 +- drivers/target/target_core_pr.c | 15 +- drivers/target/target_core_transport.c | 15 +- drivers/target/target_core_user.c | 189 --- include/scsi/libiscsi.h | 6 +- include/target/target_core_backend.h| 1 + 30 files changed, 746 insertions(+), 316 deletions(-) James
Re: [RFC] KVM: x86: Support KVM VMs sharing SEV context
> Add a capability for userspace to mirror SEV encryption context from > one vm to another. On our side, this is intended to support a > Migration Helper vCPU, but it can also be used generically to support > other in-guest workloads scheduled by the host. The intention is for > the primary guest and the mirror to have nearly identical memslots. So this causes a cloned VM that you can boot up another CPU into but the boot path must have been already present? In essence we've already been thinking about something like this to get migration running inside OVMF: https://lore.kernel.org/qemu-devel/8b824c44-6a51-c3a7-6596-921dc47fe...@linux.ibm.com/ It sounds like this mechanism can be used to boot a vCPU through a mirror VM after the fact, which is very compatible with the above whose mechanism is simply to steal a VCPU to hold in reset until it's activated. However, you haven't published how you activate the entity inside the VM ... do you have patches for this so we can see the internal capture mechanism and mirror VM boot path? > The primary benefits of this are that: > 1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they > can't accidentally clobber each other. > 2) The VMs can have different memory-views, which is necessary for > post-copy migration (the migration vCPUs on the target need to read > and write to pages, when the primary guest would VMEXIT). > > This does not change the threat model for AMD SEV. Any memory > involved is still owned by the primary guest and its initial state is > still attested to through the normal SEV_LAUNCH_* flows. If userspace > wanted to circumvent SEV, they could achieve the same effect by > simply attaching a vCPU to the primary VM. > This patch deliberately leaves userspace in charge of the memslots > for the mirror, as it already has the power to mess with them in the > primary guest. Well it does alter the threat model in that previously the configuration, including the CPU configuration, was fixed after launch and attestation. Now the CSP can alter the configuration via a mirror. I'm not sure I have a threat for this, but it definitely alters the model. > This patch does not support SEV-ES (much less SNP), as it does not > handle handing off attested VMSAs to the mirror. One of the reasons for doing the sequestered vcpu is that -ES and -SNP require the initial CPU state to be part of the attestation, so with them you can't add CPU state after the fact. I think you could use this model if you declare the vCPU in the mirror in the initial attested VMSA, but that's conjecture at this stage. > For additional context, we need a Migration Helper because SEV PSP > migration is far too slow for our live migration on its own. Using an > in-guest migrator lets us speed this up significantly. We have the same problem here at IBM, hence the RFC referred to above. James
Re: [PATCH 2/9] tpm: Allow PCR 23 to be restricted to kernel-only use
On Sat, 2021-02-20 at 01:32 +, Matthew Garrett wrote: > Under certain circumstances it might be desirable to enable the > creation of TPM-backed secrets that are only accessible to the > kernel. In an ideal world this could be achieved by using TPM > localities, but these don't appear to be available on consumer > systems. I don't understand this ... the localities seem to work fine on all the systems I have ... is this some embedded thing? > An alternative is to simply block userland from modifying one of the > resettable PCRs, leaving it available to the kernel. If the kernel > ensures that no userland can access the TPM while it is carrying out > work, it can reset PCR 23, extend it to an arbitrary value, create or > load a secret, and then reset the PCR again. Even if userland somehow > obtains the sealed material, it will be unable to unseal it since PCR > 23 will never be in the appropriate state. This seems a bit arbitrary: You're removing this PCR from user space accessibility, but PCR 23 is defined as "Application Support" how can we be sure no application will actually want to use it (and then fail)? Since PCRs are very scarce, why not use a NV index instead. They're still a bounded resource, but most TPMs have far more of them than they do PCRs, and the address space is much bigger so picking a nice arbitrary 24 bit value reduces the chance of collisions. James
Re: [PATCH v17 08/10] PM: hibernate: disable when there are active secretmem users
On Mon, 2021-02-22 at 11:17 -0800, Dan Williams wrote: > On Mon, Feb 22, 2021 at 2:24 AM Mike Rapoport > wrote: > > On Mon, Feb 22, 2021 at 07:34:52AM +, Matthew Garrett wrote: > > > On Mon, Feb 08, 2021 at 10:49:18AM +0200, Mike Rapoport wrote: > > > > > > > It is unsafe to allow saving of secretmem areas to the > > > > hibernation snapshot as they would be visible after the resume > > > > and this essentially will defeat the purpose of secret memory > > > > mappings. > > > > > > Sorry for being a bit late to this - from the point of view of > > > running processes (and even the kernel once resume is complete), > > > hibernation is effectively equivalent to suspend to RAM. Why do > > > they need to be handled differently here? > > > > Hibernation leaves a copy of the data on the disk which we want to > > prevent. > > Why not document that users should use data at rest protection > mechanisms for their hibernation device? Just because secretmem can't > assert its disclosure guarantee does not mean the hibernation device > is untrustworthy. It's not just data at rest. Part of the running security guarantees are that the kernel never gets access to the data. To support hibernate and swap we have to break that, so it reduces the runtime security posture as well as the data at rest one. This argues we could do it with a per region flags (something like less secure or more secure mappings), but when you give users that choice most of them rarely choose less secure. James
[GIT PULL] first round of SCSI updates for the 5.11+ merge window
This series consists of the usual driver updates (ufs, ibmvfc, qla2xxx, hisi_sas, pm80xx) plus the removal of the gdth driver (which is bound to cause conflicts with a trivial change somewhere). The only big major rework of note is the one from Hannes trying to clean up our result handling code in the drivers to make it consistent. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc The short changelog is: Adrian Hunter (2): scsi: ufs: ufs-debugfs: Add error counters scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode Ahmed S. Darwish (19): scsi: libsas: Remove temporarily-added _gfp() API variants scsi: mvsas: Switch back to original libsas event notifiers scsi: isci: Switch back to original libsas event notifiers scsi: libsas: Switch back to original event notifiers API scsi: pm80xx: Switch back to original libsas event notifiers scsi: aic94xx: Switch back to original libsas event notifiers scsi: hisi_sas: Switch back to original libsas event notifiers scsi: libsas: Add gfp_t flags parameter to event notifications scsi: hisi_sas: Pass gfp_t flags to libsas event notifiers scsi: aic94xx: Pass gfp_t flags to libsas event notifiers scsi: pm80xx: Pass gfp_t flags to libsas event notifiers scsi: libsas: Pass gfp_t flags to event notifiers scsi: isci: Pass gfp_t flags in isci_port_bc_change_received() scsi: isci: Pass gfp_t flags in isci_port_link_up() scsi: isci: Pass gfp_t flags in isci_port_link_down() scsi: mvsas: Pass gfp_t flags to libsas event notifiers scsi: libsas: Introduce a _gfp() variant of event notifiers scsi: libsas: docs: Remove notify_ha_event() scsi: target: core: Remove in_interrupt() check in transport_handle_cdb_direct() Anastasia Kovaleva (2): scsi: target: core: Change ASCQ for residual write scsi: target: core: Signal WRITE residuals Andrea Parri (Microsoft) (3): scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback() scsi: storvsc: Resolve data race in storvsc_probe() scsi: storvsc: Fix max_outstanding_req_per_channel for Win8 and newer Arnd Bergmann (1): scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression Avri Altman (1): scsi: ufs: A tad optimization in query upiu trace Bean Huo (15): scsi: ufs: Cleanup WB buffer flush toggle implementation scsi: ufs: Group UFS WB related flags in struct ufs_dev_info scsi: ufs: Remove two WB related fields from struct ufs_dev_info scsi: ufs: Update comment in the function ufshcd_wb_probe() scsi: ufs: docs: ABI: Add wb_on documentation for new entry wb_on scsi: ufs: Add "wb_on" sysfs node to control WB on/off scsi: ufs: Delete redundant if statement in ufshcd_intr() scsi: ufs: Remove unnecessary devm_kfree() scsi: ufs: Replace sprintf and snprintf with sysfs_emit scsi: ufs: Make UPIU trace easier differentiate among CDB, OSF, and TM scsi: ufs: Distinguish between TM request UPIU and response UPIU in TM UPIU trace scsi: ufs: Distinguish between query REQ and query RSP in query trace scsi: ufs: Don't call trace_ufshcd_upiu() in case trace poit is disabled scsi: ufs: Use __print_symbolic() for UFS trace string print scsi: ufs: Remove stringize operator '#' restriction Bhavesh Jashnani (1): scsi: pm80xx: Simultaneous poll for all FW readiness Bikash Hazarika (1): scsi: qla2xxx: Wait for ABTS response on I/O timeouts for NVMe Bjorn Helgaas (2): scsi: lpfc: Fix 'physical' typos scsi: message: fusion: Fix 'physical' typos Brian King (1): scsi: ibmvfc: Set default timeout to avoid crash during migration Can Guo (7): scsi: ufs: Give clk scaling min gear a value Revert "Make sure clk scaling happens only when HBA is runtime ACTIVE" scsi: ufs: Refactor ufshcd_init/exit_clk_scaling/gating() scsi: ufs: Protect some contexts from unexpected clock scaling scsi: ufs: Protect PM ops and err_handler from user access through sysfs scsi: ufs: Fix a possible NULL pointer issue scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback Christophe JAILLET (1): scsi: pm80xx: Switch from 'pci_' to 'dma_' API Colin Ian King (3): scsi: ibmvfc: Fix spelling mistake "succeded" -> "succeeded" scsi: pm80xx: Clean up indentation of a code block scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility" Dan Carpenter (3): scsi: lpfc: Fix ancient double free scsi: qla2xxx: Fix some memory corruption scsi: qla2xxx: Remove unnecessary NULL check Dinghao Liu (2): scsi: fnic: Fix memleak in vnic_dev_init_devcmd2 scsi: scsi_debug: Fix memleak in scsi_debug_init() Enzo Matsumiya (1): scsi: qla2xxx: Fix description for parameter ql2xenforce_iocb_limit Eric Curtin (1
Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas
On Tue, 2021-02-16 at 18:16 +0100, David Hildenbrand wrote: [...] > > > The discussion regarding migratability only really popped up > > > because this is a user-visible thing and not being able to > > > migrate can be a real problem (fragmentation, ZONE_MOVABLE, ...). > > > > I think the biggest use will potentially come from hardware > > acceleration. If it becomes simple to add say encryption to a > > secret page with no cost, then no flag needed. However, if we only > > have a limited number of keys so once we run out no more encrypted > > memory then it becomes a costly resource and users might want a > > choice of being backed by encryption or not. > > Right. But wouldn't HW support with configurable keys etc. need more > syscall parameters (meaning, even memefd_secret() as it is would not > be sufficient?). I suspect the simplistic flag approach might not > be sufficient. I might be wrong because I have no clue about MKTME > and friends. The theory I was operating under is key management is automatic and hidden, but key scarcity can't be, so if you flag requesting hardware backing then you either get success (the kernel found a key) or failure (the kernel is out of keys). If we actually want to specify the key then we need an extra argument and we *must* have a new system call. > Anyhow, I still think extending memfd_create() might just be good > enough - at least for now. I really think this is the wrong approach for a user space ABI. If we think we'll ever need to move to a separate syscall, we should begin with one. The pain of trying to shift userspace from memfd_create to a new syscall would be enormous. It's not impossible (see clone3) but it's a pain we should avoid if we know it's coming. > Things like HW support might have requirements we don't even know > yet and that we cannot even model in memfd_secret() right now. This is the annoying problem with our Linux unbreakable ABI policy: we get to plan when the ABI is introduced for stuff we don't yet even know about. James
Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas
On Tue, 2021-02-16 at 17:34 +0100, David Hildenbrand wrote: > On 16.02.21 17:25, James Bottomley wrote: > > On Mon, 2021-02-15 at 20:20 +0100, Michal Hocko wrote: > > [...] > > > > >What kind of flags are we talking about and why would that > > > > > be a problem with memfd_create interface? Could you be more > > > > > specific please? > > > > > > > > You mean what were the ioctl flags in the patch series linked > > > > above? They were SECRETMEM_EXCLUSIVE and SECRETMEM_UNCACHED in > > > > patch 3/5. > > > > > > OK I see. How many potential modes are we talking about? A few or > > > potentially many? > > > > Well I initially thought there were two (uncached or not) until you > > came up with the migratable or non-migratable, which affects the > > security properties. But now there's also potential for hardware > > backing, like mktme, described by flags as well. I suppose you > > could also use RDT to restrict which cache the data goes into: say > > L1 but not L2 on to lessen the impact of fully uncached (although > > the big thrust of uncached was to blunt hyperthread side > > channels). So there is potential for quite a large expansion even > > though I'd be willing to bet that a lot of the modes people have > > thought about turn out not to be very effective in the field. > > Thanks for the insight. I remember that even the "uncached" parts > was effectively nacked by x86 maintainers (I might be wrong). It wasn't liked by x86 maintainers, no. Plus there's no architecturally standard mechanism for making a page uncached and, as the arm people pointed out, sometimes no way of ensuring it's never cached. > For the other parts, the question is what we actually want to let > user space configure. > > Being able to specify "Very secure" "maximum secure" "average > secure" all doesn't really make sense to me. Well, it doesn't to me either unless the user feels a cost/benefit, so if max cost $100 per invocation and average cost nothing, most people would chose average unless they had a very good reason not to. In your migratable model, if we had separate limits for non-migratable and migratable, with non-migratable being set low to prevent exhaustion, max secure becomes a highly scarce resource, whereas average secure is abundant then having the choice might make sense. > The discussion regarding migratability only really popped up because > this is a user-visible thing and not being able to migrate can be a > real problem (fragmentation, ZONE_MOVABLE, ...). I think the biggest use will potentially come from hardware acceleration. If it becomes simple to add say encryption to a secret page with no cost, then no flag needed. However, if we only have a limited number of keys so once we run out no more encrypted memory then it becomes a costly resource and users might want a choice of being backed by encryption or not. James
Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas
On Mon, 2021-02-15 at 20:20 +0100, Michal Hocko wrote: [...] > > > What kind of flags are we talking about and why would that be a > > > problem with memfd_create interface? Could you be more specific > > > please? > > > > You mean what were the ioctl flags in the patch series linked > > above? They were SECRETMEM_EXCLUSIVE and SECRETMEM_UNCACHED in > > patch 3/5. > > OK I see. How many potential modes are we talking about? A few or > potentially many? Well I initially thought there were two (uncached or not) until you came up with the migratable or non-migratable, which affects the security properties. But now there's also potential for hardware backing, like mktme, described by flags as well. I suppose you could also use RDT to restrict which cache the data goes into: say L1 but not L2 on to lessen the impact of fully uncached (although the big thrust of uncached was to blunt hyperthread side channels). So there is potential for quite a large expansion even though I'd be willing to bet that a lot of the modes people have thought about turn out not to be very effective in the field. James
Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas
On Mon, 2021-02-15 at 10:13 +0100, Michal Hocko wrote: > On Sun 14-02-21 11:21:02, James Bottomley wrote: > > On Sun, 2021-02-14 at 10:58 +0100, David Hildenbrand wrote: > > [...] > > > > And here we come to the question "what are the differences that > > > > justify a new system call?" and the answer to this is very > > > > subjective. And as such we can continue bikeshedding forever. > > > > > > I think this fits into the existing memfd_create() syscall just > > > fine, and I heard no compelling argument why it shouldn‘t. That‘s > > > all I can say. > > > > OK, so let's review history. In the first two incarnations of the > > patch, it was an extension of memfd_create(). The specific > > objection by Kirill Shutemov was that it doesn't share any code in > > common with memfd and so should be a separate system call: > > > > https://lore.kernel.org/linux-api/20200713105812.dnwtdhsuyj3xbh4f@box/ > > Thanks for the pointer. But this argument hasn't been challenged at > all. It hasn't been brought up that the overlap would be considerable > higher by the hugetlb/sealing support. And so far nobody has claimed > those combinations as unviable. Kirill is actually interested in the sealing path for his KVM code so we took a look. There might be a two line overlap in memfd_create for the seal case, but there's no real overlap in memfd_add_seals which is the bulk of the code. So the best way would seem to lift the inode ... -> seals helpers to be non-static so they can be reused and roll our own add_seals. I can't see a use case at all for hugetlb support, so it seems to be a bit of an angels on pin head discussion. However, if one were to come along handling it in the same way seems reasonable. > > The other objection raised offlist is that if we do use > > memfd_create, then we have to add all the secret memory flags as an > > additional ioctl, whereas they can be specified on open if we do a > > separate system call. The container people violently objected to > > the ioctl because it can't be properly analysed by seccomp and much > > preferred the syscall version. > > > > Since we're dumping the uncached variant, the ioctl problem > > disappears but so does the possibility of ever adding it back if we > > take on the container peoples' objection. This argues for a > > separate syscall because we can add additional features and extend > > the API with flags without causing anti-ioctl riots. > > I am sorry but I do not understand this argument. You don't understand why container guarding technology doesn't like ioctls? The problem is each ioctl is the multiplexor is specific to the particular fd implementation, so unlike fcntl you don't have global ioctl numbers (although we do try to separate the space somewhat with the _IO macros). This makes analysis and blocking a hard problem for container seccomp. > What kind of flags are we talking about and why would that be a > problem with memfd_create interface? Could you be more specific > please? You mean what were the ioctl flags in the patch series linked above? They were SECRETMEM_EXCLUSIVE and SECRETMEM_UNCACHED in patch 3/5. They were eventually dropped after v10, because of problems with architectural semantics, with the idea that it could be added back again if a compelling need arose: https://lore.kernel.org/linux-api/20201123095432.5860-1-r...@kernel.org/ In theory the extra flags could be multiplexed into the memfd_create flags like hugetlbfs is but with 32 flags and a lot already taken it gets messy for expansion. When we run out of flags the first question people will ask is "why didn't you do separate system calls?". James
Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas
On Sun, 2021-02-14 at 10:58 +0100, David Hildenbrand wrote: [...] > > And here we come to the question "what are the differences that > > justify a new system call?" and the answer to this is very > > subjective. And as such we can continue bikeshedding forever. > > I think this fits into the existing memfd_create() syscall just fine, > and I heard no compelling argument why it shouldn‘t. That‘s all I can > say. OK, so let's review history. In the first two incarnations of the patch, it was an extension of memfd_create(). The specific objection by Kirill Shutemov was that it doesn't share any code in common with memfd and so should be a separate system call: https://lore.kernel.org/linux-api/20200713105812.dnwtdhsuyj3xbh4f@box/ The other objection raised offlist is that if we do use memfd_create, then we have to add all the secret memory flags as an additional ioctl, whereas they can be specified on open if we do a separate system call. The container people violently objected to the ioctl because it can't be properly analysed by seccomp and much preferred the syscall version. Since we're dumping the uncached variant, the ioctl problem disappears but so does the possibility of ever adding it back if we take on the container peoples' objection. This argues for a separate syscall because we can add additional features and extend the API with flags without causing anti-ioctl riots. James
[GIT PULL] SCSI fixes for 5.11-rc6
One fix for scsi_debug that fixes a memory leak on module removal. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Maurizio Lombardi (1): scsi: scsi_debug: Fix a memory leak And the diffstat: drivers/scsi/scsi_debug.c | 1 + 1 file changed, 1 insertion(+) With full diff below. James --- diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 4a08c450b756..b6540b92f566 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -6881,6 +6881,7 @@ static void __exit scsi_debug_exit(void) sdebug_erase_all_stores(false); xa_destroy(per_store_ap); + kfree(sdebug_q_arr); } device_initcall(scsi_debug_init);
Re: [PATCH] sign-file: add openssl engine support
On Wed, 2021-02-10 at 08:01 +, David Woodhouse wrote: > > On 10 February 2021 07:45:54 GMT, Yang Song < > songy...@linux.alibaba.com> wrote: > > Use a customized signature service supported by openssl engine > > to sign the kernel module. > > Add command line parameters that support engine for sign-file > > to use the customized openssl engine service to sign kernel > > modules. > > > > Signed-off-by: Yang Song > > Aren't engines already obsolete in the latest versions of OpenSSL, as > well as being an implementation detail of one particular crypto > library? Um, no, they're getting renamed providers with some annoying API changes that require a bit of a rewrite but the concept of a crypto "engine" plug in to the code base isn't going away. > They aren't really a concept we should be exposing in *our* user > interface. We already do ... grep ENGINE in scripts/sign-file.c Just by the way in case anyone is interested in history: https://lore.kernel.org/keyrings/1518452963.3114.6.ca...@hansenpartnership.com/ > Better to make sign-file automatically recognise RFC7512 PKCS#11 URIs > and handle them by automatically loading the PKCS#11 engine. PKCS11 can't cover everyting engines can. Engines are mostly used for accelerators, which are not in the PKCS11 API and even for external keys, PKCS11 can't cope if the key isn't inside what PKCS11 thinks of as a token. James
[GIT PULL] SCSI fixes for 5.11-rc6
One fix in drivers (lpfc) that stops an oops on resource exhaustion. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: James Smart (1): scsi: lpfc: Fix EEH encountering oops with NVMe traffic And the diffstat: drivers/scsi/lpfc/lpfc_nvme.c | 3 +++ 1 file changed, 3 insertions(+) With full diff below James --- diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 1cb82fa6a60e..39d147e251bf 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -559,6 +559,9 @@ __lpfc_nvme_ls_req(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, return -ENODEV; } + if (!vport->phba->sli4_hba.nvmels_wq) + return -ENOMEM; + /* * there are two dma buf in the request, actually there is one and * the second one is just the start address + cmd size.
Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid
On Fri, 2021-02-05 at 13:25 -0400, Jason Gunthorpe wrote: > On Fri, Feb 05, 2021 at 08:48:11AM -0800, James Bottomley wrote: [...] > > The practical consequence of this model is that if you allocate a > > chip structure with tpm_chip_alloc() you have to release it again > > by doing a put of *both* devices. > > The final put of the devs should be directly after the > cdev_device_del(), not in a devm. This became all confused because > the devs was created during alloc, not register. Having a device that > is initialized but will never be added is weird. > > See sketch below. > > > Stefan noticed the latter, so we got the bogus patch 8979b02aaf1d > > ("tpm: Fix reference count to main device") applied which simply > > breaks the master/slave model by not taking a reference on the > > master for the slave. I'm not sure why I didn't notice the problem > > with this fix at the time, but attention must have been elsewhere. > > Well, this is sort of OK because we never use the devs in TPM1, so we > end up freeing the chip with a positive refcount on the devs, which > is weird but not a functional bug. > > Jason > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm- > chip.c > index ddaeceb7e10910..e07193a0dd4438 100644 > --- a/drivers/char/tpm/tpm-chip.c > +++ b/drivers/char/tpm/tpm-chip.c > @@ -344,7 +344,6 @@ struct tpm_chip *tpm_chip_alloc(struct device > *pdev, > chip->dev_num = rc; > > device_initialize(&chip->dev); > - device_initialize(&chip->devs); > > chip->dev.class = tpm_class; > chip->dev.class->shutdown_pre = tpm_class_shutdown; > @@ -352,29 +351,12 @@ struct tpm_chip *tpm_chip_alloc(struct device > *pdev, > chip->dev.parent = pdev; > chip->dev.groups = chip->groups; > > - chip->devs.parent = pdev; > - chip->devs.class = tpmrm_class; > - chip->devs.release = tpm_devs_release; > - /* get extra reference on main device to hold on > - * behalf of devs. This holds the chip structure > - * while cdevs is in use. The corresponding put > - * is in the tpm_devs_release (TPM2 only) > - */ > - if (chip->flags & TPM_CHIP_FLAG_TPM2) > - get_device(&chip->dev); > - > if (chip->dev_num == 0) > chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR); > else > chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num); > > - chip->devs.devt = > - MKDEV(MAJOR(tpm_devt), chip->dev_num + > TPM_NUM_DEVICES); > - > rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num); > - if (rc) > - goto out; > - rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num); > if (rc) > goto out; > > @@ -382,9 +364,7 @@ struct tpm_chip *tpm_chip_alloc(struct device > *pdev, > chip->flags |= TPM_CHIP_FLAG_VIRTUAL; > > cdev_init(&chip->cdev, &tpm_fops); > - cdev_init(&chip->cdevs, &tpmrm_fops); > chip->cdev.owner = THIS_MODULE; > - chip->cdevs.owner = THIS_MODULE; > > rc = tpm2_init_space(&chip->work_space, > TPM2_SPACE_BUFFER_SIZE); > if (rc) { > @@ -396,7 +376,6 @@ struct tpm_chip *tpm_chip_alloc(struct device > *pdev, > return chip; > > out: > - put_device(&chip->devs); > put_device(&chip->dev); > return ERR_PTR(rc); > } > @@ -445,13 +424,33 @@ static int tpm_add_char_device(struct tpm_chip > *chip) > } > > if (chip->flags & TPM_CHIP_FLAG_TPM2) { > + device_initialize(&chip->devs); > + chip->devs.parent = pdev; > + chip->devs.class = tpmrm_class; > + rc = dev_set_name(&chip->devs, "tpmrm%d", chip- > >dev_num); > + if (rc) > + goto out_put_devs; > + > + /* > + * get extra reference on main device to hold on > behalf of devs. > + * This holds the chip structure while cdevs is in > use. The > + * corresponding put is in the tpm_devs_release. > + */ > + get_device(&chip->dev); > + chip->devs.release = tpm_devs_release; > + > + chip->devs.devt = > + MKDEV(MAJOR(tpm_devt), chip->dev_num + > TPM_NUM_DEVICES); > + cdev_init(&chip->cdevs, &tpmrm_fops); > + chip->cdevs.owner = THIS_MODUL
Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid
On Fri, 2021-02-05 at 04:18 +0200, Jarkko Sakkinen wrote: > On Thu, Feb 04, 2021 at 04:34:11PM -0800, James Bottomley wrote: > > On Fri, 2021-02-05 at 00:50 +0100, Lino Sanfilippo wrote: > > > From: Lino Sanfilippo > > > > > > In tpm2_del_space() chip->ops is used for flushing the sessions. > > > However > > > this function may be called after tpm_chip_unregister() which > > > sets > > > the chip->ops pointer to NULL. > > > Avoid a possible NULL pointer dereference by checking if chip- > > > >ops is > > > still > > > valid before accessing it. > > > > > > Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of > > > tpm_transmit()") > > > Signed-off-by: Lino Sanfilippo > > > --- > > > drivers/char/tpm/tpm2-space.c | 15 ++- > > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/char/tpm/tpm2-space.c > > > b/drivers/char/tpm/tpm2- > > > space.c > > > index 784b8b3..9a29a40 100644 > > > --- a/drivers/char/tpm/tpm2-space.c > > > +++ b/drivers/char/tpm/tpm2-space.c > > > @@ -58,12 +58,17 @@ int tpm2_init_space(struct tpm_space *space, > > > unsigned int buf_size) > > > > > > void tpm2_del_space(struct tpm_chip *chip, struct tpm_space > > > *space) > > > { > > > - mutex_lock(&chip->tpm_mutex); > > > - if (!tpm_chip_start(chip)) { > > > - tpm2_flush_sessions(chip, space); > > > - tpm_chip_stop(chip); > > > + down_read(&chip->ops_sem); > > > + if (chip->ops) { > > > + mutex_lock(&chip->tpm_mutex); > > > + if (!tpm_chip_start(chip)) { > > > + tpm2_flush_sessions(chip, space); > > > + tpm_chip_stop(chip); > > > + } > > > + mutex_unlock(&chip->tpm_mutex); > > > } > > > - mutex_unlock(&chip->tpm_mutex); > > > + up_read(&chip->ops_sem); > > > + > > > kfree(space->context_buf); > > > kfree(space->session_buf); > > > } > > > > Actually, this still isn't right. As I said to the last person who > > reported this, we should be doing a get/put on the ops, not rolling > > our > > own here: > > > > https://lore.kernel.org/linux-integrity/e7566e1e48f5be9dca034b4bfb67683b5d3cb88f.ca...@hansenpartnership.com/ > > > > The reporter went silent before we could get this tested, but could > > you > > try, please, because your patch is still hand rolling the ops > > get/put, > > just slightly better than it had been done previously. > > > > James > > Thanks for pointing this out. I'd strongly support Jason's proposal: > > https://lore.kernel.org/linux-integrity/20201215175624.gg5...@ziepe.ca/ > > It's the best long-term way to fix this. Really, no it's not. It introduces extra mechanism we don't need. To recap the issue: character devices already have an automatic mechanism which holds a reference to the struct device while the character node is open so the default is to release resources on final put of the struct device. tpm 2 is special because we have two character device nodes: /dev/tpm0 and /dev/tpmrm0. The way we make this work is that tpm0 is the master and tpmrm0 the slave, so the slave holds an extra reference on the master which is put when the slave final put happens. This means that our resources aren't freed until the final puts of both devices, which is the model we're using. The practical consequence of this model is that if you allocate a chip structure with tpm_chip_alloc() you have to release it again by doing a put of *both* devices. However, patch fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm") contains two bugs: firstly it didn't add a devm action release for devs and secondly it didn't update the only non-devm user ibm vtpm to do the double put. Stefan noticed the latter, so we got the bogus patch 8979b02aaf1d ("tpm: Fix reference count to main device") applied which simply breaks the master/slave model by not taking a reference on the master for the slave. I'm not sure why I didn't notice the problem with this fix at the time, but attention must have been elsewhere. Subsequently we got ftpm added which copied the ibm vtpm put bug. So I think 1/2 is the correct fix for all three bugs. I just need to find a way to verify it. James
Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid
On Fri, 2021-02-05 at 13:25 -0400, Jason Gunthorpe wrote: > On Fri, Feb 05, 2021 at 08:48:11AM -0800, James Bottomley wrote: > > > Thanks for pointing this out. I'd strongly support Jason's > > > proposal: > > > > > > https://lore.kernel.org/linux-integrity/20201215175624.gg5...@ziepe.ca/ > > > > > > It's the best long-term way to fix this. > > > > Really, no it's not. It introduces extra mechanism we don't need. > > To recap the issue: character devices already have an automatic > > mechanism which holds a reference to the struct device while the > > character node is open so the default is to release resources on > > final > > put of the struct device. > > The refcount on the struct device only keeps the memory alive, it > doesn't say anything about the ops. We still need to lock and check > the ops each and every time they are used. I think this is the crux of our disagreement: I think the ops doesn't matter because to call try_get_ops you have to have a chip structure and the only way you get a chip structure is if you hold a device containing it, in which case the device hold guarantees the chip can't be freed. Or if you pass in TPM_ANY_NUM to an operation which calls tpm_chip_find_get() which iterates the idr to find a chip under the idr lock. If you find a chip device at the idr, you're guaranteed it exists, because elimination of it is the first thing the release does and if you find a dying dev (i.e. the release routine blocks on the idr mutex trying to kill the chip attachment), try_get_ops() fails because the ops are already NULL. In either case, I think you get returned a device to which you hold a reference. Is there any other case where you can get a chip without also getting a device reference? I'll answer the other point in a separate email, but I think the principle sounds OK: we could do the final put right after we del the char devices because that's called in the module release routine and thus not have to rely on the devm actions which, as you say, are an annoying complication. James
Re: [PATCH v3 1/2] tpm: fix reference counting for struct tpm_chip
On Thu, 2021-02-04 at 20:44 -0500, Stefan Berger wrote: > To clarify: When I tested this I had *both* patches applied. Without > the patches I got the null pointer exception in tpm2_del_space(). The > 2nd patch alone solves that issue when using the steps above. Yes, I can't confirm the bug either. I only have lpc tis devices, so it could be something to do with spi, but when I do python3 in one shell >>> fd = open("/dev/tpmrm0", "wb") do rmmod tpm_tis in another shell >>> buf = bytearray(20) >>> fd.write(buf) 20 so I don't see the oops you see on write. However >>> fd.close() And it oopses here in tpm2_del_space James
Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid
On Fri, 2021-02-05 at 00:50 +0100, Lino Sanfilippo wrote: > From: Lino Sanfilippo > > In tpm2_del_space() chip->ops is used for flushing the sessions. > However > this function may be called after tpm_chip_unregister() which sets > the chip->ops pointer to NULL. > Avoid a possible NULL pointer dereference by checking if chip->ops is > still > valid before accessing it. > > Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of > tpm_transmit()") > Signed-off-by: Lino Sanfilippo > --- > drivers/char/tpm/tpm2-space.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2- > space.c > index 784b8b3..9a29a40 100644 > --- a/drivers/char/tpm/tpm2-space.c > +++ b/drivers/char/tpm/tpm2-space.c > @@ -58,12 +58,17 @@ int tpm2_init_space(struct tpm_space *space, > unsigned int buf_size) > > void tpm2_del_space(struct tpm_chip *chip, struct tpm_space *space) > { > - mutex_lock(&chip->tpm_mutex); > - if (!tpm_chip_start(chip)) { > - tpm2_flush_sessions(chip, space); > - tpm_chip_stop(chip); > + down_read(&chip->ops_sem); > + if (chip->ops) { > + mutex_lock(&chip->tpm_mutex); > + if (!tpm_chip_start(chip)) { > + tpm2_flush_sessions(chip, space); > + tpm_chip_stop(chip); > + } > + mutex_unlock(&chip->tpm_mutex); > } > - mutex_unlock(&chip->tpm_mutex); > + up_read(&chip->ops_sem); > + > kfree(space->context_buf); > kfree(space->session_buf); > } Actually, this still isn't right. As I said to the last person who reported this, we should be doing a get/put on the ops, not rolling our own here: https://lore.kernel.org/linux-integrity/e7566e1e48f5be9dca034b4bfb67683b5d3cb88f.ca...@hansenpartnership.com/ The reporter went silent before we could get this tested, but could you try, please, because your patch is still hand rolling the ops get/put, just slightly better than it had been done previously. James
Re: [PATCH] scsi: isci: convert sysfs sprintf/snprintf family to sysfs_emit
On Wed, 2021-02-03 at 16:43 +0800, Jiapeng Chong wrote: > Fix the following coccicheck warning: > > ./drivers/scsi/isci/init.c:140:8-16: WARNING: use scnprintf or > sprintf. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong > --- > drivers/scsi/isci/init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c > index c452849..741a98e 100644 > --- a/drivers/scsi/isci/init.c > +++ b/drivers/scsi/isci/init.c > @@ -137,7 +137,7 @@ static ssize_t isci_show_id(struct device *dev, > struct device_attribute *attr, c > struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(shost); > struct isci_host *ihost = container_of(sas_ha, typeof(*ihost), > sas_ha); > > - return snprintf(buf, PAGE_SIZE, "%d\n", ihost->id); > + return sysfs_emit(buf, "%d\n", ihost->id); What's the point of doing this change? We'd have to have 13,600 bit integer types before this could ever possibly overflow and the difference between snprintf and scnprintf actually matter from a practical point of view. Perhaps the coccinelle check should be updated to account for these common impossible to overflow situations. James
Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation
On Tue, 2021-02-02 at 20:15 +0200, Mike Rapoport wrote: > On Tue, Feb 02, 2021 at 03:34:29PM +0100, David Hildenbrand wrote: > > On 02.02.21 15:32, Michal Hocko wrote: > > > On Tue 02-02-21 15:26:20, David Hildenbrand wrote: > > > > On 02.02.21 15:22, Michal Hocko wrote: > > > > > On Tue 02-02-21 15:12:21, David Hildenbrand wrote: > > > > > [...] > > > > > > I think secretmem behaves much more like longterm GUP right > > > > > > now > > > > > > ("unmigratable", "lifetime controlled by user space", > > > > > > "cannot go on > > > > > > CMA/ZONE_MOVABLE"). I'd either want to reasonably well > > > > > > control/limit it or > > > > > > make it behave more like mlocked pages. > > > > > > > > > > I thought I have already asked but I must have forgotten. Is > > > > > there any > > > > > actual reason why the memory is not movable? Timing attacks? > > > > > > > > I think the reason is simple: no direct map, no copying of > > > > memory. > > > > > > This is an implementation detail though and not something > > > terribly hard > > > to add on top later on. I was more worried there would be really > > > fundamental reason why this is not possible. E.g. security > > > implications. > > > > I don't remember all the details. Let's see what Mike thinks > > regarding > > migration (e.g., security concerns). > > Thanks for considering me a security expert :-) > > Yet, I cannot estimate how dangerous is the temporal exposure of > this data to the kernel via the direct map in the simple > map/copy/unmap > sequence. Well the safest security statement is that we never expose the data to the kernel because it's a very clean security statement and easy to enforce. It's also the easiest threat model to analyse. Once we do start exposing the secret to the kernel it alters the threat profile and the analysis and obviously potentially provides the ROP gadget to an attacker to do the same. Instinct tells me that the loss of security doesn't really make up for the ability to swap or migrate but if there were a case for doing the latter, it would have to be a security policy of the user (i.e. a user should be able to decide their data is too sensitive to expose to the kernel). > More secure way would be to map source and destination in a different > page table rather than in the direct map, similarly to the way > text_poke() on x86 does. I think doing this would have much less of an impact on the security posture because it's already theoretically possible to have kmap restore access to the kernel. James > I've left the migration callback empty for now because it can be > added on top and its implementation would depend on the way we do (or > do not do) pooling. >
Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation
On Fri, 2021-01-29 at 09:23 +0100, Michal Hocko wrote: > On Thu 28-01-21 13:05:02, James Bottomley wrote: > > Obviously the API choice could be revisited > > but do you have anything to add over the previous discussion, or is > > this just to get your access control? > > Well, access control is certainly one thing which I still believe is > missing. But if there is a general agreement that the direct map > manipulation is not that critical then this will become much less of > a problem of course. The secret memory is a scarce resource but it's not a facility that should only be available to some users. > It all boils down whether secret memory is a scarce resource. With > the existing implementation it really is. It is effectivelly > repeating same design errors as hugetlb did. And look now, we have a > subtle and convoluted reservation code to track mmap requests and we > have a cgroup controller to, guess what, have at least some control > over distribution if the preallocated pool. See where am I coming > from? I'm fairly sure rlimit is the correct way to control this. The subtlety in both rlimit and memcg tracking comes from deciding to account under an existing category rather than having our own new one. People don't like new stuff in accounting because it requires modifications to everything in userspace. Accounting under and existing limit keeps userspace the same but leads to endless arguments about which limit it should be under. It took us several patch set iterations to get to a fragile consensus on this which you're now disrupting for reasons you're not making clear. > If the secret memory is more in line with mlock without any imposed > limit (other than available memory) in the end then, sure, using the > same access control as mlock sounds reasonable. Btw. if this is > really just a more restrictive mlock then is there any reason to not > hook this into the existing mlock infrastructure (e.g. > MCL_EXCLUSIVE)? Implications would be that direct map would be > handled on instantiation/tear down paths, migration would deal with > the same (if possible). Other than that it would be mlock like. In the very first patch set we proposed a mmap flag to do this. Under detailed probing it emerged that this suffers from several design problems: the KVM people want VMM to be able to remove the secret memory range from the process; there may be situations where sharing is useful and some people want to be able to seal the operations. All of this ended up convincing everyone that a file descriptor based approach was better than a mmap one. James
Re: Migration to trusted keys: sealing user-provided key?
On Sun, 2021-01-31 at 15:14 +0100, Jan Lübbe wrote: > On Sun, 2021-01-31 at 07:09 -0500, Mimi Zohar wrote: > > On Sat, 2021-01-30 at 19:53 +0200, Jarkko Sakkinen wrote: > > > On Thu, 2021-01-28 at 18:31 +0100, Ahmad Fatoum wrote: > > > > Hello, > > > > > > > > I've been looking into how a migration to using > > > > trusted/encrypted keys would look like (particularly with dm- > > > > crypt). > > > > > > > > Currently, it seems the the only way is to re-encrypt the > > > > partitions because trusted/encrypted keys always generate their > > > > payloads from RNG. > > > > > > > > If instead there was a key command to initialize a new > > > > trusted/encrypted key with a user provided value, users could > > > > use whatever mechanism they used beforehand to get a plaintext > > > > key and use that to initialize a new trusted/encrypted key. > > > > From there on, the key will be like any other trusted/encrypted > > > > key and not be disclosed again to userspace. > > > > > > > > What are your thoughts on this? Would an API like > > > > > > > > keyctl add trusted dmcrypt-key 'set ' # user- > > > > supplied content > > > > > > > > be acceptable? > > > > > > Maybe it's the lack of knowledge with dm-crypt, but why this > > > would be useful? Just want to understand the bottleneck, that's > > > all. > > Our goal in this case is to move away from having the dm-crypt key > material accessible to user-space on embedded devices. For an > existing dm-crypt volume, this key is fixed. A key can be loaded into > user key type and used by dm-crypt (cryptsetup can already do it this > way). But at this point, you can still do 'keyctl read' on that key, > exposing the key material to user space. > > Currently, with both encrypted and trusted keys, you can only > generate new random keys, not import existing key material. > > James Bottomley mentioned in the other reply that the key format will > become compatible with the openssl_tpm2_engine, which would provide a > workaround. This wouldn't work with OP-TEE-based trusted keys (see > Sumit Garg's series), though. Assuming OP-TEE has the same use model as the TPM, someone will eventually realise the need for interoperable key formats between key consumers and then it will work in the same way once the kernel gets updated to speak whatever format they come up with. > > We upstreamed "trusted" & "encrypted" keys together in order to > > address this sort of problem. Instead of directly using a > > "trusted" key for persistent file signatures being stored as > > xattrs, the "encrypted" key provides one level of > > indirection. The "encrypted" key may be encrypted/decrypted with > > either a TPM based "trusted" key or with a "user" type symmetric > > key[1]. > > > > Instead of modifying "trusted" keys, use a "user" type "encrypted" > > key. > > I don't see how this would help. When using dm-crypt with an > encrypted key, I can't use my existing key material. > > Except for the migration aspect, trusted keys seem ideal. Only a > single exported blob needs to be stored and can only be loaded/used > again on the same (trusted) system. Userspace cannot extract the key > material. Yes, that's what I was thinking ... especially when you can add policy to the keys, which includes PCR locking. Part of the problem is that changing policy, which you have to do if something happens to update the PCR values, is technically a migration, so your trusted keys for dm-crypt are really going to have to be migrateable. > To get to this point on systems in the field without re-encryption of > the whole storage, only the initial trusted/encrypted key creation > would need to allow passing in existing key material. What about a third option: why not make dm-crypt store the master key it uses as an encrypted key (if a parent trusted key is available)? That way you'd be able to extract the encrypted form of the key as root, but wouldn't be able to extract the actual master key. James
Re: [PATCH] tpm_tis: Add missing start/stop_tpm_chip calls
On Sat, 2021-01-30 at 19:36 -0800, Guenter Roeck wrote: > On 1/30/21 4:41 PM, James Bottomley wrote: > > On Sat, 2021-01-30 at 15:49 -0800, Guenter Roeck wrote: > > > On 1/29/21 2:59 PM, Jarkko Sakkinen wrote: > > > > On Tue, Jan 26, 2021 at 04:46:07PM +0100, Łukasz Majczak wrote: > > > > > Hi Jarkko, Guenter > > > > > > > > > > Yes, here are the logs when failure occurs - > > > > > https://gist.github.com/semihalf-majczak-lukasz/1575461f585f1e7fb1e9366b8eceaab9 > > > > > Look for a phrase "TPM returned invalid status" > > > > > > > > > > Guenter - good suggestion - I will try to keep it as tight as > > > > > possible. > > > > > > > > > > Best regards, > > > > > Lukasz > > > > > > > > Is it possible for you try out with linux-next? Thanks. It's a > > > > known issue, which ought to be fixed by now. > > > > > > > > The log message is harmless, it'a warning not panic, and does > > > > not endanger system stability. WARN()'s always dump stack > > > > trace. No oops is happening. > > > > > > > > > > There is a note in the kernel documentation which states: > > > > > > Note that the WARN()-family should only be used for "expected to > > > be unreachable" situations. If you want to warn about "reachable > > > but undesirable" situations, please use the pr_warn()-family of > > > functions. > > > > It fits the definition. The warning only triggers if the access is > > in the wrong locality, which should be impossible, so the warning > > should be unreachable. > > > Thanks a lot for the clarification. So a warning traceback in the > kernel doesn't necessarily suggest that there is a serious problem > that should be fixed; it only means that some code is executed which > should not be reachable (but is otherwise harmless). > > That makes me wonder, though, if it would make sense to mark such > harmless tracebacks differently. The terms "warning" and "harmless" > sound like a bit of a contradiction to me (especially for systems > where panic_on_warn is set). Well, it's not harmless; because it occurs at start of day, it means we clear the ineffective command and use default values and those happen to work fine for the TPM in question, so the problem is pretty much covered up. If it had occurred anywhere else it would result in a loss of the command data with unknown ramifications to user space, possibly leading to a TPM failure. Hopefully this means this is the only place we screwed up, but you can see why a scary warning and stack trace is appropriate: if it triggers, something in the kernel violated the TPM command model. James
Re: [PATCH] tpm_tis: Add missing start/stop_tpm_chip calls
On Sat, 2021-01-30 at 15:49 -0800, Guenter Roeck wrote: > On 1/29/21 2:59 PM, Jarkko Sakkinen wrote: > > On Tue, Jan 26, 2021 at 04:46:07PM +0100, Łukasz Majczak wrote: > > > Hi Jarkko, Guenter > > > > > > Yes, here are the logs when failure occurs - > > > https://gist.github.com/semihalf-majczak-lukasz/1575461f585f1e7fb1e9366b8eceaab9 > > > Look for a phrase "TPM returned invalid status" > > > > > > Guenter - good suggestion - I will try to keep it as tight as > > > possible. > > > > > > Best regards, > > > Lukasz > > > > Is it possible for you try out with linux-next? Thanks. It's a > > known issue, which ought to be fixed by now. > > > > The log message is harmless, it'a warning not panic, and does not > > endanger system stability. WARN()'s always dump stack trace. No > > oops is happening. > > > > There is a note in the kernel documentation which states: > > Note that the WARN()-family should only be used for "expected to > be unreachable" situations. If you want to warn about "reachable > but undesirable" situations, please use the pr_warn()-family of > functions. It fits the definition. The warning only triggers if the access is in the wrong locality, which should be impossible, so the warning should be unreachable. James > It seems to me that "harmless" doesn't really fit the expected > use of WARN(). Should it possibly be converted to pr_warn() ?
[GIT PULL] SCSI fixes for 5.11-rc5
Two minor fixes in drivers. Both changing strings (one in a comment, one in a module help text) with no code impact. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Enzo Matsumiya (1): scsi: qla2xxx: Fix description for parameter ql2xenforce_iocb_limit Valdis Kletnieks (1): scsi: target: iscsi: Fix typo in comment And the diffstat: drivers/scsi/qla2xxx/qla_os.c | 2 +- drivers/target/iscsi/iscsi_target_login.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index f80abe28f35a..0e0fe5b09496 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -42,7 +42,7 @@ MODULE_PARM_DESC(ql2xfulldump_on_mpifail, int ql2xenforce_iocb_limit = 1; module_param(ql2xenforce_iocb_limit, int, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(ql2xenforce_iocb_limit, -"Enforce IOCB throttling, to avoid FW congestion. (default: 0)"); +"Enforce IOCB throttling, to avoid FW congestion. (default: 1)"); /* * CT6 CTX allocation cache diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index 893d1b406c29..1a9c50401bdb 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -896,7 +896,7 @@ int iscsit_setup_np( else len = sizeof(struct sockaddr_in); /* -* Set SO_REUSEADDR, and disable Nagel Algorithm with TCP_NODELAY. +* Set SO_REUSEADDR, and disable Nagle Algorithm with TCP_NODELAY. */ if (np->np_network_transport == ISCSI_TCP) tcp_sock_set_nodelay(sock->sk);
Re: Migration to trusted keys: sealing user-provided key?
On Sat, 2021-01-30 at 19:53 +0200, Jarkko Sakkinen wrote: > On Thu, 2021-01-28 at 18:31 +0100, Ahmad Fatoum wrote: > > Hello, > > > > I've been looking into how a migration to using trusted/encrypted > > keys would look like (particularly with dm-crypt). > > > > Currently, it seems the the only way is to re-encrypt the > > partitions because trusted/encrypted keys always generate their > > payloads from RNG. > > > > If instead there was a key command to initialize a new > > trusted/encrypted key with a user provided value, users could use > > whatever mechanism they used beforehand to get a plaintext key and > > use that to initialize a new trusted/encrypted key. From there on, > > the key will be like any other trusted/encrypted key and not be > > disclosed again to userspace. > > > > What are your thoughts on this? Would an API like > > > > keyctl add trusted dmcrypt-key 'set ' # user-supplied > > content > > > > be acceptable? > > Maybe it's the lack of knowledge with dm-crypt, but why this would be > useful? Just want to understand the bottleneck, that's all. There was a recent patch to dm-crypt to add encrypted key support: 27f5411a718c ("dm crypt: support using encrypted keys"). The implementation requires the actual disk encryption master key to be in the payload. Most people don't want to change that key because it involves re-encrypting the whole disk (usually what people mean when they say "key" for dm-crypt is a passphrase that decrypts this master key from a keyslot in the metadata, which is why you can change your passphrase without changing the underlying encryption). However, once we get the trusted key rework upstream, we do have a solution: The key format becomes interoperable with the openssl_tpm2_engine and we can now do seal_tpm2_data on any payload and the kernel will accept it. James
Re: [GIT PULL] tpmdd updates for v5.12-rc1
On Sat, 2021-01-30 at 19:15 +0200, Jarkko Sakkinen wrote: > On Thu, Jan 28, 2021 at 07:38:21PM -0800, Linus Torvalds wrote: > > On Thu, Jan 28, 2021 at 4:54 PM Jarkko Sakkinen > > wrote: > > > This contains bug fixes for tpm_tis driver, which had a racy wait > > > for hardware state change to be ready to send a command to the > > > TPM chip. The bug has existed already since 2006, but has only > > > made itself known in recent past. > > > > Hmm. Is this for the next merge window? The subject line implies > > that, as does the addition of the cr50 driver. > > > > But the commentary about fixes implies that at least part of it > > should be in 5.11? > > This was meant for 5.12 but the timing was *way* too early. I'll take > this one back. Just to unambiguity reasons I'll use tpmdd-next-v5.12- > rc1-v2 tag for my final v5.12 PR, once I send it. > > I considered a bit, and I really think that it would make a lot of > sense to do a late 5.11 just containing the two commits from James, > namely: > > 1. tpm_tis: Fix check_locality for correct locality acquisition > 2. tpm_tis: Clean up locality release > > James: Does this make sense to you? Yes, that's fine with me. It will quiet the warning we've had several bug reports about, so it's definitely a bug fix. James
Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation
On Thu, 2021-01-28 at 14:01 +0100, Michal Hocko wrote: > On Thu 28-01-21 11:22:59, Mike Rapoport wrote: [...] > > I like the idea to have a pool as an optimization rather than a > > hard requirement but I don't see why would it need a careful access > > control. As the direct map fragmentation is not necessarily > > degrades the performance (and even sometimes it actually improves > > it) and even then the degradation is small, trying a PMD_ORDER > > allocation for a pool and then falling back to 4K page may be just > > fine. > > Well, as soon as this is a scarce resource then an access control > seems like a first thing to think of. Maybe it is not really > necessary but then this should be really justified. The control for the resource is effectively the rlimit today. I don't think dividing the world into people who can and can't use secret memory would be useful since the design is to be usable for anyone who might have a secret to keep; it would become like the kvm group permissions: something which is theoretically an access control but which in practise is given to everyone on the system. > I am also still not sure why this whole thing is not just a > ramdisk/ramfs which happens to unmap its pages from the direct > map. Wouldn't that be a much more easier model to work with? You > would get an access control for free as well. The original API was a memfd which does have this access control as well. However, the decision was made after much discussion to go with a new system call instead. Obviously the API choice could be revisited but do you have anything to add over the previous discussion, or is this just to get your access control? James
Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation
On Thu, 2021-01-28 at 14:01 +0100, Michal Hocko wrote: > On Thu 28-01-21 11:22:59, Mike Rapoport wrote: [...] > > One of the major pushbacks on the first RFC [1] of the concept was > > about the direct map fragmentation. I tried really hard to find > > data that shows what is the performance difference with different > > page sizes in the direct map and I didn't find anything. > > > > So presuming that large pages do provide advantage the first > > implementation of secretmem used PMD_ORDER allocations to amortise > > the effect of the direct map fragmentation and then handed out 4k > > pages at each fault. In addition there was an option to reserve a > > finite pool at boot time and limit secretmem allocations only to > > that pool. > > > > At some point David suggested to use CMA to improve overall > > flexibility [3], so I switched secretmem to use CMA. > > > > Now, with the data we have at hand (my benchmarks and Intel's > > report David mentioned) I'm even not sure this whole pooling even > > required. > > I would still like to understand whether that data is actually > representative. With some underlying reasoning rather than I have run > these XYZ benchmarks and numbers do not look terrible. My theory, and the reason I made Mike run the benchmarks, is that our fear of TLB miss has been alleviated by CPU speculation advances over the years. You can appreciate this if you think that both Intel and AMD have increased the number of levels in the page table to accommodate larger virtual memory size 5 instead of 3. That increases the length of the page walk nearly 2x in a physical system and even more in a virtual system. Unless this were massively optimized, systems would have slowed down significantly. Using 2M pages only eliminates one level and 2G pages eliminates 2, so I theorized that actually fragmentation wouldn't be the significant problem we once thought it was and asked Mike to benchmark it. The benchmarks show that indeed, it isn't a huge change in the data TLB miss time, I suspect because data is nicely continuous nowadays and the prediction that goes into the CPU optimizations quite easy. ITLB fragmentation actually seems to be quite a bit worse, likely because we still don't have branch prediction down to an exact science. James
Re: [PATCH] tpm_tis: Add missing start/stop_tpm_chip calls
On Tue, 2021-01-26 at 16:46 +0100, Łukasz Majczak wrote: > Hi Jarkko, Guenter > > Yes, here are the logs when failure occurs - > https://gist.github.com/semihalf-majczak-lukasz/1575461f585f1e7fb1e9366b8eceaab9 > Look for a phrase "TPM returned invalid status" We've had other reports of this: https://lore.kernel.org/linux-integrity/ghsgagsnag@gouders.net/ https://lore.kernel.org/linux-integrity/374e918c-f167-9308-2bea-ae6bc6a3d2e3@elloe.vision/ The problem is some TIS TPMs don't begin in the correct locality so we have to set it. When I proposed the check, I also proposed a fix for this problem: https://lore.kernel.org/linux-integrity/20201001180925.13808-5-james.bottom...@hansenpartnership.com/ But it's part of a series that never went upstream. Part of the reason was Jarkko proposed the get/put patch to fix this instead, but that never went upstream either. We need to decide an approach and apply one or other fixes. James
[GIT PULL] SCSI fixes for 5.11-rc4
Twelve minor fixes, all in drivers or doc. Most of the fixes are pretty obvious (although we have 2 goes to get the UFS sysfs doc right) and the biggest change is in the ufs driver which they've extensively tested. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Adrian Hunter (1): scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode Arnd Bergmann (1): scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression Brian King (1): scsi: ibmvfc: Set default timeout to avoid crash during migration Dinghao Liu (1): scsi: fnic: Fix memleak in vnic_dev_init_devcmd2 Jaegeuk Kim (2): scsi: ufs: Fix tm request when non-fatal error happens scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns() Javed Hasan (1): scsi: libfc: Avoid invoking response handler twice if ep is already completed Lukas Bulwahn (1): scsi: docs: ABI: sysfs-driver-ufs: Rectify table formatting Martin Wilck (1): scsi: scsi_transport_srp: Don't block target in failfast state Randy Dunlap (1): scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEM Shin'ichiro Kawasaki (1): scsi: target: tcmu: Fix use-after-free of se_cmd->priv Tyrel Datwyler (1): scsi: ibmvfc: Fix missing cast of ibmvfc_event pointer to u64 handle And the diffstat: Documentation/ABI/testing/sysfs-driver-ufs | 36 ++--- drivers/scsi/fnic/vnic_dev.c | 8 --- drivers/scsi/ibmvscsi/ibmvfc.c | 8 --- drivers/scsi/libfc/fc_exch.c | 16 +++-- drivers/scsi/megaraid/megaraid_sas_base.c | 6 ++--- drivers/scsi/scsi_transport_srp.c | 9 +++- drivers/scsi/ufs/Kconfig | 1 + drivers/scsi/ufs/ufshcd.c | 37 -- drivers/target/target_core_user.c | 11 ++--- 9 files changed, 90 insertions(+), 42 deletions(-) With full diff below. James --- diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index adc0d0e91607..75ccc5c62b3c 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -916,21 +916,25 @@ Date: September 2014 Contact: Subhash Jadavani Description: This entry could be used to set or show the UFS device runtime power management level. The current driver - implementation supports 6 levels with next target states: + implementation supports 7 levels with next target states: == - 0 an UFS device will stay active, an UIC link will + 0 UFS device will stay active, UIC link will stay active - 1 an UFS device will stay active, an UIC link will + 1 UFS device will stay active, UIC link will hibernate - 2 an UFS device will moved to sleep, an UIC link will + 2 UFS device will be moved to sleep, UIC link will stay active - 3 an UFS device will moved to sleep, an UIC link will + 3 UFS device will be moved to sleep, UIC link will hibernate - 4 an UFS device will be powered off, an UIC link will + 4 UFS device will be powered off, UIC link will hibernate - 5 an UFS device will be powered off, an UIC link will + 5 UFS device will be powered off, UIC link will be powered off + 6 UFS device will be moved to deep sleep, UIC link + will be powered off. Note, deep sleep might not be + supported in which case this value will not be + accepted == What: /sys/bus/platform/drivers/ufshcd/*/rpm_target_dev_state @@ -954,21 +958,25 @@ Date: September 2014 Contact: Subhash Jadavani Description: This entry could be used to set or show the UFS device system power management level. The current driver - implementation supports 6 levels with next target states: + implementation supports 7 levels with next target states: == - 0 an UFS device will stay active, an UIC link will + 0 UFS device will stay active, UIC link will stay active - 1 an UFS device will stay active, an UIC link will + 1 UFS device will stay active, UIC link will hibernate - 2 an UFS device will moved to sleep, an UIC link will + 2 UFS device will be moved to sleep, UIC l
Re: [PATCH] drivers/scsi/qla4xxx: use scnprintf() instead of snprintf()
On Wed, 2021-01-20 at 15:22 +0800, Jiapeng Zhong wrote: > Fix the following coccicheck warning: > > ./drivers/scsi/qla4xxx/ql4_attr.c: WARNING: use scnprintf or > sprintf > > The snprintf() function returns the number of characters which would > have been printed if there were enough space, but the scnprintf() > returns the number of characters which were actually printed. If > the buffer is not large enough, then using snprintf() would result > in a read overflow and an information leak. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Zhong > --- > drivers/scsi/qla4xxx/ql4_attr.c | 22 +++--- > 1 file changed, 11 insertions(+), 11 deletions(-) > > diff --git a/drivers/scsi/qla4xxx/ql4_attr.c > b/drivers/scsi/qla4xxx/ql4_attr.c > index ec43528..1a16017 100644 > --- a/drivers/scsi/qla4xxx/ql4_attr.c > +++ b/drivers/scsi/qla4xxx/ql4_attr.c > @@ -170,7 +170,7 @@ void qla4_8xxx_free_sysfs_attr(struct > scsi_qla_host *ha) > char *buf) > { > struct scsi_qla_host *ha = to_qla_host(class_to_shost(dev)); > - return snprintf(buf, PAGE_SIZE, "%s\n", ha->serial_number); > + return scnprintf(buf, PAGE_SIZE, "%s\n", ha->serial_number); This is the wrong ABI to be replacing anything sysfs with, it should be sysfs_emit() James
[GIT PULL] SCSI fixes for 5.11-rc3
Nine minor fixes, 7 in drivers and 2 in the core SCSI disk driver (sd) which should be harmless involving removing an unused variable and quietening a spurious warning. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Can Guo (1): scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback Colin Ian King (1): scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility" Dinghao Liu (1): scsi: scsi_debug: Fix memleak in scsi_debug_init() Ewan D. Milne (1): scsi: sd: Suppress spurious errors when WRITE SAME is being disabled Kiwoong Kim (1): scsi: ufs: Relocate flush of exceptional event Lukas Bulwahn (1): scsi: sd: Remove obsolete variable in sd_remove() Nilesh Javali (1): scsi: qedi: Correct max length of CHAP secret Stanley Chu (2): scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL scsi: ufs: Fix possible power drain during system suspend And the diffstat: drivers/scsi/mpt3sas/Kconfig | 2 +- drivers/scsi/qedi/qedi_main.c | 4 ++-- drivers/scsi/scsi_debug.c | 5 +++-- drivers/scsi/sd.c | 6 +++--- drivers/scsi/ufs/ufshcd.c | 24 ++-- 5 files changed, 19 insertions(+), 22 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/mpt3sas/Kconfig b/drivers/scsi/mpt3sas/Kconfig index 86209455172d..c299f7e078fb 100644 --- a/drivers/scsi/mpt3sas/Kconfig +++ b/drivers/scsi/mpt3sas/Kconfig @@ -79,5 +79,5 @@ config SCSI_MPT2SAS select SCSI_MPT3SAS depends on PCI && SCSI help - Dummy config option for backwards compatiblity: configure the MPT3SAS + Dummy config option for backwards compatibility: configure the MPT3SAS driver instead. diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c index f5fc7f518f8a..47ad64b06623 100644 --- a/drivers/scsi/qedi/qedi_main.c +++ b/drivers/scsi/qedi/qedi_main.c @@ -2245,7 +2245,7 @@ qedi_show_boot_tgt_info(struct qedi_ctx *qedi, int type, chap_name); break; case ISCSI_BOOT_TGT_CHAP_SECRET: - rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_NAME_MAX_LEN, + rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_PWD_MAX_LEN, chap_secret); break; case ISCSI_BOOT_TGT_REV_CHAP_NAME: @@ -2253,7 +2253,7 @@ qedi_show_boot_tgt_info(struct qedi_ctx *qedi, int type, mchap_name); break; case ISCSI_BOOT_TGT_REV_CHAP_SECRET: - rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_NAME_MAX_LEN, + rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_PWD_MAX_LEN, mchap_secret); break; case ISCSI_BOOT_TGT_FLAGS: diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 24c0f7ec0351..4a08c450b756 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -6740,7 +6740,7 @@ static int __init scsi_debug_init(void) k = sdeb_zbc_model_str(sdeb_zbc_model_s); if (k < 0) { ret = k; - goto free_vm; + goto free_q_arr; } sdeb_zbc_model = k; switch (sdeb_zbc_model) { @@ -6753,7 +6753,8 @@ static int __init scsi_debug_init(void) break; default: pr_err("Invalid ZBC model\n"); - return -EINVAL; + ret = -EINVAL; + goto free_q_arr; } } if (sdeb_zbc_model != BLK_ZONED_NONE) { diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 679c2c025047..a3d2d4bc4a3d 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -984,8 +984,10 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd) } } - if (sdp->no_write_same) + if (sdp->no_write_same) { + rq->rq_flags |= RQF_QUIET; return BLK_STS_TARGET; + } if (sdkp->ws16 || lba > 0x || nr_blocks > 0x) return sd_setup_write_same16_cmnd(cmd, false); @@ -3510,10 +3512,8 @@ static int sd_probe(struct device *dev) static int sd_remove(struct device *dev) { struct scsi_disk *sdkp; - dev_t devt; sdkp = dev_get_drvdata(dev); - devt = disk_devt(sdkp->disk); scsi_autopm_get_device(sdkp->device); async_synchronize_full_domain(&scsi_sd_pm_domain); diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 82ad31781bc9..e31d2c5c7b23 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -289,7 +289,8 @@ static inline void ufshcd_wb_config(struct ufs_hba *hba) if (ret)
Re: [PATCH] RDMA: usnic: Fix misuse of sysfs_emit_at
On Fri, 2021-01-15 at 13:23 -0800, Joe Perches wrote: > In commit e28bf1f03b01 ("RDMA: Convert various random sprintf sysfs > _show > uses to sysfs_emit") I mistakenly used len = sysfs_emit_at to > overwrite > the last trailing space of potentially multiple entry output. > > The length of the last sysfs_emit_at call is 1 and it should instead > be > ignored. Do so. > > Fixes: e28bf1f03b01 ("RDMA: Convert various random sprintf sysfs > _show uses to sysfs_emit") > > Reported-by: James Bottomley > Signed-off-by: Joe Perches > --- > drivers/infiniband/hw/usnic/usnic_ib_sysfs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c > b/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c > index e59615a4c9d9..fc077855b46c 100644 > --- a/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c > +++ b/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c > @@ -231,7 +231,7 @@ static ssize_t summary_show(struct > usnic_ib_qp_grp *qp_grp, char *buf) > } > } > > - len = sysfs_emit_at(buf, len, "\n"); > + sysfs_emit_at(buf, len, "\n"); /* Overwrite the last > trailing space */ len is the offset of where the next character gets written, isn't it? so if you're overwriting the last character emitted into buf, shouldn't the offset point at that character rather than one beyond it? So sysfs_emit_at(buf, len - 1, "\n"); /* Overwrite the last trailing space */ ? James
Re: [PATCH v4] certs: Add EFI_CERT_X509_GUID support for dbx entries
On Tue, 2020-09-15 at 20:49 -0400, Eric Snowberg wrote: > The Secure Boot Forbidden Signature Database, dbx, contains a list of > now revoked signatures and keys previously approved to boot with UEFI > Secure Boot enabled. The dbx is capable of containing any number of > EFI_CERT_X509_SHA256_GUID, EFI_CERT_SHA256_GUID, and > EFI_CERT_X509_GUID entries. > > Currently when EFI_CERT_X509_GUID are contained in the dbx, the > entries are skipped. > > Add support for EFI_CERT_X509_GUID dbx entries. When a > EFI_CERT_X509_GUID is found, it is added as an asymmetrical key to > the .blacklist keyring. Anytime the .platform keyring is used, the > keys in the .blacklist keyring are referenced, if a matching key is > found, the key will be rejected. > > Signed-off-by: Eric Snowberg If you're using shim, as most of our users are, you have no access to dbx to blacklist certificates. Plus our security envelope includes the Mok variables, so you should also be paying attestion to MokListX (or it's RT equivalent: MokListXRT). If you add this to the patch, we get something that is mechanistically complete and which also allows users to add certs to their Mok blacklist. James
Re: [PATCH] certs: Add EFI_CERT_X509_GUID support for dbx entries
On Wed, 2021-01-13 at 13:40 +, David Howells wrote: > Hi Linus, > > Are you willing to take this between merge windows - or does it need > to wait for the next merge window? It's not technically a bug fix to > the kernel, but it does have a CVE attached to it. > > Note that I've also updated Jarkko's address in his Reviewed-by since > his Intel address no longer works. Sorry, late to the party. I suppose I lost the argument that we shouldn't really be trusting any certs from db when shim is in operation because they're all EFI binary signing ones and will usually simply be the microsoft certificate and possibly an OEM platform one and we're usually pivoting the root of trust to the certificates in the MokList. However, if we are going to do this, we should also be blacklisting the certificates in MokListX which the OS sees through MokListXRT. Since MokListX is an essential piece of our revocation infrastructure it should have been mentioned in the CVE but wasn't for some reason. James
[GIT PULL] SCSI fixes for 5.11-rc2
This is two driver fixes (megaraid_sas and hisi_sas). The megaraid one is a revert of a previous revert of a cpu hotplug fix which exposed a bug in the block layer which has been fixed in this merge window and the hisi_sas performance enhancement comes from switching to interrupt managed completion queues, which depended on the addition of devm_platform_get_irqs_affinity() which is now upstream via the irq tree in the last merge window. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: scsi: hisi_sas: Expose HW queues for v2 hw Martin K. Petersen (1): Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug"" And the diffstat: drivers/scsi/hisi_sas/hisi_sas.h| 4 ++ drivers/scsi/hisi_sas/hisi_sas_main.c | 11 + drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 66 +++-- drivers/scsi/megaraid/megaraid_sas_base.c | 39 + drivers/scsi/megaraid/megaraid_sas_fusion.c | 29 +++-- 5 files changed, 123 insertions(+), 26 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index 2b28dd405600..e821dd32dd28 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -294,6 +295,7 @@ enum { struct hisi_sas_hw { int (*hw_init)(struct hisi_hba *hisi_hba); + int (*interrupt_preinit)(struct hisi_hba *hisi_hba); void (*setup_itct)(struct hisi_hba *hisi_hba, struct hisi_sas_device *device); int (*slot_index_alloc)(struct hisi_hba *hisi_hba, @@ -393,6 +395,8 @@ struct hisi_hba { u32 refclk_frequency_mhz; u8 sas_addr[SAS_ADDR_SIZE]; + int *irq_map; /* v2 hw */ + int n_phy; spinlock_t lock; struct semaphore sem; diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index b6d4419c32f2..cf0bfac920a8 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -2614,6 +2614,13 @@ static struct Scsi_Host *hisi_sas_shost_alloc(struct platform_device *pdev, return NULL; } +static int hisi_sas_interrupt_preinit(struct hisi_hba *hisi_hba) +{ + if (hisi_hba->hw->interrupt_preinit) + return hisi_hba->hw->interrupt_preinit(hisi_hba); + return 0; +} + int hisi_sas_probe(struct platform_device *pdev, const struct hisi_sas_hw *hw) { @@ -2671,6 +2678,10 @@ int hisi_sas_probe(struct platform_device *pdev, sha->sas_port[i] = &hisi_hba->port[i].sas_port; } + rc = hisi_sas_interrupt_preinit(hisi_hba); + if (rc) + goto err_out_ha; + rc = scsi_add_host(shost, &pdev->dev); if (rc) goto err_out_ha; diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index b57177b52fac..9adfdefef9ca 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -3302,6 +3302,28 @@ static irq_handler_t fatal_interrupts[HISI_SAS_FATAL_INT_NR] = { fatal_axi_int_v2_hw }; +#define CQ0_IRQ_INDEX (96) + +static int hisi_sas_v2_interrupt_preinit(struct hisi_hba *hisi_hba) +{ + struct platform_device *pdev = hisi_hba->platform_dev; + struct Scsi_Host *shost = hisi_hba->shost; + struct irq_affinity desc = { + .pre_vectors = CQ0_IRQ_INDEX, + .post_vectors = 16, + }; + int resv = desc.pre_vectors + desc.post_vectors, minvec = resv + 1, nvec; + + nvec = devm_platform_get_irqs_affinity(pdev, &desc, minvec, 128, + &hisi_hba->irq_map); + if (nvec < 0) + return nvec; + + shost->nr_hw_queues = hisi_hba->cq_nvecs = nvec - resv; + + return 0; +} + /* * There is a limitation in the hip06 chipset that we need * to map in all mbigen interrupts, even if they are not used. @@ -3310,14 +3332,11 @@ static int interrupt_init_v2_hw(struct hisi_hba *hisi_hba) { struct platform_device *pdev = hisi_hba->platform_dev; struct device *dev = &pdev->dev; - int irq, rc = 0, irq_map[128]; + int irq, rc = 0; int i, phy_no, fatal_no, queue_no; - for (i = 0; i < 128; i++) - irq_map[i] = platform_get_irq(pdev, i); - for (i = 0; i < HISI_SAS_PHY_INT_NR; i++) { - irq = irq_map[i + 1]; /* Phy up/down is irq1 */ + irq = hisi_hba->irq_map[i + 1]; /* Phy up/down is irq1 */ rc = devm_request_irq(dev, irq, phy_interrupts[i], 0, DRV_NAME " phy", hisi_hba); if (rc) { @@ -3331,7 +3350,7 @@ static int interrupt_init_
Re: scsi: Add diagnostic log for scsi device reset
On Thu, 2021-01-07 at 19:43 +0800, lijinlin wrote: [...] > - SCSI_LOG_ERROR_RECOVERY(3, > + if (bdr_scmd->request && bdr_scmd->request->rq_disk) > sdev_printk(KERN_INFO, sdev, > - "%s: Sending BDR\n", current- > >comm)); > + "[%s] Sending device reset\n", > + bdr_scmd->request->rq_disk- > >disk_name); Not everything is a SCSI disk. If we apply this patch, we lose traces for any non-disk device that get reset ... for tapes this can be really important to know. James
Re: [PATCH 2/3] scsi: megaraid_sas: check user-provided offsets
On Sun, 2021-01-03 at 19:49 +0100, Arnd Bergmann wrote: > On Sun, Jan 3, 2021 at 6:00 PM James Bottomley > wrote: > > On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote: > > [...] > > > @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct > > > megasas_instance > > > *instance, > > > if (instance->consistent_mask_64bit) > > > put_unaligned_le64(sense_handle, > > > sense_ptr); > > > else > > > - put_unaligned_le32(sense_handle, > > > sense_ptr); > > > + put_unaligned_le64(sense_handle, > > > sense_ptr); > > > } > > > > This hunk can't be right. It effectively means removing the if. > > I'm just trying to restore the state before the regression introduced > in my 381d34e376e3 ("scsi: megaraid_sas: Check user-provided > offsets"). > > The old code always stored 'sizeof(long)' bytes into sense_ptr, > regardless of instance->consistent_mask_64bit, but it would truncate > the address to 32 bit if that was cleared. This was clearly bogus > and I tried to make it do something more meaningful, only storing > 8 bytes into the structure if it was configured for 64-bit DMA, > regardless of the capabilities of the kernel. Heh, well, all this depends on how the firmware interprets the pointer, for which we don't seem to have a manual. Instinct tells me the flag MFI_FRAME_SENSE64 is what does this and that's conditioned on the same if clause 100 lines above this, so the fix your proposing would still seem to be wrong, because I think when that flag is not set, the device expects the sense pointer to be 32 bit. > > However, the if is needed because sense_handle is a dma_addr_t > > which can be either 32 or 64 bit. What about changing the if to > > > > if (sizeof(dma_addr_t) == 8) > > > > instead? > > That would not be useful either, the device surely does not care > if the kernel supports 64-bit DMA. What we'd really need here is > someone with access to the interface specifications to see how > many bytes should be stored in the structure. I suspect always > storing 64 bits (as my patch does) is correct, and would send a > proper patch to remove the if() if Phil confirms that my test > patch fixes the regression. Well, as I said above, I'm speculating the device does what we tell it, and whether to use 32 or 64 bits for the sense pointer definitely seems to be a flag the driver controls ... we really need someone with access to the programming manual to tell us if this speculation is accurate, though. James
Re: [PATCH 2/3] scsi: megaraid_sas: check user-provided offsets
On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote: [...] > @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct megasas_instance > *instance, > if (instance->consistent_mask_64bit) > put_unaligned_le64(sense_handle, sense_ptr); > else > - put_unaligned_le32(sense_handle, sense_ptr); > + put_unaligned_le64(sense_handle, sense_ptr); > } This hunk can't be right. It effectively means removing the if. However, the if is needed because sense_handle is a dma_addr_t which can be either 32 or 64 bit. What about changing the if to if (sizeof(dma_addr_t) == 8) instead? James
Re: [PATCH] cxgb4: fix TLS dependencies again
On Sun, 2021-01-03 at 15:01 +0100, Arnd Bergmann wrote: > From: Arnd Bergmann > > A previous patch tried to avoid a build failure, but missed one case > that Kconfig warns about: > > WARNING: unmet direct dependencies detected for CHELSIO_T4 > Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && > NET_VENDOR_CHELSIO [=y] && PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) && > (TLS [=m] || TLS [=m]=n) > Selected by [y]: > - SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI > [=y] && INET [=y] && (IPV6 [=y] || IPV6 [=y]=n) && ETHERNET [=y] > x86_64-linux-ld: drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.o: in > function `cxgb_select_queue': > cxgb4_main.c:(.text+0xf5df): undefined reference to > `tls_validate_xmit_skb' > > When any of the dependencies of CHELSIO_T4 are not met, then > SCSI_CXGB4_ISCSI must not 'select' it either. > > Fix it by mirroring the network driver dependencies on the iscsi > driver. A more invasive but also more reliable alternative would > be to use 'depends on CHELSIO_T4' instead. > > Fixes: 659fbdcf2f14 ("cxgb4: Fix build failure when CONFIG_TLS=m") > Signed-off-by: Arnd Bergmann > --- > drivers/scsi/cxgbi/cxgb4i/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/cxgbi/cxgb4i/Kconfig > b/drivers/scsi/cxgbi/cxgb4i/Kconfig > index 8b0deece9758..2af88a55fbca 100644 > --- a/drivers/scsi/cxgbi/cxgb4i/Kconfig > +++ b/drivers/scsi/cxgbi/cxgb4i/Kconfig > @@ -1,7 +1,7 @@ > # SPDX-License-Identifier: GPL-2.0-only > config SCSI_CXGB4_ISCSI > tristate "Chelsio T4 iSCSI support" > - depends on PCI && INET && (IPV6 || IPV6=n) > + depends on PCI && INET && (IPV6 || IPV6=n) && (TLS || TLS=n) > depends on THERMAL || !THERMAL > depends on ETHERNET > depends on TLS || TLS=n I thought all separated depends statements were the equivalent of && in a single statement. If so, how does repeating TLS || TLS=n twice fix the problem? This sounds more like we have some sort of bug in our Kconfig apparatus. James
Re: [GIT PULL] SCSI fixes for 5.11-rc1
On Fri, 2021-01-01 at 13:21 -0800, Linus Torvalds wrote: > On Fri, Jan 1, 2021 at 12:19 PM James Bottomley > wrote: > > Originally this change was slated for the merge window but a late > > arriving build problem with CONFIG_PM=n derailed that. > > So I've pulled this, Thanks! > but we need to have a policy for reverting this > quickly if it turns out to cause problems. Sure, we'd have to revert the six patches plus the dependent PM fix. > I'm not worried about any remaining build issues - but I'm simply > worried about some missed case where code depended on the block layer > passing commands through even while suspended. > > The block bits would seem affect non-SCSI stuff too, how extensively > have any random odd special case been tested? The block bits have been in -next since 7 December, so it has had some testing. That said, REQ_PREEMPT doesn't affect much outside of SCSI and IDE because we're where the original concept of at head insertion for "special" error handling commands like request sense came from. I'd expect the biggest field of potential problems to be in USB which does both block/SCSI and PM. That should have been well tested by the PM people (as far as they can, as you know there are tons of non spec devices out there in the space). > So I'm not so much with you on the "the scary case is the spi domain > validation case". I'm more about "what about all the other random > cases for random special drivers" I picked on that because it's the least likely to get any testing for a while. However, you're right, there are other potential problem consumers but hopefully any problem (if there is one) with the rest will show up quickly. James
[GIT PULL] SCSI fixes for 5.11-rc1
This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi). The big core two fixes are for power management ("block: Do not accept any requests while suspended" and "block: Fix a race in the runtime power management code") which finally sorts out the resume problems we've occasionally been having. To make the resume fix, there are seven necessary precursors which effectively renames REQ_PREEMPT to REQ_PM, so every "special" request in block is automatically a power management exempt one. All of the non-PM preempt cases are removed except for the one in the SCSI Parallel Interface (spi) domain validation which is a genuine case where we have to run requests at high priority to validate the bus so this becomes an autopm get/put protected request. Originally this change was slated for the merge window but a late arriving build problem with CONFIG_PM=n derailed that. However, we still need the change to fix the PM suspend bugs. You can validate the REQ_PREEMPT -> REQ_PM nature of the six transformational patches by inspecting the output of git diff fa4d0f1992a9..a4d34da715e3 The biggest risk in this code is the spi domain validation change, so I've tested it extensively on my several parallel systems here without incident and it's now run for over a week in -next on either side of the merge window. Our backport for stable plan is currently to see if we can get the lion's share of the power management problems with the first pm fix since the second one depends on the REQ_PM rename. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Adrian Hunter (4): scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff() scsi: ufs-pci: Fix restore from S4 for Intel controllers Alan Stern (1): scsi: block: Do not accept any requests while suspended Bart Van Assche (7): scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE scsi: scsi_transport_spi: Set RQF_PM for domain validation commands scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT scsi: ide: Do not set the RQF_PREEMPT flag for sense requests scsi: block: Introduce BLK_MQ_REQ_PM scsi: block: Fix a race in the runtime power management code Bean Huo (2): scsi: ufs: Fix wrong print message in dev_err() scsi: ufs: Remove unused macro definition POWER_DESC_MAX_SIZE Dan Carpenter (1): scsi: mpt3sas: Signedness bug in _base_get_diag_triggers() Randall Huang (1): scsi: ufs: Clear UAC for RPMB after ufshcd resets Randy Dunlap (1): scsi: cxgb4i: Fix TLS dependency Stanley Chu (4): scsi: ufs: Un-inline ufshcd_vops_device_reset function scsi: ufs: Re-enable WriteBooster after device reset scsi: ufs-mediatek: Keep VCC always-on for specific devices scsi: ufs: Allow regulators being always-on Zhen Lei (1): scsi: ufs-mediatek: Use correct path to fix compile error And the diffstat: block/blk-core.c | 13 --- block/blk-mq-debugfs.c| 1 - block/blk-mq.c| 4 +- block/blk-pm.c| 15 --- block/blk-pm.h| 14 --- drivers/ide/ide-atapi.c | 1 - drivers/ide/ide-io.c | 7 +--- drivers/ide/ide-pm.c | 2 +- drivers/scsi/cxgbi/cxgb4i/Kconfig | 1 + drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +- drivers/scsi/scsi_lib.c | 27 ++--- drivers/scsi/scsi_transport_spi.c | 27 + drivers/scsi/ufs/ufs-mediatek-trace.h | 2 +- drivers/scsi/ufs/ufs-mediatek.c | 21 ++ drivers/scsi/ufs/ufs-mediatek.h | 1 + drivers/scsi/ufs/ufs.h| 2 +- drivers/scsi/ufs/ufshcd-pci.c | 73 ++- drivers/scsi/ufs/ufshcd.c | 45 + drivers/scsi/ufs/ufshcd.h | 14 +++ include/linux/blk-mq.h| 4 +- include/linux/blkdev.h| 18 ++--- 21 files changed, 208 insertions(+), 86 deletions(-) With full diff below James --- diff --git a/block/blk-core.c b/block/blk-core.c index 2db8bda43b6e..2d53e2ff48ff 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -424,11 +425,11 @@ EXPORT_SYMBOL(blk_cleanup_queue); /** * blk_queue_enter() - try to increase q->q_usage_counter * @q: request queue pointer - * @flags: BLK_MQ_REQ_NOWAIT and/or BLK_MQ_REQ_PREEMPT + * @flags: BLK_MQ_REQ_NOWAIT and/or BLK_MQ_REQ_PM */ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flag
Re: [PATCH -next] tpm: Use kzalloc for allocating only one thing
On Tue, 2020-12-29 at 21:51 +0800, Zheng Yongjun wrote: > Use kzalloc rather than kcalloc(1,...) > > The semantic patch that makes this change is as follows: > (http://coccinelle.lip6.fr/) What's the reason for wanting to do this transformation? > drivers/char/tpm/tpm1-cmd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/char/tpm/tpm1-cmd.c b/drivers/char/tpm/tpm1- > cmd.c > index ca7158fa6e6c..4d8415e3b778 100644 > --- a/drivers/char/tpm/tpm1-cmd.c > +++ b/drivers/char/tpm/tpm1-cmd.c > @@ -794,7 +794,7 @@ int tpm1_pm_suspend(struct tpm_chip *chip, u32 > tpm_suspend_pcr) > */ > int tpm1_get_pcr_allocation(struct tpm_chip *chip) > { > - chip->allocated_banks = kcalloc(1, sizeof(*chip- > >allocated_banks), > + chip->allocated_banks = kzalloc(sizeof(*chip->allocated_banks), > GFP_KERNEL); > if (!chip->allocated_banks) > return -ENOMEM; The reason tpm1 has this is because it mirrors the allocation in tpm2 so we retain code consistency. It's a fairly minor advantage, so it could be changed if you have a better rationale ... but what is it? James
Re: [PATCH v1 0/6] no-copy bvec
On Wed, 2020-12-23 at 15:23 -0500, Douglas Gilbert wrote: > On 2020-12-23 11:04 a.m., James Bottomley wrote: > > On Wed, 2020-12-23 at 15:51 +, Christoph Hellwig wrote: > > > On Wed, Dec 23, 2020 at 12:52:59PM +, Pavel Begunkov wrote: > > > > Can scatterlist have 0-len entries? Those are directly > > > > translated into bvecs, e.g. in nvme/target/io-cmd-file.c and > > > > target/target_core_file.c. I've audited most of others by this > > > > moment, they're fine. > > > > > > For block layer SGLs we should never see them, and for nvme > > > neither. I think the same is true for the SCSI target code, but > > > please double check. > > > > Right, no-one ever wants to see a 0-len scatter list entry. The > > reason is that every driver uses the sgl to program the device DMA > > engine in the way NVME does. a 0 length sgl would be a dangerous > > corner case: some DMA engines would ignore it and others would go > > haywire, so if we ever let a 0 length list down into the driver, > > they'd have to understand the corner case behaviour of their DMA > > engine and filter it accordingly, which is why we disallow them in > > the upper levels, since they're effective nops anyway. > > When using scatter gather lists at the far end (i.e. on the storage > device) the T10 examples (WRITE SCATTERED and POPULATE TOKEN in SBC- > 4) explicitly allow the "number of logical blocks" in their sgl_s to > be zero and state that it is _not_ to be considered an error. But that's pretty irrelevant. The scatterlists that block has been constructing to drive DMA engines pre-date SCSI's addition of SGLs by decades (all SCSI commands before the object commands use a linear buffer which is implemented in the HBA engine as a scatterlist but not described by the SCSI standard as one). So the answer to the question should the block layer emit zero length sgl elements is "no" because they can confuse some DMA engines. If there's a more theoretical question of whether the target driver in adding commands it doesn't yet support should inject zero length SGL elements into block because SCSI allows it, the answer is still "no" because we don't want block to have SGLs that may confuse other DMA engines. There's lots of daft corner cases in the SCSI standard we don't implement and a nop for SGL elements seems to be one of the more hare brained because it adds no useful feature and merely causes compatibility issues. James
Re: [PATCH v1 0/6] no-copy bvec
On Wed, 2020-12-23 at 15:51 +, Christoph Hellwig wrote: > On Wed, Dec 23, 2020 at 12:52:59PM +, Pavel Begunkov wrote: > > Can scatterlist have 0-len entries? Those are directly translated > > into bvecs, e.g. in nvme/target/io-cmd-file.c and > > target/target_core_file.c. I've audited most of others by this > > moment, they're fine. > > For block layer SGLs we should never see them, and for nvme neither. > I think the same is true for the SCSI target code, but please double > check. Right, no-one ever wants to see a 0-len scatter list entry. The reason is that every driver uses the sgl to program the device DMA engine in the way NVME does. a 0 length sgl would be a dangerous corner case: some DMA engines would ignore it and others would go haywire, so if we ever let a 0 length list down into the driver, they'd have to understand the corner case behaviour of their DMA engine and filter it accordingly, which is why we disallow them in the upper levels, since they're effective nops anyway. James
Re: [PATCH v2] tpm: Rework open/close/shutdown to avoid races
On Tue, 2020-12-15 at 10:51 -0800, James Bottomley wrote: > On Tue, 2020-12-15 at 16:38 +0300, Sergey Temerkhanov wrote: > > Avoid race condition at shutdown by shutting downn the TPM 2.0 > > devices synchronously. This eliminates the condition when the > > shutdown sequence sets chip->ops to NULL leading to the following: > > > > [ 1586.593561][ T8669] tpm2_del_space+0x28/0x73 > > [ 1586.598718][ T8669] tpmrm_release+0x27/0x33wq > > [ 1586.603774][ T8669] __fput+0x109/0x1d > > [ 1586.608380][ T8669] task_work_run+0x7c/0x90 > > [ 1586.613414][ T8669] prepare_exit_to_usermode+0xb8/0x128 > > [ 1586.619522][ T8669] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 1586.626068][ T8669] RIP: 0033:0x4cb4bb > > An actual bug report would have been helpful. However, from this > trace it's easy to deduce that tpm2_del_space() didn't get converted > to the get/put of the chip ops ... it's still trying to do its own > half arsed thing with tpm_chip_start() and the mutex. So isn't a > much simpler fix simply to convert it as below? compile tested only, > but if you can test it out I'll send a proper patch. I got this booted and running here, so I know it works. What I still need to know is does it fix your problem? James
[GIT PULL] first round of SCSI updates for the 5.10+ merge window
This series consists of the usual driver updates (ufs, qla2xxx, smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major power management rework and a load of assorted minor updates. There are a few core updates (formatting fixes being the big one) but nothing major this cycle. We've got one obvious conflict in qla_nvme.c which is due to us having the same patch (with different commit ids) in upstream and our pull request ("scsi: qla2xxx: Return EBUSY on fcport deletion"). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc The short changelog is: Adrian Hunter (2): scsi: ufs: Allow an error return value from ->device_reset() scsi: ufs: Add DeepSleep feature Ahmed S. Darwish (12): scsi: NCR5380: Remove context check scsi: myrs: Remove WARN_ON(in_interrupt()) scsi: myrb: Remove WARN_ON(in_interrupt()) scsi: mpt3sas: Remove in_interrupt() scsi: qla4xxx: Remove in_interrupt() from qla4_82xx_rom_lock() scsi: qla4xxx: Remove in_interrupt() from qla4_82xx_idc_lock() scsi: qla2xxx: Remove in_interrupt() from qla83xx-specific code scsi: target: tcm_qla2xxx: Remove BUG_ON(in_interrupt()) scsi: qla2xxx: Remove in_interrupt() from qla82xx-specific code scsi: qla4xxx: Remove in_interrupt() scsi: hisi_sas: Remove preemptible() scsi: pm80xx: Do not sleep in atomic context Arnd Bergmann (7): scsi: ufs: Fix -Wsometimes-uninitialized warning scsi: megaraid_sas: Simplify compat_ioctl handling scsi: megaraid_sas: Check user-provided offsets scsi: aacraid: Improve compat_ioctl handlers scsi: libfc: Fix enum-conversion warning scsi: libfc: Work around -Warray-bounds warning scsi: libfc: Move scsi/fc_encode.h to libfc Arun Easi (5): scsi: qla2xxx: Fix device loss on 4G and older HBAs scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines scsi: qla2xxx: Fix FW initialization error on big endian machines scsi: qla2xxx: Fix crash during driver load on big endian machines scsi: qla2xxx: Fix compilation issue in PPC systems Asutosh Das (1): scsi: ufs: qcom: Enable aggressive power collapse for ufs HBA Bean Huo (1): scsi: ufs: Remove unnecessary if condition in ufshcd_suspend() Bjorn Andersson (1): scsi: ufs: Adjust logic in common ADAPT helper Bodo Stroesser (1): scsi: target: tcmu: scatter_/gather_data_area() rework Can Guo (7): scsi: ufs: Print host regs in IRQ handler when AH8 error happens scsi: ufs: Fix a race condition between ufshcd_abort() and eh_work() scsi: ufs: Serialize eh_work with system PM events and async scan scsi: ufs: Stop hardcoding the scale down gear scsi: ufs-qcom: Keep core_clk_unipro on while link is active scsi: ufs: Refactor ufshcd_setup_clocks() to remove skip_ref_clk scsi: ufs: Put HBA into LPM during clk gating Colin Ian King (5): scsi: qla4xxx: Remove redundant assignment to variable rval scsi: pm8001: Remove space in a debug message scsi: lpfc: Fix memory leak on lcb_context scsi: lpfc: Remove dead code on second !ndlp check scsi: lpfc: Fix pointer defereference before it is null checked issue Daniel Wagner (1): scsi: qla2xxx: Return EBUSY on fcport deletion David Disseldorp (4): scsi: target: Return COMPARE AND WRITE miscompare offsets scsi: target: Split out COMPARE AND WRITE memcmp into helper scsi: target: Rename cmd.bad_sector to cmd.sense_info scsi: target: Rename struct sense_info to sense_detail Don Brace (3): scsi: smartpqi: Update version to 1.2.16-012 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition scsi: smartpqi: Correct driver removal with HBA disks Eric Biggers (1): scsi: ufs-qcom: Only select QCOM_SCM if SCSI_UFS_CRYPTO Finn Thain (2): scsi: NCR5380: Reduce NCR5380_maybe_release_dma_irq() call sites scsi: atari_scsi: Fix race condition between .queuecommand and EH Gustavo A. R. Silva (9): scsi: target: core: Fix fall-through warnings for Clang scsi: stex: Fix fall-through warnings for Clang scsi: lpfc: Fix fall-through warnings for Clang scsi: csiostor: Fix fall-through warnings for Clang scsi: aha1740: Fix fall-through warnings for Clang scsi: aacraid: Fix fall-through warnings for Clang scsi: bfa: Fix fall-through warnings for Clang scsi: aic94xx: Fix fall-through warnings for Clang scsi: aic7xxx: Fix fall-through warnings for Clang Hannes Reinecke (4): scsi: core: Return BLK_STS_AGAIN for ALUA transitioning scsi: scsi_dh_alua: Set 'transitioning' state on Unit Attention scsi: scsi_dh_alua: Return BLK_STS_AGAIN for ALUA transitioning state scsi: block: Return status code in blk_mq_end_request() Jaegeuk Kim (6): scsi: ufs: Fix clkgating on/off scsi: ufs: Add more con
Re: [PATCH v2] tpm: Rework open/close/shutdown to avoid races
On Tue, 2020-12-15 at 16:38 +0300, Sergey Temerkhanov wrote: > Avoid race condition at shutdown by shutting downn the TPM 2.0 > devices synchronously. This eliminates the condition when the > shutdown sequence sets chip->ops to NULL leading to the following: > > [ 1586.593561][ T8669] tpm2_del_space+0x28/0x73 > [ 1586.598718][ T8669] tpmrm_release+0x27/0x33wq > [ 1586.603774][ T8669] __fput+0x109/0x1d > [ 1586.608380][ T8669] task_work_run+0x7c/0x90 > [ 1586.613414][ T8669] prepare_exit_to_usermode+0xb8/0x128 > [ 1586.619522][ T8669] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 1586.626068][ T8669] RIP: 0033:0x4cb4bb An actual bug report would have been helpful. However, from this trace it's easy to deduce that tpm2_del_space() didn't get converted to the get/put of the chip ops ... it's still trying to do its own half arsed thing with tpm_chip_start() and the mutex. So isn't a much simpler fix simply to convert it as below? compile tested only, but if you can test it out I'll send a proper patch. James --- diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c index 784b8b3cb903..0c0cd225046f 100644 --- a/drivers/char/tpm/tpm2-space.c +++ b/drivers/char/tpm/tpm2-space.c @@ -58,12 +58,12 @@ int tpm2_init_space(struct tpm_space *space, unsigned int buf_size) void tpm2_del_space(struct tpm_chip *chip, struct tpm_space *space) { - mutex_lock(&chip->tpm_mutex); - if (!tpm_chip_start(chip)) { + + if (tpm_try_get_ops(chip) == 0) { tpm2_flush_sessions(chip, space); - tpm_chip_stop(chip); + tpm_put_ops(chip); } - mutex_unlock(&chip->tpm_mutex); + kfree(space->context_buf); kfree(space->session_buf); }
Re: [PATCH] KVM/SVM: add support for SEV attestation command
On Wed, 2020-12-09 at 21:25 -0600, Brijesh Singh wrote: > Noted, I will send v2 with these fixed. I ran a test on this. It turns out for rome systems you need firmware md_sev_fam17h_model3xh_0.24b0A (or later) installed to get this and the QEMU patch with the base64 decoding fixed, but with that Tested-by: James Bottomley Attached is the test programme I used. James --- #!/usr/bin/python3 ## # Python script get an attestation and verify it with the PEK # # This assumes you've already exported the pek.cert with sev-tool # from https://github.com/AMDESE/sev-tool.git # # sev-tool --export_cert_chain # # creates several files, the only one this script needs is pek.cert # # Tables and chapters refer to the amd 55766.pdf document # # https://www.amd.com/system/files/TechDocs/55766_SEV-KM_API_Specification.pdf ## import sys import os import base64 import hashlib from argparse import ArgumentParser from Crypto.PublicKey import ECC from Crypto.Math.Numbers import Integer from git.qemu.python.qemu import qmp if __name__ == "__main__": parser = ArgumentParser(description='Inject secret into SEV') parser.add_argument('--pek-cert', help='The Platform DH certificate in binary form', default='pek.cert') parser.add_argument('--socket', help='Socket to connect to QMP on, defaults to localhost:6550', default='localhost:6550') args = parser.parse_args() if (args.socket[0] == '/'): socket = args.socket elif (':' in args.socket): s = args.socket.split(':') socket = (s[0], int(s[1])) else: parse.error('--socket must be : or /path/to/unix') fh = open(args.pek_cert, 'rb') pek = bytearray(fh.read()) curve = int.from_bytes(pek[16:20], byteorder='little') curves = { 1: 'p256', 2: 'p384' } Qx = int.from_bytes(bytes(pek[20:92]), byteorder='little') Qy = int.from_bytes(bytes(pek[92:164]), byteorder='little') pubkey = ECC.construct(point_x=Qx, point_y=Qy, curve=curves[curve]) Qmp = qmp.QEMUMonitorProtocol(address=socket); Qmp.connect() caps = Qmp.command('query-sev') print('SEV query found API={api-major}.{api-minor} build={build-id} policy={policy}\n'.format(**caps)) nonce=os.urandom(16) report = Qmp.command('query-sev-attestation-report', mnonce=base64.b64encode(nonce).decode()) a = base64.b64decode(report['data']) ## # returned data is formulated as Table 60. Attestation Report Buffer ## rnonce = a[0:16] rmeas = a[16:48] if (nonce != rnonce): sys.exit('returned nonce doesn\'t match input nonce') policy = int.from_bytes(a[48:52], byteorder='little') usage = int.from_bytes(a[52:56], byteorder='little') algo = int.from_bytes(a[56:60], byteorder='little') if (policy != caps['policy']): sys.exit('Policy mismatch:', policy, '!=', caps['policy']) if (usage != 0x1002): sys.exit('error PEK is not specified in usage: ', usage) if (algo == 0x2): h = hashlib.sha256() elif (algo == 0x102): ## # The spec (6.8) says the signature must be ECDSA-SHA256 so this # should be impossible, but it turns out to be the way our # current test hardware produces its signature ## h = hashlib.sha384() else: sys.exit('unrecognized signing algorithm: ', algo) h.update(a[0:52]) sig = a[64:208] r = int.from_bytes(sig[0:72],byteorder='little') s = int.from_bytes(sig[72:144],byteorder='little') ## # subtlety: r and s are little (AMD defined) z is big (crypto requirement) ## z = int.from_bytes(h.digest(), byteorder='big') ## # python crypto doesn't have a way of passing in r and s as # integers and I'm not inclined to wrap them up as a big endian # binary signature to have Signature.DSS unwrap them again, so # call the _verify() private interface that does take integers ## if (not pubkey._verify(Integer(z), (Integer(r), Integer(s: sys.exit('returned signature did not verify') print('usage={usage}, algorithm={algo}'.format(usage=hex(usage), algo=hex(algo))) print('ovmf-hash: ', rmeas.hex())
[GIT PULL] SCSI fixes for 5.10-rc7
Five small fixes: four in drivers: hisi_sas: fix internal queue timeout, be2iscsi: revert a prior fix causing problems, bnx2i: add missing dependency, storvsc: late arriving revert of a problem fix, and one in the core. The core one is a minor change to stop paying attention to the busy count when returning out of resources because there's a race window where the queue might not restart due to missing returning I/O. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Andrea Parri (Microsoft) (1): Revert "scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()" Dan Carpenter (1): scsi: be2iscsi: Revert "Fix a theoretical leak in beiscsi_create_eqs()" Ming Lei (1): scsi: core: Fix race between handling STS_RESOURCE and completion Randy Dunlap (1): scsi: bnx2i: Requires MMU Xiang Chen (1): scsi: hisi_sas: Select a suitable queue for internal I/Os And the diffstat: drivers/scsi/be2iscsi/be_main.c| 4 ++-- drivers/scsi/bnx2i/Kconfig | 1 + drivers/scsi/hisi_sas/hisi_sas_main.c | 6 ++ drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 + drivers/scsi/scsi_lib.c| 3 +-- drivers/scsi/storvsc_drv.c | 5 - 6 files changed, 15 insertions(+), 9 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index 202ba925c494..5c3513a4b450 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -3020,7 +3020,6 @@ static int beiscsi_create_eqs(struct beiscsi_hba *phba, goto create_eq_error; } - mem->dma = paddr; mem->va = eq_vaddress; ret = be_fill_queue(eq, phba->params.num_eq_entries, sizeof(struct be_eq_entry), eq_vaddress); @@ -3030,6 +3029,7 @@ static int beiscsi_create_eqs(struct beiscsi_hba *phba, goto create_eq_error; } + mem->dma = paddr; ret = beiscsi_cmd_eq_create(&phba->ctrl, eq, BEISCSI_EQ_DELAY_DEF); if (ret) { @@ -3086,7 +3086,6 @@ static int beiscsi_create_cqs(struct beiscsi_hba *phba, goto create_cq_error; } - mem->dma = paddr; ret = be_fill_queue(cq, phba->params.num_cq_entries, sizeof(struct sol_cqe), cq_vaddress); if (ret) { @@ -3096,6 +3095,7 @@ static int beiscsi_create_cqs(struct beiscsi_hba *phba, goto create_cq_error; } + mem->dma = paddr; ret = beiscsi_cmd_cq_create(&phba->ctrl, cq, eq, false, false, 0); if (ret) { diff --git a/drivers/scsi/bnx2i/Kconfig b/drivers/scsi/bnx2i/Kconfig index 75ace2302fed..0cc06c2ce0b8 100644 --- a/drivers/scsi/bnx2i/Kconfig +++ b/drivers/scsi/bnx2i/Kconfig @@ -4,6 +4,7 @@ config SCSI_BNX2_ISCSI depends on NET depends on PCI depends on (IPV6 || IPV6=n) + depends on MMU select SCSI_ISCSI_ATTRS select NETDEVICES select ETHERNET diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index c8dd8588f800..274ccf18ce2d 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -452,6 +452,12 @@ static int hisi_sas_task_prep(struct sas_task *task, blk_tag = blk_mq_unique_tag(scmd->request); dq_index = blk_mq_unique_tag_to_hwq(blk_tag); *dq_pointer = dq = &hisi_hba->dq[dq_index]; + } else if (hisi_hba->shost->nr_hw_queues) { + struct Scsi_Host *shost = hisi_hba->shost; + struct blk_mq_queue_map *qmap = &shost->tag_set.map[HCTX_TYPE_DEFAULT]; + int queue = qmap->mq_map[raw_smp_processor_id()]; + + *dq_pointer = dq = &hisi_hba->dq[queue]; } else { *dq_pointer = dq = sas_dev->dq; } diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 7133ca859b5e..960de375ce69 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -2452,6 +2452,11 @@ static int interrupt_init_v3_hw(struct hisi_hba *hisi_hba) rc = -ENOENT; goto free_irq_vectors; } + cq->irq_mask = pci_irq_get_affinity(pdev, i + BASE_VECTORS_V3_HW); + if (!cq->irq_mask) { + dev_err(dev, "could not get cq%d irq affinity!\n", i); + return -ENOENT; + } } return 0; diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scs
Re: [PATCH AUTOSEL 5.7 03/30] ima: extend boot_aggregate with kernel measurements
On Fri, 2020-12-11 at 06:01 -0500, Mimi Zohar wrote: > On Thu, 2020-12-10 at 21:10 -0600, Tyler Hicks wrote: > > On 2020-11-29 08:17:38, Mimi Zohar wrote: > > > Hi Sasha, > > > > > > On Wed, 2020-07-08 at 21:27 -0400, Sasha Levin wrote: > > > > On Wed, Jul 08, 2020 at 12:13:13PM -0400, Mimi Zohar wrote: > > > > > Hi Sasha, > > > > > > > > > > On Wed, 2020-07-08 at 11:40 -0400, Sasha Levin wrote: > > > > > > From: Maurizio Drocco > > > > > > > > > > > > [ Upstream commit 20c59ce010f84300f6c655d32db2610d3433f85c > > > > > > ] > > > > > > > > > > > > Registers 8-9 are used to store measurements of the kernel > > > > > > and its command line (e.g., grub2 bootloader with tpm > > > > > > module enabled). IMA should include them in the boot > > > > > > aggregate. Registers 8-9 should be only included in non- > > > > > > SHA1 digests to avoid ambiguity. > > > > > > > > > > Prior to Linux 5.8, the SHA1 template data hashes were padded > > > > > before being extended into the TPM. Support for calculating > > > > > and extending the per TPM bank template data digests is only > > > > > being upstreamed in Linux 5.8. > > > > > > > > > > How will attestation servers know whether to include PCRs 8 & > > > > > 9 in the the boot_aggregate calculation? Now, there is a > > > > > direct relationship between the template data SHA1 padded > > > > > digest not including PCRs 8 & 9, and the new per TPM bank > > > > > template data digest including them. > > > > > > > > Got it, I'll drop it then, thank you! > > > > > > After re-thinking this over, I realized that the attestation > > > server can verify the "boot_aggregate" based on the quoted PCRs > > > without knowing whether padded SHA1 hashes or per TPM bank hash > > > values were extended into the TPM[1], but non-SHA1 boot aggregate > > > values [2] should always include PCRs 8 & 9. > > > > I'm still not clear on how an attestation server would know to > > include PCRs 8 and 9 after this change came through a stable kernel > > update. It doesn't seem like something appropriate for stable since > > it requires code changes to attestation servers to handle the > > change. > > > > I know this has already been released in some stable releases, so > > I'm too late, but perhaps I'm missing something. > > The point of adding PCRs 8 & 9 only to non-SHA1 boot_aggregate values > was to avoid affecting existing attestation servers. The intention > was when attestation servers added support for the non-sha1 > boot_aggregate values, they'd also include PCRs 8 & 9. The existing > SHA1 boot_aggregate value remains PCRs 0 - 7. > > To prevent this or something similar from happening again, what > should have been the proper way of including PCRs 8 & 9? Just to be pragmatic: this is going to happen again. Shim is already measuring the Mok variables through PCR 14, so if we want an accurate boot aggregate, we're going to have to include PCR 14 as well (or persuade shim to measure through a PCR we're already including, which isn't impossible since I think shim should be measuring the Mok variables using the EV_EFI_VARIABLE_DRIVER_CONFIG event and, since it affects secure boot policy, that does argue it should be measured through PCR 7). James
Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected
On Mon, 2020-12-07 at 15:28 -0400, Jason Gunthorpe wrote: > On Sun, Dec 06, 2020 at 08:26:16PM +0100, Thomas Gleixner wrote: > > Just as a side note. I was looking at tpm_tis_probe_irq_single() > > and that function is leaking the interrupt request if any of the > > checks afterwards fails, except for the final interrupt probe check > > which does a cleanup. That means on fail before that the interrupt > > handler stays requested up to the point where the module is > > removed. If that's a shared interrupt and some other device is > > active on the same line, then each interrupt from that device will > > call into the TPM code. Something like the below is needed. > > > > Also the X86 autoprobe mechanism is interesting: > > > > if (IS_ENABLED(CONFIG_X86)) > > for (i = 3; i <= 15; i++) > > if (!tpm_tis_probe_irq_single(chip, intmask, 0, > > i)) > > return; > > > > The third argument is 'flags' which is handed to request_irq(). So > > that won't ever be able to probe a shared interrupt. But if an > > interrupt number > 0 is handed to tpm_tis_core_init() the interrupt > > is requested with IRQF_SHARED. Same issue when the chip has an > > interrupt number in the register. It's also requested exclusive > > which is pretty likely to fail on ancient x86 machines. > > It is very likely none of this works any more, it has been repeatedly > reworked over the years and just left behind out of fear someone > needs it. I've thought it should be deleted for a while now. > > I suppose the original logic was to try and probe without SHARED > because a probe would need exclusive access to the interrupt to tell > if the TPM was actually the source, not some other device. > > It is all very old and very out of step with current thinking, IMHO. > I skeptical that TPM interrupts were ever valuable enough to deserve > this in the first place. For what it's worth, I agree. Trying to probe all 15 ISA interrupts is last millennium thinking we should completely avoid. If it's not described in ACPI then you don't get an interrupt full stop. James
Re: [PATCH v13 0/3] scsi: ufs: Add Host Performance Booster Support
On Mon, 2020-12-07 at 19:35 +0100, Greg KH wrote: > On Mon, Dec 07, 2020 at 06:26:03PM +, Christoph Hellwig wrote: > > On Mon, Dec 07, 2020 at 07:23:12PM +0100, Greg KH wrote: > > > What "real workload" test can be run on this to help show if it > > > is useful or not? These vendors seem to think it helps for some > > > reason, otherwise they wouldn't have added it to their silicon :) > > > > > > Should they run fio? If so, any hints on a config that would be > > > good to show any performance increases? > > > > A real actual workload that matters. Then again that was Martins > > request to even justify it. I don't think the broken addressing > > that breaks a whole in the SCSI addressing has absolutely not > > business being supported in Linux ever. The vendors should have > > thought about the design before committing transistors to something > > that fundamentally does not make sense. Actually, that's not the way it works: vendors add commands because standards mandate. That's why people who want weird commands go and join standard committees. Unfortunately this means that a lot of the commands the standard mandates end up not being very useful in practice. For instance in SCSI we really only implement a fraction of the commands in the standard. In this case, the industry already tried a very similar approach with GEN 1 hybrid drives and it turned into a complete disaster, which is why the mode became optional in shingle drives and much better modes, which didn't have the huge shared state problem, superseded it. Plus truncating the LBA of a READ 16 to 4 bytes is asking for capacity problems down the line, so even the actual implementation seems to be problematic. All in all, this looks like a short term fix which will go away when the drive capacity improves and thus all the effort changing the driver will eventually be wasted. > So "time to boot an android system with this enabled and disabled" > would be a valid workload, right? I'm guessing that's what the > vendors here actually care about, otherwise there is no real stress- > test on a UFS system that I know of. Um, does it? I don't believe even the UFS people have claimed this. The problem is that HPB creates a shared state between the driver and the device. That shared state has to be populated, which has to happen at start of day, so it's entirely unclear if this is a win or a slow down for boot. James
Re: [PATCH v3 1/4] irq: export kstat_irqs
On Sun, 2020-12-06 at 17:40 +0100, Thomas Gleixner wrote: > On Sat, Dec 05 2020 at 12:39, Jarkko Sakkinen wrote: > > On Fri, Dec 04, 2020 at 06:43:37PM -0700, Jerry Snitselaar wrote: > > > To try and detect potential interrupt storms that > > > have been occurring with tpm_tis devices it was suggested > > > to use kstat_irqs() to get the number of interrupts. > > > Since tpm_tis can be built as a module it needs kstat_irqs > > > exported. > > > > I think you should also have a paragraph explicitly stating that > > i915_pmu.c contains a duplicate of kstat_irqs() because it is not > > exported as of today. It adds a lot more weight to this given that > > there is already existing mainline usage (kind of). > > It's abusage and just the fact that it exists is not an argument by > itself. What we want is a count of the interrupts to see if we're having an interrupt storm from the TPM device (some seem to be wired to fire the interrupt even when there's no event to warrant it). Since kstat_irqs_user() does the correct RCU locking, should we be using that instead? James
Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().
On Sat, 2020-12-05 at 22:20 +0100, Ard Biesheuvel wrote: > On Sat, 5 Dec 2020 at 22:15, James Bottomley > wrote: > > [Rostedt added because this is all his fault] > > On Sat, 2020-12-05 at 21:57 +0100, Ard Biesheuvel wrote: > > > On Sat, 5 Dec 2020 at 21:24, James Bottomley > > > wrote: > > [...] > > > > > So I don't object to using str_has_prefix() in new code in > > > > > this way, but I really don't see the point of touching > > > > > existing code. > > > > > > > > That's your prerogative as a Maintainer ... I was just > > > > explaining what the original author had in mind when > > > > str_has_prefix() was created. > > > > > > > > > > Sure, I fully understand you are not the one proposing these > > > changes. > > > > > > But if the pattern in question is so common, couldn't we go one > > > step further and define something like > > > > > > static inline const u8 *skip_prefix_or_null(const u8 *str, const > > > u8 *prefix) > > > { > > > } > > > > > > which returns a pointer into the original string, or NULL if the > > > prefix is not present. > > > > > > The current patch as proposed has no benefit whatsoever, but even > > > the meaningful alternative you are proposing is not actually an > > > improvement, given that it is not self-explanatory from the name > > > 'str_has_prefix' what it returns, and so the code becomes more > > > difficult to understand. > > > > Ah, so this is the kernel maintainer's syndrome: you see an API > > which isn't quite right for your use case, so you update or change > > it. Then you see other use cases for it and suddenly to you it > > becomes the best thing since sliced bread and with a one ring to > > rule them all mentality you exhort everyone to use this new API > > everywhere. See this comment in the merge commit (495d714ad1400) > > which comes from the merge cover letter: > > > > > - Addition of str_has_prefix() and a few use cases. There > > > currently is a similar function strstart() that is used in > > > a > > > few places, but only returns a bool and not a length. These > > > instances will be removed in the future to use > > > str_has_prefix() instead. > > > > Then you forget about it until someone else acts on your somewhat > > ill considered instruction and actually tries the > > replacement. Once someone takes up your cause, the API shows up in > > dozens of emails and the actual debate about whether or not this is > > such a good API really begins, with the poor person who picked it > > up caught in the crossfire. > > > > As maintainers we really should learn to put the cart before the s/to put/not to put/ > > horse. > > > > I am not disagreeing with any of this, but I simply don't see a point > in merging patches that apparently result in the exact same machine > code to be generated, and don't substantially make the code itself > any better. Well, I think the pattern if (strstarts(option, )) { ... option += strlen(); is a bad one because one day may get updated but not . And if is too far away in the code it might not even show up in the diff, leading to reviewers not noticing either. So I think eliminating the pattern is a definite improvement. Now whether the improvement is enough that we should churn the code base to fix it is another question. James
Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().
[Rostedt added because this is all his fault] On Sat, 2020-12-05 at 21:57 +0100, Ard Biesheuvel wrote: > On Sat, 5 Dec 2020 at 21:24, James Bottomley > wrote: [...] > > > So I don't object to using str_has_prefix() in new code in this > > > way, but I really don't see the point of touching existing code. > > > > That's your prerogative as a Maintainer ... I was just explaining > > what the original author had in mind when str_has_prefix() was > > created. > > > > Sure, I fully understand you are not the one proposing these changes. > > But if the pattern in question is so common, couldn't we go one step > further and define something like > > static inline const u8 *skip_prefix_or_null(const u8 *str, const u8 > *prefix) > { > } > > which returns a pointer into the original string, or NULL if the > prefix is not present. > > The current patch as proposed has no benefit whatsoever, but even the > meaningful alternative you are proposing is not actually an > improvement, given that it is not self-explanatory from the name > 'str_has_prefix' what it returns, and so the code becomes more > difficult to understand. Ah, so this is the kernel maintainer's syndrome: you see an API which isn't quite right for your use case, so you update or change it. Then you see other use cases for it and suddenly to you it becomes the best thing since sliced bread and with a one ring to rule them all mentality you exhort everyone to use this new API everywhere. See this comment in the merge commit (495d714ad1400) which comes from the merge cover letter: > - Addition of str_has_prefix() and a few use cases. There > currently is a similar function strstart() that is used in a > few places, but only returns a bool and not a length. These > instances will be removed in the future to use > str_has_prefix() instead. Then you forget about it until someone else acts on your somewhat ill considered instruction and actually tries the replacement. Once someone takes up your cause, the API shows up in dozens of emails and the actual debate about whether or not this is such a good API really begins, with the poor person who picked it up caught in the crossfire. As maintainers we really should learn to put the cart before the horse. James
Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().
On Sat, 2020-12-05 at 20:36 +0100, Ard Biesheuvel wrote: > On Fri, 4 Dec 2020 at 19:02, James Bottomley > wrote: > > On Fri, 2020-12-04 at 18:07 +0100, Ard Biesheuvel wrote: > > > On Fri, 4 Dec 2020 at 18:06, > > > wrote: > > > > From: Francis Laniel > > > > > > > > The two functions indicates if a string begins with a given > > > > prefix. The only difference is that strstarts() returns a bool > > > > while str_has_prefix() returns the length of the prefix if the > > > > string begins with it or 0 otherwise. > > > > > > > > > > Why? > > > > I think I can answer that. If the conversion were done properly > > (which it's not) you could get rid of the double strings in the > > code which are error prone if you update one and forget > > another. This gives a good example: 3d739c1f6156 ("tracing: Use > > the return of str_has_prefix() to remove open coded numbers"). so > > in your code you'd replace things like > > > > if (strstarts(option, "rgb")) { > > option += strlen("rgb"); > > ... > > > > with > > > > len = str_has_prefix(option, "rgb"); > > if (len) { > > option += len > > ... > > > > Obviously you also have cases where strstart is used as a boolean > > with no need to know the length ... I think there's no value to > > converting those. > > > > This will lead to worse code being generated. strlen() is evaluated > at build time by the compiler if the argument is a string literal, so > your 'before' version gets turned into 'option += 3', whereas the > latter needs to use a runtime variable. str_has_prefix() is an always_inline function so it should be build time evaluated as well. I think most compilers see len as being a constant and unchanged, so elide the variable. This means the code generated should be the same. > So I don't object to using str_has_prefix() in new code in this way, > but I really don't see the point of touching existing code. That's your prerogative as a Maintainer ... I was just explaining what the original author had in mind when str_has_prefix() was created. James
[GIT PULL] SCSI fixes for 5.10-rc6
Four small fixes in two drivers. The mpt3sas fixes are all timeout under unusual conditions problems and the storvsc is a missed incoming packet validation and a missed error return. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Andrea Parri (Microsoft) (1): scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback() Jing Xiangfeng (1): scsi: storvsc: Fix error return in storvsc_probe() Sreekanth Reddy (1): scsi: mpt3sas: Increase IOCInit request timeout to 30s Suganath Prabu S (1): scsi: mpt3sas: Fix ioctl timeout And the diffstat: drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 2 +- drivers/scsi/storvsc_drv.c | 9 - 3 files changed, 10 insertions(+), 3 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index e4cc92bc4d94..bb940cbcbb5d 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -6459,7 +6459,7 @@ _base_send_ioc_init(struct MPT3SAS_ADAPTER *ioc) r = _base_handshake_req_reply_wait(ioc, sizeof(Mpi2IOCInitRequest_t), (u32 *)&mpi_request, - sizeof(Mpi2IOCInitReply_t), (u16 *)&mpi_reply, 10); + sizeof(Mpi2IOCInitReply_t), (u16 *)&mpi_reply, 30); if (r != 0) { ioc_err(ioc, "%s: handshake failed (r=%d)\n", __func__, r); diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c b/drivers/scsi/mpt3sas/mpt3sas_ctl.c index 0f2b681449e6..edd26a2570fa 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c +++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c @@ -664,7 +664,7 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct mpt3_ioctl_command karg, Mpi26NVMeEncapsulatedRequest_t *nvme_encap_request = NULL; struct _pcie_device *pcie_device = NULL; u16 smid; - u8 timeout; + unsigned long timeout; u8 issue_reset; u32 sz, sz_arg; void *psge; diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 0c65fbd41035..99c8ff81de74 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1246,6 +1246,11 @@ static void storvsc_on_channel_callback(void *context) request = (struct storvsc_cmd_request *) ((unsigned long)desc->trans_id); + if (hv_pkt_datalen(desc) < sizeof(struct vstor_packet) - vmscsi_size_delta) { + dev_err(&device->device, "Invalid packet len\n"); + continue; + } + if (request == &stor_device->init_request || request == &stor_device->reset_request) { memcpy(&request->vstor_packet, packet, @@ -1994,8 +1999,10 @@ static int storvsc_probe(struct hv_device *device, alloc_ordered_workqueue("storvsc_error_wq_%d", WQ_MEM_RECLAIM, host->host_no); - if (!host_dev->handle_error_wq) + if (!host_dev->handle_error_wq) { + ret = -ENOMEM; goto err_out2; + } INIT_WORK(&host_dev->host_scan_work, storvsc_host_scan); /* Register the HBA and start the scsi bus scan */ ret = scsi_add_host(host, &device->device);
Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
On Fri, 2020-12-04 at 14:51 -0700, Jerry Snitselaar wrote: > James Bottomley @ 2020-12-03 14:05 MST: > > > On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote: > > > Jerry Snitselaar @ 2020-12-02 23:11 MST: > > [...] > > > > The interrupt storm detection code works on the T490s. I'm not > > > > sure what is going on with the L490. I will see if I can get > > > > access to one. > > > > > > > > Jerry > > > > > > Lenovo verified that the L490 hangs. > > > > Just to confirm, that's this system: > > > > https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490 > > > > We could ask if lenovo will give us one, but if not we could pull a > > Jens. [the backstory is that when Jens was doing queueing in the > > block layer, there were lots of SATA devices that didn't work quite > > right but you couldn't tell unless you actually tried them > > out. Getting manufacturers to send samples is rather arduous, so > > he took to ordering them online, testing them out, and then > > returning them for a full refund within the allowed window] > > > > It looks like Lenovo has a nice christmas returns policy: > > > > https://www.lenovo.com/us/en/shopping-faq/#returns > > > > James > > Yes, that is the one. I'm seeing if we have any located somewhere, or > if Lenovo will loan me one. OK, that would be best since it's your patch. We've got 12 days left to get the free returns policy, so if you haven't managed to find one by the end of next week, I'll order one and play with it ... you could always put your manager's credit card down for one as well ... especially as it should get refunded ... > I think for the time being the patch that disabled interrupts for > the T490s could be changed to it for the L490 instead. I'll post a v3 > of my current patchset. It would probably make sense for it to go in > with your patches when they land. Absolutely ... we can't turn interrupts on and then have a load of hung laptops we already knew about ... James
Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().
On Fri, 2020-12-04 at 18:07 +0100, Ard Biesheuvel wrote: > On Fri, 4 Dec 2020 at 18:06, > wrote: > > From: Francis Laniel > > > > The two functions indicates if a string begins with a given prefix. > > The only difference is that strstarts() returns a bool while > > str_has_prefix() > > returns the length of the prefix if the string begins with it or 0 > > otherwise. > > > > Why? I think I can answer that. If the conversion were done properly (which it's not) you could get rid of the double strings in the code which are error prone if you update one and forget another. This gives a good example: 3d739c1f6156 ("tracing: Use the return of str_has_prefix() to remove open coded numbers"). so in your code you'd replace things like if (strstarts(option, "rgb")) { option += strlen("rgb"); ... with len = str_has_prefix(option, "rgb"); if (len) { option += len ... Obviously you also have cases where strstart is used as a boolean with no need to know the length ... I think there's no value to converting those. James
Re: [RFC PATCH v1 00/12] Replace strstarts() by str_has_prefix()
On Fri, 2020-12-04 at 18:03 +0100, laniel_fran...@privacyrequired.com wrote: > In this patch set, I replaced all calls to strstarts() by calls to > str_has_prefix(). Indeed, the kernel has two functions to test if a > string begins with an other: > 1. strstarts() which returns a bool, so 1 if the string begins with > the prefix,0 otherwise. > 2. str_has_prefix() which returns the length of the prefix or 0. > > str_has_prefix() was introduced later than strstarts(), in commit > 495d714ad140 which also stated that str_has_prefix() should replace > strstarts(). This is what this patch set does. What's the reason why? If you look at the use cases for the replacement of strstart() they're all cases where we need to know the length we're skipping and this is hard coded, leading to potential errors later. This is a classic example: 3d739c1f6156 ("tracing: Use the return of str_has_prefix() to remove open coded numbers"). However you're not doing this transformation in the conversion, so the conversion is pretty useless. I also see no case for replacing strstart() where we're using it simply as a boolean without needing to know the length of the prefix. James
Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch
On Thu, 2020-12-03 at 14:17 -0500, Konstantin Ryabitsev wrote: > On Thu, Dec 03, 2020 at 08:55:54AM -0800, Joe Perches wrote: > > Perhaps automate a mechanism to capture that information as > > git notes for the patches when applied. > > Git notes have a limited usefulness for this -- they are indeed part > of the repository, but they aren't replicated unless someone does a > --mirror clone (or specifically fetches refs/notes/*). If the goal is > to improve visibility for contributors, then putting this info into a > git note will hardly make more difference than providing a Link: that > someone has to follow to a list archival service. > > I can offer the following proposal: > > - kernel.org already monitors all mailing lists that are archived on > lore.kernel.org for the purposes of pull request tracking > (pr-tracker-bot). > - in the near future, we will add a separate process that will > auto-explode all pull requests into individual patches and add them > to a separate public-inbox archive (think of it as another > transparency log, since pull requests are transient and opaque). > > We can additionally: > > - identify all Link: and Message-Id: entries in commit messages, > retrieve the threads they refer to, and archive them as part of > the same (or adjacent) transparency log. > > This offers an improvement over the status quo, because if > lore.kernel.org becomes unavailable, someone would have to have > access to all backend archive repositories it is currently tracking > in order to be able to reconstitute relevant conversations -- whereas > with this change, it should be sufficient to just have the copy of > the transparency log to have a fully self-contained high-relevancy > archive of both individual commits and conversations that happened > around them. I don't think this is strictly necessary because there's more than lore that archive's our lists, but the people at the internet history project would remind me not to look a gift horse in the mouth, so I think this would certainly be a useful addition. The thing which Link: doesn't necessarily track is iterations, so if you replied to v2 and your feedback got incorporated, there's a v3 iteration which has a different msgid. Is there a way of getting this full history, not just the current thread? > I'm just not sure if this will help with the subject of the > conversation, or if it does not serve the goal of recognizing > developer contributions by making them more visible. I added Jon to the cc since a lot of managers (community or otherwise) do use the lwn.net stats as a performance guide. The real question is could we get something measurable out of the data? say number of replies to an accepted patch counting in the reviewer stats or something? James
Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote: > Jerry Snitselaar @ 2020-12-02 23:11 MST: [...] > > The interrupt storm detection code works on the T490s. I'm not sure > > what is going on with the L490. I will see if I can get access to > > one. > > > > Jerry > > Lenovo verified that the L490 hangs. Just to confirm, that's this system: https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490 We could ask if lenovo will give us one, but if not we could pull a Jens. [the backstory is that when Jens was doing queueing in the block layer, there were lots of SATA devices that didn't work quite right but you couldn't tell unless you actually tried them out. Getting manufacturers to send samples is rather arduous, so he took to ordering them online, testing them out, and then returning them for a full refund within the allowed window] It looks like Lenovo has a nice christmas returns policy: https://www.lenovo.com/us/en/shopping-faq/#returns James
Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch
On Thu, 2020-12-03 at 13:52 -0500, Matthew Wilcox wrote: > It's not so much "clean history" that's the desire. It's "don't leave > landmines for git bisect". ... top posting? Well functional git bisect and show the evolution of the patch aren't mutually exclusive. Plus our current clean history approach doesn't always detect them ... That said, I agree that if a review comes in that shows a patch would break the build or runtime in a way that would damage bisection, it should be folded. I suppose this argues that only less trivial changes can be shown as part of the unfolded history, and since they're obviously less important than the review driven folded change it would add more bias to track them. I fall back to my link is enough comment. James
Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch
On Thu, 2020-12-03 at 00:43 +0100, Vlastimil Babka wrote: > Hi, > > there was a bit of debate on Twitter about this, so I thought I would > bring it here. Imagine a scenario where patch sits as a commit in > -next and there's a bug report or fix, possibly by a bot or with some > static analysis. The maintainer decides to fold it into the original > patch, which makes sense for e.g. bisectability. But there seem to be > no clear rules about attribution in this case, which looks like there > should be, probably in > Documentation/maintainer/modifying-patches.rst > > The original bug fix might include a From: $author, a Reported-by: > (e.g. syzbot), Fixes: $next-commit, some tag such as Addresses- > Coverity: to credit the static analysis tool, and an SoB. After > folding, all that's left might be a line as "include fix from > $author" in the SoB area. This is a loss of metadata/attribution just > due to folding, and might make contributors unhappy. Had they sent > the fix after the original commit was mainline and immutable, all > the info above would "survive" in the form of new commit. It has been the case since forever that discussion which improves an uncommitted patch is only captured in email (which now may be preserved in a link tag). Patch updates that come in after the patch is committed get their own commit. We've tried to move people away from counting commits as an indicator of upstream eminence, but it's still a fact of life that this is what matters to a lot of open source community managers. The tension we have is between liking a clean commit in the tree as opposed to a sequence of commits tracking the evolution of the patch and this community manager desire to track patches. So there are two embedded questions here: firstly, should we be as wedded to clean history as we are, because showing the evolution would simply solve this? Secondly, if we are agreed on clean history, how can we make engagement via email as important as engagement via commit for the community managers so the Link tag is enough? I've got to say I think trying to add tags to recognize patch evolution is a mistake and we instead investigate one of the two proposals above. James
Re: [PATCH] scsi: ses: Fix crash caused by kfree an invalid pointer
On Mon, 2020-11-30 at 10:26 +0800, Ding Hui wrote: [...] > sg_ses -e > Diagnostic pages, followed by abbreviation(s) then page code: > Supported Diagnostic Pages [sdp] [0x0] > Configuration (SES) [cf] [0x1] > Enclosure Status/Control (SES) [ec,es] [0x2] > Help Text (SES) [ht] [0x3] > > # sg_ses -p cf /dev/sdu >DELL MD32xxi 0784 > disk device has EncServ bit set > Configuration diagnostic page: >number of secondary subenclosures: 0 >generation code: 0x12c >enclosure descriptor list > Subenclosure identifier: 0 (primary) >relative ES process id: 0, number of ES processes: 0 >number of type descriptor headers: 5 >enclosure logical identifier (hex): >enclosure vendor: DELL product: MD32xxi rev: > 0784 >vendor-specific data: > 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 > 00 00 00 00 >type descriptor header/text list > Element type: Device slot, subenclosure id: 0 >number of possible elements: 12 > Element type: Power supply, subenclosure id: 0 >number of possible elements: 2 > Element type: Cooling, subenclosure id: 0 >number of possible elements: 4 > Element type: Temperature sensor, subenclosure id: 0 >number of possible elements: 4 > Element type: SCC controller electronics, subenclosure id: 0 >number of possible elements: 1 > > I'm not goot at ses, but it seems that ses does not get the right > component count Actually there is a possible explanation. Your kernel log has this in the middle: > 2020-11-30 09:31:41.360334 notice [kernel:] [425843.704480] sd > 19:0:0:0: Embedded Enclosure Device > 2020-11-30 09:31:41.360335 warning [kernel:] [425843.704732] sd > 19:0:0:0: Mode parameters changed That "Mode parameters changed" implies that what the kernel read first time around may not be the actual configuration of the enclosure. In particular, the generation code being 0x12c is also an indicator things may have changed.My theory is when the kernel boots the device is returning 0 for most of the possible elements, but it changes this at a later stage. One way to verify would be to compile ses as modular but blacklist it so it can't be inserted, then do sg_ses -p to get the enclosure and then insert the ses module to see if the two agree on the components. Unfortunately, even if that's successful, figuring out what we have to do to the enclosure to get it to finish its internal scanning may not be so easy. James
Re: [PATCH] locks: remove trailing semicolon in macro definition
On Sun, 2020-11-29 at 09:52 -0800, Randy Dunlap wrote: > On 11/29/20 9:47 AM, Tom Rix wrote: > > On 11/27/20 11:53 AM, Matthew Wilcox wrote: > > > On Fri, Nov 27, 2020 at 11:07:07AM -0800, t...@redhat.com wrote: > > > > +++ b/fs/fcntl.c > > > > @@ -526,7 +526,7 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, > > > > unsigned int, cmd, > > > > (dst)->l_whence = (src)->l_whence; \ > > > > (dst)->l_start = (src)->l_start;\ > > > > (dst)->l_len = (src)->l_len;\ > > > > - (dst)->l_pid = (src)->l_pid; > > > > + (dst)->l_pid = (src)->l_pid > > > This should be wrapped in a do { } while (0). > > > > > > Look, this warning is clearly great at finding smelly code, but > > > the > > > fixes being generated to shut up the warning are low quality. > > > > > Multiline macros not following the do {} while (0) pattern are > > likely a larger problem. > > > > I'll take a look. > > Could it become a static inline function instead? > or that might expand its scope too much? I think nowadays we should always use static inlines for argument checking unless we're capturing debug information like __FILE__ or __LINE__ or something that a static inline can't. Even in the latter case the pattern should probably be single line #define that captures the information and passes it to a static inline. There was a time when we had problems with compiler expansion of static inlines, so we shouldn't go back and churn the code base to change it because there's thousands of these and possibly some old compiler used for an obscure architecture that still needs the define. James
Re: [PATCH] scsi: ses: Fix crash caused by kfree an invalid pointer
On Sat, 2020-11-28 at 20:23 +0800, Ding Hui wrote: > We can get a crash when disconnecting the iSCSI session, > the call trace like this: > > [2a00fb70] kfree at 0830e224 > [2a00fba0] ses_intf_remove at 01f200e4 > [2a00fbd0] device_del at 086b6a98 > [2a00fc50] device_unregister at 086b6d58 > [2a00fc70] __scsi_remove_device at 0870608c > [2a00fca0] scsi_remove_device at 08706134 > [2a00fcc0] __scsi_remove_target at 087062e4 > [2a00fd10] scsi_remove_target at 087064c0 > [2a00fd70] __iscsi_unbind_session at 01c872c4 > [2a00fdb0] process_one_work at 0810f35c > [2a00fe00] worker_thread at 0810f648 > [2a00fe70] kthread at 08116e98 > > In ses_intf_add, components count could be 0, and kcalloc 0 size > scomp, > but not saved in edev->component[i].scratch > > In this situation, edev->component[0].scratch is an invalid pointer, > when kfree it in ses_intf_remove_enclosure, a crash like above would > happen > The call trace also could be other random cases when kfree cannot > catch > the invalid pointer > > We should not use edev->component[] array when the components count > is 0 > We also need check index when use edev->component[] array in > ses_enclosure_data_process > > Tested-by: Zeng Zhicong > Cc: stable # 2.6.25+ > Signed-off-by: Ding Hui This doesn't really look to be the right thing to do: an enclosure which has no component can't usefully be controlled by the driver since there's nothing for it to do, so what we should do in this situation is refuse to attach like the proposed patch below. It does seem a bit odd that someone would build an enclosure that doesn't enclose anything, so would you mind running sg_ses -e on it and reporting back what it shows? It's possible there's another type that the enclosure device should be tracking. Regards, James ---8>8>8><8<8<8 From: James Bottomley Subject: [PATCH] scsi: ses: don't attach if enclosure has no components An enclosure with no components can't usefully be operated by the driver (since effectively it has nothing to manage), so report the problem and don't attach. Not attaching also fixes an oops which could occur if the driver tries to manage a zero component enclosure. Reported-by: Ding Hui Cc: sta...@vger.kernel.org Signed-off-by: James Bottomley --- drivers/scsi/ses.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index c2afba2a5414..9624298b9c89 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -690,6 +690,11 @@ static int ses_intf_add(struct device *cdev, type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE) components += type_ptr[1]; } + if (components == 0) { + sdev_printk(KERN_ERR, sdev, "enclosure has no enumerated components\n"); + goto err_free; + } + ses_dev->page1 = buf; ses_dev->page1_len = len; buf = NULL; -- 2.26.2
[GIT PULL] SCSI fixes for 5.10-rc5
Three small fixes in the UFS driver: two are for power management issues and the third is to fix a slew of problem in the sysfs code. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Can Guo (2): scsi: ufs: Make sure clk scaling happens only when HBA is runtime ACTIVE scsi: ufs: Fix unexpected values from ufshcd_read_desc_param() Stanley Chu (1): scsi: ufs: Fix race between shutdown and runtime resume flow And the diffstat: drivers/scsi/ufs/ufshcd.c | 37 +++-- 1 file changed, 23 insertions(+), 14 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 7a160b86adc6..0c148fcd24de 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1294,8 +1294,15 @@ static int ufshcd_devfreq_target(struct device *dev, } spin_unlock_irqrestore(hba->host->host_lock, irq_flags); + pm_runtime_get_noresume(hba->dev); + if (!pm_runtime_active(hba->dev)) { + pm_runtime_put_noidle(hba->dev); + ret = -EAGAIN; + goto out; + } start = ktime_get(); ret = ufshcd_devfreq_scale(hba, scale_up); + pm_runtime_put(hba->dev); trace_ufshcd_profile_clk_scaling(dev_name(hba->dev), (scale_up ? "up" : "down"), @@ -3192,13 +3199,19 @@ int ufshcd_read_desc_param(struct ufs_hba *hba, /* Get the length of descriptor */ ufshcd_map_desc_id_to_length(hba, desc_id, &buff_len); if (!buff_len) { - dev_err(hba->dev, "%s: Failed to get desc length", __func__); + dev_err(hba->dev, "%s: Failed to get desc length\n", __func__); + return -EINVAL; + } + + if (param_offset >= buff_len) { + dev_err(hba->dev, "%s: Invalid offset 0x%x in descriptor IDN 0x%x, length 0x%x\n", + __func__, param_offset, desc_id, buff_len); return -EINVAL; } /* Check whether we need temp memory */ if (param_offset != 0 || param_size < buff_len) { - desc_buf = kmalloc(buff_len, GFP_KERNEL); + desc_buf = kzalloc(buff_len, GFP_KERNEL); if (!desc_buf) return -ENOMEM; } else { @@ -3212,14 +3225,14 @@ int ufshcd_read_desc_param(struct ufs_hba *hba, desc_buf, &buff_len); if (ret) { - dev_err(hba->dev, "%s: Failed reading descriptor. desc_id %d, desc_index %d, param_offset %d, ret %d", + dev_err(hba->dev, "%s: Failed reading descriptor. desc_id %d, desc_index %d, param_offset %d, ret %d\n", __func__, desc_id, desc_index, param_offset, ret); goto out; } /* Sanity check */ if (desc_buf[QUERY_DESC_DESC_TYPE_OFFSET] != desc_id) { - dev_err(hba->dev, "%s: invalid desc_id %d in descriptor header", + dev_err(hba->dev, "%s: invalid desc_id %d in descriptor header\n", __func__, desc_buf[QUERY_DESC_DESC_TYPE_OFFSET]); ret = -EINVAL; goto out; @@ -3229,12 +3242,12 @@ int ufshcd_read_desc_param(struct ufs_hba *hba, buff_len = desc_buf[QUERY_DESC_LENGTH_OFFSET]; ufshcd_update_desc_length(hba, desc_id, desc_index, buff_len); - /* Check wherher we will not copy more data, than available */ - if (is_kmalloc && (param_offset + param_size) > buff_len) - param_size = buff_len - param_offset; - - if (is_kmalloc) + if (is_kmalloc) { + /* Make sure we don't copy more data than available */ + if (param_offset + param_size > buff_len) + param_size = buff_len - param_offset; memcpy(param_read_buf, &desc_buf[param_offset], param_size); + } out: if (is_kmalloc) kfree(desc_buf); @@ -8900,11 +8913,7 @@ int ufshcd_shutdown(struct ufs_hba *hba) if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba)) goto out; - if (pm_runtime_suspended(hba->dev)) { - ret = ufshcd_runtime_resume(hba); - if (ret) - goto out; - } + pm_runtime_get_sync(hba->dev); ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM); out:
Re: [PATCH] [SCSI] sym53c8xx: remove trailing semicolon in macro definition
On Fri, 2020-11-27 at 10:29 -0800, t...@redhat.com wrote: > From: Tom Rix > > The macro use will already have a semicolon. > > Signed-off-by: Tom Rix > --- > drivers/scsi/sym53c8xx_2/sym_glue.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c > b/drivers/scsi/sym53c8xx_2/sym_glue.c > index d9a045f9858c..f3b3345c1766 100644 > --- a/drivers/scsi/sym53c8xx_2/sym_glue.c > +++ b/drivers/scsi/sym53c8xx_2/sym_glue.c > @@ -1001,12 +1001,12 @@ static int is_keyword(char *ptr, int len, > char *verb) > #define SKIP_SPACES(ptr, len) > \ > if ((arg_len = sym_skip_spaces(ptr, len)) < 1) > \ > return -EINVAL; > \ > - ptr += arg_len; len -= arg_len; > + ptr += arg_len; len -= arg_len > > #define GET_INT_ARG(ptr, len, v) > \ > if (!(arg_len = get_int_arg(ptr, len, &(v > \ > return -EINVAL; > \ > - ptr += arg_len; len -= arg_len; > + ptr += arg_len; len -= arg_len Those macros have the wrong form which can lead to actual bugs and this cosmetic change does nothing to fix it. If the goal here is to get the code base into better shape, please fix the bugs. James
Re: [Intel-wired-lan] [PATCH 000/141] Fix fall-through warnings for Clang
On Tue, 2020-11-24 at 13:32 -0800, Kees Cook wrote: > On Mon, Nov 23, 2020 at 08:31:30AM -0800, James Bottomley wrote: > > Really, no ... something which produces no improvement has no value > > at all ... we really shouldn't be wasting maintainer time with it > > because it has a cost to merge. I'm not sure we understand where > > the balance lies in value vs cost to merge but I am confident in > > the zero value case. > > What? We can't measure how many future bugs aren't introduced because > the kernel requires explicit case flow-control statements for all new > code. No but we can measure how vulnerable our current coding habits are to the mistake this warning would potentially prevent. I don't think it's wrong to extrapolate that if we had no instances at all of prior coding problems we likely wouldn't have any in future either making adopting the changes needed to enable the warning valueless ... that's the zero value case I was referring to above. Now, what we have seems to be about 6 cases (at least what's been shown in this thread) where a missing break would cause potentially user visible issues. That means the value of this isn't zero, but it's not a no-brainer massive win either. That's why I think asking what we've invested vs the return isn't a useless exercise. > We already enable -Wimplicit-fallthrough globally, so that's not the > discussion. The issue is that Clang is (correctly) even more strict > than GCC for this, so these are the remaining ones to fix for full > Clang coverage too. > > People have spent more time debating this already than it would have > taken to apply the patches. :) You mean we've already spent 90% of the effort to come this far so we might as well go the remaining 10% because then at least we get some return? It's certainly a clinching argument in defence procurement ... > This is about robustness and language wrangling. It's a big code- > base, and this is the price of our managing technical debt for > permanent robustness improvements. (The numbers I ran from Gustavo's > earlier patches were that about 10% of the places adjusted were > identified as legitimate bugs being fixed. This final series may be > lower, but there are still bugs being found from it -- we need to > finish this and shut the door on it for good.) I got my six patches by analyzing the lwn.net report of the fixes that was cited which had 21 of which 50% didn't actually change the emitted code, and 25% didn't have a user visible effect. But the broader point I'm making is just because the compiler people come up with a shiny new warning doesn't necessarily mean the problem it's detecting is one that causes us actual problems in the code base. I'd really be happier if we had a theory about what classes of CVE or bug we could eliminate before we embrace the next new warning. James
Re: [PATCH] tpm_tis: Disable interrupts on ThinkPad T490s
On Tue, 2020-11-24 at 10:52 -0700, Jerry Snitselaar wrote: > Before diving further into that though, does anyone else have an > opinion on ripping out the irq code, and just using polling? We've > been only polling since 2015 anyways. Well only a biased one, obviously: polling causes large amounts of busy waiting, which is a waste of CPU resources and does increase the time it takes us to do TPM operations ... not a concern if you're doing long computation ones, like signatures, but it is a problem for short operations like bulk updates of PCRs. The other potential issue, as we saw with atmel is that if you prod the chip too often (which you have to do with polling) you risk upsetting it. We've spent ages trying to tune the polling parameters to balance reduction of busy wait with chip upset and still, apparently, not quite got it right. If the TPM has a functioning IRQ then it gets us out of the whole polling mess entirely. The big question is how many chips that report an IRQ actually have a malfunctioning one? James
Re: [PATCH 000/141] Fix fall-through warnings for Clang
On Mon, 2020-11-23 at 19:56 +0100, Miguel Ojeda wrote: > On Mon, Nov 23, 2020 at 4:58 PM James Bottomley > wrote: > > Well, I used git. It says that as of today in Linus' tree we have > > 889 patches related to fall throughs and the first series went in > > in october 2017 ... ignoring a couple of outliers back to February. > > I can see ~10k insertions over ~1k commits and 15 years that mention > a fallthrough in the entire repo. That is including some commits > (like the biggest one, 960 insertions) that have nothing to do with C > fallthrough. A single kernel release has an order of magnitude more > changes than this... > > But if we do the math, for an author, at even 1 minute per line > change and assuming nothing can be automated at all, it would take 1 > month of work. For maintainers, a couple of trivial lines is noise > compared to many other patches. So you think a one line patch should take one minute to produce ... I really don't think that's grounded in reality. I suppose a one line patch only takes a minute to merge with b4 if no-one reviews or tests it, but that's not really desirable. > In fact, this discussion probably took more time than the time it > would take to review the 200 lines. :-) I'm framing the discussion in terms of the whole series of changes we have done for fall through, both what's in the tree currently (889 patches) both in terms of the produce and the consumer. That's what I used for my figures for cost. > > We're also complaining about the inability to recruit maintainers: > > > > https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/ > > > > And burn out: > > > > http://antirez.com/news/129 > > Accepting trivial and useful 1-line patches Part of what I'm trying to measure is the "and useful" bit because that's not a given. > is not what makes a voluntary maintainer quit... so the proverb "straw which broke the camel's back" uniquely doesn't apply to maintainers > Thankless work with demanding deadlines is. That's another potential reason, but it doesn't may other reasons less valid. > > The whole crux of your argument seems to be maintainers' time isn't > > important so we should accept all trivial patches > > I have not said that, at all. In fact, I am a voluntary one and I > welcome patches like this. It takes very little effort on my side to > review and it helps the kernel overall. Well, you know, subsystems are very different in terms of the amount of patches a maintainer has to process per release cycle of the kernel. If a maintainer is close to capacity, additional patches, however trivial, become a problem. If a maintainer has spare cycles, trivial patches may look easy. > Paid maintainers are the ones that can take care of big > features/reviews. > > > What I'm actually trying to articulate is a way of measuring value > > of the patch vs cost ... it has nothing really to do with who foots > > the actual bill. > > I understand your point, but you were the one putting it in terms of > a junior FTE. No, I evaluated the producer side in terms of an FTE. What we're mostly arguing about here is the consumer side: the maintainers and people who have to rework their patch sets. I estimated that at 100h. > In my view, 1 month-work (worst case) is very much worth > removing a class of errors from a critical codebase. > > > One thesis I'm actually starting to formulate is that this > > continual devaluing of maintainers is why we have so much > > difficulty keeping and recruiting them. > > That may very well be true, but I don't feel anybody has devalued > maintainers in this discussion. You seem to be saying that because you find it easy to merge trivial patches, everyone should. I'm reminded of a friend long ago who thought being a Tees River Pilot was a sinecure because he could navigate the Tees blindfold. What he forgot, of course, is that just because it's easy with a trawler doesn't mean it's easy with an oil tanker. In fact it takes longer to qualify as a Tees River Pilot than it does to get a PhD. James