from:"James Bottomley"

Re: [PATCH v9 1/4] KEYS: trusted: Add generic trusted keys framework

2021-04-20 Thread James Bottomley

On Mon, 2021-03-01 at 18:41 +0530, Sumit Garg wrote:
> Current trusted keys framework is tightly coupled to use TPM device
> as an underlying implementation which makes it difficult for
> implementations like Trusted Execution Environment (TEE) etc. to
> provide trusted keys support in case platform doesn't posses a TPM
> device.
> 
> Add a generic trusted keys framework where underlying implementations
> can be easily plugged in. Create struct trusted_key_ops to achieve
> this, which contains necessary functions of a backend.
> 
> Also, define a module parameter in order to select a particular trust
> source in case a platform support multiple trust sources. In case its
> not specified then implementation itetrates through trust sources
> list starting with TPM and assign the first trust source as a backend
> which has initiazed successfully during iteration.
> 
> Note that current implementation only supports a single trust source
> at runtime which is either selectable at compile time or during boot
> via aforementioned module parameter.

You never actually tested this, did you?  I'm now getting EINVAL from
all the trusted TPM key operations because of this patch.

The reason is quite simple:  this function:

> index ..0db86b44605d
> --- /dev/null
> +++ b/security/keys/trusted-keys/trusted_core.c
[...]
> +static int datablob_parse(char *datablob, struct trusted_key_payload
> *p)
> +{
> + substring_t args[MAX_OPT_ARGS];
> + long keylen;
> + int ret = -EINVAL;
> + int key_cmd;
> + char *c;
> +
> + /* main command */
> + c = strsep(, " \t");

Modifies its argument to consume tokens and separates them with NULL.

so the arguments for

keyctl add trusted kmk "new 34 keyhandle=0x8101"

Go into this function as

datablob="new 34 keyhandle=0x8101"

After we leave it, it looks like

datablob="new\034\0keyhandle=0x8101"

However here:

> +static int trusted_instantiate(struct key *key,
> +struct key_preparsed_payload *prep)
> +{
> + struct trusted_key_payload *payload = NULL;
> + size_t datalen = prep->datalen;
> + char *datablob;
> + int ret = 0;
> + int key_cmd;
> + size_t key_len;
> +
> + if (datalen <= 0 || datalen > 32767 || !prep->data)
> + return -EINVAL;
> +
> + datablob = kmalloc(datalen + 1, GFP_KERNEL);
> + if (!datablob)
> + return -ENOMEM;
> + memcpy(datablob, prep->data, datalen);
> + datablob[datalen] = '\0';
> +
> + payload = trusted_payload_alloc(key);
> + if (!payload) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + key_cmd = datablob_parse(datablob, payload);
> + if (key_cmd < 0) {
> + ret = key_cmd;
> + goto out;
> + }
> +
> + dump_payload(payload);
> +
> + switch (key_cmd) {
> + case Opt_load:
> + ret = static_call(trusted_key_unseal)(payload,
> datablob);

We're passing the unmodified

datablob="new\034\0keyhandle=0x8101"

Into the tpm trusted_key_unseal function.  However, it only sees "new"
and promply gives EINVAL because you've removed the ability to process
the new option from it.  What should have happened is you should have
moved data blob up to passed the consumed tokens, so it actually reads

datablob="keyhandle=0x8101"

However, to do that you'd have to have the updated pointer passed out
of your datablob_parse() above.

There's also a lost !tpm2 in the check for options->keyhandle, but I
suspect Jarkko lost that merging the two patches.  I think what's below
fixes all of this, so if you can test it for trusted_tee, I'll package
it up as two separate patches fixing all of this.

James

---

diff --git a/security/keys/trusted-keys/trusted_core.c 
b/security/keys/trusted-keys/trusted_core.c
index ec3a066a4b42..7c636212429b 100644
--- a/security/keys/trusted-keys/trusted_core.c
+++ b/security/keys/trusted-keys/trusted_core.c
@@ -62,7 +62,7 @@ static const match_table_t key_tokens = {
  *
  * On success returns 0, otherwise -EINVAL.
  */
-static int datablob_parse(char *datablob, struct trusted_key_payload *p)
+static int datablob_parse(char **datablob, struct trusted_key_payload *p)
 {
substring_t args[MAX_OPT_ARGS];
long keylen;
@@ -71,14 +71,14 @@ static int datablob_parse(char *datablob, struct 
trusted_key_payload *p)
char *c;
 
/* main command */
-   c = strsep(, " \t");
+   c = strsep(datablob, " \t");
if (!c)
return -EINVAL;
key_cmd = match_token(c, key_tokens, args);
switch (key_cmd) {
case Opt_new:
/* first argument is key size */
-   c = strsep(, " \t");
+   c = strsep(datablob, " \t");
if (!c)
return -EINVAL;
ret = kstrtol(c, 10, );
@@ -89,7 +89,7 @@ static int datablob_parse(char *datablob, struct 
trusted_key_payload *p)

[GIT PULL] SCSI fixes for 5.12-rc6

2021-04-17 Thread James Bottomley

This libsas fix is for a problem that occurs when trying to change the
cache type of an ATA device and the libiscsi one is a regression fix
from this merge window.

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Jolly Shah (1):
  scsi: libsas: Reset num_scatter if libata marks qc as NODATA

Mike Christie (1):
  scsi: iscsi: Fix iSCSI cls conn state

And the diffstat:

 drivers/scsi/libiscsi.c | 26 +++---
 drivers/scsi/libsas/sas_ata.c   |  9 -
 drivers/scsi/scsi_transport_iscsi.c | 18 +++---
 3 files changed, 22 insertions(+), 31 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 7ad11e42306d..bfd2aaa9b66b 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -3179,9 +3179,10 @@ fail_mgmt_tasks(struct iscsi_session *session, struct 
iscsi_conn *conn)
}
 }
 
-static void iscsi_start_session_recovery(struct iscsi_session *session,
-struct iscsi_conn *conn, int flag)
+void iscsi_conn_stop(struct iscsi_cls_conn *cls_conn, int flag)
 {
+   struct iscsi_conn *conn = cls_conn->dd_data;
+   struct iscsi_session *session = conn->session;
int old_stop_stage;
 
mutex_lock(>eh_mutex);
@@ -3239,27 +3240,6 @@ static void iscsi_start_session_recovery(struct 
iscsi_session *session,
spin_unlock_bh(>frwd_lock);
mutex_unlock(>eh_mutex);
 }
-
-void iscsi_conn_stop(struct iscsi_cls_conn *cls_conn, int flag)
-{
-   struct iscsi_conn *conn = cls_conn->dd_data;
-   struct iscsi_session *session = conn->session;
-
-   switch (flag) {
-   case STOP_CONN_RECOVER:
-   cls_conn->state = ISCSI_CONN_FAILED;
-   break;
-   case STOP_CONN_TERM:
-   cls_conn->state = ISCSI_CONN_DOWN;
-   break;
-   default:
-   iscsi_conn_printk(KERN_ERR, conn,
- "invalid stop flag %d\n", flag);
-   return;
-   }
-
-   iscsi_start_session_recovery(session, conn, flag);
-}
 EXPORT_SYMBOL_GPL(iscsi_conn_stop);
 
 int iscsi_conn_bind(struct iscsi_cls_session *cls_session,
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 024e5a550759..8b9a39077dba 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -201,18 +201,17 @@ static unsigned int sas_ata_qc_issue(struct 
ata_queued_cmd *qc)
memcpy(task->ata_task.atapi_packet, qc->cdb, qc->dev->cdb_len);
task->total_xfer_len = qc->nbytes;
task->num_scatter = qc->n_elem;
+   task->data_dir = qc->dma_dir;
+   } else if (qc->tf.protocol == ATA_PROT_NODATA) {
+   task->data_dir = DMA_NONE;
} else {
for_each_sg(qc->sg, sg, qc->n_elem, si)
xfer += sg_dma_len(sg);
 
task->total_xfer_len = xfer;
task->num_scatter = si;
-   }
-
-   if (qc->tf.protocol == ATA_PROT_NODATA)
-   task->data_dir = DMA_NONE;
-   else
task->data_dir = qc->dma_dir;
+   }
task->scatter = qc->sg;
task->ata_task.retry_count = 1;
task->task_state_flags = SAS_TASK_STATE_PENDING;
diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index bebfb355abdf..21a2d997a72e 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2470,10 +2470,22 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn 
*conn, int flag)
 * it works.
 */
mutex_lock(_mutex);
+   switch (flag) {
+   case STOP_CONN_RECOVER:
+   conn->state = ISCSI_CONN_FAILED;
+   break;
+   case STOP_CONN_TERM:
+   conn->state = ISCSI_CONN_DOWN;
+   break;
+   default:
+   iscsi_cls_conn_printk(KERN_ERR, conn,
+ "invalid stop flag %d\n", flag);
+   goto unlock;
+   }
+
conn->transport->stop_conn(conn, flag);
-   conn->state = ISCSI_CONN_DOWN;
+unlock:
mutex_unlock(_mutex);
-
 }
 
 static void stop_conn_work_fn(struct work_struct *work)
@@ -2961,7 +2973,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport 
*transport,
mutex_lock(>ep_mutex);
conn->ep = NULL;
mutex_unlock(>ep_mutex);
-   conn->state = ISCSI_CONN_DOWN;
+   conn->state = ISCSI_CONN_FAILED;
}
 
transport->ep_disconnect(ep);

Re: [PATCH][next] scsi: aacraid: Replace one-element array with flexible-array member

2021-04-13 Thread James Bottomley

On Tue, 2021-04-13 at 00:45 -0500, Gustavo A. R. Silva wrote:
> Hi Martin,
> 
> On 4/12/21 23:52, Martin K. Petersen wrote:
> 
> > Silencing analyzer warnings shouldn't be done at the expense of
> > human
> > readers. If it is imperative to switch to flex_array_size() to
> > quiesce
> > checker warnings, please add a comment in the code explaining that
> > the
> > size evaluates to nseg_new-1 sge_ieee1212 structs.
> 
> Done:
>   
> https://lore.kernel.org/lkml/20210413054032.GA276102@embeddedor/

I think the reason everyone gets confused is that they think the first
argument should do something.  If flex_array_size had been defined

#define flex_array_size(p, count)   \
array_size(count,   \
sizeof(*(p)) + __must_be_array(p))

Then we could have used

flex_array_size(sge, nseg_new - 1)

or

flex_array_size(rio->sge, nseg_new - 1)

and everyone would have understood either expression.  This would also
have been useful, as the first example demonstrates, when we have a
pointer rather than a flexible member ... although that means the macro
likely needs a new name.

However, perhaps just do

array_size(nseg_new - 1, sizeof(*sge));

And lose the comment?

James

Re: [PATCH][next] KEYS: trusted: Fix missing null return from kzalloc call

2021-04-12 Thread James Bottomley

On Mon, 2021-04-12 at 17:01 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> The kzalloc call can return null with the GFP_KERNEL flag so
> add a null check and exit via a new error exit label. Use the
> same exit error label for another error path too.
> 
> Addresses-Coverity: ("Dereference null return value")
> Fixes: 830027e2cb55 ("KEYS: trusted: Add generic trusted keys
> framework")
> Signed-off-by: Colin Ian King 
> ---
>  security/keys/trusted-keys/trusted_core.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/security/keys/trusted-keys/trusted_core.c
> b/security/keys/trusted-keys/trusted_core.c
> index ec3a066a4b42..90774793f0b1 100644
> --- a/security/keys/trusted-keys/trusted_core.c
> +++ b/security/keys/trusted-keys/trusted_core.c
> @@ -116,11 +116,13 @@ static struct trusted_key_payload
> *trusted_payload_alloc(struct key *key)
>  
>   ret = key_payload_reserve(key, sizeof(*p));
>   if (ret < 0)
> - return p;
> + goto err;
>   p = kzalloc(sizeof(*p), GFP_KERNEL);
> + if (!p)
> + goto err;
>  
>   p->migratable = migratable;
> -
> +err:
>   return p;

This is clearly a code migration bug in 

commit 251c85bd106099e6f388a89e88e12d14de2c9cda
Author: Sumit Garg 
Date:   Mon Mar 1 18:41:24 2021 +0530

KEYS: trusted: Add generic trusted keys framework

Which has for addition to trusted_core.c:

+static struct trusted_key_payload *trusted_payload_alloc(struct key
*key)
+{
+   struct trusted_key_payload *p = NULL;
+   int ret;
+
+   ret = key_payload_reserve(key, sizeof(*p));
+   if (ret < 0)
+   return p;
+   p = kzalloc(sizeof(*p), GFP_KERNEL);
+
+   p->migratable = migratable;
+
+   return p;
+}

And for trusted_tpm1.c:

-static struct trusted_key_payload *trusted_payload_alloc(struct key
*key)
-{
-   struct trusted_key_payload *p = NULL;
-   int ret;
-
-   ret = key_payload_reserve(key, sizeof *p);
-   if (ret < 0)
-   return p;
-   p = kzalloc(sizeof *p, GFP_KERNEL);
-   if (p)
-   p->migratable = 1; /* migratable by default */
-   return p;
-}

The trusted_tpm1.c code was correct and we got this bug introduced by
what should have been a simple cut and paste ... how did that happen? 
And therefore, how safe is the rest of the extraction into
trusted_core.c?

James

[GIT PULL] SCSI fixes for 5.12-rc6

2021-04-10 Thread James Bottomley

Seven fixes all in drivers.  The hpsa three are the most extensive and
the most problematic: it's a packed structure misalignment that oopses
on ia64 but looks like it would also oops on quite a few non-x86
architectures.  The pm80xx is a regression and the rest are bug fixes
for patches in the misc tree.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Can Guo (2):
  scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs
  scsi: ufs: core: Fix task management request completion timeout

Martin Wilck (1):
  scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST state

Roman Bolshakov (1):
  scsi: target: iscsi: Fix zero tag inside a trace event

Sergei Trofimovich (3):
  scsi: hpsa: Add an assert to prevent __packed reintroduction
  scsi: hpsa: Fix boot on ia64 (atomic_t alignment)
  scsi: hpsa: Use __packed on individual structs, not header-wide

Viswas G (1):
  scsi: pm80xx: Fix chip initialization failure

And the diffstat:

 drivers/scsi/hpsa_cmd.h | 78 +
 drivers/scsi/pm8001/pm8001_hwi.c|  8 ++--
 drivers/scsi/scsi_transport_srp.c   |  2 +-
 drivers/scsi/ufs/ufshcd.c   | 31 +++
 drivers/target/iscsi/iscsi_target.c |  3 +-
 5 files changed, 65 insertions(+), 57 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index d126bb877250..ba6a3aa8d954 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -20,6 +20,11 @@
 #ifndef HPSA_CMD_H
 #define HPSA_CMD_H
 
+#include 
+
+#include  /* static_assert */
+#include  /* offsetof */
+
 /* general boundary defintions */
 #define SENSEINFOBYTES  32 /* may vary between hbas */
 #define SG_ENTRIES_IN_CMD  32 /* Max SG entries excluding chain blocks */
@@ -200,12 +205,10 @@ union u64bit {
MAX_EXT_TARGETS + 1) /* + 1 is for the controller itself */
 
 /* SCSI-3 Commands */
-#pragma pack(1)
-
 #define HPSA_INQUIRY 0x12
 struct InquiryData {
u8 data_byte[36];
-};
+} __packed;
 
 #define HPSA_REPORT_LOG 0xc2/* Report Logical LUNs */
 #define HPSA_REPORT_PHYS 0xc3   /* Report Physical LUNs */
@@ -221,7 +224,7 @@ struct raid_map_disk_data {
u8xor_mult[2];/**< XOR multipliers for this position,
*  valid for data disks only */
u8reserved[2];
-};
+} __packed;
 
 struct raid_map_data {
__le32   structure_size;/* Size of entire structure in bytes */
@@ -247,14 +250,14 @@ struct raid_map_data {
__le16   dekindex;  /* Data encryption key index. */
u8reserved[16];
struct raid_map_disk_data data[RAID_MAP_MAX_ENTRIES];
-};
+} __packed;
 
 struct ReportLUNdata {
u8 LUNListLength[4];
u8 extended_response_flag;
u8 reserved[3];
u8 LUN[HPSA_MAX_LUN][8];
-};
+} __packed;
 
 struct ext_report_lun_entry {
u8 lunid[8];
@@ -269,20 +272,20 @@ struct ext_report_lun_entry {
u8 lun_count; /* multi-lun device, how many luns */
u8 redundant_paths;
u32 ioaccel_handle; /* ioaccel1 only uses lower 16 bits */
-};
+} __packed;
 
 struct ReportExtendedLUNdata {
u8 LUNListLength[4];
u8 extended_response_flag;
u8 reserved[3];
struct ext_report_lun_entry LUN[HPSA_MAX_PHYS_LUN];
-};
+} __packed;
 
 struct SenseSubsystem_info {
u8 reserved[36];
u8 portname[8];
u8 reserved1[1108];
-};
+} __packed;
 
 /* BMIC commands */
 #define BMIC_READ 0x26
@@ -317,7 +320,7 @@ union SCSI3Addr {
u8 Targ:6;
u8 Mode:2;/* b10 */
} LogUnit;
-};
+} __packed;
 
 struct PhysDevAddr {
u32 TargetId:24;
@@ -325,20 +328,20 @@ struct PhysDevAddr {
u32 Mode:2;
/* 2 level target device addr */
union SCSI3Addr  Target[2];
-};
+} __packed;
 
 struct LogDevAddr {
u32VolId:30;
u32Mode:2;
u8 reserved[4];
-};
+} __packed;
 
 union LUNAddr {
u8   LunAddrBytes[8];
union SCSI3AddrSCSI3Lun[4];
struct PhysDevAddr PhysDev;
struct LogDevAddr  LogDev;
-};
+} __packed;
 
 struct CommandListHeader {
u8  ReplyQueue;
@@ -346,7 +349,7 @@ struct CommandListHeader {
__le16  SGTotal;
__le64  tag;
union LUNAddr LUN;
-};
+} __packed;
 
 struct RequestBlock {
u8   CDBLen;
@@ -365,18 +368,18 @@ struct RequestBlock {
 #define GET_DIR(tad) (((tad) >> 6) & 0x03)
u16  Timeout;
u8   CDB[16];
-};
+} __packed;
 
 struct ErrDescriptor {
__le64 Addr;
__le32 Len;
-};
+} __packed;
 
 struct SGDescriptor {
__le64 Addr;
__le32 Len;
__le32 Ext;
-};
+} __packed;
 
 union MoreErrInfo {

Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context

2021-04-08 Thread James Bottomley

On Thu, 2021-04-08 at 17:41 -0700, Steve Rutherford wrote:
> On Thu, Apr 8, 2021 at 2:15 PM James Bottomley 
> wrote:
> > On Thu, 2021-04-08 at 12:48 -0700, Steve Rutherford wrote:
> > > On Thu, Apr 8, 2021 at 10:43 AM James Bottomley <
> > > j...@linux.ibm.com>
> > > wrote:
> > > > On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote:
[...]
> > > > > However, it would be nice to collaborate on the low-level
> > > > > (SEC/PEI) firmware patches to detect whether a CPU is part of
> > > > > the primary VM or the mirror.  If Google has any OVMF patches
> > > > > already done for that, it would be great to combine it with
> > > > > IBM's SEV migration code and merge it into upstream OVMF.
> > > > 
> > > > We've reached the stage with our prototyping where not having
> > > > the OVMF support is blocking us from working on QEMU.  If we're
> > > > going to have to reinvent the wheel in OVMF because Google is
> > > > unwilling to publish the patches, can you at least give some
> > > > hints about how you did it?
> > > > 
> > > > Thanks,
> > > > 
> > > > James
> > > 
> > > Hey James,
> > > It's not strictly necessary to modify OVMF to make SEV VMs live
> > > migrate. If we were to modify OVMF, we would contribute those
> > > changes
> > > upstream.
> > 
> > Well, no, we already published an OVMF RFC to this list that does
> > migration.  However, the mirror approach requires a different boot
> > mechanism for the extra vCPU in the mirror.  I assume you're doing
> > this bootstrap through OVMF so the hypervisor can interrogate it to
> > get the correct entry point?  That's the code we're asking to see
> > because that's what replaces our use of the MP service in the RFC.
> > 
> > James
> 
> Hey James,
> The intention would be to have a separate, stand-alone firmware-like
> binary run by the mirror. Since the VMM is in control of where it
> places that binary in the guest physical address space and the
> initial configuration of the vCPUs, it can point the vCPUs at an
> entry point contained within that binary, rather than at the standard
> x86 reset vector.

If you want to share ASIDs you have to share the firmware that the
running VM has been attested to.  Once the VM moves from LAUNCH to
RUNNING, the PSP won't allow the VMM to inject any more firmware or do
any more attestations.  What you mirror after this point can thus only
contain what has already been measured or what the guest added.  This
is why we think there has to be a new entry path into the VM for the
mirror vCPU.

So assuming you're thinking you'll inject two pieces of firmware at
start of day: the OVFM and this separate binary and attest to both,
then you can do that, but then you have two problems:

   1. Preventing OVMF from trampling all over your separate binary while
  it's booting
   2. Launching the vCPU up into this separate binary in a way it can
  execute (needs stack and heap)

I think you can likely solve 1. by making the separate binary look like
a ROM, but then you have the problem of where you steal the RAM you
need for a heap and stack and it still brings us back to how to launch
the vCPU which was the original question.

With ES we can set the registers at launch, so a vCPU that's never
launched can still be pre-programmed with the separate binary entry
point but solving the stack and heap looks like it requires co-
operation from OVMF.

That's why we were thinking the easiest straight line approach is to
have a runtime DXE which has a declared initialization routine that
allocates memory for the stack and a heap and a separate declared entry
point for the vCPU which picks up the already allocated and mapped
stack and heap.

James

Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context

2021-04-08 Thread James Bottomley

On Thu, 2021-04-08 at 12:48 -0700, Steve Rutherford wrote:
> On Thu, Apr 8, 2021 at 10:43 AM James Bottomley 
> wrote:
> > On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote:
> > > On 02/04/21 13:58, Ashish Kalra wrote:
> > > > Hi Nathan,
> > > > 
> > > > Will you be posting a corresponding Qemu patch for this ?
> > > 
> > > Hi Ashish,
> > > 
> > > as far as I know IBM is working on QEMU patches for guest-based
> > > migration helpers.
> > 
> > Yes, that's right, we'll take on this part.
> > 
> > > However, it would be nice to collaborate on the low-level
> > > (SEC/PEI) firmware patches to detect whether a CPU is part of the
> > > primary VM or the mirror.  If Google has any OVMF patches already
> > > done for that, it would be great to combine it with IBM's SEV
> > > migration code and merge it into upstream OVMF.
> > 
> > We've reached the stage with our prototyping where not having the
> > OVMF support is blocking us from working on QEMU.  If we're going
> > to have to reinvent the wheel in OVMF because Google is unwilling
> > to publish the patches, can you at least give some hints about how
> > you did it?
> > 
> > Thanks,
> > 
> > James
> 
> Hey James,
> It's not strictly necessary to modify OVMF to make SEV VMs live
> migrate. If we were to modify OVMF, we would contribute those changes
> upstream.

Well, no, we already published an OVMF RFC to this list that does
migration.  However, the mirror approach requires a different boot
mechanism for the extra vCPU in the mirror.  I assume you're doing this
bootstrap through OVMF so the hypervisor can interrogate it to get the
correct entry point?  That's the code we're asking to see because
that's what replaces our use of the MP service in the RFC.

James

Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context

2021-04-08 Thread James Bottomley

On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote:
> On 02/04/21 13:58, Ashish Kalra wrote:
> > Hi Nathan,
> > 
> > Will you be posting a corresponding Qemu patch for this ?
> 
> Hi Ashish,
> 
> as far as I know IBM is working on QEMU patches for guest-based 
> migration helpers.

Yes, that's right, we'll take on this part.

> However, it would be nice to collaborate on the low-level (SEC/PEI) 
> firmware patches to detect whether a CPU is part of the primary VM
> or the mirror.  If Google has any OVMF patches already done for that,
> it would be great to combine it with IBM's SEV migration code and
> merge it into upstream OVMF.

We've reached the stage with our prototyping where not having the OVMF
support is blocking us from working on QEMU.  If we're going to have to
reinvent the wheel in OVMF because Google is unwilling to publish the
patches, can you at least give some hints about how you did it?

Thanks,

James

[GIT PULL] SCSI fixes for 5.12-rc

2021-04-02 Thread James Bottomley

Single fix to iscsi for a rare race condition which can cause a kernel
panic.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Gulam Mohamed (1):
  scsi: iscsi: Fix race condition between login and sync thread

And the diffstat:

 drivers/scsi/scsi_transport_iscsi.c | 14 +-
 include/scsi/scsi_transport_iscsi.h |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

With full diff below.

James

---

diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 969d24d580e2..bebfb355abdf 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2471,6 +2471,7 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn 
*conn, int flag)
 */
mutex_lock(_mutex);
conn->transport->stop_conn(conn, flag);
+   conn->state = ISCSI_CONN_DOWN;
mutex_unlock(_mutex);
 
 }
@@ -2894,6 +2895,13 @@ iscsi_set_param(struct iscsi_transport *transport, 
struct iscsi_uevent *ev)
default:
err = transport->set_param(conn, ev->u.set_param.param,
   data, ev->u.set_param.len);
+   if ((conn->state == ISCSI_CONN_BOUND) ||
+   (conn->state == ISCSI_CONN_UP)) {
+   err = transport->set_param(conn, ev->u.set_param.param,
+   data, ev->u.set_param.len);
+   } else {
+   return -ENOTCONN;
+   }
}
 
return err;
@@ -2953,6 +2961,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport 
*transport,
mutex_lock(>ep_mutex);
conn->ep = NULL;
mutex_unlock(>ep_mutex);
+   conn->state = ISCSI_CONN_DOWN;
}
 
transport->ep_disconnect(ep);
@@ -3713,6 +3722,8 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr 
*nlh, uint32_t *group)
ev->r.retcode = transport->bind_conn(session, conn,
ev->u.b_conn.transport_eph,
ev->u.b_conn.is_leading);
+   if (!ev->r.retcode)
+   conn->state = ISCSI_CONN_BOUND;
mutex_unlock(_mutex);
 
if (ev->r.retcode || !transport->ep_connect)
@@ -3944,7 +3955,8 @@ iscsi_conn_attr(local_ipaddr, ISCSI_PARAM_LOCAL_IPADDR);
 static const char *const connection_state_names[] = {
[ISCSI_CONN_UP] = "up",
[ISCSI_CONN_DOWN] = "down",
-   [ISCSI_CONN_FAILED] = "failed"
+   [ISCSI_CONN_FAILED] = "failed",
+   [ISCSI_CONN_BOUND] = "bound"
 };
 
 static ssize_t show_conn_state(struct device *dev,
diff --git a/include/scsi/scsi_transport_iscsi.h 
b/include/scsi/scsi_transport_iscsi.h
index 8a26a2ffa952..fc5a39839b4b 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -193,6 +193,7 @@ enum iscsi_connection_state {
ISCSI_CONN_UP = 0,
ISCSI_CONN_DOWN,
ISCSI_CONN_FAILED,
+   ISCSI_CONN_BOUND,
 };
 
 struct iscsi_cls_conn {

Re: [PATCH v1 0/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys

2021-04-01 Thread James Bottomley

On Thu, 2021-04-01 at 18:50 +0530, Sumit Garg wrote:
> On Thu, 1 Apr 2021 at 15:36, Ahmad Fatoum 
> wrote:
> > Hello Richard,
> > 
> > On 31.03.21 21:36, Richard Weinberger wrote:
> > > James,
> > > 
> > > - Ursprüngliche Mail -
> > > > Von: "James Bottomley" 
> > > > Well, yes.  For the TPM, there's a defined ASN.1 format for the
> > > > keys:
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/jejb/openssl_tpm2_engine.git/tree/tpm2-asn.h
> > > > 
> > > > and part of the design of the file is that it's distinguishable
> > > > either
> > > > in DER or PEM (by the guards) format so any crypto application
> > > > can know
> > > > it's dealing with a TPM key simply by inspecting the file.  I
> > > > think you
> > > > need the same thing for CAAM and any other format.
> > > > 
> > > > We're encouraging new ASN.1 formats to be of the form
> > > > 
> > > > SEQUENCE {
> > > >type   OBJECT IDENTIFIER
> > > >... key specific fields ...
> > > > }
> > > > 
> > > > Where you choose a defined OID to represent the key and that
> > > > means
> > > > every key even in DER form begins with a unique binary
> > > > signature.
> > > 
> > > I like this idea.
> > > Ahmad, what do you think?
> > > 
> > > That way we could also get rid off the kernel parameter and all
> > > the fall back logic,
> > > given that we find a way to reliable detect TEE blobs too...
> > 
> > Sounds good to me. Sumit, your thoughts on doing this for TEE as
> > well?
> > 
> 
> AFAIU, ASN.1 formating should be independent of trusted keys backends
> which could be abstracted to trusted keys core layer so that every
> backend could be plugged in seamlessly.
> 
> James,
> 
> Would it be possible to achieve this?

I'm not quite sure what you're asking.  The saved format of the keys is
particular to the underlying hardware.  The ASN.1 wraps this so we have
an identifier, some useful information to help load the key (like the
policy systemements) and then the blobs the hardware expects.

This makes the ASN.1 specific to the backend but having a
distinguishing OID that allows anyone to tell which backend it needs
from the file.

So you can call the ASN.1 format that begins with the type OID, the
"universal" format, but if you mean there should be a standard ASN.1
format for all keys that defines all the fields then that's not
possible because the fields after type are very specific to the
underlying hardware.

James

Re: [PATCH v1 0/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys

2021-03-31 Thread James Bottomley

On Wed, 2021-03-31 at 20:36 +0200, Richard Weinberger wrote:
> James,
> 
> - Ursprüngliche Mail -
> > Von: "James Bottomley" 
> > > On Wed, Mar 17, 2021 at 3:08 PM Ahmad Fatoum <
> > > a.fat...@pengutronix.de wrote:
> > > > keyctl add trusted $KEYNAME "load $(cat ~/kmk.blob)" @s
> > > 
> > > Is there a reason why we can't pass the desired backend name in
> > > the
> > > trusted key parameters?
> > > e.g.
> > > keyctl add trusted $KEYNAME "backendtype caam load $(cat
> > > ~/kmk.blob)"
> > > @s
> > 
> > Why would you want to in the load?  The blob should be type
> > specific, so a TPM key shouldn't load as a CAAM key and vice versa
> > ... and if they're not they need to be made so before the patches
> > go upstream.
> 
> I fear right now there is no good way to detect whether a blob is
> desired for CAAM or TPM.

At least for the TPM the old format is two TPM2B structures, and the
new one is ASN.1 so either should be completely distinguishable over
what CAAM does.

> > I could possibly see that you might want to be type specific in the
> > create, but once you're simply loading an already created key, the
> > trusted key subsystem should be able to figure what to do on its
> > own.
> 
> So you have some kind of container format in mind which denotes the
> type of the blob?

Well, yes.  For the TPM, there's a defined ASN.1 format for the keys:

https://git.kernel.org/pub/scm/linux/kernel/git/jejb/openssl_tpm2_engine.git/tree/tpm2-asn.h

and part of the design of the file is that it's distinguishable either
in DER or PEM (by the guards) format so any crypto application can know
it's dealing with a TPM key simply by inspecting the file.  I think you
need the same thing for CAAM and any other format.

We're encouraging new ASN.1 formats to be of the form

SEQUENCE {
type   OBJECT IDENTIFIER
... key specific fields ...
}

Where you choose a defined OID to represent the key and that means
every key even in DER form begins with a unique binary signature.

James

Re: [PATCH v1 0/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys

2021-03-30 Thread James Bottomley

On Wed, 2021-03-31 at 00:04 +0200, Richard Weinberger wrote:
> Ahmad,
> 
> On Wed, Mar 17, 2021 at 3:08 PM Ahmad Fatoum  > wrote:
> > keyctl add trusted $KEYNAME "load $(cat ~/kmk.blob)" @s
> 
> Is there a reason why we can't pass the desired backend name in the
> trusted key parameters?
> e.g.
> keyctl add trusted $KEYNAME "backendtype caam load $(cat ~/kmk.blob)"
> @s

Why would you want to in the load?  The blob should be type specific,
so a TPM key shouldn't load as a CAAM key and vice versa ... and if
they're not they need to be made so before the patches go upstream.

I could possibly see that you might want to be type specific in the
create, but once you're simply loading an already created key, the
trusted key subsystem should be able to figure what to do on its own.

James

[GIT PULL] SCSI fixes for 5.12-rc4

2021-03-27 Thread James Bottomley

Seven fixes, all in drivers (qla2xxx, mkt3sas, qedi, target, ibmvscsi).
The most serious are the target pscsi oom and the qla2xxx revert which
can otherwise cause a use after free.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bart Van Assche (1):
  scsi: Revert "qla2xxx: Make sure that aborted commands are freed"

Jia-Ju Bai (2):
  scsi: mpt3sas: Fix error return code of mpt3sas_base_attach()
  scsi: qedi: Fix error return code of qedi_alloc_global_queues()

Martin Wilck (2):
  scsi: target: pscsi: Clean up after failure in pscsi_map_sg()
  scsi: target: pscsi: Avoid OOM in pscsi_map_sg()

Tyrel Datwyler (2):
  scsi: ibmvfc: Make ibmvfc_wait_for_ops() MQ aware
  scsi: ibmvfc: Fix potential race in ibmvfc_wait_for_ops()

And the diffstat:

 drivers/scsi/ibmvscsi/ibmvfc.c  | 67 ++---
 drivers/scsi/mpt3sas/mpt3sas_base.c |  8 +++--
 drivers/scsi/qedi/qedi_main.c   |  1 +
 drivers/scsi/qla2xxx/qla_target.c   | 13 +++
 drivers/scsi/qla2xxx/tcm_qla2xxx.c  |  4 ---
 drivers/target/target_core_pscsi.c  |  9 -
 6 files changed, 74 insertions(+), 28 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 6a92891ac488..bb64e3247a6c 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2371,6 +2371,24 @@ static int ibmvfc_match_lun(struct ibmvfc_event *evt, 
void *device)
return 0;
 }
 
+/**
+ * ibmvfc_event_is_free - Check if event is free or not
+ * @evt:   ibmvfc event struct
+ *
+ * Returns:
+ * true / false
+ **/
+static bool ibmvfc_event_is_free(struct ibmvfc_event *evt)
+{
+   struct ibmvfc_event *loop_evt;
+
+   list_for_each_entry(loop_evt, >queue->free, queue_list)
+   if (loop_evt == evt)
+   return true;
+
+   return false;
+}
+
 /**
  * ibmvfc_wait_for_ops - Wait for ops to complete
  * @vhost: ibmvfc host struct
@@ -2385,35 +2403,58 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host 
*vhost, void *device,
 {
struct ibmvfc_event *evt;
DECLARE_COMPLETION_ONSTACK(comp);
-   int wait;
+   int wait, i, q_index, q_size;
unsigned long flags;
signed long timeout = IBMVFC_ABORT_WAIT_TIMEOUT * HZ;
+   struct ibmvfc_queue *queues;
 
ENTER;
+   if (vhost->mq_enabled && vhost->using_channels) {
+   queues = vhost->scsi_scrqs.scrqs;
+   q_size = vhost->scsi_scrqs.active_queues;
+   } else {
+   queues = >crq;
+   q_size = 1;
+   }
+
do {
wait = 0;
-   spin_lock_irqsave(>crq.l_lock, flags);
-   list_for_each_entry(evt, >crq.sent, queue_list) {
-   if (match(evt, device)) {
-   evt->eh_comp = 
-   wait++;
+   spin_lock_irqsave(vhost->host->host_lock, flags);
+   for (q_index = 0; q_index < q_size; q_index++) {
+   spin_lock([q_index].l_lock);
+   for (i = 0; i < queues[q_index].evt_pool.size; i++) {
+   evt = [q_index].evt_pool.events[i];
+   if (!ibmvfc_event_is_free(evt)) {
+   if (match(evt, device)) {
+   evt->eh_comp = 
+   wait++;
+   }
+   }
}
+   spin_unlock([q_index].l_lock);
}
-   spin_unlock_irqrestore(>crq.l_lock, flags);
+   spin_unlock_irqrestore(vhost->host->host_lock, flags);
 
if (wait) {
timeout = wait_for_completion_timeout(, timeout);
 
if (!timeout) {
wait = 0;
-   spin_lock_irqsave(>crq.l_lock, flags);
-   list_for_each_entry(evt, >crq.sent, 
queue_list) {
-   if (match(evt, device)) {
-   evt->eh_comp = NULL;
-   wait++;
+   spin_lock_irqsave(vhost->host->host_lock, 
flags);
+   for (q_index = 0; q_index < q_size; q_index++) {
+   spin_lock([q_index].l_lock);
+   for (i = 0; i < 
queues[q_index].evt_pool.size; i++) {
+   evt = 
[q_index].evt_pool.events[i];
+   if (!ibmvfc_event_is_free(evt)) 
{
+   if (match(evt,

Re: [PATCH v1 3/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys

2021-03-24 Thread James Bottomley

On Wed, 2021-03-24 at 16:49 -0400, Mimi Zohar wrote:
> On Wed, 2021-03-24 at 09:14 -0700, James Bottomley wrote:
> > On Tue, 2021-03-23 at 14:07 -0400, Mimi Zohar wrote:
> > > On Tue, 2021-03-23 at 17:35 +0100, Ahmad Fatoum wrote:
> > > > Hello Horia,
> > > > 
> > > > On 21.03.21 21:48, Horia Geantă wrote:
> > > > > On 3/16/2021 7:02 PM, Ahmad Fatoum wrote:
> > > > > [...]
> > > > > > +struct trusted_key_ops caam_trusted_key_ops = {
> > > > > > +   .migratable = 0, /* non-migratable */
> > > > > > +   .init = trusted_caam_init,
> > > > > > +   .seal = trusted_caam_seal,
> > > > > > +   .unseal = trusted_caam_unseal,
> > > > > > +   .exit = trusted_caam_exit,
> > > > > > +};
> > > > > caam has random number generation capabilities, so it's worth
> > > > > using that
> > > > > by implementing .get_random.
> > > > 
> > > > If the CAAM HWRNG is already seeding the kernel RNG, why not
> > > > use
> > > > the kernel's?
> > > > 
> > > > Makes for less code duplication IMO.
> > > 
> > > Using kernel RNG, in general, for trusted keys has been discussed
> > > before.   Please refer to Dave Safford's detailed explanation for
> > > not
> > > using it [1].
> > > 
> > > [1] 
> > > https://lore.kernel.org/linux-integrity/bca04d5d9a3b764c9b7405bba4d4a3c035f2a...@alpmbapa12.e2k.ad.ge.com/
> > 
> > I still don't think relying on one source of randomness to be
> > cryptographically secure is a good idea.  The fear of bugs in the
> > kernel entropy pool is reasonable, but since it's widely used
> > they're
> > unlikely to persist very long.  Studies have shown that some TPMs
> > (notably the chinese manufactured ones) have suspicious failures in
> > their RNGs:
> > 
> > https://www.researchgate.net/publication/45934562_Benchmarking_the_True_Random_Number_Generator_of_TPM_Chips
> > 
> > And most cryptograhpers recommend using a TPM for entropy mixing
> > rather
> > than directly:
> > 
> > https://blog.cryptographyengineering.com/category/rngs/
> > 
> > The TPMFail paper also shows that in spite of NIST certification
> > things can go wrong with a TPM:
> > 
> > https://tpm.fail/
> 
> We already had a lengthy discussion on replacing the TPM RNG with the
> kernel RNG for trusted keys, when TEE was being introduced
> [2,3].  I'm not interested in re-hashing that discussion here.   The
> only difference now is that CAAM is a new trust source.  I suspect
> the same concerns/issues persist, but at least in this case using the
> kernel RNG would not be a regression.

Upstreaming the ASN.1 parser gives us a way to create trusted keys
outside the kernel and so choose any RNG that suits the user, so I
don't think there's any need to rehash for TPM based keys either.

However CaaM doesn't have the ability to create keys outside the kernel
yet, so they do need to consider the problem.

James


> [2] Pascal Van Leeuwen on mixing different sources of entropy and
> certification -
>  
> https://lore.kernel.org/linux-integrity/mn2pr20mb29732a856a40131a671f949fca...@mn2pr20mb2973.namprd20.prod.outlook.com/
> [3] Jarrko on "regression" and tpm_asym.c - 
> https://lore.kernel.org/linux-integrity/20191014190033.ga15...@linux.intel.com/
>  
> 
> Mimi
>

Re: [PATCH v1 3/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys

2021-03-24 Thread James Bottomley

On Tue, 2021-03-23 at 14:07 -0400, Mimi Zohar wrote:
> On Tue, 2021-03-23 at 17:35 +0100, Ahmad Fatoum wrote:
> > Hello Horia,
> > 
> > On 21.03.21 21:48, Horia Geantă wrote:
> > > On 3/16/2021 7:02 PM, Ahmad Fatoum wrote:
> > > [...]
> > > > +struct trusted_key_ops caam_trusted_key_ops = {
> > > > +   .migratable = 0, /* non-migratable */
> > > > +   .init = trusted_caam_init,
> > > > +   .seal = trusted_caam_seal,
> > > > +   .unseal = trusted_caam_unseal,
> > > > +   .exit = trusted_caam_exit,
> > > > +};
> > > caam has random number generation capabilities, so it's worth
> > > using that
> > > by implementing .get_random.
> > 
> > If the CAAM HWRNG is already seeding the kernel RNG, why not use
> > the kernel's?
> > 
> > Makes for less code duplication IMO.
> 
> Using kernel RNG, in general, for trusted keys has been discussed
> before.   Please refer to Dave Safford's detailed explanation for not
> using it [1].
> 
> thanks,
> 
> Mimi
> 
> [1] 
> https://lore.kernel.org/linux-integrity/bca04d5d9a3b764c9b7405bba4d4a3c035f2a...@alpmbapa12.e2k.ad.ge.com/

I still don't think relying on one source of randomness to be
cryptographically secure is a good idea.  The fear of bugs in the
kernel entropy pool is reasonable, but since it's widely used they're
unlikely to persist very long.  Studies have shown that some TPMs
(notably the chinese manufactured ones) have suspicious failures in
their RNGs:

https://www.researchgate.net/publication/45934562_Benchmarking_the_True_Random_Number_Generator_of_TPM_Chips

And most cryptograhpers recommend using a TPM for entropy mixing rather
than directly:

https://blog.cryptographyengineering.com/category/rngs/

The TPMFail paper also shows that in spite of NIST certification
things can go wrong with a TPM:

https://tpm.fail/

James

Re: [Ksummit-discuss] RFC: create mailing list "linux-issues" focussed on issues/bugs and regressions

2021-03-23 Thread James Bottomley

On Tue, 2021-03-23 at 12:20 -0400, Steven Rostedt wrote:
> On Mon, 22 Mar 2021 20:25:15 +0100
> Thorsten Leemhuis  wrote:
> 
> > I agree to the last point and yeah, maybe regressions are the more
> > important problem we should work on – at least from the perspective
> > of kernel development.  But from the users perspective (and
> > reporting-issues.rst is written for that perspective) it feel a bit
> > unsatisfying to not have a solution to query for existing report,
> > regressions or not. H...
> 
> I think the bulk of user issues are going to be regressions. Although
> you may be in a better position to know for sure, but at least for
> me, wearing my "user" hat, the thing that gets me the most is
> upgrading to a new kernel and suddenly something that use to work no
> longer does. And that is the definition of a regression. My test
> boxes still run old distros (one is running fedora 13). These are the
> boxes that catch the most issues, and if they do, they are pretty
> much guaranteed to be a regression.
> 
> I like the "linux-regressions" mailing list idea.

Can't we use the fancy features of public inbox to get the best of both
worlds?  Have the bug list (or even a collection of lists) but make the
linux-regressions one a virtual list keying off an imap flag which a
group of people control.  That way anything that is flagged as a
regression appears in that public inbox.  I assume the search can be
quite wide so we could flag a regression on any list indexed by lore?

James

Re: [Ksummit-discuss] RFC: create mailing list "linux-issues" focussed on issues/bugs and regressions

2021-03-22 Thread James Bottomley

On Mon, 2021-03-22 at 13:16 -0400, Konstantin Ryabitsev wrote:
> On Mon, Mar 22, 2021 at 04:18:14PM +0100, Thorsten Leemhuis wrote:
> > Note, there is a second reason why ksummit-discuss is CCed: another
> > reason why I want to create this new list is making it easier to
> > find and track regressions reported to our various mailing lists
> > (often without LKML in CC, as for some subsystems it's seems to be
> > custom to not CC it). 
> 
> FYI, there will soon be a unified "search all of lore.kernel.org
> regardless of the list/feed source" capability that may make it
> unnecessary to create a separate list for this purpose. There's
> active ongoing work in the public-inbox project to provide parallel
> ways to follow aggregate topics, including query-based subscriptions
> (i.e. "put a thread into my inbox whenever someone mentions my
> favourite file/function/device name"). This work is not complete yet,
> but I have great hopes that it will become available in the next
> little while.
> 
> Once we have this ability, we should be able to plug in multiple
> sources beyond just mailing lists, including a feed of all
> bugzilla.kernel.org changes. This should allow someone an easy way to
> query specific bugs and may not require the creation of a separate
> list.
> 
> I'm not opposed to the creation of a new list, of course -- just want
> to make sure it's aligned with the improvements we are working to
> make available.

I suspect the problem is that there's no known useful search string to
find a bug report even given a searchable set of lists, so the main
purpose of this list would be "if it's on here, it's a bug report" and
the triage team can cc additional lists as appropriate.  Then we simply
tell everyone to send kernel bugs to this list and ask maintainers to
cc it if a bug report shows up on their list?

James

[GIT PULL] SCSI fixes for 5.12-rc3

2021-03-20 Thread James Bottomley

Eight fixes, all in drivers, all fairly minor either being fixes in
error legs, memory leaks on teardown, context errors or semantic
problems.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Alexey Dobriyan (1):
  scsi: qla2xxx: Fix broken #endif placement

Christophe JAILLET (1):
  scsi: mpt3sas: Do not use GFP_KERNEL in atomic context

Dan Carpenter (1):
  scsi: lpfc: Fix some error codes in debugfs

Johannes Thumshirn (1):
  scsi: sd_zbc: Update write pointer offset cache

Lv Yunlong (2):
  scsi: st: Fix a use after free in st_open()
  scsi: myrs: Fix a double free in myrs_cleanup()

Tyrel Datwyler (1):
  scsi: ibmvfc: Free channel_setup_buf during device tear down

dongjian (1):
  scsi: ufs: ufs-mediatek: Correct operator & -> &&

And the diffstat:

 drivers/scsi/ibmvscsi/ibmvfc.c   |  2 ++
 drivers/scsi/lpfc/lpfc_debugfs.c |  4 ++--
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  2 +-
 drivers/scsi/myrs.c  |  2 +-
 drivers/scsi/qla2xxx/qla_target.h|  2 +-
 drivers/scsi/sd_zbc.c| 19 +++
 drivers/scsi/st.c|  2 +-
 drivers/scsi/ufs/ufs-mediatek.c  |  2 +-
 8 files changed, 20 insertions(+), 15 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 338e29d3e025..6a92891ac488 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5784,6 +5784,8 @@ static void ibmvfc_free_mem(struct ibmvfc_host *vhost)
  vhost->disc_buf_dma);
dma_free_coherent(vhost->dev, sizeof(*vhost->login_buf),
  vhost->login_buf, vhost->login_buf_dma);
+   dma_free_coherent(vhost->dev, sizeof(*vhost->channel_setup_buf),
+ vhost->channel_setup_buf, vhost->channel_setup_dma);
dma_pool_destroy(vhost->sg_pool);
ibmvfc_free_queue(vhost, async_q);
LEAVE;
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index bc79a017e1a2..46a8f2d1d2b8 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -2421,7 +2421,7 @@ lpfc_debugfs_dif_err_write(struct file *file, const char 
__user *buf,
memset(dstbuf, 0, 33);
size = (nbytes < 32) ? nbytes : 32;
if (copy_from_user(dstbuf, buf, size))
-   return 0;
+   return -EFAULT;
 
if (dent == phba->debug_InjErrLBA) {
if ((dstbuf[0] == 'o') && (dstbuf[1] == 'f') &&
@@ -2430,7 +2430,7 @@ lpfc_debugfs_dif_err_write(struct file *file, const char 
__user *buf,
}
 
if ((tmp == 0) && (kstrtoull(dstbuf, 0, )))
-   return 0;
+   return -EINVAL;
 
if (dent == phba->debug_writeGuard)
phba->lpfc_injerr_wgrd_cnt = (uint32_t)tmp;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index ffca03064797..6aa6de729187 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -413,7 +413,7 @@ mpt3sas_get_port_by_id(struct MPT3SAS_ADAPTER *ioc,
 * And add this object to port_table_list.
 */
if (!ioc->multipath_on_hba) {
-   port = kzalloc(sizeof(struct hba_port), GFP_KERNEL);
+   port = kzalloc(sizeof(struct hba_port), GFP_ATOMIC);
if (!port)
return NULL;
 
diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c
index 4adf9ded296a..329fd025c718 100644
--- a/drivers/scsi/myrs.c
+++ b/drivers/scsi/myrs.c
@@ -2273,12 +2273,12 @@ static void myrs_cleanup(struct myrs_hba *cs)
if (cs->mmio_base) {
cs->disable_intr(cs);
iounmap(cs->mmio_base);
+   cs->mmio_base = NULL;
}
if (cs->irq)
free_irq(cs->irq, cs);
if (cs->io_addr)
release_region(cs->io_addr, 0x80);
-   iounmap(cs->mmio_base);
pci_set_drvdata(pdev, NULL);
pci_disable_device(pdev);
scsi_host_put(cs->host);
diff --git a/drivers/scsi/qla2xxx/qla_target.h 
b/drivers/scsi/qla2xxx/qla_target.h
index 10e5e6c8087d..01620f3eab39 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -116,7 +116,6 @@
(min(1270, ((ql) > 0) ? (QLA_TGT_DATASEGS_PER_CMD_24XX + \
QLA_TGT_DATASEGS_PER_CONT_24XX*((ql) - 1)) : 0))
 #endif
-#endif
 
 #define GET_TARGET_ID(ha, iocb) ((HAS_EXTENDED_IDS(ha))
\
 ? le16_to_cpu((iocb)->u.isp2x.target.extended) \
@@ -244,6 +243,7 @@ struct ctio_to_2xxx {
 #ifndef CTIO_RET_TYPE
 #define CTIO_RET_TYPE  0x17/* CTIO return entry */
 #define ATIO_TYPE7 0x06 /* Accept target I/O entry for 24xx */
+#endif
 
 struct fcp_hdr {
uint8_t  r_ctl;
diff --git

[GIT PULL] SCSI fixes for 5.12-rc2

2021-03-12 Thread James Bottomley

10 updates, a non code maintainer update for vmw_pvscsi, five code
updates for ibmvfc and four for UFS.  All updates are either trivial
patches or bug fixes.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Can Guo (2):
  scsi: ufs: Remove redundant checks of !hba in suspend/resume callbacks
  scsi: ufs: Minor adjustments to error handling

Jiapeng Chong (1):
  scsi: ufs: Convert sysfs sprintf/snprintf family to sysfs_emit

Nitin Rawat (1):
  scsi: ufs: ufs-qcom: Disable interrupt in reset path

Tyrel Datwyler (5):
  scsi: ibmvfc: Reinitialize sub-CRQs and perform channel enquiry after LPM
  scsi: ibmvfc: Store return code of H_FREE_SUB_CRQ during cleanup
  scsi: ibmvfc: Treat H_CLOSED as success during sub-CRQ registration
  scsi: ibmvfc: Fix invalid sub-CRQ handles after hard reset
  scsi: ibmvfc: Simplify handling of sub-CRQ initialization

Vishal Bhakta (1):
  scsi: vmw_pvscsi: MAINTAINERS: Update maintainer

And the diffstat:

 MAINTAINERS|  2 +-
 drivers/scsi/ibmvscsi/ibmvfc.c | 62 --
 drivers/scsi/ufs/ufs-qcom.c| 10 +++
 drivers/scsi/ufs/ufshcd.c  | 41 +---
 drivers/scsi/vmw_pvscsi.c  |  2 --
 drivers/scsi/vmw_pvscsi.h  |  2 --
 6 files changed, 60 insertions(+), 59 deletions(-)

With full diff below.

James

---

diff --git a/MAINTAINERS b/MAINTAINERS
index d92f85ca831d..dede3b947834 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19165,7 +19165,7 @@ S:  Maintained
 F: drivers/infiniband/hw/vmw_pvrdma/
 
 VMware PVSCSI driver
-M: Jim Gill 
+M: Vishal Bhakta 
 M: VMware PV-Drivers 
 L: linux-s...@vger.kernel.org
 S: Maintained
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 755313b766b9..338e29d3e025 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -158,6 +159,9 @@ static void ibmvfc_npiv_logout(struct ibmvfc_host *);
 static void ibmvfc_tgt_implicit_logout_and_del(struct ibmvfc_target *);
 static void ibmvfc_tgt_move_login(struct ibmvfc_target *);
 
+static void ibmvfc_release_sub_crqs(struct ibmvfc_host *);
+static void ibmvfc_init_sub_crqs(struct ibmvfc_host *);
+
 static const char *unknown_error = "unknown error";
 
 static long h_reg_sub_crq(unsigned long unit_address, unsigned long ioba,
@@ -899,6 +903,9 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host 
*vhost)
 {
int rc = 0;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
+   unsigned long flags;
+
+   ibmvfc_release_sub_crqs(vhost);
 
/* Re-enable the CRQ */
do {
@@ -910,6 +917,15 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host 
*vhost)
if (rc)
dev_err(vhost->dev, "Error enabling adapter (rc=%d)\n", rc);
 
+   spin_lock_irqsave(vhost->host->host_lock, flags);
+   spin_lock(vhost->crq.q_lock);
+   vhost->do_enquiry = 1;
+   vhost->using_channels = 0;
+   spin_unlock(vhost->crq.q_lock);
+   spin_unlock_irqrestore(vhost->host->host_lock, flags);
+
+   ibmvfc_init_sub_crqs(vhost);
+
return rc;
 }
 
@@ -926,8 +942,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
unsigned long flags;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
struct ibmvfc_queue *crq = >crq;
-   struct ibmvfc_queue *scrq;
-   int i;
+
+   ibmvfc_release_sub_crqs(vhost);
 
/* Close the CRQ */
do {
@@ -947,16 +963,6 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
memset(crq->msgs.crq, 0, PAGE_SIZE);
crq->cur = 0;
 
-   if (vhost->scsi_scrqs.scrqs) {
-   for (i = 0; i < nr_scsi_hw_queues; i++) {
-   scrq = >scsi_scrqs.scrqs[i];
-   spin_lock(scrq->q_lock);
-   memset(scrq->msgs.scrq, 0, PAGE_SIZE);
-   scrq->cur = 0;
-   spin_unlock(scrq->q_lock);
-   }
-   }
-
/* And re-open it again */
rc = plpar_hcall_norets(H_REG_CRQ, vdev->unit_address,
crq->msg_token, PAGE_SIZE);
@@ -966,9 +972,12 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
dev_warn(vhost->dev, "Partner adapter not ready\n");
else if (rc != 0)
dev_warn(vhost->dev, "Couldn't register crq (rc=%d)\n", rc);
+
spin_unlock(vhost->crq.q_lock);
spin_unlock_irqrestore(vhost->host->host_lock, flags);
 
+   ibmvfc_init_sub_crqs(vhost);
+
return rc;
 }
 
@@ -5642,7 +5651,8 @@ static int ibmvfc_register_scsi_channel(struct 
ibmvfc_host *vhost,
rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
   >cookie, >hw_irq);
 
-

Re: [PATCH v9 0/4] Introduce TEE based Trusted Keys support

2021-03-12 Thread James Bottomley

On Fri, 2021-03-12 at 18:26 +0200, Jarkko Sakkinen wrote:
> On Wed, Mar 10, 2021 at 02:26:27PM -0800, James Bottomley wrote:
> > On Wed, 2021-03-10 at 21:56 +0200, Jarkko Sakkinen wrote:
> > [...]
> > > I also need to apply 
> > > 
> > > https://lore.kernel.org/linux-integrity/20210127190617.17564-1-james.bottom...@hansenpartnership.com/
> > > 
> > > and I would like to do both while I'm at it.
> > > 
> > > James, there was one patch that needed fixing but I cannot find
> > > lore.kernel.org link. Can you point me to that so that we
> > > can proceed?
> > 
> > I think you mean this one observing a missing space in the commit
> > message:
> > 
> > https://lore.kernel.org/keyrings/1327393.1612972...@warthog.procyon.org.uk/
> > 
> > James
> 
> Makefile needed fixing (separate lines), and spaces where missing
> between
> commas in one file (checkpatch complained).
> 
> I digged a version from my reflog but as I noted privately it's
> missing one
> file.
> 
> Either provide that file or send a new version of the full patch set.

This is the file that got lost

James

---
--- ASN.1 for TPM 2.0 keys
---

TPMKey ::= SEQUENCE {
typeOBJECT IDENTIFIER ({tpm2_key_type}),
emptyAuth   [0] EXPLICIT BOOLEAN OPTIONAL,
parent  INTEGER ({tpm2_key_parent}),
pubkey  OCTET STRING ({tpm2_key_pub}),
privkey OCTET STRING ({tpm2_key_priv})
}

Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-10 Thread James Bottomley

On Thu, 2021-03-11 at 01:49 +0100, Linus Walleij wrote:
> The use case for TPM on laptops is similar: it can be used by a
> provider to lock down a machine, but it can also be used by the
> random user to store keys. Very few users beside James
> Bottomley are capable of doing that (I am not)

Yes, that's the problem with the TPM: pretty much no-one other than
someone prepared to become an expert in the subject can use it.  This
means that enabling RPMB is unlikely to be useful ... you have to
develop easy use cases for it as well.

>  but they exist.
> https://blog.hansenpartnership.com/using-your-tpm-as-a-secure-key-store/

It's the difficulty of actually *using* the thing as a keystore which
causes the problem.   The trick to expanding use it to make it simple.
Not to derail the thread, but this should hopefully become a whole lot
easier soon.  Gnupg-2.3 will release with easy to use TPM support for
all your gpg keys:

https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=log;h=6720f1343aef9342127380b155c19e12c92d65ac

It's not the end of the road by any means, but hopefully it will become
a beach head of sorts for more uses.

James

Re: [PATCH v9 0/4] Introduce TEE based Trusted Keys support

2021-03-10 Thread James Bottomley

On Wed, 2021-03-10 at 21:56 +0200, Jarkko Sakkinen wrote:
[...]
> I also need to apply 
> 
> https://lore.kernel.org/linux-integrity/20210127190617.17564-1-james.bottom...@hansenpartnership.com/
> 
> and I would like to do both while I'm at it.
> 
> James, there was one patch that needed fixing but I cannot find
> lore.kernel.org link. Can you point me to that so that we
> can proceed?

I think you mean this one observing a missing space in the commit
message:

https://lore.kernel.org/keyrings/1327393.1612972...@warthog.procyon.org.uk/

James

Re: [PATCH v2][next] scsi: mpt3sas: Replace one-element array with flexible-array in struct _MPI2_CONFIG_PAGE_IO_UNIT_3

2021-03-08 Thread James Bottomley

On Mon, 2021-03-08 at 13:32 -0600, Gustavo A. R. Silva wrote:
> Hi all,
> 
> Friendly ping: who can review/take this, please?

Well, before embarking on a huge dynamic update, let's ask Broadcom the
simpler question of why isn't MPI2_IO_UNIT_PAGE_3_GPIO_VAL_MAX simply
set to 36?  There's no dynamic allocation of anything in the current
code ... it's all hard coded to allocate 36 entries.  If there's no
need for anything dynamic then the kzalloc could become 

io_unit_pg3 = kzalloc(sizeof(*io_unit_pg3), GFP_KERNEL);

James

Re: [PATCH] scsi: iscsi: Switch to using the new API kobj_to_dev()

2021-03-07 Thread James Bottomley

On Mon, 2021-03-08 at 11:34 +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./drivers/scsi/scsi_transport_iscsi.c:930:60-61: WARNING opportunity
> for kobj_to_dev().

I have to ask, what is the point of this?  container_of is usually
pretty safe ... as in it will detect when you screw up the usage.  The
only real misuse you can get is when the input type has an object of
the same name and return type and you got confused between two objects
with this property, but misuses like this resulting in bugs are very,
very rare.

Usually we wrap container_of because the wrapping is a bit shorter as
you can see: kobj_to_dev is about half the size of the container_of
form ... but is there any other reason to do it?

The problem is that container_of is a standard way of doing cast outs
in the kernel and we have hundreds of them.  To be precise, in scsi
alone:

jejb@jarvis:~/git/linux/drivers/scsi> git grep container_of|wc -l
496

So we really don't want to encourage wrapping them all because the
churn would be unbelievable and the gain minute.  So why should this
one case especially be wrapped when we don't want to wrap the others?

James

Re: [PATCH v3] selinux: measure state and policy capabilities

2021-03-05 Thread James Bottomley

On Fri, 2021-03-05 at 12:52 -0500, Paul Moore wrote:
[...]
> This draft seems fine to me, but there is a small logistical blocker
> at the moment which means I can't merge this until -rc2 is released,
> which likely means this coming Monday.  The problem is that this
> patch relies on code that went upstream via in the last merge window
> via the IMA tree, not the SELinux tree; normally that wouldn't be a
> problem as I typically rebase the selinux/next to Linus' -rc1 tag
> once the merge window is closed, but in this particular case the -rc1
> tag is dangerously broken for some system configurations (the tag has
> since been renamed) so I'm not rebasing onto -rc1 this time around.
> 
> Assuming that -rc2 fixes the swapfile/fs-corruption problem, early
> next week I'll rebase selinux/next to -rc2 and merge this patch.
> However, if the swapfile bug continues past -rc2 we can consider
> merging this via the IMA tree, but I'd assume not do that if possible
> due to merge conflict and testing reasons.

If it helps, we rebased the SCSI tree on top of the merge for the
swapfile fix which is this one, without waiting for -rc2:

commit f69d02e37a85645aa90d18cacfff36dba370f797
Merge: 7a7fd0de4a98 caf6912f3f4a
Author: Linus Torvalds 
Date:   Tue Mar 2 18:18:17 2021 -0800

Merge tag 'misc-5.12-2021-03-02' of git://git.kernel.dk/linux-block
 
James

Re: linux-next: rebase of the scsi-mkp tree

2021-03-04 Thread James Bottomley

On Fri, 2021-03-05 at 11:04 +1100, Stephen Rothwell wrote:
> Hi Martin,
> 
> I notice that you have rebased the scsi-mkp tree.  Unfotunately James
> has already merged part of the old version of the scsi-mkp tree int
> the scsi tree so that commits f69d02e37a85..39ae3edda325 in the scsi-
> mkp tree are the same patches as commits fe07bfda2fb9..100d21c4ff29
> in the scsi tree.

It's just the flux from Linus announcing he's screwed up -rc1 and we
shouldn't base on it ... it should all be fixed soon.

James



signature.asc
Description: This is a digitally signed message part

[GIT PULL] final round of SCSI updates for the 5.11+ merge window

2021-02-28 Thread James Bottomley

This is a few driver updates (iscsi, mpt3sas) that were still in the
staging queue when the merge window opened (all committed on or before
8 Feb) and some small bug fixes which came in during the merge window
(all committed on 22 Feb).

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc


The short changelog of the bug fixes is:

Aleksandr Miloserdov (2):
  scsi: target: core: Add cmd length set before cmd complete
  scsi: target: core: Prevent underflow for service actions

Avri Altman (1):
  scsi: ufs: Fix a duplicate dev quirk number

Bart Van Assche (1):
  scsi: sd: Fix Opal support

Bhaskar Chowdhury (1):
  scsi: aic79xx: Fix spelling of version

Bodo Stroesser (2):
  scsi: target: tcmu: Move some functions without code change
  scsi: target: tcmu: Fix memory leak caused by wrong uio usage

Chen Lin (1):
  scsi: aic7xxx: Remove unused function pointer typedef 
ahc_bus_suspend/resume_t

Don Brace (1):
  scsi: hpsa: Correct dev cmds outstanding for retried cmds

Johannes Thumshirn (1):
  scsi: sd: sd_zbc: Don't pass GFP_NOIO to kvcalloc

Randy Dunlap (1):
  scsi: bnx2fc: Fix Kconfig warning & CNIC build errors



The short changelog of the staging updates is:

Arnd Bergmann (1):
  scsi: pmcraid: Fix 'ioarcb' alignment warning

Damien Le Moal (1):
  scsi: sd: Warn if unsupported ZBC device is probed

DooHyun Hwang (1):
  scsi: ufs: Print the counter of each event history

Jiapeng Chong (1):
  scsi: qla2xxx: Simplify if statement

Mike Christie (9):
  scsi: iscsi: Drop session lock in iscsi_session_chkready()
  scsi: qla4xxx: Use iscsi_is_session_online()
  scsi: libiscsi: Reset max/exp cmdsn during recovery
  scsi: iscsi_tcp: Fix shost can_queue initialization
  scsi: libiscsi: Add helper to calculate max SCSI cmds per session
  scsi: libiscsi: Fix iSCSI host workq destruction
  scsi: libiscsi: Fix iscsi_task use after free()
  scsi: libiscsi: Drop taskqueuelock
  scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling

Sreekanth Reddy (2):
  scsi: mpt3sas: Add support for shared host tagset for CPU hotplug
  scsi: mpt3sas: Fix ReplyPostFree pool allocation

Suganath Prabu S (2):
  scsi: mpt3sas: Update driver version to 37.100.00.00
  scsi: mpt3sas: Additional diagnostic buffer query interface

Yang Li (2):
  scsi: isci: Remove redundant initialization of variable 'status'
  scsi: target: sbp: Remove unneeded semicolon



And the diffstat:

 drivers/scsi/aic7xxx/aic79xx.h  |   2 +-
 drivers/scsi/aic7xxx/aic7xxx.h  |   2 -
 drivers/scsi/bnx2fc/Kconfig |   1 +
 drivers/scsi/bnx2i/bnx2i_iscsi.c|   2 -
 drivers/scsi/hpsa.c |  51 +++-
 drivers/scsi/hpsa_cmd.h |   2 +-
 drivers/scsi/isci/request.c |   8 +-
 drivers/scsi/iscsi_tcp.c|   9 +-
 drivers/scsi/libiscsi.c | 348 ++--
 drivers/scsi/libiscsi_tcp.c |  86 ---
 drivers/scsi/mpt3sas/mpt3sas_base.c |  58 +++--
 drivers/scsi/mpt3sas/mpt3sas_base.h |  52 -
 drivers/scsi/mpt3sas/mpt3sas_ctl.c  |  67 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.h  |  22 ++
 drivers/scsi/mpt3sas/mpt3sas_scsih.c|  44 +++-
 drivers/scsi/mpt3sas/mpt3sas_trigger_diag.c |  38 ++-
 drivers/scsi/pmcraid.h  |   6 +-
 drivers/scsi/qla2xxx/qla_target.c   |   3 +-
 drivers/scsi/qla4xxx/ql4_os.c   |   2 +-
 drivers/scsi/scsi_transport_iscsi.c |   3 -
 drivers/scsi/sd.c   |  14 +-
 drivers/scsi/sd_zbc.c   |   6 +-
 drivers/scsi/ufs/ufshcd.c   |   6 +-
 drivers/scsi/ufs/ufshcd.h   |   2 +-
 drivers/target/sbp/sbp_target.c |   2 +-
 drivers/target/target_core_pr.c |  15 +-
 drivers/target/target_core_transport.c  |  15 +-
 drivers/target/target_core_user.c   | 189 ---
 include/scsi/libiscsi.h |   6 +-
 include/target/target_core_backend.h|   1 +
 30 files changed, 746 insertions(+), 316 deletions(-)

James

Re: [RFC] KVM: x86: Support KVM VMs sharing SEV context

2021-02-25 Thread James Bottomley

> Add a capability for userspace to mirror SEV encryption context from
> one vm to another. On our side, this is intended to support a
> Migration Helper vCPU, but it can also be used generically to support
> other in-guest workloads scheduled by the host. The intention is for
> the primary guest and the mirror to have nearly identical memslots.

So this causes a cloned VM that you can boot up another CPU into but
the boot path must have been already present?  In essence we've already
been thinking about something like this to get migration running inside
OVMF:

https://lore.kernel.org/qemu-devel/8b824c44-6a51-c3a7-6596-921dc47fe...@linux.ibm.com/

It sounds like this mechanism can be used to boot a vCPU through a
mirror VM after the fact, which is very compatible with the above whose
mechanism is  simply to steal a VCPU to hold in reset until it's
activated.  However, you haven't published how you activate the entity
inside the VM ... do you have patches for this so we can see the
internal capture mechanism and mirror VM boot path?

> The primary benefits of this are that:
> 1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they
> can't accidentally clobber each other.
> 2) The VMs can have different memory-views, which is necessary for
> post-copy migration (the migration vCPUs on the target need to read
> and write to pages, when the primary guest would VMEXIT).
> 
> This does not change the threat model for AMD SEV. Any memory
> involved is still owned by the primary guest and its initial state is
> still attested to through the normal SEV_LAUNCH_* flows. If userspace
> wanted to circumvent SEV, they could achieve the same effect by
> simply attaching a vCPU to the primary VM.
> This patch deliberately leaves userspace in charge of the memslots
> for the mirror, as it already has the power to mess with them in the
> primary guest.

Well it does alter the threat model in that previously the
configuration, including the CPU configuration, was fixed after launch
and attestation.  Now the CSP can alter the configuration via a mirror.
I'm not sure I have a threat for this, but it definitely alters the
model.

> This patch does not support SEV-ES (much less SNP), as it does not
> handle handing off attested VMSAs to the mirror.

One of the reasons for doing the sequestered vcpu is that -ES and -SNP
require the initial CPU state to be part of the attestation, so with
them you can't add CPU state after the fact.  I think you could use
this model if you declare the vCPU in the mirror in the initial
attested VMSA, but that's conjecture at this stage.

> For additional context, we need a Migration Helper because SEV PSP
> migration is far too slow for our live migration on its own. Using an
> in-guest migrator lets us speed this up significantly.

We have the same problem here at IBM, hence the RFC referred to above.

James

Re: [PATCH 2/9] tpm: Allow PCR 23 to be restricted to kernel-only use

2021-02-24 Thread James Bottomley

On Sat, 2021-02-20 at 01:32 +, Matthew Garrett wrote:
> Under certain circumstances it might be desirable to enable the
> creation of TPM-backed secrets that are only accessible to the
> kernel. In an ideal world this could be achieved by using TPM
> localities, but these don't appear to be available on consumer
> systems.

I don't understand this ... the localities seem to work fine on all the
systems I have ... is this some embedded thing?

>  An alternative is to simply block userland from modifying one of the
> resettable PCRs, leaving it available to the kernel. If the kernel
> ensures that no userland can access the TPM while it is carrying out
> work, it can reset PCR 23, extend it to an arbitrary value, create or
> load a secret, and then reset the PCR again. Even if userland somehow
> obtains the sealed material, it will be unable to unseal it since PCR
> 23 will never be in the appropriate state.

This seems a bit arbitrary: You're removing this PCR from user space
accessibility, but PCR 23 is defined as "Application Support" how can
we be sure no application will actually want to use it (and then fail)?

Since PCRs are very scarce, why not use a NV index instead.  They're
still a bounded resource, but most TPMs have far more of them than they
do PCRs, and the address space is much bigger so picking a nice
arbitrary 24 bit value reduces the chance of collisions.

James

Re: [PATCH v17 08/10] PM: hibernate: disable when there are active secretmem users

2021-02-22 Thread James Bottomley

On Mon, 2021-02-22 at 11:17 -0800, Dan Williams wrote:
> On Mon, Feb 22, 2021 at 2:24 AM Mike Rapoport 
> wrote:
> > On Mon, Feb 22, 2021 at 07:34:52AM +, Matthew Garrett wrote:
> > > On Mon, Feb 08, 2021 at 10:49:18AM +0200, Mike Rapoport wrote:
> > > 
> > > > It is unsafe to allow saving of secretmem areas to the
> > > > hibernation snapshot as they would be visible after the resume
> > > > and this essentially will defeat the purpose of secret memory
> > > > mappings.
> > > 
> > > Sorry for being a bit late to this - from the point of view of
> > > running processes (and even the kernel once resume is complete),
> > > hibernation is effectively equivalent to suspend to RAM. Why do
> > > they need to be handled differently here?
> > 
> > Hibernation leaves a copy of the data on the disk which we want to
> > prevent.
> 
> Why not document that users should use data at rest protection
> mechanisms for their hibernation device? Just because secretmem can't
> assert its disclosure guarantee does not mean the hibernation device
> is untrustworthy.

It's not just data at rest.  Part of the running security guarantees
are that the kernel never gets access to the data.  To support
hibernate and swap we have to break that, so it reduces the runtime
security posture as well as the data at rest one.

This argues we could do it with a per region flags (something like less
secure or more secure mappings), but when you give users that choice
most of them rarely choose less secure.

James

[GIT PULL] first round of SCSI updates for the 5.11+ merge window

2021-02-19 Thread James Bottomley

This series consists of the usual driver updates (ufs, ibmvfc, qla2xxx,
hisi_sas, pm80xx) plus the removal of the gdth driver (which is bound
to cause conflicts with a trivial change somewhere).  The only big
major rework of note is the one from Hannes trying to clean up our
result handling code in the drivers to make it consistent.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc

The short changelog is:

Adrian Hunter (2):
  scsi: ufs: ufs-debugfs: Add error counters
  scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode

Ahmed S. Darwish (19):
  scsi: libsas: Remove temporarily-added _gfp() API variants
  scsi: mvsas: Switch back to original libsas event notifiers
  scsi: isci: Switch back to original libsas event notifiers
  scsi: libsas: Switch back to original event notifiers API
  scsi: pm80xx: Switch back to original libsas event notifiers
  scsi: aic94xx: Switch back to original libsas event notifiers
  scsi: hisi_sas: Switch back to original libsas event notifiers
  scsi: libsas: Add gfp_t flags parameter to event notifications
  scsi: hisi_sas: Pass gfp_t flags to libsas event notifiers
  scsi: aic94xx: Pass gfp_t flags to libsas event notifiers
  scsi: pm80xx: Pass gfp_t flags to libsas event notifiers
  scsi: libsas: Pass gfp_t flags to event notifiers
  scsi: isci: Pass gfp_t flags in isci_port_bc_change_received()
  scsi: isci: Pass gfp_t flags in isci_port_link_up()
  scsi: isci: Pass gfp_t flags in isci_port_link_down()
  scsi: mvsas: Pass gfp_t flags to libsas event notifiers
  scsi: libsas: Introduce a _gfp() variant of event notifiers
  scsi: libsas: docs: Remove notify_ha_event()
  scsi: target: core: Remove in_interrupt() check in 
transport_handle_cdb_direct()

Anastasia Kovaleva (2):
  scsi: target: core: Change ASCQ for residual write
  scsi: target: core: Signal WRITE residuals

Andrea Parri (Microsoft) (3):
  scsi: storvsc: Validate length of incoming packet in 
storvsc_on_channel_callback()
  scsi: storvsc: Resolve data race in storvsc_probe()
  scsi: storvsc: Fix max_outstanding_req_per_channel for Win8 and newer

Arnd Bergmann (1):
  scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression

Avri Altman (1):
  scsi: ufs: A tad optimization in query upiu trace

Bean Huo (15):
  scsi: ufs: Cleanup WB buffer flush toggle implementation
  scsi: ufs: Group UFS WB related flags in struct ufs_dev_info
  scsi: ufs: Remove two WB related fields from struct ufs_dev_info
  scsi: ufs: Update comment in the function ufshcd_wb_probe()
  scsi: ufs: docs: ABI: Add wb_on documentation for new entry wb_on
  scsi: ufs: Add "wb_on" sysfs node to control WB on/off
  scsi: ufs: Delete redundant if statement in ufshcd_intr()
  scsi: ufs: Remove unnecessary devm_kfree()
  scsi: ufs: Replace sprintf and snprintf with sysfs_emit
  scsi: ufs: Make UPIU trace easier differentiate among CDB, OSF, and TM
  scsi: ufs: Distinguish between TM request UPIU and response UPIU in TM 
UPIU trace
  scsi: ufs: Distinguish between query REQ and query RSP in query trace
  scsi: ufs: Don't call trace_ufshcd_upiu() in case trace poit is disabled
  scsi: ufs: Use __print_symbolic() for UFS trace string print
  scsi: ufs: Remove stringize operator '#' restriction

Bhavesh Jashnani (1):
  scsi: pm80xx: Simultaneous poll for all FW readiness

Bikash Hazarika (1):
  scsi: qla2xxx: Wait for ABTS response on I/O timeouts for NVMe

Bjorn Helgaas (2):
  scsi: lpfc: Fix 'physical' typos
  scsi: message: fusion: Fix 'physical' typos

Brian King (1):
  scsi: ibmvfc: Set default timeout to avoid crash during migration

Can Guo (7):
  scsi: ufs: Give clk scaling min gear a value
  Revert "Make sure clk scaling happens only when HBA is runtime ACTIVE"
  scsi: ufs: Refactor ufshcd_init/exit_clk_scaling/gating()
  scsi: ufs: Protect some contexts from unexpected clock scaling
  scsi: ufs: Protect PM ops and err_handler from user access through sysfs
  scsi: ufs: Fix a possible NULL pointer issue
  scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback

Christophe JAILLET (1):
  scsi: pm80xx: Switch from 'pci_' to 'dma_' API

Colin Ian King (3):
  scsi: ibmvfc: Fix spelling mistake "succeded" -> "succeeded"
  scsi: pm80xx: Clean up indentation of a code block
  scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> 
"compatibility"

Dan Carpenter (3):
  scsi: lpfc: Fix ancient double free
  scsi: qla2xxx: Fix some memory corruption
  scsi: qla2xxx: Remove unnecessary NULL check

Dinghao Liu (2):
  scsi: fnic: Fix memleak in vnic_dev_init_devcmd2
  scsi: scsi_debug: Fix memleak in scsi_debug_init()

Enzo Matsumiya (1):
  scsi: qla2xxx: Fix description for parameter ql2xenforce_iocb_limit

Eric Curtin

Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-17 Thread James Bottomley

On Tue, 2021-02-16 at 18:16 +0100, David Hildenbrand wrote:
[...]
> > >   The discussion regarding migratability only really popped up
> > > because this is a user-visible thing and not being able to
> > > migrate can be a real problem (fragmentation, ZONE_MOVABLE, ...).
> > 
> > I think the biggest use will potentially come from hardware
> > acceleration.  If it becomes simple to add say encryption to a
> > secret page with no cost, then no flag needed.  However, if we only
> > have a limited number of keys so once we run out no more encrypted
> > memory then it becomes a costly resource and users might want a
> > choice of being backed by encryption or not.
> 
> Right. But wouldn't HW support with configurable keys etc. need more 
> syscall parameters (meaning, even memefd_secret() as it is would not
> be sufficient?). I suspect the simplistic flag approach might not
> be sufficient. I might be wrong because I have no clue about MKTME
> and friends.

The theory I was operating under is key management is automatic and
hidden, but key scarcity can't be, so if you flag requesting hardware
backing then you either get success (the kernel found a key) or failure
(the kernel is out of keys).  If we actually want to specify the key
then we need an extra argument and we *must* have a new system call.

> Anyhow, I still think extending memfd_create() might just be good
> enough - at least for now.

I really think this is the wrong approach for a user space ABI.  If we
think we'll ever need to move to a separate syscall, we should begin
with one.  The pain of trying to shift userspace from memfd_create to a
new syscall would be enormous.  It's not impossible (see clone3) but
it's a pain we should avoid if we know it's coming.

>  Things like HW support might have requirements we don't even know
> yet and that we cannot even model in memfd_secret() right now.

This is the annoying problem with our Linux unbreakable ABI policy: we
get to plan when the ABI is introduced for stuff we don't yet even know
about.

James

Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-16 Thread James Bottomley

On Tue, 2021-02-16 at 17:34 +0100, David Hildenbrand wrote:
> On 16.02.21 17:25, James Bottomley wrote:
> > On Mon, 2021-02-15 at 20:20 +0100, Michal Hocko wrote:
> > [...]
> > > > >What kind of flags are we talking about and why would that
> > > > > be a problem with memfd_create interface? Could you be more
> > > > > specific please?
> > > > 
> > > > You mean what were the ioctl flags in the patch series linked
> > > > above? They were SECRETMEM_EXCLUSIVE and SECRETMEM_UNCACHED in
> > > > patch 3/5.
> > > 
> > > OK I see. How many potential modes are we talking about? A few or
> > > potentially many?
> >   
> > Well I initially thought there were two (uncached or not) until you
> > came up with the migratable or non-migratable, which affects the
> > security properties.  But now there's also potential for hardware
> > backing, like mktme,  described by flags as well.  I suppose you
> > could also use RDT to restrict which cache the data goes into: say
> > L1 but not L2 on to lessen the impact of fully uncached (although
> > the big thrust of uncached was to blunt hyperthread side
> > channels).  So there is potential for quite a large expansion even
> > though I'd be willing to bet that a lot of the modes people have
> > thought about turn out not to be very effective in the field.
> 
> Thanks for the insight. I remember that even the "uncached" parts
> was effectively nacked by x86 maintainers (I might be wrong).

It wasn't liked by x86 maintainers, no.  Plus there's no
architecturally standard mechanism for making a page uncached and, as
the arm people pointed out, sometimes no way of ensuring it's never
cached.

>  For the other parts, the question is what we actually want to let
> user space configure.
> 
> Being able to specify "Very secure" "maximum secure" "average
> secure"  all doesn't really make sense to me.

Well, it doesn't to me either unless the user feels a cost/benefit, so
if max cost $100 per invocation and average cost nothing, most people
would chose average unless they had a very good reason not to.  In your
migratable model, if we had separate limits for non-migratable and
migratable, with non-migratable being set low to prevent exhaustion,
max secure becomes a highly scarce resource, whereas average secure is
abundant then having the choice might make sense.

>  The discussion regarding migratability only really popped up because
> this is a user-visible thing and not being able to migrate can be a
> real problem (fragmentation, ZONE_MOVABLE, ...).

I think the biggest use will potentially come from hardware
acceleration.  If it becomes simple to add say encryption to a secret
page with no cost, then no flag needed.  However, if we only have a
limited number of keys so once we run out no more encrypted memory then
it becomes a costly resource and users might want a choice of being
backed by encryption or not.

James

Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-16 Thread James Bottomley

On Mon, 2021-02-15 at 20:20 +0100, Michal Hocko wrote:
[...]
> > >   What kind of flags are we talking about and why would that be a
> > > problem with memfd_create interface? Could you be more specific
> > > please?
> > 
> > You mean what were the ioctl flags in the patch series linked
> > above? They were SECRETMEM_EXCLUSIVE and SECRETMEM_UNCACHED in
> > patch 3/5. 
> 
> OK I see. How many potential modes are we talking about? A few or
> potentially many?

Well I initially thought there were two (uncached or not) until you
came up with the migratable or non-migratable, which affects the
security properties.  But now there's also potential for hardware
backing, like mktme,  described by flags as well.  I suppose you could
also use RDT to restrict which cache the data goes into: say L1 but not
L2 on to lessen the impact of fully uncached (although the big thrust
of uncached was to blunt hyperthread side channels).  So there is
potential for quite a large expansion even though I'd be willing to bet
that a lot of the modes people have thought about turn out not to be
very effective in the field.

James

Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-15 Thread James Bottomley

On Mon, 2021-02-15 at 10:13 +0100, Michal Hocko wrote:
> On Sun 14-02-21 11:21:02, James Bottomley wrote:
> > On Sun, 2021-02-14 at 10:58 +0100, David Hildenbrand wrote:
> > [...]
> > > > And here we come to the question "what are the differences that
> > > > justify a new system call?" and the answer to this is very
> > > > subjective. And as such we can continue bikeshedding forever.
> > > 
> > > I think this fits into the existing memfd_create() syscall just
> > > fine, and I heard no compelling argument why it shouldn‘t. That‘s
> > > all I can say.
> > 
> > OK, so let's review history.  In the first two incarnations of the
> > patch, it was an extension of memfd_create().  The specific
> > objection by Kirill Shutemov was that it doesn't share any code in
> > common with memfd and so should be a separate system call:
> > 
> > https://lore.kernel.org/linux-api/20200713105812.dnwtdhsuyj3xbh4f@box/
> 
> Thanks for the pointer. But this argument hasn't been challenged at
> all. It hasn't been brought up that the overlap would be considerable
> higher by the hugetlb/sealing support. And so far nobody has claimed
> those combinations as unviable.

Kirill is actually interested in the sealing path for his KVM code so
we took a look.  There might be a two line overlap in memfd_create for
the seal case, but there's no real overlap in memfd_add_seals which is
the bulk of the code.  So the best way would seem to lift the inode ...
-> seals helpers to be non-static so they can be reused and roll our
own add_seals.

I can't see a use case at all for hugetlb support, so it seems to be a
bit of an angels on pin head discussion.  However, if one were to come
along handling it in the same way seems reasonable.

> > The other objection raised offlist is that if we do use
> > memfd_create, then we have to add all the secret memory flags as an
> > additional ioctl, whereas they can be specified on open if we do a
> > separate system call.  The container people violently objected to
> > the ioctl because it can't be properly analysed by seccomp and much
> > preferred the syscall version.
> > 
> > Since we're dumping the uncached variant, the ioctl problem
> > disappears but so does the possibility of ever adding it back if we
> > take on the container peoples' objection.  This argues for a
> > separate syscall because we can add additional features and extend
> > the API with flags without causing anti-ioctl riots.
> 
> I am sorry but I do not understand this argument.

You don't understand why container guarding technology doesn't like
ioctls?  The problem is each ioctl is the multiplexor is specific to
the particular fd implementation, so unlike fcntl you don't have global
ioctl numbers (although we do try to separate the space somewhat with
the _IO macros).  This makes analysis and blocking a hard problem for
container seccomp.

>  What kind of flags are we talking about and why would that be a
> problem with memfd_create interface? Could you be more specific
> please?

You mean what were the ioctl flags in the patch series linked above? 
They were SECRETMEM_EXCLUSIVE and SECRETMEM_UNCACHED in patch 3/5. 
They were eventually dropped after v10, because of problems with
architectural semantics, with the idea that it could be added back
again if a compelling need arose:

https://lore.kernel.org/linux-api/20201123095432.5860-1-r...@kernel.org/

In theory the extra flags could be multiplexed into the memfd_create
flags like hugetlbfs is but with 32 flags and a lot already taken it
gets messy for expansion.  When we run out of flags the first question
people will ask is "why didn't you do separate system calls?".

James

Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-14 Thread James Bottomley

On Sun, 2021-02-14 at 10:58 +0100, David Hildenbrand wrote:
[...]
> > And here we come to the question "what are the differences that
> > justify a new system call?" and the answer to this is very
> > subjective. And as such we can continue bikeshedding forever.
> 
> I think this fits into the existing memfd_create() syscall just fine,
> and I heard no compelling argument why it shouldn‘t. That‘s all I can
> say.

OK, so let's review history.  In the first two incarnations of the
patch, it was an extension of memfd_create().  The specific objection
by Kirill Shutemov was that it doesn't share any code in common with
memfd and so should be a separate system call:

https://lore.kernel.org/linux-api/20200713105812.dnwtdhsuyj3xbh4f@box/

The other objection raised offlist is that if we do use memfd_create,
then we have to add all the secret memory flags as an additional ioctl,
whereas they can be specified on open if we do a separate system call. 
The container people violently objected to the ioctl because it can't
be properly analysed by seccomp and much preferred the syscall version.

Since we're dumping the uncached variant, the ioctl problem disappears
but so does the possibility of ever adding it back if we take on the
container peoples' objection.  This argues for a separate syscall
because we can add additional features and extend the API with flags
without causing anti-ioctl riots.

James

[GIT PULL] SCSI fixes for 5.11-rc6

2021-02-13 Thread James Bottomley

One fix for scsi_debug that fixes a memory leak on module removal.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Maurizio Lombardi (1):
  scsi: scsi_debug: Fix a memory leak

And the diffstat:

 drivers/scsi/scsi_debug.c | 1 +
 1 file changed, 1 insertion(+)

With full diff below.

James

---

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 4a08c450b756..b6540b92f566 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -6881,6 +6881,7 @@ static void __exit scsi_debug_exit(void)
 
sdebug_erase_all_stores(false);
xa_destroy(per_store_ap);
+   kfree(sdebug_q_arr);
 }
 
 device_initcall(scsi_debug_init);

Re: [PATCH] sign-file: add openssl engine support

2021-02-10 Thread James Bottomley

On Wed, 2021-02-10 at 08:01 +, David Woodhouse wrote:
> 
> On 10 February 2021 07:45:54 GMT, Yang Song <
> songy...@linux.alibaba.com> wrote:
> > Use a customized signature service supported by openssl engine
> > to sign the kernel module.
> > Add command line parameters that support engine for sign-file
> > to use the customized openssl engine service to sign kernel
> > modules.
> > 
> > Signed-off-by: Yang Song 
> 
> Aren't engines already obsolete in the latest versions of OpenSSL, as
> well as being an implementation detail of one particular crypto
> library?

Um, no, they're getting renamed providers with some annoying API
changes that require a bit of a rewrite but the concept of a crypto
"engine" plug in to the code base isn't going away.

>  They aren't really a concept we should be exposing in *our* user
> interface.

We already do ... grep ENGINE in scripts/sign-file.c

Just by the way in case anyone is interested in history:

https://lore.kernel.org/keyrings/1518452963.3114.6.ca...@hansenpartnership.com/

> Better to make sign-file automatically recognise RFC7512 PKCS#11 URIs
> and handle them by automatically loading the PKCS#11 engine.

PKCS11 can't cover everyting engines can.  Engines are mostly used for
accelerators, which are not in the PKCS11 API and even for external
keys, PKCS11 can't cope if the key isn't inside what PKCS11 thinks of
as a token.

James

[GIT PULL] SCSI fixes for 5.11-rc6

2021-02-06 Thread James Bottomley

One fix in drivers (lpfc) that stops an oops on resource exhaustion.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

James Smart (1):
  scsi: lpfc: Fix EEH encountering oops with NVMe traffic

And the diffstat:

 drivers/scsi/lpfc/lpfc_nvme.c | 3 +++
 1 file changed, 3 insertions(+)

With full diff below

James

---

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 1cb82fa6a60e..39d147e251bf 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -559,6 +559,9 @@ __lpfc_nvme_ls_req(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
return -ENODEV;
}
 
+   if (!vport->phba->sli4_hba.nvmels_wq)
+   return -ENOMEM;
+
/*
 * there are two dma buf in the request, actually there is one and
 * the second one is just the start address + cmd size.

Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid

2021-02-05 Thread James Bottomley

On Fri, 2021-02-05 at 13:25 -0400, Jason Gunthorpe wrote:
> On Fri, Feb 05, 2021 at 08:48:11AM -0800, James Bottomley wrote:
[...]
> > The practical consequence of this model is that if you allocate a
> > chip structure with tpm_chip_alloc() you have to release it again
> > by doing a put of *both* devices.
> 
> The final put of the devs should be directly after the
> cdev_device_del(), not in a devm. This became all confused because
> the devs was created during alloc, not register. Having a device that
> is initialized but will never be added is weird.
> 
> See sketch below.
> 
> > Stefan noticed the latter, so we got the bogus patch 8979b02aaf1d
> > ("tpm: Fix reference count to main device") applied which simply
> > breaks the master/slave model by not taking a reference on the
> > master for the slave.  I'm not sure why I didn't notice the problem
> > with this fix at the time, but attention must have been elsewhere.
> 
> Well, this is sort of OK because we never use the devs in TPM1, so we
> end up freeing the chip with a positive refcount on the devs, which
> is weird but not a functional bug.
> 
> Jason
> 
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-
> chip.c
> index ddaeceb7e10910..e07193a0dd4438 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -344,7 +344,6 @@ struct tpm_chip *tpm_chip_alloc(struct device
> *pdev,
>   chip->dev_num = rc;
>  
>   device_initialize(>dev);
> - device_initialize(>devs);
>  
>   chip->dev.class = tpm_class;
>   chip->dev.class->shutdown_pre = tpm_class_shutdown;
> @@ -352,29 +351,12 @@ struct tpm_chip *tpm_chip_alloc(struct device
> *pdev,
>   chip->dev.parent = pdev;
>   chip->dev.groups = chip->groups;
>  
> - chip->devs.parent = pdev;
> - chip->devs.class = tpmrm_class;
> - chip->devs.release = tpm_devs_release;
> - /* get extra reference on main device to hold on
> -  * behalf of devs.  This holds the chip structure
> -  * while cdevs is in use.  The corresponding put
> -  * is in the tpm_devs_release (TPM2 only)
> -  */
> - if (chip->flags & TPM_CHIP_FLAG_TPM2)
> - get_device(>dev);
> -
>   if (chip->dev_num == 0)
>   chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
>   else
>   chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
>  
> - chip->devs.devt =
> - MKDEV(MAJOR(tpm_devt), chip->dev_num +
> TPM_NUM_DEVICES);
> -
>   rc = dev_set_name(>dev, "tpm%d", chip->dev_num);
> - if (rc)
> - goto out;
> - rc = dev_set_name(>devs, "tpmrm%d", chip->dev_num);
>   if (rc)
>   goto out;
>  
> @@ -382,9 +364,7 @@ struct tpm_chip *tpm_chip_alloc(struct device
> *pdev,
>   chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
>  
>   cdev_init(>cdev, _fops);
> - cdev_init(>cdevs, _fops);
>   chip->cdev.owner = THIS_MODULE;
> - chip->cdevs.owner = THIS_MODULE;
>  
>   rc = tpm2_init_space(>work_space,
> TPM2_SPACE_BUFFER_SIZE);
>   if (rc) {
> @@ -396,7 +376,6 @@ struct tpm_chip *tpm_chip_alloc(struct device
> *pdev,
>   return chip;
>  
>  out:
> - put_device(>devs);
>   put_device(>dev);
>   return ERR_PTR(rc);
>  }
> @@ -445,13 +424,33 @@ static int tpm_add_char_device(struct tpm_chip
> *chip)
>   }
>  
>   if (chip->flags & TPM_CHIP_FLAG_TPM2) {
> + device_initialize(>devs);
> + chip->devs.parent = pdev;
> + chip->devs.class = tpmrm_class;
> + rc = dev_set_name(>devs, "tpmrm%d", chip-
> >dev_num);
> + if (rc)
> + goto out_put_devs;
> +
> + /*
> + * get extra reference on main device to hold on
> behalf of devs.
> + * This holds the chip structure while cdevs is in
> use. The
> +  * corresponding put is in the tpm_devs_release.
> +  */
> + get_device(>dev);
> + chip->devs.release = tpm_devs_release;
> +
> + chip->devs.devt =
> + MKDEV(MAJOR(tpm_devt), chip->dev_num +
> TPM_NUM_DEVICES);
> + cdev_init(>cdevs, _fops);
> + chip->cdevs.owner = THIS_MODULE;
> +

Effectively all of this shuffles the tpmrm device allocation from
chip_alloc to chip_add ... I'm not averse to this but it does mean we
can suffer allocat

Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid

2021-02-05 Thread James Bottomley

On Fri, 2021-02-05 at 04:18 +0200, Jarkko Sakkinen wrote:
> On Thu, Feb 04, 2021 at 04:34:11PM -0800, James Bottomley wrote:
> > On Fri, 2021-02-05 at 00:50 +0100, Lino Sanfilippo wrote:
> > > From: Lino Sanfilippo 
> > > 
> > > In tpm2_del_space() chip->ops is used for flushing the sessions.
> > > However
> > > this function may be called after tpm_chip_unregister() which
> > > sets
> > > the chip->ops pointer to NULL.
> > > Avoid a possible NULL pointer dereference by checking if chip-
> > > >ops is
> > > still
> > > valid before accessing it.
> > > 
> > > Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of
> > > tpm_transmit()")
> > > Signed-off-by: Lino Sanfilippo 
> > > ---
> > >  drivers/char/tpm/tpm2-space.c | 15 ++-
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/char/tpm/tpm2-space.c
> > > b/drivers/char/tpm/tpm2-
> > > space.c
> > > index 784b8b3..9a29a40 100644
> > > --- a/drivers/char/tpm/tpm2-space.c
> > > +++ b/drivers/char/tpm/tpm2-space.c
> > > @@ -58,12 +58,17 @@ int tpm2_init_space(struct tpm_space *space,
> > > unsigned int buf_size)
> > >  
> > >  void tpm2_del_space(struct tpm_chip *chip, struct tpm_space
> > > *space)
> > >  {
> > > - mutex_lock(>tpm_mutex);
> > > - if (!tpm_chip_start(chip)) {
> > > - tpm2_flush_sessions(chip, space);
> > > - tpm_chip_stop(chip);
> > > + down_read(>ops_sem);
> > > + if (chip->ops) {
> > > + mutex_lock(>tpm_mutex);
> > > + if (!tpm_chip_start(chip)) {
> > > + tpm2_flush_sessions(chip, space);
> > > + tpm_chip_stop(chip);
> > > + }
> > > + mutex_unlock(>tpm_mutex);
> > >   }
> > > - mutex_unlock(>tpm_mutex);
> > > + up_read(>ops_sem);
> > > +
> > >   kfree(space->context_buf);
> > >   kfree(space->session_buf);
> > >  }
> > 
> > Actually, this still isn't right.  As I said to the last person who
> > reported this, we should be doing a get/put on the ops, not rolling
> > our
> > own here:
> > 
> > https://lore.kernel.org/linux-integrity/e7566e1e48f5be9dca034b4bfb67683b5d3cb88f.ca...@hansenpartnership.com/
> > 
> > The reporter went silent before we could get this tested, but could
> > you
> > try, please, because your patch is still hand rolling the ops
> > get/put,
> > just slightly better than it had been done previously.
> > 
> > James
> 
> Thanks for pointing this out. I'd strongly support Jason's proposal:
> 
> https://lore.kernel.org/linux-integrity/20201215175624.gg5...@ziepe.ca/
> 
> It's the best long-term way to fix this.

Really, no it's not.  It introduces extra mechanism we don't need.

To recap the issue: character devices already have an automatic
mechanism which holds a reference to the struct device while the
character node is open so the default is to release resources on final
put of the struct device.

tpm 2 is special because we have two character device nodes: /dev/tpm0
and /dev/tpmrm0.  The way we make this work is that tpm0 is the master
and tpmrm0 the slave, so the slave holds an extra reference on the
master which is put when the slave final put happens.  This means that
our resources aren't freed until the final puts of both devices, which
is the model we're using.

The practical consequence of this model is that if you allocate a chip
structure with tpm_chip_alloc() you have to release it again by doing a
put of *both* devices.

However, patch fdc915f7f719 ("tpm: expose spaces via a device link
/dev/tpmrm") contains two bugs: firstly it didn't add a devm action
release for devs and secondly it didn't update the only non-devm user
ibm vtpm to do the double put.

Stefan noticed the latter, so we got the bogus patch 8979b02aaf1d
("tpm: Fix reference count to main device") applied which simply breaks
the master/slave model by not taking a reference on the master for the
slave.  I'm not sure why I didn't notice the problem with this fix at
the time, but attention must have been elsewhere.

Subsequently we got ftpm added which copied the ibm vtpm put bug.

So I think 1/2 is the correct fix for all three bugs.  I just need to
find a way to verify it.

James

Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid

2021-02-05 Thread James Bottomley

On Fri, 2021-02-05 at 13:25 -0400, Jason Gunthorpe wrote:
> On Fri, Feb 05, 2021 at 08:48:11AM -0800, James Bottomley wrote:
> > > Thanks for pointing this out. I'd strongly support Jason's
> > > proposal:
> > > 
> > > https://lore.kernel.org/linux-integrity/20201215175624.gg5...@ziepe.ca/
> > > 
> > > It's the best long-term way to fix this.
> > 
> > Really, no it's not.  It introduces extra mechanism we don't need.
> > To recap the issue: character devices already have an automatic
> > mechanism which holds a reference to the struct device while the
> > character node is open so the default is to release resources on
> > final
> > put of the struct device.
> 
> The refcount on the struct device only keeps the memory alive, it
> doesn't say anything about the ops. We still need to lock and check
> the ops each and every time they are used.

I think this is the crux of our disagreement: I think the ops doesn't
matter because to call try_get_ops you have to have a chip structure
and the only way you get a chip structure is if you hold a device
containing it, in which case the device hold guarantees the chip can't
be freed.  Or if you pass in TPM_ANY_NUM to an operation which calls 
tpm_chip_find_get() which iterates the idr to find a chip under the idr
lock.  If you find a chip device at the idr, you're guaranteed it
exists, because elimination of it is the first thing the release does
and if you find a dying dev (i.e. the release routine blocks on the idr
mutex trying to kill the chip attachment), try_get_ops() fails because
the ops are already NULL.

In either case, I think you get returned a device to which you hold a
reference.  Is there any other case where you can get a chip without
also getting a device reference?

I'll answer the other point in a separate email, but I think the
principle sounds OK: we could do the final put right after we del the
char devices because that's called in the module release routine and
thus not have to rely on the devm actions which, as you say, are an
annoying complication.

James

Re: [PATCH v3 1/2] tpm: fix reference counting for struct tpm_chip

2021-02-04 Thread James Bottomley

On Thu, 2021-02-04 at 20:44 -0500, Stefan Berger wrote:
> To clarify: When I tested this I had *both* patches applied. Without
> the patches I got the null pointer exception in tpm2_del_space(). The
> 2nd patch alone solves that issue when using the steps above.


Yes, I can't confirm the bug either.  I only have lpc tis devices, so
it could be something to do with spi, but when I do

python3 in one shell

>>> fd = open("/dev/tpmrm0", "wb")

do rmmod tpm_tis in another shell

>>> buf = bytearray(20)
>>> fd.write(buf)
20

so I don't see the oops you see on write.  However

>>> fd.close()

And it oopses here in tpm2_del_space

James

Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid

2021-02-04 Thread James Bottomley

On Fri, 2021-02-05 at 00:50 +0100, Lino Sanfilippo wrote:
> From: Lino Sanfilippo 
> 
> In tpm2_del_space() chip->ops is used for flushing the sessions.
> However
> this function may be called after tpm_chip_unregister() which sets
> the chip->ops pointer to NULL.
> Avoid a possible NULL pointer dereference by checking if chip->ops is
> still
> valid before accessing it.
> 
> Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of
> tpm_transmit()")
> Signed-off-by: Lino Sanfilippo 
> ---
>  drivers/char/tpm/tpm2-space.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-
> space.c
> index 784b8b3..9a29a40 100644
> --- a/drivers/char/tpm/tpm2-space.c
> +++ b/drivers/char/tpm/tpm2-space.c
> @@ -58,12 +58,17 @@ int tpm2_init_space(struct tpm_space *space,
> unsigned int buf_size)
>  
>  void tpm2_del_space(struct tpm_chip *chip, struct tpm_space *space)
>  {
> - mutex_lock(>tpm_mutex);
> - if (!tpm_chip_start(chip)) {
> - tpm2_flush_sessions(chip, space);
> - tpm_chip_stop(chip);
> + down_read(>ops_sem);
> + if (chip->ops) {
> + mutex_lock(>tpm_mutex);
> + if (!tpm_chip_start(chip)) {
> + tpm2_flush_sessions(chip, space);
> + tpm_chip_stop(chip);
> + }
> + mutex_unlock(>tpm_mutex);
>   }
> - mutex_unlock(>tpm_mutex);
> + up_read(>ops_sem);
> +
>   kfree(space->context_buf);
>   kfree(space->session_buf);
>  }


Actually, this still isn't right.  As I said to the last person who
reported this, we should be doing a get/put on the ops, not rolling our
own here:

https://lore.kernel.org/linux-integrity/e7566e1e48f5be9dca034b4bfb67683b5d3cb88f.ca...@hansenpartnership.com/

The reporter went silent before we could get this tested, but could you
try, please, because your patch is still hand rolling the ops get/put,
just slightly better than it had been done previously.

James

Re: [PATCH] scsi: isci: convert sysfs sprintf/snprintf family to sysfs_emit

2021-02-03 Thread James Bottomley

On Wed, 2021-02-03 at 16:43 +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warning:
> 
>  ./drivers/scsi/isci/init.c:140:8-16: WARNING: use scnprintf or
> sprintf.
> 
> Reported-by: Abaci Robot
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/scsi/isci/init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
> index c452849..741a98e 100644
> --- a/drivers/scsi/isci/init.c
> +++ b/drivers/scsi/isci/init.c
> @@ -137,7 +137,7 @@ static ssize_t isci_show_id(struct device *dev,
> struct device_attribute *attr, c
>   struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(shost);
>   struct isci_host *ihost = container_of(sas_ha, typeof(*ihost),
> sas_ha);
>  
> - return snprintf(buf, PAGE_SIZE, "%d\n", ihost->id);
> + return sysfs_emit(buf, "%d\n", ihost->id);

What's the point of doing this change?  We'd have to have 13,600 bit
integer types before this could ever possibly overflow and the
difference between snprintf and scnprintf actually matter from a
practical point of view.  Perhaps the coccinelle check should be
updated to account for these common impossible to overflow situations.

James

Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-02-02 Thread James Bottomley

On Tue, 2021-02-02 at 20:15 +0200, Mike Rapoport wrote:
> On Tue, Feb 02, 2021 at 03:34:29PM +0100, David Hildenbrand wrote:
> > On 02.02.21 15:32, Michal Hocko wrote:
> > > On Tue 02-02-21 15:26:20, David Hildenbrand wrote:
> > > > On 02.02.21 15:22, Michal Hocko wrote:
> > > > > On Tue 02-02-21 15:12:21, David Hildenbrand wrote:
> > > > > [...]
> > > > > > I think secretmem behaves much more like longterm GUP right
> > > > > > now
> > > > > > ("unmigratable", "lifetime controlled by user space",
> > > > > > "cannot go on
> > > > > > CMA/ZONE_MOVABLE"). I'd either want to reasonably well
> > > > > > control/limit it or
> > > > > > make it behave more like mlocked pages.
> > > > > 
> > > > > I thought I have already asked but I must have forgotten. Is
> > > > > there any
> > > > > actual reason why the memory is not movable? Timing attacks?
> > > > 
> > > > I think the reason is simple: no direct map, no copying of
> > > > memory.
> > > 
> > > This is an implementation detail though and not something
> > > terribly hard
> > > to add on top later on. I was more worried there would be really
> > > fundamental reason why this is not possible. E.g. security
> > > implications.
> > 
> > I don't remember all the details. Let's see what Mike thinks
> > regarding
> > migration (e.g., security concerns).
> 
> Thanks for considering me a security expert :-)
> 
> Yet, I cannot estimate how dangerous is the temporal exposure of
> this data to the kernel via the direct map in the simple
> map/copy/unmap
> sequence.

Well the safest security statement is that we never expose the data to
the kernel because it's a very clean security statement and easy to
enforce.  It's also the easiest threat model to analyse.   Once we do
start exposing the secret to the kernel it alters the threat profile
and the analysis and obviously potentially provides the ROP gadget to
an attacker to do the same.  Instinct tells me that the loss of
security doesn't really make up for the ability to swap or migrate but
if there were a case for doing the latter, it would have to be a
security policy of the user (i.e. a user should be able to decide their
data is too sensitive to expose to the kernel).

> More secure way would be to map source and destination in a different
> page table rather than in the direct map, similarly to the way
> text_poke() on x86 does.

I think doing this would have much less of an impact on the security
posture because it's already theoretically possible to have kmap
restore access to the kernel.

James

> I've left the migration callback empty for now because it can be
> added on top and its implementation would depend on the way we do (or
> do not do) pooling.
>

Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-02-01 Thread James Bottomley

On Fri, 2021-01-29 at 09:23 +0100, Michal Hocko wrote:
> On Thu 28-01-21 13:05:02, James Bottomley wrote:
> > Obviously the API choice could be revisited
> > but do you have anything to add over the previous discussion, or is
> > this just to get your access control?
> 
> Well, access control is certainly one thing which I still believe is
> missing. But if there is a general agreement that the direct map
> manipulation is not that critical then this will become much less of
> a problem of course.

The secret memory is a scarce resource but it's not a facility that
should only be available to some users.

> It all boils down whether secret memory is a scarce resource. With
> the existing implementation it really is. It is effectivelly
> repeating same design errors as hugetlb did. And look now, we have a
> subtle and convoluted reservation code to track mmap requests and we
> have a cgroup controller to, guess what, have at least some control
> over distribution if the preallocated pool. See where am I coming
> from?

I'm fairly sure rlimit is the correct way to control this.  The
subtlety in both rlimit and memcg tracking comes from deciding to
account under an existing category rather than having our own new one. 
People don't like new stuff in accounting because it requires
modifications to everything in userspace.  Accounting under and
existing limit keeps userspace the same but leads to endless arguments
about which limit it should be under.  It took us several patch set
iterations to get to a fragile consensus on this which you're now
disrupting for reasons you're not making clear.

> If the secret memory is more in line with mlock without any imposed
> limit (other than available memory) in the end then, sure, using the
> same access control as mlock sounds reasonable. Btw. if this is
> really just a more restrictive mlock then is there any reason to not
> hook this into the existing mlock infrastructure (e.g.
> MCL_EXCLUSIVE)? Implications would be that direct map would be
> handled on instantiation/tear down paths, migration would deal with
> the same (if possible). Other than that it would be mlock like.

In the very first patch set we proposed a mmap flag to do this.  Under
detailed probing it emerged that this suffers from several design
problems: the KVM people want VMM to be able to remove the secret
memory range from the process; there may be situations where sharing is
useful and some people want to be able to seal the operations.  All of
this ended up convincing everyone that a file descriptor based approach
was better than a mmap one.

James

Re: Migration to trusted keys: sealing user-provided key?

2021-01-31 Thread James Bottomley

On Sun, 2021-01-31 at 15:14 +0100, Jan Lübbe wrote:
> On Sun, 2021-01-31 at 07:09 -0500, Mimi Zohar wrote:
> > On Sat, 2021-01-30 at 19:53 +0200, Jarkko Sakkinen wrote:
> > > On Thu, 2021-01-28 at 18:31 +0100, Ahmad Fatoum wrote:
> > > > Hello,
> > > > 
> > > > I've been looking into how a migration to using
> > > > trusted/encrypted keys would look like (particularly with dm-
> > > > crypt).
> > > > 
> > > > Currently, it seems the the only way is to re-encrypt the
> > > > partitions because trusted/encrypted keys always generate their
> > > > payloads from RNG.
> > > > 
> > > > If instead there was a key command to initialize a new
> > > > trusted/encrypted key with a user provided value, users could
> > > > use whatever mechanism they used beforehand to get a plaintext
> > > > key and use that to initialize a new trusted/encrypted key.
> > > > From there on, the key will be like any other trusted/encrypted
> > > > key and not be disclosed again to userspace.
> > > > 
> > > > What are your thoughts on this? Would an API like
> > > > 
> > > >   keyctl add trusted dmcrypt-key 'set ' # user-
> > > > supplied content
> > > > 
> > > > be acceptable?
> > > 
> > > Maybe it's the lack of knowledge with dm-crypt, but why this
> > > would be useful? Just want to understand the bottleneck, that's
> > > all.
> 
> Our goal in this case is to move away from having the dm-crypt key
> material accessible to user-space on embedded devices. For an
> existing dm-crypt volume, this key is fixed. A key can be loaded into
> user key type and used by dm-crypt (cryptsetup can already do it this
> way). But at this point, you can still do 'keyctl read' on that key,
> exposing the key material to user space.
> 
> Currently, with both encrypted and trusted keys, you can only
> generate new random keys, not import existing key material. 
> 
> James Bottomley mentioned in the other reply that the key format will
> become compatible with the openssl_tpm2_engine, which would provide a
> workaround. This wouldn't work with OP-TEE-based trusted keys (see
> Sumit Garg's series), though.

Assuming OP-TEE has the same use model as the TPM, someone will
eventually realise the need for interoperable key formats between key
consumers and then it will work in the same way once the kernel gets
updated to speak whatever format they come up with.

> > We upstreamed "trusted" & "encrypted" keys together in order to
> > address this sort of problem.   Instead of directly using a
> > "trusted" key for persistent file signatures being stored as
> > xattrs, the "encrypted" key provides one level of
> > indirection.   The "encrypted" key may be encrypted/decrypted with
> > either a TPM based "trusted" key or with a "user" type symmetric
> > key[1].
> > 
> > Instead of modifying "trusted" keys, use a "user" type "encrypted"
> > key.
> 
> I don't see how this would help. When using dm-crypt with an
> encrypted key, I can't use my existing key material.
> 
> Except for the migration aspect, trusted keys seem ideal. Only a
> single exported blob needs to be stored and can only be loaded/used
> again on the same (trusted) system. Userspace cannot extract the key
> material. 

Yes, that's what I was thinking ... especially when you can add policy
to the keys, which includes PCR locking.  Part of the problem is that
changing policy, which you have to do if something happens to update
the PCR values, is technically a migration, so your trusted keys for
dm-crypt are really going to have to be migrateable.

> To get to this point on systems in the field without re-encryption of
> the whole storage, only the initial trusted/encrypted key creation
> would need to allow passing in existing key material.

What about a third option: why not make dm-crypt store the master key
it uses as an encrypted key (if a parent trusted key is available)? 
That way you'd be able to extract the encrypted form of the key as
root, but wouldn't be able to extract the actual master key.

James

Re: [PATCH] tpm_tis: Add missing start/stop_tpm_chip calls

2021-01-30 Thread James Bottomley

On Sat, 2021-01-30 at 19:36 -0800, Guenter Roeck wrote:
> On 1/30/21 4:41 PM, James Bottomley wrote:
> > On Sat, 2021-01-30 at 15:49 -0800, Guenter Roeck wrote:
> > > On 1/29/21 2:59 PM, Jarkko Sakkinen wrote:
> > > > On Tue, Jan 26, 2021 at 04:46:07PM +0100, Łukasz Majczak wrote:
> > > > > Hi Jarkko, Guenter
> > > > > 
> > > > > Yes, here are the logs when failure occurs -
> > > > > https://gist.github.com/semihalf-majczak-lukasz/1575461f585f1e7fb1e9366b8eceaab9
> > > > > Look for a phrase "TPM returned invalid status"
> > > > > 
> > > > > Guenter - good suggestion - I will try to keep it as tight as
> > > > > possible.
> > > > > 
> > > > > Best regards,
> > > > > Lukasz
> > > > 
> > > > Is it possible for you try out with linux-next? Thanks. It's a
> > > > known issue, which ought to be fixed by now.
> > > > 
> > > > The log message is harmless, it'a warning not panic, and does
> > > > not endanger system stability. WARN()'s always dump stack
> > > > trace. No oops is happening.
> > > > 
> > > 
> > > There is a note in the kernel documentation which states:
> > > 
> > > Note that the WARN()-family should only be used for "expected to
> > > be unreachable" situations. If you want to warn about "reachable
> > > but undesirable" situations, please use the pr_warn()-family of
> > > functions.
> > 
> > It fits the definition.  The warning only triggers if the access is
> > in the wrong locality, which should be impossible, so the warning
> > should be unreachable.
> > 
> Thanks a lot for the clarification. So a warning traceback in the
> kernel doesn't necessarily suggest that there is a serious problem
> that should be fixed; it only means that some code is executed which
> should not be reachable (but is otherwise harmless).
> 
> That makes me wonder, though, if it would make sense to mark such
> harmless tracebacks differently. The terms "warning" and "harmless"
> sound like a bit of a contradiction to me (especially for systems
> where panic_on_warn is set).

Well, it's not harmless; because it occurs at start of day, it means we
clear the ineffective command and use default values and those happen
to work fine for the TPM in question, so the problem is pretty much
covered up.  If it had occurred anywhere else it would result in a loss
of the command data with unknown ramifications to user space, possibly
leading to a TPM failure.

Hopefully this means this is the only place we screwed up, but you can
see why a scary warning and stack trace is appropriate: if it triggers,
something in the kernel violated the TPM command model.

James

Re: [PATCH] tpm_tis: Add missing start/stop_tpm_chip calls

2021-01-30 Thread James Bottomley

On Sat, 2021-01-30 at 15:49 -0800, Guenter Roeck wrote:
> On 1/29/21 2:59 PM, Jarkko Sakkinen wrote:
> > On Tue, Jan 26, 2021 at 04:46:07PM +0100, Łukasz Majczak wrote:
> > > Hi Jarkko, Guenter
> > > 
> > > Yes, here are the logs when failure occurs -
> > > https://gist.github.com/semihalf-majczak-lukasz/1575461f585f1e7fb1e9366b8eceaab9
> > > Look for a phrase "TPM returned invalid status"
> > > 
> > > Guenter - good suggestion - I will try to keep it as tight as
> > > possible.
> > > 
> > > Best regards,
> > > Lukasz
> > 
> > Is it possible for you try out with linux-next? Thanks. It's a
> > known issue, which ought to be fixed by now.
> > 
> > The log message is harmless, it'a warning not panic, and does not
> > endanger system stability. WARN()'s always dump stack trace. No
> > oops is happening.
> > 
> 
> There is a note in the kernel documentation which states:
> 
> Note that the WARN()-family should only be used for "expected to
> be unreachable" situations. If you want to warn about "reachable
> but undesirable" situations, please use the pr_warn()-family of
> functions.

It fits the definition.  The warning only triggers if the access is in
the wrong locality, which should be impossible, so the warning should
be unreachable.

James

> It seems to me that "harmless" doesn't really fit the expected
> use of WARN(). Should it possibly be converted to pr_warn() ?

[GIT PULL] SCSI fixes for 5.11-rc5

2021-01-30 Thread James Bottomley

Two minor fixes in drivers.  Both changing strings (one in a comment,
one in a module help text) with no code impact.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Enzo Matsumiya (1):
  scsi: qla2xxx: Fix description for parameter ql2xenforce_iocb_limit

Valdis Kletnieks (1):
  scsi: target: iscsi: Fix typo in comment

And the diffstat:

 drivers/scsi/qla2xxx/qla_os.c | 2 +-
 drivers/target/iscsi/iscsi_target_login.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index f80abe28f35a..0e0fe5b09496 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -42,7 +42,7 @@ MODULE_PARM_DESC(ql2xfulldump_on_mpifail,
 int ql2xenforce_iocb_limit = 1;
 module_param(ql2xenforce_iocb_limit, int, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(ql2xenforce_iocb_limit,
-"Enforce IOCB throttling, to avoid FW congestion. (default: 
0)");
+"Enforce IOCB throttling, to avoid FW congestion. (default: 
1)");
 
 /*
  * CT6 CTX allocation cache
diff --git a/drivers/target/iscsi/iscsi_target_login.c 
b/drivers/target/iscsi/iscsi_target_login.c
index 893d1b406c29..1a9c50401bdb 100644
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -896,7 +896,7 @@ int iscsit_setup_np(
else
len = sizeof(struct sockaddr_in);
/*
-* Set SO_REUSEADDR, and disable Nagel Algorithm with TCP_NODELAY.
+* Set SO_REUSEADDR, and disable Nagle Algorithm with TCP_NODELAY.
 */
if (np->np_network_transport == ISCSI_TCP)
tcp_sock_set_nodelay(sock->sk);

Re: Migration to trusted keys: sealing user-provided key?

2021-01-30 Thread James Bottomley

On Sat, 2021-01-30 at 19:53 +0200, Jarkko Sakkinen wrote:
> On Thu, 2021-01-28 at 18:31 +0100, Ahmad Fatoum wrote:
> > Hello,
> > 
> > I've been looking into how a migration to using trusted/encrypted
> > keys would look like (particularly with dm-crypt).
> > 
> > Currently, it seems the the only way is to re-encrypt the
> > partitions because trusted/encrypted keys always generate their
> > payloads from RNG.
> > 
> > If instead there was a key command to initialize a new
> > trusted/encrypted key with a user provided value, users could use
> > whatever mechanism they used beforehand to get a plaintext key and
> > use that to initialize a new trusted/encrypted key. From there on,
> > the key will be like any other trusted/encrypted key and not be
> > disclosed again to userspace.
> > 
> > What are your thoughts on this? Would an API like
> > 
> >   keyctl add trusted dmcrypt-key 'set ' # user-supplied
> > content
> > 
> > be acceptable?
> 
> Maybe it's the lack of knowledge with dm-crypt, but why this would be
> useful? Just want to understand the bottleneck, that's all.

There was a recent patch to dm-crypt to add encrypted key support:
27f5411a718c ("dm crypt: support using encrypted keys").  The
implementation requires the actual disk encryption master key to be in
the payload.  Most people don't want to change that key because it
involves re-encrypting the whole disk (usually what people mean when
they say "key" for dm-crypt is a passphrase that decrypts this master
key from a keyslot in the metadata, which is why you can change your
passphrase without changing the underlying encryption).

However, once we get the trusted key rework upstream, we do have a
solution:  The key format becomes interoperable with the
openssl_tpm2_engine and we can now do seal_tpm2_data on any payload and
the kernel will accept it.

James

Re: [GIT PULL] tpmdd updates for v5.12-rc1

2021-01-30 Thread James Bottomley

On Sat, 2021-01-30 at 19:15 +0200, Jarkko Sakkinen wrote:
> On Thu, Jan 28, 2021 at 07:38:21PM -0800, Linus Torvalds wrote:
> > On Thu, Jan 28, 2021 at 4:54 PM Jarkko Sakkinen 
> > wrote:
> > > This contains bug fixes for tpm_tis driver, which had a racy wait
> > > for hardware state change to be ready to send a command to the
> > > TPM chip. The bug has existed already since 2006, but has only
> > > made itself known in recent past.
> > 
> > Hmm. Is this for the next merge window? The subject line implies
> > that, as does the addition of the cr50 driver.
> > 
> > But the commentary about fixes implies that at least part of it
> > should be in 5.11?
> 
> This was meant for 5.12 but the timing was *way* too early. I'll take
> this one back. Just to unambiguity reasons I'll use tpmdd-next-v5.12-
> rc1-v2 tag for my final v5.12 PR, once I send it.
> 
> I considered a bit, and I really think that it would make a lot of
> sense to do a late 5.11 just containing the two commits from James,
> namely:
> 
> 1. tpm_tis: Fix check_locality for correct locality acquisition
> 2. tpm_tis: Clean up locality release 
> 
> James: Does this make sense to you? 

Yes, that's fine with me.  It will quiet the warning we've had several
bug reports about, so it's definitely a bug fix.

James

Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-01-28 Thread James Bottomley

On Thu, 2021-01-28 at 14:01 +0100, Michal Hocko wrote:
> On Thu 28-01-21 11:22:59, Mike Rapoport wrote:
[...]
> > I like the idea to have a pool as an optimization rather than a
> > hard requirement but I don't see why would it need a careful access
> > control. As the direct map fragmentation is not necessarily
> > degrades the performance (and even sometimes it actually improves
> > it) and even then the degradation is small, trying a PMD_ORDER
> > allocation for a pool and then falling back to 4K page may be just
> > fine.
> 
> Well, as soon as this is a scarce resource then an access control
> seems like a first thing to think of. Maybe it is not really
> necessary but then this should be really justified.

The control for the resource is effectively the rlimit today.  I don't
think dividing the world into people who can and can't use secret
memory would be useful since the design is to be usable for anyone who
might have a secret to keep; it would become like the kvm group
permissions: something which is theoretically an access control but
which in practise is given to everyone on the system.

> I am also still not sure why this whole thing is not just a
> ramdisk/ramfs which happens to unmap its pages from the direct
> map. Wouldn't that be a much more easier model to work with? You
> would get an access control for free as well.

The original API was a memfd which does have this access control as
well.  However, the decision was made after much discussion to go with
a new system call instead.  Obviously the API choice could be revisited
but do you have anything to add over the previous discussion, or is
this just to get your access control?

James

Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-01-28 Thread James Bottomley

On Thu, 2021-01-28 at 14:01 +0100, Michal Hocko wrote:
> On Thu 28-01-21 11:22:59, Mike Rapoport wrote:
[...]
> > One of the major pushbacks on the first RFC [1] of the concept was
> > about the direct map fragmentation. I tried really hard to find
> > data that shows what is the performance difference with different
> > page sizes in the direct map and I didn't find anything.
> > 
> > So presuming that large pages do provide advantage the first
> > implementation of secretmem used PMD_ORDER allocations to amortise
> > the effect of the direct map fragmentation and then handed out 4k
> > pages at each fault. In addition there was an option to reserve a
> > finite pool at boot time and limit secretmem allocations only to
> > that pool.
> > 
> > At some point David suggested to use CMA to improve overall
> > flexibility [3], so I switched secretmem to use CMA.
> > 
> > Now, with the data we have at hand (my benchmarks and Intel's
> > report David mentioned) I'm even not sure this whole pooling even
> > required.
> 
> I would still like to understand whether that data is actually
> representative. With some underlying reasoning rather than I have run
> these XYZ benchmarks and numbers do not look terrible.

My theory, and the reason I made Mike run the benchmarks, is that our
fear of TLB miss has been alleviated by CPU speculation advances over
the years.  You can appreciate this if you think that both Intel and
AMD have increased the number of levels in the page table to
accommodate larger virtual memory size 5 instead of 3.  That increases
the length of the page walk nearly 2x in a physical system and even
more in a virtual system.  Unless this were massively optimized,
systems would have slowed down significantly.  Using 2M pages only
eliminates one level and 2G pages eliminates 2, so I theorized that
actually fragmentation wouldn't be the significant problem we once
thought it was and asked Mike to benchmark it.

The benchmarks show that indeed, it isn't a huge change in the data TLB
miss time, I suspect because data is nicely continuous nowadays and the
prediction that goes into the CPU optimizations quite easy.  ITLB
fragmentation actually seems to be quite a bit worse, likely because we
still don't have branch prediction down to an exact science.

James

Re: [PATCH] tpm_tis: Add missing start/stop_tpm_chip calls

2021-01-26 Thread James Bottomley

On Tue, 2021-01-26 at 16:46 +0100, Łukasz Majczak wrote:
> Hi Jarkko, Guenter
> 
> Yes, here are the logs when failure occurs -
> https://gist.github.com/semihalf-majczak-lukasz/1575461f585f1e7fb1e9366b8eceaab9
> Look for a phrase "TPM returned invalid status"

We've had other reports of this:

https://lore.kernel.org/linux-integrity/ghsgagsnag@gouders.net/
https://lore.kernel.org/linux-integrity/374e918c-f167-9308-2bea-ae6bc6a3d2e3@elloe.vision/

The problem is some TIS TPMs don't begin in the correct locality so we
have to set it.  When I proposed the check, I also proposed a fix for
this problem:

https://lore.kernel.org/linux-integrity/20201001180925.13808-5-james.bottom...@hansenpartnership.com/

But it's part of a series that never went upstream.  Part of the reason
was Jarkko proposed the get/put patch to fix this instead, but that
never went upstream either.  We need to decide an approach and apply
one or other fixes.

James

[GIT PULL] SCSI fixes for 5.11-rc4

2021-01-22 Thread James Bottomley

Twelve minor fixes, all in drivers or doc.  Most of the fixes are
pretty obvious (although we have 2 goes to get the UFS sysfs doc right)
and the biggest change is in the ufs driver which they've extensively
tested.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Adrian Hunter (1):
  scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode

Arnd Bergmann (1):
  scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression

Brian King (1):
  scsi: ibmvfc: Set default timeout to avoid crash during migration

Dinghao Liu (1):
  scsi: fnic: Fix memleak in vnic_dev_init_devcmd2

Jaegeuk Kim (2):
  scsi: ufs: Fix tm request when non-fatal error happens
  scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns()

Javed Hasan (1):
  scsi: libfc: Avoid invoking response handler twice if ep is already 
completed

Lukas Bulwahn (1):
  scsi: docs: ABI: sysfs-driver-ufs: Rectify table formatting

Martin Wilck (1):
  scsi: scsi_transport_srp: Don't block target in failfast state

Randy Dunlap (1):
  scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEM

Shin'ichiro Kawasaki (1):
  scsi: target: tcmu: Fix use-after-free of se_cmd->priv

Tyrel Datwyler (1):
  scsi: ibmvfc: Fix missing cast of ibmvfc_event pointer to u64 handle

And the diffstat:

 Documentation/ABI/testing/sysfs-driver-ufs | 36 ++---
 drivers/scsi/fnic/vnic_dev.c   |  8 ---
 drivers/scsi/ibmvscsi/ibmvfc.c |  8 ---
 drivers/scsi/libfc/fc_exch.c   | 16 +++--
 drivers/scsi/megaraid/megaraid_sas_base.c  |  6 ++---
 drivers/scsi/scsi_transport_srp.c  |  9 +++-
 drivers/scsi/ufs/Kconfig   |  1 +
 drivers/scsi/ufs/ufshcd.c  | 37 --
 drivers/target/target_core_user.c  | 11 ++---
 9 files changed, 90 insertions(+), 42 deletions(-)

With full diff below.

James

---

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index adc0d0e91607..75ccc5c62b3c 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -916,21 +916,25 @@ Date: September 2014
 Contact:   Subhash Jadavani 
 Description:   This entry could be used to set or show the UFS device
runtime power management level. The current driver
-   implementation supports 6 levels with next target states:
+   implementation supports 7 levels with next target states:
 
==  
-   0   an UFS device will stay active, an UIC link will
+   0   UFS device will stay active, UIC link will
stay active
-   1   an UFS device will stay active, an UIC link will
+   1   UFS device will stay active, UIC link will
hibernate
-   2   an UFS device will moved to sleep, an UIC link will
+   2   UFS device will be moved to sleep, UIC link will
stay active
-   3   an UFS device will moved to sleep, an UIC link will
+   3   UFS device will be moved to sleep, UIC link will
hibernate
-   4   an UFS device will be powered off, an UIC link will
+   4   UFS device will be powered off, UIC link will
hibernate
-   5   an UFS device will be powered off, an UIC link will
+   5   UFS device will be powered off, UIC link will
be powered off
+   6   UFS device will be moved to deep sleep, UIC link
+   will be powered off. Note, deep sleep might not be
+   supported in which case this value will not be
+   accepted
==  
 
 What:  /sys/bus/platform/drivers/ufshcd/*/rpm_target_dev_state
@@ -954,21 +958,25 @@ Date: September 2014
 Contact:   Subhash Jadavani 
 Description:   This entry could be used to set or show the UFS device
system power management level. The current driver
-   implementation supports 6 levels with next target states:
+   implementation supports 7 levels with next target states:
 
==  
-   0   an UFS device will stay active, an UIC link will
+   0   UFS device will stay active, UIC link will
stay active
-   1   an UFS device will stay active, an UIC link will
+   1   UFS device will stay active, UIC link will
hibernate
-   2   an UFS device will moved to sleep, an UIC link will
+   2   UFS device will be moved to sleep, UIC

Re: [PATCH] drivers/scsi/qla4xxx: use scnprintf() instead of snprintf()

2021-01-20 Thread James Bottomley

On Wed, 2021-01-20 at 15:22 +0800, Jiapeng Zhong wrote:
> Fix the following coccicheck warning:
> 
> ./drivers/scsi/qla4xxx/ql4_attr.c: WARNING: use scnprintf or
> sprintf
> 
> The snprintf() function returns the number of characters which would
> have been printed if there were enough space, but the scnprintf()
> returns the number of characters which were actually printed.  If
> the buffer is not large enough, then using snprintf() would result
> in a read overflow and an information leak.
> 
> Reported-by: Abaci Robot
> Signed-off-by: Jiapeng Zhong 
> ---
>  drivers/scsi/qla4xxx/ql4_attr.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/qla4xxx/ql4_attr.c
> b/drivers/scsi/qla4xxx/ql4_attr.c
> index ec43528..1a16017 100644
> --- a/drivers/scsi/qla4xxx/ql4_attr.c
> +++ b/drivers/scsi/qla4xxx/ql4_attr.c
> @@ -170,7 +170,7 @@ void qla4_8xxx_free_sysfs_attr(struct
> scsi_qla_host *ha)
>   char *buf)
>  {
>   struct scsi_qla_host *ha = to_qla_host(class_to_shost(dev));
> - return snprintf(buf, PAGE_SIZE, "%s\n", ha->serial_number);
> + return scnprintf(buf, PAGE_SIZE, "%s\n", ha->serial_number);

This is the wrong ABI to be replacing anything sysfs with, it should be
sysfs_emit()

James

[GIT PULL] SCSI fixes for 5.11-rc3

2021-01-16 Thread James Bottomley

Nine minor fixes, 7 in drivers and 2 in the core SCSI disk driver (sd)
which should be harmless involving removing an unused variable and
quietening a spurious warning.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Can Guo (1):
  scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback

Colin Ian King (1):
  scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> 
"compatibility"

Dinghao Liu (1):
  scsi: scsi_debug: Fix memleak in scsi_debug_init()

Ewan D. Milne (1):
  scsi: sd: Suppress spurious errors when WRITE SAME is being disabled

Kiwoong Kim (1):
  scsi: ufs: Relocate flush of exceptional event

Lukas Bulwahn (1):
  scsi: sd: Remove obsolete variable in sd_remove()

Nilesh Javali (1):
  scsi: qedi: Correct max length of CHAP secret

Stanley Chu (2):
  scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL
  scsi: ufs: Fix possible power drain during system suspend


And the diffstat:

 drivers/scsi/mpt3sas/Kconfig  |  2 +-
 drivers/scsi/qedi/qedi_main.c |  4 ++--
 drivers/scsi/scsi_debug.c |  5 +++--
 drivers/scsi/sd.c |  6 +++---
 drivers/scsi/ufs/ufshcd.c | 24 ++--
 5 files changed, 19 insertions(+), 22 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/mpt3sas/Kconfig b/drivers/scsi/mpt3sas/Kconfig
index 86209455172d..c299f7e078fb 100644
--- a/drivers/scsi/mpt3sas/Kconfig
+++ b/drivers/scsi/mpt3sas/Kconfig
@@ -79,5 +79,5 @@ config SCSI_MPT2SAS
select SCSI_MPT3SAS
depends on PCI && SCSI
help
-   Dummy config option for backwards compatiblity: configure the MPT3SAS
+   Dummy config option for backwards compatibility: configure the MPT3SAS
driver instead.
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index f5fc7f518f8a..47ad64b06623 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -2245,7 +2245,7 @@ qedi_show_boot_tgt_info(struct qedi_ctx *qedi, int type,
 chap_name);
break;
case ISCSI_BOOT_TGT_CHAP_SECRET:
-   rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_NAME_MAX_LEN,
+   rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_PWD_MAX_LEN,
 chap_secret);
break;
case ISCSI_BOOT_TGT_REV_CHAP_NAME:
@@ -2253,7 +2253,7 @@ qedi_show_boot_tgt_info(struct qedi_ctx *qedi, int type,
 mchap_name);
break;
case ISCSI_BOOT_TGT_REV_CHAP_SECRET:
-   rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_NAME_MAX_LEN,
+   rc = sprintf(buf, "%.*s\n", NVM_ISCSI_CFG_CHAP_PWD_MAX_LEN,
 mchap_secret);
break;
case ISCSI_BOOT_TGT_FLAGS:
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 24c0f7ec0351..4a08c450b756 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -6740,7 +6740,7 @@ static int __init scsi_debug_init(void)
k = sdeb_zbc_model_str(sdeb_zbc_model_s);
if (k < 0) {
ret = k;
-   goto free_vm;
+   goto free_q_arr;
}
sdeb_zbc_model = k;
switch (sdeb_zbc_model) {
@@ -6753,7 +6753,8 @@ static int __init scsi_debug_init(void)
break;
default:
pr_err("Invalid ZBC model\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto free_q_arr;
}
}
if (sdeb_zbc_model != BLK_ZONED_NONE) {
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 679c2c025047..a3d2d4bc4a3d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -984,8 +984,10 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct 
scsi_cmnd *cmd)
}
}
 
-   if (sdp->no_write_same)
+   if (sdp->no_write_same) {
+   rq->rq_flags |= RQF_QUIET;
return BLK_STS_TARGET;
+   }
 
if (sdkp->ws16 || lba > 0x || nr_blocks > 0x)
return sd_setup_write_same16_cmnd(cmd, false);
@@ -3510,10 +3512,8 @@ static int sd_probe(struct device *dev)
 static int sd_remove(struct device *dev)
 {
struct scsi_disk *sdkp;
-   dev_t devt;
 
sdkp = dev_get_drvdata(dev);
-   devt = disk_devt(sdkp->disk);
scsi_autopm_get_device(sdkp->device);
 
async_synchronize_full_domain(_sd_pm_domain);
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 82ad31781bc9..e31d2c5c7b23 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -289,7 +289,8 @@ static inline void ufshcd_wb_config(struct ufs_hba *hba)
if (ret)

Re: [PATCH] RDMA: usnic: Fix misuse of sysfs_emit_at

2021-01-15 Thread James Bottomley

On Fri, 2021-01-15 at 13:23 -0800, Joe Perches wrote:
> In commit e28bf1f03b01 ("RDMA: Convert various random sprintf sysfs
> _show
> uses to sysfs_emit") I mistakenly used len = sysfs_emit_at to
> overwrite
> the last trailing space of potentially multiple entry output.
> 
> The length of the last sysfs_emit_at call is 1 and it should instead
> be
> ignored.  Do so.
> 
> Fixes: e28bf1f03b01 ("RDMA: Convert various random sprintf sysfs
> _show uses to sysfs_emit")
> 
> Reported-by: James Bottomley 
> Signed-off-by: Joe Perches 
> ---
>  drivers/infiniband/hw/usnic/usnic_ib_sysfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c
> b/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c
> index e59615a4c9d9..fc077855b46c 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_sysfs.c
> @@ -231,7 +231,7 @@ static ssize_t summary_show(struct
> usnic_ib_qp_grp *qp_grp, char *buf)
>   }
>   }
>  
> - len = sysfs_emit_at(buf, len, "\n");
> + sysfs_emit_at(buf, len, "\n");  /* Overwrite the last
> trailing space */

len is the offset of where the next character gets written, isn't it?
so if you're overwriting the last character emitted into buf, shouldn't
the offset point at that character rather than one beyond it?  So

sysfs_emit_at(buf, len - 1, "\n");  /* Overwrite the last trailing
space */

?

James

Re: [PATCH v4] certs: Add EFI_CERT_X509_GUID support for dbx entries

2021-01-15 Thread James Bottomley

On Tue, 2020-09-15 at 20:49 -0400, Eric Snowberg wrote:
> The Secure Boot Forbidden Signature Database, dbx, contains a list of
> now revoked signatures and keys previously approved to boot with UEFI
> Secure Boot enabled.  The dbx is capable of containing any number of
> EFI_CERT_X509_SHA256_GUID, EFI_CERT_SHA256_GUID, and
> EFI_CERT_X509_GUID entries.
> 
> Currently when EFI_CERT_X509_GUID are contained in the dbx, the
> entries are skipped.
> 
> Add support for EFI_CERT_X509_GUID dbx entries. When a
> EFI_CERT_X509_GUID is found, it is added as an asymmetrical key to
> the .blacklist keyring. Anytime the .platform keyring is used, the
> keys in the .blacklist keyring are referenced, if a matching key is
> found, the key will be rejected.
> 
> Signed-off-by: Eric Snowberg 

If you're using shim, as most of our users are, you have no access to
dbx to blacklist certificates.  Plus our security envelope includes the
Mok variables, so you should also be paying attestion to MokListX (or
it's RT equivalent: MokListXRT).

If you add this to the patch, we get something that is mechanistically
complete and which also allows users to add certs to their Mok
blacklist.

James

Re: [PATCH] certs: Add EFI_CERT_X509_GUID support for dbx entries

2021-01-13 Thread James Bottomley

On Wed, 2021-01-13 at 13:40 +, David Howells wrote:
> Hi Linus,
> 
> Are you willing to take this between merge windows - or does it need
> to wait for the next merge window?  It's not technically a bug fix to
> the kernel, but it does have a CVE attached to it.
> 
> Note that I've also updated Jarkko's address in his Reviewed-by since
> his Intel address no longer works.

Sorry, late to the party.

I suppose I lost the argument that we shouldn't really be trusting any
certs from db when shim is in operation because they're all EFI binary
signing ones and will usually simply be the microsoft certificate and
possibly an OEM platform one and we're usually pivoting the root of
trust to the certificates in the MokList.

However, if we are going to do this, we should also be blacklisting the
certificates in MokListX which the OS sees through MokListXRT.  Since
MokListX is an essential piece of our revocation infrastructure it
should have been mentioned in the CVE but wasn't for some reason.

James

[GIT PULL] SCSI fixes for 5.11-rc2

2021-01-10 Thread James Bottomley

This is two driver fixes (megaraid_sas and hisi_sas).  The megaraid one
is a revert of a previous revert of a cpu hotplug fix which exposed a
bug in the block layer which has been fixed in this merge window and
the hisi_sas performance enhancement comes from switching to interrupt
managed completion queues, which depended on the addition of
devm_platform_get_irqs_affinity() which is now upstream via the irq
tree in the last merge window.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

  scsi: hisi_sas: Expose HW queues for v2 hw

Martin K. Petersen (1):
  Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset 
for cpuhotplug""

And the diffstat:

 drivers/scsi/hisi_sas/hisi_sas.h|  4 ++
 drivers/scsi/hisi_sas/hisi_sas_main.c   | 11 +
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c  | 66 +++--
 drivers/scsi/megaraid/megaraid_sas_base.c   | 39 +
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 29 +++--
 5 files changed, 123 insertions(+), 26 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index 2b28dd405600..e821dd32dd28 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -294,6 +295,7 @@ enum {
 
 struct hisi_sas_hw {
int (*hw_init)(struct hisi_hba *hisi_hba);
+   int (*interrupt_preinit)(struct hisi_hba *hisi_hba);
void (*setup_itct)(struct hisi_hba *hisi_hba,
   struct hisi_sas_device *device);
int (*slot_index_alloc)(struct hisi_hba *hisi_hba,
@@ -393,6 +395,8 @@ struct hisi_hba {
u32 refclk_frequency_mhz;
u8 sas_addr[SAS_ADDR_SIZE];
 
+   int *irq_map; /* v2 hw */
+
int n_phy;
spinlock_t lock;
struct semaphore sem;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index b6d4419c32f2..cf0bfac920a8 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -2614,6 +2614,13 @@ static struct Scsi_Host *hisi_sas_shost_alloc(struct 
platform_device *pdev,
return NULL;
 }
 
+static int hisi_sas_interrupt_preinit(struct hisi_hba *hisi_hba)
+{
+   if (hisi_hba->hw->interrupt_preinit)
+   return hisi_hba->hw->interrupt_preinit(hisi_hba);
+   return 0;
+}
+
 int hisi_sas_probe(struct platform_device *pdev,
   const struct hisi_sas_hw *hw)
 {
@@ -2671,6 +2678,10 @@ int hisi_sas_probe(struct platform_device *pdev,
sha->sas_port[i] = _hba->port[i].sas_port;
}
 
+   rc = hisi_sas_interrupt_preinit(hisi_hba);
+   if (rc)
+   goto err_out_ha;
+
rc = scsi_add_host(shost, >dev);
if (rc)
goto err_out_ha;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index b57177b52fac..9adfdefef9ca 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -3302,6 +3302,28 @@ static irq_handler_t 
fatal_interrupts[HISI_SAS_FATAL_INT_NR] = {
fatal_axi_int_v2_hw
 };
 
+#define CQ0_IRQ_INDEX (96)
+
+static int hisi_sas_v2_interrupt_preinit(struct hisi_hba *hisi_hba)
+{
+   struct platform_device *pdev = hisi_hba->platform_dev;
+   struct Scsi_Host *shost = hisi_hba->shost;
+   struct irq_affinity desc = {
+   .pre_vectors = CQ0_IRQ_INDEX,
+   .post_vectors = 16,
+   };
+   int resv = desc.pre_vectors + desc.post_vectors, minvec = resv + 1, 
nvec;
+
+   nvec = devm_platform_get_irqs_affinity(pdev, , minvec, 128,
+  _hba->irq_map);
+   if (nvec < 0)
+   return nvec;
+
+   shost->nr_hw_queues = hisi_hba->cq_nvecs = nvec - resv;
+
+   return 0;
+}
+
 /*
  * There is a limitation in the hip06 chipset that we need
  * to map in all mbigen interrupts, even if they are not used.
@@ -3310,14 +3332,11 @@ static int interrupt_init_v2_hw(struct hisi_hba 
*hisi_hba)
 {
struct platform_device *pdev = hisi_hba->platform_dev;
struct device *dev = >dev;
-   int irq, rc = 0, irq_map[128];
+   int irq, rc = 0;
int i, phy_no, fatal_no, queue_no;
 
-   for (i = 0; i < 128; i++)
-   irq_map[i] = platform_get_irq(pdev, i);
-
for (i = 0; i < HISI_SAS_PHY_INT_NR; i++) {
-   irq = irq_map[i + 1]; /* Phy up/down is irq1 */
+   irq = hisi_hba->irq_map[i + 1]; /* Phy up/down is irq1 */
rc = devm_request_irq(dev, irq, phy_interrupts[i], 0,
  DRV_NAME " phy", hisi_hba);
if (rc) {
@@ -3331,7 +3350,7 @@ static int interrupt_init_v2_hw(struct hisi_hba

Re: scsi: Add diagnostic log for scsi device reset

2021-01-07 Thread James Bottomley

On Thu, 2021-01-07 at 19:43 +0800, lijinlin wrote:
[...]
> - SCSI_LOG_ERROR_RECOVERY(3,
> + if (bdr_scmd->request && bdr_scmd->request->rq_disk)
>   sdev_printk(KERN_INFO, sdev,
> -  "%s: Sending BDR\n", current-
> >comm));
> +  "[%s] Sending device reset\n",
> +  bdr_scmd->request->rq_disk-
> >disk_name);

Not everything is a SCSI disk.  If we apply this patch, we lose traces
for any non-disk device that get reset ... for tapes this can be really
important to know.

James

Re: [PATCH 2/3] scsi: megaraid_sas: check user-provided offsets

2021-01-03 Thread James Bottomley

On Sun, 2021-01-03 at 19:49 +0100, Arnd Bergmann wrote:
> On Sun, Jan 3, 2021 at 6:00 PM James Bottomley 
> wrote:
> > On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote:
> > [...]
> > > @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct
> > > megasas_instance
> > > *instance,
> > > if (instance->consistent_mask_64bit)
> > > put_unaligned_le64(sense_handle,
> > > sense_ptr);
> > > else
> > > -   put_unaligned_le32(sense_handle,
> > > sense_ptr);
> > > +   put_unaligned_le64(sense_handle,
> > > sense_ptr);
> > > }
> > 
> > This hunk can't be right.  It effectively means removing the if.
> 
> I'm just trying to restore the state before the regression introduced
> in my 381d34e376e3 ("scsi: megaraid_sas: Check user-provided
> offsets").
> 
> The old code always stored 'sizeof(long)' bytes into sense_ptr,
> regardless of instance->consistent_mask_64bit, but it would truncate
> the address to 32 bit if that was cleared. This was clearly bogus
> and I tried to make it do something more meaningful, only storing
> 8 bytes into the structure if it was configured for 64-bit DMA,
> regardless of the capabilities of the kernel.

Heh, well, all this depends on how the firmware interprets the pointer,
for which we don't seem to have a manual.  Instinct tells me the flag
MFI_FRAME_SENSE64 is what does this and that's conditioned on the same
if clause 100 lines above this, so the fix your proposing would still
seem to be wrong, because I think when that flag is not set, the device
expects the sense pointer to be 32 bit.

> > However, the if is needed because sense_handle is a dma_addr_t
> > which can be either 32 or 64 bit.  What about changing the if to
> > 
> > if (sizeof(dma_addr_t) == 8)
> > 
> > instead?
> 
> That would not be useful either, the device surely does not care
> if the kernel supports 64-bit DMA. What we'd really need here is
> someone with access to the interface specifications to see how
> many bytes should be stored in the structure. I suspect always
> storing 64 bits (as my patch does) is correct, and would send a
> proper patch to remove the if() if Phil confirms that my test
> patch fixes the regression.

Well, as I said above, I'm speculating the device does what we tell it,
and whether to use 32 or 64 bits for the sense pointer definitely seems
to be a flag the driver controls ... we really need someone with access
to the programming manual to tell us if this speculation is accurate,
though.

James

Re: [PATCH 2/3] scsi: megaraid_sas: check user-provided offsets

2021-01-03 Thread James Bottomley

On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote:
[...]
> @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct megasas_instance
> *instance,
> if (instance->consistent_mask_64bit)
> put_unaligned_le64(sense_handle, sense_ptr);
> else
> -   put_unaligned_le32(sense_handle, sense_ptr);
> +   put_unaligned_le64(sense_handle, sense_ptr);
> }

This hunk can't be right.  It effectively means removing the if.
However, the if is needed because sense_handle is a dma_addr_t which
can be either 32 or 64 bit.  What about changing the if to 

if (sizeof(dma_addr_t) == 8)

instead?

James

Re: [PATCH] cxgb4: fix TLS dependencies again

2021-01-03 Thread James Bottomley

On Sun, 2021-01-03 at 15:01 +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> A previous patch tried to avoid a build failure, but missed one case
> that Kconfig warns about:
> 
> WARNING: unmet direct dependencies detected for CHELSIO_T4
>   Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] &&
> NET_VENDOR_CHELSIO [=y] && PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) &&
> (TLS [=m] || TLS [=m]=n)
>   Selected by [y]:
>   - SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI
> [=y] && INET [=y] && (IPV6 [=y] || IPV6 [=y]=n) && ETHERNET [=y]
> x86_64-linux-ld: drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.o: in
> function `cxgb_select_queue':
> cxgb4_main.c:(.text+0xf5df): undefined reference to
> `tls_validate_xmit_skb'
> 
> When any of the dependencies of CHELSIO_T4 are not met, then
> SCSI_CXGB4_ISCSI must not 'select' it either.
> 
> Fix it by mirroring the network driver dependencies on the iscsi
> driver. A more invasive but also more reliable alternative would
> be to use 'depends on CHELSIO_T4' instead.
> 
> Fixes: 659fbdcf2f14 ("cxgb4: Fix build failure when CONFIG_TLS=m")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/scsi/cxgbi/cxgb4i/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/cxgbi/cxgb4i/Kconfig
> b/drivers/scsi/cxgbi/cxgb4i/Kconfig
> index 8b0deece9758..2af88a55fbca 100644
> --- a/drivers/scsi/cxgbi/cxgb4i/Kconfig
> +++ b/drivers/scsi/cxgbi/cxgb4i/Kconfig
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  config SCSI_CXGB4_ISCSI
>   tristate "Chelsio T4 iSCSI support"
> - depends on PCI && INET && (IPV6 || IPV6=n)
> + depends on PCI && INET && (IPV6 || IPV6=n) && (TLS || TLS=n)
>   depends on THERMAL || !THERMAL
>   depends on ETHERNET
>   depends on TLS || TLS=n

I thought all separated depends statements were the equivalent of && in
a single statement.  If so, how does repeating TLS || TLS=n twice fix
the problem?  This sounds more like we have some sort of bug in our
Kconfig apparatus.

James

Re: [GIT PULL] SCSI fixes for 5.11-rc1

2021-01-01 Thread James Bottomley

On Fri, 2021-01-01 at 13:21 -0800, Linus Torvalds wrote:
> On Fri, Jan 1, 2021 at 12:19 PM James Bottomley
>  wrote:
> > Originally this change was slated for the merge window but a late
> > arriving build problem with CONFIG_PM=n derailed that.
> 
> So I've pulled this,

Thanks!

>  but we need to have a policy for reverting this
> quickly if it turns out to cause problems.

Sure, we'd have to revert the six patches plus the dependent PM fix.

> I'm not worried about any remaining build issues - but I'm simply
> worried about some missed case where code depended on the block layer
> passing commands through even while suspended.
> 
> The block bits would seem affect non-SCSI stuff too, how extensively
> have any random odd special case been tested?

The block bits have been in -next since 7 December, so it has had some
testing.  That said, REQ_PREEMPT doesn't affect much outside of SCSI
and IDE because we're where the original concept of at head insertion
for "special" error handling commands like request sense came from.

I'd expect the biggest field of potential problems to be in USB which
does both block/SCSI and PM.  That should have been well tested by the
PM people (as far as they can, as you know there are tons of non spec
devices out there in the space).

> So I'm not so much with you on the "the scary case is the spi domain
> validation case".  I'm more about "what about all the other random
> cases for random special drivers"

I picked on that because it's the least likely to get any testing for a
while.  However, you're right, there are other potential problem
consumers but hopefully any problem (if there is one) with the rest
will show up quickly.

James

[GIT PULL] SCSI fixes for 5.11-rc1

2021-01-01 Thread James Bottomley

This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).  The big
core two fixes are for power management ("block: Do not accept any
requests while suspended" and "block: Fix a race in the runtime power
management code") which finally sorts out the resume problems we've
occasionally been having.  To make the resume fix, there are seven
necessary precursors which effectively renames REQ_PREEMPT to REQ_PM,
so every "special" request in block is automatically a power management
exempt one.  All of the non-PM preempt cases are removed except for the
one in the SCSI Parallel Interface (spi) domain validation which is a
genuine case where we have to run requests at high priority to validate
the bus so this becomes an autopm get/put protected request.

Originally this change was slated for the merge window but a late
arriving build problem with CONFIG_PM=n derailed that.  However, we
still need the change to fix the PM suspend bugs.  You can validate the
REQ_PREEMPT -> REQ_PM nature of the six transformational patches by
inspecting the output of

git diff fa4d0f1992a9..a4d34da715e3

The biggest risk in this code is the spi domain validation change, so
I've tested it extensively on my several parallel systems here without
incident and it's now run for over a week in -next on either side of
the merge window.

Our backport for stable plan is currently to see if we can get the
lion's share of the power management problems with the first pm fix
since the second one depends on the REQ_PM rename.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Adrian Hunter (4):
  scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
  scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel 
controllers
  scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk 
->poweroff()
  scsi: ufs-pci: Fix restore from S4 for Intel controllers

Alan Stern (1):
  scsi: block: Do not accept any requests while suspended

Bart Van Assche (7):
  scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
  scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
  scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
  scsi: ide: Mark power management requests with RQF_PM instead of 
RQF_PREEMPT
  scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
  scsi: block: Introduce BLK_MQ_REQ_PM
  scsi: block: Fix a race in the runtime power management code

Bean Huo (2):
  scsi: ufs: Fix wrong print message in dev_err()
  scsi: ufs: Remove unused macro definition POWER_DESC_MAX_SIZE

Dan Carpenter (1):
  scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()

Randall Huang (1):
  scsi: ufs: Clear UAC for RPMB after ufshcd resets

Randy Dunlap (1):
  scsi: cxgb4i: Fix TLS dependency

Stanley Chu (4):
  scsi: ufs: Un-inline ufshcd_vops_device_reset function
  scsi: ufs: Re-enable WriteBooster after device reset
  scsi: ufs-mediatek: Keep VCC always-on for specific devices
  scsi: ufs: Allow regulators being always-on

Zhen Lei (1):
  scsi: ufs-mediatek: Use correct path to fix compile error

And the diffstat:

 block/blk-core.c  | 13 ---
 block/blk-mq-debugfs.c|  1 -
 block/blk-mq.c|  4 +-
 block/blk-pm.c| 15 ---
 block/blk-pm.h| 14 ---
 drivers/ide/ide-atapi.c   |  1 -
 drivers/ide/ide-io.c  |  7 +---
 drivers/ide/ide-pm.c  |  2 +-
 drivers/scsi/cxgbi/cxgb4i/Kconfig |  1 +
 drivers/scsi/mpt3sas/mpt3sas_base.c   |  2 +-
 drivers/scsi/scsi_lib.c   | 27 ++---
 drivers/scsi/scsi_transport_spi.c | 27 +
 drivers/scsi/ufs/ufs-mediatek-trace.h |  2 +-
 drivers/scsi/ufs/ufs-mediatek.c   | 21 ++
 drivers/scsi/ufs/ufs-mediatek.h   |  1 +
 drivers/scsi/ufs/ufs.h|  2 +-
 drivers/scsi/ufs/ufshcd-pci.c | 73 ++-
 drivers/scsi/ufs/ufshcd.c | 45 +
 drivers/scsi/ufs/ufshcd.h | 14 +++
 include/linux/blk-mq.h|  4 +-
 include/linux/blkdev.h| 18 ++---
 21 files changed, 208 insertions(+), 86 deletions(-)

With full diff below

James

---

diff --git a/block/blk-core.c b/block/blk-core.c
index 2db8bda43b6e..2d53e2ff48ff 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -424,11 +425,11 @@ EXPORT_SYMBOL(blk_cleanup_queue);
 /**
  * blk_queue_enter() - try to increase q->q_usage_counter
  * @q: request queue pointer
- * @flags: BLK_MQ_REQ_NOWAIT and/or BLK_MQ_REQ_PREEMPT
+ * @flags: BLK_MQ_REQ_NOWAIT and/or BLK_MQ_REQ_PM
  */
 int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t

Re: [PATCH -next] tpm: Use kzalloc for allocating only one thing

2020-12-29 Thread James Bottomley

On Tue, 2020-12-29 at 21:51 +0800, Zheng Yongjun wrote:
> Use kzalloc rather than kcalloc(1,...)
> 
> The semantic patch that makes this change is as follows:
> (http://coccinelle.lip6.fr/)

What's the reason for wanting to do this transformation?

>  drivers/char/tpm/tpm1-cmd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/tpm/tpm1-cmd.c b/drivers/char/tpm/tpm1-
> cmd.c
> index ca7158fa6e6c..4d8415e3b778 100644
> --- a/drivers/char/tpm/tpm1-cmd.c
> +++ b/drivers/char/tpm/tpm1-cmd.c
> @@ -794,7 +794,7 @@ int tpm1_pm_suspend(struct tpm_chip *chip, u32
> tpm_suspend_pcr)
>   */
>  int tpm1_get_pcr_allocation(struct tpm_chip *chip)
>  {
> - chip->allocated_banks = kcalloc(1, sizeof(*chip-
> >allocated_banks),
> + chip->allocated_banks = kzalloc(sizeof(*chip->allocated_banks),
>   GFP_KERNEL);
>   if (!chip->allocated_banks)
>   return -ENOMEM;

The reason tpm1 has this is because it mirrors the allocation in tpm2
so we retain code consistency.  It's a fairly minor advantage, so it
could be changed if you have a better rationale ... but what is it?

James

Re: [PATCH v1 0/6] no-copy bvec

2020-12-24 Thread James Bottomley

On Wed, 2020-12-23 at 15:23 -0500, Douglas Gilbert wrote:
> On 2020-12-23 11:04 a.m., James Bottomley wrote:
> > On Wed, 2020-12-23 at 15:51 +, Christoph Hellwig wrote:
> > > On Wed, Dec 23, 2020 at 12:52:59PM +, Pavel Begunkov wrote:
> > > > Can scatterlist have 0-len entries? Those are directly
> > > > translated into bvecs, e.g. in nvme/target/io-cmd-file.c and
> > > > target/target_core_file.c. I've audited most of others by this
> > > > moment, they're fine.
> > > 
> > > For block layer SGLs we should never see them, and for nvme
> > > neither. I think the same is true for the SCSI target code, but
> > > please double check.
> > 
> > Right, no-one ever wants to see a 0-len scatter list entry.  The
> > reason is that every driver uses the sgl to program the device DMA
> > engine in the way NVME does.  a 0 length sgl would be a dangerous
> > corner case: some DMA engines would ignore it and others would go
> > haywire, so if we ever let a 0 length list down into the driver,
> > they'd have to understand the corner case behaviour of their DMA
> > engine and filter it accordingly, which is why we disallow them in
> > the upper levels, since they're effective nops anyway.
> 
> When using scatter gather lists at the far end (i.e. on the storage
> device) the T10 examples (WRITE SCATTERED and POPULATE TOKEN in SBC-
> 4) explicitly allow the "number of logical blocks" in their sgl_s to
> be zero and state that it is _not_ to be considered an error.

But that's pretty irrelevant.  The scatterlists that block has been
constructing to drive DMA engines pre-date SCSI's addition of SGLs by
decades (all SCSI commands before the object commands use a linear
buffer which is implemented in the HBA engine as a scatterlist but not
described by the SCSI standard as one).

So the answer to the question should the block layer emit zero length
sgl elements is "no" because they can confuse some DMA engines.

If there's a more theoretical question of whether the target driver in
adding commands it doesn't yet support should inject zero length SGL
elements into block because SCSI allows it, the answer is still "no"
because we don't want block to have SGLs that may confuse other DMA
engines.  There's lots of daft corner cases in the SCSI standard we
don't implement and a nop for SGL elements seems to be one of the more
hare brained because it adds no useful feature and merely causes
compatibility issues.

James

Re: [PATCH v1 0/6] no-copy bvec

2020-12-23 Thread James Bottomley

On Wed, 2020-12-23 at 15:51 +, Christoph Hellwig wrote:
> On Wed, Dec 23, 2020 at 12:52:59PM +, Pavel Begunkov wrote:
> > Can scatterlist have 0-len entries? Those are directly translated
> > into bvecs, e.g. in nvme/target/io-cmd-file.c and
> > target/target_core_file.c. I've audited most of others by this
> > moment, they're fine.
> 
> For block layer SGLs we should never see them, and for nvme neither.
> I think the same is true for the SCSI target code, but please double
> check.

Right, no-one ever wants to see a 0-len scatter list entry.  The reason
is that every driver uses the sgl to program the device DMA engine in
the way NVME does.  a 0 length sgl would be a dangerous corner case:
some DMA engines would ignore it and others would go haywire, so if we
ever let a 0 length list down into the driver, they'd have to
understand the corner case behaviour of their DMA engine and filter it
accordingly, which is why we disallow them in the upper levels, since
they're effective nops anyway.

James

Re: [PATCH v2] tpm: Rework open/close/shutdown to avoid races

2020-12-17 Thread James Bottomley

On Tue, 2020-12-15 at 10:51 -0800, James Bottomley wrote:
> On Tue, 2020-12-15 at 16:38 +0300, Sergey Temerkhanov wrote:
> > Avoid race condition at shutdown by shutting downn the TPM 2.0
> > devices synchronously. This eliminates the condition when the
> > shutdown sequence sets chip->ops to NULL leading to the following:
> > 
> > [ 1586.593561][ T8669] tpm2_del_space+0x28/0x73
> > [ 1586.598718][ T8669] tpmrm_release+0x27/0x33wq
> > [ 1586.603774][ T8669] __fput+0x109/0x1d
> > [ 1586.608380][ T8669] task_work_run+0x7c/0x90
> > [ 1586.613414][ T8669] prepare_exit_to_usermode+0xb8/0x128
> > [ 1586.619522][ T8669] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 1586.626068][ T8669] RIP: 0033:0x4cb4bb
> 
> An actual bug report would have been helpful.  However, from this
> trace it's easy to deduce that tpm2_del_space() didn't get converted
> to the get/put of the chip ops ... it's still trying to do its own
> half arsed thing with tpm_chip_start() and the mutex.  So isn't a
> much simpler fix simply to convert it as below?  compile tested only,
> but if you can test it out I'll send a proper patch.

I got this booted and running here, so I know it works.  What I still
need to know is does it fix your problem?

James

[GIT PULL] first round of SCSI updates for the 5.10+ merge window

2020-12-16 Thread James Bottomley

This series consists of the usual driver updates (ufs, qla2xxx,
smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups,
a major power management rework and a load of assorted minor updates. 
There are a few core updates (formatting fixes being the big one) but
nothing major this cycle.

We've got one obvious conflict in qla_nvme.c which is due to us having
the same patch (with different commit ids) in upstream and our pull
request ("scsi: qla2xxx: Return EBUSY on fcport deletion").

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc

The short changelog is:

Adrian Hunter (2):
  scsi: ufs: Allow an error return value from ->device_reset()
  scsi: ufs: Add DeepSleep feature

Ahmed S. Darwish (12):
  scsi: NCR5380: Remove context check
  scsi: myrs: Remove WARN_ON(in_interrupt())
  scsi: myrb: Remove WARN_ON(in_interrupt())
  scsi: mpt3sas: Remove in_interrupt()
  scsi: qla4xxx: Remove in_interrupt() from qla4_82xx_rom_lock()
  scsi: qla4xxx: Remove in_interrupt() from qla4_82xx_idc_lock()
  scsi: qla2xxx: Remove in_interrupt() from qla83xx-specific code
  scsi: target: tcm_qla2xxx: Remove BUG_ON(in_interrupt())
  scsi: qla2xxx: Remove in_interrupt() from qla82xx-specific code
  scsi: qla4xxx: Remove in_interrupt()
  scsi: hisi_sas: Remove preemptible()
  scsi: pm80xx: Do not sleep in atomic context

Arnd Bergmann (7):
  scsi: ufs: Fix -Wsometimes-uninitialized warning
  scsi: megaraid_sas: Simplify compat_ioctl handling
  scsi: megaraid_sas: Check user-provided offsets
  scsi: aacraid: Improve compat_ioctl handlers
  scsi: libfc: Fix enum-conversion warning
  scsi: libfc: Work around -Warray-bounds warning
  scsi: libfc: Move scsi/fc_encode.h to libfc

Arun Easi (5):
  scsi: qla2xxx: Fix device loss on 4G and older HBAs
  scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines
  scsi: qla2xxx: Fix FW initialization error on big endian machines
  scsi: qla2xxx: Fix crash during driver load on big endian machines
  scsi: qla2xxx: Fix compilation issue in PPC systems

Asutosh Das (1):
  scsi: ufs: qcom: Enable aggressive power collapse for ufs HBA

Bean Huo (1):
  scsi: ufs: Remove unnecessary if condition in ufshcd_suspend()

Bjorn Andersson (1):
  scsi: ufs: Adjust logic in common ADAPT helper

Bodo Stroesser (1):
  scsi: target: tcmu: scatter_/gather_data_area() rework

Can Guo (7):
  scsi: ufs: Print host regs in IRQ handler when AH8 error happens
  scsi: ufs: Fix a race condition between ufshcd_abort() and eh_work()
  scsi: ufs: Serialize eh_work with system PM events and async scan
  scsi: ufs: Stop hardcoding the scale down gear
  scsi: ufs-qcom: Keep core_clk_unipro on while link is active
  scsi: ufs: Refactor ufshcd_setup_clocks() to remove skip_ref_clk
  scsi: ufs: Put HBA into LPM during clk gating

Colin Ian King (5):
  scsi: qla4xxx: Remove redundant assignment to variable rval
  scsi: pm8001: Remove space in a debug message
  scsi: lpfc: Fix memory leak on lcb_context
  scsi: lpfc: Remove dead code on second !ndlp check
  scsi: lpfc: Fix pointer defereference before it is null checked issue

Daniel Wagner (1):
  scsi: qla2xxx: Return EBUSY on fcport deletion

David Disseldorp (4):
  scsi: target: Return COMPARE AND WRITE miscompare offsets
  scsi: target: Split out COMPARE AND WRITE memcmp into helper
  scsi: target: Rename cmd.bad_sector to cmd.sense_info
  scsi: target: Rename struct sense_info to sense_detail

Don Brace (3):
  scsi: smartpqi: Update version to 1.2.16-012
  scsi: smartpqi: Correct pqi_sas_smp_handler busy condition
  scsi: smartpqi: Correct driver removal with HBA disks

Eric Biggers (1):
  scsi: ufs-qcom: Only select QCOM_SCM if SCSI_UFS_CRYPTO

Finn Thain (2):
  scsi: NCR5380: Reduce NCR5380_maybe_release_dma_irq() call sites
  scsi: atari_scsi: Fix race condition between .queuecommand and EH

Gustavo A. R. Silva (9):
  scsi: target: core: Fix fall-through warnings for Clang
  scsi: stex: Fix fall-through warnings for Clang
  scsi: lpfc: Fix fall-through warnings for Clang
  scsi: csiostor: Fix fall-through warnings for Clang
  scsi: aha1740: Fix fall-through warnings for Clang
  scsi: aacraid: Fix fall-through warnings for Clang
  scsi: bfa: Fix fall-through warnings for Clang
  scsi: aic94xx: Fix fall-through warnings for Clang
  scsi: aic7xxx: Fix fall-through warnings for Clang

Hannes Reinecke (4):
  scsi: core: Return BLK_STS_AGAIN for ALUA transitioning
  scsi: scsi_dh_alua: Set 'transitioning' state on Unit Attention
  scsi: scsi_dh_alua: Return BLK_STS_AGAIN for ALUA transitioning state
  scsi: block: Return status code in blk_mq_end_request()

Jaegeuk Kim (6):
  scsi: ufs: Fix clkgating on/off
  scsi: ufs: Add more

Re: [PATCH v2] tpm: Rework open/close/shutdown to avoid races

2020-12-15 Thread James Bottomley

On Tue, 2020-12-15 at 16:38 +0300, Sergey Temerkhanov wrote:
> Avoid race condition at shutdown by shutting downn the TPM 2.0
> devices synchronously. This eliminates the condition when the
> shutdown sequence sets chip->ops to NULL leading to the following:
> 
> [ 1586.593561][ T8669] tpm2_del_space+0x28/0x73
> [ 1586.598718][ T8669] tpmrm_release+0x27/0x33wq
> [ 1586.603774][ T8669] __fput+0x109/0x1d
> [ 1586.608380][ T8669] task_work_run+0x7c/0x90
> [ 1586.613414][ T8669] prepare_exit_to_usermode+0xb8/0x128
> [ 1586.619522][ T8669] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 1586.626068][ T8669] RIP: 0033:0x4cb4bb

An actual bug report would have been helpful.  However, from this trace
it's easy to deduce that tpm2_del_space() didn't get converted to the
get/put of the chip ops ... it's still trying to do its own half arsed
thing with tpm_chip_start() and the mutex.  So isn't a much simpler fix
simply to convert it as below?  compile tested only, but if you can
test it out I'll send a proper patch.

James

---

diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 784b8b3cb903..0c0cd225046f 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -58,12 +58,12 @@ int tpm2_init_space(struct tpm_space *space, unsigned int 
buf_size)
 
 void tpm2_del_space(struct tpm_chip *chip, struct tpm_space *space)
 {
-   mutex_lock(>tpm_mutex);
-   if (!tpm_chip_start(chip)) {
+
+   if (tpm_try_get_ops(chip) == 0) {
tpm2_flush_sessions(chip, space);
-   tpm_chip_stop(chip);
+   tpm_put_ops(chip);
}
-   mutex_unlock(>tpm_mutex);
+
kfree(space->context_buf);
kfree(space->session_buf);
 }

Re: [PATCH] KVM/SVM: add support for SEV attestation command

2020-12-13 Thread James Bottomley

On Wed, 2020-12-09 at 21:25 -0600, Brijesh Singh wrote:
> Noted, I will send v2 with these fixed.

I ran a test on this.  It turns out for rome systems you need firmware
md_sev_fam17h_model3xh_0.24b0A (or later) installed to get this and the
QEMU patch with the base64 decoding fixed, but with that

Tested-by: James Bottomley 

Attached is the test programme I used.

James

---

#!/usr/bin/python3
##
# Python script get an attestation and verify it with the PEK
#
# This assumes you've already exported the pek.cert with sev-tool
# from https://github.com/AMDESE/sev-tool.git
#
# sev-tool --export_cert_chain
#
# creates several files, the only one this script needs is pek.cert
#
# Tables and chapters refer to the amd 55766.pdf document
#
# https://www.amd.com/system/files/TechDocs/55766_SEV-KM_API_Specification.pdf
##
import sys
import os 
import base64
import hashlib
from argparse import ArgumentParser
from Crypto.PublicKey import ECC
from Crypto.Math.Numbers import Integer
from git.qemu.python.qemu import qmp

if __name__ == "__main__":
parser = ArgumentParser(description='Inject secret into SEV')
parser.add_argument('--pek-cert',
help='The Platform DH certificate in binary form',
default='pek.cert')
parser.add_argument('--socket',
help='Socket to connect to QMP on, defaults to 
localhost:6550',
default='localhost:6550')
args = parser.parse_args()

if (args.socket[0] == '/'):
socket = args.socket
elif (':' in args.socket):
s = args.socket.split(':')
socket = (s[0], int(s[1]))
else:
parse.error('--socket must be : or /path/to/unix')

fh = open(args.pek_cert, 'rb')
pek = bytearray(fh.read())
curve = int.from_bytes(pek[16:20], byteorder='little')
curves = {
1: 'p256',
2: 'p384'
}
Qx = int.from_bytes(bytes(pek[20:92]), byteorder='little')
Qy = int.from_bytes(bytes(pek[92:164]), byteorder='little')

pubkey = ECC.construct(point_x=Qx, point_y=Qy, curve=curves[curve])

Qmp = qmp.QEMUMonitorProtocol(address=socket);
Qmp.connect()
caps = Qmp.command('query-sev')
print('SEV query found API={api-major}.{api-minor} build={build-id} 
policy={policy}\n'.format(**caps))

nonce=os.urandom(16)

report = Qmp.command('query-sev-attestation-report',
 mnonce=base64.b64encode(nonce).decode())

a = base64.b64decode(report['data'])

##
# returned data is formulated as Table 60. Attestation Report Buffer
##
rnonce = a[0:16]
rmeas = a[16:48]

if (nonce != rnonce):
sys.exit('returned nonce doesn\'t match input nonce')

policy = int.from_bytes(a[48:52], byteorder='little')
usage = int.from_bytes(a[52:56], byteorder='little')
algo = int.from_bytes(a[56:60], byteorder='little')

if (policy != caps['policy']):
sys.exit('Policy mismatch:', policy, '!=', caps['policy'])

if (usage != 0x1002):
sys.exit('error PEK is not specified in usage: ', usage)

if (algo == 0x2):
h = hashlib.sha256()
elif (algo == 0x102):
##
# The spec (6.8) says the signature must be ECDSA-SHA256 so this
# should be impossible, but it turns out to be the way our
# current test hardware produces its signature
##
h = hashlib.sha384()
else:
sys.exit('unrecognized signing algorithm: ', algo)

h.update(a[0:52])

sig = a[64:208]
r = int.from_bytes(sig[0:72],byteorder='little')
s = int.from_bytes(sig[72:144],byteorder='little')
##
# subtlety: r and s are little (AMD defined) z is big (crypto requirement)
##
z = int.from_bytes(h.digest(), byteorder='big')

##
# python crypto doesn't have a way of passing in r and s as
# integers and I'm not inclined to wrap them up as a big endian
# binary signature to have Signature.DSS unwrap them again, so
# call the _verify() private interface that does take integers
##
if (not pubkey._verify(Integer(z), (Integer(r), Integer(s:
sys.exit('returned signature did not verify')

print('usage={usage}, algorithm={algo}'.format(usage=hex(usage),
   algo=hex(algo)))
print('ovmf-hash: ', rmeas.hex())

[GIT PULL] SCSI fixes for 5.10-rc7

2020-12-12 Thread James Bottomley

Five small fixes: four in drivers: hisi_sas: fix internal queue
timeout, be2iscsi: revert a prior fix causing problems, bnx2i: add
missing dependency, storvsc: late arriving revert of a problem fix, and
one in the core.  The core one is a minor change to stop paying
attention to the busy count when returning out of resources because
there's a race window where the queue might not restart due to missing
returning I/O.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Andrea Parri (Microsoft) (1):
  Revert "scsi: storvsc: Validate length of incoming packet in 
storvsc_on_channel_callback()"

Dan Carpenter (1):
  scsi: be2iscsi: Revert "Fix a theoretical leak in beiscsi_create_eqs()"

Ming Lei (1):
  scsi: core: Fix race between handling STS_RESOURCE and completion

Randy Dunlap (1):
  scsi: bnx2i: Requires MMU

Xiang Chen (1):
  scsi: hisi_sas: Select a suitable queue for internal I/Os

And the diffstat:

 drivers/scsi/be2iscsi/be_main.c| 4 ++--
 drivers/scsi/bnx2i/Kconfig | 1 +
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 6 ++
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 +
 drivers/scsi/scsi_lib.c| 3 +--
 drivers/scsi/storvsc_drv.c | 5 -
 6 files changed, 15 insertions(+), 9 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index 202ba925c494..5c3513a4b450 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -3020,7 +3020,6 @@ static int beiscsi_create_eqs(struct beiscsi_hba *phba,
goto create_eq_error;
}
 
-   mem->dma = paddr;
mem->va = eq_vaddress;
ret = be_fill_queue(eq, phba->params.num_eq_entries,
sizeof(struct be_eq_entry), eq_vaddress);
@@ -3030,6 +3029,7 @@ static int beiscsi_create_eqs(struct beiscsi_hba *phba,
goto create_eq_error;
}
 
+   mem->dma = paddr;
ret = beiscsi_cmd_eq_create(>ctrl, eq,
BEISCSI_EQ_DELAY_DEF);
if (ret) {
@@ -3086,7 +3086,6 @@ static int beiscsi_create_cqs(struct beiscsi_hba *phba,
goto create_cq_error;
}
 
-   mem->dma = paddr;
ret = be_fill_queue(cq, phba->params.num_cq_entries,
sizeof(struct sol_cqe), cq_vaddress);
if (ret) {
@@ -3096,6 +3095,7 @@ static int beiscsi_create_cqs(struct beiscsi_hba *phba,
goto create_cq_error;
}
 
+   mem->dma = paddr;
ret = beiscsi_cmd_cq_create(>ctrl, cq, eq, false,
false, 0);
if (ret) {
diff --git a/drivers/scsi/bnx2i/Kconfig b/drivers/scsi/bnx2i/Kconfig
index 75ace2302fed..0cc06c2ce0b8 100644
--- a/drivers/scsi/bnx2i/Kconfig
+++ b/drivers/scsi/bnx2i/Kconfig
@@ -4,6 +4,7 @@ config SCSI_BNX2_ISCSI
depends on NET
depends on PCI
depends on (IPV6 || IPV6=n)
+   depends on MMU
select SCSI_ISCSI_ATTRS
select NETDEVICES
select ETHERNET
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index c8dd8588f800..274ccf18ce2d 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -452,6 +452,12 @@ static int hisi_sas_task_prep(struct sas_task *task,
blk_tag = blk_mq_unique_tag(scmd->request);
dq_index = blk_mq_unique_tag_to_hwq(blk_tag);
*dq_pointer = dq = _hba->dq[dq_index];
+   } else if (hisi_hba->shost->nr_hw_queues)  {
+   struct Scsi_Host *shost = hisi_hba->shost;
+   struct blk_mq_queue_map *qmap = 
>tag_set.map[HCTX_TYPE_DEFAULT];
+   int queue = qmap->mq_map[raw_smp_processor_id()];
+
+   *dq_pointer = dq = _hba->dq[queue];
} else {
*dq_pointer = dq = sas_dev->dq;
}
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 7133ca859b5e..960de375ce69 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -2452,6 +2452,11 @@ static int interrupt_init_v3_hw(struct hisi_hba 
*hisi_hba)
rc = -ENOENT;
goto free_irq_vectors;
}
+   cq->irq_mask = pci_irq_get_affinity(pdev, i + 
BASE_VECTORS_V3_HW);
+   if (!cq->irq_mask) {
+   dev_err(dev, "could not get cq%d irq affinity!\n", i);
+   return -ENOENT;
+   }
}
 
return 0;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index

Re: [PATCH AUTOSEL 5.7 03/30] ima: extend boot_aggregate with kernel measurements

2020-12-11 Thread James Bottomley

On Fri, 2020-12-11 at 06:01 -0500, Mimi Zohar wrote:
> On Thu, 2020-12-10 at 21:10 -0600, Tyler Hicks wrote:
> > On 2020-11-29 08:17:38, Mimi Zohar wrote:
> > > Hi Sasha,
> > > 
> > > On Wed, 2020-07-08 at 21:27 -0400, Sasha Levin wrote:
> > > > On Wed, Jul 08, 2020 at 12:13:13PM -0400, Mimi Zohar wrote:
> > > > > Hi Sasha,
> > > > > 
> > > > > On Wed, 2020-07-08 at 11:40 -0400, Sasha Levin wrote:
> > > > > > From: Maurizio Drocco 
> > > > > > 
> > > > > > [ Upstream commit 20c59ce010f84300f6c655d32db2610d3433f85c
> > > > > > ]
> > > > > > 
> > > > > > Registers 8-9 are used to store measurements of the kernel
> > > > > > and its command line (e.g., grub2 bootloader with tpm
> > > > > > module enabled). IMA should include them in the boot
> > > > > > aggregate. Registers 8-9 should be only included in non-
> > > > > > SHA1 digests to avoid ambiguity.
> > > > > 
> > > > > Prior to Linux 5.8, the SHA1 template data hashes were padded
> > > > > before being extended into the TPM.  Support for calculating
> > > > > and extending the per TPM bank template data digests is only
> > > > > being upstreamed in Linux 5.8.
> > > > > 
> > > > > How will attestation servers know whether to include PCRs 8 &
> > > > > 9 in the the boot_aggregate calculation?  Now, there is a
> > > > > direct relationship between the template data SHA1 padded
> > > > > digest not including PCRs 8 & 9, and the new per TPM bank
> > > > > template data digest including them.
> > > > 
> > > > Got it, I'll drop it then, thank you!
> > > 
> > > After re-thinking this over, I realized that the attestation
> > > server can verify the "boot_aggregate" based on the quoted PCRs
> > > without knowing whether padded SHA1 hashes or per TPM bank hash
> > > values were extended into the TPM[1], but non-SHA1 boot aggregate
> > > values [2] should always include PCRs 8 & 9.
> > 
> > I'm still not clear on how an attestation server would know to
> > include PCRs 8 and 9 after this change came through a stable kernel
> > update. It doesn't seem like something appropriate for stable since
> > it requires code changes to attestation servers to handle the
> > change.
> > 
> > I know this has already been released in some stable releases, so
> > I'm too late, but perhaps I'm missing something.
> 
> The point of adding PCRs 8 & 9 only to non-SHA1 boot_aggregate values
> was to avoid affecting existing attestation servers.  The intention
> was when attestation servers added support for the non-sha1
> boot_aggregate values, they'd also include PCRs 8 & 9.  The existing
> SHA1 boot_aggregate value remains PCRs 0 - 7.
> 
> To prevent this or something similar from happening again, what
> should have been the proper way of including PCRs 8 & 9?

Just to be pragmatic: this is going to happen again.  Shim is already
measuring the Mok variables through PCR 14, so if we want an accurate
boot aggregate, we're going to have to include PCR 14 as well (or
persuade shim to measure through a PCR we're already including, which
isn't impossible since I think shim should be measuring the Mok
variables using the EV_EFI_VARIABLE_DRIVER_CONFIG event and, since it
affects secure boot policy, that does argue it should be measured
through PCR 7).

James

Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

2020-12-07 Thread James Bottomley

On Mon, 2020-12-07 at 15:28 -0400, Jason Gunthorpe wrote:
> On Sun, Dec 06, 2020 at 08:26:16PM +0100, Thomas Gleixner wrote:
> > Just as a side note. I was looking at tpm_tis_probe_irq_single()
> > and that function is leaking the interrupt request if any of the
> > checks afterwards fails, except for the final interrupt probe check
> > which does a cleanup. That means on fail before that the interrupt
> > handler stays requested up to the point where the module is
> > removed. If that's a shared interrupt and some other device is
> > active on the same line, then each interrupt from that device will
> > call into the TPM code. Something like the below is needed.
> > 
> > Also the X86 autoprobe mechanism is interesting:
> > 
> > if (IS_ENABLED(CONFIG_X86))
> > for (i = 3; i <= 15; i++)
> > if (!tpm_tis_probe_irq_single(chip, intmask, 0,
> > i))
> > return;
> > 
> > The third argument is 'flags' which is handed to request_irq(). So
> > that won't ever be able to probe a shared interrupt. But if an
> > interrupt number > 0 is handed to tpm_tis_core_init() the interrupt
> > is requested with IRQF_SHARED. Same issue when the chip has an
> > interrupt number in the register. It's also requested exclusive
> > which is pretty likely to fail on ancient x86 machines.
> 
> It is very likely none of this works any more, it has been repeatedly
> reworked over the years and just left behind out of fear someone
> needs it. I've thought it should be deleted for a while now.
> 
> I suppose the original logic was to try and probe without SHARED
> because a probe would need exclusive access to the interrupt to tell
> if the TPM was actually the source, not some other device.
> 
> It is all very old and very out of step with current thinking, IMHO.
> I skeptical that TPM interrupts were ever valuable enough to deserve
> this in the first place.

For what it's worth, I agree.  Trying to probe all 15 ISA interrupts is
last millennium thinking we should completely avoid.  If it's not
described in ACPI then you don't get an interrupt full stop.

James

Re: [PATCH v13 0/3] scsi: ufs: Add Host Performance Booster Support

2020-12-07 Thread James Bottomley

On Mon, 2020-12-07 at 19:35 +0100, Greg KH wrote:
> On Mon, Dec 07, 2020 at 06:26:03PM +, Christoph Hellwig wrote:
> > On Mon, Dec 07, 2020 at 07:23:12PM +0100, Greg KH wrote:
> > > What "real workload" test can be run on this to help show if it
> > > is useful or not?  These vendors seem to think it helps for some
> > > reason, otherwise they wouldn't have added it to their silicon :)
> > > 
> > > Should they run fio?  If so, any hints on a config that would be
> > > good to show any performance increases?
> > 
> > A real actual workload that matters.  Then again that was Martins
> > request to even justify it.  I don't think the broken addressing
> > that breaks a whole in the SCSI addressing has absolutely not
> > business being supported in Linux ever.  The vendors should have
> > thought about the design before committing transistors to something
> > that fundamentally does not make sense.

Actually, that's not the way it works: vendors add commands because
standards mandate.  That's why people who want weird commands go and
join standard committees.  Unfortunately this means that a lot of the
commands the standard mandates end up not being very useful in
practice.  For instance in SCSI we really only implement a fraction of
the commands in the standard.

In this case, the industry already tried a very similar approach with
GEN 1 hybrid drives and it turned into a complete disaster, which is
why the mode became optional in shingle drives and much better modes,
which didn't have the huge shared state problem, superseded it.  Plus
truncating the LBA of a READ 16 to 4 bytes is asking for capacity
problems down the line, so even the actual implementation seems to be
problematic.

All in all, this looks like a short term fix which will go away when
the drive capacity improves and thus all the effort changing the driver
will eventually be wasted.

> So "time to boot an android system with this enabled and disabled"
> would be a valid workload, right?  I'm guessing that's what the
> vendors here actually care about, otherwise there is no real stress-
> test on a UFS system that I know of.

Um, does it?  I don't believe even the UFS people have claimed this. 
The problem is that HPB creates a shared state between the driver and
the device.  That shared state has to be populated, which has to happen
at start of day, so it's entirely unclear if this is a win or a slow
down for boot.

James

Re: [PATCH v3 1/4] irq: export kstat_irqs

2020-12-06 Thread James Bottomley

On Sun, 2020-12-06 at 17:40 +0100, Thomas Gleixner wrote:
> On Sat, Dec 05 2020 at 12:39, Jarkko Sakkinen wrote:
> > On Fri, Dec 04, 2020 at 06:43:37PM -0700, Jerry Snitselaar wrote:
> > > To try and detect potential interrupt storms that
> > > have been occurring with tpm_tis devices it was suggested
> > > to use kstat_irqs() to get the number of interrupts.
> > > Since tpm_tis can be built as a module it needs kstat_irqs
> > > exported.
> > 
> > I think you should also have a paragraph explicitly stating that
> > i915_pmu.c contains a duplicate of kstat_irqs() because it is not
> > exported as of today. It adds a lot more weight to this given that
> > there is already existing mainline usage (kind of).
> 
> It's abusage and just the fact that it exists is not an argument by
> itself.

What we want is a count of the interrupts to see if we're having an
interrupt storm from the TPM device (some seem to be wired to fire the
interrupt even when there's no event to warrant it).  Since
kstat_irqs_user() does the correct RCU locking, should we be using that
instead?

James

Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().

2020-12-05 Thread James Bottomley

On Sat, 2020-12-05 at 22:20 +0100, Ard Biesheuvel wrote:
> On Sat, 5 Dec 2020 at 22:15, James Bottomley
>  wrote:
> > [Rostedt added because this is all his fault]
> > On Sat, 2020-12-05 at 21:57 +0100, Ard Biesheuvel wrote:
> > > On Sat, 5 Dec 2020 at 21:24, James Bottomley
> > >  wrote:
> > [...]
> > > > > So I don't object to using str_has_prefix() in new code in
> > > > > this way, but I really don't see the point of touching
> > > > > existing code.
> > > > 
> > > > That's your prerogative as a Maintainer ... I was just
> > > > explaining what the original author had in mind when
> > > > str_has_prefix() was created.
> > > > 
> > > 
> > > Sure, I fully understand you are not the one proposing these
> > > changes.
> > > 
> > > But if the pattern in question is so common, couldn't we go one
> > > step further and define something like
> > > 
> > > static inline const u8 *skip_prefix_or_null(const u8 *str, const
> > > u8 *prefix)
> > > {
> > > }
> > > 
> > > which returns a pointer into the original string, or NULL if the
> > > prefix is not present.
> > > 
> > > The current patch as proposed has no benefit whatsoever, but even
> > > the meaningful alternative you are proposing is not actually an
> > > improvement, given that it is not self-explanatory from the name
> > > 'str_has_prefix' what it returns, and so the code becomes more
> > > difficult to understand.
> > 
> > Ah, so this is the kernel maintainer's syndrome: you see an API
> > which isn't quite right for your use case, so you update or change
> > it.  Then you see other use cases for it and suddenly to you it
> > becomes the best thing since sliced bread and with a one ring to
> > rule them all mentality you exhort everyone to use this new API
> > everywhere.  See this comment in the merge commit (495d714ad1400)
> > which comes from the merge cover letter:
> > 
> > > - Addition of str_has_prefix() and a few use cases. There
> > >   currently is a similar function strstart() that is used in
> > > a
> > >   few places, but only returns a bool and not a length. These
> > >   instances will be removed in the future to use
> > >   str_has_prefix() instead.
> > 
> > Then you forget about it until someone else acts on your somewhat
> > ill considered instruction and actually tries the
> > replacement.  Once someone takes up your cause, the API shows up in
> > dozens of emails and the actual debate about whether or not this is
> > such a good API really begins, with the poor person who picked it
> > up caught in the crossfire.
> > 
> > As maintainers we really should learn to put the cart before the 

s/to put/not to put/

> > horse.
> > 
> 
> I am not disagreeing with any of this, but I simply don't see a point
> in merging patches that apparently result in the exact same machine
> code to be generated, and don't substantially make the code itself
> any better.


Well, I think the pattern

if (strstarts(option, )) {
   ...
   option += strlen();

is a bad one because one day  may get updated but not .  And if  is too far away in the code it might not
even show up in the diff, leading to reviewers not noticing either.  So
I think eliminating the pattern is a definite improvement.

Now whether the improvement is enough that we should churn the code
base to fix it is another question.

James

Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().

2020-12-05 Thread James Bottomley

[Rostedt added because this is all his fault]
On Sat, 2020-12-05 at 21:57 +0100, Ard Biesheuvel wrote:
> On Sat, 5 Dec 2020 at 21:24, James Bottomley
>  wrote:
[...]
> > > So I don't object to using str_has_prefix() in new code in this
> > > way, but I really don't see the point of touching existing code.
> > 
> > That's your prerogative as a Maintainer ... I was just explaining
> > what the original author had in mind when str_has_prefix() was
> > created.
> > 
> 
> Sure, I fully understand you are not the one proposing these changes.
> 
> But if the pattern in question is so common, couldn't we go one step
> further and define something like
> 
> static inline const u8 *skip_prefix_or_null(const u8 *str, const u8
> *prefix)
> {
> }
> 
> which returns a pointer into the original string, or NULL if the
> prefix is not present.
> 
> The current patch as proposed has no benefit whatsoever, but even the
> meaningful alternative you are proposing is not actually an
> improvement, given that it is not self-explanatory from the name
> 'str_has_prefix' what it returns, and so the code becomes more
> difficult to understand.

Ah, so this is the kernel maintainer's syndrome: you see an API which
isn't quite right for your use case, so you update or change it.  Then
you see other use cases for it and suddenly to you it becomes the best
thing since sliced bread and with a one ring to rule them all mentality
you exhort everyone to use this new API everywhere.  See this comment
in the merge commit (495d714ad1400) which comes from the merge cover
letter:

> - Addition of str_has_prefix() and a few use cases. There
>   currently is a similar function strstart() that is used in a 
>   few places, but only returns a bool and not a length. These 
>   instances will be removed in the future to use 
>   str_has_prefix() instead.

Then you forget about it until someone else acts on your somewhat ill
considered instruction and actually tries the replacement.  Once
someone takes up your cause, the API shows up in dozens of emails and
the actual debate about whether or not this is such a good API really
begins, with the poor person who picked it up caught in the crossfire.

As maintainers we really should learn to put the cart before the horse.

James

Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().

2020-12-05 Thread James Bottomley

On Sat, 2020-12-05 at 20:36 +0100, Ard Biesheuvel wrote:
> On Fri, 4 Dec 2020 at 19:02, James Bottomley
>  wrote:
> > On Fri, 2020-12-04 at 18:07 +0100, Ard Biesheuvel wrote:
> > > On Fri, 4 Dec 2020 at 18:06, 
> > > wrote:
> > > > From: Francis Laniel 
> > > > 
> > > > The two functions indicates if a string begins with a given
> > > > prefix. The only difference is that strstarts() returns a bool
> > > > while str_has_prefix() returns the length of the prefix if the
> > > > string begins with it or 0 otherwise.
> > > > 
> > > 
> > > Why?
> > 
> > I think I can answer that.  If the conversion were done properly
> > (which it's not) you could get rid of the double strings in the
> > code which are error prone if you update one and forget
> > another.  This gives a good example: 3d739c1f6156 ("tracing: Use
> > the return of str_has_prefix() to remove open coded numbers"). so
> > in your code you'd replace things like
> > 
> > if (strstarts(option, "rgb")) {
> > option += strlen("rgb");
> > ...
> > 
> > with
> > 
> > len = str_has_prefix(option, "rgb");
> > if (len) {
> > option += len
> > ...
> > 
> > Obviously you also have cases where strstart is used as a boolean
> > with no need to know the length ... I think there's no value to
> > converting those.
> > 
> 
> This will lead to worse code being generated. strlen() is evaluated
> at build time by the compiler if the argument is a string literal, so
> your 'before' version gets turned into 'option += 3', whereas the
> latter needs to use a runtime variable.

str_has_prefix() is an always_inline function so it should be build
time evaluated as well.  I think most compilers see len as being a
constant and unchanged, so elide the variable.  This means the code
generated should be the same.

> So I don't object to using str_has_prefix() in new code in this way,
> but I really don't see the point of touching existing code.

That's your prerogative as a Maintainer ... I was just explaining what
the original author had in mind when str_has_prefix() was created.

James

[GIT PULL] SCSI fixes for 5.10-rc6

2020-12-05 Thread James Bottomley

Four small fixes in two drivers.  The mpt3sas fixes are all timeout
under unusual conditions problems and the storvsc is a missed incoming
packet validation and a missed error return.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Andrea Parri (Microsoft) (1):
  scsi: storvsc: Validate length of incoming packet in 
storvsc_on_channel_callback()

Jing Xiangfeng (1):
  scsi: storvsc: Fix error return in storvsc_probe()

Sreekanth Reddy (1):
  scsi: mpt3sas: Increase IOCInit request timeout to 30s

Suganath Prabu S (1):
  scsi: mpt3sas: Fix ioctl timeout

And the diffstat:

 drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c  | 2 +-
 drivers/scsi/storvsc_drv.c  | 9 -
 3 files changed, 10 insertions(+), 3 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index e4cc92bc4d94..bb940cbcbb5d 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -6459,7 +6459,7 @@ _base_send_ioc_init(struct MPT3SAS_ADAPTER *ioc)
 
r = _base_handshake_req_reply_wait(ioc,
sizeof(Mpi2IOCInitRequest_t), (u32 *)_request,
-   sizeof(Mpi2IOCInitReply_t), (u16 *)_reply, 10);
+   sizeof(Mpi2IOCInitReply_t), (u16 *)_reply, 30);
 
if (r != 0) {
ioc_err(ioc, "%s: handshake failed (r=%d)\n", __func__, r);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c 
b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
index 0f2b681449e6..edd26a2570fa 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
@@ -664,7 +664,7 @@ _ctl_do_mpt_command(struct MPT3SAS_ADAPTER *ioc, struct 
mpt3_ioctl_command karg,
Mpi26NVMeEncapsulatedRequest_t *nvme_encap_request = NULL;
struct _pcie_device *pcie_device = NULL;
u16 smid;
-   u8 timeout;
+   unsigned long timeout;
u8 issue_reset;
u32 sz, sz_arg;
void *psge;
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 0c65fbd41035..99c8ff81de74 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1246,6 +1246,11 @@ static void storvsc_on_channel_callback(void *context)
request = (struct storvsc_cmd_request *)
((unsigned long)desc->trans_id);
 
+   if (hv_pkt_datalen(desc) < sizeof(struct vstor_packet) - 
vmscsi_size_delta) {
+   dev_err(>device, "Invalid packet len\n");
+   continue;
+   }
+
if (request == _device->init_request ||
request == _device->reset_request) {
memcpy(>vstor_packet, packet,
@@ -1994,8 +1999,10 @@ static int storvsc_probe(struct hv_device *device,
alloc_ordered_workqueue("storvsc_error_wq_%d",
WQ_MEM_RECLAIM,
host->host_no);
-   if (!host_dev->handle_error_wq)
+   if (!host_dev->handle_error_wq) {
+   ret = -ENOMEM;
goto err_out2;
+   }
INIT_WORK(_dev->host_scan_work, storvsc_host_scan);
/* Register the HBA and start the scsi bus scan */
ret = scsi_add_host(host, >device);

Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected

2020-12-04 Thread James Bottomley

On Fri, 2020-12-04 at 14:51 -0700, Jerry Snitselaar wrote:
> James Bottomley @ 2020-12-03 14:05 MST:
> 
> > On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote:
> > > Jerry Snitselaar @ 2020-12-02 23:11 MST:
> > [...]
> > > > The interrupt storm detection code works on the T490s. I'm not
> > > > sure what is going on with the L490. I will see if I can get
> > > > access to one.
> > > > 
> > > > Jerry
> > > 
> > > Lenovo verified that the L490 hangs.
> > 
> > Just to confirm, that's this system:
> > 
> > https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490
> > 
> > We could ask if lenovo will give us one, but if not we could pull a
> > Jens.  [the backstory is that when Jens was doing queueing in the
> > block layer, there were lots of SATA devices that didn't work quite
> > right but you couldn't tell unless you actually tried them
> > out.  Getting manufacturers to send samples is rather arduous, so
> > he took to ordering them online, testing them out, and then
> > returning them for a full refund within the allowed window]
> > 
> > It looks like Lenovo has a nice christmas returns policy:
> > 
> > https://www.lenovo.com/us/en/shopping-faq/#returns
> > 
> > James
> 
> Yes, that is the one. I'm seeing if we have any located somewhere, or
> if Lenovo will loan me one.

OK, that would be best since it's your patch.  We've got 12 days left
to get the free returns policy, so if you haven't managed to find one
by the end of next week, I'll order one and play with it ... you could
always put your manager's credit card down for one as well ...
especially as it should get refunded ...

>  I think for the time being the patch that disabled interrupts for
> the T490s could be changed to it for the L490 instead. I'll post a v3
> of my current patchset. It would probably make sense for it to go in
> with your patches when they land.

Absolutely ... we can't turn interrupts on and then have a load of hung
laptops we already knew about ...

James

Re: [RFC PATCH v1 07/12] efi: Replace strstarts() by str_has_prefix().

2020-12-04 Thread James Bottomley

On Fri, 2020-12-04 at 18:07 +0100, Ard Biesheuvel wrote:
> On Fri, 4 Dec 2020 at 18:06, 
> wrote:
> > From: Francis Laniel 
> > 
> > The two functions indicates if a string begins with a given prefix.
> > The only difference is that strstarts() returns a bool while
> > str_has_prefix()
> > returns the length of the prefix if the string begins with it or 0
> > otherwise.
> > 
> 
> Why? 

I think I can answer that.  If the conversion were done properly (which
it's not) you could get rid of the double strings in the code which are
error prone if you update one and forget another.  This gives a good
example: 3d739c1f6156 ("tracing: Use the return of str_has_prefix() to
remove open coded numbers"). so in your code you'd replace things like

if (strstarts(option, "rgb")) {
option += strlen("rgb");
...

with 

len = str_has_prefix(option, "rgb");
if (len) {
option += len
...

Obviously you also have cases where strstart is used as a boolean with
no need to know the length ... I think there's no value to converting
those.

James

Re: [RFC PATCH v1 00/12] Replace strstarts() by str_has_prefix()

2020-12-04 Thread James Bottomley

On Fri, 2020-12-04 at 18:03 +0100, laniel_fran...@privacyrequired.com
wrote:
> In this patch set, I replaced all calls to strstarts() by calls to
> str_has_prefix(). Indeed, the kernel has two functions to test if a
> string begins with an other:
> 1. strstarts() which returns a bool, so 1 if the string begins with
> the prefix,0 otherwise.
> 2. str_has_prefix() which returns the length of the prefix or 0.
> 
> str_has_prefix() was introduced later than strstarts(), in commit
> 495d714ad140 which also stated that str_has_prefix() should replace
> strstarts(). This is what this patch set does.

What's the reason why?  If you look at the use cases for the
replacement of strstart()  they're all cases where we need to know the
length we're skipping and this is hard coded, leading to potential
errors later.  This is a classic example:  3d739c1f6156 ("tracing: Use
the return of str_has_prefix() to remove open coded numbers").  However
you're not doing this transformation in the conversion, so the
conversion is pretty useless.  I also see no case for replacing
strstart() where we're using it simply as a boolean without needing to
know the length of the prefix.

James

Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

2020-12-03 Thread James Bottomley

On Thu, 2020-12-03 at 14:17 -0500, Konstantin Ryabitsev wrote:
> On Thu, Dec 03, 2020 at 08:55:54AM -0800, Joe Perches wrote:
> > Perhaps automate a mechanism to capture that information as
> > git notes for the patches when applied.
> 
> Git notes have a limited usefulness for this -- they are indeed part
> of the repository, but they aren't replicated unless someone does a 
> --mirror clone (or specifically fetches refs/notes/*). If the goal is
> to improve visibility for contributors, then putting this info into a
> git note will hardly make more difference than providing a Link: that
> someone has to follow to a list archival service. 
> 
> I can offer the following proposal:
> 
> - kernel.org already monitors all mailing lists that are archived on 
>   lore.kernel.org for the purposes of pull request tracking 
>   (pr-tracker-bot).
> - in the near future, we will add a separate process that will 
>   auto-explode all pull requests into individual patches and add them
>   to a separate public-inbox archive (think of it as another 
>   transparency log, since pull requests are transient and opaque).
> 
> We can additionally:
> 
> - identify all Link: and Message-Id: entries in commit messages, 
>   retrieve the threads they refer to, and archive them as part of

>   the same (or adjacent) transparency log.
> 
> This offers an improvement over the status quo, because if 
> lore.kernel.org becomes unavailable, someone would have to have
> access to all backend archive repositories it is currently tracking
> in order to be able to reconstitute relevant conversations -- whereas
> with this change, it should be sufficient to just have the copy of
> the  transparency log to have a fully self-contained high-relevancy
> archive of both individual commits and conversations that happened
> around them.

I don't think this is strictly necessary because there's more than lore
that archive's our lists, but the people at the internet history
project would remind me not to look a gift horse in the mouth, so I
think this would certainly be a useful addition.

The thing which Link: doesn't necessarily track is iterations, so if
you replied to v2 and your feedback got incorporated, there's a v3
iteration which has a different msgid.  Is there a way of getting this
full history, not just the current thread?

> I'm just not sure if this will help with the subject of the 
> conversation, or if it does not serve the goal of recognizing
> developer contributions by making them more visible.

I added Jon to the cc since a lot of managers (community or otherwise)
do use the lwn.net stats as a performance guide.  The real question is
could we get something measurable out of the data?  say number of
replies to an accepted patch counting in the reviewer stats or
something?

James

Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected

2020-12-03 Thread James Bottomley

On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote:
> Jerry Snitselaar @ 2020-12-02 23:11 MST:
[...]
> > The interrupt storm detection code works on the T490s. I'm not sure
> > what is going on with the L490. I will see if I can get access to
> > one.
> > 
> > Jerry
> 
> Lenovo verified that the L490 hangs.

Just to confirm, that's this system:

https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490

We could ask if lenovo will give us one, but if not we could pull a
Jens.  [the backstory is that when Jens was doing queueing in the block
layer, there were lots of SATA devices that didn't work quite right but
you couldn't tell unless you actually tried them out.  Getting
manufacturers to send samples is rather arduous, so he took to ordering
them online, testing them out, and then returning them for a full
refund within the allowed window]

It looks like Lenovo has a nice christmas returns policy:

https://www.lenovo.com/us/en/shopping-faq/#returns

James

Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

2020-12-03 Thread James Bottomley

On Thu, 2020-12-03 at 13:52 -0500, Matthew Wilcox wrote:
> It's not so much "clean history" that's the desire. It's "don't leave
> landmines for git bisect".

... top posting?

Well functional git bisect and show the evolution of the patch aren't
mutually exclusive.  Plus our current clean history approach doesn't
always detect them ...

That said, I agree that if a review comes in that shows a patch would
break the build or runtime in a way that would damage bisection, it
should be folded.  I suppose this argues that only less trivial changes
can be shown as part of the unfolded history, and since they're
obviously less important than the review driven folded change it would
add more bias to track them.

I fall back to my link is enough comment.

James

Re: [Ksummit-discuss] crediting bug reports and fixes folded into original patch

2020-12-03 Thread James Bottomley

On Thu, 2020-12-03 at 00:43 +0100, Vlastimil Babka wrote:
> Hi,
> 
> there was a bit of debate on Twitter about this, so I thought I would
> bring it here. Imagine a scenario where patch sits as a commit in
> -next and there's a bug report or fix, possibly by a bot or with some
> static analysis. The maintainer decides to fold it into the original
> patch, which makes sense for e.g. bisectability. But there seem to be
> no clear rules about attribution in this case, which looks like there
> should be, probably in
> Documentation/maintainer/modifying-patches.rst
> 
> The original bug fix might include a From: $author, a Reported-by:
> (e.g. syzbot), Fixes: $next-commit, some tag such as Addresses-
> Coverity: to credit the static analysis tool, and an SoB. After
> folding, all that's left might be a line as "include fix from
> $author" in the SoB area. This is a loss of metadata/attribution just
> due to folding, and might make contributors unhappy. Had they sent
> the fix after the original commit was mainline and immutable, all
> the info above would "survive" in the form of new commit.

It has been the case since forever that discussion which improves an
uncommitted patch is only captured in email (which now may be preserved
in a link tag).  Patch updates that come in after the patch is
committed get their own commit.  We've tried to move people away from
counting commits as an indicator of upstream eminence, but it's still a
fact of life that this is what matters to a lot of open source
community managers.  The tension we have is between liking a clean
commit in the tree as opposed to a sequence of commits tracking the
evolution of the patch and this community manager desire to track
patches.

So there are two embedded questions here: firstly, should we be as
wedded to clean history as we are, because showing the evolution would
simply solve this?  Secondly, if we are agreed on clean history, how
can we make engagement via email as important as engagement via commit
for the community managers so the Link tag is enough?  I've got to say
I think trying to add tags to recognize patch evolution is a mistake
and we instead investigate one of the two proposals above.

James

Re: [PATCH] scsi: ses: Fix crash caused by kfree an invalid pointer

2020-11-30 Thread James Bottomley

On Mon, 2020-11-30 at 10:26 +0800, Ding Hui wrote:
[...]
> sg_ses -e
> Diagnostic pages, followed by abbreviation(s) then page code:
>  Supported Diagnostic Pages  [sdp] [0x0]
>  Configuration (SES)  [cf] [0x1]
>  Enclosure Status/Control (SES)  [ec,es] [0x2]
>  Help Text (SES)  [ht] [0x3]
>   
> # sg_ses -p cf /dev/sdu
>DELL  MD32xxi   0784
>  disk device has EncServ bit set
> Configuration diagnostic page:
>number of secondary subenclosures: 0
>generation code: 0x12c
>enclosure descriptor list
>  Subenclosure identifier: 0 (primary)
>relative ES process id: 0, number of ES processes: 0
>number of type descriptor headers: 5
>enclosure logical identifier (hex): 
>enclosure vendor: DELL  product: MD32xxi   rev:
> 0784
>vendor-specific data:
>  30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
>  00 00 00 00
>type descriptor header/text list
>  Element type: Device slot, subenclosure id: 0
>number of possible elements: 12
>  Element type: Power supply, subenclosure id: 0
>number of possible elements: 2
>  Element type: Cooling, subenclosure id: 0
>number of possible elements: 4
>  Element type: Temperature sensor, subenclosure id: 0
>number of possible elements: 4
>  Element type: SCC controller electronics, subenclosure id: 0
>number of possible elements: 1
> 
> I'm not goot at ses, but it seems that ses does not get the right 
> component count

Actually there is a possible explanation.  Your kernel log has this in
the middle:

> 2020-11-30 09:31:41.360334 notice [kernel:] [425843.704480] sd
> 19:0:0:0: Embedded Enclosure Device
> 2020-11-30 09:31:41.360335 warning [kernel:] [425843.704732] sd 
> 19:0:0:0: Mode parameters changed

That "Mode parameters changed" implies that what the kernel read first
time around may not be the actual configuration of the enclosure.  In
particular, the generation code being 0x12c is also an indicator things
may have changed.My theory is when the kernel boots the device is
returning 0 for most of the possible elements, but it changes this at a
later stage.  One way to verify would be to compile ses as modular but
blacklist it so it can't be inserted, then do sg_ses -p to get the
enclosure and then insert the ses module to see if the two agree on the
components.  Unfortunately, even if that's successful, figuring out
what we have to do to the enclosure to get it to finish its internal
scanning may not be so easy.

James

Re: [PATCH] locks: remove trailing semicolon in macro definition

2020-11-29 Thread James Bottomley

On Sun, 2020-11-29 at 09:52 -0800, Randy Dunlap wrote:
> On 11/29/20 9:47 AM, Tom Rix wrote:
> > On 11/27/20 11:53 AM, Matthew Wilcox wrote:
> > > On Fri, Nov 27, 2020 at 11:07:07AM -0800, t...@redhat.com wrote:
> > > > +++ b/fs/fcntl.c
> > > > @@ -526,7 +526,7 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd,
> > > > unsigned int, cmd,
> > > > (dst)->l_whence = (src)->l_whence;  \
> > > > (dst)->l_start = (src)->l_start;\
> > > > (dst)->l_len = (src)->l_len;\
> > > > -   (dst)->l_pid = (src)->l_pid;
> > > > +   (dst)->l_pid = (src)->l_pid
> > > This should be wrapped in a do { } while (0).
> > > 
> > > Look, this warning is clearly great at finding smelly code, but
> > > the
> > > fixes being generated to shut up the warning are low quality.
> > > 
> > Multiline macros not following the do {} while (0) pattern are
> > likely a larger problem.
> > 
> > I'll take a look.
> 
> Could it become a static inline function instead?
> or that might expand its scope too much?

I think nowadays we should always use static inlines for argument
checking unless we're capturing debug information like __FILE__ or
__LINE__ or something that a static inline can't.  Even in the latter
case the pattern should probably be single line #define that captures
the information and passes it to a static inline.

There was a time when we had problems with compiler expansion of static
inlines, so we shouldn't go back and churn the code base to change it
because there's thousands of these and possibly some old compiler used
for an obscure architecture that still needs the define.

James

Re: [PATCH] scsi: ses: Fix crash caused by kfree an invalid pointer

2020-11-28 Thread James Bottomley

On Sat, 2020-11-28 at 20:23 +0800, Ding Hui wrote:
> We can get a crash when disconnecting the iSCSI session,
> the call trace like this:
> 
>   [2a00fb70] kfree at 0830e224
>   [2a00fba0] ses_intf_remove at 01f200e4
>   [2a00fbd0] device_del at 086b6a98
>   [2a00fc50] device_unregister at 086b6d58
>   [2a00fc70] __scsi_remove_device at 0870608c
>   [2a00fca0] scsi_remove_device at 08706134
>   [2a00fcc0] __scsi_remove_target at 087062e4
>   [2a00fd10] scsi_remove_target at 087064c0
>   [2a00fd70] __iscsi_unbind_session at 01c872c4
>   [2a00fdb0] process_one_work at 0810f35c
>   [2a00fe00] worker_thread at 0810f648
>   [2a00fe70] kthread at 08116e98
> 
> In ses_intf_add, components count could be 0, and kcalloc 0 size
> scomp,
> but not saved in edev->component[i].scratch
> 
> In this situation, edev->component[0].scratch is an invalid pointer,
> when kfree it in ses_intf_remove_enclosure, a crash like above would
> happen
> The call trace also could be other random cases when kfree cannot
> catch
> the invalid pointer
> 
> We should not use edev->component[] array when the components count
> is 0
> We also need check index when use edev->component[] array in
> ses_enclosure_data_process
> 
> Tested-by: Zeng Zhicong 
> Cc: stable  # 2.6.25+
> Signed-off-by: Ding Hui 

This doesn't really look to be the right thing to do: an enclosure
which has no component can't usefully be controlled by the driver since
there's nothing for it to do, so what we should do in this situation is
refuse to attach like the proposed patch below.

It does seem a bit odd that someone would build an enclosure that
doesn't enclose anything, so would you mind running

sg_ses -e 

on it and reporting back what it shows?  It's possible there's another
type that the enclosure device should be tracking.

Regards,

James

---8>8>8><8<8<8
From: James Bottomley 
Subject: [PATCH] scsi: ses: don't attach if enclosure has no components

An enclosure with no components can't usefully be operated by the
driver (since effectively it has nothing to manage), so report the
problem and don't attach.  Not attaching also fixes an oops which
could occur if the driver tries to manage a zero component enclosure.

Reported-by: Ding Hui 
Cc: sta...@vger.kernel.org
Signed-off-by: James Bottomley 
---
 drivers/scsi/ses.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index c2afba2a5414..9624298b9c89 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -690,6 +690,11 @@ static int ses_intf_add(struct device *cdev,
type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE)
components += type_ptr[1];
}
+   if (components == 0) {
+   sdev_printk(KERN_ERR, sdev, "enclosure has no enumerated 
components\n");
+   goto err_free;
+   }
+
ses_dev->page1 = buf;
ses_dev->page1_len = len;
buf = NULL;
-- 
2.26.2

[GIT PULL] SCSI fixes for 5.10-rc5

2020-11-27 Thread James Bottomley

Three small fixes in the UFS driver: two are for power management
issues and the third is to fix a slew of problem in the sysfs code.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Can Guo (2):
  scsi: ufs: Make sure clk scaling happens only when HBA is runtime ACTIVE
  scsi: ufs: Fix unexpected values from ufshcd_read_desc_param()

Stanley Chu (1):
  scsi: ufs: Fix race between shutdown and runtime resume flow

And the diffstat:

 drivers/scsi/ufs/ufshcd.c | 37 +++--
 1 file changed, 23 insertions(+), 14 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 7a160b86adc6..0c148fcd24de 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1294,8 +1294,15 @@ static int ufshcd_devfreq_target(struct device *dev,
}
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
 
+   pm_runtime_get_noresume(hba->dev);
+   if (!pm_runtime_active(hba->dev)) {
+   pm_runtime_put_noidle(hba->dev);
+   ret = -EAGAIN;
+   goto out;
+   }
start = ktime_get();
ret = ufshcd_devfreq_scale(hba, scale_up);
+   pm_runtime_put(hba->dev);
 
trace_ufshcd_profile_clk_scaling(dev_name(hba->dev),
(scale_up ? "up" : "down"),
@@ -3192,13 +3199,19 @@ int ufshcd_read_desc_param(struct ufs_hba *hba,
/* Get the length of descriptor */
ufshcd_map_desc_id_to_length(hba, desc_id, _len);
if (!buff_len) {
-   dev_err(hba->dev, "%s: Failed to get desc length", __func__);
+   dev_err(hba->dev, "%s: Failed to get desc length\n", __func__);
+   return -EINVAL;
+   }
+
+   if (param_offset >= buff_len) {
+   dev_err(hba->dev, "%s: Invalid offset 0x%x in descriptor IDN 
0x%x, length 0x%x\n",
+   __func__, param_offset, desc_id, buff_len);
return -EINVAL;
}
 
/* Check whether we need temp memory */
if (param_offset != 0 || param_size < buff_len) {
-   desc_buf = kmalloc(buff_len, GFP_KERNEL);
+   desc_buf = kzalloc(buff_len, GFP_KERNEL);
if (!desc_buf)
return -ENOMEM;
} else {
@@ -3212,14 +3225,14 @@ int ufshcd_read_desc_param(struct ufs_hba *hba,
desc_buf, _len);
 
if (ret) {
-   dev_err(hba->dev, "%s: Failed reading descriptor. desc_id %d, 
desc_index %d, param_offset %d, ret %d",
+   dev_err(hba->dev, "%s: Failed reading descriptor. desc_id %d, 
desc_index %d, param_offset %d, ret %d\n",
__func__, desc_id, desc_index, param_offset, ret);
goto out;
}
 
/* Sanity check */
if (desc_buf[QUERY_DESC_DESC_TYPE_OFFSET] != desc_id) {
-   dev_err(hba->dev, "%s: invalid desc_id %d in descriptor header",
+   dev_err(hba->dev, "%s: invalid desc_id %d in descriptor 
header\n",
__func__, desc_buf[QUERY_DESC_DESC_TYPE_OFFSET]);
ret = -EINVAL;
goto out;
@@ -3229,12 +3242,12 @@ int ufshcd_read_desc_param(struct ufs_hba *hba,
buff_len = desc_buf[QUERY_DESC_LENGTH_OFFSET];
ufshcd_update_desc_length(hba, desc_id, desc_index, buff_len);
 
-   /* Check wherher we will not copy more data, than available */
-   if (is_kmalloc && (param_offset + param_size) > buff_len)
-   param_size = buff_len - param_offset;
-
-   if (is_kmalloc)
+   if (is_kmalloc) {
+   /* Make sure we don't copy more data than available */
+   if (param_offset + param_size > buff_len)
+   param_size = buff_len - param_offset;
memcpy(param_read_buf, _buf[param_offset], param_size);
+   }
 out:
if (is_kmalloc)
kfree(desc_buf);
@@ -8900,11 +8913,7 @@ int ufshcd_shutdown(struct ufs_hba *hba)
if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
goto out;
 
-   if (pm_runtime_suspended(hba->dev)) {
-   ret = ufshcd_runtime_resume(hba);
-   if (ret)
-   goto out;
-   }
+   pm_runtime_get_sync(hba->dev);
 
ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
 out:

Re: [PATCH] [SCSI] sym53c8xx: remove trailing semicolon in macro definition

2020-11-27 Thread James Bottomley

On Fri, 2020-11-27 at 10:29 -0800, t...@redhat.com wrote:
> From: Tom Rix 
> 
> The macro use will already have a semicolon.
> 
> Signed-off-by: Tom Rix 
> ---
>  drivers/scsi/sym53c8xx_2/sym_glue.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c
> b/drivers/scsi/sym53c8xx_2/sym_glue.c
> index d9a045f9858c..f3b3345c1766 100644
> --- a/drivers/scsi/sym53c8xx_2/sym_glue.c
> +++ b/drivers/scsi/sym53c8xx_2/sym_glue.c
> @@ -1001,12 +1001,12 @@ static int is_keyword(char *ptr, int len,
> char *verb)
>  #define SKIP_SPACES(ptr, len)
>   \
>   if ((arg_len = sym_skip_spaces(ptr, len)) < 1)  
> \
>   return -EINVAL; 
> \
> - ptr += arg_len; len -= arg_len;
> + ptr += arg_len; len -= arg_len
>  
>  #define GET_INT_ARG(ptr, len, v) 
> \
>   if (!(arg_len = get_int_arg(ptr, len, &(v   
> \
>   return -EINVAL; 
> \
> - ptr += arg_len; len -= arg_len;
> + ptr += arg_len; len -= arg_len

Those macros have the wrong form which can lead to actual bugs and this
cosmetic change does nothing to fix it.  If the goal here is to get the
code base into better shape, please fix the bugs.

James

Re: [Intel-wired-lan] [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-24 Thread James Bottomley

On Tue, 2020-11-24 at 13:32 -0800, Kees Cook wrote:
> On Mon, Nov 23, 2020 at 08:31:30AM -0800, James Bottomley wrote:
> > Really, no ... something which produces no improvement has no value
> > at all ... we really shouldn't be wasting maintainer time with it
> > because it has a cost to merge.  I'm not sure we understand where
> > the balance lies in value vs cost to merge but I am confident in
> > the zero value case.
> 
> What? We can't measure how many future bugs aren't introduced because
> the kernel requires explicit case flow-control statements for all new
> code.

No but we can measure how vulnerable our current coding habits are to
the mistake this warning would potentially prevent.  I don't think it's
wrong to extrapolate that if we had no instances at all of prior coding
problems we likely wouldn't have any in future either making adopting
the changes needed to enable the warning valueless ... that's the zero
value case I was referring to above.

Now, what we have seems to be about 6 cases (at least what's been shown
in this thread) where a missing break would cause potentially user
visible issues.  That means the value of this isn't zero, but it's not
a no-brainer massive win either.  That's why I think asking what we've
invested vs the return isn't a useless exercise.

> We already enable -Wimplicit-fallthrough globally, so that's not the
> discussion. The issue is that Clang is (correctly) even more strict
> than GCC for this, so these are the remaining ones to fix for full
> Clang coverage too.
> 
> People have spent more time debating this already than it would have
> taken to apply the patches. :)

You mean we've already spent 90% of the effort to come this far so we
might as well go the remaining 10% because then at least we get some
return? It's certainly a clinching argument in defence procurement ...

> This is about robustness and language wrangling. It's a big code-
> base, and this is the price of our managing technical debt for
> permanent robustness improvements. (The numbers I ran from Gustavo's
> earlier patches were that about 10% of the places adjusted were
> identified as legitimate bugs being fixed. This final series may be
> lower, but there are still bugs being found from it -- we need to
> finish this and shut the door on it for good.)

I got my six patches by analyzing the lwn.net report of the fixes that
was cited which had 21 of which 50% didn't actually change the emitted
code, and 25% didn't have a user visible effect.

But the broader point I'm making is just because the compiler people
come up with a shiny new warning doesn't necessarily mean the problem
it's detecting is one that causes us actual problems in the code base. 
I'd really be happier if we had a theory about what classes of CVE or
bug we could eliminate before we embrace the next new warning.

James

Re: [PATCH] tpm_tis: Disable interrupts on ThinkPad T490s

2020-11-24 Thread James Bottomley

On Tue, 2020-11-24 at 10:52 -0700, Jerry Snitselaar wrote:
> Before diving further into that though, does anyone else have an
> opinion on ripping out the irq code, and just using polling? We've
> been only polling since 2015 anyways.

Well only a biased one, obviously: polling causes large amounts of busy
waiting, which is a waste of CPU resources and does increase the time
it takes us to do TPM operations ... not a concern if you're doing long
computation ones, like signatures, but it is a problem for short
operations like bulk updates of PCRs.  The other potential issue, as we
saw with atmel is that if you prod the chip too often (which you have
to do with polling) you risk upsetting it.  We've spent ages trying to
tune the polling parameters to balance reduction of busy wait with chip
upset and still, apparently, not quite got it right.  If the TPM has a
functioning IRQ then it gets us out of the whole polling mess entirely.
The big question is how many chips that report an IRQ actually have a
malfunctioning one?

James

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread James Bottomley

On Mon, 2020-11-23 at 19:56 +0100, Miguel Ojeda wrote:
> On Mon, Nov 23, 2020 at 4:58 PM James Bottomley
>  wrote:
> > Well, I used git.  It says that as of today in Linus' tree we have
> > 889 patches related to fall throughs and the first series went in
> > in october 2017 ... ignoring a couple of outliers back to February.
> 
> I can see ~10k insertions over ~1k commits and 15 years that mention
> a fallthrough in the entire repo. That is including some commits
> (like the biggest one, 960 insertions) that have nothing to do with C
> fallthrough. A single kernel release has an order of magnitude more
> changes than this...
> 
> But if we do the math, for an author, at even 1 minute per line
> change and assuming nothing can be automated at all, it would take 1
> month of work. For maintainers, a couple of trivial lines is noise
> compared to many other patches.

So you think a one line patch should take one minute to produce ... I
really don't think that's grounded in reality.  I suppose a one line
patch only takes a minute to merge with b4 if no-one reviews or tests
it, but that's not really desirable.

> In fact, this discussion probably took more time than the time it
> would take to review the 200 lines. :-)

I'm framing the discussion in terms of the whole series of changes we
have done for fall through, both what's in the tree currently (889
patches) both in terms of the produce and the consumer.  That's what I
used for my figures for cost.

> > We're also complaining about the inability to recruit maintainers:
> > 
> > https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/
> > 
> > And burn out:
> > 
> > http://antirez.com/news/129
> 
> Accepting trivial and useful 1-line patches

Part of what I'm trying to measure is the "and useful" bit because
that's not a given.

> is not what makes a voluntary maintainer quit...

so the proverb "straw which broke the camel's back" uniquely doesn't
apply to maintainers

>  Thankless work with demanding deadlines is.

That's another potential reason, but it doesn't may other reasons less
valid.

> > The whole crux of your argument seems to be maintainers' time isn't
> > important so we should accept all trivial patches
> 
> I have not said that, at all. In fact, I am a voluntary one and I
> welcome patches like this. It takes very little effort on my side to
> review and it helps the kernel overall.

Well, you know, subsystems are very different in terms of the amount of
patches a maintainer has to process per release cycle of the kernel. 
If a maintainer is close to capacity, additional patches, however
trivial, become a problem.  If a maintainer has spare cycles, trivial
patches may look easy.

> Paid maintainers are the ones that can take care of big
> features/reviews.
> 
> > What I'm actually trying to articulate is a way of measuring value
> > of the patch vs cost ... it has nothing really to do with who foots
> > the actual bill.
> 
> I understand your point, but you were the one putting it in terms of
> a junior FTE.

No, I evaluated the producer side in terms of an FTE.  What we're
mostly arguing about here is the consumer side: the maintainers and
people who have to rework their patch sets. I estimated that at 100h.

>  In my view, 1 month-work (worst case) is very much worth
> removing a class of errors from a critical codebase.
> 
> > One thesis I'm actually starting to formulate is that this
> > continual devaluing of maintainers is why we have so much
> > difficulty keeping and recruiting them.
> 
> That may very well be true, but I don't feel anybody has devalued
> maintainers in this discussion.

You seem to be saying that because you find it easy to merge trivial
patches, everyone should.  I'm reminded of a friend long ago who
thought being a Tees River Pilot was a sinecure because he could
navigate the Tees blindfold.  What he forgot, of course, is that just
because it's easy with a trawler doesn't mean it's easy with an oil
tanker.  In fact it takes longer to qualify as a Tees River Pilot than
it does to get a PhD.

James

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 4683 matches

Mail list logo