Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-10 Thread Benjamin Herrenschmidt
On Mon, 2019-06-10 at 13:44 -0500, Larry Finger wrote:
> On 6/7/19 11:21 PM, Benjamin Herrenschmidt wrote:
> > 
> > > Please try the attached patch. I'm not really pleased with it and I will
> > > continue to determine why the fallback to a 30-bit mask fails, but at 
> > > least this
> > > one works for me.
> > 
> > Your patch only makes sense if the device is indeed capable of
> > addressing 31-bits.
> > 
> > So either the driver is buggy and asks for a too small mask in which
> > case your patch is ok, or it's not and you're just going to cause all
> > sort of interesting random problems including possible memory
> > corruption.
> 
> Of course the driver may be buggy, but it asks for the correct mask.
> 
> This particular device is not capable of handling 32-bit DMA. The driver 
> detects 
> the 32-bit failure and falls back to 30 bits. It works on x86, and did on 
> PPC32 
> until 5.1. As Christoph said, it should always be possible to use fewer bits 
> than the maximum.

No, I don't think it *worked* on ppc32 before Christoph patch. I think
it "mostly sort-of worked" :-)

The reason I'm saying that is if your system has more than 1GB of RAM,
then you'll have chunks of memory that the device simply cannot
address.

Before Christoph patches, we had no ZONE_DMA or ZONE_DMA32 covering the
30-bit limited space, so any memory allocation could in theory land
above 30-bits, causing all sort of horrible things to happen with that
driver.

The reason I think it sort-of-mostly-worked is that to get more than
1GB of RAM, those machines use CONFIG_HIGHMEM. And *most* network
buffers aren't allocated in Highmem so you got lucky.

That said, there is such as thing as no-copy send on network, so I
wouldn't be surprised if some things would still have failed, just not
frequent enough for you to notice.

> Similar devices that are new enough to use b43 rather than b43legacy work 
> with 
> new kernels; however, they have and use 32-bit DMA.

Cheres,
Ben.




Re: [RFC PATCH 2/2] imx: mailbox: Introduce TX doorbell with ACK

2019-06-10 Thread Oleksij Rempel
Hi Daniel,

On Mon, Jun 10, 2019 at 10:16:09PM +0800, daniel.bal...@nxp.com wrote:
> From: Daniel Baluta 
> 
> TX doorbell with ACK will allow us to push the doorbell ring button
> (trigger GIR) and also will allow us to handle the response from DSP.
> 
> DSP firmware found on i.MX8 boards implements a duplex
> communication protocol over MU channels.
> 
> On the host side (Linux) we need to plugin into Sound Open Firmware IPC
> communication infrastructure which handles all the details (e.g message
> queuing, tx/rx logic) [1] and the users are only required to provide the
> following callbacks:
> 
>   - send_msg (for Tx)
>   - irq_handler (Ack of Tx, request from DSP)
> 
> In order to implement send_msg and irq_handler we will use two MU
> channels:
>   * channel #0, TX doorbell with ACK
>   * channel #1, RX doorbell
> 
> Sending a request Host -> DSP (channel #0)
>   - send_msg callback
>   - write data into SHMEM
>   - push doorbell ring button (trigger GIR)
>  - irq handler
>   - handle DSP request (channel #1)
> - read SHMEM and trigger SOF IPC state machine
> - send ACK (push doorbell ring button for channel #1)
>   - handle DSP response (ACK) (channel #0)
> - read SHMEM and trigger IPC state machine
> 
> The easisest way to implement this is to directly access the MU
> registers but since the MU is abstracted using the mailbox interface
> we need to use that instead.
> 
> [1] https://elixir.bootlin.com/linux/v5.2-rc4/source/sound/soc/sof/ipc.c
> 
> Signed-off-by: Daniel Baluta 
> ---
>  drivers/mailbox/imx-mailbox.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-mailbox.c
> index 9f74dee1a58c..3a91611e17d2 100644
> --- a/drivers/mailbox/imx-mailbox.c
> +++ b/drivers/mailbox/imx-mailbox.c
> @@ -42,6 +42,7 @@ enum imx_mu_chan_type {
>   IMX_MU_TYPE_RX, /* Rx */
>   IMX_MU_TYPE_TXDB,   /* Tx doorbell */
>   IMX_MU_TYPE_RXDB,   /* Rx doorbell */
> + IMX_MU_TYPE_TXDB_ACK/* Tx doorbell with Ack */
>  };
>  
>  struct imx_mu_con_priv {
> @@ -124,6 +125,7 @@ static irqreturn_t imx_mu_isr(int irq, void *p)
>   (ctrl & IMX_MU_xCR_RIEn(cp->idx));
>   break;
>   case IMX_MU_TYPE_RXDB:
> + case IMX_MU_TYPE_TXDB_ACK:
>   val &= IMX_MU_xSR_GIPn(cp->idx) &
>   (ctrl & IMX_MU_xCR_GIEn(cp->idx));
>   break;
> @@ -200,6 +202,7 @@ static int imx_mu_startup(struct mbox_chan *chan)
>   imx_mu_xcr_rmw(priv, IMX_MU_xCR_RIEn(cp->idx), 0);
>   break;
>   case IMX_MU_TYPE_RXDB:
> + case IMX_MU_TYPE_TXDB_ACK:
>   imx_mu_xcr_rmw(priv, IMX_MU_xCR_GIEn(cp->idx), 0);
>   break;
>   default:
> -- 
> 2.17.1

If I see it correctly, with your implementation  the mbox client
communication on channel 0 will look as follow:
mbox_client -> send_msg()
/* sheduling of mbox_chan_txdone tasklet is avoided */
mbox_client <- cl->rx_callback()
mbox_client -> mbox_client_txdone()
mbox_client -> send_msg()

Without your patch you will need to register tx and rx doorbell
channels and the communication will looks like this:
mbox_client -> send_msg()
mbox_client <- mbox_chan_txdone() /* dummy notification, can be ignored */
mbox_client <- cl->rx_callback()
mbox_client -> send_msg()

I assume, you are trying to optimize it and avoid dummy
mbox_chan_txdone() notification. Correct?

The problem is, that current mailbox-framework will set txdone_method
inside of mbox_controller_register() for all channels even if
imx-mailbox has different types of channels.

The problem with your patch is, that it will silently merge two channels
(TXDB and RXDB) and not setting actual ACK by controller - mbox_chan_txdone().
Not sure, why we need to merge it in this case.

So, with current imx_mailbox implementation your firmware should work as
is. You will need to register two separate channels for TXDB and
RXDB. It will run with some overhead by triggering txdone tasklet in 
imx-mailbox driver.

If this overhead is a problem, then this should be fixed.
Merging two doorbell  channels in to one with ACK support is nice,
but will introduce more issues if we need other doorbell channels
without ACK support on same controller 

I personally would prefer to to extend mailbox framework to support
controllers with mixed channel types and remove dummy txdone tasklet
from imx-mailbox.

Since we already initialize part of >chans[i] by imx-mailbox driver,
we can set proper chan->txdone_method as well. So we need only to
prevent mbox_controller_register() to overwrite it.

Regards,
Oleksij.

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   

Re: [PATCH 1/1] irqchip/gic: Add support for Amazon Graviton variant of GICv3+GICv2m

2019-06-10 Thread Benjamin Herrenschmidt
On Mon, 2019-06-10 at 09:20 +0100, Marc Zyngier wrote:
> Hi Zeev,
> 
> On 07/06/2019 00:17, Zeev Zilberman wrote:
> > The patch adds support for Amazon Graviton custom variant of GICv2m, where
> > hw irq is encoded using the MSI message address, as opposed to standard
> > GICv2m, where hw irq is encoded in the MSI message data.
> > In addition, the Graviton flavor of GICv2m is used along GICv3 (and not
> > GICv2).
> > 
> > Signed-off-by: Zeev Zilberman 
> > Signed-off-by: Benjamin Herrenschmidt 
> 
> There seem to be some confusion about who is the author of this patch.
> As you're the one posting the patch, your SoB tag should be the last
> one. And assuming the patch has been developed together with Ben, it
> should read:
> 
> Co-developed-by: Benjamin Herrenschmidt 
> Signed-off-by: Benjamin Herrenschmidt 
> Signed-off-by: Zeev Zilberman 

It was his patch originally. I shuffled a few things around to make it
less intrusive, then Zeev picked it back up and addresses your previous
comments. I'm happy for him to take full ownership.

> > ---
> > diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
> > index 3c77ab6..eeed19f 100644
> > --- a/drivers/irqchip/irq-gic-v2m.c
> > +++ b/drivers/irqchip/irq-gic-v2m.c
> > @@ -56,6 +56,7 @@
> >  
> >  /* List of flags for specific v2m implementation */
> >  #define GICV2M_NEEDS_SPI_OFFSET0x0001
> > +#define GICV2M_GRAVITON_ADDRESS_ONLY   0x0002
> >  
> >  static LIST_HEAD(v2m_nodes);
> >  static DEFINE_SPINLOCK(v2m_lock);
> > @@ -98,15 +99,26 @@ static struct msi_domain_info gicv2m_msi_domain_info = {
> > .chip   = _msi_irq_chip,
> >  };
> >  
> > +static phys_addr_t gicv2m_get_msi_addr(struct v2m_data *v2m, int hwirq)
> > +{
> > +   if (v2m->flags & GICV2M_GRAVITON_ADDRESS_ONLY)
> > +   return v2m->res.start | ((hwirq - 32) << 3);
> > +   else
> > +   return v2m->res.start + V2M_MSI_SETSPI_NS;
> > +}
> > +
> >  static void gicv2m_compose_msi_msg(struct irq_data *data, struct msi_msg 
> > *msg)
> >  {
> > struct v2m_data *v2m = irq_data_get_irq_chip_data(data);
> > -   phys_addr_t addr = v2m->res.start + V2M_MSI_SETSPI_NS;
> > +   phys_addr_t addr = gicv2m_get_msi_addr(v2m, data->hwirq);
> >  
> > msg->address_hi = upper_32_bits(addr);
> > msg->address_lo = lower_32_bits(addr);
> > -   msg->data = data->hwirq;
> >  
> > +   if (v2m->flags & GICV2M_GRAVITON_ADDRESS_ONLY)
> > +   msg->data = 0;
> > +   else
> > +   msg->data = data->hwirq;
> > if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
> > msg->data -= v2m->spi_offset;
> >  
> > @@ -188,7 +200,7 @@ static int gicv2m_irq_domain_alloc(struct irq_domain 
> > *domain, unsigned int virq,
> > hwirq = v2m->spi_start + offset;
> >  
> > err = iommu_dma_prepare_msi(info->desc,
> > -   v2m->res.start + V2M_MSI_SETSPI_NS);
> > +   gicv2m_get_msi_addr(v2m, hwirq));
> > if (err)
> > return err;
> >  
> > @@ -307,7 +319,7 @@ static int gicv2m_allocate_domains(struct irq_domain 
> > *parent)
> >  
> >  static int __init gicv2m_init_one(struct fwnode_handle *fwnode,
> >   u32 spi_start, u32 nr_spis,
> > - struct resource *res)
> > + struct resource *res, u32 flags)
> >  {
> > int ret;
> > struct v2m_data *v2m;
> > @@ -320,6 +332,7 @@ static int __init gicv2m_init_one(struct fwnode_handle 
> > *fwnode,
> >  
> > INIT_LIST_HEAD(>entry);
> > v2m->fwnode = fwnode;
> > +   v2m->flags = flags;
> >  
> > memcpy(>res, res, sizeof(struct resource));
> >  
> > @@ -334,7 +347,14 @@ static int __init gicv2m_init_one(struct fwnode_handle 
> > *fwnode,
> > v2m->spi_start = spi_start;
> > v2m->nr_spis = nr_spis;
> > } else {
> > -   u32 typer = readl_relaxed(v2m->base + V2M_MSI_TYPER);
> > +   u32 typer;
> > +
> > +   /* Graviton should always have explicit spi_start/nr_spis */
> > +   if (v2m->flags & GICV2M_GRAVITON_ADDRESS_ONLY) {
> > +   ret = -EINVAL;
> > +   goto err_iounmap;
> > +   }
> > +   typer = readl_relaxed(v2m->base + V2M_MSI_TYPER);
> >  
> > v2m->spi_start = V2M_MSI_TYPER_BASE_SPI(typer);
> > v2m->nr_spis = V2M_MSI_TYPER_NUM_SPI(typer);
> > @@ -355,18 +375,21 @@ static int __init gicv2m_init_one(struct 
> > fwnode_handle *fwnode,
> >  *
> >  * Broadom NS2 GICv2m implementation has an erratum where the MSI data
> >  * is 'spi_number - 32'
> > +*
> > +* Reading that register fails on the Graviton implementation
> >  */
> > -   switch (readl_relaxed(v2m->base + V2M_MSI_IIDR)) {
> > -   case XGENE_GICV2M_MSI_IIDR:
> > -   v2m->flags |= GICV2M_NEEDS_SPI_OFFSET;
> > -   v2m->spi_offset = v2m->spi_start;
> > -   break;
> > -   case BCM_NS2_GICV2M_MSI_IIDR:
> > -   

[PATCH 3/8] habanalabs: initialize MMU context for driver

2019-06-10 Thread Oded Gabbay
This patch initializes the MMU structures for the kernel context. This is
needed before we can configure mappings for the kernel context.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/context.c |  7 +++
 drivers/misc/habanalabs/mmu.c | 10 ++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/habanalabs/context.c 
b/drivers/misc/habanalabs/context.c
index 280f4625e313..8682590e3f6e 100644
--- a/drivers/misc/habanalabs/context.c
+++ b/drivers/misc/habanalabs/context.c
@@ -36,6 +36,8 @@ static void hl_ctx_fini(struct hl_ctx *ctx)
 
hl_vm_ctx_fini(ctx);
hl_asid_free(hdev, ctx->asid);
+   } else {
+   hl_mmu_ctx_fini(ctx);
}
 }
 
@@ -119,6 +121,11 @@ int hl_ctx_init(struct hl_device *hdev, struct hl_ctx 
*ctx, bool is_kernel_ctx)
 
if (is_kernel_ctx) {
ctx->asid = HL_KERNEL_ASID_ID; /* KMD gets ASID 0 */
+   rc = hl_mmu_ctx_init(ctx);
+   if (rc) {
+   dev_err(hdev->dev, "Failed to init mmu ctx module\n");
+   goto mem_ctx_err;
+   }
} else {
ctx->asid = hl_asid_alloc(hdev);
if (!ctx->asid) {
diff --git a/drivers/misc/habanalabs/mmu.c b/drivers/misc/habanalabs/mmu.c
index 87968f32e718..a80162c5c373 100644
--- a/drivers/misc/habanalabs/mmu.c
+++ b/drivers/misc/habanalabs/mmu.c
@@ -241,8 +241,9 @@ static int dram_default_mapping_init(struct hl_ctx *ctx)
hop2_pte_addr, hop3_pte_addr, pte_val;
int rc, i, j, hop3_allocated = 0;
 
-   if (!hdev->dram_supports_virtual_memory ||
-   !hdev->dram_default_page_mapping)
+   if ((!hdev->dram_supports_virtual_memory) ||
+   (!hdev->dram_default_page_mapping) ||
+   (ctx->asid == HL_KERNEL_ASID_ID))
return 0;
 
num_of_hop3 = prop->dram_size_for_default_page_mapping;
@@ -340,8 +341,9 @@ static void dram_default_mapping_fini(struct hl_ctx *ctx)
hop2_pte_addr, hop3_pte_addr;
int i, j;
 
-   if (!hdev->dram_supports_virtual_memory ||
-   !hdev->dram_default_page_mapping)
+   if ((!hdev->dram_supports_virtual_memory) ||
+   (!hdev->dram_default_page_mapping) ||
+   (ctx->asid == HL_KERNEL_ASID_ID))
return;
 
num_of_hop3 = prop->dram_size_for_default_page_mapping;
-- 
2.17.1



[PATCH 2/8] habanalabs: de-couple MMU and VM module initialization

2019-06-10 Thread Oded Gabbay
This patch initializes the MMU S/W structures before the VM S/W
structures, instead of doing that as part of the VM S/W initialization.

This is done because we need to configure some MMU mappings for the kernel
context, before the VM is initialized. The VM initialization can't be
moved earlier because it depends on the size of the DRAM, which is
retrieved from the device CPU. Communication with the device CPU will
require the MMU mappings to be configured and hence the de-coupling.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/device.c | 23 ---
 drivers/misc/habanalabs/memory.c | 13 +
 drivers/misc/habanalabs/mmu.c|  6 +-
 3 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 4df8ef88ce2d..0c4894dd9c02 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -745,6 +745,7 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset,
 
if (hard_reset) {
hl_vm_fini(hdev);
+   hl_mmu_fini(hdev);
hl_eq_reset(hdev, >event_queue);
}
 
@@ -772,6 +773,13 @@ int hl_device_reset(struct hl_device *hdev, bool 
hard_reset,
goto out_err;
}
 
+   rc = hl_mmu_init(hdev);
+   if (rc) {
+   dev_err(hdev->dev,
+   "Failed to initialize MMU S/W after hard 
reset\n");
+   goto out_err;
+   }
+
/* Allocate the kernel context */
hdev->kernel_ctx = kzalloc(sizeof(*hdev->kernel_ctx),
GFP_KERNEL);
@@ -943,11 +951,18 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
goto cq_fini;
}
 
+   /* MMU S/W must be initialized before kernel context is created */
+   rc = hl_mmu_init(hdev);
+   if (rc) {
+   dev_err(hdev->dev, "Failed to initialize MMU S/W structures\n");
+   goto eq_fini;
+   }
+
/* Allocate the kernel context */
hdev->kernel_ctx = kzalloc(sizeof(*hdev->kernel_ctx), GFP_KERNEL);
if (!hdev->kernel_ctx) {
rc = -ENOMEM;
-   goto eq_fini;
+   goto mmu_fini;
}
 
hdev->user_ctx = NULL;
@@ -995,8 +1010,6 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
goto out_disabled;
}
 
-   /* After test_queues, KMD can start sending messages to device CPU */
-
rc = device_late_init(hdev);
if (rc) {
dev_err(hdev->dev, "Failed late initialization\n");
@@ -1042,6 +1055,8 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
"kernel ctx is still alive on initialization 
failure\n");
 free_ctx:
kfree(hdev->kernel_ctx);
+mmu_fini:
+   hl_mmu_fini(hdev);
 eq_fini:
hl_eq_fini(hdev, >event_queue);
 cq_fini:
@@ -1146,6 +1161,8 @@ void hl_device_fini(struct hl_device *hdev)
 
hl_vm_fini(hdev);
 
+   hl_mmu_fini(hdev);
+
hl_eq_fini(hdev, >event_queue);
 
for (i = 0 ; i < hdev->asic_prop.completion_queues_count ; i++)
diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
index 693877e37fd8..42d237cae1dc 100644
--- a/drivers/misc/habanalabs/memory.c
+++ b/drivers/misc/habanalabs/memory.c
@@ -1657,17 +1657,10 @@ int hl_vm_init(struct hl_device *hdev)
struct hl_vm *vm = >vm;
int rc;
 
-   rc = hl_mmu_init(hdev);
-   if (rc) {
-   dev_err(hdev->dev, "Failed to init MMU\n");
-   return rc;
-   }
-
vm->dram_pg_pool = gen_pool_create(__ffs(prop->dram_page_size), -1);
if (!vm->dram_pg_pool) {
dev_err(hdev->dev, "Failed to create dram page pool\n");
-   rc = -ENOMEM;
-   goto pool_create_err;
+   return -ENOMEM;
}
 
kref_init(>dram_pg_pool_refcount);
@@ -1693,8 +1686,6 @@ int hl_vm_init(struct hl_device *hdev)
 
 pool_add_err:
gen_pool_destroy(vm->dram_pg_pool);
-pool_create_err:
-   hl_mmu_fini(hdev);
 
return rc;
 }
@@ -1724,7 +1715,5 @@ void hl_vm_fini(struct hl_device *hdev)
dev_warn(hdev->dev, "dram_pg_pool was not destroyed on %s\n",
__func__);
 
-   hl_mmu_fini(hdev);
-
vm->init_done = false;
 }
diff --git a/drivers/misc/habanalabs/mmu.c b/drivers/misc/habanalabs/mmu.c
index 10aee3141444..87968f32e718 100644
--- a/drivers/misc/habanalabs/mmu.c
+++ b/drivers/misc/habanalabs/mmu.c
@@ -385,12 +385,8 @@ static void dram_default_mapping_fini(struct hl_ctx *ctx)
  * @hdev: habanalabs device structure.
  *
  * This function does the following:
- * - Allocate max_asid zeroed hop0 pgts so no mapping is available.
- * - Enable MMU in H/W.
- * - Invalidate 

[PATCH 1/8] habanalabs: initialize device CPU queues after MMU init

2019-06-10 Thread Oded Gabbay
This patch changes the order of H/W IP initializations. The MMU needs to
be initialized before the device CPU queues, because the CPU will go
through the ASIC MMU in order to reach the host memory (where the queues
are located).

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/asid.c  |  2 +-
 drivers/misc/habanalabs/device.c| 22 +-
 drivers/misc/habanalabs/goya/goya.c | 64 +
 3 files changed, 40 insertions(+), 48 deletions(-)

diff --git a/drivers/misc/habanalabs/asid.c b/drivers/misc/habanalabs/asid.c
index f54e7971a762..2c01461701a3 100644
--- a/drivers/misc/habanalabs/asid.c
+++ b/drivers/misc/habanalabs/asid.c
@@ -18,7 +18,7 @@ int hl_asid_init(struct hl_device *hdev)
 
mutex_init(>asid_mutex);
 
-   /* ASID 0 is reserved for KMD */
+   /* ASID 0 is reserved for KMD and device CPU */
set_bit(0, hdev->asid_bitmap);
 
return 0;
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index cca4af29daf7..4df8ef88ce2d 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -326,7 +326,15 @@ static int device_late_init(struct hl_device *hdev)
 {
int rc;
 
-   INIT_DELAYED_WORK(>work_freq, set_freq_to_low_job);
+   if (hdev->asic_funcs->late_init) {
+   rc = hdev->asic_funcs->late_init(hdev);
+   if (rc) {
+   dev_err(hdev->dev,
+   "failed late initialization for the H/W\n");
+   return rc;
+   }
+   }
+
hdev->high_pll = hdev->asic_prop.high_pll;
 
/* force setting to low frequency */
@@ -337,17 +345,9 @@ static int device_late_init(struct hl_device *hdev)
else
hdev->asic_funcs->set_pll_profile(hdev, PLL_LAST);
 
-   if (hdev->asic_funcs->late_init) {
-   rc = hdev->asic_funcs->late_init(hdev);
-   if (rc) {
-   dev_err(hdev->dev,
-   "failed late initialization for the H/W\n");
-   return rc;
-   }
-   }
-
+   INIT_DELAYED_WORK(>work_freq, set_freq_to_low_job);
schedule_delayed_work(>work_freq,
-   usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC));
+   usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC));
 
if (hdev->heartbeat) {
INIT_DELAYED_WORK(>work_heartbeat, hl_device_heartbeat);
diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index 81c1d576783f..106074466dca 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -539,9 +539,32 @@ int goya_late_init(struct hl_device *hdev)
struct asic_fixed_properties *prop = >asic_prop;
int rc;
 
+   goya_fetch_psoc_frequency(hdev);
+
+   rc = goya_mmu_clear_pgt_range(hdev);
+   if (rc) {
+   dev_err(hdev->dev,
+   "Failed to clear MMU page tables range %d\n", rc);
+   return rc;
+   }
+
+   rc = goya_mmu_set_dram_default_page(hdev);
+   if (rc) {
+   dev_err(hdev->dev, "Failed to set DRAM default page %d\n", rc);
+   return rc;
+   }
+
+   rc = goya_init_cpu_queues(hdev);
+   if (rc)
+   return rc;
+
+   rc = goya_test_cpu_queue(hdev);
+   if (rc)
+   return rc;
+
rc = goya_armcp_info_get(hdev);
if (rc) {
-   dev_err(hdev->dev, "Failed to get armcp info\n");
+   dev_err(hdev->dev, "Failed to get armcp info %d\n", rc);
return rc;
}
 
@@ -553,33 +576,15 @@ int goya_late_init(struct hl_device *hdev)
 
rc = hl_fw_send_pci_access_msg(hdev, ARMCP_PACKET_ENABLE_PCI_ACCESS);
if (rc) {
-   dev_err(hdev->dev, "Failed to enable PCI access from CPU\n");
+   dev_err(hdev->dev,
+   "Failed to enable PCI access from CPU %d\n", rc);
return rc;
}
 
WREG32(mmGIC_DISTRIBUTOR__5_GICD_SETSPI_NSR,
GOYA_ASYNC_EVENT_ID_INTS_REGISTER);
 
-   goya_fetch_psoc_frequency(hdev);
-
-   rc = goya_mmu_clear_pgt_range(hdev);
-   if (rc) {
-   dev_err(hdev->dev, "Failed to clear MMU page tables range\n");
-   goto disable_pci_access;
-   }
-
-   rc = goya_mmu_set_dram_default_page(hdev);
-   if (rc) {
-   dev_err(hdev->dev, "Failed to set DRAM default page\n");
-   goto disable_pci_access;
-   }
-
return 0;
-
-disable_pci_access:
-   hl_fw_send_pci_access_msg(hdev, ARMCP_PACKET_DISABLE_PCI_ACCESS);
-
-   return rc;
 }
 
 /*
@@ -1000,7 +1005,7 @@ int goya_init_cpu_queues(struct hl_device *hdev)
 
if (err) {
dev_err(hdev->dev,
-   "Failed to communicate with ARM CPU (ArmCP timeout)\n");
+   

Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC

2019-06-10 Thread Benjamin Herrenschmidt
On Sat, 2019-06-08 at 11:05 +0200, Borislav Petkov wrote:
> On Sat, Jun 08, 2019 at 10:16:11AM +1000, Benjamin Herrenschmidt wrote:
> > Those IP blocks don't need any SW coordination at runtime. The drivers
> > don't share data nor communicate with each other. There is absolultely
> > no reason to go down that path.
> 
> Let me set one thing straight: the EDAC "subsystem" if you will - or
> that pile of code which does error counting and reporting - has its
> limitations in supporting one EDAC driver per platform. And whenever we
> have two drivers loadable on a platform, we have to do dirty hacks like
> 
>   301375e76432 ("EDAC: Add owner check to the x86 platform drivers")
> 
> What that means is, that if you need to call EDAC logging routines or
> whatnot from two different drivers, there's no locking, no nothing. So
> it might work or it might set your cat on fire.

Should we fix that then instead ? What are the big issues with adding
some basic locking ? being called from NMIs ?

If the separate drivers operate on distinct counters I don't see a big
problem there.

> IOW, having multiple separate "drivers" or representations of RAS
> functionality using EDAC facilities is something that hasn't been
> done. Well, almost. highbank_mc_edac.c and highbank_l2_edac.c is one
> example but they make sure they don't step on each other's toes by using
> different EDAC pieces - a device vs a memory controller abstraction.

That sounds like a reasonable requirement.

> And now the moment all of a sudden you decide you want for those
> separate "drivers" to synchronize on something, you need to do something
> hacky like the amd_register_ecc_decoder() thing, for example, because we
> need to call into the EDAC memory controller driver to decode a DRAM ECC
> error properly, while the rest of the error types get decoded somewhere
> else...
> 
> Then there comes the issue with code reuse - wouldn't it be great if a
> memory controller driver can be shared between platform drivers instead of
> copying it in both?
> 
> We already do that - see fsl_ddr_edac.c which gets shared between PPC
> *and* ARM. drivers/edac/skx_common.c is another example for Intel chips.
> 
> Now, if you have a platform with 10 IP blocks which each have RAS
> functionality, are you saying you'll do 10 different pieces called
> 
> __edac.c
> 
> ?
> 
> And if  has an old IP block with the old RAS
> functionality, you load __edac.c on the new
> platform too?

I'n not sure why  ...

Anyway, let's get back to the specific case of our Amazon platform here
since it's a concrete example.

Hanna, can you give us a reasonably exhaustive list of how many such
"drivers" we'll want in the EDAC subsystem and whether you envision any
coordination requirement between them or not ?

Cheers,
Ben.





[PATCH v3] arm64: dts: ls1028a: Add temperature sensor node

2019-06-10 Thread Yuantian Tang
Add nxp sa56004 chip node for temperature monitor.

Signed-off-by: Yuantian Tang 
---
v3:
- sort the node in i2c address
v2:
- change the node name and add vcc-supply
 arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts |   15 +++
 arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts |   15 +++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
index b359068..960daf2 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
@@ -47,6 +47,15 @@
regulator-always-on;
};
 
+   sb_3v3: regulator-sb3v3 {
+   compatible = "regulator-fixed";
+   regulator-name = "3v3_vbus";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
sound {
compatible = "simple-audio-card";
simple-audio-card,format = "i2s";
@@ -117,6 +126,12 @@
#size-cells = <0>;
reg = <0x3>;
 
+   temperature-sensor@4c {
+   compatible = "nxp,sa56004";
+   reg = <0x4c>;
+   vcc-supply = <_3v3>;
+   };
+
rtc@51 {
compatible = "nxp,pcf2129";
reg = <0x51>;
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts
index f9c272f..6a22423 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts
@@ -43,6 +43,15 @@
regulator-always-on;
};
 
+   sb_3v3: regulator-sb3v3 {
+   compatible = "regulator-fixed";
+   regulator-name = "3v3_vbus";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
sound {
compatible = "simple-audio-card";
simple-audio-card,format = "i2s";
@@ -115,6 +124,12 @@
#size-cells = <0>;
reg = <0x3>;
 
+   temperature-sensor@4c {
+   compatible = "nxp,sa56004";
+   reg = <0x4c>;
+   vcc-supply = <_3v3>;
+   };
+
rtc@51 {
compatible = "nxp,pcf2129";
reg = <0x51>;
-- 
1.7.1



[PATCH 5/8] habanalabs: set Goya CPU to use ASIC MMU

2019-06-10 Thread Oded Gabbay
This patch configures the Goya CPU to actually go through the MMU for
translation. The configuration is done after the configuration of the
relevant MMU mappings.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/goya/goya.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index 4e41f2669e6d..9f1f47770afa 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -986,9 +986,9 @@ int goya_init_cpu_queues(struct hl_device *hdev)
WREG32(mmPSOC_GLOBAL_CONF_SCRATCHPAD_3, upper_32_bits(eq->bus_address));
 
WREG32(mmPSOC_GLOBAL_CONF_SCRATCHPAD_8,
-   lower_32_bits(hdev->cpu_accessible_dma_address));
+   lower_32_bits(VA_CPU_ACCESSIBLE_MEM_ADDR));
WREG32(mmPSOC_GLOBAL_CONF_SCRATCHPAD_9,
-   upper_32_bits(hdev->cpu_accessible_dma_address));
+   upper_32_bits(VA_CPU_ACCESSIBLE_MEM_ADDR));
 
WREG32(mmPSOC_GLOBAL_CONF_SCRATCHPAD_5, HL_QUEUE_SIZE_IN_BYTES);
WREG32(mmPSOC_GLOBAL_CONF_SCRATCHPAD_4, HL_EQ_SIZE_IN_BYTES);
@@ -3011,7 +3011,13 @@ static void goya_dma_pool_free(struct hl_device *hdev, 
void *vaddr,
 void *goya_cpu_accessible_dma_pool_alloc(struct hl_device *hdev, size_t size,
dma_addr_t *dma_handle)
 {
-   return hl_fw_cpu_accessible_dma_pool_alloc(hdev, size, dma_handle);
+   void *vaddr;
+
+   vaddr = hl_fw_cpu_accessible_dma_pool_alloc(hdev, size, dma_handle);
+   *dma_handle = (*dma_handle) - hdev->cpu_accessible_dma_address +
+   VA_CPU_ACCESSIBLE_MEM_ADDR;
+
+   return vaddr;
 }
 
 void goya_cpu_accessible_dma_pool_free(struct hl_device *hdev, size_t size,
@@ -4667,6 +4673,14 @@ static int goya_mmu_add_mappings_for_device_cpu(struct 
hl_device *hdev)
}
}
 
+   goya_mmu_prepare_reg(hdev, mmCPU_IF_ARUSER_OVR, HL_KERNEL_ASID_ID);
+   goya_mmu_prepare_reg(hdev, mmCPU_IF_AWUSER_OVR, HL_KERNEL_ASID_ID);
+   WREG32(mmCPU_IF_ARUSER_OVR_EN, 0x7FF);
+   WREG32(mmCPU_IF_AWUSER_OVR_EN, 0x7FF);
+
+   /* Make sure configuration is flushed to device */
+   RREG32(mmCPU_IF_AWUSER_OVR_EN);
+
goya->device_cpu_mmu_mappings_done = true;
 
return 0;
@@ -4702,6 +4716,9 @@ void goya_mmu_remove_device_cpu_mappings(struct hl_device 
*hdev)
if (!goya->device_cpu_mmu_mappings_done)
return;
 
+   WREG32(mmCPU_IF_ARUSER_OVR_EN, 0);
+   WREG32(mmCPU_IF_AWUSER_OVR_EN, 0);
+
if (!(hdev->cpu_accessible_dma_address & (PAGE_SIZE_2MB - 1))) {
if (hl_mmu_unmap(hdev->kernel_ctx, VA_CPU_ACCESSIBLE_MEM_ADDR,
PAGE_SIZE_2MB))
-- 
2.17.1



[PATCH 4/8] habanalabs: add MMU mappings for Goya CPU

2019-06-10 Thread Oded Gabbay
This patch adds the necessary MMU mappings for the Goya CPU to access the
device DRAM and the host memory.

The first 256MB of the device DRAM is being mapped. That's where the F/W
is running.

The 2MB area located on the host memory for the purpose of communication
between the driver and the device CPU is also being mapped.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/debugfs.c|   7 +-
 drivers/misc/habanalabs/goya/goya.c  | 126 +--
 drivers/misc/habanalabs/goya/goyaP.h |  12 ++-
 drivers/misc/habanalabs/habanalabs.h |   6 +-
 4 files changed, 137 insertions(+), 14 deletions(-)

diff --git a/drivers/misc/habanalabs/debugfs.c 
b/drivers/misc/habanalabs/debugfs.c
index ba418aaa404c..886f8ea82499 100644
--- a/drivers/misc/habanalabs/debugfs.c
+++ b/drivers/misc/habanalabs/debugfs.c
@@ -355,7 +355,7 @@ static int mmu_show(struct seq_file *s, void *data)
struct hl_debugfs_entry *entry = s->private;
struct hl_dbg_device_entry *dev_entry = entry->dev_entry;
struct hl_device *hdev = dev_entry->hdev;
-   struct hl_ctx *ctx = hdev->user_ctx;
+   struct hl_ctx *ctx;
 
u64 hop0_addr = 0, hop0_pte_addr = 0, hop0_pte = 0,
hop1_addr = 0, hop1_pte_addr = 0, hop1_pte = 0,
@@ -367,6 +367,11 @@ static int mmu_show(struct seq_file *s, void *data)
if (!hdev->mmu_enable)
return 0;
 
+   if (dev_entry->mmu_asid == HL_KERNEL_ASID_ID)
+   ctx = hdev->kernel_ctx;
+   else
+   ctx = hdev->user_ctx;
+
if (!ctx) {
dev_err(hdev->dev, "no ctx available\n");
return 0;
diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index 106074466dca..4e41f2669e6d 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -297,6 +297,11 @@ static u32 goya_all_events[] = {
GOYA_ASYNC_EVENT_ID_DMA_BM_CH4
 };
 
+static int goya_mmu_clear_pgt_range(struct hl_device *hdev);
+static int goya_mmu_set_dram_default_page(struct hl_device *hdev);
+static int goya_mmu_add_mappings_for_device_cpu(struct hl_device *hdev);
+static void goya_mmu_prepare(struct hl_device *hdev, u32 asid);
+
 void goya_get_fixed_properties(struct hl_device *hdev)
 {
struct asic_fixed_properties *prop = >asic_prop;
@@ -554,6 +559,10 @@ int goya_late_init(struct hl_device *hdev)
return rc;
}
 
+   rc = goya_mmu_add_mappings_for_device_cpu(hdev);
+   if (rc)
+   return rc;
+
rc = goya_init_cpu_queues(hdev);
if (rc)
return rc;
@@ -2065,10 +2074,12 @@ static void goya_halt_engines(struct hl_device *hdev, 
bool hard_reset)
goya_disable_external_queues(hdev);
goya_disable_internal_queues(hdev);
 
-   if (hard_reset)
+   if (hard_reset) {
goya_disable_msix(hdev);
-   else
+   goya_mmu_remove_device_cpu_mappings(hdev);
+   } else {
goya_sync_irqs(hdev);
+   }
 }
 
 /*
@@ -4584,7 +4595,7 @@ int goya_context_switch(struct hl_device *hdev, u32 asid)
return 0;
 }
 
-int goya_mmu_clear_pgt_range(struct hl_device *hdev)
+static int goya_mmu_clear_pgt_range(struct hl_device *hdev)
 {
struct asic_fixed_properties *prop = >asic_prop;
struct goya_device *goya = hdev->asic_specific;
@@ -4598,7 +4609,7 @@ int goya_mmu_clear_pgt_range(struct hl_device *hdev)
return goya_memset_device_memory(hdev, addr, size, 0, true);
 }
 
-int goya_mmu_set_dram_default_page(struct hl_device *hdev)
+static int goya_mmu_set_dram_default_page(struct hl_device *hdev)
 {
struct goya_device *goya = hdev->asic_specific;
u64 addr = hdev->asic_prop.mmu_dram_default_page_addr;
@@ -4611,7 +4622,112 @@ int goya_mmu_set_dram_default_page(struct hl_device 
*hdev)
return goya_memset_device_memory(hdev, addr, size, val, true);
 }
 
-void goya_mmu_prepare(struct hl_device *hdev, u32 asid)
+static int goya_mmu_add_mappings_for_device_cpu(struct hl_device *hdev)
+{
+   struct asic_fixed_properties *prop = >asic_prop;
+   struct goya_device *goya = hdev->asic_specific;
+   s64 off, cpu_off;
+   int rc;
+
+   if (!(goya->hw_cap_initialized & HW_CAP_MMU))
+   return 0;
+
+   for (off = 0 ; off < CPU_FW_IMAGE_SIZE ; off += PAGE_SIZE_2MB) {
+   rc = hl_mmu_map(hdev->kernel_ctx, prop->dram_base_address + off,
+   prop->dram_base_address + off, PAGE_SIZE_2MB);
+   if (rc) {
+   dev_err(hdev->dev, "Map failed for address 0x%llx\n",
+   prop->dram_base_address + off);
+   goto unmap;
+   }
+   }
+
+   if (!(hdev->cpu_accessible_dma_address & (PAGE_SIZE_2MB - 1))) {
+   rc = hl_mmu_map(hdev->kernel_ctx, VA_CPU_ACCESSIBLE_MEM_ADDR,
+   

[PATCH 6/8] habanalabs: remove DMA mask hack for Goya

2019-06-10 Thread Oded Gabbay
This patch removes the non-standard DMA mask setting for Goya. Now that
the device CPU goes through the MMU, we are not limited to allocating the
CPU accessible memory area in the address space of under 39 bits.
Therefore, we don't need to set the DMA masking twice during
initialization, a practice that is not working on POWER architecture.

The patch sets the DMA mask to 48 bits once during the initialization. The
address of the CPU accessible memory area is configured to the MMU and the
matching VA is given to the device CPU.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/goya/goya.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index 9f1f47770afa..e8b3a31d211f 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -472,7 +472,7 @@ static int goya_early_init(struct hl_device *hdev)
 
prop->dram_pci_bar_size = pci_resource_len(pdev, DDR_BAR_ID);
 
-   rc = hl_pci_init(hdev, 39);
+   rc = hl_pci_init(hdev, 48);
if (rc)
return rc;
 
@@ -669,6 +669,9 @@ static int goya_sw_init(struct hl_device *hdev)
goto free_dma_pool;
}
 
+   dev_dbg(hdev->dev, "cpu accessible memory at bus address 0x%llx\n",
+   hdev->cpu_accessible_dma_address);
+
hdev->cpu_accessible_dma_pool = gen_pool_create(ilog2(32), -1);
if (!hdev->cpu_accessible_dma_pool) {
dev_err(hdev->dev,
@@ -2481,25 +2484,11 @@ static int goya_hw_init(struct hl_device *hdev)
if (rc)
goto disable_queues;
 
-   /*
-* Check if we managed to set the DMA mask to more then 32 bits. If so,
-* let's try to increase it again because in Goya we set the initial
-* dma mask to less then 39 bits so that the allocation of the memory
-* area for the device's cpu will be under 39 bits
-*/
-   if (hdev->dma_mask > 32) {
-   rc = hl_pci_set_dma_mask(hdev, 48);
-   if (rc)
-   goto disable_msix;
-   }
-
/* Perform read from the device to flush all MSI-X configuration */
val = RREG32(mmPCIE_DBI_DEVICE_ID_VENDOR_ID_REG);
 
return 0;
 
-disable_msix:
-   goya_disable_msix(hdev);
 disable_queues:
goya_disable_internal_queues(hdev);
goya_disable_external_queues(hdev);
-- 
2.17.1



[PATCH 7/8] habanalabs: add WARN in case of bad MMU mapping

2019-06-10 Thread Oded Gabbay
This patch checks if an MMU mapping is erroneous in that the physical
address that is being mapped is NOT divisible by the page size.

If that thing happens, then the H/W will issue a transaction which will be
translated to a wrong address, because part of the address will not be
taken (the remainder of address/page size).

Because the physical address is being handled by the driver, a WARN is
suitable here as it implies a bug in the driver code itself and not a user
bug.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/mmu.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/misc/habanalabs/mmu.c b/drivers/misc/habanalabs/mmu.c
index a80162c5c373..176c315836f1 100644
--- a/drivers/misc/habanalabs/mmu.c
+++ b/drivers/misc/habanalabs/mmu.c
@@ -913,6 +913,10 @@ int hl_mmu_map(struct hl_ctx *ctx, u64 virt_addr, u64 
phys_addr, u32 page_size)
return -EFAULT;
}
 
+   WARN_ONCE((phys_addr & (real_page_size - 1)),
+   "Mapping 0x%llx with page size of 0x%x is erroneous! Address 
must be divisible by page size",
+   phys_addr, real_page_size);
+
npages = page_size / real_page_size;
real_virt_addr = virt_addr;
real_phys_addr = phys_addr;
-- 
2.17.1



[PATCH 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-10 Thread Oded Gabbay
This patch enables support in the driver for 64-bit DMA mask when running
in a POWER9 machine.

POWER9 supports either 32-bit or 64-bit DMA mask. However, our ASICs
support 48-bit DMA mask. To support 64-bit, the driver needs to add a
special configuration to the ASIC's PCIe controller.

The activation of this special configuration is done via kernel module
parameter because:

1. It should affect all the habanalabs ASICs in the machine.

2. The pci_set_dma_mask() is a generic Linux kernel call, so the driver
   can't tell why it got an error when it tried to set the DMA mask to 48
   bits. And upon such failure, the driver must fall-back to set the mask
   to 32 bits.

3. There is no standard way to differentiate in runtime between POWER9 and
   other architectures.

Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/goya/goya.c  | 6 +-
 drivers/misc/habanalabs/habanalabs.h | 3 +++
 drivers/misc/habanalabs/habanalabs_drv.c | 7 +++
 drivers/misc/habanalabs/pci.c| 7 ++-
 4 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/goya/goya.c 
b/drivers/misc/habanalabs/goya/goya.c
index e8b3a31d211f..eb6cd1ee06f2 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -472,7 +472,11 @@ static int goya_early_init(struct hl_device *hdev)
 
prop->dram_pci_bar_size = pci_resource_len(pdev, DDR_BAR_ID);
 
-   rc = hl_pci_init(hdev, 48);
+   if (hdev->power9_64bit_dma_enable)
+   rc = hl_pci_init(hdev, 64);
+   else
+   rc = hl_pci_init(hdev, 48);
+
if (rc)
return rc;
 
diff --git a/drivers/misc/habanalabs/habanalabs.h 
b/drivers/misc/habanalabs/habanalabs.h
index 5e4a631b3d88..b6fa2df0b2d6 100644
--- a/drivers/misc/habanalabs/habanalabs.h
+++ b/drivers/misc/habanalabs/habanalabs.h
@@ -1208,6 +1208,8 @@ struct hl_device_reset_work {
  * @dma_mask: the dma mask that was set for this device
  * @in_debug: is device under debug. This, together with fd_open_cnt, enforces
  *that only a single user is configuring the debug infrastructure.
+ * @power9_64bit_dma_enable: true to enable 64-bit DMA mask support. Relevant
+ *   only to POWER9 machines.
  */
 struct hl_device {
struct pci_dev  *pdev;
@@ -1281,6 +1283,7 @@ struct hl_device {
u8  device_cpu_disabled;
u8  dma_mask;
u8  in_debug;
+   u8  power9_64bit_dma_enable;
 
/* Parameters for bring-up */
u8  mmu_enable;
diff --git a/drivers/misc/habanalabs/habanalabs_drv.c 
b/drivers/misc/habanalabs/habanalabs_drv.c
index 6f6dbe93f1df..9ca2d9d4f3fe 100644
--- a/drivers/misc/habanalabs/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/habanalabs_drv.c
@@ -28,6 +28,7 @@ static DEFINE_MUTEX(hl_devs_idr_lock);
 
 static int timeout_locked = 5;
 static int reset_on_lockup = 1;
+static int power9_64bit_dma_enable;
 
 module_param(timeout_locked, int, 0444);
 MODULE_PARM_DESC(timeout_locked,
@@ -37,6 +38,10 @@ module_param(reset_on_lockup, int, 0444);
 MODULE_PARM_DESC(reset_on_lockup,
"Do device reset on lockup (0 = no, 1 = yes, default yes)");
 
+module_param(power9_64bit_dma_enable, int, 0444);
+MODULE_PARM_DESC(power9_64bit_dma_enable,
+   "Enable 64-bit DMA mask. Should be set only in POWER9 machine (0 = no, 
1 = yes, default no)");
+
 #define PCI_VENDOR_ID_HABANALABS   0x1da3
 
 #define PCI_IDS_GOYA   0x0001
@@ -223,6 +228,8 @@ int create_hdev(struct hl_device **dev, struct pci_dev 
*pdev,
 
hdev->major = hl_major;
hdev->reset_on_lockup = reset_on_lockup;
+   hdev->power9_64bit_dma_enable = power9_64bit_dma_enable;
+
hdev->pldm = 0;
 
set_driver_behavior_per_device(hdev);
diff --git a/drivers/misc/habanalabs/pci.c b/drivers/misc/habanalabs/pci.c
index c98d88c7a5c6..15954bf419fa 100644
--- a/drivers/misc/habanalabs/pci.c
+++ b/drivers/misc/habanalabs/pci.c
@@ -283,7 +283,12 @@ int hl_pci_init_iatu(struct hl_device *hdev, u64 
sram_base_address,
upper_32_bits(host_phys_base_address));
rc |= hl_pci_iatu_write(hdev, 0x010, lower_32_bits(host_phys_end_addr));
rc |= hl_pci_iatu_write(hdev, 0x014, 0);
-   rc |= hl_pci_iatu_write(hdev, 0x018, 0);
+
+   if ((hdev->power9_64bit_dma_enable) && (hdev->dma_mask == 64))
+   rc |= hl_pci_iatu_write(hdev, 0x018, 0x0800);
+   else
+   rc |= hl_pci_iatu_write(hdev, 0x018, 0);
+
rc |= hl_pci_iatu_write(hdev, 0x020, upper_32_bits(host_phys_end_addr));
/* Increase region size */
rc |= hl_pci_iatu_write(hdev, 0x000, 0x2000);
-- 
2.17.1



[PATCH 0/8] Fixing DMA mask issues in habanalabs driver

2019-06-10 Thread Oded Gabbay
This patch-set changes the way the Goya internal CPU access memory on the
Host machine. This is needed to prevent the non-standard way the driver
used the PCI DMA set mask kernel API so far.

The DMA set mask should be called only once at the start of the driver.
This is because changing the DMA mask to a new value after allocations
were made using a previous mask value, might cause the previous allocations
to become unaccessible (usually if there is IOMMU present).

The driver did that because of a limitation in Goya's internal CPU. The
limitation was that the internal CPU can only access 40-bit addresses,
while the entire ASIC can access 50-bit addresses. Therefore, the driver
set the DMA mask to 39-bits, allocated memory for the internal CPU on the
host and then changed the DMA mask to 48-bits.

This patch-set eliminates the double DMA set by using Goya's MMU to
overcome the limitation. The driver now sets the DMA mask only once to
48-bits and allocates a single DMA region of 2MB for the internal CPU. It
then maps that region in Goya's MMU to a device virtual address under 40-bits.

In addition, this patch-set enables the use of 64-bit mask on POWER9
systems. POWER9 DMA mask can be set ONLY to 32-bit or 64-bit. To use
64-bit, the device must set bit 59 to 1 in all its outbound transactions. 
This is achieved by setting a special configuration in Goya's PCIe
controller. The configuration must be done only in POWER9 machines, as it
will make the device non-functional on other architectures 
(e.g. x86-64, ARM).

Thanks,
Oded

Oded Gabbay (8):
  habanalabs: initialize device CPU queues after MMU init
  habanalabs: de-couple MMU and VM module initialization
  habanalabs: initialize MMU context for driver
  habanalabs: add MMU mappings for Goya CPU
  habanalabs: set Goya CPU to use ASIC MMU
  habanalabs: remove DMA mask hack for Goya
  habanalabs: add WARN in case of bad MMU mapping
  habanalabs: enable 64-bit DMA mask in POWER9

 drivers/misc/habanalabs/asid.c   |   2 +-
 drivers/misc/habanalabs/context.c|   7 +
 drivers/misc/habanalabs/debugfs.c|   7 +-
 drivers/misc/habanalabs/device.c |  45 +++--
 drivers/misc/habanalabs/goya/goya.c  | 234 +--
 drivers/misc/habanalabs/goya/goyaP.h |  12 +-
 drivers/misc/habanalabs/habanalabs.h |   9 +-
 drivers/misc/habanalabs/habanalabs_drv.c |   7 +
 drivers/misc/habanalabs/memory.c |  13 +-
 drivers/misc/habanalabs/mmu.c|  20 +-
 drivers/misc/habanalabs/pci.c|   7 +-
 11 files changed, 259 insertions(+), 104 deletions(-)

-- 
2.17.1



Re: [PATCH v5 0/3] Add restrictions for kexec/kdump jumping between 5-level and 4-level kernel

2019-06-10 Thread Baoquan He
Hi,

On 05/24/19 at 03:38pm, Baoquan He wrote:
> 

Ping.

Can anyone help do further reviewing on this patchset? Or consider
merging since people have ack-ed?

Thanks
Baoquan

> The v4 cover letter tells the background about this adding, paste the
> link here for reference:
> http://lkml.kernel.org/r/20190509013644.1246-1-...@redhat.com
> 
> Changelog:
> v4->v5:
>   Tune code and log per Dave's comments, no functional change.
>   - In patch 2, change the printed erorr message; 
>   - In patch 3, add macro SZ_64T and use it in code, and remove the
> obsolete code comment.
> v3->v4:
>   No functional change.
>   - Rewrite log of patch 1/3 tell who the newly added bits are gonna be
> used.
>   - Rewrite log of patch 2/3 per tglx's words.
>   - Add Kirill's Acked-by.
> 
> v2->v3:
>   Change the constant to match the notation for the rest of defines as
>   Kirill suggested;
> v1->v2:
>   Correct the subject of patch 1 according to tglx's comment;
>   Add more information to cover-letter to address reviewers' concerns;
> 
> Baoquan He (3):
>   x86/boot: Add xloadflags bits for 5-level kernel checking
>   x86/kexec/64: Error out if try to jump to old 4-level kernel from
> 5-level kernel
>   x86/kdump/64: Change the upper limit of crashkernel reservation
> 
>  arch/x86/boot/header.S| 12 +++-
>  arch/x86/include/uapi/asm/bootparam.h |  2 ++
>  arch/x86/kernel/kexec-bzimage64.c |  5 +
>  arch/x86/kernel/setup.c   | 15 ---
>  include/linux/sizes.h |  1 +
>  5 files changed, 31 insertions(+), 4 deletions(-)
> 
> -- 
> 2.17.2
> 


Re: [PATCH v6 09/10] usb: roles: add USB Type-B GPIO connector driver

2019-06-10 Thread Chunfeng Yun
On Thu, 2019-06-06 at 09:31 +0300, Andy Shevchenko wrote:
> On Thu, Jun 6, 2019 at 5:53 AM Chunfeng Yun  wrote:
> >
> > On Wed, 2019-06-05 at 11:45 +0300, Andy Shevchenko wrote:
> > > On Wed, May 29, 2019 at 10:44 AM Chunfeng Yun  
> > > wrote:
> > > >
> > > > Due to the requirement of usb-connector.txt binding, the old way
> > > > using extcon to support USB Dual-Role switch is now deprecated
> > > > when use Type-B connector.
> > > > This patch introduces a driver of Type-B connector which typically
> > > > uses an input GPIO to detect USB ID pin, and try to replace the
> > > > function provided by extcon-usb-gpio driver
> > >
> > > > +static SIMPLE_DEV_PM_OPS(usb_conn_pm_ops,
> > > > +usb_conn_suspend, usb_conn_resume);
> > > > +
> > > > +#define DEV_PMS_OPS (IS_ENABLED(CONFIG_PM_SLEEP) ? _conn_pm_ops : 
> > > > NULL)
> > >
> > > Why this macro is needed?
> > Want to set .pm as NULL when CONFIG_PM_SLEEP is not enabled.
> 
> Doesn't SIMPLE_DEV_PM_OPS do this for you?
Yes, you are right, it provides an empty dev_pm_ops struct, I'll remove
DEV_PMS_OPS, thanks a lot

> 




Re: [PATCH] bpf/core.c - silence warning messages

2019-06-10 Thread Andrii Nakryiko
On Thu, Jun 6, 2019 at 8:08 PM Valdis Klētnieks  wrote:
>
> Compiling kernel/bpf/core.c with W=1 causes a flood of warnings:
>
> kernel/bpf/core.c:1198:65: warning: initialized field overwritten 
> [-Woverride-init]
>  1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true
>   | ^~~~
> kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL'
>  1087 |  INSN_3(ALU, ADD,  X),   \
>   |  ^~
> kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP'
>  1202 |   BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL),
>   |   ^~~~
> kernel/bpf/core.c:1198:65: note: (near initialization for 
> 'public_insntable[12]')
>  1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true
>   | ^~~~
> kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL'
>  1087 |  INSN_3(ALU, ADD,  X),   \
>   |  ^~
> kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP'
>  1202 |   BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL),
>   |   ^~~~
>
> 98 copies of the above.
>
> The attached patch silences the warnings, because we *know* we're overwriting
> the default initializer. That leaves bpf/core.c with only 6 other warnings,
> which become more visible in comparison.
>
> Signed-off-by: Valdis Kletnieks 

Thanks! Please include bpf-next in [PATCH] prefix in the future. I've
also CC'ed b...@vger.kernel.org list.

Acked-by: Andrii Nakryiko 

>
> diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
> index 4c2fa3ac56f6..2606665f2cb5 100644
> --- a/kernel/bpf/Makefile
> +++ b/kernel/bpf/Makefile
> @@ -21,3 +21,4 @@ obj-$(CONFIG_CGROUP_BPF) += cgroup.o
>  ifeq ($(CONFIG_INET),y)
>  obj-$(CONFIG_BPF_SYSCALL) += reuseport_array.o
>  endif
> +CFLAGS_core.o  += $(call cc-disable-warning, override-init)
>
>


[PATCH v13 2/2] pwm: sifive: Add a driver for SiFive SoC PWM

2019-06-10 Thread Yash Shah
Adds a PWM driver for PWM chip present in SiFive's HiFive Unleashed SoC.

Signed-off-by: Wesley W. Terpstra 
[Atish: Various fixes and code cleanup]
Signed-off-by: Atish Patra 
Signed-off-by: Yash Shah 
---
 drivers/pwm/Kconfig  |  11 ++
 drivers/pwm/Makefile |   1 +
 drivers/pwm/pwm-sifive.c | 339 +++
 3 files changed, 351 insertions(+)
 create mode 100644 drivers/pwm/pwm-sifive.c

diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index 1311b540..f7eacac 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -400,6 +400,17 @@ config PWM_SAMSUNG
  To compile this driver as a module, choose M here: the module
  will be called pwm-samsung.
 
+config PWM_SIFIVE
+   tristate "SiFive PWM support"
+   depends on OF
+   depends on COMMON_CLK
+   depends on RISCV || COMPILE_TEST
+   help
+ Generic PWM framework driver for SiFive SoCs.
+
+ To compile this driver as a module, choose M here: the module
+ will be called pwm-sifive.
+
 config PWM_SPEAR
tristate "STMicroelectronics SPEAr PWM support"
depends on PLAT_SPEAR
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index c368599..76b555b 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -39,6 +39,7 @@ obj-$(CONFIG_PWM_RCAR)+= pwm-rcar.o
 obj-$(CONFIG_PWM_RENESAS_TPU)  += pwm-renesas-tpu.o
 obj-$(CONFIG_PWM_ROCKCHIP) += pwm-rockchip.o
 obj-$(CONFIG_PWM_SAMSUNG)  += pwm-samsung.o
+obj-$(CONFIG_PWM_SIFIVE)   += pwm-sifive.o
 obj-$(CONFIG_PWM_SPEAR)+= pwm-spear.o
 obj-$(CONFIG_PWM_STI)  += pwm-sti.o
 obj-$(CONFIG_PWM_STM32)+= pwm-stm32.o
diff --git a/drivers/pwm/pwm-sifive.c b/drivers/pwm/pwm-sifive.c
new file mode 100644
index 000..a7c107f
--- /dev/null
+++ b/drivers/pwm/pwm-sifive.c
@@ -0,0 +1,339 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2017-2018 SiFive
+ * For SiFive's PWM IP block documentation please refer Chapter 14 of
+ * Reference Manual : https://static.dev.sifive.com/FU540-C000-v1.0.pdf
+ *
+ * Limitations:
+ * - When changing both duty cycle and period, we cannot prevent in
+ *   software that the output might produce a period with mixed
+ *   settings (new period length and old duty cycle).
+ * - The hardware cannot generate a 100% duty cycle.
+ * - The hardware generates only inverted output.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Register offsets */
+#define PWM_SIFIVE_PWMCFG  0x0
+#define PWM_SIFIVE_PWMCOUNT0x8
+#define PWM_SIFIVE_PWMS0x10
+#define PWM_SIFIVE_PWMCMP0 0x20
+
+/* PWMCFG fields */
+#define PWM_SIFIVE_PWMCFG_SCALEGENMASK(3, 0)
+#define PWM_SIFIVE_PWMCFG_STICKY   BIT(8)
+#define PWM_SIFIVE_PWMCFG_ZERO_CMP BIT(9)
+#define PWM_SIFIVE_PWMCFG_DEGLITCH BIT(10)
+#define PWM_SIFIVE_PWMCFG_EN_ALWAYSBIT(12)
+#define PWM_SIFIVE_PWMCFG_EN_ONCE  BIT(13)
+#define PWM_SIFIVE_PWMCFG_CENTER   BIT(16)
+#define PWM_SIFIVE_PWMCFG_GANG BIT(24)
+#define PWM_SIFIVE_PWMCFG_IP   BIT(28)
+
+/* PWM_SIFIVE_SIZE_PWMCMP is used to calculate offset for pwmcmpX registers */
+#define PWM_SIFIVE_SIZE_PWMCMP 4
+#define PWM_SIFIVE_CMPWIDTH16
+#define PWM_SIFIVE_DEFAULT_PERIOD  1000
+
+struct pwm_sifive_ddata {
+   struct pwm_chip chip;
+   struct mutex lock; /* lock to protect user_count */
+   struct notifier_block notifier;
+   struct clk *clk;
+   void __iomem *regs;
+   unsigned int real_period;
+   unsigned int approx_period;
+   int user_count;
+};
+
+static inline
+struct pwm_sifive_ddata *pwm_sifive_chip_to_ddata(struct pwm_chip *c)
+{
+   return container_of(c, struct pwm_sifive_ddata, chip);
+}
+
+static int pwm_sifive_request(struct pwm_chip *chip, struct pwm_device *pwm)
+{
+   struct pwm_sifive_ddata *ddata = pwm_sifive_chip_to_ddata(chip);
+
+   mutex_lock(>lock);
+   ddata->user_count++;
+   mutex_unlock(>lock);
+
+   return 0;
+}
+
+static void pwm_sifive_free(struct pwm_chip *chip, struct pwm_device *pwm)
+{
+   struct pwm_sifive_ddata *ddata = pwm_sifive_chip_to_ddata(chip);
+
+   mutex_lock(>lock);
+   ddata->user_count--;
+   mutex_unlock(>lock);
+}
+
+static void pwm_sifive_update_clock(struct pwm_sifive_ddata *ddata,
+   unsigned long rate)
+{
+   unsigned long long num;
+   unsigned long scale_pow;
+   int scale;
+   u32 val;
+   /*
+* The PWM unit is used with pwmzerocmp=0, so the only way to modify the
+* period length is using pwmscale which provides the number of bits the
+* counter is shifted before being feed to the comparators. A period
+* lasts (1 << (PWM_SIFIVE_CMPWIDTH + pwmscale)) clock ticks.
+* (1 << (PWM_SIFIVE_CMPWIDTH + scale)) * 

[PATCH v13 1/2] pwm: sifive: Add DT documentation for SiFive PWM Controller

2019-06-10 Thread Yash Shah
DT documentation for PWM controller added.

Signed-off-by: Wesley W. Terpstra 
[Atish: Compatible string update]
Signed-off-by: Atish Patra 
Signed-off-by: Yash Shah 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/pwm/pwm-sifive.txt | 33 ++
 1 file changed, 33 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pwm/pwm-sifive.txt

diff --git a/Documentation/devicetree/bindings/pwm/pwm-sifive.txt 
b/Documentation/devicetree/bindings/pwm/pwm-sifive.txt
new file mode 100644
index 000..36447e3
--- /dev/null
+++ b/Documentation/devicetree/bindings/pwm/pwm-sifive.txt
@@ -0,0 +1,33 @@
+SiFive PWM controller
+
+Unlike most other PWM controllers, the SiFive PWM controller currently only
+supports one period for all channels in the PWM. All PWMs need to run at
+the same period. The period also has significant restrictions on the values
+it can achieve, which the driver rounds to the nearest achievable period.
+PWM RTL that corresponds to the IP block version numbers can be found
+here:
+
+https://github.com/sifive/sifive-blocks/tree/master/src/main/scala/devices/pwm
+
+Required properties:
+- compatible: Should be "sifive,-pwm" and "sifive,pwm".
+  Supported compatible strings are: "sifive,fu540-c000-pwm" for the SiFive
+  PWM v0 as integrated onto the SiFive FU540 chip, and "sifive,pwm0" for the
+  SiFive PWM v0 IP block with no chip integration tweaks.
+  Please refer to sifive-blocks-ip-versioning.txt for details.
+- reg: physical base address and length of the controller's registers
+- clocks: Should contain a clock identifier for the PWM's parent clock.
+- #pwm-cells: Should be 3. See pwm.txt in this directory
+  for a description of the cell format.
+- interrupts: one interrupt per PWM channel
+
+Examples:
+
+pwm:  pwm@1002 {
+   compatible = "sifive,fu540-c000-pwm", "sifive,pwm0";
+   reg = <0x0 0x1002 0x0 0x1000>;
+   clocks = <>;
+   interrupt-parent = <>;
+   interrupts = <42 43 44 45>;
+   #pwm-cells = <3>;
+};
-- 
1.9.1



[PATCH v13 0/2] PWM support for HiFive Unleashed

2019-06-10 Thread Yash Shah
This patch series adds a PWM driver and DT documentation
for HiFive Unleashed board. The patches are mostly based on
Wesley's patch.

This patchset is based on Linux v5.2-rc1 and tested on HiFive Unleashed
board with additional board related patches needed for testing can be
found at dev/yashs/pwm_v13 branch of:
https://github.com/yashshah7/riscv-linux.git

v13
- Rebased onto Mainline v5.2-rc1
- Correct the order of pwmchip_remove() after clk_disable() in .remove()

v12
- Rebased onto Mainline v5.1

v11
- Change naming convention for pwm_device and pwm_sifive_ddata pointers
- Assign of_pwm_xlate_with_flag() to of_xlate func ptr since this driver
  use three pwm-cells (Issue reported by Andreas Schwab 
- Other minor fixes

v10
- Use DIV_ROUND_CLOSEST_ULL instead of div_u64_round
- Change 'num' defination to u64 bit (in pwm_sifive_apply).
- Remove the usage of pwm_get_state()

v9
- Use appropriate bitfield macros
- Add approx_period in pwm_sifive_ddata struct and related changes
- Correct the eqn for calculation of frac (in pwm_sifive_apply)
- Other minor fixes

v8
- Typo corrections
- Remove active_user and related code
- Do not clear PWM_SIFIVE_PWMCFG_EN_ALWAYS
- Other minor fixes

v7
- Modify description of compatible property in DT documentation
- Use mutex locks at appropriate places
- Fix all bad line breaks
- Allow enabling/disabling PWM only when the user is the only active user
- Remove Deglitch logic
- Other minor fixes

v6
- Remove the global property 'sifive,period-ns'
- Implement free and request callbacks to maintain user counts.
- Add user_count member to struct pwm_sifive_ddata
- Allow period change only if user_count is one
- Add pwm_sifive_enable function to enable/disable PWM
- Change calculation logic of frac (in pwm_sifive_apply)
- Remove state correction
- Remove pwm_sifive_xlate function
- Clock to be enabled only when PWM is enabled
- Other minor fixes

v5
- Correct the order of compatible string properties
- PWM state correction to be done always
- Other minor fixes based upon feedback on v4

v4
- Rename macros with appropriate names
- Remove unused macros
- Rename struct sifive_pwm_device to struct pwm_sifive_ddata
- Rename function prefix as per driver name
- Other minor fixes based upon feedback on v3

v3
- Add a link to the reference manaul
- Use appropriate apis for division operation
- Add check for polarity
- Enable clk before calling clk_get_rate
- Other minor fixes based upon feedback on v2

V2 changed from V1:
- Remove inclusion of dt-bindings/pwm/pwm.h
- Remove artificial alignments
- Replace ioread32/iowrite32 with readl/writel
- Remove camelcase
- Change dev_info to dev_dbg for unnecessary log
- Correct typo in driver name
- Remove use of of_match_ptr macro
- Update the DT compatible strings and Add reference to a common
  versioning document

Yash Shah (2):
  pwm: sifive: Add DT documentation for SiFive PWM Controller
  pwm: sifive: Add a driver for SiFive SoC PWM

 .../devicetree/bindings/pwm/pwm-sifive.txt |  33 ++
 drivers/pwm/Kconfig|  11 +
 drivers/pwm/Makefile   |   1 +
 drivers/pwm/pwm-sifive.c   | 339 +
 4 files changed, 384 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pwm/pwm-sifive.txt
 create mode 100644 drivers/pwm/pwm-sifive.c

-- 
1.9.1



Re: [PATCH 00/25] clk: sunxi-ng: clk parent rewrite part 1

2019-06-10 Thread Chen-Yu Tsai
On Sat, Jun 8, 2019 at 2:46 AM Stephen Boyd  wrote:
>
> Quoting Chen-Yu Tsai (2019-06-03 09:38:22)
> > Hi Stephen,
> >
> > On Mon, May 20, 2019 at 5:03 PM Maxime Ripard  
> > wrote:
> > >
> > > On Mon, May 20, 2019 at 04:03:56PM +0800, Chen-Yu Tsai wrote:
> > > > From: Chen-Yu Tsai 
> > > >
> > > > Hi everyone,
> > > >
> > > > This is series is the first part of a large series (I haven't done the
> > > > rest) of patches to rewrite the clk parent relationship handling within
> > > > the sunxi-ng clk driver. This is based on Stephen's recent work allowing
> > > > clk drivers to specify clk parents using struct clk_hw * or parsing DT
> > > > phandles in the clk node.
> > > >
> > > > This series can be split into a few major parts:
> > > >
> > > > 1) The first patch is a small fix for clk debugfs representation. This
> > > >was done before commit 1a079560b145 ("clk: Cache core in
> > > >clk_fetch_parent_index() without names") was posted, so it might or
> > > >might not be needed. Found this when checking my work using
> > > >clk_possible_parents.
> > > >
> > > > 2) A bunch of CLK_HW_INIT_* helper macros are added. These cover the
> > > >situations I encountered, or assume I will encounter, such as single
> > > >internal (struct clk_hw *) parent, single DT (struct clk_parent_data
> > > >.fw_name), multiple internal parents, and multiple mixed (internal +
> > > >DT) parents. A special variant for just an internal single parent is
> > > >added, CLK_HW_INIT_HWS, which lets the driver share the singular
> > > >list, instead of having the compiler create a compound literal every
> > > >time. It might even make sense to only keep this variant.
> > > >
> > > > 3) A bunch of CLK_FIXED_FACTOR_* helper macros are added. The rationale
> > > >is the same as the single parent CLK_HW_INIT_* helpers.
> > > >
> > > > 4) Bulk conversion of CLK_FIXED_FACTOR to use local parent references,
> > > >either struct clk_hw * or DT .fw_name types, whichever the hardware
> > > >requires.
> > > >
> > > > 5) The beginning of SUNXI_CCU_GATE conversion to local parent
> > > >references. This part is not done. They are included as justification
> > > >and examples for the shared list of clk parents case.
> > >
> > > That series is pretty neat. As far as sunxi is concerned, you can add my
> > > Acked-by: Maxime Ripard 
> > >
> > > > I realize this is going to be many patches every time I convert a clock
> > > > type. Going forward would the people involved prefer I send out
> > > > individual patches like this series, or squash them all together?
> > >
> > > For bisection, I guess it would be good to keep the approach you've
> > > had in this series. If this is really too much, I guess we can always
> > > change oru mind later on.
> >
> > Any thoughts on this series and how to proceed?
> >
>
> I have a few minor nitpicks but otherwise the series looks good to me.
> I'm perfectly happy to see the individual patches unless you want to
> squash them into one big patch. I can review the conversions either way.

OK. Based on your and Maxime's response, I'll send them individually.

> Did you need me to apply any patches here? Or can I assume you'll resend
> with a pull request so it can be merged into clk-next?

I can send them as part of our normal pull request. Or did you want this
as a separate topic?

I'll still send out a v2 to cover your review comments.

> BTW, did you have to update any DT bindings or documentation? I didn't
> see anything, so I'm a little surprised that all that stuff was already
> in place.

The bindings had the clocks / clock-names all defined, and the DT all had
the properties, because we had already gone through one rewrite. It's just
the driver didn't follow them properly, because the parents were cross
node / driver, and we had these statically initialized parent name arrays.

I had started work on having the driver rewrite the parents lists based on
fetching clock names via DT, but it was far from elegant. Then this came
up. :)


Regards
ChenYu


linux-next: manual merge of the clockevents tree with Linus' tree

2019-06-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the clockevents tree got a conflict in:

  drivers/clocksource/timer-tegra.c

between commit:

  9c92ab619141 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 
282")

from Linus' tree and commit:

  75e9f7c6dca8 ("clocksource/drivers/tegra: Use SPDX identifier")

from the clockevents tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/clocksource/timer-tegra.c
index 1e7ece279730,9406855781ff..
--- a/drivers/clocksource/timer-tegra.c
+++ b/drivers/clocksource/timer-tegra.c
@@@ -1,11 -1,11 +1,11 @@@
 -// SPDX-License-Identifier: GPL-2.0
 +// SPDX-License-Identifier: GPL-2.0-only
  /*
   * Copyright (C) 2010 Google, Inc.
-  *
-  * Author:
-  *Colin Cross 
+  * Author: Colin Cross 
   */
  
+ #define pr_fmt(fmt)   "tegra-timer: " fmt
+ 
  #include 
  #include 
  #include 


pgpanme6II5wA.pgp
Description: OpenPGP digital signature


Re: [PATCH v2 4/4] watchdog: jz4740: Make probe function __init_or_module

2019-06-10 Thread Christophe Leroy

Hi Paul,

Le 08/06/2019 à 11:57, Paul Cercueil a écrit :

Hi Christophe,

Le sam. 8 juin 2019 à 9:51, Christophe Leroy  a 
écrit :

Hi Paul,

Le 07/06/2019 à 18:24, Paul Cercueil a écrit :

This allows the probe function to be dropped after the kernel finished
its initialization, in the case where the driver was not compiled as a
module.


I'm not sure that's what  __init_or_module flag does.

As far as I understand, this flag makes the function being dropped 
only when the kernel is built without modules support, ie without 
CONFIG_MODULES. See 
https://elixir.bootlin.com/linux/latest/source/include/linux/module.h#L145 



So it doesn't depend on the driver being built-in or compiled as a module?


No it doesn't. This flag is for built-in functions that are needed by 
init and modules init. If the kernel doesn't support modules, it can 
drop the function after init. If the kernel supports modules, it has to 
keep the function. That's what this flag is made for.


If your need is to mark a function so that it gets discarded after init 
or module init, just mark it __init. If it is built in, it will be 
dropped after init. If it is in a module, it will be dropped after the 
module is initialised.




In addition, I'm not sure you can simply define a probe function as 
__init. What if someone tries to unbind and rebind the device through 
sysfs for instance ?


Ouch. I feel stupid now.

It seems there is a special function called __platform_driver_probe() 
for registering devices when the probe function is to be in __init, 
see 
https://elixir.bootlin.com/linux/latest/source/drivers/base/platform.c#L684 



Yes, but only usable by drivers that won't defer probe, and it removes 
the bind/unbind attributes from sysfs,

so it shouldn't be used for non-critical drivers, I think.


I guess it would make sense for watchdog drivers, we don't expect this 
kind of driver to be unbinded, do we ?


Christophe




Christophe



Signed-off-by: Paul Cercueil 
---

Notes:
 v2: New patch

  drivers/watchdog/jz4740_wdt.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/watchdog/jz4740_wdt.c 
b/drivers/watchdog/jz4740_wdt.c

index 7519d80c5d05..2061788c1939 100644
--- a/drivers/watchdog/jz4740_wdt.c
+++ b/drivers/watchdog/jz4740_wdt.c
@@ -157,7 +157,7 @@ static const struct of_device_id 
jz4740_wdt_of_matches[] = {

  MODULE_DEVICE_TABLE(of, jz4740_wdt_of_matches);
  #endif
  -static int jz4740_wdt_probe(struct platform_device *pdev)
+static int __init_or_module jz4740_wdt_probe(struct platform_device 
*pdev)

  {
  struct device *dev = >dev;
  struct jz4740_wdt_drvdata *drvdata;





Re: [RFC V3] mm: Generalize and rename notify_page_fault() as kprobe_page_fault()

2019-06-10 Thread Anshuman Khandual



On 06/11/2019 10:16 AM, Christophe Leroy wrote:
> 
> 
> Le 10/06/2019 à 04:39, Anshuman Khandual a écrit :
>>
>>
>> On 06/07/2019 09:01 PM, Christophe Leroy wrote:
>>>
>>>
>>> Le 07/06/2019 à 12:34, Anshuman Khandual a écrit :
 Very similar definitions for notify_page_fault() are being used by multiple
 architectures duplicating much of the same code. This attempts to unify all
 of them into a generic implementation, rename it as kprobe_page_fault() and
 then move it to a common header.

 kprobes_built_in() can detect CONFIG_KPROBES, hence new kprobe_page_fault()
 need not be wrapped again within CONFIG_KPROBES. Trap number argument can
 now contain upto an 'unsigned int' accommodating all possible platforms.

 kprobe_page_fault() goes the x86 way while dealing with preemption context.
 As explained in these following commits the invoking context in itself must
 be non-preemptible for kprobes processing context irrespective of whether
 kprobe_running() or perhaps smp_processor_id() is safe or not. It does not
 make much sense to continue when original context is preemptible. Instead
 just bail out earlier.

 commit a980c0ef9f6d
 ("x86/kprobes: Refactor kprobes_fault() like kprobe_exceptions_notify()")

 commit b506a9d08bae ("x86: code clarification patch to Kprobes arch code")

 Cc: linux-arm-ker...@lists.infradead.org
 Cc: linux-i...@vger.kernel.org
 Cc: linuxppc-...@lists.ozlabs.org
 Cc: linux-s...@vger.kernel.org
 Cc: linux...@vger.kernel.org
 Cc: sparcli...@vger.kernel.org
 Cc: x...@kernel.org
 Cc: Andrew Morton 
 Cc: Michal Hocko 
 Cc: Matthew Wilcox 
 Cc: Mark Rutland 
 Cc: Christophe Leroy 
 Cc: Stephen Rothwell 
 Cc: Andrey Konovalov 
 Cc: Michael Ellerman 
 Cc: Paul Mackerras 
 Cc: Russell King 
 Cc: Catalin Marinas 
 Cc: Will Deacon 
 Cc: Tony Luck 
 Cc: Fenghua Yu 
 Cc: Martin Schwidefsky 
 Cc: Heiko Carstens 
 Cc: Yoshinori Sato 
 Cc: "David S. Miller" 
 Cc: Thomas Gleixner 
 Cc: Peter Zijlstra 
 Cc: Ingo Molnar 
 Cc: Andy Lutomirski 
 Cc: Dave Hansen 

 Signed-off-by: Anshuman Khandual 
 ---
 Testing:

 - Build and boot tested on arm64 and x86
 - Build tested on some other archs (arm, sparc64, alpha, powerpc etc)

 Changes in RFC V3:

 - Updated the commit message with an explaination for new preemption 
 behaviour
 - Moved notify_page_fault() to kprobes.h with 'static nokprobe_inline' per 
 Matthew
 - Changed notify_page_fault() return type from int to bool per Michael 
 Ellerman
 - Renamed notify_page_fault() as kprobe_page_fault() per Peterz

 Changes in RFC V2: (https://patchwork.kernel.org/patch/10974221/)

 - Changed generic notify_page_fault() per Mathew Wilcox
 - Changed x86 to use new generic notify_page_fault()
 - s/must not/need not/ in commit message per Matthew Wilcox

 Changes in RFC V1: (https://patchwork.kernel.org/patch/10968273/)

    arch/arm/mm/fault.c  | 24 +---
    arch/arm64/mm/fault.c    | 24 +---
    arch/ia64/mm/fault.c | 24 +---
    arch/powerpc/mm/fault.c  | 23 ++-
    arch/s390/mm/fault.c | 16 +---
    arch/sh/mm/fault.c   | 18 ++
    arch/sparc/mm/fault_64.c | 16 +---
    arch/x86/mm/fault.c  | 21 ++---
    include/linux/kprobes.h  | 16 
    9 files changed, 27 insertions(+), 155 deletions(-)

>>>
>>> [...]
>>>
 diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
 index 443d980..064dd15 100644
 --- a/include/linux/kprobes.h
 +++ b/include/linux/kprobes.h
 @@ -458,4 +458,20 @@ static inline bool is_kprobe_optinsn_slot(unsigned 
 long addr)
    }
    #endif
    +static nokprobe_inline bool kprobe_page_fault(struct pt_regs *regs,
 +  unsigned int trap)
 +{
 +    int ret = 0;
>>>
>>> ret is pointless.
>>>
 +
 +    /*
 + * To be potentially processing a kprobe fault and to be allowed
 + * to call kprobe_running(), we have to be non-preemptible.
 + */
 +    if (kprobes_built_in() && !preemptible() && !user_mode(regs)) {
 +    if (kprobe_running() && kprobe_fault_handler(regs, trap))
>>>
>>> don't need an 'if A if B', can do 'if A && B'
>>
>> Which will make it a very lengthy condition check.
> 
> Yes. But is that a problem at all ?

Probably not.

> 
> For me the following would be easier to read.
> 
> if (kprobes_built_in() && !preemptible() && !user_mode(regs) &&
>     kprobe_running() && kprobe_fault_handler(regs, trap))
> ret = 1;

As mentioned before will stick with current x86 implementation. 


Re: [RFC V3] mm: Generalize and rename notify_page_fault() as kprobe_page_fault()

2019-06-10 Thread Anshuman Khandual



On 06/10/2019 08:57 PM, Leonardo Bras wrote:
> On Mon, 2019-06-10 at 08:09 +0530, Anshuman Khandual wrote:
 +/*
 + * To be potentially processing a kprobe fault and to be allowed
 + * to call kprobe_running(), we have to be non-preemptible.
 + */
 +if (kprobes_built_in() && !preemptible() && !user_mode(regs)) {
 +if (kprobe_running() && kprobe_fault_handler(regs, trap))
>>>
>>> don't need an 'if A if B', can do 'if A && B'
>>
>> Which will make it a very lengthy condition check.
> 
> Well, is there any problem line-breaking the if condition?
> 
> if (A && B && C &&
> D && E )
> 
> Also, if it's used only to decide the return value, maybe would be fine
> to do somethink like that:
> 
> return (A && B && C &&
> D && E ); 

Got it. But as Dave and Matthew had pointed out earlier, the current x86
implementation has better readability. Hence will probably stick with it.


Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-06-10 Thread Borislav Petkov
On Fri, Jun 07, 2019 at 06:37:23PM +0200, Borislav Petkov wrote:
> On Fri, Jun 07, 2019 at 02:49:42PM +, Ghannam, Yazen wrote:
> > Would you mind if the function name stayed the same? The reason is
> > that MCA_CTL is written here, which is the "init" part, and MCA_STATUS
> > is cleared.
> >
> > I can use another name for the check, e.g. __mcheck_cpu_check_banks()
> > or __mcheck_cpu_banks_check_init().
> 
> Nevermind, leave it as is. I'll fix it up ontop. I don't like that
> "__mcheck_cpu_init" prefixing there which is a mouthful and should
> simply be "mce_cpu_" to denote that it is a function which is
> run on a CPU to setup stuff.

So I'm staring at this and I can't say that I'm getting any good ideas:

I wanna get rid of that ugly "__mcheck_cpu_" prefix but the replacements
I can think of right now, are crap:

* I can call them all "cpu_" but then they look like generic
cpu-setup functions which come from kernel/cpu.c or so.

* I can prefix them with "mce_cpu" but when you do them all, it becomes
a block of "mce_cpu_" stuff which ain't more readable either. And
besides, those are static functions so they shouldn't need the prefix.
But I'd like the naming to denote that they're doing per-CPU setup
stuff. Which brings me to the previous point.

So no, don't have a good idea yet...

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Quation needed For June Inquiry

2019-06-10 Thread Jpexcc Salesi
Hello dear,
 
We are in the market for your products after meeting at your stand during last 
expo.
 
Please kindly send us your latest catalog and price list so as to start a new 
project/order as promised during the exhibition. 
 
I would appreciate your response about the above details required so we can 
revert back to you asap.
 
Kind regards
 
Rhema Zoeh


Re: [PATCH v2 2/7] scsi: NCR5380: Always re-enable reselection interrupt

2019-06-10 Thread Michael Schmitz

Hi Finn,

IIRC I'd tested that change as well - didn't change broken target 
behaviour but no regressions in other respects. Add my tested-by if needed.


Cheers,

Michael


Am 09.06.2019 um 13:19 schrieb Finn Thain:

The reselection interrupt gets disabled during selection and must be
re-enabled when hostdata->connected becomes NULL. If it isn't re-enabled
a disconnected command may time-out or the target may wedge the bus while
trying to reselect the host. This can happen after a command is aborted.

Fix this by enabling the reselection interrupt in NCR5380_main() after
calls to NCR5380_select() and NCR5380_information_transfer() return.

Cc: Michael Schmitz 
Cc: sta...@vger.kernel.org # v4.9+
Fixes: 8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler")
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/scsi/NCR5380.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index fe0535affc14..08e3ea8159b3 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -709,6 +709,8 @@ static void NCR5380_main(struct work_struct *work)
NCR5380_information_transfer(instance);
done = 0;
}
+   if (!hostdata->connected)
+   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
spin_unlock_irq(>lock);
if (!done)
cond_resched();
@@ -1110,8 +1112,6 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
spin_lock_irq(>lock);
NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
NCR5380_reselect(instance);
-   if (!hostdata->connected)
-   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
shost_printk(KERN_ERR, instance, "reselection after won 
arbitration?\n");
goto out;
}
@@ -1119,7 +1119,6 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
if (err < 0) {
spin_lock_irq(>lock);
NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
-   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);

/* Can't touch cmd if it has been reclaimed by the scsi ML */
if (!hostdata->selecting)
@@ -1157,7 +1156,6 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
if (err < 0) {
shost_printk(KERN_ERR, instance, "select: REQ timeout\n");
NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
-   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
goto out;
}
if (!hostdata->selecting) {
@@ -1826,9 +1824,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
 */
NCR5380_write(TARGET_COMMAND_REG, 0);

-   /* Enable reselect interrupts */
-   NCR5380_write(SELECT_ENABLE_REG, 
hostdata->id_mask);
-
maybe_release_dma_irq(instance);
return;
case MESSAGE_REJECT:
@@ -1860,8 +1855,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
 */
NCR5380_write(TARGET_COMMAND_REG, 0);

-   /* Enable reselect interrupts */
-   NCR5380_write(SELECT_ENABLE_REG, 
hostdata->id_mask);
 #ifdef SUN3_SCSI_VME
dregs->csr |= CSR_DMA_ENABLE;
 #endif
@@ -1964,7 +1957,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
cmd->result = DID_ERROR << 16;
complete_cmd(instance, cmd);
maybe_release_dma_irq(instance);
-   NCR5380_write(SELECT_ENABLE_REG, 
hostdata->id_mask);
return;
}
msgout = NOP;



Re: [PATCH] NCR5380: Support chained sg lists

2019-06-10 Thread Michael Schmitz

Hi Finn,

Thanks - can't test this on my hardware but looks good to me.

Cheers,

Michael

Am 11.06.2019 um 15:25 schrieb Finn Thain:

My understanding is that support for chained scatterlists is to
become mandatory for LLDs.

Cc: Michael Schmitz 
Signed-off-by: Finn Thain 
---
 drivers/scsi/NCR5380.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index d9fa9cf2fd8b..536426f25e86 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -149,12 +149,10 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)

if (scsi_bufflen(cmd)) {
cmd->SCp.buffer = scsi_sglist(cmd);
-   cmd->SCp.buffers_residual = scsi_sg_count(cmd) - 1;
cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
cmd->SCp.this_residual = cmd->SCp.buffer->length;
} else {
cmd->SCp.buffer = NULL;
-   cmd->SCp.buffers_residual = 0;
cmd->SCp.ptr = NULL;
cmd->SCp.this_residual = 0;
}
@@ -163,6 +161,17 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)
cmd->SCp.Message = 0;
 }

+static inline void advance_sg_buffer(struct scsi_cmnd *cmd)
+{
+   struct scatterlist *s = cmd->SCp.buffer;
+
+   if (!cmd->SCp.this_residual && s && !sg_is_last(s)) {
+   cmd->SCp.buffer = sg_next(s);
+   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
+   cmd->SCp.this_residual = cmd->SCp.buffer->length;
+   }
+}
+
 /**
  * NCR5380_poll_politely2 - wait for two chip register values
  * @hostdata: host private data
@@ -1670,12 +1679,7 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
sun3_dma_setup_done != cmd) {
int count;

-   if (!cmd->SCp.this_residual && 
cmd->SCp.buffers_residual) {
-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   }
+   advance_sg_buffer(cmd);

count = sun3scsi_dma_xfer_len(hostdata, cmd);

@@ -1725,15 +1729,11 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
 * scatter-gather list, move onto the next one.
 */

-   if (!cmd->SCp.this_residual && 
cmd->SCp.buffers_residual) {
-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   dsprintk(NDEBUG_INFORMATION, instance, "%d 
bytes and %d buffers left\n",
-cmd->SCp.this_residual,
-cmd->SCp.buffers_residual);
-   }
+   advance_sg_buffer(cmd);
+   dsprintk(NDEBUG_INFORMATION, instance,
+   "this residual %d, sg ents %d\n",
+   cmd->SCp.this_residual,
+   sg_nents(cmd->SCp.buffer));

/*
 * The preferred transfer method is going to be
@@ -2126,12 +2126,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance)
if (sun3_dma_setup_done != tmp) {
int count;

-   if (!tmp->SCp.this_residual && tmp->SCp.buffers_residual) {
-   ++tmp->SCp.buffer;
-   --tmp->SCp.buffers_residual;
-   tmp->SCp.this_residual = tmp->SCp.buffer->length;
-   tmp->SCp.ptr = sg_virt(tmp->SCp.buffer);
-   }
+   advance_sg_buffer(tmp);

count = sun3scsi_dma_xfer_len(hostdata, tmp);




linux-next: build warning after merge of the tpmdd tree

2019-06-10 Thread Stephen Rothwell
Hi all,

After merging the tpmdd tree, today's linux-next build (arm
multi_v7_defconfig) produced this warning:

drivers/firmware/efi/tpm.c: In function 'efi_tpm_eventlog_init':
drivers/firmware/efi/tpm.c:80:10: warning: passing argument 1 of 
'tpm2_calc_event_log_size' makes pointer from integer without a cast 
[-Wint-conversion]
  tbl_size = tpm2_calc_event_log_size(efi.tpm_final_log
  ~
  + sizeof(final_tbl->version)
  
  + sizeof(final_tbl->nr_events),
  ^~
drivers/firmware/efi/tpm.c:19:43: note: expected 'void *' but argument is of 
type 'long unsigned int'
 static int tpm2_calc_event_log_size(void *data, int count, void *size_info)
 ~~^~~~

Introduced by commit

  a537b15c54a3 ("tpm: Reserve the TPM final events table")

-- 
Cheers,
Stephen Rothwell


pgprVFN47DDmD.pgp
Description: OpenPGP digital signature


Re: bcachefs status update (it's done cooking; let's get this sucker merged)

2019-06-10 Thread Linus Torvalds
On Mon, Jun 10, 2019 at 3:17 PM Kent Overstreet
 wrote:
>
>
> > Why does the regular page lock (at a finer granularity) not suffice?
>
> Because the lock needs to prevent pages from being _added_ to the page cache -
> to do it with a page granularity lock it'd have to be part of the radix tree,

No, I understand that part, but I still think we should be able to do
the locking per-page rather than over the whole mapping.

When doing dio, you need to iterate over old existing pages anyway in
that range (otherwise the "no _new_ pages" part is kind of pointless
when there are old pages there), so my gut feel is that you might as
well at that point also "poison" the range you are doin dio on. With
the xarray changes, we might be better at handling ranges. That was
one of the arguments for the xarrays over the old radix tree model,
after all.

And I think the dio code would ideally want to have a range-based lock
anyway, rather than one global one. No?

Anyway, don't get me wrong. I'm not entirely against a "stop adding
pages" model per-mapping if it's just fundamentally simpler and nobody
wants anything fancier. So I'm certainly open to it, assuming it
doesn't add any real overhead to the normal case.

But I *am* against it when it has ad-hoc locking and random
anti-recursion things.

So I'm with Dave on the "I hope we can avoid the recursive hacks" by
making better rules. Even if I disagree with him on the locking thing
- I'd rather put _more_stress on the standard locking and make sure it
really works, over having multiple similar locking models because they
don't trust each other.

   Linus


[PATCH 2/4] vfs: create a generic checking function for FS_IOC_FSSETXATTR

2019-06-10 Thread Darrick J. Wong
From: Darrick J. Wong 

Create a generic checking function for the incoming FS_IOC_FSSETXATTR
fsxattr values so that we can standardize some of the implementation
behaviors.

Signed-off-by: Darrick J. Wong 
---
 fs/btrfs/ioctl.c   |   21 +---
 fs/ext4/ioctl.c|   27 ++--
 fs/f2fs/file.c |   26 ++-
 fs/inode.c |   17 +
 fs/xfs/xfs_ioctl.c |   70 ++--
 include/linux/fs.h |3 ++
 6 files changed, 111 insertions(+), 53 deletions(-)


diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f408aa93b0cf..7ddda5b4b6a6 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -366,6 +366,13 @@ static int check_xflags(unsigned int flags)
return 0;
 }
 
+static void __btrfs_ioctl_fsgetxattr(struct btrfs_inode *binode,
+struct fsxattr *fa)
+{
+   memset(fa, 0, sizeof(*fa));
+   fa->fsx_xflags = btrfs_inode_flags_to_xflags(binode->flags);
+}
+
 /*
  * Set the xflags from the internal inode flags. The remaining items of fsxattr
  * are zeroed.
@@ -375,8 +382,7 @@ static int btrfs_ioctl_fsgetxattr(struct file *file, void 
__user *arg)
struct btrfs_inode *binode = BTRFS_I(file_inode(file));
struct fsxattr fa;
 
-   memset(, 0, sizeof(fa));
-   fa.fsx_xflags = btrfs_inode_flags_to_xflags(binode->flags);
+   __btrfs_ioctl_fsgetxattr(binode, );
 
if (copy_to_user(arg, , sizeof(fa)))
return -EFAULT;
@@ -390,7 +396,7 @@ static int btrfs_ioctl_fssetxattr(struct file *file, void 
__user *arg)
struct btrfs_inode *binode = BTRFS_I(inode);
struct btrfs_root *root = binode->root;
struct btrfs_trans_handle *trans;
-   struct fsxattr fa;
+   struct fsxattr fa, old_fa;
unsigned old_flags;
unsigned old_i_flags;
int ret = 0;
@@ -421,13 +427,10 @@ static int btrfs_ioctl_fssetxattr(struct file *file, void 
__user *arg)
old_flags = binode->flags;
old_i_flags = inode->i_flags;
 
-   /* We need the capabilities to change append-only or immutable inode */
-   if (((old_flags & (BTRFS_INODE_APPEND | BTRFS_INODE_IMMUTABLE)) ||
-(fa.fsx_xflags & (FS_XFLAG_APPEND | FS_XFLAG_IMMUTABLE))) &&
-   !capable(CAP_LINUX_IMMUTABLE)) {
-   ret = -EPERM;
+   __btrfs_ioctl_fsgetxattr(binode, _fa);
+   ret = vfs_ioc_fssetxattr_check(inode, _fa, );
+   if (ret)
goto out_unlock;
-   }
 
if (fa.fsx_xflags & FS_XFLAG_SYNC)
binode->flags |= BTRFS_INODE_SYNC;
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 5126ee351a84..c2f48c90ca45 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -721,6 +721,19 @@ static int ext4_ioctl_check_project(struct inode *inode, 
struct fsxattr *fa)
return 0;
 }
 
+static void ext4_fsgetxattr(struct inode *inode, struct fsxattr *fa)
+{
+   struct ext4_inode_info *ei = EXT4_I(inode);
+
+   memset(fa, 0, sizeof(struct fsxattr));
+   fa->fsx_xflags = ext4_iflags_to_xflags(ei->i_flags & 
EXT4_FL_USER_VISIBLE);
+
+   if (ext4_has_feature_project(inode->i_sb)) {
+   fa->fsx_projid = (__u32)from_kprojid(_user_ns,
+   ei->i_projid);
+   }
+}
+
 long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 {
struct inode *inode = file_inode(filp);
@@ -1089,13 +1102,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, 
unsigned long arg)
{
struct fsxattr fa;
 
-   memset(, 0, sizeof(struct fsxattr));
-   fa.fsx_xflags = ext4_iflags_to_xflags(ei->i_flags & 
EXT4_FL_USER_VISIBLE);
-
-   if (ext4_has_feature_project(inode->i_sb)) {
-   fa.fsx_projid = (__u32)from_kprojid(_user_ns,
-   EXT4_I(inode)->i_projid);
-   }
+   ext4_fsgetxattr(inode, );
 
if (copy_to_user((struct fsxattr __user *)arg,
 , sizeof(fa)))
@@ -1104,7 +,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, 
unsigned long arg)
}
case EXT4_IOC_FSSETXATTR:
{
-   struct fsxattr fa;
+   struct fsxattr fa, old_fa;
int err;
 
if (copy_from_user(, (struct fsxattr __user *)arg,
@@ -1127,7 +1134,11 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, 
unsigned long arg)
return err;
 
inode_lock(inode);
+   ext4_fsgetxattr(inode, _fa);
err = ext4_ioctl_check_project(inode, );
+   if (err)
+   goto out;
+   err = vfs_ioc_fssetxattr_check(inode, _fa, );
if (err)
goto out;
flags = (ei->i_flags & ~EXT4_FL_XFLAG_VISIBLE) |
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 

[PATCH v3 0/6] vfs: make immutable files actually immutable

2019-06-10 Thread Darrick J. Wong
Hi all,

The chattr(1) manpage has this to say about the immutable bit that
system administrators can set on files:

"A file with the 'i' attribute cannot be modified: it cannot be deleted
or renamed, no link can be created to this file, most of the file's
metadata can not be modified, and the file can not be opened in write
mode."

Given the clause about how the file 'cannot be modified', it is
surprising that programs holding writable file descriptors can continue
to write to and truncate files after the immutable flag has been set,
but they cannot call other things such as utimes, fallocate, unlink,
link, setxattr, or reflink.

Since the immutable flag is only settable by administrators, resolve
this inconsistent behavior in favor of the documented behavior -- once
the flag is set, the file cannot be modified, period.  We presume that
administrators must be trusted to know what they're doing, and that
cutting off programs with writable fds will probably break them.

Therefore, add immutability checks to the relevant VFS functions, then
refactor the SETFLAGS and FSSETXATTR implementations to use common
argument checking functions so that we can then force pagefaults on all
the file data when setting immutability.

Note that various distro manpages points out the inconsistent behavior
of the various Linux filesystems w.r.t. immutable.  This fixes all that.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This has been lightly tested with fstests.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=immutable-files

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=immutable-files


Re: [RFC V3] mm: Generalize and rename notify_page_fault() as kprobe_page_fault()

2019-06-10 Thread Christophe Leroy




Le 10/06/2019 à 04:39, Anshuman Khandual a écrit :



On 06/07/2019 09:01 PM, Christophe Leroy wrote:



Le 07/06/2019 à 12:34, Anshuman Khandual a écrit :

Very similar definitions for notify_page_fault() are being used by multiple
architectures duplicating much of the same code. This attempts to unify all
of them into a generic implementation, rename it as kprobe_page_fault() and
then move it to a common header.

kprobes_built_in() can detect CONFIG_KPROBES, hence new kprobe_page_fault()
need not be wrapped again within CONFIG_KPROBES. Trap number argument can
now contain upto an 'unsigned int' accommodating all possible platforms.

kprobe_page_fault() goes the x86 way while dealing with preemption context.
As explained in these following commits the invoking context in itself must
be non-preemptible for kprobes processing context irrespective of whether
kprobe_running() or perhaps smp_processor_id() is safe or not. It does not
make much sense to continue when original context is preemptible. Instead
just bail out earlier.

commit a980c0ef9f6d
("x86/kprobes: Refactor kprobes_fault() like kprobe_exceptions_notify()")

commit b506a9d08bae ("x86: code clarification patch to Kprobes arch code")

Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: x...@kernel.org
Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: Matthew Wilcox 
Cc: Mark Rutland 
Cc: Christophe Leroy 
Cc: Stephen Rothwell 
Cc: Andrey Konovalov 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Yoshinori Sato 
Cc: "David S. Miller" 
Cc: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Andy Lutomirski 
Cc: Dave Hansen 

Signed-off-by: Anshuman Khandual 
---
Testing:

- Build and boot tested on arm64 and x86
- Build tested on some other archs (arm, sparc64, alpha, powerpc etc)

Changes in RFC V3:

- Updated the commit message with an explaination for new preemption behaviour
- Moved notify_page_fault() to kprobes.h with 'static nokprobe_inline' per 
Matthew
- Changed notify_page_fault() return type from int to bool per Michael Ellerman
- Renamed notify_page_fault() as kprobe_page_fault() per Peterz

Changes in RFC V2: (https://patchwork.kernel.org/patch/10974221/)

- Changed generic notify_page_fault() per Mathew Wilcox
- Changed x86 to use new generic notify_page_fault()
- s/must not/need not/ in commit message per Matthew Wilcox

Changes in RFC V1: (https://patchwork.kernel.org/patch/10968273/)

   arch/arm/mm/fault.c  | 24 +---
   arch/arm64/mm/fault.c    | 24 +---
   arch/ia64/mm/fault.c | 24 +---
   arch/powerpc/mm/fault.c  | 23 ++-
   arch/s390/mm/fault.c | 16 +---
   arch/sh/mm/fault.c   | 18 ++
   arch/sparc/mm/fault_64.c | 16 +---
   arch/x86/mm/fault.c  | 21 ++---
   include/linux/kprobes.h  | 16 
   9 files changed, 27 insertions(+), 155 deletions(-)



[...]


diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 443d980..064dd15 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -458,4 +458,20 @@ static inline bool is_kprobe_optinsn_slot(unsigned long 
addr)
   }
   #endif
   +static nokprobe_inline bool kprobe_page_fault(struct pt_regs *regs,
+  unsigned int trap)
+{
+    int ret = 0;


ret is pointless.


+
+    /*
+ * To be potentially processing a kprobe fault and to be allowed
+ * to call kprobe_running(), we have to be non-preemptible.
+ */
+    if (kprobes_built_in() && !preemptible() && !user_mode(regs)) {
+    if (kprobe_running() && kprobe_fault_handler(regs, trap))


don't need an 'if A if B', can do 'if A && B'


Which will make it a very lengthy condition check.


Yes. But is that a problem at all ?

For me the following would be easier to read.

if (kprobes_built_in() && !preemptible() && !user_mode(regs) &&
kprobe_running() && kprobe_fault_handler(regs, trap))
ret = 1;

Christophe






+    ret = 1;


can do 'return true;' directly here


+    }
+    return ret;


And 'return false' here.


Makes sense, will drop ret.



Re: [PATCH v5 15/15] dmaengine: imx-sdma: add uart rom script

2019-06-10 Thread Vinod Koul
On 11-06-19, 03:04, Robin Gong wrote:
> On 2019-06-10 at 12:55 +, Vinod Koul wrote:
> > On 10-06-19, 16:17, yibin.g...@nxp.com wrote:
> > > 
> > > From: Robin Gong 
> > > 
> > > For the compatibility of NXP internal legacy kernel before 4.19
> > > which
> > > is based on uart ram script and upstreaming kernel based on uart
> > > rom
> > > script, add both uart ram/rom script in latest sdma firmware. By
> > > default
> > > uart rom script used.
> > > Besides, add two multi-fifo scripts for SAI/PDM on i.mx8m/8mm and
> > > add
> > > back qspi script miss for v4(i.mx7d/8m/8mm family, but v3 is for
> > > i.mx6).
> > > 
> > > rom script:
> > >   uart_2_mcu_addr
> > >   uartsh_2_mcu_addr /* through spba bus */
> > > ram script:
> > >   uart_2_mcu_ram_addr
> > >   uartsh_2_mcu_ram_addr /* through spba bus */
> > > 
> > > Please get latest sdma firmware from the below and put them into
> > > the path
> > > (/lib/firmware/imx/sdma/):
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> > > it.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ffirmware%2Flinux
> > > -firmware.gitdata=02%7C01%7Cyibin.gong%40nxp.com%7C6a7833e8a09
> > > 344d9951e08d6eda35fc5%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C
> > > 636957683278190279sdata=RHeypEOREiPGmKveg6gYPy%2FMg8Dzz4JpcHsm
> > > %2Bbdxlzo%3Dreserved=0
> > > /tree/imx/sdma
> > How does this work with folks have older firmware?
> The older SDMA RAM script(firmware) will break the uart driver of
> upstreaming kernel for these years, this is why Lucas raise uart driver
> patch (commit 905c0decad28) to use ROM script instead. There are two
> ways to fix uart issue: one is checking 'Idle Condition
> Detection'/'Aging timer' in RAM script and enable 'IDLE' in uart
> driver, another is only checking 'Aging timer' in ROM script and
> adjusting RX FIFO burst length one word less to ensure at least one
> word left forever in RX FIFO which is the trigger requirement of 'Aging
> timer'(So no need 'IDLE', 'Aging time' is enough) . FSL/NXP internal
> kernel go with the first option, while upstreaming kernel go with the
> second. Since Lucas's patch assume ROM script used in kernel and
> disable 'IDLE', upstreaming kernel broken in uart driver with older
> firmware for these years. So this patch is just for fix this
> compatibility issue with the ram script(older firmware) updated in
> linux-firmware(done already.), thus both RAM script and ROM script can
> work in kernel. Besides, kernel with the latest RAM firmware and this
> patch set can workaround ecspi issue without any function break which
> Lucas concerned about.

Acked-by: Vinod Koul 

-- 
~Vinod


Re: [RFC PATCH 23/30] of/platform: Export of_platform_device_create_pdata()

2019-06-10 Thread Kishon Vijay Abraham I
Hi Rob,

On 10/06/19 11:13 PM, Rob Herring wrote:
> On Tue, Jun 4, 2019 at 7:19 AM Kishon Vijay Abraham I  wrote:
>>
>> Export of_platform_device_create_pdata() to be used by drivers to
>> create child devices with the given platform data. This can be used
>> by platform specific driver to send platform data core driver. For e.g.,
>> this will be used by TI's J721E SoC specific PCIe driver to send
>> ->start_link() ops and ->is_link_up() ops to Cadence core PCIe driver.
> 
> NAK
> 
> of_platform_device_create_pdata() is purely for legacy handling of
> auxdata which is something I hope to get rid of someday. Or to put it
> another way, auxdata use is a sign of platforms not fully converted to
> DT.

All right. Thanks for letting me know your thoughts.

Lorenzo,

We've modeled Cadence PCIe core as a separate driver and for some of the
functionalities (for example starting LTSSM or checking link status) it has to
invoke the wrapper driver functions (The registers for these are present in
wrapper and not in Cadence Core). In the case of Designware, we modeled DWC
core as a library which provided APIs to be used by wrapper driver. Now that
Rob is not inclined for passing platform data from one driver to another (in
this case TI specific J721E driver to Cadence PCIe driver), should we model
Cadence core also as a library? If you agree, I can prepare patches for making
Cadence PCIe core as a library. Please let me know your thoughts.

Thanks
Kishon



Re: bcachefs status update (it's done cooking; let's get this sucker merged)

2019-06-10 Thread Linus Torvalds
On Mon, Jun 10, 2019 at 6:11 PM Dave Chinner  wrote:
>
> Please, no, let's not make the rwsems even more fragile than they
> already are. I'm tired of the ongoing XFS customer escalations that
> end up being root caused to yet another rwsem memory barrier bug.
>
> > Have you talked to Waiman Long about that?
>
> Unfortunately, Waiman has been unable to find/debug multiple rwsem
> exclusion violations we've seen in XFS bug reports over the past 2-3
> years.

Inside xfs you can do whatever you want.

But in generic code, no, we're not saying "we don't trust the generic
locking, so we cook our own random locking".

If tghere really are exclusion issues, they should be fairly easy to
try to find with a generic test-suite. Have a bunch of readers that
assert that some shared variable has a particular value, and a bund of
writers that then modify the value and set it back. Add some random
timing and "yield" to them all, and show that the serialization is
wrong.

Some kind of "XFS load Y shows problems" is undebuggable, and not
necessarily due to locking.

Because if the locking issues are real (and we did fix one bug
recently in a9e9bcb45b15: "locking/rwsem: Prevent decrement of reader
count before increment") it needs to be fixed. Some kind of "let's do
something else entirely" is simply not acceptable.

  Linus


Re: [PATCH] block: fix a crash in do_task_dead()

2019-06-10 Thread Gaurav Kohli




+


Hi Peter, Jen,

As we are not taking pi_lock here , is there possibility of same task dead
call comes as this point of time for current thread, bcoz of which we have
seen earlier issue after this commit 0619317ff8ba
[T114538]  do_task_dead+0xf0/0xf8
[T114538]  do_exit+0xd5c/0x10fc
[T114538]  do_group_exit+0xf4/0x110
[T114538]  get_signal+0x280/0xdd8
[T114538]  do_notify_resume+0x720/0x968
[T114538]  work_pending+0x8/0x10

Is there a chance of TASK_DEAD set at this point of time?


In this case try_to_wake_up(current, TASK_NORMAL) will do nothing, see the
if (!(p->state & state)) above.

See also the comment about set_special_state() above. It disables irqs and
this is enough to ensure that try_to_wake_up(current) from irq can't race
with set_special_state(TASK_DEAD).


Thanks Oleg,

I missed that part(both thread and interrupt is in same core only), So 
that situation would never come.


Oleg.



--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


Re: bcachefs status update (it's done cooking; let's get this sucker merged)

2019-06-10 Thread Dave Chinner
On Mon, Jun 10, 2019 at 09:17:37PM -0400, Kent Overstreet wrote:
> On Mon, Jun 10, 2019 at 10:46:35AM -1000, Linus Torvalds wrote:
> > On Mon, Jun 10, 2019 at 9:14 AM Kent Overstreet
> >  wrote:
> > That lock is somewhat questionable in the first place, and no, we
> > don't do those hacky recursive things anyway. A recursive lock is
> > almost always a buggy and mis-designed one.
> 
> You're preaching to the choir there, I still feel dirty about that code and 
> I'd
> love nothing more than for someone else to come along and point out how stupid
> I've been with a much better way of doing it. 
> 
> > Why does the regular page lock (at a finer granularity) not suffice?
> 
> Because the lock needs to prevent pages from being _added_ to the page cache -
> to do it with a page granularity lock it'd have to be part of the radix tree, 
> 
> > And no, nobody has ever cared. The dio people just don't care about
> > page cache anyway. They have their own thing going.
> 
> It's not just dio, it's even worse with the various fallocate operations. And
> the xfs people care, but IIRC even they don't have locking for pages being
> faulted in. This is an issue I've talked to other filesystem people quite a 
> bit
> about - especially Dave Chinner, maybe we can get him to weigh in here.
> 
> And this inconsistency does result in _real_ bugs. It goes something like 
> this:
>  - dio write shoots down the range of the page cache for the file it's writing
>to, using invalidate_inode_pages_range2
>  - After the page cache shoot down, but before the write actually happens,
>another process pulls those pages back in to the page cache
>  - Now the write happens: if that write was e.g. an allocating write, you're
>going to have page cache state (buffer heads) that say that page doesn't 
> have
>anything on disk backing it, but it actually does because of the dio write.
> 
> xfs has additional locking (that the vfs does _not_ do) around both the 
> buffered
> and dio IO paths to prevent this happening because of a buffered read pulling
> the pages back in, but no one has a solution for pages getting _faulted_ back 
> in
> - either because of mmap or gup().
> 
> And there are some filesystem people who do know about this race, because at
> some point the dio code has been changed to shoot down the page cache _again_
> after the write completes. But that doesn't eliminate the race, it just makes 
> it
> harder to trigger.
> 
> And dio writes actually aren't the worst of it, it's even worse with fallocate
> FALLOC_FL_INSERT_RANGE/COLLAPSE_RANGE. Last time I looked at the ext4 
> fallocate
> code, it looked _completely_ broken to me - the code seemed to think it was
> using the same mechanism truncate uses for shooting down the page cache and
> keeping pages from being readded - but that only works for truncate because 
> it's
> changing i_size and shooting down pages above i_size. Fallocate needs to shoot
> down pages that are still within i_size, so... yeah...

Yes, that ext4 code is broken, and Jan Kara is trying to work out
how to fix it. His recent patchset fell foul of taking the same lock
either side of the mmap_sem in this path:

> The recursiveness is needed because otherwise, if you mmap a file, then do a 
> dio
> write where you pass the address you mmapped to pwrite(), gup() from the dio
> write path will be trying to fault in the exact pages it's blocking from being
> added.
> 
> A better solution would be for gup() to detect that and return an error, so we
> can just fall back to buffered writes. Or just return an error to userspace
> because fuck anyone who would actually do that.

I just recently said this with reference to the range lock stuff I'm
working on in the background:

FWIW, it's to avoid problems with stupid userspace stuff
that nobody really should be doing that I want range locks
for the XFS inode locks.  If userspace overlaps the ranges
and deadlocks in that case, they they get to keep all the
broken bits because, IMO, they are doing something
monumentally stupid. I'd probably be making it return
EDEADLOCK back out to userspace in the case rather than
deadlocking but, fundamentally, I think it's broken
behaviour that we should be rejecting with an error rather
than adding complexity trying to handle it.

So I think this recusive locking across a page fault case should
just fail, not add yet more complexity to try to handle a rare
corner case that exists more in theory than in reality. i.e put the
lock context in the current task, then if the page fault requires a
conflicting lock context to be taken, we terminate the page fault,
back out of the IO and return EDEADLOCK out to userspace. This works
for all types of lock contexts - only the filesystem itself needs to
know what the lock context pointer contains

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [RFC PATCH 1/2] mailbox: imx: Clear GIEn bit at shutdown

2019-06-10 Thread Oleksij Rempel
On Mon, Jun 10, 2019 at 10:16:08PM +0800, daniel.bal...@nxp.com wrote:
> From: Daniel Baluta 
> 
> GIEn is enabled at startup for RX doorbell mailboxes so
> we need to clear the bit at shutdown in order to avoid
> leaving the interrupt line enabled.
> 
> Signed-off-by: Daniel Baluta 

Please send  bug fixes separately from RFC patches.

You can add my
Reviewed-by: Oleksij Rempel 

> ---
>  drivers/mailbox/imx-mailbox.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-mailbox.c
> index 25be8bb5e371..9f74dee1a58c 100644
> --- a/drivers/mailbox/imx-mailbox.c
> +++ b/drivers/mailbox/imx-mailbox.c
> @@ -217,8 +217,8 @@ static void imx_mu_shutdown(struct mbox_chan *chan)
>   if (cp->type == IMX_MU_TYPE_TXDB)
>   tasklet_kill(>txdb_tasklet);
>  
> - imx_mu_xcr_rmw(priv, 0,
> -IMX_MU_xCR_TIEn(cp->idx) | IMX_MU_xCR_RIEn(cp->idx));
> + imx_mu_xcr_rmw(priv, 0, IMX_MU_xCR_TIEn(cp->idx) |
> +IMX_MU_xCR_RIEn(cp->idx) | IMX_MU_xCR_GIEn(cp->idx));
>  
>   free_irq(priv->irq, chan);
>  }
> -- 
> 2.17.1
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |


Re: 答复: 答复: 答复: [PATCH] input: alps-fix the issue alps cs19 trackstick do not work.

2019-06-10 Thread Hui Wang

On 2019/6/11 上午11:23, Hui Wang wrote:

On 2019/6/11 上午11:05, Xiaoxiao Liu wrote:

Hi Pali,

I discussed with our FW team about this problem.
We think the V8 method means a touchpad feature  and does not fit the 
CS19 trackpoint device.
CS19 TrackPoint needn't  use any Absolute (Raw) mode and is usually 
use standard mouse data.
CS19 TrackPoint device is a completely different device with 
DualPoint device of Dell/HP.
CS19 TrackPoint device is independent  of Touchpad. (Touchpad is 
connecting by I2C, TrackPoint is directly connecting with PS2 port.)

And it has completely another FW.

So we think it is better to use the mouse mode for CS19 trackpoint.


Maybe here is some mis-understanding,  the mouse mode here doesn't 
mean we use psmouse-base.c for cs19 (bare ps/2 mouse), we plan to use 
trackpoint.c to drive this HW, so this trackpoint has all features a 
trackpoint should have.


And I sent a patch one month ago to let the the trackpoint.c to drive 
this HW: https://www.spinics.net/lists/linux-input/msg61341.html, maybe 
that patch is reference.

Regards,

Hui.



Best Regards
Shona
-邮件原件-
发件人: Pali Rohár 
发送时间: Monday, June 10, 2019 6:43 PM
收件人: 劉 曉曉 Xiaoxiao Liu 
抄送: XiaoXiao Liu ; 
dmitry.torok...@gmail.com; peter.hutte...@who-t.net; 
hui.w...@canonical.com; linux-in...@vger.kernel.org; 
linux-kernel@vger.kernel.org; 曹 曉建 Xiaojian Cao 
; zhang...@lenovo.com; 斉藤 直樹 Naoki Saito 
; 川瀬 英夫 Hideo Kawase 

主题: Re: 答复: 答复: [PATCH] input: alps-fix the issue alps cs19 
trackstick do not work.


On Monday 10 June 2019 10:03:51 Xiaoxiao Liu wrote:

Hi Pali,

Hi!


We register our CS19 device as ALPS_ONLY_TRACKSTICK device.
And let the V8 protocol function support the process of 
ALPS_ONLY_TRACKSTICK device.


I want to confirm if this solution OK?
Yes, it is fine. Just make sure that touchapad input device is not 
registered when this ALPS_ONLY_TRACKSTICK flag is set. So userspace 
would not see any fake/unavailable touchpad input device.



Xiaoxiao.Liu

--
Pali Rohár
pali.ro...@gmail.com


linux-next: manual merge of the mfd tree with Linus' tree

2019-06-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the mfd tree got a conflict in:

  include/linux/mfd/cros_ec_commands.h

between commit:

  9c92ab619141 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 
282")

from Linus' tree and commit:

  2769bd79a915 ("mfd: cros_ec: Update license term")

from the mfd tree.

I fixed it up (I use the SPDX tag from the former and the later change
to the comment from the latter) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgp4fabT2XbcX.pgp
Description: OpenPGP digital signature


Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-06-10 Thread liwei (GF)
Hi Alex,

On 2019/3/29 23:20, Alex Kogan wrote:
> In CNA, spinning threads are organized in two queues, a main queue for
> threads running on the same node as the current lock holder, and a
> secondary queue for threads running on other nodes. At the unlock time,
> the lock holder scans the main queue looking for a thread running on
> the same node. If found (call it thread T), all threads in the main queue
> between the current lock holder and T are moved to the end of the
> secondary queue, and the lock is passed to T. If such T is not found, the
> lock is passed to the first node in the secondary queue. Finally, if the
> secondary queue is empty, the lock is passed to the next thread in the
> main queue. For more details, see https://arxiv.org/abs/1810.05600.
> 
> Note that this variant of CNA may introduce starvation by continuously
> passing the lock to threads running on the same node. This issue
> will be addressed later in the series.
> 
> Enabling CNA is controlled via a new configuration option
> (NUMA_AWARE_SPINLOCKS), which is enabled by default if NUMA is enabled.
> 
> Signed-off-by: Alex Kogan 
> Reviewed-by: Steve Sistare 
> ---
>  arch/x86/Kconfig  |  14 +++
>  include/asm-generic/qspinlock_types.h |  13 +++
>  kernel/locking/mcs_spinlock.h |  10 ++
>  kernel/locking/qspinlock.c|  29 +-
>  kernel/locking/qspinlock_cna.h| 173 
> ++
>  5 files changed, 236 insertions(+), 3 deletions(-)
>  create mode 100644 kernel/locking/qspinlock_cna.h
> 
(SNIP)
> +
> +static __always_inline int get_node_index(struct mcs_spinlock *node)
> +{
> + return decode_count(node->node_and_count++);
When nesting level is > 4, it won't return a index >= 4 here and the numa node 
number
is changed by mistake. It will go into a wrong way instead of the following 
branch.


/*
 * 4 nodes are allocated based on the assumption that there will
 * not be nested NMIs taking spinlocks. That may not be true in
 * some architectures even though the chance of needing more than
 * 4 nodes will still be extremely unlikely. When that happens,
 * we fall back to spinning on the lock directly without using
 * any MCS node. This is not the most elegant solution, but is
 * simple enough.
 */
if (unlikely(idx >= MAX_NODES)) {
while (!queued_spin_trylock(lock))
cpu_relax();
goto release;
}

> +}
> +
> +static __always_inline void release_mcs_node(struct mcs_spinlock *node)
> +{
> + __this_cpu_dec(node->node_and_count);
> +}
> +
> +static __always_inline void cna_init_node(struct mcs_spinlock *node, int 
> cpuid,
> +   u32 tail)
> +{

Thanks,
Wei



Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

2019-06-10 Thread Leo Yan
On Mon, Jun 10, 2019 at 03:47:54PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > > I tested with the lastest perf/core branch which contains the patch:
> > > 'perf augmented_raw_syscalls: Tell which args are filenames and how
> > > many bytes to copy' and got the error as below:
> > > 
> > > # perf trace -e string -e 
> > > /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> > > Error:  Invalid syscall access, chmod, chown, creat, futimesat, lchown, 
> > > link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> > > rmdir, stat, statfs, symlink, truncate, unlink
> 
> Humm, I think that we can just make the code that parses the
> tools/perf/trace/strace/groups/string file to ignore syscalls it can't
> find in the syscall_tbl, i.e. trace those if they exist in the arch.

Agree.

> > > Hint:   try 'perf list syscalls:sys_enter_*'
> > > Hint:   and: 'man syscalls'
> > > 
> > > So seems mksyscalltbl has not included completely for syscalls, I
> > > use below command to generate syscalltbl_arm64[] array and it don't
> > > include related entries for access, chmod, chown, etc ...
> 
> So, we need to investigate why is that these are missing, good thing we
> have this 'strings' group :-)
> 
> > > You could refer the generated syscalltbl_arm64 in:
> > > http://paste.ubuntu.com/p/8Bj7Jkm2mP/
> > 
> > After digging into this issue on Arm64, below is summary info:
> > 
> > - arm64 uses the header include/uapi/linux/unistd.h to define system
> >   call numbers, in this header some system calls are not defined (I
> >   think the reason is these system calls are obsolete at the end) so the
> >   corresponding strings are missed in the array syscalltbl_native,
> >   for arm64 the array is defined in the file:
> >   tools/perf/arch/arm64/include/generated/asm/syscalls.c.
> 
> Yeah, I looked at the 'access' case and indeed it is not present in
> include/uapi/asm-generic/unistd.h, which is the place
> include/uapi/linux/unistd.h ends up.
> 
> Ok please take a look at the patch at the end of this message, should be ok?
> 
> I tested it by changing the strace/gorups/string file to have a few
> unknown syscalls, running it with -v we see:
> 
> [root@quaco perf]# perf trace -v -e string ls
> Skipping unknown syscalls: access99, acct99, add_key99
> 
> normal operation not considering those unknown syscalls.

I did testing with the patch, but it failed after I added eBPF event
with below command, I even saw segmentation fault; please see below
inline comments.

  perf --debug verbose=10 trace -e string -e \

/mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c

[...]

> commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
> Author: Arnaldo Carvalho de Melo 
> Date:   Mon Jun 10 15:37:45 2019 -0300
> 
> perf trace: Skip unknown syscalls when expanding strace like syscall 
> groups
> 
> We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
> syscalls that should be selected when 'string' is used, meaning, in this
> case, syscalls that receive as one of its arguments a string, like a
> pathname.
> 
> But those were first selected and tested on x86_64, and end up failing
> in architectures where some of those syscalls are not available, like
> the 'access' syscall on arm64, which makes using 'perf trace -e string'
> in such archs to fail.
> 
> Since this the routine doing the validation is used only when reading
> such files, do not fail when some syscall is not found in the
> syscalltbl, instead just use pr_debug() to register that in case people
> are suspicious of problems.
> 
> Now using 'perf trace -e string' should work on arm64, selecting only
> the syscalls that have a string and are available on that architecture.
> 
> Reported-by: Leo Yan 
> Cc: Adrian Hunter 
> Cc: Alexander Shishkin 
> Cc: Alexei Starovoitov 
> Cc: Daniel Borkmann 
> Cc: Jiri Olsa 
> Cc: Martin KaFai Lau 
> Cc: Mathieu Poirier 
> Cc: Mike Leach 
> Cc: Namhyung Kim 
> Cc: Song Liu 
> Cc: Suzuki K Poulose 
> Cc: Yonghong Song 
> Link: 
> https://lkml.kernel.org/n/tip-oa4c2x8p3587jme0g89fy...@git.kernel.org
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 1a2a605cf068..eb70a4b71755 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace 
> *trace, int id)
>  static int trace__validate_ev_qualifier(struct trace *trace)
>  {
>   int err = 0, i;
> + bool printed_invalid_prefix = false;
>   size_t nr_allocated;
>   struct str_node *pos;
>  
> @@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace 
> *trace)
>   if (id >= 0)
>   goto matches;
>  
> - if (err == 0) {
> -

Re: bcachefs status update (it's done cooking; let's get this sucker merged)

2019-06-10 Thread Dave Chinner
On Mon, Jun 10, 2019 at 10:46:35AM -1000, Linus Torvalds wrote:
> I also get the feeling that the "intent" part of the six-locks could
> just be done as a slight extension of the rwsem, where an "intent" is
> the same as a write-lock, but without waiting for existing readers,
> and then the write-lock part is just the "wait for readers to be
> done".

Please, no, let's not make the rwsems even more fragile than they
already are. I'm tired of the ongoing XFS customer escalations that
end up being root caused to yet another rwsem memory barrier bug.

> Have you talked to Waiman Long about that?

Unfortunately, Waiman has been unable to find/debug multiple rwsem
exclusion violations we've seen in XFS bug reports over the past 2-3
years. Those memory barrier bugs have all been fixed by other people
long after Waiman has said "I can't reproduce any problems in my
testing" and essentially walked away from the problem. We've been
left multiple times wondering how the hell we even prove it's a
rwsem bug because there's no way to reproduce the inconsistent rwsem
state we see in the kernel crash dumps.

Hence, as a downstream rwsem user, I have relatively little
confidence in upstream's ability to integrate new functionality into
rwsems without introducing yet more subtle regressions that are only
exposed by heavy rwsem users like XFS. As such, I consider rwsems to
be extremely fragile and are now a prime suspect whenever see some
one-off memory corruption in a structure protected by a rwsem.

As such, please keep SIX locks separate to rwsems to minimise the
merge risk of bcachefs.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


KASAN: use-after-free Read in mntput

2019-06-10 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:d1fdb6d8 Linux 5.2-rc4
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12b30acaa0
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
dashboard link: https://syzkaller.appspot.com/bug?extid=99de05d099a170867f22
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
userspace arch: i386
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1114dc46a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17eade6aa0

The bug was bisected to:

commit 9c8ad7a2ff0bfe58f019ec0abc1fb965114dde7d
Author: David Howells 
Date:   Thu May 16 11:52:27 2019 +

uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15c9f91ea0
final crash:https://syzkaller.appspot.com/x/report.txt?x=17c9f91ea0
console output: https://syzkaller.appspot.com/x/log.txt?x=13c9f91ea0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+99de05d099a170867...@syzkaller.appspotmail.com
Fixes: 9c8ad7a2ff0b ("uapi, x86: Fix the syscall numbering of the mount API  
syscalls [ver #2]")


==
BUG: KASAN: use-after-free in mntput+0x91/0xa0 fs/namespace.c:1207
Read of size 4 at addr 88808f661124 by task syz-executor817/8955

CPU: 1 PID: 8955 Comm: syz-executor817 Not tainted 5.2.0-rc4 #18
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131
 mntput+0x91/0xa0 fs/namespace.c:1207
 path_put+0x50/0x70 fs/namei.c:483
 free_fs_struct+0x25/0x70 fs/fs_struct.c:91
 exit_fs+0xf0/0x130 fs/fs_struct.c:108
 do_exit+0x8e0/0x2fa0 kernel/exit.c:873
 do_group_exit+0x135/0x370 kernel/exit.c:981
 __do_sys_exit_group kernel/exit.c:992 [inline]
 __se_sys_exit_group kernel/exit.c:990 [inline]
 __ia32_sys_exit_group+0x44/0x50 kernel/exit.c:990
 do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
 do_fast_syscall_32+0x27b/0xd7d arch/x86/entry/common.c:408
 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f16849
Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90  
90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90  
90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90

RSP: 002b:ffe4f85c EFLAGS: 0296 ORIG_RAX: 00fc
RAX: ffda RBX:  RCX: 080ed2b8
RDX:  RSI: 080d71fc RDI: 080ed2c0
RBP: 0001 R08:  R09: 
R10:  R11:  R12: 
R13:  R14:  R15: 

Allocated by task 8955:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_kmalloc mm/kasan/common.c:489 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:497
 slab_post_alloc_hook mm/slab.h:437 [inline]
 slab_alloc mm/slab.c:3326 [inline]
 kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3488
 kmem_cache_zalloc include/linux/slab.h:732 [inline]
 alloc_vfsmnt+0x28/0x780 fs/namespace.c:182
 vfs_create_mount+0x96/0x500 fs/namespace.c:961
 __do_sys_fsmount fs/namespace.c:3423 [inline]
 __se_sys_fsmount fs/namespace.c:3340 [inline]
 __ia32_sys_fsmount+0x584/0xc80 fs/namespace.c:3340
 do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
 do_fast_syscall_32+0x27b/0xd7d arch/x86/entry/common.c:408
 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

Freed by task 16:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
 __cache_free mm/slab.c:3432 [inline]
 kmem_cache_free+0x86/0x260 mm/slab.c:3698
 free_vfsmnt+0x6f/0x90 fs/namespace.c:559
 delayed_free_vfsmnt+0x16/0x20 fs/namespace.c:564
 __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
 rcu_do_batch kernel/rcu/tree.c:2092 [inline]
 invoke_rcu_callbacks kernel/rcu/tree.c:2310 [inline]
 rcu_core+0xba5/0x1500 kernel/rcu/tree.c:2291
 __do_softirq+0x25c/0x94c kernel/softirq.c:292

The buggy address belongs to the object at 88808f661000
 which belongs to the cache mnt_cache of size 432
The buggy address is located 292 bytes inside of
 432-byte region [88808f661000, 88808f6611b0)
The buggy address belongs to the page:
page:ea00023d9840 refcount:1 mapcount:0 mapping:8880aa594940  
index:0x0

flags: 0x1fffc000200(slab)

memory leak in nfs_get_client

2019-06-10 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:d1fdb6d8 Linux 5.2-rc4
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=117e0f71a0
kernel config:  https://syzkaller.appspot.com/x/.config?x=cb38d33cd06d8d48
dashboard link: https://syzkaller.appspot.com/bug?extid=7fe11b49c1cc30e3fce2
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=15a46001a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=174b24d1a0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+7fe11b49c1cc30e3f...@syzkaller.appspotmail.com

 fl=212 nc=0 na=0]
BUG: memory leak
unreferenced object 0x888121b91400 (size 1024):
  comm "syz-executor400", pid 6969, jiffies 4294941900 (age 18.210s)
  hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<9c69e9c0>] kmemleak_alloc_recursive  
include/linux/kmemleak.h:43 [inline]

[<9c69e9c0>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<9c69e9c0>] slab_alloc mm/slab.c:3326 [inline]
[<9c69e9c0>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
[<7d1011ce>] kmalloc include/linux/slab.h:547 [inline]
[<7d1011ce>] kzalloc include/linux/slab.h:742 [inline]
[<7d1011ce>] nfs_alloc_client+0x2e/0x170 fs/nfs/client.c:152
[<7f1bdfa5>] nfs_get_client+0x1cb/0x500 fs/nfs/client.c:425
[<4dc18603>] nfs_init_server+0xc6/0x450 fs/nfs/client.c:671
[<72615bbf>] nfs_create_server+0x83/0x1f0 fs/nfs/client.c:958
[] nfs_try_mount+0x5a/0x350 fs/nfs/super.c:1883
[] nfs_fs_mount+0x448/0xc52 fs/nfs/super.c:2719
[<0b19c7d0>] legacy_get_tree+0x27/0x80 fs/fs_context.c:661
[] vfs_get_tree+0x2e/0x120 fs/super.c:1476
[<8eec78b0>] do_new_mount fs/namespace.c:2790 [inline]
[<8eec78b0>] do_mount+0x932/0xc50 fs/namespace.c:3110
[] ksys_mount+0xab/0x120 fs/namespace.c:3319
[<82fa14d6>] __do_sys_mount fs/namespace.c: [inline]
[<82fa14d6>] __se_sys_mount fs/namespace.c:3330 [inline]
[<82fa14d6>] __x64_sys_mount+0x26/0x30 fs/namespace.c:3330
[] do_syscall_64+0x76/0x1a0  
arch/x86/entry/common.c:301

[<70865558>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0x88811e758400 (size 1024):
  comm "syz-executor400", pid 6973, jiffies 4294941906 (age 18.150s)
  hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<9c69e9c0>] kmemleak_alloc_recursive  
include/linux/kmemleak.h:43 [inline]

[<9c69e9c0>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<9c69e9c0>] slab_alloc mm/slab.c:3326 [inline]
[<9c69e9c0>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
[<7d1011ce>] kmalloc include/linux/slab.h:547 [inline]
[<7d1011ce>] kzalloc include/linux/slab.h:742 [inline]
[<7d1011ce>] nfs_alloc_client+0x2e/0x170 fs/nfs/client.c:152
[<7f1bdfa5>] nfs_get_client+0x1cb/0x500 fs/nfs/client.c:425
[<4dc18603>] nfs_init_server+0xc6/0x450 fs/nfs/client.c:671
[<72615bbf>] nfs_create_server+0x83/0x1f0 fs/nfs/client.c:958
[] nfs_try_mount+0x5a/0x350 fs/nfs/super.c:1883
[] nfs_fs_mount+0x448/0xc52 fs/nfs/super.c:2719
[<0b19c7d0>] legacy_get_tree+0x27/0x80 fs/fs_context.c:661
[] vfs_get_tree+0x2e/0x120 fs/super.c:1476
[<8eec78b0>] do_new_mount fs/namespace.c:2790 [inline]
[<8eec78b0>] do_mount+0x932/0xc50 fs/namespace.c:3110
[] ksys_mount+0xab/0x120 fs/namespace.c:3319
[<82fa14d6>] __do_sys_mount fs/namespace.c: [inline]
[<82fa14d6>] __se_sys_mount fs/namespace.c:3330 [inline]
[<82fa14d6>] __x64_sys_mount+0x26/0x30 fs/namespace.c:3330
[] do_syscall_64+0x76/0x1a0  
arch/x86/entry/common.c:301

[<70865558>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0x888118ef9360 (size 32):
  comm "syz-executor400", pid 6973, jiffies 4294941906 (age 18.150s)
  hex dump (first 32 bytes):
00 71 54 04 00 ea ff ff c0 6e 9a 04 00 ea ff ff  .qT..n..
c0 0b 81 04 00 ea ff ff c0 05 86 04 00 ea ff ff  
  backtrace:
[<3e75bb46>] kmemleak_alloc_recursive  
include/linux/kmemleak.h:43 [inline]

[<3e75bb46>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<3e75bb46>] slab_alloc mm/slab.c:3326 [inline]

[PATCH] iio: humidity: Replace older GPIO APIs with GPIO Consumer APIs for the dht11 sensor

2019-06-10 Thread Shobhit Kukreti
The dht11 driver uses a single gpio to make measurements. It was
using the older global gpio numberspace. The patch replaces the
old gpio api with the new gpio descriptor based api.

Removed header files "linux/gpio.h" and "linux/of_gpio.h"

Signed-off-by: Shobhit Kukreti 
---
 drivers/iio/humidity/dht11.c | 28 ++--
 1 file changed, 10 insertions(+), 18 deletions(-)

diff --git a/drivers/iio/humidity/dht11.c b/drivers/iio/humidity/dht11.c
index c815920..f5128d8 100644
--- a/drivers/iio/humidity/dht11.c
+++ b/drivers/iio/humidity/dht11.c
@@ -22,8 +22,7 @@
 #include 
 #include 
 #include 
-#include 
-#include 
+#include 
 #include 
 
 #include 
@@ -72,7 +71,7 @@
 struct dht11 {
struct device   *dev;
 
-   int gpio;
+   struct gpio_desc*gpiod;
int irq;
 
struct completion   completion;
@@ -179,7 +178,7 @@ static irqreturn_t dht11_handle_irq(int irq, void *data)
if (dht11->num_edges < DHT11_EDGES_PER_READ && dht11->num_edges >= 0) {
dht11->edges[dht11->num_edges].ts = ktime_get_boot_ns();
dht11->edges[dht11->num_edges++].value =
-   gpio_get_value(dht11->gpio);
+   gpiod_get_value(dht11->gpiod);
 
if (dht11->num_edges >= DHT11_EDGES_PER_READ)
complete(>completion);
@@ -217,12 +216,12 @@ static int dht11_read_raw(struct iio_dev *iio_dev,
reinit_completion(>completion);
 
dht11->num_edges = 0;
-   ret = gpio_direction_output(dht11->gpio, 0);
+   ret = gpiod_direction_output(dht11->gpiod, 0);
if (ret)
goto err;
usleep_range(DHT11_START_TRANSMISSION_MIN,
 DHT11_START_TRANSMISSION_MAX);
-   ret = gpio_direction_input(dht11->gpio);
+   ret = gpiod_direction_input(dht11->gpiod);
if (ret)
goto err;
 
@@ -294,10 +293,8 @@ MODULE_DEVICE_TABLE(of, dht11_dt_ids);
 static int dht11_probe(struct platform_device *pdev)
 {
struct device *dev = >dev;
-   struct device_node *node = dev->of_node;
struct dht11 *dht11;
struct iio_dev *iio;
-   int ret;
 
iio = devm_iio_device_alloc(dev, sizeof(*dht11));
if (!iio) {
@@ -307,18 +304,13 @@ static int dht11_probe(struct platform_device *pdev)
 
dht11 = iio_priv(iio);
dht11->dev = dev;
+   dht11->gpiod = devm_gpiod_get(dev, NULL, GPIOD_IN);
+   if (IS_ERR(dht11->gpiod))
+   return PTR_ERR(dht11->gpiod);
 
-   ret = of_get_gpio(node, 0);
-   if (ret < 0)
-   return ret;
-   dht11->gpio = ret;
-   ret = devm_gpio_request_one(dev, dht11->gpio, GPIOF_IN, pdev->name);
-   if (ret)
-   return ret;
-
-   dht11->irq = gpio_to_irq(dht11->gpio);
+   dht11->irq = gpiod_to_irq(dht11->gpiod);
if (dht11->irq < 0) {
-   dev_err(dev, "GPIO %d has no interrupt\n", dht11->gpio);
+   dev_err(dev, "GPIO %d has no interrupt\n", 
desc_to_gpio(dht11->gpiod));
return -EINVAL;
}
 
-- 
2.7.4



[PATCH RESEND] Powerpc/Watchpoint: Restore nvgprs while returning from exception

2019-06-10 Thread Ravi Bangoria
Powerpc hw triggers watchpoint before executing the instruction. To
make trigger-after-execute behavior, kernel emulates the instruction.
If the instruction is 'load something into non-volatile register',
exception handler should restore emulated register state while
returning back, otherwise there will be register state corruption.
Ex, Adding a watchpoint on a list can corrput the list:

  # cat /proc/kallsyms | grep kthread_create_list
  c121c8b8 d kthread_create_list

Add watchpoint on kthread_create_list->prev:

  # perf record -e mem:0xc121c8c0

Run some workload such that new kthread gets invoked. Ex, I just
logged out from console:

  list_add corruption. next->prev should be prev (c1214e00), \
but was c121c8b8. (next=c121c8b8).
  WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
  CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
  ...
  NIP __list_add_valid+0xb4/0xc0
  LR __list_add_valid+0xb0/0xc0
  Call Trace:
  __list_add_valid+0xb0/0xc0 (unreliable)
  __kthread_create_on_node+0xe0/0x260
  kthread_create_on_node+0x34/0x50
  create_worker+0xe8/0x260
  worker_thread+0x444/0x560
  kthread+0x160/0x1a0
  ret_from_kernel_thread+0x5c/0x70

List corruption happened because it uses 'load into non-volatile
register' instruction:

Snippet from __kthread_create_on_node:

  c0136be8: addis   r29,r2,-19
  c0136bec: ld  r29,31424(r29)
if (!__list_add_valid(new, prev, next))
  c0136bf0: mr  r3,r30
  c0136bf4: mr  r5,r28
  c0136bf8: mr  r4,r29
  c0136bfc: bl  c059a2f8 <__list_add_valid+0x8>

Register state from WARN_ON():

  GPR00: c059a3a0 c07ff23afb50 c1344e00 0075
  GPR04:   001852af8bc1 
  GPR08: 0001 0007 0006 04aa
  GPR12:  c07eb080 c0137038 c05ff62aaa00
  GPR16:   c07fffbe7600 c07fffbe7370
  GPR20: c07fffbe7320 c07fffbe7300 c1373a00 
  GPR24: fef7 c012e320 c07ff23afcb0 c0cb8628
  GPR28: c121c8b8 c1214e00 c07fef5b17e8 c07fef5b17c0

Watchpoint hit at 0xc0136bec.

  addis   r29,r2,-19
   => r29 = 0xc1344e00 + (-19 << 16)
   => r29 = 0xc1214e00

  ld  r29,31424(r29)
   => r29 = *(0xc1214e00 + 31424)
   => r29 = *(0xc121c8c0)

0xc121c8c0 is where we placed a watchpoint and thus this
instruction was emulated by emulate_step. But because handle_dabr_fault
did not restore emulated register state, r29 still contains stale
value in above register state.

Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 
64-bit server processors") 
Signed-off-by: Ravi Bangoria 
Cc: sta...@vger.kernel.org # 2.6.36+
Reviewed-by: Naveen N. Rao 
---
 arch/powerpc/kernel/exceptions-64s.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 6b86055e5251..0e649d980ec3 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1761,7 +1761,7 @@ handle_dabr_fault:
ld  r5,_DSISR(r1)
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_break
-12:b   ret_from_except_lite
+12:b   ret_from_except
 
 
 #ifdef CONFIG_PPC_BOOK3S_64
-- 
2.20.1



[PATCH] NCR5380: Support chained sg lists

2019-06-10 Thread Finn Thain
My understanding is that support for chained scatterlists is to
become mandatory for LLDs.

Cc: Michael Schmitz 
Signed-off-by: Finn Thain 
---
 drivers/scsi/NCR5380.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index d9fa9cf2fd8b..536426f25e86 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -149,12 +149,10 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)
 
if (scsi_bufflen(cmd)) {
cmd->SCp.buffer = scsi_sglist(cmd);
-   cmd->SCp.buffers_residual = scsi_sg_count(cmd) - 1;
cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
cmd->SCp.this_residual = cmd->SCp.buffer->length;
} else {
cmd->SCp.buffer = NULL;
-   cmd->SCp.buffers_residual = 0;
cmd->SCp.ptr = NULL;
cmd->SCp.this_residual = 0;
}
@@ -163,6 +161,17 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)
cmd->SCp.Message = 0;
 }
 
+static inline void advance_sg_buffer(struct scsi_cmnd *cmd)
+{
+   struct scatterlist *s = cmd->SCp.buffer;
+
+   if (!cmd->SCp.this_residual && s && !sg_is_last(s)) {
+   cmd->SCp.buffer = sg_next(s);
+   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
+   cmd->SCp.this_residual = cmd->SCp.buffer->length;
+   }
+}
+
 /**
  * NCR5380_poll_politely2 - wait for two chip register values
  * @hostdata: host private data
@@ -1670,12 +1679,7 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
sun3_dma_setup_done != cmd) {
int count;
 
-   if (!cmd->SCp.this_residual && 
cmd->SCp.buffers_residual) {
-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   }
+   advance_sg_buffer(cmd);
 
count = sun3scsi_dma_xfer_len(hostdata, cmd);
 
@@ -1725,15 +1729,11 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
 * scatter-gather list, move onto the next one.
 */
 
-   if (!cmd->SCp.this_residual && 
cmd->SCp.buffers_residual) {
-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   dsprintk(NDEBUG_INFORMATION, instance, 
"%d bytes and %d buffers left\n",
-cmd->SCp.this_residual,
-cmd->SCp.buffers_residual);
-   }
+   advance_sg_buffer(cmd);
+   dsprintk(NDEBUG_INFORMATION, instance,
+   "this residual %d, sg ents %d\n",
+   cmd->SCp.this_residual,
+   sg_nents(cmd->SCp.buffer));
 
/*
 * The preferred transfer method is going to be
@@ -2126,12 +2126,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance)
if (sun3_dma_setup_done != tmp) {
int count;
 
-   if (!tmp->SCp.this_residual && tmp->SCp.buffers_residual) {
-   ++tmp->SCp.buffer;
-   --tmp->SCp.buffers_residual;
-   tmp->SCp.this_residual = tmp->SCp.buffer->length;
-   tmp->SCp.ptr = sg_virt(tmp->SCp.buffer);
-   }
+   advance_sg_buffer(tmp);
 
count = sun3scsi_dma_xfer_len(hostdata, tmp);
 
-- 
2.21.0



Re: 答复: 答复: 答复: [PATCH] input: alps-fix the issue alps cs19 trackstick do not work.

2019-06-10 Thread Hui Wang

On 2019/6/11 上午11:05, Xiaoxiao Liu wrote:

Hi Pali,

I discussed with our FW team about this problem.
We think the V8 method means a touchpad feature  and does not fit the CS19 
trackpoint device.
CS19 TrackPoint needn't  use any Absolute (Raw) mode and is usually use 
standard mouse data.
CS19 TrackPoint device is a completely different device with DualPoint device 
of Dell/HP.
CS19 TrackPoint device is independent  of Touchpad. (Touchpad is connecting by 
I2C, TrackPoint is directly connecting with PS2 port.)
And it has completely another FW.

So we think it is better to use the mouse mode for CS19 trackpoint.


Maybe here is some mis-understanding,  the mouse mode here doesn't mean 
we use psmouse-base.c for cs19 (bare ps/2 mouse), we plan to use 
trackpoint.c to drive this HW, so this trackpoint has all features a 
trackpoint should have.


Regards,

Hui.



Best Regards
Shona
-邮件原件-
发件人: Pali Rohár 
发送时间: Monday, June 10, 2019 6:43 PM
收件人: 劉 曉曉 Xiaoxiao Liu 
抄送: XiaoXiao Liu ; dmitry.torok...@gmail.com; peter.hutte...@who-t.net; 
hui.w...@canonical.com; linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; 曹 曉建 Xiaojian Cao 
; zhang...@lenovo.com; 斉藤 直樹 Naoki Saito 
; 川瀬 英夫 Hideo Kawase 
主题: Re: 答复: 答复: [PATCH] input: alps-fix the issue alps cs19 trackstick do not 
work.

On Monday 10 June 2019 10:03:51 Xiaoxiao Liu wrote:

Hi Pali,

Hi!


We register our CS19 device as ALPS_ONLY_TRACKSTICK device.
And let the V8 protocol function support the process of ALPS_ONLY_TRACKSTICK 
device.

I want to confirm if this solution OK?

Yes, it is fine. Just make sure that touchapad input device is not registered 
when this ALPS_ONLY_TRACKSTICK flag is set. So userspace would not see any 
fake/unavailable touchpad input device.


Xiaoxiao.Liu

--
Pali Rohár
pali.ro...@gmail.com


RE: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove with parent removal

2019-06-10 Thread Parav Pandit
Hi Alex,

> -Original Message-
> From: Cornelia Huck 
> Sent: Tuesday, June 4, 2019 11:18 AM
> To: Parav Pandit 
> Cc: k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> kwankh...@nvidia.com; alex.william...@redhat.com; c...@nvidia.com
> Subject: Re: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove
> with parent removal
> 
> On Mon,  3 Jun 2019 13:56:58 -0500
> Parav Pandit  wrote:
> 
> > In following sequences, child devices created while removing mdev
> > parent device can be left out, or it may lead to race of removing half
> > initialized child mdev devices.
> >
> > issue-1:
> > 
> >cpu-0 cpu-1
> >- -
> >   mdev_unregister_device()
> > device_for_each_child()
> >   mdev_device_remove_cb()
> > mdev_device_remove()
> > create_store()
> >   mdev_device_create()   [...]
> > device_add()
> >   parent_remove_sysfs_files()
> >
> > /* BUG: device added by cpu-0
> >  * whose parent is getting removed
> >  * and it won't process this mdev.
> >  */
> >
> > issue-2:
> > 
> > Below crash is observed when user initiated remove is in progress and
> > mdev_unregister_driver() completes parent unregistration.
> >
> >cpu-0 cpu-1
> >- -
> > remove_store()
> >mdev_device_remove()
> >active = false;
> >   mdev_unregister_device()
> >   parent device removed.
> >[...]
> >parents->ops->remove()
> >  /*
> >   * BUG: Accessing invalid parent.
> >   */
> >
> > This is similar race like create() racing with mdev_unregister_device().
> >
> > BUG: unable to handle kernel paging request at c0585668 PGD
> > e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0
> > Oops:  [#1] SMP PTI
> > CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted
> > 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro
> > SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016
> > RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace:
> >  remove_store+0x71/0x90 [mdev]
> >  kernfs_fop_write+0x113/0x1a0
> >  vfs_write+0xad/0x1b0
> >  ksys_write+0x5a/0xe0
> >  do_syscall_64+0x5a/0x210
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > Therefore, mdev core is improved as below to overcome above issues.
> >
> > Wait for any ongoing mdev create() and remove() to finish before
> > unregistering parent device.
> > This continues to allow multiple create and remove to progress in
> > parallel for different mdev devices as most common case.
> > At the same time guard parent removal while parent is being accessed
> > by
> > create() and remove() callbacks.
> > create()/remove() and unregister_device() are synchronized by the rwsem.
> >
> > Refactor device removal code to mdev_device_remove_common() to avoid
> > acquiring unreg_sem of the parent.
> >
> > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver")
> > Signed-off-by: Parav Pandit 
> > ---
> >  drivers/vfio/mdev/mdev_core.c| 71 
> >  drivers/vfio/mdev/mdev_private.h |  2 +
> >  2 files changed, 55 insertions(+), 18 deletions(-)
> >
> 
> > @@ -265,6 +299,12 @@ int mdev_device_create(struct kobject *kobj,
> >
> > mdev->parent = parent;
> >
> 
> Adding
> 
> /* Check if parent unregistration has started */
> 
> here as well might be nice, but no need to resend the patch for that.
> 
> > +   if (!down_read_trylock(>unreg_sem)) {
> > +   mdev_device_free(mdev);
> > +   ret = -ENODEV;
> > +   goto mdev_fail;
> > +   }
> > +
> > device_initialize(>dev);
> > mdev->dev.parent  = dev;
> > mdev->dev.bus = _bus_type;
> 
> Reviewed-by: Cornelia Huck 

Now that we have all 3 patches reviewed and comments addressed, if there are no 
more comments, can you please take it forward?


答复: 答复: 答复: [PATCH] input: alps-fix the issue alps cs19 trackstick do not work.

2019-06-10 Thread Xiaoxiao Liu
Hi Pali,

I discussed with our FW team about this problem.
We think the V8 method means a touchpad feature  and does not fit the CS19 
trackpoint device. 
CS19 TrackPoint needn't  use any Absolute (Raw) mode and is usually use 
standard mouse data.
CS19 TrackPoint device is a completely different device with DualPoint device 
of Dell/HP.
CS19 TrackPoint device is independent  of Touchpad. (Touchpad is connecting by 
I2C, TrackPoint is directly connecting with PS2 port.)
And it has completely another FW. 

So we think it is better to use the mouse mode for CS19 trackpoint.

Best Regards
Shona
-邮件原件-
发件人: Pali Rohár  
发送时间: Monday, June 10, 2019 6:43 PM
收件人: 劉 曉曉 Xiaoxiao Liu 
抄送: XiaoXiao Liu ; dmitry.torok...@gmail.com; 
peter.hutte...@who-t.net; hui.w...@canonical.com; linux-in...@vger.kernel.org; 
linux-kernel@vger.kernel.org; 曹 曉建 Xiaojian Cao ; 
zhang...@lenovo.com; 斉藤 直樹 Naoki Saito ; 川瀬 英夫 
Hideo Kawase 
主题: Re: 答复: 答复: [PATCH] input: alps-fix the issue alps cs19 trackstick do not 
work.

On Monday 10 June 2019 10:03:51 Xiaoxiao Liu wrote:
> Hi Pali,

Hi!

> We register our CS19 device as ALPS_ONLY_TRACKSTICK device.
> And let the V8 protocol function support the process of ALPS_ONLY_TRACKSTICK 
> device. 
> 
> I want to confirm if this solution OK?

Yes, it is fine. Just make sure that touchapad input device is not registered 
when this ALPS_ONLY_TRACKSTICK flag is set. So userspace would not see any 
fake/unavailable touchpad input device.

> Xiaoxiao.Liu

--
Pali Rohár
pali.ro...@gmail.com


Re: [PATCH v4] selinux: lsm: fix a missing-check bug in selinux_sb_eat_lsm_o pts()

2019-06-10 Thread Gen Zhang
On Mon, Jun 10, 2019 at 04:20:28PM -0400, Paul Moore wrote:
> On Fri, Jun 7, 2019 at 4:41 AM Ondrej Mosnacek  wrote:
> >
> > On Thu, Jun 6, 2019 at 10:55 AM Gen Zhang  wrote:
> > > In selinux_sb_eat_lsm_opts(), 'arg' is allocated by kmemdup_nul(). It
> > > returns NULL when fails. So 'arg' should be checked. And 'mnt_opts'
> > > should be freed when error.
> > >
> > > Signed-off-by: Gen Zhang 
> > > Fixes: 99dbbb593fe6 ("selinux: rewrite selinux_sb_eat_lsm_opts()")
> >
> > My comments about the subject and an empty line before label apply
> > here as well, but Paul can fix both easily when applying ...
> 
> Since we've been discussing general best practices for submitting
> patches in this thread (and the other related thread), I wanted to
> (re)clarify my thoughts around maintainers fixing patches when merging
> them upstream.
> 
> When in doubt, do not ever rely on the upstream maintainer fixing your
> patch while merging it, and if problems do arise during review, it is
> best to not ask the maintainer to fix them for you, but for you to fix
> them instead (you are the patch author after all!).  Similarly, making
> comments along the lines of "X can fix both easily when applying", is
> also a bad thing to say when reviewing patches.  It's the patch
> author's responsibility to fix the patch by address review comments,
> not the maintainer.  I'll typically let you know if you don't need to
> rework a patch(set).
> 
> That said, there are times when the maintainer will change the patch
> during merging, most of which are due to resolving merge
> conflicts/fuzz with changes already in the tree (that *is* the
> maintainer's responsibility).  Speaking for myself, sometimes I will
> also make some minor changes if the patch author is away, or
> unreliable, or if there is a hard deadline near and I'm worried that
> the updated patch might not be ready in time.  I'll also sometimes
> make the changes directly if the patch is holding up a larger, more
> important patch(set), but that is really rare.  I'm sure I've made
> changes for other reasons in the past, and I'm sure I'll make changes
> for other reasons in the future, but hopefully this will give you a
> better idea of how the process works :)
> 
> -- 
> paul moore
> www.paul-moore.com
Thanks for your comments. I will resend a patch after revising.

Thanks
Gen


Re: [PATCH v3] selinux: lsm: fix a missing-check bug in selinux_add_mnt_opt( )

2019-06-10 Thread Gen Zhang
On Mon, Jun 10, 2019 at 03:31:50PM -0400, Paul Moore wrote:
> On Fri, Jun 7, 2019 at 8:11 AM Gen Zhang  wrote:
> >
> > On Fri, Jun 07, 2019 at 10:39:05AM +0200, Ondrej Mosnacek wrote:
> > > On Thu, Jun 6, 2019 at 11:23 AM Gen Zhang  
> > > wrote:
> > > > In selinux_add_mnt_opt(), 'val' is allocated by kmemdup_nul(). It 
> > > > returns
> > > > NULL when fails. So 'val' should be checked. And 'mnt_opts' should be
> > > > freed when error.
> > > >
> > > > Signed-off-by: Gen Zhang 
> > > > Fixes: 757cbe597fe8 ("LSM: new method: ->sb_add_mnt_opt()")
> > > > ---
> > > > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > > > index 3ec702c..4e4c1c6 100644
> > > > --- a/security/selinux/hooks.c
> > > > +++ b/security/selinux/hooks.c
> > > > @@ -1052,15 +1052,23 @@ static int selinux_add_mnt_opt(const char 
> > > > *option, const char *val, int len,
> > > > if (token == Opt_error)
> > > > return -EINVAL;
> > > >
> > > > -   if (token != Opt_seclabel)
> > > > -   val = kmemdup_nul(val, len, GFP_KERNEL);
> > > > +   if (token != Opt_seclabel) {
> > > > +   val = kmemdup_nul(val, len, GFP_KERNEL);
> > > > +   if (!val) {
> > > > +   rc = -ENOMEM;
> > > > +   goto free_opt;
> > > > +   }
> > > > +   }
> > > > rc = selinux_add_opt(token, val, mnt_opts);
> > > > if (unlikely(rc)) {
> > > > kfree(val);
> > > > -   if (*mnt_opts) {
> > > > -   selinux_free_mnt_opts(*mnt_opts);
> > > > -   *mnt_opts = NULL;
> > > > -   }
> > > > +   goto free_opt;
> > > > +   }
> > > > +   return rc;
> > >
> > > At this point rc is guaranteed to be 0, so you can just 'return 0' for
> > > clarity. Also, I visually prefer an empty line between a return
> > > statement and a goto label, but I'm not sure what is the
> > > general/maintainer's preference.
> >
> > Am I supposed to revise and send a patch v4 for this, or let the
> > maintainer do this? :-)
> 
> First a few things from my perspective: I don't really care too much
> about the difference between returning "0" and "rc" here, one could
> argue that "0" is cleaner and that "rc" is "safer".  To me it isn't a
> big deal and generally isn't something I would even comment on unless
> there was something else in the patch that needed addressing.  I care
> a more about the style choice of having an empty line between the
> return and the start of the goto targets (vertical whitespace before
> the jump targets is good, please include it), but once again, I'm not
> sure I would comment on that.  The patch subject line is a bit
> confusing in that we already discussed when to use "selinux" and when
> to use "lsm", but I imagine there might be some confusion about using
> both so let me try and clear that up now: don't do it unless you have
> a *really* good reason to do so :)  In this case it is all SELinux
> code so there is no reason why you should be including the "lsm"
> prefix.
Thanks for your comments. I was uncertain of the meaning of "lsm". So I
used"selinux: lsm:". I am aware of that now.

Thanks
Gen
> 
> You've been pretty responsive, so if you don't mind submitting a v4
> with the changes mentioned above, that would be far more preferable to
> me making the changes.  I have some other comments about maintainer
> fixes to patches, but I'll save that for the other thread :)
> 
> -- 
> paul moore
> www.paul-moore.com


Re: [PATCH v5 15/15] dmaengine: imx-sdma: add uart rom script

2019-06-10 Thread Robin Gong
On 2019-06-10 at 12:55 +, Vinod Koul wrote:
> On 10-06-19, 16:17, yibin.g...@nxp.com wrote:
> > 
> > From: Robin Gong 
> > 
> > For the compatibility of NXP internal legacy kernel before 4.19
> > which
> > is based on uart ram script and upstreaming kernel based on uart
> > rom
> > script, add both uart ram/rom script in latest sdma firmware. By
> > default
> > uart rom script used.
> > Besides, add two multi-fifo scripts for SAI/PDM on i.mx8m/8mm and
> > add
> > back qspi script miss for v4(i.mx7d/8m/8mm family, but v3 is for
> > i.mx6).
> > 
> > rom script:
> > uart_2_mcu_addr
> > uartsh_2_mcu_addr /* through spba bus */
> > ram script:
> > uart_2_mcu_ram_addr
> > uartsh_2_mcu_ram_addr /* through spba bus */
> > 
> > Please get latest sdma firmware from the below and put them into
> > the path
> > (/lib/firmware/imx/sdma/):
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> > it.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ffirmware%2Flinux
> > -firmware.gitdata=02%7C01%7Cyibin.gong%40nxp.com%7C6a7833e8a09
> > 344d9951e08d6eda35fc5%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C
> > 636957683278190279sdata=RHeypEOREiPGmKveg6gYPy%2FMg8Dzz4JpcHsm
> > %2Bbdxlzo%3Dreserved=0
> > /tree/imx/sdma
> How does this work with folks have older firmware?
The older SDMA RAM script(firmware) will break the uart driver of
upstreaming kernel for these years, this is why Lucas raise uart driver
patch (commit 905c0decad28) to use ROM script instead. There are two
ways to fix uart issue: one is checking 'Idle Condition
Detection'/'Aging timer' in RAM script and enable 'IDLE' in uart
driver, another is only checking 'Aging timer' in ROM script and
adjusting RX FIFO burst length one word less to ensure at least one
word left forever in RX FIFO which is the trigger requirement of 'Aging
timer'(So no need 'IDLE', 'Aging time' is enough) . FSL/NXP internal
kernel go with the first option, while upstreaming kernel go with the
second. Since Lucas's patch assume ROM script used in kernel and
disable 'IDLE', upstreaming kernel broken in uart driver with older
firmware for these years. So this patch is just for fix this
compatibility issue with the ram script(older firmware) updated in
linux-firmware(done already.), thus both RAM script and ROM script can
work in kernel. Besides, kernel with the latest RAM firmware and this
patch set can workaround ecspi issue without any function break which
Lucas concerned about.
> 
> > 
> > 
> > Signed-off-by: Robin Gong 
> > ---
> >  drivers/dma/imx-sdma.c |  4 ++--
> >  include/linux/platform_data/dma-imx-sdma.h | 10 --
> >  2 files changed, 10 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
> > index f7c150d..deea9aa 100644
> > --- a/drivers/dma/imx-sdma.c
> > +++ b/drivers/dma/imx-sdma.c
> > @@ -1733,8 +1733,8 @@ static void sdma_issue_pending(struct
> > dma_chan *chan)
> >  
> >  #define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V134
> >  #define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V238
> > -#define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V341
> > -#define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V442
> > +#define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V345
> > +#define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V446
> >  
> >  static void sdma_add_scripts(struct sdma_engine *sdma,
> >     const struct sdma_script_start_addrs *addr)
> > diff --git a/include/linux/platform_data/dma-imx-sdma.h
> > b/include/linux/platform_data/dma-imx-sdma.h
> > index f794fee..e12d2e8 100644
> > --- a/include/linux/platform_data/dma-imx-sdma.h
> > +++ b/include/linux/platform_data/dma-imx-sdma.h
> > @@ -20,12 +20,12 @@ struct sdma_script_start_addrs {
> >     s32 per_2_firi_addr;
> >     s32 mcu_2_firi_addr;
> >     s32 uart_2_per_addr;
> > -   s32 uart_2_mcu_addr;
> > +   s32 uart_2_mcu_ram_addr;
> >     s32 per_2_app_addr;
> >     s32 mcu_2_app_addr;
> >     s32 per_2_per_addr;
> >     s32 uartsh_2_per_addr;
> > -   s32 uartsh_2_mcu_addr;
> > +   s32 uartsh_2_mcu_ram_addr;
> >     s32 per_2_shp_addr;
> >     s32 mcu_2_shp_addr;
> >     s32 ata_2_mcu_addr;
> > @@ -52,7 +52,13 @@ struct sdma_script_start_addrs {
> >     s32 zcanfd_2_mcu_addr;
> >     s32 zqspi_2_mcu_addr;
> >     s32 mcu_2_ecspi_addr;
> > +   s32 mcu_2_sai_addr;
> > +   s32 sai_2_mcu_addr;
> > +   s32 uart_2_mcu_addr;
> > +   s32 uartsh_2_mcu_addr;
> >     /* End of v3 array */
> > +   s32 mcu_2_zqspi_addr;
> > +   /* End of v4 array */
> >  };
> >  
> >  /**
> > -- 
> > 2.7.4

[PATCH] perf version: Fix segfault

2019-06-10 Thread Ravi Bangoria
'perf version' on powerpc segfaults when used with non-supported
option:
  # perf version -a
  Segmentation fault (core dumped)

Fix this.

Signed-off-by: Ravi Bangoria 
Reviewed-by: Kamalesh babulal 
Tested-by: Mamatha Inamdar 
---
 tools/perf/builtin-version.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index f470144d1a70..bf114ca9ca87 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -19,6 +19,7 @@ static struct version version;
 static struct option version_options[] = {
OPT_BOOLEAN(0, "build-options", _options,
"display the build options"),
+   OPT_END(),
 };
 
 static const char * const version_usage[] = {
-- 
2.20.1



Re: [PATCH v2 1/2] staging: erofs: add requirements field in superblock

2019-06-10 Thread Chao Yu
On 2019/6/11 10:42, Gao Xiang wrote:
> There are some backward incompatible features pending
> for months, mainly due to on-disk format expensions.
> 
> However, we should ensure that it cannot be mounted with
> old kernels. Otherwise, it will causes unexpected behaviors.
> 
> Fixes: ba2b77a82022 ("staging: erofs: add super block operations")
> Cc:  # 4.19+
> Signed-off-by: Gao Xiang 

Reviewed-by: Chao Yu 

Thanks,


Re: [PATCH -next] HID: logitech-dj: fix return value of logi_dj_recv_query_hidpp_devices

2019-06-10 Thread Yuehaibing
Hi all,

Friendly ping...

On 2019/5/25 22:09, YueHaibing wrote:
> We should return 'retval' as the correct return value
> instead of always zero.
> 
> Fixes: 74808f9115ce ("HID: logitech-dj: add support for non unifying 
> receivers")
> Signed-off-by: YueHaibing 
> ---
>  drivers/hid/hid-logitech-dj.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
> index 41baa4dbbfcc..7f8db602eec0 100644
> --- a/drivers/hid/hid-logitech-dj.c
> +++ b/drivers/hid/hid-logitech-dj.c
> @@ -1133,7 +1133,7 @@ static int logi_dj_recv_query_hidpp_devices(struct 
> dj_receiver_dev *djrcv_dev)
>   HID_REQ_SET_REPORT);
>  
>   kfree(hidpp_report);
> - return 0;
> + return retval;
>  }
>  
>  static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev 
> *djrcv_dev)
> 



[PATCH 2/2] rtl8723bs: os_dep: fix spaces preferred around unary operator

2019-06-10 Thread Hariprasad Kelam
CHECK: spaces preferred around that '|' (ctx:VxV)
CHECK: spaces preferred around that '|' (ctx:VxV)
CHECK: spaces preferred around that '+' (ctx:VxV)

Signed-off-by: Hariprasad Kelam 
---
 drivers/staging/rtl8723bs/os_dep/rtw_proc.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/rtl8723bs/os_dep/rtw_proc.c 
b/drivers/staging/rtl8723bs/os_dep/rtw_proc.c
index d6862e8..16ada19 100644
--- a/drivers/staging/rtl8723bs/os_dep/rtw_proc.c
+++ b/drivers/staging/rtl8723bs/os_dep/rtw_proc.c
@@ -21,7 +21,7 @@ inline struct proc_dir_entry *rtw_proc_create_dir(const char 
*name, struct proc_
 {
struct proc_dir_entry *entry;
 
-   entry = proc_mkdir_data(name, S_IRUGO|S_IXUGO, parent, data);
+   entry = proc_mkdir_data(name, S_IRUGO | S_IXUGO, parent, data);
 
return entry;
 }
@@ -31,7 +31,7 @@ inline struct proc_dir_entry *rtw_proc_create_entry(const 
char *name, struct pro
 {
struct proc_dir_entry *entry;
 
-   entry = proc_create_data(name,  S_IFREG|S_IRUGO, parent, fops, data);
+   entry = proc_create_data(name,  S_IFREG | S_IRUGO, parent, fops, data);
 
return entry;
 }
@@ -90,7 +90,7 @@ static int rtw_drv_proc_open(struct inode *inode, struct file 
*file)
 {
/* struct net_device *dev = proc_get_parent_data(inode); */
ssize_t index = (ssize_t)PDE_DATA(inode);
-   const struct rtw_proc_hdl *hdl = drv_proc_hdls+index;
+   const struct rtw_proc_hdl *hdl = drv_proc_hdls + index;
 
return single_open(file, hdl->show, NULL);
 }
@@ -98,7 +98,7 @@ static int rtw_drv_proc_open(struct inode *inode, struct file 
*file)
 static ssize_t rtw_drv_proc_write(struct file *file, const char __user 
*buffer, size_t count, loff_t *pos)
 {
ssize_t index = (ssize_t)PDE_DATA(file_inode(file));
-   const struct rtw_proc_hdl *hdl = drv_proc_hdls+index;
+   const struct rtw_proc_hdl *hdl = drv_proc_hdls + index;
ssize_t (*write)(struct file *, const char __user *, size_t, loff_t *, 
void *) = hdl->write;
 
if (write)
@@ -207,7 +207,7 @@ static int proc_get_linked_info_dump(struct seq_file *m, 
void *v)
struct adapter *padapter = (struct adapter *)rtw_netdev_priv(dev);
 
if (padapter)
-   DBG_871X_SEL_NL(m, "linked_info_dump :%s\n", 
(padapter->bLinkInfoDump)?"enable":"disable");
+   DBG_871X_SEL_NL(m, "linked_info_dump :%s\n", 
(padapter->bLinkInfoDump) ? "enable" : "disable");
 
return 0;
 }
@@ -245,7 +245,7 @@ static int proc_get_rx_info(struct seq_file *m, void *v)
struct debug_priv *pdbgpriv = >drv_dbg;
 
/* Counts of packets whose seq_num is less than 
preorder_ctrl->indicate_seq, Ex delay, retransmission, redundant packets and so 
on */
-   DBG_871X_SEL_NL(m,"Counts of Packets Whose Seq_Num Less Than Reorder 
Control Seq_Num: %llu\n", (unsigned long 
long)pdbgpriv->dbg_rx_ampdu_drop_count);
+   DBG_871X_SEL_NL(m, "Counts of Packets Whose Seq_Num Less Than Reorder 
Control Seq_Num: %llu\n", (unsigned long 
long)pdbgpriv->dbg_rx_ampdu_drop_count);
/* How many times the Rx Reorder Timer is triggered. */
DBG_871X_SEL_NL(m,"Rx Reorder Time-out Trigger Counts: %llu\n", 
(unsigned long long)pdbgpriv->dbg_rx_ampdu_forced_indicate_count);
/* Total counts of packets loss */
@@ -341,8 +341,8 @@ static int proc_get_cam_cache(struct seq_file *m, void *v)
, dvobj->cam_cache[i].ctrl
, MAC_ARG(dvobj->cam_cache[i].mac)
, KEY_ARG(dvobj->cam_cache[i].key)
-   , (dvobj->cam_cache[i].ctrl)&0x03
-   , 
security_type_str(((dvobj->cam_cache[i].ctrl)>>2)&0x07)
+   , (dvobj->cam_cache[i].ctrl) & 0x03
+   , security_type_str(((dvobj->cam_cache[i].ctrl) 
>> 2) & 0x07)
/*  ((dvobj->cam_cache[i].ctrl)>>5)&0x01 */
/*  ((dvobj->cam_cache[i].ctrl)>>6)&0x01 */
/*  ((dvobj->cam_cache[i].ctrl)>>8)&0x7f */
@@ -421,7 +421,7 @@ static const int adapter_proc_hdls_num = 
sizeof(adapter_proc_hdls) / sizeof(stru
 static int rtw_adapter_proc_open(struct inode *inode, struct file *file)
 {
ssize_t index = (ssize_t)PDE_DATA(inode);
-   const struct rtw_proc_hdl *hdl = adapter_proc_hdls+index;
+   const struct rtw_proc_hdl *hdl = adapter_proc_hdls + index;
 
return single_open(file, hdl->show, proc_get_parent_data(inode));
 }
@@ -429,7 +429,7 @@ static int rtw_adapter_proc_open(struct inode *inode, 
struct file *file)
 static ssize_t rtw_adapter_proc_write(struct file *file, const char __user 
*buffer, size_t count, loff_t *pos)
 {
ssize_t index = (ssize_t)PDE_DATA(file_inode(file));
-   const struct rtw_proc_hdl *hdl = adapter_proc_hdls+index;
+   const struct rtw_proc_hdl 

[PATCH 1/2] staging: rtl8723bs: fix issue Comparison to NULL

2019-06-10 Thread Hariprasad Kelam
This patch fixes below issues reported by checkpatch

CHECK: Comparison to NULL could be written "rtw_proc"
CHECK: Comparison to NULL could be written "!rtw_proc"
CHECK: Comparison to NULL could be written "!rtw_proc"

Signed-off-by: Hariprasad Kelam 
---
 drivers/staging/rtl8723bs/os_dep/rtw_proc.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/rtl8723bs/os_dep/rtw_proc.c 
b/drivers/staging/rtl8723bs/os_dep/rtw_proc.c
index d8e7ad1..d6862e8 100644
--- a/drivers/staging/rtl8723bs/os_dep/rtw_proc.c
+++ b/drivers/staging/rtl8723bs/os_dep/rtw_proc.c
@@ -122,14 +122,14 @@ int rtw_drv_proc_init(void)
ssize_t i;
struct proc_dir_entry *entry = NULL;
 
-   if (rtw_proc != NULL) {
+   if (rtw_proc) {
rtw_warn_on(1);
goto exit;
}
 
rtw_proc = rtw_proc_create_dir(RTW_PROC_NAME, get_proc_net, NULL);
 
-   if (rtw_proc == NULL) {
+   if (!rtw_proc) {
rtw_warn_on(1);
goto exit;
}
@@ -152,7 +152,7 @@ void rtw_drv_proc_deinit(void)
 {
int i;
 
-   if (rtw_proc == NULL)
+   if (!rtw_proc)
return;
 
for (i = 0; i < drv_proc_hdls_num; i++)
@@ -637,18 +637,18 @@ static struct proc_dir_entry *rtw_odm_proc_init(struct 
net_device *dev)
struct adapter  *adapter = rtw_netdev_priv(dev);
ssize_t i;
 
-   if (adapter->dir_dev == NULL) {
+   if (!adapter->dir_dev) {
rtw_warn_on(1);
goto exit;
}
 
-   if (adapter->dir_odm != NULL) {
+   if (adapter->dir_odm) {
rtw_warn_on(1);
goto exit;
}
 
dir_odm = rtw_proc_create_dir("odm", adapter->dir_dev, dev);
-   if (dir_odm == NULL) {
+   if (!dir_odm) {
rtw_warn_on(1);
goto exit;
}
@@ -674,7 +674,7 @@ static void rtw_odm_proc_deinit(struct adapter  
*adapter)
 
dir_odm = adapter->dir_odm;
 
-   if (dir_odm == NULL) {
+   if (!dir_odm) {
rtw_warn_on(1);
return;
}
@@ -695,18 +695,18 @@ struct proc_dir_entry *rtw_adapter_proc_init(struct 
net_device *dev)
struct adapter *adapter = rtw_netdev_priv(dev);
ssize_t i;
 
-   if (drv_proc == NULL) {
+   if (!drv_proc) {
rtw_warn_on(1);
goto exit;
}
 
-   if (adapter->dir_dev != NULL) {
+   if (adapter->dir_dev) {
rtw_warn_on(1);
goto exit;
}
 
dir_dev = rtw_proc_create_dir(dev->name, drv_proc, dev);
-   if (dir_dev == NULL) {
+   if (!dir_dev) {
rtw_warn_on(1);
goto exit;
}
@@ -736,7 +736,7 @@ void rtw_adapter_proc_deinit(struct net_device *dev)
 
dir_dev = adapter->dir_dev;
 
-   if (dir_dev == NULL) {
+   if (!dir_dev) {
rtw_warn_on(1);
return;
}
@@ -760,7 +760,7 @@ void rtw_adapter_proc_replace(struct net_device *dev)
 
dir_dev = adapter->dir_dev;
 
-   if (dir_dev == NULL) {
+   if (!dir_dev) {
rtw_warn_on(1);
return;
}
-- 
2.7.4



Re: [PATCH] cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending

2019-06-10 Thread Pavan Kondeti
Hi Rafael/Thomas,

On Mon, Jun 3, 2019 at 10:03 AM Pavankumar Kondeti
 wrote:
>
> When "deep" suspend is enabled, all CPUs except the primary CPU
> are hotplugged out. Since CPU hotplug is a costly operation,
> check if we have to abort the suspend in between each CPU
> hotplug. This would improve the system suspend abort latency
> upon detecting a wakeup condition.
>

Please let me know if you have any comments on this patch.

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project


Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol

2019-06-10 Thread Jin, Yao




On 6/8/2019 7:41 PM, Jin, Yao wrote:



On 6/5/2019 7:44 PM, Jiri Olsa wrote:

On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:

SNIP


diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 43623fa..d1641da 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -79,6 +79,9 @@ struct hist_entry_diff {
  /* HISTC_WEIGHTED_DIFF */
  s64    wdiff;
+
+    /* PERF_HPP_DIFF__CYCLES */
+    s64    cycles;
  };
  };
@@ -143,6 +146,9 @@ struct hist_entry {
  struct branch_info    *branch_info;
  long    time;
  struct hists    *hists;
+    void    *block_hists;
+    int    block_idx;
+    int    block_num;
  struct mem_info    *mem_info;
  struct block_info    *block_info;


could you please not add the new block* stuff in here,
and instead use the "c2c model" and use yourr own struct
on top of hist_entry? we are trying to librarize this
stuff and keep only necessary things in here..

you're already using hist_entry_ops, so should be easy

something like:

struct block_hist_entry {
    void    *block_hists;
    int    block_idx;
    int    block_num;
    struct block_info    *block_info;

    struct hist_entry    he;
};



jirka



Hi Jiri,

After more considerations, maybe I can't move these stuffs from 
hist_entry to block_hist_entry.


Actually we use 2 kinds of hist_entry in this patch series. On kind of 
hist_entry is for symbol/function. The other kind of hist_entry is for 
basic block.


@@ -143,6 +146,9 @@ struct hist_entry {
   struct branch_info    *branch_info;
   long    time;
   struct hists    *hists;
+    void    *block_hists;
+    int    block_idx;
+    int    block_num;
   struct mem_info    *mem_info;
   struct block_info    *block_info;

The above hist_entry is actually for symbol/function. This patch series 
collects all basic blocks in a symbol/function, so it needs a hists in 
struct hist_entry (block_hists) to point to the hists of basic blocks.


Correct me if I'm wrong.

Thanks
Jin Yao



Hi Jiri,

Either adding a new pointer 'priv' in 'struct map_symbol'?

struct map_symbol {
struct map  *map;
struct symbol   *sym;
+   void*priv;
};

We create a struct outside and assign the pointer to priv. Logically it 
should make sense since the symbol/function may have private data for 
processing.


Any idea?

Thanks
Jin Yao






Re: [PATCH v2] staging: kpc2000: kpc_i2c: remove the macros inb_p and outb_p

2019-06-10 Thread Geordan Neukum
On Mon, Jun 10, 2019 at 03:48:24PM +0800, Hao Xu wrote:
> remove inb_p and outb_p to call readq/writeq directly.
> 
> Signed-off-by: Hao Xu 
> ---
> Changes in v2:
> - remove the macros inb_p/outb_p and use readq/writeq directly, per 
> https://lkml.kernel.org/lkml/20190608134505.ga...@arch-01.home/
> ---
>  drivers/staging/kpc2000/kpc2000_i2c.c | 112 
> --
>  1 file changed, 53 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/staging/kpc2000/kpc2000_i2c.c 
> b/drivers/staging/kpc2000/kpc2000_i2c.c
> index 69e8773..246d5b3 100644
> --- a/drivers/staging/kpc2000/kpc2000_i2c.c
> +++ b/drivers/staging/kpc2000/kpc2000_i2c.c

> @@ -307,28 +301,28 @@ static int i801_block_transaction_byte_by_byte(struct 
> i2c_device *priv, union i2
>   else
>   smbcmd = I801_BLOCK_DATA;
>   }
> - outb_p(smbcmd | ENABLE_INT9, SMBHSTCNT(priv));
> + writeq(smbcmd | ENABLE_INT9, (void *)SMBHSTCNT(priv));
>  
>   if (i == 1)
> - outb_p(inb(SMBHSTCNT(priv)) | I801_START, 
> SMBHSTCNT(priv));
> + writeq(inb(SMBHSTCNT(priv)) | I801_START, (void 
> *)SMBHSTCNT(priv));

This inb() call looks like a bug. We perform a 64-bit operation when
talking to this hardware register everywhere else in this driver. Anyone
have more insight into the hardware with which this driver interacts
such that they could shed some light on the subject?

Probably a separate issue, but I did notice it as a result of this patch.

Thanks,
Geordan


Re: [PATCH RESEND 1/2] tools/perf: Add arch neutral function to choose event for perf kvm record

2019-06-10 Thread Ravi Bangoria



On 6/10/19 8:46 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jun 10, 2019 at 12:15:17PM +0530, Anju T Sudhakar escreveu:
>> 'perf kvm record' uses 'cycles'(if the user did not specify any event) as
>> the default event to profile the guest.
>> This will not provide any proper samples from the guest incase of
>> powerpc architecture, since in powerpc the PMUs are controlled by
>> the guest rather than the host.
>>
>> Patch adds a function to pick an arch specific event for 'perf kvm record',
>> instead of selecting 'cycles' as a default event for all architectures.
>>
>> For powerpc this function checks for any user specified event, and if there
>> isn't any it returns invalid instead of proceeding with 'cycles' event.
> 
> Michael, Ravi, Maddy, could you please provide an Acked-by, Reviewed-by
> or Tested-by?

Code looks fine to me but cross-build fails for aarch64:

  builtin-kvm.c:1513:12: error: no previous prototype for 
'kvm_add_default_arch_event' [-Werror=missing-prototypes]
   int __weak kvm_add_default_arch_event(int *argc __maybe_unused,
  ^~
  cc1: all warnings being treated as errors
  mv: cannot stat './.builtin-kvm.o.tmp': No such file or directory

With the build fix:
Acked-by: Ravi Bangoria 



Re: [PATCH v2 02/17] dt-bindings: soc: qcom: add IPA bindings

2019-06-10 Thread Alex Elder
On 6/10/19 5:08 PM, Rob Herring wrote:
> On Thu, May 30, 2019 at 9:53 PM Alex Elder  wrote:
>>
>> Add the binding definitions for the "qcom,ipa" device tree node.
>>
>> Signed-off-by: Alex Elder 
>> ---
>>  .../devicetree/bindings/net/qcom,ipa.yaml | 180 ++
>>  1 file changed, 180 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml 
>> b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
>> new file mode 100644
>> index ..0037fc278a61
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
>> @@ -0,0 +1,180 @@
>> +# SPDX-License-Identifier: GPL-2.0
> 
> New bindings are preferred to be dual GPL-2.0 and BSD-2-Clause. But
> that's really a decision for the submitter.

Thanks Rob.  I'll ask Qualcomm if there's any problem
with doing that; I presume not.  If I re-submit this
with dual copyright, I will include your Reviewed-by
despite the change, OK?

-Alex

> 
> Reviewed-by: Rob Herring 
> 



linux-next: manual merge of the crypto tree with Linus' tree

2019-06-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the crypto tree got conflicts in:

  drivers/crypto/vmx/aes.c
  drivers/crypto/vmx/aes_cbc.c
  drivers/crypto/vmx/aes_ctr.c
  drivers/crypto/vmx/aes_xts.c
  drivers/crypto/vmx/vmx.c

between commits:

  64d85cc99980 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 
299")
  27ba4deb4e26 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 
442")

from Linus' tree and commit:

  1fa0a7dcf759 ("crypto: vmx - convert to SPDX license identifiers")

from the crypto tree.

I fixed it up (I just used the SPDX tags from Linus' tree) and can
carry the fix as necessary. This is now fixed as far as linux-next is
concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the
conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgp_B5PDadHEJ.pgp
Description: OpenPGP digital signature


RE: [PATCH V3 3/4] clk: imx: Add support for i.MX8MN clock driver

2019-06-10 Thread Anson Huang
Hi, Stephen

> -Original Message-
> From: Stephen Boyd 
> Sent: Monday, June 10, 2019 11:14 PM
> To: bjorn.anders...@linaro.org; catalin.mari...@arm.com;
> devicet...@vger.kernel.org; dingu...@kernel.org;
> enric.balle...@collabora.com; feste...@gmail.com;
> horms+rene...@verge.net.au; ja...@amarulasolutions.com;
> ker...@pengutronix.de; l.st...@pengutronix.de; linux-arm-
> ker...@lists.infradead.org; linux-...@vger.kernel.org; linux-
> ker...@vger.kernel.org; mark.rutl...@arm.com;
> maxime.rip...@bootlin.com; mturque...@baylibre.com; o...@lixom.net;
> robh...@kernel.org; s.hauer@pengutronix .de ;
> shawn...@kernel.org; will.dea...@arm.com; Abel Vesa
> ; Aisheng Dong ; Anson
> Huang ; Jacky Bai ; Leonard
> Crestez 
> Cc: dl-linux-imx 
> Subject: RE: [PATCH V3 3/4] clk: imx: Add support for i.MX8MN clock driver
> 
> Quoting Anson Huang (2019-06-08 02:58:18)
> > Hi, Stephen
> >
> > > -Original Message-
> > > From: Stephen Boyd 
> > > Sent: Saturday, June 8, 2019 2:01 AM
> > > To: bjorn.anders...@linaro.org; catalin.mari...@arm.com;
> > > devicet...@vger.kernel.org; dingu...@kernel.org;
> > > enric.balle...@collabora.com; feste...@gmail.com;
> > > horms+rene...@verge.net.au; ja...@amarulasolutions.com;
> > > ker...@pengutronix.de; l.st...@pengutronix.de; linux-arm-
> > > ker...@lists.infradead.org; linux-...@vger.kernel.org; linux-
> > > ker...@vger.kernel.org; mark.rutl...@arm.com;
> > > maxime.rip...@bootlin.com; mturque...@baylibre.com; o...@lixom.net;
> > > robh...@kernel.org; s.hauer@pengutronix .de
> > > robh+;
> > > shawn...@kernel.org; will.dea...@arm.com; Abel Vesa
> > > ; Aisheng Dong ; Anson
> > > Huang ; Jacky Bai ;
> Leonard
> > > Crestez 
> > > Cc: dl-linux-imx 
> > > Subject: RE: [PATCH V3 3/4] clk: imx: Add support for i.MX8MN clock
> > > driver
> > >
> > > Quoting Anson Huang (2019-06-06 17:50:28)
> > > >
> > > > I will use devm_platform_ioremap_resource() instead of ioremap(),
> > > > and can you be more specific about devmified clk registration?
> > > >
> > >
> > > I mean using things like devm_clk_hw_register().
> >
> > Sorry, I am still a little confused, all the clock
> > register(clk_register()) are via each different clock types like
> > imx_clk_gate4/imx_clk_pll14xx, if using clk_hw_register, means we need
> > to re-write the clock driver using different clk register method, that
> > will make the driver completely different from i.mx8mq/i.mx8mm, they
> > are actually same series of SoC as i.mx8mn, it will introduce many
> confusion, is my understanding correct? And is it OK to just keep what it is
> and make them all aligned?
> >
> 
> Ok, the problem I'm trying to point out is that clk registrations need to be
> undone, i.e. clk_unregister() needs to be called, when the driver fails to
> probe. devm_*() is one way to do this, but if you have other ways of
> removing all the registered clks then that works too. Makes sense?

Yes, it makes sense. Do you think it is OK to add an imx_unregister_clocks() 
API, then
call it in every place of returning failure in .probe function? If yes, I will 
add it and also
fix it in i.MX8MQ driver which uses platform driver model but does NOT handle 
this case. 

base = devm_platform_ioremap_resource(pdev, 0);
-   if (WARN_ON(IS_ERR(base)))
-   return PTR_ERR(base);
+   if (WARN_ON(IS_ERR(base))) {
+   ret = PTR_ERR(base);
+   goto unregister_clks;
+   }

pr_err("failed to register clks for i.MX8MN\n");
-   return -EINVAL;
+   goto unregister_clks;
}

return 0;
+
+unregister_clks:
+   imx_unregister_clocks(clks, ARRAY_SIZE(clks));
+
+   return ret;

+void imx_unregister_clocks(struct clk *clks[], unsigned int count)
+{
+   unsigned i;
+
+   for (i = 0; i < count; i++)
+   clk_unregister(clks[i]);
+}
+

Thanks,
Anson.


[PATCH -mm RESEND] mm: fix race between swapoff and mincore

2019-06-10 Thread Huang, Ying
From: Huang Ying 

Via commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks") on,
after swapoff, the address_space associated with the swap device will be
freed.  So swap_address_space() users which touch the address_space need
some kind of mechanism to prevent the address_space from being freed
during accessing.

When mincore process unmapped range for swapped shmem pages, it doesn't
hold the lock to prevent swap device from being swapoff.  So the following
race is possible,

CPU1CPU2
do_mincore()swapoff()
  walk_page_range()
mincore_unmapped_range()
  __mincore_unmapped_range
mincore_page
  as = swap_address_space()
  ... exit_swap_address_space()
  ...   kvfree(spaces)
  find_get_page(as)

The address space may be accessed after being freed.

To fix the race, get_swap_device()/put_swap_device() is used to enclose
find_get_page() to check whether the swap entry is valid and prevent the
swap device from being swapoff during accessing.

Fixes: 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks")
Signed-off-by: "Huang, Ying" 
Reviewed-by: Andrew Morton 
Acked-by: Michal Hocko 
Cc: Hugh Dickins 
Cc: Paul E. McKenney 
Cc: Minchan Kim 
Cc: Johannes Weiner 
Cc: Tim Chen 
Cc: Mel Gorman 
Cc: Jérôme Glisse 
Cc: Andrea Arcangeli 
Cc: Yang Shi 
Cc: David Rientjes 
Cc: Rik van Riel 
Cc: Jan Kara 
Cc: Dave Jiang 
Cc: Daniel Jordan 
Cc: Andrea Parri 
---
 mm/mincore.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/mincore.c b/mm/mincore.c
index c3f058bd0faf..4fe91d497436 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -68,8 +68,16 @@ static unsigned char mincore_page(struct address_space 
*mapping, pgoff_t pgoff)
 */
if (xa_is_value(page)) {
swp_entry_t swp = radix_to_swp_entry(page);
-   page = find_get_page(swap_address_space(swp),
-swp_offset(swp));
+   struct swap_info_struct *si;
+
+   /* Prevent swap device to being swapoff under us */
+   si = get_swap_device(swp);
+   if (si) {
+   page = find_get_page(swap_address_space(swp),
+swp_offset(swp));
+   put_swap_device(si);
+   } else
+   page = NULL;
}
} else
page = find_get_page(mapping, pgoff);
-- 
2.20.1



[PATCH V4 1/3] ocfs2: add last unlock times in locking_state

2019-06-10 Thread Gang He
ocfs2 file system uses locking_state file under debugfs to dump
each ocfs2 file system's dlm lock resources, but the dlm lock
resources in memory are becoming more and more after the files
were touched by the user. it will become a bit difficult to analyze
these dlm lock resource records in locking_state file by the upper
scripts, though some files are not active for now, which were
accessed long time ago.
Then, I'd like to add last pr/ex unlock times in locking_state file
for each dlm lock resource record, the the upper scripts can use
last unlock time to filter inactive dlm lock resource record.

Compared with v1, the main change is to use wall time in
microsecond for last pr/ex unlock time.

Signed-off-by: Gang He 
Reviewed-by: Joseph Qi 
---
 fs/ocfs2/dlmglue.c | 18 +++---
 fs/ocfs2/ocfs2.h   |  1 +
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index af405586c5b1..3b0e7d399df2 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -474,6 +474,8 @@ static void ocfs2_update_lock_stats(struct ocfs2_lock_res 
*res, int level,
 
if (ret)
stats->ls_fail++;
+
+   stats->ls_last = ktime_to_us(ktime_get_real());
 }
 
 static inline void ocfs2_track_lock_refresh(struct ocfs2_lock_res *lockres)
@@ -3093,8 +3095,10 @@ static void *ocfs2_dlm_seq_next(struct seq_file *m, void 
*v, loff_t *pos)
  * - Lock stats printed
  * New in version 3
  * - Max time in lock stats is in usecs (instead of nsecs)
+ * New in version 4
+ * - Add last pr/ex unlock times in usecs
  */
-#define OCFS2_DLM_DEBUG_STR_VERSION 3
+#define OCFS2_DLM_DEBUG_STR_VERSION 4
 static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
 {
int i;
@@ -3145,6 +3149,8 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
 # define lock_max_prmode(_l)   ((_l)->l_lock_prmode.ls_max)
 # define lock_max_exmode(_l)   ((_l)->l_lock_exmode.ls_max)
 # define lock_refresh(_l)  ((_l)->l_lock_refresh)
+# define lock_last_prmode(_l)  ((_l)->l_lock_prmode.ls_last)
+# define lock_last_exmode(_l)  ((_l)->l_lock_exmode.ls_last)
 #else
 # define lock_num_prmode(_l)   (0)
 # define lock_num_exmode(_l)   (0)
@@ -3155,6 +3161,8 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
 # define lock_max_prmode(_l)   (0)
 # define lock_max_exmode(_l)   (0)
 # define lock_refresh(_l)  (0)
+# define lock_last_prmode(_l)  (0ULL)
+# define lock_last_exmode(_l)  (0ULL)
 #endif
/* The following seq_print was added in version 2 of this output */
seq_printf(m, "%u\t"
@@ -3165,7 +3173,9 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
   "%llu\t"
   "%u\t"
   "%u\t"
-  "%u\t",
+  "%u\t"
+  "%llu\t"
+  "%llu\t",
   lock_num_prmode(lockres),
   lock_num_exmode(lockres),
   lock_num_prmode_failed(lockres),
@@ -3174,7 +3184,9 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
   lock_total_exmode(lockres),
   lock_max_prmode(lockres),
   lock_max_exmode(lockres),
-  lock_refresh(lockres));
+  lock_refresh(lockres),
+  lock_last_prmode(lockres),
+  lock_last_exmode(lockres));
 
/* End the line */
seq_printf(m, "\n");
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 1f029fbe8b8d..6f43651f01b3 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -164,6 +164,7 @@ struct ocfs2_lock_stats {
 
/* Storing max wait in usecs saves 24 bytes per inode */
u32 ls_max; /* Max wait in USEC */
+   u64 ls_last;/* Last unlock time in USEC */
 };
 #endif
 
-- 
2.21.0



[PATCH V4 2/3] ocfs2: add locking filter debugfs file

2019-06-10 Thread Gang He
Add locking filter debugfs file, which is used to filter lock
resources dump from locking_state debugfs file.
We use d_filter_secs field to filter lock resources dump,
the default d_filter_secs(0) value filters nothing,
otherwise, only dump the last N seconds active lock resources.
This enhancement can avoid dumping lots of old records.
The d_filter_secs value can be changed via locking_filter file.

Compared with v3, I need to do the related change since last
lock/unlock uses wall time in microsecond. secondly, adjust
CONFIG_OCFS2_FS_STATS macro positions.
Compared with v2, ocfs2_dlm_init_debug() returns directly with
error when creating locking filter debugfs file is failed, since
ocfs2_dlm_shutdown_debug() will handle this failure perfectly.
Compared with v1, the main change is to add CONFIG_OCFS2_FS_STATS
macro definition judgment.

Signed-off-by: Gang He 
Reviewed-by: Joseph Qi 
---
 fs/ocfs2/dlmglue.c | 38 ++
 fs/ocfs2/ocfs2.h   |  2 ++
 2 files changed, 40 insertions(+)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 3b0e7d399df2..d4caa6d117c6 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -3005,6 +3005,8 @@ struct ocfs2_dlm_debug *ocfs2_new_dlm_debug(void)
kref_init(_debug->d_refcnt);
INIT_LIST_HEAD(_debug->d_lockres_tracking);
dlm_debug->d_locking_state = NULL;
+   dlm_debug->d_locking_filter = NULL;
+   dlm_debug->d_filter_secs = 0;
 out:
return dlm_debug;
 }
@@ -3104,10 +3106,34 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void 
*v)
int i;
char *lvb;
struct ocfs2_lock_res *lockres = v;
+#ifdef CONFIG_OCFS2_FS_STATS
+   u64 now, last;
+   struct ocfs2_dlm_debug *dlm_debug =
+   ((struct ocfs2_dlm_seq_priv *)m->private)->p_dlm_debug;
+#endif
 
if (!lockres)
return -EINVAL;
 
+#ifdef CONFIG_OCFS2_FS_STATS
+   if (dlm_debug->d_filter_secs) {
+   now = ktime_to_us(ktime_get_real());
+   if (lockres->l_lock_prmode.ls_last >
+   lockres->l_lock_exmode.ls_last)
+   last = lockres->l_lock_prmode.ls_last;
+   else
+   last = lockres->l_lock_exmode.ls_last;
+   /*
+* Use d_filter_secs field to filter lock resources dump,
+* the default d_filter_secs(0) value filters nothing,
+* otherwise, only dump the last N seconds active lock
+* resources.
+*/
+   if ((now - last) / 100 > dlm_debug->d_filter_secs)
+   return 0;
+   }
+#endif
+
seq_printf(m, "0x%x\t", OCFS2_DLM_DEBUG_STR_VERSION);
 
if (lockres->l_type == OCFS2_LOCK_TYPE_DENTRY)
@@ -3257,6 +3283,17 @@ static int ocfs2_dlm_init_debug(struct ocfs2_super *osb)
goto out;
}
 
+   dlm_debug->d_locking_filter = debugfs_create_u32("locking_filter",
+   0600,
+   osb->osb_debug_root,
+   _debug->d_filter_secs);
+   if (!dlm_debug->d_locking_filter) {
+   ret = -EINVAL;
+   mlog(ML_ERROR,
+"Unable to create locking filter debugfs file.\n");
+   goto out;
+   }
+
ocfs2_get_dlm_debug(dlm_debug);
 out:
return ret;
@@ -3268,6 +3305,7 @@ static void ocfs2_dlm_shutdown_debug(struct ocfs2_super 
*osb)
 
if (dlm_debug) {
debugfs_remove(dlm_debug->d_locking_state);
+   debugfs_remove(dlm_debug->d_locking_filter);
ocfs2_put_dlm_debug(dlm_debug);
}
 }
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 6f43651f01b3..6d0a77703d0e 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -237,6 +237,8 @@ struct ocfs2_orphan_scan {
 struct ocfs2_dlm_debug {
struct kref d_refcnt;
struct dentry *d_locking_state;
+   struct dentry *d_locking_filter;
+   u32 d_filter_secs;
struct list_head d_lockres_tracking;
 };
 
-- 
2.21.0



[PATCH V4 3/3] ocfs2: add first lock wait time in locking_state

2019-06-10 Thread Gang He
ocfs2 file system uses locking_state file under debugfs to dump
each ocfs2 file system's dlm lock resources, but the users ever
encountered some hang(deadlock) problems in ocfs2 file system.
I'd like to add first lock wait time in locking_state file, which
can help the upper scripts detect these deadlock problems via
comparing the first lock wait time with the current time.

Signed-off-by: Gang He 
---
 fs/ocfs2/dlmglue.c | 32 +---
 fs/ocfs2/ocfs2.h   |  1 +
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index d4caa6d117c6..8ce4b76f81ee 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -440,6 +440,7 @@ static void ocfs2_remove_lockres_tracking(struct 
ocfs2_lock_res *res)
 static void ocfs2_init_lock_stats(struct ocfs2_lock_res *res)
 {
res->l_lock_refresh = 0;
+   res->l_lock_wait = 0;
memset(>l_lock_prmode, 0, sizeof(struct ocfs2_lock_stats));
memset(>l_lock_exmode, 0, sizeof(struct ocfs2_lock_stats));
 }
@@ -483,6 +484,21 @@ static inline void ocfs2_track_lock_refresh(struct 
ocfs2_lock_res *lockres)
lockres->l_lock_refresh++;
 }
 
+static inline void ocfs2_track_lock_wait(struct ocfs2_lock_res *lockres)
+{
+   struct ocfs2_mask_waiter *mw;
+
+   if (list_empty(>l_mask_waiters)) {
+   lockres->l_lock_wait = 0;
+   return;
+   }
+
+   mw = list_first_entry(>l_mask_waiters,
+   struct ocfs2_mask_waiter, mw_item);
+   lockres->l_lock_wait =
+   ktime_to_us(ktime_mono_to_real(mw->mw_lock_start));
+}
+
 static inline void ocfs2_init_start_time(struct ocfs2_mask_waiter *mw)
 {
mw->mw_lock_start = ktime_get();
@@ -498,6 +514,9 @@ static inline void ocfs2_update_lock_stats(struct 
ocfs2_lock_res *res,
 static inline void ocfs2_track_lock_refresh(struct ocfs2_lock_res *lockres)
 {
 }
+static inline void ocfs2_track_lock_wait(struct ocfs2_lock_res *lockres)
+{
+}
 static inline void ocfs2_init_start_time(struct ocfs2_mask_waiter *mw)
 {
 }
@@ -891,6 +910,7 @@ static void lockres_set_flags(struct ocfs2_lock_res 
*lockres,
list_del_init(>mw_item);
mw->mw_status = 0;
complete(>mw_complete);
+   ocfs2_track_lock_wait(lockres);
}
 }
 static void lockres_or_flags(struct ocfs2_lock_res *lockres, unsigned long or)
@@ -1402,6 +1422,7 @@ static void lockres_add_mask_waiter(struct ocfs2_lock_res 
*lockres,
list_add_tail(>mw_item, >l_mask_waiters);
mw->mw_mask = mask;
mw->mw_goal = goal;
+   ocfs2_track_lock_wait(lockres);
 }
 
 /* returns 0 if the mw that was removed was already satisfied, -EBUSY
@@ -1418,6 +1439,7 @@ static int __lockres_remove_mask_waiter(struct 
ocfs2_lock_res *lockres,
 
list_del_init(>mw_item);
init_completion(>mw_complete);
+   ocfs2_track_lock_wait(lockres);
}
 
return ret;
@@ -3098,7 +3120,7 @@ static void *ocfs2_dlm_seq_next(struct seq_file *m, void 
*v, loff_t *pos)
  * New in version 3
  * - Max time in lock stats is in usecs (instead of nsecs)
  * New in version 4
- * - Add last pr/ex unlock times in usecs
+ * - Add last pr/ex unlock times and first lock wait time in usecs
  */
 #define OCFS2_DLM_DEBUG_STR_VERSION 4
 static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
@@ -3116,7 +3138,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
return -EINVAL;
 
 #ifdef CONFIG_OCFS2_FS_STATS
-   if (dlm_debug->d_filter_secs) {
+   if (!lockres->l_lock_wait && dlm_debug->d_filter_secs) {
now = ktime_to_us(ktime_get_real());
if (lockres->l_lock_prmode.ls_last >
lockres->l_lock_exmode.ls_last)
@@ -3177,6 +3199,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
 # define lock_refresh(_l)  ((_l)->l_lock_refresh)
 # define lock_last_prmode(_l)  ((_l)->l_lock_prmode.ls_last)
 # define lock_last_exmode(_l)  ((_l)->l_lock_exmode.ls_last)
+# define lock_wait(_l) ((_l)->l_lock_wait)
 #else
 # define lock_num_prmode(_l)   (0)
 # define lock_num_exmode(_l)   (0)
@@ -3189,6 +3212,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
 # define lock_refresh(_l)  (0)
 # define lock_last_prmode(_l)  (0ULL)
 # define lock_last_exmode(_l)  (0ULL)
+# define lock_wait(_l) (0ULL)
 #endif
/* The following seq_print was added in version 2 of this output */
seq_printf(m, "%u\t"
@@ -3201,6 +3225,7 @@ static int ocfs2_dlm_seq_show(struct seq_file *m, void *v)
   "%u\t"
   "%u\t"
   "%llu\t"
+  "%llu\t"
   "%llu\t",
   lock_num_prmode(lockres),
   lock_num_exmode(lockres),
@@ -3212,7 +3237,8 @@ 

Re: [PATCH v3 0/3] KVM: Yield to IPI target if necessary

2019-06-10 Thread Nadav Amit
> On Jun 10, 2019, at 6:45 PM, Wanpeng Li  wrote:
> 
> On Tue, 11 Jun 2019 at 09:11, Sean Christopherson
>  wrote:
>> On Mon, Jun 10, 2019 at 04:34:20PM +0200, Radim Krčmář wrote:
>>> 2019-05-30 09:05+0800, Wanpeng Li:
 The idea is from Xen, when sending a call-function IPI-many to vCPUs,
 yield if any of the IPI target vCPUs was preempted. 17% performance
 increasement of ebizzy benchmark can be observed in an over-subscribe
 environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function
 IPI-many since call-function is not easy to be trigged by userspace
 workload).
>>> 
>>> Have you checked if we could gain performance by having the yield as an
>>> extension to our PV IPI call?
>>> 
>>> It would allow us to skip the VM entry/exit overhead on the caller.
>>> (The benefit of that might be negligible and it also poses a
>>> complication when splitting the target mask into several PV IPI
>>> hypercalls.)
>> 
>> Tangetially related to splitting PV IPI hypercalls, are there any major
>> hurdles to supporting shorthand?  Not having to generate the mask for
>> ->send_IPI_allbutself and ->kvm_send_ipi_all seems like an easy to way
>> shave cycles for affected flows.
> 
> Not sure why shorthand is not used for native x2apic mode.

Why do you say so? native_send_call_func_ipi() checks if allbutself
shorthand should be used and does so (even though the check can be more
efficient - I’m looking at that code right now…)

[RESEND PATCH V1] can: sja1000: f81601: add Fintek F81601 support

2019-06-10 Thread Ji-Ze Hong (Peter Hong)
This patch add support for Fintek PCIE to 2 CAN controller support

Signed-off-by: Ji-Ze Hong (Peter Hong) 
---
 drivers/net/can/sja1000/Kconfig  |   8 ++
 drivers/net/can/sja1000/Makefile |   1 +
 drivers/net/can/sja1000/f81601.c | 223 +++
 3 files changed, 232 insertions(+)
 create mode 100644 drivers/net/can/sja1000/f81601.c

diff --git a/drivers/net/can/sja1000/Kconfig b/drivers/net/can/sja1000/Kconfig
index f6dc89927ece..8588323c5138 100644
--- a/drivers/net/can/sja1000/Kconfig
+++ b/drivers/net/can/sja1000/Kconfig
@@ -101,4 +101,12 @@ config CAN_TSCAN1
  IRQ numbers are read from jumpers JP4 and JP5,
  SJA1000 IO base addresses are chosen heuristically (first that works).
 
+config CAN_F81601
+   tristate "Fintek F81601 PCIE to 2 CAN Controller"
+   depends on PCI
+   help
+ This driver adds support for Fintek F81601 PCIE to 2 CAN Controller.
+ It had internal 24MHz clock source, but it can be changed by
+ manufacturer. We can use modinfo to get usage for parameters.
+ Visit http://www.fintek.com.tw to get more information.
 endif
diff --git a/drivers/net/can/sja1000/Makefile b/drivers/net/can/sja1000/Makefile
index 9253aaf9e739..6f6268543bd9 100644
--- a/drivers/net/can/sja1000/Makefile
+++ b/drivers/net/can/sja1000/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_CAN_PEAK_PCMCIA) += peak_pcmcia.o
 obj-$(CONFIG_CAN_PEAK_PCI) += peak_pci.o
 obj-$(CONFIG_CAN_PLX_PCI) += plx_pci.o
 obj-$(CONFIG_CAN_TSCAN1) += tscan1.o
+obj-$(CONFIG_CAN_F81601) += f81601.o
diff --git a/drivers/net/can/sja1000/f81601.c b/drivers/net/can/sja1000/f81601.c
new file mode 100644
index ..1578bb837aaf
--- /dev/null
+++ b/drivers/net/can/sja1000/f81601.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Fintek F81601 PCIE to 2 CAN controller driver
+ *
+ * Copyright (C) 2019 Peter Hong 
+ * Copyright (C) 2019 Linux Foundation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "sja1000.h"
+
+#define F81601_PCI_MAX_CHAN2
+
+#define F81601_DECODE_REG  0x209
+#define F81601_IO_MODE BIT(7)
+#define F81601_MEM_MODEBIT(6)
+#define F81601_CFG_MODEBIT(5)
+#define F81601_CAN2_INTERNAL_CLK   BIT(3)
+#define F81601_CAN1_INTERNAL_CLK   BIT(2)
+#define F81601_CAN2_EN BIT(1)
+#define F81601_CAN1_EN BIT(0)
+
+#define F81601_TRAP_REG0x20a
+#define F81601_CAN2_HAS_EN BIT(4)
+
+struct f81601_pci_card {
+   int channels;   /* detected channels count */
+   void __iomem *addr;
+   spinlock_t lock;/* for access mem io */
+   struct pci_dev *dev;
+   struct net_device *net_dev[F81601_PCI_MAX_CHAN];
+};
+
+static const struct pci_device_id f81601_pci_tbl[] = {
+   { PCI_DEVICE(0x1c29, 0x1703) },
+   {},
+};
+
+MODULE_DEVICE_TABLE(pci, f81601_pci_tbl);
+
+static bool internal_clk = 1;
+module_param(internal_clk, bool, 0444);
+MODULE_PARM_DESC(internal_clk, "Use internal clock, default 1 (24MHz)");
+
+static unsigned int external_clk;
+module_param(external_clk, uint, 0444);
+MODULE_PARM_DESC(external_clk, "External Clock, must spec when internal_clk = 
0");
+
+static u8 f81601_pci_read_reg(const struct sja1000_priv *priv, int port)
+{
+   return readb(priv->reg_base + port);
+}
+
+static void f81601_pci_write_reg(const struct sja1000_priv *priv, int port,
+u8 val)
+{
+   struct f81601_pci_card *card = priv->priv;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   writeb(val, priv->reg_base + port);
+   readb(priv->reg_base);
+   spin_unlock_irqrestore(>lock, flags);
+}
+
+static void f81601_pci_del_card(struct pci_dev *pdev)
+{
+   struct f81601_pci_card *card = pci_get_drvdata(pdev);
+   struct net_device *dev;
+   int i = 0;
+
+   for (i = 0; i < F81601_PCI_MAX_CHAN; i++) {
+   dev = card->net_dev[i];
+   if (!dev)
+   continue;
+
+   dev_info(>dev, "%s: Removing %s\n", __func__, dev->name);
+
+   unregister_sja1000dev(dev);
+   free_sja1000dev(dev);
+   }
+
+   pcim_iounmap(pdev, card->addr);
+}
+
+/* Probe F81601 based device for the SJA1000 chips and register each
+ * available CAN channel to SJA1000 Socket-CAN subsystem.
+ */
+static int f81601_pci_add_card(struct pci_dev *pdev,
+  const struct pci_device_id *ent)
+{
+   struct sja1000_priv *priv;
+   struct net_device *dev;
+   struct f81601_pci_card *card;
+   int err, i;
+   u8 tmp;
+
+   if (pcim_enable_device(pdev) < 0) {
+   dev_err(>dev, "Failed to enable PCI device\n");
+   return -ENODEV;
+   }
+
+   dev_info(>dev, "Detected 

Re: [PATCH v3 0/3] KVM: Yield to IPI target if necessary

2019-06-10 Thread Wanpeng Li
On Tue, 11 Jun 2019 at 09:11, Sean Christopherson
 wrote:
>
> On Mon, Jun 10, 2019 at 04:34:20PM +0200, Radim Krčmář wrote:
> > 2019-05-30 09:05+0800, Wanpeng Li:
> > > The idea is from Xen, when sending a call-function IPI-many to vCPUs,
> > > yield if any of the IPI target vCPUs was preempted. 17% performance
> > > increasement of ebizzy benchmark can be observed in an over-subscribe
> > > environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function
> > > IPI-many since call-function is not easy to be trigged by userspace
> > > workload).
> >
> > Have you checked if we could gain performance by having the yield as an
> > extension to our PV IPI call?
> >
> > It would allow us to skip the VM entry/exit overhead on the caller.
> > (The benefit of that might be negligible and it also poses a
> >  complication when splitting the target mask into several PV IPI
> >  hypercalls.)
>
> Tangetially related to splitting PV IPI hypercalls, are there any major
> hurdles to supporting shorthand?  Not having to generate the mask for
> ->send_IPI_allbutself and ->kvm_send_ipi_all seems like an easy to way
> shave cycles for affected flows.

Not sure why shorthand is not used for native x2apic mode.

Regards,
Wanpeng Li


Re: [PATCH v2 1/2] KVM: LAPIC: Optimize timer latency consider world switch time

2019-06-10 Thread Wanpeng Li
On Tue, 11 Jun 2019 at 09:21, Sean Christopherson
 wrote:
>
> On Fri, May 31, 2019 at 02:40:13PM +0800, Wanpeng Li wrote:
> > From: Wanpeng Li 
> >
> > Advance lapic timer tries to hidden the hypervisor overhead between the
> > host emulated timer fires and the guest awares the timer is fired. However,
> > even though after more sustaining optimizations, 
> > kvm-unit-tests/tscdeadline_latency
> > still awares ~1000 cycles latency since we lost the time between the end of
> > wait_lapic_expire and the guest awares the timer is fired. There are
> > codes between the end of wait_lapic_expire and the world switch, 
> > furthermore,
> > the world switch itself also has overhead. Actually the guest_tsc is equal
> > to the target deadline time in wait_lapic_expire is too late, guest will
> > aware the latency between the end of wait_lapic_expire() and after vmentry
> > to the guest. This patch takes this time into consideration.
> >
> > The vmentry_lapic_timer_advance_ns module parameter should be well tuned by
> > host admin, setting bit 0 to 1 to finally cache parameter in KVM. This patch
> > can reduce average cyclictest latency from 3us to 2us on Skylake server.
> > (guest w/ nohz=off, idle=poll, host w/ preemption_timer=N, the cyclictest
> > latency is not too sensitive when preemption_timer=Y for this optimization 
> > in
> > my testing), kvm-unit-tests/tscdeadline_latency can reach 0.
> >
> > Cc: Paolo Bonzini 
> > Cc: Radim Krčmář 
> > Cc: Sean Christopherson 
> > Signed-off-by: Wanpeng Li 
> > ---
> > NOTE: rebase on https://lkml.org/lkml/2019/5/20/449
> > v1 -> v2:
> >  * rename get_vmentry_advance_delta to get_vmentry_advance_cycles
> >  * cache vmentry_advance_cycles by setting param bit 0
> >  * add param max limit
> >
> >  arch/x86/kvm/lapic.c   | 38 +++---
> >  arch/x86/kvm/lapic.h   |  3 +++
> >  arch/x86/kvm/vmx/vmx.c |  2 +-
> >  arch/x86/kvm/x86.c |  9 +
> >  arch/x86/kvm/x86.h |  2 ++
> >  5 files changed, 50 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index fcf42a3..60587b5 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -1531,6 +1531,38 @@ static inline void adjust_lapic_timer_advance(struct 
> > kvm_vcpu *vcpu,
> >   apic->lapic_timer.timer_advance_ns = timer_advance_ns;
> >  }
> >
> > +#define MAX_VMENTRY_ADVANCE_NS 1000
> > +
> > +u64 compute_vmentry_advance_cycles(struct kvm_vcpu *vcpu)
>
> This can be static, unless get_vmentry_advance_cycles() is moved to
> lapic.h, in which case compute_vmentry_advance_cycles() would need to be
> exported.

Thanks for the review, Sean. I think Paolo has already drop this one.
https://lkml.org/lkml/2019/5/31/210

Regards,
Wanpeng Li


linux-next: manual merge of the net-next tree with the net tree

2019-06-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  kernel/bpf/verifier.c

between commit:

  983695fa6765 ("bpf: fix unconnected udp hooks")

from the net tree and commit:

  5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This is
now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your
tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc kernel/bpf/verifier.c
index a5c369e60343,5c2cb5bd84ce..
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@@ -5353,12 -5513,13 +5505,16 @@@ static int check_return_code(struct bpf
struct tnum range = tnum_range(0, 1);
  
switch (env->prog->type) {
 +  case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
 +  if (env->prog->expected_attach_type == BPF_CGROUP_UDP4_RECVMSG 
||
 +  env->prog->expected_attach_type == BPF_CGROUP_UDP6_RECVMSG)
 +  range = tnum_range(1, 1);
case BPF_PROG_TYPE_CGROUP_SKB:
+   if (env->prog->expected_attach_type == BPF_CGROUP_INET_EGRESS) {
+   range = tnum_range(0, 3);
+   enforce_attach_type_range = tnum_range(2, 3);
+   }
case BPF_PROG_TYPE_CGROUP_SOCK:
 -  case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
case BPF_PROG_TYPE_SOCK_OPS:
case BPF_PROG_TYPE_CGROUP_DEVICE:
case BPF_PROG_TYPE_CGROUP_SYSCTL:
@@@ -5385,9 -5546,13 +5541,13 @@@
verbose(env, "has unknown scalar value");
}
tnum_strn(tn_buf, sizeof(tn_buf), range);
 -  verbose(env, " should have been %s\n", tn_buf);
 +  verbose(env, " should have been in %s\n", tn_buf);
return -EINVAL;
}
+ 
+   if (!tnum_is_unknown(enforce_attach_type_range) &&
+   tnum_in(enforce_attach_type_range, reg->var_off))
+   env->prog->enforce_expected_attach_type = 1;
return 0;
  }
  


pgpkP0znEdajJ.pgp
Description: OpenPGP digital signature


Re: [PATCH net-next] hinic: fix a bug in set rx mode

2019-06-10 Thread xuechaojing

Yes, This patch fixes the Oops.

xue

在 2019/6/11 8:45, dann frazier wrote:

On Mon, May 27, 2019 at 10:10:05PM +, Xue Chaojing wrote:

in set_rx_mode, __dev_mc_sync and netdev_for_each_mc_addr will
repeatedly set the multicast mac address. so we delete this loop.

fyi, I'm told this fixes the following Oops (in case it makes sense to
queue it for stable):

[ 642.914581] Internal error: Oops: 9605 [#1] SMP
[ 642.919444] Modules linked in: hinic(-) 8021q garp mrp stp llc ses enclosure 
sg nls_utf8 isofs vfat fat loop ipmi_ssif crc32_ce crct10dif_ce ghash_c e 
sha2_ce sha256_arm64 sha1_ce sbsa_gwdt hns_roce_hw_v2 hns_roce ib_core ipmi_si 
ipmi_devintf ipmi_msghandler xfs libcrc32c marvell hibmc_drm drm_kms_h elper 
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qla2xxx ixgbe drm mpt3sas 
nvme_fc hns3 hisi_sas_v3_hw igb nvme_fabrics hisi_sas_main hclge mdio 
scsi_transport_fc nvme libsas hnae3 raid_class nvme_core scsi_transport_sas 
i2c_algo_bit gpio_dwapb gpio_generic dm_mirror dm_region_hash dm_log dm_mo d 
[last unloaded: hinic]
[ 642.974177] CPU: 4 PID: 5339 Comm: kworker/u256:1 Kdump: loaded Not tainted 
4.18.0-74.el8.aarch64 #1
[ 642.983293] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 
2280-A CS V2.16.01 03/16/2019
[ 642.992591] Workqueue: hinic_dev set_rx_mode [hinic]
[ 642.997542] pstate: 00c9 (nzcv daif +PAN +UAO)
[ 643.002320] pc : add_mac_addr+0xa4/0x100 [hinic]
[ 643.006924] lr : set_rx_mode+0x88/0xc0 [hinic]
[ 643.011353] sp : 3228fd40
[ 643.014653] x29: 3228fd40 x28: 
[ 643.019952] x27: b955c362ff38 x26: 27ccd2cc3110
[ 643.025250] x25:  x24: b955025c6b08
[ 643.030547] x23: b955025c6000 x22: 0010
[ 643.035845] x21: 27cc56040488 x20: b955025c6ac0
[ 643.041142] x19:  x18: 0010
[ 643.046440] x17: b7135830 x16: 27ccd2259bb8
[ 643.051737] x15:  x14: 2030302031302039
[ 643.057035] x13: 33203d2072646461 x12: 2063616d20746573
[ 643.062332] x11: 203a296465726574 x10: 0d10
[ 643.067630] x9 : 3228f9f0 x8 : b95501756170
[ 643.072927] x7 : 198c000940300814 x6 : 3228fd08
[ 643.078225] x5 :  x4 : 
[ 643.083523] x3 :  x2 : 0001
[ 643.088820] x1 : 0010 x0 : 00e3
[ 643.094118] Process kworker/u256:1 (pid: 5339, stack limit = 
0x23b4f182)
[ 643.101498] Call trace:
[ 643.103932] add_mac_addr+0xa4/0x100 [hinic]
[ 643.108189] set_rx_mode+0x88/0xc0 [hinic]
[ 643.112272] process_one_work+0x1ac/0x3e0
[ 643.116268] worker_thread+0x44/0x448
[ 643.119916] kthread+0x130/0x138
[ 643.123130] ret_from_fork+0x10/0x18
[ 643.126692] Code: a9425bf5 a94363f7 a8c47bfd d65f03c0 (394016c7)
[ 643.132828] SMP: stopping secondary CPUs
[ 643.139859] Starting crashdump kernel...
[ 643.143771] Bye!

   -dann


Signed-off-by: Xue Chaojing 
---
  drivers/net/ethernet/huawei/hinic/hinic_main.c | 4 
  1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c 
b/drivers/net/ethernet/huawei/hinic/hinic_main.c
index e64bc664f687..cfd3f4232cac 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
@@ -724,7 +724,6 @@ static void set_rx_mode(struct work_struct *work)
  {
struct hinic_rx_mode_work *rx_mode_work = work_to_rx_mode_work(work);
struct hinic_dev *nic_dev = rx_mode_work_to_nic_dev(rx_mode_work);
-   struct netdev_hw_addr *ha;
  
  	netif_info(nic_dev, drv, nic_dev->netdev, "set rx mode work\n");
  
@@ -732,9 +731,6 @@ static void set_rx_mode(struct work_struct *work)
  
  	__dev_uc_sync(nic_dev->netdev, add_mac_addr, remove_mac_addr);

__dev_mc_sync(nic_dev->netdev, add_mac_addr, remove_mac_addr);
-
-   netdev_for_each_mc_addr(ha, nic_dev->netdev)
-   add_mac_addr(nic_dev->netdev, ha->addr);
  }
  
  static void hinic_set_rx_mode(struct net_device *netdev)

.



[PATCH] soc: imx: Add i.MX8MN SoC driver support

2019-06-10 Thread Anson . Huang
From: Anson Huang 

This patch adds i.MX8MN SoC driver support:

root@imx8mnevk:~# cat /sys/devices/soc0/family
Freescale i.MX

root@imx8mnevk:~# cat /sys/devices/soc0/machine
NXP i.MX8MNano DDR4 EVK board

root@imx8mnevk:~# cat /sys/devices/soc0/soc_id
i.MX8MN

root@imx8mnevk:~# cat /sys/devices/soc0/revision
1.0

Signed-off-by: Anson Huang 
---
 drivers/soc/imx/soc-imx8.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/soc/imx/soc-imx8.c b/drivers/soc/imx/soc-imx8.c
index 3842d09..02309a2 100644
--- a/drivers/soc/imx/soc-imx8.c
+++ b/drivers/soc/imx/soc-imx8.c
@@ -55,7 +55,12 @@ static u32 __init imx8mm_soc_revision(void)
void __iomem *anatop_base;
u32 rev;
 
-   np = of_find_compatible_node(NULL, NULL, "fsl,imx8mm-anatop");
+   if (of_machine_is_compatible("fsl,imx8mm"))
+   np = of_find_compatible_node(NULL, NULL, "fsl,imx8mm-anatop");
+   else if (of_machine_is_compatible("fsl,imx8mn"))
+   np = of_find_compatible_node(NULL, NULL, "fsl,imx8mn-anatop");
+   else
+   np = NULL;
if (!np)
return 0;
 
@@ -79,9 +84,15 @@ static const struct imx8_soc_data imx8mm_soc_data = {
.soc_revision = imx8mm_soc_revision,
 };
 
+static const struct imx8_soc_data imx8mn_soc_data = {
+   .name = "i.MX8MN",
+   .soc_revision = imx8mm_soc_revision,
+};
+
 static const struct of_device_id imx8_soc_match[] = {
{ .compatible = "fsl,imx8mq", .data = _soc_data, },
{ .compatible = "fsl,imx8mm", .data = _soc_data, },
+   { .compatible = "fsl,imx8mn", .data = _soc_data, },
{ }
 };
 
-- 
2.7.4



linux-next: manual merge of the net-next tree with the net tree

2019-06-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/ethernet/mellanox/mlx5/core/cmd.c

between commit:

  6a6fabbfa3e8 ("net/mlx5: Update pci error handler entries and command 
translation")

from the net tree and commit:

  cd56f929e6a5 ("net/mlx5: E-Switch, Replace host_params event with 
functions_changed event")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index e94686c42000,30f7dffb5b1b..
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@@ -632,11 -628,7 +632,11 @@@ const char *mlx5_command_str(int comman
MLX5_COMMAND_STR_CASE(QUERY_MODIFY_HEADER_CONTEXT);
MLX5_COMMAND_STR_CASE(ALLOC_MEMIC);
MLX5_COMMAND_STR_CASE(DEALLOC_MEMIC);
-   MLX5_COMMAND_STR_CASE(QUERY_HOST_PARAMS);
+   MLX5_COMMAND_STR_CASE(QUERY_ESW_FUNCTIONS);
 +  MLX5_COMMAND_STR_CASE(CREATE_UCTX);
 +  MLX5_COMMAND_STR_CASE(DESTROY_UCTX);
 +  MLX5_COMMAND_STR_CASE(CREATE_UMEM);
 +  MLX5_COMMAND_STR_CASE(DESTROY_UMEM);
default: return "unknown command opcode";
}
  }


pgpgg6DOsDB8r.pgp
Description: OpenPGP digital signature


[PATCH -next v2] packet: remove unused variable 'status' in __packet_lookup_frame_in_block

2019-06-10 Thread Mao Wenan
The variable 'status' in  __packet_lookup_frame_in_block() is never used since
introduction in commit f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer
implementation."), we can remove it.

Signed-off-by: Mao Wenan 
---
 v2: don't change parameter from 0 to TP_STATUS_KERNEL when calls 
 prb_retire_current_block(). 
---
 net/packet/af_packet.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index a29d66da7394..7fa847dcea30 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1003,7 +1003,6 @@ static void prb_fill_curr_block(char *curr,
 /* Assumes caller has the sk->rx_queue.lock */
 static void *__packet_lookup_frame_in_block(struct packet_sock *po,
struct sk_buff *skb,
-   int status,
unsigned int len
)
 {
@@ -1075,7 +1074,7 @@ static void *packet_current_rx_frame(struct packet_sock 
*po,
po->rx_ring.head, status);
return curr;
case TPACKET_V3:
-   return __packet_lookup_frame_in_block(po, skb, status, len);
+   return __packet_lookup_frame_in_block(po, skb, len);
default:
WARN(1, "TPACKET version not supported\n");
BUG();
-- 
2.20.1



Re: [RFC PATCH] powerpc/book3e: KASAN Full support for 64bit

2019-06-10 Thread Daniel Axtens
Christophe Leroy  writes:

> On 06/03/2019 11:50 PM, Daniel Axtens wrote:
>> Christophe Leroy  writes:
>> 
>>> Hi,
>>>
>>> Ok, can you share your .config ?
>> 
>> Sure! This one is with kasan off as the last build I did was testing to
>> see if the code reorgisation was the cause of the issues. (it was not)
>> 
>> 
>> 
>> 
>> This was the kasan-enabled config that failed to boot:
>> 
>> 
>
> Same issue with your .config under QEMU:
>
> A go with gdb shows:
>
> Breakpoint 3, 0xc0027b6c in exc_0x700_common ()
> => 0xc0027b6c :   f8 01 00 70 std 
> r0,112(r1)
> (gdb) bt
> #0  0xc0027b6c in exc_0x700_common ()
> #1  0xc136f80c in .udbg_init_memcons ()
>

Thanks for debugging this!

> Without CONFIG_PPC_EARLY_DEBUG, it boots fine for me. Can you check on 
> your side ?

Yes, that works on my side.

> Deactivating KASAN for arch/powerpc/kernel/udbg.o and 
> arch/powerpc/sysdev/udbg_memcons.o is not enough, we hit a call to 
> strstr() in register_early_udbg_console(), and once we get rid of it (in 
> the same way as in prom_init.c) the next issue is register_console() and 
> I don't know what to do about that one.

Disabling early debug seems like a reasonable restriction to add.

I'll have a look at modules across this and book3s next.

Regards,
Daniel

>
> Christophe
>
>> 
>> 
>> Regards,
>> Daniel
>> 
>>>
>>> Christophe
>>>
>>> Le 31/05/2019 à 03:29, Daniel Axtens a écrit :
 Hi Christophe,

 I tried this on the t4240rdb and it fails to boot if KASAN is
 enabled. It does boot with the patch applied but KASAN disabled, so that
 narrows it down a little bit.

 I need to focus on 3s first so I'll just drop 3e from my patch set for
 now.

 Regards,
 Daniel

> The KASAN shadow area is mapped into vmemmap space:
> 0x8000 0400   to 0x8000 0600  .
> For this vmemmap has to be disabled.
>
> Cc: Daniel Axtens 
> Signed-off-by: Christophe Leroy 
> ---
>arch/powerpc/Kconfig  |   1 +
>arch/powerpc/Kconfig.debug|   3 +-
>arch/powerpc/include/asm/kasan.h  |  11 +++
>arch/powerpc/kernel/Makefile  |   2 +
>arch/powerpc/kernel/head_64.S |   3 +
>arch/powerpc/kernel/setup_64.c|  20 +++---
>arch/powerpc/mm/kasan/Makefile|   1 +
>arch/powerpc/mm/kasan/kasan_init_64.c | 129 
> ++
>8 files changed, 159 insertions(+), 11 deletions(-)
>create mode 100644 arch/powerpc/mm/kasan/kasan_init_64.c
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 1a2fb50126b2..e0b7c45e4dc7 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -174,6 +174,7 @@ config PPC
>   select HAVE_ARCH_AUDITSYSCALL
>   select HAVE_ARCH_JUMP_LABEL
>   select HAVE_ARCH_KASAN  if PPC32
> + select HAVE_ARCH_KASAN  if PPC_BOOK3E_64 && 
> !SPARSEMEM_VMEMMAP
>   select HAVE_ARCH_KGDB
>   select HAVE_ARCH_MMAP_RND_BITS
>   select HAVE_ARCH_MMAP_RND_COMPAT_BITS   if COMPAT
> diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
> index 61febbbdd02b..b4140dd6b4e4 100644
> --- a/arch/powerpc/Kconfig.debug
> +++ b/arch/powerpc/Kconfig.debug
> @@ -370,4 +370,5 @@ config PPC_FAST_ENDIAN_SWITCH
>config KASAN_SHADOW_OFFSET
>   hex
>   depends on KASAN
> - default 0xe000
> + default 0xe000 if PPC32
> + default 0x68000400 if PPC64
> diff --git a/arch/powerpc/include/asm/kasan.h 
> b/arch/powerpc/include/asm/kasan.h
> index 296e51c2f066..756b3d58f921 100644
> --- a/arch/powerpc/include/asm/kasan.h
> +++ b/arch/powerpc/include/asm/kasan.h
> @@ -23,10 +23,21 @@
>
>#define KASAN_SHADOW_OFFSETASM_CONST(CONFIG_KASAN_SHADOW_OFFSET)
>
> +#ifdef CONFIG_PPC32
>#define KASAN_SHADOW_END   0UL
>
>#define KASAN_SHADOW_SIZE  (KASAN_SHADOW_END - KASAN_SHADOW_START)
>
> +#else
> +
> +#include 
> +
> +#define KASAN_SHADOW_SIZE(KERN_VIRT_SIZE >> 
> KASAN_SHADOW_SCALE_SHIFT)
> +
> +#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
> +
> +#endif /* CONFIG_PPC32 */
> +
>#ifdef CONFIG_KASAN
>void kasan_early_init(void);
>void kasan_mmu_init(void);
> diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
> index 0ea6c4aa3a20..7f232c06f11d 100644
> --- a/arch/powerpc/kernel/Makefile
> +++ b/arch/powerpc/kernel/Makefile
> @@ -35,6 +35,8 @@ KASAN_SANITIZE_early_32.o := n
>KASAN_SANITIZE_cputable.o := n
>KASAN_SANITIZE_prom_init.o := n
>KASAN_SANITIZE_btext.o := n
> +KASAN_SANITIZE_paca.o := n

Re: [PATCH v2 1/2] KVM: LAPIC: Optimize timer latency consider world switch time

2019-06-10 Thread Sean Christopherson
On Fri, May 31, 2019 at 02:40:13PM +0800, Wanpeng Li wrote:
> From: Wanpeng Li 
> 
> Advance lapic timer tries to hidden the hypervisor overhead between the
> host emulated timer fires and the guest awares the timer is fired. However,
> even though after more sustaining optimizations, 
> kvm-unit-tests/tscdeadline_latency 
> still awares ~1000 cycles latency since we lost the time between the end of 
> wait_lapic_expire and the guest awares the timer is fired. There are 
> codes between the end of wait_lapic_expire and the world switch, furthermore, 
> the world switch itself also has overhead. Actually the guest_tsc is equal 
> to the target deadline time in wait_lapic_expire is too late, guest will
> aware the latency between the end of wait_lapic_expire() and after vmentry 
> to the guest. This patch takes this time into consideration. 
> 
> The vmentry_lapic_timer_advance_ns module parameter should be well tuned by 
> host admin, setting bit 0 to 1 to finally cache parameter in KVM. This patch 
> can reduce average cyclictest latency from 3us to 2us on Skylake server. 
> (guest w/ nohz=off, idle=poll, host w/ preemption_timer=N, the cyclictest 
> latency is not too sensitive when preemption_timer=Y for this optimization in 
> my testing), kvm-unit-tests/tscdeadline_latency can reach 0.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Sean Christopherson 
> Signed-off-by: Wanpeng Li 
> ---
> NOTE: rebase on https://lkml.org/lkml/2019/5/20/449
> v1 -> v2:
>  * rename get_vmentry_advance_delta to get_vmentry_advance_cycles
>  * cache vmentry_advance_cycles by setting param bit 0 
>  * add param max limit 
> 
>  arch/x86/kvm/lapic.c   | 38 +++---
>  arch/x86/kvm/lapic.h   |  3 +++
>  arch/x86/kvm/vmx/vmx.c |  2 +-
>  arch/x86/kvm/x86.c |  9 +
>  arch/x86/kvm/x86.h |  2 ++
>  5 files changed, 50 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index fcf42a3..60587b5 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1531,6 +1531,38 @@ static inline void adjust_lapic_timer_advance(struct 
> kvm_vcpu *vcpu,
>   apic->lapic_timer.timer_advance_ns = timer_advance_ns;
>  }
>  
> +#define MAX_VMENTRY_ADVANCE_NS 1000
> +
> +u64 compute_vmentry_advance_cycles(struct kvm_vcpu *vcpu)

This can be static, unless get_vmentry_advance_cycles() is moved to
lapic.h, in which case compute_vmentry_advance_cycles() would need to be
exported.

> +{
> + u64 cycles;
> + struct kvm_lapic *apic = vcpu->arch.apic;
> + u64 val = min_t(u32, vmentry_lapic_timer_advance_ns, 
> MAX_VMENTRY_ADVANCE_NS);
> +
> + cycles = (val & ~1ULL) * vcpu->arch.virtual_tsc_khz;
> + do_div(cycles, 100);
> +
> + /* setting bit 0 locks the value, it is cached */
> + if (val & 1)
> + apic->lapic_timer.vmentry_advance_cycles = cycles;
> +
> + return cycles;
> +}
> +
> +inline u64 get_vmentry_advance_cycles(struct kvm_vcpu *vcpu)

This shouldn't be 'inline' since it's exported from a C file.  That being
said, I think it's short enough to define as a 'static inline' in lapic.h.

> +{
> + struct kvm_lapic *apic = vcpu->arch.apic;
> +
> + if (!vmentry_lapic_timer_advance_ns)
> + return 0;
> +
> + if (likely(apic->lapic_timer.vmentry_advance_cycles))
> + return apic->lapic_timer.vmentry_advance_cycles;
> +
> + return compute_vmentry_advance_cycles(vcpu);
> +}
> +EXPORT_SYMBOL_GPL(get_vmentry_advance_cycles);
> +
>  void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
>  {
>   struct kvm_lapic *apic = vcpu->arch.apic;
> @@ -1544,7 +1576,7 @@ void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
>  
>   tsc_deadline = apic->lapic_timer.expired_tscdeadline;
>   apic->lapic_timer.expired_tscdeadline = 0;
> - guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> + guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc()) + 
> get_vmentry_advance_cycles(vcpu);
>   apic->lapic_timer.advance_expire_delta = guest_tsc - tsc_deadline;
>  
>   if (guest_tsc < tsc_deadline)
> @@ -1572,7 +1604,7 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)
>   local_irq_save(flags);
>  
>   now = ktime_get();
> - guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> + guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc()) + 
> get_vmentry_advance_cycles(vcpu);
>  
>   ns = (tscdeadline - guest_tsc) * 100ULL;
>   do_div(ns, this_tsc_khz);
> @@ -2329,7 +2361,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int 
> timer_advance_ns)
>   apic->lapic_timer.timer_advance_ns = timer_advance_ns;
>   apic->lapic_timer.timer_advance_adjust_done = true;
>   }
> -
> + apic->lapic_timer.vmentry_advance_cycles = 0;
>  
>   /*
>* APIC is created enabled. This will prevent kvm_lapic_set_base from
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index f974a3d..70854a9 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h

Re: bcachefs status update (it's done cooking; let's get this sucker merged)

2019-06-10 Thread Kent Overstreet
On Mon, Jun 10, 2019 at 10:46:35AM -1000, Linus Torvalds wrote:
> On Mon, Jun 10, 2019 at 9:14 AM Kent Overstreet
>  wrote:
> >
> > So. Here's my bcachefs-for-review branch - this has the minimal set of 
> > patches
> > outside of fs/bcachefs/. My master branch has some performance 
> > optimizations for
> > the core buffered IO paths, but those are fairly tricky and invasive so I 
> > want
> > to hold off on those for now - this branch is intended to be more or less
> > suitable for merging as is.
> 
> Honestly, it really isn't.

Heh, I suppose that's what review is for :)

> There are obvious things wrong with it - like the fact that you've
> rebased it so that the original history is gone, yet you've not
> actually *fixed* the history, so you find things like reverts of
> commits that should simply have been removed, and fixes for things
> that should just have been fixed in the original commit the fix is
> for.

Yeah, I suppose I have dropped the ball on that lately. 
 
> But note that the cleanup should go further than just fix those kinds
> of technical issues. If you rebase, and you have fixes in your tree
> for things you rebase, just fix things as you rewrite history anyway
> (there are cases where the fix may be informative in itself and it's
> worth leaving around, but that's rare).

Yeah that has historically been my practice, I've just been moving away from
that kind of history editing as bcachefs has been getting more users. Hence the
in-between, worst of both workflows state of the current tree.

But, I can certainly go through and clean things up like that one last time and
make everything bisectable again - I'll go through and write proper commit
messages too. Unless you'd be ok with just squashing most of the history down to
one commit - which would you prefer?

> Anyway, aside from that, I only looked at the non-bcachefs parts. Some
> of those are not acceptable either, like
> 
> struct pagecache_lock add_lock
> cacheline_aligned_in_smp; /* protects adding new pages */
> 
> in 'struct address_space', which is completely bogus, since that
> forces not only a potentially huge amount of padding, it also requires
> alignment that that struct simply fundamentally does not have, and
> _will_ not have.

Oh, good point.

> You can only use cacheline_aligned_in_smp for top-level objects,
> and honestly, it's almost never a win. That lock shouldn't be so hot.
> 
> That lock is somewhat questionable in the first place, and no, we
> don't do those hacky recursive things anyway. A recursive lock is
> almost always a buggy and mis-designed one.

You're preaching to the choir there, I still feel dirty about that code and I'd
love nothing more than for someone else to come along and point out how stupid
I've been with a much better way of doing it. 

> Why does the regular page lock (at a finer granularity) not suffice?

Because the lock needs to prevent pages from being _added_ to the page cache -
to do it with a page granularity lock it'd have to be part of the radix tree, 

> And no, nobody has ever cared. The dio people just don't care about
> page cache anyway. They have their own thing going.

It's not just dio, it's even worse with the various fallocate operations. And
the xfs people care, but IIRC even they don't have locking for pages being
faulted in. This is an issue I've talked to other filesystem people quite a bit
about - especially Dave Chinner, maybe we can get him to weigh in here.

And this inconsistency does result in _real_ bugs. It goes something like this:
 - dio write shoots down the range of the page cache for the file it's writing
   to, using invalidate_inode_pages_range2
 - After the page cache shoot down, but before the write actually happens,
   another process pulls those pages back in to the page cache
 - Now the write happens: if that write was e.g. an allocating write, you're
   going to have page cache state (buffer heads) that say that page doesn't have
   anything on disk backing it, but it actually does because of the dio write.

xfs has additional locking (that the vfs does _not_ do) around both the buffered
and dio IO paths to prevent this happening because of a buffered read pulling
the pages back in, but no one has a solution for pages getting _faulted_ back in
- either because of mmap or gup().

And there are some filesystem people who do know about this race, because at
some point the dio code has been changed to shoot down the page cache _again_
after the write completes. But that doesn't eliminate the race, it just makes it
harder to trigger.

And dio writes actually aren't the worst of it, it's even worse with fallocate
FALLOC_FL_INSERT_RANGE/COLLAPSE_RANGE. Last time I looked at the ext4 fallocate
code, it looked _completely_ broken to me - the code seemed to think it was
using the same mechanism truncate uses for shooting down the page cache and
keeping pages from being readded - but that only works for truncate because it's
changing 

Re: [PATCH 0/4] trace: introduce trace event injection

2019-06-10 Thread Steven Rostedt
On Mon, 10 Jun 2019 14:11:57 -0700
Cong Wang  wrote:

> On Sat, May 25, 2019 at 3:37 PM Steven Rostedt  wrote:
> > Hi Cong,
> >
> > Thanks for sending these patches, but I just want to let you know that
> > it's currently a US holiday, and then afterward I'll be doing quite a
> > bit of traveling for the next two weeks. If you don't hear from me in
> > after two weeks, please send me a reminder.  
> 
> This is a reminder after two weeks. :) Please review my patches
> when you have a chance.
>

Thanks for the reminder. I'll try to get to it this week.

-- Steve


Re: [PATCH v3 0/3] KVM: Yield to IPI target if necessary

2019-06-10 Thread Sean Christopherson
On Mon, Jun 10, 2019 at 04:34:20PM +0200, Radim Krčmář wrote:
> 2019-05-30 09:05+0800, Wanpeng Li:
> > The idea is from Xen, when sending a call-function IPI-many to vCPUs, 
> > yield if any of the IPI target vCPUs was preempted. 17% performance 
> > increasement of ebizzy benchmark can be observed in an over-subscribe 
> > environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function 
> > IPI-many since call-function is not easy to be trigged by userspace 
> > workload).
> 
> Have you checked if we could gain performance by having the yield as an
> extension to our PV IPI call?
> 
> It would allow us to skip the VM entry/exit overhead on the caller.
> (The benefit of that might be negligible and it also poses a
>  complication when splitting the target mask into several PV IPI
>  hypercalls.)

Tangetially related to splitting PV IPI hypercalls, are there any major
hurdles to supporting shorthand?  Not having to generate the mask for
->send_IPI_allbutself and ->kvm_send_ipi_all seems like an easy to way
shave cycles for affected flows.


Re: [PATCH] usercopy: Remove HARDENED_USERCOPY_PAGESPAN

2019-06-10 Thread Kees Cook
On Mon, Jun 10, 2019 at 03:30:55PM -0700, Eric Biggers wrote:
> Any progress on this patch?

I have no had time yet; sorry. If anyone else would like to take a stab
at it, I'd appreciate it. :)

-- 
Kees Cook


Re: memory leak in start_sync_thread

2019-06-10 Thread Eric Biggers
On Tue, May 28, 2019 at 11:28:05AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:cd6c84d8 Linux 5.2-rc2
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=132bd44aa0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=64479170dcaf0e11
> dashboard link: https://syzkaller.appspot.com/bug?extid=7e2e50c8adfccd2e5041
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=114b1354a0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14b7ad26a0
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7e2e50c8adfccd2e5...@syzkaller.appspotmail.com
> 
> d started: state = MASTER, mcast_ifn = syz_tun, syncid = 0, id = 0
> BUG: memory leak
> unreferenced object 0x8881206bf700 (size 32):
>   comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s)
>   hex dump (first 32 bytes):
> 00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff  .@|..E.!
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [<57619e23>] kmemleak_alloc_recursive
> include/linux/kmemleak.h:55 [inline]
> [<57619e23>] slab_post_alloc_hook mm/slab.h:439 [inline]
> [<57619e23>] slab_alloc mm/slab.c:3326 [inline]
> [<57619e23>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
> [<86ce5479>] kmalloc include/linux/slab.h:547 [inline]
> [<86ce5479>] start_sync_thread+0x5d2/0xe10
> net/netfilter/ipvs/ip_vs_sync.c:1862
> [<1a9229cc>] do_ip_vs_set_ctl+0x4c5/0x780
> net/netfilter/ipvs/ip_vs_ctl.c:2402
> [] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
> [] nf_setsockopt+0x4c/0x80
> net/netfilter/nf_sockopt.c:115
> [<942f62d4>] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline]
> [<942f62d4>] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238
> [] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
> [] sock_common_setsockopt+0x38/0x50
> net/core/sock.c:3130
> [<95eef4cf>] __sys_setsockopt+0x98/0x120 net/socket.c:2078
> [<9747cf88>] __do_sys_setsockopt net/socket.c:2089 [inline]
> [<9747cf88>] __se_sys_setsockopt net/socket.c:2086 [inline]
> [<9747cf88>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
> [] do_syscall_64+0x76/0x1a0
> arch/x86/entry/common.c:301
> [<893b4ac8>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 

The bug is that ownership of some memory is passed to a kthread started by
kthread_run(), but the kthread can be stopped before it actually executes the
threadfn.  See the code in kernel/kthread.c:

ret = -EINTR;
if (!test_bit(KTHREAD_SHOULD_STOP, >flags)) {
cgroup_kthread_ready();
__kthread_parkme(self);
ret = threadfn(data);
}

So, apparently the thread parameters must always be owned by the owner of the
kthread, not by the kthread itself.  It seems like this would be a common
mistake in kernel code; I'm surprised this doesn't come up more...

- Eric


Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

2019-06-10 Thread Naoya Horiguchi
On Mon, Jun 10, 2019 at 05:19:45PM -0700, Mike Kravetz wrote:
> On 6/10/19 1:18 AM, Naoya Horiguchi wrote:
> > The pass/fail of soft offline should be judged by checking whether the
> > raw error page was finally contained or not (i.e. the result of
> > set_hwpoison_free_buddy_page()), but current code do not work like that.
> > So this patch is suggesting to fix it.
> > 
> > Signed-off-by: Naoya Horiguchi 
> > Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> > Cc:  # v4.19+
> 
> Reviewed-by: Mike Kravetz 

Thank you, Mike.

> 
> To follow-up on Andrew's comment/question about user visible effects.  Without
> this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the
> original page and will not return an error.

Yes, that's right.

>  Are there any other visible
> effects?

I can't think of other ones.

- Naoya


Re: [LKP] [btrfs] c8eaeac7b7: aim7.jobs-per-min -11.7% regression

2019-06-10 Thread Huang, Ying
"Huang, Ying"  writes:

> Hi, Josef,
>
> kernel test robot  writes:
>
>> Greeting,
>>
>> FYI, we noticed a -11.7% regression of aim7.jobs-per-min due to commit:
>>
>>
>> commit: c8eaeac7b734347c3afba7008b7af62f37b9c140 ("btrfs: reserve
>> delalloc metadata differently")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: aim7
>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 
>> 384G memory
>> with following parameters:
>>
>>  disk: 4BRD_12G
>>  md: RAID0
>>  fs: btrfs
>>  test: disk_rr
>>  load: 1500
>>  cpufreq_governor: performance
>>
>> test-description: AIM7 is a traditional UNIX system level benchmark
>> suite which is used to test and measure the performance of multiuser
>> system.
>> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
>
> Here's another regression, do you have time to take a look at this?

Ping

Best Regards,
Huang, Ying


Re: [PATCH net-next] hinic: fix a bug in set rx mode

2019-06-10 Thread dann frazier
On Mon, May 27, 2019 at 10:10:05PM +, Xue Chaojing wrote:
> in set_rx_mode, __dev_mc_sync and netdev_for_each_mc_addr will
> repeatedly set the multicast mac address. so we delete this loop.

fyi, I'm told this fixes the following Oops (in case it makes sense to
queue it for stable):

[ 642.914581] Internal error: Oops: 9605 [#1] SMP
[ 642.919444] Modules linked in: hinic(-) 8021q garp mrp stp llc ses enclosure 
sg nls_utf8 isofs vfat fat loop ipmi_ssif crc32_ce crct10dif_ce ghash_c e 
sha2_ce sha256_arm64 sha1_ce sbsa_gwdt hns_roce_hw_v2 hns_roce ib_core ipmi_si 
ipmi_devintf ipmi_msghandler xfs libcrc32c marvell hibmc_drm drm_kms_h elper 
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qla2xxx ixgbe drm mpt3sas 
nvme_fc hns3 hisi_sas_v3_hw igb nvme_fabrics hisi_sas_main hclge mdio 
scsi_transport_fc nvme libsas hnae3 raid_class nvme_core scsi_transport_sas 
i2c_algo_bit gpio_dwapb gpio_generic dm_mirror dm_region_hash dm_log dm_mo d 
[last unloaded: hinic]
[ 642.974177] CPU: 4 PID: 5339 Comm: kworker/u256:1 Kdump: loaded Not tainted 
4.18.0-74.el8.aarch64 #1
[ 642.983293] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 
2280-A CS V2.16.01 03/16/2019
[ 642.992591] Workqueue: hinic_dev set_rx_mode [hinic]
[ 642.997542] pstate: 00c9 (nzcv daif +PAN +UAO)
[ 643.002320] pc : add_mac_addr+0xa4/0x100 [hinic]
[ 643.006924] lr : set_rx_mode+0x88/0xc0 [hinic]
[ 643.011353] sp : 3228fd40
[ 643.014653] x29: 3228fd40 x28: 
[ 643.019952] x27: b955c362ff38 x26: 27ccd2cc3110
[ 643.025250] x25:  x24: b955025c6b08
[ 643.030547] x23: b955025c6000 x22: 0010
[ 643.035845] x21: 27cc56040488 x20: b955025c6ac0
[ 643.041142] x19:  x18: 0010
[ 643.046440] x17: b7135830 x16: 27ccd2259bb8
[ 643.051737] x15:  x14: 2030302031302039
[ 643.057035] x13: 33203d2072646461 x12: 2063616d20746573
[ 643.062332] x11: 203a296465726574 x10: 0d10
[ 643.067630] x9 : 3228f9f0 x8 : b95501756170
[ 643.072927] x7 : 198c000940300814 x6 : 3228fd08
[ 643.078225] x5 :  x4 : 
[ 643.083523] x3 :  x2 : 0001
[ 643.088820] x1 : 0010 x0 : 00e3
[ 643.094118] Process kworker/u256:1 (pid: 5339, stack limit = 
0x23b4f182)
[ 643.101498] Call trace:
[ 643.103932] add_mac_addr+0xa4/0x100 [hinic]
[ 643.108189] set_rx_mode+0x88/0xc0 [hinic]
[ 643.112272] process_one_work+0x1ac/0x3e0
[ 643.116268] worker_thread+0x44/0x448
[ 643.119916] kthread+0x130/0x138
[ 643.123130] ret_from_fork+0x10/0x18
[ 643.126692] Code: a9425bf5 a94363f7 a8c47bfd d65f03c0 (394016c7)
[ 643.132828] SMP: stopping secondary CPUs
[ 643.139859] Starting crashdump kernel...
[ 643.143771] Bye!

  -dann

> Signed-off-by: Xue Chaojing 
> ---
>  drivers/net/ethernet/huawei/hinic/hinic_main.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c 
> b/drivers/net/ethernet/huawei/hinic/hinic_main.c
> index e64bc664f687..cfd3f4232cac 100644
> --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
> +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
> @@ -724,7 +724,6 @@ static void set_rx_mode(struct work_struct *work)
>  {
>   struct hinic_rx_mode_work *rx_mode_work = work_to_rx_mode_work(work);
>   struct hinic_dev *nic_dev = rx_mode_work_to_nic_dev(rx_mode_work);
> - struct netdev_hw_addr *ha;
>  
>   netif_info(nic_dev, drv, nic_dev->netdev, "set rx mode work\n");
>  
> @@ -732,9 +731,6 @@ static void set_rx_mode(struct work_struct *work)
>  
>   __dev_uc_sync(nic_dev->netdev, add_mac_addr, remove_mac_addr);
>   __dev_mc_sync(nic_dev->netdev, add_mac_addr, remove_mac_addr);
> -
> - netdev_for_each_mc_addr(ha, nic_dev->netdev)
> - add_mac_addr(nic_dev->netdev, ha->addr);
>  }
>  
>  static void hinic_set_rx_mode(struct net_device *netdev)


linux-next: build warning after merge of the fbdev tree

2019-06-10 Thread Stephen Rothwell
Hi Bartlomiej,

After merging the fbdev tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/video/fbdev/pvr2fb.c:726:12: warning: 'pvr2_get_param_val' defined but 
not used [-Wunused-function]
 static int pvr2_get_param_val(const struct pvr2_params *p, const char *s,
^~

Introduced by commit

  0f5a5712ad1e ("video: fbdev: pvr2fb: add COMPILE_TEST support")

The uses are protected by #ifndef MODULE.

-- 
Cheers,
Stephen Rothwell


pgpAIsy6gLF53.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the fbdev tree with Linus' tree

2019-06-10 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the fbdev tree got a conflict in:

  drivers/video/fbdev/mxsfb.c

between commit:

  c942fddf8793 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 
157")

from Linus' tree and commit:

  f225f1393f03 ("video: fbdev: mxsfb: Remove driver")

from the fbdev tree.

I fixed it up (I removed the file) and can carry the fix as necessary.
This is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpyXMyxUehbW.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 1/3] thermal: sun8i: add thermal driver for h6

2019-06-10 Thread Vasily Khoruzhick
On Mon, Jun 10, 2019 at 5:31 PM Frank Lee  wrote:
>
> On Tue, Jun 11, 2019 at 7:29 AM Vasily Khoruzhick  wrote:
> >
> > On Sat, May 25, 2019 at 11:17 AM Yangtao Li  wrote:
> > >
> > > This patch adds the support for allwinner thermal sensor, within
> > > allwinner SoC. It will register sensors for thermal framework
> > > and use device tree to bind cooling device.
> >
> > Hi Yangtao,
> >
> > Any plans on v4 of this series?
> >
>
> I am waiting for comment from Maxime.
>
> I’ll support both h3 and h6 in v4.

If you have a git tree I'll be happy to contribute A64 support. IIRC
it was quite similar to H3.

> Yangtao


Re: [PATCH v3 1/3] thermal: sun8i: add thermal driver for h6

2019-06-10 Thread Frank Lee
On Tue, Jun 11, 2019 at 7:29 AM Vasily Khoruzhick  wrote:
>
> On Sat, May 25, 2019 at 11:17 AM Yangtao Li  wrote:
> >
> > This patch adds the support for allwinner thermal sensor, within
> > allwinner SoC. It will register sensors for thermal framework
> > and use device tree to bind cooling device.
>
> Hi Yangtao,
>
> Any plans on v4 of this series?
>

I am waiting for comment from Maxime.

I’ll support both h3 and h6 in v4.

Yangtao


linux-next: build warning after merge of the i2c tree

2019-06-10 Thread Stephen Rothwell
Hi Wolfram,

After merging the i2c tree, today's linux-next build (x86_64 allmodconfig)
produced this warning:

drivers/media/dvb-frontends/tua6100.c: In function 'tua6100_set_params':
drivers/media/dvb-frontends/tua6100.c:71: warning: "_P" redefined
 #define _P 32
 
In file included from include/acpi/platform/aclinux.h:54,
 from include/acpi/platform/acenv.h:152,
 from include/acpi/acpi.h:22,
 from include/linux/acpi.h:21,
 from include/linux/i2c.h:17,
 from drivers/media/dvb-frontends/tua6100.h:22,
 from drivers/media/dvb-frontends/tua6100.c:24:
include/linux/ctype.h:14: note: this is the location of the previous definition
 #define _P 0x10 /* punct */

Exposed by commit

  5213d7efc8ec ("i2c: acpi: export i2c_acpi_find_adapter_by_handle")

Since that included  from 

Originally introduced by commit

  00be2e7c6415 ("V4L/DVB (4606): Add driver for TUA6100")

The _P in  has existed since before git.
-- 
Cheers,
Stephen Rothwell


pgp_UgcZkpPqh.pgp
Description: OpenPGP digital signature


Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

2019-06-10 Thread Mike Kravetz
On 6/10/19 1:18 AM, Naoya Horiguchi wrote:
> The pass/fail of soft offline should be judged by checking whether the
> raw error page was finally contained or not (i.e. the result of
> set_hwpoison_free_buddy_page()), but current code do not work like that.
> So this patch is suggesting to fix it.
> 
> Signed-off-by: Naoya Horiguchi 
> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> Cc:  # v4.19+

Reviewed-by: Mike Kravetz 

To follow-up on Andrew's comment/question about user visible effects.  Without
this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the
original page and will not return an error.  Are there any other visible
effects?

-- 
Mike Kravetz


[RESEND PATCH v3 1/3] dma-buf: give each buffer a full-fledged inode

2019-06-10 Thread Chenbo Feng
From: Greg Hackmann 

By traversing /proc/*/fd and /proc/*/map_files, processes with CAP_ADMIN
can get a lot of fine-grained data about how shmem buffers are shared
among processes.  stat(2) on each entry gives the caller a unique
ID (st_ino), the buffer's size (st_size), and even the number of pages
currently charged to the buffer (st_blocks / 512).

In contrast, all dma-bufs share the same anonymous inode.  So while we
can count how many dma-buf fds or mappings a process has, we can't get
the size of the backing buffers or tell if two entries point to the same
dma-buf.  On systems with debugfs, we can get a per-buffer breakdown of
size and reference count, but can't tell which processes are actually
holding the references to each buffer.

Replace the singleton inode with full-fledged inodes allocated by
alloc_anon_inode().  This involves creating and mounting a
mini-pseudo-filesystem for dma-buf, following the example in fs/aio.c.

Signed-off-by: Greg Hackmann 
Signed-off-by: Chenbo Feng 
---
 drivers/dma-buf/dma-buf.c  | 63 ++
 include/uapi/linux/magic.h |  1 +
 2 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 7c858020d14b..ffd5a2ad7d6f 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -34,8 +34,10 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
+#include 
 
 static inline int is_dma_buf_file(struct file *);
 
@@ -46,6 +48,25 @@ struct dma_buf_list {
 
 static struct dma_buf_list db_list;
 
+static const struct dentry_operations dma_buf_dentry_ops = {
+   .d_dname = simple_dname,
+};
+
+static struct vfsmount *dma_buf_mnt;
+
+static struct dentry *dma_buf_fs_mount(struct file_system_type *fs_type,
+   int flags, const char *name, void *data)
+{
+   return mount_pseudo(fs_type, "dmabuf:", NULL, _buf_dentry_ops,
+   DMA_BUF_MAGIC);
+}
+
+static struct file_system_type dma_buf_fs_type = {
+   .name = "dmabuf",
+   .mount = dma_buf_fs_mount,
+   .kill_sb = kill_anon_super,
+};
+
 static int dma_buf_release(struct inode *inode, struct file *file)
 {
struct dma_buf *dmabuf;
@@ -338,6 +359,31 @@ static inline int is_dma_buf_file(struct file *file)
return file->f_op == _buf_fops;
 }
 
+static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags)
+{
+   struct file *file;
+   struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb);
+
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
+
+   inode->i_size = dmabuf->size;
+   inode_set_bytes(inode, dmabuf->size);
+
+   file = alloc_file_pseudo(inode, dma_buf_mnt, "dmabuf",
+flags, _buf_fops);
+   if (IS_ERR(file))
+   goto err_alloc_file;
+   file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
+   file->private_data = dmabuf;
+
+   return file;
+
+err_alloc_file:
+   iput(inode);
+   return file;
+}
+
 /**
  * DOC: dma buf device access
  *
@@ -433,8 +479,7 @@ struct dma_buf *dma_buf_export(const struct 
dma_buf_export_info *exp_info)
}
dmabuf->resv = resv;
 
-   file = anon_inode_getfile("dmabuf", _buf_fops, dmabuf,
-   exp_info->flags);
+   file = dma_buf_getfile(dmabuf, exp_info->flags);
if (IS_ERR(file)) {
ret = PTR_ERR(file);
goto err_dmabuf;
@@ -1025,8 +1070,8 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
return ret;
 
seq_puts(s, "\nDma-buf Objects:\n");
-   seq_printf(s, "%-8s\t%-8s\t%-8s\t%-8s\texp_name\n",
-  "size", "flags", "mode", "count");
+   seq_printf(s, "%-8s\t%-8s\t%-8s\t%-8s\texp_name\t%-8s\n",
+  "size", "flags", "mode", "count", "ino");
 
list_for_each_entry(buf_obj, _list.head, list_node) {
ret = mutex_lock_interruptible(_obj->lock);
@@ -1037,11 +1082,12 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
continue;
}
 
-   seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\n",
+   seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\t%08lu\n",
buf_obj->size,
buf_obj->file->f_flags, buf_obj->file->f_mode,
file_count(buf_obj->file),
-   buf_obj->exp_name);
+   buf_obj->exp_name,
+   file_inode(buf_obj->file)->i_ino);
 
robj = buf_obj->resv;
while (true) {
@@ -1136,6 +1182,10 @@ static inline void dma_buf_uninit_debugfs(void)
 
 static int __init dma_buf_init(void)
 {
+   dma_buf_mnt = kern_mount(_buf_fs_type);
+   if (IS_ERR(dma_buf_mnt))
+   return PTR_ERR(dma_buf_mnt);
+
mutex_init(_list.lock);

Re: KASAN: use-after-free Read in kfree_skb (3)

2019-06-10 Thread syzbot

syzbot has bisected this bug to:

commit 5ec8c48a6235175f7ff59ed1acbe91d4d0398026
Author: Thierry Reding 
Date:   Thu Jul 6 15:16:47 2017 +

Merge branch 'for-4.13/drivers' into for-next

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ef22daa0
start commit:   d1fdb6d8 Linux 5.2-rc4
git tree:   upstream
final crash:https://syzkaller.appspot.com/x/report.txt?x=16ef22daa0
console output: https://syzkaller.appspot.com/x/log.txt?x=12ef22daa0
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
dashboard link: https://syzkaller.appspot.com/bug?extid=dcb1305dd05699c40640
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13c787f2a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10e32801a0

Reported-by: syzbot+dcb1305dd05699c40...@syzkaller.appspotmail.com
Fixes: 5ec8c48a6235 ("Merge branch 'for-4.13/drivers' into for-next")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


Re: [PATCH v3 0/8] memory: tegra: Introduce Tegra30 EMC driver

2019-06-10 Thread Dmitry Osipenko
24.05.2019 20:23, Dmitry Osipenko пишет:
> Hello,
> 
> This series introduces driver for the External Memory Controller (EMC)
> found on Tegra30 chips, it controls the external DRAM on the board. The
> purpose of this driver is to program memory timing for external memory on
> the EMC clock rate change. The driver was tested using the ACTMON devfreq
> driver that performs memory frequency scaling based on memory-usage load.
> 
> Changelog:
> 
> v3: - Addressed review comments that were made by Stephen Boyd to v2 by
>   adding explicit typing for the callback variable, by including
>   "clk-provider.h" directly in the code and by dropping __clk_lookup
>   usage where possible.
> 
>   Added more patches into this series:
> 
> memory: tegra20-emc: Drop setting EMC rate to max on probe
> memory: tegra20-emc: Adapt for clock driver changes
> memory: tegra20-emc: Include io.h instead of iopoll.h
> memory: tegra20-emc: Replace clk_get_sys with devm_clk_get
> 
>   Initially I was going to include these patches into other patchset,
>   but changed my mind after rearranging things a tad. The "Adapt for
>   clock driver changes" patch is directly related to the clock changes
>   done in the first patch of this series, the rest are minor cleanups
>   that are fine to include here as well.
> 
>   Added some more words to the commit message of "Add binding for NVIDIA
>   Tegra30 External Memory Controller" patch, clarifying why common DDR
>   timing device-tree form isn't suitable for Tegra30.
> 
>   The Tegra30 EMC driver now explicitly selects the registers access
>   mode (EMC_DBG mux), not relying on the setting left from bootloader.
> 
> v2: - Added support for changing MC clock diver configuration based on
>   Memory Controller (MC) configuration which is part of the memory
>   timing.
> 
> - Merged the "Add custom EMC clock implementation" patch into this
>   series because the "Introduce Tegra30 EMC driver" patch directly
>   depends on it. Please note that Tegra20 EMC driver will need to be
>   adapted for the clock changes as well, I'll send out the Tegra20
>   patches after this series will be applied because of some other
>   dependencies (devfreq) and because the temporary breakage won't
>   be critical (driver will just error out on probe).
> 
> - EMC driver now performs MC configuration validation by checking
>   that the number of MC / EMC timings matches and that the timings
>   rate is the same.
> 
> - EMC driver now supports timings that want to change the MC clock
>   configuration.
> 
> - Other minor prettifying changes of the code.
> 
> Dmitry Osipenko (8):
>   clk: tegra20/30: Add custom EMC clock implementation
>   memory: tegra20-emc: Drop setting EMC rate to max on probe
>   memory: tegra20-emc: Adapt for clock driver changes
>   memory: tegra20-emc: Include io.h instead of iopoll.h
>   memory: tegra20-emc: Replace clk_get_sys with devm_clk_get
>   dt-bindings: memory: Add binding for NVIDIA Tegra30 External Memory
> Controller
>   memory: tegra: Introduce Tegra30 EMC driver
>   ARM: dts: tegra30: Add External Memory Controller node
> 
>  .../memory-controllers/nvidia,tegra30-emc.txt |  249 
>  arch/arm/boot/dts/tegra30.dtsi|   11 +
>  drivers/clk/tegra/Makefile|2 +
>  drivers/clk/tegra/clk-tegra20-emc.c   |  299 +
>  drivers/clk/tegra/clk-tegra20.c   |   55 +-
>  drivers/clk/tegra/clk-tegra30.c   |   38 +-
>  drivers/clk/tegra/clk.h   |6 +
>  drivers/memory/tegra/Kconfig  |   10 +
>  drivers/memory/tegra/Makefile |1 +
>  drivers/memory/tegra/mc.c |3 -
>  drivers/memory/tegra/mc.h |   30 +-
>  drivers/memory/tegra/tegra20-emc.c|   94 +-
>  drivers/memory/tegra/tegra30-emc.c| 1165 +
>  drivers/memory/tegra/tegra30.c|   44 +
>  include/linux/clk/tegra.h |   14 +
>  15 files changed, 1903 insertions(+), 118 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/memory-controllers/nvidia,tegra30-emc.txt
>  create mode 100644 drivers/clk/tegra/clk-tegra20-emc.c
>  create mode 100644 drivers/memory/tegra/tegra30-emc.c
> 

Hello Peter,

Do you have any comments on the clk/emc bits? It looks to me that this
series basically needs yours, Stephen's and Rob's acks, after which
Thierry could pick it up once everything is arranged. Stephen and Rob
already made some comments to the previous versions of the series that
hopefully are addressed now. Maybe you also have something to say?
Otherwise just an ack will be also very appreciated. Thanks in advance!

Actually just noticed that I accidentally missed to CC Stephen directly
for this series, but hopefully it's not a problem since he is 

  1   2   3   4   5   6   7   8   9   >