date:20170704

Re: [PATCH v2] ext4: have ext4_xattr_set_handle() allocate journal credits

2017-07-04 Thread Tahsin Erdogan

> Are you aware of other cases where we're likely to run into problems
> besides ext4_new_inode()?

Nope. If we can get ext4_new_inode() case covered we should be fine.

I will abandon this patch and will work on a patch that adds extra
credits in __ext4_new_inode().

thanks

Re: [PATCH v6 01/18] xen: introduce the pvcalls interface header

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> Introduce the C header file which defines the PV Calls interface. It is
> imported from xen/include/public/io/pvcalls.h.
> 
> Signed-off-by: Stefano Stabellini 
> Reviewed-by: Boris Ostrovsky 

Reviewed-by: Juergen Gross 


Thanks,

Juergen

Re: [GIT PULL] s390 patches for 4.13 merge window

2017-07-04 Thread Martin Schwidefsky

On Tue, 4 Jul 2017 17:58:18 +1000
Stephen Rothwell  wrote:

> Hi Linus,
> 
> On Mon, 3 Jul 2017 15:46:00 -0700 Linus Torvalds 
>  wrote:
> >
> > On Mon, Jul 3, 2017 at 2:01 AM, Martin Schwidefsky
> >  wrote:  
> > >
> > > please pull from the 'for-linus' branch of
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git 
> > > for-linus
> > 
> > So my conflict resolution looks different from the one Stephen posted,
> > which may be due to various reasons, ranging from "linux-next has
> > other things that conflict" to just "I didn't notice some semantic
> > conflict since unlike linux-next I don't build for s390".
> > 
> > Regardless, you should check my current -git tree just to verify, and
> > send me a patch if I screwed something up.  
> 
> At least part of the difference is the following merge fix patch I have
> been carrying.  It is needed due to a build failure.
> 
> From: Stephen Rothwell 
> Date: Tue, 13 Jun 2017 20:51:32 +1000
> Subject: [PATCH] s390: fix up for "blk-mq: switch ->queue_rq return value to
>  blk_status_t"
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  drivers/s390/block/scm_blk.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/s390/block/scm_blk.c b/drivers/s390/block/scm_blk.c
> index 42018a20f2b7..0071febac9e6 100644
> --- a/drivers/s390/block/scm_blk.c
> +++ b/drivers/s390/block/scm_blk.c
> @@ -278,7 +278,7 @@ struct scm_queue {
>   spinlock_t lock;
>  };
> 
> -static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> +static blk_status_t scm_blk_request(struct blk_mq_hw_ctx *hctx,
>  const struct blk_mq_queue_data *qd)
>  {
>   struct scm_device *scmdev = hctx->queue->queuedata;
> @@ -290,7 +290,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
>   spin_lock(>lock);
>   if (!scm_permit_request(bdev, req)) {
>   spin_unlock(>lock);
> - return BLK_MQ_RQ_QUEUE_BUSY;
> + return BLK_STS_RESOURCE;
>   }
> 
>   scmrq = sq->scmrq;
> @@ -299,7 +299,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
>   if (!scmrq) {
>   SCM_LOG(5, "no request");
>   spin_unlock(>lock);
> - return BLK_MQ_RQ_QUEUE_BUSY;
> + return BLK_STS_RESOURCE;
>   }
>   scm_request_init(bdev, scmrq);
>   sq->scmrq = scmrq;
> @@ -315,7 +315,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> 
>   sq->scmrq = NULL;
>   spin_unlock(>lock);
> - return BLK_MQ_RQ_QUEUE_BUSY;
> + return BLK_STS_RESOURCE;
>   }
>   blk_mq_start_request(req);
> 
> @@ -324,7 +324,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
>   sq->scmrq = NULL;
>   }
>   spin_unlock(>lock);
> - return BLK_MQ_RQ_QUEUE_OK;
> + return BLK_STS_OK;
>  }
> 
>  static int scm_blk_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,

This is the same patch I came up with to get it to compile. I asked
Sebastian to verify that the driver actually works with these changes.


-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

Re: [GIT PULL] USB/PHY patches for 4.13-rc1

2017-07-04 Thread Greg KH

On Tue, Jul 04, 2017 at 09:15:55AM +0200, Geert Uytterhoeven wrote:
> Hi Greg, Heikki,
> 
> On Mon, Jul 3, 2017 at 4:58 PM, Greg KH  wrote:
> > The following changes since commit 41f1830f5a7af77cf5c86359aba3cbd706687e52:
> >
> >   Linux 4.12-rc6 (2017-06-19 22:19:37 +0800)
> >
> > are available in the git repository at:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ 
> > tags/usb-4.13-rc1
> >
> > for you to fetch changes up to 6836796de4019944f4ba4c99a360e8250fd2e735:
> >
> >   Add USB quirk for HVR-950q to avoid intermittent device resets 
> > (2017-06-29 14:49:06 +0200)
> >
> > 
> > USB/PHY patches for 4.13-rc1
> >
> > Here is the big patchset of USB and PHY driver updates for 4.13-rc1.
> >
> > On the PHY side, they decided to move files around to "make things
> > easier" in their tree.  Hopefully that wasn't a mistake, but in
> > linux-next testing, we haven't had any reported problems.
> >
> > There's the usual set of gadget and xhci and musb updates in here as
> > well, along with a number of smaller updates for a raft of different USB
> > drivers.  Full details in the shortlog, nothing really major.
> >
> > All of these have been in linux-next for a while with no reported
> > issues.
> >
> > Signed-off-by: Greg Kroah-Hartman 
> >
> > 
> 
> > Heikki Krogerus (3):
> >   usb: typec: Add support for UCSI interface
> 
> Commit c1b0bc2dabfa884d ("usb: typec: Add support for UCSI interface"):
> 
> > --- /dev/null
> > +++ b/drivers/usb/typec/ucsi/Kconfig
> > @@ -0,0 +1,23 @@
> > +config TYPEC_UCSI
> > +   tristate "USB Type-C Connector System Software Interface driver"
> > +   depends on !CPU_BIG_ENDIAN
> 
> To work as expected, and prevent this driver from being enabled on big endian
> systems, this depends on "[PATCH v3 0/3] Define CPU_BIG_ENDIAN or warn for
> inconsistencies".
> https://lkml.org/lkml/2017/6/12/1068

Is this a problem?  I thought that series was slated to be merged soon,
is that not going to happen?

thanks,

greg k-h

Re: [PATCH 1/6] ARM: dts: rockchip: add regulator nodes for rk3229-evb

2017-07-04 Thread Heiko Stübner

Hi Frank,

Am Dienstag, 4. Juli 2017, 16:12:42 CEST schrieb Frank Wang:
> This patch adds vcc_io, vdd_arm and vdd_log regulator nodes
> for rk3229-evb board.
> 
> Signed-off-by: Frank Wang 
> ---
>  arch/arm/boot/dts/rk3229-evb.dts | 54
>  1 file changed, 54 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/rk3229-evb.dts
> b/arch/arm/boot/dts/rk3229-evb.dts index 82e8a53..8b10c64 100644
> --- a/arch/arm/boot/dts/rk3229-evb.dts
> +++ b/arch/arm/boot/dts/rk3229-evb.dts
> @@ -78,6 +78,52 @@
>   regulator-always-on;
>   regulator-boot-on;
>   };
> +
> + vdd_arm: vdd-arm-regulator {
> + compatible = "pwm-regulator";
> + rockchip,pwm_id = <1>;
> + rockchip,pwm_voltage = <110>;

This doesn't look like a part of the mainline pwm-regulator binding
(both rockchip,* properties)

> + pwms = < 0 25000 1>;
> + regulator-name = "vdd_arm";
> + regulator-min-microvolt = <95>;
> + regulator-max-microvolt = <140>;
> + regulator-always-on;
> + regulator-boot-on;
> + };

please also add supplies for regulators. Information on supplies
should be easily extractable from the board schematics.

This not only results in a nice tree in debugfs (regulator/regulator_summary)
but also makes sure supplying regulators are not accidentially turned off.
(pwm-supply for pwm-regulators, vin-supply for fixed regulators)


> +
> + vdd_log: vdd-log-regulator {
> + compatible = "pwm-regulator";
> + rockchip,pwm_id = <2>;
> + rockchip,pwm_voltage = <120>;
> + pwms = < 0 25000 1>;
> + regulator-name = "vdd_log";
> + regulator-min-microvolt = <100>;
> + regulator-max-microvolt = <130>;
> + regulator-always-on;
> + regulator-boot-on;
> + };
> +
> + regulators {
> + compatible = "simple-bus";

don't create a subnode/bus for regulators. Just add them
regularly. Take a look at all the other boards (like rk3399-firefly,
gru, veyron) for reference.

> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + vccio_1v8_reg: regulator@0 {

Same here, no regulator@0 please, just name this one
vccio_1v8: vccio-1v8-regulator {

(removed _reg from phandle and changed node name)

> + compatible = "regulator-fixed";
> + regulator-name = "vccio_1v8";
> + regulator-min-microvolt = <180>;
> + regulator-max-microvolt = <180>;
> + regulator-always-on;
> + };
> +
> + vccio_3v3_reg: regulator@1 {
> + compatible = "regulator-fixed";
> + regulator-name = "vccio_3v3";
> + regulator-min-microvolt = <330>;
> + regulator-max-microvolt = <330>;
> + regulator-always-on;
> + };
> + };
>  };


Thanks
Heiko

Re: [PATCH v2] sched/pelt: fix false running accounting

2017-07-04 Thread Peter Zijlstra

On Tue, Jul 04, 2017 at 09:27:07AM +0200, Peter Zijlstra wrote:
> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
> > The running state is a subset of runnable state which means that running
> > can't be set if runnable (weight) is cleared. There are corner cases
> > where the current sched_entity has been already dequeued but cfs_rq->curr
> > has not been updated yet and still points to the dequeued sched_entity.
> > If ___update_load_avg is called at that time, weight will be 0 and running
> > will be set which is not possible.
> > 
> > This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
> > The current sched_entity has been dequeued so se->on_rq is cleared and
> > cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
> > cleared when picking the idle thread). Because the cfs_rq becomes idle,
> > idle_balance() is called and ends up to call update_blocked_averages()
> > with these wrong running and runnable states.
> > 
> > Add a test in ___update_load_avg to correct the running state in this case.
> 
> Cute, however did you find that ?

Hmm,.. could you give a little more detail?

Because if ->on_rq=0, we'll have done dequeue_task() which will have
done update_curr() with ->on_rq, weight and ->running consistently.

Then the above, inconsistent update should not happen, because delta=0.

Re: [GIT PULL] USB/PHY patches for 4.13-rc1

2017-07-04 Thread Geert Uytterhoeven

Hi Greg,

On Tue, Jul 4, 2017 at 10:04 AM, Greg KH <gre...@linuxfoundation.org> wrote:
> On Tue, Jul 04, 2017 at 09:15:55AM +0200, Geert Uytterhoeven wrote:
>> On Mon, Jul 3, 2017 at 4:58 PM, Greg KH <gre...@linuxfoundation.org> wrote:
>> > USB/PHY patches for 4.13-rc1

>> > Heikki Krogerus (3):
>> >   usb: typec: Add support for UCSI interface
>>
>> Commit c1b0bc2dabfa884d ("usb: typec: Add support for UCSI interface"):
>>
>> > --- /dev/null
>> > +++ b/drivers/usb/typec/ucsi/Kconfig
>> > @@ -0,0 +1,23 @@
>> > +config TYPEC_UCSI
>> > +   tristate "USB Type-C Connector System Software Interface driver"
>> > +   depends on !CPU_BIG_ENDIAN
>>
>> To work as expected, and prevent this driver from being enabled on big endian
>> systems, this depends on "[PATCH v3 0/3] Define CPU_BIG_ENDIAN or warn for
>> inconsistencies".
>> https://lkml.org/lkml/2017/6/12/1068
>
> Is this a problem?

I have no idea what happens if you enable the driver on big endian.

> I thought that series was slated to be merged soon,
> is that not going to happen?

Me too. But it's not in next-20170704.

Babu, what's the plan?

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 10/14] qcom: mtd: nand: support for QPIC Page read/write

2017-07-04 Thread Archit Taneja




On 06/29/2017 12:46 PM, Abhishek Sahu wrote:

1. Add the function for command descriptor preparation which
will be used only by BAM DMA and it will form the DMA descriptors
containing command elements.

2. Add the data descriptor preparation function which will be used
only by BAM DMA for forming the data SGL’s.

3. Add clear BAM transaction and call it before every new request

4. Check DMA mode for ADM or BAM and call the appropriate
descriptor formation function.

5. Enable the BAM in NAND_CTRL.

Signed-off-by: Abhishek Sahu 
---
  drivers/mtd/nand/qcom_nandc.c | 190 +++---
  1 file changed, 180 insertions(+), 10 deletions(-)

diff --git a/drivers/mtd/nand/qcom_nandc.c b/drivers/mtd/nand/qcom_nandc.c
index 17766af..4c6e594 100644
--- a/drivers/mtd/nand/qcom_nandc.c
+++ b/drivers/mtd/nand/qcom_nandc.c
@@ -156,6 +156,8 @@
  #define   FETCH_ID0xb
  #define   RESET_DEVICE0xd
  
+/* NAND_CTRL bits */

+#defineBAM_MODE_EN BIT(0)
  /*
   * the NAND controller performs reads/writes with ECC in 516 byte chunks.
   * the driver calls the chunks 'step' or 'codeword' interchangeably
@@ -190,6 +192,14 @@
   */
  #define NAND_ERASED_CW_SET(0x0008)
  
+/* Returns the dma address for reg read buffer */

+#define REG_BUF_DMA_ADDR(chip, vaddr) \
+   ((chip)->reg_read_buf_phys + \
+   ((uint8_t *)(vaddr) - (uint8_t *)(chip)->reg_read_buf))
+
+/* Returns the NAND register physical address */
+#define NAND_REG_PHYS(chip, offset) ((chip)->base_phys + (offset))
+
  #define QPIC_PER_CW_MAX_CMD_ELEMENTS  (32)
  #define QPIC_PER_CW_MAX_CMD_SGL   (32)
  #define QPIC_PER_CW_MAX_DATA_SGL  (8)
@@ -287,7 +297,8 @@ struct nandc_regs {
   *controller
   * @dev:  parent device
   * @base: MMIO base
- * @base_dma:  physical base address of controller registers
+ * @base_phys: physical base address of controller registers
+ * @base_dma:  dma base address of controller registers
   * @core_clk: controller clock
   * @aon_clk:  another controller clock
   *
@@ -323,6 +334,7 @@ struct qcom_nand_controller {
struct device *dev;
  
  	void __iomem *base;

+   phys_addr_t base_phys;
dma_addr_t base_dma;
  
  	struct clk *core_clk;

@@ -467,6 +479,29 @@ static void free_bam_transaction(struct 
qcom_nand_controller *nandc)
return bam_txn;
  }
  
+/* Clears the BAM transaction indexes */

+static void clear_bam_transaction(struct qcom_nand_controller *nandc)
+{
+   struct bam_transaction *bam_txn = nandc->bam_txn;
+
+   if (!nandc->dma_bam_enabled)
+   return;
+
+   bam_txn->bam_ce_pos = 0;
+   bam_txn->bam_ce_start = 0;
+   bam_txn->cmd_sgl_pos = 0;
+   bam_txn->cmd_sgl_start = 0;
+   bam_txn->tx_sgl_pos = 0;
+   bam_txn->tx_sgl_start = 0;
+   bam_txn->rx_sgl_pos = 0;
+   bam_txn->rx_sgl_start = 0;
+
+   sg_init_table(bam_txn->cmd_sgl, nandc->max_cwperpage *
+ QPIC_PER_CW_MAX_CMD_SGL);
+   sg_init_table(bam_txn->data_sg, nandc->max_cwperpage *
+ QPIC_PER_CW_MAX_DATA_SGL);
+}
+
  static inline struct qcom_nand_host *to_qcom_nand_host(struct nand_chip *chip)
  {
return container_of(chip, struct qcom_nand_host, chip);
@@ -682,6 +717,102 @@ static int prepare_bam_async_desc(struct 
qcom_nand_controller *nandc,
return 0;
  }
  
+/*

+ * Prepares the command descriptor for BAM DMA which will be used for NAND
+ * register reads and writes. The command descriptor requires the command
+ * to be formed in command element type so this function uses the command
+ * element from bam transaction ce array and fills the same with required
+ * data. A single SGL can contain multiple command elements so
+ * NAND_BAM_NEXT_SGL will be used for starting the separate SGL
+ * after the current command element.
+ */
+static int prep_dma_desc_command(struct qcom_nand_controller *nandc, bool read,
+int reg_off, const void *vaddr,
+int size, unsigned int flags)
+{
+   int bam_ce_size;
+   int i, ret;
+   struct bam_cmd_element *bam_ce_buffer;
+   struct bam_transaction *bam_txn = nandc->bam_txn;
+
+   bam_ce_buffer = _txn->bam_ce[bam_txn->bam_ce_pos];
+
+   /* fill the command desc */
+   for (i = 0; i < size; i++) {
+   if (read)
+   bam_prep_ce(_ce_buffer[i],
+   NAND_REG_PHYS(nandc, reg_off + 4 * i),
+   BAM_READ_COMMAND,
+   REG_BUF_DMA_ADDR(nandc,
+(__le32 *)vaddr + i));
+   else
+

Re: [PATCH v2] sched/pelt: fix false running accounting

2017-07-04 Thread Peter Zijlstra

On Tue, Jul 04, 2017 at 11:12:34AM +0200, Vincent Guittot wrote:
> On 4 July 2017 at 10:34, Peter Zijlstra  wrote:
> > On Tue, Jul 04, 2017 at 09:27:07AM +0200, Peter Zijlstra wrote:
> >> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
> >> > The running state is a subset of runnable state which means that running
> >> > can't be set if runnable (weight) is cleared. There are corner cases
> >> > where the current sched_entity has been already dequeued but cfs_rq->curr
> >> > has not been updated yet and still points to the dequeued sched_entity.
> >> > If ___update_load_avg is called at that time, weight will be 0 and 
> >> > running
> >> > will be set which is not possible.
> >> >
> >> > This case happens during pick_next_task_fair() when a cfs_rq becomes 
> >> > idles.
> >> > The current sched_entity has been dequeued so se->on_rq is cleared and
> >> > cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
> >> > cleared when picking the idle thread). Because the cfs_rq becomes idle,
> >> > idle_balance() is called and ends up to call update_blocked_averages()
> >> > with these wrong running and runnable states.
> >> >
> >> > Add a test in ___update_load_avg to correct the running state in this 
> >> > case.
> >>
> >> Cute, however did you find that ?
> >
> > Hmm,.. could you give a little more detail?
> >
> > Because if ->on_rq=0, we'll have done dequeue_task() which will have
> > done update_curr() with ->on_rq, weight and ->running consistently.
> >
> > Then the above, inconsistent update should not happen, because delta=0.
> 
> In fact, the delta between dequeue_entity_load_avg() and
> update_blocked_averages() is not 0 on my platform (hikey) but can be
> longer than 60us (at lowest frequency with only 1 task group level)

But but but, how can that happen? Should it not all be under the same
rq->lock and thus have only a single update_rq_clock() and thus be at
the same 'instant' ?

Re: [PATCH] coccinelle: api: detect unnecessary le16_to_cpu

2017-07-04 Thread Julia Lawall



On Tue, 4 Jul 2017, Andy Shevchenko wrote:

> On Tue, Jul 4, 2017 at 12:11 PM, Julia Lawall  wrote:
> > Here is a revised version (not a patch because it doesn't support all of
> > the various modes) and the results.  It doesn't return anything beyond
> > what was mentioned in previous mails.
> >
> > For the following code:
> >
> > ret = i2c_smbus_read_word_data(chip->client, reg << 1);
> > val[0] = (u16)ret & 0xFF;
> > val[1] = (u16)ret >> 8;
> >
> > do we want to see:
> >
> > put_unaligned(val,i2c_smbus_read_word_data(chip->client, reg << 1));
>
> If and only if the type of the pointer is a byte type (u8 *, char *,
> or alike) _and_ we try to use it as a provider for 16-bit value (2
> bytes).

OK, the provider part seems to add more complexity than is worth putting
in the rule, so I will let the developer figure out what should be done.

thanks,
julia

Re: [RFC][PATCH] sched: attach extra runtime to the right avg

2017-07-04 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> > An intermediate approach to improve that skew would be something like 
> > below. 
> > It doesn't track the remainder like your patch does, but doesn't lose 
> > precision either, just rounds down 'now' to the nearest 1024 boundary.
> 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 008c514dc241..b03703cd7989 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -2965,7 +2965,7 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg 
> > *sa,
> > if (!delta)
> > return 0;
> >  
> > -   sa->last_update_time += delta << 10;
> > +   sa->last_update_time = now & ~1023ULL;
> >  
> 
> So if we have a task that always runs <1024ns it should still get blocks
> of runtime because the difference between now and now&~1023 can be !0
> and spill.

Agreed - in the first approximation I was trying to figure out why Josef was 
seeing an effect from the patch.

> I'm just not immediately seeing how its different from the 0-sum we had.
> It should be identical since delta*1024 would equally land us on those
> same edges (there's an offset in the differential form between the two,
> but since we start with last_update_time=0, the resulting edges are the
> same afaict).

So I think the difference is that this:

sa->last_update_time = now & ~1023ULL;

is tracking the absolute value of 'now' (i.e. rq->clock in most cases) by and 
large, with a 1024 ns imprecision.

This code on the other hand:

sa->last_update_time += delta << 10;

... in essence creates a whole new absolute clock value that slowly but surely 
is 
drifting away from the real rq->clock, because 'delta' is always rounded down 
to 
the nearest 1024 ns boundary, so we accumulate the 'remainder' losses.

That is because:

delta >>= 10;
...
sa->last_update_time += delta << 10;

Given enough time, ->last_update_time can drift a long way, and this delta:

delta = now - sa->last_update_time;

... becomes meaningless AFAICS, because it's essentially two different clocks 
that 
get compared.

But I might be super confused about this myself ...

Thanks,

Ingo

Re: [PATCH 0/3] irqchip: Miscellaneous fixes for GIC/GICv3

2017-07-04 Thread Marc Zyngier

On 04/07/17 10:56, Suzuki K Poulose wrote:
> This series contains some fixes for GIC/GIC-v3 to behave as expected
> by the generic management layer.
> 
> Suzuki K Poulose (3):
>   irqchip: gic-v3: Report failures in gic_irq_domain_alloc
>   irqchip: gic-v2: Report failures in gic_irq_domain_alloc
>   irq: gic-v3: Honor forced affinity setting
> 
>  drivers/irqchip/irq-gic-v3.c | 14 +++---
>  drivers/irqchip/irq-gic.c|  7 +--
>  2 files changed, 16 insertions(+), 5 deletions(-)
> 

All 3 patches queued for post -rc1.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

Re: [PATCH 01/11] S.A.R.A. Documentation

2017-07-04 Thread Salvatore Mesoraca

2017-06-28 0:51 GMT+02:00 Kees Cook :
> On Mon, Jun 12, 2017 at 9:56 AM, Salvatore Mesoraca
>  wrote:
>> Adding documentation for S.A.R.A. LSM.
>>
>> Signed-off-by: Salvatore Mesoraca 
>> ---
>>  Documentation/admin-guide/kernel-parameters.txt |  40 +
>>  Documentation/security/00-INDEX |   2 +
>>  Documentation/security/SARA.rst | 192 
>> 
>>  3 files changed, 234 insertions(+)
>>  create mode 100644 Documentation/security/SARA.rst
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index 0f5c3b4..f3ee12d 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -3702,6 +3702,46 @@
>> 1 -- enable.
>> Default value is set via kernel config option.
>>
>> +   sara=   [SARA] Disable or enable S.A.R.A. at boot time.
>> +   If disabled this way S.A.R.A. can't be enabled
>> +   again.
>> +   Format: { "0" | "1" }
>> +   See security/sara/Kconfig help text
>> +   0 -- disable.
>> +   1 -- enable.
>> +   Default value is set via kernel config option.
>> +
>> +   sara_usb_filtering= [SARA]
>> +   Disable or enable S.A.R.A. USB Filtering at boot
>> +   time.
>> +   Format: { "0" | "1" }
>> +   See security/sara/Kconfig help text
>> +   0 -- disable.
>> +   1 -- enable.
>> +   Default value is 1.
>> +
>> +   sara_usb_filtering_default= [SARA]
>> +   Set S.A.R.A. USB Filtering default action.
>> +   Format: { "a" | "d" }
>> +   See security/sara/Kconfig help text
>> +   a -- allow.
>> +   d -- deny.
>> +   Default value is set via kernel config option.
>> +
>> +   sara_wxprot=[SARA] Disable or enable S.A.R.A. WX Protection
>> +   at boot time.
>> +   Format: { "0" | "1" }
>> +   See security/sara/Kconfig help text
>> +   0 -- disable.
>> +   1 -- enable.
>> +   Default value is 1.
>> +
>> +   sara_wxprot_default_flags= [SARA]
>> +   Set S.A.R.A. WX Protection default flags.
>> +   Format: 
>> +   See S.A.R.A. documentation.
>> +   Default value is set via kernel config option.
>> +
>
> As an organizational note, I would suggest making these all regular
> "module parameters", which would let them be automatically namespaced
> under "sara". For example "sara.enabled", "sara.wxprot", etc. For
> example, this is how LoadPin does it for "loadpin.enabled":
>
> /* Should not be mutable after boot, so not listed in sysfs (perm == 0). */
> module_param(enabled, int, 0);
> MODULE_PARM_DESC(enabled, "Pin module/firmware loading (default: true)");

I apologize to be so late to answer you.
I completely missed this email.
I'll follow your suggestion in v3, thank you.

Re: [RFC 0/5] drivers: Add boot constraints core

2017-07-04 Thread Viresh Kumar

On 03-07-17, 16:07, Mark Brown wrote:
> On Mon, Jul 03, 2017 at 11:45:52AM +0530, Viresh Kumar wrote:
> > The above regulator-min/max-microvolt values I mentioned were for the 
> > regulator
> > device and not what the consumers would request. Yes, DMA will request 
> > something
> 
> If you're putting the maximum possible range that the physical regulator
> can supply into machine constraints then you really haven't understood
> what machine constraints are at all.

I wasn't referring to the limits of the physical regulators but the min/max that
the consumers can set on a particular platform.

> No, it really shouldn't.  Please read what I wrote.

Sorry about that. Understood it now.

-- 
viresh

Re: centos 7.2，I got some oops form my production line

2017-07-04 Thread Xishi Qiu

On 2017/6/29 16:22, Xishi Qiu wrote:

> centos 7.2，I got some oops form my production line,
> Anybody has seen these errors before?
> 

Here is another one

[  703.025737] BUG: unable to handle kernel NULL pointer dereference at 
0d68
[  703.026008] IP: [] mlx4_en_QUERY_PORT+0xa2/0x190 [mlx4_en]
[  703.026008] PGD 377f2a067 PUD 379df4067 PMD 0 
[  703.026008] Oops: 0002 [#1] SMP 
[  703.033019] Modules linked in: sch_htb haek(OVE) squashfs loop binfmt_misc 
phram mtdblock mtd_blkdevs mtd zlib_deflate nf_log_ipv4 nf_log_common xt_LOG 
ipmi_watchdog ipmi_devintf ipmi_si ipmi_msghandler vfat fat bonding tipc 
kboxdriver(O) kbox(O) ipt_REJECT iptable_filter signo_catch(O) mlx4_ib(OVE) 
ib_sa(OVE) ib_mad(OVE) ib_core(OVE) mlx4_en(OVE) ib_addr(OVE) ib_netlink(OVE) 
vxlan ip6_udp_tunnel udp_tunnel ptp pps_core mlx4_core(OVE) compat(OVE) isofs 
crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper 
ablk_helper cryptd ppdev dm_mod parport_pc sg parport pcspkr i2c_piix4 i2c_core 
ip_tables ext3 mbcache jbd sr_mod cdrom ata_generic pata_acpi virtio_blk(OVE) 
virtio_console(OVE) kvm_ivshmem(OVE) crct10dif_pclmul crct10dif_common ata_piix 
crc32c_intel serio_raw libata pv_channel(OVE)
[  703.055064] mlx4_core :00:07.0: mlx4_dec_port_macs removed mac, port: 1, 
now: 0
[  703.033019]  virtio_pci(OVE) virtio_ring(OVE) virtio(OVE) floppy 
monitor_netdev(OE)
[  703.033019] CPU: 3 PID: 3038 Comm: kworker/3:2 Tainted: GW  OE  
V---   3.10.0-327.49.58.52.x86_64 #1
[  703.033019] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 
rel-1.8.1-0-g4adadbd-2016_105425-HGH108200 04/01/2014
[  703.033019] Workqueue: events linkwatch_event
[  703.033019] task: 88041a9bf300 ti: 880412134000 task.ti: 
880412134000
[  703.066565] RIP: 0010:[]  [] 
mlx4_en_QUERY_PORT+0xa2/0x190 [mlx4_en]
[  703.066565] RSP: 0018:880412137bd0  EFLAGS: 00010a03
[  703.066565] RAX: 8800ba5bc000 RBX: 880410cd1000 RCX: 0038
[  703.066565] RDX: 0001 RSI: 0246 RDI: 88041472046c
[  703.074752] RBP: 880412137c10 R08: 81668be0 R09: 81dc63c0
[  703.074752] R10: 0400 R11: 0017 R12: 8803773eea20
[  703.074752] R13:  R14:  R15: 88041b5eb000
[  703.074752] FS:  () GS:880434ac() 
knlGS:
[  703.074752] CS:  0010 DS:  ES:  CR0: 80050033
[  703.074752] CR2: 0d68 CR3: 00036ff69000 CR4: 001407e0
[  703.086770] DR0:  DR1:  DR2: 
[  703.086770] DR3:  DR6: 0ff0 DR7: 0400
[  703.086770] Stack:
[  703.086770]  88040043 ea60 8804 
ba5bc000
[  703.086770]  880410ce 880412137cac 88041b5eb8c0 
880410ce08c0
[  703.086770]  880412137c78 a02a29a4 dead00200200 
4ff29e7e
[  703.086770] Call Trace:
[  703.086770]  [] mlx4_en_get_settings+0x34/0x540 [mlx4_en]
[  703.086770]  [] __ethtool_get_settings+0x86/0x140
[  703.104149]  [] bond_update_speed_duplex+0x3d/0x90 
[bonding]
[  703.104149]  [] bond_netdev_event+0x137/0x360 [bonding]
[  703.104149]  [] notifier_call_chain+0x4c/0x70
[  703.104149]  [] raw_notifier_call_chain+0x16/0x20
[  703.104149]  [] call_netdevice_notifiers+0x2d/0x60
[  703.104149]  [] netdev_state_change+0x23/0x40
[  703.104149]  [] linkwatch_do_dev+0x40/0x60
[  703.104149]  [] __linkwatch_run_queue+0xef/0x200
[  703.104149]  [] linkwatch_event+0x25/0x30
[  703.104149]  [] process_one_work+0x17b/0x470
[  703.104149]  [] worker_thread+0x11b/0x400
[  703.104149]  [] ? rescuer_thread+0x400/0x400
[  703.104149]  [] kthread+0xcf/0xe0
[  703.104149]  [] ? kthread_create_on_node+0x140/0x140
[  703.104149]  [] ret_from_fork+0x58/0x90
[  703.104149]  [] ? kthread_create_on_node+0x140/0x140
[  703.134274] Code: 48 8b 3b 4c 89 e6 e8 7e 7e f4 ff 48 83 c4 20 44 89 f0 5b 
41 5c 41 5d 41 5e 5d c3 66 0f 1f 44 00 00 49 8b 04 24 0f be 10 c1 ea 1f <41> 89 
95 68 0d 00 00 0f b6 50 05 83 e2 6f 80 fa 40 0f 87 b7 00 
[  703.134274] RIP  [] mlx4_en_QUERY_PORT+0xa2/0x190 [mlx4_en]
[  703.134274]  RSP 
[  703.134274] CR2: 0d68
[  703.134274] ---[ end trace 76a7da47a517c30b ]---
[  703.134274] Kernel panic - not syncing: Fatal exception


> 
> 1)
> 2017-06-28T02:18:16.461384+08:00[880983.488036] do nothing after die!
> 2017-06-28T02:18:16.462068+08:00[880983.488723] Modules linked in: fuse 
> iptable_filter sha512_generic icp_qa_al_vf(OVE) vfat fat isofs ext4 jbd2 xfs 
> libcrc32c kboxdriver(O) ipmi_devintf ipmi_si ipmi_msghandler kbox(O) 
> signo_catch(O) mlx4_core(OVE) compat(OVE) ppdev crc32_pclmul 
> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd 
> pcspkr parport_pc i2c_piix4 parport i2c_core ip_tables ext3 mbcache jbd 
> ata_generic pata_acpi virtio_console(OVE) virtio_balloon(OVE)

Re: [PATCH 3/5] pwm: rockchip: Move the configuration of polarity from rockchip_pwm_set_enable() to rockchip_pwm_config()

2017-07-04 Thread David.Wu


Hi Boris,

在 2017/7/4 2:36, Boris Brezillon 写道:

Hm, maybe it's time to drop these custom hooks and implement
pwm_apply_v1 and pwm_apply_v2 instead.


Okay, drop the enable and config hooks, only use the apply hook to 
instead them.

Re: [GIT PULL] USB/PHY patches for 4.13-rc1

2017-07-04 Thread Geert Uytterhoeven

Hi Greg, Heikki,

On Mon, Jul 3, 2017 at 4:58 PM, Greg KH  wrote:
> The following changes since commit 41f1830f5a7af77cf5c86359aba3cbd706687e52:
>
>   Linux 4.12-rc6 (2017-06-19 22:19:37 +0800)
>
> are available in the git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ 
> tags/usb-4.13-rc1
>
> for you to fetch changes up to 6836796de4019944f4ba4c99a360e8250fd2e735:
>
>   Add USB quirk for HVR-950q to avoid intermittent device resets (2017-06-29 
> 14:49:06 +0200)
>
> 
> USB/PHY patches for 4.13-rc1
>
> Here is the big patchset of USB and PHY driver updates for 4.13-rc1.
>
> On the PHY side, they decided to move files around to "make things
> easier" in their tree.  Hopefully that wasn't a mistake, but in
> linux-next testing, we haven't had any reported problems.
>
> There's the usual set of gadget and xhci and musb updates in here as
> well, along with a number of smaller updates for a raft of different USB
> drivers.  Full details in the shortlog, nothing really major.
>
> All of these have been in linux-next for a while with no reported
> issues.
>
> Signed-off-by: Greg Kroah-Hartman 
>
> 

> Heikki Krogerus (3):
>   usb: typec: Add support for UCSI interface

Commit c1b0bc2dabfa884d ("usb: typec: Add support for UCSI interface"):

> --- /dev/null
> +++ b/drivers/usb/typec/ucsi/Kconfig
> @@ -0,0 +1,23 @@
> +config TYPEC_UCSI
> +   tristate "USB Type-C Connector System Software Interface driver"
> +   depends on !CPU_BIG_ENDIAN

To work as expected, and prevent this driver from being enabled on big endian
systems, this depends on "[PATCH v3 0/3] Define CPU_BIG_ENDIAN or warn for
inconsistencies".
https://lkml.org/lkml/2017/6/12/1068

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH] x86/platform/uv/BAU: minor cleanup, make some local functions static

2017-07-04 Thread Dou Liyang


Hi Thomas,

At 07/04/2017 03:19 PM, Thomas Gleixner wrote:

On Tue, 4 Jul 2017, Dou Liyang wrote:

At 07/03/2017 10:22 PM, Colin King wrote:

-int normal_busy(struct bau_control *bcp)
+static int normal_busy(struct bau_control *bcp)


In my opinion, there is no need to mark *normal_busy* static, remove it
directly.

the commit c5d35d399e68(x86/UV2: Work around BAU bug) add it to
handle_uv2_busy(), but the handle_uv2_busy() is rewritten now. the
normal_busy is unused, can be remove.


Correct.


By the way, there are also an other function named
uv_bau_message_interrupt() can be remove.


Not so much.

# git grep uv_bau_message_interrupt arch/x86/
arch/x86/entry/entry_64.S:apicinterrupt3 UV_BAU_MESSAGE 
uv_bau_message_intr1uv_bau_message_interrupt
arch/x86/platform/uv/tlb_uv.c:void uv_bau_message_interrupt(struct pt_regs 
*regs)



Oops, Indeed! you are right. ;)

Thanks,

dou.


Thanks,

tglx

Re: [PATCH v6 17/18] xen/pvcalls: implement write

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> When the other end notifies us that there is data to be written
> (pvcalls_back_conn_event), increment the io and write counters, and
> schedule the ioworker.
> 
> Implement the write function called by ioworker by reading the data from
> the data ring, writing it to the socket by calling inet_sendmsg.
> 
> Set out_error on error.
> 
> Signed-off-by: Stefano Stabellini 

Reviewed-by: Juergen Gross 


Thanks,

Juergen

Re: [PATCH v6 18/18] xen: introduce a Kconfig option to enable the pvcalls backend

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> Also add pvcalls-back to the Makefile.
> 
> Signed-off-by: Stefano Stabellini 

Reviewed-by: Juergen Gross 


Thanks,

Juergen

[PATCH v2 1/2] x86/boot/KASLR: Adapt process_e820_entry for any type of memory entry

2017-07-04 Thread Baoquan He

Now function process_e820_entry is only used to process e820 memory
entries. Adapt it for any type of memory entry, not just for e820.
Later we will use it to process efi mirror regions.

So rename the old process_e820_entry to process_mem_region, and
extract and wrap the e820 specific processing code into process_e820_entry.

Signed-off-by: Baoquan He 
---
 arch/x86/boot/compressed/kaslr.c | 60 ++--
 1 file changed, 33 insertions(+), 27 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 91f27ab970ef..85c360eec4a6 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -479,35 +479,31 @@ static unsigned long slots_fetch_random(void)
return 0;
 }
 
-static void process_e820_entry(struct boot_e820_entry *entry,
+static void process_mem_region(struct mem_vector *entry,
   unsigned long minimum,
   unsigned long image_size)
 {
struct mem_vector region, overlap;
struct slot_area slot_area;
unsigned long start_orig, end;
-   struct boot_e820_entry cur_entry;
-
-   /* Skip non-RAM entries. */
-   if (entry->type != E820_TYPE_RAM)
-   return;
+   struct mem_vector cur_entry;
 
/* On 32-bit, ignore entries entirely above our maximum. */
-   if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= KERNEL_IMAGE_SIZE)
+   if (IS_ENABLED(CONFIG_X86_32) && entry->start >= KERNEL_IMAGE_SIZE)
return;
 
/* Ignore entries entirely below our minimum. */
-   if (entry->addr + entry->size < minimum)
+   if (entry->start + entry->size < minimum)
return;
 
/* Ignore entries above memory limit */
-   end = min(entry->size + entry->addr, mem_limit);
-   if (entry->addr >= end)
+   end = min(entry->size + entry->start, mem_limit);
+   if (entry->start >= end)
return;
-   cur_entry.addr = entry->addr;
-   cur_entry.size = end - entry->addr;
+   cur_entry.start = entry->start;
+   cur_entry.size = end - entry->start;
 
-   region.start = cur_entry.addr;
+   region.start = cur_entry.start;
region.size = cur_entry.size;
 
/* Give up if slot area array is full. */
@@ -522,7 +518,7 @@ static void process_e820_entry(struct boot_e820_entry 
*entry,
region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
 
/* Did we raise the address above this e820 region? */
-   if (region.start > cur_entry.addr + cur_entry.size)
+   if (region.start > cur_entry.start + cur_entry.size)
return;
 
/* Reduce size by any delta from the original address. */
@@ -562,12 +558,31 @@ static void process_e820_entry(struct boot_e820_entry 
*entry,
}
 }
 
-static unsigned long find_random_phys_addr(unsigned long minimum,
-  unsigned long image_size)
+static void process_e820_entry(unsigned long minimum, unsigned long image_size)
 {
int i;
-   unsigned long addr;
+   struct mem_vector region;
+   struct boot_e820_entry *entry;
+
+   /* Verify potential e820 positions, appending to slots list. */
+   for (i = 0; i < boot_params->e820_entries; i++) {
+   entry = _params->e820_table[i];
+   /* Skip non-RAM entries. */
+   if (entry->type != E820_TYPE_RAM)
+   continue;
+   region.start = entry->addr;
+   region.size = entry->size;
+   process_mem_region(, minimum, image_size);
+   if (slot_area_index == MAX_SLOT_AREA) {
+   debug_putstr("Aborted e820 scan (slot_areas full)!\n");
+   break;
+   }
+   }
+}
 
+static unsigned long find_random_phys_addr(unsigned long minimum,
+  unsigned long image_size)
+{
/* Check if we had too many memmaps. */
if (memmap_too_large) {
debug_putstr("Aborted e820 scan (more than 4 memmap= args)!\n");
@@ -577,16 +592,7 @@ static unsigned long find_random_phys_addr(unsigned long 
minimum,
/* Make sure minimum is aligned. */
minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN);
 
-   /* Verify potential e820 positions, appending to slots list. */
-   for (i = 0; i < boot_params->e820_entries; i++) {
-   process_e820_entry(_params->e820_table[i], minimum,
-  image_size);
-   if (slot_area_index == MAX_SLOT_AREA) {
-   debug_putstr("Aborted e820 scan (slot_areas full)!\n");
-   break;
-   }
-   }
-
+   process_e820_entry(minimum, image_size);
return slots_fetch_random();
 }
 
-- 
2.5.5

[PATCH v2 2/2] x86/boot/KASLR: Restrict kernel to be randomized in mirror regions

2017-07-04 Thread Baoquan He

Kernel text may be located in non-mirror regions (movable zone) when both
address range mirroring feature and KASLR are enabled.

The address range mirroring feature arranges such mirror region into
normal zone and other region into movable zone in order to locate
kernel code and data in mirror region. The physical memory region
whose descriptors in EFI memory map has EFI_MEMORY_MORE_RELIABLE
attribute (bit: 16) are mirrored.

If efi is detected, iterate efi memory map and pick the mirror region to
process for adding candidate of randomization slot. If efi is disabled
or no mirror region found, still process e820 memory map.

Signed-off-by: Baoquan He 
---
 arch/x86/boot/compressed/kaslr.c | 52 +++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 85c360eec4a6..5f10e0b10ef4 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -37,7 +37,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 /* Macros used by the included decompressor code below. */
 #define STATIC
@@ -558,6 +560,50 @@ static void process_mem_region(struct mem_vector *entry,
}
 }
 
+/* Marks if efi mirror regions have been found and handled. */
+static bool efi_mirror_found;
+
+static void process_efi_entry(unsigned long minimum, unsigned long image_size)
+{
+   struct efi_info *e = _params->efi_info;
+   struct mem_vector region;
+   efi_memory_desc_t *md;
+   unsigned long pmap;
+   char *signature;
+   u32 nr_desc;
+   int i;
+
+
+#ifdef CONFIG_EFI
+   signature = (char *)_params->efi_info.efi_loader_signature;
+#endif
+   if (strncmp(signature, EFI32_LOADER_SIGNATURE, 4) &&
+   strncmp(signature, EFI64_LOADER_SIGNATURE, 4))
+   return;
+
+#ifdef CONFIG_X86_32
+   /* Can't handle data above 4GB at this time */
+   if (e->efi_memmap_hi) {
+   warn("Memory map is above 4GB, EFI should be disabled.\n");
+   return;
+   }
+   pmap =  e->efi_memmap;
+#else
+   pmap = (e->efi_memmap | ((__u64)e->efi_memmap_hi << 32));
+#endif
+
+   nr_desc = e->efi_memmap_size / e->efi_memdesc_size;
+   for (i = 0; i < nr_desc; i++) {
+   md = (efi_memory_desc_t *)(pmap + (i * e->efi_memdesc_size));
+   if (md->attribute & EFI_MEMORY_MORE_RELIABLE) {
+   region.start = md->phys_addr;
+   region.size = md->num_pages << EFI_PAGE_SHIFT;
+   process_mem_region(, minimum, image_size);
+   efi_mirror_found = true;
+   }
+   }
+}
+
 static void process_e820_entry(unsigned long minimum, unsigned long image_size)
 {
int i;
@@ -592,6 +638,10 @@ static unsigned long find_random_phys_addr(unsigned long 
minimum,
/* Make sure minimum is aligned. */
minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN);
 
+   process_efi_entry(minimum, image_size);
+   if (efi_mirror_found)
+   return slots_fetch_random();
+
process_e820_entry(minimum, image_size);
return slots_fetch_random();
 }
@@ -651,7 +701,7 @@ void choose_random_location(unsigned long input,
 */
min_addr = min(*output, 512UL << 20);
 
-   /* Walk e820 and find a random address. */
+   /* Walk available memory entries to find a random address. */
random_addr = find_random_phys_addr(min_addr, output_size);
if (!random_addr) {
warn("Physical KASLR disabled: no suitable memory region!");
-- 
2.5.5

Re: [PATCH v6 03/18] xen/pvcalls: initialize the module and register the xenbus backend

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> Keep a list of connected frontends. Use a semaphore to protect list
> accesses.
> 
> Signed-off-by: Stefano Stabellini 
> Reviewed-by: Boris Ostrovsky 

Reviewed-by: Juergen Gross 


Thanks,

Juergen

Re: [GIT PULL] s390 patches for 4.13 merge window

2017-07-04 Thread Martin Schwidefsky

On Tue, 4 Jul 2017 10:05:30 +0200
Martin Schwidefsky  wrote:

> On Tue, 4 Jul 2017 17:58:18 +1000
> Stephen Rothwell  wrote:
> 
> > Hi Linus,
> > 
> > On Mon, 3 Jul 2017 15:46:00 -0700 Linus Torvalds 
> >  wrote:  
> > >
> > > On Mon, Jul 3, 2017 at 2:01 AM, Martin Schwidefsky
> > >  wrote:
> > > >
> > > > please pull from the 'for-linus' branch of
> > > >
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git 
> > > > for-linus  
> > > 
> > > So my conflict resolution looks different from the one Stephen posted,
> > > which may be due to various reasons, ranging from "linux-next has
> > > other things that conflict" to just "I didn't notice some semantic
> > > conflict since unlike linux-next I don't build for s390".
> > > 
> > > Regardless, you should check my current -git tree just to verify, and
> > > send me a patch if I screwed something up.
> > 
> > At least part of the difference is the following merge fix patch I have
> > been carrying.  It is needed due to a build failure.
> > 
> > From: Stephen Rothwell 
> > Date: Tue, 13 Jun 2017 20:51:32 +1000
> > Subject: [PATCH] s390: fix up for "blk-mq: switch ->queue_rq return value to
> >  blk_status_t"
> > 
> > Signed-off-by: Stephen Rothwell 
> > ---
> >  drivers/s390/block/scm_blk.c | 10 +-
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/s390/block/scm_blk.c b/drivers/s390/block/scm_blk.c
> > index 42018a20f2b7..0071febac9e6 100644
> > --- a/drivers/s390/block/scm_blk.c
> > +++ b/drivers/s390/block/scm_blk.c
> > @@ -278,7 +278,7 @@ struct scm_queue {
> > spinlock_t lock;
> >  };
> > 
> > -static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> > +static blk_status_t scm_blk_request(struct blk_mq_hw_ctx *hctx,
> >const struct blk_mq_queue_data *qd)
> >  {
> > struct scm_device *scmdev = hctx->queue->queuedata;
> > @@ -290,7 +290,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> > spin_lock(>lock);
> > if (!scm_permit_request(bdev, req)) {
> > spin_unlock(>lock);
> > -   return BLK_MQ_RQ_QUEUE_BUSY;
> > +   return BLK_STS_RESOURCE;
> > }
> > 
> > scmrq = sq->scmrq;
> > @@ -299,7 +299,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> > if (!scmrq) {
> > SCM_LOG(5, "no request");
> > spin_unlock(>lock);
> > -   return BLK_MQ_RQ_QUEUE_BUSY;
> > +   return BLK_STS_RESOURCE;
> > }
> > scm_request_init(bdev, scmrq);
> > sq->scmrq = scmrq;
> > @@ -315,7 +315,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> > 
> > sq->scmrq = NULL;
> > spin_unlock(>lock);
> > -   return BLK_MQ_RQ_QUEUE_BUSY;
> > +   return BLK_STS_RESOURCE;
> > }
> > blk_mq_start_request(req);
> > 
> > @@ -324,7 +324,7 @@ static int scm_blk_request(struct blk_mq_hw_ctx *hctx,
> > sq->scmrq = NULL;
> > }
> > spin_unlock(>lock);
> > -   return BLK_MQ_RQ_QUEUE_OK;
> > +   return BLK_STS_OK;
> >  }
> > 
> >  static int scm_blk_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,  
> 
> This is the same patch I came up with to get it to compile. I asked
> Sebastian to verify that the driver actually works with these changes.

Looks good. Sebastian confirmed that the scm driver will be fine with the
add-on patch from Stephen.

@Linus:
I can add this to the s390 tree and sent the patch with the next please-pull.
Or you can apply the patch directly, whatever you prefer.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus

2017-07-04 Thread Thomas Gleixner

On Mon, 3 Jul 2017, Chen Yu wrote:
> On Sun, Jun 04, 2017 at 10:04:53PM +0200, Thomas Gleixner wrote:
> > After looking at the callsites, it's safe to change
> > irq_set_affinity_locked() so that it uses the direct affinity setter
> > function when force == true.
> > 
> Sorry it took me sometime to understand this point(this is why I did not reply
> to you at the first time :-)
> I thought the defination of the word 'safe' here means, we should
> not adjust the irq affinity in the process context if the ISR is
> still running, otherwise there might be a race condition.
> 
> Currently, there are four drivers would set the force flag to true(AKA,
> invoking irq_force_affinity()).
> 
> 1. exynos4_mct_starting_cpu()
>The irq affinity is set before the clockevent is registered,
>so there would be no interrupt triggered when adjusting
>the irq affinity in the process context. Safe.
> 
> 2. sirfsoc_local_timer_starting_cpu()
>The same as above. Safe.
> 
> 3. arm_perf_starting_cpu()
>During cpu offline, the pmu interrupt(non percpu pmu interrupt)
>might be migrated to other online cpus. Then once the same cpu
>is put online, the interrupt will be set back to this cpu again
>by invoking irq_force_affinity(), but currently the pmu interrupt
>might be still running on other cpus, so it would be unsafe to adjust
>its irq affinity in the process context?

No, that's not an issue. The ability to move interrupts in process context,
or better said in any context, has nothing to do with a concurrent
interrupt. The normal mechanics for most architectures/interrupt
controllers is just to program the new affinity setting which will take
effect with the next delivered interrupt.

We just have these ill designed hardware implementations which do not allow
that. They require to change the interrupt affinity at the point when the
interrupt is handled on the original target CPU. But that's hard to achieve
when the CPU is about to go offline, because we might wait forever for an
interrupt to be raised. So in that case we need to forcefully move them
away and take the risk of colliding with an actual interrupt being raised
in hardware concurrently which has the potential to confuse the interrupt
chip.

> 4. sunhv_migrate_hvcons_irq()
>The cpu who encountered a panic needs to migrate the hvcons irq to the
>current alive cpu, and send ipi to stop other cpus. So at the time to
>adjust the irq affinity for the hvcons, the interrupt of the latter might
>be running and it might be unsafe to adjust the irq affinity in the
>process context?

None of these are related to that problem. All of these architectures can
move interrupts in process context unconditionally. It's also not relevant
which callsites invoke irq_set_affinity_locked() with force=true.

The point is whether we can change the semantics of irq_set_affinity_locked()
without breaking something.

But please answer my other mail in that thread [1] first before we start
about changing anything in that area. The affinity related changes are in
Linus tree now, so please retest against that as well.

Thanks,

tglx

[1] http://lkml.kernel.org/r/alpine.DEB.2.20.1706282036330.1890@nanos

Re: [PATCH stable-only] mm: fix classzone_idx underflow in shrink_zones()

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 10:45:43, Vlastimil Babka wrote:
> Hi,
> 
> I realize this is against the standard stable policy, but I see no other
> way, because the mainline accidental fix is part of 34+ patch reclaim
> rework, that would be absurd to try to backport into stable. The fix is
> a one-liner though.
> 
> The bug affects at least 4.4.y, and likely also older stable trees that
> backported commit 7bf52fb891b6, which itself was a fix for 3.19 commit
> 6b4f7799c6a5. You could revert the 7bf52fb891b6 backport, but then 32bit
> with highmem might suffer from OOM or thrashing.
> 
> More details in the changelog itself.
> 
> 8<
> >From a1a1e459276298ac98827520e07923aa1219dbe1 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka 
> Date: Thu, 22 Jun 2017 16:23:13 +0200
> Subject: [PATCH] mm: fix classzone_idx underflow in shrink_zones()
> 
> We've got reported a BUG in do_try_to_free_pages():
> 
> BUG: unable to handle kernel paging request at 8ff28990
> IP: [] do_try_to_free_pages+0x140/0x490
> PGD 0
> Oops:  [#1] SMP
> megaraid_sas sg scsi_mod efivarfs autofs4
> Supported: No, Unsupported modules are loaded
> Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> task: 88ffd0d4c540 ti: 88ffd0e48000 task.ti: 88ffd0e48000
> RIP: 0010:[]  [] 
> do_try_to_free_pages+0x140/0x490
> RSP: 0018:88ffd0e4ba60  EFLAGS: 00010206
> RAX: 06fff900 RBX:  RCX: 88f29000
> RDX: 0000 RSI: 0003 RDI: 024200c8
> RBP: 01320122 R08:  R09: 88ffd0e4bbac
> R10:  R11:  R12: 88ffd0e4bae0
> R13: 0e00 R14: 88f2a500 R15: 88f2b300
> FS:  () GS:88ffe644() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 8ff28990 CR3: 01c0a000 CR4: 003406e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Stack:
>  0002db570a80 024200c8001e 88f2b300 
>  88fd5700 88ffd0d4c540 88ffd0d4c540 000c
>   0040 024200c8 88ffd0e4bae0
> Call Trace:
>  [] try_to_free_pages+0xba/0x170
>  [] __alloc_pages_nodemask+0x53f/0xb20
>  [] alloc_pages_current+0x7f/0x100
>  [] migrate_pages+0x202/0x710
>  [] __offline_pages.constprop.23+0x4ba/0x790
>  [] memory_subsys_offline+0x43/0x70
>  [] device_offline+0x7d/0xa0
>  [] acpi_bus_offline+0xa5/0xef
>  [] acpi_device_hotplug+0x21b/0x41f
>  [] acpi_hotplug_work_fn+0x1a/0x23
>  [] process_one_work+0x14e/0x410
>  [] worker_thread+0x116/0x490
>  [] kthread+0xbd/0xe0
>  [] ret_from_fork+0x3f/0x70
> 
> This translates to the loop in shrink_zone():
> 
> classzone_idx = requested_highidx;
> while (!populated_zone(zone->zone_pgdat->node_zones +
>   classzone_idx))
>   classzone_idx--;
> 
> where no zone is populated, so classzone_idx becomes -1 (in RBX).
> 
> Added debugging output reveals that we enter the function with
> sc->gfp_mask == GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE
> requested_highidx = gfp_zone(sc->gfp_mask) == 2 (ZONE_NORMAL)
> 
> Inside the for loop, however:
> gfp_zone(sc->gfp_mask) == 3 (ZONE_MOVABLE)
> 
> This means we have gone through this branch:
> 
> if (buffer_heads_over_limit)
> sc->gfp_mask |= __GFP_HIGHMEM;
> 
> This changes the gfp_zone() result, but requested_highidx remains unchanged.
> On nodes where the only populated zone is movable, the inner while loop will
> check only lower zones, which are not populated, and underflow classzone_idx.
> 
> To sum up, the bug occurs in configurations with ZONE_MOVABLE (such as when
> booted with the movable_node parameter) and only in situations when
> buffer_heads_over_limit is true, and there's an allocation with __GFP_MOVABLE
> and without __GFP_HIGHMEM performing direct reclaim.
> 
> This patch makes sure that classzone_idx starts with the correct zone.
> 
> Mainline has been affected in versions 4.6 and 4.7, but the culprit commit has
> been also included in stable trees.
> In mainline, this has been fixed accidentally as part of 34-patch series (plus
> follow-up fixes) "Move LRU page reclaim from zones to nodes", which makes the
> mainline commit unsuitable for stable backport, unfortunately.
> 
> Fixes: 7bf52fb891b6 ("mm: vmscan: reclaim highmem zone if buffer_heads is 
> over limit")
> Obsoleted-by: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node 
> basis")
> Debugged-by: Michal Hocko 
> Signed-off-by: Vlastimil Babka 
> Cc: Minchan Kim 
> Cc: Johannes Weiner 
> Cc: Mel Gorman 
> Cc: 

Acked-by: Michal Hocko 

> 
> ---
>  mm/vmscan.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
>

Re: [PATCH v2] mux: remove the Kconfig question for the subsystem

2017-07-04 Thread Greg KH

On Tue, Jul 04, 2017 at 10:22:44AM +0200, Peter Rosin wrote:
> The MULTIPLEXER question in the Kconfig might be confusing and is
> of dubious value. Remove it. This makes consumers responsible for
> selecting MULTIPLEXER, which they already do.
> 
> Signed-off-by: Peter Rosin 
> ---
>  drivers/mux/Kconfig | 19 +--
>  1 file changed, 5 insertions(+), 14 deletions(-)

Looks good to me, I'll queue it up for after 4.13-rc1, unless Linus
wants to take this now directly.

greg k-h

Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region)

2017-07-04 Thread Peter Zijlstra

On Thu, Jun 29, 2017 at 10:12:33AM +0200, Ingo Molnar wrote:
> 
> * Mark Rutland  wrote:
> 
> > It still seems wrong to make up data, though.
> 
> So what we have here is a hardware quirk: we asked for user-space samples, 
> but 
> didn't get them and we cannot expose the kernel-internal address.
> 
> The question is, how do we handle the hardware quirk. Since we cannot fix the 
> hardware on existing systems there's really just two choices:
> 
>  - Lose the sample (and signal it as a lost sample)
> 
>  - Keep the sample but change the sensitive kernel-internal address to 
> something 
>that is not sensitive: 0 or -1 works, but we could perhaps also return a 
>well-known user-space address such as the vDSO syscall trampoline or such?
> 
> there's no other option really.
> 
> I'd lean towards Vince's take: losing samples is more surprising than getting 
> the 
> occasional sample with some sanitized data in it.
> 
> If we make the artificial data still a meaningful user-space address, related 
> to 
> kernel entries, then it might even be a bonus, as users would learn to 
> recognize 
> it as: 'oh, skid artifact, I know about that'.

So while we could easily fake SAMPLE_IP to do as you suggest, other
entries might be much harder to fake. That said, I have no problems with
just 0 stuffing them.

The only real problem is determining how much to stuff I suppose.

Re: [PATCH v9 2/3] PCI: Add tango PCIe host bridge support

2017-07-04 Thread Ard Biesheuvel

On 4 July 2017 at 09:19, Jisheng Zhang  wrote:
> On Tue, 4 Jul 2017 09:02:06 +0100 Ard Biesheuvel wrote:
>
>> On 4 July 2017 at 07:58, Jisheng Zhang  wrote:
>> > On Mon, 3 Jul 2017 08:27:04 -0500 wrote:
>> >
>> >> [+cc Jingoo, Joao]
>> >>
>> >> On Mon, Jul 03, 2017 at 10:35:50AM +0100, Ard Biesheuvel wrote:
>> >> > On 3 July 2017 at 00:18, Bjorn Helgaas  wrote:
>> >> > > On Tue, Jun 20, 2017 at 10:17:40AM +0200, Marc Gonzalez wrote:
>> >> > >> This driver is required to work around several hardware bugs
>> >> > >> in the PCIe controller.
>> >> > >>
>> >> > >> NB: Revision 1 does not support legacy interrupts, or IO space.
>> >> > >
>> >> > > I had to apply these manually because of conflicts in Kconfig and
>> >> > > Makefile.  What are these based on?  Easiest for me is if you base
>> >> > > them on the current -rc1 tag.
>> >> > >
>> >> > >> Signed-off-by: Marc Gonzalez 
>> >> > >> ---
>> >> > >>  drivers/pci/host/Kconfig  |   8 +++
>> >> > >>  drivers/pci/host/Makefile |   1 +
>> >> > >>  drivers/pci/host/pcie-tango.c | 164 
>> >> > >> ++
>> >> > >>  include/linux/pci_ids.h   |   2 +
>> >> > >>  4 files changed, 175 insertions(+)
>> >> > >>  create mode 100644 drivers/pci/host/pcie-tango.c
>> >> > >>
>> >> > [..]
>> >> > >> + /*
>> >> > >> +  * QUIRK #2
>> >> > >> +  * Unfortunately, config and mem spaces are muxed.
>> >> > >> +  * Linux does not support such a setting, since drivers are 
>> >> > >> free
>> >> > >> +  * to access mem space directly, at any time.
>> >> > >> +  * Therefore, we can only PRAY that config and mem space 
>> >> > >> accesses
>> >> > >> +  * NEVER occur concurrently.
>> >> > >> +  */
>> >> > >> + writel_relaxed(1, pcie->mux);
>> >> > >> + ret = pci_generic_config_read(bus, devfn, where, size, val);
>> >> > >> + writel_relaxed(0, pcie->mux);
>> >> > >
>> >> > > I'm very hesitant about this.  When people stress this, we're going to
>> >> > > get reports of data corruption.  Even with the disclaimer below, I
>> >> > > don't feel good about this.  Adding the driver is an implicit claim
>> >> > > that we support the device, but we know it can't be made reliable.
>> >> >
>> >> > I noticed that the Synopsys driver suffers from a similar issue: in
>> >> > dw_pcie_rd_other_conf(), it happily reprograms the outbound I/O window
>> >> > to perform a config space access, and switches it back to I/O space
>> >> > afterwards (unless it has more than 2 viewports, in which case it uses
>> >> > dedicated windows for I/O space and config space)
>> >>
>> >> That doesn't sound good.  Jingoo, Joao?  I remember some discussion
>> >> about this, but not the details.
>> >>
>> >> I/O accesses use wrappers (inb(), etc), so there's at least the
>> >> possibility of a mutex to serialize them with respect to config
>> >> accesses.
>> >>
>> >
>> > IIRC, for 2 viewports, we don't need to worry about the config space
>> > access, because config space access is serialized by pci_lock; We
>> > do have race between config space and io space. But the accessing config
>> > space and io space at the same time is rare.
>>
>> Being 'rare' is not sufficient, unfortunately. In the current
>> situation, I/O space accesses may occur when the outbound window is
>> directed to the config space of a potentially completely unrelated
>> device. This is bad.
>
> Yep, I agree with you.
>
>>
>> > And the PCIe EPs which
>> > has io space are rare too, supporting these EPs are not the potential
>> > target of those platforms with 2 viewports.
>> >
>>
>> I am not sure EP mode is relevant here. What I do know is that boards
>
> I mean those PCIe EP cards which have IO space, but that doesn't matter.
>

Ah, ok. But if such EP cards are so rare, why expose I/O space at all
if we cannot do it safely?

>> like the Marvell 8040 based MacchiatoBin uses this IP in RC mode, and
>> exposes config, MMIO and IO space windows using only 2 viewports. Note
>> that this is essentially a bug in the DT description, given that its
>> version of the IP supports 8 viewports. But the driver needs to be
>> fixed as well.
>
> To fix for 2 viewports situation, we need to serialize access of the io
> and config space. In internal repo, we can achieve it by modifying the
> io access helper functions such as inl/outl, but this won't be accepted
> by the mainline IMHO. Except fixing the HW, any elegant solution?
>
> Suggestions are appreciated.
>

I think the safe and upstreamable approach is to disable the I/O
window completely if num-viewports <= 2.

Re: [tip:sched/core] sched/cputime: Refactor the cputime_adjust() code

2017-07-04 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> Argh, no... That code was perfectly fine. The new code otoh is
> convoluted crap.
> 
> It had the form:
> 
>   if (exception1)
> deal with exception1
> 
>   if (execption2)
> deal with exception2
> 
>   do normal stuff
> 
> Which is as simple and straight forward as it gets.
> 
> The new code otoh reads like:
> 
>   if (!exception1) {
>   if (exception2)
> deal with exception 2
>   else
> do normal stuff
>   }
> 
> which is absolute shit.
> 
> So NAK on this.

Agreed - I've queued up a revert.

Note that I fixed the old comment, which was arguably wrong:

/*
 * If either stime or both stime and utime are 0, assume all runtime is
 * userspace. Once a task gets some ticks, the monotonicy code at
 * 'update' will ensure things converge to the observed ratio.
 */

The correct comment is something like:

/*
 * If either stime or utime are 0, assume all runtime is userspace.
 * Once a task gets some ticks, the monotonicy code at 'update:'
 * will ensure things converge to the observed ratio.
 */

Thanks,

Ingo

Re: [PATCH mm] introduce reverse buddy concept to reduce buddy fragment

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 09:21:00, zhouxianrong wrote:
> the test was done as follows:
> 
> 1. the environment is android 7.0 and kernel is 4.1 and managed memory is 
> 3.5GB

There have been many changes in the compaction proper since than. Do you
see the same problem with the current upstream kernel?

> 2. every 4s startup one apk, total 100 more apks need to startup
> 3. after finishing step 2, sample buddyinfo once and get the result

How stable are those results?
-- 
Michal Hocko
SUSE Labs

Re: [PATCH v6 05/18] xen/pvcalls: connect to a frontend

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> Introduce a per-frontend data structure named pvcalls_fedata. It
> contains pointers to the command ring, its event channel, a list of
> active sockets and a tree of passive sockets (passing sockets need to be
> looked up from the id on listen, accept and poll commands, while active
> sockets only on release).
> 
> It also has an unbound workqueue to schedule the work of parsing and
> executing commands on the command ring. socket_lock protects the two
> lists. In pvcalls_back_global, keep a list of connected frontends.
> 
> Signed-off-by: Stefano Stabellini 
> Reviewed-by: Boris Ostrovsky 
> CC: boris.ostrov...@oracle.com
> CC: jgr...@suse.com
> ---
>  drivers/xen/pvcalls-back.c | 92 
> ++
>  1 file changed, 92 insertions(+)
> 
> diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
> index 7bce750..e4c2e46 100644
> --- a/drivers/xen/pvcalls-back.c
> +++ b/drivers/xen/pvcalls-back.c
> @@ -33,9 +33,101 @@ struct pvcalls_back_global {
>   struct semaphore frontends_lock;
>  } pvcalls_back_global;
>  
> +/*
> + * Per-frontend data structure. It contains pointers to the command
> + * ring, its event channel, a list of active sockets and a tree of
> + * passive sockets.
> + */
> +struct pvcalls_fedata {
> + struct list_head list;
> + struct xenbus_device *dev;
> + struct xen_pvcalls_sring *sring;
> + struct xen_pvcalls_back_ring ring;
> + int irq;
> + struct list_head socket_mappings;
> + struct radix_tree_root socketpass_mappings;
> + struct semaphore socket_lock;
> + struct workqueue_struct *wq;
> + struct work_struct register_work;
> +};
> +
> +static void pvcalls_back_work(struct work_struct *work)
> +{
> +}
> +
> +static irqreturn_t pvcalls_back_event(int irq, void *dev_id)
> +{
> + return IRQ_HANDLED;
> +}
> +
>  static int backend_connect(struct xenbus_device *dev)
>  {
> + int err, evtchn;
> + grant_ref_t ring_ref;
> + struct pvcalls_fedata *fedata = NULL;
> +
> + fedata = kzalloc(sizeof(struct pvcalls_fedata), GFP_KERNEL);
> + if (!fedata)
> + return -ENOMEM;
> +
> + err = xenbus_scanf(XBT_NIL, dev->otherend, "port", "%u",
> +);
> + if (err != 1) {
> + err = -EINVAL;
> + xenbus_dev_fatal(dev, err, "reading %s/event-channel",
> +  dev->otherend);
> + goto error;
> + }
> +
> + err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-ref", "%u", _ref);
> + if (err != 1) {
> + err = -EINVAL;
> + xenbus_dev_fatal(dev, err, "reading %s/ring-ref",
> +  dev->otherend);
> + goto error;
> + }
> +
> + err = bind_interdomain_evtchn_to_irqhandler(dev->otherend_id, evtchn,
> + pvcalls_back_event, 0,
> + "pvcalls-backend", dev);
> + if (err < 0)
> + goto error;
> + fedata->irq = err;
> +
> + fedata->wq = alloc_workqueue("pvcalls_back_wq", WQ_UNBOUND, 1);
> + if (!fedata->wq) {
> + err = -ENOMEM;
> + goto error;
> + }
> +
> + err = xenbus_map_ring_valloc(dev, _ref, 1, (void**)>sring);
> + if (err < 0)
> + goto error;
> +
> + BACK_RING_INIT(>ring, fedata->sring, XEN_PAGE_SIZE * 1);
> + fedata->dev = dev;
> +
> + INIT_WORK(>register_work, pvcalls_back_work);
> + INIT_LIST_HEAD(>socket_mappings);
> + INIT_RADIX_TREE(>socketpass_mappings, GFP_KERNEL);
> + sema_init(>socket_lock, 1);
> + dev_set_drvdata(>dev, fedata);
> +
> + down(_back_global.frontends_lock);
> + list_add_tail(>list, _back_global.frontends);
> + up(_back_global.frontends_lock);
> + queue_work(fedata->wq, >register_work);
> +
>   return 0;
> +
> + error:
> + if (fedata->sring != NULL)
> + xenbus_unmap_ring_vfree(dev, fedata->sring);
> + if (fedata->wq)
> + destroy_workqueue(fedata->wq);
> + unbind_from_irqhandler(fedata->irq, dev);

fedata->irq might have not been set and can be zero here. irq 0 is
a valid irq, I think.


Juergen

Re: [PATCH BUGFIX] block, bfq: fix bug causing crashes

2017-07-04 Thread Paolo Valente


> Il giorno 04 lug 2017, alle ore 00:49, Jens Axboe  ha 
> scritto:
> 
> On 07/03/2017 02:00 AM, Paolo Valente wrote:
>> Hi Jens,
>> I'm writing this short cover letter to hopefully help you decide what
>> to do with this patch, in this late phase of the development
>> cycle. This patch fixes a bug causing kernel crashes for at least
>> one year. Crashes apparently affect only a minority of users, but are
>> systematic for them (a crash every few tens of minutes for some).
> 
> By the time you wrote that email, 4.12 was already released for hours.

Oops ...

> So there's really no choice this time but to queue it up for 4.13.
> 

At least, no decision to make :)

Thanks,
Paolo

> -- 
> Jens Axboe
>

Re: [PATCH v9 2/3] PCI: Add tango PCIe host bridge support

2017-07-04 Thread Jisheng Zhang

On Tue, 4 Jul 2017 14:58:40 +0800 Jisheng Zhang wrote:

> On Mon, 3 Jul 2017 08:27:04 -0500 wrote:
> 
> > [+cc Jingoo, Joao]
> > 
> > On Mon, Jul 03, 2017 at 10:35:50AM +0100, Ard Biesheuvel wrote:  
> > > On 3 July 2017 at 00:18, Bjorn Helgaas  wrote:
> > > > On Tue, Jun 20, 2017 at 10:17:40AM +0200, Marc Gonzalez wrote:
> > > >> This driver is required to work around several hardware bugs
> > > >> in the PCIe controller.
> > > >>
> > > >> NB: Revision 1 does not support legacy interrupts, or IO space.
> > > >
> > > > I had to apply these manually because of conflicts in Kconfig and
> > > > Makefile.  What are these based on?  Easiest for me is if you base
> > > > them on the current -rc1 tag.
> > > >
> > > >> Signed-off-by: Marc Gonzalez 
> > > >> ---
> > > >>  drivers/pci/host/Kconfig  |   8 +++
> > > >>  drivers/pci/host/Makefile |   1 +
> > > >>  drivers/pci/host/pcie-tango.c | 164 
> > > >> ++
> > > >>  include/linux/pci_ids.h   |   2 +
> > > >>  4 files changed, 175 insertions(+)
> > > >>  create mode 100644 drivers/pci/host/pcie-tango.c
> > > >>
> > > [..]
> > > >> + /*
> > > >> +  * QUIRK #2
> > > >> +  * Unfortunately, config and mem spaces are muxed.
> > > >> +  * Linux does not support such a setting, since drivers are free
> > > >> +  * to access mem space directly, at any time.
> > > >> +  * Therefore, we can only PRAY that config and mem space accesses
> > > >> +  * NEVER occur concurrently.
> > > >> +  */
> > > >> + writel_relaxed(1, pcie->mux);
> > > >> + ret = pci_generic_config_read(bus, devfn, where, size, val);
> > > >> + writel_relaxed(0, pcie->mux);
> > > >
> > > > I'm very hesitant about this.  When people stress this, we're going to
> > > > get reports of data corruption.  Even with the disclaimer below, I
> > > > don't feel good about this.  Adding the driver is an implicit claim
> > > > that we support the device, but we know it can't be made reliable.
> > > 
> > > I noticed that the Synopsys driver suffers from a similar issue: in
> > > dw_pcie_rd_other_conf(), it happily reprograms the outbound I/O window
> > > to perform a config space access, and switches it back to I/O space
> > > afterwards (unless it has more than 2 viewports, in which case it uses
> > > dedicated windows for I/O space and config space)
> > 
> > That doesn't sound good.  Jingoo, Joao?  I remember some discussion
> > about this, but not the details.
> > 
> > I/O accesses use wrappers (inb(), etc), so there's at least the
> > possibility of a mutex to serialize them with respect to config
> > accesses.
> >   
> 
> IIRC, for 2 viewports, we don't need to worry about the config space
> access, because config space access is serialized by pci_lock; We
> do have race between config space and io space. But the accessing config
> space and io space at the same time is rare. And the PCIe EPs which
> has io space are rare too, supporting these EPs are not the potential
> target of those platforms with 2 viewports.
> 

PS: I think most platforms choose 2 pcie designware viewports just because
it's the default setting. And I have send a feature request to ASIC people
to increase the viewports to 3 for future marvell berlin SoCs.

Re: [PATCH v6 13/18] xen/pvcalls: implement release command

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> Release both active and passive sockets. For active sockets, make sure
> to avoid possible conflicts with the ioworker reading/writing to those
> sockets concurrently. Set map->release to let the ioworker know
> atomically that the socket will be released soon, then wait until the
> ioworker finishes (flush_work).
> 
> Unmap indexes pages and data rings.
> 
> Signed-off-by: Stefano Stabellini 

Reviewed-by: Juergen Gross 


Thanks,

Juergen

Re: [PATCH v6 14/18] xen/pvcalls: disconnect and module_exit

2017-07-04 Thread Juergen Gross

On 03/07/17 23:08, Stefano Stabellini wrote:
> Implement backend_disconnect. Call pvcalls_back_release_active on active
> sockets and pvcalls_back_release_passive on passive sockets.
> 
> Implement module_exit by calling backend_disconnect on frontend
> connections.
> 
> Signed-off-by: Stefano Stabellini 
> CC: boris.ostrov...@oracle.com
> CC: jgr...@suse.com
> ---
>  drivers/xen/pvcalls-back.c | 52 
> ++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
> index 9f4247f..71a42fc 100644
> --- a/drivers/xen/pvcalls-back.c
> +++ b/drivers/xen/pvcalls-back.c
> @@ -807,6 +807,42 @@ static int backend_connect(struct xenbus_device *dev)
>  
>  static int backend_disconnect(struct xenbus_device *dev)
>  {
> + struct pvcalls_fedata *fedata;
> + struct sock_mapping *map, *n;
> + struct sockpass_mapping *mappass;
> + struct radix_tree_iter iter;
> + void **slot;
> +
> +
> + fedata = dev_get_drvdata(>dev);
> +
> + down(>socket_lock);
> + list_for_each_entry_safe(map, n, >socket_mappings, list) {
> + list_del(>list);
> + pvcalls_back_release_active(dev, fedata, map);
> + }
> +
> + radix_tree_for_each_slot(slot, >socketpass_mappings, , 0) {
> + mappass = radix_tree_deref_slot(slot);
> + if (!mappass)
> + continue;
> + if (radix_tree_exception(mappass)) {
> + if (radix_tree_deref_retry(mappass))
> + slot = radix_tree_iter_retry();
> + } else {
> + radix_tree_delete(>socketpass_mappings, 
> mappass->id);
> + pvcalls_back_release_passive(dev, fedata, mappass);
> + }
> + }
> + up(>socket_lock);
> +
> + xenbus_unmap_ring_vfree(dev, fedata->sring);
> + unbind_from_irqhandler(fedata->irq, dev);

Swap above two lines to avoid irq being handled after releasing
ring?


Juergen

Re: [PATCH 4.9 000/172] 4.9.36-stable review

2017-07-04 Thread Greg Kroah-Hartman

On Mon, Jul 03, 2017 at 12:25:48PM -0700, kernelci.org bot wrote:
> stable-rc/linux-4.9.y boot: 130 boots: 5 failed, 112 passed with 13 offline 
> (v4.9.35-173-g45949a8fd1df)
> 
> Full Boot Summary: 
> https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.35-173-g45949a8fd1df/
> Full Build Summary: 
> https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.35-173-g45949a8fd1df/
> 
> Tree: stable-rc
> Branch: linux-4.9.y
> Git Describe: v4.9.35-173-g45949a8fd1df
> Git Commit: 45949a8fd1dfe62289359ae6e71bbb3fc45afeeb
> Git URL: 
> http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Tested: 31 unique boards, 13 SoC families, 28 builds out of 203
> 
> Boot Regressions Detected:
> 
> arm:
> 
> exynos_defconfig:
> exynos5800-peach-pi_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.9.34-45-g92905e331aea)
> 
> multi_v7_defconfig:
> imx6q-sabrelite_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.9.34-44-g8041763f609c)
> rk3288-rock2-square_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.9.34)
> 
> mvebu_v5_defconfig:
> kirkwood-openblocks_a7_rootfs:nfs:
> lab-free-electrons: new failure (last pass: v4.9.34)

Any hint as to why these new failures are happening?

thanks,

greg k-h

Re: [PATCH v9 2/3] PCI: Add tango PCIe host bridge support

2017-07-04 Thread Ard Biesheuvel

On 4 July 2017 at 07:58, Jisheng Zhang  wrote:
> On Mon, 3 Jul 2017 08:27:04 -0500 wrote:
>
>> [+cc Jingoo, Joao]
>>
>> On Mon, Jul 03, 2017 at 10:35:50AM +0100, Ard Biesheuvel wrote:
>> > On 3 July 2017 at 00:18, Bjorn Helgaas  wrote:
>> > > On Tue, Jun 20, 2017 at 10:17:40AM +0200, Marc Gonzalez wrote:
>> > >> This driver is required to work around several hardware bugs
>> > >> in the PCIe controller.
>> > >>
>> > >> NB: Revision 1 does not support legacy interrupts, or IO space.
>> > >
>> > > I had to apply these manually because of conflicts in Kconfig and
>> > > Makefile.  What are these based on?  Easiest for me is if you base
>> > > them on the current -rc1 tag.
>> > >
>> > >> Signed-off-by: Marc Gonzalez 
>> > >> ---
>> > >>  drivers/pci/host/Kconfig  |   8 +++
>> > >>  drivers/pci/host/Makefile |   1 +
>> > >>  drivers/pci/host/pcie-tango.c | 164 
>> > >> ++
>> > >>  include/linux/pci_ids.h   |   2 +
>> > >>  4 files changed, 175 insertions(+)
>> > >>  create mode 100644 drivers/pci/host/pcie-tango.c
>> > >>
>> > [..]
>> > >> + /*
>> > >> +  * QUIRK #2
>> > >> +  * Unfortunately, config and mem spaces are muxed.
>> > >> +  * Linux does not support such a setting, since drivers are free
>> > >> +  * to access mem space directly, at any time.
>> > >> +  * Therefore, we can only PRAY that config and mem space accesses
>> > >> +  * NEVER occur concurrently.
>> > >> +  */
>> > >> + writel_relaxed(1, pcie->mux);
>> > >> + ret = pci_generic_config_read(bus, devfn, where, size, val);
>> > >> + writel_relaxed(0, pcie->mux);
>> > >
>> > > I'm very hesitant about this.  When people stress this, we're going to
>> > > get reports of data corruption.  Even with the disclaimer below, I
>> > > don't feel good about this.  Adding the driver is an implicit claim
>> > > that we support the device, but we know it can't be made reliable.
>> >
>> > I noticed that the Synopsys driver suffers from a similar issue: in
>> > dw_pcie_rd_other_conf(), it happily reprograms the outbound I/O window
>> > to perform a config space access, and switches it back to I/O space
>> > afterwards (unless it has more than 2 viewports, in which case it uses
>> > dedicated windows for I/O space and config space)
>>
>> That doesn't sound good.  Jingoo, Joao?  I remember some discussion
>> about this, but not the details.
>>
>> I/O accesses use wrappers (inb(), etc), so there's at least the
>> possibility of a mutex to serialize them with respect to config
>> accesses.
>>
>
> IIRC, for 2 viewports, we don't need to worry about the config space
> access, because config space access is serialized by pci_lock; We
> do have race between config space and io space. But the accessing config
> space and io space at the same time is rare.

Being 'rare' is not sufficient, unfortunately. In the current
situation, I/O space accesses may occur when the outbound window is
directed to the config space of a potentially completely unrelated
device. This is bad.

> And the PCIe EPs which
> has io space are rare too, supporting these EPs are not the potential
> target of those platforms with 2 viewports.
>

I am not sure EP mode is relevant here. What I do know is that boards
like the Marvell 8040 based MacchiatoBin uses this IP in RC mode, and
exposes config, MMIO and IO space windows using only 2 viewports. Note
that this is essentially a bug in the DT description, given that its
version of the IP supports 8 viewports. But the driver needs to be
fixed as well.

Re: [PATCH 4.4 000/101] 4.4.76-stable review

2017-07-04 Thread Greg Kroah-Hartman

On Mon, Jul 03, 2017 at 01:25:47PM -0700, kernelci.org bot wrote:
> stable-rc/linux-4.4.y boot: 149 boots: 6 failed, 94 passed with 49 offline 
> (v4.4.75-102-g77af3cab5b0c)
> 
> Full Boot Summary: 
> https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.4.y/kernel/v4.4.75-102-g77af3cab5b0c/
> Full Build Summary: 
> https://kernelci.org/build/stable-rc/branch/linux-4.4.y/kernel/v4.4.75-102-g77af3cab5b0c/
> 
> Tree: stable-rc
> Branch: linux-4.4.y
> Git Describe: v4.4.75-102-g77af3cab5b0c
> Git Commit: 77af3cab5b0cbf0e0bf63c752b90b7f18af14512
> Git URL: 
> http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Tested: 36 unique boards, 13 SoC families, 20 builds out of 199
> 
> Boot Regressions Detected:
> 
> arm:
> 
> exynos_defconfig:
> exynos5422-odroidxu3_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.4.73-60-gb24780a48ef0)
> exynos5800-peach-pi_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.4.73-60-g6ee496d7218a)
> 
> multi_v7_defconfig:
> imx6q-sabrelite_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.4.73-54-g519a0cef2eeb)
> rk3288-rock2-square_rootfs:nfs:
> lab-collabora: new failure (last pass: v4.4.73-60-g6ee496d7218a)
> 
> mvebu_v5_defconfig:
> kirkwood-db-88f6282_rootfs:nfs:
> lab-free-electrons: new failure (last pass: 
> v4.4.73-60-gb24780a48ef0)
> 
> mvebu_v7_defconfig:
> armada-xp-db_rootfs:nfs:
> lab-free-electrons: new failure (last pass: 
> v4.4.73-33-g9ea962186ff3)

I've pushed out an update to the tree, these failures should now be
fixed.  If not, please let me know.

thanks,

greg k-h

Re: [PATCH v9 2/3] PCI: Add tango PCIe host bridge support

2017-07-04 Thread Jisheng Zhang

On Tue, 4 Jul 2017 09:02:06 +0100 Ard Biesheuvel wrote:

> On 4 July 2017 at 07:58, Jisheng Zhang  wrote:
> > On Mon, 3 Jul 2017 08:27:04 -0500 wrote:
> >  
> >> [+cc Jingoo, Joao]
> >>
> >> On Mon, Jul 03, 2017 at 10:35:50AM +0100, Ard Biesheuvel wrote:  
> >> > On 3 July 2017 at 00:18, Bjorn Helgaas  wrote:  
> >> > > On Tue, Jun 20, 2017 at 10:17:40AM +0200, Marc Gonzalez wrote:  
> >> > >> This driver is required to work around several hardware bugs
> >> > >> in the PCIe controller.
> >> > >>
> >> > >> NB: Revision 1 does not support legacy interrupts, or IO space.  
> >> > >
> >> > > I had to apply these manually because of conflicts in Kconfig and
> >> > > Makefile.  What are these based on?  Easiest for me is if you base
> >> > > them on the current -rc1 tag.
> >> > >  
> >> > >> Signed-off-by: Marc Gonzalez 
> >> > >> ---
> >> > >>  drivers/pci/host/Kconfig  |   8 +++
> >> > >>  drivers/pci/host/Makefile |   1 +
> >> > >>  drivers/pci/host/pcie-tango.c | 164 
> >> > >> ++
> >> > >>  include/linux/pci_ids.h   |   2 +
> >> > >>  4 files changed, 175 insertions(+)
> >> > >>  create mode 100644 drivers/pci/host/pcie-tango.c
> >> > >>  
> >> > [..]  
> >> > >> + /*
> >> > >> +  * QUIRK #2
> >> > >> +  * Unfortunately, config and mem spaces are muxed.
> >> > >> +  * Linux does not support such a setting, since drivers are free
> >> > >> +  * to access mem space directly, at any time.
> >> > >> +  * Therefore, we can only PRAY that config and mem space 
> >> > >> accesses
> >> > >> +  * NEVER occur concurrently.
> >> > >> +  */
> >> > >> + writel_relaxed(1, pcie->mux);
> >> > >> + ret = pci_generic_config_read(bus, devfn, where, size, val);
> >> > >> + writel_relaxed(0, pcie->mux);  
> >> > >
> >> > > I'm very hesitant about this.  When people stress this, we're going to
> >> > > get reports of data corruption.  Even with the disclaimer below, I
> >> > > don't feel good about this.  Adding the driver is an implicit claim
> >> > > that we support the device, but we know it can't be made reliable.  
> >> >
> >> > I noticed that the Synopsys driver suffers from a similar issue: in
> >> > dw_pcie_rd_other_conf(), it happily reprograms the outbound I/O window
> >> > to perform a config space access, and switches it back to I/O space
> >> > afterwards (unless it has more than 2 viewports, in which case it uses
> >> > dedicated windows for I/O space and config space)  
> >>
> >> That doesn't sound good.  Jingoo, Joao?  I remember some discussion
> >> about this, but not the details.
> >>
> >> I/O accesses use wrappers (inb(), etc), so there's at least the
> >> possibility of a mutex to serialize them with respect to config
> >> accesses.
> >>  
> >
> > IIRC, for 2 viewports, we don't need to worry about the config space
> > access, because config space access is serialized by pci_lock; We
> > do have race between config space and io space. But the accessing config
> > space and io space at the same time is rare.  
> 
> Being 'rare' is not sufficient, unfortunately. In the current
> situation, I/O space accesses may occur when the outbound window is
> directed to the config space of a potentially completely unrelated
> device. This is bad.

Yep, I agree with you.

> 
> > And the PCIe EPs which
> > has io space are rare too, supporting these EPs are not the potential
> > target of those platforms with 2 viewports.
> >  
> 
> I am not sure EP mode is relevant here. What I do know is that boards

I mean those PCIe EP cards which have IO space, but that doesn't matter.

> like the Marvell 8040 based MacchiatoBin uses this IP in RC mode, and
> exposes config, MMIO and IO space windows using only 2 viewports. Note
> that this is essentially a bug in the DT description, given that its
> version of the IP supports 8 viewports. But the driver needs to be
> fixed as well.

To fix for 2 viewports situation, we need to serialize access of the io
and config space. In internal repo, we can achieve it by modifying the
io access helper functions such as inl/outl, but this won't be accepted
by the mainline IMHO. Except fixing the HW, any elegant solution?

Suggestions are appreciated.

Thanks,
Jisheng

Re: [PATCH 4/6] ARM: dts: rockchip: enable eMMC for rk3229-evb

2017-07-04 Thread Heiko Stübner

Am Dienstag, 4. Juli 2017, 16:12:45 CEST schrieb Frank Wang:
> This patch enables eMMC support for rk3229-evb board.
> 
> Signed-off-by: Frank Wang 
> ---
>  arch/arm/boot/dts/rk3229-evb.dts | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/rk3229-evb.dts
> b/arch/arm/boot/dts/rk3229-evb.dts index b64f86c..bae0dbf 100644
> --- a/arch/arm/boot/dts/rk3229-evb.dts
> +++ b/arch/arm/boot/dts/rk3229-evb.dts
> @@ -130,6 +130,17 @@
>   cpu-supply = <_arm>;
>  };
> 
> + {
> + broken-cd;
> + bus-width = <8>;
> + cap-mmc-highspeed;
> + supports-emmc;
> + disable-wp;
> + non-removable;

non-removable should be enough, so you shouldn't need the broken-cd above?

> + num-slots = <1>;
> + status = "okay";
> +};
> +
>   {
>   assigned-clocks = < SCLK_MAC_EXTCLK>, < SCLK_MAC>;
>   assigned-clock-parents = <_gmac>, < SCLK_MAC_EXTCLK>;

Re: [PATCH v2 00/11] patches for fpga

2017-07-04 Thread Anatolij Gustschin

On Tue, 4 Jul 2017 11:12:17 +0200
Greg Kroah-Hartman gre...@linuxfoundation.org wrote:
...
>> Is this series queued for merging in 4.13-rc1 ?  
>
>Nope, it missed my merge window for that, sorry.  I'll queue them up to
>my tree after 4.13-rc1 is out.  If there are any specific bugfixes in
>here that should go into 4.13-final, please let me know.

Okay. No, there are no bugfixes in this series.

thanks,

Anatolij

Re: "mm: use early_pfn_to_nid in page_ext_init" broken on some configurations?

2017-07-04 Thread Vlastimil Babka

On 07/04/2017 07:23 AM, Joonsoo Kim wrote:
> On Mon, Jul 03, 2017 at 04:18:01PM +0200, Vlastimil Babka wrote:
>> allocated" looks much more sane there. But there's a warning nevertheless.
> 
> Warning would comes from the fact that drain_all_pages() is called
> before mm_percpu_wq is initialised. We could remove WARN_ON_ONCE() and add
> drain_local_page(zone) to fix the problem.

Wouldn't that still leave some period during boot where kernel already
runs on multiple CPU's, but mm_percpu_wq is not yet initialized and
somebody tries to use it? We want to catch such cases, right?

> Thanks.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
>

Re: [PATCH v5 2/4] [media] platform: Add Synopsys Designware HDMI RX Controller Driver

2017-07-04 Thread Hans Verkuil

On 07/04/17 11:28, Jose Abreu wrote:
> Hi Hans,
> 
> 
> On 03-07-2017 11:33, Hans Verkuil wrote:
>> On 07/03/2017 11:53 AM, Jose Abreu wrote:
>>> Hi Hans,
>>>
>>>
>>> On 03-07-2017 10:27, Hans Verkuil wrote:
 On 06/29/2017 12:46 PM, Jose Abreu wrote:
> This is an initial submission for the Synopsys Designware
> HDMI RX
> Controller Driver. This driver interacts with a phy driver so
> that
> a communication between them is created and a video pipeline is
> configured.
>
> The controller + phy pipeline can then be integrated into a
> fully
> featured system that can be able to receive video up to 4k@60Hz
> with deep color 48bit RGB, depending on the platform. Although,
> this initial version does not yet handle deep color modes.
>
> This driver was implemented as a standard V4L2 subdevice and
> its
> main features are:
>  - Internal state machine that reconfigures phy until the
>  video is not stable
>  - JTAG communication with phy
>  - Inter-module communication with phy driver
>  - Debug write/read ioctls
>
> Some notes:
>  - RX sense controller (cable connection/disconnection)
> must
>  be handled by the platform wrapper as this is not
> integrated
>  into the controller RTL
>  - The same goes for EDID ROM's
>  - ZCAL calibration is needed only in FPGA platforms, in
> ASIC
>  this is not needed
>  - The state machine is not an ideal solution as it
> creates a
>  kthread but it is needed because some sources might not be
>  very stable at sending the video (i.e. we must react
>  accordingly).
>
> Signed-off-by: Jose Abreu 
> Cc: Carlos Palminha 
> Cc: Mauro Carvalho Chehab 
> Cc: Hans Verkuil 
> Cc: Sylwester Nawrocki 
>
> Changes from v4:
>  - Add flag V4L2_SUBDEV_FL_HAS_DEVNODE (Sylwester)
>  - Remove some comments and change some messages to dev_dbg
> (Sylwester)
>  - Use v4l2_async_subnotifier_register() (Sylwester)
> Changes from v3:
>  - Use v4l2 async API (Sylwester)
>  - Do not block waiting for phy
>  - Do not use busy waiting delays (Sylwester)
>  - Simplify dw_hdmi_power_on (Sylwester)
>  - Use clock API (Sylwester)
>  - Use compatible string (Sylwester)
>  - Minor fixes (Sylwester)
> Changes from v2:
>  - Address review comments from Hans regarding CEC
>  - Use CEC notifier
>  - Enable SCDC
> Changes from v1:
>  - Add support for CEC
>  - Correct typo errors
>  - Correctly detect interlaced video modes
>  - Correct VIC parsing
> Changes from RFC:
>  - Add support for HDCP 1.4
>  - Fixup HDMI_VIC not being parsed (Hans)
>  - Send source change signal when powering off (Hans)
>  - Add a "wait stable delay"
>  - Detect interlaced video modes (Hans)
>  - Restrain g/s_register from reading/writing to HDCP regs
> (Hans)
> ---
>drivers/media/platform/dwc/Kconfig  |   15 +
>drivers/media/platform/dwc/Makefile |1 +
>drivers/media/platform/dwc/dw-hdmi-rx.c | 1824
> +++
>drivers/media/platform/dwc/dw-hdmi-rx.h |  441 
>include/media/dwc/dw-hdmi-rx-pdata.h|   97 ++
>5 files changed, 2378 insertions(+)
>create mode 100644 drivers/media/platform/dwc/dw-hdmi-rx.c
>create mode 100644 drivers/media/platform/dwc/dw-hdmi-rx.h
>create mode 100644 include/media/dwc/dw-hdmi-rx-pdata.h
>
> diff --git a/drivers/media/platform/dwc/Kconfig
> b/drivers/media/platform/dwc/Kconfig
> index 361d38d..3ddccde 100644
> --- a/drivers/media/platform/dwc/Kconfig
> +++ b/drivers/media/platform/dwc/Kconfig
> @@ -6,3 +6,18 @@ config VIDEO_DWC_HDMI_PHY_E405
>To compile this driver as a module, choose M here.
> The module
>  will be called dw-hdmi-phy-e405.
> +
> +config VIDEO_DWC_HDMI_RX
> +tristate "Synopsys Designware HDMI Receiver driver"
> +depends on VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API
> +help
> +  Support for Synopsys Designware HDMI RX controller.
> +
> +  To compile this driver as a module, choose M here. The
> module
> +  will be called dw-hdmi-rx.
> +
> +config VIDEO_DWC_HDMI_RX_CEC
> +bool
> +depends on VIDEO_DWC_HDMI_RX
> +select CEC_CORE
> +select CEC_NOTIFIER
> diff --git a/drivers/media/platform/dwc/Makefile
> b/drivers/media/platform/dwc/Makefile
> index fc3b62c..cd04ca9 100644
> --- a/drivers/media/platform/dwc/Makefile
> +++

Re: [PATCH 09/14] qcom: mtd: nand: BAM support for read page

2017-07-04 Thread Archit Taneja




On 06/29/2017 12:46 PM, Abhishek Sahu wrote:

1. The BAM mode requires few registers configuration before each
NAND page read and codeword read which is different from ADM
so add the helper functions which will be called in BAM mode
only.

2. The NAND page read handling of BAM is different from ADM so
call the appropriate helper functions

Signed-off-by: Abhishek Sahu 
---
  drivers/mtd/nand/qcom_nandc.c | 63 ++-
  1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/qcom_nandc.c b/drivers/mtd/nand/qcom_nandc.c
index 8e7dc9e..17766af 100644
--- a/drivers/mtd/nand/qcom_nandc.c
+++ b/drivers/mtd/nand/qcom_nandc.c
@@ -870,6 +870,35 @@ static void config_cw_read(struct qcom_nand_controller 
*nandc)
  }
  
  /*

+ * Helpers to prepare DMA descriptors for configuring registers
+ * before reading a NAND page with BAM.
+ */
+static void config_bam_page_read(struct qcom_nand_controller *nandc)
+{
+   write_reg_dma(nandc, NAND_FLASH_CMD, 3, 0);
+   write_reg_dma(nandc, NAND_DEV0_CFG0, 3, 0);
+   write_reg_dma(nandc, NAND_EBI2_ECC_BUF_CFG, 1, 0);
+   write_reg_dma(nandc, NAND_ERASED_CW_DETECT_CFG, 1, 0);
+   write_reg_dma(nandc, NAND_ERASED_CW_DETECT_CFG, 1,
+ NAND_ERASED_CW_SET | NAND_BAM_NEXT_SGL);
+}
+
+/*
+ * Helpers to prepare DMA descriptors for configuring registers
+ * before reading each codeword in NAND page with BAM.
+ */


If I understood right, EBI2 nand required us to load all the registers
configured in config_cw_read() for every codeword, and for BAM, the
registers configured in config_bam_page_read() just needs to be done once,
and the registers in config_bam_cw_read()  need to be reloaded for every
codeword?

Could you please clarify this better in the commit message and comments? Also,
I still see config_cw_read() being used for QPIC nand in nandc_param() and
copy_last_cw()?

Also, I think these should be called config_qpic_page_read() and
config_qpic_cw_read() since it seems more of a property of the NAND controller
rather than the underlying DMA engine. If so, config_cw_read() can be called
config_cw_ebi2_read(). Please correct me if I'm wrong somewhere.


+static void config_bam_cw_read(struct qcom_nand_controller *nandc)
+{
+   write_reg_dma(nandc, NAND_READ_LOCATION_0, 2, 0);
+   write_reg_dma(nandc, NAND_FLASH_CMD, 1, NAND_BAM_NEXT_SGL);
+   write_reg_dma(nandc, NAND_EXEC_CMD, 1, NAND_BAM_NEXT_SGL);
+
+   read_reg_dma(nandc, NAND_FLASH_STATUS, 2, 0);
+   read_reg_dma(nandc, NAND_ERASED_CW_DETECT_STATUS, 1,
+NAND_BAM_NEXT_SGL);
+}
+
+/*
   * helpers to prepare dma descriptors used to configure registers needed for
   * writing a codeword/step in a page
   */
@@ -1398,6 +1427,9 @@ static int read_page_ecc(struct qcom_nand_host *host, u8 
*data_buf,
struct nand_ecc_ctrl *ecc = >ecc;
int i, ret;
  
+	if (nandc->dma_bam_enabled)

+   config_bam_page_read(nandc);
+
/* queue cmd descs for each codeword */
for (i = 0; i < ecc->steps; i++) {
int data_size, oob_size;
@@ -1411,7 +1443,36 @@ static int read_page_ecc(struct qcom_nand_host *host, u8 
*data_buf,
oob_size = host->ecc_bytes_hw + host->spare_bytes;
}
  
-		config_cw_read(nandc);

+   if (nandc->dma_bam_enabled) {
+   if (data_buf && oob_buf) {
+   nandc_set_reg(nandc, NAND_READ_LOCATION_0,
+ (0 << READ_LOCATION_OFFSET) |
+ (data_size <<
+ READ_LOCATION_SIZE) |
+ (0 << READ_LOCATION_LAST));
+   nandc_set_reg(nandc, NAND_READ_LOCATION_1,
+ (data_size <<
+ READ_LOCATION_OFFSET) |
+ (oob_size << READ_LOCATION_SIZE) |
+ (1 << READ_LOCATION_LAST));
+   } else if (data_buf) {
+   nandc_set_reg(nandc, NAND_READ_LOCATION_0,
+ (0 << READ_LOCATION_OFFSET) |
+ (data_size <<
+ READ_LOCATION_SIZE) |
+ (1 << READ_LOCATION_LAST));
+   } else {
+   nandc_set_reg(nandc, NAND_READ_LOCATION_0,
+ (data_size <<
+ READ_LOCATION_OFFSET) |
+ (oob_size << READ_LOCATION_SIZE) |
+ (1 << READ_LOCATION_LAST));
+

Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region)

2017-07-04 Thread Ingo Molnar


* Peter Zijlstra  wrote:

> On Thu, Jun 29, 2017 at 10:12:33AM +0200, Ingo Molnar wrote:
> > 
> > * Mark Rutland  wrote:
> > 
> > > It still seems wrong to make up data, though.
> > 
> > So what we have here is a hardware quirk: we asked for user-space samples, 
> > but 
> > didn't get them and we cannot expose the kernel-internal address.
> > 
> > The question is, how do we handle the hardware quirk. Since we cannot fix 
> > the 
> > hardware on existing systems there's really just two choices:
> > 
> >  - Lose the sample (and signal it as a lost sample)
> > 
> >  - Keep the sample but change the sensitive kernel-internal address to 
> > something 
> >that is not sensitive: 0 or -1 works, but we could perhaps also return a 
> >well-known user-space address such as the vDSO syscall trampoline or 
> > such?
> > 
> > there's no other option really.
> > 
> > I'd lean towards Vince's take: losing samples is more surprising than 
> > getting the 
> > occasional sample with some sanitized data in it.
> > 
> > If we make the artificial data still a meaningful user-space address, 
> > related to 
> > kernel entries, then it might even be a bonus, as users would learn to 
> > recognize 
> > it as: 'oh, skid artifact, I know about that'.
> 
> So while we could easily fake SAMPLE_IP to do as you suggest, other
> entries might be much harder to fake. That said, I have no problems with
> just 0 stuffing them.
> 
> The only real problem is determining how much to stuff I suppose.

I think the RIP is the most important one to fix up in an informative fashion 
(instead of just zeroing it out), so that mainstream users of 'perf top' or
'perf report' have a chance to see that certain entries have this skid artifact.

The other registers should be zeroed out once we stop trusting a sample.

Thanks,

Ingo

Re: [PATCH v1 0/4] media: rc: add support for IR receiver on MT7622 SoC

2017-07-04 Thread Andi Shyti

Hi Sean,

> This patchset introduces Consumer IR (CIR) support for MT7622 SoC
> implements raw mode for more compatibility with different protocols
> as previously SoC did. Before adding support to MT7622 SoC, extra
> code refactor is done since there're major differences in register and
> field definition from the previous SoC.
> 
> Sean Wang (4):
>   dt-bindings: media: mtk-cir: Add support for MT7622 SoC
>   media: rc: mtk-cir: add platform data to adapt into various hardware
>   media: rc: mtk-cir: add support for MediaTek MT7622 SoC
>   MAINTAINERS: add entry for MediaTek CIR driver

for the whole patchset:

Reviewed-by: Andi Shyti 

Andi

Re: [PATCH v4 3/3] arm64: kvm: inject SError with user space specified syndrome

2017-07-04 Thread James Morse

Hi gengdongjiu,

Can you give us a specific example of an error you are trying to handle?
How would a non-KVM user space process handle the error?

KVM-users should be regular user space processes, we should not have a KVM-way
and everyone-else-way of handling errors.

On 04/07/17 05:46, gengdongjiu wrote:
> On 2017/7/3 16:39, Christoffer Dall wrote:
>> On Mon, Jun 26, 2017 at 08:46:39PM +0800, Dongjiu Geng wrote:
>>> when SError happen, kvm notifies user space to record the CPER,
>>> user space specifies and passes the contents of ESR_EL1 on taking
>>> a virtual SError interrupt to KVM, KVM enables virtual system
>>> error or asynchronous abort with this specifies syndrome. This
>>> patch modify the world-switch to restore VSESR_EL2, VSESR_EL2
>>> saves the virtual SError syndrome, it becomes the ESR_EL1 value when
>>> HCR_EL2.VSE injects an SError. This register is added by the
>>> RAS Extensions.
>>
>> This commit message is confusing and doesn't help me understand the
>> patch.
> (1) what is the rationale for the guest OS SError interrupt(SEI) handling in 
> the RAS solution?

>   a). In the firmware-first RAS solution, when guest OS happen a SError 
> interrupt (SEI), it will firstly trap to EL3(SCR_EL3.EA = 1);
>   b). The firmware logs, triages, and delegates the error exception to the 
> hypervisor. As the error came from guest OS  EL1, firmware
>   does by faking an SError interrupt exception entry to EL2.
>   c). Control transfers to the hypervisor's delegated error recovery 
> agent.Because HCR_EL2.AMO is set to 1, the hypervisor can use a
>   Virtual SError interrupt to delegate an asynchronous abort to EL1, by 
> setting HCR_EL2.VSE to 1 and using VESR_EL2 to pass syndrome.

So (a): a physical-CPU hardware error occurs, and then (c) we tell Qemu/kvmtool
via a KVM-specific API.

Don't do this, it doesn't work for non-KVM users. You are exposing host-specific
implementation details to user space. What if I discover the same error via a
Polling GHES, or one of the IRQ flavours?

User space should not have to know, or care, how linux is notified about APEI
RAS errors.

> (2) what is this patch mainly do?
>   As mentioned above, the hypervisor needs to enable virtual SError and pass 
> the virtual syndrome to the guest OS.
> 
>   a). when Control transfers to the hypervisor from firmware by faking an 
> SError interrupt, the hypervisor delivered the syndrome_info(esr_el2) and
>   host VA address( Qemu translate this VA address to the virtual machine 
> physical address(IPA)) using below new added "serror_intr" struct.
>   /* KVM_EXIT_SERROR_INTR */
>   struct {
>   __u32 syndrome_info;
>   __u64 address;
>   } serror_intr;

This is for a guest exit to host user-space. Here you are telling Qemu that a
physical CPU hardware error occurred. Qemu/kvmtool should not be expected to
parse the ESR, this is the job of the operating system.

When you're using ACPI firmware-first, SError/SEI is just a notification, the
important data is in the CPER records, which Qemu can't access, (and should be
processed by Linux APEI code).

It looks like you've calculated an address from FAR_EL2/HPFAR_EL2. For an
SError, these are meaningless.

(These registers hold real values for Synchronous External Abort, but for
 firmware-first we should prefer the CPER records.)

>   b). Qemu gets the address(host VA) delivered by KVM, translate this host VA 
> address to virtual machine physical address(IPA), and runtime record this 
> virtual
>  machine physical address(IPA) to the guest OS's APEI table.

I agree with this step, but you're acting on the wrong data. (You're converting
fault_ipa -> virtual address -> fault_ipa, something isn't right ...)

Qemu should react to a signal like BUS_MCEERR_A{R,O} from memory_failure(). This
mechanism serves all user space processes, not just kvm users. This is where the
user-space virtual address should come from. Qemu/kvmtool have to generate the
guest IPA once they discover the affected memory was presented to the guest
through KVM.

Your KVM-specific mechanism exposes too much raw information (raw ESR values to
user space), and only serves applications using KVM.

If there is another type of CPER record where we should notify userspace, please
do it from mm/memory-failure.c, drivers/acpi/apei/ghes.c or
drivers/firmware/efi/cper.c. These should consider all user-space applications,
not just users of KVM, and not just on arm64.

>   c). Qemu gets the syndrome_info delivered by KVM, it refers to this 
> syndrome value(but can be different from it) to specify the virtual SError 
> interrupt's syndrome through setting VESR_EL2.

'but can be different from it' is because a classification step is required, the
operating system should do this. We should only signal Qemu/kvmtool for errors
that can actually be handled. Some APEI notifications may be for corrected
errors, (I would hope these always

[PATCH net 1/3] net: hns: Add TX CSUM check when fill TX description

2017-07-04 Thread Lin Yun Sheng

From: Yunsheng Lin 

If driver support checksum offload, should check netdev feature
before fill TX description and get CSUM err bit from RX
description. HNS driver do the check in RX derction but it doesn't
do the check in TX direction.

Signed-off-by: lipeng 
Reviewed-by: Daode Huang 
Reviewed-by: Yunsheng Lin 
---
 drivers/net/ethernet/hisilicon/hns/hns_enet.c | 36 +++
 drivers/net/ethernet/hisilicon/hns/hns_enet.h |  2 +-
 2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index c6700b9..b1e7224 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -40,12 +40,14 @@
 #define SKB_TMP_LEN(SKB) \
(((SKB)->transport_header - (SKB)->mac_header) + tcp_hdrlen(SKB))
 
-static void fill_v2_desc(struct hnae_ring *ring, void *priv,
+static void fill_v2_desc(struct hns_nic_ring_data *ring_data, void *priv,
 int size, dma_addr_t dma, int frag_end,
 int buf_num, enum hns_desc_type type, int mtu)
 {
+   struct hnae_ring *ring = ring_data->ring;
struct hnae_desc *desc = >desc[ring->next_to_use];
struct hnae_desc_cb *desc_cb = >desc_cb[ring->next_to_use];
+   struct net_device *ndev = ring_data->napi.dev;
struct iphdr *iphdr;
struct ipv6hdr *ipv6hdr;
struct sk_buff *skb;
@@ -90,8 +92,13 @@ static void fill_v2_desc(struct hnae_ring *ring, void *priv,
 
if (skb->protocol == htons(ETH_P_IP)) {
iphdr = ip_hdr(skb);
-   hnae_set_bit(rrcfv, HNSV2_TXD_L3CS_B, 1);
-   hnae_set_bit(rrcfv, HNSV2_TXD_L4CS_B, 1);
+
+   if (ndev->features & NETIF_F_IP_CSUM) {
+   hnae_set_bit(rrcfv, HNSV2_TXD_L3CS_B,
+1);
+   hnae_set_bit(rrcfv, HNSV2_TXD_L4CS_B,
+1);
+   }
 
/* check for tcp/udp header */
if (iphdr->protocol == IPPROTO_TCP &&
@@ -105,7 +112,10 @@ static void fill_v2_desc(struct hnae_ring *ring, void 
*priv,
} else if (skb->protocol == htons(ETH_P_IPV6)) {
hnae_set_bit(tvsvsn, HNSV2_TXD_IPV6_B, 1);
ipv6hdr = ipv6_hdr(skb);
-   hnae_set_bit(rrcfv, HNSV2_TXD_L4CS_B, 1);
+
+   if (ndev->features & NETIF_F_IPV6_CSUM)
+   hnae_set_bit(rrcfv, HNSV2_TXD_L4CS_B,
+1);
 
/* check for tcp/udp header */
if (ipv6hdr->nexthdr == IPPROTO_TCP &&
@@ -140,12 +150,14 @@ static void fill_v2_desc(struct hnae_ring *ring, void 
*priv,
 };
 MODULE_DEVICE_TABLE(acpi, hns_enet_acpi_match);
 
-static void fill_desc(struct hnae_ring *ring, void *priv,
+static void fill_desc(struct hns_nic_ring_data *ring_data, void *priv,
  int size, dma_addr_t dma, int frag_end,
  int buf_num, enum hns_desc_type type, int mtu)
 {
+   struct hnae_ring *ring = ring_data->ring;
struct hnae_desc *desc = >desc[ring->next_to_use];
struct hnae_desc_cb *desc_cb = >desc_cb[ring->next_to_use];
+   struct net_device *ndev = ring_data->napi.dev;
struct sk_buff *skb;
__be16 protocol;
u32 ip_offset;
@@ -179,12 +191,14 @@ static void fill_desc(struct hnae_ring *ring, void *priv,
skb->protocol = protocol;
}
 
-   if (skb->protocol == htons(ETH_P_IP)) {
+   if (skb->protocol == htons(ETH_P_IP) &&
+   (ndev->features & NETIF_F_IP_CSUM)) {
flag_ipoffset |= 1 << HNS_TXD_L3CS_B;
/* check for tcp/udp header */
flag_ipoffset |= 1 << HNS_TXD_L4CS_B;
 
-   } else if (skb->protocol == htons(ETH_P_IPV6)) {
+   } else if (skb->protocol == htons(ETH_P_IPV6) &&
+  (ndev->features & NETIF_F_IPV6_CSUM)) {
/* ipv6 has not l3 cs, check for L4 header */
flag_ipoffset |= 1 << HNS_TXD_L4CS_B;
}
@@ -275,7 +289,7 @@ static int hns_nic_maybe_stop_tso(
return 0;
 }
 
-static void fill_tso_desc(struct hnae_ring *ring, void *priv,
+static void fill_tso_desc(struct

Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region)

2017-07-04 Thread Mark Rutland

On Tue, Jul 04, 2017 at 10:33:45AM +0100, Mark Rutland wrote:
> On Tue, Jul 04, 2017 at 11:03:13AM +0200, Peter Zijlstra wrote:
> > Faking data gets a wee bit tricky in how much data we need to clear
> > through, its not only IP, pretty much everything we get from the
> > interrupt context, like the branch stack and registers is also suspect.
> 
> Indeed. I'll take a run through __perf_event_output() and callees, and
> see what we need to drop.

Looking at perf_event_sample_format in uapi/linux/perf_event.h, there
are samples that are obviously sensitive, and should be dropped:

* PERF_SAMPLE_IP
* PERF_SAMPLE_CALLCHAIN
* PERF_SAMPLE_BRANCH_STACK
* PERF_SAMPLE_REGS_INTR

... samples that look benign:

* PERF_SAMPLE_TID
* PERF_SAMPLE_TIME
* PERF_SAMPLE_CPU
* PERF_SAMPLE_PERIOD
* PERF_SAMPLE_REGS_USER
* PERF_SAMPLE_STACK_USER
* PERF_SAMPLE_READ
* PERF_SAMPLE_ID
* PERF_SAMPLE_STREAM_ID
* PERF_SAMPLE_IDENTIFIER

.. and samples that I have no idea about:

* PERF_SAMPLE_ADDR
* PERF_SAMPLE_RAW
* PERF_SAMPLE_WEIGHT
* PERF_SAMPLE_DATA_SRC
* PERF_SAMPLE_TRANSACTION

Should any of those be moved into the "should be dropped" pile?

Thanks,
Mark.

[PATCH v3 05/16] drm/fb-helper: do a generic fb_setcmap helper in terms of crtc .gamma_set

2017-07-04 Thread Peter Rosin

This makes the redundant fb helpers .load_lut, .gamma_set and .gamma_get
completely obsolete.

Signed-off-by: Peter Rosin 
---
 drivers/gpu/drm/drm_fb_helper.c | 165 +++-
 1 file changed, 94 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index b75b1f2..7f8199a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1257,27 +1257,6 @@ void drm_fb_helper_set_suspend_unlocked(struct 
drm_fb_helper *fb_helper,
 }
 EXPORT_SYMBOL(drm_fb_helper_set_suspend_unlocked);
 
-static int setcolreg(struct drm_crtc *crtc, u16 red, u16 green,
-u16 blue, u16 regno, struct fb_info *info)
-{
-   struct drm_fb_helper *fb_helper = info->par;
-   struct drm_framebuffer *fb = fb_helper->fb;
-
-   /*
-* The driver really shouldn't advertise pseudo/directcolor
-* visuals if it can't deal with the palette.
-*/
-   if (WARN_ON(!fb_helper->funcs->gamma_set ||
-   !fb_helper->funcs->gamma_get))
-   return -EINVAL;
-
-   WARN_ON(fb->format->cpp[0] != 1);
-
-   fb_helper->funcs->gamma_set(crtc, red, green, blue, regno);
-
-   return 0;
-}
-
 static int setcmap_pseudo_palette(struct fb_cmap *cmap, struct fb_info *info)
 {
u32 *palette = (u32 *)info->pseudo_palette;
@@ -1310,54 +1289,68 @@ static int setcmap_pseudo_palette(struct fb_cmap *cmap, 
struct fb_info *info)
return 0;
 }
 
-/**
- * drm_fb_helper_setcmap - implementation for _ops.fb_setcmap
- * @cmap: cmap to set
- * @info: fbdev registered by the helper
- */
-int drm_fb_helper_setcmap(struct fb_cmap *cmap, struct fb_info *info)
+static int setcmap_legacy(struct fb_cmap *cmap, struct fb_info *info)
 {
struct drm_fb_helper *fb_helper = info->par;
-   struct drm_device *dev = fb_helper->dev;
-   const struct drm_crtc_helper_funcs *crtc_funcs;
-   u16 *red, *green, *blue, *transp;
struct drm_crtc *crtc;
u16 *r, *g, *b;
-   int i, j, rc = 0;
-   int start;
+   int i, ret = 0;
 
-   if (oops_in_progress)
-   return -EBUSY;
+   for (i = 0; i < fb_helper->crtc_count; i++) {
+   crtc = fb_helper->crtc_info[i].mode_set.crtc;
+   if (!crtc->funcs->gamma_set || !crtc->gamma_size)
+   return -EINVAL;
 
-   mutex_lock(_helper->lock);
-   if (!drm_fb_helper_is_bound(fb_helper)) {
-   mutex_unlock(_helper->lock);
-   return -EBUSY;
-   }
+   if (cmap->start + cmap->len > crtc->gamma_size)
+   return -EINVAL;
 
-   drm_modeset_lock_all(dev);
-   if (info->fix.visual == FB_VISUAL_TRUECOLOR) {
-   rc = setcmap_pseudo_palette(cmap, info);
-   goto out;
+   r = crtc->gamma_store;
+   g = r + crtc->gamma_size;
+   b = g + crtc->gamma_size;
+
+   memcpy(r + cmap->start, cmap->red, cmap->len * sizeof(*r));
+   memcpy(g + cmap->start, cmap->green, cmap->len * sizeof(*g));
+   memcpy(b + cmap->start, cmap->blue, cmap->len * sizeof(*b));
+
+   ret = crtc->funcs->gamma_set(crtc, r, g, b,
+crtc->gamma_size, NULL);
+   if (ret)
+   return ret;
}
 
-   for (i = 0; i < fb_helper->crtc_count; i++) {
-   crtc = fb_helper->crtc_info[i].mode_set.crtc;
-   crtc_funcs = crtc->helper_private;
+   return ret;
+}
 
-   red = cmap->red;
-   green = cmap->green;
-   blue = cmap->blue;
-   transp = cmap->transp;
-   start = cmap->start;
+static int setcmap_atomic(struct fb_cmap *cmap, struct fb_info *info)
+{
+   struct drm_fb_helper *fb_helper = info->par;
+   struct drm_device *dev = fb_helper->dev;
+   struct drm_modeset_acquire_ctx ctx;
+   struct drm_crtc_state *crtc_state;
+   struct drm_atomic_state *state;
+   struct drm_crtc *crtc;
+   u16 *r, *g, *b;
+   int i, ret = 0;
 
-   if (!crtc->gamma_size) {
-   rc = -EINVAL;
+   state = drm_atomic_state_alloc(dev);
+   if (!state)
+   return -ENOMEM;
+   drm_modeset_acquire_init(, 0);
+retry:
+   ret = drm_modeset_lock_all_ctx(dev, );
+   if (ret)
+   goto fini;
+   state->acquire_ctx = 
+   for (i = 0; i < fb_helper->crtc_count; i++) {
+   crtc = fb_helper->crtc_info[i].mode_set.crtc;
+   if (!crtc->funcs->gamma_set) {
+   ret = -EINVAL;
goto out;
}
 
-   if (cmap->start + cmap->len > crtc->gamma_size) {
-   rc = -EINVAL;
+   crtc_state = drm_atomic_get_crtc_state(state, crtc);
+   if

Re: [PATCH] gpio: drop unnecessary includes from include/linux/gpio/driver.h

2017-07-04 Thread Masahiro Yamada

2017-07-04 19:06 GMT+09:00 Andy Shevchenko :
> On Tue, 2017-07-04 at 12:53 +0900, Masahiro Yamada wrote:
>> Some of include directives in include/linux/gpio/driver.h are
>> unneeded because the header does not need to know the content of
>> struct device, irq_chip, etc.  Just declare they are structures.
>>
>> On the other hand,  and 
>> turned out to be necessary for irq_flow_handler_t and spinlock_t,
>> respectively.
>>
>> Each driver should include what it needs without relying on what is
>> implicitly included from .  This will cut down
>> unnecessary header parsing.
>
> If Linus is okay with the following proposal I would rather go with it,
> i.e. logical split the series to
>
> 1. Fix IRQ related headers inclusion
> 2. Fix pinconf-generic.h inclusion
> 3. Fix OF headers inclusion (btw, of_gpio.h is not enough there?)

Maybe
  4.  Fix (platform_)device inclusion


But, I do not see much sense to touch headers multiple times.



-- 
Best Regards
Masahiro Yamada

Re: [PATCH RFC] iio: pressure: zpa2326: report interrupted case as failure

2017-07-04 Thread Geert Uytterhoeven

Hi Nicholas,

On Sun, May 14, 2017 at 10:43 AM, Nicholas Mc Guire  wrote:
> If the timeout-case prints a warning message then probably the interrupted
> case should also. Further, wait_for_completion_interruptible_timeout()
> returns long not int.
>
> Fixes: commit 03b262f2bbf4 ("iio:pressure: initial zpa2326 barometer support")
> Signed-off-by: Nicholas Mc Guire 
> ---
>
> The original control-flow was technically not wrong just confusing and a bit
> complicated. Not clear if reporting the interrupted case actually is useful,
> but given that the timeout is relatively long (200ms) it is not that unlikely
> so differentiating the cases seems helpful.
>
> Patch was compile-tested with: x86_64_defconfig + CONFIG_IIO=m, 
> CONFIG_ZPA2326=m
>
> Patch is against v4.11 (localversion-next is next-20170512)
>
>  drivers/iio/pressure/zpa2326.c | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iio/pressure/zpa2326.c b/drivers/iio/pressure/zpa2326.c
> index e58a0ad..617926f 100644
> --- a/drivers/iio/pressure/zpa2326.c
> +++ b/drivers/iio/pressure/zpa2326.c
> @@ -867,12 +867,13 @@ static int zpa2326_wait_oneshot_completion(const struct 
> iio_dev   *indio_dev,
>  {
> int  ret;
> unsigned int val;
> +   long timeout;
>
> zpa2326_dbg(indio_dev, "waiting for one shot completion interrupt");
>
> -   ret = wait_for_completion_interruptible_timeout(
> +   timeout = wait_for_completion_interruptible_timeout(
> >data_ready, ZPA2326_CONVERSION_JIFFIES);
> -   if (ret > 0)
> +   if (timeout > 0)

Check for strict positive timeout.

> /*
>  * Interrupt handler completed before timeout: return 
> operation
>  * status.
> @@ -882,13 +883,15 @@ static int zpa2326_wait_oneshot_completion(const struct 
> iio_dev   *indio_dev,
> /* Clear all interrupts just to be sure. */
> regmap_read(private->regmap, ZPA2326_INT_SOURCE_REG, );
>
> -   if (!ret)
> +   if (!timeout) {

Check for zero timeout.

> /* Timed out. */
> +   zpa2326_warn(indio_dev, "no one shot interrupt occurred 
> (%ld)",
> +timeout);
> ret = -ETIME;
> -
> -   if (ret != -ERESTARTSYS)
> -   zpa2326_warn(indio_dev, "no one shot interrupt occurred (%d)",
> -ret);
> +   } else if (timeout < 0) {

So if we get here, timeout is always strict negative, so the check can
be removed.

> +   zpa2326_warn(indio_dev, "wait for one shot interrupt 
> canceled");
> +   ret = -ERESTARTSYS;
> +   }
>
> return ret;

But gcc-4.1.2 is not smart enough:

drivers/iio/pressure/zpa2326.c:868: warning: ‘ret’ may be used
uninitialized in this function

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH v3 00/16] improve the fb_setcmap helper

2017-07-04 Thread Peter Rosin

Hi!

While trying to get CLUT support for the atmel_hlcdc driver, and
specifically for the emulated fbdev interface, I received some
push-back that my feeble in-driver attempts should be solved
by the core. This is my attempt to do it right.

I have obviously not tested all of this with more than a compile,
but patches 1 through 5 are enough to make the atmel-hlcdc driver
do what I need. The rest is just lots of removals and cleanup made
possible by the improved core.

Please test, I would not be surprised if I have fouled up some
bit-manipulation somewhere, or if I have misunderstood something
about atomics...

Changes since v2:
- Added patch 1/16 which factors out pseudo-palette handling.
- Removed the if (cmap->start + cmap->len < cmap->start)
  sanity check on the assumption that the fbdev core handles it.
- Added patch 4/16 which factors out atomic state and commit
  handling from drm_atomic_helper_legacy_gamma_set to
  drm_mode_gamma_set_ioctl.
- Do one atomic commit for all affected crtc.
- Removed a now obsolete note in include/drm/drm_crtc.h (ammended
  the last patch).
- Cc list is getting long, so I have redused the list for the
  individual patches. If you would like to get the full series
  (or nothing at all) for the next round (if that is needed) just
  say so.

Changes since v1:

- Rebased to next-20170621
- Split 1/11 into a preparatory patch, a cleanup patch and then
  the meat in 3/14.
- Handle pseudo-palette for FB_VISUAL_TRUECOLOR.
- Removed the empty .gamma_get/.gamma_set fb helpers from the
  armada driver that I had somehow managed to ignore but which
  0day found real quick.
- Be less judgemental on drivers only providing .gamma_get and
  .gamma_set, but no .load_lut. That's actually a valid thing
  to do if you only need pseudo-palette for FB_VISUAL_TRUECOLOR.
- Add a comment about colliding bitfields in the nouveau driver.
- Remove gamma_set/gamma_get declarations from the radeon driver
  (the definitions were removed in v1).

Cheers,
peda

Peter Rosin (16):
  drm/fb-helper: factor out pseudo-palette
  drm/fb-helper: keep the .gamma_store updated in drm_fb_helper_setcmap
  drm/fb-helper: remove drm_fb_helper_save_lut_atomic
  drm/color-mgmt: move atomic state/commit out from .gamma_set
  drm/fb-helper: do a generic fb_setcmap helper in terms of crtc
.gamma_set
  drm: amd: remove dead code and pointless local lut storage
  drm: armada: remove dead empty functions
  drm: ast: remove dead code and pointless local lut storage
  drm: cirrus: remove dead code and pointless local lut storage
  drm: gma500: remove dead code and pointless local lut storage
  drm: i915: remove dead code and pointless local lut storage
  drm: mgag200: remove dead code and pointless local lut storage
  drm: nouveau: remove dead code and pointless local lut storage
  drm: radeon: remove dead code and pointless local lut storage
  drm: stm: remove dead code and pointless local lut storage
  drm: remove unused and redundant callbacks

 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c  |  24 
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h|   1 -
 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c  |  29 ++---
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c  |  29 ++---
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c   |  29 ++---
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c   |  29 ++---
 drivers/gpu/drm/amd/amdgpu/dce_virtual.c|  25 +---
 drivers/gpu/drm/armada/armada_crtc.c|  10 --
 drivers/gpu/drm/armada/armada_crtc.h|   2 -
 drivers/gpu/drm/armada/armada_fbdev.c   |   2 -
 drivers/gpu/drm/ast/ast_drv.h   |   1 -
 drivers/gpu/drm/ast/ast_fb.c|  20 ---
 drivers/gpu/drm/ast/ast_mode.c  |  28 +---
 drivers/gpu/drm/cirrus/cirrus_drv.h |   8 --
 drivers/gpu/drm/cirrus/cirrus_fbdev.c   |   2 -
 drivers/gpu/drm/cirrus/cirrus_mode.c|  73 +++
 drivers/gpu/drm/drm_atomic_helper.c |  37 ++
 drivers/gpu/drm/drm_color_mgmt.c|  27 +++-
 drivers/gpu/drm/drm_fb_helper.c | 195 +---
 drivers/gpu/drm/gma500/framebuffer.c|  22 
 drivers/gpu/drm/gma500/gma_display.c|  34 ++---
 drivers/gpu/drm/gma500/gma_display.h|   2 +-
 drivers/gpu/drm/gma500/psb_intel_display.c  |   7 +-
 drivers/gpu/drm/gma500/psb_intel_drv.h  |   1 -
 drivers/gpu/drm/i915/intel_drv.h|   1 -
 drivers/gpu/drm/i915/intel_fbdev.c  |  31 -
 drivers/gpu/drm/mgag200/mgag200_drv.h   |   5 -
 drivers/gpu/drm/mgag200/mgag200_fb.c|   2 -
 drivers/gpu/drm/mgag200/mgag200_mode.c  |  64 +++--
 drivers/gpu/drm/nouveau/dispnv04/crtc.c |  28 ++--
 drivers/gpu/drm/nouveau/nouveau_crtc.h  |   3 -
 drivers/gpu/drm/nouveau/nouveau_fbcon.c |  22 
 drivers/gpu/drm/nouveau/nv50_display.c  |  42 ++
 drivers/gpu/drm/radeon/atombios_crtc.c  |   1 -
 drivers/gpu/drm/radeon/radeon_connectors.c  |   7 +-
 drivers/gpu/drm/radeon/radeon_display.c |

Re: [PATCH 2/3] mtd: spi-nor: core code for the Altera Quadspi Flash Controller v2

2017-07-04 Thread Michal Suchanek

On 4 July 2017 at 02:00, Cyrille Pitchen  wrote:
> Hi Matthew,
>
>
> Le 26/06/2017 à 18:13, matthew.gerl...@linux.intel.com a écrit :
>> From: Matthew Gerlach 

>> +static int altera_quadspi_setup_banks(struct device *dev,
>> +   u32 bank, struct device_node *np)
>> +{
>> + struct altera_quadspi *q = dev_get_drvdata(dev);
>> + struct altera_quadspi_flash *flash;
>> + struct spi_nor *nor;
>> + int ret = 0;
>> + char modalias[40] = {0};
>> + struct spi_nor_hwcaps hwcaps = {
>> + .mask = SNOR_HWCAPS_READ |
>> + SNOR_HWCAPS_READ_FAST |
>> + SNOR_HWCAPS_READ_1_1_2 |
>> + SNOR_HWCAPS_READ_1_1_4 |
>> + SNOR_HWCAPS_PP,
>> + };
>
> since aletera_quadspi_{read|erase} just don't care about
> nor->read_opcode, nor->program_opcode and so on and anyway override all
> settings chosen by spi-nor.c, it means they will use Dual or Quad SPI
> controllers as they want, whether SNOR_HWCAPS_READ_1_1_{2|4} are set or not.
> Then I think it's risky to declare the READ_1_1_2 and READ_1_1_4 hwcaps
> because it may trigger additionnal calls of nor->read_reg() /
> nor->write_reg() from spi_nor_scan() with op codes not supported by
> altera_quadspi_{read|write}_reg().
>
>> +
>> + if (bank > q->num_flashes - 1)
>> + return -EINVAL;
>> +
>> + altera_quadspi_chip_select(q, bank);
>> +
>> + flash = devm_kzalloc(q->dev, sizeof(*flash), GFP_KERNEL);
>> + if (!flash)
>> + return -ENOMEM;
>> +
>> + q->flash[bank] = flash;
>> + nor = >nor;
>> + nor->dev = dev;
>> + nor->priv = flash;
>> + nor->mtd.priv = nor;
>> + flash->q = q;
>> + flash->bank = bank;
>> + spi_nor_set_flash_node(nor, np);
>> +
>> + /* spi nor framework*/
>> + nor->read_reg = altera_quadspi_read_reg;
>> + nor->write_reg = altera_quadspi_write_reg;
>> + nor->read = altera_quadspi_read;
>> + nor->write = altera_quadspi_write;
>> + nor->erase = altera_quadspi_erase;
>> + nor->flash_lock = altera_quadspi_lock;
>> + nor->flash_unlock = altera_quadspi_unlock;
>
> nor->flash_lock and nor->flash_unlock are described as "FLASH SPECIFIC"
> in include/linux/mtd/spi-nor.h as opposed to "DRIVER SPECIFIC" functions
> like nor->read, nor->read_reg, ...
>
> It means the actual implementations should be provided by the spi-nor
> sub-system but not by each SPI controller driver.
>
>
>
> For me, it really sounds like a bad idea that this driver tries so much
> to mystify the spi-nor sub-system.
>
> I can understand that you have to cope with the hardware design and its
> limitations but clearly it looks the spi-nor API is not suited to this
> hardware. This driver ignores and by-passes any settings selected by
> spi_nor_scan().
> Duplicating code is generally a bad idea but in this case, I don't know
> if trying to reuse spi_nor_read() / spi_nor_write() and spi_nor_erase()
> from spi-nor.c is that helpful.
>
> Why not directly plug your driver into the above mtd layer implementing
> you own version of mtd->_read(), mtd->_write() and mtd->_erase() then
> registering the mtd device? It may be not the way to go but at least we
> should study this alternative.

AFAICT fsl-quadspi does just that preventing the use of the SPI
controller for non-flash devices.

There is at least one accelerated driver that is passed the opcodes to
program in the controller for read acceleration in spi_flash_read so
reusing that should be viable. If the opcodes can be programmed or
match what is hardcoded in the controller use the acceleration and
fallback to plain spi transfer if there is mismatch between what
m25p80_read requests and what the controller can do.

If this works and you can still use the plain SPI trnsfers the
controller will be much morer useful than fsl-quadspi.

Thanks

Michal

[PATCH v3 03/16] drm/fb-helper: remove drm_fb_helper_save_lut_atomic

2017-07-04 Thread Peter Rosin

drm_fb_helper_save_lut_atomic is redundant since the .gamma_store is
now always kept up to date by drm_fb_helper_setcmap.

Signed-off-by: Peter Rosin 
---
 drivers/gpu/drm/drm_fb_helper.c | 17 -
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 41fd9e0..b75b1f2 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -253,22 +253,6 @@ int drm_fb_helper_remove_one_connector(struct 
drm_fb_helper *fb_helper,
 }
 EXPORT_SYMBOL(drm_fb_helper_remove_one_connector);
 
-static void drm_fb_helper_save_lut_atomic(struct drm_crtc *crtc, struct 
drm_fb_helper *helper)
-{
-   uint16_t *r_base, *g_base, *b_base;
-   int i;
-
-   if (helper->funcs->gamma_get == NULL)
-   return;
-
-   r_base = crtc->gamma_store;
-   g_base = r_base + crtc->gamma_size;
-   b_base = g_base + crtc->gamma_size;
-
-   for (i = 0; i < crtc->gamma_size; i++)
-   helper->funcs->gamma_get(crtc, _base[i], _base[i], 
_base[i], i);
-}
-
 static void drm_fb_helper_restore_lut_atomic(struct drm_crtc *crtc)
 {
uint16_t *r_base, *g_base, *b_base;
@@ -309,7 +293,6 @@ int drm_fb_helper_debug_enter(struct fb_info *info)
if (drm_drv_uses_atomic_modeset(mode_set->crtc->dev))
continue;
 
-   drm_fb_helper_save_lut_atomic(mode_set->crtc, helper);
funcs->mode_set_base_atomic(mode_set->crtc,
mode_set->fb,
mode_set->x,
-- 
2.1.4

[tip:irq/urgent] genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN

2017-07-04 Thread tip-bot for Sebastian Ott

Commit-ID:  e5682b4eecb2b73282853d0ef314d3164b986997
Gitweb: http://git.kernel.org/tip/e5682b4eecb2b73282853d0ef314d3164b986997
Author: Sebastian Ott 
AuthorDate: Tue, 4 Jul 2017 11:25:15 +0200
Committer:  Thomas Gleixner 
CommitDate: Tue, 4 Jul 2017 12:36:43 +0200

genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN

Fix this build error:

kernel/irq/internals.h:440:20: error: inlining failed in call to always_inline
  'irq_domain_debugfs_init': function body not available
kernel/irq/debugfs.c:202:2: note: called from here
  irq_domain_debugfs_init(root_dir);
  ^

Signed-off-by: Sebastian Ott 
Signed-off-by: Thomas Gleixner 
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1707041124000.1712@schleppi

---
 kernel/irq/internals.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 9da14d1..dbfba99 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -437,7 +437,9 @@ static inline void irq_remove_debugfs_entry(struct irq_desc 
*desc)
 # ifdef CONFIG_IRQ_DOMAIN
 void irq_domain_debugfs_init(struct dentry *root);
 # else
-static inline void irq_domain_debugfs_init(struct dentry *root);
+static inline void irq_domain_debugfs_init(struct dentry *root)
+{
+}
 # endif
 #else /* CONFIG_GENERIC_IRQ_DEBUGFS */
 static inline void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *d)

[tip:irq/urgent] genirq/timings: Move free timings out of spinlocked region

2017-07-04 Thread tip-bot for Thomas Gleixner

Commit-ID:  2343877fbda701599653e63f8dcc318aa1bf15ee
Gitweb: http://git.kernel.org/tip/2343877fbda701599653e63f8dcc318aa1bf15ee
Author: Thomas Gleixner 
AuthorDate: Thu, 29 Jun 2017 23:33:39 +0200
Committer:  Thomas Gleixner 
CommitDate: Tue, 4 Jul 2017 12:46:16 +0200

genirq/timings: Move free timings out of spinlocked region

No point to do memory management from a interrupt disabled spin locked
region.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Marc Zyngier 
Cc: Daniel Lezcano 
Cc: Heiko Stuebner 
Cc: Julia Cartwright 
Cc: Linus Walleij 
Cc: Brian Norris 
Cc: Doug Anderson 
Cc: linux-rockc...@lists.infradead.org
Cc: John Keeping 
Cc: linux-g...@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.196130...@linutronix.de

---
 kernel/irq/manage.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 3e69343..91e1f23 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1489,7 +1489,6 @@ static struct irqaction *__free_irq(unsigned int irq, 
void *dev_id)
if (!desc->action) {
irq_settings_clr_disable_unlazy(desc);
irq_shutdown(desc);
-   irq_remove_timings(desc);
}
 
 #ifdef CONFIG_SMP
@@ -1531,8 +1530,10 @@ static struct irqaction *__free_irq(unsigned int irq, 
void *dev_id)
}
}
 
-   if (!desc->action)
+   if (!desc->action) {
irq_release_resources(desc);
+   irq_remove_timings(desc);
+   }
 
mutex_unlock(>request_mutex);

Re: [PATCH 2/2] misc: added Spreadtrum's radio driver

2017-07-04 Thread Arnd Bergmann

On Tue, Jul 4, 2017 at 12:15 PM, Chunyan Zhang
 wrote:
> This patch added FM radio driver for Spreadtrum's SC2342, which's
> a WCN SoC, also added a new directory for Spreadtrum's WCN SoCs.
>
> Signed-off-by: Songhe Wei 
> Signed-off-by: Chunyan Zhang 

(adding linux-media folks to Cc)

Hi Chunyan,

Thanks for posting this for inclusion as Greg asked for. I'm not sure what
the policy is for new radio drivers, but I assume this would have to go
to drivers/staging/media/ as it is a driver for hardware that fits into
drivers/media/radio but doesn't use the respective APIs.

Arnd
---
end of message, full patch quoted for reference below

> ---
>  drivers/misc/Kconfig   |1 +
>  drivers/misc/Makefile  |1 +
>  drivers/misc/sprd-wcn/Kconfig  |   14 +
>  drivers/misc/sprd-wcn/Makefile |1 +
>  drivers/misc/sprd-wcn/radio/Kconfig|8 +
>  drivers/misc/sprd-wcn/radio/Makefile   |2 +
>  drivers/misc/sprd-wcn/radio/fmdrv.h|  595 +++
>  drivers/misc/sprd-wcn/radio/fmdrv_main.c   | 1245 
> 
>  drivers/misc/sprd-wcn/radio/fmdrv_main.h   |  117 +++
>  drivers/misc/sprd-wcn/radio/fmdrv_ops.c|  447 +
>  drivers/misc/sprd-wcn/radio/fmdrv_ops.h|   17 +
>  drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.c |  753 ++
>  drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.h |  103 ++
>  13 files changed, 3304 insertions(+)
>  create mode 100644 drivers/misc/sprd-wcn/Kconfig
>  create mode 100644 drivers/misc/sprd-wcn/Makefile
>  create mode 100644 drivers/misc/sprd-wcn/radio/Kconfig
>  create mode 100644 drivers/misc/sprd-wcn/radio/Makefile
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv.h
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_main.c
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_main.h
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_ops.c
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_ops.h
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.c
>  create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.h
>
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 07bbd4c..5e295b3 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -510,4 +510,5 @@ source "drivers/misc/mic/Kconfig"
>  source "drivers/misc/genwqe/Kconfig"
>  source "drivers/misc/echo/Kconfig"
>  source "drivers/misc/cxl/Kconfig"
> +source "drivers/misc/sprd-wcn/Kconfig"
>  endmenu
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index ad13677..df75ea7 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -55,6 +55,7 @@ obj-$(CONFIG_VEXPRESS_SYSCFG) += vexpress-syscfg.o
>  obj-$(CONFIG_CXL_BASE) += cxl/
>  obj-$(CONFIG_ASPEED_LPC_CTRL)  += aspeed-lpc-ctrl.o
>  obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o
> +obj-$(CONFIG_SPRD_WCN) += sprd-wcn/
>
>  lkdtm-$(CONFIG_LKDTM)  += lkdtm_core.o
>  lkdtm-$(CONFIG_LKDTM)  += lkdtm_bugs.o
> diff --git a/drivers/misc/sprd-wcn/Kconfig b/drivers/misc/sprd-wcn/Kconfig
> new file mode 100644
> index 000..d2e7428
> --- /dev/null
> +++ b/drivers/misc/sprd-wcn/Kconfig
> @@ -0,0 +1,14 @@
> +config SPRD_WCN
> +   tristate "Support for Spreadtrum's WCN SoCs"
> +   depends on ARCH_SPRD
> +   default n
> +   help
> + This enables Spreadtrum's WCN (wireless connectivity network)
> + SoCs. In general, Spreadtrum's WCN SoCs consisted of some
> + modules, such as FM, bluetooth, wifi, GPS, etc.
> +
> +if SPRD_WCN
> +
> +source "drivers/misc/sprd-wcn/radio/Kconfig"
> +
> +endif
> diff --git a/drivers/misc/sprd-wcn/Makefile b/drivers/misc/sprd-wcn/Makefile
> new file mode 100644
> index 000..3ad5dad
> --- /dev/null
> +++ b/drivers/misc/sprd-wcn/Makefile
> @@ -0,0 +1 @@
> +obj-y  += radio/
> diff --git a/drivers/misc/sprd-wcn/radio/Kconfig 
> b/drivers/misc/sprd-wcn/radio/Kconfig
> new file mode 100644
> index 000..3cc0f7e
> --- /dev/null
> +++ b/drivers/misc/sprd-wcn/radio/Kconfig
> @@ -0,0 +1,8 @@
> +## Spreadtrum SC2332 FM drivers
> +
> +config SPRD_RADIO_SC2332
> +   tristate "Support for the Spreadtrum Radio SC2332"
> +   default n
> +   ---help---
> + Say Y to enable built-in FM radio controller for the
> + Spreadtrum SC2332 SoC.
> diff --git a/drivers/misc/sprd-wcn/radio/Makefile 
> b/drivers/misc/sprd-wcn/radio/Makefile
> new file mode 100644
> index 000..16f1582
> --- /dev/null
> +++ b/drivers/misc/sprd-wcn/radio/Makefile
> @@ -0,0 +1,2 @@
> +obj-$(CONFIG_SPRD_RADIO_SC2332) := marlin2_fm.o
> +marlin2_fm-objs := fmdrv_main.o fmdrv_ops.o fmdrv_rds_parser.o
> diff --git a/drivers/misc/sprd-wcn/radio/fmdrv.h 
> b/drivers/misc/sprd-wcn/radio/fmdrv.h
> new file mode 100644
> index

[tip:irq/urgent] genirq: Add mutex to irq desc to serialize request/free_irq()

2017-07-04 Thread tip-bot for Thomas Gleixner

Commit-ID:  9114014cf4e6df0b22d764380ae1fc54f1a7a8b2
Gitweb: http://git.kernel.org/tip/9114014cf4e6df0b22d764380ae1fc54f1a7a8b2
Author: Thomas Gleixner 
AuthorDate: Thu, 29 Jun 2017 23:33:37 +0200
Committer:  Thomas Gleixner 
CommitDate: Tue, 4 Jul 2017 12:46:16 +0200

genirq: Add mutex to irq desc to serialize request/free_irq()

The irq_request/release_resources() callbacks ar currently invoked under
desc->lock with interrupts disabled. This is a source of problems on RT and
conceptually not required.

Add a seperate mutex to struct irq_desc which allows to serialize
request/free_irq(), which can be used to move the resource functions out of
the desc->lock held region.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Marc Zyngier 
Cc: Heiko Stuebner 
Cc: Julia Cartwright 
Cc: Linus Walleij 
Cc: Brian Norris 
Cc: Doug Anderson 
Cc: linux-rockc...@lists.infradead.org
Cc: John Keeping 
Cc: linux-g...@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.039220...@linutronix.de

---
 include/linux/irqdesc.h | 3 +++
 kernel/irq/irqdesc.c| 1 +
 kernel/irq/manage.c | 8 
 3 files changed, 12 insertions(+)

diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index d425a3a..3e90a09 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * Core internal functions to deal with irq descriptors
@@ -45,6 +46,7 @@ struct pt_regs;
  * IRQF_FORCE_RESUME set
  * @rcu:   rcu head for delayed free
  * @kobj:  kobject used to represent this struct in sysfs
+ * @request_mutex: mutex to protect request/free before locking desc->lock
  * @dir:   /proc/irq/ procfs entry
  * @debugfs_file:  dentry for the debugfs file
  * @name:  flow handler name for /proc/interrupts output
@@ -96,6 +98,7 @@ struct irq_desc {
struct rcu_head rcu;
struct kobject  kobj;
 #endif
+   struct mutexrequest_mutex;
int parent_irq;
struct module   *owner;
const char  *name;
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 948b50e..906a67e 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -373,6 +373,7 @@ static struct irq_desc *alloc_desc(int irq, int node, 
unsigned int flags,
 
raw_spin_lock_init(>lock);
lockdep_set_class(>lock, _desc_lock_class);
+   mutex_init(>request_mutex);
init_rcu_head(>rcu);
 
desc_set_defaults(irq, desc, node, affinity, owner);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 0934e02..0139908 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1167,6 +1167,8 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
new->flags &= ~IRQF_ONESHOT;
 
+   mutex_lock(>request_mutex);
+
chip_bus_lock(desc);
 
/*
@@ -1350,6 +1352,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
 
raw_spin_unlock_irqrestore(>lock, flags);
chip_bus_sync_unlock(desc);
+   mutex_unlock(>request_mutex);
 
irq_setup_timings(desc, new);
 
@@ -1383,6 +1386,8 @@ out_unlock:
 
chip_bus_sync_unlock(desc);
 
+   mutex_unlock(>request_mutex);
+
 out_thread:
if (new->thread) {
struct task_struct *t = new->thread;
@@ -1446,6 +1451,7 @@ static struct irqaction *__free_irq(unsigned int irq, 
void *dev_id)
if (!desc)
return NULL;
 
+   mutex_lock(>request_mutex);
chip_bus_lock(desc);
raw_spin_lock_irqsave(>lock, flags);
 
@@ -1521,6 +1527,8 @@ static struct irqaction *__free_irq(unsigned int irq, 
void *dev_id)
}
}
 
+   mutex_unlock(>request_mutex);
+
irq_chip_pm_put(>irq_data);
module_put(desc->owner);
kfree(action->secondary);

[tip:irq/urgent] genirq: Move irq resource handling out of spinlocked region

2017-07-04 Thread tip-bot for Thomas Gleixner

Commit-ID:  46e48e257360f0845fe17089713cbad4db611e70
Gitweb: http://git.kernel.org/tip/46e48e257360f0845fe17089713cbad4db611e70
Author: Thomas Gleixner 
AuthorDate: Thu, 29 Jun 2017 23:33:38 +0200
Committer:  Thomas Gleixner 
CommitDate: Tue, 4 Jul 2017 12:46:16 +0200

genirq: Move irq resource handling out of spinlocked region

Aside of being conceptually wrong, there is also an actual (hard to
trigger and mostly theoretical) problem.

CPU0CPU1
free_irq(X) interrupt X
spin_lock(desc->lock)
wake irq thread()
spin_unlock(desc->lock)
spin_lock(desc->lock)
remove action()
shutdown_irq()  
release_resources() thread_handler()
spin_unlock(desc->lock)   access released resources.

synchronize_irq()

Move the release resources invocation after synchronize_irq() so it's
guaranteed that the threaded handler has finished.

Move the resource request call out of the desc->lock held region as well,
so the invocation context is the same for both request and release.

This solves the problems with those functions on RT as well.
 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Marc Zyngier 
Cc: Heiko Stuebner 
Cc: Julia Cartwright 
Cc: Linus Walleij 
Cc: Brian Norris 
Cc: Doug Anderson 
Cc: linux-rockc...@lists.infradead.org
Cc: John Keeping 
Cc: linux-g...@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.117028...@linutronix.de

---
 kernel/irq/manage.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 0139908..3e69343 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1168,6 +1168,14 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
new->flags &= ~IRQF_ONESHOT;
 
mutex_lock(>request_mutex);
+   if (!desc->action) {
+   ret = irq_request_resources(desc);
+   if (ret) {
+   pr_err("Failed to request resources for %s (irq %d) on 
irqchip %s\n",
+  new->name, irq, desc->irq_data.chip->name);
+   goto out_mutex;
+   }
+   }
 
chip_bus_lock(desc);
 
@@ -1271,13 +1279,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
}
 
if (!shared) {
-   ret = irq_request_resources(desc);
-   if (ret) {
-   pr_err("Failed to request resources for %s (irq %d) on 
irqchip %s\n",
-  new->name, irq, desc->irq_data.chip->name);
-   goto out_unlock;
-   }
-
init_waitqueue_head(>wait_for_threads);
 
/* Setup the type (level, edge polarity) if configured: */
@@ -1386,6 +1387,10 @@ out_unlock:
 
chip_bus_sync_unlock(desc);
 
+   if (!desc->action)
+   irq_release_resources(desc);
+
+out_mutex:
mutex_unlock(>request_mutex);
 
 out_thread:
@@ -1484,7 +1489,6 @@ static struct irqaction *__free_irq(unsigned int irq, 
void *dev_id)
if (!desc->action) {
irq_settings_clr_disable_unlazy(desc);
irq_shutdown(desc);
-   irq_release_resources(desc);
irq_remove_timings(desc);
}
 
@@ -1527,6 +1531,9 @@ static struct irqaction *__free_irq(unsigned int irq, 
void *dev_id)
}
}
 
+   if (!desc->action)
+   irq_release_resources(desc);
+
mutex_unlock(>request_mutex);
 
irq_chip_pm_put(>irq_data);

[tip:irq/urgent] genirq: Move bus locking into __setup_irq()

2017-07-04 Thread tip-bot for Thomas Gleixner

Commit-ID:  3a90795e1e885167209056a1a90be965add30e25
Gitweb: http://git.kernel.org/tip/3a90795e1e885167209056a1a90be965add30e25
Author: Thomas Gleixner 
AuthorDate: Thu, 29 Jun 2017 23:33:36 +0200
Committer:  Thomas Gleixner 
CommitDate: Tue, 4 Jul 2017 12:46:15 +0200

genirq: Move bus locking into __setup_irq()

There is no point in having the irq_bus_lock() protection around all
callers to __setup_irq().

Move it into __setup_irq(). This is also a preparatory patch for addressing
the issues with the irq resource callbacks.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Marc Zyngier 
Cc: Heiko Stuebner 
Cc: Julia Cartwright 
Cc: Linus Walleij 
Cc: Brian Norris 
Cc: Doug Anderson 
Cc: linux-rockc...@lists.infradead.org
Cc: John Keeping 
Cc: linux-g...@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214343.960949...@linutronix.de

---
 kernel/irq/manage.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 5c11c17..0934e02 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1167,6 +1167,8 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
new->flags &= ~IRQF_ONESHOT;
 
+   chip_bus_lock(desc);
+
/*
 * The following block of code has to be executed atomically
 */
@@ -1347,6 +1349,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
}
 
raw_spin_unlock_irqrestore(>lock, flags);
+   chip_bus_sync_unlock(desc);
 
irq_setup_timings(desc, new);
 
@@ -1378,6 +1381,8 @@ mismatch:
 out_unlock:
raw_spin_unlock_irqrestore(>lock, flags);
 
+   chip_bus_sync_unlock(desc);
+
 out_thread:
if (new->thread) {
struct task_struct *t = new->thread;
@@ -1417,9 +1422,7 @@ int setup_irq(unsigned int irq, struct irqaction *act)
if (retval < 0)
return retval;
 
-   chip_bus_lock(desc);
retval = __setup_irq(irq, desc, act);
-   chip_bus_sync_unlock(desc);
 
if (retval)
irq_chip_pm_put(>irq_data);
@@ -1674,9 +1677,7 @@ int request_threaded_irq(unsigned int irq, irq_handler_t 
handler,
return retval;
}
 
-   chip_bus_lock(desc);
retval = __setup_irq(irq, desc, action);
-   chip_bus_sync_unlock(desc);
 
if (retval) {
irq_chip_pm_put(>irq_data);
@@ -1924,9 +1925,7 @@ int setup_percpu_irq(unsigned int irq, struct irqaction 
*act)
if (retval < 0)
return retval;
 
-   chip_bus_lock(desc);
retval = __setup_irq(irq, desc, act);
-   chip_bus_sync_unlock(desc);
 
if (retval)
irq_chip_pm_put(>irq_data);
@@ -1980,9 +1979,7 @@ int request_percpu_irq(unsigned int irq, irq_handler_t 
handler,
return retval;
}
 
-   chip_bus_lock(desc);
retval = __setup_irq(irq, desc, action);
-   chip_bus_sync_unlock(desc);
 
if (retval) {
irq_chip_pm_put(>irq_data);

答复: [Patch v2 3/3] arm64: dts: register Hi3660's thermal sensor

2017-07-04 Thread Wangtao (Kevin, Kirin)


On 2017/7/1 11:06, "Eduardo Valentin"  wrote:> 
> On Thu, Jun 22, 2017 at 11:42:03AM +0800, Tao Wang wrote:
> > Bind thermal sensor driver for Hi3660.
> >
> > Signed-off-by: Tao Wang 
> > Signed-off-by: Leo Yan 
> > ---
> > Changes in v2:
> > - rebase changes on linux next
> >
> >  arch/arm64/boot/dts/hisilicon/hi3660.dtsi |6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > index c6a1961..a6a1e01 100644
> > --- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > +++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > @@ -848,5 +848,11 @@
> >  _cfg_func>;
> > status = "disabled";
> > };
> > +
> > +   tsensor: tsensor {
> > +   compatible = "hisilicon,hi3660-thermal";
> > +   reg = <0x0 0xfff3 0x0 0x1000>;
> > +   #thermal-sensor-cells = <1>;
> > +   };
> 
> Are you planning to also add thermal zone entries?
Yes, after cpufreq enabled.
> 
> > };
> >  };
> > --
> > 1.7.9.5
> >

Re: [PATCH] bus: arm-ccn: constify attribute_group structures.

2017-07-04 Thread Pawel Moll

On Mon, 2017-07-03 at 13:01 +0530, Arvind Yadav wrote:
> attribute_groups are not supposed to change at runtime. All functions
> working with attribute_groups provided by  work with const
> attribute_group. So mark the non-const structs as const.
> 
> File size before:
>    text      data bss dec hex filename
>    9074      5592 416   15082    3aea drivers/bus/arm-ccn.o
> 
> File size After adding 'const':
>    text      data bss dec hex filename
>    9327      5336 416   15079    3ae7 drivers/bus/arm-ccn.o
> 
> Signed-off-by: Arvind Yadav 

Looks fine to me. I'll queue it for the next time I push out CCN driver
fixes (no dates promised, though...)

Thanks!

Paweł

Re: [PATCH] mm: larger stack guard gap, between vmas

2017-07-04 Thread Ben Hutchings

On Tue, 2017-07-04 at 12:42 +0200, Michal Hocko wrote:
> On Tue 04-07-17 11:47:28, Willy Tarreau wrote:
> > On Tue, Jul 04, 2017 at 11:35:38AM +0200, Michal Hocko wrote:
[...]
> > But wouldn't this completely disable the check in case such a guard page
> > is installed, and possibly continue to allow the collision when the stack
> > allocation is large enough to skip this guard page ?
> 
> Yes and but a PROT_NONE would fault and as the changelog says, we _hope_
> that userspace does the right thing.

It may well not be large enough, because of the same wrong assumptions
that resulted in the kernel's guard page not being large enough.  We
should count it as part of the guard gap but not a substitute.

> > Shouldn't we instead
> > "skip" such a vma and look for the next one ?
> 
> Yeah, that would be possible, I am not sure it is worth it though. The
> gap as it is implemented now prevents regular mappings to get close to
> the stack. So we only care about those with MAP_FIXED and those can
> screw things already so we really have to rely on userspace doing some
> semi reasonable.
> 
> > I was thinking about something more like :
> > 
> > prev = vma->vm_prev;
> > +   /* Don't consider a possible user-space stack guard page */
> > +   if (prev && !(prev->vm_flags & VM_GROWSDOWN) &&
> > +   !(prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC)))
> > +   prev = prev->vm_prev;
> > +
> 
> If anywhing this would require to have a loop over all PROT_NONE
> mappings to not hit into other weird usecases.

That's what I was thinking of.  Tried the following patch:

Subject: mmap: Ignore VM_NONE mappings when checking for space to
 expand the stack

Some user-space run-times (in particular, Java and Rust) allocate
their own guard pages in the main stack.  This didn't work well
before, but it can now block stack expansion where it is safe and would
previously have been allowed.  Ignore such mappings when checking the
size of the gap before expanding.

Reported-by: Ximin Luo 
References: https://bugs.debian.org/865416
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Cc: sta...@vger.kernel.org
Signed-off-by: Ben Hutchings 
---
 mm/mmap.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index a5e3dcd75e79..19f3ce04f24f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2243,7 +2243,14 @@ int expand_upwards(struct vm_area_struct *vma, unsigned 
long address)
if (gap_addr < address || gap_addr > TASK_SIZE)
gap_addr = TASK_SIZE;
 
-   next = vma->vm_next;
+   /*
+* Allow VM_NONE mappings in the gap as some applications try
+* to make their own stack guards
+*/
+   for (next = vma->vm_next;
+next && !(next->vm_flags & (VM_READ | VM_WRITE | VM_EXEC));
+next = next->vm_next)
+   ;
if (next && next->vm_start < gap_addr) {
if (!(next->vm_flags & VM_GROWSUP))
return -ENOMEM;
@@ -2323,11 +2330,17 @@ int expand_downwards(struct vm_area_struct *vma,
if (error)
return error;
 
-   /* Enforce stack_guard_gap */
+   /*
+* Enforce stack_guard_gap, but allow VM_NONE mappings in the gap
+* as some applications try to make their own stack guards
+*/
gap_addr = address - stack_guard_gap;
if (gap_addr > address)
return -ENOMEM;
-   prev = vma->vm_prev;
+   for (prev = vma->vm_prev;
+prev && !(prev->vm_flags & (VM_READ | VM_WRITE | VM_EXEC));
+prev = prev->vm_prev)
+   ;
if (prev && prev->vm_end > gap_addr) {
if (!(prev->vm_flags & VM_GROWSDOWN))
return -ENOMEM;
--- END ---

I don't have a ppc64el machine where I can change the kernel, but I
tried this on x86_64 with the stack limit reduced to 1 MiB and Rust
is able to expand its stack where previously it would crash.

This *doesn't* fix the LibreOffice regression on i386.

Ben.

-- 
Ben Hutchings
The world is coming to an end.  Please log off.



signature.asc
Description: Digital signature

Re: [PATCH mm] introduce reverse buddy concept to reduce buddy fragment

2017-07-04 Thread Mel Gorman

On Tue, Jul 04, 2017 at 01:24:14PM +0200, Michal Hocko wrote:
> On Tue 04-07-17 16:04:52, zhouxianrong wrote:
> > every 2s i sample /proc/buddyinfo in the whole test process.
> > 
> > the last about 90 samples were sampled after the test was done.
> 
> I've tried to explain to you that numbers without a proper testing
> metodology and highlevel metrics you are interested in and comparision
> to the base kernel are meaningless. I cannot draw any conclusion from
> looking at numbers you have posted. Are high order allocations cheaper
> to do with this patch? What about an averge order-0 allocation request?
> 

I have to agree. The patch is extremely complex for what it does which
is working around a limitation of the buddy allocator in general
(buddy's must be naturally aligned). There would have to be *strong*
justification that allocations fail even with compaction or a reclaim
cycle or that the latency is severely reduced -- neither which is
evident from the data presented. It would also have to be proven that
there is no overhead added in the general case to justify this so
without extensive justification for the complexity;

Naked-by: Mel Gorman 

> You are touching memory allocator hot paths and those are really
> sensitive to changes. It takes a lot of testing with different workloads
> to prove that no new regressions are introduced. That being said, I
> completely agree that reducing the memory fragmentation is an important
> objective but touching the page allocator and adding new branches there
> sounds like a problematic approach which would have to show _huge_
> benefits to be mergeable. Is it possible to improve khugepaged to
> accomplish the same thing?

Or if this is CMA related, a justification why alloc_contig_range cannot do
the same thing with a linear walk when the initial allocation attempt fails.

-- 
Mel Gorman
SUSE Labs

Re: [PATCH v3] acpi: configfs: Unload SSDT on configfs entry removal

2017-07-04 Thread Rafael J. Wysocki

On Tue, Jul 4, 2017 at 8:14 AM, Jan Kiszka  wrote:
> On 2017-06-09 20:36, Jan Kiszka wrote:
>> Call directly into acpica to load a table to obtain its index on return.
>> We choose the direct call of acpica internal functions to avoid having
>> to modify its API which is used outside of Linux as well.
>>
>> Use that index to unload the table again when the corresponding
>> directory in configfs gets removed. This allows to change SSDTs without
>> rebooting the system. It also allows to destroy devices again that a
>> dynamically loaded SSDT created.
>>
>> This is widely similar to the DT overlay behavior.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>>
>> Change in v3:
>>  - fix breakage if acpi_configfs is modular
>>
>>  drivers/acpi/acpi_configfs.c | 20 +++-
>>  drivers/acpi/acpica/tbdata.c |  4 
>>  2 files changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/acpi/acpi_configfs.c b/drivers/acpi/acpi_configfs.c
>> index 146a77fb762d..853bc7fc673f 100644
>> --- a/drivers/acpi/acpi_configfs.c
>> +++ b/drivers/acpi/acpi_configfs.c
>> @@ -15,11 +15,15 @@
>>  #include 
>>  #include 
>>
>> +#include "acpica/accommon.h"
>> +#include "acpica/actables.h"
>> +
>>  static struct config_group *acpi_table_group;
>>
>>  struct acpi_table {
>>   struct config_item cfg;
>>   struct acpi_table_header *header;
>> + u32 index;
>>  };
>>
>>  static ssize_t acpi_table_aml_write(struct config_item *cfg,
>> @@ -52,7 +56,11 @@ static ssize_t acpi_table_aml_write(struct config_item 
>> *cfg,
>>   if (!table->header)
>>   return -ENOMEM;
>>
>> - ret = acpi_load_table(table->header);
>> + ACPI_INFO(("Host-directed Dynamic ACPI Table Load:"));
>> + ret = acpi_tb_install_and_load_table(
>> + ACPI_PTR_TO_PHYSADDR(table->header),
>> + ACPI_TABLE_ORIGIN_EXTERNAL_VIRTUAL, FALSE,
>> + >index);
>>   if (ret) {
>>   kfree(table->header);
>>   table->header = NULL;
>> @@ -215,8 +223,18 @@ static struct config_item *acpi_table_make_item(struct 
>> config_group *group,
>>   return >cfg;
>>  }
>>
>> +static void acpi_table_drop_item(struct config_group *group,
>> +  struct config_item *cfg)
>> +{
>> + struct acpi_table *table = container_of(cfg, struct acpi_table, cfg);
>> +
>> + ACPI_INFO(("Host-directed Dynamic ACPI Table Unload"));
>> + acpi_tb_unload_table(table->index);
>> +}
>> +
>>  struct configfs_group_operations acpi_table_group_ops = {
>>   .make_item = acpi_table_make_item,
>> + .drop_item = acpi_table_drop_item,
>>  };
>>
>>  static struct config_item_type acpi_tables_type = {
>> diff --git a/drivers/acpi/acpica/tbdata.c b/drivers/acpi/acpica/tbdata.c
>> index 27c5c27d4818..c9d6fa6d7cc6 100644
>> --- a/drivers/acpi/acpica/tbdata.c
>> +++ b/drivers/acpi/acpica/tbdata.c
>> @@ -867,6 +867,8 @@ acpi_tb_install_and_load_table(acpi_physical_address 
>> address,
>>   return_ACPI_STATUS(status);
>>  }
>>
>> +ACPI_EXPORT_SYMBOL(acpi_tb_install_and_load_table)
>> +
>>  
>> /***
>>   *
>>   * FUNCTION:acpi_tb_unload_table
>> @@ -914,3 +916,5 @@ acpi_status acpi_tb_unload_table(u32 table_index)
>>   acpi_tb_set_table_loaded_flag(table_index, FALSE);
>>   return_ACPI_STATUS(status);
>>  }
>> +
>> +ACPI_EXPORT_SYMBOL(acpi_tb_unload_table)
>>
>
> Ping for this patch.

Pushed to Linus and is waiting for merging (along with the other ACPI
changes for 4.13).

Thanks,
Rafael

Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()

2017-07-04 Thread Vlastimil Babka

On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> The rework of the cpu hotplug locking unearthed potential deadlocks with
> the memory hotplug locking code.
> 
> The solution for these is to rework the memory hotplug locking code as well
> and take the cpu hotplug lock before the memory hotplug lock in
> mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> hotplug lock when the memory hotplug code calls lru_add_drain_all().
> 
> Split out the inner workings of lru_add_drain_all() into
> lru_add_drain_all_cpuslocked() so this function can be invoked from the
> memory hotplug code with the cpu hotplug lock held.
> 
> Reported-by: Andrey Ryabinin 
> Signed-off-by: Thomas Gleixner 
> Cc: Michal Hocko 
> Cc: linux...@kvack.org
> Cc: Andrew Morton 
> Cc: Vlastimil Babka 
> Cc: Vladimir Davydov 

Acked-by: Vlastimil Babka 

A question below.

> ---
>  include/linux/swap.h |1 +
>  mm/swap.c|   11 ---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
>  extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
> +extern void lru_add_drain_all_cpuslocked(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> -void lru_add_drain_all(void)
> +void lru_add_drain_all_cpuslocked(void)
>  {
>   static DEFINE_MUTEX(lock);
>   static struct cpumask has_work;
> @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
>   return;
>  
>   mutex_lock();
> - get_online_cpus();

Is there a an assertion check that we are locked, that could be put in
e.g. VM_WARN_ON_ONCE()?

>   cpumask_clear(_work);
>  
>   for_each_online_cpu(cpu) {
> @@ -721,10 +720,16 @@ void lru_add_drain_all(void)
>   for_each_cpu(cpu, _work)
>   flush_work(_cpu(lru_add_drain_work, cpu));
>  
> - put_online_cpus();
>   mutex_unlock();
>  }
>  
> +void lru_add_drain_all(void)
> +{
> + get_online_cpus();
> + lru_add_drain_all_cpuslocked();
> + put_online_cpus();
> +}
> +
>  /**
>   * release_pages - batched put_page()
>   * @pages: array of pages to release
> 
>

Re: [PATCH v2 5/8] KVM: arm/arm64: vgic: Handle mapped level sensitive SPIs

2017-07-04 Thread Marc Zyngier

Hi Eric,

On 15/06/17 13:52, Eric Auger wrote:
> Currently, the line level of unmapped level sensitive SPIs is
> toggled down by the maintenance IRQ handler/resamplefd mechanism.
> 
> As mapped SPI completion is not trapped, we cannot rely on this
> mechanism and the line level needs to be observed at distributor
> level instead.
> 
> This patch handles the physical IRQ case in vgic_validate_injection
> and get the line level of a mapped SPI at distributor level.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v1 -> v2:
> - renamed is_unshared_mapped into is_mapped_spi
> - changes to kvm_vgic_map_phys_irq moved in the previous patch
> - make vgic_validate_injection more readable
> - reword the commit message
> ---
>  virt/kvm/arm/vgic/vgic.c | 16 ++--
>  virt/kvm/arm/vgic/vgic.h |  7 ++-
>  2 files changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index 075f073..2e35ac7 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -139,6 +139,17 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>   kfree(irq);
>  }
>  
> +bool irq_line_level(struct vgic_irq *irq)
> +{
> + bool line_level = irq->line_level;
> +
> + if (unlikely(is_mapped_spi(irq)))
> + WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +   IRQCHIP_STATE_PENDING,
> +   _level));
> + return line_level;
> +}
> +
>  /**
>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>   *
> @@ -236,13 +247,14 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Only valid injection if changing level for level-triggered IRQs or for a
> - * rising edge.
> + * rising edge. Injection of virtual interrupts associated to physical
> + * interrupts always is valid.
>   */
>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>  {
>   switch (irq->config) {
>   case VGIC_CONFIG_LEVEL:
> - return irq->line_level != level;
> + return (irq->line_level != level || 
> unlikely(is_mapped_spi(irq)));
>   case VGIC_CONFIG_EDGE:
>   return level;
>   }
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index bba7fa2..da254ae 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -96,14 +96,19 @@
>  /* we only support 64 kB translation table page size */
>  #define KVM_ITS_L1E_ADDR_MASKGENMASK_ULL(51, 16)
>  
> +bool irq_line_level(struct vgic_irq *irq);
> +
>  static inline bool irq_is_pending(struct vgic_irq *irq)
>  {
>   if (irq->config == VGIC_CONFIG_EDGE)
>   return irq->pending_latch;
>   else
> - return irq->pending_latch || irq->line_level;
> + return irq->pending_latch || irq_line_level(irq);

I'm a bit concerned that an edge interrupt doesn't take the distributor
state into account here. Why is that so? Once an SPI is forwarded to a
guest, a large part of the edge vs level differences move into the HW,
and are not that different anymore from a SW PoV.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

Re: [vs-plain] Re: [PATCH] mm: larger stack guard gap, between vmas

2017-07-04 Thread John Haxby

On 04/07/17 00:55, Ben Hutchings wrote:
> Unfortunately these regressions have not been completely fixed by
> switching to Hugh's fix.
> 
> Firstly, some Rust programs are crashing on ppc64el with 64 KiB pages. 
> Apparently Rust maps its own guard page at the lower limit of the stack
> (determined using pthread_getattr_np() and pthread_attr_getstack()).  I
> don't think this ever actually worked for the main thread stack, but it
> now also blocks expansion as the default stack size of 8 MiB is smaller
> than the stack gap of 16 MiB.  Would it make sense to skip over
> PROT_NONE mappings when checking whether it's safe to expand?
> 
> Secondly, LibreOffice is crashing on i386 when running components
> implemented in Java.  I don't have a diagnosis for this yet.

We found that we needed f4cb767d76cf ("mm: fix new crash in
unmapped_area_topdown()")   Apologies if you've already covered that.

This may be needed in addition to the other patch you proposed.

jch

Re: [PATCH v5 2/4] [media] platform: Add Synopsys Designware HDMI RX Controller Driver

2017-07-04 Thread Jose Abreu



On 04-07-2017 10:39, Hans Verkuil wrote:
>
>> +static const struct v4l2_subdev_video_ops
>> dw_hdmi_sd_video_ops = {
>> +.s_routing = dw_hdmi_s_routing,
>> +.g_input_status = dw_hdmi_g_input_status,
>> +.g_parm = dw_hdmi_g_parm,
>> +.g_dv_timings = dw_hdmi_g_dv_timings,
>> +.query_dv_timings = dw_hdmi_query_dv_timings,
> No s_dv_timings???
 Hmm, yeah, I didn't implement it because the callchain and the
 player I use just use {get/set}_fmt. s_dv_timings can just
 populate the fields and replace them with the detected dv_timings
 ? Just like set_fmt does? Because the controller has no scaler.
>>> No, s_dv_timings is the function that actually sets
>>> dw_dev->timings.
>>> After you check that it is valid of course (call
>>> v4l2_valid_dv_timings).
>>>
>>> set_fmt calls get_fmt which returns the information from
>>> dw_dev->timings.
>>>
>>> But it is s_dv_timings that has to set dw_dev->timings.
>>>
>>> With the current code you can only capture 640x480 (the default
>>> timings).
>>> Have you ever tested this with any other timings? I don't quite
>>> understand
>>> how you test.
>> I use mpv to test with a wrapper driver that just calls the
>> subdev ops and sets up a video dma.
>>
>> Ah, I see now. I failed to port the correct callbacks and in the
>> upstream version I'm using I only tested with 640x480 ...
>>
>> But apart from that this is a capture device without scaling so I
>> can not set timings, I can only return them so that applications
>> know which format I'm receiving, right? So my s_dv_timings will
>> return the same as query_dv_timings ...
> Well, to be precise: s_dv_timings just accepts what the application
> gives it (as long as it is within the dv_timings capabilities). But
> those timings come in practice from a query_dv_timings call from the
> application.
>
> The core rule is that receivers cannot randomly change timings since
> timings are related to buffer sizes. You do not want the application
> to allocate buffers for 640x480 and when the source changes to 1920x1080
> have those buffers suddenly overflow.
>
> Instead the app queries the timings, allocates the buffers, start
> streaming and when the timings change it will get an event so it can
> stop streaming, reallocate buffers, and start the process again.
>
> In other words, the application is in control here.
>

... But this is not true for mpv/mplayer. They first try to set a
default format (by using s_fmt) and then query the format again
(by using g_fmt) ... So dv_timings are never used. Are these apps
broken? Im only using them because of performance, do you
recommend others?

Best regards,
Jose Miguel Abreu

[PATCH] clk: qcom: clk-smd-rpm: Fix the initial rate of branches

2017-07-04 Thread Georgi Djakov

As there is no way to actually query the hardware for the current clock
rate, now racalc_rate() just returns the last rate that was previously
set. But if the rate was not set yet, we return the bogus rate of 1KHz.

Knowing what the rate of XO is and that some clocks are just branches of
it, we can do better and return that rate instead of a bogus one.

Reported-by: Archit Taneja 
Signed-off-by: Georgi Djakov 
---
 drivers/clk/qcom/clk-smd-rpm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
index d990fe44aef3..7350a43b0573 100644
--- a/drivers/clk/qcom/clk-smd-rpm.c
+++ b/drivers/clk/qcom/clk-smd-rpm.c
@@ -116,12 +116,12 @@
 
 #define DEFINE_CLK_SMD_RPM_XO_BUFFER(_platform, _name, _active, r_id)\
__DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active,\
-   QCOM_SMD_RPM_CLK_BUF_A, r_id, 0, 1000,\
+   QCOM_SMD_RPM_CLK_BUF_A, r_id, 0, 1920,\
QCOM_RPM_KEY_SOFTWARE_ENABLE)
 
 #define DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(_platform, _name, _active, r_id) \
__DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active,\
-   QCOM_SMD_RPM_CLK_BUF_A, r_id, 0, 1000,\
+   QCOM_SMD_RPM_CLK_BUF_A, r_id, 0, 1920,\
QCOM_RPM_KEY_PIN_CTRL_CLK_BUFFER_ENABLE_KEY)
 
 #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)

Re: [PATCH] mm: larger stack guard gap, between vmas

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 13:21:02, Ben Hutchings wrote:
> On Tue, 2017-07-04 at 14:00 +0200, Michal Hocko wrote:
> > On Tue 04-07-17 12:36:11, Ben Hutchings wrote:
> > > On Tue, 2017-07-04 at 12:42 +0200, Michal Hocko wrote:
> > > > On Tue 04-07-17 11:47:28, Willy Tarreau wrote:
> > > > > On Tue, Jul 04, 2017 at 11:35:38AM +0200, Michal Hocko wrote:
> > > 
> > > [...]
> > > > > But wouldn't this completely disable the check in case such a guard 
> > > > > page
> > > > > is installed, and possibly continue to allow the collision when the 
> > > > > stack
> > > > > allocation is large enough to skip this guard page ?
> > > > 
> > > > Yes and but a PROT_NONE would fault and as the changelog says, we _hope_
> > > > that userspace does the right thing.
> > > 
> > > It may well not be large enough, because of the same wrong assumptions
> > > that resulted in the kernel's guard page not being large enough.  We
> > > should count it as part of the guard gap but not a substitute.
> > 
> > yes, you are right of course. But isn't this a bug on their side
> > considering they are managing their _own_ stack gap?
> 
> Yes it's their bug, but you know the rule - don't break user-space.

Absolutely, that is why I belive we should consider the prev VMA but
doing anything more just risks for new regressions. Or why do you think
that not-checking them would cause a regression?

> > Our stack gap
> > management is a best effort thing and two such approaches competing will
> > always lead to weird cornercases. That was my assumption when saying
> > that I am not sure this is really _worth_ it. We should definitely try
> > to workaround clashes but that's about it. If others think that we
> > should do everything to prevent even those issues I will not oppose
> > of course. It just adds more cycles to something that is a weird case
> > already.
> 
> I don't want odd behaviour to weaken the stack guard.
> 
> > [...]
> > 
> > > This *doesn't* fix the LibreOffice regression on i386.
> > 
> > Are there any details about this regression?
> 
> Here:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=865303#170
> 
> I haven't reproduced it in Writer, but if I use Base to create a new
> HSQLDB database it reliably crashes (HSQLDB is implemented in Java).

I haven't read through previous 169 comments but I do not see any stack
trace. Ideally with info proc mapping that would tell us the memory
layout.

-- 
Michal Hocko
SUSE Labs

Re: [RFC][PATCH] sched: attach extra runtime to the right avg

2017-07-04 Thread Peter Zijlstra

On Tue, Jul 04, 2017 at 02:21:50PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 04, 2017 at 12:13:09PM +0200, Ingo Molnar wrote:
> > 
> > This code on the other hand:
> > 
> > sa->last_update_time += delta << 10;
> > 
> > ... in essence creates a whole new absolute clock value that slowly but 
> > surely is 
> > drifting away from the real rq->clock, because 'delta' is always rounded 
> > down to 
> > the nearest 1024 ns boundary, so we accumulate the 'remainder' losses.
> > 
> > That is because:
> > 
> > delta >>= 10;
> > ...
> > sa->last_update_time += delta << 10;
> > 
> > Given enough time, ->last_update_time can drift a long way, and this delta:
> > 
> > delta = now - sa->last_update_time;
> > 
> > ... becomes meaningless AFAICS, because it's essentially two different 
> > clocks that 
> > get compared.
> 
> Thing is, once you drift over 1023 (ns) your delta increases and you
> catch up again.
> 
> 
> 
>  A  B C   D  E  F
>  |  | |   |  |  |
>  ++++++++++++
> 
> 
> A: now = 0
>sa->last_update_time = 0
>delta := (now - sa->last_update_time) >> 10 = 0
> 
> B: now = 614  (+614)
>delta = (614 - 0) >> 10 = 0
>sa->last_update_time += 0  (0)
>sa->last_update_time = now & ~1023 (0)
> 
> C: now = 1843 (+1229)
>delta = (1843 - 0) >> 10 = 1
>sa->last_update_time += 1024   (1024)
>sa->last_update_time = now & ~1023 (1024)
> 
> 
> D: now = 3481 (+1638)
>delta = (3481 - 1024) >> 10 = 2
>sa->last_update_time += 2048   (3072)
>sa->last_update_time = now & ~1023 (3072)
> 
> E: now = 5734 (+2253)
>delta = (5734 - 3072) = 2
>sa->last_update_time += 2048   (5120)
>sa->last_update_time = now & ~1023 (5120)
> 
> F: now = 6348 (+614)
>delta = (6348 - 5120) >> 10 = 1
>sa->last_update_time += 1024   (6144)
>sa->last_update_time = now & ~1023 (6144)
> 
> 
> 
> And you'll see that both are identical, and that both D and F have
> gotten a spill from sub-chunk accounting.


Where the two approaches differ is when we have different modifications
to sa->last_update_time (and we do).

The differential (+=) one does not mandate initial value of
->last_update_time has the bottom 9 bits cleared. It will simply
continue from wherever.

The absolute (&) one however mandates that ->last_update_time always has
the bottom few bits 0, otherwise we can 'gain' time. The first iteration
will clear those bits and we'll then double account them.

It so happens that we have an explicit assign in migrate
(attach_entity_load_avg / set_task_rq_fair). And on negative delta. In
all those cases we use the immediate 'now' value, no clearing of bottom
bits.

The differential should work fine with that, the absolute one has double
accounting issues in that case.

So it would be very good to find what exactly causes Josef's workload to
get 'fixed'.

Re: [PATCH v3] phy: allwinner: phy-sun4i-usb: Add log when probing

2017-07-04 Thread Mylene Josserand


Hi,

On 04/07/2017 14:37, Quentin Schulz wrote:

When phy-sun4i-usb's probing fails, it does not print the reason in
kernel log, forcing the developer to edit this driver to add info logs.
This commit makes the kernel print the reason of phy-sun4i-usb's probing
failure or a success message.

Signed-off-by: Quentin Schulz 


Tested-by: Mylène Josserand 

Thanks!

--
Mylène Josserand, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 14:48:56, Thomas Gleixner wrote:
> On Tue, 4 Jul 2017, Michal Hocko wrote:
> > On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> > > The rework of the cpu hotplug locking unearthed potential deadlocks with
> > > the memory hotplug locking code.
> > > 
> > > The solution for these is to rework the memory hotplug locking code as 
> > > well
> > > and take the cpu hotplug lock before the memory hotplug lock in
> > > mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> > > hotplug lock when the memory hotplug code calls lru_add_drain_all().
> > > 
> > > Split out the inner workings of lru_add_drain_all() into
> > > lru_add_drain_all_cpuslocked() so this function can be invoked from the
> > > memory hotplug code with the cpu hotplug lock held.
> > 
> > You have added callers in the later patch in the series AFAICS which
> > is OK but I think it would be better to have them in this patch
> > already. Nothing earth shattering (maybe a rebase artifact).
> 
> The requirement for changing that comes with the extra hotplug locking in
> mem_hotplug_begin(). That is required to establish the proper lock order
> and then causes the recursive locking in the next patch. Adding the caller
> here would be wrong, because then lru_add_drain_all_cpuslocked() would be
> called unprotected. Hens and eggs as usual :)

Yeah, you are right. My bad I should have noticed that.
-- 
Michal Hocko
SUSE Labs

[PATCH 24/36] net, x25: convert x25_route.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/x25.h   | 7 ---
 net/x25/x25_route.c | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/net/x25.h b/include/net/x25.h
index 6d30a01..1ac1400 100644
--- a/include/net/x25.h
+++ b/include/net/x25.h
@@ -11,6 +11,7 @@
 #define _X25_H 
 #include 
 #include 
+#include 
 #include 
 
 #defineX25_ADDR_LEN16
@@ -129,7 +130,7 @@ struct x25_route {
struct x25_address  address;
unsigned intsigdigits;
struct net_device   *dev;
-   atomic_trefcnt;
+   refcount_t  refcnt;
 };
 
 struct x25_neigh {
@@ -265,12 +266,12 @@ void x25_route_free(void);
 
 static __inline__ void x25_route_hold(struct x25_route *rt)
 {
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 static __inline__ void x25_route_put(struct x25_route *rt)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree(rt);
 }
 
diff --git a/net/x25/x25_route.c b/net/x25/x25_route.c
index 277c8d2..b85b889 100644
--- a/net/x25/x25_route.c
+++ b/net/x25/x25_route.c
@@ -55,7 +55,7 @@ static int x25_add_route(struct x25_address *address, 
unsigned int sigdigits,
 
rt->sigdigits = sigdigits;
rt->dev   = dev;
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
 
list_add(>node, _route_list);
rc = 0;
-- 
2.7.4

[PATCH 25/36] net, x25: convert x25_neigh.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/x25.h  | 6 +++---
 net/x25/x25_link.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/x25.h b/include/net/x25.h
index 1ac1400..2609b57 100644
--- a/include/net/x25.h
+++ b/include/net/x25.h
@@ -142,7 +142,7 @@ struct x25_neigh {
unsigned long   t20;
struct timer_list   t20timer;
unsigned long   global_facil_mask;
-   atomic_trefcnt;
+   refcount_t  refcnt;
 };
 
 struct x25_sock {
@@ -243,12 +243,12 @@ void x25_link_free(void);
 /* x25_neigh.c */
 static __inline__ void x25_neigh_hold(struct x25_neigh *nb)
 {
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 static __inline__ void x25_neigh_put(struct x25_neigh *nb)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree(nb);
 }
 
diff --git a/net/x25/x25_link.c b/net/x25/x25_link.c
index bcaa180..e0cd04d 100644
--- a/net/x25/x25_link.c
+++ b/net/x25/x25_link.c
@@ -266,7 +266,7 @@ void x25_link_device_up(struct net_device *dev)
   X25_MASK_PACKET_SIZE |
   X25_MASK_WINDOW_SIZE;
nb->t20  = sysctl_x25_restart_request_timeout;
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
 
write_lock_bh(_neigh_list_lock);
list_add(>node, _neigh_list);
-- 
2.7.4

[PATCH 29/36] net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/sctp/auth.h | 5 +++--
 net/sctp/auth.c | 4 ++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/net/sctp/auth.h b/include/net/sctp/auth.h
index 171244b..e5c57d0 100644
--- a/include/net/sctp/auth.h
+++ b/include/net/sctp/auth.h
@@ -31,6 +31,7 @@
 #define __sctp_auth_h__
 
 #include 
+#include 
 
 struct sctp_endpoint;
 struct sctp_association;
@@ -53,7 +54,7 @@ struct sctp_hmac {
  * over SCTP-AUTH
  */
 struct sctp_auth_bytes {
-   atomic_t refcnt;
+   refcount_t refcnt;
__u32 len;
__u8  data[];
 };
@@ -76,7 +77,7 @@ static inline void sctp_auth_key_hold(struct sctp_auth_bytes 
*key)
if (!key)
return;
 
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 void sctp_auth_key_put(struct sctp_auth_bytes *key);
diff --git a/net/sctp/auth.c b/net/sctp/auth.c
index 8ffa598..e001b01 100644
--- a/net/sctp/auth.c
+++ b/net/sctp/auth.c
@@ -63,7 +63,7 @@ void sctp_auth_key_put(struct sctp_auth_bytes *key)
if (!key)
return;
 
-   if (atomic_dec_and_test(>refcnt)) {
+   if (refcount_dec_and_test(>refcnt)) {
kzfree(key);
SCTP_DBG_OBJCNT_DEC(keys);
}
@@ -84,7 +84,7 @@ static struct sctp_auth_bytes *sctp_auth_create_key(__u32 
key_len, gfp_t gfp)
return NULL;
 
key->len = key_len;
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
SCTP_DBG_OBJCNT_INC(keys);
 
return key;
-- 
2.7.4

[PATCH 28/36] net, xfrm: convert sec_path.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/xfrm.h| 6 +++---
 net/xfrm/xfrm_input.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index e1bd1de..c0916ab 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1030,7 +1030,7 @@ struct xfrm_offload {
 };
 
 struct sec_path {
-   atomic_trefcnt;
+   refcount_t  refcnt;
int len;
int olen;
 
@@ -1051,7 +1051,7 @@ static inline struct sec_path *
 secpath_get(struct sec_path *sp)
 {
if (sp)
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
return sp;
 }
 
@@ -1060,7 +1060,7 @@ void __secpath_destroy(struct sec_path *sp);
 static inline void
 secpath_put(struct sec_path *sp)
 {
-   if (sp && atomic_dec_and_test(>refcnt))
+   if (sp && refcount_dec_and_test(>refcnt))
__secpath_destroy(sp);
 }
 
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 9de4b1d..923205e 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -116,7 +116,7 @@ struct sec_path *secpath_dup(struct sec_path *src)
for (i = 0; i < sp->len; i++)
xfrm_state_hold(sp->xvec[i]);
}
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
return sp;
 }
 EXPORT_SYMBOL(secpath_dup);
@@ -126,7 +126,7 @@ int secpath_set(struct sk_buff *skb)
struct sec_path *sp;
 
/* Allocate new secpath or COW existing one. */
-   if (!skb->sp || atomic_read(>sp->refcnt) != 1) {
+   if (!skb->sp || refcount_read(>sp->refcnt) != 1) {
sp = secpath_dup(skb->sp);
if (!sp)
return -ENOMEM;
-- 
2.7.4

[PATCH 30/36] net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/sctp/structs.h | 2 +-
 net/sctp/chunk.c   | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 07c11fe..4d7c855 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -496,7 +496,7 @@ struct sctp_datamsg {
/* Chunks waiting to be submitted to lower layer. */
struct list_head chunks;
/* Reference counting. */
-   atomic_t refcnt;
+   refcount_t refcnt;
/* When is this message no longer interesting to the peer? */
unsigned long expires_at;
/* Did the messenge fail to send? */
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 81466f6..1323d41 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -49,7 +49,7 @@
 /* Initialize datamsg from memory. */
 static void sctp_datamsg_init(struct sctp_datamsg *msg)
 {
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
msg->send_failed = 0;
msg->send_error = 0;
msg->can_delay = 1;
@@ -136,13 +136,13 @@ static void sctp_datamsg_destroy(struct sctp_datamsg *msg)
 /* Hold a reference. */
 static void sctp_datamsg_hold(struct sctp_datamsg *msg)
 {
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 /* Release a reference. */
 void sctp_datamsg_put(struct sctp_datamsg *msg)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
sctp_datamsg_destroy(msg);
 }
 
-- 
2.7.4

[PATCH 33/36] net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/sctp/structs.h | 2 +-
 net/sctp/associola.c   | 6 +++---
 net/sctp/endpointola.c | 6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 6a0d372..5ab29af 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1174,7 +1174,7 @@ struct sctp_ep_common {
 *   refcnt   - Reference count access to this object.
 *   dead - Do not attempt to use this object.
 */
-   atomic_trefcnt;
+   refcount_trefcnt;
booldead;
 
/* What socket does this endpoint belong to?  */
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index fa4f530..40ec836 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -88,7 +88,7 @@ static struct sctp_association *sctp_association_init(struct 
sctp_association *a
asoc->base.type = SCTP_EP_TYPE_ASSOCIATION;
 
/* Initialize the object handling fields.  */
-   atomic_set(>base.refcnt, 1);
+   refcount_set(>base.refcnt, 1);
 
/* Initialize the bind addr area.  */
sctp_bind_addr_init(>base.bind_addr, ep->base.bind_addr.port);
@@ -873,7 +873,7 @@ void sctp_assoc_control_transport(struct sctp_association 
*asoc,
 /* Hold a reference to an association. */
 void sctp_association_hold(struct sctp_association *asoc)
 {
-   atomic_inc(>base.refcnt);
+   refcount_inc(>base.refcnt);
 }
 
 /* Release a reference to an association and cleanup
@@ -881,7 +881,7 @@ void sctp_association_hold(struct sctp_association *asoc)
  */
 void sctp_association_put(struct sctp_association *asoc)
 {
-   if (atomic_dec_and_test(>base.refcnt))
+   if (refcount_dec_and_test(>base.refcnt))
sctp_association_destroy(asoc);
 }
 
diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
index efbc318..0e86f98 100644
--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -114,7 +114,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct 
sctp_endpoint *ep,
ep->base.type = SCTP_EP_TYPE_SOCKET;
 
/* Initialize the basic object fields. */
-   atomic_set(>base.refcnt, 1);
+   refcount_set(>base.refcnt, 1);
ep->base.dead = false;
 
/* Create an input queue.  */
@@ -285,7 +285,7 @@ static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
 /* Hold a reference to an endpoint. */
 void sctp_endpoint_hold(struct sctp_endpoint *ep)
 {
-   atomic_inc(>base.refcnt);
+   refcount_inc(>base.refcnt);
 }
 
 /* Release a reference to an endpoint and clean up if there are
@@ -293,7 +293,7 @@ void sctp_endpoint_hold(struct sctp_endpoint *ep)
  */
 void sctp_endpoint_put(struct sctp_endpoint *ep)
 {
-   if (atomic_dec_and_test(>base.refcnt))
+   if (refcount_dec_and_test(>base.refcnt))
sctp_endpoint_destroy(ep);
 }
 
-- 
2.7.4

[PATCH 27/36] net, xfrm: convert xfrm_policy.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/xfrm.h | 6 +++---
 net/key/af_key.c   | 2 +-
 net/xfrm/xfrm_policy.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index f5272a2..e1bd1de 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -560,7 +560,7 @@ struct xfrm_policy {
 
/* This lock only affects elements except for entry. */
rwlock_tlock;
-   atomic_trefcnt;
+   refcount_t  refcnt;
struct timer_list   timer;
 
struct flow_cache_object flo;
@@ -816,14 +816,14 @@ static inline void xfrm_audit_state_icvfail(struct 
xfrm_state *x,
 static inline void xfrm_pol_hold(struct xfrm_policy *policy)
 {
if (likely(policy != NULL))
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 void xfrm_policy_destroy(struct xfrm_policy *policy);
 
 static inline void xfrm_pol_put(struct xfrm_policy *policy)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
xfrm_policy_destroy(policy);
 }
 
diff --git a/net/key/af_key.c b/net/key/af_key.c
index edcf1d0..ca9d3ae 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -2177,7 +2177,7 @@ static int pfkey_xfrm_policy2msg(struct sk_buff *skb, 
const struct xfrm_policy *
}
 
hdr->sadb_msg_len = size / sizeof(uint64_t);
-   hdr->sadb_msg_reserved = atomic_read(>refcnt);
+   hdr->sadb_msg_reserved = refcount_read(>refcnt);
 
return 0;
 }
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 4706df6..ff61d85 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -62,7 +62,7 @@ static struct xfrm_policy *__xfrm_policy_unlink(struct 
xfrm_policy *pol,
 
 static inline bool xfrm_pol_hold_rcu(struct xfrm_policy *policy)
 {
-   return atomic_inc_not_zero(>refcnt);
+   return refcount_inc_not_zero(>refcnt);
 }
 
 static inline bool
@@ -292,7 +292,7 @@ struct xfrm_policy *xfrm_policy_alloc(struct net *net, 
gfp_t gfp)
INIT_HLIST_NODE(>bydst);
INIT_HLIST_NODE(>byidx);
rwlock_init(>lock);
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
skb_queue_head_init(>polq.hold_queue);
setup_timer(>timer, xfrm_policy_timer,
(unsigned long)policy);
-- 
2.7.4

[PATCH 36/36] net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/ax25.h | 6 +++---
 net/ax25/af_ax25.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/ax25.h b/include/net/ax25.h
index e3467ba..c4a0cf6 100644
--- a/include/net/ax25.h
+++ b/include/net/ax25.h
@@ -244,7 +244,7 @@ typedef struct ax25_cb {
unsigned char   window;
struct timer_list   timer, dtimer;
struct sock *sk;/* Backlink to socket */
-   atomic_trefcount;
+   refcount_t  refcount;
 } ax25_cb;
 
 struct ax25_sock {
@@ -266,11 +266,11 @@ static inline struct ax25_cb *sk_to_ax25(const struct 
sock *sk)
hlist_for_each_entry(__ax25, list, ax25_node)
 
 #define ax25_cb_hold(__ax25) \
-   atomic_inc(&((__ax25)->refcount))
+   refcount_inc(&((__ax25)->refcount))
 
 static __inline__ void ax25_cb_put(ax25_cb *ax25)
 {
-   if (atomic_dec_and_test(>refcount)) {
+   if (refcount_dec_and_test(>refcount)) {
kfree(ax25->digipeat);
kfree(ax25);
}
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 0c92ba0..f3f9d18 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -510,7 +510,7 @@ ax25_cb *ax25_create_cb(void)
if ((ax25 = kzalloc(sizeof(*ax25), GFP_ATOMIC)) == NULL)
return NULL;
 
-   atomic_set(>refcount, 1);
+   refcount_set(>refcount, 1);
 
skb_queue_head_init(>write_queue);
skb_queue_head_init(>frag_queue);
-- 
2.7.4

[PATCH 01/36] net, llc: convert llc_sap.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/llc.h  | 6 +++---
 net/llc/llc_core.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/llc.h b/include/net/llc.h
index e8e61d4..dc35f25 100644
--- a/include/net/llc.h
+++ b/include/net/llc.h
@@ -55,7 +55,7 @@ struct llc_sap {
unsigned charstate;
unsigned charp_bit;
unsigned charf_bit;
-   atomic_t refcnt;
+   refcount_t   refcnt;
int  (*rcv_func)(struct sk_buff *skb,
 struct net_device *dev,
 struct packet_type *pt,
@@ -113,14 +113,14 @@ struct llc_sap *llc_sap_open(unsigned char lsap,
struct net_device *orig_dev));
 static inline void llc_sap_hold(struct llc_sap *sap)
 {
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 void llc_sap_close(struct llc_sap *sap);
 
 static inline void llc_sap_put(struct llc_sap *sap)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
llc_sap_close(sap);
 }
 
diff --git a/net/llc/llc_core.c b/net/llc/llc_core.c
index 842851c..8904126 100644
--- a/net/llc/llc_core.c
+++ b/net/llc/llc_core.c
@@ -41,7 +41,7 @@ static struct llc_sap *llc_sap_alloc(void)
spin_lock_init(>sk_lock);
for (i = 0; i < LLC_SK_LADDR_HASH_ENTRIES; i++)
INIT_HLIST_NULLS_HEAD(>sk_laddr_hash[i], i);
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
}
return sap;
 }
-- 
2.7.4

[PATCH 04/36] net, vxlan: convert vxlan_sock.refcnt from atomic_t to refcount_t

2017-07-04 Thread Elena Reshetova

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 drivers/net/vxlan.c | 10 +-
 include/net/vxlan.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index b04e103..96aa7e6 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1034,11 +1034,11 @@ static bool vxlan_group_used(struct vxlan_net *vn, 
struct vxlan_dev *dev)
/* The vxlan_sock is only used by dev, leaving group has
 * no effect on other vxlan devices.
 */
-   if (family == AF_INET && sock4 && atomic_read(>refcnt) == 1)
+   if (family == AF_INET && sock4 && refcount_read(>refcnt) == 1)
return false;
 #if IS_ENABLED(CONFIG_IPV6)
sock6 = rtnl_dereference(dev->vn6_sock);
-   if (family == AF_INET6 && sock6 && atomic_read(>refcnt) == 1)
+   if (family == AF_INET6 && sock6 && refcount_read(>refcnt) == 1)
return false;
 #endif
 
@@ -1075,7 +1075,7 @@ static bool __vxlan_sock_release_prep(struct vxlan_sock 
*vs)
 
if (!vs)
return false;
-   if (!atomic_dec_and_test(>refcnt))
+   if (!refcount_dec_and_test(>refcnt))
return false;
 
vn = net_generic(sock_net(vs->sock->sk), vxlan_net_id);
@@ -2825,7 +2825,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net 
*net, bool ipv6,
}
 
vs->sock = sock;
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
vs->flags = (flags & VXLAN_F_RCV_FLAGS);
 
spin_lock(>sock_lock);
@@ -2860,7 +2860,7 @@ static int __vxlan_sock_add(struct vxlan_dev *vxlan, bool 
ipv6)
spin_lock(>sock_lock);
vs = vxlan_find_sock(vxlan->net, ipv6 ? AF_INET6 : AF_INET,
 vxlan->cfg.dst_port, vxlan->cfg.flags);
-   if (vs && !atomic_add_unless(>refcnt, 1, 0)) {
+   if (vs && !refcount_inc_not_zero(>refcnt)) {
spin_unlock(>sock_lock);
return -EBUSY;
}
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 326e849..3f430e3 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -183,7 +183,7 @@ struct vxlan_sock {
struct hlist_node hlist;
struct socket*sock;
struct hlist_head vni_list[VNI_HASH_SIZE];
-   atomic_t  refcnt;
+   refcount_trefcnt;
u32   flags;
 };
 
-- 
2.7.4

[PATCH 0/2] add support for Spreadtrum's FM driver

2017-07-04 Thread Chunyan Zhang

According to GregKH's suggestion [1], we tried to simply sort out the
FM driver source code which has been using in the internal projects.

Hopes it can help for fixing the problem raised in [1].

[1] https://lkml.org/lkml/2017/6/28/222

Chunyan Zhang (2):
  arm64: dts: add Spreadtrum's fm support
  misc: added Spreadtrum's radio driver

 arch/arm64/boot/dts/sprd/sp9860g-1h10.dts  |4 +
 drivers/misc/Kconfig   |1 +
 drivers/misc/Makefile  |1 +
 drivers/misc/sprd-wcn/Kconfig  |   14 +
 drivers/misc/sprd-wcn/Makefile |1 +
 drivers/misc/sprd-wcn/radio/Kconfig|8 +
 drivers/misc/sprd-wcn/radio/Makefile   |2 +
 drivers/misc/sprd-wcn/radio/fmdrv.h|  595 +++
 drivers/misc/sprd-wcn/radio/fmdrv_main.c   | 1245 
 drivers/misc/sprd-wcn/radio/fmdrv_main.h   |  117 +++
 drivers/misc/sprd-wcn/radio/fmdrv_ops.c|  447 +
 drivers/misc/sprd-wcn/radio/fmdrv_ops.h|   17 +
 drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.c |  753 ++
 drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.h |  103 ++
 14 files changed, 3308 insertions(+)
 create mode 100644 drivers/misc/sprd-wcn/Kconfig
 create mode 100644 drivers/misc/sprd-wcn/Makefile
 create mode 100644 drivers/misc/sprd-wcn/radio/Kconfig
 create mode 100644 drivers/misc/sprd-wcn/radio/Makefile
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv.h
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_main.c
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_main.h
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_ops.c
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_ops.h
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.c
 create mode 100644 drivers/misc/sprd-wcn/radio/fmdrv_rds_parser.h

-- 
2.7.4

[PATCH 1/2] arm64: dts: add Spreadtrum's fm support

2017-07-04 Thread Chunyan Zhang

Added FM support for Spreadtrum's SP9860 board.

Signed-off-by: Songhe Wei 
Signed-off-by: Chunyan Zhang 
---
 arch/arm64/boot/dts/sprd/sp9860g-1h10.dts | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/sprd/sp9860g-1h10.dts 
b/arch/arm64/boot/dts/sprd/sp9860g-1h10.dts
index 0362ecd..6fe052d 100644
--- a/arch/arm64/boot/dts/sprd/sp9860g-1h10.dts
+++ b/arch/arm64/boot/dts/sprd/sp9860g-1h10.dts
@@ -39,6 +39,10 @@
#size-cells = <2>;
ranges;
};
+
+   sprd-fm {
+   compatible  = "sprd,marlin2-fm";
+   };
 };
 
  {
-- 
2.7.4

Re: [PATCH] bus: omap-ocp2scp: Fix error handling in omap_ocp2scp_probe

2017-07-04 Thread Kishon Vijay Abraham I

+Tony, Arnd,

Hi,

On Friday 19 May 2017 02:16 PM, Kishon Vijay Abraham I wrote:
> The error handling code in omap_ocp2scp_probe fails to invoke
> pm_runtime_disable and fails to initialize return value in
> certain cases. Fix it here.

Can this patch be picked into arm-soc tree?

Thanks
Kishon
> 
> Signed-off-by: Kishon Vijay Abraham I 
> Signed-off-by: Sekhar Nori 
> ---
>  drivers/bus/omap-ocp2scp.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/bus/omap-ocp2scp.c b/drivers/bus/omap-ocp2scp.c
> index bf500e0e7362..77791f3dcfc6 100644
> --- a/drivers/bus/omap-ocp2scp.c
> +++ b/drivers/bus/omap-ocp2scp.c
> @@ -70,8 +70,10 @@ static int omap_ocp2scp_probe(struct platform_device *pdev)
>   if (!of_device_is_compatible(np, "ti,am437x-ocp2scp")) {
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   regs = devm_ioremap_resource(>dev, res);
> - if (IS_ERR(regs))
> - goto err0;
> + if (IS_ERR(regs)) {
> + ret = PTR_ERR(regs);
> + goto err1;
> + }
>  
>   pm_runtime_get_sync(>dev);
>   reg = readl_relaxed(regs + OCP2SCP_TIMING);
> @@ -83,6 +85,9 @@ static int omap_ocp2scp_probe(struct platform_device *pdev)
>  
>   return 0;
>  
> +err1:
> + pm_runtime_disable(>dev);
> +
>  err0:
>   device_for_each_child(>dev, NULL, ocp2scp_remove_devices);
>  
>

Re: [PATCH] ideapad-laptop: Add several models to no_hw_rfkill

2017-07-04 Thread Andy Shevchenko

On Tue, Jul 4, 2017 at 8:26 AM, Yang Jiaxun  wrote:
> From 8db74a4eef334f614bf727232e5b88f67f824862 Mon Sep 17 00:00:00 2001
> From: Yang Jiaxun 
> Date: Tue, 4 Jul 2017 11:28:41 +0800
> Subject: [PATCH] ideapad-laptop: Add several models to no_hw_rfkill
>
> Some Lenovo ideapad models do not have hardware rfkill switches, but trying
> to read the rfkill switches through the ideapad-laptop module. It caused to
> always reported blocking breaking wifi.
>
> Fix it by adding those models to no_hw_rfkill list.

Thanks for the patch, though...

Please, fix your email client that our patchwork can track.

Moreover, fix the indentation in the way it's already done in the
driver (your patch should not have "-" (minus) lines AFAIU).

>
> Signed-off-by: Yang Jiaxun 
> ---
>  drivers/platform/x86/ideapad-laptop.c | 78
> +--
>  1 file changed, 74 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/platform/x86/ideapad-laptop.c
> b/drivers/platform/x86/ideapad-laptop.c
> index 527e5d9..3ebc5c7 100644
> --- a/drivers/platform/x86/ideapad-laptop.c
> +++ b/drivers/platform/x86/ideapad-laptop.c
> @@ -909,17 +909,87 @@ static const struct dmi_system_id no_hw_rfkill_list[]
> = {
>  },
>  },
>  {
> +.ident = "Lenovo V310-14IKB",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo V310-14IKB"),
> +},
> +},
> +{
> +.ident = "Lenovo V310-14ISK",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo V310-14ISK"),
> +},
> +},
> +{
> +.ident = "Lenovo V310-15IKB",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo V310-15IKB"),
> +},
> +},
> +{
>  .ident = "Lenovo V310-15ISK",
>  .matches = {
> -DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> -DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo V310-15ISK"),
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo V310-15ISK"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad 300-15IBR",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 300-15IBR"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad 300-15IKB",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 300-15IKB"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad 300S-11IBR",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 300S-11BR"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad 310-15ABR",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 310-15ABR"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad 310-15IAP",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 310-15IAP"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad 310-15ISK",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 310-15ISK"),
>  },
>  },
>  {
>  .ident = "Lenovo ideapad 310-15IKB",
>  .matches = {
> -DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> -DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 310-15IKB"),
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad 310-15IKB"),
> +},
> +},
> +{
> +.ident = "Lenovo ideapad Y700-14ISK",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo ideapad Y700-14ISK"),
>  },
>  },
>  {
> --
> 2.7.4
>



-- 
With Best Regards,
Andy Shevchenko

[PATCH] usb: renesas_usbhs: make array type_array static const

2017-07-04 Thread Colin King

From: Colin Ian King 

Array type_array can be made static const rather than being
populated on the stack. Makes the object code smaller:

Before:
   textdata bss dec hex filename
   80871496   09583256f drivers/usb/renesas_usbhs/pipe.o

After:
   textdata bss dec hex filename
   78831584   0946724fb drivers/usb/renesas_usbhs/pipe.o

Signed-off-by: Colin Ian King 
---
 drivers/usb/renesas_usbhs/pipe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/renesas_usbhs/pipe.c b/drivers/usb/renesas_usbhs/pipe.c
index 9396a8c14af8..d811f0550c04 100644
--- a/drivers/usb/renesas_usbhs/pipe.c
+++ b/drivers/usb/renesas_usbhs/pipe.c
@@ -401,7 +401,7 @@ static int usbhsp_setup_pipecfg(struct usbhs_pipe *pipe, 
int is_host,
u16 dir = 0;
u16 epnum = 0;
u16 shtnak = 0;
-   u16 type_array[] = {
+   static const u16 type_array[] = {
[USB_ENDPOINT_XFER_BULK] = TYPE_BULK,
[USB_ENDPOINT_XFER_INT]  = TYPE_INT,
[USB_ENDPOINT_XFER_ISOC] = TYPE_ISO,
-- 
2.11.0

Re: [PATCH v3 01/16] drm/fb-helper: factor out pseudo-palette

2017-07-04 Thread Peter Rosin

On 2017-07-04 12:36, Peter Rosin wrote:
> The pseudo-palette has nothing to do with the crtc, so move it
> out of the crtc loop and update the palette once, then break out
> early.
> 
> Signed-off-by: Peter Rosin 

Should of course be p...@axentia.se

I wonder when I managed to Ctrl-T that one?

Cheers,
peda

Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 11:32:34, Thomas Gleixner wrote:
> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The reason is that memory hotplug takes the memory hotplug lock and then
> calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> 
> The problem has been there forever. The reason why this was never reported
> is that the cpu hotplug locking had this homebrewn recursive reader writer
> semaphore construct which due to the recursion evaded the full lock dep
> coverage. The memory hotplug code copied that construct verbatim and
> therefor has similar issues.
> 
> Three steps to fix this:
> 
> 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>issues get reported proper by lockdep.
> 
> 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
>hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
>and use to avoid recursive locking.
> 
> 3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
>hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
>invoking lru_add_drain_all_cpuslocked() instead.
> 
> Reported-by: Andrey Ryabinin 
> Signed-off-by: Thomas Gleixner 
> Cc: Michal Hocko 
> Cc: linux...@kvack.org
> Cc: Andrew Morton 
> Cc: Vlastimil Babka 
> Cc: Vladimir Davydov 

Acked-by: Michal Hocko 

> ---
>  mm/memory_hotplug.c |   89 
> 
>  mm/page_alloc.c |2 -
>  2 files changed, 16 insertions(+), 75 deletions(-)
> 
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -52,32 +52,17 @@ static void generic_online_page(struct p
>  static online_page_callback_t online_page_callback = generic_online_page;
>  static DEFINE_MUTEX(online_page_callback_lock);
>  
> -/* The same as the cpu_hotplug lock, but for memory hotplug. */
> -static struct {
> - struct task_struct *active_writer;
> - struct mutex lock; /* Synchronizes accesses to refcount, */
> - /*
> -  * Also blocks the new readers during
> -  * an ongoing mem hotplug operation.
> -  */
> - int refcount;
> +DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock);
>  
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> - struct lockdep_map dep_map;
> -#endif
> -} mem_hotplug = {
> - .active_writer = NULL,
> - .lock = __MUTEX_INITIALIZER(mem_hotplug.lock),
> - .refcount = 0,
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> - .dep_map = {.name = "mem_hotplug.lock" },
> -#endif
> -};
> +void get_online_mems(void)
> +{
> + percpu_down_read(_hotplug_lock);
> +}
>  
> -/* Lockdep annotations for get/put_online_mems() and mem_hotplug_begin/end() 
> */
> -#define memhp_lock_acquire_read() lock_map_acquire_read(_hotplug.dep_map)
> -#define memhp_lock_acquire()  lock_map_acquire(_hotplug.dep_map)
> -#define memhp_lock_release()  lock_map_release(_hotplug.dep_map)
> +void put_online_mems(void)
> +{
> + percpu_up_read(_hotplug_lock);
> +}
>  
>  #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
>  bool memhp_auto_online;
> @@ -97,60 +82,16 @@ static int __init setup_memhp_default_st
>  }
>  __setup("memhp_default_state=", setup_memhp_default_state);
>  
> -void get_online_mems(void)
> -{
> - might_sleep();
> - if (mem_hotplug.active_writer == current)
> - return;
> - memhp_lock_acquire_read();
> - mutex_lock(_hotplug.lock);
> - mem_hotplug.refcount++;
> - mutex_unlock(_hotplug.lock);
> -
> -}
> -
> -void put_online_mems(void)
> -{
> - if (mem_hotplug.active_writer == current)
> - return;
> - mutex_lock(_hotplug.lock);
> -
> - if (WARN_ON(!mem_hotplug.refcount))
> - mem_hotplug.refcount++; /* try to fix things up */
> -
> - if (!--mem_hotplug.refcount && unlikely(mem_hotplug.active_writer))
> - wake_up_process(mem_hotplug.active_writer);
> - mutex_unlock(_hotplug.lock);
> - memhp_lock_release();
> -
> -}
> -
> -/* Serializes write accesses to mem_hotplug.active_writer. */
> -static DEFINE_MUTEX(memory_add_remove_lock);
> -
>  void mem_hotplug_begin(void)
>  {
> - mutex_lock(_add_remove_lock);
> -
> - mem_hotplug.active_writer = current;
> -
> - memhp_lock_acquire();
> - for (;;) {
> - mutex_lock(_hotplug.lock);
> - if (likely(!mem_hotplug.refcount))
> - break;
> - __set_current_state(TASK_UNINTERRUPTIBLE);
> - mutex_unlock(_hotplug.lock);
> - schedule();
> - }
> + cpus_read_lock();
> + percpu_down_write(_hotplug_lock);
>  }
>  
>  void mem_hotplug_done(void)
>  {
> - mem_hotplug.active_writer = NULL;
> - mutex_unlock(_hotplug.lock);
> - memhp_lock_release();
> -

Re: Where to update regulator register with initial voltage set by HW

2017-07-04 Thread Waldemar Rymarkiewicz

Hi Mark,

On 3 July 2017 at 17:36, Mark Brown  wrote:
> On Mon, Jul 03, 2017 at 05:33:03PM +0200, Waldemar Rymarkiewicz wrote:
>
>> I've asked also on TI forum if this is typical to the regulator not to
>> determine the startup voltage but still waiting for feedback. Anyway,
>> if this is the case I guess a driver is a good place to update
>> register before we register to the regulator framework.
>
> It's really unusual to have a device that has the voltage changable by
> register write at runtime where the current state can't be read back.

or you did not realise that this is initialised by bootloader for example.

After investigating this issue a bit more I've found that this is
rather typical for power regulators not to update a register with
startup voltage set by a feedback resistor divider as it would cost
extra circuit.  So, I assume that most likely a bootloader normally
initializes power regulator in case it's needed eg. if it's supplying
CPU which is DVS-enabled.

Anyway, it's more clear now to me how this should be done.

/Waldek

Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> The rework of the cpu hotplug locking unearthed potential deadlocks with
> the memory hotplug locking code.
> 
> The solution for these is to rework the memory hotplug locking code as well
> and take the cpu hotplug lock before the memory hotplug lock in
> mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> hotplug lock when the memory hotplug code calls lru_add_drain_all().
> 
> Split out the inner workings of lru_add_drain_all() into
> lru_add_drain_all_cpuslocked() so this function can be invoked from the
> memory hotplug code with the cpu hotplug lock held.

You have added callers in the later patch in the series AFAICS which
is OK but I think it would be better to have them in this patch
already. Nothing earth shattering (maybe a rebase artifact).

> Reported-by: Andrey Ryabinin 
> Signed-off-by: Thomas Gleixner 
> Cc: Michal Hocko 
> Cc: linux...@kvack.org
> Cc: Andrew Morton 
> Cc: Vlastimil Babka 
> Cc: Vladimir Davydov 

Acked-by: Michal Hocko 

> ---
>  include/linux/swap.h |1 +
>  mm/swap.c|   11 ---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
>  extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
> +extern void lru_add_drain_all_cpuslocked(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> -void lru_add_drain_all(void)
> +void lru_add_drain_all_cpuslocked(void)
>  {
>   static DEFINE_MUTEX(lock);
>   static struct cpumask has_work;
> @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
>   return;
>  
>   mutex_lock();
> - get_online_cpus();
>   cpumask_clear(_work);
>  
>   for_each_online_cpu(cpu) {
> @@ -721,10 +720,16 @@ void lru_add_drain_all(void)
>   for_each_cpu(cpu, _work)
>   flush_work(_cpu(lru_add_drain_work, cpu));
>  
> - put_online_cpus();
>   mutex_unlock();
>  }
>  
> +void lru_add_drain_all(void)
> +{
> + get_online_cpus();
> + lru_add_drain_all_cpuslocked();
> + put_online_cpus();
> +}
> +
>  /**
>   * release_pages - batched put_page()
>   * @pages: array of pages to release
> 

-- 
Michal Hocko
SUSE Labs

Re: Question regarding MAX_ARG_STRLEN with execve()

2017-07-04 Thread Anshuman Khandual

On 07/03/2017 02:51 PM, Michal Hocko wrote:
> On Mon 03-07-17 13:58:59, Anshuman Khandual wrote:
>> On 06/30/2017 07:52 PM, Michal Hocko wrote:
>>> On Fri 30-06-17 11:59:37, Anshuman Khandual wrote:
 Hello,

 execve() system call should support argument length of
 MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
 are not able to pass 32 * PAGE_SIZE arguments into the execve()
 system call because of the following reasons.

 * struct linux_binprm's vma starts with a size of PAGE_SIZE

vma->vm_end = STACK_TOP_MAX;
vma->vm_start = vma->vm_end - PAGE_SIZE;

 * The VMA expands as much depending upon the argument size. So
   for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.

 * 33 * PAGE_SIZE with 64K pages fails the following test in
   get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
   (8 MB /4) with 64K page size.

if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)

 * Right now RLIMIT_STACK is hard coded 8MB which does not take
   PAGE_SIZE into account. 

 Wondering what should be the solution for this problem ?

 * Change the default stack size from 8MB ?
>>> just increase the ulimit if you want to use such a large arguments.
>>>
>>
>> Yeah that is possible but it does not still offset the fact that
>> the calculation is broken on the page size of 64K. I mean, yeah
>> its not practical to have such a large argument. But the point
>> is whether we would want to support the MAX_ARG_STRLEN semantic
>> for execve system call or not. At present its broken for 64K
>> and I am asking whether we will be willing to revisit the
>> '1/4th of the stack' condition.
> 
> I dunno. We have this 1/4 of RLIMIT semantic for years and it doesn't
> seem there were any bug reports. Yes, MAX_ARG_STRLEN being PAGE_SIZE
> dependent is unfortunate because it makes an arch independent default
> ulimit hard to get right but I am not sure we actually have to lose
> sleep over this.

I understand your point.

> 
> Or do you have any specific proposal how to "fix" this limitation which
> wouldn't break other userspace?

There are three variables here MAX_ARG_STRLEN, RLIMIT_STACK and the 25%
condition. Execve() is supporting MAX_ARG_STRLEN for a long time, hence
it cannot be changed now. That leaves us to change either the default
RLIMIT_STACK value or the 25% condition. Both are kernel internal
implementation. But I am not sure how changing them might affect any
other userspace behavior, hence asking for suggestions. I just wanted
to explore the possibilities of a fix here.

Re: [PATCH v2] sched/pelt: fix false running accounting

2017-07-04 Thread Peter Zijlstra

On Tue, Jul 04, 2017 at 11:57:12AM +0200, Vincent Guittot wrote:
> On 4 July 2017 at 11:44, Peter Zijlstra  wrote:

> > But but but, how can that happen? Should it not all be under the same
> > rq->lock and thus have only a single update_rq_clock() and thus be at
> > the same 'instant' ?
> 
> idle_balance() unlock rq->lock before calling  update_blocked_averages
> And update_blocked_averages() starts by calling update_rq_clock()

Ah indeed. Might want to clarify that point.

RE: [Patch v2 2/3] thermal: hisilicon: add thermal sensor driver for Hi3660

2017-07-04 Thread Wangtao (Kevin, Kirin)


On 2017/7/1 11:06, "Eduardo Valentin"  wrote:> 
> Hey Tao,
> 
> On Thu, Jun 22, 2017 at 11:42:02AM +0800, Tao Wang wrote:
> > This patch adds the support for thermal sensor of Hi3660 SoC.
> > this will register sensors for thermal framework and use device
> > tree to bind cooling device.
> >
> > Signed-off-by: Tao Wang 
> > Signed-off-by: Leo Yan 
> > ---
> > Changes in v2:
> > - correct alphabet order
> > - correct compatible name
> >
> >  drivers/thermal/Kconfig  |   10 ++
> >  drivers/thermal/Makefile |1 +
> >  drivers/thermal/hi3660_thermal.c |  198
> ++
> >  3 files changed, 209 insertions(+)
> >  create mode 100644 drivers/thermal/hi3660_thermal.c
> >
> > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> > index b5b5fac..ed22a90 100644
> > --- a/drivers/thermal/Kconfig
> > +++ b/drivers/thermal/Kconfig
> > @@ -192,6 +192,16 @@ config THERMAL_EMULATION
> >   because userland can easily disable the thermal policy by simply
> >   flooding this sysfs node with low temperature values.
> >
> > +config HI3660_THERMAL
> > +   tristate "Hi3660 thermal driver"
> > +   depends on ARCH_HISI || COMPILE_TEST
> > +   depends on HAS_IOMEM
> > +   depends on OF
> > +   default y
> > +   help
> > + Enable this to plug Hi3660 thermal driver into the Linux thermal
> > + framework.
> > +
> >  config HISI_THERMAL
> > tristate "Hisilicon thermal driver"
> > depends on ARCH_HISI || COMPILE_TEST
> > diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> > index 094d703..f29d0a5 100644
> > --- a/drivers/thermal/Makefile
> > +++ b/drivers/thermal/Makefile
> > @@ -55,6 +55,7 @@ obj-$(CONFIG_INTEL_PCH_THERMAL)   +=
> intel_pch_thermal.o
> >  obj-$(CONFIG_ST_THERMAL)   += st/
> >  obj-$(CONFIG_QCOM_TSENS)   += qcom/
> >  obj-$(CONFIG_TEGRA_SOCTHERM)   += tegra/
> > +obj-$(CONFIG_HI3660_THERMAL)   += hi3660_thermal.o
> >  obj-$(CONFIG_HISI_THERMAL) += hisi_thermal.o
> >  obj-$(CONFIG_MTK_THERMAL)  += mtk_thermal.o
> >  obj-$(CONFIG_GENERIC_ADC_THERMAL)  += thermal-generic-adc.o
> > diff --git a/drivers/thermal/hi3660_thermal.c
> b/drivers/thermal/hi3660_thermal.c
> > new file mode 100644
> > index 000..68fa9018
> > --- /dev/null
> > +++ b/drivers/thermal/hi3660_thermal.c
> > @@ -0,0 +1,198 @@
> > +/*
> > + *  linux/drivers/thermal/hi3660_thermal.c
> > + *
> > + *  Copyright (c) 2017 Hisilicon Limited.
> > + *  Copyright (c) 2017 Linaro Limited.
> > + *
> > + *  Author: Tao Wang 
> > + *  Author: Leo Yan 
> > + *
> > + *  This program is free software; you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License as published by
> > + *  the Free Software Foundation; version 2 of the License.
> > + *
> > + *  This program is distributed in the hope that it will be useful,
> > + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + *  GNU General Public License for more details.
> > + *
> > + *  You should have received a copy of the GNU General Public License
> > + *  along with this program.  If not, see .
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "thermal_core.h"
> > +
> > +#define HW_MAX_SENSORS 4
> > +#define HISI_MAX_SENSORS   6
> > +#define SENSOR_MAX 4
> > +#define SENSOR_AVG 5
> > +
> > +#define ADC_MIN116
> > +#define ADC_MAX922
> > +
> > +/* hi3660 Thermal Sensor Dev Structure */
> > +struct hi3660_thermal_sensor {
> > +   struct hi3660_thermal_data *thermal;
> > +   struct thermal_zone_device *tzd;
> > +
> > +   uint32_t id;
> > +};
> > +
> > +struct hi3660_thermal_data {
> > +   struct platform_device *pdev;
> > +   struct hi3660_thermal_sensor sensors[HISI_MAX_SENSORS];
> > +   void __iomem *thermal_base;
> > +};
> > +
> > +unsigned int sensor_reg_offset[HW_MAX_SENSORS] = { 0x1c, 0x5c, 0x9c,
> 0xdc };
> > +
> > +
> > +static int hi3660_thermal_get_temp(void *_sensor, int *temp)
> > +{
> > +   struct hi3660_thermal_sensor *sensor = _sensor;
> > +   struct hi3660_thermal_data *data = sensor->thermal;
> > +   unsigned int idx;
> > +   int val, average = 0, max = 0;
> > +
> > +   if (sensor->id < HW_MAX_SENSORS) {
> > +   val = readl(data->thermal_base + sensor_reg_offset[sensor-
> >id]);
> > +   val = clamp_val(val, ADC_MIN, ADC_MAX);
> > +   } else {
> > +   for (idx = 0; idx < HW_MAX_SENSORS; idx++) {
> > +   val = readl(data->thermal_base
> > +   + sensor_reg_offset[idx]);
> > +   val = clamp_val(val, ADC_MIN, ADC_MAX);
> > +

Re: [PATCH mm] introduce reverse buddy concept to reduce buddy fragment

2017-07-04 Thread Michal Hocko

On Tue 04-07-17 16:04:52, zhouxianrong wrote:
> every 2s i sample /proc/buddyinfo in the whole test process.
> 
> the last about 90 samples were sampled after the test was done.

I've tried to explain to you that numbers without a proper testing
metodology and highlevel metrics you are interested in and comparision
to the base kernel are meaningless. I cannot draw any conclusion from
looking at numbers you have posted. Are high order allocations cheaper
to do with this patch? What about an averge order-0 allocation request?

You are touching memory allocator hot paths and those are really
sensitive to changes. It takes a lot of testing with different workloads
to prove that no new regressions are introduced. That being said, I
completely agree that reducing the memory fragmentation is an important
objective but touching the page allocator and adding new branches there
sounds like a problematic approach which would have to show _huge_
benefits to be mergeable. Is it possible to improve khugepaged to
accomplish the same thing?
-- 
Michal Hocko
SUSE Labs

[PATCH] btrfs: resume qgroup rescan on rw remount

2017-07-04 Thread Aleksa Sarai

Several distributions mount the "proper root" as ro during initrd and
then remount it as rw before pivot_root(2). Thus, if a rescan had been
aborted by a previous shutdown, the rescan would never be resumed.

This issue would manifest itself as several btrfs ioctl(2)s causing the
entire machine to hang when btrfs_qgroup_wait_for_completion was hit
(due to the fs_info->qgroup_rescan_running flag being set but the rescan
itself not being resumed). Notably, Docker's btrfs storage driver makes
regular use of BTRFS_QUOTA_CTL_DISABLE and BTRFS_IOC_QUOTA_RESCAN_WAIT
(causing this problem to be manifested on boot for some machines).

Cc:  # v3.11+
Cc: Jeff Mahoney 
Fixes: b382a324b60f ("Btrfs: fix qgroup rescan resume on mount")
Signed-off-by: Aleksa Sarai 
---
 fs/btrfs/super.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 6346876c97ea..ff6690389343 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1821,6 +1821,8 @@ static int btrfs_remount(struct super_block *sb, int 
*flags, char *data)
goto restore;
}
 
+   btrfs_qgroup_rescan_resume(fs_info);
+
if (!fs_info->uuid_root) {
btrfs_info(fs_info, "creating UUID tree");
ret = btrfs_create_uuid_tree(fs_info);
-- 
2.13.2

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 1354 matches

Mail list logo