Re: [PATCHv7 00/29] THP-enabled tmpfs/shmem using compound pages

2016-04-23 Thread Wincy Van
On Wed, Apr 20, 2016 at 1:07 AM, Andres Lagar-Cavilla
 wrote:
> Andrea, we provide the, ahem, adjustments to
> transparent_hugepage_adjust. Rest assured we aggressively use mmu
> notifiers with no further changes required.
>
> As in: zero changes have been required in the lifetime (years) of
> kvm+huge tmpfs at Google, other than mod'ing
> transparent_hugepage_adjust.

We are using kvm + tmpfs to do qemu live upgrading, how does google
use this memory model ?
I think our pupose to use tmpfs may be the same.

And huge tmpfs is a really good improvement for that.

>
> As noted by Paolo, the additions to transparent_hugepage_adjust could
> be lifted outside of kvm (into shmem.c? maybe) for any consumer of
> huge tmpfs with mmu notifiers.
>

Thanks,
Wincy


Re: [PATCHv7 00/29] THP-enabled tmpfs/shmem using compound pages

2016-04-23 Thread Wincy Van
On Wed, Apr 20, 2016 at 1:07 AM, Andres Lagar-Cavilla
 wrote:
> Andrea, we provide the, ahem, adjustments to
> transparent_hugepage_adjust. Rest assured we aggressively use mmu
> notifiers with no further changes required.
>
> As in: zero changes have been required in the lifetime (years) of
> kvm+huge tmpfs at Google, other than mod'ing
> transparent_hugepage_adjust.

We are using kvm + tmpfs to do qemu live upgrading, how does google
use this memory model ?
I think our pupose to use tmpfs may be the same.

And huge tmpfs is a really good improvement for that.

>
> As noted by Paolo, the additions to transparent_hugepage_adjust could
> be lifted outside of kvm (into shmem.c? maybe) for any consumer of
> huge tmpfs with mmu notifiers.
>

Thanks,
Wincy


Re: [PATCH v8 1/4] gadget: Introduce the usb charger framework

2016-04-23 Thread Baolin Wang
Hi Pavel,

On 24 April 2016 at 03:53, Pavel Machek  wrote:
> Hi!
>
>> +/*
>> + * Sysfs attributes:
>> + *
>> + * These sysfs attributes are used for showing and setting different type
>> + * (SDP/DCP/CDP/ACA) chargers' current limitation.
>> + */
>> +static ssize_t sdp_limit_show(struct device *dev,
>> +   struct device_attribute *attr,
>> +   char *buf)
>> +{
>> + struct usb_charger *uchger = dev_to_uchger(dev);
>> +
>> + return sprintf(buf, "%d\n", uchger->cur_limit.sdp_cur_limit);
>> +}
>> +
>
> Sysfs attributes... can we get appropriate documentation for them?

I've send out the v10 patchset which fixed this issue. Thanks.

>
> Thanks,
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html



-- 
Baolin.wang
Best Regards


Re: [PATCH v8 1/4] gadget: Introduce the usb charger framework

2016-04-23 Thread Baolin Wang
Hi Pavel,

On 24 April 2016 at 03:53, Pavel Machek  wrote:
> Hi!
>
>> +/*
>> + * Sysfs attributes:
>> + *
>> + * These sysfs attributes are used for showing and setting different type
>> + * (SDP/DCP/CDP/ACA) chargers' current limitation.
>> + */
>> +static ssize_t sdp_limit_show(struct device *dev,
>> +   struct device_attribute *attr,
>> +   char *buf)
>> +{
>> + struct usb_charger *uchger = dev_to_uchger(dev);
>> +
>> + return sprintf(buf, "%d\n", uchger->cur_limit.sdp_cur_limit);
>> +}
>> +
>
> Sysfs attributes... can we get appropriate documentation for them?

I've send out the v10 patchset which fixed this issue. Thanks.

>
> Thanks,
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html



-- 
Baolin.wang
Best Regards


Re: [PATCH RFC 4/8] mtd: spi-nor: fix support of Dual (x-y-2) and Quad (x-y-4) SPI protocols

2016-04-23 Thread R, Vignesh
Hi Cyrille,

On 4/13/2016 10:53 PM, Cyrille Pitchen wrote:
[...]

> +
> +static int spi_nor_setup(struct spi_nor *nor, const struct flash_info *info,
> +  const struct spi_nor_basic_flash_parameter *params,
> +  const struct spi_nor_modes *modes)
> +{
> + bool enable_quad_io, enable_4_4_4, enable_2_2_2;
> + u32 rd_modes, wr_modes, cmd_modes, mask;
> + const struct spi_nor_erase_type *erase_type;
> + const struct spi_nor_read *read;
> + int rd_midx, wr_midx, err = 0;
> +
> + /* 2-2-2 or 4-4-4 modes must be supported by BOTH read and write */
> + mask = (SNOR_MODE_2_2_2 | SNOR_MODE_4_4_4);
> + cmd_modes = (modes->rd_modes & modes->wr_modes) & mask;
> + rd_modes = (modes->rd_modes & ~mask) | cmd_modes;
> + wr_modes = (modes->wr_modes & ~mask) | cmd_modes;
> +
> + /* Setup read operation. */
> + rd_midx = fls(params->rd_modes & rd_modes) - 1;
> + if (spi_nor_midx2proto(rd_midx, >read_proto)) {
> + dev_err(nor->dev, "invalid (fast) read\n");
> + return -EINVAL;
> + }
> + read = >reads[rd_midx];
> + nor->read_opcode = read->opcode;
> + nor->read_dummy = read->num_mode_clocks + read->num_wait_states;
> +
> + /* Set page program op code and protocol. */
> + wr_midx = fls(params->wr_modes & wr_modes) - 1;
> + if (spi_nor_midx2proto(wr_midx, >write_proto)) {
> + dev_err(nor->dev, "invalid page program\n");
> + return -EINVAL;
> + }
> + nor->program_opcode = params->page_programs[wr_midx];
> +
> + /* Set sector erase op code and size. */
> + erase_type = >erase_types[0];
> +#ifdef CONFIG_MTD_SPI_NOR_USE_4K_SECTORS
> + for (i = 1; i < SNOR_MAX_ERASE_TYPES; ++i)

^^^ 'i' is undefined here.

> + if (params->erase_types[i].size == 0x0c)
> + erase_type = >erase_types[i];
> +#endif
> + nor->erase_opcode = erase_type->opcode;
> + nor->mtd.erasesize = (1 << erase_type->size);
> +
> +
> + enable_quad_io = (SNOR_PROTO_DATA_FROM_PROTO(nor->read_proto) == 4 ||
> +   SNOR_PROTO_DATA_FROM_PROTO(nor->write_proto) == 4);
> + enable_4_4_4 = (nor->read_proto == SNOR_PROTO_4_4_4);
> + enable_2_2_2 = (nor->read_proto == SNOR_PROTO_2_2_2);
> +
> + /* Enable Quad I/O if needed. */
> + if ((enable_quad_io || enable_4_4_4) &&
> + params->enable_quad_io &&
> + nor->reg_proto != SNOR_PROTO_4_4_4) {
> + err = params->enable_quad_io(nor, true);
> + if (err) {
> + dev_err(nor->dev,
> + "failed to enable the Quad I/O mode\n");
> + return err;
> + }
> + }
> +
> + /* Enter/Leave 2-2-2 or 4-4-4 if needed. */
> + if (enable_2_2_2 && params->enable_2_2_2 &&
> + nor->reg_proto != SNOR_PROTO_2_2_2)
> + err = params->enable_2_2_2(nor, true);
> + else if (enable_4_4_4 && params->enable_4_4_4 &&
> +  nor->reg_proto != SNOR_PROTO_4_4_4)
> + err = params->enable_4_4_4(nor, true);
> + else if (!enable_2_2_2 && params->enable_2_2_2 &&
> +  nor->reg_proto == SNOR_PROTO_2_2_2)
> + err = params->enable_2_2_2(nor, false);
> + else if (!enable_4_4_4 && params->enable_4_4_4 &&
> +  nor->reg_proto == SNOR_PROTO_4_4_4)
> + err = params->enable_4_4_4(nor, false);
> + if (err)
> + return err;
> +
> + /*
> +  * Fix erase protocol if needed, read and write protocols should
> +  * already be valid.
> +  */
> + switch (nor->reg_proto) {
> + case SNOR_PROTO_4_4_4:
> + nor->erase_proto = SNOR_PROTO_4_4_4;
> + break;
> +
> + case SNOR_PROTO_2_2_2:
> + nor->erase_proto = SNOR_PROTO_2_2_2;
> + break;
> +
> + default:
> + nor->erase_proto = SNOR_PROTO_1_1_1;
> + break;
> + }
> +
> + dev_dbg(nor->dev,
> + "(Fast) Read:  opcode=%02Xh, protocol=%03x, mode=%u, wait=%u\n",
> + nor->read_opcode, nor->read_proto,
> + read->num_mode_clocks, read->num_wait_states);
> + dev_dbg(nor->dev,
> + "Page Program: opcode=%02Xh, protocol=%03x\n",
> + nor->program_opcode, nor->write_proto);
> + dev_dbg(nor->dev,
> + "Sector Erase: opcode=%02Xh, protocol=%03x, sector size=%zu\n",
> + nor->erase_opcode, nor->erase_proto, nor->mtd.erasesize);
> +
> + return 0;
> +}
> +
> +int spi_nor_scan(struct spi_nor *nor, const char *name,
> +  struct spi_nor_modes *modes)
> +{
> + const struct spi_nor_basic_flash_parameter *params = NULL;
>   const struct flash_info *info = NULL;
>   struct device *dev = nor->dev;
>   struct mtd_info *mtd = >mtd;
> @@ -1342,11 +1483,17 @@ int spi_nor_scan(struct spi_nor *nor, const char 
> *name, enum read_mode mode)
>   if (ret)
>

Re: [PATCH RFC 4/8] mtd: spi-nor: fix support of Dual (x-y-2) and Quad (x-y-4) SPI protocols

2016-04-23 Thread R, Vignesh
Hi Cyrille,

On 4/13/2016 10:53 PM, Cyrille Pitchen wrote:
[...]

> +
> +static int spi_nor_setup(struct spi_nor *nor, const struct flash_info *info,
> +  const struct spi_nor_basic_flash_parameter *params,
> +  const struct spi_nor_modes *modes)
> +{
> + bool enable_quad_io, enable_4_4_4, enable_2_2_2;
> + u32 rd_modes, wr_modes, cmd_modes, mask;
> + const struct spi_nor_erase_type *erase_type;
> + const struct spi_nor_read *read;
> + int rd_midx, wr_midx, err = 0;
> +
> + /* 2-2-2 or 4-4-4 modes must be supported by BOTH read and write */
> + mask = (SNOR_MODE_2_2_2 | SNOR_MODE_4_4_4);
> + cmd_modes = (modes->rd_modes & modes->wr_modes) & mask;
> + rd_modes = (modes->rd_modes & ~mask) | cmd_modes;
> + wr_modes = (modes->wr_modes & ~mask) | cmd_modes;
> +
> + /* Setup read operation. */
> + rd_midx = fls(params->rd_modes & rd_modes) - 1;
> + if (spi_nor_midx2proto(rd_midx, >read_proto)) {
> + dev_err(nor->dev, "invalid (fast) read\n");
> + return -EINVAL;
> + }
> + read = >reads[rd_midx];
> + nor->read_opcode = read->opcode;
> + nor->read_dummy = read->num_mode_clocks + read->num_wait_states;
> +
> + /* Set page program op code and protocol. */
> + wr_midx = fls(params->wr_modes & wr_modes) - 1;
> + if (spi_nor_midx2proto(wr_midx, >write_proto)) {
> + dev_err(nor->dev, "invalid page program\n");
> + return -EINVAL;
> + }
> + nor->program_opcode = params->page_programs[wr_midx];
> +
> + /* Set sector erase op code and size. */
> + erase_type = >erase_types[0];
> +#ifdef CONFIG_MTD_SPI_NOR_USE_4K_SECTORS
> + for (i = 1; i < SNOR_MAX_ERASE_TYPES; ++i)

^^^ 'i' is undefined here.

> + if (params->erase_types[i].size == 0x0c)
> + erase_type = >erase_types[i];
> +#endif
> + nor->erase_opcode = erase_type->opcode;
> + nor->mtd.erasesize = (1 << erase_type->size);
> +
> +
> + enable_quad_io = (SNOR_PROTO_DATA_FROM_PROTO(nor->read_proto) == 4 ||
> +   SNOR_PROTO_DATA_FROM_PROTO(nor->write_proto) == 4);
> + enable_4_4_4 = (nor->read_proto == SNOR_PROTO_4_4_4);
> + enable_2_2_2 = (nor->read_proto == SNOR_PROTO_2_2_2);
> +
> + /* Enable Quad I/O if needed. */
> + if ((enable_quad_io || enable_4_4_4) &&
> + params->enable_quad_io &&
> + nor->reg_proto != SNOR_PROTO_4_4_4) {
> + err = params->enable_quad_io(nor, true);
> + if (err) {
> + dev_err(nor->dev,
> + "failed to enable the Quad I/O mode\n");
> + return err;
> + }
> + }
> +
> + /* Enter/Leave 2-2-2 or 4-4-4 if needed. */
> + if (enable_2_2_2 && params->enable_2_2_2 &&
> + nor->reg_proto != SNOR_PROTO_2_2_2)
> + err = params->enable_2_2_2(nor, true);
> + else if (enable_4_4_4 && params->enable_4_4_4 &&
> +  nor->reg_proto != SNOR_PROTO_4_4_4)
> + err = params->enable_4_4_4(nor, true);
> + else if (!enable_2_2_2 && params->enable_2_2_2 &&
> +  nor->reg_proto == SNOR_PROTO_2_2_2)
> + err = params->enable_2_2_2(nor, false);
> + else if (!enable_4_4_4 && params->enable_4_4_4 &&
> +  nor->reg_proto == SNOR_PROTO_4_4_4)
> + err = params->enable_4_4_4(nor, false);
> + if (err)
> + return err;
> +
> + /*
> +  * Fix erase protocol if needed, read and write protocols should
> +  * already be valid.
> +  */
> + switch (nor->reg_proto) {
> + case SNOR_PROTO_4_4_4:
> + nor->erase_proto = SNOR_PROTO_4_4_4;
> + break;
> +
> + case SNOR_PROTO_2_2_2:
> + nor->erase_proto = SNOR_PROTO_2_2_2;
> + break;
> +
> + default:
> + nor->erase_proto = SNOR_PROTO_1_1_1;
> + break;
> + }
> +
> + dev_dbg(nor->dev,
> + "(Fast) Read:  opcode=%02Xh, protocol=%03x, mode=%u, wait=%u\n",
> + nor->read_opcode, nor->read_proto,
> + read->num_mode_clocks, read->num_wait_states);
> + dev_dbg(nor->dev,
> + "Page Program: opcode=%02Xh, protocol=%03x\n",
> + nor->program_opcode, nor->write_proto);
> + dev_dbg(nor->dev,
> + "Sector Erase: opcode=%02Xh, protocol=%03x, sector size=%zu\n",
> + nor->erase_opcode, nor->erase_proto, nor->mtd.erasesize);
> +
> + return 0;
> +}
> +
> +int spi_nor_scan(struct spi_nor *nor, const char *name,
> +  struct spi_nor_modes *modes)
> +{
> + const struct spi_nor_basic_flash_parameter *params = NULL;
>   const struct flash_info *info = NULL;
>   struct device *dev = nor->dev;
>   struct mtd_info *mtd = >mtd;
> @@ -1342,11 +1483,17 @@ int spi_nor_scan(struct spi_nor *nor, const char 
> *name, enum read_mode mode)
>   if (ret)
>

Re: [RFC][PATCH v6 0/2] printk: Make printk() completely async

2016-04-23 Thread Sergey Senozhatsky
On (04/23/16 21:40), Pavel Machek wrote:
[..]
> > > The patch set is against next-20160321
> > > 
> > > the series in total has 3 patches:
> > > - printk: Make printk() completely async
> > > - printk: Make wake_up_klogd_work_func() async
> > > - printk: make console_unlock() async
> > > 
> > > per discussion, "printk: make console_unlock() async" will be posted
> > > later on.
> > 
> > Patches look good to me. I don't think you need to mention the
> > console_unlock() async patch when it is not part of the series.  BTW, you
> > seemed to have dropped my patch to skip if there are too many buffered
> > messages when oops is in progress. Any reason for that?
> 
> So... from basically linux 0.0, cli() printk("") could be used for
> debugging. ... and that's now gone. Right?
> 
> Can you explain why that is good idea?

it's not gone. you need to explicitly enable async printk mode. the case
you mentioned -- cli() printk("")->console_unlock() -- apart from being
useful in some scenarios, can cause problems in others, simply because
under some circumstances it can run forever, as long as there are printk()
calls coming from other CPUs (which can happen during, f.e., debugging).
did you mean UP systems? well, async printk is sort of useless on UP systems
anyway.

-ss


Re: [RFC][PATCH v6 0/2] printk: Make printk() completely async

2016-04-23 Thread Sergey Senozhatsky
On (04/23/16 21:40), Pavel Machek wrote:
[..]
> > > The patch set is against next-20160321
> > > 
> > > the series in total has 3 patches:
> > > - printk: Make printk() completely async
> > > - printk: Make wake_up_klogd_work_func() async
> > > - printk: make console_unlock() async
> > > 
> > > per discussion, "printk: make console_unlock() async" will be posted
> > > later on.
> > 
> > Patches look good to me. I don't think you need to mention the
> > console_unlock() async patch when it is not part of the series.  BTW, you
> > seemed to have dropped my patch to skip if there are too many buffered
> > messages when oops is in progress. Any reason for that?
> 
> So... from basically linux 0.0, cli() printk("") could be used for
> debugging. ... and that's now gone. Right?
> 
> Can you explain why that is good idea?

it's not gone. you need to explicitly enable async printk mode. the case
you mentioned -- cli() printk("")->console_unlock() -- apart from being
useful in some scenarios, can cause problems in others, simply because
under some circumstances it can run forever, as long as there are printk()
calls coming from other CPUs (which can happen during, f.e., debugging).
did you mean UP systems? well, async printk is sort of useless on UP systems
anyway.

-ss


[PATCH v2] mm: fix incorrect pfn passed to untrack_pfn() in remap_pfn_range()

2016-04-23 Thread Yongji Xie
We use generic hooks in remap_pfn_range() to help archs to
track pfnmap regions. The code is something like:

int remap_pfn_range()
{
...
track_pfn_remap(vma, , pfn, addr, PAGE_ALIGN(size));
...
pfn -= addr >> PAGE_SHIFT;
...
untrack_pfn(vma, pfn, PAGE_ALIGN(size));
...
}

Here we can easily find the pfn is changed but not recovered
before untrack_pfn() is called. That's incorrect.

Signed-off-by: Yongji Xie 
---
 mm/memory.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 098f00d..eee75ed 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1711,6 +1711,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
unsigned long next;
unsigned long end = addr + PAGE_ALIGN(size);
struct mm_struct *mm = vma->vm_mm;
+   unsigned long remap_pfn = pfn;
int err;
 
/*
@@ -1737,7 +1738,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
vma->vm_pgoff = pfn;
}
 
-   err = track_pfn_remap(vma, , pfn, addr, PAGE_ALIGN(size));
+   err = track_pfn_remap(vma, , remap_pfn, addr, PAGE_ALIGN(size));
if (err)
return -EINVAL;
 
@@ -1756,7 +1757,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
} while (pgd++, addr = next, addr != end);
 
if (err)
-   untrack_pfn(vma, pfn, PAGE_ALIGN(size));
+   untrack_pfn(vma, remap_pfn, PAGE_ALIGN(size));
 
return err;
 }
-- 
1.7.9.5



[PATCH v2] mm: fix incorrect pfn passed to untrack_pfn() in remap_pfn_range()

2016-04-23 Thread Yongji Xie
We use generic hooks in remap_pfn_range() to help archs to
track pfnmap regions. The code is something like:

int remap_pfn_range()
{
...
track_pfn_remap(vma, , pfn, addr, PAGE_ALIGN(size));
...
pfn -= addr >> PAGE_SHIFT;
...
untrack_pfn(vma, pfn, PAGE_ALIGN(size));
...
}

Here we can easily find the pfn is changed but not recovered
before untrack_pfn() is called. That's incorrect.

Signed-off-by: Yongji Xie 
---
 mm/memory.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 098f00d..eee75ed 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1711,6 +1711,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
unsigned long next;
unsigned long end = addr + PAGE_ALIGN(size);
struct mm_struct *mm = vma->vm_mm;
+   unsigned long remap_pfn = pfn;
int err;
 
/*
@@ -1737,7 +1738,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
vma->vm_pgoff = pfn;
}
 
-   err = track_pfn_remap(vma, , pfn, addr, PAGE_ALIGN(size));
+   err = track_pfn_remap(vma, , remap_pfn, addr, PAGE_ALIGN(size));
if (err)
return -EINVAL;
 
@@ -1756,7 +1757,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
} while (pgd++, addr = next, addr != end);
 
if (err)
-   untrack_pfn(vma, pfn, PAGE_ALIGN(size));
+   untrack_pfn(vma, remap_pfn, PAGE_ALIGN(size));
 
return err;
 }
-- 
1.7.9.5



Re: [RFC][PATCH v4 0/2] printk: Make printk() completely async

2016-04-23 Thread Sergey Senozhatsky
Hi,

On (04/23/16 21:36), Pavel Machek wrote:
> >  The patch set is based on slightly updated Jan Kara's patches.
> > 
> > This patch set makes printk() completely asynchronous: new messages
> > are getting upended to the kernel printk buffer, but instead of 'direct'
> > printing the actual print job is performed by a dedicated kthread.
> > This has the advantage that printing always happens from a schedulable
> > context and thus we don't lockup any particular CPU or even
> > interrupts.
> 
> And that means that printk() will become mostly useless for debugging,
> right?

what do you mean? printk by default operates in 'old' sync mode
and can be switched to async for those who experience problems
with the sync one (whilst actually debugging). I'm not sure I
got your point.

-ss


Re: [RFC][PATCH v4 0/2] printk: Make printk() completely async

2016-04-23 Thread Sergey Senozhatsky
Hi,

On (04/23/16 21:36), Pavel Machek wrote:
> >  The patch set is based on slightly updated Jan Kara's patches.
> > 
> > This patch set makes printk() completely asynchronous: new messages
> > are getting upended to the kernel printk buffer, but instead of 'direct'
> > printing the actual print job is performed by a dedicated kthread.
> > This has the advantage that printing always happens from a schedulable
> > context and thus we don't lockup any particular CPU or even
> > interrupts.
> 
> And that means that printk() will become mostly useless for debugging,
> right?

what do you mean? printk by default operates in 'old' sync mode
and can be switched to async for those who experience problems
with the sync one (whilst actually debugging). I'm not sure I
got your point.

-ss


Re: [PATCH] mm: Fix incorrect pfn passed to untrack_pfn in remap_pfn_range

2016-04-23 Thread Yongji Xie

On 2016/4/23 2:38, Andrew Morton wrote:

On Fri, 22 Apr 2016 18:31:28 +0800 Yongji Xie  wrote:


We used generic hooks in remap_pfn_range to help archs to
track pfnmap regions. The code is something like:

int remap_pfn_range()
{
...
track_pfn_remap(vma, , pfn, addr, PAGE_ALIGN(size));
...
pfn -= addr >> PAGE_SHIFT;
...
untrack_pfn(vma, pfn, PAGE_ALIGN(size));
...
}

Here we can easily find the pfn is changed but not recovered
before untrack_pfn() is called. That's incorrect.

What are the runtime effects of this bug?


No, this is just a fix in theory:-) .


--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1755,6 +1755,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
break;
} while (pgd++, addr = next, addr != end);
  
+	pfn += (end - PAGE_ALIGN(size)) >> PAGE_SHIFT;

if (err)
untrack_pfn(vma, pfn, PAGE_ALIGN(size));

I'm having trouble understanding this.  Wouldn't it be better to simply
save the track_pfn_remap() call's `pfn' arg in a new local variable?



Yes, it's a little difficult to understand this. I will send a v2 soon.

Thanks,
Yongji



Re: [PATCH] mm: Fix incorrect pfn passed to untrack_pfn in remap_pfn_range

2016-04-23 Thread Yongji Xie

On 2016/4/23 2:38, Andrew Morton wrote:

On Fri, 22 Apr 2016 18:31:28 +0800 Yongji Xie  wrote:


We used generic hooks in remap_pfn_range to help archs to
track pfnmap regions. The code is something like:

int remap_pfn_range()
{
...
track_pfn_remap(vma, , pfn, addr, PAGE_ALIGN(size));
...
pfn -= addr >> PAGE_SHIFT;
...
untrack_pfn(vma, pfn, PAGE_ALIGN(size));
...
}

Here we can easily find the pfn is changed but not recovered
before untrack_pfn() is called. That's incorrect.

What are the runtime effects of this bug?


No, this is just a fix in theory:-) .


--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1755,6 +1755,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned 
long addr,
break;
} while (pgd++, addr = next, addr != end);
  
+	pfn += (end - PAGE_ALIGN(size)) >> PAGE_SHIFT;

if (err)
untrack_pfn(vma, pfn, PAGE_ALIGN(size));

I'm having trouble understanding this.  Wouldn't it be better to simply
save the track_pfn_remap() call's `pfn' arg in a new local variable?



Yes, it's a little difficult to understand this. I will send a v2 soon.

Thanks,
Yongji



Re: [PATCH v6 10/10] clocksource: arm_arch_timer: Remove arch_timer_get_timecounter

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:33 AM, Julien Grall wrote:
> The only call of arch_timer_get_timecounter (in KVM) has been removed.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 

Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 10/10] clocksource: arm_arch_timer: Remove arch_timer_get_timecounter

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:33 AM, Julien Grall wrote:
> The only call of arch_timer_get_timecounter (in KVM) has been removed.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 

Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 09/10] KVM: arm/arm64: vgic: Rely on the GIC driver to parse the firmware tables

2016-04-23 Thread Shanker Donthineni
Hi Julien,


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Currently, the firmware tables are parsed 2 times: once in the GIC
> drivers, the other time when initializing the vGIC. It means code
> duplication and make more tedious to add the support for another
> firmware table (like ACPI).
>
> Use the recently introduced helper gic_get_kvm_info() to get
> information about the virtual GIC.
>
> With this change, the virtual GIC becomes agnostic to the firmware
> table and KVM will be able to initialize the vGIC on ACPI.
>
> Signed-off-by: Julien Grall 
> Reviewed-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server with PAGE_SIZE=4K.
> ---
> Cc: Marc Zyngier 
> Cc: Gleb Natapov 
> Cc: Paolo Bonzini 
>
> Changes in v6:
> - Add Christoffer's reviewed-by
>
> Changes in v4:
> - Remove validation check as they are already done during
> parsing.
> - Move the alignement check from the parsing to the vGIC code.
> - Fix typo in the commit message
>
> Changes in v2:
> - Use 0 rather than a negative value to know when the maintenance
> IRQ
> is not present.
> - Use resource for vcpu and vctrl.
> ---
>  include/kvm/arm_vgic.h |  7 +++---
>  virt/kvm/arm/vgic-v2.c | 61
> +-
>  virt/kvm/arm/vgic-v3.c | 47 +-
>  virt/kvm/arm/vgic.c| 50 ++---
>  4 files changed, 73 insertions(+), 92 deletions(-)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 281caf8..be6037a 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define VGIC_NR_IRQS_LEGACY  256
>  #define VGIC_NR_SGIS 16
> @@ -353,15 +354,15 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu,
> struct irq_phys_map *map);
>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
>  #define vgic_ready(k)((k)->arch.vgic.ready)
>  
> -int vgic_v2_probe(struct device_node *vgic_node,
> +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info,
> const struct vgic_ops **ops,
> const struct vgic_params **params);
>  #ifdef CONFIG_KVM_ARM_VGIC_V3
> -int vgic_v3_probe(struct device_node *vgic_node,
> +int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info,
> const struct vgic_ops **ops,
> const struct vgic_params **params);
>  #else
> -static inline int vgic_v3_probe(struct device_node *vgic_node,
> +static inline int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info,
>   const struct vgic_ops **ops,
>   const struct vgic_params **params)
>  {
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index 67ec334..7e826c9 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -20,9 +20,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
> -#include 
> -#include 
>  
>  #include 
>  
> @@ -186,38 +183,39 @@ static void vgic_cpu_init_lrs(void *params)
>  }
>  
>  /**
> - * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in
> DT
> - * @node:pointer to the DT node
> - * @ops: address of a pointer to the GICv2 operations
> - * @params:  address of a pointer to HW-specific parameters
> + * vgic_v2_probe - probe for a GICv2 compatible interrupt controller
> + * @gic_kvm_info:pointer to the GIC description
> + * @ops: address of a pointer to the GICv2 operations
> + * @params:  address of a pointer to HW-specific parameters
>   *
>   * Returns 0 if a GICv2 has been found, with the low level operations
>   * in *ops and the HW parameters in *params. Returns an error code
>   * otherwise.
>   */
> -int vgic_v2_probe(struct device_node *vgic_node,
> -   const struct vgic_ops **ops,
> -   const struct vgic_params **params)
> +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info,
> +const struct vgic_ops **ops,
> +const struct vgic_params **params)
>  {
>   int ret;
> - struct resource vctrl_res;
> - struct resource vcpu_res;
>   struct vgic_params *vgic = _v2_params;
> + const struct resource *vctrl_res = _kvm_info->vctrl;
> + const struct resource *vcpu_res = _kvm_info->vcpu;
>  
> - vgic->maint_irq = irq_of_parse_and_map(vgic_node, 0);
> - if (!vgic->maint_irq) {
> - kvm_err("error getting vgic maintenance irq from DT\n");
> + if (!gic_kvm_info->maint_irq) {
> + kvm_err("error getting vgic maintenance irq\n");
>   ret = -ENXIO;
>   goto out;
>   }
> + vgic->maint_irq = gic_kvm_info->maint_irq;
>  
> - ret = 

Re: [PATCH v6 09/10] KVM: arm/arm64: vgic: Rely on the GIC driver to parse the firmware tables

2016-04-23 Thread Shanker Donthineni
Hi Julien,


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Currently, the firmware tables are parsed 2 times: once in the GIC
> drivers, the other time when initializing the vGIC. It means code
> duplication and make more tedious to add the support for another
> firmware table (like ACPI).
>
> Use the recently introduced helper gic_get_kvm_info() to get
> information about the virtual GIC.
>
> With this change, the virtual GIC becomes agnostic to the firmware
> table and KVM will be able to initialize the vGIC on ACPI.
>
> Signed-off-by: Julien Grall 
> Reviewed-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server with PAGE_SIZE=4K.
> ---
> Cc: Marc Zyngier 
> Cc: Gleb Natapov 
> Cc: Paolo Bonzini 
>
> Changes in v6:
> - Add Christoffer's reviewed-by
>
> Changes in v4:
> - Remove validation check as they are already done during
> parsing.
> - Move the alignement check from the parsing to the vGIC code.
> - Fix typo in the commit message
>
> Changes in v2:
> - Use 0 rather than a negative value to know when the maintenance
> IRQ
> is not present.
> - Use resource for vcpu and vctrl.
> ---
>  include/kvm/arm_vgic.h |  7 +++---
>  virt/kvm/arm/vgic-v2.c | 61
> +-
>  virt/kvm/arm/vgic-v3.c | 47 +-
>  virt/kvm/arm/vgic.c| 50 ++---
>  4 files changed, 73 insertions(+), 92 deletions(-)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 281caf8..be6037a 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define VGIC_NR_IRQS_LEGACY  256
>  #define VGIC_NR_SGIS 16
> @@ -353,15 +354,15 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu,
> struct irq_phys_map *map);
>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
>  #define vgic_ready(k)((k)->arch.vgic.ready)
>  
> -int vgic_v2_probe(struct device_node *vgic_node,
> +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info,
> const struct vgic_ops **ops,
> const struct vgic_params **params);
>  #ifdef CONFIG_KVM_ARM_VGIC_V3
> -int vgic_v3_probe(struct device_node *vgic_node,
> +int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info,
> const struct vgic_ops **ops,
> const struct vgic_params **params);
>  #else
> -static inline int vgic_v3_probe(struct device_node *vgic_node,
> +static inline int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info,
>   const struct vgic_ops **ops,
>   const struct vgic_params **params)
>  {
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index 67ec334..7e826c9 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -20,9 +20,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
> -#include 
> -#include 
>  
>  #include 
>  
> @@ -186,38 +183,39 @@ static void vgic_cpu_init_lrs(void *params)
>  }
>  
>  /**
> - * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in
> DT
> - * @node:pointer to the DT node
> - * @ops: address of a pointer to the GICv2 operations
> - * @params:  address of a pointer to HW-specific parameters
> + * vgic_v2_probe - probe for a GICv2 compatible interrupt controller
> + * @gic_kvm_info:pointer to the GIC description
> + * @ops: address of a pointer to the GICv2 operations
> + * @params:  address of a pointer to HW-specific parameters
>   *
>   * Returns 0 if a GICv2 has been found, with the low level operations
>   * in *ops and the HW parameters in *params. Returns an error code
>   * otherwise.
>   */
> -int vgic_v2_probe(struct device_node *vgic_node,
> -   const struct vgic_ops **ops,
> -   const struct vgic_params **params)
> +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info,
> +const struct vgic_ops **ops,
> +const struct vgic_params **params)
>  {
>   int ret;
> - struct resource vctrl_res;
> - struct resource vcpu_res;
>   struct vgic_params *vgic = _v2_params;
> + const struct resource *vctrl_res = _kvm_info->vctrl;
> + const struct resource *vcpu_res = _kvm_info->vcpu;
>  
> - vgic->maint_irq = irq_of_parse_and_map(vgic_node, 0);
> - if (!vgic->maint_irq) {
> - kvm_err("error getting vgic maintenance irq from DT\n");
> + if (!gic_kvm_info->maint_irq) {
> + kvm_err("error getting vgic maintenance irq\n");
>   ret = -ENXIO;
>   goto out;
>   }
> + vgic->maint_irq = gic_kvm_info->maint_irq;
>  
> - ret = of_address_to_resource(vgic_node, 2, _res);
> - if (ret) {
> - kvm_err("Cannot obtain GICH resource\n");
> + if 

Re: [PATCH v2] lib: make sg_pool tristate instead of bool

2016-04-23 Thread Ming Lin
On Sat, Apr 23, 2016 at 7:44 PM, Paul Gortmaker
 wrote:
> The recently added Kconfig controlling compilation of this code is:
>
> lib/Kconfig:config SG_POOL
> lib/Kconfig:def_bool n
>
> ...meaning that it currently is not being built as a module by anyone,
> and that tripped my audit looking for modular code that is essentially
> orphaned (i.e. module_exit, and .remove fcns in non-modular drivers.)
>
> In the following discussion, Ming Lin indicated that the original
> intention was to have it tristate, so here we convert it accordingly.
>
> Also fix up a couple spelling issues that appear in the surrounding
> patch context.
>
> Cc: Christoph Hellwig 
> Cc: Ming Lin 
> Cc: Sagi Grimberg 
> Cc: Martin K. Petersen 
> Signed-off-by: Paul Gortmaker 
> ---
>
> [v2: drop modular code removal patch in favour of supporting a modular
>  build via a one line Kconfig patch as per Ming's comments.  Build tested
>  for allmodconfig on ARM and x86-64 on linux-next.   ]
>
>  lib/Kconfig | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/Kconfig b/lib/Kconfig
> index e04f168f8f42..8de5868804b5 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -528,13 +528,13 @@ config SG_SPLIT
> help
>  Provides a helper to split scatterlists into chunks, each chunk being
>  a scatterlist. This should be selected by a driver or an API which
> -whishes to split a scatterlist amongst multiple DMA channels.
> +wishes to split a scatterlist amongst multiple DMA channels.
>
>  config SG_POOL
> -   def_bool n
> +   def_tristate n
> help
>  Provides a helper to allocate chained scatterlists. This should be
> -selected by a driver or an API which whishes to allocate chained
> +selected by a driver or an API which wishes to allocate chained
>  scatterlist.
>
>  #

Looks good.

Acked-by: Ming Lin 

Thanks.


Re: [PATCH v2] lib: make sg_pool tristate instead of bool

2016-04-23 Thread Ming Lin
On Sat, Apr 23, 2016 at 7:44 PM, Paul Gortmaker
 wrote:
> The recently added Kconfig controlling compilation of this code is:
>
> lib/Kconfig:config SG_POOL
> lib/Kconfig:def_bool n
>
> ...meaning that it currently is not being built as a module by anyone,
> and that tripped my audit looking for modular code that is essentially
> orphaned (i.e. module_exit, and .remove fcns in non-modular drivers.)
>
> In the following discussion, Ming Lin indicated that the original
> intention was to have it tristate, so here we convert it accordingly.
>
> Also fix up a couple spelling issues that appear in the surrounding
> patch context.
>
> Cc: Christoph Hellwig 
> Cc: Ming Lin 
> Cc: Sagi Grimberg 
> Cc: Martin K. Petersen 
> Signed-off-by: Paul Gortmaker 
> ---
>
> [v2: drop modular code removal patch in favour of supporting a modular
>  build via a one line Kconfig patch as per Ming's comments.  Build tested
>  for allmodconfig on ARM and x86-64 on linux-next.   ]
>
>  lib/Kconfig | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/Kconfig b/lib/Kconfig
> index e04f168f8f42..8de5868804b5 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -528,13 +528,13 @@ config SG_SPLIT
> help
>  Provides a helper to split scatterlists into chunks, each chunk being
>  a scatterlist. This should be selected by a driver or an API which
> -whishes to split a scatterlist amongst multiple DMA channels.
> +wishes to split a scatterlist amongst multiple DMA channels.
>
>  config SG_POOL
> -   def_bool n
> +   def_tristate n
> help
>  Provides a helper to allocate chained scatterlists. This should be
> -selected by a driver or an API which whishes to allocate chained
> +selected by a driver or an API which wishes to allocate chained
>  scatterlist.
>
>  #

Looks good.

Acked-by: Ming Lin 

Thanks.


Re: [PATCH v6 08/10] KVM: arm/arm64: arch_timer: Rely on the arch timer to parse the firmware tables

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> The firmware table is currently parsed by the virtual timer code in
> order to retrieve the virtual timer interrupt. However, this is already
> done by the arch timer driver.
>
> To avoid code duplication, use the newly function
> arch_timer_get_kvm_info()
> which return all the information required by the virtual timer code.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 08/10] KVM: arm/arm64: arch_timer: Rely on the arch timer to parse the firmware tables

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> The firmware table is currently parsed by the virtual timer code in
> order to retrieve the virtual timer interrupt. However, this is already
> done by the arch timer driver.
>
> To avoid code duplication, use the newly function
> arch_timer_get_kvm_info()
> which return all the information required by the virtual timer code.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 07/10] irqchip/gic-v3: Parse and export virtual GIC information

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Fill up the recently introduced gic_kvm_info with the hardware
> information used for virtualization.
>
> Signed-off-by: Julien Grall 
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 07/10] irqchip/gic-v3: Parse and export virtual GIC information

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Fill up the recently introduced gic_kvm_info with the hardware
> information used for virtualization.
>
> Signed-off-by: Julien Grall 
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 06/10] irqchip/gic-v3: Gather all ACPI specific data in a single structure

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> The ACPI code requires to use global variables in order to collect
> information from the tables.
>
> To make clear those variables are ACPI specific, gather all of them in a
> single structure.
>
> Furthermore, even if some of the variables are not marked with
> __initdata, they are all only used during the initialization. Therefore,
> the new variable, which hold the structure, can be marked with
> __initdata.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
> Reviewed-by: Hanjun Guo 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 06/10] irqchip/gic-v3: Gather all ACPI specific data in a single structure

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> The ACPI code requires to use global variables in order to collect
> information from the tables.
>
> To make clear those variables are ACPI specific, gather all of them in a
> single structure.
>
> Furthermore, even if some of the variables are not marked with
> __initdata, they are all only used during the initialization. Therefore,
> the new variable, which hold the structure, can be marked with
> __initdata.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
> Reviewed-by: Hanjun Guo 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 04/10] irqchip/gic-v2: Parse and export virtual GIC information

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> For now, the firmware tables are parsed 2 times: once in the GIC
> drivers, the other timer when initializing the vGIC. It means code
> duplication and make more tedious to add the support for another
> firmware table (like ACPI).
>
> Introduce a new structure and set of helpers to get/set the virtual GIC
> information. Also fill up the structure for GICv2.
>
> Signed-off-by: Julien Grall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 04/10] irqchip/gic-v2: Parse and export virtual GIC information

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> For now, the firmware tables are parsed 2 times: once in the GIC
> drivers, the other timer when initializing the vGIC. It means code
> duplication and make more tedious to add the support for another
> firmware table (like ACPI).
>
> Introduce a new structure and set of helpers to get/set the virtual GIC
> information. Also fill up the structure for GICv2.
>
> Signed-off-by: Julien Grall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 03/10] irqchip/gic-v2: Gather ACPI specific data in a single structure

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> The ACPI code requires to use global variables in order to collect
> information from the tables.
>
> For now, a single global variable is used, but more will be added in a
> subsequent patch. To make clear they are ACPI specific, gather all the
> information in a single structure.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christofer Dall 
> Acked-by: Hanjun Guo 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 03/10] irqchip/gic-v2: Gather ACPI specific data in a single structure

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> The ACPI code requires to use global variables in order to collect
> information from the tables.
>
> For now, a single global variable is used, but more will be added in a
> subsequent patch. To make clear they are ACPI specific, gather all the
> information in a single structure.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christofer Dall 
> Acked-by: Hanjun Guo 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 02/10] clocksource: arm_arch_timer: Extend arch_timer_kvm_info to get the virtual IRQ

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Currently, the firmware table is parsed by the virtual timer code in
> order to retrieve the virtual timer interrupt. However, this is already
> done by the arch timer driver.
>
> To avoid code duplication, extend arch_timer_kvm_info to get the virtual
> IRQ.
>
> Note that the KVM code will be modified in a subsequent patch.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 02/10] clocksource: arm_arch_timer: Extend arch_timer_kvm_info to get the virtual IRQ

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Currently, the firmware table is parsed by the virtual timer code in
> order to retrieve the virtual timer interrupt. However, this is already
> done by the arch timer driver.
>
> To avoid code duplication, extend arch_timer_kvm_info to get the virtual
> IRQ.
>
> Note that the KVM code will be modified in a subsequent patch.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 01/10] clocksource: arm_arch_timer: Gather KVM specific information in a structure

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Introduce a structure which are filled up by the arch timer driver and
> used by the virtual timer in KVM.
>
> The first member of this structure will be the timecounter. More members
> will be added later.
>
> A stub for the new helper isn't introduced because KVM requires the arch
> timer for both ARM64 and ARM32.
>
> The function arch_timer_get_timecounter is kept for the time being and
> will be dropped in a subsequent patch.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.
> ---
> Cc: Daniel Lezcano 
> Cc: Thomas Gleixner 
> Cc: Marc Zyngier 
>
> Changes in v6:
> - Add Christoffer's acked-by
>
> Changes in v3:
> - Rename the patch
> - Move the KVM changes and removal of arch_timer_get_timecounter
> in separate patches.
> ---
>  drivers/clocksource/arm_arch_timer.c | 12 +---
>  include/clocksource/arm_arch_timer.h |  5 +
>  2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c
> b/drivers/clocksource/arm_arch_timer.c
> index 5152b38..62bdfe7 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -468,11 +468,16 @@ static struct cyclecounter cyclecounter = {
>   .mask   = CLOCKSOURCE_MASK(56),
>  };
>  
> -static struct timecounter timecounter;
> +static struct arch_timer_kvm_info arch_timer_kvm_info;
> +
> +struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)
> +{
> + return _timer_kvm_info;
> +}
>  
>  struct timecounter *arch_timer_get_timecounter(void)
>  {
> - return 
> + return _timer_kvm_info.timecounter;
>  }
>  
>  static void __init arch_counter_register(unsigned type)
> @@ -500,7 +505,8 @@ static void __init arch_counter_register(unsigned
> type)
>   clocksource_register_hz(_counter, arch_timer_rate);
>   cyclecounter.mult = clocksource_counter.mult;
>   cyclecounter.shift = clocksource_counter.shift;
> - timecounter_init(, , start_count);
> + timecounter_init(_timer_kvm_info.timecounter,
> +  , start_count);
>  
>   /* 56 bits minimum, so we assume worst case rollover */
>   sched_clock_register(arch_timer_read_counter, 56,
> arch_timer_rate);
> diff --git a/include/clocksource/arm_arch_timer.h
> b/include/clocksource/arm_arch_timer.h
> index 25d0914..9101ed6b 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -49,11 +49,16 @@ enum arch_timer_reg {
>  
>  #define ARCH_TIMER_EVT_STREAM_FREQ   1   /* 100us */
>  
> +struct arch_timer_kvm_info {
> + struct timecounter timecounter;
> +};
> +
>  #ifdef CONFIG_ARM_ARCH_TIMER
>  
>  extern u32 arch_timer_get_rate(void);
>  extern u64 (*arch_timer_read_counter)(void);
>  extern struct timecounter *arch_timer_get_timecounter(void);
> +extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
>  
>  #else
>  

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



Re: [PATCH v6 01/10] clocksource: arm_arch_timer: Gather KVM specific information in a structure

2016-04-23 Thread Shanker Donthineni


On 04/11/2016 10:32 AM, Julien Grall wrote:
> Introduce a structure which are filled up by the arch timer driver and
> used by the virtual timer in KVM.
>
> The first member of this structure will be the timecounter. More members
> will be added later.
>
> A stub for the new helper isn't introduced because KVM requires the arch
> timer for both ARM64 and ARM32.
>
> The function arch_timer_get_timecounter is kept for the time being and
> will be dropped in a subsequent patch.
>
> Signed-off-by: Julien Grall 
> Acked-by: Christoffer Dall 
>
Tested-by: Shanker Donthineni 

Using the Qualcomm Technologies QDF2XXX server platform.
> ---
> Cc: Daniel Lezcano 
> Cc: Thomas Gleixner 
> Cc: Marc Zyngier 
>
> Changes in v6:
> - Add Christoffer's acked-by
>
> Changes in v3:
> - Rename the patch
> - Move the KVM changes and removal of arch_timer_get_timecounter
> in separate patches.
> ---
>  drivers/clocksource/arm_arch_timer.c | 12 +---
>  include/clocksource/arm_arch_timer.h |  5 +
>  2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c
> b/drivers/clocksource/arm_arch_timer.c
> index 5152b38..62bdfe7 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -468,11 +468,16 @@ static struct cyclecounter cyclecounter = {
>   .mask   = CLOCKSOURCE_MASK(56),
>  };
>  
> -static struct timecounter timecounter;
> +static struct arch_timer_kvm_info arch_timer_kvm_info;
> +
> +struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)
> +{
> + return _timer_kvm_info;
> +}
>  
>  struct timecounter *arch_timer_get_timecounter(void)
>  {
> - return 
> + return _timer_kvm_info.timecounter;
>  }
>  
>  static void __init arch_counter_register(unsigned type)
> @@ -500,7 +505,8 @@ static void __init arch_counter_register(unsigned
> type)
>   clocksource_register_hz(_counter, arch_timer_rate);
>   cyclecounter.mult = clocksource_counter.mult;
>   cyclecounter.shift = clocksource_counter.shift;
> - timecounter_init(, , start_count);
> + timecounter_init(_timer_kvm_info.timecounter,
> +  , start_count);
>  
>   /* 56 bits minimum, so we assume worst case rollover */
>   sched_clock_register(arch_timer_read_counter, 56,
> arch_timer_rate);
> diff --git a/include/clocksource/arm_arch_timer.h
> b/include/clocksource/arm_arch_timer.h
> index 25d0914..9101ed6b 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -49,11 +49,16 @@ enum arch_timer_reg {
>  
>  #define ARCH_TIMER_EVT_STREAM_FREQ   1   /* 100us */
>  
> +struct arch_timer_kvm_info {
> + struct timecounter timecounter;
> +};
> +
>  #ifdef CONFIG_ARM_ARCH_TIMER
>  
>  extern u32 arch_timer_get_rate(void);
>  extern u64 (*arch_timer_read_counter)(void);
>  extern struct timecounter *arch_timer_get_timecounter(void);
> +extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
>  
>  #else
>  

-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project



[PATCH v2] lib: make sg_pool tristate instead of bool

2016-04-23 Thread Paul Gortmaker
The recently added Kconfig controlling compilation of this code is:

lib/Kconfig:config SG_POOL
lib/Kconfig:def_bool n

...meaning that it currently is not being built as a module by anyone,
and that tripped my audit looking for modular code that is essentially
orphaned (i.e. module_exit, and .remove fcns in non-modular drivers.)

In the following discussion, Ming Lin indicated that the original
intention was to have it tristate, so here we convert it accordingly.

Also fix up a couple spelling issues that appear in the surrounding
patch context.

Cc: Christoph Hellwig 
Cc: Ming Lin 
Cc: Sagi Grimberg 
Cc: Martin K. Petersen 
Signed-off-by: Paul Gortmaker 
---

[v2: drop modular code removal patch in favour of supporting a modular
 build via a one line Kconfig patch as per Ming's comments.  Build tested
 for allmodconfig on ARM and x86-64 on linux-next.   ]

 lib/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index e04f168f8f42..8de5868804b5 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -528,13 +528,13 @@ config SG_SPLIT
help
 Provides a helper to split scatterlists into chunks, each chunk being
 a scatterlist. This should be selected by a driver or an API which
-whishes to split a scatterlist amongst multiple DMA channels.
+wishes to split a scatterlist amongst multiple DMA channels.
 
 config SG_POOL
-   def_bool n
+   def_tristate n
help
 Provides a helper to allocate chained scatterlists. This should be
-selected by a driver or an API which whishes to allocate chained
+selected by a driver or an API which wishes to allocate chained
 scatterlist.
 
 #
-- 
2.8.0



[PATCH v2] lib: make sg_pool tristate instead of bool

2016-04-23 Thread Paul Gortmaker
The recently added Kconfig controlling compilation of this code is:

lib/Kconfig:config SG_POOL
lib/Kconfig:def_bool n

...meaning that it currently is not being built as a module by anyone,
and that tripped my audit looking for modular code that is essentially
orphaned (i.e. module_exit, and .remove fcns in non-modular drivers.)

In the following discussion, Ming Lin indicated that the original
intention was to have it tristate, so here we convert it accordingly.

Also fix up a couple spelling issues that appear in the surrounding
patch context.

Cc: Christoph Hellwig 
Cc: Ming Lin 
Cc: Sagi Grimberg 
Cc: Martin K. Petersen 
Signed-off-by: Paul Gortmaker 
---

[v2: drop modular code removal patch in favour of supporting a modular
 build via a one line Kconfig patch as per Ming's comments.  Build tested
 for allmodconfig on ARM and x86-64 on linux-next.   ]

 lib/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index e04f168f8f42..8de5868804b5 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -528,13 +528,13 @@ config SG_SPLIT
help
 Provides a helper to split scatterlists into chunks, each chunk being
 a scatterlist. This should be selected by a driver or an API which
-whishes to split a scatterlist amongst multiple DMA channels.
+wishes to split a scatterlist amongst multiple DMA channels.
 
 config SG_POOL
-   def_bool n
+   def_tristate n
help
 Provides a helper to allocate chained scatterlists. This should be
-selected by a driver or an API which whishes to allocate chained
+selected by a driver or an API which wishes to allocate chained
 scatterlist.
 
 #
-- 
2.8.0



Re: random(4) changes

2016-04-23 Thread Theodore Ts'o
On Fri, Apr 22, 2016 at 06:27:48PM -0400, Sandy Harris wrote:
> 
> I really like Stephan's idea of simplifying the interrupt handling,
> replacing the multiple entropy-gathering calls in the current driver
> with one routine called for all interrupts. See section 1.2 of his
> doc. That seems to me a much cleaner design, easier both to analyse
> and to optimise as a fast interrupt handler.

The current /dev/random driver *already* has a fast interrupt handler,
and it was designed specifically to be very fast and very lightweight.

It's a fair argument that getting rid of add_disk_randomness()
probably makes sense.  However, add_input_randomness() is useful
because it is also mixing in the HID input (e.g., the characters typed
or the mouse movements), and that is extremely valuable and I wouldn't
want to get rid of this.

> In the current driver -- and I think in Stephan's, though I have not
> looked at his code in any detail, only his paper -- heavy use of
> /dev/urandom or the kernel get_random_bytes() call can deplete the
> entropy available to /dev/random. That can be a serious problem in
> some circumstances, but I think I have a fix.

So /dev/urandom, or preferentially, the getrandom(2) system call,
which will block until the entropy pool is initialized, is designed to
be a CRNG.  We use the entropy accounting for the urandom pool as a
hueristic to know how aggressively to pull the random pool and/or
things like hwrandom (since pulling entropy from the TPM does have
costs, for example power utilization for battery-powered devices).

We already throttle back how much we pull from the input pool if it is
being used heavily, specifically to avoid this problem.

Cheers,

- Ted


Re: random(4) changes

2016-04-23 Thread Theodore Ts'o
On Fri, Apr 22, 2016 at 06:27:48PM -0400, Sandy Harris wrote:
> 
> I really like Stephan's idea of simplifying the interrupt handling,
> replacing the multiple entropy-gathering calls in the current driver
> with one routine called for all interrupts. See section 1.2 of his
> doc. That seems to me a much cleaner design, easier both to analyse
> and to optimise as a fast interrupt handler.

The current /dev/random driver *already* has a fast interrupt handler,
and it was designed specifically to be very fast and very lightweight.

It's a fair argument that getting rid of add_disk_randomness()
probably makes sense.  However, add_input_randomness() is useful
because it is also mixing in the HID input (e.g., the characters typed
or the mouse movements), and that is extremely valuable and I wouldn't
want to get rid of this.

> In the current driver -- and I think in Stephan's, though I have not
> looked at his code in any detail, only his paper -- heavy use of
> /dev/urandom or the kernel get_random_bytes() call can deplete the
> entropy available to /dev/random. That can be a serious problem in
> some circumstances, but I think I have a fix.

So /dev/urandom, or preferentially, the getrandom(2) system call,
which will block until the entropy pool is initialized, is designed to
be a CRNG.  We use the entropy accounting for the urandom pool as a
hueristic to know how aggressively to pull the random pool and/or
things like hwrandom (since pulling entropy from the TPM does have
costs, for example power utilization for battery-powered devices).

We already throttle back how much we pull from the input pool if it is
being used heavily, specifically to avoid this problem.

Cheers,

- Ted


[PATCH] ALSA: intel_hda - Add dock support for ThinkPad X260

2016-04-23 Thread Conrad Kostecki

Hi!
My shiny new ThinkPad X260 is the same way affected, as older ThinkPad 
(X*40, X*50) generations.
That means, I am unable to get sound output, when I am using the Lenovo 
CES 2013 docking station series (basic, pro, ultra).


It can be fixed the same way, as it was already done for X240 and X250, 
as the X260 uses the same docking connector.

I am attaching my patch, which works for me.

Cheers
Conrad

--

Fixes audio output on a ThinkPad X260, when using Lenovo CES 2013 
docking station series (basic, pro, ultra).


Signed-off-by: Conrad Kostecki 

diff -uprN -X linux-4.6-rc4-vanilla/Documentation/dontdiff 
linux-4.6-rc4-vanilla/sound/pci/hda/patch_realtek.c 
linux-4.6-rc4/sound/pci/hda/patch_realtek.c
--- linux-4.6-rc4-vanilla/sound/pci/hda/patch_realtek.c 2016-04-24 
03:26:36.330983586 +0200
+++ linux-4.6-rc4/sound/pci/hda/patch_realtek.c 2016-04-24 
03:27:13.737981843 +0200

@@ -5570,6 +5570,7 @@ static const struct snd_pci_quirk alc269
SND_PCI_QUIRK(0x17aa, 0x2218, "Thinkpad X1 Carbon 2nd", 
ALC292_FIXUP_TPT440_DOCK),
SND_PCI_QUIRK(0x17aa, 0x2223, "ThinkPad T550", 
ALC292_FIXUP_TPT440_DOCK),
SND_PCI_QUIRK(0x17aa, 0x2226, "ThinkPad X250", 
ALC292_FIXUP_TPT440_DOCK),
+   SND_PCI_QUIRK(0x17aa, 0x504a, "ThinkPad X260", 
ALC292_FIXUP_TPT440_DOCK),

SND_PCI_QUIRK(0x17aa, 0x2233, "Thinkpad", ALC292_FIXUP_TPT460),
SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", 
ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", 
ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),

--



[PATCH] ALSA: intel_hda - Add dock support for ThinkPad X260

2016-04-23 Thread Conrad Kostecki

Hi!
My shiny new ThinkPad X260 is the same way affected, as older ThinkPad 
(X*40, X*50) generations.
That means, I am unable to get sound output, when I am using the Lenovo 
CES 2013 docking station series (basic, pro, ultra).


It can be fixed the same way, as it was already done for X240 and X250, 
as the X260 uses the same docking connector.

I am attaching my patch, which works for me.

Cheers
Conrad

--

Fixes audio output on a ThinkPad X260, when using Lenovo CES 2013 
docking station series (basic, pro, ultra).


Signed-off-by: Conrad Kostecki 

diff -uprN -X linux-4.6-rc4-vanilla/Documentation/dontdiff 
linux-4.6-rc4-vanilla/sound/pci/hda/patch_realtek.c 
linux-4.6-rc4/sound/pci/hda/patch_realtek.c
--- linux-4.6-rc4-vanilla/sound/pci/hda/patch_realtek.c 2016-04-24 
03:26:36.330983586 +0200
+++ linux-4.6-rc4/sound/pci/hda/patch_realtek.c 2016-04-24 
03:27:13.737981843 +0200

@@ -5570,6 +5570,7 @@ static const struct snd_pci_quirk alc269
SND_PCI_QUIRK(0x17aa, 0x2218, "Thinkpad X1 Carbon 2nd", 
ALC292_FIXUP_TPT440_DOCK),
SND_PCI_QUIRK(0x17aa, 0x2223, "ThinkPad T550", 
ALC292_FIXUP_TPT440_DOCK),
SND_PCI_QUIRK(0x17aa, 0x2226, "ThinkPad X250", 
ALC292_FIXUP_TPT440_DOCK),
+   SND_PCI_QUIRK(0x17aa, 0x504a, "ThinkPad X260", 
ALC292_FIXUP_TPT440_DOCK),

SND_PCI_QUIRK(0x17aa, 0x2233, "Thinkpad", ALC292_FIXUP_TPT460),
SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", 
ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", 
ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),

--



Re: [RFC] The Linux Scheduler: a Decade of Wasted Cores Report

2016-04-23 Thread Brendan Gregg
On Sat, Apr 23, 2016 at 11:20 AM, Jeff Merkey  wrote:
>
> Interesting read.
>
> http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf
>
> "... The Linux kernel scheduler has deficiencies that prevent a
> multicore system from making proper use of all cores for heavily
> multithreaded loads, according to a lecture and paper delivered
> earlier this month at the EuroSys '16 conference in London, ..."
>
> Any plans to incorporate these fixes?

While this paper analyzes and proposes fixes for four bugs, it has
been getting a lot of attention for broader claims about Linux being
fundamentally broken:

"As a central part of resource management, the OS thread scheduler
must maintain the following, simple, invariant: make sure that ready
threads are scheduled on available cores. As simple as it may seem, we
found that this invariant is often broken in Linux. Cores may stay
idle for seconds while ready threads are waiting in runqueues."

Then states that the problems in the Linux scheduler that they found
cause degradations of "13-24% for typical Linux workloads".

Their proof of concept patches are online[1]. I tested them and saw 0%
improvements on the systems I tested, for some simple workloads[2]. I
tested 1 and 2 node NUMA, as that is typical for my employer (Netflix,
and our tens of thousands of Linux instances in the AWS/EC2 cloud),
even though I wasn't expecting any difference on 1 node. I've used
synthetic workloads so far.

I should note I do check run queue latency having hit scheduler bugs
in the past (especially on other kernels) and haven't noticed the
issues they describe, on our systems, for various workloads. I've also
written a new tool for this (runqlat using bcc/BPF[3]) to print run
queue latency as a histogram.

The bugs they found seem real, and their analysis is great (although
using visualizations to find and fix scheduler bugs isn't new), and it
would be good to see these fixed. However, it would also be useful to
double check how widespread these issues really are. I suspect many on
this list can test these patches in different environments.

Have we really had a decade of wasted cores, losing 13-24% for typical
Linux workloads? I don't think it's that widespread, but I'm only
testing one environment.

Brendan

[1] https://github.com/jplozi/wastedcores
[2] https://gist.github.com/brendangregg/588b1d29bcb952141d50ccc0e005fcf8
[3] https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt


Re: [RFC] The Linux Scheduler: a Decade of Wasted Cores Report

2016-04-23 Thread Brendan Gregg
On Sat, Apr 23, 2016 at 11:20 AM, Jeff Merkey  wrote:
>
> Interesting read.
>
> http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf
>
> "... The Linux kernel scheduler has deficiencies that prevent a
> multicore system from making proper use of all cores for heavily
> multithreaded loads, according to a lecture and paper delivered
> earlier this month at the EuroSys '16 conference in London, ..."
>
> Any plans to incorporate these fixes?

While this paper analyzes and proposes fixes for four bugs, it has
been getting a lot of attention for broader claims about Linux being
fundamentally broken:

"As a central part of resource management, the OS thread scheduler
must maintain the following, simple, invariant: make sure that ready
threads are scheduled on available cores. As simple as it may seem, we
found that this invariant is often broken in Linux. Cores may stay
idle for seconds while ready threads are waiting in runqueues."

Then states that the problems in the Linux scheduler that they found
cause degradations of "13-24% for typical Linux workloads".

Their proof of concept patches are online[1]. I tested them and saw 0%
improvements on the systems I tested, for some simple workloads[2]. I
tested 1 and 2 node NUMA, as that is typical for my employer (Netflix,
and our tens of thousands of Linux instances in the AWS/EC2 cloud),
even though I wasn't expecting any difference on 1 node. I've used
synthetic workloads so far.

I should note I do check run queue latency having hit scheduler bugs
in the past (especially on other kernels) and haven't noticed the
issues they describe, on our systems, for various workloads. I've also
written a new tool for this (runqlat using bcc/BPF[3]) to print run
queue latency as a histogram.

The bugs they found seem real, and their analysis is great (although
using visualizations to find and fix scheduler bugs isn't new), and it
would be good to see these fixed. However, it would also be useful to
double check how widespread these issues really are. I suspect many on
this list can test these patches in different environments.

Have we really had a decade of wasted cores, losing 13-24% for typical
Linux workloads? I don't think it's that widespread, but I'm only
testing one environment.

Brendan

[1] https://github.com/jplozi/wastedcores
[2] https://gist.github.com/brendangregg/588b1d29bcb952141d50ccc0e005fcf8
[3] https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt


Re: [PATCH net-next 0/9] netlink: align attributes when needed (patchset #1)

2016-04-23 Thread David Miller
From: Nicolas Dichtel 
Date: Fri, 22 Apr 2016 17:31:15 +0200

> This is the continuation of the work done to align netlink attributes
> when these attributes contain some 64-bit fields.
> 
> David, if the third patch is too big (or maybe the series), I can split it.
> Just tell me what you prefer.

This looks excellent, series applied, thanks!


Re: [PATCH net-next 0/9] netlink: align attributes when needed (patchset #1)

2016-04-23 Thread David Miller
From: Nicolas Dichtel 
Date: Fri, 22 Apr 2016 17:31:15 +0200

> This is the continuation of the work done to align netlink attributes
> when these attributes contain some 64-bit fields.
> 
> David, if the third patch is too big (or maybe the series), I can split it.
> Just tell me what you prefer.

This looks excellent, series applied, thanks!


Re: [PATCH] unicore32: mm: Add missing parameter to arch_vma_access_permitted

2016-04-23 Thread Guenter Roeck

ping ... still not upstream.

On 03/21/2016 04:20 AM, Guenter Roeck wrote:

unicore32 fails to compile with the following errors.

mm/memory.c: In function ‘__handle_mm_fault’:
mm/memory.c:3381: error:
too many arguments to function ‘arch_vma_access_permitted’
mm/gup.c: In function ‘check_vma_flags’:
mm/gup.c:456: error:
too many arguments to function ‘arch_vma_access_permitted’
mm/gup.c: In function ‘vma_permits_fault’:
mm/gup.c:640: error:
too many arguments to function ‘arch_vma_access_permitted’

Fixes: d61172b4b695b ("mm/core, x86/mm/pkeys: Differentiate instruction 
fetches")
Cc: Dave Hansen 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Guenter Roeck 
---
  arch/unicore32/include/asm/mmu_context.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/unicore32/include/asm/mmu_context.h 
b/arch/unicore32/include/asm/mmu_context.h
index e35632ef23c7..62dfc644c908 100644
--- a/arch/unicore32/include/asm/mmu_context.h
+++ b/arch/unicore32/include/asm/mmu_context.h
@@ -98,7 +98,7 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
  }

  static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
-   bool write, bool foreign)
+   bool write, bool execute, bool foreign)
  {
/* by default, allow everything */
return true;





Re: [PATCH] unicore32: mm: Add missing parameter to arch_vma_access_permitted

2016-04-23 Thread Guenter Roeck

ping ... still not upstream.

On 03/21/2016 04:20 AM, Guenter Roeck wrote:

unicore32 fails to compile with the following errors.

mm/memory.c: In function ‘__handle_mm_fault’:
mm/memory.c:3381: error:
too many arguments to function ‘arch_vma_access_permitted’
mm/gup.c: In function ‘check_vma_flags’:
mm/gup.c:456: error:
too many arguments to function ‘arch_vma_access_permitted’
mm/gup.c: In function ‘vma_permits_fault’:
mm/gup.c:640: error:
too many arguments to function ‘arch_vma_access_permitted’

Fixes: d61172b4b695b ("mm/core, x86/mm/pkeys: Differentiate instruction 
fetches")
Cc: Dave Hansen 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Guenter Roeck 
---
  arch/unicore32/include/asm/mmu_context.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/unicore32/include/asm/mmu_context.h 
b/arch/unicore32/include/asm/mmu_context.h
index e35632ef23c7..62dfc644c908 100644
--- a/arch/unicore32/include/asm/mmu_context.h
+++ b/arch/unicore32/include/asm/mmu_context.h
@@ -98,7 +98,7 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
  }

  static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
-   bool write, bool foreign)
+   bool write, bool execute, bool foreign)
  {
/* by default, allow everything */
return true;





Re: Major KVM issues with kernel 4.5 on the host

2016-04-23 Thread Borislav Petkov
On Sat, Apr 23, 2016 at 08:43:41PM +0200, Marc Haber wrote:
> Uncorrectable errors would still be identified by the ECC hardware,

Not if the hardware decides to syncflood so that we don't even get to
run the #MC handler...

> and the box wouldn't be perfectly fine with an "old" kernel.

Maybe the "old" kernel is not causing all the required ingredients to
come together for the uncorrectable error to happen. But yeah, I agree,
the fact that 4.4 is fine kinda doesn't fit with the uncorrectable error
theory.

> Yes, that would be in the logs.

Presumably. And see above.

> But we still postulate that the issue does only show on older AMD
> CPUs. Otherwise, I wouldn't be the only one making this experience.

It actually shows only on this one system. At least I'm not aware of any
other report of the same issue. My system with a F10h, rev E is just
fine.

> Do you want me to memtest for 24 hours?

Yeah, that memtest crap never triggers any ECCs. But if you're bored,
why not...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.


Re: Major KVM issues with kernel 4.5 on the host

2016-04-23 Thread Borislav Petkov
On Sat, Apr 23, 2016 at 08:43:41PM +0200, Marc Haber wrote:
> Uncorrectable errors would still be identified by the ECC hardware,

Not if the hardware decides to syncflood so that we don't even get to
run the #MC handler...

> and the box wouldn't be perfectly fine with an "old" kernel.

Maybe the "old" kernel is not causing all the required ingredients to
come together for the uncorrectable error to happen. But yeah, I agree,
the fact that 4.4 is fine kinda doesn't fit with the uncorrectable error
theory.

> Yes, that would be in the logs.

Presumably. And see above.

> But we still postulate that the issue does only show on older AMD
> CPUs. Otherwise, I wouldn't be the only one making this experience.

It actually shows only on this one system. At least I'm not aware of any
other report of the same issue. My system with a F10h, rev E is just
fine.

> Do you want me to memtest for 24 hours?

Yeah, that memtest crap never triggers any ECCs. But if you're bored,
why not...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.


Re: [PATCH v5 00/10] Support for Cortex-M Prototyping System

2016-04-23 Thread Arnd Bergmann
On Friday 01 April 2016, Vladimir Murzin wrote:
> Hi,
> 
> This patch series provides the basic support for running ucLinux on V2M-MPS2
> platform.
> 
> With these patches applied ucLinux can be run on both HW and FVP models
> with Cortex-M3/M4/M7 configurations.
> 
> Board description:
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.100112_0100_03_en/arm_versatile_express_cortex_m_prototyping_system_(v2m_mps2)_technical_reference_manual_100112_0100_03_en.pdf
> 
> Application notes (cover Cortex-M3/M4/M7):
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0385a/DAI0385A_cortex_m3_on_v2m_mps2.pdf
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0386a/DAI0386A_cortex_m4_on_v2m_mps2.pdf
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0399a/DAI0399A_cortex_m7_on_v2m_mps2.pdf
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0400a/DAI0400A_cortex_m7_on_v2m_mps2.pdf
> 
> Cortex-M System Design Kit (referenced as CMDK from documents above):
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0479c/DDI0479C_cortex_m_system_design_kit_r1p0_trm.pdf
> 
> I'd be happy to hear any feedback/comments on this series!
> 

The whole series looks good to me, please submit the patches to the
appropriate maintainers for inclusion separately:

* patches 1-2 to the clocksource maintainers
* patches 3-5 to the serial driver maintainers
* patches 6-10 to the vexpress maintainers

I think everyone is already on Cc here, but they don't want to pick the
patches out of the series separately. I expect to get the patches for
arm-soc as part of the vexpress pull requests (dt, defconfig, and core).

Arnd


Re: [PATCH v5 00/10] Support for Cortex-M Prototyping System

2016-04-23 Thread Arnd Bergmann
On Friday 01 April 2016, Vladimir Murzin wrote:
> Hi,
> 
> This patch series provides the basic support for running ucLinux on V2M-MPS2
> platform.
> 
> With these patches applied ucLinux can be run on both HW and FVP models
> with Cortex-M3/M4/M7 configurations.
> 
> Board description:
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.100112_0100_03_en/arm_versatile_express_cortex_m_prototyping_system_(v2m_mps2)_technical_reference_manual_100112_0100_03_en.pdf
> 
> Application notes (cover Cortex-M3/M4/M7):
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0385a/DAI0385A_cortex_m3_on_v2m_mps2.pdf
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0386a/DAI0386A_cortex_m4_on_v2m_mps2.pdf
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0399a/DAI0399A_cortex_m7_on_v2m_mps2.pdf
> http://infocenter.arm.com/help/topic/com.arm.doc.dai0400a/DAI0400A_cortex_m7_on_v2m_mps2.pdf
> 
> Cortex-M System Design Kit (referenced as CMDK from documents above):
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0479c/DDI0479C_cortex_m_system_design_kit_r1p0_trm.pdf
> 
> I'd be happy to hear any feedback/comments on this series!
> 

The whole series looks good to me, please submit the patches to the
appropriate maintainers for inclusion separately:

* patches 1-2 to the clocksource maintainers
* patches 3-5 to the serial driver maintainers
* patches 6-10 to the vexpress maintainers

I think everyone is already on Cc here, but they don't want to pick the
patches out of the series separately. I expect to get the patches for
arm-soc as part of the vexpress pull requests (dt, defconfig, and core).

Arnd


[RFC PATCH 01/11] thermal: prevent zones with no types to be registered

2016-04-23 Thread Eduardo Valentin
There are APIs that rely on tz->type. This patch
prevent thermal zones without it to be registered.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d4b5465..650e5fa 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1810,6 +1810,9 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int passive = 0;
struct thermal_governor *governor;
 
+   if (!type || strlen(type) == 0)
+   return ERR_PTR(-EINVAL);
+
if (type && strlen(type) >= THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
 
@@ -1835,7 +1838,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
return ERR_PTR(result);
}
 
-   strlcpy(tz->type, type ? : "", sizeof(tz->type));
+   strlcpy(tz->type, type, sizeof(tz->type));
tz->ops = ops;
tz->tzp = tzp;
tz->device.class = _class;
@@ -1855,11 +1858,9 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
}
 
/* sys I/F */
-   if (type) {
-   result = device_create_file(>device, _attr_type);
-   if (result)
-   goto unregister;
-   }
+   result = device_create_file(>device, _attr_type);
+   if (result)
+   goto unregister;
 
result = device_create_file(>device, _attr_temp);
if (result)
@@ -2008,8 +2009,7 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
 
thermal_zone_device_set_polling(tz, 0);
 
-   if (tz->type[0])
-   device_remove_file(>device, _attr_type);
+   device_remove_file(>device, _attr_type);
device_remove_file(>device, _attr_temp);
if (tz->ops->get_mode)
device_remove_file(>device, _attr_mode);
-- 
2.1.4



[RFC PATCH 01/11] thermal: prevent zones with no types to be registered

2016-04-23 Thread Eduardo Valentin
There are APIs that rely on tz->type. This patch
prevent thermal zones without it to be registered.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d4b5465..650e5fa 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1810,6 +1810,9 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int passive = 0;
struct thermal_governor *governor;
 
+   if (!type || strlen(type) == 0)
+   return ERR_PTR(-EINVAL);
+
if (type && strlen(type) >= THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
 
@@ -1835,7 +1838,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
return ERR_PTR(result);
}
 
-   strlcpy(tz->type, type ? : "", sizeof(tz->type));
+   strlcpy(tz->type, type, sizeof(tz->type));
tz->ops = ops;
tz->tzp = tzp;
tz->device.class = _class;
@@ -1855,11 +1858,9 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
}
 
/* sys I/F */
-   if (type) {
-   result = device_create_file(>device, _attr_type);
-   if (result)
-   goto unregister;
-   }
+   result = device_create_file(>device, _attr_type);
+   if (result)
+   goto unregister;
 
result = device_create_file(>device, _attr_temp);
if (result)
@@ -2008,8 +2009,7 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
 
thermal_zone_device_set_polling(tz, 0);
 
-   if (tz->type[0])
-   device_remove_file(>device, _attr_type);
+   device_remove_file(>device, _attr_type);
device_remove_file(>device, _attr_temp);
if (tz->ops->get_mode)
device_remove_file(>device, _attr_mode);
-- 
2.1.4



[RFC PATCH 03/11] thermal: group device_create_file() calls that are always created

2016-04-23 Thread Eduardo Valentin
Simple code reorganization to group files that are always created
when registering a thermal zone.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index e28d547..2227264 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1858,14 +1858,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
}
 
/* sys I/F */
-   result = device_create_file(>device, _attr_type);
-   if (result)
-   goto unregister;
-
-   result = device_create_file(>device, _attr_temp);
-   if (result)
-   goto unregister;
-
if (ops->get_mode) {
result = device_create_file(>device, _attr_mode);
if (result)
@@ -1900,13 +1892,16 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
goto unregister;
}
 
-   /* Create policy attribute */
-   result = device_create_file(>device, _attr_policy);
+   result = device_create_file(>device, _attr_type);
if (result)
goto unregister;
 
-   /* Add thermal zone params */
-   result = create_tzp_attrs(>device);
+   result = device_create_file(>device, _attr_temp);
+   if (result)
+   goto unregister;
+
+   /* Create policy attribute */
+   result = device_create_file(>device, _attr_policy);
if (result)
goto unregister;
 
@@ -1915,6 +1910,11 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
if (result)
goto unregister;
 
+   /* Add thermal zone params */
+   result = create_tzp_attrs(>device);
+   if (result)
+   goto unregister;
+
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
-- 
2.1.4



[RFC PATCH 03/11] thermal: group device_create_file() calls that are always created

2016-04-23 Thread Eduardo Valentin
Simple code reorganization to group files that are always created
when registering a thermal zone.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index e28d547..2227264 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1858,14 +1858,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
}
 
/* sys I/F */
-   result = device_create_file(>device, _attr_type);
-   if (result)
-   goto unregister;
-
-   result = device_create_file(>device, _attr_temp);
-   if (result)
-   goto unregister;
-
if (ops->get_mode) {
result = device_create_file(>device, _attr_mode);
if (result)
@@ -1900,13 +1892,16 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
goto unregister;
}
 
-   /* Create policy attribute */
-   result = device_create_file(>device, _attr_policy);
+   result = device_create_file(>device, _attr_type);
if (result)
goto unregister;
 
-   /* Add thermal zone params */
-   result = create_tzp_attrs(>device);
+   result = device_create_file(>device, _attr_temp);
+   if (result)
+   goto unregister;
+
+   /* Create policy attribute */
+   result = device_create_file(>device, _attr_policy);
if (result)
goto unregister;
 
@@ -1915,6 +1910,11 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
if (result)
goto unregister;
 
+   /* Add thermal zone params */
+   result = create_tzp_attrs(>device);
+   if (result)
+   goto unregister;
+
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
-- 
2.1.4



[RFC PATCH 06/11] thermal: move mode attribute to tz->device.groups

2016-04-23 Thread Eduardo Valentin
Moving mode attribute to tz->device.groups requires the implementation
of a .is_visible() callback. The condition returned by .is_visible() of
the mode attribute group is kept the same, we allow the attribute to be
visible only if ops->get_mode() is set by the thermal driver.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index e32f851..6e44038 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1006,6 +1006,7 @@ static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
 static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
 
+/* These attributes are unconditionally added to a thermal zone */
 static struct attribute *thermal_zone_dev_attrs[] = {
_attr_type.attr,
_attr_temp.attr,
@@ -1028,8 +1029,34 @@ static struct attribute_group 
thermal_zone_attribute_group = {
.attrs = thermal_zone_dev_attrs,
 };
 
+/* We expose mode only if .get_mode is present */
+static struct attribute *thermal_zone_mode_attrs[] = {
+   _attr_mode.attr,
+};
+
+static umode_t thermal_zone_mode_is_visible(struct kobject *kobj,
+   struct attribute *attr,
+   int attrno)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   struct thermal_zone_device *tz;
+
+   tz = container_of(dev, struct thermal_zone_device, device);
+
+   if (tz->ops->get_mode)
+   return attr->mode;
+
+   return 0;
+}
+
+static struct attribute_group thermal_zone_mode_attribute_group = {
+   .attrs = thermal_zone_mode_attrs,
+   .is_visible = thermal_zone_mode_is_visible,
+};
+
 static const struct attribute_group *thermal_zone_attribute_groups[] = {
_zone_attribute_group,
+   _zone_mode_attribute_group,
NULL
 };
 
@@ -1868,12 +1895,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
}
 
/* sys I/F */
-   if (ops->get_mode) {
-   result = device_create_file(>device, _attr_mode);
-   if (result)
-   goto unregister;
-   }
-
result = create_trip_attrs(tz, mask);
if (result)
goto unregister;
@@ -1990,8 +2011,6 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
 
thermal_zone_device_set_polling(tz, 0);
 
-   if (tz->ops->get_mode)
-   device_remove_file(>device, _attr_mode);
remove_trip_attrs(tz);
thermal_set_governor(tz, NULL);
 
-- 
2.1.4



[RFC PATCH 05/11] thermal: move emul_temp creation to tz->device.groups

2016-04-23 Thread Eduardo Valentin
emul_temp creation is dependent on a compile time
condition. Moving to tz->device.groups.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 0a7d918..e32f851 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -996,6 +996,7 @@ create_s32_tzp_attr(offset);
  */
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
+static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
 static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
 static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
@@ -1004,11 +1005,13 @@ static DEVICE_ATTR(sustainable_power, S_IWUSR | 
S_IRUGO, sustainable_power_show,
 /* These thermal zone device attributes are created based on conditions */
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
 static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 
 static struct attribute *thermal_zone_dev_attrs[] = {
_attr_type.attr,
_attr_temp.attr,
+#if (IS_ENABLED(CONFIG_THERMAL_EMULATION))
+   _attr_emul_temp.attr,
+#endif
_attr_policy.attr,
_attr_available_policies.attr,
_attr_sustainable_power.attr,
@@ -1893,12 +1896,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
goto unregister;
}
 
-   if (IS_ENABLED(CONFIG_THERMAL_EMULATION)) {
-   result = device_create_file(>device, _attr_emul_temp);
-   if (result)
-   goto unregister;
-   }
-
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
-- 
2.1.4



[RFC PATCH 08/11] thermal: move power actor code out of sysfs I/F section

2016-04-23 Thread Eduardo Valentin
Simply reorganize code to keep only functions of sysfs interface
of thermal zone device together. Therefore, move the power actor code
out of the sysfs I/F section.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 160 -
 1 file changed, 80 insertions(+), 80 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index e48c720..7c95978 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -582,6 +582,86 @@ static void thermal_zone_device_check(struct work_struct 
*work)
thermal_zone_device_update(tz);
 }
 
+/**
+ * power_actor_get_max_power() - get the maximum power that a cdev can consume
+ * @cdev:  pointer to _cooling_device
+ * @tz:a valid thermal zone device pointer
+ * @max_power: pointer in which to store the maximum power
+ *
+ * Calculate the maximum power consumption in milliwats that the
+ * cooling device can currently consume and store it in @max_power.
+ *
+ * Return: 0 on success, -EINVAL if @cdev doesn't support the
+ * power_actor API or -E* on other error.
+ */
+int power_actor_get_max_power(struct thermal_cooling_device *cdev,
+ struct thermal_zone_device *tz, u32 *max_power)
+{
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   return cdev->ops->state2power(cdev, tz, 0, max_power);
+}
+
+/**
+ * power_actor_get_min_power() - get the mainimum power that a cdev can consume
+ * @cdev:  pointer to _cooling_device
+ * @tz:a valid thermal zone device pointer
+ * @min_power: pointer in which to store the minimum power
+ *
+ * Calculate the minimum power consumption in milliwatts that the
+ * cooling device can currently consume and store it in @min_power.
+ *
+ * Return: 0 on success, -EINVAL if @cdev doesn't support the
+ * power_actor API or -E* on other error.
+ */
+int power_actor_get_min_power(struct thermal_cooling_device *cdev,
+ struct thermal_zone_device *tz, u32 *min_power)
+{
+   unsigned long max_state;
+   int ret;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   ret = cdev->ops->get_max_state(cdev, _state);
+   if (ret)
+   return ret;
+
+   return cdev->ops->state2power(cdev, tz, max_state, min_power);
+}
+
+/**
+ * power_actor_set_power() - limit the maximum power that a cooling device can 
consume
+ * @cdev:  pointer to _cooling_device
+ * @instance:  thermal instance to update
+ * @power: the power in milliwatts
+ *
+ * Set the cooling device to consume at most @power milliwatts.
+ *
+ * Return: 0 on success, -EINVAL if the cooling device does not
+ * implement the power actor API or -E* for other failures.
+ */
+int power_actor_set_power(struct thermal_cooling_device *cdev,
+ struct thermal_instance *instance, u32 power)
+{
+   unsigned long state;
+   int ret;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   ret = cdev->ops->power2state(cdev, instance->tz, power, );
+   if (ret)
+   return ret;
+
+   instance->target = state;
+   cdev->updated = false;
+   thermal_cdev_update(cdev);
+
+   return 0;
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
@@ -1092,86 +1172,6 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
NULL
 };
 
-/**
- * power_actor_get_max_power() - get the maximum power that a cdev can consume
- * @cdev:  pointer to _cooling_device
- * @tz:a valid thermal zone device pointer
- * @max_power: pointer in which to store the maximum power
- *
- * Calculate the maximum power consumption in milliwats that the
- * cooling device can currently consume and store it in @max_power.
- *
- * Return: 0 on success, -EINVAL if @cdev doesn't support the
- * power_actor API or -E* on other error.
- */
-int power_actor_get_max_power(struct thermal_cooling_device *cdev,
- struct thermal_zone_device *tz, u32 *max_power)
-{
-   if (!cdev_is_power_actor(cdev))
-   return -EINVAL;
-
-   return cdev->ops->state2power(cdev, tz, 0, max_power);
-}
-
-/**
- * power_actor_get_min_power() - get the mainimum power that a cdev can consume
- * @cdev:  pointer to _cooling_device
- * @tz:a valid thermal zone device pointer
- * @min_power: pointer in which to store the minimum power
- *
- * Calculate the minimum power consumption in milliwatts that the
- * cooling device can currently consume and store it in @min_power.
- *
- * Return: 0 on success, -EINVAL if @cdev doesn't support the
- * power_actor API or -E* on other error.
- */
-int power_actor_get_min_power(struct thermal_cooling_device *cdev,
-   

[RFC PATCH 07/11] thermal: move passive attr to tz->device.groups

2016-04-23 Thread Eduardo Valentin
This patch moves the passive attribute to tz->device.groups. Moving the
passive attribute also requires a .is_visible() callback implementation
for its attribute group.

The logic behind the visibility of passive attribute is kept the same.
We only expose the passive attribute if the thermal driver has exposed
at least one passive trip point.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 41 -
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 6e44038..e48c720 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1054,9 +1054,41 @@ static struct attribute_group 
thermal_zone_mode_attribute_group = {
.is_visible = thermal_zone_mode_is_visible,
 };
 
+/* We expose passive only if passive trips are present */
+static struct attribute *thermal_zone_passive_attrs[] = {
+   _attr_passive.attr,
+};
+
+static umode_t thermal_zone_passive_is_visible(struct kobject *kobj,
+  struct attribute *attr,
+  int attrno)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   struct thermal_zone_device *tz;
+   enum thermal_trip_type trip_type;
+   int count;
+
+   tz = container_of(dev, struct thermal_zone_device, device);
+
+   for (count = 0; count < tz->trips; count++) {
+   tz->ops->get_trip_type(tz, count, _type);
+
+   if (trip_type == THERMAL_TRIP_PASSIVE)
+   return attr->mode;
+   }
+
+   return 0;
+}
+
+static struct attribute_group thermal_zone_passive_attribute_group = {
+   .attrs = thermal_zone_passive_attrs,
+   .is_visible = thermal_zone_passive_is_visible,
+};
+
 static const struct attribute_group *thermal_zone_attribute_groups[] = {
_zone_attribute_group,
_zone_mode_attribute_group,
+   _zone_passive_attribute_group,
NULL
 };
 
@@ -1841,7 +1873,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int trip_temp;
int result;
int count;
-   int passive = 0;
struct thermal_governor *governor;
 
if (!type || strlen(type) == 0)
@@ -1902,8 +1933,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
for (count = 0; count < trips; count++) {
if (tz->ops->get_trip_type(tz, count, _type))
set_bit(count, >trips_disabled);
-   if (trip_type == THERMAL_TRIP_PASSIVE)
-   passive = 1;
if (tz->ops->get_trip_temp(tz, count, _temp))
set_bit(count, >trips_disabled);
/* Check for bogus trip points */
@@ -1911,12 +1940,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
set_bit(count, >trips_disabled);
}
 
-   if (!passive) {
-   result = device_create_file(>device, _attr_passive);
-   if (result)
-   goto unregister;
-   }
-
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
-- 
2.1.4



[RFC PATCH 06/11] thermal: move mode attribute to tz->device.groups

2016-04-23 Thread Eduardo Valentin
Moving mode attribute to tz->device.groups requires the implementation
of a .is_visible() callback. The condition returned by .is_visible() of
the mode attribute group is kept the same, we allow the attribute to be
visible only if ops->get_mode() is set by the thermal driver.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index e32f851..6e44038 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1006,6 +1006,7 @@ static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
 static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
 
+/* These attributes are unconditionally added to a thermal zone */
 static struct attribute *thermal_zone_dev_attrs[] = {
_attr_type.attr,
_attr_temp.attr,
@@ -1028,8 +1029,34 @@ static struct attribute_group 
thermal_zone_attribute_group = {
.attrs = thermal_zone_dev_attrs,
 };
 
+/* We expose mode only if .get_mode is present */
+static struct attribute *thermal_zone_mode_attrs[] = {
+   _attr_mode.attr,
+};
+
+static umode_t thermal_zone_mode_is_visible(struct kobject *kobj,
+   struct attribute *attr,
+   int attrno)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   struct thermal_zone_device *tz;
+
+   tz = container_of(dev, struct thermal_zone_device, device);
+
+   if (tz->ops->get_mode)
+   return attr->mode;
+
+   return 0;
+}
+
+static struct attribute_group thermal_zone_mode_attribute_group = {
+   .attrs = thermal_zone_mode_attrs,
+   .is_visible = thermal_zone_mode_is_visible,
+};
+
 static const struct attribute_group *thermal_zone_attribute_groups[] = {
_zone_attribute_group,
+   _zone_mode_attribute_group,
NULL
 };
 
@@ -1868,12 +1895,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
}
 
/* sys I/F */
-   if (ops->get_mode) {
-   result = device_create_file(>device, _attr_mode);
-   if (result)
-   goto unregister;
-   }
-
result = create_trip_attrs(tz, mask);
if (result)
goto unregister;
@@ -1990,8 +2011,6 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
 
thermal_zone_device_set_polling(tz, 0);
 
-   if (tz->ops->get_mode)
-   device_remove_file(>device, _attr_mode);
remove_trip_attrs(tz);
thermal_set_governor(tz, NULL);
 
-- 
2.1.4



[RFC PATCH 05/11] thermal: move emul_temp creation to tz->device.groups

2016-04-23 Thread Eduardo Valentin
emul_temp creation is dependent on a compile time
condition. Moving to tz->device.groups.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 0a7d918..e32f851 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -996,6 +996,7 @@ create_s32_tzp_attr(offset);
  */
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
+static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
 static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
 static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
@@ -1004,11 +1005,13 @@ static DEVICE_ATTR(sustainable_power, S_IWUSR | 
S_IRUGO, sustainable_power_show,
 /* These thermal zone device attributes are created based on conditions */
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
 static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 
 static struct attribute *thermal_zone_dev_attrs[] = {
_attr_type.attr,
_attr_temp.attr,
+#if (IS_ENABLED(CONFIG_THERMAL_EMULATION))
+   _attr_emul_temp.attr,
+#endif
_attr_policy.attr,
_attr_available_policies.attr,
_attr_sustainable_power.attr,
@@ -1893,12 +1896,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
goto unregister;
}
 
-   if (IS_ENABLED(CONFIG_THERMAL_EMULATION)) {
-   result = device_create_file(>device, _attr_emul_temp);
-   if (result)
-   goto unregister;
-   }
-
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
-- 
2.1.4



[RFC PATCH 08/11] thermal: move power actor code out of sysfs I/F section

2016-04-23 Thread Eduardo Valentin
Simply reorganize code to keep only functions of sysfs interface
of thermal zone device together. Therefore, move the power actor code
out of the sysfs I/F section.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 160 -
 1 file changed, 80 insertions(+), 80 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index e48c720..7c95978 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -582,6 +582,86 @@ static void thermal_zone_device_check(struct work_struct 
*work)
thermal_zone_device_update(tz);
 }
 
+/**
+ * power_actor_get_max_power() - get the maximum power that a cdev can consume
+ * @cdev:  pointer to _cooling_device
+ * @tz:a valid thermal zone device pointer
+ * @max_power: pointer in which to store the maximum power
+ *
+ * Calculate the maximum power consumption in milliwats that the
+ * cooling device can currently consume and store it in @max_power.
+ *
+ * Return: 0 on success, -EINVAL if @cdev doesn't support the
+ * power_actor API or -E* on other error.
+ */
+int power_actor_get_max_power(struct thermal_cooling_device *cdev,
+ struct thermal_zone_device *tz, u32 *max_power)
+{
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   return cdev->ops->state2power(cdev, tz, 0, max_power);
+}
+
+/**
+ * power_actor_get_min_power() - get the mainimum power that a cdev can consume
+ * @cdev:  pointer to _cooling_device
+ * @tz:a valid thermal zone device pointer
+ * @min_power: pointer in which to store the minimum power
+ *
+ * Calculate the minimum power consumption in milliwatts that the
+ * cooling device can currently consume and store it in @min_power.
+ *
+ * Return: 0 on success, -EINVAL if @cdev doesn't support the
+ * power_actor API or -E* on other error.
+ */
+int power_actor_get_min_power(struct thermal_cooling_device *cdev,
+ struct thermal_zone_device *tz, u32 *min_power)
+{
+   unsigned long max_state;
+   int ret;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   ret = cdev->ops->get_max_state(cdev, _state);
+   if (ret)
+   return ret;
+
+   return cdev->ops->state2power(cdev, tz, max_state, min_power);
+}
+
+/**
+ * power_actor_set_power() - limit the maximum power that a cooling device can 
consume
+ * @cdev:  pointer to _cooling_device
+ * @instance:  thermal instance to update
+ * @power: the power in milliwatts
+ *
+ * Set the cooling device to consume at most @power milliwatts.
+ *
+ * Return: 0 on success, -EINVAL if the cooling device does not
+ * implement the power actor API or -E* for other failures.
+ */
+int power_actor_set_power(struct thermal_cooling_device *cdev,
+ struct thermal_instance *instance, u32 power)
+{
+   unsigned long state;
+   int ret;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   ret = cdev->ops->power2state(cdev, instance->tz, power, );
+   if (ret)
+   return ret;
+
+   instance->target = state;
+   cdev->updated = false;
+   thermal_cdev_update(cdev);
+
+   return 0;
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
@@ -1092,86 +1172,6 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
NULL
 };
 
-/**
- * power_actor_get_max_power() - get the maximum power that a cdev can consume
- * @cdev:  pointer to _cooling_device
- * @tz:a valid thermal zone device pointer
- * @max_power: pointer in which to store the maximum power
- *
- * Calculate the maximum power consumption in milliwats that the
- * cooling device can currently consume and store it in @max_power.
- *
- * Return: 0 on success, -EINVAL if @cdev doesn't support the
- * power_actor API or -E* on other error.
- */
-int power_actor_get_max_power(struct thermal_cooling_device *cdev,
- struct thermal_zone_device *tz, u32 *max_power)
-{
-   if (!cdev_is_power_actor(cdev))
-   return -EINVAL;
-
-   return cdev->ops->state2power(cdev, tz, 0, max_power);
-}
-
-/**
- * power_actor_get_min_power() - get the mainimum power that a cdev can consume
- * @cdev:  pointer to _cooling_device
- * @tz:a valid thermal zone device pointer
- * @min_power: pointer in which to store the minimum power
- *
- * Calculate the minimum power consumption in milliwatts that the
- * cooling device can currently consume and store it in @min_power.
- *
- * Return: 0 on success, -EINVAL if @cdev doesn't support the
- * power_actor API or -E* on other error.
- */
-int power_actor_get_min_power(struct thermal_cooling_device *cdev,
- struct 

[RFC PATCH 07/11] thermal: move passive attr to tz->device.groups

2016-04-23 Thread Eduardo Valentin
This patch moves the passive attribute to tz->device.groups. Moving the
passive attribute also requires a .is_visible() callback implementation
for its attribute group.

The logic behind the visibility of passive attribute is kept the same.
We only expose the passive attribute if the thermal driver has exposed
at least one passive trip point.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 41 -
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 6e44038..e48c720 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1054,9 +1054,41 @@ static struct attribute_group 
thermal_zone_mode_attribute_group = {
.is_visible = thermal_zone_mode_is_visible,
 };
 
+/* We expose passive only if passive trips are present */
+static struct attribute *thermal_zone_passive_attrs[] = {
+   _attr_passive.attr,
+};
+
+static umode_t thermal_zone_passive_is_visible(struct kobject *kobj,
+  struct attribute *attr,
+  int attrno)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   struct thermal_zone_device *tz;
+   enum thermal_trip_type trip_type;
+   int count;
+
+   tz = container_of(dev, struct thermal_zone_device, device);
+
+   for (count = 0; count < tz->trips; count++) {
+   tz->ops->get_trip_type(tz, count, _type);
+
+   if (trip_type == THERMAL_TRIP_PASSIVE)
+   return attr->mode;
+   }
+
+   return 0;
+}
+
+static struct attribute_group thermal_zone_passive_attribute_group = {
+   .attrs = thermal_zone_passive_attrs,
+   .is_visible = thermal_zone_passive_is_visible,
+};
+
 static const struct attribute_group *thermal_zone_attribute_groups[] = {
_zone_attribute_group,
_zone_mode_attribute_group,
+   _zone_passive_attribute_group,
NULL
 };
 
@@ -1841,7 +1873,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int trip_temp;
int result;
int count;
-   int passive = 0;
struct thermal_governor *governor;
 
if (!type || strlen(type) == 0)
@@ -1902,8 +1933,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
for (count = 0; count < trips; count++) {
if (tz->ops->get_trip_type(tz, count, _type))
set_bit(count, >trips_disabled);
-   if (trip_type == THERMAL_TRIP_PASSIVE)
-   passive = 1;
if (tz->ops->get_trip_temp(tz, count, _temp))
set_bit(count, >trips_disabled);
/* Check for bogus trip points */
@@ -1911,12 +1940,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
set_bit(count, >trips_disabled);
}
 
-   if (!passive) {
-   result = device_create_file(>device, _attr_passive);
-   if (result)
-   goto unregister;
-   }
-
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
-- 
2.1.4



[RFC PATCH 10/11] thermal: create tz->device.groups dynamically

2016-04-23 Thread Eduardo Valentin
This is a patch to allow adding groups created dynamically. For now we
create only the existing group. However, this is a preparation to allow
creating trip groups, which are determined only when the number of trips
are known at runtime.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index b1b2945..13a85d7 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1169,7 +1169,7 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
_zone_attribute_group,
_zone_mode_attribute_group,
_zone_passive_attribute_group,
-   NULL
+   /* This is not NULL terminated as we create the group dynamically */
 };
 
 /**
@@ -1281,6 +1281,25 @@ static void remove_trip_attrs(struct thermal_zone_device 
*tz)
kfree(tz->trip_hyst_attrs);
 }
 
+static int thermal_zone_create_device_groups(struct thermal_zone_device *tz)
+{
+   const struct attribute_group **groups;
+   int i, size;
+
+   size = ARRAY_SIZE(thermal_zone_attribute_groups) + 1;
+   /* This also takes care of API requirement to be NULL terminated */
+   groups = kzalloc(size * sizeof(*groups), GFP_KERNEL);
+   if (!groups)
+   return -ENOMEM;
+
+   for (i = 0; i < size - 1; i++)
+   groups[i] = thermal_zone_attribute_groups[i];
+
+   tz->device.groups = groups;
+
+   return 0;
+}
+
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
@@ -1913,7 +1932,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
tz->polling_delay = polling_delay;
 
/* Add nodes that are always present via .groups */
-   tz->device.groups = thermal_zone_attribute_groups;
+   thermal_zone_create_device_groups(tz);
/* A new thermal zone needs to be updated anyway. */
atomic_set(>need_update, 1);
 
@@ -2042,7 +2061,7 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
idr_destroy(>idr);
mutex_destroy(>lock);
device_unregister(>device);
-   return;
+   kfree(tz->device.groups);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_unregister);
 
-- 
2.1.4



[RFC PATCH 10/11] thermal: create tz->device.groups dynamically

2016-04-23 Thread Eduardo Valentin
This is a patch to allow adding groups created dynamically. For now we
create only the existing group. However, this is a preparation to allow
creating trip groups, which are determined only when the number of trips
are known at runtime.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index b1b2945..13a85d7 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1169,7 +1169,7 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
_zone_attribute_group,
_zone_mode_attribute_group,
_zone_passive_attribute_group,
-   NULL
+   /* This is not NULL terminated as we create the group dynamically */
 };
 
 /**
@@ -1281,6 +1281,25 @@ static void remove_trip_attrs(struct thermal_zone_device 
*tz)
kfree(tz->trip_hyst_attrs);
 }
 
+static int thermal_zone_create_device_groups(struct thermal_zone_device *tz)
+{
+   const struct attribute_group **groups;
+   int i, size;
+
+   size = ARRAY_SIZE(thermal_zone_attribute_groups) + 1;
+   /* This also takes care of API requirement to be NULL terminated */
+   groups = kzalloc(size * sizeof(*groups), GFP_KERNEL);
+   if (!groups)
+   return -ENOMEM;
+
+   for (i = 0; i < size - 1; i++)
+   groups[i] = thermal_zone_attribute_groups[i];
+
+   tz->device.groups = groups;
+
+   return 0;
+}
+
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
@@ -1913,7 +1932,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
tz->polling_delay = polling_delay;
 
/* Add nodes that are always present via .groups */
-   tz->device.groups = thermal_zone_attribute_groups;
+   thermal_zone_create_device_groups(tz);
/* A new thermal zone needs to be updated anyway. */
atomic_set(>need_update, 1);
 
@@ -2042,7 +2061,7 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
idr_destroy(>idr);
mutex_destroy(>lock);
device_unregister(>device);
-   return;
+   kfree(tz->device.groups);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_unregister);
 
-- 
2.1.4



[RFC PATCH 09/11] thermal: move the trip attrs to the tz sysfs I/F section

2016-04-23 Thread Eduardo Valentin
Code reorganization to keep all the sysfs I/F of a thermal zone in the
same section.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 218 -
 1 file changed, 109 insertions(+), 109 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 7c95978..b1b2945 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1172,6 +1172,115 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
NULL
 };
 
+/**
+ * create_trip_attrs() - create attributes for trip points
+ * @tz:the thermal zone device
+ * @mask:  Writeable trip point bitmap.
+ *
+ * helper function to instantiate sysfs entries for every trip
+ * point and its properties of a struct thermal_zone_device.
+ *
+ * Return: 0 on success, the proper error value otherwise.
+ */
+static int create_trip_attrs(struct thermal_zone_device *tz, int mask)
+{
+   int indx;
+   int size = sizeof(struct thermal_attr) * tz->trips;
+
+   tz->trip_type_attrs = kzalloc(size, GFP_KERNEL);
+   if (!tz->trip_type_attrs)
+   return -ENOMEM;
+
+   tz->trip_temp_attrs = kzalloc(size, GFP_KERNEL);
+   if (!tz->trip_temp_attrs) {
+   kfree(tz->trip_type_attrs);
+   return -ENOMEM;
+   }
+
+   if (tz->ops->get_trip_hyst) {
+   tz->trip_hyst_attrs = kzalloc(size, GFP_KERNEL);
+   if (!tz->trip_hyst_attrs) {
+   kfree(tz->trip_type_attrs);
+   kfree(tz->trip_temp_attrs);
+   return -ENOMEM;
+   }
+   }
+
+
+   for (indx = 0; indx < tz->trips; indx++) {
+   /* create trip type attribute */
+   snprintf(tz->trip_type_attrs[indx].name, THERMAL_NAME_LENGTH,
+"trip_point_%d_type", indx);
+
+   sysfs_attr_init(>trip_type_attrs[indx].attr.attr);
+   tz->trip_type_attrs[indx].attr.attr.name =
+   tz->trip_type_attrs[indx].name;
+   tz->trip_type_attrs[indx].attr.attr.mode = S_IRUGO;
+   tz->trip_type_attrs[indx].attr.show = trip_point_type_show;
+
+   device_create_file(>device,
+  >trip_type_attrs[indx].attr);
+
+   /* create trip temp attribute */
+   snprintf(tz->trip_temp_attrs[indx].name, THERMAL_NAME_LENGTH,
+"trip_point_%d_temp", indx);
+
+   sysfs_attr_init(>trip_temp_attrs[indx].attr.attr);
+   tz->trip_temp_attrs[indx].attr.attr.name =
+   tz->trip_temp_attrs[indx].name;
+   tz->trip_temp_attrs[indx].attr.attr.mode = S_IRUGO;
+   tz->trip_temp_attrs[indx].attr.show = trip_point_temp_show;
+   if (IS_ENABLED(CONFIG_THERMAL_WRITABLE_TRIPS) &&
+   mask & (1 << indx)) {
+   tz->trip_temp_attrs[indx].attr.attr.mode |= S_IWUSR;
+   tz->trip_temp_attrs[indx].attr.store =
+   trip_point_temp_store;
+   }
+
+   device_create_file(>device,
+  >trip_temp_attrs[indx].attr);
+
+   /* create Optional trip hyst attribute */
+   if (!tz->ops->get_trip_hyst)
+   continue;
+   snprintf(tz->trip_hyst_attrs[indx].name, THERMAL_NAME_LENGTH,
+"trip_point_%d_hyst", indx);
+
+   sysfs_attr_init(>trip_hyst_attrs[indx].attr.attr);
+   tz->trip_hyst_attrs[indx].attr.attr.name =
+   tz->trip_hyst_attrs[indx].name;
+   tz->trip_hyst_attrs[indx].attr.attr.mode = S_IRUGO;
+   tz->trip_hyst_attrs[indx].attr.show = trip_point_hyst_show;
+   if (tz->ops->set_trip_hyst) {
+   tz->trip_hyst_attrs[indx].attr.attr.mode |= S_IWUSR;
+   tz->trip_hyst_attrs[indx].attr.store =
+   trip_point_hyst_store;
+   }
+
+   device_create_file(>device,
+  >trip_hyst_attrs[indx].attr);
+   }
+   return 0;
+}
+
+static void remove_trip_attrs(struct thermal_zone_device *tz)
+{
+   int indx;
+
+   for (indx = 0; indx < tz->trips; indx++) {
+   device_remove_file(>device,
+  >trip_type_attrs[indx].attr);
+   device_remove_file(>device,
+  >trip_temp_attrs[indx].attr);
+   if (tz->ops->get_trip_hyst)
+   device_remove_file(>device,
+ 

[RFC PATCH 11/11] thermal: move trips attributes to tz->device.groups

2016-04-23 Thread Eduardo Valentin
Finally, move the last thermal zone sysfs attributes to
tz->device.groups: trips attributes. This requires adding a
attribute_group to thermal_zone_device, creating it dynamically, and
then setting all trips attributes in it. The trips attribute is then
added to the tz->device.groups.

As the removal of all attributes are handled by device core, the device
remove calls are not needed anymore.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 75 +-
 include/linux/thermal.h|  2 ++
 2 files changed, 39 insertions(+), 38 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 13a85d7..e844a04 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1184,8 +1184,9 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
  */
 static int create_trip_attrs(struct thermal_zone_device *tz, int mask)
 {
-   int indx;
int size = sizeof(struct thermal_attr) * tz->trips;
+   struct attribute **attrs;
+   int indx;
 
tz->trip_type_attrs = kzalloc(size, GFP_KERNEL);
if (!tz->trip_type_attrs)
@@ -1206,6 +1207,14 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
}
}
 
+   attrs = kzalloc(sizeof(*attrs) * tz->trips * 3, GFP_KERNEL);
+   if (!attrs) {
+   kfree(tz->trip_type_attrs);
+   kfree(tz->trip_temp_attrs);
+   if (tz->ops->get_trip_hyst)
+   kfree(tz->trip_hyst_attrs);
+   return -ENOMEM;
+   }
 
for (indx = 0; indx < tz->trips; indx++) {
/* create trip type attribute */
@@ -1217,9 +1226,7 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
tz->trip_type_attrs[indx].name;
tz->trip_type_attrs[indx].attr.attr.mode = S_IRUGO;
tz->trip_type_attrs[indx].attr.show = trip_point_type_show;
-
-   device_create_file(>device,
-  >trip_type_attrs[indx].attr);
+   attrs[indx] = >trip_type_attrs[indx].attr.attr;
 
/* create trip temp attribute */
snprintf(tz->trip_temp_attrs[indx].name, THERMAL_NAME_LENGTH,
@@ -1236,9 +1243,7 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
tz->trip_temp_attrs[indx].attr.store =
trip_point_temp_store;
}
-
-   device_create_file(>device,
-  >trip_temp_attrs[indx].attr);
+   attrs[indx + tz->trips] = >trip_type_attrs[indx].attr.attr;
 
/* create Optional trip hyst attribute */
if (!tz->ops->get_trip_hyst)
@@ -1256,45 +1261,37 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
tz->trip_hyst_attrs[indx].attr.store =
trip_point_hyst_store;
}
-
-   device_create_file(>device,
-  >trip_hyst_attrs[indx].attr);
+   attrs[indx + tz->trips * 2] =
+   >trip_type_attrs[indx].attr.attr;
}
-   return 0;
-}
 
-static void remove_trip_attrs(struct thermal_zone_device *tz)
-{
-   int indx;
+   tz->trips_attribute_group.attrs = attrs;
 
-   for (indx = 0; indx < tz->trips; indx++) {
-   device_remove_file(>device,
-  >trip_type_attrs[indx].attr);
-   device_remove_file(>device,
-  >trip_temp_attrs[indx].attr);
-   if (tz->ops->get_trip_hyst)
-   device_remove_file(>device,
- >trip_hyst_attrs[indx].attr);
-   }
-   kfree(tz->trip_type_attrs);
-   kfree(tz->trip_temp_attrs);
-   kfree(tz->trip_hyst_attrs);
+   return 0;
 }
 
-static int thermal_zone_create_device_groups(struct thermal_zone_device *tz)
+static int thermal_zone_create_device_groups(struct thermal_zone_device *tz,
+int mask)
 {
const struct attribute_group **groups;
-   int i, size;
+   int i, size, result;
+
+   result = create_trip_attrs(tz, mask);
+   if (result)
+   return result;
 
-   size = ARRAY_SIZE(thermal_zone_attribute_groups) + 1;
+   /* we need one extra for trips and the NULL to terminate the array */
+   size = ARRAY_SIZE(thermal_zone_attribute_groups) + 2;
/* This also takes care of API requirement to be NULL terminated */
groups = kzalloc(size * sizeof(*groups), GFP_KERNEL);
if 

[RFC PATCH 09/11] thermal: move the trip attrs to the tz sysfs I/F section

2016-04-23 Thread Eduardo Valentin
Code reorganization to keep all the sysfs I/F of a thermal zone in the
same section.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 218 -
 1 file changed, 109 insertions(+), 109 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 7c95978..b1b2945 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1172,6 +1172,115 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
NULL
 };
 
+/**
+ * create_trip_attrs() - create attributes for trip points
+ * @tz:the thermal zone device
+ * @mask:  Writeable trip point bitmap.
+ *
+ * helper function to instantiate sysfs entries for every trip
+ * point and its properties of a struct thermal_zone_device.
+ *
+ * Return: 0 on success, the proper error value otherwise.
+ */
+static int create_trip_attrs(struct thermal_zone_device *tz, int mask)
+{
+   int indx;
+   int size = sizeof(struct thermal_attr) * tz->trips;
+
+   tz->trip_type_attrs = kzalloc(size, GFP_KERNEL);
+   if (!tz->trip_type_attrs)
+   return -ENOMEM;
+
+   tz->trip_temp_attrs = kzalloc(size, GFP_KERNEL);
+   if (!tz->trip_temp_attrs) {
+   kfree(tz->trip_type_attrs);
+   return -ENOMEM;
+   }
+
+   if (tz->ops->get_trip_hyst) {
+   tz->trip_hyst_attrs = kzalloc(size, GFP_KERNEL);
+   if (!tz->trip_hyst_attrs) {
+   kfree(tz->trip_type_attrs);
+   kfree(tz->trip_temp_attrs);
+   return -ENOMEM;
+   }
+   }
+
+
+   for (indx = 0; indx < tz->trips; indx++) {
+   /* create trip type attribute */
+   snprintf(tz->trip_type_attrs[indx].name, THERMAL_NAME_LENGTH,
+"trip_point_%d_type", indx);
+
+   sysfs_attr_init(>trip_type_attrs[indx].attr.attr);
+   tz->trip_type_attrs[indx].attr.attr.name =
+   tz->trip_type_attrs[indx].name;
+   tz->trip_type_attrs[indx].attr.attr.mode = S_IRUGO;
+   tz->trip_type_attrs[indx].attr.show = trip_point_type_show;
+
+   device_create_file(>device,
+  >trip_type_attrs[indx].attr);
+
+   /* create trip temp attribute */
+   snprintf(tz->trip_temp_attrs[indx].name, THERMAL_NAME_LENGTH,
+"trip_point_%d_temp", indx);
+
+   sysfs_attr_init(>trip_temp_attrs[indx].attr.attr);
+   tz->trip_temp_attrs[indx].attr.attr.name =
+   tz->trip_temp_attrs[indx].name;
+   tz->trip_temp_attrs[indx].attr.attr.mode = S_IRUGO;
+   tz->trip_temp_attrs[indx].attr.show = trip_point_temp_show;
+   if (IS_ENABLED(CONFIG_THERMAL_WRITABLE_TRIPS) &&
+   mask & (1 << indx)) {
+   tz->trip_temp_attrs[indx].attr.attr.mode |= S_IWUSR;
+   tz->trip_temp_attrs[indx].attr.store =
+   trip_point_temp_store;
+   }
+
+   device_create_file(>device,
+  >trip_temp_attrs[indx].attr);
+
+   /* create Optional trip hyst attribute */
+   if (!tz->ops->get_trip_hyst)
+   continue;
+   snprintf(tz->trip_hyst_attrs[indx].name, THERMAL_NAME_LENGTH,
+"trip_point_%d_hyst", indx);
+
+   sysfs_attr_init(>trip_hyst_attrs[indx].attr.attr);
+   tz->trip_hyst_attrs[indx].attr.attr.name =
+   tz->trip_hyst_attrs[indx].name;
+   tz->trip_hyst_attrs[indx].attr.attr.mode = S_IRUGO;
+   tz->trip_hyst_attrs[indx].attr.show = trip_point_hyst_show;
+   if (tz->ops->set_trip_hyst) {
+   tz->trip_hyst_attrs[indx].attr.attr.mode |= S_IWUSR;
+   tz->trip_hyst_attrs[indx].attr.store =
+   trip_point_hyst_store;
+   }
+
+   device_create_file(>device,
+  >trip_hyst_attrs[indx].attr);
+   }
+   return 0;
+}
+
+static void remove_trip_attrs(struct thermal_zone_device *tz)
+{
+   int indx;
+
+   for (indx = 0; indx < tz->trips; indx++) {
+   device_remove_file(>device,
+  >trip_type_attrs[indx].attr);
+   device_remove_file(>device,
+  >trip_temp_attrs[indx].attr);
+   if (tz->ops->get_trip_hyst)
+   device_remove_file(>device,
+ >trip_hyst_attrs[indx].attr);

[RFC PATCH 11/11] thermal: move trips attributes to tz->device.groups

2016-04-23 Thread Eduardo Valentin
Finally, move the last thermal zone sysfs attributes to
tz->device.groups: trips attributes. This requires adding a
attribute_group to thermal_zone_device, creating it dynamically, and
then setting all trips attributes in it. The trips attribute is then
added to the tz->device.groups.

As the removal of all attributes are handled by device core, the device
remove calls are not needed anymore.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 75 +-
 include/linux/thermal.h|  2 ++
 2 files changed, 39 insertions(+), 38 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 13a85d7..e844a04 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1184,8 +1184,9 @@ static const struct attribute_group 
*thermal_zone_attribute_groups[] = {
  */
 static int create_trip_attrs(struct thermal_zone_device *tz, int mask)
 {
-   int indx;
int size = sizeof(struct thermal_attr) * tz->trips;
+   struct attribute **attrs;
+   int indx;
 
tz->trip_type_attrs = kzalloc(size, GFP_KERNEL);
if (!tz->trip_type_attrs)
@@ -1206,6 +1207,14 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
}
}
 
+   attrs = kzalloc(sizeof(*attrs) * tz->trips * 3, GFP_KERNEL);
+   if (!attrs) {
+   kfree(tz->trip_type_attrs);
+   kfree(tz->trip_temp_attrs);
+   if (tz->ops->get_trip_hyst)
+   kfree(tz->trip_hyst_attrs);
+   return -ENOMEM;
+   }
 
for (indx = 0; indx < tz->trips; indx++) {
/* create trip type attribute */
@@ -1217,9 +1226,7 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
tz->trip_type_attrs[indx].name;
tz->trip_type_attrs[indx].attr.attr.mode = S_IRUGO;
tz->trip_type_attrs[indx].attr.show = trip_point_type_show;
-
-   device_create_file(>device,
-  >trip_type_attrs[indx].attr);
+   attrs[indx] = >trip_type_attrs[indx].attr.attr;
 
/* create trip temp attribute */
snprintf(tz->trip_temp_attrs[indx].name, THERMAL_NAME_LENGTH,
@@ -1236,9 +1243,7 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
tz->trip_temp_attrs[indx].attr.store =
trip_point_temp_store;
}
-
-   device_create_file(>device,
-  >trip_temp_attrs[indx].attr);
+   attrs[indx + tz->trips] = >trip_type_attrs[indx].attr.attr;
 
/* create Optional trip hyst attribute */
if (!tz->ops->get_trip_hyst)
@@ -1256,45 +1261,37 @@ static int create_trip_attrs(struct thermal_zone_device 
*tz, int mask)
tz->trip_hyst_attrs[indx].attr.store =
trip_point_hyst_store;
}
-
-   device_create_file(>device,
-  >trip_hyst_attrs[indx].attr);
+   attrs[indx + tz->trips * 2] =
+   >trip_type_attrs[indx].attr.attr;
}
-   return 0;
-}
 
-static void remove_trip_attrs(struct thermal_zone_device *tz)
-{
-   int indx;
+   tz->trips_attribute_group.attrs = attrs;
 
-   for (indx = 0; indx < tz->trips; indx++) {
-   device_remove_file(>device,
-  >trip_type_attrs[indx].attr);
-   device_remove_file(>device,
-  >trip_temp_attrs[indx].attr);
-   if (tz->ops->get_trip_hyst)
-   device_remove_file(>device,
- >trip_hyst_attrs[indx].attr);
-   }
-   kfree(tz->trip_type_attrs);
-   kfree(tz->trip_temp_attrs);
-   kfree(tz->trip_hyst_attrs);
+   return 0;
 }
 
-static int thermal_zone_create_device_groups(struct thermal_zone_device *tz)
+static int thermal_zone_create_device_groups(struct thermal_zone_device *tz,
+int mask)
 {
const struct attribute_group **groups;
-   int i, size;
+   int i, size, result;
+
+   result = create_trip_attrs(tz, mask);
+   if (result)
+   return result;
 
-   size = ARRAY_SIZE(thermal_zone_attribute_groups) + 1;
+   /* we need one extra for trips and the NULL to terminate the array */
+   size = ARRAY_SIZE(thermal_zone_attribute_groups) + 2;
/* This also takes care of API requirement to be NULL terminated */
groups = kzalloc(size * sizeof(*groups), GFP_KERNEL);
if (!groups)
return -ENOMEM;
 
-  

[RFC PATCH 04/11] thermal: use dev.groups to manage always present tz attributes

2016-04-23 Thread Eduardo Valentin
Thermal zones attributes are all being created using
device_create_file(). This has the disadvantage of making the code
complicated and sometimes we may miss the cleanup of them.

This patch starts to move the thermal zone sysfs attributes to the
dev.groups, so Linux device core manage them for us. For now, this patch
only moves those attributes are always present regardless of thermal
zone condition.

This change has also the advantage of cleaning up the thermal zone
parameters sysfs entries that are left unclean after device
registration.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 86 --
 1 file changed, 33 insertions(+), 53 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 2227264..0a7d918 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -989,42 +989,46 @@ create_s32_tzp_attr(slope);
 create_s32_tzp_attr(offset);
 #undef create_s32_tzp_attr
 
+/*
+ * These are thermal zone device attributes that will always be present.
+ * All the attributes created for tzp (create_s32_tzp_attr) also are always
+ * present on the sysfs interface.
+ */
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
-static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
-static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
 static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
 static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
   sustainable_power_store);
 
-static struct device_attribute *dev_tzp_attrs[] = {
-   _attr_sustainable_power,
-   _attr_k_po,
-   _attr_k_pu,
-   _attr_k_i,
-   _attr_k_d,
-   _attr_integral_cutoff,
-   _attr_slope,
-   _attr_offset,
-};
-
-static int create_tzp_attrs(struct device *dev)
-{
-   int i;
+/* These thermal zone device attributes are created based on conditions */
+static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
+static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
+static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 
-   for (i = 0; i < ARRAY_SIZE(dev_tzp_attrs); i++) {
-   int ret;
-   struct device_attribute *dev_attr = dev_tzp_attrs[i];
+static struct attribute *thermal_zone_dev_attrs[] = {
+   _attr_type.attr,
+   _attr_temp.attr,
+   _attr_policy.attr,
+   _attr_available_policies.attr,
+   _attr_sustainable_power.attr,
+   _attr_k_po.attr,
+   _attr_k_pu.attr,
+   _attr_k_i.attr,
+   _attr_k_d.attr,
+   _attr_integral_cutoff.attr,
+   _attr_slope.attr,
+   _attr_offset.attr,
+};
 
-   ret = device_create_file(dev, dev_attr);
-   if (ret)
-   return ret;
-   }
+static struct attribute_group thermal_zone_attribute_group = {
+   .attrs = thermal_zone_dev_attrs,
+};
 
-   return 0;
-}
+static const struct attribute_group *thermal_zone_attribute_groups[] = {
+   _zone_attribute_group,
+   NULL
+};
 
 /**
  * power_actor_get_max_power() - get the maximum power that a cdev can consume
@@ -1846,6 +1850,9 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
tz->trips = trips;
tz->passive_delay = passive_delay;
tz->polling_delay = polling_delay;
+
+   /* Add nodes that are always present via .groups */
+   tz->device.groups = thermal_zone_attribute_groups;
/* A new thermal zone needs to be updated anyway. */
atomic_set(>need_update, 1);
 
@@ -1892,29 +1899,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
goto unregister;
}
 
-   result = device_create_file(>device, _attr_type);
-   if (result)
-   goto unregister;
-
-   result = device_create_file(>device, _attr_temp);
-   if (result)
-   goto unregister;
-
-   /* Create policy attribute */
-   result = device_create_file(>device, _attr_policy);
-   if (result)
-   goto unregister;
-
-   /* Create available_policies attribute */
-   result = device_create_file(>device, _attr_available_policies);
-   if (result)
-   goto unregister;
-
-   /* Add thermal zone params */
-   result = create_tzp_attrs(>device);
-   if (result)
-   goto unregister;
-
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
@@ -2009,12 +1993,8 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
 

[RFC PATCH 00/11] thermal: sysfs rework

2016-04-23 Thread Eduardo Valentin
Hello Linux PM, Rui,

This is a series of patches for review. In this series I am proposing
to rework how we do sysfs, mainly for thermal zone attributes.

Currently, as many features have been added recently, there are more
than one way of attribute handling. This series is an attempt to
standardize the sysfs handling. Essentially, this will move all
attributes to the dev.groups field, so sysfs core code handles the
attributes properly. Apart from the obvious code organization benefit,
this change should also take care properly of attribute destruction,
when thermal zones are removed. 

The cooling device attributes are more or less handled in this manner.
But they still require some piece of rework. In this series, I am not
touching them yet.

I don't expect any impact on userspace.

The only change in behavior is that now, thermal zones with empty
.type will not be allowed to be registered. 

Please give your inputs. 

BR,

Eduardo Valentin
--
Eduardo Valentin (11):
  thermal: prevent zones with no types to be registered
  thermal: group thermal_zone DEVICE_ATTR's declarations
  thermal: group device_create_file() calls that are always created
  thermal: use dev.groups to manage always present tz attributes
  thermal: move emul_temp creation to tz->device.groups
  thermal: move mode attribute to tz->device.groups
  thermal: move passive attr to tz->device.groups
  thermal: move power actor code out of sysfs I/F section
  thermal: move the trip attrs to the tz sysfs I/F section
  thermal: create tz->device.groups dynamically
  thermal: move trips attributes to tz->device.groups

 drivers/thermal/thermal_core.c | 549 ++---
 include/linux/thermal.h|   2 +
 2 files changed, 295 insertions(+), 256 deletions(-)

-- 
2.1.4



[RFC PATCH 02/11] thermal: group thermal_zone DEVICE_ATTR's declarations

2016-04-23 Thread Eduardo Valentin
Simply reorganize the code to have all DEVICE_ATTR's
in one point in the file.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 650e5fa..e28d547 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -917,7 +917,6 @@ emul_temp_store(struct device *dev, struct device_attribute 
*attr,
 
return ret ? ret : count;
 }
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 
 static ssize_t
 sustainable_power_show(struct device *dev, struct device_attribute *devattr,
@@ -948,8 +947,6 @@ sustainable_power_store(struct device *dev, struct 
device_attribute *devattr,
 
return count;
 }
-static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
-   sustainable_power_store);
 
 #define create_s32_tzp_attr(name)  \
static ssize_t  \
@@ -992,6 +989,16 @@ create_s32_tzp_attr(slope);
 create_s32_tzp_attr(offset);
 #undef create_s32_tzp_attr
 
+static DEVICE_ATTR(type, 0444, type_show, NULL);
+static DEVICE_ATTR(temp, 0444, temp_show, NULL);
+static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
+static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
+static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
+static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
+static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
+static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
+  sustainable_power_store);
+
 static struct device_attribute *dev_tzp_attrs[] = {
_attr_sustainable_power,
_attr_k_po,
@@ -1099,13 +1106,6 @@ int power_actor_set_power(struct thermal_cooling_device 
*cdev,
return 0;
 }
 
-static DEVICE_ATTR(type, 0444, type_show, NULL);
-static DEVICE_ATTR(temp, 0444, temp_show, NULL);
-static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
-static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
-static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
-static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
-
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
-- 
2.1.4



[RFC PATCH 04/11] thermal: use dev.groups to manage always present tz attributes

2016-04-23 Thread Eduardo Valentin
Thermal zones attributes are all being created using
device_create_file(). This has the disadvantage of making the code
complicated and sometimes we may miss the cleanup of them.

This patch starts to move the thermal zone sysfs attributes to the
dev.groups, so Linux device core manage them for us. For now, this patch
only moves those attributes are always present regardless of thermal
zone condition.

This change has also the advantage of cleaning up the thermal zone
parameters sysfs entries that are left unclean after device
registration.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 86 --
 1 file changed, 33 insertions(+), 53 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 2227264..0a7d918 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -989,42 +989,46 @@ create_s32_tzp_attr(slope);
 create_s32_tzp_attr(offset);
 #undef create_s32_tzp_attr
 
+/*
+ * These are thermal zone device attributes that will always be present.
+ * All the attributes created for tzp (create_s32_tzp_attr) also are always
+ * present on the sysfs interface.
+ */
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
-static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
-static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
 static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
 static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
   sustainable_power_store);
 
-static struct device_attribute *dev_tzp_attrs[] = {
-   _attr_sustainable_power,
-   _attr_k_po,
-   _attr_k_pu,
-   _attr_k_i,
-   _attr_k_d,
-   _attr_integral_cutoff,
-   _attr_slope,
-   _attr_offset,
-};
-
-static int create_tzp_attrs(struct device *dev)
-{
-   int i;
+/* These thermal zone device attributes are created based on conditions */
+static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
+static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
+static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 
-   for (i = 0; i < ARRAY_SIZE(dev_tzp_attrs); i++) {
-   int ret;
-   struct device_attribute *dev_attr = dev_tzp_attrs[i];
+static struct attribute *thermal_zone_dev_attrs[] = {
+   _attr_type.attr,
+   _attr_temp.attr,
+   _attr_policy.attr,
+   _attr_available_policies.attr,
+   _attr_sustainable_power.attr,
+   _attr_k_po.attr,
+   _attr_k_pu.attr,
+   _attr_k_i.attr,
+   _attr_k_d.attr,
+   _attr_integral_cutoff.attr,
+   _attr_slope.attr,
+   _attr_offset.attr,
+};
 
-   ret = device_create_file(dev, dev_attr);
-   if (ret)
-   return ret;
-   }
+static struct attribute_group thermal_zone_attribute_group = {
+   .attrs = thermal_zone_dev_attrs,
+};
 
-   return 0;
-}
+static const struct attribute_group *thermal_zone_attribute_groups[] = {
+   _zone_attribute_group,
+   NULL
+};
 
 /**
  * power_actor_get_max_power() - get the maximum power that a cdev can consume
@@ -1846,6 +1850,9 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
tz->trips = trips;
tz->passive_delay = passive_delay;
tz->polling_delay = polling_delay;
+
+   /* Add nodes that are always present via .groups */
+   tz->device.groups = thermal_zone_attribute_groups;
/* A new thermal zone needs to be updated anyway. */
atomic_set(>need_update, 1);
 
@@ -1892,29 +1899,6 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
goto unregister;
}
 
-   result = device_create_file(>device, _attr_type);
-   if (result)
-   goto unregister;
-
-   result = device_create_file(>device, _attr_temp);
-   if (result)
-   goto unregister;
-
-   /* Create policy attribute */
-   result = device_create_file(>device, _attr_policy);
-   if (result)
-   goto unregister;
-
-   /* Create available_policies attribute */
-   result = device_create_file(>device, _attr_available_policies);
-   if (result)
-   goto unregister;
-
-   /* Add thermal zone params */
-   result = create_tzp_attrs(>device);
-   if (result)
-   goto unregister;
-
/* Update 'this' zone's governor information */
mutex_lock(_governor_lock);
 
@@ -2009,12 +1993,8 @@ void thermal_zone_device_unregister(struct 
thermal_zone_device *tz)
 
thermal_zone_device_set_polling(tz, 0);
 
-   

[RFC PATCH 00/11] thermal: sysfs rework

2016-04-23 Thread Eduardo Valentin
Hello Linux PM, Rui,

This is a series of patches for review. In this series I am proposing
to rework how we do sysfs, mainly for thermal zone attributes.

Currently, as many features have been added recently, there are more
than one way of attribute handling. This series is an attempt to
standardize the sysfs handling. Essentially, this will move all
attributes to the dev.groups field, so sysfs core code handles the
attributes properly. Apart from the obvious code organization benefit,
this change should also take care properly of attribute destruction,
when thermal zones are removed. 

The cooling device attributes are more or less handled in this manner.
But they still require some piece of rework. In this series, I am not
touching them yet.

I don't expect any impact on userspace.

The only change in behavior is that now, thermal zones with empty
.type will not be allowed to be registered. 

Please give your inputs. 

BR,

Eduardo Valentin
--
Eduardo Valentin (11):
  thermal: prevent zones with no types to be registered
  thermal: group thermal_zone DEVICE_ATTR's declarations
  thermal: group device_create_file() calls that are always created
  thermal: use dev.groups to manage always present tz attributes
  thermal: move emul_temp creation to tz->device.groups
  thermal: move mode attribute to tz->device.groups
  thermal: move passive attr to tz->device.groups
  thermal: move power actor code out of sysfs I/F section
  thermal: move the trip attrs to the tz sysfs I/F section
  thermal: create tz->device.groups dynamically
  thermal: move trips attributes to tz->device.groups

 drivers/thermal/thermal_core.c | 549 ++---
 include/linux/thermal.h|   2 +
 2 files changed, 295 insertions(+), 256 deletions(-)

-- 
2.1.4



[RFC PATCH 02/11] thermal: group thermal_zone DEVICE_ATTR's declarations

2016-04-23 Thread Eduardo Valentin
Simply reorganize the code to have all DEVICE_ATTR's
in one point in the file.

Cc: Zhang Rui 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 drivers/thermal/thermal_core.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 650e5fa..e28d547 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -917,7 +917,6 @@ emul_temp_store(struct device *dev, struct device_attribute 
*attr,
 
return ret ? ret : count;
 }
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 
 static ssize_t
 sustainable_power_show(struct device *dev, struct device_attribute *devattr,
@@ -948,8 +947,6 @@ sustainable_power_store(struct device *dev, struct 
device_attribute *devattr,
 
return count;
 }
-static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
-   sustainable_power_store);
 
 #define create_s32_tzp_attr(name)  \
static ssize_t  \
@@ -992,6 +989,16 @@ create_s32_tzp_attr(slope);
 create_s32_tzp_attr(offset);
 #undef create_s32_tzp_attr
 
+static DEVICE_ATTR(type, 0444, type_show, NULL);
+static DEVICE_ATTR(temp, 0444, temp_show, NULL);
+static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
+static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
+static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
+static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
+static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
+static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO, 
sustainable_power_show,
+  sustainable_power_store);
+
 static struct device_attribute *dev_tzp_attrs[] = {
_attr_sustainable_power,
_attr_k_po,
@@ -1099,13 +1106,6 @@ int power_actor_set_power(struct thermal_cooling_device 
*cdev,
return 0;
 }
 
-static DEVICE_ATTR(type, 0444, type_show, NULL);
-static DEVICE_ATTR(temp, 0444, temp_show, NULL);
-static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
-static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
-static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
-static DEVICE_ATTR(available_policies, S_IRUGO, available_policies_show, NULL);
-
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
-- 
2.1.4



Re: [PATCH v2] ARM: dts: artpec: update clock bindings in artpec6.dtsi

2016-04-23 Thread Arnd Bergmann
On Monday 14 March 2016, Lars Persson wrote:
> The clock binding for the main clock controller was changed to an
> indexed controller style binding on request of the clk
> maintainers. This updates the dtsi to use the new bindings.
> 
> Signed-off-by: Lars Persson 
> ---
> v2: Use numerical clock indexes to enable merge before the clock driver 
> bindings
> are in the tree.
> 
>  arch/arm/boot/dts/artpec6.dtsi | 99 
> +-
>  1 file changed, 20 insertions(+), 79 deletions(-)

I found this patch while going through stuff that had not been applied yet.
Is this still the latest version that we should apply for v4.7?

Arnd


Re: [PATCH v2] ARM: dts: artpec: update clock bindings in artpec6.dtsi

2016-04-23 Thread Arnd Bergmann
On Monday 14 March 2016, Lars Persson wrote:
> The clock binding for the main clock controller was changed to an
> indexed controller style binding on request of the clk
> maintainers. This updates the dtsi to use the new bindings.
> 
> Signed-off-by: Lars Persson 
> ---
> v2: Use numerical clock indexes to enable merge before the clock driver 
> bindings
> are in the tree.
> 
>  arch/arm/boot/dts/artpec6.dtsi | 99 
> +-
>  1 file changed, 20 insertions(+), 79 deletions(-)

I found this patch while going through stuff that had not been applied yet.
Is this still the latest version that we should apply for v4.7?

Arnd


[PATCH] RAID Cleanup for bio-split

2016-04-23 Thread Shaun Tancheff
It looks like some minor changes slipped through on the RAID.

A couple of checks for REQ_PREFLUSH flag should also check for
bi_op matching REQ_OP_FLUSH.

Wrappers for sync_page_io() are passed READ/WRITE but need to
be passed REQ_OP_READ and REQ_OP_WRITE.

Signed-off-by: Shaun Tancheff 
---
 drivers/md/raid0.c  |  3 ++-
 drivers/md/raid1.c  | 13 +++--
 drivers/md/raid10.c | 21 ++---
 drivers/md/raid5.c  |  3 ++-
 4 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index f95463d..46e9ba8 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -458,7 +458,8 @@ static void raid0_make_request(struct mddev *mddev, struct 
bio *bio)
struct md_rdev *tmp_dev;
struct bio *split;
 
-   if (unlikely(bio->bi_rw & REQ_PREFLUSH)) {
+   if (unlikely(bio->bi_rw & REQ_PREFLUSH) ||
+   unlikely(bio->bi_op == REQ_OP_FLUSH)) {
md_flush_request(mddev, bio);
return;
}
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 2a2c177..f7c0577 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1771,12 +1771,12 @@ static void end_sync_write(struct bio *bio)
 }
 
 static int r1_sync_page_io(struct md_rdev *rdev, sector_t sector,
-   int sectors, struct page *page, int rw)
+   int sectors, struct page *page, int op)
 {
-   if (sync_page_io(rdev, sector, sectors << 9, page, rw, 0, false))
+   if (sync_page_io(rdev, sector, sectors << 9, page, op, 0, false))
/* success */
return 1;
-   if (rw == WRITE) {
+   if (op == REQ_OP_WRITE) {
set_bit(WriteErrorSeen, >flags);
if (!test_and_set_bit(WantReplacement,
  >flags))
@@ -1883,7 +1883,7 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
rdev = conf->mirrors[d].rdev;
if (r1_sync_page_io(rdev, sect, s,
bio->bi_io_vec[idx].bv_page,
-   WRITE) == 0) {
+   REQ_OP_WRITE) == 0) {
r1_bio->bios[d]->bi_end_io = NULL;
rdev_dec_pending(rdev, mddev);
}
@@ -2118,7 +2118,7 @@ static void fix_read_error(struct r1conf *conf, int 
read_disk,
if (rdev &&
!test_bit(Faulty, >flags))
r1_sync_page_io(rdev, sect, s,
-   conf->tmppage, WRITE);
+   conf->tmppage, REQ_OP_WRITE);
}
d = start;
while (d != read_disk) {
@@ -2130,7 +2130,7 @@ static void fix_read_error(struct r1conf *conf, int 
read_disk,
if (rdev &&
!test_bit(Faulty, >flags)) {
if (r1_sync_page_io(rdev, sect, s,
-   conf->tmppage, READ)) {
+   conf->tmppage, 
REQ_OP_READ)) {
atomic_add(s, >corrected_errors);
printk(KERN_INFO
   "md/raid1:%s: read error 
corrected "
@@ -2204,6 +2204,7 @@ static int narrow_write_error(struct r1bio *r1_bio, int i)
}
 
wbio->bi_op = REQ_OP_WRITE;
+   wbio->bi_rw = 0;
wbio->bi_iter.bi_sector = r1_bio->sector;
wbio->bi_iter.bi_size = r1_bio->sectors << 9;
 
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index c5dc4e4..44e87c2 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1364,8 +1364,7 @@ retry_write:
mbio->bi_bdev = rdev->bdev;
mbio->bi_end_io = raid10_end_write_request;
mbio->bi_op = op;
-   mbio->bi_rw =
-   do_sync | do_fua | do_sec;
+   mbio->bi_rw = do_sync | do_fua | do_sec;
mbio->bi_private = r10_bio;
 
atomic_inc(_bio->remaining);
@@ -1408,8 +1407,7 @@ retry_write:
mbio->bi_bdev = rdev->bdev;
mbio->bi_end_io = raid10_end_write_request;
mbio->bi_op = op;
-   mbio->bi_rw =
-   do_sync | do_fua | do_sec;
+   mbio->bi_rw = do_sync | do_fua | do_sec;
mbio->bi_private = r10_bio;
 
atomic_inc(_bio->remaining);
@@ -1452,7 +1450,8 @@ static void raid10_make_request(struct mddev *mddev, 
struct bio *bio)
 
struct bio 

[PATCH] RAID Cleanup for bio-split

2016-04-23 Thread Shaun Tancheff
It looks like some minor changes slipped through on the RAID.

A couple of checks for REQ_PREFLUSH flag should also check for
bi_op matching REQ_OP_FLUSH.

Wrappers for sync_page_io() are passed READ/WRITE but need to
be passed REQ_OP_READ and REQ_OP_WRITE.

Signed-off-by: Shaun Tancheff 
---
 drivers/md/raid0.c  |  3 ++-
 drivers/md/raid1.c  | 13 +++--
 drivers/md/raid10.c | 21 ++---
 drivers/md/raid5.c  |  3 ++-
 4 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index f95463d..46e9ba8 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -458,7 +458,8 @@ static void raid0_make_request(struct mddev *mddev, struct 
bio *bio)
struct md_rdev *tmp_dev;
struct bio *split;
 
-   if (unlikely(bio->bi_rw & REQ_PREFLUSH)) {
+   if (unlikely(bio->bi_rw & REQ_PREFLUSH) ||
+   unlikely(bio->bi_op == REQ_OP_FLUSH)) {
md_flush_request(mddev, bio);
return;
}
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 2a2c177..f7c0577 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1771,12 +1771,12 @@ static void end_sync_write(struct bio *bio)
 }
 
 static int r1_sync_page_io(struct md_rdev *rdev, sector_t sector,
-   int sectors, struct page *page, int rw)
+   int sectors, struct page *page, int op)
 {
-   if (sync_page_io(rdev, sector, sectors << 9, page, rw, 0, false))
+   if (sync_page_io(rdev, sector, sectors << 9, page, op, 0, false))
/* success */
return 1;
-   if (rw == WRITE) {
+   if (op == REQ_OP_WRITE) {
set_bit(WriteErrorSeen, >flags);
if (!test_and_set_bit(WantReplacement,
  >flags))
@@ -1883,7 +1883,7 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
rdev = conf->mirrors[d].rdev;
if (r1_sync_page_io(rdev, sect, s,
bio->bi_io_vec[idx].bv_page,
-   WRITE) == 0) {
+   REQ_OP_WRITE) == 0) {
r1_bio->bios[d]->bi_end_io = NULL;
rdev_dec_pending(rdev, mddev);
}
@@ -2118,7 +2118,7 @@ static void fix_read_error(struct r1conf *conf, int 
read_disk,
if (rdev &&
!test_bit(Faulty, >flags))
r1_sync_page_io(rdev, sect, s,
-   conf->tmppage, WRITE);
+   conf->tmppage, REQ_OP_WRITE);
}
d = start;
while (d != read_disk) {
@@ -2130,7 +2130,7 @@ static void fix_read_error(struct r1conf *conf, int 
read_disk,
if (rdev &&
!test_bit(Faulty, >flags)) {
if (r1_sync_page_io(rdev, sect, s,
-   conf->tmppage, READ)) {
+   conf->tmppage, 
REQ_OP_READ)) {
atomic_add(s, >corrected_errors);
printk(KERN_INFO
   "md/raid1:%s: read error 
corrected "
@@ -2204,6 +2204,7 @@ static int narrow_write_error(struct r1bio *r1_bio, int i)
}
 
wbio->bi_op = REQ_OP_WRITE;
+   wbio->bi_rw = 0;
wbio->bi_iter.bi_sector = r1_bio->sector;
wbio->bi_iter.bi_size = r1_bio->sectors << 9;
 
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index c5dc4e4..44e87c2 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1364,8 +1364,7 @@ retry_write:
mbio->bi_bdev = rdev->bdev;
mbio->bi_end_io = raid10_end_write_request;
mbio->bi_op = op;
-   mbio->bi_rw =
-   do_sync | do_fua | do_sec;
+   mbio->bi_rw = do_sync | do_fua | do_sec;
mbio->bi_private = r10_bio;
 
atomic_inc(_bio->remaining);
@@ -1408,8 +1407,7 @@ retry_write:
mbio->bi_bdev = rdev->bdev;
mbio->bi_end_io = raid10_end_write_request;
mbio->bi_op = op;
-   mbio->bi_rw =
-   do_sync | do_fua | do_sec;
+   mbio->bi_rw = do_sync | do_fua | do_sec;
mbio->bi_private = r10_bio;
 
atomic_inc(_bio->remaining);
@@ -1452,7 +1450,8 @@ static void raid10_make_request(struct mddev *mddev, 
struct bio *bio)
 
struct bio *split;
 
-   if 

Re: [PATCHv5 2/3] x86/vdso: add mremap hook to vm_special_mapping

2016-04-23 Thread kbuild test robot
Hi,

[auto build test ERROR on v4.6-rc4]
[cannot apply to tip/x86/core tip/x86/vdso next-20160422]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improving the system]

url:
https://github.com/0day-ci/linux/commits/Dmitry-Safonov/x86-rename-is_-ia32-x32-_task-to-in_-ia32-x32-_syscall/20160418-214656
config: x86_64-randconfig-s0-04240623 (attached as .config)
compiler: gcc-5 (Debian 5.3.1-14) 5.3.1 20160409
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: the 
linux-review/Dmitry-Safonov/x86-rename-is_-ia32-x32-_task-to-in_-ia32-x32-_syscall/20160418-214656
 HEAD aed83f1dd951908724a5ba564e2b03d68a9fc7b8 builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   arch/x86/entry/vdso/vma.c: In function 'vdso_mremap':
>> arch/x86/entry/vdso/vma.c:114:37: error: 'vdso_image_32' undeclared (first 
>> use in this function)
 if (in_ia32_syscall() && image == _image_32) {
^
   arch/x86/entry/vdso/vma.c:114:37: note: each undeclared identifier is 
reported only once for each function it appears in

vim +/vdso_image_32 +114 arch/x86/entry/vdso/vma.c

   108  if (image->size != new_size)
   109  return -EINVAL;
   110  
   111  if (current->mm != new_vma->vm_mm)
   112  return -EFAULT;
   113  
 > 114  if (in_ia32_syscall() && image == _image_32) {
   115  struct pt_regs *regs = current_pt_regs();
   116  unsigned long vdso_land = image->sym_int80_landing_pad;
   117  unsigned long old_land_addr = vdso_land +

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCHv5 2/3] x86/vdso: add mremap hook to vm_special_mapping

2016-04-23 Thread kbuild test robot
Hi,

[auto build test ERROR on v4.6-rc4]
[cannot apply to tip/x86/core tip/x86/vdso next-20160422]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improving the system]

url:
https://github.com/0day-ci/linux/commits/Dmitry-Safonov/x86-rename-is_-ia32-x32-_task-to-in_-ia32-x32-_syscall/20160418-214656
config: x86_64-randconfig-s0-04240623 (attached as .config)
compiler: gcc-5 (Debian 5.3.1-14) 5.3.1 20160409
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: the 
linux-review/Dmitry-Safonov/x86-rename-is_-ia32-x32-_task-to-in_-ia32-x32-_syscall/20160418-214656
 HEAD aed83f1dd951908724a5ba564e2b03d68a9fc7b8 builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   arch/x86/entry/vdso/vma.c: In function 'vdso_mremap':
>> arch/x86/entry/vdso/vma.c:114:37: error: 'vdso_image_32' undeclared (first 
>> use in this function)
 if (in_ia32_syscall() && image == _image_32) {
^
   arch/x86/entry/vdso/vma.c:114:37: note: each undeclared identifier is 
reported only once for each function it appears in

vim +/vdso_image_32 +114 arch/x86/entry/vdso/vma.c

   108  if (image->size != new_size)
   109  return -EINVAL;
   110  
   111  if (current->mm != new_vma->vm_mm)
   112  return -EFAULT;
   113  
 > 114  if (in_ia32_syscall() && image == _image_32) {
   115  struct pt_regs *regs = current_pt_regs();
   116  unsigned long vdso_land = image->sym_int80_landing_pad;
   117  unsigned long old_land_addr = vdso_land +

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


[GIT PULL] Thermal management updates for v4.6-rc5

2016-04-23 Thread Eduardo Valentin
Hello Linus,

Here are a set of fixes on thermal subsystem.

Please consider pulling from

  git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal fixes

to receive Thermal Management updates for v4.6-rc5 with top-most

a6f4850dbca66e46a73b8774e85aaf9fc0caf265:

  thermal: fix Mediatek thermal controller build (2016-04-20 21:13:21 -0700)

on top of commit 55f058e7574c3615dea4615573a19bdb258696c6:

  Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 (2016-04-20 
12:00:07 -0700)

Specifics in this pull request:
- Fixes in mediatek and OF thermal drivers
- Fixes in power_allocator governor
- More fixes of unsigned to int type change in thermal_core.c.
- These change have been CI tested using KernelCI bot [1,2]. \o/

[1] - 
https://kernelci.org/boot/all/job/evalenti/kernel/v4.6-rc4-63-g58b76ffa037f/
[2] - https://kernelci.org/build/evalenti/kernel/v4.6-rc4-63-g58b76ffa037f/

BR,

Eduardo Valentin


Javi Merino (1):
  thermal: power_allocator: req_range multiplication should be a 64 bit type

Johannes Berg (1):
  thermal: fix Mediatek thermal controller build

Julia Lawall (1):
  thermal: of: add __init attribute

Randy Dunlap (2):
  thermal: minor mtk_thermal.c cleanups
  thermal: fix mtk_thermal build dependency

Wei Ni (1):
  thermal: consistently use int for trip temp

 drivers/thermal/Kconfig   | 2 ++
 drivers/thermal/mtk_thermal.c | 3 +--
 drivers/thermal/of-thermal.c  | 4 ++--
 drivers/thermal/power_allocator.c | 2 +-
 drivers/thermal/thermal_core.c| 8 
 include/linux/thermal.h   | 4 ++--
 6 files changed, 12 insertions(+), 11 deletions(-)


Re: [PATCH] kvm: x86: make lapic hrtimer pinned

2016-04-23 Thread Wanpeng Li
2016-04-22 21:12 GMT+08:00 Luiz Capitulino :
> On Fri, 22 Apr 2016 07:12:51 +0800
> Wanpeng Li  wrote:
>
>> 2016-04-05 20:40 GMT+08:00 Luiz Capitulino :
>> > On Tue, 5 Apr 2016 14:18:01 +0800
>> > Yang Zhang  wrote:
>> >
>> >> On 2016/4/5 5:00, Rik van Riel wrote:
>> >> > On Mon, 2016-04-04 at 16:46 -0400, Luiz Capitulino wrote:
>> >> >> When a vCPU runs on a nohz_full core, the hrtimer used by
>> >> >> the lapic emulation code can be migrated to another core.
>> >> >> When this happens, it's possible to observe milisecond
>> >> >> latency when delivering timer IRQs to KVM guests.
>> >> >>
>> >> >> The huge latency is mainly due to the fact that
>> >> >> apic_timer_fn() expects to run during a kvm exit. It
>> >> >> sets KVM_REQ_PENDING_TIMER and let it be handled on kvm
>> >> >> entry. However, if the timer fires on a different core,
>> >> >> we have to wait until the next kvm exit for the guest
>> >> >> to see KVM_REQ_PENDING_TIMER set.
>> >> >>
>> >> >> This problem became visible after commit 9642d18ee. This
>> >> >> commit changed the timer migration code to always attempt
>> >> >> to migrate timers away from nohz_full cores. While it's
>> >> >> discussable if this is correct/desirable (I don't think
>> >> >> it is), it's clear that the lapic emulation code has
>> >> >> a requirement on firing the hrtimer in the same core
>> >> >> where it was started. This is achieved by making the
>> >> >> hrtimer pinned.
>> >> >
>> >> > Given that delivering a timer to a guest seems to
>> >> > involve trapping from the guest to the host, anyway,
>> >> > I don't see a downside to your patch.
>> >> >
>> >> > If that is ever changed (eg. allowing delivery of
>> >> > a timer interrupt to a VCPU without trapping to the
>> >> > host), we may want to revisit this.
>> >>
>> >>
>> >> Posted interrupt helps in this case. Currently, KVM doesn't use PI for
>> >> lapic timer is due to same affinity for lapic timer and VCPU. Now, we
>> >> can change to use PI for lapic timer. The only concern is what's
>> >> frequency of timer migration in upstream Linux? If it is frequently,
>> >> will it bring additional cost?
>> >
>> > I can't answer this questions.
>> >
>> >> BTW, in what case the migration of timers during VCPU scheduling will 
>> >> fail?
>> >
>> > For hrtimers (which is the lapic emulation case), it only succeeds if
>> > the destination core has a hrtimer expiring before the hrtimer being
>> > migrated.
>>
>> Interesting, did you figure out why this happen? Actually the clock
>> event device will be reprogrammed if the expire time of the new
>> enqueued hrtimer is earlier than the left most(earliest expire time)
>> hrtimer in hrtimer rb tree.
>
> Unless the code has changed very recently, what you describe is
> what happens when queueing a hrtimer in the same core. Migrating a
> hrtimer to a different core is a different case.

You are right!

Regards,
Wanpeng Li


[GIT PULL] Thermal management updates for v4.6-rc5

2016-04-23 Thread Eduardo Valentin
Hello Linus,

Here are a set of fixes on thermal subsystem.

Please consider pulling from

  git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal fixes

to receive Thermal Management updates for v4.6-rc5 with top-most

a6f4850dbca66e46a73b8774e85aaf9fc0caf265:

  thermal: fix Mediatek thermal controller build (2016-04-20 21:13:21 -0700)

on top of commit 55f058e7574c3615dea4615573a19bdb258696c6:

  Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 (2016-04-20 
12:00:07 -0700)

Specifics in this pull request:
- Fixes in mediatek and OF thermal drivers
- Fixes in power_allocator governor
- More fixes of unsigned to int type change in thermal_core.c.
- These change have been CI tested using KernelCI bot [1,2]. \o/

[1] - 
https://kernelci.org/boot/all/job/evalenti/kernel/v4.6-rc4-63-g58b76ffa037f/
[2] - https://kernelci.org/build/evalenti/kernel/v4.6-rc4-63-g58b76ffa037f/

BR,

Eduardo Valentin


Javi Merino (1):
  thermal: power_allocator: req_range multiplication should be a 64 bit type

Johannes Berg (1):
  thermal: fix Mediatek thermal controller build

Julia Lawall (1):
  thermal: of: add __init attribute

Randy Dunlap (2):
  thermal: minor mtk_thermal.c cleanups
  thermal: fix mtk_thermal build dependency

Wei Ni (1):
  thermal: consistently use int for trip temp

 drivers/thermal/Kconfig   | 2 ++
 drivers/thermal/mtk_thermal.c | 3 +--
 drivers/thermal/of-thermal.c  | 4 ++--
 drivers/thermal/power_allocator.c | 2 +-
 drivers/thermal/thermal_core.c| 8 
 include/linux/thermal.h   | 4 ++--
 6 files changed, 12 insertions(+), 11 deletions(-)


Re: [PATCH] kvm: x86: make lapic hrtimer pinned

2016-04-23 Thread Wanpeng Li
2016-04-22 21:12 GMT+08:00 Luiz Capitulino :
> On Fri, 22 Apr 2016 07:12:51 +0800
> Wanpeng Li  wrote:
>
>> 2016-04-05 20:40 GMT+08:00 Luiz Capitulino :
>> > On Tue, 5 Apr 2016 14:18:01 +0800
>> > Yang Zhang  wrote:
>> >
>> >> On 2016/4/5 5:00, Rik van Riel wrote:
>> >> > On Mon, 2016-04-04 at 16:46 -0400, Luiz Capitulino wrote:
>> >> >> When a vCPU runs on a nohz_full core, the hrtimer used by
>> >> >> the lapic emulation code can be migrated to another core.
>> >> >> When this happens, it's possible to observe milisecond
>> >> >> latency when delivering timer IRQs to KVM guests.
>> >> >>
>> >> >> The huge latency is mainly due to the fact that
>> >> >> apic_timer_fn() expects to run during a kvm exit. It
>> >> >> sets KVM_REQ_PENDING_TIMER and let it be handled on kvm
>> >> >> entry. However, if the timer fires on a different core,
>> >> >> we have to wait until the next kvm exit for the guest
>> >> >> to see KVM_REQ_PENDING_TIMER set.
>> >> >>
>> >> >> This problem became visible after commit 9642d18ee. This
>> >> >> commit changed the timer migration code to always attempt
>> >> >> to migrate timers away from nohz_full cores. While it's
>> >> >> discussable if this is correct/desirable (I don't think
>> >> >> it is), it's clear that the lapic emulation code has
>> >> >> a requirement on firing the hrtimer in the same core
>> >> >> where it was started. This is achieved by making the
>> >> >> hrtimer pinned.
>> >> >
>> >> > Given that delivering a timer to a guest seems to
>> >> > involve trapping from the guest to the host, anyway,
>> >> > I don't see a downside to your patch.
>> >> >
>> >> > If that is ever changed (eg. allowing delivery of
>> >> > a timer interrupt to a VCPU without trapping to the
>> >> > host), we may want to revisit this.
>> >>
>> >>
>> >> Posted interrupt helps in this case. Currently, KVM doesn't use PI for
>> >> lapic timer is due to same affinity for lapic timer and VCPU. Now, we
>> >> can change to use PI for lapic timer. The only concern is what's
>> >> frequency of timer migration in upstream Linux? If it is frequently,
>> >> will it bring additional cost?
>> >
>> > I can't answer this questions.
>> >
>> >> BTW, in what case the migration of timers during VCPU scheduling will 
>> >> fail?
>> >
>> > For hrtimers (which is the lapic emulation case), it only succeeds if
>> > the destination core has a hrtimer expiring before the hrtimer being
>> > migrated.
>>
>> Interesting, did you figure out why this happen? Actually the clock
>> event device will be reprogrammed if the expire time of the new
>> enqueued hrtimer is earlier than the left most(earliest expire time)
>> hrtimer in hrtimer rb tree.
>
> Unless the code has changed very recently, what you describe is
> what happens when queueing a hrtimer in the same core. Migrating a
> hrtimer to a different core is a different case.

You are right!

Regards,
Wanpeng Li


4.5.x drm/i915/ + drm/drm_irq + drm/radeon & ACPI problems doing vga_switcheroo switching & getting EDID modes for laptop hybrid graphics with Intel IGC & Radeon Neptune 8970M

2016-04-23 Thread Jason Vas Dias
I have not so far been able to get my Radeon 8970M discrete graphics card
with GPU to go into graphics mode under Linux 4.4.0+
 ( tried 4.4.0, 4.5.0, 4.5.1, ...) on my Clevo KAPOK laptop x86_64 LFS system ,
which has :
CPU : Intel(R) Core(TM) i7-4910MQ CPU @ 2.90GHz
RAM: 16GB  ;  Disk: 1TB SATA  + 256MB SDD
.

$ lspci -nn | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen
Core Processor Integrated Graphics Controller [8086:0416] (rev 06)
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Neptune XT [Radeon HD 8970M] [1002:6801]

So far, the Neptune card will only go into graphics mode when driven
by the closed source FGLRX driver under a Linux 3.10 / RHEL-7 clone -
I'm trying to
get it working under Linux  4.4.0+, whose 'drivers/drm/radeon' driver claims to
support the card .

Persistently, the Xorg server v1.18.3  with Xorg Radeon Driver v7.7.0
(latest stable GIT versions) report "No modes" and are unable to discover
any probed EDID display modes for the card , as shown by the Xorg.0.log
excerpt :
[  1503.772] (II) Loading /usr/lib64/xorg/modules/drivers/radeon_drv.so
[  1503.773] (II) Module radeon: vendor="X.Org Foundation"
[  1503.773]compiled for 1.18.3, module version = 7.7.0
[  1503.775]Module class: X.Org Video Driver
[  1503.775]ABI class: X.Org Video Driver, version 20.0
[  1503.775] (II) LoadModule: "intel"
[  1503.777] (II) Loading /usr/lib64/xorg/modules/drivers/intel_drv.so
[  1503.778] (II) Module intel: vendor="X.Org Foundation"
[  1503.778]compiled for 1.18.3, module version = 2.99.917
[  1503.779]Module class: X.Org Video Driver
[  1503.780]ABI class: X.Org Video Driver, version 20.0
...
[  1503.788] (II) RADEON: Driver for ATI Radeon chipsets:
...

[  1503.957] (II) [KMS] Kernel modesetting enabled.
[  1503.957] (II) intel(1): Using Kernel Mode Setting driver: i915,
version 1.6.0 20151218
[  1503.957] (EE) Screen 1 deleted because of no matching config section.
[  1503.957] (II) UnloadModule: "intel"
[  1503.957] (II) RADEON(0): RADEONPreInit_KMS
[  1503.957] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32
[  1503.957] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes
(32 bpp pixmaps)
[  1503.957] (==) RADEON(0): Default visual is TrueColor
[  1503.957] (**) RADEON(0): Option "DRI" "3"
[  1503.957] (==) RADEON(0): RGB weight 888
[  1503.957] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC)
[  1503.957] (--) RADEON(0): Chipset: "PITCAIRN" (ChipID = 0x6801)
[  1503.957] (II) Loading sub module "fb"
[  1503.957] (II) LoadModule: "fb"
[  1503.957] (II) Loading /usr/lib64/xorg/modules/libfb.so
[  1503.958] (II) Module fb: vendor="X.Org Foundation"
[  1503.958]compiled for 1.18.3, module version = 1.0.0
[  1503.958]ABI class: X.Org ANSI C Emulation, version 0.4
[  1503.958] (II) Loading sub module "dri2"
[  1503.958] (II) LoadModule: "dri2"
[  1503.958] (II) Module "dri2" already built-in
[  1503.958] (II) Loading sub module "glamoregl"
[  1503.958] (II) LoadModule: "glamoregl"
[  1503.958] (II) Loading /usr/lib64/xorg/modules/libglamoregl.so
[  1503.958] (II) Module glamoregl: vendor="X.Org Foundation"
[  1503.958]compiled for 1.18.3, module version = 0.6.0
[  1503.958]ABI class: X.Org ANSI C Emulation, version 0.4
[  1503.958] (II) glamor: OpenGL accelerated X.org driver based.
[  1504.023] (II) glamor: EGL version 1.4 (DRI2):
[  1504.023] (II) RADEON(0): glamor detected, initialising EGL layer.
[  1504.023] (II) RADEON(0): KMS Color Tiling: enabled
[  1504.023] (II) RADEON(0): KMS Color Tiling 2D: enabled
[  1504.024] (II) RADEON(0): KMS Pageflipping: enabled
[  1504.024] (II) RADEON(0): SwapBuffers wait for vsync: enabled
[  1504.024] (II) RADEON(0): Initializing outputs ...
[  1504.024] (II) RADEON(0): 0 crtcs needed for screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 0 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 1 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 2 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 3 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 4 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 5 to this screen.
[  1504.024] (WW) RADEON(0): No outputs definitely connected, trying again...
[  1504.024] (WW) RADEON(0): Unable to find connected outputs -
setting 1024x768 initial framebuffer
[  1504.024] (II) RADEON(0): Using default gamma of (1.0, 1.0, 1.0)
unless otherwise stated.
[  1504.024] (II) RADEON(0): mem size init: gart size :7fbcc000 vram
size: s:1 visible:ff916000
[  1504.024] (==) RADEON(0): DPI set to (96, 96)
[  1504.024] (II) Loading sub module "ramdac"
[  1504.024] (II) LoadModule: "ramdac"
[  1504.024] (II) Module "ramdac" already built-in
[  1504.024] (EE) RADEON(0): No modes.
[  1504.024] (II) RADEON(0): RADEONFreeScreen
[  1504.024] (II) UnloadModule: "radeon"
[  1504.024] (II) UnloadSubModule: "glamoregl"
[  1504.024] (II) Unloading glamoregl
[  

4.5.x drm/i915/ + drm/drm_irq + drm/radeon & ACPI problems doing vga_switcheroo switching & getting EDID modes for laptop hybrid graphics with Intel IGC & Radeon Neptune 8970M

2016-04-23 Thread Jason Vas Dias
I have not so far been able to get my Radeon 8970M discrete graphics card
with GPU to go into graphics mode under Linux 4.4.0+
 ( tried 4.4.0, 4.5.0, 4.5.1, ...) on my Clevo KAPOK laptop x86_64 LFS system ,
which has :
CPU : Intel(R) Core(TM) i7-4910MQ CPU @ 2.90GHz
RAM: 16GB  ;  Disk: 1TB SATA  + 256MB SDD
.

$ lspci -nn | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen
Core Processor Integrated Graphics Controller [8086:0416] (rev 06)
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Neptune XT [Radeon HD 8970M] [1002:6801]

So far, the Neptune card will only go into graphics mode when driven
by the closed source FGLRX driver under a Linux 3.10 / RHEL-7 clone -
I'm trying to
get it working under Linux  4.4.0+, whose 'drivers/drm/radeon' driver claims to
support the card .

Persistently, the Xorg server v1.18.3  with Xorg Radeon Driver v7.7.0
(latest stable GIT versions) report "No modes" and are unable to discover
any probed EDID display modes for the card , as shown by the Xorg.0.log
excerpt :
[  1503.772] (II) Loading /usr/lib64/xorg/modules/drivers/radeon_drv.so
[  1503.773] (II) Module radeon: vendor="X.Org Foundation"
[  1503.773]compiled for 1.18.3, module version = 7.7.0
[  1503.775]Module class: X.Org Video Driver
[  1503.775]ABI class: X.Org Video Driver, version 20.0
[  1503.775] (II) LoadModule: "intel"
[  1503.777] (II) Loading /usr/lib64/xorg/modules/drivers/intel_drv.so
[  1503.778] (II) Module intel: vendor="X.Org Foundation"
[  1503.778]compiled for 1.18.3, module version = 2.99.917
[  1503.779]Module class: X.Org Video Driver
[  1503.780]ABI class: X.Org Video Driver, version 20.0
...
[  1503.788] (II) RADEON: Driver for ATI Radeon chipsets:
...

[  1503.957] (II) [KMS] Kernel modesetting enabled.
[  1503.957] (II) intel(1): Using Kernel Mode Setting driver: i915,
version 1.6.0 20151218
[  1503.957] (EE) Screen 1 deleted because of no matching config section.
[  1503.957] (II) UnloadModule: "intel"
[  1503.957] (II) RADEON(0): RADEONPreInit_KMS
[  1503.957] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32
[  1503.957] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes
(32 bpp pixmaps)
[  1503.957] (==) RADEON(0): Default visual is TrueColor
[  1503.957] (**) RADEON(0): Option "DRI" "3"
[  1503.957] (==) RADEON(0): RGB weight 888
[  1503.957] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC)
[  1503.957] (--) RADEON(0): Chipset: "PITCAIRN" (ChipID = 0x6801)
[  1503.957] (II) Loading sub module "fb"
[  1503.957] (II) LoadModule: "fb"
[  1503.957] (II) Loading /usr/lib64/xorg/modules/libfb.so
[  1503.958] (II) Module fb: vendor="X.Org Foundation"
[  1503.958]compiled for 1.18.3, module version = 1.0.0
[  1503.958]ABI class: X.Org ANSI C Emulation, version 0.4
[  1503.958] (II) Loading sub module "dri2"
[  1503.958] (II) LoadModule: "dri2"
[  1503.958] (II) Module "dri2" already built-in
[  1503.958] (II) Loading sub module "glamoregl"
[  1503.958] (II) LoadModule: "glamoregl"
[  1503.958] (II) Loading /usr/lib64/xorg/modules/libglamoregl.so
[  1503.958] (II) Module glamoregl: vendor="X.Org Foundation"
[  1503.958]compiled for 1.18.3, module version = 0.6.0
[  1503.958]ABI class: X.Org ANSI C Emulation, version 0.4
[  1503.958] (II) glamor: OpenGL accelerated X.org driver based.
[  1504.023] (II) glamor: EGL version 1.4 (DRI2):
[  1504.023] (II) RADEON(0): glamor detected, initialising EGL layer.
[  1504.023] (II) RADEON(0): KMS Color Tiling: enabled
[  1504.023] (II) RADEON(0): KMS Color Tiling 2D: enabled
[  1504.024] (II) RADEON(0): KMS Pageflipping: enabled
[  1504.024] (II) RADEON(0): SwapBuffers wait for vsync: enabled
[  1504.024] (II) RADEON(0): Initializing outputs ...
[  1504.024] (II) RADEON(0): 0 crtcs needed for screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 0 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 1 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 2 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 3 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 4 to this screen.
[  1504.024] (II) RADEON(0): Allocated crtc nr. 5 to this screen.
[  1504.024] (WW) RADEON(0): No outputs definitely connected, trying again...
[  1504.024] (WW) RADEON(0): Unable to find connected outputs -
setting 1024x768 initial framebuffer
[  1504.024] (II) RADEON(0): Using default gamma of (1.0, 1.0, 1.0)
unless otherwise stated.
[  1504.024] (II) RADEON(0): mem size init: gart size :7fbcc000 vram
size: s:1 visible:ff916000
[  1504.024] (==) RADEON(0): DPI set to (96, 96)
[  1504.024] (II) Loading sub module "ramdac"
[  1504.024] (II) LoadModule: "ramdac"
[  1504.024] (II) Module "ramdac" already built-in
[  1504.024] (EE) RADEON(0): No modes.
[  1504.024] (II) RADEON(0): RADEONFreeScreen
[  1504.024] (II) UnloadModule: "radeon"
[  1504.024] (II) UnloadSubModule: "glamoregl"
[  1504.024] (II) Unloading glamoregl
[  

Re: [PATCH v3] KVM: remove buggy vcpu id check on vcpu creation

2016-04-23 Thread Wanpeng Li
2016-04-22 21:07 GMT+08:00 Radim Krčmář :
> 2016-04-22 09:40+0800, Wanpeng Li:
>> 2016-04-21 23:29 GMT+08:00 Radim Krčmář :
>>> x86 vcpu_id encodes APIC ID and APIC ID encodes CPU topology by
>>> reserving blocks of bits for socket/core/thread, so if core or thread
>>> count isn't a power of two, then the set of valid APIC IDs is sparse,
>>
>>  ^^^
>>  ^^^
>> Is this the root reason why recommand max vCPUs per vm is 160 and the
>> KVM_MAX_VCPUS is 255 instead of due to perforamnce concern?
>
> No, the recommended amout of VCPUs is 160 because I didn't bump it after
> PLE stopped killing big guests. :/
>
> You can get full 255 VCPU guest with a proper configuration, e.g.
> "-smp 255" or "-smp 255,cores=8" and the only problem is scalability,
> but I don't know of anything that doesn't scale to that point.
>
> (Scaling up to 2^32 is harder, because you don't want O(N) search, nor
>  full allocation on smaller guests.  Neither is a big problem now.)

I see, thanks Radim.

Regards,
Wanpeng Li


Re: [PATCH v3] KVM: remove buggy vcpu id check on vcpu creation

2016-04-23 Thread Wanpeng Li
2016-04-22 21:07 GMT+08:00 Radim Krčmář :
> 2016-04-22 09:40+0800, Wanpeng Li:
>> 2016-04-21 23:29 GMT+08:00 Radim Krčmář :
>>> x86 vcpu_id encodes APIC ID and APIC ID encodes CPU topology by
>>> reserving blocks of bits for socket/core/thread, so if core or thread
>>> count isn't a power of two, then the set of valid APIC IDs is sparse,
>>
>>  ^^^
>>  ^^^
>> Is this the root reason why recommand max vCPUs per vm is 160 and the
>> KVM_MAX_VCPUS is 255 instead of due to perforamnce concern?
>
> No, the recommended amout of VCPUs is 160 because I didn't bump it after
> PLE stopped killing big guests. :/
>
> You can get full 255 VCPU guest with a proper configuration, e.g.
> "-smp 255" or "-smp 255,cores=8" and the only problem is scalability,
> but I don't know of anything that doesn't scale to that point.
>
> (Scaling up to 2^32 is harder, because you don't want O(N) search, nor
>  full allocation on smaller guests.  Neither is a big problem now.)

I see, thanks Radim.

Regards,
Wanpeng Li


Re: WARNING: CPU: 1 PID: 1 at kernel/events/core.c:7825 perf_pmu_register+0x385/0x390

2016-04-23 Thread Peter Zijlstra
On Sat, Apr 23, 2016 at 03:03:22PM +0200, Borislav Petkov wrote:
> Yo,
> 
> did the fix for this go anywhere? I'm still seeing it on rc4+tip/master:
> 
> [0.760493] AMD Power PMU detected
> [0.760689] LVT offset 0 assigned for vector 0x400
> [0.761072] perf: AMD IBS detected (0x07ff)
> [0.761340] [ cut here ]
> [0.761571] WARNING: CPU: 1 PID: 1 at kernel/events/core.c:7825 
> perf_pmu_register+0x385/0x390
> [0.761909] Modules linked in:
> [0.762093] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc4+ #1
> [0.762331] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 
> 01.08 01/28/2016
> [0.762674]   812dafa9  
> 
> [0.763185]  810589ac 81a0fbe0 000a 
> 8172ee85
> [0.763672]   88042bdf9130 0008b000 
> 8112ff55
> [0.764101] Call Trace:
> [0.764240]  [] ? dump_stack+0x5c/0x83
> [0.764454]  [] ? __warn+0xec/0x110
> [0.764680]  [] ? perf_pmu_register+0x385/0x390
> [0.764956]  [] ? msr_init+0xbe/0xbe
> [0.765202]  [] ? amd_iommu_pc_init+0xd4/0x141
> [0.765475]  [] ? do_one_initcall+0xaf/0x200
> [0.765718]  [] ? parse_args+0x2ab/0x4c0
> [0.765935]  [] ? kernel_init_freeable+0x111/0x190
> [0.766175]  [] ? kernel_init+0xa/0x100
> [0.766389]  [] ? ret_from_fork+0x22/0x40
> [0.766620]  [] ? rest_init+0x90/0x90
> [0.766885] ---[ end trace 9285cdb6cf96a9b2 ]---
> [0.767120] perf: amd_iommu: Detected. (0 banks, 0 counters/bank)
> 

Oh, cute there's two different ones.

31d50c551e30 ("perf/x86/amd/uncore: Do not register a task ctx for uncore PMUs")

Doth the below fixeth thingies?

---
Subject: perf/amd/iommu: Do not register a task ctx for uncore like PMUs

The new sanity check introduced by:

  26657848502b ("perf/core: Verify we have a single perf_hw_context PMU")

... triggered on the AMD IOMMU driver.

IOMMUs are not per logical CPU, they cannot have per-task counters. Fix it.

Cc: Suravee Suthikulpanit 
Cc: Joerg Roedel 
Reported-by: Borislav Petkov 
Signed-off-by: Peter Zijlstra (Intel) 
---
 arch/x86/events/amd/iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 40625ca7a190..6011a573dd64 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -474,6 +474,7 @@ static __init int _init_perf_amd_iommu(
 
 static struct perf_amd_iommu __perf_iommu = {
.pmu = {
+   .task_ctx_nr= perf_invalid_context,
.event_init = perf_iommu_event_init,
.add= perf_iommu_add,
.del= perf_iommu_del,


Re: WARNING: CPU: 1 PID: 1 at kernel/events/core.c:7825 perf_pmu_register+0x385/0x390

2016-04-23 Thread Peter Zijlstra
On Sat, Apr 23, 2016 at 03:03:22PM +0200, Borislav Petkov wrote:
> Yo,
> 
> did the fix for this go anywhere? I'm still seeing it on rc4+tip/master:
> 
> [0.760493] AMD Power PMU detected
> [0.760689] LVT offset 0 assigned for vector 0x400
> [0.761072] perf: AMD IBS detected (0x07ff)
> [0.761340] [ cut here ]
> [0.761571] WARNING: CPU: 1 PID: 1 at kernel/events/core.c:7825 
> perf_pmu_register+0x385/0x390
> [0.761909] Modules linked in:
> [0.762093] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc4+ #1
> [0.762331] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 
> 01.08 01/28/2016
> [0.762674]   812dafa9  
> 
> [0.763185]  810589ac 81a0fbe0 000a 
> 8172ee85
> [0.763672]   88042bdf9130 0008b000 
> 8112ff55
> [0.764101] Call Trace:
> [0.764240]  [] ? dump_stack+0x5c/0x83
> [0.764454]  [] ? __warn+0xec/0x110
> [0.764680]  [] ? perf_pmu_register+0x385/0x390
> [0.764956]  [] ? msr_init+0xbe/0xbe
> [0.765202]  [] ? amd_iommu_pc_init+0xd4/0x141
> [0.765475]  [] ? do_one_initcall+0xaf/0x200
> [0.765718]  [] ? parse_args+0x2ab/0x4c0
> [0.765935]  [] ? kernel_init_freeable+0x111/0x190
> [0.766175]  [] ? kernel_init+0xa/0x100
> [0.766389]  [] ? ret_from_fork+0x22/0x40
> [0.766620]  [] ? rest_init+0x90/0x90
> [0.766885] ---[ end trace 9285cdb6cf96a9b2 ]---
> [0.767120] perf: amd_iommu: Detected. (0 banks, 0 counters/bank)
> 

Oh, cute there's two different ones.

31d50c551e30 ("perf/x86/amd/uncore: Do not register a task ctx for uncore PMUs")

Doth the below fixeth thingies?

---
Subject: perf/amd/iommu: Do not register a task ctx for uncore like PMUs

The new sanity check introduced by:

  26657848502b ("perf/core: Verify we have a single perf_hw_context PMU")

... triggered on the AMD IOMMU driver.

IOMMUs are not per logical CPU, they cannot have per-task counters. Fix it.

Cc: Suravee Suthikulpanit 
Cc: Joerg Roedel 
Reported-by: Borislav Petkov 
Signed-off-by: Peter Zijlstra (Intel) 
---
 arch/x86/events/amd/iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 40625ca7a190..6011a573dd64 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -474,6 +474,7 @@ static __init int _init_perf_amd_iommu(
 
 static struct perf_amd_iommu __perf_iommu = {
.pmu = {
+   .task_ctx_nr= perf_invalid_context,
.event_init = perf_iommu_event_init,
.add= perf_iommu_add,
.del= perf_iommu_del,


Re: [PATCH] kvm: x86: do not leak guest xcr0 into host interrupt handlers

2016-04-23 Thread Wanpeng Li
2016-04-23 1:21 GMT+08:00 David Matlack :
> On Fri, Apr 22, 2016 at 12:30 AM, Wanpeng Li  wrote:
>> Hi Paolo and David,
>> 2016-03-31 3:24 GMT+08:00 David Matlack :
>>>
>>> kernel_fpu_begin() saves the current fpu context. If this uses
>>> XSAVE[OPT], it may leave the xsave area in an undesirable state.
>>> According to the SDM, during XSAVE bit i of XSTATE_BV is not modified
>>> if bit i is 0 in xcr0. So it's possible that XSTATE_BV[i] == 1 and
>>> xcr0[i] == 0 following an XSAVE.
>>
>> How XSAVE save bit i since SDM mentioned that "XSAVE saves state
>> component i if and only if RFBM[i] = 1. "?  RFBM[i] will be 0 if
>> XSTATE_BV[i] == 1 && guest xcr0[i] == 0.
>
> You are correct, RFBM[i] will be 0 and XSAVE does not save state
> component i in this case. However, XSTATE_BV[i] is left untouched by
> XSAVE (left as 1). On XRSTOR, the CPU checks if XSTATE_BV[i] == 1 &&
> xcr0[i] == 0, and if so delivers a #GP.

However, SDM also mentioned that "If RFBM[i] = 0, XRSTOR does not
update state component i." So we #GP on a don't need restore bit i if
XSTATE_BV[I] == 1 && xcr0[0] ==0. That's where I miss I think, thanks
for your explanation.

Regard,
Wanpeng Li

>
> If you are wondering how XSTATE_BV[i] could be 1 in the first place, I
> suspect it is left over from a previous XSAVE (which sets XSTATE_BV[i]
> to the value in XINUSE[i]).


Re: [PATCH] kvm: x86: do not leak guest xcr0 into host interrupt handlers

2016-04-23 Thread Wanpeng Li
2016-04-23 1:21 GMT+08:00 David Matlack :
> On Fri, Apr 22, 2016 at 12:30 AM, Wanpeng Li  wrote:
>> Hi Paolo and David,
>> 2016-03-31 3:24 GMT+08:00 David Matlack :
>>>
>>> kernel_fpu_begin() saves the current fpu context. If this uses
>>> XSAVE[OPT], it may leave the xsave area in an undesirable state.
>>> According to the SDM, during XSAVE bit i of XSTATE_BV is not modified
>>> if bit i is 0 in xcr0. So it's possible that XSTATE_BV[i] == 1 and
>>> xcr0[i] == 0 following an XSAVE.
>>
>> How XSAVE save bit i since SDM mentioned that "XSAVE saves state
>> component i if and only if RFBM[i] = 1. "?  RFBM[i] will be 0 if
>> XSTATE_BV[i] == 1 && guest xcr0[i] == 0.
>
> You are correct, RFBM[i] will be 0 and XSAVE does not save state
> component i in this case. However, XSTATE_BV[i] is left untouched by
> XSAVE (left as 1). On XRSTOR, the CPU checks if XSTATE_BV[i] == 1 &&
> xcr0[i] == 0, and if so delivers a #GP.

However, SDM also mentioned that "If RFBM[i] = 0, XRSTOR does not
update state component i." So we #GP on a don't need restore bit i if
XSTATE_BV[I] == 1 && xcr0[0] ==0. That's where I miss I think, thanks
for your explanation.

Regard,
Wanpeng Li

>
> If you are wondering how XSTATE_BV[i] could be 1 in the first place, I
> suspect it is left over from a previous XSAVE (which sets XSTATE_BV[i]
> to the value in XINUSE[i]).


KREDIT

2016-04-23 Thread Metro Cash Loan




Willkommen bei ,Metro Cash Loan
Das Unternehmen möchte Sie 
darüber informieren, die wir anbieten, alle Arten von Darlehen Zinssatz 
von 2.5% und diese Firma hat geholfen so viele Menschen mit Darlehen, 
also wenn Sie wissen, dass eine oder Sie finanzielle Hilfe brauchen Sie 
uns schicken eine Email an accessfinanc...@financier.com

Darlehensantrag.
Vollständiger Name:
Skype Id:
Adresse:
Beruf:
Monatliches Einkommen:
Telefonnummer:
Darlehensbetrag benötigt:
Dauer:
Zweck für Darlehen:
Angehörigen:

Sie haben das oben ausfüllen, damit wir mit dem Darlehen fortfahren können.
Erwarte Ihre schnelle Antwort.
Gruss.
Manager.

Adresse: Metro Cash Loan.
P.o. Box 492148
Los Angeles, CA 90049. VEREINIGTE STAATEN
Tel + 1 872. 400. 8629


KREDIT

2016-04-23 Thread Metro Cash Loan




Willkommen bei ,Metro Cash Loan
Das Unternehmen möchte Sie 
darüber informieren, die wir anbieten, alle Arten von Darlehen Zinssatz 
von 2.5% und diese Firma hat geholfen so viele Menschen mit Darlehen, 
also wenn Sie wissen, dass eine oder Sie finanzielle Hilfe brauchen Sie 
uns schicken eine Email an accessfinanc...@financier.com

Darlehensantrag.
Vollständiger Name:
Skype Id:
Adresse:
Beruf:
Monatliches Einkommen:
Telefonnummer:
Darlehensbetrag benötigt:
Dauer:
Zweck für Darlehen:
Angehörigen:

Sie haben das oben ausfüllen, damit wir mit dem Darlehen fortfahren können.
Erwarte Ihre schnelle Antwort.
Gruss.
Manager.

Adresse: Metro Cash Loan.
P.o. Box 492148
Los Angeles, CA 90049. VEREINIGTE STAATEN
Tel + 1 872. 400. 8629


Re: [PATCH] generic syscalls: wire up preadv2 and pwritev2 syscalls

2016-04-23 Thread Arnd Bergmann
On Monday 11 April 2016 10:17:46 Andre Przywara wrote:
> These new syscalls are implemented as generic code, so enable them for
> architectures like arm64 which use the generic syscall table.
> 
> Signed-off-by: Andre Przywara 
> 

I've forwarded it now as a pull request. Generally speaking, I'd much prefer
anyone who adds a syscall to update asm-generic/unistd.h as well
(as documented in Documentation/adding-syscalls.txt), there is no
need for me to put those patches into the asm-generic git tree first.

On a related topic, I've been thinking (for years) about coming up with a
way to have all future syscalls just get added to a single file in the
kernel to have them appended to the tables for all architectures. There
are two basic methods that seem appropriate here for avoiding the split
between unistd.h and syscalls.S:

a) The current asm-generic method of interleaving the __NR_* macro definitions
   and the entry in a .c file array, including the header multiple times
   to get all the tables
b) generating both files from an input like x86 does with
   arch/x86/entry/syscalls/ infrastructure

I think we need something similar to b) but with some extensions to allow
extending the architecture specific tables in a nice way, rather than
having to have one file per architecture.

Arnd


Re: [PATCH] generic syscalls: wire up preadv2 and pwritev2 syscalls

2016-04-23 Thread Arnd Bergmann
On Monday 11 April 2016 10:17:46 Andre Przywara wrote:
> These new syscalls are implemented as generic code, so enable them for
> architectures like arm64 which use the generic syscall table.
> 
> Signed-off-by: Andre Przywara 
> 

I've forwarded it now as a pull request. Generally speaking, I'd much prefer
anyone who adds a syscall to update asm-generic/unistd.h as well
(as documented in Documentation/adding-syscalls.txt), there is no
need for me to put those patches into the asm-generic git tree first.

On a related topic, I've been thinking (for years) about coming up with a
way to have all future syscalls just get added to a single file in the
kernel to have them appended to the tables for all architectures. There
are two basic methods that seem appropriate here for avoiding the split
between unistd.h and syscalls.S:

a) The current asm-generic method of interleaving the __NR_* macro definitions
   and the entry in a .c file array, including the header multiple times
   to get all the tables
b) generating both files from an input like x86 does with
   arch/x86/entry/syscalls/ infrastructure

I think we need something similar to b) but with some extensions to allow
extending the architecture specific tables in a nice way, rather than
having to have one file per architecture.

Arnd


Re: [PATCH] iio: st_gyro: Add lsm9ds0-gyro support

2016-04-23 Thread Jonathan Cameron
On 19/04/16 13:02, Crestez Dan Leonard wrote:
> This device has an identical interface to other supported sensors and the 
> patch
> only adds IDs.
> 
> Signed-off-by: Crestez Dan Leonard 
Applied to the togreg branch of iio.git - initially pushed out as testing
for the autobuilders to play with it.

Thanks,

Jonathan
> ---
>  Documentation/devicetree/bindings/iio/st-sensors.txt | 1 +
>  drivers/iio/gyro/Kconfig | 2 +-
>  drivers/iio/gyro/st_gyro.h   | 1 +
>  drivers/iio/gyro/st_gyro_core.c  | 1 +
>  drivers/iio/gyro/st_gyro_i2c.c   | 5 +
>  drivers/iio/gyro/st_gyro_spi.c   | 1 +
>  6 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/iio/st-sensors.txt 
> b/Documentation/devicetree/bindings/iio/st-sensors.txt
> index 637e283..5844cf7 100644
> --- a/Documentation/devicetree/bindings/iio/st-sensors.txt
> +++ b/Documentation/devicetree/bindings/iio/st-sensors.txt
> @@ -51,6 +51,7 @@ Gyroscopes:
>  - st,l3gd20-gyro
>  - st,l3g4is-gyro
>  - st,lsm330-gyro
> +- st,lsm9ds0-gyro
>  
>  Magnetometers:
>  - st,lsm303agr-magn
> diff --git a/drivers/iio/gyro/Kconfig b/drivers/iio/gyro/Kconfig
> index e816d29..205a844 100644
> --- a/drivers/iio/gyro/Kconfig
> +++ b/drivers/iio/gyro/Kconfig
> @@ -93,7 +93,7 @@ config IIO_ST_GYRO_3AXIS
>   select IIO_TRIGGERED_BUFFER if (IIO_BUFFER)
>   help
> Say yes here to build support for STMicroelectronics gyroscopes:
> -   L3G4200D, LSM330DL, L3GD20, LSM330DLC, L3G4IS, LSM330.
> +   L3G4200D, LSM330DL, L3GD20, LSM330DLC, L3G4IS, LSM330, LSM9DS0.
>  
> This driver can also be built as a module. If so, these modules
> will be created:
> diff --git a/drivers/iio/gyro/st_gyro.h b/drivers/iio/gyro/st_gyro.h
> index 5353d63..a5c5c4e 100644
> --- a/drivers/iio/gyro/st_gyro.h
> +++ b/drivers/iio/gyro/st_gyro.h
> @@ -21,6 +21,7 @@
>  #define L3GD20_GYRO_DEV_NAME "l3gd20"
>  #define L3G4IS_GYRO_DEV_NAME "l3g4is_ui"
>  #define LSM330_GYRO_DEV_NAME "lsm330_gyro"
> +#define LSM9DS0_GYRO_DEV_NAME"lsm9ds0_gyro"
>  
>  /**
>   * struct st_sensors_platform_data - gyro platform data
> diff --git a/drivers/iio/gyro/st_gyro_core.c b/drivers/iio/gyro/st_gyro_core.c
> index be9057e..52a3c87 100644
> --- a/drivers/iio/gyro/st_gyro_core.c
> +++ b/drivers/iio/gyro/st_gyro_core.c
> @@ -204,6 +204,7 @@ static const struct st_sensor_settings 
> st_gyro_sensors_settings[] = {
>   [2] = LSM330DLC_GYRO_DEV_NAME,
>   [3] = L3G4IS_GYRO_DEV_NAME,
>   [4] = LSM330_GYRO_DEV_NAME,
> + [5] = LSM9DS0_GYRO_DEV_NAME,
>   },
>   .ch = (struct iio_chan_spec *)st_gyro_16bit_channels,
>   .odr = {
> diff --git a/drivers/iio/gyro/st_gyro_i2c.c b/drivers/iio/gyro/st_gyro_i2c.c
> index 6848451..40056b8 100644
> --- a/drivers/iio/gyro/st_gyro_i2c.c
> +++ b/drivers/iio/gyro/st_gyro_i2c.c
> @@ -48,6 +48,10 @@ static const struct of_device_id st_gyro_of_match[] = {
>   .compatible = "st,lsm330-gyro",
>   .data = LSM330_GYRO_DEV_NAME,
>   },
> + {
> + .compatible = "st,lsm9ds0-gyro",
> + .data = LSM9DS0_GYRO_DEV_NAME,
> + },
>   {},
>  };
>  MODULE_DEVICE_TABLE(of, st_gyro_of_match);
> @@ -93,6 +97,7 @@ static const struct i2c_device_id st_gyro_id_table[] = {
>   { L3GD20_GYRO_DEV_NAME },
>   { L3G4IS_GYRO_DEV_NAME },
>   { LSM330_GYRO_DEV_NAME },
> + { LSM9DS0_GYRO_DEV_NAME },
>   {},
>  };
>  MODULE_DEVICE_TABLE(i2c, st_gyro_id_table);
> diff --git a/drivers/iio/gyro/st_gyro_spi.c b/drivers/iio/gyro/st_gyro_spi.c
> index d2b7a5f..fbf2fae 100644
> --- a/drivers/iio/gyro/st_gyro_spi.c
> +++ b/drivers/iio/gyro/st_gyro_spi.c
> @@ -54,6 +54,7 @@ static const struct spi_device_id st_gyro_id_table[] = {
>   { L3GD20_GYRO_DEV_NAME },
>   { L3G4IS_GYRO_DEV_NAME },
>   { LSM330_GYRO_DEV_NAME },
> + { LSM9DS0_GYRO_DEV_NAME },
>   {},
>  };
>  MODULE_DEVICE_TABLE(spi, st_gyro_id_table);
> 



Re: [PATCH] iio: st_gyro: Add lsm9ds0-gyro support

2016-04-23 Thread Jonathan Cameron
On 19/04/16 13:02, Crestez Dan Leonard wrote:
> This device has an identical interface to other supported sensors and the 
> patch
> only adds IDs.
> 
> Signed-off-by: Crestez Dan Leonard 
Applied to the togreg branch of iio.git - initially pushed out as testing
for the autobuilders to play with it.

Thanks,

Jonathan
> ---
>  Documentation/devicetree/bindings/iio/st-sensors.txt | 1 +
>  drivers/iio/gyro/Kconfig | 2 +-
>  drivers/iio/gyro/st_gyro.h   | 1 +
>  drivers/iio/gyro/st_gyro_core.c  | 1 +
>  drivers/iio/gyro/st_gyro_i2c.c   | 5 +
>  drivers/iio/gyro/st_gyro_spi.c   | 1 +
>  6 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/iio/st-sensors.txt 
> b/Documentation/devicetree/bindings/iio/st-sensors.txt
> index 637e283..5844cf7 100644
> --- a/Documentation/devicetree/bindings/iio/st-sensors.txt
> +++ b/Documentation/devicetree/bindings/iio/st-sensors.txt
> @@ -51,6 +51,7 @@ Gyroscopes:
>  - st,l3gd20-gyro
>  - st,l3g4is-gyro
>  - st,lsm330-gyro
> +- st,lsm9ds0-gyro
>  
>  Magnetometers:
>  - st,lsm303agr-magn
> diff --git a/drivers/iio/gyro/Kconfig b/drivers/iio/gyro/Kconfig
> index e816d29..205a844 100644
> --- a/drivers/iio/gyro/Kconfig
> +++ b/drivers/iio/gyro/Kconfig
> @@ -93,7 +93,7 @@ config IIO_ST_GYRO_3AXIS
>   select IIO_TRIGGERED_BUFFER if (IIO_BUFFER)
>   help
> Say yes here to build support for STMicroelectronics gyroscopes:
> -   L3G4200D, LSM330DL, L3GD20, LSM330DLC, L3G4IS, LSM330.
> +   L3G4200D, LSM330DL, L3GD20, LSM330DLC, L3G4IS, LSM330, LSM9DS0.
>  
> This driver can also be built as a module. If so, these modules
> will be created:
> diff --git a/drivers/iio/gyro/st_gyro.h b/drivers/iio/gyro/st_gyro.h
> index 5353d63..a5c5c4e 100644
> --- a/drivers/iio/gyro/st_gyro.h
> +++ b/drivers/iio/gyro/st_gyro.h
> @@ -21,6 +21,7 @@
>  #define L3GD20_GYRO_DEV_NAME "l3gd20"
>  #define L3G4IS_GYRO_DEV_NAME "l3g4is_ui"
>  #define LSM330_GYRO_DEV_NAME "lsm330_gyro"
> +#define LSM9DS0_GYRO_DEV_NAME"lsm9ds0_gyro"
>  
>  /**
>   * struct st_sensors_platform_data - gyro platform data
> diff --git a/drivers/iio/gyro/st_gyro_core.c b/drivers/iio/gyro/st_gyro_core.c
> index be9057e..52a3c87 100644
> --- a/drivers/iio/gyro/st_gyro_core.c
> +++ b/drivers/iio/gyro/st_gyro_core.c
> @@ -204,6 +204,7 @@ static const struct st_sensor_settings 
> st_gyro_sensors_settings[] = {
>   [2] = LSM330DLC_GYRO_DEV_NAME,
>   [3] = L3G4IS_GYRO_DEV_NAME,
>   [4] = LSM330_GYRO_DEV_NAME,
> + [5] = LSM9DS0_GYRO_DEV_NAME,
>   },
>   .ch = (struct iio_chan_spec *)st_gyro_16bit_channels,
>   .odr = {
> diff --git a/drivers/iio/gyro/st_gyro_i2c.c b/drivers/iio/gyro/st_gyro_i2c.c
> index 6848451..40056b8 100644
> --- a/drivers/iio/gyro/st_gyro_i2c.c
> +++ b/drivers/iio/gyro/st_gyro_i2c.c
> @@ -48,6 +48,10 @@ static const struct of_device_id st_gyro_of_match[] = {
>   .compatible = "st,lsm330-gyro",
>   .data = LSM330_GYRO_DEV_NAME,
>   },
> + {
> + .compatible = "st,lsm9ds0-gyro",
> + .data = LSM9DS0_GYRO_DEV_NAME,
> + },
>   {},
>  };
>  MODULE_DEVICE_TABLE(of, st_gyro_of_match);
> @@ -93,6 +97,7 @@ static const struct i2c_device_id st_gyro_id_table[] = {
>   { L3GD20_GYRO_DEV_NAME },
>   { L3G4IS_GYRO_DEV_NAME },
>   { LSM330_GYRO_DEV_NAME },
> + { LSM9DS0_GYRO_DEV_NAME },
>   {},
>  };
>  MODULE_DEVICE_TABLE(i2c, st_gyro_id_table);
> diff --git a/drivers/iio/gyro/st_gyro_spi.c b/drivers/iio/gyro/st_gyro_spi.c
> index d2b7a5f..fbf2fae 100644
> --- a/drivers/iio/gyro/st_gyro_spi.c
> +++ b/drivers/iio/gyro/st_gyro_spi.c
> @@ -54,6 +54,7 @@ static const struct spi_device_id st_gyro_id_table[] = {
>   { L3GD20_GYRO_DEV_NAME },
>   { L3G4IS_GYRO_DEV_NAME },
>   { LSM330_GYRO_DEV_NAME },
> + { LSM9DS0_GYRO_DEV_NAME },
>   {},
>  };
>  MODULE_DEVICE_TABLE(spi, st_gyro_id_table);
> 



Re: [RFC PATCH 3/3] Documentation: iio: Add IIO software devices docs

2016-04-23 Thread Jonathan Cameron
On 20/04/16 16:51, Daniel Baluta wrote:
> On Wed, Apr 20, 2016 at 6:44 PM, Lars-Peter Clausen  wrote:
>> On 04/20/2016 05:45 PM, Daniel Baluta wrote:
>>
>>> +
>>> +What:/config/iio/triggers/dummy
>>
>> s/triggers/devices
> 
> :D, will fix in v2. Obviously, one problem of this RFC is that we
> have a lot of duplicate code between the triggers and devices
> in configfs support.
> 
We could take this as is, and do the refactor as a follow up series
- kind of up to you really.

The series as a whole looks pretty good to me though a bit of tidying
up is needed in one or two corners.  Will take a closer look


Re: [RFC PATCH 3/3] Documentation: iio: Add IIO software devices docs

2016-04-23 Thread Jonathan Cameron
On 20/04/16 16:51, Daniel Baluta wrote:
> On Wed, Apr 20, 2016 at 6:44 PM, Lars-Peter Clausen  wrote:
>> On 04/20/2016 05:45 PM, Daniel Baluta wrote:
>>
>>> +
>>> +What:/config/iio/triggers/dummy
>>
>> s/triggers/devices
> 
> :D, will fix in v2. Obviously, one problem of this RFC is that we
> have a lot of duplicate code between the triggers and devices
> in configfs support.
> 
We could take this as is, and do the refactor as a follow up series
- kind of up to you really.

The series as a whole looks pretty good to me though a bit of tidying
up is needed in one or two corners.  Will take a closer look


  1   2   3   4   5   >