date:20170418

Re: linux-next: build failure after merge of the rcu tree

2017-04-18 Thread Stephen Rothwell

Hi Paul,

On Tue, 18 Apr 2017 21:06:20 -0700 "Paul E. McKenney" 
 wrote:
>
> Or at least broken in a more subtle and creative way.  ;-)

What I live for :-)

-- 
Cheers,
Stephen Rothwell

Re: linux-next: build failure after merge of the rcu tree

2017-04-18 Thread Stephen Rothwell

Hi Paul,

On Tue, 18 Apr 2017 21:06:20 -0700 "Paul E. McKenney" 
 wrote:
>
> Or at least broken in a more subtle and creative way.  ;-)

What I live for :-)

-- 
Cheers,
Stephen Rothwell

Re: Re: "mm: move pcp and lru-pcp draining into single wq" broke resume from s2ram

2017-04-18 Thread Tetsuo Handa

Geert Uytterhoeven wrote:
> 8 locks held by s2ram/1899:
>  #0:  (sb_writers#7){.+.+.+}, at: [] vfs_write+0xa8/0x15c
>  #1:  (>mutex){+.+.+.}, at: [] 
> kernfs_fop_write+0xf0/0x194
>  #2:  (s_active#48){.+.+.+}, at: [] 
> kernfs_fop_write+0xf8/0x194
>  #3:  (pm_mutex){+.+.+.}, at: [] pm_suspend+0x16c/0xabc
>  #4:  (>mutex){..}, at: [] device_resume+0x58/0x190
>  #5:  (cma_mutex){+.+...}, at: [] cma_alloc+0x150/0x374
>  #6:  (lock){+.+...}, at: [] lru_add_drain_all+0x4c/0x1b4
>  #7:  (cpu_hotplug.dep_map){++}, at: [] 
> get_online_cpus+0x3c/0x9c

I think this situation suggests that

int pm_suspend(suspend_state_t state) {
  error = enter_state(state) {
if (!mutex_trylock(_mutex)) /* #3 */
  return -EBUSY;
error = suspend_devices_and_enter(state) {
  error = suspend_enter(state, ) {
enable_nonboot_cpus() {
  cpu_maps_update_begin() {
mutex_lock(_add_remove_lock);
  }
  pr_info("Enabling non-boot CPUs ...\n");
  for_each_cpu(cpu, frozen_cpus) {
error = _cpu_up(cpu, 1, CPUHP_ONLINE) {
  cpu_hotplug_begin() {
mutex_lock(_hotplug.lock);
  }
  
  cpu_hotplug_done() {
mutex_unlock(_hotplug.lock);
  }
}
if (!error) {
  pr_info("CPU%d is up\n", cpu);
  continue;
}
  }
  cpu_maps_update_done() {
 mutex_unlock(_add_remove_lock);
  }
}
  }
  dpm_resume_end(PMSG_RESUME) {
dpm_resume(state) {
  mutex_lock(_list_mtx);
  while (!list_empty(_suspended_list)) {
mutex_unlock(_list_mtx);
error = device_resume(dev, state, false) {
  dpm_wait_for_superior(dev, async);
  dpm_watchdog_set(, dev);
  device_lock(dev) {
mutex_lock(>mutex); /* #4 */
  }
  error = dpm_run_callback(callback, dev, state, info) {
cma_alloc() {
  mutex_lock(_mutex); /* #5 */
  alloc_contig_range() {
lru_add_drain_all() {
  mutex_lock(); /* #6 */
  get_online_cpus() {
mutex_lock(_hotplug.lock); /* #7 hang? */
mutex_unlock(_hotplug.lock);
  }
  put_online_cpus();
  mutex_unlock(); /* #6 */
}
  }
  mutex_unlock(_mutex); /* #5 */
}
  }
  device_unlock(dev) {
mutex_unlock(>mutex); /* #4 */
  }
}
mutex_lock(_list_mtx);
  }
  mutex_unlock(_list_mtx);
}
dpm_complete(state) {
  mutex_lock(_list_mtx);
  while (!list_empty(_prepared_list)) {
mutex_unlock(_list_mtx);
device_complete(dev, state) {
}
mutex_lock(_list_mtx);
  }
  mutex_unlock(_list_mtx);
}
  }
}
mutex_unlock(_mutex); /* #3 */
  }
}

Somebody is waiting forever with cpu_hotplug.lock held?
I think that full dmesg with SysRq-t output is appreciated.

Re: Re: "mm: move pcp and lru-pcp draining into single wq" broke resume from s2ram

2017-04-18 Thread Tetsuo Handa

Geert Uytterhoeven wrote:
> 8 locks held by s2ram/1899:
>  #0:  (sb_writers#7){.+.+.+}, at: [] vfs_write+0xa8/0x15c
>  #1:  (>mutex){+.+.+.}, at: [] 
> kernfs_fop_write+0xf0/0x194
>  #2:  (s_active#48){.+.+.+}, at: [] 
> kernfs_fop_write+0xf8/0x194
>  #3:  (pm_mutex){+.+.+.}, at: [] pm_suspend+0x16c/0xabc
>  #4:  (>mutex){..}, at: [] device_resume+0x58/0x190
>  #5:  (cma_mutex){+.+...}, at: [] cma_alloc+0x150/0x374
>  #6:  (lock){+.+...}, at: [] lru_add_drain_all+0x4c/0x1b4
>  #7:  (cpu_hotplug.dep_map){++}, at: [] 
> get_online_cpus+0x3c/0x9c

I think this situation suggests that

int pm_suspend(suspend_state_t state) {
  error = enter_state(state) {
if (!mutex_trylock(_mutex)) /* #3 */
  return -EBUSY;
error = suspend_devices_and_enter(state) {
  error = suspend_enter(state, ) {
enable_nonboot_cpus() {
  cpu_maps_update_begin() {
mutex_lock(_add_remove_lock);
  }
  pr_info("Enabling non-boot CPUs ...\n");
  for_each_cpu(cpu, frozen_cpus) {
error = _cpu_up(cpu, 1, CPUHP_ONLINE) {
  cpu_hotplug_begin() {
mutex_lock(_hotplug.lock);
  }
  
  cpu_hotplug_done() {
mutex_unlock(_hotplug.lock);
  }
}
if (!error) {
  pr_info("CPU%d is up\n", cpu);
  continue;
}
  }
  cpu_maps_update_done() {
 mutex_unlock(_add_remove_lock);
  }
}
  }
  dpm_resume_end(PMSG_RESUME) {
dpm_resume(state) {
  mutex_lock(_list_mtx);
  while (!list_empty(_suspended_list)) {
mutex_unlock(_list_mtx);
error = device_resume(dev, state, false) {
  dpm_wait_for_superior(dev, async);
  dpm_watchdog_set(, dev);
  device_lock(dev) {
mutex_lock(>mutex); /* #4 */
  }
  error = dpm_run_callback(callback, dev, state, info) {
cma_alloc() {
  mutex_lock(_mutex); /* #5 */
  alloc_contig_range() {
lru_add_drain_all() {
  mutex_lock(); /* #6 */
  get_online_cpus() {
mutex_lock(_hotplug.lock); /* #7 hang? */
mutex_unlock(_hotplug.lock);
  }
  put_online_cpus();
  mutex_unlock(); /* #6 */
}
  }
  mutex_unlock(_mutex); /* #5 */
}
  }
  device_unlock(dev) {
mutex_unlock(>mutex); /* #4 */
  }
}
mutex_lock(_list_mtx);
  }
  mutex_unlock(_list_mtx);
}
dpm_complete(state) {
  mutex_lock(_list_mtx);
  while (!list_empty(_prepared_list)) {
mutex_unlock(_list_mtx);
device_complete(dev, state) {
}
mutex_lock(_list_mtx);
  }
  mutex_unlock(_list_mtx);
}
  }
}
mutex_unlock(_mutex); /* #3 */
  }
}

Somebody is waiting forever with cpu_hotplug.lock held?
I think that full dmesg with SysRq-t output is appreciated.

Re: export pcie_flr and remove copies of it in drivers V2

2017-04-18 Thread Leon Romanovsky

On Tue, Apr 18, 2017 at 01:36:12PM -0500, Bjorn Helgaas wrote:
> On Fri, Apr 14, 2017 at 09:11:24PM +0200, Christoph Hellwig wrote:
> > Hi all,
> >
> > this exports the PCI layer pcie_flr helper, and removes various opencoded
> > copies of it.
> >
> > Changes since V1:
> >  - rebase on top of the pci/virtualization branch
> >  - fixed the probe case in __pci_dev_reset
> >  - added ACKs from Bjorn
>
> Applied the first three patches:
>
>   bc13871ef35a PCI: Export pcie_flr()
>   e641c375d414 PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()
>   40e0901ea4bf PCI: Call pcie_flr() from reset_chelsio_generic_dev()
>

Bjorn,

How do you suggest to proceed with other patches? They should be applied
to your tree either, because they depend on "bc13871ef35a PCI: Export
pcie_flr()".

Thanks


> to pci/virtualization for v4.12, thanks!


signature.asc
Description: PGP signature

Re: export pcie_flr and remove copies of it in drivers V2

2017-04-18 Thread Leon Romanovsky

On Tue, Apr 18, 2017 at 01:36:12PM -0500, Bjorn Helgaas wrote:
> On Fri, Apr 14, 2017 at 09:11:24PM +0200, Christoph Hellwig wrote:
> > Hi all,
> >
> > this exports the PCI layer pcie_flr helper, and removes various opencoded
> > copies of it.
> >
> > Changes since V1:
> >  - rebase on top of the pci/virtualization branch
> >  - fixed the probe case in __pci_dev_reset
> >  - added ACKs from Bjorn
>
> Applied the first three patches:
>
>   bc13871ef35a PCI: Export pcie_flr()
>   e641c375d414 PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()
>   40e0901ea4bf PCI: Call pcie_flr() from reset_chelsio_generic_dev()
>

Bjorn,

How do you suggest to proceed with other patches? They should be applied
to your tree either, because they depend on "bc13871ef35a PCI: Export
pcie_flr()".

Thanks


> to pci/virtualization for v4.12, thanks!


signature.asc
Description: PGP signature

[PATCH V3 02/17] thermal: cpu_cooling: rearrange globals

2017-04-18 Thread Viresh Kumar

Just to make it look better.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index be29489dd247..ce94aafed25d 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -105,8 +105,8 @@ struct cpufreq_cooling_device {
struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
-static DEFINE_IDA(cpufreq_ida);
 
+static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_dev_list);
 
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 02/17] thermal: cpu_cooling: rearrange globals

2017-04-18 Thread Viresh Kumar

Just to make it look better.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index be29489dd247..ce94aafed25d 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -105,8 +105,8 @@ struct cpufreq_cooling_device {
struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
-static DEFINE_IDA(cpufreq_ida);
 
+static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_dev_list);
 
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 01/17] thermal: cpu_cooling: Avoid accessing potentially freed structures

2017-04-18 Thread Viresh Kumar

After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.

Drop the lock after we are done using the structure.

Cc: 4.2+  # 4.2+
Fixes: 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep problems in 
cpu_cooling")
Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 69d0f430b2d1..be29489dd247 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -153,8 +153,10 @@ unsigned long cpufreq_cooling_get_level(unsigned int cpu, 
unsigned int freq)
mutex_lock(_list_lock);
list_for_each_entry(cpufreq_dev, _dev_list, node) {
if (cpumask_test_cpu(cpu, _dev->allowed_cpus)) {
+   unsigned long level = get_level(cpufreq_dev, freq);
+
mutex_unlock(_list_lock);
-   return get_level(cpufreq_dev, freq);
+   return level;
}
}
mutex_unlock(_list_lock);
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 01/17] thermal: cpu_cooling: Avoid accessing potentially freed structures

2017-04-18 Thread Viresh Kumar

After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.

Drop the lock after we are done using the structure.

Cc: 4.2+  # 4.2+
Fixes: 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep problems in 
cpu_cooling")
Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 69d0f430b2d1..be29489dd247 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -153,8 +153,10 @@ unsigned long cpufreq_cooling_get_level(unsigned int cpu, 
unsigned int freq)
mutex_lock(_list_lock);
list_for_each_entry(cpufreq_dev, _dev_list, node) {
if (cpumask_test_cpu(cpu, _dev->allowed_cpus)) {
+   unsigned long level = get_level(cpufreq_dev, freq);
+
mutex_unlock(_list_lock);
-   return get_level(cpufreq_dev, freq);
+   return level;
}
}
mutex_unlock(_list_lock);
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 09/17] thermal: cpu_cooling: store cpufreq policy

2017-04-18 Thread Viresh Kumar

The cpufreq policy can be used by the cpu_cooling driver, lets store it
in the cpufreq_cooling_device structure.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 55ff45c1e917..7dddc7443f5d 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -67,6 +67,7 @@ struct power_table {
  * registered.
  * @cdev: thermal_cooling_device pointer to keep track of the
  * registered cooling device.
+ * @policy: cpufreq policy.
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
  * @clipped_freq: integer value representing the absolute value of the clipped
@@ -91,6 +92,7 @@ struct power_table {
 struct cpufreq_cooling_device {
int id;
struct thermal_cooling_device *cdev;
+   struct cpufreq_policy *policy;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
@@ -760,6 +762,7 @@ __cpufreq_cooling_register(struct device_node *np,
if (!cpufreq_cdev)
return ERR_PTR(-ENOMEM);
 
+   cpufreq_cdev->policy = policy;
num_cpus = cpumask_weight(policy->related_cpus);
cpufreq_cdev->time_in_idle = kcalloc(num_cpus,
sizeof(*cpufreq_cdev->time_in_idle),
-- 
2.12.0.432.g71c3a4f4ba37

Re: [PATCH V3 02/16] block, bfq: add full hierarchical scheduling and cgroups support

2017-04-18 Thread Paolo Valente

> Il giorno 18 apr 2017, alle ore 09:04, Tejun Heo  ha scritto:
> 
> Hello, Paolo.
> 
> On Wed, Apr 12, 2017 at 07:22:03AM +0200, Paolo Valente wrote:
>> could you elaborate a bit more on this?  I mean, cgroups support has
>> been in BFQ (and CFQ) for almost ten years, perfectly working as far
>> as I know.  Of course it is perfectly working in terms of I/O and not
>> of CPU bandwidth distribution; and, for the moment, it is effective
>> only for devices below 30-50KIOPS.  What's the point in throwing
>> (momentarily?) away such a fundamental feature?  What am I missing?
> 
> I've been trying to track down latency issues with the CPU controller
> which basically takes the same approach and I'm not sure nesting
> scheduler timelines is a good approach.  It intuitively feels elegant
> but seems to have some fundamental issues.  IIUC, bfq isn't quite the
> same in that it doesn't need load balancer across multiple queues and
> it could be that bfq is close enough to the basic model that the
> nested behavior maps to the correct scheduling behavior.
> 
> However, for example, in the CPU controller, the nested timelines
> break sleeper boost.  The boost is implemented by considering the
> thread to have woken up upto some duration prior to the current time;
> however, it only affects the timeline inside the cgroup and there's no
> good way to propagate it upwards.  The final result is two threads in
> a cgroup with the double weight can behave significantly worse in
> terms of latency compared to two threads with the weight of 1 in the
> root.
> 

Hi Tejun,
I don't know in detail the specific multiple-queue issues you report,
but bfq implements the upward propagation you mention: if a process in
a group is to be privileged, i.e., if the process has basically to be
provided with a higher weight (in addition to other important forms of
help), then this weight boost is propagated upward through the path
from the process to the root node in the group hierarchy.

> Given that the nested scheduling ends up pretty expensive, I'm not
> sure how good a model this nesting approach is.  Especially if there
> can be multiple queues, the weight distribution across cgroup
> instances across multiple queues has to be coordinated globally
> anyway,

To get perfect global service guarantees, yes.  But you can settle
with tradeoffs that, according to my experience with storage and
packet I/O, are so good to be probably indistinguishable from an
ideal, but too costly solution.  I mean, with a well-done approximated
scheduling solution, the deviation with respect to an ideal service
can be in the same order of the noise caused by unavoidable latencies
of other sw and hw components than the scheduler.

> so the weight / cost adjustment part can't happen
> automatically anyway as in single queue case.  If we're going there,
> we might as well implement cgroup support by actively modulating the
> combined weights, which will make individual scheduling operations
> cheaper and it easier to think about and guarantee latency behaviors.
> 

Yes.  Anyway, I didn't quite understand what is or could be the
alternative, w.r.t. hierarchical scheduling, for guaranteeing
bandwidth distribution of shared resources in a complex setting.  If
you think I could be of any help on this, just put me somehow in the
loop.

> If you think that bfq will stay single queue and won't need timeline
> modifying heuristics (for responsiveness or whatever), the current
> approach could be fine, but I'm a bit awry about committing to the
> current approach if we're gonna encounter the same problems.
> 

As of now, bfq is targeted at not too fast devices (< 30-50KIOPS),
which happen to be single queue.  In particular, bfq is currently
agnostic w.r.t.  to the number of downstream queues.

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun

[PATCH V3 09/17] thermal: cpu_cooling: store cpufreq policy

2017-04-18 Thread Viresh Kumar

The cpufreq policy can be used by the cpu_cooling driver, lets store it
in the cpufreq_cooling_device structure.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 55ff45c1e917..7dddc7443f5d 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -67,6 +67,7 @@ struct power_table {
  * registered.
  * @cdev: thermal_cooling_device pointer to keep track of the
  * registered cooling device.
+ * @policy: cpufreq policy.
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
  * @clipped_freq: integer value representing the absolute value of the clipped
@@ -91,6 +92,7 @@ struct power_table {
 struct cpufreq_cooling_device {
int id;
struct thermal_cooling_device *cdev;
+   struct cpufreq_policy *policy;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
@@ -760,6 +762,7 @@ __cpufreq_cooling_register(struct device_node *np,
if (!cpufreq_cdev)
return ERR_PTR(-ENOMEM);
 
+   cpufreq_cdev->policy = policy;
num_cpus = cpumask_weight(policy->related_cpus);
cpufreq_cdev->time_in_idle = kcalloc(num_cpus,
sizeof(*cpufreq_cdev->time_in_idle),
-- 
2.12.0.432.g71c3a4f4ba37

Re: [PATCH V3 02/16] block, bfq: add full hierarchical scheduling and cgroups support

2017-04-18 Thread Paolo Valente

> Il giorno 18 apr 2017, alle ore 09:04, Tejun Heo  ha scritto:
> 
> Hello, Paolo.
> 
> On Wed, Apr 12, 2017 at 07:22:03AM +0200, Paolo Valente wrote:
>> could you elaborate a bit more on this?  I mean, cgroups support has
>> been in BFQ (and CFQ) for almost ten years, perfectly working as far
>> as I know.  Of course it is perfectly working in terms of I/O and not
>> of CPU bandwidth distribution; and, for the moment, it is effective
>> only for devices below 30-50KIOPS.  What's the point in throwing
>> (momentarily?) away such a fundamental feature?  What am I missing?
> 
> I've been trying to track down latency issues with the CPU controller
> which basically takes the same approach and I'm not sure nesting
> scheduler timelines is a good approach.  It intuitively feels elegant
> but seems to have some fundamental issues.  IIUC, bfq isn't quite the
> same in that it doesn't need load balancer across multiple queues and
> it could be that bfq is close enough to the basic model that the
> nested behavior maps to the correct scheduling behavior.
> 
> However, for example, in the CPU controller, the nested timelines
> break sleeper boost.  The boost is implemented by considering the
> thread to have woken up upto some duration prior to the current time;
> however, it only affects the timeline inside the cgroup and there's no
> good way to propagate it upwards.  The final result is two threads in
> a cgroup with the double weight can behave significantly worse in
> terms of latency compared to two threads with the weight of 1 in the
> root.
> 

Hi Tejun,
I don't know in detail the specific multiple-queue issues you report,
but bfq implements the upward propagation you mention: if a process in
a group is to be privileged, i.e., if the process has basically to be
provided with a higher weight (in addition to other important forms of
help), then this weight boost is propagated upward through the path
from the process to the root node in the group hierarchy.

> Given that the nested scheduling ends up pretty expensive, I'm not
> sure how good a model this nesting approach is.  Especially if there
> can be multiple queues, the weight distribution across cgroup
> instances across multiple queues has to be coordinated globally
> anyway,

To get perfect global service guarantees, yes.  But you can settle
with tradeoffs that, according to my experience with storage and
packet I/O, are so good to be probably indistinguishable from an
ideal, but too costly solution.  I mean, with a well-done approximated
scheduling solution, the deviation with respect to an ideal service
can be in the same order of the noise caused by unavoidable latencies
of other sw and hw components than the scheduler.

> so the weight / cost adjustment part can't happen
> automatically anyway as in single queue case.  If we're going there,
> we might as well implement cgroup support by actively modulating the
> combined weights, which will make individual scheduling operations
> cheaper and it easier to think about and guarantee latency behaviors.
> 

Yes.  Anyway, I didn't quite understand what is or could be the
alternative, w.r.t. hierarchical scheduling, for guaranteeing
bandwidth distribution of shared resources in a complex setting.  If
you think I could be of any help on this, just put me somehow in the
loop.

> If you think that bfq will stay single queue and won't need timeline
> modifying heuristics (for responsiveness or whatever), the current
> approach could be fine, but I'm a bit awry about committing to the
> current approach if we're gonna encounter the same problems.
> 

As of now, bfq is targeted at not too fast devices (< 30-50KIOPS),
which happen to be single queue.  In particular, bfq is currently
agnostic w.r.t.  to the number of downstream queues.

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun

[PATCH V3 08/17] cpufreq: create cpufreq_table_count_valid_entries()

2017-04-18 Thread Viresh Kumar

We need such a routine at two places already, lets create one.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/cpufreq_stats.c | 13 -
 drivers/thermal/cpu_cooling.c   | 22 +-
 include/linux/cpufreq.h | 14 ++
 3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index f570ead62454..9c3d319dc129 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -170,11 +170,10 @@ void cpufreq_stats_create_table(struct cpufreq_policy 
*policy)
unsigned int i = 0, count = 0, ret = -ENOMEM;
struct cpufreq_stats *stats;
unsigned int alloc_size;
-   struct cpufreq_frequency_table *pos, *table;
+   struct cpufreq_frequency_table *pos;
 
-   /* We need cpufreq table for creating stats table */
-   table = policy->freq_table;
-   if (unlikely(!table))
+   count = cpufreq_table_count_valid_entries(policy);
+   if (!count)
return;
 
/* stats already initialized */
@@ -185,10 +184,6 @@ void cpufreq_stats_create_table(struct cpufreq_policy 
*policy)
if (!stats)
return;
 
-   /* Find total allocation size */
-   cpufreq_for_each_valid_entry(pos, table)
-   count++;
-
alloc_size = count * sizeof(int) + count * sizeof(u64);
 
alloc_size += count * count * sizeof(int);
@@ -205,7 +200,7 @@ void cpufreq_stats_create_table(struct cpufreq_policy 
*policy)
stats->max_state = count;
 
/* Find valid-unique entries */
-   cpufreq_for_each_valid_entry(pos, table)
+   cpufreq_for_each_valid_entry(pos, policy->freq_table)
if (freq_table_get_index(stats, pos->frequency) == -1)
stats->freq_table[i++] = pos->frequency;
 
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 58e58065b650..55ff45c1e917 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -739,7 +739,6 @@ __cpufreq_cooling_register(struct device_node *np,
struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_cdev;
char dev_name[THERMAL_NAME_LENGTH];
-   struct cpufreq_frequency_table *pos, *table;
unsigned int freq, i, num_cpus;
int ret;
struct thermal_cooling_device_ops *cooling_ops;
@@ -750,9 +749,10 @@ __cpufreq_cooling_register(struct device_node *np,
return ERR_PTR(-EINVAL);
}
 
-   table = policy->freq_table;
-   if (!table) {
-   pr_debug("%s: CPUFreq table not found\n", __func__);
+   i = cpufreq_table_count_valid_entries(policy);
+   if (!i) {
+   pr_debug("%s: CPUFreq table not found or has no valid 
entries\n",
+__func__);
return ERR_PTR(-ENODEV);
}
 
@@ -777,20 +777,16 @@ __cpufreq_cooling_register(struct device_node *np,
goto free_time_in_idle;
}
 
-   /* Find max levels */
-   cpufreq_for_each_valid_entry(pos, table)
-   cpufreq_cdev->max_level++;
+   /* max_level is an index, not a counter */
+   cpufreq_cdev->max_level = i - 1;
 
-   cpufreq_cdev->freq_table = kmalloc(sizeof(*cpufreq_cdev->freq_table) *
- cpufreq_cdev->max_level, GFP_KERNEL);
+   cpufreq_cdev->freq_table = kmalloc(sizeof(*cpufreq_cdev->freq_table) * 
i,
+ GFP_KERNEL);
if (!cpufreq_cdev->freq_table) {
cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle_timestamp;
}
 
-   /* max_level is an index, not a counter */
-   cpufreq_cdev->max_level--;
-
cpumask_copy(_cdev->allowed_cpus, policy->related_cpus);
 
if (capacitance) {
@@ -816,7 +812,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
/* Fill freq-table in descending order of frequencies */
for (i = 0, freq = -1; i <= cpufreq_cdev->max_level; i++) {
-   freq = find_next_max(table, freq);
+   freq = find_next_max(policy->freq_table, freq);
cpufreq_cdev->freq_table[i] = freq;
 
/* Warn for duplicate entries */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 87165f06a307..affc13568af6 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -855,6 +855,20 @@ static inline int cpufreq_frequency_table_target(struct 
cpufreq_policy *policy,
return -EINVAL;
}
 }
+
+static inline int cpufreq_table_count_valid_entries(const struct 
cpufreq_policy *policy)
+{
+   struct cpufreq_frequency_table *pos;
+   int count = 0;
+
+   if (unlikely(!policy->freq_table))
+   return 0;
+
+   cpufreq_for_each_valid_entry(pos, policy->freq_table)
+   count++;
+
+   return

[PATCH V3 08/17] cpufreq: create cpufreq_table_count_valid_entries()

2017-04-18 Thread Viresh Kumar

We need such a routine at two places already, lets create one.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/cpufreq_stats.c | 13 -
 drivers/thermal/cpu_cooling.c   | 22 +-
 include/linux/cpufreq.h | 14 ++
 3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index f570ead62454..9c3d319dc129 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -170,11 +170,10 @@ void cpufreq_stats_create_table(struct cpufreq_policy 
*policy)
unsigned int i = 0, count = 0, ret = -ENOMEM;
struct cpufreq_stats *stats;
unsigned int alloc_size;
-   struct cpufreq_frequency_table *pos, *table;
+   struct cpufreq_frequency_table *pos;
 
-   /* We need cpufreq table for creating stats table */
-   table = policy->freq_table;
-   if (unlikely(!table))
+   count = cpufreq_table_count_valid_entries(policy);
+   if (!count)
return;
 
/* stats already initialized */
@@ -185,10 +184,6 @@ void cpufreq_stats_create_table(struct cpufreq_policy 
*policy)
if (!stats)
return;
 
-   /* Find total allocation size */
-   cpufreq_for_each_valid_entry(pos, table)
-   count++;
-
alloc_size = count * sizeof(int) + count * sizeof(u64);
 
alloc_size += count * count * sizeof(int);
@@ -205,7 +200,7 @@ void cpufreq_stats_create_table(struct cpufreq_policy 
*policy)
stats->max_state = count;
 
/* Find valid-unique entries */
-   cpufreq_for_each_valid_entry(pos, table)
+   cpufreq_for_each_valid_entry(pos, policy->freq_table)
if (freq_table_get_index(stats, pos->frequency) == -1)
stats->freq_table[i++] = pos->frequency;
 
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 58e58065b650..55ff45c1e917 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -739,7 +739,6 @@ __cpufreq_cooling_register(struct device_node *np,
struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_cdev;
char dev_name[THERMAL_NAME_LENGTH];
-   struct cpufreq_frequency_table *pos, *table;
unsigned int freq, i, num_cpus;
int ret;
struct thermal_cooling_device_ops *cooling_ops;
@@ -750,9 +749,10 @@ __cpufreq_cooling_register(struct device_node *np,
return ERR_PTR(-EINVAL);
}
 
-   table = policy->freq_table;
-   if (!table) {
-   pr_debug("%s: CPUFreq table not found\n", __func__);
+   i = cpufreq_table_count_valid_entries(policy);
+   if (!i) {
+   pr_debug("%s: CPUFreq table not found or has no valid 
entries\n",
+__func__);
return ERR_PTR(-ENODEV);
}
 
@@ -777,20 +777,16 @@ __cpufreq_cooling_register(struct device_node *np,
goto free_time_in_idle;
}
 
-   /* Find max levels */
-   cpufreq_for_each_valid_entry(pos, table)
-   cpufreq_cdev->max_level++;
+   /* max_level is an index, not a counter */
+   cpufreq_cdev->max_level = i - 1;
 
-   cpufreq_cdev->freq_table = kmalloc(sizeof(*cpufreq_cdev->freq_table) *
- cpufreq_cdev->max_level, GFP_KERNEL);
+   cpufreq_cdev->freq_table = kmalloc(sizeof(*cpufreq_cdev->freq_table) * 
i,
+ GFP_KERNEL);
if (!cpufreq_cdev->freq_table) {
cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle_timestamp;
}
 
-   /* max_level is an index, not a counter */
-   cpufreq_cdev->max_level--;
-
cpumask_copy(_cdev->allowed_cpus, policy->related_cpus);
 
if (capacitance) {
@@ -816,7 +812,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
/* Fill freq-table in descending order of frequencies */
for (i = 0, freq = -1; i <= cpufreq_cdev->max_level; i++) {
-   freq = find_next_max(table, freq);
+   freq = find_next_max(policy->freq_table, freq);
cpufreq_cdev->freq_table[i] = freq;
 
/* Warn for duplicate entries */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 87165f06a307..affc13568af6 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -855,6 +855,20 @@ static inline int cpufreq_frequency_table_target(struct 
cpufreq_policy *policy,
return -EINVAL;
}
 }
+
+static inline int cpufreq_table_count_valid_entries(const struct 
cpufreq_policy *policy)
+{
+   struct cpufreq_frequency_table *pos;
+   int count = 0;
+
+   if (unlikely(!policy->freq_table))
+   return 0;
+
+   cpufreq_for_each_valid_entry(pos, policy->freq_table)
+   count++;
+
+   return count;
+}
 #else
 static

[PATCH V3 14/17] thermal: cpu_cooling: get_level() can't fail

2017-04-18 Thread Viresh Kumar

The frequency passed to get_level() is returned by cpu_power_to_freq()
and it is guaranteed that get_level() can't fail.

Get rid of error code.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 71d15448a293..762ddfc4e654 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -119,22 +119,19 @@ static LIST_HEAD(cpufreq_cdev_list);
  * @cpufreq_cdev: cpufreq_cdev for which the property is required
  * @freq: Frequency
  *
- * Return: level on success, THERMAL_CSTATE_INVALID on error.
+ * Return: level corresponding to the frequency.
  */
 static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
   unsigned int freq)
 {
+   struct freq_table *freq_table = cpufreq_cdev->freq_table;
unsigned long level;
 
-   for (level = 0; level <= cpufreq_cdev->max_level; level++) {
-   if (freq == cpufreq_cdev->freq_table[level].frequency)
-   return level;
-
-   if (freq > cpufreq_cdev->freq_table[level].frequency)
+   for (level = 1; level < cpufreq_cdev->max_level; level++)
+   if (freq > freq_table[level].frequency)
break;
-   }
 
-   return THERMAL_CSTATE_INVALID;
+   return level - 1;
 }
 
 /**
@@ -623,13 +620,6 @@ static int cpufreq_power2state(struct 
thermal_cooling_device *cdev,
target_freq = cpu_power_to_freq(cpufreq_cdev, normalised_power);
 
*state = get_level(cpufreq_cdev, target_freq);
-   if (*state == THERMAL_CSTATE_INVALID) {
-   dev_err_ratelimited(>device,
-   "Failed to convert %dKHz for cpu %d into a 
cdev state\n",
-   target_freq, policy->cpu);
-   return -EINVAL;
-   }
-
trace_thermal_power_cpu_limit(policy->related_cpus, target_freq, *state,
  power);
return 0;
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 14/17] thermal: cpu_cooling: get_level() can't fail

2017-04-18 Thread Viresh Kumar

The frequency passed to get_level() is returned by cpu_power_to_freq()
and it is guaranteed that get_level() can't fail.

Get rid of error code.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 71d15448a293..762ddfc4e654 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -119,22 +119,19 @@ static LIST_HEAD(cpufreq_cdev_list);
  * @cpufreq_cdev: cpufreq_cdev for which the property is required
  * @freq: Frequency
  *
- * Return: level on success, THERMAL_CSTATE_INVALID on error.
+ * Return: level corresponding to the frequency.
  */
 static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
   unsigned int freq)
 {
+   struct freq_table *freq_table = cpufreq_cdev->freq_table;
unsigned long level;
 
-   for (level = 0; level <= cpufreq_cdev->max_level; level++) {
-   if (freq == cpufreq_cdev->freq_table[level].frequency)
-   return level;
-
-   if (freq > cpufreq_cdev->freq_table[level].frequency)
+   for (level = 1; level < cpufreq_cdev->max_level; level++)
+   if (freq > freq_table[level].frequency)
break;
-   }
 
-   return THERMAL_CSTATE_INVALID;
+   return level - 1;
 }
 
 /**
@@ -623,13 +620,6 @@ static int cpufreq_power2state(struct 
thermal_cooling_device *cdev,
target_freq = cpu_power_to_freq(cpufreq_cdev, normalised_power);
 
*state = get_level(cpufreq_cdev, target_freq);
-   if (*state == THERMAL_CSTATE_INVALID) {
-   dev_err_ratelimited(>device,
-   "Failed to convert %dKHz for cpu %d into a 
cdev state\n",
-   target_freq, policy->cpu);
-   return -EINVAL;
-   }
-
trace_thermal_power_cpu_limit(policy->related_cpus, target_freq, *state,
  power);
return 0;
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 15/17] thermal: cpu_cooling: don't store cpu_dev in cpufreq_cdev

2017-04-18 Thread Viresh Kumar

'cpu_dev' is used by only one function, get_static_power(), and it
wouldn't be time consuming to get the cpu device structure within it.
This would help removing cpu_dev from struct cpufreq_cooling_device.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 762ddfc4e654..c85b217d16c8 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -87,7 +87,6 @@ struct time_in_idle {
  * @node: list_head to link all cpufreq_cooling_device together.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @idle_time: idle time stats
- * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
  * This structure is required for keeping information of each registered
@@ -104,7 +103,6 @@ struct cpufreq_cooling_device {
struct list_head node;
u32 last_load;
struct time_in_idle *idle_time;
-   struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
 
@@ -255,8 +253,6 @@ static int update_freq_table(struct cpufreq_cooling_device 
*cpufreq_cdev,
freq_table[i].power = power;
}
 
-   cpufreq_cdev->cpu_dev = dev;
-
return 0;
 }
 
@@ -338,19 +334,22 @@ static int get_static_power(struct cpufreq_cooling_device 
*cpufreq_cdev,
 {
struct dev_pm_opp *opp;
unsigned long voltage;
-   struct cpumask *cpumask = cpufreq_cdev->policy->related_cpus;
+   struct cpufreq_policy *policy = cpufreq_cdev->policy;
+   struct cpumask *cpumask = policy->related_cpus;
unsigned long freq_hz = freq * 1000;
+   struct device *dev;
 
-   if (!cpufreq_cdev->plat_get_static_power || !cpufreq_cdev->cpu_dev) {
+   if (!cpufreq_cdev->plat_get_static_power) {
*power = 0;
return 0;
}
 
-   opp = dev_pm_opp_find_freq_exact(cpufreq_cdev->cpu_dev, freq_hz,
-true);
+   dev = get_cpu_device(policy->cpu);
+   WARN_ON(!dev);
+
+   opp = dev_pm_opp_find_freq_exact(dev, freq_hz, true);
if (IS_ERR(opp)) {
-   dev_warn_ratelimited(cpufreq_cdev->cpu_dev,
-"Failed to find OPP for frequency %lu: 
%ld\n",
+   dev_warn_ratelimited(dev, "Failed to find OPP for frequency 
%lu: %ld\n",
 freq_hz, PTR_ERR(opp));
return -EINVAL;
}
@@ -359,8 +358,7 @@ static int get_static_power(struct cpufreq_cooling_device 
*cpufreq_cdev,
dev_pm_opp_put(opp);
 
if (voltage == 0) {
-   dev_err_ratelimited(cpufreq_cdev->cpu_dev,
-   "Failed to get voltage for frequency %lu\n",
+   dev_err_ratelimited(dev, "Failed to get voltage for frequency 
%lu\n",
freq_hz);
return -EINVAL;
}
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 15/17] thermal: cpu_cooling: don't store cpu_dev in cpufreq_cdev

2017-04-18 Thread Viresh Kumar

'cpu_dev' is used by only one function, get_static_power(), and it
wouldn't be time consuming to get the cpu device structure within it.
This would help removing cpu_dev from struct cpufreq_cooling_device.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 762ddfc4e654..c85b217d16c8 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -87,7 +87,6 @@ struct time_in_idle {
  * @node: list_head to link all cpufreq_cooling_device together.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @idle_time: idle time stats
- * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
  * This structure is required for keeping information of each registered
@@ -104,7 +103,6 @@ struct cpufreq_cooling_device {
struct list_head node;
u32 last_load;
struct time_in_idle *idle_time;
-   struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
 
@@ -255,8 +253,6 @@ static int update_freq_table(struct cpufreq_cooling_device 
*cpufreq_cdev,
freq_table[i].power = power;
}
 
-   cpufreq_cdev->cpu_dev = dev;
-
return 0;
 }
 
@@ -338,19 +334,22 @@ static int get_static_power(struct cpufreq_cooling_device 
*cpufreq_cdev,
 {
struct dev_pm_opp *opp;
unsigned long voltage;
-   struct cpumask *cpumask = cpufreq_cdev->policy->related_cpus;
+   struct cpufreq_policy *policy = cpufreq_cdev->policy;
+   struct cpumask *cpumask = policy->related_cpus;
unsigned long freq_hz = freq * 1000;
+   struct device *dev;
 
-   if (!cpufreq_cdev->plat_get_static_power || !cpufreq_cdev->cpu_dev) {
+   if (!cpufreq_cdev->plat_get_static_power) {
*power = 0;
return 0;
}
 
-   opp = dev_pm_opp_find_freq_exact(cpufreq_cdev->cpu_dev, freq_hz,
-true);
+   dev = get_cpu_device(policy->cpu);
+   WARN_ON(!dev);
+
+   opp = dev_pm_opp_find_freq_exact(dev, freq_hz, true);
if (IS_ERR(opp)) {
-   dev_warn_ratelimited(cpufreq_cdev->cpu_dev,
-"Failed to find OPP for frequency %lu: 
%ld\n",
+   dev_warn_ratelimited(dev, "Failed to find OPP for frequency 
%lu: %ld\n",
 freq_hz, PTR_ERR(opp));
return -EINVAL;
}
@@ -359,8 +358,7 @@ static int get_static_power(struct cpufreq_cooling_device 
*cpufreq_cdev,
dev_pm_opp_put(opp);
 
if (voltage == 0) {
-   dev_err_ratelimited(cpufreq_cdev->cpu_dev,
-   "Failed to get voltage for frequency %lu\n",
+   dev_err_ratelimited(dev, "Failed to get voltage for frequency 
%lu\n",
freq_hz);
return -EINVAL;
}
-- 
2.12.0.432.g71c3a4f4ba37

Re: [PATCH v2] mm: add VM_STATIC flag to vmalloc and prevent from removing the areas

2017-04-18 Thread Hoeun Ryu


> On Apr 18, 2017, at 3:59 PM, Michal Hocko  wrote:
> 
>> On Tue 18-04-17 14:48:39, Hoeun Ryu wrote:
>> vm_area_add_early/vm_area_register_early() are used to reserve vmalloc area
>> during boot process and those virtually mapped areas are never unmapped.
>> So `OR` VM_STATIC flag to the areas in vmalloc_init() when importing
>> existing vmlist entries and prevent those areas from being removed from the
>> rbtree by accident.
> 
> Has this been a problem in the past or currently so that it is worth
> handling?
> 
>> This flags can be also used by other vmalloc APIs to
>> specify that the area will never go away.
> 
> Do we have a user for that?
> 
>> This makes remove_vm_area() more robust against other kind of errors (eg.
>> programming errors).
> 
> Well, yes it will help to prevent from vfree(early_mem) but we have 4
> users of vm_area_register_early so I am really wondering whether this is
> worth additional code. It would really help to understand your
> motivation for the patch if we were explicit about the problem you are
> trying to solve.

I just think that it would be good to make it robust against various kind of 
errors.
You might think that's not an enough reason to do so though.

> 
> Thanks
> 
> -- 
> Michal Hocko
> SUSE Labs

Re: [PATCH v2] mm: add VM_STATIC flag to vmalloc and prevent from removing the areas

2017-04-18 Thread Hoeun Ryu


> On Apr 18, 2017, at 3:59 PM, Michal Hocko  wrote:
> 
>> On Tue 18-04-17 14:48:39, Hoeun Ryu wrote:
>> vm_area_add_early/vm_area_register_early() are used to reserve vmalloc area
>> during boot process and those virtually mapped areas are never unmapped.
>> So `OR` VM_STATIC flag to the areas in vmalloc_init() when importing
>> existing vmlist entries and prevent those areas from being removed from the
>> rbtree by accident.
> 
> Has this been a problem in the past or currently so that it is worth
> handling?
> 
>> This flags can be also used by other vmalloc APIs to
>> specify that the area will never go away.
> 
> Do we have a user for that?
> 
>> This makes remove_vm_area() more robust against other kind of errors (eg.
>> programming errors).
> 
> Well, yes it will help to prevent from vfree(early_mem) but we have 4
> users of vm_area_register_early so I am really wondering whether this is
> worth additional code. It would really help to understand your
> motivation for the patch if we were explicit about the problem you are
> trying to solve.

I just think that it would be good to make it robust against various kind of 
errors.
You might think that's not an enough reason to do so though.

> 
> Thanks
> 
> -- 
> Michal Hocko
> SUSE Labs

[PATCH V3 10/17] thermal: cpu_cooling: OPPs are registered for all CPUs

2017-04-18 Thread Viresh Kumar

The OPPs are registered for all CPUs of a cpufreq policy now and we
don't need to run the loop in build_dyn_power_table(). Just check for
the policy->cpu and we should be fine.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 7dddc7443f5d..ce387f62c93e 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -83,7 +83,7 @@ struct power_table {
  * @dyn_power_table: array of struct power_table for frequency to power
  * conversion, sorted in ascending order.
  * @dyn_power_table_entries: number of entries in the @dyn_power_table array
- * @cpu_dev: the first cpu_device from @allowed_cpus that has OPPs registered
+ * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
  * This structure is required for keeping information of each registered
@@ -207,24 +207,20 @@ static int build_dyn_power_table(struct 
cpufreq_cooling_device *cpufreq_cdev,
struct power_table *power_table;
struct dev_pm_opp *opp;
struct device *dev = NULL;
-   int num_opps = 0, cpu, i, ret = 0;
+   int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i, ret = 0;
unsigned long freq;
 
-   for_each_cpu(cpu, _cdev->allowed_cpus) {
-   dev = get_cpu_device(cpu);
-   if (!dev) {
-   dev_warn(_cdev->cdev->device,
-"No cpu device for cpu %d\n", cpu);
-   continue;
-   }
-
-   num_opps = dev_pm_opp_get_opp_count(dev);
-   if (num_opps > 0)
-   break;
-   else if (num_opps < 0)
-   return num_opps;
+   dev = get_cpu_device(cpu);
+   if (unlikely(!dev)) {
+   dev_warn(_cdev->cdev->device,
+"No cpu device for cpu %d\n", cpu);
+   return -ENODEV;
}
 
+   num_opps = dev_pm_opp_get_opp_count(dev);
+   if (num_opps < 0)
+   return num_opps;
+
if (num_opps == 0)
return -EINVAL;
 
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 12/17] thermal: cpu_cooling: merge frequency and power tables

2017-04-18 Thread Viresh Kumar

The cpu_cooling driver keeps two tables:

- freq_table: table of frequencies in descending order, built from
  policy->freq_table.

- power_table: table of frequencies and power in ascending order, built
  from OPP table.

If the OPPs are used for the CPU device then both these tables are
actually built using the OPP core and should have the same frequency
entries. And there is no need to keep separate tables for this.

Lets merge them both.

Note that the new table is in descending order of frequencies and so the
'for' loops were required to be fixed at few places to make it work.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 153 ++
 1 file changed, 67 insertions(+), 86 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1097162f7f8a..17d6d4635936 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -49,14 +49,14 @@
  */
 
 /**
- * struct power_table - frequency to power conversion
+ * struct freq_table - frequency table along with power entries
  * @frequency: frequency in KHz
  * @power: power in mW
  *
  * This structure is built when the cooling device registers and helps
- * in translating frequency to power and viceversa.
+ * in translating frequency to power and vice versa.
  */
-struct power_table {
+struct freq_table {
u32 frequency;
u32 power;
 };
@@ -79,9 +79,6 @@ struct power_table {
  * @time_in_idle: previous reading of the absolute time that this cpu was idle
  * @time_in_idle_timestamp: wall time of the last invocation of
  * get_cpu_idle_time_us()
- * @dyn_power_table: array of struct power_table for frequency to power
- * conversion, sorted in ascending order.
- * @dyn_power_table_entries: number of entries in the @dyn_power_table array
  * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
@@ -95,13 +92,11 @@ struct cpufreq_cooling_device {
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
-   unsigned int *freq_table;   /* In descending order */
+   struct freq_table *freq_table;  /* In descending order */
struct list_head node;
u32 last_load;
u64 *time_in_idle;
u64 *time_in_idle_timestamp;
-   struct power_table *dyn_power_table;
-   int dyn_power_table_entries;
struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
@@ -125,10 +120,10 @@ static unsigned long get_level(struct 
cpufreq_cooling_device *cpufreq_cdev,
unsigned long level;
 
for (level = 0; level <= cpufreq_cdev->max_level; level++) {
-   if (freq == cpufreq_cdev->freq_table[level])
+   if (freq == cpufreq_cdev->freq_table[level].frequency)
return level;
 
-   if (freq > cpufreq_cdev->freq_table[level])
+   if (freq > cpufreq_cdev->freq_table[level].frequency)
break;
}
 
@@ -185,28 +180,25 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 }
 
 /**
- * build_dyn_power_table() - create a dynamic power to frequency table
- * @cpufreq_cdev:  the cpufreq cooling device in which to store the table
+ * update_freq_table() - Update the freq table with power numbers
+ * @cpufreq_cdev:  the cpufreq cooling device in which to update the table
  * @capacitance: dynamic power coefficient for these cpus
  *
- * Build a dynamic power to frequency table for this cpu and store it
- * in @cpufreq_cdev.  This table will be used in cpu_power_to_freq() and
- * cpu_freq_to_power() to convert between power and frequency
- * efficiently.  Power is stored in mW, frequency in KHz.  The
- * resulting table is in ascending order.
+ * Update the freq table with power numbers.  This table will be used in
+ * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
+ * frequency efficiently.  Power is stored in mW, frequency in KHz.  The
+ * resulting table is in descending order.
  *
  * Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
- * -ENOMEM if we run out of memory or -EAGAIN if an OPP was
- * added/enabled while the function was executing.
+ * or -ENOMEM if we run out of memory.
  */
-static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_cdev,
-u32 capacitance)
+static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
+u32 capacitance)
 {
-   struct power_table *power_table;
+   struct freq_table *freq_table = cpufreq_cdev->freq_table;
struct dev_pm_opp *opp;
struct device *dev = NULL;
-   int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i, ret = 0;
-   unsigned long freq;
+   int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i;
 
dev = get_cpu_device(cpu);

[PATCH V3 10/17] thermal: cpu_cooling: OPPs are registered for all CPUs

2017-04-18 Thread Viresh Kumar

The OPPs are registered for all CPUs of a cpufreq policy now and we
don't need to run the loop in build_dyn_power_table(). Just check for
the policy->cpu and we should be fine.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 7dddc7443f5d..ce387f62c93e 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -83,7 +83,7 @@ struct power_table {
  * @dyn_power_table: array of struct power_table for frequency to power
  * conversion, sorted in ascending order.
  * @dyn_power_table_entries: number of entries in the @dyn_power_table array
- * @cpu_dev: the first cpu_device from @allowed_cpus that has OPPs registered
+ * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
  * This structure is required for keeping information of each registered
@@ -207,24 +207,20 @@ static int build_dyn_power_table(struct 
cpufreq_cooling_device *cpufreq_cdev,
struct power_table *power_table;
struct dev_pm_opp *opp;
struct device *dev = NULL;
-   int num_opps = 0, cpu, i, ret = 0;
+   int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i, ret = 0;
unsigned long freq;
 
-   for_each_cpu(cpu, _cdev->allowed_cpus) {
-   dev = get_cpu_device(cpu);
-   if (!dev) {
-   dev_warn(_cdev->cdev->device,
-"No cpu device for cpu %d\n", cpu);
-   continue;
-   }
-
-   num_opps = dev_pm_opp_get_opp_count(dev);
-   if (num_opps > 0)
-   break;
-   else if (num_opps < 0)
-   return num_opps;
+   dev = get_cpu_device(cpu);
+   if (unlikely(!dev)) {
+   dev_warn(_cdev->cdev->device,
+"No cpu device for cpu %d\n", cpu);
+   return -ENODEV;
}
 
+   num_opps = dev_pm_opp_get_opp_count(dev);
+   if (num_opps < 0)
+   return num_opps;
+
if (num_opps == 0)
return -EINVAL;
 
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 12/17] thermal: cpu_cooling: merge frequency and power tables

2017-04-18 Thread Viresh Kumar

The cpu_cooling driver keeps two tables:

- freq_table: table of frequencies in descending order, built from
  policy->freq_table.

- power_table: table of frequencies and power in ascending order, built
  from OPP table.

If the OPPs are used for the CPU device then both these tables are
actually built using the OPP core and should have the same frequency
entries. And there is no need to keep separate tables for this.

Lets merge them both.

Note that the new table is in descending order of frequencies and so the
'for' loops were required to be fixed at few places to make it work.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 153 ++
 1 file changed, 67 insertions(+), 86 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1097162f7f8a..17d6d4635936 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -49,14 +49,14 @@
  */
 
 /**
- * struct power_table - frequency to power conversion
+ * struct freq_table - frequency table along with power entries
  * @frequency: frequency in KHz
  * @power: power in mW
  *
  * This structure is built when the cooling device registers and helps
- * in translating frequency to power and viceversa.
+ * in translating frequency to power and vice versa.
  */
-struct power_table {
+struct freq_table {
u32 frequency;
u32 power;
 };
@@ -79,9 +79,6 @@ struct power_table {
  * @time_in_idle: previous reading of the absolute time that this cpu was idle
  * @time_in_idle_timestamp: wall time of the last invocation of
  * get_cpu_idle_time_us()
- * @dyn_power_table: array of struct power_table for frequency to power
- * conversion, sorted in ascending order.
- * @dyn_power_table_entries: number of entries in the @dyn_power_table array
  * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
@@ -95,13 +92,11 @@ struct cpufreq_cooling_device {
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
-   unsigned int *freq_table;   /* In descending order */
+   struct freq_table *freq_table;  /* In descending order */
struct list_head node;
u32 last_load;
u64 *time_in_idle;
u64 *time_in_idle_timestamp;
-   struct power_table *dyn_power_table;
-   int dyn_power_table_entries;
struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
@@ -125,10 +120,10 @@ static unsigned long get_level(struct 
cpufreq_cooling_device *cpufreq_cdev,
unsigned long level;
 
for (level = 0; level <= cpufreq_cdev->max_level; level++) {
-   if (freq == cpufreq_cdev->freq_table[level])
+   if (freq == cpufreq_cdev->freq_table[level].frequency)
return level;
 
-   if (freq > cpufreq_cdev->freq_table[level])
+   if (freq > cpufreq_cdev->freq_table[level].frequency)
break;
}
 
@@ -185,28 +180,25 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 }
 
 /**
- * build_dyn_power_table() - create a dynamic power to frequency table
- * @cpufreq_cdev:  the cpufreq cooling device in which to store the table
+ * update_freq_table() - Update the freq table with power numbers
+ * @cpufreq_cdev:  the cpufreq cooling device in which to update the table
  * @capacitance: dynamic power coefficient for these cpus
  *
- * Build a dynamic power to frequency table for this cpu and store it
- * in @cpufreq_cdev.  This table will be used in cpu_power_to_freq() and
- * cpu_freq_to_power() to convert between power and frequency
- * efficiently.  Power is stored in mW, frequency in KHz.  The
- * resulting table is in ascending order.
+ * Update the freq table with power numbers.  This table will be used in
+ * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
+ * frequency efficiently.  Power is stored in mW, frequency in KHz.  The
+ * resulting table is in descending order.
  *
  * Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
- * -ENOMEM if we run out of memory or -EAGAIN if an OPP was
- * added/enabled while the function was executing.
+ * or -ENOMEM if we run out of memory.
  */
-static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_cdev,
-u32 capacitance)
+static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
+u32 capacitance)
 {
-   struct power_table *power_table;
+   struct freq_table *freq_table = cpufreq_cdev->freq_table;
struct dev_pm_opp *opp;
struct device *dev = NULL;
-   int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i, ret = 0;
-   unsigned long freq;
+   int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i;
 
dev = get_cpu_device(cpu);
if (unlikely(!dev)) {

[PATCH V3 13/17] thermal: cpu_cooling: create structure for idle time stats

2017-04-18 Thread Viresh Kumar

We keep two arrays for idle time stats and allocate memory for them
separately. It would be much easier to follow if we create an array of
idle stats structure instead and allocate it once.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 53 ---
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 17d6d4635936..71d15448a293 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -62,6 +62,16 @@ struct freq_table {
 };
 
 /**
+ * struct time_in_idle - Idle time stats
+ * @time: previous reading of the absolute time that this cpu was idle
+ * @timestamp: wall time of the last invocation of get_cpu_idle_time_us()
+ */
+struct time_in_idle {
+   u64 time;
+   u64 timestamp;
+};
+
+/**
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
  * @id: unique integer value corresponding to each cpufreq_cooling_device
  * registered.
@@ -76,9 +86,7 @@ struct freq_table {
  * cpufreq frequencies.
  * @node: list_head to link all cpufreq_cooling_device together.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
- * @time_in_idle: previous reading of the absolute time that this cpu was idle
- * @time_in_idle_timestamp: wall time of the last invocation of
- * get_cpu_idle_time_us()
+ * @idle_time: idle time stats
  * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
@@ -95,8 +103,7 @@ struct cpufreq_cooling_device {
struct freq_table *freq_table;  /* In descending order */
struct list_head node;
u32 last_load;
-   u64 *time_in_idle;
-   u64 *time_in_idle_timestamp;
+   struct time_in_idle *idle_time;
struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
@@ -296,18 +303,19 @@ static u32 get_load(struct cpufreq_cooling_device 
*cpufreq_cdev, int cpu,
 {
u32 load;
u64 now, now_idle, delta_time, delta_idle;
+   struct time_in_idle *idle_time = _cdev->idle_time[cpu_idx];
 
now_idle = get_cpu_idle_time(cpu, , 0);
-   delta_idle = now_idle - cpufreq_cdev->time_in_idle[cpu_idx];
-   delta_time = now - cpufreq_cdev->time_in_idle_timestamp[cpu_idx];
+   delta_idle = now_idle - idle_time->time;
+   delta_time = now - idle_time->timestamp;
 
if (delta_time <= delta_idle)
load = 0;
else
load = div64_u64(100 * (delta_time - delta_idle), delta_time);
 
-   cpufreq_cdev->time_in_idle[cpu_idx] = now_idle;
-   cpufreq_cdev->time_in_idle_timestamp[cpu_idx] = now;
+   idle_time->time = now_idle;
+   idle_time->timestamp = now;
 
return load;
 }
@@ -711,22 +719,14 @@ __cpufreq_cooling_register(struct device_node *np,
 
cpufreq_cdev->policy = policy;
num_cpus = cpumask_weight(policy->related_cpus);
-   cpufreq_cdev->time_in_idle = kcalloc(num_cpus,
-   sizeof(*cpufreq_cdev->time_in_idle),
-   GFP_KERNEL);
-   if (!cpufreq_cdev->time_in_idle) {
+   cpufreq_cdev->idle_time = kcalloc(num_cpus,
+sizeof(*cpufreq_cdev->idle_time),
+GFP_KERNEL);
+   if (!cpufreq_cdev->idle_time) {
cdev = ERR_PTR(-ENOMEM);
goto free_cdev;
}
 
-   cpufreq_cdev->time_in_idle_timestamp =
-   kcalloc(num_cpus, sizeof(*cpufreq_cdev->time_in_idle_timestamp),
-   GFP_KERNEL);
-   if (!cpufreq_cdev->time_in_idle_timestamp) {
-   cdev = ERR_PTR(-ENOMEM);
-   goto free_time_in_idle;
-   }
-
/* max_level is an index, not a counter */
cpufreq_cdev->max_level = i - 1;
 
@@ -734,7 +734,7 @@ __cpufreq_cooling_register(struct device_node *np,
  GFP_KERNEL);
if (!cpufreq_cdev->freq_table) {
cdev = ERR_PTR(-ENOMEM);
-   goto free_time_in_idle_timestamp;
+   goto free_idle_time;
}
 
ret = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
@@ -797,10 +797,8 @@ __cpufreq_cooling_register(struct device_node *np,
ida_simple_remove(_ida, cpufreq_cdev->id);
 free_table:
kfree(cpufreq_cdev->freq_table);
-free_time_in_idle_timestamp:
-   kfree(cpufreq_cdev->time_in_idle_timestamp);
-free_time_in_idle:
-   kfree(cpufreq_cdev->time_in_idle);
+free_idle_time:
+   kfree(cpufreq_cdev->idle_time);
 free_cdev:
kfree(cpufreq_cdev);
return cdev;
@@ -943,8 +941,7 @@ void cpufreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
 
thermal_cooling_device_unregister(cpufreq_cdev->cdev);
ida_simple_remove(_ida, cpufreq_cdev->id);
-

[PATCH V3 17/17] thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device

2017-04-18 Thread Viresh Kumar

This shrinks the size of the structure on arm64 by 8 bytes by avoiding
padding of 4 bytes at two places.

Also add missing doc comment for freq_table

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dc73405b04f2..05073c33ba20 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -75,17 +75,18 @@ struct time_in_idle {
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
  * @id: unique integer value corresponding to each cpufreq_cooling_device
  * registered.
- * @cdev: thermal_cooling_device pointer to keep track of the
- * registered cooling device.
- * @policy: cpufreq policy.
+ * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
  * @clipped_freq: integer value representing the absolute value of the clipped
  * frequency.
  * @max_level: maximum cooling level. One less than total number of valid
  * cpufreq frequencies.
+ * @freq_table: Freq table in descending order of frequencies
+ * @cdev: thermal_cooling_device pointer to keep track of the
+ * registered cooling device.
+ * @policy: cpufreq policy.
  * @node: list_head to link all cpufreq_cooling_device together.
- * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @idle_time: idle time stats
  * @plat_get_static_power: callback to calculate the static power
  *
@@ -94,14 +95,14 @@ struct time_in_idle {
  */
 struct cpufreq_cooling_device {
int id;
-   struct thermal_cooling_device *cdev;
-   struct cpufreq_policy *policy;
+   u32 last_load;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
struct freq_table *freq_table;  /* In descending order */
+   struct thermal_cooling_device *cdev;
+   struct cpufreq_policy *policy;
struct list_head node;
-   u32 last_load;
struct time_in_idle *idle_time;
get_static_t plat_get_static_power;
 };
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 07/17] thermal: cpu_cooling: use cpufreq_policy to register cooling device

2017-04-18 Thread Viresh Kumar

The CPU cooling driver uses the cpufreq policy, to get clip_cpus, the
frequency table, etc. Most of the callers of CPU cooling driver's
registration routines have the cpufreq policy with them, but they only
pass the policy->related_cpus cpumask. The __cpufreq_cooling_register()
routine then gets the policy by itself and uses it.

It would be much better if the callers can pass the policy instead
directly. This also fixes a basic design flaw, where the policy can be
freed while the CPU cooling driver is still active.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/arm_big_little.c   |  2 +-
 drivers/cpufreq/cpufreq-dt.c   |  2 +-
 drivers/cpufreq/dbx500-cpufreq.c   |  2 +-
 drivers/cpufreq/mt8173-cpufreq.c   |  4 +-
 drivers/cpufreq/qoriq-cpufreq.c|  3 +-
 drivers/thermal/cpu_cooling.c  | 61 --
 drivers/thermal/imx_thermal.c  | 22 ++--
 drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 22 +---
 include/linux/cpu_cooling.h| 26 -
 9 files changed, 74 insertions(+), 70 deletions(-)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index 418042201e6d..ea6d62547b10 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -540,7 +540,7 @@ static void bL_cpufreq_ready(struct cpufreq_policy *policy)
 _coefficient);
 
cdev[cur_cluster] = of_cpufreq_power_cooling_register(np,
-   policy->related_cpus, power_coefficient, NULL);
+   policy, power_coefficient, NULL);
if (IS_ERR(cdev[cur_cluster])) {
dev_err(cpu_dev,
"running cpufreq without cooling device: %ld\n",
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index c943787d761e..fef3c2160691 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -326,7 +326,7 @@ static void cpufreq_ready(struct cpufreq_policy *policy)
 _coefficient);
 
priv->cdev = of_cpufreq_power_cooling_register(np,
-   policy->related_cpus, power_coefficient, NULL);
+   policy, power_coefficient, NULL);
if (IS_ERR(priv->cdev)) {
dev_err(priv->cpu_dev,
"running cpufreq without cooling device: %ld\n",
diff --git a/drivers/cpufreq/dbx500-cpufreq.c b/drivers/cpufreq/dbx500-cpufreq.c
index 3575b82210ba..4ee0431579c1 100644
--- a/drivers/cpufreq/dbx500-cpufreq.c
+++ b/drivers/cpufreq/dbx500-cpufreq.c
@@ -43,7 +43,7 @@ static int dbx500_cpufreq_exit(struct cpufreq_policy *policy)
 
 static void dbx500_cpufreq_ready(struct cpufreq_policy *policy)
 {
-   cdev = cpufreq_cooling_register(policy->cpus);
+   cdev = cpufreq_cooling_register(policy);
if (IS_ERR(cdev))
pr_err("Failed to register cooling device %ld\n", 
PTR_ERR(cdev));
else
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index fd1886faf33a..f9f00fb4bc3a 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -320,9 +320,7 @@ static void mtk_cpufreq_ready(struct cpufreq_policy *policy)
of_property_read_u32(np, DYNAMIC_POWER, );
 
info->cdev = of_cpufreq_power_cooling_register(np,
-   policy->related_cpus,
-   capacitance,
-   NULL);
+   policy, capacitance, NULL);
 
if (IS_ERR(info->cdev)) {
dev_err(info->cpu_dev,
diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
index e2ea433a5f9c..4ada55b8856e 100644
--- a/drivers/cpufreq/qoriq-cpufreq.c
+++ b/drivers/cpufreq/qoriq-cpufreq.c
@@ -278,8 +278,7 @@ static void qoriq_cpufreq_ready(struct cpufreq_policy 
*policy)
struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
 
if (of_find_property(np, "#cooling-cells", NULL)) {
-   cpud->cdev = of_cpufreq_cooling_register(np,
-policy->related_cpus);
+   cpud->cdev = of_cpufreq_cooling_register(np, policy);
 
if (IS_ERR(cpud->cdev) && PTR_ERR(cpud->cdev) != -ENOSYS) {
pr_err("cpu%d is not running as cooling device: %ld\n",
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 002b48dc6bea..58e58065b650 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -717,7 +717,7 @@ static unsigned int find_next_max(struct

[PATCH V3 13/17] thermal: cpu_cooling: create structure for idle time stats

2017-04-18 Thread Viresh Kumar

We keep two arrays for idle time stats and allocate memory for them
separately. It would be much easier to follow if we create an array of
idle stats structure instead and allocate it once.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 53 ---
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 17d6d4635936..71d15448a293 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -62,6 +62,16 @@ struct freq_table {
 };
 
 /**
+ * struct time_in_idle - Idle time stats
+ * @time: previous reading of the absolute time that this cpu was idle
+ * @timestamp: wall time of the last invocation of get_cpu_idle_time_us()
+ */
+struct time_in_idle {
+   u64 time;
+   u64 timestamp;
+};
+
+/**
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
  * @id: unique integer value corresponding to each cpufreq_cooling_device
  * registered.
@@ -76,9 +86,7 @@ struct freq_table {
  * cpufreq frequencies.
  * @node: list_head to link all cpufreq_cooling_device together.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
- * @time_in_idle: previous reading of the absolute time that this cpu was idle
- * @time_in_idle_timestamp: wall time of the last invocation of
- * get_cpu_idle_time_us()
+ * @idle_time: idle time stats
  * @cpu_dev: the cpu_device of policy->cpu.
  * @plat_get_static_power: callback to calculate the static power
  *
@@ -95,8 +103,7 @@ struct cpufreq_cooling_device {
struct freq_table *freq_table;  /* In descending order */
struct list_head node;
u32 last_load;
-   u64 *time_in_idle;
-   u64 *time_in_idle_timestamp;
+   struct time_in_idle *idle_time;
struct device *cpu_dev;
get_static_t plat_get_static_power;
 };
@@ -296,18 +303,19 @@ static u32 get_load(struct cpufreq_cooling_device 
*cpufreq_cdev, int cpu,
 {
u32 load;
u64 now, now_idle, delta_time, delta_idle;
+   struct time_in_idle *idle_time = _cdev->idle_time[cpu_idx];
 
now_idle = get_cpu_idle_time(cpu, , 0);
-   delta_idle = now_idle - cpufreq_cdev->time_in_idle[cpu_idx];
-   delta_time = now - cpufreq_cdev->time_in_idle_timestamp[cpu_idx];
+   delta_idle = now_idle - idle_time->time;
+   delta_time = now - idle_time->timestamp;
 
if (delta_time <= delta_idle)
load = 0;
else
load = div64_u64(100 * (delta_time - delta_idle), delta_time);
 
-   cpufreq_cdev->time_in_idle[cpu_idx] = now_idle;
-   cpufreq_cdev->time_in_idle_timestamp[cpu_idx] = now;
+   idle_time->time = now_idle;
+   idle_time->timestamp = now;
 
return load;
 }
@@ -711,22 +719,14 @@ __cpufreq_cooling_register(struct device_node *np,
 
cpufreq_cdev->policy = policy;
num_cpus = cpumask_weight(policy->related_cpus);
-   cpufreq_cdev->time_in_idle = kcalloc(num_cpus,
-   sizeof(*cpufreq_cdev->time_in_idle),
-   GFP_KERNEL);
-   if (!cpufreq_cdev->time_in_idle) {
+   cpufreq_cdev->idle_time = kcalloc(num_cpus,
+sizeof(*cpufreq_cdev->idle_time),
+GFP_KERNEL);
+   if (!cpufreq_cdev->idle_time) {
cdev = ERR_PTR(-ENOMEM);
goto free_cdev;
}
 
-   cpufreq_cdev->time_in_idle_timestamp =
-   kcalloc(num_cpus, sizeof(*cpufreq_cdev->time_in_idle_timestamp),
-   GFP_KERNEL);
-   if (!cpufreq_cdev->time_in_idle_timestamp) {
-   cdev = ERR_PTR(-ENOMEM);
-   goto free_time_in_idle;
-   }
-
/* max_level is an index, not a counter */
cpufreq_cdev->max_level = i - 1;
 
@@ -734,7 +734,7 @@ __cpufreq_cooling_register(struct device_node *np,
  GFP_KERNEL);
if (!cpufreq_cdev->freq_table) {
cdev = ERR_PTR(-ENOMEM);
-   goto free_time_in_idle_timestamp;
+   goto free_idle_time;
}
 
ret = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
@@ -797,10 +797,8 @@ __cpufreq_cooling_register(struct device_node *np,
ida_simple_remove(_ida, cpufreq_cdev->id);
 free_table:
kfree(cpufreq_cdev->freq_table);
-free_time_in_idle_timestamp:
-   kfree(cpufreq_cdev->time_in_idle_timestamp);
-free_time_in_idle:
-   kfree(cpufreq_cdev->time_in_idle);
+free_idle_time:
+   kfree(cpufreq_cdev->idle_time);
 free_cdev:
kfree(cpufreq_cdev);
return cdev;
@@ -943,8 +941,7 @@ void cpufreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
 
thermal_cooling_device_unregister(cpufreq_cdev->cdev);
ida_simple_remove(_ida, cpufreq_cdev->id);
-

[PATCH V3 17/17] thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device

2017-04-18 Thread Viresh Kumar

This shrinks the size of the structure on arm64 by 8 bytes by avoiding
padding of 4 bytes at two places.

Also add missing doc comment for freq_table

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dc73405b04f2..05073c33ba20 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -75,17 +75,18 @@ struct time_in_idle {
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
  * @id: unique integer value corresponding to each cpufreq_cooling_device
  * registered.
- * @cdev: thermal_cooling_device pointer to keep track of the
- * registered cooling device.
- * @policy: cpufreq policy.
+ * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
  * @clipped_freq: integer value representing the absolute value of the clipped
  * frequency.
  * @max_level: maximum cooling level. One less than total number of valid
  * cpufreq frequencies.
+ * @freq_table: Freq table in descending order of frequencies
+ * @cdev: thermal_cooling_device pointer to keep track of the
+ * registered cooling device.
+ * @policy: cpufreq policy.
  * @node: list_head to link all cpufreq_cooling_device together.
- * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @idle_time: idle time stats
  * @plat_get_static_power: callback to calculate the static power
  *
@@ -94,14 +95,14 @@ struct time_in_idle {
  */
 struct cpufreq_cooling_device {
int id;
-   struct thermal_cooling_device *cdev;
-   struct cpufreq_policy *policy;
+   u32 last_load;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
struct freq_table *freq_table;  /* In descending order */
+   struct thermal_cooling_device *cdev;
+   struct cpufreq_policy *policy;
struct list_head node;
-   u32 last_load;
struct time_in_idle *idle_time;
get_static_t plat_get_static_power;
 };
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 07/17] thermal: cpu_cooling: use cpufreq_policy to register cooling device

2017-04-18 Thread Viresh Kumar

The CPU cooling driver uses the cpufreq policy, to get clip_cpus, the
frequency table, etc. Most of the callers of CPU cooling driver's
registration routines have the cpufreq policy with them, but they only
pass the policy->related_cpus cpumask. The __cpufreq_cooling_register()
routine then gets the policy by itself and uses it.

It would be much better if the callers can pass the policy instead
directly. This also fixes a basic design flaw, where the policy can be
freed while the CPU cooling driver is still active.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/arm_big_little.c   |  2 +-
 drivers/cpufreq/cpufreq-dt.c   |  2 +-
 drivers/cpufreq/dbx500-cpufreq.c   |  2 +-
 drivers/cpufreq/mt8173-cpufreq.c   |  4 +-
 drivers/cpufreq/qoriq-cpufreq.c|  3 +-
 drivers/thermal/cpu_cooling.c  | 61 --
 drivers/thermal/imx_thermal.c  | 22 ++--
 drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 22 +---
 include/linux/cpu_cooling.h| 26 -
 9 files changed, 74 insertions(+), 70 deletions(-)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index 418042201e6d..ea6d62547b10 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -540,7 +540,7 @@ static void bL_cpufreq_ready(struct cpufreq_policy *policy)
 _coefficient);
 
cdev[cur_cluster] = of_cpufreq_power_cooling_register(np,
-   policy->related_cpus, power_coefficient, NULL);
+   policy, power_coefficient, NULL);
if (IS_ERR(cdev[cur_cluster])) {
dev_err(cpu_dev,
"running cpufreq without cooling device: %ld\n",
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index c943787d761e..fef3c2160691 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -326,7 +326,7 @@ static void cpufreq_ready(struct cpufreq_policy *policy)
 _coefficient);
 
priv->cdev = of_cpufreq_power_cooling_register(np,
-   policy->related_cpus, power_coefficient, NULL);
+   policy, power_coefficient, NULL);
if (IS_ERR(priv->cdev)) {
dev_err(priv->cpu_dev,
"running cpufreq without cooling device: %ld\n",
diff --git a/drivers/cpufreq/dbx500-cpufreq.c b/drivers/cpufreq/dbx500-cpufreq.c
index 3575b82210ba..4ee0431579c1 100644
--- a/drivers/cpufreq/dbx500-cpufreq.c
+++ b/drivers/cpufreq/dbx500-cpufreq.c
@@ -43,7 +43,7 @@ static int dbx500_cpufreq_exit(struct cpufreq_policy *policy)
 
 static void dbx500_cpufreq_ready(struct cpufreq_policy *policy)
 {
-   cdev = cpufreq_cooling_register(policy->cpus);
+   cdev = cpufreq_cooling_register(policy);
if (IS_ERR(cdev))
pr_err("Failed to register cooling device %ld\n", 
PTR_ERR(cdev));
else
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index fd1886faf33a..f9f00fb4bc3a 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -320,9 +320,7 @@ static void mtk_cpufreq_ready(struct cpufreq_policy *policy)
of_property_read_u32(np, DYNAMIC_POWER, );
 
info->cdev = of_cpufreq_power_cooling_register(np,
-   policy->related_cpus,
-   capacitance,
-   NULL);
+   policy, capacitance, NULL);
 
if (IS_ERR(info->cdev)) {
dev_err(info->cpu_dev,
diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
index e2ea433a5f9c..4ada55b8856e 100644
--- a/drivers/cpufreq/qoriq-cpufreq.c
+++ b/drivers/cpufreq/qoriq-cpufreq.c
@@ -278,8 +278,7 @@ static void qoriq_cpufreq_ready(struct cpufreq_policy 
*policy)
struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
 
if (of_find_property(np, "#cooling-cells", NULL)) {
-   cpud->cdev = of_cpufreq_cooling_register(np,
-policy->related_cpus);
+   cpud->cdev = of_cpufreq_cooling_register(np, policy);
 
if (IS_ERR(cpud->cdev) && PTR_ERR(cpud->cdev) != -ENOSYS) {
pr_err("cpu%d is not running as cooling device: %ld\n",
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 002b48dc6bea..58e58065b650 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -717,7 +717,7 @@ static unsigned int find_next_max(struct 
cpufreq_frequency_table

[PATCH V3 16/17] thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()

2017-04-18 Thread Viresh Kumar

The frequency table shouldn't have any zero frequency entries and so
such a check isn't required. Though it would be better to make sure
'state' is within limits.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index c85b217d16c8..dc73405b04f2 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -560,12 +560,13 @@ static int cpufreq_state2power(struct 
thermal_cooling_device *cdev,
int ret;
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
+   /* Request state should be less than max_level */
+   if (WARN_ON(state > cpufreq_cdev->max_level))
+   return -EINVAL;
+
num_cpus = cpumask_weight(cpufreq_cdev->policy->cpus);
 
freq = cpufreq_cdev->freq_table[state].frequency;
-   if (!freq)
-   return -EINVAL;
-
dynamic_power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus;
ret = get_static_power(cpufreq_cdev, tz, freq, _power);
if (ret)
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 16/17] thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()

2017-04-18 Thread Viresh Kumar

The frequency table shouldn't have any zero frequency entries and so
such a check isn't required. Though it would be better to make sure
'state' is within limits.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index c85b217d16c8..dc73405b04f2 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -560,12 +560,13 @@ static int cpufreq_state2power(struct 
thermal_cooling_device *cdev,
int ret;
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
+   /* Request state should be less than max_level */
+   if (WARN_ON(state > cpufreq_cdev->max_level))
+   return -EINVAL;
+
num_cpus = cpumask_weight(cpufreq_cdev->policy->cpus);
 
freq = cpufreq_cdev->freq_table[state].frequency;
-   if (!freq)
-   return -EINVAL;
-
dynamic_power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus;
ret = get_static_power(cpufreq_cdev, tz, freq, _power);
if (ret)
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 03/17] thermal: cpu_cooling: Name cpufreq cooling devices as cpufreq_cdev

2017-04-18 Thread Viresh Kumar

Objects of "struct cpufreq_cooling_device" are named a bit
inconsistently. Lets use cpufreq_cdev everywhere. Also note that the
lists containing such devices is renamed similarly too.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 248 +-
 1 file changed, 124 insertions(+), 124 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index ce94aafed25d..80a46a80817b 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -108,27 +108,27 @@ struct cpufreq_cooling_device {
 
 static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
-static LIST_HEAD(cpufreq_dev_list);
+static LIST_HEAD(cpufreq_cdev_list);
 
 /* Below code defines functions to be used for cpufreq as cooling device */
 
 /**
  * get_level: Find the level for a particular frequency
- * @cpufreq_dev: cpufreq_dev for which the property is required
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
  * @freq: Frequency
  *
  * Return: level on success, THERMAL_CSTATE_INVALID on error.
  */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
   unsigned int freq)
 {
unsigned long level;
 
-   for (level = 0; level <= cpufreq_dev->max_level; level++) {
-   if (freq == cpufreq_dev->freq_table[level])
+   for (level = 0; level <= cpufreq_cdev->max_level; level++) {
+   if (freq == cpufreq_cdev->freq_table[level])
return level;
 
-   if (freq > cpufreq_dev->freq_table[level])
+   if (freq > cpufreq_cdev->freq_table[level])
break;
}
 
@@ -148,12 +148,12 @@ static unsigned long get_level(struct 
cpufreq_cooling_device *cpufreq_dev,
  */
 unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
 {
-   struct cpufreq_cooling_device *cpufreq_dev;
+   struct cpufreq_cooling_device *cpufreq_cdev;
 
mutex_lock(_list_lock);
-   list_for_each_entry(cpufreq_dev, _dev_list, node) {
-   if (cpumask_test_cpu(cpu, _dev->allowed_cpus)) {
-   unsigned long level = get_level(cpufreq_dev, freq);
+   list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
+   if (cpumask_test_cpu(cpu, _cdev->allowed_cpus)) {
+   unsigned long level = get_level(cpufreq_cdev, freq);
 
mutex_unlock(_list_lock);
return level;
@@ -183,14 +183,14 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 {
struct cpufreq_policy *policy = data;
unsigned long clipped_freq;
-   struct cpufreq_cooling_device *cpufreq_dev;
+   struct cpufreq_cooling_device *cpufreq_cdev;
 
if (event != CPUFREQ_ADJUST)
return NOTIFY_DONE;
 
mutex_lock(_list_lock);
-   list_for_each_entry(cpufreq_dev, _dev_list, node) {
-   if (!cpumask_test_cpu(policy->cpu, _dev->allowed_cpus))
+   list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
+   if (!cpumask_test_cpu(policy->cpu, _cdev->allowed_cpus))
continue;
 
/*
@@ -204,7 +204,7 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 * But, if clipped_freq is greater than policy->max, we don't
 * need to do anything.
 */
-   clipped_freq = cpufreq_dev->clipped_freq;
+   clipped_freq = cpufreq_cdev->clipped_freq;
 
if (policy->max > clipped_freq)
cpufreq_verify_within_limits(policy, 0, clipped_freq);
@@ -217,11 +217,11 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 
 /**
  * build_dyn_power_table() - create a dynamic power to frequency table
- * @cpufreq_device:the cpufreq cooling device in which to store the table
+ * @cpufreq_cdev:  the cpufreq cooling device in which to store the table
  * @capacitance: dynamic power coefficient for these cpus
  *
  * Build a dynamic power to frequency table for this cpu and store it
- * in @cpufreq_device.  This table will be used in cpu_power_to_freq() and
+ * in @cpufreq_cdev.  This table will be used in cpu_power_to_freq() and
  * cpu_freq_to_power() to convert between power and frequency
  * efficiently.  Power is stored in mW, frequency in KHz.  The
  * resulting table is in ascending order.
@@ -230,7 +230,7 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
  * -ENOMEM if we run out of memory or -EAGAIN if an OPP was
  * added/enabled while the function was executing.
  */
-static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
+static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_cdev,

[PATCH V3 06/17] thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()

2017-04-18 Thread Viresh Kumar

'cpu' is used at only one place and there is no need to keep a separate
variable for it.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1f4b6a719d05..002b48dc6bea 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -456,7 +456,6 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
 unsigned long state)
 {
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-   unsigned int cpu = cpumask_any(_cdev->allowed_cpus);
unsigned int clip_freq;
 
/* Request state should be less than max_level */
@@ -471,7 +470,7 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
cpufreq_cdev->cpufreq_state = state;
cpufreq_cdev->clipped_freq = clip_freq;
 
-   cpufreq_update_policy(cpu);
+   cpufreq_update_policy(cpumask_any(_cdev->allowed_cpus));
 
return 0;
 }
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 04/17] thermal: cpu_cooling: replace cool_dev with cdev

2017-04-18 Thread Viresh Kumar

Objects of "struct thermal_cooling_device" are named a bit
inconsistently. Lets use cdev everywhere.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 37 ++---
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 80a46a80817b..f1e784c22c5a 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -65,7 +65,7 @@ struct power_table {
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
  * @id: unique integer value corresponding to each cpufreq_cooling_device
  * registered.
- * @cool_dev: thermal_cooling_device pointer to keep track of the
+ * @cdev: thermal_cooling_device pointer to keep track of the
  * registered cooling device.
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
@@ -90,7 +90,7 @@ struct power_table {
  */
 struct cpufreq_cooling_device {
int id;
-   struct thermal_cooling_device *cool_dev;
+   struct thermal_cooling_device *cdev;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
@@ -242,7 +242,7 @@ static int build_dyn_power_table(struct 
cpufreq_cooling_device *cpufreq_cdev,
for_each_cpu(cpu, _cdev->allowed_cpus) {
dev = get_cpu_device(cpu);
if (!dev) {
-   dev_warn(_cdev->cool_dev->device,
+   dev_warn(_cdev->cdev->device,
 "No cpu device for cpu %d\n", cpu);
continue;
}
@@ -769,7 +769,7 @@ __cpufreq_cooling_register(struct device_node *np,
get_static_t plat_static_func)
 {
struct cpufreq_policy *policy;
-   struct thermal_cooling_device *cool_dev;
+   struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_cdev;
char dev_name[THERMAL_NAME_LENGTH];
struct cpufreq_frequency_table *pos, *table;
@@ -786,20 +786,20 @@ __cpufreq_cooling_register(struct device_node *np,
policy = cpufreq_cpu_get(cpumask_first(temp_mask));
if (!policy) {
pr_debug("%s: CPUFreq policy not found\n", __func__);
-   cool_dev = ERR_PTR(-EPROBE_DEFER);
+   cdev = ERR_PTR(-EPROBE_DEFER);
goto free_cpumask;
}
 
table = policy->freq_table;
if (!table) {
pr_debug("%s: CPUFreq table not found\n", __func__);
-   cool_dev = ERR_PTR(-ENODEV);
+   cdev = ERR_PTR(-ENODEV);
goto put_policy;
}
 
cpufreq_cdev = kzalloc(sizeof(*cpufreq_cdev), GFP_KERNEL);
if (!cpufreq_cdev) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto put_policy;
}
 
@@ -808,7 +808,7 @@ __cpufreq_cooling_register(struct device_node *np,
sizeof(*cpufreq_cdev->time_in_idle),
GFP_KERNEL);
if (!cpufreq_cdev->time_in_idle) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto free_cdev;
}
 
@@ -816,7 +816,7 @@ __cpufreq_cooling_register(struct device_node *np,
kcalloc(num_cpus, sizeof(*cpufreq_cdev->time_in_idle_timestamp),
GFP_KERNEL);
if (!cpufreq_cdev->time_in_idle_timestamp) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle;
}
 
@@ -827,7 +827,7 @@ __cpufreq_cooling_register(struct device_node *np,
cpufreq_cdev->freq_table = kmalloc(sizeof(*cpufreq_cdev->freq_table) *
  cpufreq_cdev->max_level, GFP_KERNEL);
if (!cpufreq_cdev->freq_table) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle_timestamp;
}
 
@@ -841,7 +841,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
ret = build_dyn_power_table(cpufreq_cdev, capacitance);
if (ret) {
-   cool_dev = ERR_PTR(ret);
+   cdev = ERR_PTR(ret);
goto free_table;
}
 
@@ -852,7 +852,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
ret = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
if (ret < 0) {
-   cool_dev = ERR_PTR(ret);
+   cdev = ERR_PTR(ret);
goto free_power_table;
}
cpufreq_cdev->id = ret;
@@ -872,14 +872,13 @@ __cpufreq_cooling_register(struct device_node *np,
snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
 cpufreq_cdev->id);
 
-   cool_dev =

[PATCH V3 03/17] thermal: cpu_cooling: Name cpufreq cooling devices as cpufreq_cdev

2017-04-18 Thread Viresh Kumar

Objects of "struct cpufreq_cooling_device" are named a bit
inconsistently. Lets use cpufreq_cdev everywhere. Also note that the
lists containing such devices is renamed similarly too.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 248 +-
 1 file changed, 124 insertions(+), 124 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index ce94aafed25d..80a46a80817b 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -108,27 +108,27 @@ struct cpufreq_cooling_device {
 
 static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
-static LIST_HEAD(cpufreq_dev_list);
+static LIST_HEAD(cpufreq_cdev_list);
 
 /* Below code defines functions to be used for cpufreq as cooling device */
 
 /**
  * get_level: Find the level for a particular frequency
- * @cpufreq_dev: cpufreq_dev for which the property is required
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
  * @freq: Frequency
  *
  * Return: level on success, THERMAL_CSTATE_INVALID on error.
  */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
   unsigned int freq)
 {
unsigned long level;
 
-   for (level = 0; level <= cpufreq_dev->max_level; level++) {
-   if (freq == cpufreq_dev->freq_table[level])
+   for (level = 0; level <= cpufreq_cdev->max_level; level++) {
+   if (freq == cpufreq_cdev->freq_table[level])
return level;
 
-   if (freq > cpufreq_dev->freq_table[level])
+   if (freq > cpufreq_cdev->freq_table[level])
break;
}
 
@@ -148,12 +148,12 @@ static unsigned long get_level(struct 
cpufreq_cooling_device *cpufreq_dev,
  */
 unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
 {
-   struct cpufreq_cooling_device *cpufreq_dev;
+   struct cpufreq_cooling_device *cpufreq_cdev;
 
mutex_lock(_list_lock);
-   list_for_each_entry(cpufreq_dev, _dev_list, node) {
-   if (cpumask_test_cpu(cpu, _dev->allowed_cpus)) {
-   unsigned long level = get_level(cpufreq_dev, freq);
+   list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
+   if (cpumask_test_cpu(cpu, _cdev->allowed_cpus)) {
+   unsigned long level = get_level(cpufreq_cdev, freq);
 
mutex_unlock(_list_lock);
return level;
@@ -183,14 +183,14 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 {
struct cpufreq_policy *policy = data;
unsigned long clipped_freq;
-   struct cpufreq_cooling_device *cpufreq_dev;
+   struct cpufreq_cooling_device *cpufreq_cdev;
 
if (event != CPUFREQ_ADJUST)
return NOTIFY_DONE;
 
mutex_lock(_list_lock);
-   list_for_each_entry(cpufreq_dev, _dev_list, node) {
-   if (!cpumask_test_cpu(policy->cpu, _dev->allowed_cpus))
+   list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
+   if (!cpumask_test_cpu(policy->cpu, _cdev->allowed_cpus))
continue;
 
/*
@@ -204,7 +204,7 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 * But, if clipped_freq is greater than policy->max, we don't
 * need to do anything.
 */
-   clipped_freq = cpufreq_dev->clipped_freq;
+   clipped_freq = cpufreq_cdev->clipped_freq;
 
if (policy->max > clipped_freq)
cpufreq_verify_within_limits(policy, 0, clipped_freq);
@@ -217,11 +217,11 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 
 /**
  * build_dyn_power_table() - create a dynamic power to frequency table
- * @cpufreq_device:the cpufreq cooling device in which to store the table
+ * @cpufreq_cdev:  the cpufreq cooling device in which to store the table
  * @capacitance: dynamic power coefficient for these cpus
  *
  * Build a dynamic power to frequency table for this cpu and store it
- * in @cpufreq_device.  This table will be used in cpu_power_to_freq() and
+ * in @cpufreq_cdev.  This table will be used in cpu_power_to_freq() and
  * cpu_freq_to_power() to convert between power and frequency
  * efficiently.  Power is stored in mW, frequency in KHz.  The
  * resulting table is in ascending order.
@@ -230,7 +230,7 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
  * -ENOMEM if we run out of memory or -EAGAIN if an OPP was
  * added/enabled while the function was executing.
  */
-static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
+static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_cdev,
 u32 capacitance)
 {

[PATCH V3 06/17] thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()

2017-04-18 Thread Viresh Kumar

'cpu' is used at only one place and there is no need to keep a separate
variable for it.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1f4b6a719d05..002b48dc6bea 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -456,7 +456,6 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
 unsigned long state)
 {
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-   unsigned int cpu = cpumask_any(_cdev->allowed_cpus);
unsigned int clip_freq;
 
/* Request state should be less than max_level */
@@ -471,7 +470,7 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
cpufreq_cdev->cpufreq_state = state;
cpufreq_cdev->clipped_freq = clip_freq;
 
-   cpufreq_update_policy(cpu);
+   cpufreq_update_policy(cpumask_any(_cdev->allowed_cpus));
 
return 0;
 }
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 04/17] thermal: cpu_cooling: replace cool_dev with cdev

2017-04-18 Thread Viresh Kumar

Objects of "struct thermal_cooling_device" are named a bit
inconsistently. Lets use cdev everywhere.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 37 ++---
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 80a46a80817b..f1e784c22c5a 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -65,7 +65,7 @@ struct power_table {
  * struct cpufreq_cooling_device - data for cooling device with cpufreq
  * @id: unique integer value corresponding to each cpufreq_cooling_device
  * registered.
- * @cool_dev: thermal_cooling_device pointer to keep track of the
+ * @cdev: thermal_cooling_device pointer to keep track of the
  * registered cooling device.
  * @cpufreq_state: integer value representing the current state of cpufreq
  * cooling devices.
@@ -90,7 +90,7 @@ struct power_table {
  */
 struct cpufreq_cooling_device {
int id;
-   struct thermal_cooling_device *cool_dev;
+   struct thermal_cooling_device *cdev;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
@@ -242,7 +242,7 @@ static int build_dyn_power_table(struct 
cpufreq_cooling_device *cpufreq_cdev,
for_each_cpu(cpu, _cdev->allowed_cpus) {
dev = get_cpu_device(cpu);
if (!dev) {
-   dev_warn(_cdev->cool_dev->device,
+   dev_warn(_cdev->cdev->device,
 "No cpu device for cpu %d\n", cpu);
continue;
}
@@ -769,7 +769,7 @@ __cpufreq_cooling_register(struct device_node *np,
get_static_t plat_static_func)
 {
struct cpufreq_policy *policy;
-   struct thermal_cooling_device *cool_dev;
+   struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_cdev;
char dev_name[THERMAL_NAME_LENGTH];
struct cpufreq_frequency_table *pos, *table;
@@ -786,20 +786,20 @@ __cpufreq_cooling_register(struct device_node *np,
policy = cpufreq_cpu_get(cpumask_first(temp_mask));
if (!policy) {
pr_debug("%s: CPUFreq policy not found\n", __func__);
-   cool_dev = ERR_PTR(-EPROBE_DEFER);
+   cdev = ERR_PTR(-EPROBE_DEFER);
goto free_cpumask;
}
 
table = policy->freq_table;
if (!table) {
pr_debug("%s: CPUFreq table not found\n", __func__);
-   cool_dev = ERR_PTR(-ENODEV);
+   cdev = ERR_PTR(-ENODEV);
goto put_policy;
}
 
cpufreq_cdev = kzalloc(sizeof(*cpufreq_cdev), GFP_KERNEL);
if (!cpufreq_cdev) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto put_policy;
}
 
@@ -808,7 +808,7 @@ __cpufreq_cooling_register(struct device_node *np,
sizeof(*cpufreq_cdev->time_in_idle),
GFP_KERNEL);
if (!cpufreq_cdev->time_in_idle) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto free_cdev;
}
 
@@ -816,7 +816,7 @@ __cpufreq_cooling_register(struct device_node *np,
kcalloc(num_cpus, sizeof(*cpufreq_cdev->time_in_idle_timestamp),
GFP_KERNEL);
if (!cpufreq_cdev->time_in_idle_timestamp) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle;
}
 
@@ -827,7 +827,7 @@ __cpufreq_cooling_register(struct device_node *np,
cpufreq_cdev->freq_table = kmalloc(sizeof(*cpufreq_cdev->freq_table) *
  cpufreq_cdev->max_level, GFP_KERNEL);
if (!cpufreq_cdev->freq_table) {
-   cool_dev = ERR_PTR(-ENOMEM);
+   cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle_timestamp;
}
 
@@ -841,7 +841,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
ret = build_dyn_power_table(cpufreq_cdev, capacitance);
if (ret) {
-   cool_dev = ERR_PTR(ret);
+   cdev = ERR_PTR(ret);
goto free_table;
}
 
@@ -852,7 +852,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
ret = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
if (ret < 0) {
-   cool_dev = ERR_PTR(ret);
+   cdev = ERR_PTR(ret);
goto free_power_table;
}
cpufreq_cdev->id = ret;
@@ -872,14 +872,13 @@ __cpufreq_cooling_register(struct device_node *np,
snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
 cpufreq_cdev->id);
 
-   cool_dev = thermal_of_cooling_device_register(np, dev_name,
-

[PATCH V3 11/17] thermal: cpu_cooling: get rid of 'allowed_cpus'

2017-04-18 Thread Viresh Kumar

'allowed_cpus' is a copy of policy->related_cpus and can be replaced by
it directly. At some places we are only concerned about online CPUs and
policy->cpus can be used there.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 77 ---
 1 file changed, 21 insertions(+), 56 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index ce387f62c93e..1097162f7f8a 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -74,7 +74,6 @@ struct power_table {
  * frequency.
  * @max_level: maximum cooling level. One less than total number of valid
  * cpufreq frequencies.
- * @allowed_cpus: all the cpus involved for this cpufreq_cooling_device.
  * @node: list_head to link all cpufreq_cooling_device together.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @time_in_idle: previous reading of the absolute time that this cpu was idle
@@ -97,7 +96,6 @@ struct cpufreq_cooling_device {
unsigned int clipped_freq;
unsigned int max_level;
unsigned int *freq_table;   /* In descending order */
-   struct cpumask allowed_cpus;
struct list_head node;
u32 last_load;
u64 *time_in_idle;
@@ -161,7 +159,7 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 
mutex_lock(_list_lock);
list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
-   if (!cpumask_test_cpu(policy->cpu, _cdev->allowed_cpus))
+   if (policy != cpufreq_cdev->policy)
continue;
 
/*
@@ -304,7 +302,7 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device 
*cpufreq_cdev,
  * get_load() - get load for a cpu since last updated
  * @cpufreq_cdev:   cpufreq_cooling_device for this cpu
  * @cpu:   cpu number
- * @cpu_idx:   index of the cpu in cpufreq_cdev->allowed_cpus
+ * @cpu_idx:   index of the cpu in time_in_idle*
  *
  * Return: The average load of cpu @cpu in percentage since this
  * function was last called.
@@ -351,7 +349,7 @@ static int get_static_power(struct cpufreq_cooling_device 
*cpufreq_cdev,
 {
struct dev_pm_opp *opp;
unsigned long voltage;
-   struct cpumask *cpumask = _cdev->allowed_cpus;
+   struct cpumask *cpumask = cpufreq_cdev->policy->related_cpus;
unsigned long freq_hz = freq * 1000;
 
if (!cpufreq_cdev->plat_get_static_power || !cpufreq_cdev->cpu_dev) {
@@ -468,7 +466,7 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
cpufreq_cdev->cpufreq_state = state;
cpufreq_cdev->clipped_freq = clip_freq;
 
-   cpufreq_update_policy(cpumask_any(_cdev->allowed_cpus));
+   cpufreq_update_policy(cpufreq_cdev->policy->cpu);
 
return 0;
 }
@@ -504,28 +502,18 @@ static int cpufreq_get_requested_power(struct 
thermal_cooling_device *cdev,
int i = 0, cpu, ret;
u32 static_power, dynamic_power, total_load = 0;
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+   struct cpufreq_policy *policy = cpufreq_cdev->policy;
u32 *load_cpu = NULL;
 
-   cpu = cpumask_any_and(_cdev->allowed_cpus, cpu_online_mask);
-
-   /*
-* All the CPUs are offline, thus the requested power by
-* the cdev is 0
-*/
-   if (cpu >= nr_cpu_ids) {
-   *power = 0;
-   return 0;
-   }
-
-   freq = cpufreq_quick_get(cpu);
+   freq = cpufreq_quick_get(policy->cpu);
 
if (trace_thermal_power_cpu_get_power_enabled()) {
-   u32 ncpus = cpumask_weight(_cdev->allowed_cpus);
+   u32 ncpus = cpumask_weight(policy->related_cpus);
 
load_cpu = kcalloc(ncpus, sizeof(*load_cpu), GFP_KERNEL);
}
 
-   for_each_cpu(cpu, _cdev->allowed_cpus) {
+   for_each_cpu(cpu, policy->related_cpus) {
u32 load;
 
if (cpu_online(cpu))
@@ -550,9 +538,9 @@ static int cpufreq_get_requested_power(struct 
thermal_cooling_device *cdev,
}
 
if (load_cpu) {
-   trace_thermal_power_cpu_get_power(
-   _cdev->allowed_cpus,
-   freq, load_cpu, i, dynamic_power, static_power);
+   trace_thermal_power_cpu_get_power(policy->related_cpus, freq,
+ load_cpu, i, dynamic_power,
+ static_power);
 
kfree(load_cpu);
}
@@ -581,38 +569,22 @@ static int cpufreq_state2power(struct 
thermal_cooling_device *cdev,
   unsigned long state, u32 *power)
 {
unsigned int freq, num_cpus;
-   cpumask_var_t cpumask;
u32 static_power, dynamic_power;
int ret;
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
-   if (!alloc_cpumask_var(,

[PATCH V3 11/17] thermal: cpu_cooling: get rid of 'allowed_cpus'

2017-04-18 Thread Viresh Kumar

'allowed_cpus' is a copy of policy->related_cpus and can be replaced by
it directly. At some places we are only concerned about online CPUs and
policy->cpus can be used there.

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 77 ---
 1 file changed, 21 insertions(+), 56 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index ce387f62c93e..1097162f7f8a 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -74,7 +74,6 @@ struct power_table {
  * frequency.
  * @max_level: maximum cooling level. One less than total number of valid
  * cpufreq frequencies.
- * @allowed_cpus: all the cpus involved for this cpufreq_cooling_device.
  * @node: list_head to link all cpufreq_cooling_device together.
  * @last_load: load measured by the latest call to 
cpufreq_get_requested_power()
  * @time_in_idle: previous reading of the absolute time that this cpu was idle
@@ -97,7 +96,6 @@ struct cpufreq_cooling_device {
unsigned int clipped_freq;
unsigned int max_level;
unsigned int *freq_table;   /* In descending order */
-   struct cpumask allowed_cpus;
struct list_head node;
u32 last_load;
u64 *time_in_idle;
@@ -161,7 +159,7 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
 
mutex_lock(_list_lock);
list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
-   if (!cpumask_test_cpu(policy->cpu, _cdev->allowed_cpus))
+   if (policy != cpufreq_cdev->policy)
continue;
 
/*
@@ -304,7 +302,7 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device 
*cpufreq_cdev,
  * get_load() - get load for a cpu since last updated
  * @cpufreq_cdev:   cpufreq_cooling_device for this cpu
  * @cpu:   cpu number
- * @cpu_idx:   index of the cpu in cpufreq_cdev->allowed_cpus
+ * @cpu_idx:   index of the cpu in time_in_idle*
  *
  * Return: The average load of cpu @cpu in percentage since this
  * function was last called.
@@ -351,7 +349,7 @@ static int get_static_power(struct cpufreq_cooling_device 
*cpufreq_cdev,
 {
struct dev_pm_opp *opp;
unsigned long voltage;
-   struct cpumask *cpumask = _cdev->allowed_cpus;
+   struct cpumask *cpumask = cpufreq_cdev->policy->related_cpus;
unsigned long freq_hz = freq * 1000;
 
if (!cpufreq_cdev->plat_get_static_power || !cpufreq_cdev->cpu_dev) {
@@ -468,7 +466,7 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
cpufreq_cdev->cpufreq_state = state;
cpufreq_cdev->clipped_freq = clip_freq;
 
-   cpufreq_update_policy(cpumask_any(_cdev->allowed_cpus));
+   cpufreq_update_policy(cpufreq_cdev->policy->cpu);
 
return 0;
 }
@@ -504,28 +502,18 @@ static int cpufreq_get_requested_power(struct 
thermal_cooling_device *cdev,
int i = 0, cpu, ret;
u32 static_power, dynamic_power, total_load = 0;
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+   struct cpufreq_policy *policy = cpufreq_cdev->policy;
u32 *load_cpu = NULL;
 
-   cpu = cpumask_any_and(_cdev->allowed_cpus, cpu_online_mask);
-
-   /*
-* All the CPUs are offline, thus the requested power by
-* the cdev is 0
-*/
-   if (cpu >= nr_cpu_ids) {
-   *power = 0;
-   return 0;
-   }
-
-   freq = cpufreq_quick_get(cpu);
+   freq = cpufreq_quick_get(policy->cpu);
 
if (trace_thermal_power_cpu_get_power_enabled()) {
-   u32 ncpus = cpumask_weight(_cdev->allowed_cpus);
+   u32 ncpus = cpumask_weight(policy->related_cpus);
 
load_cpu = kcalloc(ncpus, sizeof(*load_cpu), GFP_KERNEL);
}
 
-   for_each_cpu(cpu, _cdev->allowed_cpus) {
+   for_each_cpu(cpu, policy->related_cpus) {
u32 load;
 
if (cpu_online(cpu))
@@ -550,9 +538,9 @@ static int cpufreq_get_requested_power(struct 
thermal_cooling_device *cdev,
}
 
if (load_cpu) {
-   trace_thermal_power_cpu_get_power(
-   _cdev->allowed_cpus,
-   freq, load_cpu, i, dynamic_power, static_power);
+   trace_thermal_power_cpu_get_power(policy->related_cpus, freq,
+ load_cpu, i, dynamic_power,
+ static_power);
 
kfree(load_cpu);
}
@@ -581,38 +569,22 @@ static int cpufreq_state2power(struct 
thermal_cooling_device *cdev,
   unsigned long state, u32 *power)
 {
unsigned int freq, num_cpus;
-   cpumask_var_t cpumask;
u32 static_power, dynamic_power;
int ret;
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
-   if (!alloc_cpumask_var(, GFP_KERNEL))
-

[PATCH V3 05/17] thermal: cpu_cooling: remove cpufreq_cooling_get_level()

2017-04-18 Thread Viresh Kumar

There is only one user of cpufreq_cooling_get_level() and that already
has pointer to the cpufreq_cdev structure. It can directly call
get_level() instead and we can get rid of cpufreq_cooling_get_level().

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 33 +
 include/linux/cpu_cooling.h   |  6 --
 2 files changed, 1 insertion(+), 38 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f1e784c22c5a..1f4b6a719d05 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -136,37 +136,6 @@ static unsigned long get_level(struct 
cpufreq_cooling_device *cpufreq_cdev,
 }
 
 /**
- * cpufreq_cooling_get_level - for a given cpu, return the cooling level.
- * @cpu: cpu for which the level is required
- * @freq: the frequency of interest
- *
- * This function will match the cooling level corresponding to the
- * requested @freq and return it.
- *
- * Return: The matched cooling level on success or THERMAL_CSTATE_INVALID
- * otherwise.
- */
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
-{
-   struct cpufreq_cooling_device *cpufreq_cdev;
-
-   mutex_lock(_list_lock);
-   list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
-   if (cpumask_test_cpu(cpu, _cdev->allowed_cpus)) {
-   unsigned long level = get_level(cpufreq_cdev, freq);
-
-   mutex_unlock(_list_lock);
-   return level;
-   }
-   }
-   mutex_unlock(_list_lock);
-
-   pr_err("%s: cpu:%d not part of any cooling device\n", __func__, cpu);
-   return THERMAL_CSTATE_INVALID;
-}
-EXPORT_SYMBOL_GPL(cpufreq_cooling_get_level);
-
-/**
  * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
  * @nb:struct notifier_block * with callback info.
  * @event: value showing cpufreq event for which this function invoked.
@@ -697,7 +666,7 @@ static int cpufreq_power2state(struct 
thermal_cooling_device *cdev,
normalised_power = (dyn_power * 100) / last_load;
target_freq = cpu_power_to_freq(cpufreq_cdev, normalised_power);
 
-   *state = cpufreq_cooling_get_level(cpu, target_freq);
+   *state = get_level(cpufreq_cdev, target_freq);
if (*state == THERMAL_CSTATE_INVALID) {
dev_err_ratelimited(>device,
"Failed to convert %dKHz for cpu %d into a 
cdev state\n",
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index c156f5082758..96c5e4c2f9c8 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -82,7 +82,6 @@ of_cpufreq_power_cooling_register(struct device_node *np,
  */
 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev);
 
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq);
 #else /* !CONFIG_CPU_THERMAL */
 static inline struct thermal_cooling_device *
 cpufreq_cooling_register(const struct cpumask *clip_cpus)
@@ -117,11 +116,6 @@ void cpufreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
 {
return;
 }
-static inline
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
-{
-   return THERMAL_CSTATE_INVALID;
-}
 #endif /* CONFIG_CPU_THERMAL */
 
 #endif /* __CPU_COOLING_H__ */
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 05/17] thermal: cpu_cooling: remove cpufreq_cooling_get_level()

2017-04-18 Thread Viresh Kumar

There is only one user of cpufreq_cooling_get_level() and that already
has pointer to the cpufreq_cdev structure. It can directly call
get_level() instead and we can get rid of cpufreq_cooling_get_level().

Signed-off-by: Viresh Kumar 
---
 drivers/thermal/cpu_cooling.c | 33 +
 include/linux/cpu_cooling.h   |  6 --
 2 files changed, 1 insertion(+), 38 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f1e784c22c5a..1f4b6a719d05 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -136,37 +136,6 @@ static unsigned long get_level(struct 
cpufreq_cooling_device *cpufreq_cdev,
 }
 
 /**
- * cpufreq_cooling_get_level - for a given cpu, return the cooling level.
- * @cpu: cpu for which the level is required
- * @freq: the frequency of interest
- *
- * This function will match the cooling level corresponding to the
- * requested @freq and return it.
- *
- * Return: The matched cooling level on success or THERMAL_CSTATE_INVALID
- * otherwise.
- */
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
-{
-   struct cpufreq_cooling_device *cpufreq_cdev;
-
-   mutex_lock(_list_lock);
-   list_for_each_entry(cpufreq_cdev, _cdev_list, node) {
-   if (cpumask_test_cpu(cpu, _cdev->allowed_cpus)) {
-   unsigned long level = get_level(cpufreq_cdev, freq);
-
-   mutex_unlock(_list_lock);
-   return level;
-   }
-   }
-   mutex_unlock(_list_lock);
-
-   pr_err("%s: cpu:%d not part of any cooling device\n", __func__, cpu);
-   return THERMAL_CSTATE_INVALID;
-}
-EXPORT_SYMBOL_GPL(cpufreq_cooling_get_level);
-
-/**
  * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
  * @nb:struct notifier_block * with callback info.
  * @event: value showing cpufreq event for which this function invoked.
@@ -697,7 +666,7 @@ static int cpufreq_power2state(struct 
thermal_cooling_device *cdev,
normalised_power = (dyn_power * 100) / last_load;
target_freq = cpu_power_to_freq(cpufreq_cdev, normalised_power);
 
-   *state = cpufreq_cooling_get_level(cpu, target_freq);
+   *state = get_level(cpufreq_cdev, target_freq);
if (*state == THERMAL_CSTATE_INVALID) {
dev_err_ratelimited(>device,
"Failed to convert %dKHz for cpu %d into a 
cdev state\n",
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index c156f5082758..96c5e4c2f9c8 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -82,7 +82,6 @@ of_cpufreq_power_cooling_register(struct device_node *np,
  */
 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev);
 
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq);
 #else /* !CONFIG_CPU_THERMAL */
 static inline struct thermal_cooling_device *
 cpufreq_cooling_register(const struct cpumask *clip_cpus)
@@ -117,11 +116,6 @@ void cpufreq_cooling_unregister(struct 
thermal_cooling_device *cdev)
 {
return;
 }
-static inline
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
-{
-   return THERMAL_CSTATE_INVALID;
-}
 #endif /* CONFIG_CPU_THERMAL */
 
 #endif /* __CPU_COOLING_H__ */
-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 00/17] thermal: cpu_cooling: improve interaction with cpufreq core

2017-04-18 Thread Viresh Kumar

Hi Guys,

The cpu_cooling driver is designed to use CPU frequency scaling to avoid
high thermal states for a platform. But it wasn't glued really well with
cpufreq core. For example clipped-cpus is copied from the policy
structure and its much better to use the policy->cpus (or related_cpus)
fields directly as they may have got updated. Not that things were
broken before this series, but they can be optimized a bit more.

This series tries to improve interactions between cpufreq core and
cpu_cooling driver and does some fixes/cleanups to the cpu_cooling
driver.

I have tested it on ARM 32 (exynos) and 64 bit (hikey) boards and have
pushed them for 0-day build bot and kernel CI testing as well. We should
know if something is broken with these.

@Lukasz: It would be good if you can give them a test, specially because
of your work on the "power" specific bits in the driver. This series
already has the improvements you suggested.

Pushed here as well:

git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git thermal/cooling

V2->V3:
- Additional check to guarantee that policy is valid.
- Initialize freq-table and cpufreq_cdev->policy fields before they are
  used by the power-cooling functionality.
- Thanks Lukasz for testing out and suggesting these changes.

V1->V2:
- Name cpufreq cooling dev as cpufreq_cdev everywhere (Eduardo).

--
viresh

Viresh Kumar (17):
  thermal: cpu_cooling: Avoid accessing potentially freed structures
  thermal: cpu_cooling: rearrange globals
  thermal: cpu_cooling: Name cpufreq cooling devices as cpufreq_cdev
  thermal: cpu_cooling: replace cool_dev with cdev
  thermal: cpu_cooling: remove cpufreq_cooling_get_level()
  thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()
  thermal: cpu_cooling: use cpufreq_policy to register cooling device
  cpufreq: create cpufreq_table_count_valid_entries()
  thermal: cpu_cooling: store cpufreq policy
  thermal: cpu_cooling: OPPs are registered for all CPUs
  thermal: cpu_cooling: get rid of 'allowed_cpus'
  thermal: cpu_cooling: merge frequency and power tables
  thermal: cpu_cooling: create structure for idle time stats
  thermal: cpu_cooling: get_level() can't fail
  thermal: cpu_cooling: don't store cpu_dev in cpufreq_cdev
  thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()
  thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device

 drivers/cpufreq/arm_big_little.c   |   2 +-
 drivers/cpufreq/cpufreq-dt.c   |   2 +-
 drivers/cpufreq/cpufreq_stats.c|  13 +-
 drivers/cpufreq/dbx500-cpufreq.c   |   2 +-
 drivers/cpufreq/mt8173-cpufreq.c   |   4 +-
 drivers/cpufreq/qoriq-cpufreq.c|   3 +-
 drivers/thermal/cpu_cooling.c  | 602 +
 drivers/thermal/imx_thermal.c  |  22 +-
 drivers/thermal/ti-soc-thermal/ti-thermal-common.c |  22 +-
 include/linux/cpu_cooling.h|  32 +-
 include/linux/cpufreq.h|  14 +
 11 files changed, 311 insertions(+), 407 deletions(-)

-- 
2.12.0.432.g71c3a4f4ba37

[PATCH V3 00/17] thermal: cpu_cooling: improve interaction with cpufreq core

2017-04-18 Thread Viresh Kumar

Hi Guys,

The cpu_cooling driver is designed to use CPU frequency scaling to avoid
high thermal states for a platform. But it wasn't glued really well with
cpufreq core. For example clipped-cpus is copied from the policy
structure and its much better to use the policy->cpus (or related_cpus)
fields directly as they may have got updated. Not that things were
broken before this series, but they can be optimized a bit more.

This series tries to improve interactions between cpufreq core and
cpu_cooling driver and does some fixes/cleanups to the cpu_cooling
driver.

I have tested it on ARM 32 (exynos) and 64 bit (hikey) boards and have
pushed them for 0-day build bot and kernel CI testing as well. We should
know if something is broken with these.

@Lukasz: It would be good if you can give them a test, specially because
of your work on the "power" specific bits in the driver. This series
already has the improvements you suggested.

Pushed here as well:

git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git thermal/cooling

V2->V3:
- Additional check to guarantee that policy is valid.
- Initialize freq-table and cpufreq_cdev->policy fields before they are
  used by the power-cooling functionality.
- Thanks Lukasz for testing out and suggesting these changes.

V1->V2:
- Name cpufreq cooling dev as cpufreq_cdev everywhere (Eduardo).

--
viresh

Viresh Kumar (17):
  thermal: cpu_cooling: Avoid accessing potentially freed structures
  thermal: cpu_cooling: rearrange globals
  thermal: cpu_cooling: Name cpufreq cooling devices as cpufreq_cdev
  thermal: cpu_cooling: replace cool_dev with cdev
  thermal: cpu_cooling: remove cpufreq_cooling_get_level()
  thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()
  thermal: cpu_cooling: use cpufreq_policy to register cooling device
  cpufreq: create cpufreq_table_count_valid_entries()
  thermal: cpu_cooling: store cpufreq policy
  thermal: cpu_cooling: OPPs are registered for all CPUs
  thermal: cpu_cooling: get rid of 'allowed_cpus'
  thermal: cpu_cooling: merge frequency and power tables
  thermal: cpu_cooling: create structure for idle time stats
  thermal: cpu_cooling: get_level() can't fail
  thermal: cpu_cooling: don't store cpu_dev in cpufreq_cdev
  thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()
  thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device

 drivers/cpufreq/arm_big_little.c   |   2 +-
 drivers/cpufreq/cpufreq-dt.c   |   2 +-
 drivers/cpufreq/cpufreq_stats.c|  13 +-
 drivers/cpufreq/dbx500-cpufreq.c   |   2 +-
 drivers/cpufreq/mt8173-cpufreq.c   |   4 +-
 drivers/cpufreq/qoriq-cpufreq.c|   3 +-
 drivers/thermal/cpu_cooling.c  | 602 +
 drivers/thermal/imx_thermal.c  |  22 +-
 drivers/thermal/ti-soc-thermal/ti-thermal-common.c |  22 +-
 include/linux/cpu_cooling.h|  32 +-
 include/linux/cpufreq.h|  14 +
 11 files changed, 311 insertions(+), 407 deletions(-)

-- 
2.12.0.432.g71c3a4f4ba37

Re: [PATCH] acpi: fix typo

2017-04-18 Thread Cao jin

Hi

On 04/19/2017 08:20 AM, Rafael J. Wysocki wrote:
> On Fri, Mar 31, 2017 at 11:46 AM, Cao jin  wrote:
>> Signed-off-by: Cao jin 
>> ---
>>  Documentation/acpi/linuxized-acpica.txt | 10 +-
> 
> Please send changes to this file separately.
> 
>>  include/acpi/actypes.h  |  4 ++--
> 
> This one belongs to ACPICA and there is a special process for
> modifying ACPICA files.
> 

I have read the process. So, does that means, we never send acpica patch
to kernel mail list, they only can be sent to acpica project?

-- 
Sincerely,
Cao jin

Re: [PATCH] acpi: fix typo

2017-04-18 Thread Cao jin

Hi

On 04/19/2017 08:20 AM, Rafael J. Wysocki wrote:
> On Fri, Mar 31, 2017 at 11:46 AM, Cao jin  wrote:
>> Signed-off-by: Cao jin 
>> ---
>>  Documentation/acpi/linuxized-acpica.txt | 10 +-
> 
> Please send changes to this file separately.
> 
>>  include/acpi/actypes.h  |  4 ++--
> 
> This one belongs to ACPICA and there is a special process for
> modifying ACPICA files.
> 

I have read the process. So, does that means, we never send acpica patch
to kernel mail list, they only can be sent to acpica project?

-- 
Sincerely,
Cao jin

Re: [PATCH V2 00/17] thermal: cpu_cooling: improve interaction with cpufreq core

2017-04-18 Thread Viresh Kumar

On 18-04-17, 15:40, Lukasz Luba wrote:
> Hi Viresh,
> 
> I have checkout your branch at newest commit:
> 908063832c268f8add94
> I have built it and run it on my Juno r2.
> I have some python tests for IPA and I run one of them.
> 
> I seen a few issues so I have created a patch just
> to be able to run IPA.
> My next email will have the patch so you can see the changes.
> 
> IPA does not work with this patch set.
> I have tested two source codes from your repo:
> 1. your change 908063832c268f8add94
> 2. your base 8f506e0faf4e2a4a0bde9f9b1
> 
> In case 1. IPA does not work - temperature rises to 83degC
> in case 2. works - temperature is limited to 65degC.

Yeah, there were some cases power specific cases that weren't covered in my
tests and thanks a lot for testing it out. I have pushed my branch again and it
has all your fixes (a bit refined) in it.

> On Monday I can allocate more time for it.

My branch should just work now. Please see if you can allocate 10-15 min today
to give it a try, so that we can get it in earlier. I will send a V3 today and
your Tested-by would be very much appreciated.

Thanks Lukasz.

-- 
viresh

Re: [PATCH V2 00/17] thermal: cpu_cooling: improve interaction with cpufreq core

2017-04-18 Thread Viresh Kumar

On 18-04-17, 15:40, Lukasz Luba wrote:
> Hi Viresh,
> 
> I have checkout your branch at newest commit:
> 908063832c268f8add94
> I have built it and run it on my Juno r2.
> I have some python tests for IPA and I run one of them.
> 
> I seen a few issues so I have created a patch just
> to be able to run IPA.
> My next email will have the patch so you can see the changes.
> 
> IPA does not work with this patch set.
> I have tested two source codes from your repo:
> 1. your change 908063832c268f8add94
> 2. your base 8f506e0faf4e2a4a0bde9f9b1
> 
> In case 1. IPA does not work - temperature rises to 83degC
> in case 2. works - temperature is limited to 65degC.

Yeah, there were some cases power specific cases that weren't covered in my
tests and thanks a lot for testing it out. I have pushed my branch again and it
has all your fixes (a bit refined) in it.

> On Monday I can allocate more time for it.

My branch should just work now. Please see if you can allocate 10-15 min today
to give it a try, so that we can get it in earlier. I will send a V3 today and
your Tested-by would be very much appreciated.

Thanks Lukasz.

-- 
viresh

Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-04-18 Thread Kees Cook

On Tue, Apr 18, 2017 at 9:58 PM, Serge E. Hallyn  wrote:
> On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
>> This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
>> project in-kernel.
>>
>> This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
>> sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
>> ioctl calls from non CAP_SYS_ADMIN users.
>>
>> Possible effects on userland:
>>
>> There could be a few user programs that would be effected by this
>> change.
>> See: 
>> notable programs are: agetty, csh, xemacs and tcsh
>>
>> However, I still believe that this change is worth it given that the
>> Kconfig defaults to n. This will be a feature that is turned on for the
>
> It's not worthless, but note that for instance before this was fixed
> in lxc, this patch would not have helped with escapes from privileged
> containers.
>
>> same reason that people activate it when using grsecurity. Users of this
>> opt-in feature will realize that they are choosing security over some OS
>> features like unprivileged TIOCSTI ioctls, as should be clear in the
>> Kconfig help message.
>>
>> Threat Model/Patch Rational:
>>
>> >From grsecurity's config for GRKERNSEC_HARDEN_TTY.
>>
>>  | There are very few legitimate uses for this functionality and it
>>  | has made vulnerabilities in several 'su'-like programs possible in
>>  | the past.  Even without these vulnerabilities, it provides an
>>  | attacker with an easy mechanism to move laterally among other
>>  | processes within the same user's compromised session.
>>
>> So if one process within a tty session becomes compromised it can follow
>> that additional processes, that are thought to be in different security
>> boundaries, can be compromised as a result. When using a program like su
>> or sudo, these additional processes could be in a tty session where TTY file
>> descriptors are indeed shared over privilege boundaries.
>>
>> This is also an excellent writeup about the issue:
>> 
>>
>> Signed-off-by: Matt Brown 

Thanks for working on this! I think it'll be nice to have available.

>> ---
>>  drivers/tty/tty_io.c |  4 
>>  include/linux/tty.h  |  2 ++
>>  kernel/sysctl.c  | 12 
>>  security/Kconfig | 13 +
>>  4 files changed, 31 insertions(+)
>>
>> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
>> index e6d1a65..31894e8 100644
>> --- a/drivers/tty/tty_io.c
>> +++ b/drivers/tty/tty_io.c
>> @@ -2296,11 +2296,15 @@ static int tty_fasync(int fd, struct file *filp, int 
>> on)
>>   *   FIXME: may race normal receive processing
>>   */
>>
>> +int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT);
>> +
>>  static int tiocsti(struct tty_struct *tty, char __user *p)
>>  {
>>   char ch, mbz = 0;
>>   struct tty_ldisc *ld;
>>
>> + if (tiocsti_restrict && !capable(CAP_SYS_ADMIN))
>> + return -EPERM;

I wonder if it might be worth adding a pr_warn_ratelimited() here to
help people identify either programs that want to use this feature or
actual attacks?

>>   if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
>>   return -EPERM;
>>   if (get_user(ch, p))
>> diff --git a/include/linux/tty.h b/include/linux/tty.h
>> index 1017e904..7011102 100644
>> --- a/include/linux/tty.h
>> +++ b/include/linux/tty.h
>> @@ -342,6 +342,8 @@ struct tty_file_private {
>>   struct list_head list;
>>  };
>>
>> +extern int tiocsti_restrict;
>> +
>>  /* tty magic number */
>>  #define TTY_MAGIC0x5401
>>
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index acf0a5a..68d1363 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -67,6 +67,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include 
>>  #include 
>> @@ -833,6 +834,17 @@ static struct ctl_table kern_table[] = {
>>   .extra2 = ,
>>   },
>>  #endif
>> +#if defined CONFIG_TTY
>> + {
>> + .procname   = "tiocsti_restrict",
>> + .data   = _restrict,

Since this is a new sysctl, it'll need to get documented in
Documentation/sysctl/kernel.txt as part of this patch.

>> + .maxlen = sizeof(int),
>> + .mode   = 0644,
>> + .proc_handler   = proc_dointvec_minmax_sysadmin,
>> + .extra1 = ,
>> + .extra2 = ,
>> + },
>> +#endif
>>   {
>>   .procname   = "ngroups_max",
>>   .data   = _max,
>> diff --git a/security/Kconfig b/security/Kconfig
>> index 3ff1bf9..7d13331 100644
>> --- a/security/Kconfig
>> +++ b/security/Kconfig
>> @@ -18,6 +18,19 @@ config SECURITY_DMESG_RESTRICT
>>
>> If you are unsure how to answer this question, answer N.
>>
>> +config

Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-04-18 Thread Kees Cook

On Tue, Apr 18, 2017 at 9:58 PM, Serge E. Hallyn  wrote:
> On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
>> This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
>> project in-kernel.
>>
>> This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
>> sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
>> ioctl calls from non CAP_SYS_ADMIN users.
>>
>> Possible effects on userland:
>>
>> There could be a few user programs that would be effected by this
>> change.
>> See: 
>> notable programs are: agetty, csh, xemacs and tcsh
>>
>> However, I still believe that this change is worth it given that the
>> Kconfig defaults to n. This will be a feature that is turned on for the
>
> It's not worthless, but note that for instance before this was fixed
> in lxc, this patch would not have helped with escapes from privileged
> containers.
>
>> same reason that people activate it when using grsecurity. Users of this
>> opt-in feature will realize that they are choosing security over some OS
>> features like unprivileged TIOCSTI ioctls, as should be clear in the
>> Kconfig help message.
>>
>> Threat Model/Patch Rational:
>>
>> >From grsecurity's config for GRKERNSEC_HARDEN_TTY.
>>
>>  | There are very few legitimate uses for this functionality and it
>>  | has made vulnerabilities in several 'su'-like programs possible in
>>  | the past.  Even without these vulnerabilities, it provides an
>>  | attacker with an easy mechanism to move laterally among other
>>  | processes within the same user's compromised session.
>>
>> So if one process within a tty session becomes compromised it can follow
>> that additional processes, that are thought to be in different security
>> boundaries, can be compromised as a result. When using a program like su
>> or sudo, these additional processes could be in a tty session where TTY file
>> descriptors are indeed shared over privilege boundaries.
>>
>> This is also an excellent writeup about the issue:
>> 
>>
>> Signed-off-by: Matt Brown 

Thanks for working on this! I think it'll be nice to have available.

>> ---
>>  drivers/tty/tty_io.c |  4 
>>  include/linux/tty.h  |  2 ++
>>  kernel/sysctl.c  | 12 
>>  security/Kconfig | 13 +
>>  4 files changed, 31 insertions(+)
>>
>> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
>> index e6d1a65..31894e8 100644
>> --- a/drivers/tty/tty_io.c
>> +++ b/drivers/tty/tty_io.c
>> @@ -2296,11 +2296,15 @@ static int tty_fasync(int fd, struct file *filp, int 
>> on)
>>   *   FIXME: may race normal receive processing
>>   */
>>
>> +int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT);
>> +
>>  static int tiocsti(struct tty_struct *tty, char __user *p)
>>  {
>>   char ch, mbz = 0;
>>   struct tty_ldisc *ld;
>>
>> + if (tiocsti_restrict && !capable(CAP_SYS_ADMIN))
>> + return -EPERM;

I wonder if it might be worth adding a pr_warn_ratelimited() here to
help people identify either programs that want to use this feature or
actual attacks?

>>   if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
>>   return -EPERM;
>>   if (get_user(ch, p))
>> diff --git a/include/linux/tty.h b/include/linux/tty.h
>> index 1017e904..7011102 100644
>> --- a/include/linux/tty.h
>> +++ b/include/linux/tty.h
>> @@ -342,6 +342,8 @@ struct tty_file_private {
>>   struct list_head list;
>>  };
>>
>> +extern int tiocsti_restrict;
>> +
>>  /* tty magic number */
>>  #define TTY_MAGIC0x5401
>>
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index acf0a5a..68d1363 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -67,6 +67,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include 
>>  #include 
>> @@ -833,6 +834,17 @@ static struct ctl_table kern_table[] = {
>>   .extra2 = ,
>>   },
>>  #endif
>> +#if defined CONFIG_TTY
>> + {
>> + .procname   = "tiocsti_restrict",
>> + .data   = _restrict,

Since this is a new sysctl, it'll need to get documented in
Documentation/sysctl/kernel.txt as part of this patch.

>> + .maxlen = sizeof(int),
>> + .mode   = 0644,
>> + .proc_handler   = proc_dointvec_minmax_sysadmin,
>> + .extra1 = ,
>> + .extra2 = ,
>> + },
>> +#endif
>>   {
>>   .procname   = "ngroups_max",
>>   .data   = _max,
>> diff --git a/security/Kconfig b/security/Kconfig
>> index 3ff1bf9..7d13331 100644
>> --- a/security/Kconfig
>> +++ b/security/Kconfig
>> @@ -18,6 +18,19 @@ config SECURITY_DMESG_RESTRICT
>>
>> If you are unsure how to answer this question, answer N.
>>
>> +config SECURITY_TIOCSTI_RESTRICT
>
> This

Re: [PATCH v2] usb: dwc3: add disable u2mac linestate check quirk

2017-04-18 Thread Guenter Roeck

On Tue, Apr 18, 2017 at 8:59 PM, wlf  wrote:
> Dear Guenter,
>
>
>
> 在 2017年04月18日 21:18, Guenter Roeck 写道:
>>
>> On Mon, Apr 17, 2017 at 10:17 PM, William Wu 
>> wrote:
>>>
>>> This patch adds a quirk to disable USB 2.0 MAC linestate check
>>> during HS transmit. Refer the dwc3 databook, we can use it for
>>> some special platforms if the linestate not reflect the expected
>>> line state(J) during transmission.
>>>
>>> When use this quirk, the controller implements a fixed 40-bit
>>> TxEndDelay after the packet is given on UTMI and ignores the
>>> linestate during the transmit of a token (during token-to-token
>>> and token-to-data IPGAP).
>>>
>>> On some rockchip platforms (e.g. rk3399), it requires to disable
>>> the u2mac linestate check to decrease the SSPLIT token to SETUP
>>> token inter-packet delay from 566ns to 466ns, and fix the issue
>>> that FS/LS devices not recognized if inserted through USB 3.0 HUB.
>>>
>>> Signed-off-by: William Wu 
>>> ---
>>> Changes in v2:
>>> - fix coding style
>>>
>>>   Documentation/devicetree/bindings/usb/dwc3.txt |  2 ++
>>>   drivers/usb/dwc3/core.c| 14 ++
>>>   drivers/usb/dwc3/core.h|  4 
>>>   3 files changed, 16 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt
>>> b/Documentation/devicetree/bindings/usb/dwc3.txt
>>> index f658f39..6a89f0c 100644
>>> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
>>> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
>>> @@ -45,6 +45,8 @@ Optional properties:
>>>  a free-running PHY clock.
>>>- snps,dis-del-phy-power-chg-quirk: when set core will change PHY
>>> power
>>>  from P0 to P1/P2/P3 without delay.
>>> + - snps,tx-ipgap-linecheck-dis-quirk: when set, disable u2mac linestate
>>> check
>>> +   during HS transmit.
>>
>> All other disable-something quirks are named
>> "snps,dis-something-quirk". Maybe use the same naming convention ?
>
> Yes, good idea！ I will fix it with "snps,dis-tx-ipgap-linecheck-quirk"  in
> next patch verison.
> Thanks:-)
>>
>>
>>>- snps,is-utmi-l1-suspend: true when DWC3 asserts output signal
>>>  utmi_l1_suspend_n, false when asserts
>>> utmi_sleep_n
>>>- snps,hird-threshold: HIRD threshold
>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>> index 455d89a..03429c5 100644
>>> --- a/drivers/usb/dwc3/core.c
>>> +++ b/drivers/usb/dwc3/core.c
>>> @@ -796,15 +796,19 @@ static int dwc3_core_init(struct dwc3 *dwc)
>>>  dwc3_writel(dwc->regs, DWC3_GUCTL2, reg);
>>>  }
>>>
>>> +   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
>>> +
>>
>> My understanding is that the register was only introduced with dwc3
>> revision 2.50a. Is it ok to read and write it unconditionally ?
>
> Yes, refer to dwc3 databook, the DWC3_GUCTL1 was introduced since 2.50a.
> Maybe it's better
> to read and write it only when we know our controller version.
>
> Is it good to fix it like the following patch?
> But this patch has a problem that we need to read and write the register
> twice if our controller verison > = 2.90a, and need this quirk.
>
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -806,6 +806,12 @@ static int dwc3_core_init(struct dwc3 *dwc)
> dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
> }
>
> +   if (dwc->dis_tx_ipgap_linecheck_quirk) {
> +   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
> +   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
> +   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
> +   }
> +
>

How about this ?

if (dwc->revision >= DWC3_REVISION_250A) {
reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
if (dwc->revision >= DWC3_REVISION_290A)
reg |= DWC3_GUCTL1_DEV_L1_EXIT_BY_HW;
if (dwc->dis_tx_ipgap_linecheck_quirk)
   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
}

Thanks,
Guenter

> Hi John & Felipe,
>Could you provide me some suggestion？
>Thank you！
>
>>>  /*
>>>   * Enable hardware control of sending remote wakeup in HS when
>>>   * the device is in the L1 state.
>>>   */
>>> -   if (dwc->revision >= DWC3_REVISION_290A) {
>>> -   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
>>> +   if (dwc->revision >= DWC3_REVISION_290A)
>>>  reg |= DWC3_GUCTL1_DEV_L1_EXIT_BY_HW;
>>> -   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
>>> -   }
>>> +
>>> +   if (dwc->tx_ipgap_linecheck_dis_quirk)
>>> +   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
>>> +
>>> +   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
>>>
>>>  return 0;
>>>
>>> @@ -1023,6 +1027,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
>>>

Re: [PATCH v2] usb: dwc3: add disable u2mac linestate check quirk

2017-04-18 Thread Guenter Roeck

On Tue, Apr 18, 2017 at 8:59 PM, wlf  wrote:
> Dear Guenter,
>
>
>
> 在 2017年04月18日 21:18, Guenter Roeck 写道:
>>
>> On Mon, Apr 17, 2017 at 10:17 PM, William Wu 
>> wrote:
>>>
>>> This patch adds a quirk to disable USB 2.0 MAC linestate check
>>> during HS transmit. Refer the dwc3 databook, we can use it for
>>> some special platforms if the linestate not reflect the expected
>>> line state(J) during transmission.
>>>
>>> When use this quirk, the controller implements a fixed 40-bit
>>> TxEndDelay after the packet is given on UTMI and ignores the
>>> linestate during the transmit of a token (during token-to-token
>>> and token-to-data IPGAP).
>>>
>>> On some rockchip platforms (e.g. rk3399), it requires to disable
>>> the u2mac linestate check to decrease the SSPLIT token to SETUP
>>> token inter-packet delay from 566ns to 466ns, and fix the issue
>>> that FS/LS devices not recognized if inserted through USB 3.0 HUB.
>>>
>>> Signed-off-by: William Wu 
>>> ---
>>> Changes in v2:
>>> - fix coding style
>>>
>>>   Documentation/devicetree/bindings/usb/dwc3.txt |  2 ++
>>>   drivers/usb/dwc3/core.c| 14 ++
>>>   drivers/usb/dwc3/core.h|  4 
>>>   3 files changed, 16 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt
>>> b/Documentation/devicetree/bindings/usb/dwc3.txt
>>> index f658f39..6a89f0c 100644
>>> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
>>> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
>>> @@ -45,6 +45,8 @@ Optional properties:
>>>  a free-running PHY clock.
>>>- snps,dis-del-phy-power-chg-quirk: when set core will change PHY
>>> power
>>>  from P0 to P1/P2/P3 without delay.
>>> + - snps,tx-ipgap-linecheck-dis-quirk: when set, disable u2mac linestate
>>> check
>>> +   during HS transmit.
>>
>> All other disable-something quirks are named
>> "snps,dis-something-quirk". Maybe use the same naming convention ?
>
> Yes, good idea！ I will fix it with "snps,dis-tx-ipgap-linecheck-quirk"  in
> next patch verison.
> Thanks:-)
>>
>>
>>>- snps,is-utmi-l1-suspend: true when DWC3 asserts output signal
>>>  utmi_l1_suspend_n, false when asserts
>>> utmi_sleep_n
>>>- snps,hird-threshold: HIRD threshold
>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>> index 455d89a..03429c5 100644
>>> --- a/drivers/usb/dwc3/core.c
>>> +++ b/drivers/usb/dwc3/core.c
>>> @@ -796,15 +796,19 @@ static int dwc3_core_init(struct dwc3 *dwc)
>>>  dwc3_writel(dwc->regs, DWC3_GUCTL2, reg);
>>>  }
>>>
>>> +   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
>>> +
>>
>> My understanding is that the register was only introduced with dwc3
>> revision 2.50a. Is it ok to read and write it unconditionally ?
>
> Yes, refer to dwc3 databook, the DWC3_GUCTL1 was introduced since 2.50a.
> Maybe it's better
> to read and write it only when we know our controller version.
>
> Is it good to fix it like the following patch?
> But this patch has a problem that we need to read and write the register
> twice if our controller verison > = 2.90a, and need this quirk.
>
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -806,6 +806,12 @@ static int dwc3_core_init(struct dwc3 *dwc)
> dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
> }
>
> +   if (dwc->dis_tx_ipgap_linecheck_quirk) {
> +   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
> +   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
> +   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
> +   }
> +
>

How about this ?

if (dwc->revision >= DWC3_REVISION_250A) {
reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
if (dwc->revision >= DWC3_REVISION_290A)
reg |= DWC3_GUCTL1_DEV_L1_EXIT_BY_HW;
if (dwc->dis_tx_ipgap_linecheck_quirk)
   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
}

Thanks,
Guenter

> Hi John & Felipe,
>Could you provide me some suggestion？
>Thank you！
>
>>>  /*
>>>   * Enable hardware control of sending remote wakeup in HS when
>>>   * the device is in the L1 state.
>>>   */
>>> -   if (dwc->revision >= DWC3_REVISION_290A) {
>>> -   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
>>> +   if (dwc->revision >= DWC3_REVISION_290A)
>>>  reg |= DWC3_GUCTL1_DEV_L1_EXIT_BY_HW;
>>> -   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
>>> -   }
>>> +
>>> +   if (dwc->tx_ipgap_linecheck_dis_quirk)
>>> +   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
>>> +
>>> +   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
>>>
>>>  return 0;
>>>
>>> @@ -1023,6 +1027,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
>>>  "snps,dis-u2-freeclk-exists-quirk");
>>>

[PATCH 4/4] ARM: sun8i: h3: bananapi-m2-plus: Enable USB OTG

2017-04-18 Thread Chen-Yu Tsai

The Bananapi M2 Plus has a USB OTG port that can be used in both
powered host mode and peripheral mode. When in peripheral mode,
the port does not power the board. There is no VBUS sensing on
the port.

This patch adds the regulator controlling VBUS on the OTG port,
the GPIO for the ID detect pin, and enables the USB OTG and host
controllers.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts 
b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
index 52acbe111cad..17c7c088cdea 100644
--- a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
@@ -92,6 +92,10 @@
};
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -145,6 +149,10 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -170,6 +178,11 @@
};
 };
 
+_usb0_vbus {
+   gpio = < 3 11 GPIO_ACTIVE_HIGH>; /* PD11 */
+   status = "okay";
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins_a>;
@@ -182,7 +195,14 @@
status = "okay";
 };
 
+_otg {
+   dr_mode = "otg";
+   status = "okay";
+};
+
  {
-   /* USB VBUS is on as long as VCC-IO is on */
+   usb0_id_det-gpios = <_pio 0 6 GPIO_ACTIVE_HIGH>; /* PL6 */
+   usb0_vbus-supply = <_usb0_vbus>;
+   /* USB host VBUS is on as long as VCC-IO is on */
status = "okay";
 };
-- 
2.11.0

[PATCH 4/4] ARM: sun8i: h3: bananapi-m2-plus: Enable USB OTG

2017-04-18 Thread Chen-Yu Tsai

The Bananapi M2 Plus has a USB OTG port that can be used in both
powered host mode and peripheral mode. When in peripheral mode,
the port does not power the board. There is no VBUS sensing on
the port.

This patch adds the regulator controlling VBUS on the OTG port,
the GPIO for the ID detect pin, and enables the USB OTG and host
controllers.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts 
b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
index 52acbe111cad..17c7c088cdea 100644
--- a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
@@ -92,6 +92,10 @@
};
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -145,6 +149,10 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -170,6 +178,11 @@
};
 };
 
+_usb0_vbus {
+   gpio = < 3 11 GPIO_ACTIVE_HIGH>; /* PD11 */
+   status = "okay";
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins_a>;
@@ -182,7 +195,14 @@
status = "okay";
 };
 
+_otg {
+   dr_mode = "otg";
+   status = "okay";
+};
+
  {
-   /* USB VBUS is on as long as VCC-IO is on */
+   usb0_id_det-gpios = <_pio 0 6 GPIO_ACTIVE_HIGH>; /* PL6 */
+   usb0_vbus-supply = <_usb0_vbus>;
+   /* USB host VBUS is on as long as VCC-IO is on */
status = "okay";
 };
-- 
2.11.0

[PATCH 0/4] ARM: sunxi: device tree pinctrl clean up and H3 OTG

2017-04-18 Thread Chen-Yu Tsai

Hi Maxime,

This series has 2 parts. The parts are largely unrelated, though the
second part should be applied after the first part, so we don't
accidentally mux pins that we shouldn't. Hence I'm sending them
together.

The first 2 patches clean up the sunxi device tree files, removing
pinmux settings for common GPIO pins. These include the enable pins
for the common regulators, and the mmc0 card detect pin from the
reference designs.

The second part, the latter 2 patches, enable USB OTG on the Orangepi
PC, PC Plus, Plus 2E, and the Bananapi M2+. The first 3 boards are
bunched together, due to how the PC Plus and Plus 2E device trees include
the device tree of the Opi PC.

Regards
ChenYu

Chen-Yu Tsai (4):
  ARM: sunxi: common-regulators: Drop pinmux settings for GPIO pins
  ARM: sunxi: Drop mmc0_cd_pin_reference_design pinmux setting
  ARM: sun8i: h3: orangepi-pc: Enable USB OTG
  ARM: sun8i: h3: bananapi-m2-plus: Enable USB OTG

 arch/arm/boot/dts/sun4i-a10-a1000.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-cubieboard.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts   |  2 +-
 arch/arm/boot/dts/sun4i-a10-gemei-g9.dts   |  2 +-
 arch/arm/boot/dts/sun4i-a10-hackberry.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts   |  6 +
 arch/arm/boot/dts/sun4i-a10-inet1.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-inet97fv2.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-inet9f-rev03.dts   |  2 +-
 .../boot/dts/sun4i-a10-itead-iteaduino-plus.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-jesurun-q5.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-marsboard.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-mini-xplus.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-mk802.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-mk802ii.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-olinuxino-lime.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-pcduino.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-pov-protab2-ips9.dts   |  2 +-
 arch/arm/boot/dts/sun4i-a10.dtsi   |  6 -
 arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts   |  8 --
 arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts   |  4 ---
 .../boot/dts/sun5i-a13-empire-electronix-d709.dts  |  4 ---
 arch/arm/boot/dts/sun5i-a13-hsg-h702.dts   |  5 
 arch/arm/boot/dts/sun5i-a13-olinuxino.dts  |  4 ---
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts|  5 
 arch/arm/boot/dts/sun7i-a20-cubieboard2.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-cubietruck.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-hummingbird.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-i12-tvbox.dts  |  2 +-
 arch/arm/boot/dts/sun7i-a20-icnova-swac.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-itead-ibox.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-lamobo-r1.dts  |  8 --
 arch/arm/boot/dts/sun7i-a20-m3.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-mk808c.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-olimex-som-evb.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3-nano.dts  |  2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3.dts   |  6 +
 arch/arm/boot/dts/sun7i-a20-wexler-tab7200.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts   |  2 +-
 arch/arm/boot/dts/sun7i-a20.dtsi   |  6 -
 arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts| 22 +++-
 arch/arm/boot/dts/sun8i-h3-orangepi-2.dts  |  4 ---
 arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts | 22 +++-
 arch/arm/boot/dts/sunxi-common-regulators.dtsi | 30 --
 51 files changed, 78 insertions(+), 138 deletions(-)

-- 
2.11.0

[PATCH 0/4] ARM: sunxi: device tree pinctrl clean up and H3 OTG

2017-04-18 Thread Chen-Yu Tsai

Hi Maxime,

This series has 2 parts. The parts are largely unrelated, though the
second part should be applied after the first part, so we don't
accidentally mux pins that we shouldn't. Hence I'm sending them
together.

The first 2 patches clean up the sunxi device tree files, removing
pinmux settings for common GPIO pins. These include the enable pins
for the common regulators, and the mmc0 card detect pin from the
reference designs.

The second part, the latter 2 patches, enable USB OTG on the Orangepi
PC, PC Plus, Plus 2E, and the Bananapi M2+. The first 3 boards are
bunched together, due to how the PC Plus and Plus 2E device trees include
the device tree of the Opi PC.

Regards
ChenYu

Chen-Yu Tsai (4):
  ARM: sunxi: common-regulators: Drop pinmux settings for GPIO pins
  ARM: sunxi: Drop mmc0_cd_pin_reference_design pinmux setting
  ARM: sun8i: h3: orangepi-pc: Enable USB OTG
  ARM: sun8i: h3: bananapi-m2-plus: Enable USB OTG

 arch/arm/boot/dts/sun4i-a10-a1000.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-cubieboard.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts   |  2 +-
 arch/arm/boot/dts/sun4i-a10-gemei-g9.dts   |  2 +-
 arch/arm/boot/dts/sun4i-a10-hackberry.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts   |  6 +
 arch/arm/boot/dts/sun4i-a10-inet1.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-inet97fv2.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-inet9f-rev03.dts   |  2 +-
 .../boot/dts/sun4i-a10-itead-iteaduino-plus.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-jesurun-q5.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-marsboard.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-mini-xplus.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-mk802.dts  |  2 +-
 arch/arm/boot/dts/sun4i-a10-mk802ii.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-olinuxino-lime.dts |  2 +-
 arch/arm/boot/dts/sun4i-a10-pcduino.dts|  2 +-
 arch/arm/boot/dts/sun4i-a10-pov-protab2-ips9.dts   |  2 +-
 arch/arm/boot/dts/sun4i-a10.dtsi   |  6 -
 arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts   |  8 --
 arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts   |  4 ---
 .../boot/dts/sun5i-a13-empire-electronix-d709.dts  |  4 ---
 arch/arm/boot/dts/sun5i-a13-hsg-h702.dts   |  5 
 arch/arm/boot/dts/sun5i-a13-olinuxino.dts  |  4 ---
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts|  5 
 arch/arm/boot/dts/sun7i-a20-cubieboard2.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-cubietruck.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-hummingbird.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-i12-tvbox.dts  |  2 +-
 arch/arm/boot/dts/sun7i-a20-icnova-swac.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-itead-ibox.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-lamobo-r1.dts  |  8 --
 arch/arm/boot/dts/sun7i-a20-m3.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-mk808c.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-olimex-som-evb.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts|  2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3-nano.dts  |  2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3.dts   |  6 +
 arch/arm/boot/dts/sun7i-a20-wexler-tab7200.dts |  2 +-
 arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts   |  2 +-
 arch/arm/boot/dts/sun7i-a20.dtsi   |  6 -
 arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts| 22 +++-
 arch/arm/boot/dts/sun8i-h3-orangepi-2.dts  |  4 ---
 arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts | 22 +++-
 arch/arm/boot/dts/sunxi-common-regulators.dtsi | 30 --
 51 files changed, 78 insertions(+), 138 deletions(-)

-- 
2.11.0

[PATCH 3/4] ARM: sun8i: h3: orangepi-pc: Enable USB OTG

2017-04-18 Thread Chen-Yu Tsai

The Orange Pi PC, PC Plus, and Plus 2E all have a USB OTG port
that can be used in both powered host mode and peripheral mode.
When in peripheral mode, the port does not power the board.
There is no VBUS sensing on the port. All three boards have all
related pins routed the same way.

The device tree file for the Orange Pi Plus 2E is based on the
Orange Pi PC Plus, which itself is based on the Orange Pi PC.
Changes to the base Orange Pi PC device tree file affects all 3
boards.

This patch adds the regulator controlling VBUS on the OTG port,
the GPIO for the ID detect pin, and enables the USB OTG and host
controllers.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
index f148111c326d..1a044b17d6c6 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
@@ -97,6 +97,10 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -125,6 +129,10 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -156,6 +164,11 @@
};
 };
 
+_usb0_vbus {
+   gpio = <_pio 0 2 GPIO_ACTIVE_HIGH>; /* PL2 */
+   status = "okay";
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins_a>;
@@ -180,7 +193,14 @@
status = "disabled";
 };
 
+_otg {
+   dr_mode = "otg";
+   status = "okay";
+};
+
  {
-   /* USB VBUS is always on */
+   usb0_id_det-gpios = < 6 12 GPIO_ACTIVE_HIGH>; /* PG12 */
+   usb0_vbus-supply = <_usb0_vbus>;
+   /* VBUS on USB host ports are always on */
status = "okay";
 };
-- 
2.11.0

[PATCH 3/4] ARM: sun8i: h3: orangepi-pc: Enable USB OTG

2017-04-18 Thread Chen-Yu Tsai

The Orange Pi PC, PC Plus, and Plus 2E all have a USB OTG port
that can be used in both powered host mode and peripheral mode.
When in peripheral mode, the port does not power the board.
There is no VBUS sensing on the port. All three boards have all
related pins routed the same way.

The device tree file for the Orange Pi Plus 2E is based on the
Orange Pi PC Plus, which itself is based on the Orange Pi PC.
Changes to the base Orange Pi PC device tree file affects all 3
boards.

This patch adds the regulator controlling VBUS on the OTG port,
the GPIO for the ID detect pin, and enables the USB OTG and host
controllers.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
index f148111c326d..1a044b17d6c6 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
@@ -97,6 +97,10 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -125,6 +129,10 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
@@ -156,6 +164,11 @@
};
 };
 
+_usb0_vbus {
+   gpio = <_pio 0 2 GPIO_ACTIVE_HIGH>; /* PL2 */
+   status = "okay";
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins_a>;
@@ -180,7 +193,14 @@
status = "disabled";
 };
 
+_otg {
+   dr_mode = "otg";
+   status = "okay";
+};
+
  {
-   /* USB VBUS is always on */
+   usb0_id_det-gpios = < 6 12 GPIO_ACTIVE_HIGH>; /* PG12 */
+   usb0_vbus-supply = <_usb0_vbus>;
+   /* VBUS on USB host ports are always on */
status = "okay";
 };
-- 
2.11.0

[PATCH 1/4] ARM: sunxi: common-regulators: Drop pinmux settings for GPIO pins

2017-04-18 Thread Chen-Yu Tsai

As part of our effort to move pinctrl/GPIO interlocking into the
driver where it belongs, this patch drops the definition and usage
of the pinmux settings for the common regulators defined in
sunxi-common-regulators.dtsi.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts   |  8 --
 arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts   |  4 ---
 .../boot/dts/sun5i-a13-empire-electronix-d709.dts  |  4 ---
 arch/arm/boot/dts/sun5i-a13-hsg-h702.dts   |  5 
 arch/arm/boot/dts/sun5i-a13-olinuxino.dts  |  4 ---
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts|  5 
 arch/arm/boot/dts/sun7i-a20-lamobo-r1.dts  |  8 --
 arch/arm/boot/dts/sun7i-a20-pcduino3.dts   |  4 ---
 arch/arm/boot/dts/sun8i-h3-orangepi-2.dts  |  4 ---
 arch/arm/boot/dts/sunxi-common-regulators.dtsi | 30 --
 13 files changed, 88 deletions(-)

diff --git a/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts 
b/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts
index 85dcf81ab64e..bc4351bb851f 100644
--- a/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts
+++ b/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts
@@ -120,10 +120,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PH6";
-};
-
 _otg {
dr_mode = "otg";
status = "okay";
diff --git a/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts 
b/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts
index c6f742a7e69f..d2dee8d434bf 100644
--- a/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts
@@ -136,14 +136,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG13";
-};
-
-_vbus_pin_a {
-   pins = "PB10";
-};
-
 _otg {
dr_mode = "host";
status = "okay";
diff --git a/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts 
b/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts
index a27c3fa58736..16f839df4227 100644
--- a/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts
@@ -168,10 +168,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG13";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts 
b/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts
index 894f874a5beb..eff36fe1aaa3 100644
--- a/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts
@@ -271,10 +271,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG11";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts 
b/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts
index ea3e5655a61b..5482be174e12 100644
--- a/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts
@@ -216,10 +216,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
usb1_vbus-supply = <_usb1_vbus>;
status = "okay";
diff --git a/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts 
b/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts
index 34411d27aadf..3dbb0d7c2f8c 100644
--- a/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts
+++ b/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts
@@ -207,10 +207,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>, <_vbus_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts 
b/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts
index 2489c16f7efa..584fa579ded2 100644
--- a/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts
+++ b/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts
@@ -186,7 +186,6 @@
 };
 
 _usb0_vbus {
-   pinctrl-0 = <_vbus_pin_a>;
gpio = < 6 12 GPIO_ACTIVE_HIGH>; /* PG12 */
status = "okay";
 };
@@ -202,10 +201,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>, <_vbus_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a13-olinuxino.dts 
b/arch/arm/boot/dts/sun5i-a13-olinuxino.dts
index 95f591bb8ced..38072c7e10e2 100644
--- a/arch/arm/boot/dts/sun5i-a13-olinuxino.dts
+++ b/arch/arm/boot/dts/sun5i-a13-olinuxino.dts
@@ -269,10 +269,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>, <_vbus_detect_pin>;
diff --git a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts 
b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
index d4f74f476f25..b4c87a23e3f8 100644
--- a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
+++ b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
@@ -344,11 +344,6 @@

[PATCH 2/4] ARM: sunxi: Drop mmc0_cd_pin_reference_design pinmux setting

2017-04-18 Thread Chen-Yu Tsai

As part of our effort to move pinctrl/GPIO interlocking into the
driver where it belongs, this patch drops the definition and usage
of the mmc0_cd_pin_reference_design pinmux setting for the default
mmc0 card detect GPIO pin.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun4i-a10-a1000.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts  | 2 +-
 arch/arm/boot/dts/sun4i-a10-cubieboard.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-gemei-g9.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-hackberry.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-inet1.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-inet97fv2.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-inet9f-rev03.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-itead-iteaduino-plus.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-jesurun-q5.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-marsboard.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-mini-xplus.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-mk802.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-mk802ii.dts  | 2 +-
 arch/arm/boot/dts/sun4i-a10-olinuxino-lime.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-pcduino.dts  | 2 +-
 arch/arm/boot/dts/sun4i-a10-pov-protab2-ips9.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10.dtsi | 6 --
 arch/arm/boot/dts/sun7i-a20-cubieboard2.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-cubietruck.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-hummingbird.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-i12-tvbox.dts| 2 +-
 arch/arm/boot/dts/sun7i-a20-icnova-swac.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-itead-ibox.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-m3.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-mk808c.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-olimex-som-evb.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3-nano.dts| 2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3.dts | 2 +-
 arch/arm/boot/dts/sun7i-a20-wexler-tab7200.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts | 2 +-
 arch/arm/boot/dts/sun7i-a20.dtsi | 6 --
 38 files changed, 36 insertions(+), 48 deletions(-)

diff --git a/arch/arm/boot/dts/sun4i-a10-a1000.dts 
b/arch/arm/boot/dts/sun4i-a10-a1000.dts
index f2a01fe2bebc..f80d37ddc4c6 100644
--- a/arch/arm/boot/dts/sun4i-a10-a1000.dts
+++ b/arch/arm/boot/dts/sun4i-a10-a1000.dts
@@ -171,7 +171,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts 
b/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts
index 942d739a4384..6b02de592a02 100644
--- a/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts
+++ b/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts
@@ -109,7 +109,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts 
b/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts
index 17f8c5ec011c..a7d61994b8fd 100644
--- a/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts
+++ b/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts
@@ -128,7 +128,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-cubieboard.dts 
b/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
index d844938e2aa7..a698a994e5ff 100644
--- a/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
+++ b/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
@@ -142,7 +142,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts 
b/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts
index aad3bec1cb39..e0777ae808c7 100644
--- a/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts
+++ b/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts
@@ -163,7 +163,7 @@
 
  {
pinctrl-names = "default";
-

[PATCH 2/4] ARM: sunxi: Drop mmc0_cd_pin_reference_design pinmux setting

2017-04-18 Thread Chen-Yu Tsai

As part of our effort to move pinctrl/GPIO interlocking into the
driver where it belongs, this patch drops the definition and usage
of the mmc0_cd_pin_reference_design pinmux setting for the default
mmc0 card detect GPIO pin.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun4i-a10-a1000.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts  | 2 +-
 arch/arm/boot/dts/sun4i-a10-cubieboard.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-gemei-g9.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-hackberry.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-inet1.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-inet97fv2.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-inet9f-rev03.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-itead-iteaduino-plus.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10-jesurun-q5.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-marsboard.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-mini-xplus.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-mk802.dts| 2 +-
 arch/arm/boot/dts/sun4i-a10-mk802ii.dts  | 2 +-
 arch/arm/boot/dts/sun4i-a10-olinuxino-lime.dts   | 2 +-
 arch/arm/boot/dts/sun4i-a10-pcduino.dts  | 2 +-
 arch/arm/boot/dts/sun4i-a10-pov-protab2-ips9.dts | 2 +-
 arch/arm/boot/dts/sun4i-a10.dtsi | 6 --
 arch/arm/boot/dts/sun7i-a20-cubieboard2.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-cubietruck.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-hummingbird.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-i12-tvbox.dts| 2 +-
 arch/arm/boot/dts/sun7i-a20-icnova-swac.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-itead-ibox.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-m3.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-mk808c.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-olimex-som-evb.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts  | 2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3-nano.dts| 2 +-
 arch/arm/boot/dts/sun7i-a20-pcduino3.dts | 2 +-
 arch/arm/boot/dts/sun7i-a20-wexler-tab7200.dts   | 2 +-
 arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts | 2 +-
 arch/arm/boot/dts/sun7i-a20.dtsi | 6 --
 38 files changed, 36 insertions(+), 48 deletions(-)

diff --git a/arch/arm/boot/dts/sun4i-a10-a1000.dts 
b/arch/arm/boot/dts/sun4i-a10-a1000.dts
index f2a01fe2bebc..f80d37ddc4c6 100644
--- a/arch/arm/boot/dts/sun4i-a10-a1000.dts
+++ b/arch/arm/boot/dts/sun4i-a10-a1000.dts
@@ -171,7 +171,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts 
b/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts
index 942d739a4384..6b02de592a02 100644
--- a/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts
+++ b/arch/arm/boot/dts/sun4i-a10-ba10-tvbox.dts
@@ -109,7 +109,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts 
b/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts
index 17f8c5ec011c..a7d61994b8fd 100644
--- a/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts
+++ b/arch/arm/boot/dts/sun4i-a10-chuwi-v7-cw0825.dts
@@ -128,7 +128,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-cubieboard.dts 
b/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
index d844938e2aa7..a698a994e5ff 100644
--- a/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
+++ b/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
@@ -142,7 +142,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 = <_pins_a>, <_cd_pin_reference_design>;
+   pinctrl-0 = <_pins_a>;
vmmc-supply = <_vcc3v3>;
bus-width = <4>;
cd-gpios = < 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
diff --git a/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts 
b/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts
index aad3bec1cb39..e0777ae808c7 100644
--- a/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts
+++ b/arch/arm/boot/dts/sun4i-a10-dserve-dsrv9703c.dts
@@ -163,7 +163,7 @@
 
  {
pinctrl-names = "default";
-   pinctrl-0 =

[PATCH 1/4] ARM: sunxi: common-regulators: Drop pinmux settings for GPIO pins

2017-04-18 Thread Chen-Yu Tsai

As part of our effort to move pinctrl/GPIO interlocking into the
driver where it belongs, this patch drops the definition and usage
of the pinmux settings for the common regulators defined in
sunxi-common-regulators.dtsi.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts   |  8 --
 arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts   |  4 ---
 arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts   |  4 ---
 .../boot/dts/sun5i-a13-empire-electronix-d709.dts  |  4 ---
 arch/arm/boot/dts/sun5i-a13-hsg-h702.dts   |  5 
 arch/arm/boot/dts/sun5i-a13-olinuxino.dts  |  4 ---
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts|  5 
 arch/arm/boot/dts/sun7i-a20-lamobo-r1.dts  |  8 --
 arch/arm/boot/dts/sun7i-a20-pcduino3.dts   |  4 ---
 arch/arm/boot/dts/sun8i-h3-orangepi-2.dts  |  4 ---
 arch/arm/boot/dts/sunxi-common-regulators.dtsi | 30 --
 13 files changed, 88 deletions(-)

diff --git a/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts 
b/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts
index 85dcf81ab64e..bc4351bb851f 100644
--- a/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts
+++ b/arch/arm/boot/dts/sun4i-a10-hyundai-a7hd.dts
@@ -120,10 +120,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PH6";
-};
-
 _otg {
dr_mode = "otg";
status = "okay";
diff --git a/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts 
b/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts
index c6f742a7e69f..d2dee8d434bf 100644
--- a/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-auxtek-t003.dts
@@ -136,14 +136,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG13";
-};
-
-_vbus_pin_a {
-   pins = "PB10";
-};
-
 _otg {
dr_mode = "host";
status = "okay";
diff --git a/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts 
b/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts
index a27c3fa58736..16f839df4227 100644
--- a/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-auxtek-t004.dts
@@ -168,10 +168,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG13";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts 
b/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts
index 894f874a5beb..eff36fe1aaa3 100644
--- a/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-olinuxino-micro.dts
@@ -271,10 +271,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG11";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts 
b/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts
index ea3e5655a61b..5482be174e12 100644
--- a/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts
+++ b/arch/arm/boot/dts/sun5i-a10s-wobo-i5.dts
@@ -216,10 +216,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
usb1_vbus-supply = <_usb1_vbus>;
status = "okay";
diff --git a/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts 
b/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts
index 34411d27aadf..3dbb0d7c2f8c 100644
--- a/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts
+++ b/arch/arm/boot/dts/sun5i-a13-empire-electronix-d709.dts
@@ -207,10 +207,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>, <_vbus_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts 
b/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts
index 2489c16f7efa..584fa579ded2 100644
--- a/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts
+++ b/arch/arm/boot/dts/sun5i-a13-hsg-h702.dts
@@ -186,7 +186,6 @@
 };
 
 _usb0_vbus {
-   pinctrl-0 = <_vbus_pin_a>;
gpio = < 6 12 GPIO_ACTIVE_HIGH>; /* PG12 */
status = "okay";
 };
@@ -202,10 +201,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>, <_vbus_detect_pin>;
diff --git a/arch/arm/boot/dts/sun5i-a13-olinuxino.dts 
b/arch/arm/boot/dts/sun5i-a13-olinuxino.dts
index 95f591bb8ced..38072c7e10e2 100644
--- a/arch/arm/boot/dts/sun5i-a13-olinuxino.dts
+++ b/arch/arm/boot/dts/sun5i-a13-olinuxino.dts
@@ -269,10 +269,6 @@
status = "okay";
 };
 
-_vbus_pin_a {
-   pins = "PG12";
-};
-
  {
pinctrl-names = "default";
pinctrl-0 = <_id_detect_pin>, <_vbus_detect_pin>;
diff --git a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts 
b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
index d4f74f476f25..b4c87a23e3f8 100644
--- a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
+++ b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
@@ -344,11 +344,6 @@
status = "okay";

Re: bfq-mq performance comparison to cfq

2017-04-18 Thread Bart Van Assche

On 04/11/17 00:29, Paolo Valente wrote:
>
>> Il giorno 10 apr 2017, alle ore 17:15, Bart Van Assche 
>>  ha scritto:
>>
>> On Mon, 2017-04-10 at 11:55 +0200, Paolo Valente wrote:
>>> That said, if you do always want maximum throughput, even at the
>>> expense of latency, then just switch off low-latency heuristics, i.e.,
>>> set low_latency to 0.  Depending on the device, setting slice_ilde to
>>> 0 may help a lot too (as well as with CFQ).  If the throughput is
>>> still low also after forcing BFQ to an only-throughput mode, then you
>>> hit some bug, and I'll have a little more work to do ...
>>
>> Has it been considered to make applications tell the I/O scheduler
>> whether to optimize for latency or for throughput? It shouldn't be that
>> hard for window managers and shells to figure out whether or not a new
>> application that is being started is interactive or not. This would
>> require a mechanism that allows applications to provide such information
>> to the I/O scheduler. Wouldn't that be a better approach than the I/O
>> scheduler trying to guess whether or not an application is an interactive
>> application?
>
> IMO that would be an (or maybe the) optimal solution, in terms of both
> throughput and latency.  We have even developed a prototype doing what
> you propose, for Android.  Unfortunately, I have not yet succeeded in
> getting support, to turn it into candidate production code, or to make
> a similar solution for lsb-compliant systems.

Hello Paolo,

What API was used by the Android application to tell the I/O scheduler 
to optimize for latency? Do you think that it would be sufficient if the 
application uses the ioprio_set() system call to set the I/O priority to 
IOPRIO_CLASS_RT?

Thanks,

Bart.

Re: bfq-mq performance comparison to cfq

2017-04-18 Thread Bart Van Assche

On 04/11/17 00:29, Paolo Valente wrote:
>
>> Il giorno 10 apr 2017, alle ore 17:15, Bart Van Assche 
>>  ha scritto:
>>
>> On Mon, 2017-04-10 at 11:55 +0200, Paolo Valente wrote:
>>> That said, if you do always want maximum throughput, even at the
>>> expense of latency, then just switch off low-latency heuristics, i.e.,
>>> set low_latency to 0.  Depending on the device, setting slice_ilde to
>>> 0 may help a lot too (as well as with CFQ).  If the throughput is
>>> still low also after forcing BFQ to an only-throughput mode, then you
>>> hit some bug, and I'll have a little more work to do ...
>>
>> Has it been considered to make applications tell the I/O scheduler
>> whether to optimize for latency or for throughput? It shouldn't be that
>> hard for window managers and shells to figure out whether or not a new
>> application that is being started is interactive or not. This would
>> require a mechanism that allows applications to provide such information
>> to the I/O scheduler. Wouldn't that be a better approach than the I/O
>> scheduler trying to guess whether or not an application is an interactive
>> application?
>
> IMO that would be an (or maybe the) optimal solution, in terms of both
> throughput and latency.  We have even developed a prototype doing what
> you propose, for Android.  Unfortunately, I have not yet succeeded in
> getting support, to turn it into candidate production code, or to make
> a similar solution for lsb-compliant systems.

Hello Paolo,

What API was used by the Android application to tell the I/O scheduler 
to optimize for latency? Do you think that it would be sufficient if the 
application uses the ioprio_set() system call to set the I/O priority to 
IOPRIO_CLASS_RT?

Thanks,

Bart.

Re: Doubt on first access for PCIe device

2017-04-18 Thread Jon Masters

On 04/11/2017 10:15 AM, abhijit wrote:

> Here I am assuming, the completer ID will be device number and function 
> number that will eventually programmed in to  device. In that case, my 
> question is, without first write, how read request(VENDOR ID read) is 
> serviced/routed?

You'll want to read about PCIe enumeration at boot time and how the BIOS
walks the topology to assign these (which an OS may later re-number). In
particular, read about ECAM for configuration. This uses memory mapped
config space read/write accessors that target a memory address space
under the control of the root complex.

Jon.

Re: Doubt on first access for PCIe device

2017-04-18 Thread Jon Masters

On 04/11/2017 10:15 AM, abhijit wrote:

> Here I am assuming, the completer ID will be device number and function 
> number that will eventually programmed in to  device. In that case, my 
> question is, without first write, how read request(VENDOR ID read) is 
> serviced/routed?

You'll want to read about PCIe enumeration at boot time and how the BIOS
walks the topology to assign these (which an OS may later re-number). In
particular, read about ECAM for configuration. This uses memory mapped
config space read/write accessors that target a memory address space
under the control of the root complex.

Jon.

Re: [PATCH 2/3] drm/vc4: Don't try to initialize FBDEV if we're only bound to V3D.

2017-04-18 Thread Daniel Vetter

On Tue, Apr 18, 2017 at 9:11 PM, Eric Anholt  wrote:
> The FBDEV initialization would throw an error in dmesg, when we just
> want to silently not initialize fbdev on a V3D-only VC4 instance.
>
> Signed-off-by: Eric Anholt 

Hm, this shouldn't be an error really, you might want to hotplug more
connectors later on. What exactly complains?
-Daniel

> ---
>  drivers/gpu/drm/vc4/vc4_kms.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index ad7925a9e0ea..237a504f11f0 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -230,10 +230,12 @@ int vc4_kms_load(struct drm_device *dev)
>
> drm_mode_config_reset(dev);
>
> -   vc4->fbdev = drm_fbdev_cma_init(dev, 32,
> -   dev->mode_config.num_connector);
> -   if (IS_ERR(vc4->fbdev))
> -   vc4->fbdev = NULL;
> +   if (dev->mode_config.num_connector) {
> +   vc4->fbdev = drm_fbdev_cma_init(dev, 32,
> +   
> dev->mode_config.num_connector);
> +   if (IS_ERR(vc4->fbdev))
> +   vc4->fbdev = NULL;
> +   }
>
> drm_kms_helper_poll_init(dev);
>
> --
> 2.11.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

Re: [PATCH 2/3] drm/vc4: Don't try to initialize FBDEV if we're only bound to V3D.

2017-04-18 Thread Daniel Vetter

On Tue, Apr 18, 2017 at 9:11 PM, Eric Anholt  wrote:
> The FBDEV initialization would throw an error in dmesg, when we just
> want to silently not initialize fbdev on a V3D-only VC4 instance.
>
> Signed-off-by: Eric Anholt 

Hm, this shouldn't be an error really, you might want to hotplug more
connectors later on. What exactly complains?
-Daniel

> ---
>  drivers/gpu/drm/vc4/vc4_kms.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index ad7925a9e0ea..237a504f11f0 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -230,10 +230,12 @@ int vc4_kms_load(struct drm_device *dev)
>
> drm_mode_config_reset(dev);
>
> -   vc4->fbdev = drm_fbdev_cma_init(dev, 32,
> -   dev->mode_config.num_connector);
> -   if (IS_ERR(vc4->fbdev))
> -   vc4->fbdev = NULL;
> +   if (dev->mode_config.num_connector) {
> +   vc4->fbdev = drm_fbdev_cma_init(dev, 32,
> +   
> dev->mode_config.num_connector);
> +   if (IS_ERR(vc4->fbdev))
> +   vc4->fbdev = NULL;
> +   }
>
> drm_kms_helper_poll_init(dev);
>
> --
> 2.11.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-04-18 Thread Serge E. Hallyn

On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
> This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
> project in-kernel.
> 
> This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
> sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
> ioctl calls from non CAP_SYS_ADMIN users.
> 
> Possible effects on userland:
> 
> There could be a few user programs that would be effected by this
> change.
> See: 
> notable programs are: agetty, csh, xemacs and tcsh
> 
> However, I still believe that this change is worth it given that the
> Kconfig defaults to n. This will be a feature that is turned on for the

It's not worthless, but note that for instance before this was fixed
in lxc, this patch would not have helped with escapes from privileged
containers.

> same reason that people activate it when using grsecurity. Users of this
> opt-in feature will realize that they are choosing security over some OS
> features like unprivileged TIOCSTI ioctls, as should be clear in the
> Kconfig help message.
> 
> Threat Model/Patch Rational:
> 
> >From grsecurity's config for GRKERNSEC_HARDEN_TTY.
> 
>  | There are very few legitimate uses for this functionality and it
>  | has made vulnerabilities in several 'su'-like programs possible in
>  | the past.  Even without these vulnerabilities, it provides an
>  | attacker with an easy mechanism to move laterally among other
>  | processes within the same user's compromised session.
> 
> So if one process within a tty session becomes compromised it can follow
> that additional processes, that are thought to be in different security
> boundaries, can be compromised as a result. When using a program like su
> or sudo, these additional processes could be in a tty session where TTY file
> descriptors are indeed shared over privilege boundaries.
> 
> This is also an excellent writeup about the issue:
> 
> 
> Signed-off-by: Matt Brown 
> ---
>  drivers/tty/tty_io.c |  4 
>  include/linux/tty.h  |  2 ++
>  kernel/sysctl.c  | 12 
>  security/Kconfig | 13 +
>  4 files changed, 31 insertions(+)
> 
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index e6d1a65..31894e8 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -2296,11 +2296,15 @@ static int tty_fasync(int fd, struct file *filp, int 
> on)
>   *   FIXME: may race normal receive processing
>   */
>  
> +int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT);
> +
>  static int tiocsti(struct tty_struct *tty, char __user *p)
>  {
>   char ch, mbz = 0;
>   struct tty_ldisc *ld;
>  
> + if (tiocsti_restrict && !capable(CAP_SYS_ADMIN))
> + return -EPERM;
>   if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
>   return -EPERM;
>   if (get_user(ch, p))
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index 1017e904..7011102 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -342,6 +342,8 @@ struct tty_file_private {
>   struct list_head list;
>  };
>  
> +extern int tiocsti_restrict;
> +
>  /* tty magic number */
>  #define TTY_MAGIC0x5401
>  
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index acf0a5a..68d1363 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -67,6 +67,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -833,6 +834,17 @@ static struct ctl_table kern_table[] = {
>   .extra2 = ,
>   },
>  #endif
> +#if defined CONFIG_TTY
> + {
> + .procname   = "tiocsti_restrict",
> + .data   = _restrict,
> + .maxlen = sizeof(int),
> + .mode   = 0644,
> + .proc_handler   = proc_dointvec_minmax_sysadmin,
> + .extra1 = ,
> + .extra2 = ,
> + },
> +#endif
>   {
>   .procname   = "ngroups_max",
>   .data   = _max,
> diff --git a/security/Kconfig b/security/Kconfig
> index 3ff1bf9..7d13331 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -18,6 +18,19 @@ config SECURITY_DMESG_RESTRICT
>  
> If you are unsure how to answer this question, answer N.
>  
> +config SECURITY_TIOCSTI_RESTRICT

This is an odd way to name this.  Shouldn't the name reflect that it
is setting the default, rather than enabling the feature?

Besides that, I'm ok with the patch.

> + bool "Restrict unprivileged use of tiocsti command injection"
> + default n
> + help
> +   This enforces restrictions on unprivileged users injecting commands
> +   into other processes which share a tty session using the TIOCSTI
> +   ioctl. This option makes TIOCSTI use require CAP_SYS_ADMIN.
> +
> +

Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-04-18 Thread Serge E. Hallyn

On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
> This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
> project in-kernel.
> 
> This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
> sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
> ioctl calls from non CAP_SYS_ADMIN users.
> 
> Possible effects on userland:
> 
> There could be a few user programs that would be effected by this
> change.
> See: 
> notable programs are: agetty, csh, xemacs and tcsh
> 
> However, I still believe that this change is worth it given that the
> Kconfig defaults to n. This will be a feature that is turned on for the

It's not worthless, but note that for instance before this was fixed
in lxc, this patch would not have helped with escapes from privileged
containers.

> same reason that people activate it when using grsecurity. Users of this
> opt-in feature will realize that they are choosing security over some OS
> features like unprivileged TIOCSTI ioctls, as should be clear in the
> Kconfig help message.
> 
> Threat Model/Patch Rational:
> 
> >From grsecurity's config for GRKERNSEC_HARDEN_TTY.
> 
>  | There are very few legitimate uses for this functionality and it
>  | has made vulnerabilities in several 'su'-like programs possible in
>  | the past.  Even without these vulnerabilities, it provides an
>  | attacker with an easy mechanism to move laterally among other
>  | processes within the same user's compromised session.
> 
> So if one process within a tty session becomes compromised it can follow
> that additional processes, that are thought to be in different security
> boundaries, can be compromised as a result. When using a program like su
> or sudo, these additional processes could be in a tty session where TTY file
> descriptors are indeed shared over privilege boundaries.
> 
> This is also an excellent writeup about the issue:
> 
> 
> Signed-off-by: Matt Brown 
> ---
>  drivers/tty/tty_io.c |  4 
>  include/linux/tty.h  |  2 ++
>  kernel/sysctl.c  | 12 
>  security/Kconfig | 13 +
>  4 files changed, 31 insertions(+)
> 
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index e6d1a65..31894e8 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -2296,11 +2296,15 @@ static int tty_fasync(int fd, struct file *filp, int 
> on)
>   *   FIXME: may race normal receive processing
>   */
>  
> +int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT);
> +
>  static int tiocsti(struct tty_struct *tty, char __user *p)
>  {
>   char ch, mbz = 0;
>   struct tty_ldisc *ld;
>  
> + if (tiocsti_restrict && !capable(CAP_SYS_ADMIN))
> + return -EPERM;
>   if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
>   return -EPERM;
>   if (get_user(ch, p))
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index 1017e904..7011102 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -342,6 +342,8 @@ struct tty_file_private {
>   struct list_head list;
>  };
>  
> +extern int tiocsti_restrict;
> +
>  /* tty magic number */
>  #define TTY_MAGIC0x5401
>  
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index acf0a5a..68d1363 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -67,6 +67,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -833,6 +834,17 @@ static struct ctl_table kern_table[] = {
>   .extra2 = ,
>   },
>  #endif
> +#if defined CONFIG_TTY
> + {
> + .procname   = "tiocsti_restrict",
> + .data   = _restrict,
> + .maxlen = sizeof(int),
> + .mode   = 0644,
> + .proc_handler   = proc_dointvec_minmax_sysadmin,
> + .extra1 = ,
> + .extra2 = ,
> + },
> +#endif
>   {
>   .procname   = "ngroups_max",
>   .data   = _max,
> diff --git a/security/Kconfig b/security/Kconfig
> index 3ff1bf9..7d13331 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -18,6 +18,19 @@ config SECURITY_DMESG_RESTRICT
>  
> If you are unsure how to answer this question, answer N.
>  
> +config SECURITY_TIOCSTI_RESTRICT

This is an odd way to name this.  Shouldn't the name reflect that it
is setting the default, rather than enabling the feature?

Besides that, I'm ok with the patch.

> + bool "Restrict unprivileged use of tiocsti command injection"
> + default n
> + help
> +   This enforces restrictions on unprivileged users injecting commands
> +   into other processes which share a tty session using the TIOCSTI
> +   ioctl. This option makes TIOCSTI use require CAP_SYS_ADMIN.
> +
> +   If this

Re: [PATCH v2 2/2] drm: dw-hdmi: gate audio clock from the I2S enablement callbacks

2017-04-18 Thread Archit Taneja




On 04/14/2017 02:01 PM, Romain Perier wrote:

Currently, the audio sampler clock is enabled from dw_hdmi_setup() at
step E. and is kept enabled for later use. This clock should be enabled
and disabled along with the actual audio stream and not always on (that
is bad for PM). Futhermore, as described by the datasheet, the I2S


s/Futhermore/Furthermore


variant need to gate/ungate the clock when the stream is


s/need/needs


enabled/disabled.

This commit adds a parameter to hdmi_audio_enable_clk() that controls
when the audio sample clock must be enabled or disabled. Then, it adds
the call to this function from dw_hdmi_i2s_audio_enable() and
dw_hdmi_i2s_audio_disable().

Signed-off-by: Romain Perier 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 5b328c0..a6da634 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -544,6 +544,12 @@ void dw_hdmi_set_sample_rate(struct dw_hdmi *hdmi, 
unsigned int rate)
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_set_sample_rate);

+static void hdmi_enable_audio_clk(struct dw_hdmi *hdmi, bool enable)
+{
+   hdmi_modb(hdmi, enable ? 0 : HDMI_MC_CLKDIS_AUDCLK_DISABLE,
+ HDMI_MC_CLKDIS_AUDCLK_DISABLE, HDMI_MC_CLKDIS);
+}
+
 void dw_hdmi_ahb_audio_enable(struct dw_hdmi *hdmi)
 {
hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
@@ -557,6 +563,12 @@ void dw_hdmi_ahb_audio_disable(struct dw_hdmi *hdmi)
 void dw_hdmi_i2s_audio_enable(struct dw_hdmi *hdmi)
 {
hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+   hdmi_enable_audio_clk(hdmi, true);
+}
+
+void dw_hdmi_i2s_audio_disable(struct dw_hdmi *hdmi)
+{
+   hdmi_enable_audio_clk(hdmi, false);
 }


This should be static too.

If you're okay with the suggestions, I can fix these myself and push. Let
me know if that's okay.

Thanks,
Archit



 void dw_hdmi_audio_enable(struct dw_hdmi *hdmi)
@@ -1592,11 +1604,6 @@ static void dw_hdmi_enable_video_path(struct dw_hdmi 
*hdmi)
HDMI_MC_FLOWCTRL);
 }

-static void hdmi_enable_audio_clk(struct dw_hdmi *hdmi)
-{
-   hdmi_modb(hdmi, 0, HDMI_MC_CLKDIS_AUDCLK_DISABLE, HDMI_MC_CLKDIS);
-}
-
 /* Workaround to clear the overflow condition */
 static void dw_hdmi_clear_overflow(struct dw_hdmi *hdmi)
 {
@@ -1710,7 +1717,7 @@ static int dw_hdmi_setup(struct dw_hdmi *hdmi, struct 
drm_display_mode *mode)

/* HDMI Initialization Step E - Configure audio */
hdmi_clk_regenerator_update_pixel_clock(hdmi);
-   hdmi_enable_audio_clk(hdmi);
+   hdmi_enable_audio_clk(hdmi, true);
}

/* not for DVI mode */
@@ -2438,6 +2445,7 @@ __dw_hdmi_probe(struct platform_device *pdev,
audio.write = hdmi_writeb;
audio.read  = hdmi_readb;
hdmi->enable_audio = dw_hdmi_i2s_audio_enable;
+   hdmi->disable_audio = dw_hdmi_i2s_audio_disable;

pdevinfo.name = "dw-hdmi-i2s-audio";
pdevinfo.data = 



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v2 2/2] drm: dw-hdmi: gate audio clock from the I2S enablement callbacks

2017-04-18 Thread Archit Taneja




On 04/14/2017 02:01 PM, Romain Perier wrote:

Currently, the audio sampler clock is enabled from dw_hdmi_setup() at
step E. and is kept enabled for later use. This clock should be enabled
and disabled along with the actual audio stream and not always on (that
is bad for PM). Futhermore, as described by the datasheet, the I2S


s/Futhermore/Furthermore


variant need to gate/ungate the clock when the stream is


s/need/needs


enabled/disabled.

This commit adds a parameter to hdmi_audio_enable_clk() that controls
when the audio sample clock must be enabled or disabled. Then, it adds
the call to this function from dw_hdmi_i2s_audio_enable() and
dw_hdmi_i2s_audio_disable().

Signed-off-by: Romain Perier 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 5b328c0..a6da634 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -544,6 +544,12 @@ void dw_hdmi_set_sample_rate(struct dw_hdmi *hdmi, 
unsigned int rate)
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_set_sample_rate);

+static void hdmi_enable_audio_clk(struct dw_hdmi *hdmi, bool enable)
+{
+   hdmi_modb(hdmi, enable ? 0 : HDMI_MC_CLKDIS_AUDCLK_DISABLE,
+ HDMI_MC_CLKDIS_AUDCLK_DISABLE, HDMI_MC_CLKDIS);
+}
+
 void dw_hdmi_ahb_audio_enable(struct dw_hdmi *hdmi)
 {
hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
@@ -557,6 +563,12 @@ void dw_hdmi_ahb_audio_disable(struct dw_hdmi *hdmi)
 void dw_hdmi_i2s_audio_enable(struct dw_hdmi *hdmi)
 {
hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+   hdmi_enable_audio_clk(hdmi, true);
+}
+
+void dw_hdmi_i2s_audio_disable(struct dw_hdmi *hdmi)
+{
+   hdmi_enable_audio_clk(hdmi, false);
 }


This should be static too.

If you're okay with the suggestions, I can fix these myself and push. Let
me know if that's okay.

Thanks,
Archit



 void dw_hdmi_audio_enable(struct dw_hdmi *hdmi)
@@ -1592,11 +1604,6 @@ static void dw_hdmi_enable_video_path(struct dw_hdmi 
*hdmi)
HDMI_MC_FLOWCTRL);
 }

-static void hdmi_enable_audio_clk(struct dw_hdmi *hdmi)
-{
-   hdmi_modb(hdmi, 0, HDMI_MC_CLKDIS_AUDCLK_DISABLE, HDMI_MC_CLKDIS);
-}
-
 /* Workaround to clear the overflow condition */
 static void dw_hdmi_clear_overflow(struct dw_hdmi *hdmi)
 {
@@ -1710,7 +1717,7 @@ static int dw_hdmi_setup(struct dw_hdmi *hdmi, struct 
drm_display_mode *mode)

/* HDMI Initialization Step E - Configure audio */
hdmi_clk_regenerator_update_pixel_clock(hdmi);
-   hdmi_enable_audio_clk(hdmi);
+   hdmi_enable_audio_clk(hdmi, true);
}

/* not for DVI mode */
@@ -2438,6 +2445,7 @@ __dw_hdmi_probe(struct platform_device *pdev,
audio.write = hdmi_writeb;
audio.read  = hdmi_readb;
hdmi->enable_audio = dw_hdmi_i2s_audio_enable;
+   hdmi->disable_audio = dw_hdmi_i2s_audio_disable;

pdevinfo.name = "dw-hdmi-i2s-audio";
pdevinfo.data = 



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v3 1/6] powerpc/perf: Define big-endian version of perf_mem_data_src

2017-04-18 Thread Michael Ellerman

Peter Zijlstra  writes:

> On Tue, Apr 11, 2017 at 07:21:05AM +0530, Madhavan Srinivasan wrote:
>> From: Sukadev Bhattiprolu 
>> 
>> perf_mem_data_src is an union that is initialized via the ->val field
>> and accessed via the bitmap fields. For this to work on big endian
>> platforms (Which is broken now), we also need a big-endian represenation
>> of perf_mem_data_src. i.e, in a big endian system, if user request
>> PERF_SAMPLE_DATA_SRC (perf report -d), will get the default value from
>> perf_sample_data_init(), which is PERF_MEM_NA. Value for PERF_MEM_NA
>> is constructed using shifts:
>> 
>>   /* TLB access */
>>   #define PERF_MEM_TLB_NA0x01 /* not available */
>>   ...
>>   #define PERF_MEM_TLB_SHIFT 26
>> 
>>   #define PERF_MEM_S(a, s) \
>>  (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>> 
>>   #define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
>>  PERF_MEM_S(LVL, NA)   |\
>>  PERF_MEM_S(SNOOP, NA) |\
>>  PERF_MEM_S(LOCK, NA)  |\
>>  PERF_MEM_S(TLB, NA))
>> 
>> Which works out as:
>> 
>>   ((0x01 << 0) | (0x01 << 5) | (0x01 << 19) | (0x01 << 24) | (0x01 << 26))
>> 
>> Which means the PERF_MEM_NA value comes out of the kernel as 0x5080021
>> in CPU endian.
>> 
>> But then in the perf tool, the code uses the bitfields to inspect the
>> value, and currently the bitfields are defined using little endian
>> ordering.
>> 
>> So eg. in perf_mem__tlb_scnprintf() we see:
>>   data_src->val = 0x5080021
>>  op = 0x0
>> lvl = 0x0
>>   snoop = 0x0
>>lock = 0x0
>>dtlb = 0x0
>>rsvd = 0x5080021
>> 
>> Patch does a minimal fix of adding big endian definition of the bitfields
>> to match the values that are already exported by the kernel on big endian.
>> And it makes no change on little endian.
>
> I think it is important to note that there are no current big-endian
> users. So 'fixing' this will not break anybody and will ensure future
> users (next patch) will work correctly.

Actually that's only partly true. As I describe above the PERF_MEM_NA
value is currently exported on BE platforms when a user requests it.

So I added this text after the output from perf_mem__tlb_scnprintf():

  Because of the way the perf tool code is written this is still displayed to 
the
  user as "N/A", so there is no bug visible at the UI level.
  
  Currently there are no big endian architectures which export a meaningful
  value (ie. other than PERF_MEM_NA), so the extent of the bug on big endian
  platforms is that the PERF_MEM_NA value is exported incorrectly as described
  above. Subsequent patches will add support on big endian powerpc for 
populating
  the data source value.


Hope that is clear.

It also occurred to me that we don't actually have to redefine the whole
union, it's only the bitfields that matter, so we could reduce the diff
to:

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index c66a485a24ac..97152c79df6b 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -894,12 +894,23 @@ enum perf_callchain_context {
 union perf_mem_data_src {
__u64 val;
struct {
+#if defined(__LITTLE_ENDIAN_BITFIELD)
__u64   mem_op:5,   /* type of opcode */
mem_lvl:14, /* memory hierarchy level */
mem_snoop:5,/* snoop mode */
mem_lock:2, /* lock instr */
mem_dtlb:7, /* tlb access */
mem_rsvd:31;
+#elif defined(__BIG_ENDIAN_BITFIELD)
+   __u64   mem_rsvd:31,
+   mem_dtlb:7, /* tlb access */
+   mem_lock:2, /* lock instr */
+   mem_snoop:5,/* snoop mode */
+   mem_lvl:14, /* memory hierarchy level */
+   mem_op:5;   /* type of opcode */
+#else
+#error "Unknown endianness"
+#endif
};
 };
 

That looks better to me, thoughts?

cheers

Re: [PATCH v3 1/6] powerpc/perf: Define big-endian version of perf_mem_data_src

2017-04-18 Thread Michael Ellerman

Peter Zijlstra  writes:

> On Tue, Apr 11, 2017 at 07:21:05AM +0530, Madhavan Srinivasan wrote:
>> From: Sukadev Bhattiprolu 
>> 
>> perf_mem_data_src is an union that is initialized via the ->val field
>> and accessed via the bitmap fields. For this to work on big endian
>> platforms (Which is broken now), we also need a big-endian represenation
>> of perf_mem_data_src. i.e, in a big endian system, if user request
>> PERF_SAMPLE_DATA_SRC (perf report -d), will get the default value from
>> perf_sample_data_init(), which is PERF_MEM_NA. Value for PERF_MEM_NA
>> is constructed using shifts:
>> 
>>   /* TLB access */
>>   #define PERF_MEM_TLB_NA0x01 /* not available */
>>   ...
>>   #define PERF_MEM_TLB_SHIFT 26
>> 
>>   #define PERF_MEM_S(a, s) \
>>  (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>> 
>>   #define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
>>  PERF_MEM_S(LVL, NA)   |\
>>  PERF_MEM_S(SNOOP, NA) |\
>>  PERF_MEM_S(LOCK, NA)  |\
>>  PERF_MEM_S(TLB, NA))
>> 
>> Which works out as:
>> 
>>   ((0x01 << 0) | (0x01 << 5) | (0x01 << 19) | (0x01 << 24) | (0x01 << 26))
>> 
>> Which means the PERF_MEM_NA value comes out of the kernel as 0x5080021
>> in CPU endian.
>> 
>> But then in the perf tool, the code uses the bitfields to inspect the
>> value, and currently the bitfields are defined using little endian
>> ordering.
>> 
>> So eg. in perf_mem__tlb_scnprintf() we see:
>>   data_src->val = 0x5080021
>>  op = 0x0
>> lvl = 0x0
>>   snoop = 0x0
>>lock = 0x0
>>dtlb = 0x0
>>rsvd = 0x5080021
>> 
>> Patch does a minimal fix of adding big endian definition of the bitfields
>> to match the values that are already exported by the kernel on big endian.
>> And it makes no change on little endian.
>
> I think it is important to note that there are no current big-endian
> users. So 'fixing' this will not break anybody and will ensure future
> users (next patch) will work correctly.

Actually that's only partly true. As I describe above the PERF_MEM_NA
value is currently exported on BE platforms when a user requests it.

So I added this text after the output from perf_mem__tlb_scnprintf():

  Because of the way the perf tool code is written this is still displayed to 
the
  user as "N/A", so there is no bug visible at the UI level.
  
  Currently there are no big endian architectures which export a meaningful
  value (ie. other than PERF_MEM_NA), so the extent of the bug on big endian
  platforms is that the PERF_MEM_NA value is exported incorrectly as described
  above. Subsequent patches will add support on big endian powerpc for 
populating
  the data source value.


Hope that is clear.

It also occurred to me that we don't actually have to redefine the whole
union, it's only the bitfields that matter, so we could reduce the diff
to:

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index c66a485a24ac..97152c79df6b 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -894,12 +894,23 @@ enum perf_callchain_context {
 union perf_mem_data_src {
__u64 val;
struct {
+#if defined(__LITTLE_ENDIAN_BITFIELD)
__u64   mem_op:5,   /* type of opcode */
mem_lvl:14, /* memory hierarchy level */
mem_snoop:5,/* snoop mode */
mem_lock:2, /* lock instr */
mem_dtlb:7, /* tlb access */
mem_rsvd:31;
+#elif defined(__BIG_ENDIAN_BITFIELD)
+   __u64   mem_rsvd:31,
+   mem_dtlb:7, /* tlb access */
+   mem_lock:2, /* lock instr */
+   mem_snoop:5,/* snoop mode */
+   mem_lvl:14, /* memory hierarchy level */
+   mem_op:5;   /* type of opcode */
+#else
+#error "Unknown endianness"
+#endif
};
 };
 

That looks better to me, thoughts?

cheers

Re: [PATCH v2 1/2] drm: dw-hdmi: add specific I2S and AHB functions for stream handling

2017-04-18 Thread Archit Taneja




On 04/14/2017 02:01 PM, Romain Perier wrote:

Currently, CTS+N is forced to zero as a workaround of the IP block for
i.MX platforms. This is requested in the datasheet of the corresponding
IP for AHB mode only. However, we have seen that it introduces glitches
or delays when playing a sound on HDMI for I2S mode. This proves that we
cannot keep the current functions for handling audio stream as-is if
these contain workaround that are specific to a mode.

This commit introduces two callbacks, one for each variant.
dw_hdmi_setup defines the right function depending on the detected
variant. Then, the exported functions dw_hdmi_audio_enable and
dw_hdmi_audio_disable calls the corresponding callbacks

Signed-off-by: Romain Perier 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 4b6f216..5b328c0 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -173,6 +173,8 @@ struct dw_hdmi {

unsigned int reg_shift;
struct regmap *regm;
+   void (*enable_audio)(struct dw_hdmi *hdmi);
+   void (*disable_audio)(struct dw_hdmi *hdmi);
 };

 #define HDMI_IH_PHY_STAT0_RX_SENSE \
@@ -542,13 +544,29 @@ void dw_hdmi_set_sample_rate(struct dw_hdmi *hdmi, 
unsigned int rate)
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_set_sample_rate);

+void dw_hdmi_ahb_audio_enable(struct dw_hdmi *hdmi)
+{
+   hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+}
+
+void dw_hdmi_ahb_audio_disable(struct dw_hdmi *hdmi)
+{
+   hdmi_set_cts_n(hdmi, hdmi->audio_cts, 0);
+}
+
+void dw_hdmi_i2s_audio_enable(struct dw_hdmi *hdmi)
+{
+   hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+}
+


I get some sparse warnings asking for the above 3 to be static.

Thanks,
Archit


 void dw_hdmi_audio_enable(struct dw_hdmi *hdmi)
 {
unsigned long flags;

spin_lock_irqsave(>audio_lock, flags);
hdmi->audio_enable = true;
-   hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+   if (hdmi->enable_audio)
+   hdmi->enable_audio(hdmi);
spin_unlock_irqrestore(>audio_lock, flags);
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_audio_enable);
@@ -559,7 +577,8 @@ void dw_hdmi_audio_disable(struct dw_hdmi *hdmi)

spin_lock_irqsave(>audio_lock, flags);
hdmi->audio_enable = false;
-   hdmi_set_cts_n(hdmi, hdmi->audio_cts, 0);
+   if (hdmi->disable_audio)
+   hdmi->disable_audio(hdmi);
spin_unlock_irqrestore(>audio_lock, flags);
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_audio_disable);
@@ -2404,6 +2423,8 @@ __dw_hdmi_probe(struct platform_device *pdev,
audio.irq = irq;
audio.hdmi = hdmi;
audio.eld = hdmi->connector.eld;
+   hdmi->enable_audio = dw_hdmi_ahb_audio_enable;
+   hdmi->disable_audio = dw_hdmi_ahb_audio_disable;

pdevinfo.name = "dw-hdmi-ahb-audio";
pdevinfo.data = 
@@ -2416,6 +2437,7 @@ __dw_hdmi_probe(struct platform_device *pdev,
audio.hdmi  = hdmi;
audio.write = hdmi_writeb;
audio.read  = hdmi_readb;
+   hdmi->enable_audio = dw_hdmi_i2s_audio_enable;

pdevinfo.name = "dw-hdmi-i2s-audio";
pdevinfo.data = 



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v2 1/2] drm: dw-hdmi: add specific I2S and AHB functions for stream handling

2017-04-18 Thread Archit Taneja




On 04/14/2017 02:01 PM, Romain Perier wrote:

Currently, CTS+N is forced to zero as a workaround of the IP block for
i.MX platforms. This is requested in the datasheet of the corresponding
IP for AHB mode only. However, we have seen that it introduces glitches
or delays when playing a sound on HDMI for I2S mode. This proves that we
cannot keep the current functions for handling audio stream as-is if
these contain workaround that are specific to a mode.

This commit introduces two callbacks, one for each variant.
dw_hdmi_setup defines the right function depending on the detected
variant. Then, the exported functions dw_hdmi_audio_enable and
dw_hdmi_audio_disable calls the corresponding callbacks

Signed-off-by: Romain Perier 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 4b6f216..5b328c0 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -173,6 +173,8 @@ struct dw_hdmi {

unsigned int reg_shift;
struct regmap *regm;
+   void (*enable_audio)(struct dw_hdmi *hdmi);
+   void (*disable_audio)(struct dw_hdmi *hdmi);
 };

 #define HDMI_IH_PHY_STAT0_RX_SENSE \
@@ -542,13 +544,29 @@ void dw_hdmi_set_sample_rate(struct dw_hdmi *hdmi, 
unsigned int rate)
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_set_sample_rate);

+void dw_hdmi_ahb_audio_enable(struct dw_hdmi *hdmi)
+{
+   hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+}
+
+void dw_hdmi_ahb_audio_disable(struct dw_hdmi *hdmi)
+{
+   hdmi_set_cts_n(hdmi, hdmi->audio_cts, 0);
+}
+
+void dw_hdmi_i2s_audio_enable(struct dw_hdmi *hdmi)
+{
+   hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+}
+


I get some sparse warnings asking for the above 3 to be static.

Thanks,
Archit


 void dw_hdmi_audio_enable(struct dw_hdmi *hdmi)
 {
unsigned long flags;

spin_lock_irqsave(>audio_lock, flags);
hdmi->audio_enable = true;
-   hdmi_set_cts_n(hdmi, hdmi->audio_cts, hdmi->audio_n);
+   if (hdmi->enable_audio)
+   hdmi->enable_audio(hdmi);
spin_unlock_irqrestore(>audio_lock, flags);
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_audio_enable);
@@ -559,7 +577,8 @@ void dw_hdmi_audio_disable(struct dw_hdmi *hdmi)

spin_lock_irqsave(>audio_lock, flags);
hdmi->audio_enable = false;
-   hdmi_set_cts_n(hdmi, hdmi->audio_cts, 0);
+   if (hdmi->disable_audio)
+   hdmi->disable_audio(hdmi);
spin_unlock_irqrestore(>audio_lock, flags);
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_audio_disable);
@@ -2404,6 +2423,8 @@ __dw_hdmi_probe(struct platform_device *pdev,
audio.irq = irq;
audio.hdmi = hdmi;
audio.eld = hdmi->connector.eld;
+   hdmi->enable_audio = dw_hdmi_ahb_audio_enable;
+   hdmi->disable_audio = dw_hdmi_ahb_audio_disable;

pdevinfo.name = "dw-hdmi-ahb-audio";
pdevinfo.data = 
@@ -2416,6 +2437,7 @@ __dw_hdmi_probe(struct platform_device *pdev,
audio.hdmi  = hdmi;
audio.write = hdmi_writeb;
audio.read  = hdmi_readb;
+   hdmi->enable_audio = dw_hdmi_i2s_audio_enable;

pdevinfo.name = "dw-hdmi-i2s-audio";
pdevinfo.data = 



--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v4 4/5] perf report: Show branch type statistics for stdio mode

2017-04-18 Thread Jin, Yao




On 4/19/2017 8:53 AM, Jin, Yao wrote:



On 4/19/2017 2:53 AM, Jiri Olsa wrote:

On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:

SNIP


+const char *branch_type_name(int type)
+{
+const char *branch_names[PERF_BR_MAX] = {
+"N/A",
+"JCC",
+"JMP",
+"IND_JMP",
+"CALL",
+"IND_CALL",
+"RET",
+"SYSCALL",
+"SYSRET",
+"IRQ",
+"INT",
+"IRET",
+"FAR_BRANCH",
+};
+
+if ((type >= 0) && (type < PERF_BR_MAX))
+return branch_names[type];
+
+return NULL;

looks like we should add util/branch.c with above functions
and merge it with util/parse-branch-options.c

we create new file even for less code ;-)

thanks,
jirka


Could we directly add branch_type_name() in util/parse-branch-options.c?

I just feel it's a bit waste of creating a new file for less code. :)

Thanks
Jin Yao


After considering again, yes, creating util/branch.c should be better. I 
will do that.


Thanks
Jin Yao

Re: [PATCH v4 4/5] perf report: Show branch type statistics for stdio mode

2017-04-18 Thread Jin, Yao




On 4/19/2017 8:53 AM, Jin, Yao wrote:



On 4/19/2017 2:53 AM, Jiri Olsa wrote:

On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:

SNIP


+const char *branch_type_name(int type)
+{
+const char *branch_names[PERF_BR_MAX] = {
+"N/A",
+"JCC",
+"JMP",
+"IND_JMP",
+"CALL",
+"IND_CALL",
+"RET",
+"SYSCALL",
+"SYSRET",
+"IRQ",
+"INT",
+"IRET",
+"FAR_BRANCH",
+};
+
+if ((type >= 0) && (type < PERF_BR_MAX))
+return branch_names[type];
+
+return NULL;

looks like we should add util/branch.c with above functions
and merge it with util/parse-branch-options.c

we create new file even for less code ;-)

thanks,
jirka


Could we directly add branch_type_name() in util/parse-branch-options.c?

I just feel it's a bit waste of creating a new file for less code. :)

Thanks
Jin Yao


After considering again, yes, creating util/branch.c should be better. I 
will do that.


Thanks
Jin Yao

Re: linux-next: build failure after merge of the rcu tree

2017-04-18 Thread Paul E. McKenney

On Wed, Apr 19, 2017 at 01:50:16PM +1000, Stephen Rothwell wrote:
> Hi Paul,
> 
> After merging the rcu tree, today's linux-next build (x86_64 allmodconfig)
> failed like this:
> 
> kernel/rcu/rcutorture.c: In function 'rcu_torture_stats_print':
> kernel/rcu/rcutorture.c:1369:3: error: implicit declaration of function 
> 'srcutorture_get_gp_data' [-Werror=implicit-function-declaration]
>srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
>^
> 
> Caused by commit
> 
>   b4d55cac0a93 ("srcu: Make rcutorture writer stalls print SRCU GP state")
> 
> This config has CONFIG_CLASSIC_SRCU=y and CONFIG_RCU_TORTURE_TEST=m, so
> CONFIG_RCU_TORTURE_TEST is not defined - CONFIG_RCU_TORTURE_TEST_MODULE
> is defined.  You probably want to protect srcutorture_get_gp_data() with
> IS_ENABLED(CONFIG_RCU_TORTURE_TEST) instead.
> 
> I have used the rcu tree from next-20170418 for today.

Please accept my apologies!  I forgot about the state of -rcu while
chasing another bug, and only a few minutes ago made the transition
from "Why doesn't this code work?" to "Why didn't my brain work?".  :-/

Will be fixed for tomorrow's -next.  Or at least broken in a more subtle
and creative way.  ;-)

Thanx, Paul

Re: linux-next: build failure after merge of the rcu tree

2017-04-18 Thread Paul E. McKenney

On Wed, Apr 19, 2017 at 01:50:16PM +1000, Stephen Rothwell wrote:
> Hi Paul,
> 
> After merging the rcu tree, today's linux-next build (x86_64 allmodconfig)
> failed like this:
> 
> kernel/rcu/rcutorture.c: In function 'rcu_torture_stats_print':
> kernel/rcu/rcutorture.c:1369:3: error: implicit declaration of function 
> 'srcutorture_get_gp_data' [-Werror=implicit-function-declaration]
>srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
>^
> 
> Caused by commit
> 
>   b4d55cac0a93 ("srcu: Make rcutorture writer stalls print SRCU GP state")
> 
> This config has CONFIG_CLASSIC_SRCU=y and CONFIG_RCU_TORTURE_TEST=m, so
> CONFIG_RCU_TORTURE_TEST is not defined - CONFIG_RCU_TORTURE_TEST_MODULE
> is defined.  You probably want to protect srcutorture_get_gp_data() with
> IS_ENABLED(CONFIG_RCU_TORTURE_TEST) instead.
> 
> I have used the rcu tree from next-20170418 for today.

Please accept my apologies!  I forgot about the state of -rcu while
chasing another bug, and only a few minutes ago made the transition
from "Why doesn't this code work?" to "Why didn't my brain work?".  :-/

Will be fixed for tomorrow's -next.  Or at least broken in a more subtle
and creative way.  ;-)

Thanx, Paul

Re: [PATCH v2] usb: dwc3: add disable u2mac linestate check quirk

2017-04-18 Thread wlf


Dear Guenter,


在 2017年04月18日 21:18, Guenter Roeck 写道:

On Mon, Apr 17, 2017 at 10:17 PM, William Wu  wrote:

This patch adds a quirk to disable USB 2.0 MAC linestate check
during HS transmit. Refer the dwc3 databook, we can use it for
some special platforms if the linestate not reflect the expected
line state(J) during transmission.

When use this quirk, the controller implements a fixed 40-bit
TxEndDelay after the packet is given on UTMI and ignores the
linestate during the transmit of a token (during token-to-token
and token-to-data IPGAP).

On some rockchip platforms (e.g. rk3399), it requires to disable
the u2mac linestate check to decrease the SSPLIT token to SETUP
token inter-packet delay from 566ns to 466ns, and fix the issue
that FS/LS devices not recognized if inserted through USB 3.0 HUB.

Signed-off-by: William Wu 
---
Changes in v2:
- fix coding style

  Documentation/devicetree/bindings/usb/dwc3.txt |  2 ++
  drivers/usb/dwc3/core.c| 14 ++
  drivers/usb/dwc3/core.h|  4 
  3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt 
b/Documentation/devicetree/bindings/usb/dwc3.txt
index f658f39..6a89f0c 100644
--- a/Documentation/devicetree/bindings/usb/dwc3.txt
+++ b/Documentation/devicetree/bindings/usb/dwc3.txt
@@ -45,6 +45,8 @@ Optional properties:
 a free-running PHY clock.
   - snps,dis-del-phy-power-chg-quirk: when set core will change PHY power
 from P0 to P1/P2/P3 without delay.
+ - snps,tx-ipgap-linecheck-dis-quirk: when set, disable u2mac linestate check
+   during HS transmit.

All other disable-something quirks are named
"snps,dis-something-quirk". Maybe use the same naming convention ?
Yes, good idea！ I will fix it with "snps,dis-tx-ipgap-linecheck-quirk"  
in next patch verison.

Thanks:-)



   - snps,is-utmi-l1-suspend: true when DWC3 asserts output signal
 utmi_l1_suspend_n, false when asserts utmi_sleep_n
   - snps,hird-threshold: HIRD threshold
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 455d89a..03429c5 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -796,15 +796,19 @@ static int dwc3_core_init(struct dwc3 *dwc)
 dwc3_writel(dwc->regs, DWC3_GUCTL2, reg);
 }

+   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
+

My understanding is that the register was only introduced with dwc3
revision 2.50a. Is it ok to read and write it unconditionally ?
Yes, refer to dwc3 databook, the DWC3_GUCTL1 was introduced since 2.50a. 
Maybe it's better

to read and write it only when we know our controller version.

Is it good to fix it like the following patch?
But this patch has a problem that we need to read and write the register
twice if our controller verison > = 2.90a, and need this quirk.

--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -806,6 +806,12 @@ static int dwc3_core_init(struct dwc3 *dwc)
dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
}

+   if (dwc->dis_tx_ipgap_linecheck_quirk) {
+   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
+   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
+   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
+   }
+

Hi John & Felipe,
   Could you provide me some suggestion？
   Thank you！

 /*
  * Enable hardware control of sending remote wakeup in HS when
  * the device is in the L1 state.
  */
-   if (dwc->revision >= DWC3_REVISION_290A) {
-   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
+   if (dwc->revision >= DWC3_REVISION_290A)
 reg |= DWC3_GUCTL1_DEV_L1_EXIT_BY_HW;
-   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
-   }
+
+   if (dwc->tx_ipgap_linecheck_dis_quirk)
+   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
+
+   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);

 return 0;

@@ -1023,6 +1027,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
 "snps,dis-u2-freeclk-exists-quirk");
 dwc->dis_del_phy_power_chg_quirk = device_property_read_bool(dev,
 "snps,dis-del-phy-power-chg-quirk");
+   dwc->tx_ipgap_linecheck_dis_quirk = device_property_read_bool(dev,
+   "snps,tx-ipgap-linecheck-dis-quirk");

 dwc->tx_de_emphasis_quirk = device_property_read_bool(dev,
 "snps,tx_de_emphasis_quirk");
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 981c77f..3c2537b 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -204,6 +204,7 @@
  #define DWC3_GCTL_DSBLCLKGTNG  BIT(0)

  /* Global User Control 1 Register */
+#define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
  #define

Re: [PATCH v2] usb: dwc3: add disable u2mac linestate check quirk

2017-04-18 Thread wlf


Dear Guenter,


在 2017年04月18日 21:18, Guenter Roeck 写道:

On Mon, Apr 17, 2017 at 10:17 PM, William Wu  wrote:

This patch adds a quirk to disable USB 2.0 MAC linestate check
during HS transmit. Refer the dwc3 databook, we can use it for
some special platforms if the linestate not reflect the expected
line state(J) during transmission.

When use this quirk, the controller implements a fixed 40-bit
TxEndDelay after the packet is given on UTMI and ignores the
linestate during the transmit of a token (during token-to-token
and token-to-data IPGAP).

On some rockchip platforms (e.g. rk3399), it requires to disable
the u2mac linestate check to decrease the SSPLIT token to SETUP
token inter-packet delay from 566ns to 466ns, and fix the issue
that FS/LS devices not recognized if inserted through USB 3.0 HUB.

Signed-off-by: William Wu 
---
Changes in v2:
- fix coding style

  Documentation/devicetree/bindings/usb/dwc3.txt |  2 ++
  drivers/usb/dwc3/core.c| 14 ++
  drivers/usb/dwc3/core.h|  4 
  3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt 
b/Documentation/devicetree/bindings/usb/dwc3.txt
index f658f39..6a89f0c 100644
--- a/Documentation/devicetree/bindings/usb/dwc3.txt
+++ b/Documentation/devicetree/bindings/usb/dwc3.txt
@@ -45,6 +45,8 @@ Optional properties:
 a free-running PHY clock.
   - snps,dis-del-phy-power-chg-quirk: when set core will change PHY power
 from P0 to P1/P2/P3 without delay.
+ - snps,tx-ipgap-linecheck-dis-quirk: when set, disable u2mac linestate check
+   during HS transmit.

All other disable-something quirks are named
"snps,dis-something-quirk". Maybe use the same naming convention ?
Yes, good idea！ I will fix it with "snps,dis-tx-ipgap-linecheck-quirk"  
in next patch verison.

Thanks:-)



   - snps,is-utmi-l1-suspend: true when DWC3 asserts output signal
 utmi_l1_suspend_n, false when asserts utmi_sleep_n
   - snps,hird-threshold: HIRD threshold
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 455d89a..03429c5 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -796,15 +796,19 @@ static int dwc3_core_init(struct dwc3 *dwc)
 dwc3_writel(dwc->regs, DWC3_GUCTL2, reg);
 }

+   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
+

My understanding is that the register was only introduced with dwc3
revision 2.50a. Is it ok to read and write it unconditionally ?
Yes, refer to dwc3 databook, the DWC3_GUCTL1 was introduced since 2.50a. 
Maybe it's better

to read and write it only when we know our controller version.

Is it good to fix it like the following patch?
But this patch has a problem that we need to read and write the register
twice if our controller verison > = 2.90a, and need this quirk.

--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -806,6 +806,12 @@ static int dwc3_core_init(struct dwc3 *dwc)
dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
}

+   if (dwc->dis_tx_ipgap_linecheck_quirk) {
+   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
+   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
+   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
+   }
+

Hi John & Felipe,
   Could you provide me some suggestion？
   Thank you！

 /*
  * Enable hardware control of sending remote wakeup in HS when
  * the device is in the L1 state.
  */
-   if (dwc->revision >= DWC3_REVISION_290A) {
-   reg = dwc3_readl(dwc->regs, DWC3_GUCTL1);
+   if (dwc->revision >= DWC3_REVISION_290A)
 reg |= DWC3_GUCTL1_DEV_L1_EXIT_BY_HW;
-   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
-   }
+
+   if (dwc->tx_ipgap_linecheck_dis_quirk)
+   reg |= DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS;
+
+   dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);

 return 0;

@@ -1023,6 +1027,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
 "snps,dis-u2-freeclk-exists-quirk");
 dwc->dis_del_phy_power_chg_quirk = device_property_read_bool(dev,
 "snps,dis-del-phy-power-chg-quirk");
+   dwc->tx_ipgap_linecheck_dis_quirk = device_property_read_bool(dev,
+   "snps,tx-ipgap-linecheck-dis-quirk");

 dwc->tx_de_emphasis_quirk = device_property_read_bool(dev,
 "snps,tx_de_emphasis_quirk");
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 981c77f..3c2537b 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -204,6 +204,7 @@
  #define DWC3_GCTL_DSBLCLKGTNG  BIT(0)

  /* Global User Control 1 Register */
+#define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
  #define DWC3_GUCTL1_DEV_L1_EXIT_BY_HW  BIT(24)

  /* Global USB2

linux-next: build failure after merge of the rcu tree

2017-04-18 Thread Stephen Rothwell

Hi Paul,

After merging the rcu tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

kernel/rcu/rcutorture.c: In function 'rcu_torture_stats_print':
kernel/rcu/rcutorture.c:1369:3: error: implicit declaration of function 
'srcutorture_get_gp_data' [-Werror=implicit-function-declaration]
   srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
   ^

Caused by commit

  b4d55cac0a93 ("srcu: Make rcutorture writer stalls print SRCU GP state")

This config has CONFIG_CLASSIC_SRCU=y and CONFIG_RCU_TORTURE_TEST=m, so
CONFIG_RCU_TORTURE_TEST is not defined - CONFIG_RCU_TORTURE_TEST_MODULE
is defined.  You probably want to protect srcutorture_get_gp_data() with
IS_ENABLED(CONFIG_RCU_TORTURE_TEST) instead.

I have used the rcu tree from next-20170418 for today.

-- 
Cheers,
Stephen Rothwell

linux-next: build failure after merge of the rcu tree

2017-04-18 Thread Stephen Rothwell

Hi Paul,

After merging the rcu tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

kernel/rcu/rcutorture.c: In function 'rcu_torture_stats_print':
kernel/rcu/rcutorture.c:1369:3: error: implicit declaration of function 
'srcutorture_get_gp_data' [-Werror=implicit-function-declaration]
   srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
   ^

Caused by commit

  b4d55cac0a93 ("srcu: Make rcutorture writer stalls print SRCU GP state")

This config has CONFIG_CLASSIC_SRCU=y and CONFIG_RCU_TORTURE_TEST=m, so
CONFIG_RCU_TORTURE_TEST is not defined - CONFIG_RCU_TORTURE_TEST_MODULE
is defined.  You probably want to protect srcutorture_get_gp_data() with
IS_ENABLED(CONFIG_RCU_TORTURE_TEST) instead.

I have used the rcu tree from next-20170418 for today.

-- 
Cheers,
Stephen Rothwell

[PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-04-18 Thread Matt Brown

This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
project in-kernel.

This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
ioctl calls from non CAP_SYS_ADMIN users.

Possible effects on userland:

There could be a few user programs that would be effected by this
change.
See: 
notable programs are: agetty, csh, xemacs and tcsh

However, I still believe that this change is worth it given that the
Kconfig defaults to n. This will be a feature that is turned on for the
same reason that people activate it when using grsecurity. Users of this
opt-in feature will realize that they are choosing security over some OS
features like unprivileged TIOCSTI ioctls, as should be clear in the
Kconfig help message.

Threat Model/Patch Rational:

>From grsecurity's config for GRKERNSEC_HARDEN_TTY.

 | There are very few legitimate uses for this functionality and it
 | has made vulnerabilities in several 'su'-like programs possible in
 | the past.  Even without these vulnerabilities, it provides an
 | attacker with an easy mechanism to move laterally among other
 | processes within the same user's compromised session.

So if one process within a tty session becomes compromised it can follow
that additional processes, that are thought to be in different security
boundaries, can be compromised as a result. When using a program like su
or sudo, these additional processes could be in a tty session where TTY file
descriptors are indeed shared over privilege boundaries.

This is also an excellent writeup about the issue:


Signed-off-by: Matt Brown 
---
 drivers/tty/tty_io.c |  4 
 include/linux/tty.h  |  2 ++
 kernel/sysctl.c  | 12 
 security/Kconfig | 13 +
 4 files changed, 31 insertions(+)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index e6d1a65..31894e8 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2296,11 +2296,15 @@ static int tty_fasync(int fd, struct file *filp, int on)
  * FIXME: may race normal receive processing
  */
 
+int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT);
+
 static int tiocsti(struct tty_struct *tty, char __user *p)
 {
char ch, mbz = 0;
struct tty_ldisc *ld;
 
+   if (tiocsti_restrict && !capable(CAP_SYS_ADMIN))
+   return -EPERM;
if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
return -EPERM;
if (get_user(ch, p))
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 1017e904..7011102 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -342,6 +342,8 @@ struct tty_file_private {
struct list_head list;
 };
 
+extern int tiocsti_restrict;
+
 /* tty magic number */
 #define TTY_MAGIC  0x5401
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index acf0a5a..68d1363 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -67,6 +67,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -833,6 +834,17 @@ static struct ctl_table kern_table[] = {
.extra2 = ,
},
 #endif
+#if defined CONFIG_TTY
+   {
+   .procname   = "tiocsti_restrict",
+   .data   = _restrict,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax_sysadmin,
+   .extra1 = ,
+   .extra2 = ,
+   },
+#endif
{
.procname   = "ngroups_max",
.data   = _max,
diff --git a/security/Kconfig b/security/Kconfig
index 3ff1bf9..7d13331 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -18,6 +18,19 @@ config SECURITY_DMESG_RESTRICT
 
  If you are unsure how to answer this question, answer N.
 
+config SECURITY_TIOCSTI_RESTRICT
+   bool "Restrict unprivileged use of tiocsti command injection"
+   default n
+   help
+ This enforces restrictions on unprivileged users injecting commands
+ into other processes which share a tty session using the TIOCSTI
+ ioctl. This option makes TIOCSTI use require CAP_SYS_ADMIN.
+
+ If this option is not selected, no restrictions will be enforced
+ unless the tiocsti_restrict sysctl is explicitly set to (1).
+
+ If you are unsure how to answer this question, answer N.
+
 config SECURITY
bool "Enable different security models"
depends on SYSFS
-- 
2.10.2

[PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-04-18 Thread Matt Brown

This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
project in-kernel.

This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
ioctl calls from non CAP_SYS_ADMIN users.

Possible effects on userland:

There could be a few user programs that would be effected by this
change.
See: 
notable programs are: agetty, csh, xemacs and tcsh

However, I still believe that this change is worth it given that the
Kconfig defaults to n. This will be a feature that is turned on for the
same reason that people activate it when using grsecurity. Users of this
opt-in feature will realize that they are choosing security over some OS
features like unprivileged TIOCSTI ioctls, as should be clear in the
Kconfig help message.

Threat Model/Patch Rational:

>From grsecurity's config for GRKERNSEC_HARDEN_TTY.

 | There are very few legitimate uses for this functionality and it
 | has made vulnerabilities in several 'su'-like programs possible in
 | the past.  Even without these vulnerabilities, it provides an
 | attacker with an easy mechanism to move laterally among other
 | processes within the same user's compromised session.

So if one process within a tty session becomes compromised it can follow
that additional processes, that are thought to be in different security
boundaries, can be compromised as a result. When using a program like su
or sudo, these additional processes could be in a tty session where TTY file
descriptors are indeed shared over privilege boundaries.

This is also an excellent writeup about the issue:


Signed-off-by: Matt Brown 
---
 drivers/tty/tty_io.c |  4 
 include/linux/tty.h  |  2 ++
 kernel/sysctl.c  | 12 
 security/Kconfig | 13 +
 4 files changed, 31 insertions(+)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index e6d1a65..31894e8 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2296,11 +2296,15 @@ static int tty_fasync(int fd, struct file *filp, int on)
  * FIXME: may race normal receive processing
  */
 
+int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT);
+
 static int tiocsti(struct tty_struct *tty, char __user *p)
 {
char ch, mbz = 0;
struct tty_ldisc *ld;
 
+   if (tiocsti_restrict && !capable(CAP_SYS_ADMIN))
+   return -EPERM;
if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
return -EPERM;
if (get_user(ch, p))
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 1017e904..7011102 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -342,6 +342,8 @@ struct tty_file_private {
struct list_head list;
 };
 
+extern int tiocsti_restrict;
+
 /* tty magic number */
 #define TTY_MAGIC  0x5401
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index acf0a5a..68d1363 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -67,6 +67,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -833,6 +834,17 @@ static struct ctl_table kern_table[] = {
.extra2 = ,
},
 #endif
+#if defined CONFIG_TTY
+   {
+   .procname   = "tiocsti_restrict",
+   .data   = _restrict,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax_sysadmin,
+   .extra1 = ,
+   .extra2 = ,
+   },
+#endif
{
.procname   = "ngroups_max",
.data   = _max,
diff --git a/security/Kconfig b/security/Kconfig
index 3ff1bf9..7d13331 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -18,6 +18,19 @@ config SECURITY_DMESG_RESTRICT
 
  If you are unsure how to answer this question, answer N.
 
+config SECURITY_TIOCSTI_RESTRICT
+   bool "Restrict unprivileged use of tiocsti command injection"
+   default n
+   help
+ This enforces restrictions on unprivileged users injecting commands
+ into other processes which share a tty session using the TIOCSTI
+ ioctl. This option makes TIOCSTI use require CAP_SYS_ADMIN.
+
+ If this option is not selected, no restrictions will be enforced
+ unless the tiocsti_restrict sysctl is explicitly set to (1).
+
+ If you are unsure how to answer this question, answer N.
+
 config SECURITY
bool "Enable different security models"
depends on SYSFS
-- 
2.10.2

Re: [PATCH v2 8/9] staging: fsl-dpaa2/eth: Add TODO file

2017-04-18 Thread Stuart Yoder

On Wed, Apr 12, 2017 at 11:25 AM, Ioana Radulescu
 wrote:
> Add a list of TODO items for the Ethernet driver
>
> Signed-off-by: Ioana Radulescu 
> ---
> v2: Add note
>
>  drivers/staging/fsl-dpaa2/ethernet/TODO | 14 ++
>  1 file changed, 14 insertions(+)
>  create mode 100644 drivers/staging/fsl-dpaa2/ethernet/TODO
>
> diff --git a/drivers/staging/fsl-dpaa2/ethernet/TODO 
> b/drivers/staging/fsl-dpaa2/ethernet/TODO
> new file mode 100644
> index ..110e66d44b42
> --- /dev/null
> +++ b/drivers/staging/fsl-dpaa2/ethernet/TODO
> @@ -0,0 +1,14 @@
> +* Add a DPAA2 MAC kernel driver in order to allow PHY management; currently
> +  the DPMAC objects and their link to DPNIs are handled by MC internally
> +  and all PHYs are seen as fixed-link
> +* add more debug support: decide how to expose detailed debug statistics,
> +  add ingress error queue support
> +* MC firmware uprev; the DPAA2 objects used by the Ethernet driver need to
> +  be kept in sync with binary interface changes in MC
> +* refine README file
> +* cleanup
> +
> +NOTE: None of the above is must-have before getting the DPAA2 Ethernet driver
> +out of staging. The main requirement for that is to have the drivers it
> +depends on, fsl-mc bus and DPIO driver, moved to drivers/bus and drivers/soc
> +respectively.

The TODO file should have contact info (I think)...look at other
drivers/staging TODO
for examples.

Stuart

Re: [PATCH v2 8/9] staging: fsl-dpaa2/eth: Add TODO file

2017-04-18 Thread Stuart Yoder

On Wed, Apr 12, 2017 at 11:25 AM, Ioana Radulescu
 wrote:
> Add a list of TODO items for the Ethernet driver
>
> Signed-off-by: Ioana Radulescu 
> ---
> v2: Add note
>
>  drivers/staging/fsl-dpaa2/ethernet/TODO | 14 ++
>  1 file changed, 14 insertions(+)
>  create mode 100644 drivers/staging/fsl-dpaa2/ethernet/TODO
>
> diff --git a/drivers/staging/fsl-dpaa2/ethernet/TODO 
> b/drivers/staging/fsl-dpaa2/ethernet/TODO
> new file mode 100644
> index ..110e66d44b42
> --- /dev/null
> +++ b/drivers/staging/fsl-dpaa2/ethernet/TODO
> @@ -0,0 +1,14 @@
> +* Add a DPAA2 MAC kernel driver in order to allow PHY management; currently
> +  the DPMAC objects and their link to DPNIs are handled by MC internally
> +  and all PHYs are seen as fixed-link
> +* add more debug support: decide how to expose detailed debug statistics,
> +  add ingress error queue support
> +* MC firmware uprev; the DPAA2 objects used by the Ethernet driver need to
> +  be kept in sync with binary interface changes in MC
> +* refine README file
> +* cleanup
> +
> +NOTE: None of the above is must-have before getting the DPAA2 Ethernet driver
> +out of staging. The main requirement for that is to have the drivers it
> +depends on, fsl-mc bus and DPIO driver, moved to drivers/bus and drivers/soc
> +respectively.

The TODO file should have contact info (I think)...look at other
drivers/staging TODO
for examples.

Stuart

WARNING: kernel stack frame pointer has bad value

2017-04-18 Thread Steven Rostedt

Josh,

I'm starting to get a bunch of these warnings, and I'm thinking they
are false positives. The stack frame error is recorded at a call from
entry_SYSCALL_64_fastpath, where I would expect the bp to not be valid.

To trigger this, I only need to go into /sys/kernel/debug/tracing and
echo function > current_tracer then cat trace. Maybe function tracer
stack frames is messing it up some how, but it always fails at the
entry call.

Here's the dump;

 WARNING: kernel stack frame pointer at 8800bda0ff30 in sshd:1090 has bad 
value 55b32abf1fa8
 unwind stack type:0 next_sp:  (null) mask:6 graph_idx:0
 8800bda0fd28: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0fd30: 810dc940 (sigprocmask+0x150/0x150)
 8800bda0fd38: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0fd40: 8800c7e60040 (0x8800c7e60040)
 8800bda0fd48: 8800bda0fe08 (0x8800bda0fe08)
 8800bda0fd50: 825393c0 (ftrace_trace_arrays+0x40/0x40)
 8800bda0fd58: 8800c7e60040 (0x8800c7e60040)
 8800bda0fd60: 0008 (0x8)
 8800bda0fd68: 001a0800 (0x1a0800)
 8800bda0fd70:  ...
 8800bda0fd78: fbfff04a727c (0xfbfff04a727c)
 8800bda0fd80: 8122c8bb (trace_function+0x2b/0x120)
 8800bda0fd88: dc00 (0xdc00)
 8800bda0fd90: 810dc940 (sigprocmask+0x150/0x150)
 8800bda0fd98: 825393e0 (global_trace+0x20/0x1680)
 8800bda0fda0: ff7d (0xff7d)
 8800bda0fda8: 8122c8bb (trace_function+0x2b/0x120)
 8800bda0fdb0: 0010 (0x10)
 8800bda0fdb8: 0246 (0x246)
 8800bda0fdc0: 8800bda0fdd0 (0x8800bda0fdd0)
 8800bda0fdc8: 0018 (0x18)
 8800bda0fdd0: a02e0077 (0xa02e0077)
 8800bda0fdd8: 0246 (0x246)
 8800bda0fde0: 8800c7e60040 (0x8800c7e60040)
 8800bda0fde8: 8800c7e60040 (0x8800c7e60040)
 8800bda0fdf0: 0007 (0x7)
 8800bda0fdf8: 810dc940 (sigprocmask+0x150/0x150)
 8800bda0fe00: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0fe08: 8800bda0fe68 (0x8800bda0fe68)
 8800bda0fe10: 81238168 (function_trace_call+0x208/0x260)
 8800bda0fe18: 00026f10 (0x26f10)
 8800bda0fe20: 8800c7e621f0 (0x8800c7e621f0)
 8800bda0fe28: 00026f10 (0x26f10)
 8800bda0fe30: 8800d3ea6f10 (0x8800d3ea6f10)
 8800bda0fe38: 8010 (0x8010)
 8800bda0fe40: 7d1f4e80 (0x7d1f4e80)
 8800bda0fe48: 7d1f4e00 (0x7d1f4e00)
 8800bda0fe50:  ...
 8800bda0fe58: 7d1f4f8f (0x7d1f4f8f)
 8800bda0fe60: 55b32a9a2a51 (0x55b32a9a2a51)
 8800bda0fe68: 8800bda0ff20 (0x8800bda0ff20)
 8800bda0fe70: a02e0077 (0xa02e0077)
 8800bda0fe78: 55b32bdc57c0 (0x55b32bdc57c0)
 8800bda0fe80: 41b58ab3 (0x41b58ab3)
 8800bda0fe88: 8233e3f0 (ONEf+0x16e40/0x5840d)
 8800bda0fe90: 8800bda0fed0 (0x8800bda0fed0)
 8800bda0fe98: 55b32abf1fa8 (0x55b32abf1fa8)
 8800bda0fea0: 8800bda0fee0 (0x8800bda0fee0)
 8800bda0fea8: 8800c7e60040 (0x8800c7e60040)
 8800bda0feb0: 81cf5017 (entry_SYSCALL_64_fastpath+0x5/0xad)
 8800bda0feb8: 001a0800 (0x1a0800)
 8800bda0fec0:  ...
 8800bda0fec8: 000e (0xe)
 8800bda0fed0: 0008 (0x8)
 8800bda0fed8: 7d1f4e00 (0x7d1f4e00)
 8800bda0fee0: 7d1f4e80 (0x7d1f4e80)
 8800bda0fee8:  ...
 8800bda0fef0: 8800bda0ff48 (0x8800bda0ff48)
 8800bda0fef8: 810dc945 (SyS_rt_sigprocmask+0x5/0x1a0)
 8800bda0ff00: 8800c7e60040 (0x8800c7e60040)
 8800bda0ff08: 0008 (0x8)
 8800bda0ff10: 001a0800 (0x1a0800)
 8800bda0ff18:  ...
 8800bda0ff20: 8800bda0ff30 (0x8800bda0ff30)
 8800bda0ff28: 810dc945 (SyS_rt_sigprocmask+0x5/0x1a0)
 8800bda0ff30: 55b32abf1fa8 (0x55b32abf1fa8)
 8800bda0ff38: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0ff40: 55b32abf1fa8 (0x55b32abf1fa8)
 8800bda0ff48: 810dc945 (SyS_rt_sigprocmask+0x5/0x1a0)
 8800bda0ff50: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0ff58: 258c9a9a (0x258c9a9a)
 8800bda0ff60: 9a954c2d (0x9a954c2d)
 8800bda0ff68: fc397de1 (0xfc397de1)
 8800bda0ff70: 2badc874 (0x2badc874)
 8800bda0ff78: 8800bda0ff98 (0x8800bda0ff98)
 8800bda0ff80: 81149040 (trace_hardirqs_off_caller+0xc0/0x110)
 8800bda0ff88: 0246 (0x246)
 8800bda0ff90: 0008 (0x8)
 8800bda0ff98: 001a0800 (0x1a0800)
 8800bda0ffa0:  ...

WARNING: kernel stack frame pointer has bad value

2017-04-18 Thread Steven Rostedt

Josh,

I'm starting to get a bunch of these warnings, and I'm thinking they
are false positives. The stack frame error is recorded at a call from
entry_SYSCALL_64_fastpath, where I would expect the bp to not be valid.

To trigger this, I only need to go into /sys/kernel/debug/tracing and
echo function > current_tracer then cat trace. Maybe function tracer
stack frames is messing it up some how, but it always fails at the
entry call.

Here's the dump;

 WARNING: kernel stack frame pointer at 8800bda0ff30 in sshd:1090 has bad 
value 55b32abf1fa8
 unwind stack type:0 next_sp:  (null) mask:6 graph_idx:0
 8800bda0fd28: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0fd30: 810dc940 (sigprocmask+0x150/0x150)
 8800bda0fd38: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0fd40: 8800c7e60040 (0x8800c7e60040)
 8800bda0fd48: 8800bda0fe08 (0x8800bda0fe08)
 8800bda0fd50: 825393c0 (ftrace_trace_arrays+0x40/0x40)
 8800bda0fd58: 8800c7e60040 (0x8800c7e60040)
 8800bda0fd60: 0008 (0x8)
 8800bda0fd68: 001a0800 (0x1a0800)
 8800bda0fd70:  ...
 8800bda0fd78: fbfff04a727c (0xfbfff04a727c)
 8800bda0fd80: 8122c8bb (trace_function+0x2b/0x120)
 8800bda0fd88: dc00 (0xdc00)
 8800bda0fd90: 810dc940 (sigprocmask+0x150/0x150)
 8800bda0fd98: 825393e0 (global_trace+0x20/0x1680)
 8800bda0fda0: ff7d (0xff7d)
 8800bda0fda8: 8122c8bb (trace_function+0x2b/0x120)
 8800bda0fdb0: 0010 (0x10)
 8800bda0fdb8: 0246 (0x246)
 8800bda0fdc0: 8800bda0fdd0 (0x8800bda0fdd0)
 8800bda0fdc8: 0018 (0x18)
 8800bda0fdd0: a02e0077 (0xa02e0077)
 8800bda0fdd8: 0246 (0x246)
 8800bda0fde0: 8800c7e60040 (0x8800c7e60040)
 8800bda0fde8: 8800c7e60040 (0x8800c7e60040)
 8800bda0fdf0: 0007 (0x7)
 8800bda0fdf8: 810dc940 (sigprocmask+0x150/0x150)
 8800bda0fe00: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0fe08: 8800bda0fe68 (0x8800bda0fe68)
 8800bda0fe10: 81238168 (function_trace_call+0x208/0x260)
 8800bda0fe18: 00026f10 (0x26f10)
 8800bda0fe20: 8800c7e621f0 (0x8800c7e621f0)
 8800bda0fe28: 00026f10 (0x26f10)
 8800bda0fe30: 8800d3ea6f10 (0x8800d3ea6f10)
 8800bda0fe38: 8010 (0x8010)
 8800bda0fe40: 7d1f4e80 (0x7d1f4e80)
 8800bda0fe48: 7d1f4e00 (0x7d1f4e00)
 8800bda0fe50:  ...
 8800bda0fe58: 7d1f4f8f (0x7d1f4f8f)
 8800bda0fe60: 55b32a9a2a51 (0x55b32a9a2a51)
 8800bda0fe68: 8800bda0ff20 (0x8800bda0ff20)
 8800bda0fe70: a02e0077 (0xa02e0077)
 8800bda0fe78: 55b32bdc57c0 (0x55b32bdc57c0)
 8800bda0fe80: 41b58ab3 (0x41b58ab3)
 8800bda0fe88: 8233e3f0 (ONEf+0x16e40/0x5840d)
 8800bda0fe90: 8800bda0fed0 (0x8800bda0fed0)
 8800bda0fe98: 55b32abf1fa8 (0x55b32abf1fa8)
 8800bda0fea0: 8800bda0fee0 (0x8800bda0fee0)
 8800bda0fea8: 8800c7e60040 (0x8800c7e60040)
 8800bda0feb0: 81cf5017 (entry_SYSCALL_64_fastpath+0x5/0xad)
 8800bda0feb8: 001a0800 (0x1a0800)
 8800bda0fec0:  ...
 8800bda0fec8: 000e (0xe)
 8800bda0fed0: 0008 (0x8)
 8800bda0fed8: 7d1f4e00 (0x7d1f4e00)
 8800bda0fee0: 7d1f4e80 (0x7d1f4e80)
 8800bda0fee8:  ...
 8800bda0fef0: 8800bda0ff48 (0x8800bda0ff48)
 8800bda0fef8: 810dc945 (SyS_rt_sigprocmask+0x5/0x1a0)
 8800bda0ff00: 8800c7e60040 (0x8800c7e60040)
 8800bda0ff08: 0008 (0x8)
 8800bda0ff10: 001a0800 (0x1a0800)
 8800bda0ff18:  ...
 8800bda0ff20: 8800bda0ff30 (0x8800bda0ff30)
 8800bda0ff28: 810dc945 (SyS_rt_sigprocmask+0x5/0x1a0)
 8800bda0ff30: 55b32abf1fa8 (0x55b32abf1fa8)
 8800bda0ff38: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0ff40: 55b32abf1fa8 (0x55b32abf1fa8)
 8800bda0ff48: 810dc945 (SyS_rt_sigprocmask+0x5/0x1a0)
 8800bda0ff50: 81cf502a (entry_SYSCALL_64_fastpath+0x18/0xad)
 8800bda0ff58: 258c9a9a (0x258c9a9a)
 8800bda0ff60: 9a954c2d (0x9a954c2d)
 8800bda0ff68: fc397de1 (0xfc397de1)
 8800bda0ff70: 2badc874 (0x2badc874)
 8800bda0ff78: 8800bda0ff98 (0x8800bda0ff98)
 8800bda0ff80: 81149040 (trace_hardirqs_off_caller+0xc0/0x110)
 8800bda0ff88: 0246 (0x246)
 8800bda0ff90: 0008 (0x8)
 8800bda0ff98: 001a0800 (0x1a0800)
 8800bda0ffa0:  ...

[RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level

2017-04-18 Thread Anshuman Khandual

Though migrating gigantic HugeTLB pages does not sound much like real
world use case, they can be affected by memory errors. Hence migration
at the PGD level HugeTLB pages should be supported just to enable soft
and hard offline use cases.

While allocating the new gigantic HugeTLB page, it should not matter
whether new page comes from the same node or not. There would be very
few gigantic pages on the system afterall, we should not be bothered
about node locality when trying to save a big page from crashing.

This introduces a new HugeTLB allocator called alloc_gigantic_page()
which will scan over all online nodes on the system and allocate a
single HugeTLB page.

Signed-off-by: Anshuman Khandual 
---
Tested on a POWER8 machine with 16GB pages along with Aneesh's
recent HugeTLB enablement patch series on powerpc which can
be found here.

https://lkml.org/lkml/2017/4/17/225

Here, we directly call alloc_gigantic_page() which ignores node
locality. But we can also first call normal alloc_huge_page()
with the node number and if that fails to allocate then call
alloc_gigantic_page() as a fallback option.

 include/linux/hugetlb.h |  8 +++-
 mm/hugetlb.c| 17 +
 mm/memory-failure.c |  8 ++--
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 04b73a9c8b4b..ee75197e6ed8 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -347,6 +347,7 @@ struct huge_bootmem_page {
 
 struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
+struct page *alloc_gigantic_page(struct hstate *h);
 struct page *alloc_huge_page_node(struct hstate *h, int nid);
 struct page *alloc_huge_page_noerr(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
@@ -473,7 +474,11 @@ extern int dissolve_free_huge_pages(unsigned long 
start_pfn,
 static inline bool hugepage_migration_supported(struct hstate *h)
 {
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
-   return huge_page_shift(h) == PMD_SHIFT;
+   if ((huge_page_shift(h) == PMD_SHIFT) ||
+   (huge_page_shift(h) == PGDIR_SHIFT))
+   return true;
+   else
+   return false;
 #else
return false;
 #endif
@@ -511,6 +516,7 @@ static inline void hugetlb_count_sub(long l, struct 
mm_struct *mm)
 #else  /* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page(v, a, r) NULL
+#define alloc_gigantic_page(h) NULL
 #define alloc_huge_page_node(h, nid) NULL
 #define alloc_huge_page_noerr(v, a, r) NULL
 #define alloc_bootmem_huge_page(h) NULL
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 97a44db06850..f2b31dddb1bc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1669,6 +1669,23 @@ struct page *__alloc_buddy_huge_page_with_mpol(struct 
hstate *h,
return __alloc_buddy_huge_page(h, vma, addr, NUMA_NO_NODE);
 }
 
+struct page *alloc_gigantic_page(struct hstate *h)
+{
+   struct page *page = NULL;
+   int nid = 0;
+
+   spin_lock(_lock);
+   if (h->free_huge_pages - h->resv_huge_pages > 0) {
+   for_each_online_node(nid) {
+   page = dequeue_huge_page_node(h, nid);
+   if (page)
+   break;
+   }
+   }
+   spin_unlock(_lock);
+   return page;
+}
+
 /*
  * This allocation function is useful in the context where vma is irrelevant.
  * E.g. soft-offlining uses this function because it only cares physical
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index fe64d7729a8e..619650969fe5 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1481,11 +1481,15 @@ EXPORT_SYMBOL(unpoison_memory);
 static struct page *new_page(struct page *p, unsigned long private, int **x)
 {
int nid = page_to_nid(p);
-   if (PageHuge(p))
+   if (PageHuge(p)) {
+   if (hstate_is_gigantic(page_hstate(compound_head(p
+   return 
alloc_gigantic_page(page_hstate(compound_head(p)));
+
return alloc_huge_page_node(page_hstate(compound_head(p)),
   nid);
-   else
+   } else {
return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
+   }
 }
 
 /*
-- 
2.12.0

[RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level

2017-04-18 Thread Anshuman Khandual

Though migrating gigantic HugeTLB pages does not sound much like real
world use case, they can be affected by memory errors. Hence migration
at the PGD level HugeTLB pages should be supported just to enable soft
and hard offline use cases.

While allocating the new gigantic HugeTLB page, it should not matter
whether new page comes from the same node or not. There would be very
few gigantic pages on the system afterall, we should not be bothered
about node locality when trying to save a big page from crashing.

This introduces a new HugeTLB allocator called alloc_gigantic_page()
which will scan over all online nodes on the system and allocate a
single HugeTLB page.

Signed-off-by: Anshuman Khandual 
---
Tested on a POWER8 machine with 16GB pages along with Aneesh's
recent HugeTLB enablement patch series on powerpc which can
be found here.

https://lkml.org/lkml/2017/4/17/225

Here, we directly call alloc_gigantic_page() which ignores node
locality. But we can also first call normal alloc_huge_page()
with the node number and if that fails to allocate then call
alloc_gigantic_page() as a fallback option.

 include/linux/hugetlb.h |  8 +++-
 mm/hugetlb.c| 17 +
 mm/memory-failure.c |  8 ++--
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 04b73a9c8b4b..ee75197e6ed8 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -347,6 +347,7 @@ struct huge_bootmem_page {
 
 struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
+struct page *alloc_gigantic_page(struct hstate *h);
 struct page *alloc_huge_page_node(struct hstate *h, int nid);
 struct page *alloc_huge_page_noerr(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
@@ -473,7 +474,11 @@ extern int dissolve_free_huge_pages(unsigned long 
start_pfn,
 static inline bool hugepage_migration_supported(struct hstate *h)
 {
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
-   return huge_page_shift(h) == PMD_SHIFT;
+   if ((huge_page_shift(h) == PMD_SHIFT) ||
+   (huge_page_shift(h) == PGDIR_SHIFT))
+   return true;
+   else
+   return false;
 #else
return false;
 #endif
@@ -511,6 +516,7 @@ static inline void hugetlb_count_sub(long l, struct 
mm_struct *mm)
 #else  /* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page(v, a, r) NULL
+#define alloc_gigantic_page(h) NULL
 #define alloc_huge_page_node(h, nid) NULL
 #define alloc_huge_page_noerr(v, a, r) NULL
 #define alloc_bootmem_huge_page(h) NULL
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 97a44db06850..f2b31dddb1bc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1669,6 +1669,23 @@ struct page *__alloc_buddy_huge_page_with_mpol(struct 
hstate *h,
return __alloc_buddy_huge_page(h, vma, addr, NUMA_NO_NODE);
 }
 
+struct page *alloc_gigantic_page(struct hstate *h)
+{
+   struct page *page = NULL;
+   int nid = 0;
+
+   spin_lock(_lock);
+   if (h->free_huge_pages - h->resv_huge_pages > 0) {
+   for_each_online_node(nid) {
+   page = dequeue_huge_page_node(h, nid);
+   if (page)
+   break;
+   }
+   }
+   spin_unlock(_lock);
+   return page;
+}
+
 /*
  * This allocation function is useful in the context where vma is irrelevant.
  * E.g. soft-offlining uses this function because it only cares physical
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index fe64d7729a8e..619650969fe5 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1481,11 +1481,15 @@ EXPORT_SYMBOL(unpoison_memory);
 static struct page *new_page(struct page *p, unsigned long private, int **x)
 {
int nid = page_to_nid(p);
-   if (PageHuge(p))
+   if (PageHuge(p)) {
+   if (hstate_is_gigantic(page_hstate(compound_head(p
+   return 
alloc_gigantic_page(page_hstate(compound_head(p)));
+
return alloc_huge_page_node(page_hstate(compound_head(p)),
   nid);
-   else
+   } else {
return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
+   }
 }
 
 /*
-- 
2.12.0

Re: [RfC PATCH] drm: fourcc byteorder: brings header file comments in line with reality.

2017-04-18 Thread Ilia Mirkin

On Tue, Apr 18, 2017 at 11:19 PM, Ilia Mirkin  wrote:
> On Tue, Apr 18, 2017 at 9:01 PM, Michel Dänzer  wrote:
>> On 18/04/17 07:14 PM, Gerd Hoffmann wrote:
>>>   Hi,
>>>
> Quite true that this proves nothing. However one should note that
> fbcon -> fbdev works,

 BTW, this supports Gerd's patch, since the KMS fbdev emulation code uses
 e.g. DRM_FORMAT_XRGB for depth/bpp 24/32, and the fbdev API uses
 native endian packed colour values.
>>>
>>> Same is true for DRM_IOCTL_MODE_ADDFB, with depth/bpp 24/32 you'll get
>>> DRM_FORMAT_XRGB (only DRM_IOCTL_MODE_ADDFB2 allows userspace specify
>>> fourcc formats directly).
>>
>> Right, and since all major Xorg drivers use DRM_IOCTL_MODE_ADDFB,
>> they're effectively using DRM_FORMAT_XRGB as native endianness as well.
>
> In the meanwhile, it has been pointed out to me that pre-nv50 display
> code actually doesn't use DRM_FORMAT_* at all -- it uses some helpers
> which end up advertising XR24 / AR24. However from what I can tell,
> that's not a well-reasoned selection. Either way, I'm going to test
> Gerd's patch, hopefully during the week, or weekend at the latest. My
> current suspicion is that it will have no effect on nouveau either
> way. We'll find out.

(And as Michel points out, the patch doesn't actually touch anything,
just comments. I originally thought it changed format -> fourcc
mapping.)

Re: [RfC PATCH] drm: fourcc byteorder: brings header file comments in line with reality.

2017-04-18 Thread Ilia Mirkin

On Tue, Apr 18, 2017 at 11:19 PM, Ilia Mirkin  wrote:
> On Tue, Apr 18, 2017 at 9:01 PM, Michel Dänzer  wrote:
>> On 18/04/17 07:14 PM, Gerd Hoffmann wrote:
>>>   Hi,
>>>
> Quite true that this proves nothing. However one should note that
> fbcon -> fbdev works,

 BTW, this supports Gerd's patch, since the KMS fbdev emulation code uses
 e.g. DRM_FORMAT_XRGB for depth/bpp 24/32, and the fbdev API uses
 native endian packed colour values.
>>>
>>> Same is true for DRM_IOCTL_MODE_ADDFB, with depth/bpp 24/32 you'll get
>>> DRM_FORMAT_XRGB (only DRM_IOCTL_MODE_ADDFB2 allows userspace specify
>>> fourcc formats directly).
>>
>> Right, and since all major Xorg drivers use DRM_IOCTL_MODE_ADDFB,
>> they're effectively using DRM_FORMAT_XRGB as native endianness as well.
>
> In the meanwhile, it has been pointed out to me that pre-nv50 display
> code actually doesn't use DRM_FORMAT_* at all -- it uses some helpers
> which end up advertising XR24 / AR24. However from what I can tell,
> that's not a well-reasoned selection. Either way, I'm going to test
> Gerd's patch, hopefully during the week, or weekend at the latest. My
> current suspicion is that it will have no effect on nouveau either
> way. We'll find out.

(And as Michel points out, the patch doesn't actually touch anything,
just comments. I originally thought it changed format -> fourcc
mapping.)

Re: [RfC PATCH] drm: fourcc byteorder: brings header file comments in line with reality.

2017-04-18 Thread Ilia Mirkin

On Tue, Apr 18, 2017 at 9:01 PM, Michel Dänzer  wrote:
> On 18/04/17 07:14 PM, Gerd Hoffmann wrote:
>>   Hi,
>>
 Quite true that this proves nothing. However one should note that
 fbcon -> fbdev works,
>>>
>>> BTW, this supports Gerd's patch, since the KMS fbdev emulation code uses
>>> e.g. DRM_FORMAT_XRGB for depth/bpp 24/32, and the fbdev API uses
>>> native endian packed colour values.
>>
>> Same is true for DRM_IOCTL_MODE_ADDFB, with depth/bpp 24/32 you'll get
>> DRM_FORMAT_XRGB (only DRM_IOCTL_MODE_ADDFB2 allows userspace specify
>> fourcc formats directly).
>
> Right, and since all major Xorg drivers use DRM_IOCTL_MODE_ADDFB,
> they're effectively using DRM_FORMAT_XRGB as native endianness as well.

In the meanwhile, it has been pointed out to me that pre-nv50 display
code actually doesn't use DRM_FORMAT_* at all -- it uses some helpers
which end up advertising XR24 / AR24. However from what I can tell,
that's not a well-reasoned selection. Either way, I'm going to test
Gerd's patch, hopefully during the week, or weekend at the latest. My
current suspicion is that it will have no effect on nouveau either
way. We'll find out.

  -ilia

Re: [RfC PATCH] drm: fourcc byteorder: brings header file comments in line with reality.

2017-04-18 Thread Ilia Mirkin

On Tue, Apr 18, 2017 at 9:01 PM, Michel Dänzer  wrote:
> On 18/04/17 07:14 PM, Gerd Hoffmann wrote:
>>   Hi,
>>
 Quite true that this proves nothing. However one should note that
 fbcon -> fbdev works,
>>>
>>> BTW, this supports Gerd's patch, since the KMS fbdev emulation code uses
>>> e.g. DRM_FORMAT_XRGB for depth/bpp 24/32, and the fbdev API uses
>>> native endian packed colour values.
>>
>> Same is true for DRM_IOCTL_MODE_ADDFB, with depth/bpp 24/32 you'll get
>> DRM_FORMAT_XRGB (only DRM_IOCTL_MODE_ADDFB2 allows userspace specify
>> fourcc formats directly).
>
> Right, and since all major Xorg drivers use DRM_IOCTL_MODE_ADDFB,
> they're effectively using DRM_FORMAT_XRGB as native endianness as well.

In the meanwhile, it has been pointed out to me that pre-nv50 display
code actually doesn't use DRM_FORMAT_* at all -- it uses some helpers
which end up advertising XR24 / AR24. However from what I can tell,
that's not a well-reasoned selection. Either way, I'm going to test
Gerd's patch, hopefully during the week, or weekend at the latest. My
current suspicion is that it will have no effect on nouveau either
way. We'll find out.

  -ilia

Re: [PATCH] of: introduce event tracepoints for dynamic device_node lifecyle

2017-04-18 Thread Frank Rowand

On 04/18/17 19:46, Steven Rostedt wrote:
> On Tue, 18 Apr 2017 17:07:17 -0700
> Frank Rowand  wrote:
> 
> 
>> As far as I know, there is no easy way to combine trace data and printk()
>> style data to create a single chronology of events.  If some of the
>> information needed to debug an issue is trace data and some is printk()
>> style data then it becomes more difficult to understand the overall
>> situation.
> 
> You mean like:
> 
>  # echo 1 > /sys/kernel/debug/tracing/events/printk/console/enable
> 
> Makes all printks also go into the ftrace ring buffer.

Thanks!  I was hoping there was going to be an easy answer like this.


> -- Steve
> 
>>
>> If Rob wants to convert printk() style data to trace data (and I can't
>> convince him otherwise) then I will have further comments on this specific
>> patch.
>>
> .
>

Re: [PATCH] of: introduce event tracepoints for dynamic device_node lifecyle

2017-04-18 Thread Frank Rowand

On 04/18/17 19:46, Steven Rostedt wrote:
> On Tue, 18 Apr 2017 17:07:17 -0700
> Frank Rowand  wrote:
> 
> 
>> As far as I know, there is no easy way to combine trace data and printk()
>> style data to create a single chronology of events.  If some of the
>> information needed to debug an issue is trace data and some is printk()
>> style data then it becomes more difficult to understand the overall
>> situation.
> 
> You mean like:
> 
>  # echo 1 > /sys/kernel/debug/tracing/events/printk/console/enable
> 
> Makes all printks also go into the ftrace ring buffer.

Thanks!  I was hoping there was going to be an easy answer like this.


> -- Steve
> 
>>
>> If Rob wants to convert printk() style data to trace data (and I can't
>> convince him otherwise) then I will have further comments on this specific
>> patch.
>>
> .
>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1710 matches

Mail list logo