Re: [STABLE 4.9.y PATCH 0/9] Backport of KVM Speculation Control support

2018-02-08 Thread Greg KH
On Thu, Feb 08, 2018 at 06:57:38PM +0100, Greg KH wrote:
> On Thu, Feb 08, 2018 at 06:42:03PM +0100, Paolo Bonzini wrote:
> > On 08/02/2018 18:14, Greg KH wrote:
> > > On Thu, Feb 08, 2018 at 03:49:59AM +0100, Greg KH wrote:
> > >> On Tue, Feb 06, 2018 at 09:05:46PM +, Woodhouse, David wrote:
> > >>>
> > >>>
> > >>> On Tue, 2018-02-06 at 19:01 +0100, Paolo Bonzini wrote:
> >  On 06/02/2018 18:29, David Woodhouse wrote:
> > > I've put together a linux-4.9.y branch at 
> > > http://git.infradead.org/retpoline-stable.git/shortlog/refs/heads/linux-4.9.y
> > >  
> > > Most of it is fairly straightforward, apart from the IBPB on context 
> > > switch for which Tim has already posted a candidate. I wanted some 
> > > more
> > > review on my backports of the KVM bits though, including some extra
> > > historical patches I pulled in.
> > 
> >  Looks good!  Thanks for the work,
> > 
> >  Paolo
> > >>>
> > >>> Thanks. In that case, Greg, the full set is lined up in
> > >>> http://git.infradead.org/retpoline-stable.git/shortlog/refs/heads/linux-4.9.y
> > >>> or git://git.infradead.org/retpoline-stable linux-4.9.y
> > >>
> > >> Many thanks for all of this work.  I've now queued up all of these.
> > > 
> > > There's a problem with the backport of 6342c50ad12e ("KVM: nVMX:
> > > vmx_complete_nested_posted_interrupt() can't fail") as there is still a
> > > check in the function that can fail:
> > > 
> > >   vapic_page = kmap(vmx->nested.virtual_apic_page);
> > >   if (!vapic_page) {
> > >   WARN_ON(1);
> > >   return -ENOMEM;
> > >   }
> > > 
> > > Do we need something else before this patch in order to fix this?  I
> > > guess kmap really can't fail, should I just drop the whole (!vapic_page)
> > > check?
> > 
> > Yes, that would be commit 42cf014d38d8822cce63703a467e00f65d000952.
> > Should David or I respin?
> 
> No need, I can sneak it into the middle of the series :)  I'll do it
> later tonight and let you know if I have any problems, thanks for
> pointing out the needed commit.

Now queued up.


[kselftests] compaction_test is blocked

2018-02-08 Thread Li Zhijian

Hi

kselftests is integrated Intel 0Day project.
Sometimes we found compaction_test is blocked for more than 1 hours until i 
kill it.

Try to figure out where it is running, i added some log to this case.

the test log is like:
---
 [  111.750543] main: 248
 [  111.750544]-
 [ 111.750821] check_compaction: 98
 [  111.750822]-
 [  111.751102] check_compaction: 105
 [  111.751103]-
 [  111.751362] check_compaction: 111
 [  111.751363]-
 [  111.751621] check_compaction: 118
 [  111.751622]-
 [  111.751879] check_compaction: 123
 [  111.751880]-
---
118 fprintf(stderr, "%s: %d\n", __func__, __LINE__);
119 lseek(fd, 0, SEEK_SET);
120
121 /* Request a large number of huge pages. The Kernel will allocate
122as much as it can */
123 fprintf(stderr, "%s: %d\n", __func__, __LINE__); 
<<< the last line we can catch.
124 if (write(fd, "10", (6*sizeof(char))) != (6*sizeof(char))) {
 blocking position
125 perror("Failed to write 10 to 
/proc/sys/vm/nr_hugepages\n");
126 goto close_fd;
127 }
128
129 lseek(fd, 0, SEEK_SET);
130
131 fprintf(stderr, "%s: %d\n", __func__, __LINE__);
132 if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) {
133 perror("Failed to re-read from 
/proc/sys/vm/nr_hugepages\n");
134 goto close_fd;
135 }
---

According to above log and code, it most likely it is blocking at the writing 
operation.

my environment is like:
OS: debian
kernel: v4.15
model: Ivytown Ivy Bridge-EP
nr_cpu: 48
memory: 64G


NOTE: 0Day can reproduce this issue in 20% on 0Day.

Anybody can help have a look?

Thanks
Zhjian





Re: [PATCH 2/3] clk: exynos5433: Allow audio subsystem clock rate propagation

2018-02-08 Thread Chanwoo Choi
Hi Sylwester,

On 2018년 02월 08일 00:18, Sylwester Nawrocki wrote:
> Hi Chanwoo,
> 
> On 02/06/2018 05:06 AM, Chanwoo Choi wrote:
>>>  drivers/clk/samsung/clk-exynos5433.c | 22 +++---
>>>  1 file changed, 11 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/clk/samsung/clk-exynos5433.c 
>>> b/drivers/clk/samsung/clk-exynos5433.c
>>> index 74b70ddab4d6..d74361736e64 100644
>>> --- a/drivers/clk/samsung/clk-exynos5433.c
>>> +++ b/drivers/clk/samsung/clk-exynos5433.c
>>> @@ -246,14 +246,14 @@ static const struct samsung_fixed_rate_clock 
>>> top_fixed_clks[] __initconst = {
>>>  
>>>  static const struct samsung_mux_clock top_mux_clks[] __initconst = {
>>> /* MUX_SEL_TOP0 */
>>> -   MUX(CLK_MOUT_AUD_PLL, "mout_aud_pll", mout_aud_pll_p, MUX_SEL_TOP0,
>>> -   4, 1),
>>> +   MUX_F(CLK_MOUT_AUD_PLL, "mout_aud_pll", mout_aud_pll_p, MUX_SEL_TOP0,
>>> + 4, 1, CLK_SET_RATE_PARENT, 0),
>>
>> If you add CLK_SET_RATE_PARENT to 'mout_aud_pll' and mout_aud_pll changes 
>> the rate,
>> fout_aud_pll's rate will be changed. But, fout_aud_pll is also the parent
>> of 'mout_aud_pll_user'. It might change the rate of children of 
>> mout_aud_pll_user.
>> mout_aud_pll_user would not want to change the parent's clock.
>>
>> fout_aud_pll  22   196608009 
>>  0 0  
>>mout_aud_pll_user  11   196608009 
>>  0 0  
>>mout_aud_pll   00   196608009 
>>  0 0  
> 
> I'd say the range of changes is such that the consumers of the affected child 
> clocks can cope and could adjust to the changed frequencies. Those consumer 
> devices are all components/peripherals of the audio subsystem (LPASS) and, 

The mout_aud_pll_user has the child clock of serial_3.
serial_3 was used for bluetooth on TM2. If you change the aud_pll
with CLK_SET_RATE_PARENT, it might affect the bluetooth operation.
The bluetooth is only used for transfering the data.

Actually, I'm not sure that this patch might affect bluetooth operation or not.

> for example, in case of TM2 there is no issues at all with varying the AUD PLL
> frequency depending on the HDMI audio sample rate. The other audio path uses
> the audio CODEC's internal PLL as the root clock source. The AUD PLL frequency
> will need to be adjusted somehow anyway, we could also get the PLL clock 
> directly and set it's rate, instead of relying on that rate propagation 
> algorithm.  I think we could also export a function from the exynos-lpass mfd 
> driver for setting the PLL's rate directly, after listing the AUD PLL clock 
> in the lpass DT node. That would be more flexible API, easier to adopt for 
> various use cases/boards, now we have only TM2. I can't list the PLL clock 
> in the sound node, that would not have passed the DT maintainters' review. 
> 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


[tip:x86/pti] objtool: Fix switch-table detection

2018-02-08 Thread tip-bot for Peter Zijlstra
Commit-ID:  99ce7962d52d1948ad6f2785e308d48e76e0a6ef
Gitweb: https://git.kernel.org/tip/99ce7962d52d1948ad6f2785e308d48e76e0a6ef
Author: Peter Zijlstra 
AuthorDate: Thu, 8 Feb 2018 14:02:32 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 9 Feb 2018 07:20:23 +0100

objtool: Fix switch-table detection

Linus reported that GCC-7.3 generated a switch-table construct that
confused objtool. It turns out that, in particular due to KASAN, it is
possible to have unrelated .rodata usage in between the .rodata setup
for the switch-table and the following indirect jump.

The simple linear reverse search from the indirect jump would hit upon
the KASAN .rodata usage first and fail to find a switch_table,
resulting in a spurious 'sibling call with modified stack frame'
warning.

Fix this by creating a 'jump-stack' which we can 'unwind' during
reversal, thereby skipping over much of the in-between code.

This is not fool proof by any means, but is sufficient to make the
known cases work. Future work would be to construct more comprehensive
flow analysis code.

Reported-and-tested-by: Linus Torvalds 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Josh Poimboeuf 
Cc: Borislav Petkov 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/20180208130232.gf25...@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar 
---
 tools/objtool/check.c | 41 +++--
 tools/objtool/check.h |  1 +
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 9cd028a..2e458eb 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -851,8 +851,14 @@ static int add_switch_table(struct objtool_file *file, 
struct symbol *func,
  *This is a fairly uncommon pattern which is new for GCC 6.  As of this
  *writing, there are 11 occurrences of it in the allmodconfig kernel.
  *
+ *As of GCC 7 there are quite a few more of these and the 'in between' code
+ *is significant. Esp. with KASAN enabled some of the code between the mov
+ *and jmpq uses .rodata itself, which can confuse things.
+ *
  *TODO: Once we have DWARF CFI and smarter instruction decoding logic,
  *ensure the same register is used in the mov and jump instructions.
+ *
+ *NOTE: RETPOLINE made it harder still to decode dynamic jumps.
  */
 static struct rela *find_switch_table(struct objtool_file *file,
  struct symbol *func,
@@ -874,12 +880,25 @@ static struct rela *find_switch_table(struct objtool_file 
*file,
text_rela->addend + 4);
if (!rodata_rela)
return NULL;
+
file->ignore_unreachables = true;
return rodata_rela;
}
 
/* case 3 */
-   func_for_each_insn_continue_reverse(file, func, insn) {
+   /*
+* Backward search using the @first_jump_src links, these help avoid
+* much of the 'in between' code. Which avoids us getting confused by
+* it.
+*/
+   for (insn = list_prev_entry(insn, list);
+
+&insn->list != &file->insn_list &&
+insn->sec == func->sec &&
+insn->offset >= func->offset;
+
+insn = insn->first_jump_src ?: list_prev_entry(insn, list)) {
+
if (insn->type == INSN_JUMP_DYNAMIC)
break;
 
@@ -909,14 +928,32 @@ static struct rela *find_switch_table(struct objtool_file 
*file,
return NULL;
 }
 
+
 static int add_func_switch_tables(struct objtool_file *file,
  struct symbol *func)
 {
-   struct instruction *insn, *prev_jump = NULL;
+   struct instruction *insn, *last = NULL, *prev_jump = NULL;
struct rela *rela, *prev_rela = NULL;
int ret;
 
func_for_each_insn(file, func, insn) {
+   if (!last)
+   last = insn;
+
+   /*
+* Store back-pointers for unconditional forward jumps such
+* that find_switch_table() can back-track using those and
+* avoid some potentially confusing code.
+*/
+   if (insn->type == INSN_JUMP_UNCONDITIONAL && insn->jump_dest &&
+   insn->offset > last->offset &&
+   insn->jump_dest->offset > insn->offset &&
+   !insn->jump_dest->first_jump_src) {
+
+   insn->jump_dest->first_jump_src = insn;
+   last = insn->jump_dest;
+   }
+
if (insn->type != INSN_JUMP_DYNAMIC)
continue;
 
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index dbadb30..23a1d06 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -47,6 +47,7 @@ struct instruction {
bool alt_group, visited, dead_end, ignore, hint, save, restore, 
ignore_alts;
struct symbol *call_dest;

Re: [PATCH 1/3] clk: exynos5433: Extend list of available AUD_PLL output frequencies

2018-02-08 Thread Chanwoo Choi
On 2018년 02월 07일 22:04, Sylwester Nawrocki wrote:
> On 02/07/2018 12:24 PM, Chanwoo Choi wrote:
>> Could you share your equation?
>> because your result is a little bit different of my result.
>> - my equation : ((mdiv + kdiv/65535) x 24MHz) / (pdiv x POWER(2,sdiv))
> 
> It resembles the code from samsung_pll36xx_recalc_rate():
> 
> (24 * 10^6 * (M * 2^16 + K)) / (P * 2^S) / 2^16
> 
> and a more accurate one
> 
> ROUNDDOWN(ROUNDDOWN(24 * 10^6 * (M * 2^16 + K), 0) / ROUNDDOWN(P * 2^S, 0) / 
> 2^16, 0)
> 
> Shouldn't you substitute 65535 with 65536?

65536 is right. It is my mistake using 65535.
Thanks for your share.

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [V9fs-developer] [RFC] we should solve create-unlink-getattr idiom

2018-02-08 Thread Veaceslav Falico
Hi Yiwen, all,

On 2/9/2018 8:10 AM, jiangyiwen wrote:
> Hi Eric and Greg,
> 
> I encountered the similar problem with create-unlink-getattr idiom.
> I use the testcase that create-unlink-setattr idiom, and I see the
> bug is reported at https://bugs.launchpad.net/qemu/+bug/1336794.
> Then I also see you already fix the issue and push the patch to upstream.
> https://github.com/ericvh/linux/commit/eaf70223eac094291169f5a6de580351890162a2
> http://patchwork.ozlabs.org/patch/626194/
> 
> Unfortunately, the two patches are not merged into master, I don't know
> the reason, so I suggest if the patche can be merged into master, and
> it will solve the create-unlink-getattr idiom.

As a follow up - the create-unlink-setattr (mainly ftruncate and anything
else which works on fd instead of path) isn't fixed by these patches, but
I'm currently working on a new patch, obviously on top of those two, to
make the setattr work too.

It's based on the same logic as the above patches though - use FIDs with
open fd's guest side and use open fd's host side if possible with f*
functions, otherwise path with l* functions.

It's bigger than the QEMU getattr patch, as there are no f* functions
available for ftruncate case, for example.

So if those two patches could be merged it'd be a lot easier to then
go forward with the setattr fix.

Thank you!

> 
> Thanks,
> Yiwen
> 
> .
> 




Re: [PATCH 4/6] sched/isolation: Residual 1Hz scheduler tick offload

2018-02-08 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> When a CPU runs in full dynticks mode, a 1Hz tick remains in order to
> keep the scheduler stats alive. However this residual tick is a burden
> for bare metal tasks that can't stand any interruption at all, or want
> to minimize them.
> 
> The usual boot parameters "nohz_full=" or "isolcpus=nohz" will now
> outsource these scheduler ticks to the global workqueue so that a
> housekeeping CPU handles those remotely. The sched_class::task_tick()
> implementations have been audited and look safe to be called remotely
> as the target runqueue and its current task are passed in parameter
> and don't seem to be accessed locally.
> 
> Note that in the case of using isolcpus, it's still up to the user to
> affine the global workqueues to the housekeeping CPUs through
> /sys/devices/virtual/workqueue/cpumask or domains isolation
> "isolcpus=nohz,domain".
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Chris Metcalf 
> Cc: Christoph Lameter 
> Cc: Luiz Capitulino 
> Cc: Mike Galbraith 
> Cc: Paul E. McKenney 
> Cc: Peter Zijlstra 
> Cc: Rik van Riel 
> Cc: Thomas Gleixner 
> Cc: Wanpeng Li 
> Cc: Ingo Molnar 
> ---
>  kernel/sched/core.c  | 91 
> +++-
>  kernel/sched/isolation.c |  4 +++
>  kernel/sched/sched.h |  2 ++
>  3 files changed, 96 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index fc9fa25..5c0e8b6 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3120,7 +3120,94 @@ u64 scheduler_tick_max_deferment(void)
>  
>   return jiffies_to_nsecs(next - now);
>  }
> -#endif
> +
> +struct tick_work {
> + int cpu;
> + struct delayed_work work;
> +};
> +
> +static struct tick_work __percpu *tick_work_cpu;
> +
> +static void sched_tick_remote(struct work_struct *work)
> +{
> + struct delayed_work *dwork = to_delayed_work(work);
> + struct tick_work *twork = container_of(dwork, struct tick_work, work);
> + int cpu = twork->cpu;
> + struct rq *rq = cpu_rq(cpu);
> + struct rq_flags rf;
> +
> + /*
> +  * Handle the tick only if it appears the remote CPU is running
> +  * in full dynticks mode. The check is racy by nature, but
> +  * missing a tick or having one too much is no big deal.

I'd suggest pointing out why it's no big deal:

 * missing a tick or having one too much is no big deal,
 * because the scheduler tick updates statistics and checks
 * timeslices in a time-independent way, regardless of when
 * exactly it is running.

> +  */
> + if (!idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu)) {
> + struct task_struct *curr;
> + u64 delta;
> +
> + rq_lock_irq(rq, &rf);
> + update_rq_clock(rq);
> + curr = rq->curr;
> + delta = rq_clock_task(rq) - curr->se.exec_start;
> + /* Make sure we tick in a reasonable amount of time */
> + WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);


Please add a newline before the comment, and I'd also suggest this wording:

/* Make sure the next tick runs within a reasonable amount of 
time: */

> + /*
> +  * Perform remote tick every second. The arbitrary frequence is
> +  * large enough to avoid overload and short enough to keep sched
> +  * internal stats alive.
> +  */
> + queue_delayed_work(system_unbound_wq, dwork, HZ);
> +}

Typo. I'd also suggest somewhat clearer wording:

/*
 * Run the remote tick once per second (1Hz). This arbitrary
 * frequency is large enough to avoid overload but short enough
 * to keep scheduler internal stats reasonably up to date.
 */

> +#ifdef CONFIG_HOTPLUG_CPU
> +static void sched_tick_stop(int cpu)
> +{
> + struct tick_work *twork;
> +
> + if (housekeeping_cpu(cpu, HK_FLAG_TICK))
> + return;
> +
> + WARN_ON_ONCE(!tick_work_cpu);
> +
> + twork = per_cpu_ptr(tick_work_cpu, cpu);
> + cancel_delayed_work_sync(&twork->work);
> +}
> +#endif /* CONFIG_HOTPLUG_CPU */
> +
> +int __init sched_tick_offload_init(void)
> +{
> + tick_work_cpu = alloc_percpu(struct tick_work);
> + if (!tick_work_cpu) {
> + pr_err("Can't allocate remote tick struct\n");
> + return -ENOMEM;

Printing a warning is not enough. If tick_work_cpu ends up being NULL, then the 
tick will crash AFAICS, due to:

  > +   twork = per_cpu_ptr(tick_work_cpu, cpu);
  > +   cancel_delayed_work_sync(&twork->work);

... it's much better to crash straight away - i.e. we should use panic().

> +#else
> +static void sched_tick_start(int cpu) { }
> +static void sched_tick_stop(int cpu) { }
> +#endif /* CONFIG_NO_HZ_FULL */

So if we are using #if/else/endif markers, please use them in the #else branch 
when it's so short, where they are actually useful:

> +#else /* !CONFIG_NO_HZ_FULL: */
> +static void sched_tick_st

Re: [PATCH] ath9k: turn on btcoex_enable as default

2018-02-08 Thread Kalle Valo
Kai Heng Feng  writes:

> Hi Felix,
>
>> On Feb 8, 2018, at 7:02 PM, Felix Fietkau  wrote:
>>
>> On 2018-02-08 06:28, Kai-Heng Feng wrote:
>>> Without btcoex_enable, WiFi activies make both WiFi and Bluetooth
>>> unstable if there's a bluetooth connection.
>>>
>>> Enable this option when bt_ant_diversity is disabled.
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1746164
>>> Signed-off-by: Kai-Heng Feng 
>> I think this might cause regressions on devices that don't have
>> bluetooth. This probably either needs more EEPROM checks, or something
>> to selectively enable it only on affected platforms.
>
> I think it’s better not to use dmi_match. This issue should affect
> more ath9k. And bluetooth peripherals are more than ever now, so it
> would be great to use BT out of the box.

Sure, but we have to make sure that we don't create regressions on
existing systems. For example, did you test this with any system which
don't support btcoex? (just asking, haven't tested this myself)

-- 
Kalle Valo


[V9fs-developer] [RFC] we should solve create-unlink-getattr idiom

2018-02-08 Thread jiangyiwen
Hi Eric and Greg,

I encountered the similar problem with create-unlink-getattr idiom.
I use the testcase that create-unlink-setattr idiom, and I see the
bug is reported at https://bugs.launchpad.net/qemu/+bug/1336794.
Then I also see you already fix the issue and push the patch to upstream.
https://github.com/ericvh/linux/commit/eaf70223eac094291169f5a6de580351890162a2
http://patchwork.ozlabs.org/patch/626194/

Unfortunately, the two patches are not merged into master, I don't know
the reason, so I suggest if the patche can be merged into master, and
it will solve the create-unlink-getattr idiom.

Thanks,
Yiwen



Re: [PATCH 6/6] sched/isolation: Tick offload documentation

2018-02-08 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> Update the documentation to reflect the 1Hz tick offload changes.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Chris Metcalf 
> Cc: Christoph Lameter 
> Cc: Luiz Capitulino 
> Cc: Mike Galbraith 
> Cc: Paul E. McKenney 
> Cc: Peter Zijlstra 
> Cc: Rik van Riel 
> Cc: Thomas Gleixner 
> Cc: Wanpeng Li 
> Cc: Ingo Molnar 
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 39ac9d4..c851e41 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1762,7 +1762,11 @@
>   specified in the flag list (default: domain):
>  
>   nohz
> -   Disable the tick when a single task runs.
> +   Disable the tick when a single task runs. A residual 
> 1Hz
> +   tick is offloaded to workqueues that you need to 
> affine
> +   to housekeeping through the sysfs file
> +   /sys/devices/virtual/workqueue/cpumask or using the 
> below
> +   domain flag.

This is pretty ambiguous and somewhat confusing, I'd suggest something like:

nohz
  Disable the tick when a single task runs.

  A residual 1Hz tick is offloaded to workqueues, which 
you 
  need to affine to housekeeping through the global 
  workqueue's affinity configured via the 
  /sys/devices/virtual/workqueue/cpumask sysfs file, or 
  by using the 'domain' flag described below.

  NOTE: by default the global workqueue runs on all 
CPUs, 
  so to protect individual CPUs the 'cpumask' file has 
to 
  be configured manually after bootup.

Assuming what I wrote is correct - the CPU isolation config space is pretty 
confusing all around and should be made a lot more human friendly ...

Thanks,

Ingo


Re: [Qemu-devel] [RFC PATCH] vfio/pci: Add ioeventfd support

2018-02-08 Thread Peter Xu
On Tue, Feb 06, 2018 at 05:08:14PM -0700, Alex Williamson wrote:

[...]

> +long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset,
> + uint64_t data, int count, int fd)
> +{
> + struct pci_dev *pdev = vdev->pdev;
> + loff_t pos = offset & VFIO_PCI_OFFSET_MASK;
> + int ret, bar = VFIO_PCI_OFFSET_TO_INDEX(offset);
> + struct vfio_pci_ioeventfd *ioeventfd;
> + int (*handler)(void *, void *);
> + unsigned long val;
> +
> + /* Only support ioeventfds into BARs */
> + if (bar > VFIO_PCI_BAR5_REGION_INDEX)
> + return -EINVAL;
> +
> + if (pos + count > pci_resource_len(pdev, bar))
> + return -EINVAL;
> +
> + /* Disallow ioeventfds working around MSI-X table writes */
> + if (bar == vdev->msix_bar &&
> + !(pos + count <= vdev->msix_offset ||
> +   pos >= vdev->msix_offset + vdev->msix_size))
> + return -EINVAL;
> +
> + switch (count) {
> + case 1:
> + handler = &vfio_pci_ioeventfd_handler8;
> + val = data;
> + break;
> + case 2:
> + handler = &vfio_pci_ioeventfd_handler16;
> + val = le16_to_cpu(data);
> + break;
> + case 4:
> + handler = &vfio_pci_ioeventfd_handler32;
> + val = le32_to_cpu(data);
> + break;
> +#ifdef iowrite64
> + case 8:
> + handler = &vfio_pci_ioeventfd_handler64;
> + val = le64_to_cpu(data);
> + break;
> +#endif
> + default:
> + return -EINVAL;
> + }
> +
> + ret = vfio_pci_setup_barmap(vdev, bar);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&vdev->ioeventfds_lock);
> +
> + list_for_each_entry(ioeventfd, &vdev->ioeventfds_list, next) {
> + if (ioeventfd->pos == pos && ioeventfd->bar == bar &&
> + ioeventfd->data == data && ioeventfd->count == count) {
> + if (fd == -1) {
> + vfio_virqfd_disable(&ioeventfd->virqfd);
> + list_del(&ioeventfd->next);
> + kfree(ioeventfd);
> + ret = 0;
> + } else
> + ret = -EEXIST;
> +
> + goto out_unlock;
> + }
> + }
> +
> + if (fd < 0) {
> + ret = -ENODEV;
> + goto out_unlock;
> + }
> +
> + ioeventfd = kzalloc(sizeof(*ioeventfd), GFP_KERNEL);
> + if (!ioeventfd) {
> + ret = -ENOMEM;
> + goto out_unlock;
> + }
> +
> + ioeventfd->pos = pos;
> + ioeventfd->bar = bar;
> + ioeventfd->data = data;
> + ioeventfd->count = count;
> +
> + ret = vfio_virqfd_enable(vdev->barmap[ioeventfd->bar] + ioeventfd->pos,
> +  handler, NULL, (void *)val,
> +  &ioeventfd->virqfd, fd);
> + if (ret) {
> + kfree(ioeventfd);
> + goto out_unlock;
> + }
> +
> + list_add(&ioeventfd->next, &vdev->ioeventfds_list);

Is there a limit on how many ioeventfds that can be created?

IIUC we'll create this eventfd "automatically" if a MMIO addr/data
triggered continuously for N=10 times, then would it be safer we have
a limitation on maximum eventfds?  Or not sure whether a malicious
guest can consume the host memory by sending:

- addr1/data1, 10 times
- addr2/data2, 10 times
- ...

To create unlimited ioeventfds?  Thanks,

-- 
Peter Xu


[PATCH 1/2] dt-bindings: clock: reset: Add AXG AO Clock and Reset Bindings

2018-02-08 Thread Yixun Lan
Add dt-bindings headers for the Meson-AXG's AO clock and
reset controller.

CC: 
Signed-off-by: Yixun Lan 
---
 include/dt-bindings/clock/axg-aoclkc.h | 26 ++
 include/dt-bindings/reset/axg-aoclkc.h | 20 
 2 files changed, 46 insertions(+)
 create mode 100644 include/dt-bindings/clock/axg-aoclkc.h
 create mode 100644 include/dt-bindings/reset/axg-aoclkc.h

diff --git a/include/dt-bindings/clock/axg-aoclkc.h 
b/include/dt-bindings/clock/axg-aoclkc.h
new file mode 100644
index ..78683abb4247
--- /dev/null
+++ b/include/dt-bindings/clock/axg-aoclkc.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: (GPL-2.0+ OR BSD) */
+/*
+ * Copyright (c) 2016 BayLibre, SAS
+ * Author: Neil Armstrong 
+ *
+ * Copyright (c) 2018 Amlogic, inc.
+ * Author: Qiufang Dai 
+ */
+
+#ifndef DT_BINDINGS_CLOCK_AMLOGIC_MESON_AXG_AOCLK
+#define DT_BINDINGS_CLOCK_AMLOGIC_MESON_AXG_AOCLK
+
+#define CLKID_AO_REMOTE0
+#define CLKID_AO_I2C_MASTER1
+#define CLKID_AO_I2C_SLAVE 2
+#define CLKID_AO_UART1 3
+#define CLKID_AO_UART2 4
+#define CLKID_AO_IR_BLASTER5
+#define CLKID_AO_SAR_ADC   6
+#define CLKID_AO_CLK81 7
+#define CLKID_AO_SAR_ADC_SEL   8
+#define CLKID_AO_SAR_ADC_DIV   9
+#define CLKID_AO_SAR_ADC_CLK   10
+#define CLKID_AO_ALT_XTAL  11
+
+#endif
diff --git a/include/dt-bindings/reset/axg-aoclkc.h 
b/include/dt-bindings/reset/axg-aoclkc.h
new file mode 100644
index ..307f58161bbb
--- /dev/null
+++ b/include/dt-bindings/reset/axg-aoclkc.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: (GPL-2.0+ OR BSD) */
+/*
+ * Copyright (c) 2016 BayLibre, SAS
+ * Author: Neil Armstrong 
+ *
+ * Copyright (c) 2018 Amlogic, inc.
+ * Author: Qiufang Dai 
+ */
+
+#ifndef DT_BINDINGS_RESET_AMLOGIC_MESON_AXG_AOCLK
+#define DT_BINDINGS_RESET_AMLOGIC_MESON_AXG_AOCLK
+
+#define RESET_AO_REMOTE0
+#define RESET_AO_I2C_MASTER1
+#define RESET_AO_I2C_SLAVE 2
+#define RESET_AO_UART1 3
+#define RESET_AO_UART2 4
+#define RESET_AO_IR_BLASTER5
+
+#endif
-- 
2.15.1



[PATCH] RDMA/nldev: Fix multiple potential NULL pointer dereferences

2018-02-08 Thread Gustavo A. R. Silva
In case the message header and payload cannot be stored, function
nlmsg_put returns null.

Fix this by adding multiple sanity checks and avoid a potential
null dereference on _nlh_ when calling nlmsg_end.

Addresses-Coverity-ID: 1454215 ("Dereference null return value")
Addresses-Coverity-ID: 1454223 ("Dereference null return value")
Addresses-Coverity-ID: 1454224 ("Dereference null return value")
Addresses-Coverity-ID: 1464669 ("Dereference null return value")
Addresses-Coverity-ID: 1464670 ("Dereference null return value")
Addresses-Coverity-ID: 1464672 ("Dereference null return value")
Fixes: e5c9469efcb1 ("RDMA/netlink: Add nldev device doit implementation")
Fixes: c3f66f7b0052 ("RDMA/netlink: Implement nldev port doit callback")
Fixes: 7d02f605f0dc ("RDMA/netlink: Add nldev port dumpit implementation")
Fixes: b5fa635aab8f ("RDMA/nldev: Provide detailed QP information")
Fixes: bf3c5a93c523 ("RDMA/nldev: Provide global resource utilization")
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/infiniband/core/nldev.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 5326a68..dc8f6eb 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -313,6 +313,11 @@ static int nldev_get_doit(struct sk_buff *skb, struct 
nlmsghdr *nlh,
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
0, 0);
+   if (!nlh) {
+   err = -EMSGSIZE;
+   goto err_free;
+
+   }
 
err = fill_dev_info(msg, device);
if (err)
@@ -344,6 +349,8 @@ static int _nldev_get_dumpit(struct ib_device *device,
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
0, NLM_F_MULTI);
+   if (!nlh)
+   goto out;
 
if (fill_dev_info(skb, device)) {
nlmsg_cancel(skb, nlh);
@@ -354,7 +361,8 @@ static int _nldev_get_dumpit(struct ib_device *device,
 
idx++;
 
-out:   cb->args[0] = idx;
+out:
+   cb->args[0] = idx;
return skb->len;
 }
 
@@ -404,6 +412,10 @@ static int nldev_port_get_doit(struct sk_buff *skb, struct 
nlmsghdr *nlh,
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
0, 0);
+   if (!nlh) {
+   err = -EMSGSIZE;
+   goto err_free;
+   }
 
err = fill_port_info(msg, device, port);
if (err)
@@ -464,6 +476,8 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
 RDMA_NLDEV_CMD_PORT_GET),
0, NLM_F_MULTI);
+   if (!nlh)
+   goto out;
 
if (fill_port_info(skb, device, p)) {
nlmsg_cancel(skb, nlh);
@@ -507,6 +521,10 @@ static int nldev_res_get_doit(struct sk_buff *skb, struct 
nlmsghdr *nlh,
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_GET),
0, 0);
+   if (!nlh) {
+   ret = -EMSGSIZE;
+   goto err_free;
+   }
 
ret = fill_res_info(msg, device);
if (ret)
@@ -537,6 +555,8 @@ static int _nldev_res_get_dumpit(struct ib_device *device,
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_GET),
0, NLM_F_MULTI);
+   if (!nlh)
+   goto out;
 
if (fill_res_info(skb, device)) {
nlmsg_cancel(skb, nlh);
@@ -603,6 +623,10 @@ static int nldev_res_get_qp_dumpit(struct sk_buff *skb,
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, 
RDMA_NLDEV_CMD_RES_QP_GET),
0, NLM_F_MULTI);
+   if (!nlh) {
+   ret = -EMSGSIZE;
+   goto err_index;
+   }
 
if (fill_nldev_handle(skb, device)) {
ret = -EMSGSIZE;
-- 
2.7.4



[PATCH 0/2] clk: meson-axg: Add AO Cloclk and Reset driver

2018-02-08 Thread Yixun Lan
  This patch try to add AO clock and Reset driver in Amlogic's
Meson-AXG SoC.

  Please note this patchset actually depend on the clock regmap
conversion series [1].

[1] clk: meson: use regmap in clock controllers
 https://lkml.kernel.org/r/20180131180945.18025-1-jbru...@baylibre.com


Yixun Lan (2):
  dt-bindings: clock: reset: Add AXG AO Clock and Reset Bindings
  clk: meson-axg: Add AO Clock and Reset controller driver

 drivers/clk/meson/Makefile |   2 +-
 drivers/clk/meson/axg-aoclk.c  | 236 +
 drivers/clk/meson/axg-aoclk.h  |  25 
 include/dt-bindings/clock/axg-aoclkc.h |  26 
 include/dt-bindings/reset/axg-aoclkc.h |  20 +++
 5 files changed, 308 insertions(+), 1 deletion(-)
 create mode 100644 drivers/clk/meson/axg-aoclk.c
 create mode 100644 drivers/clk/meson/axg-aoclk.h
 create mode 100644 include/dt-bindings/clock/axg-aoclkc.h
 create mode 100644 include/dt-bindings/reset/axg-aoclkc.h

-- 
2.15.1



Re: [PATCH 0/6] isolation: 1Hz residual tick offloading v5

2018-02-08 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

>   sched/isolation: Residual 1Hz scheduler tick offload
>   sched/isolation: Tick offload documentation

Please try to start each title with a verb.

[ ... and preferably not by prepending 'do' ;-) ]

Beyond making changelogs more consistent, this will actually also add real 
information to the title, because, for example, any of these possible variants:

   sched/isolation: Fix tick offload documentation
   sched/isolation: Update tick offload documentation
   sched/isolation: Add tick offload documentation
   sched/isolation: Remove tick offload documentation

   sched/isolation: Fix residual 1Hz scheduler tick offload
   sched/isolation: Update residual 1Hz scheduler tick offload
   sched/isolation: Introduce residual 1Hz scheduler tick offload
   sched/isolation: Remove residual 1Hz scheduler tick offload

will tell us _a lot more_ about the nature of the changes from the shortlog 
alone!

Thanks,

Ingo


[PATCH 2/2] clk: meson-axg: Add AO Clock and Reset controller driver

2018-02-08 Thread Yixun Lan
Adds a Clock and Reset controller driver for the Always-On part
of the Amlogic Meson-AXG SoC.

Signed-off-by: Qiufang Dai 
Signed-off-by: Yixun Lan 
---
 drivers/clk/meson/Makefile|   2 +-
 drivers/clk/meson/axg-aoclk.c | 236 ++
 drivers/clk/meson/axg-aoclk.h |  25 +
 3 files changed, 262 insertions(+), 1 deletion(-)
 create mode 100644 drivers/clk/meson/axg-aoclk.c
 create mode 100644 drivers/clk/meson/axg-aoclk.h

diff --git a/drivers/clk/meson/Makefile b/drivers/clk/meson/Makefile
index 11f99139b844..c7510744406a 100644
--- a/drivers/clk/meson/Makefile
+++ b/drivers/clk/meson/Makefile
@@ -6,6 +6,6 @@ obj-$(CONFIG_COMMON_CLK_AMLOGIC) += clk-pll.o clk-mpll.o 
clk-audio-divider.o
 obj-$(CONFIG_COMMON_CLK_AMLOGIC) += clk-phase.o
 obj-$(CONFIG_COMMON_CLK_MESON8B) += meson8b.o
 obj-$(CONFIG_COMMON_CLK_GXBB)   += gxbb.o gxbb-aoclk.o gxbb-aoclk-32k.o
-obj-$(CONFIG_COMMON_CLK_AXG)+= axg.o
+obj-$(CONFIG_COMMON_CLK_AXG)+= axg.o axg-aoclk.o
 obj-$(CONFIG_COMMON_CLK_AXG_AUDIO) += axg-audio.o
 obj-$(CONFIG_COMMON_CLK_REGMAP_MESON)  += clk-regmap.o
diff --git a/drivers/clk/meson/axg-aoclk.c b/drivers/clk/meson/axg-aoclk.c
new file mode 100644
index ..832aa19dd76c
--- /dev/null
+++ b/drivers/clk/meson/axg-aoclk.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * AmLogic Meson-AXG Clock Controller Driver
+ *
+ * Copyright (c) 2016 Baylibre SAS.
+ * Author: Michael Turquette 
+ *
+ * Copyright (c) 2018 Amlogic, inc.
+ * Author: Qiufang Dai 
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "clkc.h"
+#include "axg-aoclk.h"
+
+struct axg_aoclk_reset_controller {
+   struct reset_controller_dev reset;
+   unsigned int *data;
+   struct regmap *regmap;
+};
+
+static int axg_aoclk_do_reset(struct reset_controller_dev *rcdev,
+  unsigned long id)
+{
+   struct axg_aoclk_reset_controller *reset =
+   container_of(rcdev, struct axg_aoclk_reset_controller, reset);
+
+   return regmap_write(reset->regmap, AO_RTI_GEN_CNTL_REG0,
+   BIT(reset->data[id]));
+}
+
+static const struct reset_control_ops axg_aoclk_reset_ops = {
+   .reset = axg_aoclk_do_reset,
+};
+
+#define AXG_AO_GATE(_name, _bit)   \
+static struct clk_regmap _name##_ao = {
\
+   .data = &(struct clk_regmap_gate_data) {\
+   .offset = (AO_RTI_GEN_CNTL_REG0),   \
+   .bit_idx = (_bit),  \
+   },  \
+   .hw.init = &(struct clk_init_data) {\
+   .name = #_name "_ao",   \
+   .ops = &clk_regmap_gate_ops,\
+   .parent_names = (const char *[]){ "clk81" },\
+   .num_parents = 1,   \
+   .flags = (CLK_SET_RATE_PARENT | CLK_IGNORE_UNUSED), \
+   },  \
+}
+
+AXG_AO_GATE(remote, 0);
+AXG_AO_GATE(i2c_master, 1);
+AXG_AO_GATE(i2c_slave, 2);
+AXG_AO_GATE(uart1, 3);
+AXG_AO_GATE(uart2, 5);
+AXG_AO_GATE(ir_blaster, 6);
+AXG_AO_GATE(saradc, 7);
+
+static struct clk_fixed_rate ao_alt_xtal = {
+   .fixed_rate = 32000,
+   .hw.init = &(struct clk_init_data){
+   .name = "ao_alt_xtal",
+   .num_parents = 0,
+   .ops = &clk_fixed_rate_ops,
+   },
+};
+
+static struct clk_regmap ao_clk81 = {
+   .data = &(struct clk_regmap_mux_data) {
+   .offset = AO_RTI_PWR_CNTL_REG0,
+   .mask = 0x1,
+   .shift = 8,
+   },
+   .hw.init = &(struct clk_init_data){
+   .name = "ao_clk81",
+   .ops = &clk_regmap_mux_ro_ops,
+   .parent_names = (const char *[]){ "clk81", "ao_alt_xtal"},
+   .num_parents = 2,
+   },
+};
+
+static struct clk_regmap axg_saradc_mux = {
+   .data = &(struct clk_regmap_mux_data) {
+   .offset = AO_SAR_CLK,
+   .mask = 0x3,
+   .shift = 9,
+   },
+   .hw.init = &(struct clk_init_data){
+   .name = "axg_saradc_mux",
+   .ops = &clk_regmap_mux_ops,
+   .parent_names = (const char *[]){ "xtal", "ao_clk81" },
+   .num_parents = 2,
+   },
+};
+
+static struct clk_regmap axg_saradc_div = {
+   .data = &(struct clk_regmap_div_data) {
+   .offset = AO_SAR_CLK,
+   .shift = 0,
+   .width = 8,
+   },
+   .hw.init = &(struct clk_init_data){
+   .name = "axg_saradc_div",
+   .ops = &clk_regmap_divider_ops,
+   .parent_n

[PATCH] ASoC: use DEFINE_SHOW_ATTRIBUTE() to decrease code duplication

2018-02-08 Thread Donglin Peng
There is some duplicate code in soc-core.c,and the kernel provides
DEFINE_SHOW_ATTRIBUTE() helper macro to decrease it in seq_file.h.

Signed-off-by: Peng Donglin 
---
 sound/soc/soc-core.c | 51 +--
 1 file changed, 9 insertions(+), 42 deletions(-)

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 96c44f6576c9..cb52d1e8e0b9 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -349,33 +349,22 @@ static void soc_init_codec_debugfs(struct
snd_soc_component *component)
 "ASoC: Failed to create codec register debugfs file\n");
 }

-static int codec_list_seq_show(struct seq_file *m, void *v)
+static int codec_list_show(struct seq_file *s, void *v)
 {
 struct snd_soc_codec *codec;

 mutex_lock(&client_mutex);

 list_for_each_entry(codec, &codec_list, list)
-seq_printf(m, "%s\n", codec->component.name);
+seq_printf(s, "%s\n", codec->component.name);

 mutex_unlock(&client_mutex);

 return 0;
 }
+DEFINE_SHOW_ATTRIBUTE(codec_list);

-static int codec_list_seq_open(struct inode *inode, struct file *file)
-{
-return single_open(file, codec_list_seq_show, NULL);
-}
-
-static const struct file_operations codec_list_fops = {
-.open = codec_list_seq_open,
-.read = seq_read,
-.llseek = seq_lseek,
-.release = single_release,
-};
-
-static int dai_list_seq_show(struct seq_file *m, void *v)
+static int dai_list_show(struct seq_file *s, void *v)
 {
 struct snd_soc_component *component;
 struct snd_soc_dai *dai;
@@ -384,50 +373,28 @@ static int dai_list_seq_show(struct seq_file *m, void *v)

 list_for_each_entry(component, &component_list, list)
 list_for_each_entry(dai, &component->dai_list, list)
-seq_printf(m, "%s\n", dai->name);
+seq_printf(s, "%s\n", dai->name);

 mutex_unlock(&client_mutex);

 return 0;
 }
+DEFINE_SHOW_ATTRIBUTE(dai_list);

-static int dai_list_seq_open(struct inode *inode, struct file *file)
-{
-return single_open(file, dai_list_seq_show, NULL);
-}
-
-static const struct file_operations dai_list_fops = {
-.open = dai_list_seq_open,
-.read = seq_read,
-.llseek = seq_lseek,
-.release = single_release,
-};
-
-static int platform_list_seq_show(struct seq_file *m, void *v)
+static int platform_list_show(struct seq_file *s, void *v)
 {
 struct snd_soc_platform *platform;

 mutex_lock(&client_mutex);

 list_for_each_entry(platform, &platform_list, list)
-seq_printf(m, "%s\n", platform->component.name);
+seq_printf(s, "%s\n", platform->component.name);

 mutex_unlock(&client_mutex);

 return 0;
 }
-
-static int platform_list_seq_open(struct inode *inode, struct file *file)
-{
-return single_open(file, platform_list_seq_show, NULL);
-}
-
-static const struct file_operations platform_list_fops = {
-.open = platform_list_seq_open,
-.read = seq_read,
-.llseek = seq_lseek,
-.release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(platform_list);

 static void soc_init_card_debugfs(struct snd_soc_card *card)
 {
-- 
2.16.1


Re: [PATCH 3/6] sched/isolation: Isolate workqueues when "nohz_full=" is set

2018-02-08 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> - flags = HK_FLAG_TICK | HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
> + flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER |
> + HK_FLAG_RCU | HK_FLAG_MISC;

> - cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(HK_FLAG_DOMAIN));
> + cpumask_copy(wq_unbound_cpumask,
> +  housekeeping_cpumask(HK_FLAG_DOMAIN | HK_FLAG_WQ));

LGTM, but _please_ don't do these ugly line-breaks, just keep it slightly over 
col80.

Thanks,

Ingo


[mainline][ppc - bare-metal ] memory hotunplug operation results in kernel Oops

2018-02-08 Thread Abdul Haleem
Greetings,

Todays mainline kernel has Oops messages for memory hot-unplug
operation.

Machine: Power 8 bare-metal 
Kernel: 4.15.0
Config: attached
gcc: 4.8.5
Test: Memory hot-unplug
echo offline > /sys/devices/system/memory/memory/state

the above command triggered 2 kernel Oops messages and the bad address
from first Oops maps to:

# gdb -batch vmlinux -ex 'list *(0xc0a15a18)'
0xc0a15a18 is in _raw_spin_lock
(./arch/powerpc/include/asm/spinlock.h:82).
77   */
78  static inline unsigned long __arch_spin_trylock(arch_spinlock_t
*lock)
79  {
80  unsigned long tmp, token;
81
82  token = LOCK_TOKEN;
83  __asm__ __volatile__(
84  "1: " PPC_LWARX(%0,0,%2,1) "\n\
85  cmpwi   0,%0,0\n\
86  bne-2f\n\

and the second Oops with bad address maps to:

# gdb -batch vmlinux -ex 'list *(0xc029f7c8)'
0xc029f7c8 is in page_vma_mapped_walk
(./arch/powerpc/include/asm/book3s/64/pgtable.h:571).
566 }
567 #endif /* CONFIG_NUMA_BALANCING */
568 
569 static inline int pte_present(pte_t pte)
570 {
571 return !!(pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT));
572 }
573 
574 #ifdef CONFIG_PPC_MEM_KEYS
575 extern bool arch_pte_access_permitted(u64 pte, bool write, bool
execute);


traces logs:
Offlined Pages 4096
Offlined Pages 4096
Offlined Pages 4096
Unable to handle kernel paging request for data at address 0xf0004030
Faulting instruction address: 0xc0a15a18
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge
stp llc kvm_hv kvm iptable_filter vmx_crypto ipmi_powernv ipmi_devintf
ipmi_msghandler powernv_rng leds_powernv led_class powernv_op_panel
rng_core nfsd binfmt_misc ip_tables x_tables autofs4
CPU: 2 PID: 50585 Comm: stress Not tainted 4.15.0-11704-ga2e5790-dirty #1
NIP:  c0a15a18 LR: c028ea14 CTR: c0280ca0
REGS: c007f704f8a0 TRAP: 0300   Not tainted  (4.15.0-11704-ga2e5790-dirty)
MSR:  9280b033   CR: 28824828  XER: 

CFAR: c000884c DAR: f0004030 DSISR: 4000 SOFTE: 0
GPR00: c028ea14 c007f704fb20 c10e3300 f0004030
GPR04: 0020bc02 02bc2000  04b0
GPR08: c100 00080040 8002 c000
GPR12: 4400 cfd00c00 00011096 
GPR16: c007f704c000  c1281b70 c007a11ccf00
GPR20:   c10004b0 fe7fefff
GPR24: c007e861a500 00011096 f0004000 
GPR28: c007f1518880 c007f0720880 00011097 f0004030
NIP [c0a15a18] _raw_spin_lock+0x28/0xc0
LR [c028ea14] copy_page_range+0x604/0x1390
Call Trace:
[c007f704fb50] [c028ea14] copy_page_range+0x604/0x1390
[c007f704fce0] [c00ea84c] copy_process.isra.40.part.41+0xbdc/0x18b0
[c007f704fdc0] [c00eb704] _do_fork+0xd4/0x4a0
[c007f704fe30] [c000bbc8] ppc_clone+0x8/0xc
Instruction dump:
990d028c 4bc8 3c4c006d 3842d910 7c0802a6 fbe1fff8 7c7f1b78 f8010010
f821ffd1 3940 994d028c 814d0008 <7d201829> 2c09 40c20010 7d40192d
---[ end trace b21abd323ba17f9c ]---
Unable to handle kernel paging request for data at address 0xc10004a8
Faulting instruction address: 0xc029f7c8
Oops: Kernel access of bad area, sig: 11 [#2]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge
stp llc kvm_hv kvm iptable_filter vmx_crypto ipmi_powernv ipmi_devintf
ipmi_msghandler powernv_rng leds_powernv led_class powernv_op_panel
rng_core nfsd binfmt_misc ip_tables x_tables autofs4
CPU: 14 PID: 1025 Comm: kswapd0 Tainted: G  D  
4.15.0-11704-ga2e5790-dirty #1
NIP:  c029f7c8 LR: c029f39c CTR: c02a1170
REGS: c007f1d0f580 TRAP: 0380   Tainted: G  D   
(4.15.0-11704-ga2e5790-dirty)
MSR:  90009033   CR: 28002084  XER: 
CFAR: c029f624 SOFTE: 0
GPR00: 0007e8fad800 c007f1d0f800 c10e3300 000a
GPR04:  c007f0720880  
GPR08: c10004a8 04a8 c100 
GPR12: c02a1170 cfd05400 f1f61ca0 c007fe00
GPR16: f1f61c80 c0076300 c007fc055808 0003
GPR20: c0086300 c007f1d0fa30 0001 
GPR24: c007a11ccf00  0001 f1f61c80
GPR28:  f072 c128

Re: [PATCH 1/6] sched: Rename init_rq_hrtick to hrtick_rq_init

2018-02-08 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> Do that rename in order to normalize the hrtick namespace.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Chris Metcalf 
> Cc: Christoph Lameter 
> Cc: Luiz Capitulino 
> Cc: Mike Galbraith 
> Cc: Paul E. McKenney 
> Cc: Peter Zijlstra 
> Cc: Rik van Riel 
> Cc: Thomas Gleixner 
> Cc: Wanpeng Li 
> Cc: Ingo Molnar 
> ---
>  kernel/sched/core.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 36f113a..fc9fa25 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -333,7 +333,7 @@ void hrtick_start(struct rq *rq, u64 delay)
>  }
>  #endif /* CONFIG_SMP */
>  
> -static void init_rq_hrtick(struct rq *rq)
> +static void hrtick_rq_init(struct rq *rq)

On a related note, I think we should also do:

s/start_hrtick_dl
 /sched_dl_hrtick_start

or such. (In a separate patch)

Thanks,

Ingo


[PATCH] ASoC: use DEFINE_SHOW_ATTRIBUTE() to decrease code duplication

2018-02-08 Thread Peng Donglin
There is some duplicate code in soc-core.c,and the kernel provides
DEFINE_SHOW_ATTRIBUTE() helper macro to decrease it in seq_file.h.

Signed-off-by: Peng Donglin 
---
 sound/soc/soc-core.c | 51 +--
 1 file changed, 9 insertions(+), 42 deletions(-)

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 96c44f6576c9..cb52d1e8e0b9 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -349,33 +349,22 @@ static void soc_init_codec_debugfs(struct 
snd_soc_component *component)
"ASoC: Failed to create codec register debugfs file\n");
 }
 
-static int codec_list_seq_show(struct seq_file *m, void *v)
+static int codec_list_show(struct seq_file *s, void *v)
 {
struct snd_soc_codec *codec;
 
mutex_lock(&client_mutex);
 
list_for_each_entry(codec, &codec_list, list)
-   seq_printf(m, "%s\n", codec->component.name);
+   seq_printf(s, "%s\n", codec->component.name);
 
mutex_unlock(&client_mutex);
 
return 0;
 }
+DEFINE_SHOW_ATTRIBUTE(codec_list);
 
-static int codec_list_seq_open(struct inode *inode, struct file *file)
-{
-   return single_open(file, codec_list_seq_show, NULL);
-}
-
-static const struct file_operations codec_list_fops = {
-   .open = codec_list_seq_open,
-   .read = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
-
-static int dai_list_seq_show(struct seq_file *m, void *v)
+static int dai_list_show(struct seq_file *s, void *v)
 {
struct snd_soc_component *component;
struct snd_soc_dai *dai;
@@ -384,50 +373,28 @@ static int dai_list_seq_show(struct seq_file *m, void *v)
 
list_for_each_entry(component, &component_list, list)
list_for_each_entry(dai, &component->dai_list, list)
-   seq_printf(m, "%s\n", dai->name);
+   seq_printf(s, "%s\n", dai->name);
 
mutex_unlock(&client_mutex);
 
return 0;
 }
+DEFINE_SHOW_ATTRIBUTE(dai_list);
 
-static int dai_list_seq_open(struct inode *inode, struct file *file)
-{
-   return single_open(file, dai_list_seq_show, NULL);
-}
-
-static const struct file_operations dai_list_fops = {
-   .open = dai_list_seq_open,
-   .read = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
-
-static int platform_list_seq_show(struct seq_file *m, void *v)
+static int platform_list_show(struct seq_file *s, void *v)
 {
struct snd_soc_platform *platform;
 
mutex_lock(&client_mutex);
 
list_for_each_entry(platform, &platform_list, list)
-   seq_printf(m, "%s\n", platform->component.name);
+   seq_printf(s, "%s\n", platform->component.name);
 
mutex_unlock(&client_mutex);
 
return 0;
 }
-
-static int platform_list_seq_open(struct inode *inode, struct file *file)
-{
-   return single_open(file, platform_list_seq_show, NULL);
-}
-
-static const struct file_operations platform_list_fops = {
-   .open = platform_list_seq_open,
-   .read = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(platform_list);
 
 static void soc_init_card_debugfs(struct snd_soc_card *card)
 {
-- 
2.16.1



Re: [PATCH] Input: gpio_keys: Add level trigger support for GPIO keys

2018-02-08 Thread Dmitry Torokhov
On Thu, Feb 8, 2018 at 10:08 PM, Baolin Wang  wrote:
> On some platforms (such as Spreadtrum platform), the GPIO keys can only
> be triggered by level type.

How do you stop the interrupt from re-triggering as long as the key
stays pressed?

Thanks.

-- 
Dmitry


[PATCH v10 1/8] v4l2-dv-timings: add v4l2_hdmi_colorimetry()

2018-02-08 Thread Tim Harvey
From: Hans Verkuil 

Add the v4l2_hdmi_colorimetry() function so we have a single function
that determines the colorspace, YCbCr encoding, quantization range and
transfer function from the InfoFrame data.

Cc: Randy Dunlap 
Signed-off-by: Hans Verkuil 
---
v9:
 - fix kernel-doc format (Randy)

 drivers/media/v4l2-core/v4l2-dv-timings.c | 141 ++
 include/media/v4l2-dv-timings.h   |  21 +
 2 files changed, 162 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-dv-timings.c 
b/drivers/media/v4l2-core/v4l2-dv-timings.c
index 930f9c5..5663d86 100644
--- a/drivers/media/v4l2-core/v4l2-dv-timings.c
+++ b/drivers/media/v4l2-core/v4l2-dv-timings.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 MODULE_AUTHOR("Hans Verkuil");
 MODULE_DESCRIPTION("V4L2 DV Timings Helper Functions");
@@ -814,3 +815,143 @@ struct v4l2_fract v4l2_calc_aspect_ratio(u8 
hor_landscape, u8 vert_portrait)
return aspect;
 }
 EXPORT_SYMBOL_GPL(v4l2_calc_aspect_ratio);
+
+/** v4l2_hdmi_rx_colorimetry - determine HDMI colorimetry information
+ * based on various InfoFrames.
+ * @avi: the AVI InfoFrame
+ * @hdmi: the HDMI Vendor InfoFrame, may be NULL
+ * @height: the frame height
+ *
+ * Determines the HDMI colorimetry information, i.e. how the HDMI
+ * pixel color data should be interpreted.
+ *
+ * Note that some of the newer features (DCI-P3, HDR) are not yet
+ * implemented: the hdmi.h header needs to be updated to the HDMI 2.0
+ * and CTA-861-G standards.
+ */
+struct v4l2_hdmi_colorimetry
+v4l2_hdmi_rx_colorimetry(const struct hdmi_avi_infoframe *avi,
+const struct hdmi_vendor_infoframe *hdmi,
+unsigned int height)
+{
+   struct v4l2_hdmi_colorimetry c = {
+   V4L2_COLORSPACE_SRGB,
+   V4L2_YCBCR_ENC_DEFAULT,
+   V4L2_QUANTIZATION_FULL_RANGE,
+   V4L2_XFER_FUNC_SRGB
+   };
+   bool is_ce = avi->video_code || (hdmi && hdmi->vic);
+   bool is_sdtv = height <= 576;
+   bool default_is_lim_range_rgb = avi->video_code > 1;
+
+   switch (avi->colorspace) {
+   case HDMI_COLORSPACE_RGB:
+   /* RGB pixel encoding */
+   switch (avi->colorimetry) {
+   case HDMI_COLORIMETRY_EXTENDED:
+   switch (avi->extended_colorimetry) {
+   case HDMI_EXTENDED_COLORIMETRY_ADOBE_RGB:
+   c.colorspace = V4L2_COLORSPACE_ADOBERGB;
+   c.xfer_func = V4L2_XFER_FUNC_ADOBERGB;
+   break;
+   case HDMI_EXTENDED_COLORIMETRY_BT2020:
+   c.colorspace = V4L2_COLORSPACE_BT2020;
+   c.xfer_func = V4L2_XFER_FUNC_709;
+   break;
+   default:
+   break;
+   }
+   break;
+   default:
+   break;
+   }
+   switch (avi->quantization_range) {
+   case HDMI_QUANTIZATION_RANGE_LIMITED:
+   c.quantization = V4L2_QUANTIZATION_LIM_RANGE;
+   break;
+   case HDMI_QUANTIZATION_RANGE_FULL:
+   break;
+   default:
+   if (default_is_lim_range_rgb)
+   c.quantization = V4L2_QUANTIZATION_LIM_RANGE;
+   break;
+   }
+   break;
+
+   default:
+   /* YCbCr pixel encoding */
+   c.quantization = V4L2_QUANTIZATION_LIM_RANGE;
+   switch (avi->colorimetry) {
+   case HDMI_COLORIMETRY_NONE:
+   if (!is_ce)
+   break;
+   if (is_sdtv) {
+   c.colorspace = V4L2_COLORSPACE_SMPTE170M;
+   c.ycbcr_enc = V4L2_YCBCR_ENC_601;
+   } else {
+   c.colorspace = V4L2_COLORSPACE_REC709;
+   c.ycbcr_enc = V4L2_YCBCR_ENC_709;
+   }
+   c.xfer_func = V4L2_XFER_FUNC_709;
+   break;
+   case HDMI_COLORIMETRY_ITU_601:
+   c.colorspace = V4L2_COLORSPACE_SMPTE170M;
+   c.ycbcr_enc = V4L2_YCBCR_ENC_601;
+   c.xfer_func = V4L2_XFER_FUNC_709;
+   break;
+   case HDMI_COLORIMETRY_ITU_709:
+   c.colorspace = V4L2_COLORSPACE_REC709;
+   c.ycbcr_enc = V4L2_YCBCR_ENC_709;
+   c.xfer_func = V4L2_XFER_FUNC_709;
+   break;
+   case HDMI_COLORIMETRY_EXTENDED:
+   switch (avi->extended_colorimetry) {
+   case HDMI_EXTEND

[PATCH v10 3/8] media: add digital video decoder entity functions

2018-02-08 Thread Tim Harvey
Add a new media entity function definition for digital TV decoders:
MEDIA_ENT_F_DTV_DECODER

Signed-off-by: Tim Harvey 
---
 Documentation/media/uapi/mediactl/media-types.rst | 11 +++
 include/uapi/linux/media.h|  5 +
 2 files changed, 16 insertions(+)

diff --git a/Documentation/media/uapi/mediactl/media-types.rst 
b/Documentation/media/uapi/mediactl/media-types.rst
index 8d64b0c..195400e 100644
--- a/Documentation/media/uapi/mediactl/media-types.rst
+++ b/Documentation/media/uapi/mediactl/media-types.rst
@@ -321,6 +321,17 @@ Types and flags used to represent the media graph elements
  MIPI CSI-2, ...), and outputs them on its source pad to an output
  video bus of another type (eDP, MIPI CSI-2, parallel, ...).
 
+-  ..  row 31
+
+   ..  _MEDIA-ENT-F-DTV-DECODER:
+
+   -  ``MEDIA_ENT_F_DTV_DECODER``
+
+   -  Digital video decoder. The basic function of the video decoder is
+ to accept digital video from a wide variety of sources
+ and output it in some digital video standard, with appropriate
+ timing signals.
+
 ..  tabularcolumns:: |p{5.5cm}|p{12.0cm}|
 
 .. _media-entity-flag:
diff --git a/include/uapi/linux/media.h b/include/uapi/linux/media.h
index b9b9446..2f12328 100644
--- a/include/uapi/linux/media.h
+++ b/include/uapi/linux/media.h
@@ -110,6 +110,11 @@ struct media_device_info {
 #define MEDIA_ENT_F_VID_IF_BRIDGE  (MEDIA_ENT_F_BASE + 0x5002)
 
 /*
+ * Digital video decoder entities
+ */
+#define MEDIA_ENT_F_DTV_DECODER(MEDIA_ENT_F_BASE + 
0x6001)
+
+/*
  * Connectors
  */
 /* It is a responsibility of the entity drivers to add connectors and links */
-- 
2.7.4



[PATCH v10 5/8] media: dt-bindings: Add bindings for TDA1997X

2018-02-08 Thread Tim Harvey
Acked-by: Rob Herring 
Acked-by: Sakari Ailus 
Signed-off-by: Tim Harvey 
---
v6:
 - replace copyright with SPDX tag
 - added Rob's ack

v5:
 - added Sakari's ack

v4:
 - move include/dt-bindings/media/tda1997x.h to bindings patch
 - clarify port node details

v3:
 - fix typo

v2:
 - add vendor prefix and remove _ from vidout-portcfg
 - remove _ from labels
 - remove max-pixel-rate property
 - describe and provide example for single output port
 - update to new audio port bindings

 .../devicetree/bindings/media/i2c/tda1997x.txt | 179 +
 include/dt-bindings/media/tda1997x.h   |  74 +
 2 files changed, 253 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/tda1997x.txt
 create mode 100644 include/dt-bindings/media/tda1997x.h

diff --git a/Documentation/devicetree/bindings/media/i2c/tda1997x.txt 
b/Documentation/devicetree/bindings/media/i2c/tda1997x.txt
new file mode 100644
index 000..9ab53c3
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/i2c/tda1997x.txt
@@ -0,0 +1,179 @@
+Device-Tree bindings for the NXP TDA1997x HDMI receiver
+
+The TDA19971/73 are HDMI video receivers.
+
+The TDA19971 Video port output pins can be used as follows:
+ - RGB 8bit per color (24 bits total): R[11:4] B[11:4] G[11:4]
+ - YUV444 8bit per color (24 bits total): Y[11:4] Cr[11:4] Cb[11:4]
+ - YUV422 semi-planar 8bit per component (16 bits total): Y[11:4] CbCr[11:4]
+ - YUV422 semi-planar 10bit per component (20 bits total): Y[11:2] CbCr[11:2]
+ - YUV422 semi-planar 12bit per component (24 bits total): - Y[11:0] CbCr[11:0]
+ - YUV422 BT656 8bit per component (8 bits total): YCbCr[11:4] (2-cycles)
+ - YUV422 BT656 10bit per component (10 bits total): YCbCr[11:2] (2-cycles)
+ - YUV422 BT656 12bit per component (12 bits total): YCbCr[11:0] (2-cycles)
+
+The TDA19973 Video port output pins can be used as follows:
+ - RGB 12bit per color (36 bits total): R[11:0] B[11:0] G[11:0]
+ - YUV444 12bit per color (36 bits total): Y[11:0] Cb[11:0] Cr[11:0]
+ - YUV422 semi-planar 12bit per component (24 bits total): Y[11:0] CbCr[11:0]
+ - YUV422 BT656 12bit per component (12 bits total): YCbCr[11:0] (2-cycles)
+
+The Video port output pins are mapped via 4-bit 'pin groups' allowing
+for a variety of connection possibilities including swapping pin order within
+pin groups. The video_portcfg device-tree property consists of register mapping
+pairs which map a chip-specific VP output register to a 4-bit pin group. If
+the pin group needs to be bit-swapped you can use the *_S pin-group defines.
+
+Required Properties:
+ - compatible  :
+  - "nxp,tda19971" for the TDA19971
+  - "nxp,tda19973" for the TDA19973
+ - reg : I2C slave address
+ - interrupts  : The interrupt number
+ - DOVDD-supply: Digital I/O supply
+ - DVDD-supply : Digital Core supply
+ - AVDD-supply : Analog supply
+ - nxp,vidout-portcfg  : array of pairs mapping VP output pins to pin groups.
+
+Optional Properties:
+ - nxp,audout-format   : DAI bus format: "i2s" or "spdif".
+ - nxp,audout-width: width of audio output data bus (1-4).
+ - nxp,audout-layout   : data layout (0=AP0 used, 1=AP0/AP1/AP2/AP3 used).
+ - nxp,audout-mclk-fs  : Multiplication factor between stream rate and codec
+ mclk.
+
+The port node shall contain one endpoint child node for its digital
+output video port, in accordance with the video interface bindings defined in
+Documentation/devicetree/bindings/media/video-interfaces.txt.
+
+Optional Endpoint Properties:
+  The following three properties are defined in video-interfaces.txt and
+  are valid for the output parallel bus endpoint:
+  - hsync-active: Horizontal synchronization polarity. Defaults to active high.
+  - vsync-active: Vertical synchronization polarity. Defaults to active high.
+  - data-active: Data polarity. Defaults to active high.
+
+Examples:
+ - VP[15:0] connected to IMX6 CSI_DATA[19:4] for 16bit YUV422
+   16bit I2S layout0 with a 128*fs clock (A_WS, AP0, A_CLK pins)
+   hdmi-receiver@48 {
+   compatible = "nxp,tda19971";
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_tda1997x>;
+   reg = <0x48>;
+   interrupt-parent = <&gpio1>;
+   interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
+   DOVDD-supply = <®_3p3v>;
+   AVDD-supply = <®_1p8v>;
+   DVDD-supply = <®_1p8v>;
+   /* audio */
+   #sound-dai-cells = <0>;
+   nxp,audout-format = "i2s";
+   nxp,audout-layout = <0>;
+   nxp,audout-width = <16>;
+   nxp,audout-mclk-fs = <128>;
+   /*
+* The 8bpp YUV422 semi-planar mode outputs CbCr[11:4]
+* and Y[11:4] across 16bits in the same pixclk cycle.
+*/
+   nxp,vidout-portcfg =
+   /* Y[11:8]<->VP[15:12]<->CSI_DATA[1

[PATCH v10 2/8] media: v4l-ioctl: fix clearing pad for VIDIOC_DV_TIMIGNS_CAP

2018-02-08 Thread Tim Harvey
Signed-off-by: Tim Harvey 
---
 drivers/media/v4l2-core/v4l2-ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c 
b/drivers/media/v4l2-core/v4l2-ioctl.c
index 7961499..5f3670d 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -2638,7 +2638,7 @@ static struct v4l2_ioctl_info v4l2_ioctls[] = {
IOCTL_INFO_FNC(VIDIOC_PREPARE_BUF, v4l_prepare_buf, v4l_print_buffer, 
INFO_FL_QUEUE),
IOCTL_INFO_STD(VIDIOC_ENUM_DV_TIMINGS, vidioc_enum_dv_timings, 
v4l_print_enum_dv_timings, INFO_FL_CLEAR(v4l2_enum_dv_timings, pad)),
IOCTL_INFO_STD(VIDIOC_QUERY_DV_TIMINGS, vidioc_query_dv_timings, 
v4l_print_dv_timings, INFO_FL_ALWAYS_COPY),
-   IOCTL_INFO_STD(VIDIOC_DV_TIMINGS_CAP, vidioc_dv_timings_cap, 
v4l_print_dv_timings_cap, INFO_FL_CLEAR(v4l2_dv_timings_cap, type)),
+   IOCTL_INFO_STD(VIDIOC_DV_TIMINGS_CAP, vidioc_dv_timings_cap, 
v4l_print_dv_timings_cap, INFO_FL_CLEAR(v4l2_dv_timings_cap, pad)),
IOCTL_INFO_FNC(VIDIOC_ENUM_FREQ_BANDS, v4l_enum_freq_bands, 
v4l_print_freq_band, 0),
IOCTL_INFO_FNC(VIDIOC_DBG_G_CHIP_INFO, v4l_dbg_g_chip_info, 
v4l_print_dbg_chip_info, INFO_FL_CLEAR(v4l2_dbg_chip_info, match)),
IOCTL_INFO_FNC(VIDIOC_QUERY_EXT_CTRL, v4l_query_ext_ctrl, 
v4l_print_query_ext_ctrl, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_query_ext_ctrl, 
id)),
-- 
2.7.4



[PATCH v10 7/8] ARM: dts: imx: Add TDA19971 HDMI Receiver to GW54xx

2018-02-08 Thread Tim Harvey
The GW54xx has a front-panel microHDMI connector routed to a TDA19971
which is connected the the IPU CSI when using IMX6Q.

Signed-off-by: Tim Harvey 
---
v5:
 - remove leading 0 from unit address
 - add newline between property list and child node

v4: no changes
v3: no changes

v2:
 - add HDMI audio input support

 arch/arm/boot/dts/imx6q-gw54xx.dts| 105 ++
 arch/arm/boot/dts/imx6qdl-gw54xx.dtsi |  29 +-
 2 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/imx6q-gw54xx.dts 
b/arch/arm/boot/dts/imx6q-gw54xx.dts
index 56e5b50..0477120 100644
--- a/arch/arm/boot/dts/imx6q-gw54xx.dts
+++ b/arch/arm/boot/dts/imx6q-gw54xx.dts
@@ -12,10 +12,30 @@
 /dts-v1/;
 #include "imx6q.dtsi"
 #include "imx6qdl-gw54xx.dtsi"
+#include 
 
 / {
model = "Gateworks Ventana i.MX6 Dual/Quad GW54XX";
compatible = "gw,imx6q-gw54xx", "gw,ventana", "fsl,imx6q";
+
+   sound-digital {
+   compatible = "simple-audio-card";
+   simple-audio-card,name = "tda1997x-audio";
+
+   simple-audio-card,dai-link@0 {
+   format = "i2s";
+
+   cpu {
+   sound-dai = <&ssi2>;
+   };
+
+   codec {
+   bitclock-master;
+   frame-master;
+   sound-dai = <&tda1997x>;
+   };
+   };
+   };
 };
 
 &i2c3 {
@@ -35,6 +55,61 @@
};
};
};
+
+   tda1997x: codec@48 {
+   compatible = "nxp,tda19971";
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_tda1997x>;
+   reg = <0x48>;
+   interrupt-parent = <&gpio1>;
+   interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
+   DOVDD-supply = <®_3p3v>;
+   AVDD-supply = <&sw4_reg>;
+   DVDD-supply = <&sw4_reg>;
+   #sound-dai-cells = <0>;
+   nxp,audout-format = "i2s";
+   nxp,audout-layout = <0>;
+   nxp,audout-width = <16>;
+   nxp,audout-mclk-fs = <128>;
+   /*
+* The 8bpp YUV422 semi-planar mode outputs CbCr[11:4]
+* and Y[11:4] across 16bits in the same cycle
+* which we map to VP[15:08]<->CSI_DATA[19:12]
+*/
+   nxp,vidout-portcfg =
+   /*G_Y_11_8<->VP[15:12]<->CSI_DATA[19:16]*/
+   < TDA1997X_VP24_V15_12 TDA1997X_G_Y_11_8 >,
+   /*G_Y_7_4<->VP[11:08]<->CSI_DATA[15:12]*/
+   < TDA1997X_VP24_V11_08 TDA1997X_G_Y_7_4 >,
+   /*R_CR_CBCR_11_8<->VP[07:04]<->CSI_DATA[11:08]*/
+   < TDA1997X_VP24_V07_04 TDA1997X_R_CR_CBCR_11_8 >,
+   /*R_CR_CBCR_7_4<->VP[03:00]<->CSI_DATA[07:04]*/
+   < TDA1997X_VP24_V03_00 TDA1997X_R_CR_CBCR_7_4 >;
+
+   port {
+   tda1997x_to_ipu1_csi0_mux: endpoint {
+   remote-endpoint = 
<&ipu1_csi0_mux_from_parallel_sensor>;
+   bus-width = <16>;
+   hsync-active = <1>;
+   vsync-active = <1>;
+   data-active = <1>;
+   };
+   };
+   };
+};
+
+&ipu1_csi0_from_ipu1_csi0_mux {
+   bus-width = <16>;
+};
+
+&ipu1_csi0_mux_from_parallel_sensor {
+   remote-endpoint = <&tda1997x_to_ipu1_csi0_mux>;
+   bus-width = <16>;
+};
+
+&ipu1_csi0 {
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_ipu1_csi0>;
 };
 
 &ipu2_csi1_from_ipu2_csi1_mux {
@@ -63,6 +138,30 @@
>;
};
 
+   pinctrl_ipu1_csi0: ipu1_csi0grp {
+   fsl,pins = <
+   MX6QDL_PAD_CSI0_DAT4__IPU1_CSI0_DATA04  0x1b0b0
+   MX6QDL_PAD_CSI0_DAT5__IPU1_CSI0_DATA05  0x1b0b0
+   MX6QDL_PAD_CSI0_DAT6__IPU1_CSI0_DATA06  0x1b0b0
+   MX6QDL_PAD_CSI0_DAT7__IPU1_CSI0_DATA07  0x1b0b0
+   MX6QDL_PAD_CSI0_DAT8__IPU1_CSI0_DATA08  0x1b0b0
+   MX6QDL_PAD_CSI0_DAT9__IPU1_CSI0_DATA09  0x1b0b0
+   MX6QDL_PAD_CSI0_DAT10__IPU1_CSI0_DATA10 0x1b0b0
+   MX6QDL_PAD_CSI0_DAT11__IPU1_CSI0_DATA11 0x1b0b0
+   MX6QDL_PAD_CSI0_DAT12__IPU1_CSI0_DATA12 0x1b0b0
+   MX6QDL_PAD_CSI0_DAT13__IPU1_CSI0_DATA13 0x1b0b0
+   MX6QDL_PAD_CSI0_DAT14__IPU1_CSI0_DATA14 0x1b0b0
+   MX6QDL_PAD_CSI0_DAT15__IPU1_CSI0_DATA15 0x1b0b0
+   MX6QDL_PAD_CSI0_DAT16__IPU1_CSI0_DATA16 0x1b0b0

[PATCH v10 6/8] media: i2c: Add TDA1997x HDMI receiver driver

2018-02-08 Thread Tim Harvey
Add support for the TDA1997x HDMI receivers.

Cc: Hans Verkuil 
Signed-off-by: Tim Harvey 
---
v10:
 - removed unnecessary check for !timings in get/set/query dv timings (Hans)
 - dropped pointless s_stream handler (Hans)
 - remove need for detected_timings and always use set timings (Hans)

v9:
 - remove redundant pad bounds check already in v4l2-subdev.c
 - assign entity function (Hans)
 - properly assign/check/free ctrl_handler (Hans)
 - fixed typo 'Rull Range' -> 'Full Range'
 - update csc after quant range change

v8:
 - fix available formats for tda19971 bt656 bus width >12
 - support full range of input modes based on timings_cap
 - fix set_format (compliance)
 - fixed get/set edid (compliance)
 - add init_cfg to setup default pad config (compliance)
 - added missing pad checks to get_dv_timings_cap/enum_dv_timings (compliance)
 - fix alignment of if statement and whitespace in comment (Hans)
 - move regs to tda1997x_regs.h to clean up (Hans)
 - add define and sanity check for num of mbus_codes (Hans)

v7:
 - fix interlaced mode
 - support no AVI infoframe (ie DVI) (Hans)
 - add support for multiple output formats (Hans)

v6:
 - fix return on regulator enablei in tda1997x_set_power()
 - replace copyright with SPDX tag
 - fix colorspace handling

v5:
 - uppercase string constants
 - use v4l2_hdmi_rx_coloriemtry to fill format
 - fix V4L2_CID_DV_RX_RGB_RANGE
 - fix interlaced mode format

v4:
 - move include/dt-bindings/media/tda1997x.h to bindings patch
 - fix typos
 - fix default quant range for VGA
 - fix quant range handling and conv matrix
 - add additional standards and capabilities to timings_cap

v3:
 - use V4L2_DV_BT_FRAME_WIDTH/HEIGHT macros
 - fixed missing break
 - use only hdmi_infoframe_log for infoframe logging
 - simplify tda1997x_s_stream error handling
 - add delayed work proc to handle hotplug enable/disable
 - fix set_edid (disable HPD before writing, enable after)
 - remove enabling edid by default
 - initialize timings
 - take quant range into account in colorspace conversion
 - remove vendor/product tracking (we provide this in log_status via infoframes)
 - add v4l_controls
 - add more detail to log_status
 - calculate vhref generator timings
 - timing detection fixes (rounding errors, hswidth errors)
 - rename configure_input/configure_conv functions

v2:
 - implement dv timings enum/cap
 - remove deprecated g_mbus_config op
 - fix dv_query_timings
 - add EDID get/set handling
 - remove max-pixel-rate support
 - add audio codec DAI support
 - change audio bindings

 drivers/media/i2c/Kconfig |9 +
 drivers/media/i2c/Makefile|1 +
 drivers/media/i2c/tda1997x.c  | 2807 +
 drivers/media/i2c/tda1997x_regs.h |  641 +
 include/media/i2c/tda1997x.h  |   42 +
 5 files changed, 3500 insertions(+)
 create mode 100644 drivers/media/i2c/tda1997x.c
 create mode 100644 drivers/media/i2c/tda1997x_regs.h
 create mode 100644 include/media/i2c/tda1997x.h

diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
index cb5d7ff..3522641 100644
--- a/drivers/media/i2c/Kconfig
+++ b/drivers/media/i2c/Kconfig
@@ -56,6 +56,15 @@ config VIDEO_TDA9840
  To compile this driver as a module, choose M here: the
  module will be called tda9840.
 
+config VIDEO_TDA1997X
+   tristate "NXP TDA1997x HDMI receiver"
+   depends on VIDEO_V4L2 && I2C && VIDEO_V4L2_SUBDEV_API
+   ---help---
+ V4L2 subdevice driver for the NXP TDA1997x HDMI receivers.
+
+ To compile this driver as a module, choose M here: the
+ module will be called tda1997x.
+
 config VIDEO_TEA6415C
tristate "Philips TEA6415C audio processor"
depends on I2C
diff --git a/drivers/media/i2c/Makefile b/drivers/media/i2c/Makefile
index 548a9ef..adfcae9 100644
--- a/drivers/media/i2c/Makefile
+++ b/drivers/media/i2c/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_VIDEO_TVAUDIO) += tvaudio.o
 obj-$(CONFIG_VIDEO_TDA7432) += tda7432.o
 obj-$(CONFIG_VIDEO_SAA6588) += saa6588.o
 obj-$(CONFIG_VIDEO_TDA9840) += tda9840.o
+obj-$(CONFIG_VIDEO_TDA1997X) += tda1997x.o
 obj-$(CONFIG_VIDEO_TEA6415C) += tea6415c.o
 obj-$(CONFIG_VIDEO_TEA6420) += tea6420.o
 obj-$(CONFIG_VIDEO_SAA7110) += saa7110.o
diff --git a/drivers/media/i2c/tda1997x.c b/drivers/media/i2c/tda1997x.c
new file mode 100644
index 000..0a4673b
--- /dev/null
+++ b/drivers/media/i2c/tda1997x.c
@@ -0,0 +1,2807 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Gateworks Corporation
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "tda1997x_regs.h"
+
+#define TDA1997X_MBUS_CODES5
+
+/* debug level */
+static int debug;
+module_param(debug, int, 0644);
+MODULE_PARM_DESC(debug, "debug level (0-2)");
+
+/* Audio for

[PATCH v10 8/8] ARM: dts: imx: Add TDA19971 HDMI Receiver to GW551x

2018-02-08 Thread Tim Harvey
Signed-off-by: Tim Harvey 
---
v5:
 - add missing audmux config

 arch/arm/boot/dts/imx6qdl-gw551x.dtsi | 138 ++
 1 file changed, 138 insertions(+)

diff --git a/arch/arm/boot/dts/imx6qdl-gw551x.dtsi 
b/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
index 30d4662..749548a 100644
--- a/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
@@ -46,6 +46,8 @@
  */
 
 #include 
+#include 
+#include 
 
 / {
/* these are used by bootloader for disabling nodes */
@@ -98,6 +100,50 @@
regulator-min-microvolt = <500>;
regulator-max-microvolt = <500>;
};
+
+   sound-digital {
+   compatible = "simple-audio-card";
+   simple-audio-card,name = "tda1997x-audio";
+
+   simple-audio-card,dai-link@0 {
+   format = "i2s";
+
+   cpu {
+   sound-dai = <&ssi2>;
+   };
+
+   codec {
+   bitclock-master;
+   frame-master;
+   sound-dai = <&tda1997x>;
+   };
+   };
+   };
+};
+
+&audmux {
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_audmux>; /* AUD5<->tda1997x */
+   status = "okay";
+
+   ssi1 {
+   fsl,audmux-port = <0>;
+   fsl,port-config = <
+   (IMX_AUDMUX_V2_PTCR_TFSDIR |
+   IMX_AUDMUX_V2_PTCR_TFSEL(4+8) | /* RXFS */
+   IMX_AUDMUX_V2_PTCR_TCLKDIR |
+   IMX_AUDMUX_V2_PTCR_TCSEL(4+8) | /* RXC */
+   IMX_AUDMUX_V2_PTCR_SYN)
+   IMX_AUDMUX_V2_PDCR_RXDSEL(4)
+   >;
+   };
+
+   aud5 {
+   fsl,audmux-port = <4>;
+   fsl,port-config = <
+   IMX_AUDMUX_V2_PTCR_SYN
+   IMX_AUDMUX_V2_PDCR_RXDSEL(0)>;
+   };
 };
 
 &can1 {
@@ -263,6 +309,60 @@
#gpio-cells = <2>;
};
 
+   tda1997x: tda1997x@48 {
+   compatible = "nxp,tda19971";
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_tda1997x>;
+   reg = <0x48>;
+   interrupt-parent = <&gpio1>;
+   interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
+   DOVDD-supply = <®_3p3>;
+   AVDD-supply = <®_1p8b>;
+   DVDD-supply = <®_1p8a>;
+   #sound-dai-cells = <0>;
+   nxp,audout-format = "i2s";
+   nxp,audout-layout = <0>;
+   nxp,audout-width = <16>;
+   nxp,audout-mclk-fs = <128>;
+   /*
+* The 8bpp YUV422 semi-planar mode outputs CbCr[11:4]
+* and Y[11:4] across 16bits in the same cycle
+* which we map to VP[15:08]<->CSI_DATA[19:12]
+*/
+   nxp,vidout-portcfg =
+   /*G_Y_11_8<->VP[15:12]<->CSI_DATA[19:16]*/
+   < TDA1997X_VP24_V15_12 TDA1997X_G_Y_11_8 >,
+   /*G_Y_7_4<->VP[11:08]<->CSI_DATA[15:12]*/
+   < TDA1997X_VP24_V11_08 TDA1997X_G_Y_7_4 >,
+   /*R_CR_CBCR_11_8<->VP[07:04]<->CSI_DATA[11:08]*/
+   < TDA1997X_VP24_V07_04 TDA1997X_R_CR_CBCR_11_8 >,
+   /*R_CR_CBCR_7_4<->VP[03:00]<->CSI_DATA[07:04]*/
+   < TDA1997X_VP24_V03_00 TDA1997X_R_CR_CBCR_7_4 >;
+
+   port {
+   tda1997x_to_ipu1_csi0_mux: endpoint {
+   remote-endpoint = 
<&ipu1_csi0_mux_from_parallel_sensor>;
+   bus-width = <16>;
+   hsync-active = <1>;
+   vsync-active = <1>;
+   data-active = <1>;
+   };
+   };
+   };
+};
+
+&ipu1_csi0_from_ipu1_csi0_mux {
+   bus-width = <16>;
+};
+
+&ipu1_csi0_mux_from_parallel_sensor {
+   remote-endpoint = <&tda1997x_to_ipu1_csi0_mux>;
+   bus-width = <16>;
+};
+
+&ipu1_csi0 {
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_ipu1_csi0>;
 };
 
 &pcie {
@@ -320,6 +420,14 @@
 };
 
 &iomuxc {
+   pinctrl_audmux: audmuxgrp {
+   fsl,pins = <
+   MX6QDL_PAD_DISP0_DAT19__AUD5_RXD0x130b0
+   MX6QDL_PAD_DISP0_DAT14__AUD5_RXC0x130b0
+   MX6QDL_PAD_DISP0_DAT13__AUD5_RXFS   0x130b0
+   >;
+   };
+
pinctrl_flexcan1: flexcan1grp {
fsl,pins = <
MX6QDL_PAD_KEY_ROW2__FLEXCAN1_RX0x1b0b1
@@ -375,6 +483,30 @@
>;
};
 
+   pinctrl_ipu1_csi0: ipu1_csi0grp {
+   fsl,pins = <
+   MX6QDL_PAD_CSI0_DAT4__IPU1_CSI0_D

[PATCH v10 0/8] TDA1997x HDMI video reciver

2018-02-08 Thread Tim Harvey
This is a v4l2 subdev driver supporting the TDA1997x HDMI video receiver.

I've tested this on a Gateworks GW54xx/GW551x with an IMX6Q/IMX6DL which
uses the TDA19971 with 16bits connected to the IMX6 CSI and single-lane
I2S audio providing 2-channel audio.

For this configuration I've tested both 16bit YUV422 and 8bit
BT656 parallel video bus modes.

While the driver should support the TDA1993 I do not have one for testing.

Further potential development efforts include:
 - CEC support
 - HDCP support
 - TDA19972 support (2 inputs)

Media graphs can be found at http://dev.gateworks.com/docs/linux/media

See details below for configuration and compliance tests

History:
v10:
 - removed unnecessary check for !timings in get/set/query dv timings (Hans)
 - dropped pointless s_stream handler (Hans)
 - remove need for detected_timings and always use set timings (Hans)

v9:
 - add digital video decoder video interface entity function

v8:
 - fix clearing pad for VIDIOC_DV_TIMIGNS_CAP
 - support full range of input modes based on timings_cap
 - add patch to fix clearing pad for VIDIOC_DV_TIMIGINGS
 - fix available formats for tda19971 bt656 bus width >12
 - fix set_format (compliance)
 - fixed get/set edid (compliance)
 - add init_cfg to setup default pad config (compliance)
 - added missing pad checks to get_dv_timings_cap/enum_dv_timings (compliance)
 - fix alignment of if statement and whitespace in comment (Hans)
 - move regs to tda1997x_regs.h to clean up (Hans)
 - add define and sanity check for num of mbus_codes (Hans)

v7:
 - fix interlaced mode
 - support no AVI infoframe (ie DVI) (Hans)
 - add support for multiple output formats (Hans)

v6:
 - tda1997x: fix return on regulator enablei in tda1997x_set_power() (Fabio)
 - tda1997x: fix colorspace handling (Hans)
 - bindings: added Robs's ack (Rob)
 - replace copyright with SPDX tag (Philippe)

v5:
 - added v4l2_hdmi_colorimetry() patch from Hans to series
 - bindings: added Sakari's ack
 - tda1997x: uppercase string constants
 - tda1997x: use v4l2_hdmi_rx_coloriemtry to fill format
 - tda1997x: fix V4L2_CID_DV_RX_RGB_RANGE
 - tda1997x: fix interlaced mode format
 - dts: remove leading 0 from unit address
 - dts: add newline between property list and child node
 - dts: added missing audmux in GW551x dts

v4:
 - move include/dt-bindings/media/tda1997x.h to bindings patch
 - clarify port node details in bindings
 - fix typos
 - fix default quant range for VGA
 - fix quant range handling and conv matrix
 - add additional standards and capabilities to timings_cap

v3:
 - fix typo in dt bindings
 - added dt bindings for GW551x
 - use V4L2_DV_BT_FRAME_WIDTH/HEIGHT macros
 - fixed missing break
 - use only hdmi_infoframe_log for infoframe logging
 - simplify tda1997x_s_stream error handling
 - add delayed work proc to handle hotplug enable/disable
 - fix set_edid (disable HPD before writing, enable after)
 - remove enabling edid by default
 - initialize timings
 - take quant range into account in colorspace conversion
 - remove vendor/product tracking (we provide this in log_status via
   infoframes)
 - add v4l_controls
 - add more detail to log_status
 - calculate vhref generator timings
 - timing detection fixes (rounding errors, hswidth errors)
 - rename configure_input/configure_conv functions

v2:
 - encorporate feedback into dt bindings
 - change audio dt bindings
 - implement dv timings enum/cap
 - remove deprecated g_mbus_config op
 - fix dv_query_timings
 - add EDID get/set handling
 - remove max-pixel-rate support
 - add audio codec DAI support
 - added media-ctl and v4l2-compliance details

v1:
 - initial RFC

Pipeline configuration:
$ media-ctl -e 'tda19971 2-0048'
/dev/v4l-subdev1
$ v4l2-ctl -d /dev/v4l-subdev1 --set-dv-bt-timings=query
BT timings set
$ media-ctl --get-v4l2 '"tda19971 2-0048":0'
[fmt:UYVY8_2X8/1280x720 field:none colorspace:srgb]

$ media-ctl --link "tda19971 2-0048":0 -> "ipu1_csi0_mux":1[1]
$ media-ctl --link "ipu1_csi0_mux":2 -> "ipu1_csi0":0[1]
$ media-ctl --link "ipu1_csi0":2 -> "ipu1_csi0 capture":0[1]
$ media-ctl --set-v4l2 'tda19971 2-0048':0[fmt:UYVY8_2X8/1280x720]
$ media-ctl --set-v4l2 'ipu1_csi0_mux':2[fmt:UYVY8_2X8/1280x720]
$ media-ctl --set-v4l2 'ipu1_csi0':0[fmt:UYVY8_2X8/1280x720]
$ gst-launch-1.0 v4l2src device=/dev/video4 ! 
video/x-raw,width=1280,height=720,format=UYVY ! jpegenc ! rtpjpegpay ! udpsink 
host=172.24.40.6 port=5000

$ media-ctl -d /dev/media0 -p
Media controller API version 4.15.0

Media device information

driver  imx-media
model   imx-media
serial  
bus info
hw revision 0x0
driver version  4.15.0

Device topology
- entity 1: adv7180 2-0020 (1 pad, 1 link)
type V4L2 subdev subtype Unknown flags 20004
device node name /dev/v4l-subdev0
pad0: Source
[fmt:UYVY8_2X8/720x480 field:interlaced colorspace:smpte170m]
-> "ipu2_csi1_mux":1 []

- entity 3: tda19971 2-0048 (1 p

[PATCH v10 4/8] MAINTAINERS: add entry for NXP TDA1997x driver

2018-02-08 Thread Tim Harvey
Signed-off-by: Tim Harvey 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 845fc25..439b500 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13262,6 +13262,14 @@ T: git git://linuxtv.org/mkrufky/tuners.git
 S: Maintained
 F: drivers/media/tuners/tda18271*
 
+TDA1997x MEDIA DRIVER
+M: Tim Harvey 
+L: linux-me...@vger.kernel.org
+W: https://linuxtv.org
+Q: http://patchwork.linuxtv.org/project/linux-media/list/
+S: Maintained
+F: drivers/media/i2c/tda1997x.*
+
 TDA827x MEDIA DRIVER
 M: Michael Krufky 
 L: linux-me...@vger.kernel.org
-- 
2.7.4



Re: [PATCH] power: reset: Add Spreadtrum SC27xx PMIC power off support

2018-02-08 Thread Baolin Wang
Hi Sebastian,

On 9 February 2018 at 05:50, Sebastian Reichel  wrote:
> Hi Baolin,
>
> On Mon, Jan 15, 2018 at 03:58:57PM +0800, Baolin Wang wrote:
>> On Spreadtrum platform, we need power off system through external SC27xx
>> series PMICs including the SC2720, SC2721, SC2723, SC2730 and SC2731 chips.
>> Thus this patch adds SC27xx series PMICs power-off support.
>>
>> Signed-off-by: Baolin Wang 
>> ---
>>  drivers/power/reset/Kconfig   |9 +
>>  drivers/power/reset/Makefile  |1 +
>>  drivers/power/reset/sc27xx-poweroff.c |   65 
>> +
>>  3 files changed, 75 insertions(+)
>>  create mode 100644 drivers/power/reset/sc27xx-poweroff.c
>>
>> diff --git a/drivers/power/reset/Kconfig b/drivers/power/reset/Kconfig
>> index ca0de1a..611ae56 100644
>> --- a/drivers/power/reset/Kconfig
>> +++ b/drivers/power/reset/Kconfig
>> @@ -227,5 +227,14 @@ config SYSCON_REBOOT_MODE
>> register, then the bootloader can read it to take different
>> action according to the mode.
>>
>> +config POWER_RESET_SC27XX
>> + tristate "Spreadtrum SC27xx PMIC power-off driver"
>> + depends on MFD_SC27XX_PMIC || COMPILE_TEST
>> + help
>> +   This driver supports powering off a system through
>> +   Spreadtrum SC27xx series PMICs. The SC27xx series
>> +   PMICs includes the SC2720, SC2721, SC2723, SC2730
>> +   and SC2731 chips.
>> +
>>  endif
>>
>> diff --git a/drivers/power/reset/Makefile b/drivers/power/reset/Makefile
>> index aeb65ed..225d645 100644
>> --- a/drivers/power/reset/Makefile
>> +++ b/drivers/power/reset/Makefile
>> @@ -27,3 +27,4 @@ obj-$(CONFIG_POWER_RESET_RMOBILE) += rmobile-reset.o
>>  obj-$(CONFIG_POWER_RESET_ZX) += zx-reboot.o
>>  obj-$(CONFIG_REBOOT_MODE) += reboot-mode.o
>>  obj-$(CONFIG_SYSCON_REBOOT_MODE) += syscon-reboot-mode.o
>> +obj-$(CONFIG_POWER_RESET_SC27XX) += sc27xx-poweroff.o
>> diff --git a/drivers/power/reset/sc27xx-poweroff.c 
>> b/drivers/power/reset/sc27xx-poweroff.c
>> new file mode 100644
>> index 000..8e4b6a0
>> --- /dev/null
>> +++ b/drivers/power/reset/sc27xx-poweroff.c
>> @@ -0,0 +1,65 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) 2017 Spreadtrum Communications Inc.
>> + * Copyright (c) 2017 Linaro Ltd.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#define SC27XX_PWR_PD_HW 0xc2c
>> +#define SC27XX_PWR_OFF_ENBIT(0)
>> +
>> +static struct regmap *regmap;
>> +
>> +/*
>> + * On Spreadtrum platform, we need power off system through external SC27xx
>> + * series PMICs, and it is one similar SPI bus mapped by regmap to access 
>> PMIC,
>> + * which is not fast io access.
>> + *
>> + * So before stopping other cores, we need release other cores' resource by
>> + * taking cpus down to avoid racing regmap or spi mutex lock when poweroff
>> + * system through PMIC.
>> + */
>> +void sc27xx_poweroff_shutdown(void)
>> +{
>> + int cpu = smp_processor_id();
>> +
>> + freeze_secondary_cpus(cpu);
>> +}
>> +
>> +static struct syscore_ops poweroff_syscore_ops = {
>> + .shutdown = sc27xx_poweroff_shutdown,
>> +};
>> +
>> +static void sc27xx_poweroff_do_poweroff(void)
>> +{
>> + regmap_write(regmap, SC27XX_PWR_PD_HW, SC27XX_PWR_OFF_EN);
>> +}
>> +
>> +static int sc27xx_poweroff_probe(struct platform_device *pdev)
>> +{
>
> if (regmap)
> return -EINVAL;

OK.

>
>> + regmap = dev_get_regmap(pdev->dev.parent, NULL);
>> + if (!regmap)
>> + return -ENODEV;
>> +
>> + pm_power_off = sc27xx_poweroff_do_poweroff;
>> + register_syscore_ops(&poweroff_syscore_ops);
>> + return 0;
>> +}
>
> static void sc27xx_poweroff_remove(struct platform_device *pdev) {
> if (pm_power_off == sc27xx_poweroff_do_poweroff)
> pm_power_off = NULL;
> regmap = NULL;
> }

OK. I will add it in next version.

>
>> +static struct platform_driver sc27xx_poweroff_driver = {
>> + .probe = sc27xx_poweroff_probe,
>
> .remove = sc27xx_poweroff_remove,
>
>> + .driver = {
>> + .name = "sc27xx-poweroff",
>> + },
>> +};
>> +module_platform_driver(sc27xx_poweroff_driver);
>> +
>> +MODULE_DESCRIPTION("Spreadtrum SC27xx PMIC Poweroff Driver");
>> +MODULE_LICENSE("GPL v2");
>
> MODULE_ALIAS("platform:sc27xx-poweroff");

OK. Thanks for your comments.

-- 
Baolin.wang
Best Regards


Assalamu alaikum

2018-02-08 Thread Mr Ibrahim Zaki


Assalamu alaikum


My name is Mr. Ibrahim Zaki, I am a staff member working with BCEAO BANK here 
in Ouagadougou, Burkina Faso.


I want you to help me receive the sum of twenty-seven million two hundred 
dollars ($ 27,200,000) in your bank account. This fund was deposited in the 
bank here by a foreign customer who accidentally died next to his family 
members several years ago. Nobody has asked for this fund until now. Contact me 
for more information. PLEASE REPLY ME WITH MY PRIVATE EMAIL 
(mribrahimzak...@gmail.com)


Best regard,
Mr. Ibrahim Zaki


[V9fs-developer] [PATCH] fs/9p: don't set SB_NOATIME by default

2018-02-08 Thread jiangyiwen
User use some syscall, for example mmap(v9fs_file_mmap), it will not
update atime even if user's mnt_flags have MNT_NOATIME, because
v9fs default set SB_NOATIME in v9fs_set_super.

For supporting access time is updated when user mount with relatime,
we should clear SB_NOATIME by default.

Signed-off-by: Yiwen Jiang 
---
 fs/9p/vfs_super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index af03c2a..48ce504 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -94,7 +94,7 @@ static int v9fs_set_super(struct super_block *s, void *data)
if (v9ses->cache)
sb->s_bdi->ra_pages = (VM_MAX_READAHEAD * 1024)/PAGE_SIZE;

-   sb->s_flags |= SB_ACTIVE | SB_DIRSYNC | SB_NOATIME;
+   sb->s_flags |= SB_ACTIVE | SB_DIRSYNC;
if (!v9ses->cache)
sb->s_flags |= SB_SYNCHRONOUS;

-- 
1.8.3.1



[PATCH] Input: gpio_keys: Add level trigger support for GPIO keys

2018-02-08 Thread Baolin Wang
On some platforms (such as Spreadtrum platform), the GPIO keys can only
be triggered by level type. So this patch introduces one property to
indicate if the GPIO trigger type is level trigger or edge trigger.

Signed-off-by: Baolin Wang 
---
 .../devicetree/bindings/input/gpio-keys.txt|2 ++
 drivers/input/keyboard/gpio_keys.c |   22 +++-
 include/linux/gpio_keys.h  |1 +
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/input/gpio-keys.txt 
b/Documentation/devicetree/bindings/input/gpio-keys.txt
index a949404..e3104bd 100644
--- a/Documentation/devicetree/bindings/input/gpio-keys.txt
+++ b/Documentation/devicetree/bindings/input/gpio-keys.txt
@@ -29,6 +29,8 @@ Optional subnode-properties:
- linux,can-disable: Boolean, indicates that button is connected
  to dedicated (not shared) interrupt which can be disabled to
  suppress events from the button.
+   - gpio-key,level-trigger: Boolean, indicates that button's interrupt
+ type is level trigger. Otherwise it is edge trigger as default.
 
 Example nodes:
 
diff --git a/drivers/input/keyboard/gpio_keys.c 
b/drivers/input/keyboard/gpio_keys.c
index 87e613d..d3b4bb6 100644
--- a/drivers/input/keyboard/gpio_keys.c
+++ b/drivers/input/keyboard/gpio_keys.c
@@ -385,6 +385,19 @@ static void gpio_keys_gpio_work_func(struct work_struct 
*work)
struct gpio_button_data *bdata =
container_of(work, struct gpio_button_data, work.work);
 
+   if (bdata->button->level_trigger) {
+   unsigned int trigger =
+   irq_get_trigger_type(bdata->irq) & ~IRQF_TRIGGER_MASK;
+   int state = gpiod_get_raw_value_cansleep(bdata->gpiod);
+
+   if (state)
+   trigger |= IRQF_TRIGGER_LOW;
+   else
+   trigger |= IRQF_TRIGGER_HIGH;
+
+   irq_set_irq_type(bdata->irq, trigger);
+   }
+
gpio_keys_gpio_report_event(bdata);
 
if (bdata->button->wakeup)
@@ -566,7 +579,11 @@ static int gpio_keys_setup_key(struct platform_device 
*pdev,
INIT_DELAYED_WORK(&bdata->work, gpio_keys_gpio_work_func);
 
isr = gpio_keys_gpio_isr;
-   irqflags = IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING;
+   if (button->level_trigger)
+   irqflags = gpiod_is_active_low(bdata->gpiod) ?
+   IRQF_TRIGGER_LOW : IRQF_TRIGGER_HIGH;
+   else
+   irqflags = IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING;
 
} else {
if (!button->irq) {
@@ -721,6 +738,9 @@ static void gpio_keys_close(struct input_dev *input)
button->can_disable =
fwnode_property_read_bool(child, "linux,can-disable");
 
+   button->level_trigger =
+   fwnode_property_read_bool(child, 
"gpio-key,level-trigger");
+
if (fwnode_property_read_u32(child, "debounce-interval",
 &button->debounce_interval))
button->debounce_interval = 5;
diff --git a/include/linux/gpio_keys.h b/include/linux/gpio_keys.h
index d06bf77..5095645 100644
--- a/include/linux/gpio_keys.h
+++ b/include/linux/gpio_keys.h
@@ -28,6 +28,7 @@ struct gpio_keys_button {
int wakeup;
int debounce_interval;
bool can_disable;
+   bool level_trigger;
int value;
unsigned int irq;
 };
-- 
1.7.9.5



Re: [PATCH v9 6/8] media: i2c: Add TDA1997x HDMI receiver driver

2018-02-08 Thread Tim Harvey
On Thu, Feb 8, 2018 at 7:06 AM, Hans Verkuil  wrote:
> Hi Tim,
>
> I was so hoping I could make a pull request for this, but I still found
> problems with g/s/query_dv_timings.
>
> I strongly suspect that v4l2-compliance would fail if you boot up the system
> *without* a source connected.
>
> And I discovered that I was missing additional checks in the timings tests
> for v4l2-compliance that would have found the same issue if run with a source
> connected. I've fixed and committed those tests now. I'll also try to test
> that a S_DV_TIMINGS calls updates the format.
>
> Details below:
>
> On 02/07/18 23:42, Tim Harvey wrote:
>> +struct tda1997x_state {
>
> ...
>
>> + struct v4l2_dv_timings timings;
>> + const struct v4l2_dv_timings *detected_timings;
>
> ...
>
>> +/* Configure frame detection window and VHREF timing generator */
>> +static int
>> +tda1997x_configure_vhref(struct v4l2_subdev *sd)
>> +{
>> + struct tda1997x_state *state = to_state(sd);
>> + const struct v4l2_bt_timings *bt;
>> + int width, lines;
>> + u16 href_start, href_end;
>> + u16 vref_f1_start, vref_f2_start;
>> + u8 vref_f1_width, vref_f2_width;
>> + u8 field_polarity;
>> + u16 fieldref_f1_start, fieldref_f2_start;
>> + u8 reg;
>> +
>> + if (!state->detected_timings)
>> + return -EINVAL;
>
> Why this test? Who cares if there are no detected timings? It's certainly
> not a failure. S_DV_TIMINGS should succeed regardless of whether there is
> a signal or not and regardless of the current detected timings.
>

good point. Both tda1997x_configure_vhref() and
tda1997x_configure_csc() should never return an error - I'll change
that.

>> + bt = &state->detected_timings->bt;
>
> Ouch. The timings passed in with S_DV_TIMINGS should be used.
>
> Just use state->timings here, not detected_timings.

Ok. I was thinking the VHREF generator responsible for output timings
to the SoC should always match the input source but changing it async
like that could mess with userspace buffers and the like so even
though the output will be 'wrong' after a resolution change I get that
I need to wait for userspace to come along and query then set the new
resolution.

>
>> + href_start = bt->hbackporch + bt->hsync + 1;
>> + href_end = href_start + bt->width;
>> + vref_f1_start = bt->height + bt->vbackporch + bt->vsync +
>> + bt->il_vbackporch + bt->il_vsync +
>> + bt->il_vfrontporch;
>> + vref_f1_width = bt->vbackporch + bt->vsync + bt->vfrontporch;
>> + vref_f2_start = 0;
>> + vref_f2_width = 0;
>> + fieldref_f1_start = 0;
>> + fieldref_f2_start = 0;
>> + if (bt->interlaced) {
>> + vref_f2_start = (bt->height / 2) +
>> + (bt->il_vbackporch + bt->il_vsync - 1);
>> + vref_f2_width = bt->il_vbackporch + bt->il_vsync +
>> + bt->il_vfrontporch;
>> + fieldref_f2_start = vref_f2_start + bt->il_vfrontporch +
>> + fieldref_f1_start;
>> + }
>> + field_polarity = 0;
>> +
>> + width = V4L2_DV_BT_FRAME_WIDTH(bt);
>> + lines = V4L2_DV_BT_FRAME_HEIGHT(bt);
>> +
>> + /*
>> +  * Configure Frame Detection Window:
>> +  *  horiz area where the VHREF module consider a VSYNC a new frame
>> +  */
>> + io_write16(sd, REG_FDW_S, 0x2ef); /* start position */
>> + io_write16(sd, REG_FDW_E, 0x141); /* end position */
>> +
>> + /* Set Pixel And Line Counters */
>> + if (state->chip_revision == 0)
>> + io_write16(sd, REG_PXCNT_PR, 4);
>> + else
>> + io_write16(sd, REG_PXCNT_PR, 1);
>> + io_write16(sd, REG_PXCNT_NPIX, width & MASK_VHREF);
>> + io_write16(sd, REG_LCNT_PR, 1);
>> + io_write16(sd, REG_LCNT_NLIN, lines & MASK_VHREF);
>> +
>> + /*
>> +  * Configure the VHRef timing generator responsible for rebuilding all
>> +  * horiz and vert synch and ref signals from its input allowing auto
>> +  * detection algorithms and forcing predefined modes (480i & 576i)
>> +  */
>> + reg = VHREF_STD_DET_OFF << VHREF_STD_DET_SHIFT;
>> + io_write(sd, REG_VHREF_CTRL, reg);
>> +
>> + /*
>> +  * Configure the VHRef timing values. In case the VHREF generator has
>> +  * been configured in manual mode, this will allow to manually set all
>> +  * horiz and vert ref values (non-active pixel areas) of the generator
>> +  * and allows setting the frame reference params.
>> +  */
>> + /* horizontal reference start/end */
>> + io_write16(sd, REG_HREF_S, href_start & MASK_VHREF);
>> + io_write16(sd, REG_HREF_E, href_end & MASK_VHREF);
>> + /* vertical reference f1 start/end */
>> + io_write16(sd, REG_VREF_F1_S, vref_f1_start & MASK_VHREF);
>> + io_write(sd, REG_VREF_F1_WIDTH, vref_f1_width);
>> + /* vertical reference f2 start/end */
>> + io_write16(sd, REG_VREF_F2_S, vref_f2_start & MAS

Re: [PATCH v6] checkpatch.pl: Add SPDX license tag check

2018-02-08 Thread Philippe Ombredanne
On Fri, Feb 9, 2018 at 1:35 AM, Joe Perches  wrote:
> On Fri, 2018-02-02 at 13:18 -0800, Joe Perches wrote:
>> On Fri, 2018-02-02 at 09:40 -0600, Rob Herring wrote:
>> > Add SPDX license tag check based on the rules defined in
>> > Documentation/process/license-rules.rst. To summarize, SPDX license tags
>> > should be on the 1st line (or 2nd line in scripts) using the appropriate
>> > comment style for the file type.
>> >
>> > Cc: Andy Whitcroft 
>> > Cc: Joe Perches 
>> > Cc: Thomas Gleixner 
>> > Cc: Philippe Ombredanne 
>> > Acked-by: Greg Kroah-Hartman 
>> > Signed-off-by: Rob Herring 
>>
>> Signed-off-by: Joe Perches 
>
> Andrew, would you pick this up please?
>
>> > ---
>> > v6:
>> > - Dropped script extension check and only look for #!/... on 1st line. A
>> >   text executable file was not reliable either.
>> > - Support .awk and .tc which may or may not have a #!/.
>> > - Fixed a typo in script "#!" regex and also match on first /.
>> > - Add Greg's ack.
>> >
>> >  scripts/checkpatch.pl | 27 +++
>> >  1 file changed, 27 insertions(+)
>> >
>> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>> > index ba03f17ff662..6db245e5f93b 100755
>> > --- a/scripts/checkpatch.pl
>> > +++ b/scripts/checkpatch.pl
>> > @@ -2225,6 +2225,8 @@ sub process {
>> >
>> > my $camelcase_file_seeded = 0;
>> >
>> > +   my $checklicenseline = 1;
>> > +
>> > sanitise_line_reset();
>> > my $line;
>> > foreach my $rawline (@rawlines) {
>> > @@ -2416,6 +2418,7 @@ sub process {
>> > } else {
>> > $check = $check_orig;
>> > }
>> > +   $checklicenseline = 1;
>> > next;
>> > }
>> >
>> > @@ -2866,6 +2869,30 @@ sub process {
>> > }
>> > }
>> >
>> > +# check for using SPDX license tag at beginning of files
>> > +   if ($realline == $checklicenseline) {
>> > +   if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {
>> > +   $checklicenseline = 2;
>> > +   } elsif ($rawline =~ /^\+/) {
>> > +   my $comment = "";
>> > +   if ($realfile =~ /\.(h|s|S)$/) {
>> > +   $comment = '/*';
>> > +   } elsif ($realfile =~ /\.(c|dts|dtsi)$/) {
>> > +   $comment = '//';
>> > +   } elsif (($checklicenseline == 2) || $realfile 
>> > =~ /\.(sh|pl|py|awk|tc)$/) {
>> > +   $comment = '#';
>> > +   } elsif ($realfile =~ /\.rst$/) {
>> > +   $comment = '..';
>> > +   }
>> > +
>> > +   if ($comment !~ /^$/ &&
>> > +   $rawline !~ /^\+\Q$comment\E 
>> > SPDX-License-Identifier: /) {
>> > +   WARN("SPDX_LICENSE_TAG",
>> > +"Missing or malformed 
>> > SPDX-License-Identifier tag in line $checklicenseline\n" . $herecurr);
>> > +   }
>> > +   }
>> > +   }
>> > +
>> >  # check we are in a valid source file if not then ignore this hunk
>> > next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);
>> >

BTW I forgot this if you like to add it:

Acked-by: Philippe Ombredanne 


Re: [PATCH 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-08 Thread Sergey Senozhatsky
On (02/09/18 14:36), Sergey Senozhatsky wrote:
> +/**
> + * zs_huge_object() - Test if a compressed object's size is too big for 
> normal
> + *zspool classes and it will be stored in a huge class.
> + * @sz: Size in bytes of the compressed object.
> + *
> + * The functions checks if the object's size falls into huge_class area.
> + * We must take ZS_HANDLE_SIZE into account and test the actual size we
> + * are going to use up, because zs_malloc() unconditionally adds the
> + * handle size before it performs size_class lookup.
> + *
> + * Context: Any context.
> + *
> + * Return:
> + * * true  - The object's size is too big, it will be stored in a huge class.
> + * * false - The object will be store in normal zspool classes.
> + */
> ---
> 
> looks OK?

Modulo silly typos... and broken English.

-ss


Re: [PATCH 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-08 Thread Sergey Senozhatsky
On (02/08/18 20:10), Matthew Wilcox wrote:
[..]
> Examples::
> 
>   * Context: Any context.
>   * Context: Any context. Takes and releases the RCU lock.
>   * Context: Any context. Expects  to be held by caller.
>   * Context: Process context. May sleep if @gfp flags permit.
>   * Context: Process context. Takes and releases .
>   * Context: Softirq or process context. Takes and releases , BH-safe.
>   * Context: Interrupt context.

I assume thatspelling serves as a placeholder and should be
replaced with a lock name in a real comment. E.g.

Takes and releases audit_cmd_mutex.

or should it actually be

Takes and releases .




So below is zs_huge_object() documentation I came up with:

---

+/**
+ * zs_huge_object() - Test if a compressed object's size is too big for normal
+ *zspool classes and it will be stored in a huge class.
+ * @sz: Size in bytes of the compressed object.
+ *
+ * The functions checks if the object's size falls into huge_class area.
+ * We must take ZS_HANDLE_SIZE into account and test the actual size we
+ * are going to use up, because zs_malloc() unconditionally adds the
+ * handle size before it performs size_class lookup.
+ *
+ * Context: Any context.
+ *
+ * Return:
+ * * true  - The object's size is too big, it will be stored in a huge class.
+ * * false - The object will be store in normal zspool classes.
+ */
---

looks OK?

-ss


Re: [PATCH v6 15/15] dt-bindings: cpufreq: Document operating-points-v2-krait-cpu

2018-02-08 Thread Sricharan R
Hi Rob,

On 2/9/2018 8:24 AM, Rob Herring wrote:
> On Tue, Feb 06, 2018 at 09:38:28AM +0530, Sricharan R wrote:
>> In Certain QCOM SoCs like ipq8064, apq8064, msm8960, msm8974
>> that has KRAIT processors the voltage/current value of each OPP
>> varies based on the silicon variant in use.
>> operating-points-v2-krait-cpu specifies the phandle to nvmem efuse cells
>> and the operating-points-v2 table for each opp. The qcom-cpufreq driver
>> reads the efuse value from the SoC to provide the required information
>> that is used to determine the voltage and current value for each OPP of
>> operating-points-v2 table when it is parsed by the OPP framework.
>>
>> Signed-off-by: Sricharan R 
>> ---
>>  .../devicetree/bindings/cpufreq/krait-cpufreq.txt  | 363 
>> +
>>  1 file changed, 363 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
>>
>> diff --git a/Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt 
>> b/Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
>> new file mode 100644
>> index 000..e7351f7
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
>> @@ -0,0 +1,363 @@
>> +QCOM KRAIT CPUFreq and OPP bindings
>> +===
>> +
>> +In Certain QCOM SoCs like ipq8064, apq8064, msm8960, msm8974
>> +that has KRAIT processors the voltage value of each OPP varies
>> +based on the silicon variant in use. Qualcomm Process Voltage Scaling Tables
>> +defines the voltage and current value based on the speed/pvs/version
>> +combination blown in the efuse. The qcom-cpufreq driver reads the efuse
>> +value from the SoC to provide the OPP framework with required information.
>> +This is used to determine the voltage and current value for each OPP of
>> +operating-points-v2 table when it is parsed by the OPP framework.
>> +
>> +Required properties:
>> +
>> +In 'cpus' nodes:
>> +- operating-points-v2: Phandle to the operating-points-v2 table to use.
>> +
>> +In 'operating-points-v2' table:
>> +- compatible: Should be
>> +- 'operating-points-v2-krait-cpu' for ipq8064, apq8064, msm8960,
>> +  msm8974.
>> +- nvmem-cells: A phandle pointing to a nvmem-cells node representing the
>> +efuse registers that has information about the
>> +speedbin/pvs/version that is used to select the right
>> +voltage/current value pair. Note that the length field of the
>> +nvmem-cell is used to differentiate between format 'A' or 'B'
>> +efuse settings. len of '4' bytes is for format 'A' and '8'
>> +bytes for format 'B'. Please refer the for nvmem-cells
>> +bindings Documentation/devicetree/bindings/nvmem/nvmem.txt
>> +and also examples below for both the cases.
>> +Example 1:
>> +-
>> +
>> +/* For arch/arm/boot/dts/apq8064.dtsi --> format 'A' */
>> +cpus {
>> +#address-cells = <1>;
>> +#size-cells = <0>;
>> +
>> +CPU0: cpu@0 {
>> +compatible = "qcom,krait";
>> +enable-method = "qcom,kpss-acc-v1";
>> +device_type = "cpu";
>> +reg = <0>;
>> +next-level-cache = <&L2>;
>> +qcom,acc = <&acc0>;
>> +qcom,saw = <&saw0>;
>> +cpu-idle-states = <&CPU_SPC>;
>> +operating-points-v2 = <&cpu_opp_table>;
>> +};
>> +};
>> +
>> +qfprom: qfprom@70 {
>> +  compatible  = "qcom,qfprom";
>> +  reg = <0x0070 0x1000>;
>> +  #address-cells  = <1>;
>> +  #size-cells = <1>;
>> +  ranges;
>> +  pvs_efuse: pvs {
>> +reg = <0xc0 0x4>;
>> +};
>> +};
>> +
>> +cpu_opp_table: opp-table {
>> +compatible = "operating-points-v2-krait-cpu";
>> +nvmem-cells = <&pvs_efuse>;
>> +
>> +/*
>> + * Missing opp-shared property means CPUs switch DVFS states
>> + * independently.
>> + */
>> +
>> +   opp-91800 {
>> +opp-hz = /bits/ 64 <91800>;
>> +opp-microvolt-speed0-pvs0-v0 = <110>;
> 
> Where is this property defined? I'm not that happy with it, but don't 
> have a better suggestion. Maybe make pvsN be an array of values with 0 
> for any skipped indexes? The '-v0' seems pointless. 
> 

 'opp-microvolt' is the property that comes from OPP-V2 bindings and rest
  of the string "speed%s-pvs%s-v%s" gets concatenated to the string
  by the cpufreq driver using dev_pm_opp_set_prop_name api. So all
  the three speed,pvs,v (version) come from efuse and can vary.
  Just that in the data so far, v is always '0'.

>> +opp-microvolt-speed0-pvs1-v0 = <105>;
>> +opp-microvolt-speed0-pvs3-v0 = <100>;
>> +opp-microvolt-speed0-pvs4-v0 = <975000>;
>> +opp-microvolt-speed1-pvs0-v0 = <1025000>;
>> + 

Re: [RFC PATCH 4/7] kconfig: support new special property shell=

2018-02-08 Thread Ulf Magnusson
On Fri, Feb 09, 2018 at 01:19:09AM +0900, Masahiro Yamada wrote:
> This works with bool, int, hex, string types.
> 
> For bool, the symbol is set to 'y' or 'n' depending on the exit value
> of the command.
> 
> For int, hex, string, the symbol is set to the value to the stdout
> of the command. (only the first line of the stdout)
> 
> The following shows how to write this and how it works.
> 
> (example Kconfig)--
> config srctree
> string
> option env="srctree"
> 
> config CC
> string
> option env="CC"
> 
> config CC_HAS_STACKPROTECTOR
> bool
> option shell="$CC -Werror -fstack-protector -c -x c /dev/null"
> 
> config CC_HAS_STACKPROTECTOR_STRONG
> bool
> option shell="$CC -Werror -fstack-protector-strong -c -x c /dev/null"
> 
> config CC_VERSION
> int
> option shell="$srctree/scripts/gcc-version.sh $CC | sed 's/^0*//'"
> help
>   gcc-version.sh returns 4 digits number. Unfortunately, the preceding
>   zero would cause 'number is invalid'.  Cut it off.
> 
> config CC_IS_CLANG
> bool
> option shell="$CC --version | grep -q clang"
> 
> config CC_IS_GCC
> bool
> option shell="$CC --version | grep -q gcc"
> -
> 
>   $ make alldefconfig
>   scripts/kconfig/conf  --alldefconfig Kconfig
>   #
>   # configuration written to .config
>   #
>   $ cat .config
>   #
>   # Automatically generated file; DO NOT EDIT.
>   # Linux Kernel Configuration
>   #
>   CONFIG_CC_HAS_STACKPROTECTOR=y
>   CONFIG_CC_HAS_STACKPROTECTOR_STRONG=y
>   CONFIG_CC_VERSION=504
>   # CONFIG_CC_IS_CLANG is not set
>   CONFIG_CC_IS_GCC=y
> 
> Suggested-by: Linus Torvalds 
> Signed-off-by: Masahiro Yamada 

I know this is just an RFC/incomplete, but in case it's helpful:

> ---
> 
>  scripts/kconfig/expr.h |  1 +
>  scripts/kconfig/kconf_id.c |  1 +
>  scripts/kconfig/lkc.h  |  1 +
>  scripts/kconfig/menu.c |  3 ++
>  scripts/kconfig/symbol.c   | 74 
> ++
>  5 files changed, 80 insertions(+)
> 
> diff --git a/scripts/kconfig/expr.h b/scripts/kconfig/expr.h
> index c16e82e..83029f92 100644
> --- a/scripts/kconfig/expr.h
> +++ b/scripts/kconfig/expr.h
> @@ -183,6 +183,7 @@ enum prop_type {
>   P_IMPLY,/* imply BAR */
>   P_RANGE,/* range 7..100 (for a symbol) */
>   P_ENV,  /* value from environment variable */
> + P_SHELL,/* shell command */
>   P_SYMBOL,   /* where a symbol is defined */
>  };
>  
> diff --git a/scripts/kconfig/kconf_id.c b/scripts/kconfig/kconf_id.c
> index 3ea9c5f..0db9d1c 100644
> --- a/scripts/kconfig/kconf_id.c
> +++ b/scripts/kconfig/kconf_id.c
> @@ -34,6 +34,7 @@ static struct kconf_id kconf_id_array[] = {
>   { "defconfig_list", T_OPT_DEFCONFIG_LIST,   TF_OPTION },
>   { "env",T_OPT_ENV,  TF_OPTION },
>   { "allnoconfig_y",  T_OPT_ALLNOCONFIG_Y,TF_OPTION },
> + { "shell",  T_OPT_SHELL,TF_OPTION },
>  };
>  
>  #define KCONF_ID_ARRAY_SIZE (sizeof(kconf_id_array)/sizeof(struct kconf_id))
> diff --git a/scripts/kconfig/lkc.h b/scripts/kconfig/lkc.h
> index 4e23feb..8d05042 100644
> --- a/scripts/kconfig/lkc.h
> +++ b/scripts/kconfig/lkc.h
> @@ -60,6 +60,7 @@ enum conf_def_mode {
>  #define T_OPT_DEFCONFIG_LIST 2
>  #define T_OPT_ENV3
>  #define T_OPT_ALLNOCONFIG_Y  4
> +#define T_OPT_SHELL  5
>  
>  struct kconf_id {
>   const char *name;
> diff --git a/scripts/kconfig/menu.c b/scripts/kconfig/menu.c
> index 9922285..6254dfb 100644
> --- a/scripts/kconfig/menu.c
> +++ b/scripts/kconfig/menu.c
> @@ -216,6 +216,9 @@ void menu_add_option(int token, char *arg)
>   case T_OPT_ENV:
>   prop_add_env(arg);
>   break;
> + case T_OPT_SHELL:
> + prop_add_shell(arg);
> + break;
>   case T_OPT_ALLNOCONFIG_Y:
>   current_entry->sym->flags |= SYMBOL_ALLNOCONFIG_Y;
>   break;
> diff --git a/scripts/kconfig/symbol.c b/scripts/kconfig/symbol.c
> index 893eae6..02ac4f4 100644
> --- a/scripts/kconfig/symbol.c
> +++ b/scripts/kconfig/symbol.c
> @@ -4,6 +4,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1370,6 +1371,8 @@ const char *prop_get_type_name(enum prop_type type)
>   return "prompt";
>   case P_ENV:
>   return "env";
> + case P_SHELL:
> + return "shell";
>   case P_COMMENT:
>   return "comment";
>   case P_MENU:
> @@ -1420,3 +1423,74 @@ static void prop_add_env(const char *env)
>   else
>   menu_warn(current_entry, "environment variable %s undefined", 
> env);
>  }
> +
> +static void prop_add_shell(const char *cmd)
> +{
> + struct symbol *sym, *sym2;
> + struct property *prop;
> + char *expanded_cmd;
> + FILE 

Re: [PATCH 17/18] tracing: Add indirect to indirect access for function based events

2018-02-08 Thread Namhyung Kim
On Fri, Feb 02, 2018 at 06:05:15PM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" 
> 
> Allow the function based events to retrieve not only the parameters offsets,
> but also get data from a pointer within a parameter structure. Something
> like:
> 
>  # echo 'ip_rcv(string skdev+16[0][0] | x8[6] skperm+16[0]+558)' > 
> function_events
> 
>  # echo 1 > events/functions/ip_rcv/enable
>  # cat trace
> -0 [003] ..s3   310.626391: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   310.626400: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   312.183775: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   312.184329: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   312.303895: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   312.304610: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   312.471980: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   312.472908: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> -0 [003] ..s3   313.135804: 
> __netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
> 
> That is, we retrieved the net_device of the sk_buff and displayed its name
> and perm_addr info.
> 
>   sk->dev->name, sk->dev->perm_addr
> 
> Signed-off-by: Steven Rostedt (VMware) 
> ---

[SNIP]
> +static unsigned long process_redirects(struct func_arg *arg, unsigned long 
> val,
> +char *buf)
> +{
> + struct func_arg_redirect *redirect;
> + int ret;
> +
> + if (arg->indirect) {
> + ret = probe_kernel_read(buf, (void *)val, sizeof(long));
> + if (ret)
> + return 0;
> + val = *(unsigned long *)buf;
> + }
> +
> + list_for_each_entry(redirect, &arg->redirects, list) {
> + val += redirect->index;
> + if (redirect->indirect) {
> + val += (redirect->indirect ^ INDIRECT_FLAG);
> + ret = probe_kernel_read(buf, (void *)val, sizeof(long));
> + if (ret)
> + return 0;
> + }
> + }
> + return val;
> +}
> +
> +static long long __get_arg(struct func_arg *arg, unsigned long long val)
>  {
>   char buf[8];
>   int ret;
>  
>   val += arg->index;
>  
> - if (!arg->indirect)
> - return val;
> + if (arg->indirect)
> + val += (arg->indirect ^ INDIRECT_FLAG);
>  
> - val = val + (arg->indirect ^ INDIRECT_FLAG);
> + if (!list_empty(&arg->redirects))
> + val = process_redirects(arg, val, buf);
> +
> + if (!val)
> + return 0;
>  
>   /* Arrays and strings do their own indirect reads */
> - if (arg->array || arg->func_type == FUNC_TYPE_string)
> + if (!arg->indirect || arg->array || arg->func_type == FUNC_TYPE_string)
>   return val;

It seems the indirect is processed twice with redirects.  Consider
"x64 foo[0]+4", the process_redirects() will call probe_kernel_read()
and then here again.

Thanks,
Namhyung


>  
>   ret = probe_kernel_read(buf, (void *)val, arg->size);
> @@ -1162,6 +1246,7 @@ static void func_event_seq_stop(struct seq_file *m, 
> void *v)
>  static int func_event_seq_show(struct seq_file *m, void *v)
>  {
>   struct func_event *func_event = v;
> + struct func_arg_redirect *redirect;
>   struct func_arg *arg;
>   bool comma = false;
>   int last_arg = 0;
> @@ -1190,6 +1275,13 @@ static int func_event_seq_show(struct seq_file *m, 
> void *v)
>   seq_printf(m, "[%ld]",
>  (arg->indirect ^ INDIRECT_FLAG) / 
> arg->size);
>   }
> + list_for_each_entry(redirect, &arg->redirects, list) {
> + if (redirect->index)
> + seq_printf(m, "+%ld", redirect->index);
> + if (redirect->indirect)
> + seq_printf(m, "[%d]",
> +(redirect->indirect ^ INDIRECT_FLAG) 
> / arg->size);
> + }
>   }
>   seq_puts(m, ")\n");
>  
> -- 
> 2.15.1
> 
> 


Re: [alsa-devel] [PATCH] ASoC: Intel: Skylake: make function skl_clk_round_rate static

2018-02-08 Thread Vinod Koul
On Thu, Feb 08, 2018 at 02:35:30PM +, Colin King wrote:
> From: Colin Ian King 
> 
> The function skl_clk_round_rate is local to the source and does not
> need to be in global scope, so make it static.
> 
> Cleans up sparse warning:
> sound/soc/intel/skylake/skl-ssp-clk.c:250:6: warning: symbol
> 'skl_clk_round_rate' was not declared. Should it be static?

Acked-By: Vinod Koul 

-- 
~Vinod


tg3 crashes under high load, when using 100Mbits

2018-02-08 Thread Kai Heng Feng

Hi Broadcom folks,

We are now enabling a new platform with tg3 nic, unfortunately we observed  
the bug [1] that dated back to 2015.
I tried commit 4419bb1cedcd ("tg3: Add workaround to restrict 5762 MRRS to  
2048”) but it does’t work.


Do you have any idea how to solve the issue?

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447664

Kai-Heng



[PATCH v2] x86/kvm/vmx: Don't halt vcpu when L1 is injecting events to L2

2018-02-08 Thread Chao Gao
Although L2 is in halt state, it will be in the active state after
VM entry if the VM entry is vectoring according to SDM 26.6.2 Activity
State. Halting the vcpu here means the event won't be injected to L2
and this decision isn't reported to L1. Thus L0 drops an event that
should be injected to L2.

Cc: Liran Alon 
Signed-off-by: Chao Gao 
---
Changes in v2:
 - Remove VID stuff. Only handle event injection in this patch.
---
 arch/x86/kvm/vmx.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index bb5b488..42f39d9 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10985,7 +10985,12 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool 
launch)
if (ret)
return ret;
 
-   if (vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT)
+   /*
+* If we're entering a halted L2 vcpu and the L2 vcpu won't be woken
+* by event injection, halt vcpu for optimization.
+*/
+   if ((vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT) &&
+   !(vmcs12->vm_entry_intr_info_field & VECTORING_INFO_VALID_MASK))
return kvm_vcpu_halt(vcpu);
 
vmx->nested.nested_run_pending = 1;
-- 
1.9.1



Re: [PATCH v5 1/3] arm64/ras: support sea error recovery

2018-02-08 Thread gengdongjiu


On 2018/2/8 3:03, James Morse wrote:
> Hi Xie XiuQi,
> 
> On 30/01/18 19:19, James Morse wrote:
>> On 26/01/18 12:31, Xie XiuQi wrote:
>>> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors
>>> are consumed. According to the existing process, errors occurred in the
>>> kernel, leading to direct panic, if it occurred the user-space, we should
>>> just kill process.
>>>
>>> But there is a class of error, in fact, is not necessary to kill
>>> process, you can recover and continue to run the process. Such as
>>> the instruction data corrupted, where the memory page might be
>>> read-only, which is has not been modified, the disk might have the
>>> correct data, so you can directly drop the page, ant reload it when
>>> necessary.
>>
>> With firmware-first support, we do all this...
>>
>>
>>> So this patchset is just try to solve such problem: if the error is
>>> consumed in user-space and the error occurs on a clean page, you can
>>> directly drop the memory page without killing process.
>>>
>>> If the corrupted page is clean, just dropped it and return to user-space
>>> without side effects. And if corrupted page is dirty, memory_failure()
>>> will send SIGBUS with code=BUS_MCEERR_AR. While without this patchset,
>>> do_sea() will just send SIGBUS, so the process was killed in the same place.
>>
>> ... but this happens too. I agree its something we should fix, but I don't 
>> think
>> this is the best way to do it.
>>
>> This series is pulling the memory-failure-queue details back into the 
>> arch-code
>> to build a second list, that gets processed as extra work when we return to
>> user-space.
>>
>>
>> The root of the issue is ghes_notify_sea() claims the notification as 
>> something
>> APEI has dealt with, ... but it hasn't done it yet. The signals will be
>> generated by something currently stuck in a queue. (Evidently x86 doesn't 
>> handle
>> synchronous errors like this using firmware-first).
>>
>> I think a smaller fix is to give the queues that may be holding the
>> memory_failure() work a kick as part of the code that calls 
>> ghes_notify_sea().
>> This means that by the time we return to do_sea() ghes_notify_sea()'s claim 
>> that
>> APEI has dealt with it is true as any generated signals are pending. We can 
>> then
>> skip the existing SIGBUS generation code.
>>
>>
>>> Because memory_failure() may sleep, we can not call it directly in SEA
>>
>> (this one is more serious, I've attempted to fix it by moving all NMI-like
>> GHES-notifications to use the estatus queue).
>>
>>
>>> exception context. So we saved faulting physical address associated with
>>> a process in the ghes handler and set __TIF_SEA_NOTIFY. When we return
>>> from SEA exception context and get into do_notify_resume() before the
>>> process running, we could check it and call memory_failure() to do
>>> recovery.
>>
>>> It's safe, because we are in process context.
>>
>> I think this is the trick. When we take a Synchronous-external-abort out of
>> userspace, we're in process context too. We can add helpers to drain the
>> memory_failure_queue which can be called when do_sea() when we know we're
>> preemptible and interrupts-et-al are unmasked.
> 
> Something like... base on [0], in arch/arm64/kernel/acpi.c:
> -%<-
> int apei_claim_sea(struct pt_regs *regs)
> {
> int cpu;
> int err = -ENOENT;
> unsigned long current_flags = arch_local_save_flags();
> unsigned long interrupted_flags = current_flags;
> 
> if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
> return err;
> 
> if (regs)
> interrupted_flags = regs->pstate;
> 
> /*
>  * APEI expects an NMI-like notification to always be called
>  * in NMI context.
>  */
> local_daif_restore(DAIF_ERRCTX);
> nmi_enter();
> err = ghes_notify_sea();
> cpu = smp_processor_id();
> nmi_exit();
> 
> /*
>  * APEI NMI-like notifications are deferred to irq_work. Unless
>  * we interrupted irqs-masked code, we can do that now.
>  */
> if (!err) {
> if (!arch_irqs_disabled_flags(interrupted_flags)) {
> local_daif_restore(DAIF_PROCCTX_NOIRQ);
> irq_work_run();
> } else {
> err = -EINPROGRESS;
> }
> }
> 
> local_daif_restore(current_flags);
> 
> if (IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE) && !err) {
> /*
>  * Memory failure work is scheduled on the local CPU.
>  * If we interrupted userspace, or are in process context
>  * we can do that now.
>  */
> if ((regs && !user_mode(regs)) || !preemptible())
> err = -EINPROGRESS;
> else
> memory_failure_queue_kick(cpu);
> 

Re: [PATCH 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-08 Thread Sergey Senozhatsky
On (02/08/18 20:10), Matthew Wilcox wrote:
> > > > +/*
> > > > + * Check if the object's size falls into huge_class area. We must take
> > > > + * ZS_HANDLE_SIZE into account and test the actual size we are going to
> > > > + * use up. zs_malloc() unconditionally adds handle size before it 
> > > > performs
> > > > + * size_class lookup, so we may endup in a huge class yet 
> > > > zs_huge_object()
> > > > + * returned 'false'.
> > > > + */
> > > 
> > > Can you please reformat this comment as kernel-doc?
> > 
> > Is this - Documentation/doc-guide/kernel-doc.rst - the right thing
> > to use as a reference?
> 
> Yes.  I just sent a revision to it that makes it (I think) a little
> easier to read.  Try this version:

That's helpful, thanks! Will take a look and re-spin the patch.

-ss


YOUR FUND TRANSFER..

2018-02-08 Thread Mr Femi
This mail is been writing to you because we have come to understand that
you have lost a lot of money all because you want to receive your fund
well note that all that have been put to a stop as the federal government of
Nigeria has promised to assist you with the sum of $5million in other to
compensate you and all you have to do is fill the below information s.

1 full name

2 home phone and cell phone number

3 occupation

4 amount that was lost by you

Send this and get back at once.

Warm regards

Femi


Re: [PATCH] mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one()

2018-02-08 Thread Yang Shi



On 2/8/18 8:33 PM, Kirill A. Shutemov wrote:

On Thu, Feb 08, 2018 at 02:39:26PM -0800, Andrew Morton wrote:

On Tue,  6 Feb 2018 08:06:36 +0800 Yang Shi  wrote:


For PTE-mapped THP, the compound THP has not been split to normal 4K
pages yet, the whole THP is considered referenced if any one of sub
page is referenced.

When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked
to retrieve referenced bit. But, the current code just returns the
result of the last PTE. If the last PTE has not referenced, the
referenced flag will be cleared.

So, here just break pvmw walk once referenced PTE is found if the page
is a part of THP.

...

--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -67,6 +67,14 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
if (pvmw.pte) {
referenced = ptep_clear_young_notify(vma, addr,
pvmw.pte);
+   /*
+* For PTE-mapped THP, one sub page is referenced,
+* the whole THP is referenced.
+*/
+   if (referenced && PageTransCompound(pvmw.page)) {
+   page_vma_mapped_walk_done(&pvmw);
+   break;
+   }

This means that the function will no longer clear the referenced bits
in all the ptes.  What effect does this have and should we document
this in some fashion?

Yeah, the patch is wrong. We need to get all ptes for THP cleared.

What about something like this instead (untested):


Thanks, Kirill. It looks correct. All ptes should be cleared.

I'm going to prepare v2 patch.

Regards,
Yang



diff --git a/mm/page_idle.c b/mm/page_idle.c
index 0a49374e6931..6876522c9dce 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -65,10 +65,10 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
 while (page_vma_mapped_walk(&pvmw)) {
 addr = pvmw.address;
 if (pvmw.pte) {
-   referenced = ptep_clear_young_notify(vma, addr,
+   referenced |= ptep_clear_young_notify(vma, addr,
 pvmw.pte);
 } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
-   referenced = pmdp_clear_young_notify(vma, addr,
+   referenced |= pmdp_clear_young_notify(vma, addr,
 pvmw.pmd);
 } else {
 /* unexpected pmd-mapped page? */




Re: [PATCH] mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one()

2018-02-08 Thread Kirill A. Shutemov
On Thu, Feb 08, 2018 at 02:39:26PM -0800, Andrew Morton wrote:
> On Tue,  6 Feb 2018 08:06:36 +0800 Yang Shi  
> wrote:
> 
> > For PTE-mapped THP, the compound THP has not been split to normal 4K
> > pages yet, the whole THP is considered referenced if any one of sub
> > page is referenced.
> > 
> > When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked
> > to retrieve referenced bit. But, the current code just returns the
> > result of the last PTE. If the last PTE has not referenced, the
> > referenced flag will be cleared.
> > 
> > So, here just break pvmw walk once referenced PTE is found if the page
> > is a part of THP.
> > 
> > ...
> >
> > --- a/mm/page_idle.c
> > +++ b/mm/page_idle.c
> > @@ -67,6 +67,14 @@ static bool page_idle_clear_pte_refs_one(struct page 
> > *page,
> > if (pvmw.pte) {
> > referenced = ptep_clear_young_notify(vma, addr,
> > pvmw.pte);
> > +   /*
> > +* For PTE-mapped THP, one sub page is referenced,
> > +* the whole THP is referenced.
> > +*/
> > +   if (referenced && PageTransCompound(pvmw.page)) {
> > +   page_vma_mapped_walk_done(&pvmw);
> > +   break;
> > +   }
> 
> This means that the function will no longer clear the referenced bits
> in all the ptes.  What effect does this have and should we document
> this in some fashion?

Yeah, the patch is wrong. We need to get all ptes for THP cleared.

What about something like this instead (untested):

diff --git a/mm/page_idle.c b/mm/page_idle.c
index 0a49374e6931..6876522c9dce 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -65,10 +65,10 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
while (page_vma_mapped_walk(&pvmw)) {
addr = pvmw.address;
if (pvmw.pte) {
-   referenced = ptep_clear_young_notify(vma, addr,
+   referenced |= ptep_clear_young_notify(vma, addr,
pvmw.pte);
} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
-   referenced = pmdp_clear_young_notify(vma, addr,
+   referenced |= pmdp_clear_young_notify(vma, addr,
pvmw.pmd);
} else {
/* unexpected pmd-mapped page? */
-- 
 Kirill A. Shutemov


Re: net: thunder: change q_len's type to handle max ring size

2018-02-08 Thread Sunil Kovvuri
On Fri, Feb 9, 2018 at 3:27 AM, Dean Nelson  wrote:
> On 02/08/2018 02:34 PM, David Miller wrote:
>>
>> From: Dean Nelson 
>> Date:
>>
>>> The Cavium thunder nicvf driver supports rx/tx rings of up to 65536
>>> entries per.
>>> The number of entires are stored in the q_len member of struct
>>> q_desc_mem. The
>>> problem is that q_len being a u16, results in 65536 becoming 0.
>>>
>>> In getting pointers to descriptors in the rings, the driver uses q_len
>>> minus 1
>>> as a mask after incrementing the pointer, in order to go back to the
>>> beginning
>>> and not go past the end of the ring.
>>>
>>> With the q_len set to 0 the mask is no longer correct and the driver does
>>> go
>>> beyond the end of the ring, causing various ills. Usually the first thing
>>> that
>>> shows up is a "NETDEV WATCHDOG: enP2p1s0f1 (nicvf): transmit queue 7
>>> timed out"
>>> warning.
>>>
>>> This patch remedies the problem by changing q_len to a u32.
>>>
>>> Signed-off-by: Dean Nelson 
>>
>>
>> Applied, thanks.
>
>
> Thank you!
>
>>
>> Another way to solve this could have been to encode that length
>> as "length - 1"
>
>
> True. I had pondered that, but felt that since changing q_len's type
> didn't add any length to the structure and that it was less impactful
> from a number-of-lines of code changed perspective, I'd opt for this
> route.
>
> Cavium, if you'd prefer this goes the route that Dave just mentioned,
> please let me know and I can make a new patch against what's been
> applied?

Thanks for fixing this and i think the current patch is fine.

Thanks,
Sunil.

>
> Thanks,
> Dean
>
>
>
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [RFC] Limit mappings to ten per page per process

2018-02-08 Thread Kirill A. Shutemov
On Thu, Feb 08, 2018 at 01:37:43PM -0800, Matthew Wilcox wrote:
> On Thu, Feb 08, 2018 at 12:21:00PM -0800, Matthew Wilcox wrote:
> > Now that I think about it, though, perhaps the simplest solution is not
> > to worry about checking whether _mapcount has saturated, and instead when
> > adding a new mmap, check whether this task already has it mapped 10 times.
> > If so, refuse the mapping.
> 
> That turns out to be quite easy.  Comments on this approach?

This *may* break some remap_file_pages() users.

And it may be rather costly for popular binaries. Consider libc.so.

-- 
 Kirill A. Shutemov


Re: [PATCH] ath9k: turn on btcoex_enable as default

2018-02-08 Thread Kai Heng Feng

Hi Felix,


On Feb 8, 2018, at 7:02 PM, Felix Fietkau  wrote:

On 2018-02-08 06:28, Kai-Heng Feng wrote:

Without btcoex_enable, WiFi activies make both WiFi and Bluetooth
unstable if there's a bluetooth connection.

Enable this option when bt_ant_diversity is disabled.

BugLink: https://bugs.launchpad.net/bugs/1746164
Signed-off-by: Kai-Heng Feng 

I think this might cause regressions on devices that don't have
bluetooth. This probably either needs more EEPROM checks, or something
to selectively enable it only on affected platforms.



I think it’s better not to use dmi_match. This issue should affect more  
ath9k.
And bluetooth peripherals are more than ever now, so it would be great to  
use BT out of the box.


Can you take a look at the bug link, maybe there are other things caused  
the erratic behavior that I didn’t notice?


Kai-Heng


- Felix


Re: [RFC PATCH 7/7] Test stackprotector options in Kconfig to kill CC_STACKPROTECTOR_AUTO

2018-02-08 Thread Masahiro Yamada
2018-02-09 3:30 GMT+09:00 Kees Cook :
> On Fri, Feb 9, 2018 at 3:19 AM, Masahiro Yamada
>  wrote:
>> Add CC_HAS_STACKPROTECTOR(_STRONG) and proper dependency.
>>
>> I re-arranged the choice values, _STRONG, _REGULAR, _NONE in this order
>> because the default of choice is the first visible symbol.
>> [...]
>> +# is this necessary?
>> +#ifeq ($(CONFIG_CC_STACKPROTECTOR_NONE),y)
>> +#KBUILD_CFLAGS += -fno-stack-protector
>> +#endif
>
> Yes, and also in the case of a broken stack protector, because some
> compilers enable stack protector by default, so if we've selected it
> to be NONE or detected it as broken, we need to force it off in the
> compiler.
>
>> +# TODO: run scripts/gcc-$(SRCARCH)_$(BITS)-has-stack-protector.sh from 
>> Kconfig
>
> FWIW, this is the part that I got stuck on.
> gcc-$(SRCARCH)_$(BITS)-has-stack-protector.sh depends on the KBUILD
> flags that got built up and detected up to this point in the Makefile,
> so I couldn't find a way to run it out of Kconfig since it didn't know
> what the KBUILD flags were yet.


SRCARCH is fixed when loading Kconfig files.

BITS is derived from CONFIG_64BIT.

config 64BIT
bool "64-bit kernel" if ARCH = "x86"
default ARCH != "i386"
---help---
  Say yes to build a 64-bit kernel - formerly known as x86_64
  Say no to build a 32-bit kernel - formerly known as i386


This is a more difficult part because users can toggle this option
from menuconfig, etc.

If this option is changed, the compiler options must be re-computed,
i.e. system() must be called again.

This is missing in my first draft.

I have not checked how slow it is.




>> +
>>  ifeq ($(cc-name),clang)
>>  KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
>>  KBUILD_CFLAGS += $(call cc-disable-warning, unused-variable)
>> diff --git a/arch/Kconfig b/arch/Kconfig
>> index 76c0b54..50723d8 100644
>> --- a/arch/Kconfig
>> +++ b/arch/Kconfig
>> @@ -538,10 +538,20 @@ config HAVE_CC_STACKPROTECTOR
>>   - its compiler supports the -fstack-protector option
>>   - it has implemented a stack canary (e.g. __stack_chk_guard)
>>
>> +config CC_HAS_STACKPROTECTOR
>> +   bool
>> +   option shell="$CC -Werror -fstack-protector -c -x c /dev/null"
>> +
>> +config CC_HAS_STACKPROTECTOR_STRONG
>> +   bool
>> +   option shell="$CC -Werror -fstack-protector-strong -c -x c /dev/null"
>
> I'm nervous we'll get tripped up here, since $CC may not include the
> right $(KBUILD_CPPFLAGS) and $(CC_OPTION_CFLAGS) as in cc-option, both
> of which are calculated during the Makefile run. But maybe it won't be
> a problem in actual use.


Right, I had noticed this is a problem, but not implemented yet.

At least, some basic compiler options must be imported into Kconfig.

Especially this is a problem for clang.
One clang executable is built with lots of
architecture back-ends.
So,
CLANG_TARGET:= --target=$(notdir $(CROSS_COMPILE:%-=%))
etc. is mandatory.


If I remember correctly, there existed some options
that depend on others.

I am not sure about the stackprotector case.



>> +
>> +config CC_STACKPROTECTOR
>> +   bool
>> +
>>  choice
>> prompt "Stack Protector buffer overflow detection"
>> depends on HAVE_CC_STACKPROTECTOR
>> -   default CC_STACKPROTECTOR_AUTO
>> help
>>   This option turns on the "stack-protector" GCC feature. This
>>   feature puts, at the beginning of functions, a canary value on
>> @@ -551,26 +561,10 @@ choice
>>   overwrite the canary, which gets detected and the attack is then
>>   neutralized via a kernel panic.
>>
>> -config CC_STACKPROTECTOR_NONE
>> -   bool "None"
>> -   help
>> - Disable "stack-protector" GCC feature.
>> -
>> -config CC_STACKPROTECTOR_REGULAR
>> -   bool "Regular"
>> -   help
>> - Functions will have the stack-protector canary logic added if they
>> - have an 8-byte or larger character array on the stack.
>> -
>> - This feature requires gcc version 4.2 or above, or a distribution
>> - gcc with the feature backported ("-fstack-protector").
>> -
>> - On an x86 "defconfig" build, this feature adds canary checks to
>> - about 3% of all kernel functions, which increases kernel code size
>> - by about 0.3%.
>> -
>>  config CC_STACKPROTECTOR_STRONG
>> bool "Strong"
>> +   depends on CC_HAS_STACKPROTECTOR_STRONG
>> +   select CC_STACKPROTECTOR
>> help
>>   Functions will have the stack-protector canary logic added in any
>>   of the following conditions:
>> @@ -588,11 +582,25 @@ config CC_STACKPROTECTOR_STRONG
>>   about 20% of all kernel functions, which increases the kernel code
>>   size by about 2%.
>>
>> -config CC_STACKPROTECTOR_AUTO
>> -   bool "Automatic"
>> +config CC_STACKPROTECTOR_REGULAR
>> +   bool "Regular"
>> +   depends on CC_HAS_STACKPROTECTOR
>> +   select CC_STACKPROTECTOR
>> +

Re: [RFC PATCH 2/4] softirq: Per vector deferment to workqueue

2018-02-08 Thread Mike Galbraith
On Thu, 2018-02-08 at 20:30 +, Dmitry Safonov wrote:
> On Thu, 2018-02-08 at 15:22 -0500, David Miller wrote:
> > From: Dmitry Safonov 
> > Date: Thu, 08 Feb 2018 20:14:55 +
> > 
> > > On Thu, 2018-02-08 at 13:45 -0500, David Miller wrote:
> > >> From: Sebastian Andrzej Siewior 
> > >> Date: Thu, 8 Feb 2018 18:44:52 +0100
> > >> 
> > >> > May I instead suggest to stick to ksoftirqd? So you run in
> > softirq
> > >> > context (after return from IRQ) and if takes too long, you
> > offload
> > >> the
> > >> > vector to ksoftirqd instead. You may want to play with the
> > metric
> > >> on
> > >> > which you decide when you want switch to ksoftirqd / account how
> > >> long a
> > >> > vector runs.
> > >> 
> > >> Having read over this stuff for the past few weeks this is how I
> > feel
> > >> as well.  Just make ksofbitrq do what we want (only execute the
> > >> overloaded softirq vectors).
> > >> 
> > >> The more I look at the workqueue stuff, the more complications and
> > >> weird behavioral artifacts we are getting for questionable gain.
> > > 
> > > What about creating several ksoftirqd threads per-cpu?
> > > Like I did with boot parameter to specify how many threads and
> > which
> > > softirqs to serve.
> > 
> > Why do we need more than one per cpu?
> 
> Ugh, yeah, I remember why I did it - I tried to reuse scheduler for
> each ksoftirqd thread to decide if it need to run now or later.
> That would give an admin a way to prioritise softirqs with nice.
> Not sure if it's a nice idea at all..

For RT that can be handy, but for the general case it's a waste of
cycles, so would want to be opt-in.

-Mike


Re: [PATCH 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-08 Thread Matthew Wilcox
On Fri, Feb 09, 2018 at 11:55:20AM +0900, Sergey Senozhatsky wrote:
> On (02/08/18 18:30), Mike Rapoport wrote:
> [..]
> > > 
> > > +/*
> > > + * Check if the object's size falls into huge_class area. We must take
> > > + * ZS_HANDLE_SIZE into account and test the actual size we are going to
> > > + * use up. zs_malloc() unconditionally adds handle size before it 
> > > performs
> > > + * size_class lookup, so we may endup in a huge class yet 
> > > zs_huge_object()
> > > + * returned 'false'.
> > > + */
> > 
> > Can you please reformat this comment as kernel-doc?
> 
> Is this - Documentation/doc-guide/kernel-doc.rst - the right thing
> to use as a reference?

Yes.  I just sent a revision to it that makes it (I think) a little
easier to read.  Try this version:


Writing kernel-doc comments
===

The Linux kernel source files may contain structured documentation
comments in the kernel-doc format to describe the functions, types
and design of the code. It is easier to keep documentation up-to-date
when it is embedded in source files.

.. note:: The kernel-doc format is deceptively similar to javadoc,
   gtk-doc or Doxygen, yet distinctively different, for historical
   reasons. The kernel source contains tens of thousands of kernel-doc
   comments. Please stick to the style described here.

The kernel-doc structure is extracted from the comments, and proper
`Sphinx C Domain`_ function and type descriptions with anchors are
generated from them. The descriptions are filtered for special kernel-doc
highlights and cross-references. See below for details.

.. _Sphinx C Domain: http://www.sphinx-doc.org/en/stable/domains.html

Every function that is exported to loadable modules using
``EXPORT_SYMBOL`` or ``EXPORT_SYMBOL_GPL`` should have a kernel-doc
comment. Functions and data structures in header files which are intended
to be used by modules should also have kernel-doc comments.

It is good practice to also provide kernel-doc formatted documentation
for functions externally visible to other kernel files (not marked
``static``). We also recommend providing kernel-doc formatted
documentation for private (file ``static``) routines, for consistency of
kernel source code layout. This is lower priority and at the discretion
of the maintainer of that kernel source file.

How to format kernel-doc comments
-

The opening comment mark ``/**`` is used for kernel-doc comments. The
``kernel-doc`` tool will extract comments marked this way. The rest of
the comment is formatted like a normal multi-line comment with a column
of asterisks on the left side, closing with ``*/`` on a line by itself.

The function and type kernel-doc comments should be placed just before
the function or type being described in order to maximise the chance
that somebody changing the code will also change the documentation. The
overview kernel-doc comments may be placed anywhere at the top indentation
level.

Function documentation
--

The general format of a function and function-like macro kernel-doc comment is::

  /**
   * function_name() - Brief description of function.
   * @arg1: Describe the first argument.
   * @arg2: Describe the second argument.
   *One can provide multiple line descriptions
   *for arguments.
   *
   * A longer description, with more discussion of the function function_name()
   * that might be useful to those using or modifying it. Begins with an
   * empty comment line, and may include additional embedded empty
   * comment lines.
   *
   * The longer description may have multiple paragraphs.
   *
   * Context: Describes whether the function can sleep, what locks it takes,
   *  releases, or expects to be held. It can extend over multiple
   *  lines.
   * Return: Describe the return value of foobar.
   *
   * The return value description can also have multiple paragraphs, and should
   * be placed at the end of the comment block.
   */

The brief description following the function name may span multiple lines, and
ends with an argument description, a blank comment line, or the end of the
comment block.

Function parameters
~~~

Each function argument should be described in order, immediately following
the short function description.  Do not leave a blank line between the
function description and the arguments, nor between the arguments.

Each ``@argument:`` description may span multiple lines.

.. note::

   If the ``@argument`` description has multiple lines, the continuation
   of the description should start at the same column as the previous line::

  * @argument: some long description
  *that continues on next lines

   or::

  * @argument:
  * some long description
  * that continues on next lines

If a function has a variable number of arguments, its description should
be written in kernel-doc notation as::

  * @...: description

Function

Re: Regression after commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly")

2018-02-08 Thread Matthew Wilcox
On Thu, Feb 08, 2018 at 03:20:04PM -0800, Matthew Wilcox wrote:
> So ... we could enable ZONE_DMA32 on 32-bit architectures.  I don't know
> what side-effects that might have; it's clearly only been tested on 64-bit
> architectures so far.
> 
> It might be best to just revert 19809c2da28a and the follow-on 704b862f9efd.

Alternatively, try this.  It passes in GFP_DMA32 from vmalloc_32,
regardless of whether ZONE_DMA32 exists or not.  If ZONE_DMA32 doesn't
exist, then we clear it in __vmalloc_area_node(), after using it to
determine that we shouldn't set __GFP_HIGHMEM.

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 673942094328..91e8a95123c4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1669,10 +1669,11 @@ static void *__vmalloc_area_node(struct vm_struct 
*area, gfp_t gfp_mask,
struct page **pages;
unsigned int nr_pages, array_size, i;
const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
-   const gfp_t alloc_mask = gfp_mask | __GFP_NOWARN;
-   const gfp_t highmem_mask = (gfp_mask & (GFP_DMA | GFP_DMA32)) ?
-   0 :
-   __GFP_HIGHMEM;
+   gfp_t alloc_mask = gfp_mask | __GFP_NOWARN;
+   if (!(alloc_mask & GFP_ZONEMASK))
+   alloc_mask |= __GFP_HIGHMEM;
+   if (!IS_ENABLED(CONFIG_ZONE_DMA32) && (alloc_mask & __GFP_DMA32))
+   alloc_mask &= ~__GFP_DMA32;
 
nr_pages = get_vm_area_size(area) >> PAGE_SHIFT;
array_size = (nr_pages * sizeof(struct page *));
@@ -1680,7 +1681,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, 
gfp_t gfp_mask,
area->nr_pages = nr_pages;
/* Please note that the recursion is strictly bounded. */
if (array_size > PAGE_SIZE) {
-   pages = __vmalloc_node(array_size, 1, nested_gfp|highmem_mask,
+   pages = __vmalloc_node(array_size, 1, nested_gfp|__GFP_HIGHMEM,
PAGE_KERNEL, node, area->caller);
} else {
pages = kmalloc_node(array_size, nested_gfp, node);
@@ -1696,9 +1697,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, 
gfp_t gfp_mask,
struct page *page;
 
if (node == NUMA_NO_NODE)
-   page = alloc_page(alloc_mask|highmem_mask);
+   page = alloc_page(alloc_mask);
else
-   page = alloc_pages_node(node, alloc_mask|highmem_mask, 
0);
+   page = alloc_pages_node(node, alloc_mask, 0);
 
if (unlikely(!page)) {
/* Successfully allocated i pages, free them in 
__vunmap() */
@@ -1706,7 +1707,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, 
gfp_t gfp_mask,
goto fail;
}
area->pages[i] = page;
-   if (gfpflags_allow_blocking(gfp_mask|highmem_mask))
+   if (gfpflags_allow_blocking(gfp_mask))
cond_resched();
}
 
@@ -1942,12 +1943,10 @@ void *vmalloc_exec(unsigned long size)
  NUMA_NO_NODE, __builtin_return_address(0));
 }
 
-#if defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA32)
-#define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL
-#elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA)
+#if defined(CONFIG_64BIT) && !defined(CONFIG_ZONE_DMA32)
 #define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL
 #else
-#define GFP_VMALLOC32 GFP_KERNEL
+#define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL
 #endif
 
 /**


Re: ocxl: fix signed comparison with less than zero

2018-02-08 Thread Michael Ellerman
On Tue, 2018-01-30 at 15:11:44 UTC, Colin King wrote:
> From: Colin Ian King 
> 
> Currently the comparison of used < 0 is always false because
> uses is a size_t. Fix this by making used a ssize_t type.
> 
> Detected by Coccinelle:
> drivers/misc/ocxl/file.c:320:6-10: WARNING: Unsigned expression
> compared with zero: used < 0
> 
> Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi devices")
> Signed-off-by: Colin Ian King 
> Acked-by: Andrew Donnellan 
> Acked-by: Frederic Barrat 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/dedab7f0d3137441a97fe7cf9b9ca5

cheers


HOPE TO HEAR FROM YOU

2018-02-08 Thread Miss Nadege
Hello dear how are you?

Nice to meet you,my name is Miss Nadege Yann, can we become friends? hope to 
hear from you so that we can know each other very well, love matters mostly in 
life,i will also send you my pictures and tell you more about myself, my email 
address is(missnade...@gmail.com)

waiting to hear from you soon.
Miss.Nadege Yann


Re: [PATCH] arm: dts: mt7623: enable all four available UARTs on bananapi-r2

2018-02-08 Thread Sean Wang
On Wed, 2018-02-07 at 17:01 +0100, Matthias Brugger wrote:
> 
> On 01/23/2018 09:51 AM, Sean Wang wrote:
> > On Sat, 2017-12-23 at 23:35 +0800, Sean Wang wrote:
> >> On Sat, 2017-12-23 at 08:52 +0100, Matthias Brugger wrote:
> >>>
> >>> On 12/22/2017 07:06 AM, sean.w...@mediatek.com wrote:
>  From: Sean Wang 
> 
>  On bpi-r2 board, totally there're four uarts which we usually called
>  uart[0-3] helpful to extend slow I/O devices. Among those ones, uart2 has
>  dedicated pin slot which is used to conolse log. uart[0-1] appear at the
>  40-pins connector and uart3 has no pinout, but just has test points (TP47
>  for TX and TP48 for RX, respectively) nearby uart2. Also, some missing
>  pinctrl is being complemented for those devices.
> 
>  Signed-off-by: Sean Wang 
>  ---
>   arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts | 26 
>  --
>   1 file changed, 24 insertions(+), 2 deletions(-)
> 
>  diff --git a/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts 
>  b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
>  index 7bf5aa2..64bf5db 100644
>  --- a/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
>  +++ b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
>  @@ -409,6 +409,20 @@
>    ;
>   };do you like it or quite want me to remove the uart3 
>  node?
>   };
>  +
>  +uart2_pins_a: uart@2 {
>  +pins_dat {
>  +pinmux = ,
>  + ;
>  +};
>  +};
>  +
>  +uart3_pins_a: uart@3 {
>  +pins_dat {
>  +pinmux = ,
>  + ;
>  +};
>  +};
>   };
>   
>   &pwm {
>  @@ -454,16 +468,24 @@
>   &uart0 {
>   pinctrl-names = "default";
>   pinctrl-0 = <&uart0_pins_a>;
>  -status = "disabled";
>  +status = "okay";
>   };
>   
>   &uart1 {
>   pinctrl-names = "default";
>   pinctrl-0 = <&uart1_pins_a>;
>  -status = "disabled";
>  +status = "okay";
>   };
>   
>   &uart2 {
>  +pinctrl-names = "default";
>  +pinctrl-0 = <&uart2_pins_a>;
>  +status = "okay";
>  +};
>  +
>  +&uart3 {
>  +pinctrl-names = "default";
>  +pinctrl-0 = <&uart3_pins_a>;
>   status = "okay";
>   };
>   
> >>>
> >>> Why do we want to enable uart3 when there are only test points?
> >>> It is not very useful, or do I oversee something?
> >>>
> > 
> >> I have been listening to the sound from potential users of bpi-r2 to
> >> understand what assistance I have to provide to them. Something could
> >> be seen through [1] in the forum to know they had been trying hard to
> >> explore all available UARTs from the SoC in the last weeks. The patch
> >> should be really useful for these people and for the extra soldering
> >> it shouldn't become a problem for these makers.
> >>
> >> [1] http://forum.banana-pi.org/t/gpio-uart-not-the-debug-port/3748
> >>
> >>Sean 
> >>
> > 
> > Hi, Matthias
> > 
> > do you like it or quite want me to remove the uart3 node?
> > 
> > I can take it into account along with other pending dts changes in my
> > queue.
> > 
> 
> Sorry for the late answer.
> Do I understand correctly that uart3 is routed to TP47 and TP48, and these 
> test
> points are accessible through the SATA connector? Doesn't they break SATA 
> then?
> 

TP47 and TP48 are directly pins out from SoC, not through the SATA
connector.

> I think as they are only available through a non-documented test point, we
> shouldn't enable it.
> 

Okay, let's drop uart 3 setting here.

> Regards,
> Matthias




Re: [RFC 2/2] Introduce sysctl(s) for the migration costs

2018-02-08 Thread Mike Galbraith
On Thu, 2018-02-08 at 14:19 -0800, Rohit Jain wrote:
> This patch introduces the sysctl for sched_domain based migration costs.
> These in turn can be used for performance tuning of workloads.

With this patch, we trade 1 completely bogus constant (cost is really
highly variable) for 3, twiddling of which has zero effect unless you
trigger a domain rebuild afterward, which is neither mentioned in the
changelog, nor documented.

bogo-numbers++ is kinda hard to love.

-Mike


Re: [PATCH v2 14/16] arm64: dts: mt7622: add thermal and related nodes

2018-02-08 Thread Sean Wang
On Wed, 2018-02-07 at 12:43 +0100, Matthias Brugger wrote:
> 
> On 02/06/2018 10:53 AM, sean.w...@mediatek.com wrote:
> > From: Sean Wang 
> > 
> > add nodes for the thermal controller and associated thermal zone using
> > CPU as the cooling device for each trip point. In addition, add a fixup
> > for thermal_calibration on nvmem should be 12 bytes as the minimal
> > requirement.
> > 
> > Signed-off-by: Sean Wang 
> > ---
> >  arch/arm64/boot/dts/mediatek/mt7622.dtsi | 72 
> > +++-
> >  1 file changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi 
> > b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
> > index e6dd4f6..6cf67dd 100644
> > --- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
> > +++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
> > @@ -12,6 +12,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  / {
> > compatible = "mediatek,mt7622";
> > @@ -75,6 +76,7 @@
> >  <&apmixedsys CLK_APMIXED_MAIN_CORE_EN>;
> > clock-names = "cpu", "intermediate";
> > operating-points-v2 = <&cpu_opp_table>;
> > +   #cooling-cells = <2>;
> > enable-method = "psci";
> > clock-frequency = <13>;
> > };
> > @@ -119,6 +121,58 @@
> > };
> > };
> >  
> > +   thermal-zones {
> > +   cpu_thermal: cpu-thermal {
> > +   polling-delay-passive = <1000>;
> > +   polling-delay = <1000>;
> > +
> > +   thermal-sensors = <&thermal 0>;
> > +
> > +   trips {
> > +   cpu_passive: cpu-passive {
> > +   temperature = <47000>;
> > +   hysteresis = <2000>;
> > +   type = "passive";
> > +   };
> > +
> > +   cpu_active: cpu-active {
> > +   temperature = <67000>;
> > +   hysteresis = <2000>;
> > +   type = "active";
> > +   };
> > +
> > +   cpu_hot: cpu-hot {
> > +   temperature = <87000>;
> > +   hysteresis = <2000>;
> > +   type = "hot";
> > +   };
> > +
> > +   cpu-crit {
> > +   temperature = <107000>;
> > +   hysteresis = <2000>;
> > +   type = "critical";
> > +   };
> > +   };
> > +
> > +   cooling-maps {
> > +   map0 {
> > +   trip = <&cpu_passive>;
> > +   cooling-device = <&cpu0 
> > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > +   };
> > +
> > +   map1 {
> > +   trip = <&cpu_active>;
> > +   cooling-device = <&cpu0 
> > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > +   };
> > +
> > +   map2 {
> > +   trip = <&cpu_hot>;
> > +   cooling-device = <&cpu0 
> > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > +   };
> > +   };
> > +   };
> > +   };
> > +
> > timer {
> > compatible = "arm,armv8-timer";
> > interrupt-parent = <&gic>;
> > @@ -201,7 +255,7 @@
> > #size-cells = <1>;
> >  
> > thermal_calibration: calib@198 {
> > -   reg = <0x198 0x8>;
> > +   reg = <0x198 0xc>;
> 
> Any reason why this is not part of patch 8/16?
> 

There's no strong reason wanting me to do that. patch 8 has contained a
lot of nodes and patch 16 is present just in v2. So, I felt it should be
a little bit easy that people reviews those patches if they are put into
separate patches. But, It's still fine to make them into one in the next
version.

> Regards,
> Matthias
> 




Re: [PATCH] cpufreq: schedutil: rate limits for SCHED_DEADLINE

2018-02-08 Thread Viresh Kumar
On 08-02-18, 18:01, Claudio Scordino wrote:
> When the SCHED_DEADLINE scheduling class increases the CPU utilization,
> we should not wait for the rate limit, otherwise we may miss some deadline.
> 
> Tests using rt-app on Exynos5422 have shown reductions of about 10% of 
> deadline
> misses for tasks with low RT periods.
> 
> The patch applies on top of the one recently proposed by Peter to drop the
> SCHED_CPUFREQ_* flags.
> 
> Signed-off-by: Claudio Scordino 
> CC: Rafael J . Wysocki 
> CC: Patrick Bellasi 
> CC: Dietmar Eggemann 
> CC: Morten Rasmussen 
> CC: Juri Lelli 
> CC: Viresh Kumar 
> CC: Vincent Guittot 
> CC: Todd Kjos 
> CC: Joel Fernandes 
> CC: linux...@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/cpufreq_schedutil.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)

So the previous commit was surely incorrect as it relied on comparing
frequencies instead of dl-util, and freq requirements could have even
changed due to CFS.

> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index b0bd77d..d8dcba2 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -74,7 +74,10 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>  
>  / Governor internals ***/
>  
> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 
> time)
> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
> +  u64 time,
> +  struct sugov_cpu *sg_cpu_old,
> +  struct sugov_cpu *sg_cpu_new)
>  {
>   s64 delta_ns;
>  
> @@ -111,6 +114,10 @@ static bool sugov_should_update_freq(struct sugov_policy 
> *sg_policy, u64 time)
>   return true;
>   }
>  
> + /* Ignore rate limit when DL increased utilization. */
> + if (sg_cpu_new->util_dl > sg_cpu_old->util_dl)
> + return true;
> +

Changing the frequency has a penalty, specially in the ARM world (and
that's where you are testing your stuff). I am worried that we will
have (corner) cases where we will waste a lot of time changing the
frequencies. For example (I may be wrong here), what if 10 small DL
tasks are queued one after the other? The util will keep on changing
and so will the frequency ? There may be more similar cases ?

Is it possible to (somehow) check here if the DL tasks will miss
deadline if we continue to run at current frequency? And only ignore
rate-limit if that is the case ?

>   delta_ns = time - sg_policy->last_freq_update_time;
>   return delta_ns >= sg_policy->freq_update_delay_ns;
>  }
> @@ -271,6 +278,7 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   unsigned int flags)
>  {
>   struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
> update_util);
> + struct sugov_cpu sg_cpu_old = *sg_cpu;

Not really a big deal, but this structure is 80 bytes on ARM64, why
copy everything when what we need is just 8 bytes ?

-- 
viresh


Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 10:10 PM, NeilBrown  wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
> 
>>> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
>>> 
>>> On Tue, Aug 16 2016, James Simmons wrote:
>> 
>> my that’s an old patch
>> 
>>> 
> ...
>>> 
>>> Whoever converted it to "!strcmp()" inverted the condition.  This is a
>>> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
>>> 
>>> This causes many tests in the 'sanity' test suite to return
>>> -ENOMEM (that had me puzzled for a while!!).
>> 
>> huh? I am not seeing anything of the sort and I was running sanity
>> all the time until a recent pause (but going to resume).
> 
> That does surprised me - I reproduce it every time.
> I have two VMs running a SLE12-SP2 kernel with patches from
> lustre-release applied.  These are servers. They have 2 3G virtual disks
> each.
> I have two over VMs running current mainline.  These are clients.
> 
> I guess your 'recent pause' included between v4.15-rc1 (8e55b6fd0660)
> and v4.15-rc6 (a93639090a27) - a full month when lustre wouldn't work at
> all :-(

More than that, but I am pretty sure James Simmons is running tests all the 
time too
(he has a different config, I only have tcp).

>>> This seems to suggest that no-one has been testing the mainline linux
>>> lustre.
>>> It also seems to suggest that there is a good chance that there
>>> are other bugs that have crept in while no-one has really been caring.
>>> Given that the sanity test suite doesn't complete for me, but just
>>> hangs (in test_27z I think), that seems particularly likely.
>> 
>> Works for me, here’s a run from earlier today on 4.15.0:
> 
> Well that's encouraging .. I haven't looked into this one yet - I'm not
> even sure where to start.

m… debug logs for example (greatly neutered in staging tree, but still useful)?
try lctl dk and see what’s in there.

>> Instead the plan was to clean up the staging client into acceptable state,
>> move it out of staging, bring in all the missing features and then
>> drop the client (more or less) from the lustre-release.
> 
> That sounds like a great plan.  Any idea why it didn't happen?

Because meeting open-ended demands is hard and certain demands sound like
“throw away your X and rewrite it from scratch" (e.g. everything IB-related).

Certain things that sound useless (like the debug subsystem in Lustre)
is very useful when you have a 10k nodes in a cluster and need to selectively
pull stuff from a run to debug a complicated cross-node interaction.
I asked NFS people how do they do it and they don’t have anything that scales
and usually involves reducing the problem to a much smaller set of nodes first.

> It seems there is a lot of upstream work mixed in with the clean up, and
> I don't think that really helps anyone.

I don’t understand what you mean here.

> Is it at all realistic that the client might be removed from
> lustre-release?  That might be a good goal to work towards.

Assuming we can bring the whole functionality over - sure.

Of course there’d still be some separate development place and we would
need to create patches (new features?) for like SuSE and other distros
and for testing of server features, I guess, but that could just that -
a side branch somewhere I hope.

It’s not that we are super glad to chase every kernel vendors put out,
of course it would be much easier if the kernels already included
a very functional Lustre client.

>>> Might it make sense to instead start cleaning up the code in
>>> lustre-release so as to make it meet the upstream kernel standards.
>>> Then when the time is right, the kernel code can be moved *out* of
>>> lustre-release and *in* to linux.  Then development can continue in
>>> Linux (just like it does with other Linux filesystems).
>> 
>> While we can be cleaning lustre in lustre-release, there are some things
>> we cannot do as easily, e.g. decoupling Lustre client from the server.
>> Also it would not attract any reviews from all the janitor or
>> (more importantly) Al Viro and other people with a sharp eyes.
>> 
>>> An added bonus of this is that there is an obvious path to getting
>>> server support in mainline Linux.  The current situation of client-only
>>> support seems weird given how interdependent the two are.
>> 
>> Given the pushback Lustre client was given I have no hope Lustre server
>> will get into mainline in my lifetime.
> 
> Even if it is horrible it would be nice to have it in staging... I guess
> the changes required to ext4 prohibit that... I don't suppose it can be
> made to work with mainline ext4 in a reduced-functionality-and-performance
> way??

We support unpatched ZFS as a server too! ;)
(and if somebody invests the time into it, there was some half-baked btrfs
backend too I think).
That said nobody here believes in any success of pushing Lustre server into
mainline.
It would just be easier to push the whole server into userspace (And there
was a project like this in the past, now abandoned because i

Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

2018-02-08 Thread Mike Galbraith
On Thu, 2018-02-08 at 14:19 -0800, Rohit Jain wrote:
> This patch makes idle_balance more dynamic as the sched_migration_cost
> is now accounted on a sched_domain level. This in turn is done in
> sd_init when we know what the topology relationships are.
> 
> For introduction sakes cost of migration within the same core is set as
> 0, across cores is 50 usec and across sockets is 500 usec. sysctl for
> these variables are introduced in patch 2.
> 
> Signed-off-by: Rohit Jain 
> ---
>  include/linux/sched/topology.h | 1 +
>  kernel/sched/fair.c| 6 +++---
>  kernel/sched/topology.c| 5 +
>  3 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index cf257c2..bcb4db2 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -104,6 +104,7 @@ struct sched_domain {
>   u64 max_newidle_lb_cost;
>   unsigned long next_decay_max_lb_cost;
>  
> + u64 sched_migration_cost;
>   u64 avg_scan_cost;  /* select_idle_sibling */
>  
>  #ifdef CONFIG_SCHEDSTATS
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2fe3aa8..61d3508 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8782,8 +8782,7 @@ static int idle_balance(struct rq *this_rq, struct 
> rq_flags *rf)
>*/
>   rq_unpin_lock(this_rq, rf);
>  
> - if (this_rq->avg_idle < sysctl_sched_migration_cost ||
> - !this_rq->rd->overload) {
> + if (!this_rq->rd->overload) {
>   rcu_read_lock();
>   sd = rcu_dereference_check_sched_domain(this_rq->sd);
>   if (sd)

Unexplained/unrelated change.

> @@ -8804,7 +8803,8 @@ static int idle_balance(struct rq *this_rq, struct 
> rq_flags *rf)
>   if (!(sd->flags & SD_LOAD_BALANCE))
>   continue;
>  
> - if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
> + if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost +
> + sd->sched_migration_cost) {
>   update_next_balance(sd, &next_balance);
>   break;
>   }

Ditto.

> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 034cbed..bcd8c64 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1148,12 +1148,14 @@ sd_init(struct sched_domain_topology_level *tl,
>   sd->flags |= SD_PREFER_SIBLING;
>   sd->imbalance_pct = 110;
>   sd->smt_gain = 1178; /* ~15% */
> + sd->sched_migration_cost = 0;
>  
>   } else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
>   sd->flags |= SD_PREFER_SIBLING;
>   sd->imbalance_pct = 117;
>   sd->cache_nice_tries = 1;
>   sd->busy_idx = 2;
> + sd->sched_migration_cost = 50UL;
>  
>  #ifdef CONFIG_NUMA
>   } else if (sd->flags & SD_NUMA) {
> @@ -1162,6 +1164,7 @@ sd_init(struct sched_domain_topology_level *tl,
>   sd->idle_idx = 2;
>  
>   sd->flags |= SD_SERIALIZE;
> + sd->sched_migration_cost = 500UL;

That's not 500us.

-Mike


Re: [PATCH v2 04/16] arm64: dts: mt7622: add pinctrl related device nodes

2018-02-08 Thread Sean Wang
On Wed, 2018-02-07 at 12:31 +0100, Matthias Brugger wrote:
> 
> On 02/06/2018 10:52 AM, sean.w...@mediatek.com wrote:
> > From: Sean Wang 
> > 
> > add pinctrl device nodes and rfb1 board, additionally include all pin
> > groups possible being used on rfb1 board and available gpio keys.
> > 
> > Signed-off-by: Sean Wang 
> > ---
> >  arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts | 200 
> > +++
> >  arch/arm64/boot/dts/mediatek/mt7622.dtsi |   7 +
> >  2 files changed, 207 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts 
> > b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
> > index c08309d..bd1093a 100644
> > --- a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
> > +++ b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
> > @@ -7,6 +7,8 @@
> >   */
> >  
> >  /dts-v1/;
> > +#include 
> > +
> >  #include "mt7622.dtsi"
> >  
> >  / {
> > @@ -17,11 +19,209 @@
> > bootargs = "console=ttyS0,115200n1";
> > };
> >  
> > +   gpio-keys {
> > +   compatible = "gpio-keys-polled";
> > +   poll-interval = <100>;
> > +
> > +   factory {
> > +   label = "factory";
> > +   linux,code = ;
> > +   gpios = <&pio 0 0>;
> > +   };
> > +
> > +   wps {
> > +   label = "wps";
> > +   linux,code = ;
> > +   gpios = <&pio 102 0>;
> > +   };
> > +   };
> > +
> > memory {
> > reg = <0 0x4000 0 0x3F00>;
> > };
> >  };
> >  
> > +&pio {
> > +   /* eMMC is shared pin with parallel NAND */
> > +   emmc_pins_default: emmc-pins-default {
> > +   mux {
> > +   function = "emmc", "emmc_rst";
> > +   groups = "emmc";
> > +   };
> > +   };
> > +
> > +   emmc_pins_uhs: emmc-pins-uhs {
> > +   mux {
> > +   function = "emmc";
> > +   groups = "emmc";
> > +   };
> > +   };
> > +
> > +   eth_pins: eth-pins {
> > +   mux {
> > +   function = "eth";
> > +   groups = "mdc_mdio", "rgmii_via_gmac2";
> > +   };
> > +   };
> > +
> > +   i2c1_pins: i2c1-pins {
> > +   mux {
> > +   function = "i2c";
> > +   groups =  "i2c1_0";
> > +   };
> > +   };
> > +
> > +   i2c2_pins: i2c2-pins {
> > +   mux {
> > +   function = "i2c";
> > +   groups =  "i2c2_0";
> > +   };
> > +   };
> > +
> > +   i2s1_pins: i2s1-pins {
> > +   mux {
> > +   function = "i2s";
> > +   groups =  "i2s_out_bclk_ws_mclk",
> > + "i2s1_in_data",
> > + "i2s1_out_data";
> > +   };
> > +   };
> > +
> > +   irrx_pins: irrx-pins {
> > +   mux {
> > +   function = "ir";
> > +   groups =  "ir_1_rx";
> > +   };
> > +   };
> > +
> > +   irtx_pins: irtx-pins {
> > +   mux {
> > +   function = "ir";
> > +   groups =  "ir_1_tx";
> > +   };
> > +   };
> > +
> > +   /* Parallel nand is shared pin with eMMC */
> > +   parallel_nand_pins: parallel-nand-pins {
> > +   mux {
> > +   function = "flash";
> > +   groups = "par_nand";
> > +   };
> > +   };
> > +
> > +   pcie0_pins: pcie0-pins {
> > +   mux {
> > +   groups = "pcie0_pad_perst",
> > +"pcie0_1_waken",
> > +"pcie0_1_clkreq";
> > +   function = "pcie";
> > +   };
> > +   };
> > +
> > +   pcie1_pins: pcie1-pins {
> > +   mux {
> > +   groups = "pcie1_pad_perst",
> > +"pcie1_0_waken",
> > +"pcie1_0_clkreq";
> > +   function = "pcie";
> > +   };
> > +   };
> > +
> > +   pmic_bus_pins: pmic-bus-pins {
> > +   mux {
> > +   groups = "pmic_bus";
> > +   function = "pmic";
> > +   };
> > +   };
> 
> Some bikeshedding here. Can you please add function before groups, so that it 
> is
> uniform through out the file?
> 
> Thanks,
> Matthias
> 

okay, will make them all aligned

> > +
> > +   pwm7_pins: pwm1-2-pins {
> > +   mux {
> > +   function = "pwm";
> > +   groups = "pwm_ch7_2";
> > +   };
> > +   };
> > +
> > +   wled_pins: wled-pins {
> > +   mux {
> > +   function = "led";
> > +   groups = "wled";
> > +   };
> > +   };
> > +
> > +   sd0_pins_default: sd0-pins-default {
> > +   mux {
> > +   function = "sd";
> > +   groups = "sd_0";
> > +   };
> > +   };
> > +
> > +   sd0_pins_uhs: sd0-pins-uhs {
> > +   mux {
> > +   funct

Re: [PATCH v2 01/16] dt-bindings: clock: mediatek: add missing required #reset-cells

2018-02-08 Thread Sean Wang
On Wed, 2018-02-07 at 11:45 +0100, Matthias Brugger wrote:
> 
> On 02/06/2018 10:52 AM, sean.w...@mediatek.com wrote:
> > From: Sean Wang 
> > 
> > All ethsys, pciesys and ssusbsys internally include reset controller, so
> > explicitly add back these missing cell definitions to related bindings
> > and examples.
> > 
> > Signed-off-by: Sean Wang 
> > Cc: Rob Herring 
> > Cc: Stephen Boyd 
> > Reviewed-by: Rob Herring 
> > ---
> >  Documentation/devicetree/bindings/arm/mediatek/mediatek,ethsys.txt   | 2 ++
> >  Documentation/devicetree/bindings/arm/mediatek/mediatek,pciesys.txt  | 2 ++
> >  Documentation/devicetree/bindings/arm/mediatek/mediatek,ssusbsys.txt | 2 ++
> >  3 files changed, 6 insertions(+)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/arm/mediatek/mediatek,ethsys.txt 
> > b/Documentation/devicetree/bindings/arm/mediatek/mediatek,ethsys.txt
> > index 7aa3fa1..8f5335b 100644
> > --- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,ethsys.txt
> > +++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,ethsys.txt
> > @@ -9,6 +9,7 @@ Required Properties:
> > - "mediatek,mt2701-ethsys", "syscon"
> > - "mediatek,mt7622-ethsys", "syscon"
> >  - #clock-cells: Must be 1
> > +- #reset-cells: Must be 1
> >  
> >  The ethsys controller uses the common clk binding from
> >  Documentation/devicetree/bindings/clock/clock-bindings.txt
> > @@ -20,4 +21,5 @@ ethsys: clock-controller@1b00 {
> > compatible = "mediatek,mt2701-ethsys", "syscon";
> > reg = <0 0x1b00 0 0x1000>;
> > #clock-cells = <1>;
> > +   #reset-cells = <1>;
> 
> The example is already fixed upstream, but I forgot the binding description,
> please rebase this patch.
> 
> And please don't forget to add all clock maintainers.
> 

okay, i will do it.

> Regards,
> Matthias
> 




Re: [PATCH 15/18] tracing: Add string type for dynamic strings in function based events

2018-02-08 Thread Steven Rostedt
On Fri, 9 Feb 2018 12:15:47 +0900
Namhyung Kim  wrote:

> > @@ -124,6 +128,16 @@ enum {
> > FUNC_TYPE_MAX
> >  };
> >  
> > +#define MAX_STR512
> > +
> > +/* Two contexts, normal and NMI, hence the " * 2" */
> > +struct func_string {
> > +   charbuf[MAX_STR * 2];
> > +};
> > +
> > +static struct func_string __percpu *str_buffer;
> > +static int nr_strings;  
> 
> What protects it?

Grumble, I was thinking that the entire create_function_event was under
the func_event_mutex, which it is not. So nr_strings is not fully
protected. I'll fix that thanks.

As for str_buffer, I should comment this as it is rather subtle.


+static int read_string(char *str, unsigned long addr)
+{
+   unsigned long flags;
+   struct func_string *strbuf;
+   char *ptr = (void *)addr;
+   char *buf;
+   int ret;
+
+   if (!str_buffer)
+   return 0;
+
+   strbuf = this_cpu_ptr(str_buffer);
+   buf = &strbuf->buf[0];
+
+   if (in_nmi())
+   buf += MAX_STR;
+
+   local_irq_save(flags);

Like I said, this is really subtle, and desperately needs a comment.

The str_buffer is per cpu and can only be access under irqs disabled.
If we are in NMI, then we move the starting position forward by MAX_STR.

I'll add comments and protect create_function_event with the mutex.

Thanks for pointing this out!

-- Steve


+   ret = strncpy_from_unsafe(buf, ptr, MAX_STR);
+   if (ret < 0)
+   ret = 0;
+   if (ret > 0 && str)
+   memcpy(str, buf, ret);
+   local_irq_restore(flags);
+
+   return ret;
+}


Re: [PATCH v2] printk: Relocate wake_klogd check close to the end of console_unlock()

2018-02-08 Thread Sergey Senozhatsky
On (02/08/18 17:48), Petr Mladek wrote:
[..]
> > 
> > I need to do more "research" on this. I though about it some time ago,
> > and I think that waking up klogd _only_ when we don't have any pending
> > logbuf messages still can be pretty late. Can't it? We can spin in
> > console_unlock() printing loop for a long time, probably passing
> > console_sem ownership between CPUs, without waking up the log_wait waiter.
> > May be we can wake it up from the printing loop, outside of logbuf_lock,
> > and let klogd to compete for logbuf_lock with the printing CPU. Why do
> > we wake it up only when we are done pushing messages to a potentially
> > slow serial console?
> 
> I thought about this as well but I was lazy. You made me to do some
> archaeology. It seems that it worked this way basically from the beginning.
> I have a git tree with pre-git commits. The oldest printk changes are
> there from 2.1.113.
> 
> In 2.1.113, logd was weaken directly from printk():

Thanks!

Was going to do the same today. Will take a look.

[..]
> My opinion:
> 
> IMHO, it would make perfect sense to wake klogd earlier and it should
> be safe these days.
> 
> I am just slightly afraid of a potential contention on printk_lock.
> Consoles and klogd might delay each other. Another question is
> how to do so when console_unlock() is called with interrupts
> disabled (irq_work is queued on the same CPU). This is why
> I would suggest to do this change separately and not for 4.16.

By postponing klogd wakeup we don't really address logbuf_lock
contention. We have no guarantees that no new printk will come
while klogd is active. Besides, consoles don't really delay
klogd - I tend to ignore the impact of msg_print_text(), it should
be fast. We call console drivers outside of logbuf_lock scope, so
everything should fine (tm).


Another question - do we need to wake it up from console_unlock()?

Basically,
- if consoles are suspended, we also "suspend" user space klogd.
  Does it really make sense?

- if console_lock is acquired by a preemptible task and that task
  is getting scheduled out for a long time (OOM, etc) then we postpone
  user space logging for unknown period of time. First the console_lock
  will have to flush pending messages and only afterwards it will wakeup
  klogd. Does it really make sense?

- If current console_lock owner jumps to retry (new pending messages
  were appended to the logbuf) label, user space klogd wakeup is getting
  postponed even further.

So, the final question is - since there in only one legitimate way
(modulo user space writes to kmsg) to append new messages to the
logbuf, shall we put klogd wakeup there? IOW, to vprintk_emit().

Something like this:

---

 kernel/printk/printk.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index db4b9b8929eb..2c8992d54a59 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -423,6 +423,9 @@ static u32 log_next_idx;
 static u64 console_seq;
 static u32 console_idx;
 
+/* The last message seq klogd have seen */
+static u64 klogd_seen_seq;
+
 /* the next printk record to read after the last 'clear' command */
 static u64 clear_seq;
 static u32 clear_idx;
@@ -1888,6 +1891,10 @@ asmlinkage int vprintk_emit(int facility, int level,
 
printed_len = log_output(facility, level, lflags, dict, dictlen, text, 
text_len);
 
+   if (klogd_seen_seq != log_next_seq) {
+   klogd_seen_seq = log_next_seq;
+   wake_up_klogd();
+   }
logbuf_unlock_irqrestore(flags);
 
/* If called from the scheduler, we can not call up(). */
@@ -2289,9 +2296,7 @@ void console_unlock(void)
 {
static char ext_text[CONSOLE_EXT_LOG_MAX];
static char text[LOG_LINE_MAX + PREFIX_MAX];
-   static u64 seen_seq;
unsigned long flags;
-   bool wake_klogd = false;
bool do_cond_resched, retry;
 
if (console_suspended) {
@@ -2335,11 +2340,6 @@ void console_unlock(void)
 
printk_safe_enter_irqsave(flags);
raw_spin_lock(&logbuf_lock);
-   if (seen_seq != log_next_seq) {
-   wake_klogd = true;
-   seen_seq = log_next_seq;
-   }
-
if (console_seq < log_first_seq) {
len = sprintf(text, "** %u printk messages dropped 
**\n",
  (unsigned)(log_first_seq - console_seq));
@@ -2429,9 +2429,6 @@ void console_unlock(void)
 
if (retry && console_trylock())
goto again;
-
-   if (wake_klogd)
-   wake_up_klogd();
 }
 EXPORT_SYMBOL(console_unlock);
 
---

So, essentially, instead of:

- OK, there is a new kernel message. Let's first print it to all of the
  consoles (if !suspended and if can use console) and only afterwards let
  user space to read it and to, probably, persist it in syslog journal file.

and now we have

Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

2018-02-08 Thread Xiao Guangrong



On 02/08/2018 06:31 PM, Paolo Bonzini wrote:

On 08/02/2018 09:57, Xiao Guangrong wrote:

Maybe it should return RET_PF_EMULATE, which would cause an emulation
failure and then an exit with KVM_EXIT_INTERNAL_ERROR.


So the root cause is that a running vCPU accessing the memory whose memslot
is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
memslot).

The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace,
we'd better to make ept-misconfig's handler follow this style as well.


Why return -EFAULT and not attempt emulation (which will fail)?



That is a good question... :)

This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
userspace should avoid this case by itself (avoiding vCPU accessing the
memslot which is being updated). If it happens, it's a operation issue
rather than INTERNAL ERROR.

Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
is a better solution...


Re: [PATCH] ftrace: fix the file mode of graph tracer and stacktracer

2018-02-08 Thread Steven Rostedt
On Fri, 9 Feb 2018 10:33:31 +0800
"Zhengyuan Liu"  wrote:

> It doesn't affect root writing to those files as root is a super user and
> can access to any write-only files.  I just want to make those writable

Ah, I know when I vim a file that's read only, even for root vim will
complain. But there's no complaining with redirection.

I just tried vim on set_graph_function, and yeah, it does complain.

> file to look consistent with others, seeing bellow:
> 
> -rw-r--r--  1 root root  set_event_pid
> -rw-r--r--  1 root root  set_ftrace_filter
> -rw-r--r--  1 root root  set_ftrace_notrace
> -rw-r--r--  1 root root  set_ftrace_pid
> -r--r--r--  1 root root   set_graph_function
> -r--r--r--  1 root root  set_graph_notrace

Yeah, I understand.

> 
> If this patch makes no sense, just ignore it!

No no, it makes perfect sense. I plan on adding it. But it's not that
important to be included for stable. Expect to see this in 4.17.

Thanks!

-- Steve


Re: [PATCH 15/18] tracing: Add string type for dynamic strings in function based events

2018-02-08 Thread Namhyung Kim
On Fri, Feb 02, 2018 at 06:05:13PM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" 
> 
> Add a "string" type that will create a dynamic length string for the
> event, this is the same as the __string() field in normal TRACE_EVENTS.
> 
> [ missing 'static' found by Fengguang Wu's kbuild test robot ]
> Signed-off-by: Steven Rostedt (VMware) 
> ---
>  Documentation/trace/function-based-events.rst |  19 ++-
>  kernel/trace/trace_event_ftrace.c | 183 
> +++---
>  2 files changed, 181 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/trace/function-based-events.rst 
> b/Documentation/trace/function-based-events.rst
> index 99ae77cd59e6..6c643ea749e7 100644
> --- a/Documentation/trace/function-based-events.rst
> +++ b/Documentation/trace/function-based-events.rst
> @@ -99,7 +99,7 @@ as follows:
>   's8' | 's16' | 's32' | 's64' |
>   'x8' | 'x16' | 'x32' | 'x64' |
>   'char' | 'short' | 'int' | 'long' | 'size_t' |
> -  'symbol'
> +  'symbol' | 'string'
>  
>   FIELD :=  |  INDEX |  OFFSET |  OFFSET INDEX
>  
> @@ -342,3 +342,20 @@ the format "%s". If a nul is found, the output will 
> stop. Use another type
>bash-1470  [003] ...2   980.678715: 
> path_openat->link_path_walk(name=/lib64/ld-linux-x86-64.so.2)
>bash-1470  [003] ...2   980.678721: 
> path_openat->link_path_walk(name=ld-2.24.so)
>bash-1470  [003] ...2   980.678978: 
> path_lookupat->link_path_walk(name=/etc/ld.so.preload)
> +
> +
> +Dynamic strings
> +===
> +
> +Static strings are fine, but they can waste a lot of memory in the ring 
> buffer.
> +The above allocated 64 bytes for a character array, but most of the output 
> was
> +less than 20 characters. Not wanting to truncate strings or waste space on
> +the ring buffer, the dynamic string can help.
> +
> +Use the "string" type for strings that have a large range in size. The max
> +size that will be recorded is 512 bytes. If a string is larger than that, 
> then
> +it will be truncated.
> +
> + # echo 'link_path_walk(string name)' > function_events
> +
> +Gives the same result as above, but does not waste buffer space.
> diff --git a/kernel/trace/trace_event_ftrace.c 
> b/kernel/trace/trace_event_ftrace.c
> index dd24b840329d..273c5838a8e2 100644
> --- a/kernel/trace/trace_event_ftrace.c
> +++ b/kernel/trace/trace_event_ftrace.c
> @@ -39,6 +39,7 @@ struct func_event {
>   struct func_arg *last_arg;
>   int arg_cnt;
>   int arg_offset;
> + int has_strings;
>  };
>  
>  struct func_file {
> @@ -83,6 +84,8 @@ typedef u32 x32;
>  typedef u16 x16;
>  typedef u8 x8;
>  typedef void * symbol;
> +/* 2 byte offset, 2 byte length */
> +typedef u32 string;
>  
>  #define TYPE_TUPLE(type) \
>   { #type, sizeof(type), is_signed_type(type) }
> @@ -105,7 +108,8 @@ typedef void * symbol;
>   TYPE_TUPLE(u8), \
>   TYPE_TUPLE(s8), \
>   TYPE_TUPLE(x8), \
> - TYPE_TUPLE(symbol)
> + TYPE_TUPLE(symbol), \
> + TYPE_TUPLE(string)
>  
>  static struct func_type {
>   char*name;
> @@ -124,6 +128,16 @@ enum {
>   FUNC_TYPE_MAX
>  };
>  
> +#define MAX_STR  512
> +
> +/* Two contexts, normal and NMI, hence the " * 2" */
> +struct func_string {
> + charbuf[MAX_STR * 2];
> +};
> +
> +static struct func_string __percpu *str_buffer;
> +static int nr_strings;

What protects it?

Thanks,
Namhyung


> +
>  /**
>   * arch_get_func_args - retrieve function arguments via pt_regs
>   * @regs: The registers at the moment the function is called
> @@ -163,6 +177,23 @@ int __weak arch_get_func_args(struct pt_regs *regs,
>   return 0;
>  }
>  
> +static void free_arg(struct func_arg *arg)
> +{
> + list_del(&arg->list);
> + if (arg->func_type == FUNC_TYPE_string) {
> + nr_strings--;
> + if (WARN_ON(nr_strings < 0))
> + nr_strings = 0;
> + if (!nr_strings) {
> + free_percpu(str_buffer);
> + str_buffer = NULL;
> + }
> + }
> + kfree(arg->name);
> + kfree(arg->type);
> + kfree(arg);
> +}
> +
>  static void free_func_event(struct func_event *func_event)
>  {
>   struct func_arg *arg, *n;
> @@ -171,10 +202,7 @@ static void free_func_event(struct func_event 
> *func_event)
>   return;
>  
>   list_for_each_entry_safe(arg, n, &func_event->args, list) {
> - list_del(&arg->list);
> - kfree(arg->name);
> - kfree(arg->type);
> - kfree(arg);
> + free_arg(arg);
>   }
>   ftrace_free_filter(&func_event->ops);
>   kfree(func_event->call.print_fmt);
> @@ -255,6 +283,17 @@ static int add_arg(struct func_event *fe

Re: [PATCH v28 0/4] Virtio-balloon: support free page reporting

2018-02-08 Thread Michael S. Tsirkin
On Fri, Feb 09, 2018 at 11:11:39AM +0800, Wei Wang wrote:
> On 02/09/2018 03:55 AM, Michael S. Tsirkin wrote:
> > On Thu, Feb 08, 2018 at 05:50:16PM +0800, Wei Wang wrote:
> > 
> > > Details:
> > > Set up a Ping-Pong local live migration, where the guest ceaselessy
> > > migrates between the source and destination. Linux compilation,
> > > i.e. make bzImage -j4, is performed during the Ping-Pong migration. The
> > > legacy case takes 5min14s to finish the compilation. With this
> > > optimization patched, it takes 5min12s.
> > How is migration time affected in this case?
> 
> 
> When the linux compilation workload runs, the migration time (both the
> legacy and this optimization case) varies as the compilation goes on. It
> seems not easy to give a static speedup number, some times the migration
> time is reduced to 33%, sometimes to 50%, it varies, and depends on how much
> free memory the system has at that moment. For example, at the later stage
> of the compilation, I can observe 5GB memory being used as page cache. But
> overall, I can observe obvious improvement of the migration time.
> 
> 
> Best,
> Wei

You can run multiple tests and give a best, worst and median numbers.

-- 
MST


linux-next: Signed-off-by missing for commit in the btrfs-kdave tree

2018-02-08 Thread Stephen Rothwell
Hi David,

Commit

  7f53a77da963 ("Btrfs: fix btrfs_evict_inode to handle abnormal inodes 
correctly")

is missing a Signed-off-by from its committer.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread NeilBrown
On Thu, Feb 08 2018, Oleg Drokin wrote:

>> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
>> 
>> On Tue, Aug 16 2016, James Simmons wrote:
>
> my that’s an old patch
>
>> 
...
>> 
>> Whoever converted it to "!strcmp()" inverted the condition.  This is a
>> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
>> 
>> This causes many tests in the 'sanity' test suite to return
>> -ENOMEM (that had me puzzled for a while!!).
>
> huh? I am not seeing anything of the sort and I was running sanity
> all the time until a recent pause (but going to resume).

That does surprised me - I reproduce it every time.
I have two VMs running a SLE12-SP2 kernel with patches from
lustre-release applied.  These are servers. They have 2 3G virtual disks
each.
I have two over VMs running current mainline.  These are clients.

I guess your 'recent pause' included between v4.15-rc1 (8e55b6fd0660)
and v4.15-rc6 (a93639090a27) - a full month when lustre wouldn't work at
all :-(


>
>> This seems to suggest that no-one has been testing the mainline linux
>> lustre.
>> It also seems to suggest that there is a good chance that there
>> are other bugs that have crept in while no-one has really been caring.
>> Given that the sanity test suite doesn't complete for me, but just
>> hangs (in test_27z I think), that seems particularly likely.
>
> Works for me, here’s a run from earlier today on 4.15.0:

Well that's encouraging .. I haven't looked into this one yet - I'm not
even sure where to start.

>
>> So my real question - to anyone interested in lustre for mainline linux
>> - is: can we actually trust this code at all?
>
> Absolutely. Seems that you just stumbled upon a corner case that was not
> being hit by people that do the testing, so you have something unique about
> your setup, I guess.
>
>> I'm seriously tempted to suggest that we just
>>  rm -r drivers/staging/lustre
>> 
>> drivers/staging is great for letting the community work on code that has
>> been "thrown over the wall" and is not openly developed elsewhere, but
>> that is not the case for lustre.  lustre has (or seems to have) an open
>> development process.  Having on-going development happen both there and
>> in drivers/staging seems a waste of resources.
>
> It is a bit of a waste of resources, but there are some other things here.
> E.g. we cannot have any APIs with no users in the kernel.
> Also some people like to have in-kernel modules coming with their distros
> (there were some users that used staging client on ubuntu as their
> setup).
>
> Instead the plan was to clean up the staging client into acceptable state,
> move it out of staging, bring in all the missing features and then
> drop the client (more or less) from the lustre-release.

That sounds like a great plan.  Any idea why it didn't happen?
It seems there is a lot of upstream work mixed in with the clean up, and
I don't think that really helps anyone.

Is it at all realistic that the client might be removed from
lustre-release?  That might be a good goal to work towards.

>
>> Might it make sense to instead start cleaning up the code in
>> lustre-release so as to make it meet the upstream kernel standards.
>> Then when the time is right, the kernel code can be moved *out* of
>> lustre-release and *in* to linux.  Then development can continue in
>> Linux (just like it does with other Linux filesystems).
>
> While we can be cleaning lustre in lustre-release, there are some things
> we cannot do as easily, e.g. decoupling Lustre client from the server.
> Also it would not attract any reviews from all the janitor or
> (more importantly) Al Viro and other people with a sharp eyes.
>
>> An added bonus of this is that there is an obvious path to getting
>> server support in mainline Linux.  The current situation of client-only
>> support seems weird given how interdependent the two are.
>
> Given the pushback Lustre client was given I have no hope Lustre server
> will get into mainline in my lifetime.

Even if it is horrible it would be nice to have it in staging... I guess
the changes required to ext4 prohibit that... I don't suppose it can be
made to work with mainline ext4 in a reduced-functionality-and-performance
way??

I think it would be a lot easier to motivate forward progress if there
were a credible end goal of everything being in mainline.

>
>> What do others think?  Is there any chance that the current lustre in
>> Linux will ever be more than a poor second-cousin to the external
>> lustre-release.  If there isn't, should we just discard it now and move
>> on?
>
>
> I think many useful cleanups and fixes came from the staging tree at
> the very least.
> The biggest problem with it all is that we are in staging tree so
> we cannot bring it to parity much. And we are in staging tree because
> there’s a whole bunch of “cleanups” requested that take a lot of effort
> (in both implementing them and then in finding other ways of achieving
> things that were done in old ways before)

Re: [PATCH v28 0/4] Virtio-balloon: support free page reporting

2018-02-08 Thread Wei Wang

On 02/09/2018 03:55 AM, Michael S. Tsirkin wrote:

On Thu, Feb 08, 2018 at 05:50:16PM +0800, Wei Wang wrote:


Details:
Set up a Ping-Pong local live migration, where the guest ceaselessy
migrates between the source and destination. Linux compilation,
i.e. make bzImage -j4, is performed during the Ping-Pong migration. The
legacy case takes 5min14s to finish the compilation. With this
optimization patched, it takes 5min12s.

How is migration time affected in this case?



When the linux compilation workload runs, the migration time (both the 
legacy and this optimization case) varies as the compilation goes on. It 
seems not easy to give a static speedup number, some times the migration 
time is reduced to 33%, sometimes to 50%, it varies, and depends on how 
much free memory the system has at that moment. For example, at the 
later stage of the compilation, I can observe 5GB memory being used as 
page cache. But overall, I can observe obvious improvement of the 
migration time.



Best,
Wei


Re: ipmi_si fails to get BMC ID

2018-02-08 Thread Chris Chiu
On Thu, Feb 8, 2018 at 11:53 PM, Corey Minyard  wrote:
> On 02/07/2018 09:01 PM, Chris Chiu wrote:
>>
>> Hi,
>>  We are working with a new desktop Acer Veriton Z4640G and get
>> stumbled on failing to enter S3 suspend with kernel version 4.14 even
>> the latest 4.15+. Here's the kernel log
>> https://gist.github.com/mschiu77/76888f1fd4eb56aa8959d76759a912bb.
>
>
> This is a little strange, nobody had reported this before.  Can you
> reproduce this
> at will, or was it a one-time thing?

It can be reproduced on each reboot.
>
> Does the IPMI driver always take this long to issue that error, even if you
> are not
> entering sleep state?
>
Yep, it will always print "ipmi_si :02:00.3: There appears to be
no BMC at this
location" few minutes after boot.

> And it started with 4.14, and didn't occur before then, right?
>

I haven't try pre-4.14 kernel. Will do that and update here.

> There's a bug in the PCI utils database, I submitted a report a while ago.
> This is
> a KCS, not a SMIC interface.
>
> It looks like the driver is trying to detect that there is a device out
> there and
> there is something that kind of works, but doesn't work completely. The
> interface
> specific code was all split out into separate files in 4.14.  It is possible
> the
> detection code got messed up in the process.  Nothing jumps out looking at
> the code differences, and I know it works on some PCI machines.
>
> Assuming this is reproducible, can you send the the output of a pre-4.14
> kernel?  If that doesn't make it obvious I may have to have access to the
> machine itself.
>
> -corey
>
>
It's an All-in-One machine so I think it would be difficult for
shipment. I'll see what
I can do. Thanks for help.

Chris

>>  As you see, it is due to "ipmi_probe+0x430/0x430 [ipmi_si]". After
>> the message "ipmi_si :02:00.3: There appears to be no BMC at this
>> location" shows up, then it can really go to suspend w/o problem.
>> Although it took around 3 mins. The IPMI device is probed from PCI and
>> here's the output of lspci
>> https://gist.github.com/mschiu77/33f0372be41670d8a69c97e64f833087. The
>> IPMI device is "02:00.3 IPMI SMIC interface [0c07]". We get stuck here
>> because we don't really know why it took so long in try_get_dev_id() /
>> ipmi_si_intf.c. Any suggestion about this to help us moving forward?
>> Thanks
>>
>>
>> Chris
>
>
>


Re: [v3] ARM: dts: imx: Add support for Advantech DMS-BA16

2018-02-08 Thread Shawn Guo
On Thu, Feb 08, 2018 at 11:24:53AM -0800, Yung-Ching LIN wrote:
> Will correct the USB_OTG_ID pinmux setting in the v4.
> Do you mind if I redefine reg_usb_otg_vbus node since we just started
> using OTG host/device mode on this board ?

Not at all.  Feel free to change it.

Shawn


Re: [PATCH v2] ARM: dts: imx6ull: add Toradex Colibri iMX6ULL support

2018-02-08 Thread Shawn Guo
On Thu, Feb 08, 2018 at 10:25:47AM +0100, ste...@agner.ch wrote:
> On 08.02.2018 08:47, Shawn Guo wrote:
> > On Tue, Feb 06, 2018 at 05:49:03PM +0100, Stefan Agner wrote:
> >> Add support for the Computer on Module Colibri iMX6ULL and its
> >> Bluetooth/Wifi variant along with the development/evaluation carrier
> >> board device trees. Follow the usual hierarchic include model,
> >> maintaining shared configuration in imx6ull-colibri.dtsi and
> >> imx6ull-colibri-eval-v3.dtsi respectively.
> >>
> >> Signed-off-by: Stefan Agner 
> >> ---
> >> This depends on the following patchsets work:
> >> - https://lkml.org/lkml/2018/1/6/129 (applied)
> >> - https://lkml.org/lkml/2018/1/10/998 (applied)
> >> - https://www.spinics.net/lists/arm-kernel/msg632671.html (pending, 
> >> required)
> >> - https://lkml.org/lkml/2018/1/18/850 (only for highest CPU frequency)
> > 
> > So the only dependency is the cpufreq change now.  So we have two
> > options:
> > 
> > 1. Hold the patch until the cpufreq change appear on my tree.  That
> >will require us wait for another release cycle.
> > 
> > 2. Drop the highest CPU frequency, so that we can apply the patch right
> >away, and add that setpoint after dependant cpufreq change lands
> >mainline.
> 
> The way cpufreq currently works is that for everything higher than
> 396MHz it just will set the CPU parent to pll2_bus_clk which can go up
> to 528MHz. Also voltage should be within operation range even for
> 528MHz.
> 
> So I think we can safely merge the current device tree.

Okay, thanks for the info.  Patch applied.

Shawn


Re: [PATCH v6 15/15] dt-bindings: cpufreq: Document operating-points-v2-krait-cpu

2018-02-08 Thread Rob Herring
On Tue, Feb 06, 2018 at 09:38:28AM +0530, Sricharan R wrote:
> In Certain QCOM SoCs like ipq8064, apq8064, msm8960, msm8974
> that has KRAIT processors the voltage/current value of each OPP
> varies based on the silicon variant in use.
> operating-points-v2-krait-cpu specifies the phandle to nvmem efuse cells
> and the operating-points-v2 table for each opp. The qcom-cpufreq driver
> reads the efuse value from the SoC to provide the required information
> that is used to determine the voltage and current value for each OPP of
> operating-points-v2 table when it is parsed by the OPP framework.
> 
> Signed-off-by: Sricharan R 
> ---
>  .../devicetree/bindings/cpufreq/krait-cpufreq.txt  | 363 
> +
>  1 file changed, 363 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
> 
> diff --git a/Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt 
> b/Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
> new file mode 100644
> index 000..e7351f7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
> @@ -0,0 +1,363 @@
> +QCOM KRAIT CPUFreq and OPP bindings
> +===
> +
> +In Certain QCOM SoCs like ipq8064, apq8064, msm8960, msm8974
> +that has KRAIT processors the voltage value of each OPP varies
> +based on the silicon variant in use. Qualcomm Process Voltage Scaling Tables
> +defines the voltage and current value based on the speed/pvs/version
> +combination blown in the efuse. The qcom-cpufreq driver reads the efuse
> +value from the SoC to provide the OPP framework with required information.
> +This is used to determine the voltage and current value for each OPP of
> +operating-points-v2 table when it is parsed by the OPP framework.
> +
> +Required properties:
> +
> +In 'cpus' nodes:
> +- operating-points-v2: Phandle to the operating-points-v2 table to use.
> +
> +In 'operating-points-v2' table:
> +- compatible: Should be
> + - 'operating-points-v2-krait-cpu' for ipq8064, apq8064, msm8960,
> +   msm8974.
> +- nvmem-cells: A phandle pointing to a nvmem-cells node representing the
> + efuse registers that has information about the
> + speedbin/pvs/version that is used to select the right
> + voltage/current value pair. Note that the length field of the
> + nvmem-cell is used to differentiate between format 'A' or 'B'
> + efuse settings. len of '4' bytes is for format 'A' and '8'
> + bytes for format 'B'. Please refer the for nvmem-cells
> + bindings Documentation/devicetree/bindings/nvmem/nvmem.txt
> + and also examples below for both the cases.
> +Example 1:
> +-
> +
> +/* For arch/arm/boot/dts/apq8064.dtsi --> format 'A' */
> +cpus {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + CPU0: cpu@0 {
> + compatible = "qcom,krait";
> + enable-method = "qcom,kpss-acc-v1";
> + device_type = "cpu";
> + reg = <0>;
> + next-level-cache = <&L2>;
> + qcom,acc = <&acc0>;
> + qcom,saw = <&saw0>;
> + cpu-idle-states = <&CPU_SPC>;
> + operating-points-v2 = <&cpu_opp_table>;
> + };
> +};
> +
> +qfprom: qfprom@70 {
> +   compatible  = "qcom,qfprom";
> +   reg = <0x0070 0x1000>;
> +   #address-cells  = <1>;
> +   #size-cells = <1>;
> +   ranges;
> +   pvs_efuse: pvs {
> + reg = <0xc0 0x4>;
> + };
> +};
> +
> +cpu_opp_table: opp-table {
> + compatible = "operating-points-v2-krait-cpu";
> + nvmem-cells = <&pvs_efuse>;
> +
> + /*
> +  * Missing opp-shared property means CPUs switch DVFS states
> +  * independently.
> +  */
> +
> +opp-91800 {
> + opp-hz = /bits/ 64 <91800>;
> + opp-microvolt-speed0-pvs0-v0 = <110>;

Where is this property defined? I'm not that happy with it, but don't 
have a better suggestion. Maybe make pvsN be an array of values with 0 
for any skipped indexes? The '-v0' seems pointless. 

> + opp-microvolt-speed0-pvs1-v0 = <105>;
> + opp-microvolt-speed0-pvs3-v0 = <100>;
> + opp-microvolt-speed0-pvs4-v0 = <975000>;
> + opp-microvolt-speed1-pvs0-v0 = <1025000>;
> + opp-microvolt-speed1-pvs1-v0 = <100>;
> + opp-microvolt-speed1-pvs2-v0 = <95>;
> + opp-microvolt-speed1-pvs3-v0 = <925000>;
> + opp-microvolt-speed1-pvs4-v0 = <90>;
> + opp-microvolt-speed1-pvs5-v0 = <90>;
> + opp-microvolt-speed1-pvs6-v0 = <90>;
> + opp-microvolt-speed2-pvs0-v0 = <975000>;
> 

Re: [PATCH 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-08 Thread Sergey Senozhatsky
On (02/08/18 18:30), Mike Rapoport wrote:
[..]
> > 
> > +/*
> > + * Check if the object's size falls into huge_class area. We must take
> > + * ZS_HANDLE_SIZE into account and test the actual size we are going to
> > + * use up. zs_malloc() unconditionally adds handle size before it performs
> > + * size_class lookup, so we may endup in a huge class yet zs_huge_object()
> > + * returned 'false'.
> > + */
> 
> Can you please reformat this comment as kernel-doc?

Is this - Documentation/doc-guide/kernel-doc.rst - the right thing
to use as a reference?

-ss


Re: [PATCH] ARM: dts: ls1021a: add quadspi node

2018-02-08 Thread Shawn Guo
On Thu, Feb 08, 2018 at 03:24:40PM +0100, Rasmus Villemoes wrote:
> On 2018-02-05 09:03, Shawn Guo wrote:
> > On Fri, Jan 26, 2018 at 03:20:14PM +0100, Rasmus Villemoes wrote:
> >> Add a node to device tree repesenting the QuadSPI controller present on
> >> LS1021a. Driver support has been present since e8c034b2fbe5 (mtd:
> >> spi-nor: fsl-quadspi: add support for ls1021a).
> >>
> >> Signed-off-by: Rasmus Villemoes 
> > 
> > Applied, thanks.
> > 
> 
> Sorry, I didn't know you already had a similar patch queued for 4.16
> (85f8ee78ab ARM: dts: ls1021a: Add support for QSPI with ls1021a SoC),
> so I have to ask you to unapply (or do you want me to send a revert?) -
> the ls1021a.dtsi in your imx/dt branch now has two almost-identical
> qspi: nodes.

Ah, I did not notice that either.  Patch dropped.


Shawn


linux-next: Tree for Feb 9

2018-02-08 Thread Stephen Rothwell
Hi all,

Please do not add any v4.17 material to your linux-next included branches
until after v4.16-rc1 has been released.

Changes since 20180208:

The btrfs-kdave tree still had its build failure so I used the version
from next-20180206.

The vhost tree lost its build failure.

The akpm tree lost a patch that turned up elsewhere.

Non-merge commits (relative to Linus' tree): 723
 985 files changed, 24209 insertions(+), 5943 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 256 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (a0f79386a496 Merge tag 'for-linus-4.16' of 
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux)
Merging fixes/master (b46dc8ae17a4 media: videobuf2: fix up for "media: 
annotate ->poll() instances")
Merging kbuild-current/fixes (36c1681678b5 genksyms: drop *.hash.c from 
.gitignore)
Merging arc-current/for-curr (053823335956 arc: dts: use 'atmel' as 
manufacturer for at24 in axs10x_mb)
Merging arm-current/fixes (091f02483df7 ARM: net: bpf: clarify tail_call index)
Merging m68k-current/for-linus (2334b1ac1235 MAINTAINERS: Add NuBus subsystem 
entry)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (1b689a95ce74 powerpc/pseries: include 
linux/types.h in asm/hvcall.h)
Merging sparc/master (aebb48f5e465 sparc64: fix typo in 
CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (08f513851218 net: phy: fix phy_start to consider 
PHY_IGNORE_INTERRUPT)
Merging bpf/master (69fe98edee48 Merge branch 'bpf-misc-nfp-bpftool-doc-fixes')
Merging ipsec/master (545d8ae7afff xfrm: fix boolean assignment in 
xfrm_get_type_offload)
Merging netfilter/master (fd2c19b2a28b netfilter: x_tables: remove size check)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (a9e6d44ddecc ssb: Do not disable PCI host on 
non-Mips)
Merging mac80211/master (c4de37ee2b55 mac80211: mesh: fix wrong mesh TTL offset 
calculation)
Merging rdma-fixes/for-rc (ae59c3f0b6cf RDMA/mlx5: Fix out-of-bound access 
while querying AH)
Merging sound-current/for-linus (61fcf8ece9b6 ALSA: hda/realtek - Enable 
Thinkpad Dock device for ALC298 platform)
Merging pci-current/for-linus (838cda369707 x86/PCI: Enable AMD 64-bit window 
on resume)
Merging driver-core.current/driver-core-linus (35277995e179 Merge branch 
'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging tty.current/tty-linus (4bf772b14675 Merge tag 'drm-for-v4.16' of 
git://people.freedesktop.org/~airlied/linux)
Merging usb.current/usb-linus (4bf772b14675 Merge tag 'drm-for-v4.16' of 
git://people.freedesktop.org/~airlied/linux)
Merging usb-gadget-fixes/fixes (b2cd1df66037 Linux 4.15-rc7)
Merging usb-serial-fixes/usb-linus (d14ac576d10f USB: serial: cp210x: add new 
device ID ELV ALC 8xxx)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON)
Merging staging.cur

Re: [PATCH v3 2/2] ASoC: ak5558: Add bindings for AK5558 ADC

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 07:01:55PM +0200, Daniel Baluta wrote:
> Document the bindings for AK5558 ADC.
> 
> Signed-off-by: Daniel Baluta 
> ---
>  Documentation/devicetree/bindings/sound/ak5558.txt | 22 
> ++
>  1 file changed, 22 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/sound/ak5558.txt

Reviewed-by: Rob Herring 



Re: [PATCH 8/8] ASoC: samsung,tm2-audio DT binding documentation update

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 04:44:03PM +0100, Sylwester Nawrocki wrote:
> This patch documents additional entries of the audio-codec and
> i2s-controller properties required for the HDMI audio support.
> 
> Signed-off-by: Sylwester Nawrocki 
> ---
>  .../devicetree/bindings/sound/samsung,tm2-audio.txt| 14 
> +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Reviewed-by: Rob Herring 



Re: [PATCH 6/8] ASoC: samsung: i2s: Update clock-output-names property documentation

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 04:44:01PM +0100, Sylwester Nawrocki wrote:
> The clock-output-names property is marked as deprecated. While at it,
> #clock-cells property's value is corrected in the example snippet
> and few typos are fixed.
> 
> Signed-off-by: Sylwester Nawrocki 
> ---
>  .../devicetree/bindings/sound/samsung-i2s.txt  | 18 
> +-
>  1 file changed, 9 insertions(+), 9 deletions(-)

Reviewed-by: Rob Herring 



Re: [PATCH] ftrace: fix the file mode of graph tracer and stacktracer

2018-02-08 Thread Zhengyuan Liu
2018-02-08 23:21 GMT+08:00 Steven Rostedt :
> On Thu,  8 Feb 2018 09:41:53 +0800
> Zhengyuan Liu  wrote:
>
>> It's something looks weird that those files could be written by root
>> but shows with no write permission by ll command.
>> Chen LinX  has sent a similar patch to fix
>> graph function file mode in 2000,  I didn't get the reason why that
>> patch wasn't applied so I resend it.
>>
>> Signed-off-by: Zhengyuan Liu 
>> ---
>>  kernel/trace/ftrace.c  | 4 ++--
>>  kernel/trace/trace_stack.c | 2 +-
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index ccdf366..fe903dc8 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -5513,10 +5513,10 @@ static __init int ftrace_init_dyn_tracefs(struct 
>> dentry *d_tracer)
>>   ftrace_create_filter_files(&global_ops, d_tracer);
>>
>>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>> - trace_create_file("set_graph_function", 0444, d_tracer,
>> + trace_create_file("set_graph_function", 0644, d_tracer,
>
> Thanks for resending the patch.
>
> What's interesting is that this doesn't seem to affect whether or not
> root can write to the file. I wonder if that's a bug itself.
>
> -- Steve

Hi, Steve, Thanks for reply.

It doesn't affect root writing to those files as root is a super user and
can access to any write-only files.  I just want to make those writable
file to look consistent with others, seeing bellow:

-rw-r--r--  1 root root  set_event_pid
-rw-r--r--  1 root root  set_ftrace_filter
-rw-r--r--  1 root root  set_ftrace_notrace
-rw-r--r--  1 root root  set_ftrace_pid
-r--r--r--  1 root root   set_graph_function
-r--r--r--  1 root root  set_graph_notrace

If this patch makes no sense, just ignore it!

>>   NULL,
>>   &ftrace_graph_fops);
>> - trace_create_file("set_graph_notrace", 0444, d_tracer,
>> + trace_create_file("set_graph_notrace", 0644, d_tracer,
>>   NULL,
>>   &ftrace_graph_notrace_fops);
>>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
>> index 734accc..4356f14 100644
>> --- a/kernel/trace/trace_stack.c
>> +++ b/kernel/trace/trace_stack.c
>> @@ -468,7 +468,7 @@ static __init int stack_trace_init(void)
>>   NULL, &stack_trace_fops);
>>
>>  #ifdef CONFIG_DYNAMIC_FTRACE
>> - trace_create_file("stack_trace_filter", 0444, d_tracer,
>> + trace_create_file("stack_trace_filter", 0644, d_tracer,
>> &trace_ops, &stack_trace_filter_fops);
>>  #endif
>>
>

Re: [PATCH] scsi: ufs-qcom: add number of lanes per direction

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 08:02:07PM +0800, Can Guo wrote:
> From: Gilad Broner 
> 
> Different platforms may have different number of lanes for the UFS link.
> Add parameter to device tree specifying how many lanes should be
> configured for the UFS link. And don't print err message for clocks
> that are optional, this leads to unnecessary confusion about failure.
> 
> Signed-off-by: Gilad Broner 
> Signed-off-by: Subhash Jadavani 
> Signed-off-by: Can Guo 
> 
> diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt 
> b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> index 5357919..4cee3f9 100644
> --- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> +++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> @@ -31,6 +31,9 @@ Optional properties:
> defined or a value in the array is "0" then it is 
> assumed
> that the frequency is set by the parent clock or a
> fixed rate clock source.
> +- lanes-per-direction:   number of lanes available per direction - 
> either 1 or 2.
> + Note that it is assume same number of lanes is used 
> both directions at once.

Seems reasonable until someone does not make things symmetrical. We 
should design for that case.

> + If not specified, default is 2 lanes per direction.
>  
>  Note: If above properties are not defined it can be assumed that the supply
>  regulators or clocks are always on.


Re: [PATCH v1 1/2] dt-bindings/display/panel: otm8009a: Add optional power-supply property

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 10:45:31AM +0100, Philippe Cornu wrote:
> Some boards use a dedicated voltage regulator for this panel.
> Add & document this related optional power-supply property.
> 
> Signed-off-by: Philippe Cornu 
> ---
>  Documentation/devicetree/bindings/display/panel/orisetech,otm8009a.txt | 2 ++
>  1 file changed, 2 insertions(+)

Reviewed-by: Rob Herring 


Re: [PATCHv3] tlv320dac33: Add device tree bindings

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 09:24:45AM +0100, Pavel Machek wrote:
> 
> This adds device tree bindings for tlv320dac33.c.
> 
> Acked-by: Peter Ujfalusi 
> Signed-off-by: Pavel Machek 

Reviewed-by: Rob Herring 


Re: [PATCH v1 1/2] dt-binding: clock: document NPCM7xx clock DT bindings

2018-02-08 Thread Rob Herring
On Mon, Feb 05, 2018 at 10:22:54AM +0200, Tomer Maimon wrote:
> Added device tree binding documentation for Nuvoton NPCM7xx clocks.
> 
> Signed-off-by: Tomer Maimon 
> ---
>  .../bindings/clock/nuvoton,npcm7xx-clk.txt | 84 
> ++
>  1 file changed, 84 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/clock/nuvoton,npcm7xx-clk.txt
> 
> diff --git a/Documentation/devicetree/bindings/clock/nuvoton,npcm7xx-clk.txt 
> b/Documentation/devicetree/bindings/clock/nuvoton,npcm7xx-clk.txt
> new file mode 100644
> index ..1ba1945d3616
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/clock/nuvoton,npcm7xx-clk.txt
> @@ -0,0 +1,84 @@
> +* Nuvoton NPCM7XX Clock Controller
> +
> +Nuvoton Poleg BMC NPCM7XX contain integrated clock

s/contain/contains/

And your line break is strange.

> +controller, which generates and supplies clock to all modules within the BMC.

s/clock/clocks/

> +
> +Required Properties:
> +
> +- compatible: should be one of following:
> + - "nuvoton,npcm750-clk" : for clock controller of Nuvoton
> +   Poleg BMC NPCM750
> +
> +- reg: physical base address of the controller and length of memory mapped
> +  region.
> +
> +
> +- #clock-cells: should be 1.
> +
> +All available clocks are defined as preprocessor macros in
> +dt-bindings/clock/nuvoton,npcm7xx-clock.h header and can beused in device 
> tree

This file should be part of this patch.

> +sources.
> +
> +External clocks:
> +
> +There are several clocks that are generated outside the BMC. All clocks are 
> of
> +a known fixed value that cannot be chagned. Therefor these values are hard 
> coded

s/chagned/changed/

s/Therefor/Therefore/

> +inside the driver and registered on init.
> +
> +The clock modules contains 4 PLL, 20 dividers and 11 muxes. All these 
> settings
> +are set before Linux boot and are not to be altered by the Linux. This 
> driver is
> +used only to read the values clocks, not to set them.
> +
> +In addition to the clock driver, there are 3 external clocks suppling the

The binding describes h/w, not a driver.

> +network, which are of fixed values, set on on the device tree, but not used 
> by
> + the clock module. Example can be found below.
   ^
extra space

All this description belongs at the top of this doc.

> +
> +Example: Clock controller node:
> +
> + clk: clock-controller@f0801000 {
> + compatible = "nuvoton,npcm750-clk";
> + #clock-cells = <1>;
> + clock-controller;
> + reg = <0xf0801000 0x1000>;
> + status = "okay";
> + };
> +
> +Example: Required external clocks for network:
> +
> + /* external clock signal rg1refck, supplied by the phy */
> + clk_rg1refck: clk_rg1refck {

Use '-' rather than '_' in node names.

> + compatible = "fixed-clock";
> + #clock-cells = <0>;
> + clock-frequency = <12500>;
> + clock-output-names = "clk_rg1refck";
> + };
> +
> + /* external clock signal rg2refck, supplied by the phy */
> + clk_rg2refck: clk_rg2refck {
> + compatible = "fixed-clock";
> + #clock-cells = <0>;
> + clock-frequency = <12500>;
> + clock-output-names = "clk_rg2refck";
> + };
> +
> + clk_xin: clk_xin {
> + compatible = "fixed-clock";
> + #clock-cells = <0>;
> + clock-frequency = <5000>;
> + clock-output-names = "clk_xin";
> + };
> +
> +Example: UART controller node that consumes the clock generated by the clock
> +  controller (refer to the standard clock bindings for information about
> +  "clocks" and "clock-names" properties):
> +
> + uart0: serial@e290 {
> + compatible = "Nuvoton,s5pv210-uart";

s/Nuvoton/nuvoton/

> + reg = <0xe290 0x400>;
> + interrupt-parent = <&vic1>;
> + interrupts = <10>;
> + clock-names = "uart", "clk_uart_baud0",
> + "clk_uart_baud1";
> + clocks = <&clocks UART0>, <&clocks UART0>,
> + <&clocks SCLK_UART0>;
> + };
> -- 
> 2.14.1
> 


Re: [patch v1 0/4] mlx-platform: Add support for new Mellanox systems, code improvement, fixes for msn21xx system

2018-02-08 Thread Darren Hart
On Fri, Feb 02, 2018 at 08:45:44AM +, Vadim Pasternak wrote:
> The patchset:
> - adds defines for bus numbers, used for system topology description;
> - fixes definition for power cables for system family msn21xx;
> - introduces support for new Mellanox systems;
> 
> Vadim Pasternak (4):
>   platform/x86: mlx-platform: Use defines for bus assignment
>   platform/x86: mlx-platform: Add define for the negative bus
>   platform/x86: mlx-platform: Fix power cable setting for systems from
> msn21xx family

I've queued 1,2,3 and will try to include these in one final PR to Linus during
the Merge Window. (Andy, FYI)

>   platform/x86: mlx-platform: Add support for new Mellanox systems
> 

This one needs to be broken up into smaller patches.

>  drivers/platform/x86/mlx-platform.c | 345 
> ++--
>  1 file changed, 335 insertions(+), 10 deletions(-)
> 
> -- 
> 2.1.4
> 
> 

-- 
Darren Hart
VMware Open Source Technology Center


Re: [patch v1 4/4] platform/x86: mlx-platform: Add support for new Mellanox systems

2018-02-08 Thread Darren Hart
On Fri, Feb 02, 2018 at 08:45:48AM +, Vadim Pasternak wrote:
> Add support for the next new Mellanox system types: msn274x, msn201x,
> qmb7, sn34, sn37. The current members of these types are:

Please break this up into one patch per system type, or a similar logical
breakdown.

-- 
Darren Hart
VMware Open Source Technology Center


Re: [PATCH 03/12] i2c: qup: remove redundant variables for BAM SG count

2018-02-08 Thread Sricharan R
Hi Abhishek,

On 2/3/2018 1:28 PM, Abhishek Sahu wrote:
> The rx_nents and tx_nents are redundant. rx_buf and tx_buf can
> be used for total number of SG entries.
> 
> Signed-off-by: Abhishek Sahu 
> ---
>  drivers/i2c/busses/i2c-qup.c | 26 ++
>  1 file changed, 10 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
> index c68f433..bb83a2967 100644
> --- a/drivers/i2c/busses/i2c-qup.c
> +++ b/drivers/i2c/busses/i2c-qup.c
> @@ -692,7 +692,7 @@ static int qup_i2c_bam_do_xfer(struct qup_i2c_dev *qup, 
> struct i2c_msg *msg,
>   struct dma_async_tx_descriptor *txd, *rxd = NULL;
>   int ret = 0, idx = 0, limit = QUP_READ_LIMIT;
>   dma_cookie_t cookie_rx, cookie_tx;
> - u32 rx_nents = 0, tx_nents = 0, len, blocks, rem;
> + u32 len, blocks, rem;
>   u32 i, tlen, tx_len, tx_buf = 0, rx_buf = 0, off = 0;
>   u8 *tags;
>  

 This is correct. Just a nit, may be rx/tx_buf can be changed to
 rx/tx_count to make it more clear.

Regards,
 Sricharan

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation


[PATCH] irqchip: Use %px to print pointer value

2018-02-08 Thread Jaedon Shin
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
pointers printed with %p are hashed. Use %px instead of %p to print
pointer value.

Signed-off-by: Jaedon Shin 
---
 drivers/irqchip/irq-bcm7038-l1.c | 2 +-
 drivers/irqchip/irq-bcm7120-l2.c | 2 +-
 drivers/irqchip/irq-brcmstb-l2.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-bcm7038-l1.c b/drivers/irqchip/irq-bcm7038-l1.c
index 55cfb986225b..f604c1d89b3b 100644
--- a/drivers/irqchip/irq-bcm7038-l1.c
+++ b/drivers/irqchip/irq-bcm7038-l1.c
@@ -339,7 +339,7 @@ int __init bcm7038_l1_of_init(struct device_node *dn,
goto out_unmap;
}
 
-   pr_info("registered BCM7038 L1 intc (mem: 0x%p, IRQs: %d)\n",
+   pr_info("registered BCM7038 L1 intc (mem: 0x%px, IRQs: %d)\n",
intc->cpus[0]->map_base, IRQS_PER_WORD * intc->n_words);
 
return 0;
diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index 983640eba418..1cc4dd1d584a 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -318,7 +318,7 @@ static int __init bcm7120_l2_intc_probe(struct device_node 
*dn,
}
}
 
-   pr_info("registered %s intc (mem: 0x%p, parent IRQ(s): %d)\n",
+   pr_info("registered %s intc (mem: 0x%px, parent IRQ(s): %d)\n",
intc_name, data->map_base[0], data->num_parent_irqs);
 
return 0;
diff --git a/drivers/irqchip/irq-brcmstb-l2.c b/drivers/irqchip/irq-brcmstb-l2.c
index 691d20eb0bec..6760edeeb666 100644
--- a/drivers/irqchip/irq-brcmstb-l2.c
+++ b/drivers/irqchip/irq-brcmstb-l2.c
@@ -262,7 +262,7 @@ static int __init brcmstb_l2_intc_of_init(struct 
device_node *np,
ct->chip.irq_set_wake = irq_gc_set_wake;
}
 
-   pr_info("registered L2 intc (mem: 0x%p, parent irq: %d)\n",
+   pr_info("registered L2 intc (mem: 0x%px, parent irq: %d)\n",
base, parent_irq);
 
return 0;
-- 
2.16.1



Re: [PATCH] ASoC: Intel: Skylake: make function skl_clk_round_rate static

2018-02-08 Thread Takashi Sakamoto

Hi,

On Feb 8 2018 23:35, Colin King wrote:

From: Colin Ian King 

The function skl_clk_round_rate is local to the source and does not
need to be in global scope, so make it static.

Cleans up sparse warning:
sound/soc/intel/skylake/skl-ssp-clk.c:250:6: warning: symbol
'skl_clk_round_rate' was not declared. Should it be static?

Signed-off-by: Colin Ian King 
---
  sound/soc/intel/skylake/skl-ssp-clk.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Takashi Sakamoto 


diff --git a/sound/soc/intel/skylake/skl-ssp-clk.c 
b/sound/soc/intel/skylake/skl-ssp-clk.c
index 7fbddf5e3b00..cda1b5fa7436 100644
--- a/sound/soc/intel/skylake/skl-ssp-clk.c
+++ b/sound/soc/intel/skylake/skl-ssp-clk.c
@@ -247,8 +247,8 @@ static unsigned long skl_clk_recalc_rate(struct clk_hw *hw,
  }
  
  /* Not supported by clk driver. Implemented to satisfy clk fw */

-long skl_clk_round_rate(struct clk_hw *hw, unsigned long rate,
-   unsigned long *parent_rate)
+static long skl_clk_round_rate(struct clk_hw *hw, unsigned long rate,
+  unsigned long *parent_rate)
  {
return rate;
  }



Thanks

Takashi Sakamoto


Re: [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
> 
> On Tue, Aug 16 2016, James Simmons wrote:

my that’s an old patch

> 
>> 
>> +static inline bool
>> +lsm_md_eq(const struct lmv_stripe_md *lsm1, const struct lmv_stripe_md 
>> *lsm2)
>> +{
>> +int idx;
>> +
>> +if (lsm1->lsm_md_magic != lsm2->lsm_md_magic ||
>> +lsm1->lsm_md_stripe_count != lsm2->lsm_md_stripe_count ||
>> +lsm1->lsm_md_master_mdt_index != lsm2->lsm_md_master_mdt_index ||
>> +lsm1->lsm_md_hash_type != lsm2->lsm_md_hash_type ||
>> +lsm1->lsm_md_layout_version != lsm2->lsm_md_layout_version ||
>> +!strcmp(lsm1->lsm_md_pool_name, lsm2->lsm_md_pool_name))
>> +return false;
> 
> Hi James and all,
> This patch (8f18c8a48b736c2f in linux) is different from the
> corresponding patch in lustre-release (60e07b972114df).
> 
> In that patch, the last clause in the 'if' condition is
> 
> +   strcmp(lsm1->lsm_md_pool_name,
> + lsm2->lsm_md_pool_name) != 0)
> 
> Whoever converted it to "!strcmp()" inverted the condition.  This is a
> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
> 
> This causes many tests in the 'sanity' test suite to return
> -ENOMEM (that had me puzzled for a while!!).

huh? I am not seeing anything of the sort and I was running sanity
all the time until a recent pause (but going to resume).

> This seems to suggest that no-one has been testing the mainline linux
> lustre.
> It also seems to suggest that there is a good chance that there
> are other bugs that have crept in while no-one has really been caring.
> Given that the sanity test suite doesn't complete for me, but just
> hangs (in test_27z I think), that seems particularly likely.

Works for me, here’s a run from earlier today on 4.15.0:
== sanity test 27z: check SEQ/OID on the MDT and OST filesystems 
= 16:43:58 (1518126238)
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0169548 s, 61.8 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.02782 s, 75.4 MB/s
check file /mnt/lustre/d27z.sanity/f27z.sanity-1
FID seq 0x20401, oid 0x4640 ver 0x0
LOV seq 0x20401, oid 0x4640, count: 1
want: stripe:0 ost:0 oid:314/0x13a seq:0
Stopping /mnt/lustre-ost1 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/lustre-ost1
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST
/mnt/lustre-ost1/O/0/d26/314: parent=[0x20401:0x4640:0x0] stripe=0 
stripe_size=0 stripe_count=0
check file /mnt/lustre/d27z.sanity/f27z.sanity-2
FID seq 0x20401, oid 0x4642 ver 0x0
LOV seq 0x20401, oid 0x4642, count: 2
want: stripe:0 ost:1 oid:1187/0x4a3 seq:0
Stopping /mnt/lustre-ost2 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/lustre-ost2
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST0001
/mnt/lustre-ost2/O/0/d3/1187: parent=[0x20401:0x4642:0x0] stripe=0 
stripe_size=0 stripe_count=0
want: stripe:1 ost:0 oid:315/0x13b seq:0
got: objid=0 seq=0 parent=[0x20401:0x4642:0x0] stripe=1
Resetting fail_loc on all nodes...done.
16:44:32 (1518126272) waiting for centos6-16 network 5 secs ...
16:44:32 (1518126272) network interface is UP
16:44:33 (1518126273) waiting for centos6-17 network 5 secs ...
16:44:33 (1518126273) network interface is UP


> So my real question - to anyone interested in lustre for mainline linux
> - is: can we actually trust this code at all?

Absolutely. Seems that you just stumbled upon a corner case that was not
being hit by people that do the testing, so you have something unique about
your setup, I guess.

> I'm seriously tempted to suggest that we just
>  rm -r drivers/staging/lustre
> 
> drivers/staging is great for letting the community work on code that has
> been "thrown over the wall" and is not openly developed elsewhere, but
> that is not the case for lustre.  lustre has (or seems to have) an open
> development process.  Having on-going development happen both there and
> in drivers/staging seems a waste of resources.

It is a bit of 

Re: [PATCH 13/18] tracing: Add array type to function based events

2018-02-08 Thread Steven Rostedt
On Fri, 9 Feb 2018 10:17:45 +0900
Namhyung Kim  wrote:

> > + # echo 1 > events/functions/ip_rcv/enable
> > + # cat trace
> > +-0 [003] ..s3   219.813582: 
> > __netif_receive_skb_core->ip_rcv(skb=880118195e00, 
> > perm_addr=b4,b5,2f,ce,18,65)
> > +-0 [003] ..s3   219.813595: 
> > __netif_receive_skb_core->ip_rcv(skb=880118195e00, 
> > perm_addr=b4,b5,2f,ce,18,65)
> > +-0 [003] ..s3   220.115053: 
> > __netif_receive_skb_core->ip_rcv(skb=880118195c00, 
> > perm_addr=b4,b5,2f,ce,18,65)
> > +-0 [003] ..s3   220.115293: 
> > __netif_receive_skb_core->ip_rcv(skb=880118195c00, 
> > perm_addr=b4,b5,2f,ce,18,65)  
> 
> What about adding braces to indicate array type like below?
> 
> ... ip_rcv(skb=880118195c00, perm_addr={b4,b5,2f,ce,18,65})
> 

That's a nice idea, I'll add it.

> > +   case FUNC_STATE_ARRAY:
> > case FUNC_STATE_BRACKET:
> > -   WARN_ON(!fevent->last_arg);
> > +   if (WARN_ON(!fevent->last_arg))
> > +   break;
> > ret = kstrtoul(token, 0, &val);
> > if (ret)
> > break;
> > -   val *= fevent->last_arg->size;
> > -   fevent->last_arg->indirect = val ^ INDIRECT_FLAG;
> > -   return FUNC_STATE_INDIRECT;
> > +   if (state == FUNC_STATE_BRACKET) {
> > +   val *= fevent->last_arg->size;
> > +   fevent->last_arg->indirect = val ^ INDIRECT_FLAG;
> > +   return FUNC_STATE_INDIRECT;
> > +   }
> > +   if (val <= 0)
> > +   break;  
> 
> The val is unsigned long type.

I probably should make it a cap it for the array, as arrays that are
too big will simply fail to allocate on the ring buffer.

But it should only check for zero.

> 
> 
> > +   fevent->last_arg->array = val;
> > +   type = kasprintf(GFP_KERNEL, "%s[%d]", fevent->last_arg->type, 
> > (unsigned)val);  
> 
> s/%d/%lu/  and no need to cast it.

Sure.

> 
> 
> > +   if (!type)
> > +   break;
> > +   kfree(fevent->last_arg->type);
> > +   fevent->last_arg->type = type;
> > +   /*
> > +* arg_offset has already been updated once by size.
> > +* This update needs to account for that (hence the "- 1").
> > +*/
> > +   fevent->arg_offset += fevent->last_arg->size * 
> > (fevent->last_arg->array - 1);
> > +   return FUNC_STATE_ARRAY_SIZE;
> > +
> > +   case FUNC_STATE_ARRAY_SIZE:
> > +   if (token[0] != ']')
> > +   break;
> > +   return FUNC_STATE_ARRAY_END;
> >  
> > case FUNC_STATE_INDIRECT:
> > if (token[0] != ']')
> > @@ -453,6 +485,10 @@ static long long get_arg(struct func_arg *arg, 
> > unsigned long val)
> >  
> > val = val + (arg->indirect ^ INDIRECT_FLAG);
> >  
> > +   /* Arrays do their own indirect reads */
> > +   if (arg->array)
> > +   return val;
> > +  
> 
> Not sure about this.  After this change it would make 'x64[1] foo' and
> 'x64[1] foo[0]' equivalent, right?

Yeah, I may need to re-think this. I originally had the "array"
use the "indirect" code, but I'm thinking that isn't necessary.

Thanks for the input.

-- Steve


Re: [PATCH RESEND v4] perf/core: Fix installing cgroup event into cpu

2018-02-08 Thread Lin Xiulei
2018-02-08 23:36 GMT+08:00 Jiri Olsa :
>
> On Thu, Feb 08, 2018 at 11:33:44AM +0800, linxiu...@gmail.com wrote:
> > From: "leilei.lin" 
> >
> > Do not install cgroup event into the CPU context and schedule it
> > if the cgroup is not running on this CPU
> >
> > While there is no task of cgroup running specified CPU, current
> > kernel still install cgroup event into CPU context that causes
> > another cgroup event can't be installed into this CPU.
> >
> > This patch prevent scheduling events at __perf_install_in_context()
> > and installing events at list_update_cgroup_event() if cgroup isn't
> > running on specified CPU.
> >
> > Signed-off-by: leilei.lin 
> > ---
> >  v2: Set cpuctx->cgrp only if the same cgroup is running on this
> >CPU otherwise following events couldn't be activated immediately
> >  v3: Enhance the comments and commit message
> >  v4: Adjust to config
> >
> >  kernel/events/core.c | 50 
> > +-
> >  1 file changed, 37 insertions(+), 13 deletions(-)
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 4df5b69..fd28d61 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -933,31 +933,41 @@ list_update_cgroup_event(struct perf_event *event,
> >  {
> >   struct perf_cpu_context *cpuctx;
> >   struct list_head *cpuctx_entry;
> > + struct perf_cgroup *cgrp;
> >
> >   if (!is_cgroup_event(event))
> >   return;
> >
> > - if (add && ctx->nr_cgroups++)
> > - return;
> > - else if (!add && --ctx->nr_cgroups)
> > - return;
>
> I might be missing something, but should this check stay on
> the top regardles of the cgroup_is_descendant check below?
>

I don't think so,  if event A on cgroup A is opened and immediately
followed by a event B opened on cgroup B, then
"if (add && ctx->nr_cgroups++)" would __return__ with
 cpuctx->cgrp = cgroup A, that is incorrect.

And previous thread is here https://lkml.org/lkml/2018/1/24/79

>
> you could put NULL into cpuctx->cgrp on context with cgroup
> event in the list
>

what's the harm? It's invoked by perf_remove_from_context() when
an event is ready to be released. And whenever process/cgroup is
scheduled, cpuctx->cgrp will be set

In case that the patch was not sorted well, I put the patched code

```
static inline void
list_update_cgroup_event(struct perf_event *event,
 struct perf_event_context *ctx, bool add)
{
struct perf_cpu_context *cpuctx;
struct list_head *cpuctx_entry;
struct perf_cgroup *cgrp;

if (!is_cgroup_event(event))
return;

/*
 * Because cgroup events are always per-cpu events,
 * this will always be called from the right CPU.
 */
cpuctx = __get_cpu_context(ctx);
cgrp = perf_cgroup_from_task(current, ctx);

/*
 * if only the cgroup is running on this cpu,
 * we put/remove this cgroup into cpu context.
 * Or it would case mismatch in following cgroup
 * events at event_filter_match()
 */
if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) {
if (add)
cpuctx->cgrp = cgrp;
else
cpuctx->cgrp = NULL;
}

if (add && ctx->nr_cgroups++)
return;
else if (!add && --ctx->nr_cgroups)
return;

cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
if (add)
list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
else
list_del(cpuctx_entry);
}
```

thanks

>
> thanks,
> jirka
>
> >   /*
> >* Because cgroup events are always per-cpu events,
> >* this will always be called from the right CPU.
> >*/
> >   cpuctx = __get_cpu_context(ctx);
> > - cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
> > - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU 
> > .*/
> > - if (add) {
> > - struct perf_cgroup *cgrp = perf_cgroup_from_task(current, 
> > ctx);
> > + cgrp = perf_cgroup_from_task(current, ctx);
> >
> > - list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
> > - if (cgroup_is_descendant(cgrp->css.cgroup, 
> > event->cgrp->css.cgroup))
> > + /*
> > +  * if only the cgroup is running on this cpu,
> > +  * we put/remove this cgroup into cpu context.
> > +  * Or it would case mismatch in following cgroup
> > +  * events at event_filter_match()
> > +  */
> > + if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) {
> > + if (add)
> >   cpuctx->cgrp = cgrp;
> > - } else {
> > - list_del(cpuctx_entry);
> > - cpuctx->cgrp = NULL;
> > + else
> > + cpuctx->cgrp = NULL;
>
> SNIP
>


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread jianchao.wang
Hi Keith and Sagi

Many thanks for your kindly response.
That's really appreciated.

On 02/09/2018 01:56 AM, Keith Busch wrote:
> On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote:
>> Given the discussion on this set, you plan to respin again
>> for 4.16?
> 
> With the exception of maybe patch 1, this needs more consideration than
> I'd feel okay with for the 4.16 release.
> 
Currently, one of the block is the nvme_wait_freeze in nvme_reset_work.
This cause some issues when I test this patchset yesterday.
As I posted on the V1 patchset mail thread:

if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
nvme_reset_work will hang forever, because no one could complete the entered 
requests.

if we invoke nvme_reset_ctrl after modify the state machine to be able to 
change to RESETTING
to RECONNECTING and queue reset_work, we still cannot move things forward, 
because the reset_work
is being executed.

if we use nvme_wait_freeze_timeout in nvme_reset_work, unfreeze and return if 
expires. But the 
timeout value is tricky..


And actually, one of the possible solution to fix this cleanly is 
blk_set_preempt_only.
It is a lightweight way to gate the new bios out of generic_make_request.

Looking forward your advice on this.
And many thanks for your precious time on this.

Sincerely
Thanks
Jianchao


  1   2   3   4   5   6   7   8   >