Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask

2018-04-24 Thread Dmitry Vyukov
On Tue, Apr 24, 2018 at 7:02 PM, Michal Hocko  wrote:
> On Tue 24-04-18 12:48:50, Chunyu Hu wrote:
>>
>>
>> - Original Message -
>> > From: "Michal Hocko" 
>> > To: "Chunyu Hu" 
>> > Cc: "Dmitry Vyukov" , "Catalin Marinas" 
>> > , "Chunyu Hu"
>> > , "LKML" , "Linux-MM" 
>> > 
>> > Sent: Tuesday, April 24, 2018 9:20:57 PM
>> > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in 
>> > gfp_kmemleak_mask
>> >
>> > On Mon 23-04-18 12:17:32, Chunyu Hu wrote:
>> > [...]
>> > > So if there is a new flag, it would be the 25th bits.
>> >
>> > No new flags please. Can you simply store a simple bool into 
>> > fail_page_alloc
>> > and have save/restore api for that?
>>
>> Hi Michal,
>>
>> I still don't get your point. The original NOFAIL added in kmemleak was
>> for skipping fault injection in page/slab  allocation for kmemleak object,
>> since kmemleak will disable itself until next reboot, whenever it hit an
>> allocation failure, in that case, it will lose effect to check kmemleak
>> in errer path rose by fault injection. But NOFAULT's effect is more than
>> skipping fault injection, it's also for hard allocation. So a dedicated flag
>> for skipping fault injection in specified slab/page allocation was mentioned.
>
> I am not familiar with the kmemleak all that much, but fiddling with the
> gfp_mask is a wrong way to achieve kmemleak specific action. I might be

I would say this is more like slab fault injection-specific action. It
can be used in other debugging facilities. Slab fault injection is a
part of slab. Slab behavior is generally controlled with gfp_mask.

> easilly wrong but I do not see any code that would restore the original
> gfp_mask down the kmem_cache_alloc path.
>
>> d9570ee3bd1d ("kmemleak: allow to coexist with fault injection")
>>
>> Do you mean something like below, with the save/store api? But looks like
>> to make it possible to skip a specified allocation, not global disabling,
>> a bool is not enough, and a gfp_flag is also needed. Maybe I missed 
>> something?
>
> Yes, this is essentially what I meant. It is still a global thing which
> is not all that great and if it matters then you can make it per
> task_struct. That really depends on the code flow here.

If we go this route, it definitely needs to be per task and also needs
to work with interrupts: switch on interrupts and not corrupt on
interrupts. A gfp flag is free of these problems.


Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask

2018-04-24 Thread Dmitry Vyukov
On Tue, Apr 24, 2018 at 7:02 PM, Michal Hocko  wrote:
> On Tue 24-04-18 12:48:50, Chunyu Hu wrote:
>>
>>
>> - Original Message -
>> > From: "Michal Hocko" 
>> > To: "Chunyu Hu" 
>> > Cc: "Dmitry Vyukov" , "Catalin Marinas" 
>> > , "Chunyu Hu"
>> > , "LKML" , "Linux-MM" 
>> > 
>> > Sent: Tuesday, April 24, 2018 9:20:57 PM
>> > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in 
>> > gfp_kmemleak_mask
>> >
>> > On Mon 23-04-18 12:17:32, Chunyu Hu wrote:
>> > [...]
>> > > So if there is a new flag, it would be the 25th bits.
>> >
>> > No new flags please. Can you simply store a simple bool into 
>> > fail_page_alloc
>> > and have save/restore api for that?
>>
>> Hi Michal,
>>
>> I still don't get your point. The original NOFAIL added in kmemleak was
>> for skipping fault injection in page/slab  allocation for kmemleak object,
>> since kmemleak will disable itself until next reboot, whenever it hit an
>> allocation failure, in that case, it will lose effect to check kmemleak
>> in errer path rose by fault injection. But NOFAULT's effect is more than
>> skipping fault injection, it's also for hard allocation. So a dedicated flag
>> for skipping fault injection in specified slab/page allocation was mentioned.
>
> I am not familiar with the kmemleak all that much, but fiddling with the
> gfp_mask is a wrong way to achieve kmemleak specific action. I might be

I would say this is more like slab fault injection-specific action. It
can be used in other debugging facilities. Slab fault injection is a
part of slab. Slab behavior is generally controlled with gfp_mask.

> easilly wrong but I do not see any code that would restore the original
> gfp_mask down the kmem_cache_alloc path.
>
>> d9570ee3bd1d ("kmemleak: allow to coexist with fault injection")
>>
>> Do you mean something like below, with the save/store api? But looks like
>> to make it possible to skip a specified allocation, not global disabling,
>> a bool is not enough, and a gfp_flag is also needed. Maybe I missed 
>> something?
>
> Yes, this is essentially what I meant. It is still a global thing which
> is not all that great and if it matters then you can make it per
> task_struct. That really depends on the code flow here.

If we go this route, it definitely needs to be per task and also needs
to work with interrupts: switch on interrupts and not corrupt on
interrupts. A gfp flag is free of these problems.


Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-24 Thread Eric W. Biederman
Andrey Grodzovsky  writes:

> On 04/24/2018 12:23 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky  writes:
>>
>>> Avoid calling wait_event_killable when you are possibly being called
>>> from get_signal routine since in that case you end up in a deadlock
>>> where you are alreay blocked in singla processing any trying to wait
>>> on a new signal.
>> I am curious what the call path that is problematic here.
>
> Here is the problematic call stack
>
> [<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
> [<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
> [<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
> [<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
> [<0>] drm_release+0x414/0x5b0 [drm]
> [<0>] __fput+0x176/0x350
> [<0>] task_work_run+0xa1/0xc0
> [<0>] do_exit+0x48f/0x1280
> [<0>] do_group_exit+0x89/0x140
> [<0>] get_signal+0x375/0x8f0
> [<0>] do_signal+0x79/0xaa0
> [<0>] exit_to_usermode_loop+0x83/0xd0
> [<0>] do_syscall_64+0x244/0x270
> [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [<0>] 0x
>
> On exit from system call you process all the signals you received and
> encounter a fatal signal which triggers process termination.
>
>>
>> In general waiting seems wrong when the process has already been
>> fatally killed as indicated by PF_SIGNALED.
>
> So indeed this patch avoids wait in this case.
>
>>
>> Returning -ERESTARTSYS seems wrong as nothing should make it back even
>> to the edge of userspace here.
>
> Can you clarify please - what should be returned here instead ?

__fput does not have a return code.  I don't see the return code of
release being used anywhere.  So any return code is going to be lost.
So maybe something that talks to the drm/kernel layer but don't expect
your system call to be restarted, which is what -ERESTARTSYS asks for.

Hmm.  When looking at the code that is merged versus whatever your
patch is against it gets even clearer.  The -ERESTARTSYS
return code doesn't even get out of drm_sched_entity_fini.

Caring at all about process state at that point is wrong, as except for
being in ``process'' context where you can sleep nothing is connected to
a process.

Let me respectfully suggest that the wait_event_killable on that code
path is wrong.  Possibly you want a wait_event_timeout if you are very
nice.  But the code already has the logic necessary to handle what
happens if it can't sleep.

So I think the justification needs to be why you are trying to sleep
there at all.

The progress guarantee needs to come from the gpu layer or the AMD
driver not from someone getting impatient and sending SIGKILL to
a dead process.


Eric


>>
>> Given that this is the only use of PF_SIGNALED outside of bsd process
>> accounting I find this code very suspicious.
>>
>> It looks the code path that gets called during exit is buggy and needs
>> to be sorted out.
>>
>> Eric
>>
>>
>>> Signed-off-by: Andrey Grodzovsky 
>>> ---
>>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
>>> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> index 088ff2b..09fd258 100644
>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct 
>>> drm_gpu_scheduler *sched,
>>> return;
>>> /**
>>>  * The client will not queue more IBs during this fini, consume existing
>>> -* queued IBs or discard them on SIGKILL
>>> +* queued IBs or discard them when in death signal state since
>>> +* wait_event_killable can't receive signals in that state.
>>> */
>>> -   if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>> +   if (current->flags & PF_SIGNALED)
>>> entity->fini_status = -ERESTARTSYS;
>>> else
>>> entity->fini_status = wait_event_killable(sched->job_scheduled,


Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-24 Thread Eric W. Biederman
Andrey Grodzovsky  writes:

> On 04/24/2018 12:23 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky  writes:
>>
>>> Avoid calling wait_event_killable when you are possibly being called
>>> from get_signal routine since in that case you end up in a deadlock
>>> where you are alreay blocked in singla processing any trying to wait
>>> on a new signal.
>> I am curious what the call path that is problematic here.
>
> Here is the problematic call stack
>
> [<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
> [<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
> [<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
> [<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
> [<0>] drm_release+0x414/0x5b0 [drm]
> [<0>] __fput+0x176/0x350
> [<0>] task_work_run+0xa1/0xc0
> [<0>] do_exit+0x48f/0x1280
> [<0>] do_group_exit+0x89/0x140
> [<0>] get_signal+0x375/0x8f0
> [<0>] do_signal+0x79/0xaa0
> [<0>] exit_to_usermode_loop+0x83/0xd0
> [<0>] do_syscall_64+0x244/0x270
> [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [<0>] 0x
>
> On exit from system call you process all the signals you received and
> encounter a fatal signal which triggers process termination.
>
>>
>> In general waiting seems wrong when the process has already been
>> fatally killed as indicated by PF_SIGNALED.
>
> So indeed this patch avoids wait in this case.
>
>>
>> Returning -ERESTARTSYS seems wrong as nothing should make it back even
>> to the edge of userspace here.
>
> Can you clarify please - what should be returned here instead ?

__fput does not have a return code.  I don't see the return code of
release being used anywhere.  So any return code is going to be lost.
So maybe something that talks to the drm/kernel layer but don't expect
your system call to be restarted, which is what -ERESTARTSYS asks for.

Hmm.  When looking at the code that is merged versus whatever your
patch is against it gets even clearer.  The -ERESTARTSYS
return code doesn't even get out of drm_sched_entity_fini.

Caring at all about process state at that point is wrong, as except for
being in ``process'' context where you can sleep nothing is connected to
a process.

Let me respectfully suggest that the wait_event_killable on that code
path is wrong.  Possibly you want a wait_event_timeout if you are very
nice.  But the code already has the logic necessary to handle what
happens if it can't sleep.

So I think the justification needs to be why you are trying to sleep
there at all.

The progress guarantee needs to come from the gpu layer or the AMD
driver not from someone getting impatient and sending SIGKILL to
a dead process.


Eric


>>
>> Given that this is the only use of PF_SIGNALED outside of bsd process
>> accounting I find this code very suspicious.
>>
>> It looks the code path that gets called during exit is buggy and needs
>> to be sorted out.
>>
>> Eric
>>
>>
>>> Signed-off-by: Andrey Grodzovsky 
>>> ---
>>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
>>> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> index 088ff2b..09fd258 100644
>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct 
>>> drm_gpu_scheduler *sched,
>>> return;
>>> /**
>>>  * The client will not queue more IBs during this fini, consume existing
>>> -* queued IBs or discard them on SIGKILL
>>> +* queued IBs or discard them when in death signal state since
>>> +* wait_event_killable can't receive signals in that state.
>>> */
>>> -   if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>> +   if (current->flags & PF_SIGNALED)
>>> entity->fini_status = -ERESTARTSYS;
>>> else
>>> entity->fini_status = wait_event_killable(sched->job_scheduled,


Re: [PATCH v3 2/3] regulator: add support for SY8106A regulator

2018-04-24 Thread Mark Brown
On Mon, Apr 23, 2018 at 10:46:56PM +0800, Icenowy Zheng wrote:

> --- /dev/null
> +++ b/drivers/regulator/sy8106a-regulator.c
> @@ -0,0 +1,176 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * sy8106a-regulator.c - Regulator device driver for SY8106A

Just make the entire thing a C++ comment so it looks consistent and
joined up.

> +static int sy8106a_set_voltage_sel(struct regulator_dev *rdev, unsigned int 
> sel)
> +{
> + /* We use our set_voltage_sel in order to avoid unnecessary I2C
> +  * chatter, because the regulator_get_voltage_sel_regmap using
> +  * apply_bit would perform 4 unnecessary transfers instead of one,
> +  * increasing the chance of error.
> +  */
> + return regmap_write(rdev->regmap, rdev->desc->vsel_reg,
> + sel | SY8106A_GO_BIT);

Why would it do these extra transfers?  Is this just the fact that you
didn't set up a register cache (though the r/m/w cycle should only add
the read)?  We could put some logic in the core regmap code to detect
that an _update_bits() call is going to write to the whole register,
though it'd be even easier to just let this register be cached.

Generally if we can usefully optimize things we should do it at the
framework level.

> + if (reg & SY8106A_GO_BIT)
> + return reg & rdev->desc->vsel_mask;
> + else
> + return (chip->fixed_voltage - rdev->desc->min_uV) /
> +rdev->desc->uV_step;

You could use the standard get_voltage_sel() if you provide a mapping
operation that set everything with _GO_BIT set to return the fixed
voltage.  Though looking at this it seems that the fixed voltage will
always be one that could be set via the register anyway so I'm wondering
if the easiest thing here isn't to just have the driver turn off _GO_BIT
during probe() and not worry about the special case at runtime.


signature.asc
Description: PGP signature


Re: [PATCH v3 2/3] regulator: add support for SY8106A regulator

2018-04-24 Thread Mark Brown
On Mon, Apr 23, 2018 at 10:46:56PM +0800, Icenowy Zheng wrote:

> --- /dev/null
> +++ b/drivers/regulator/sy8106a-regulator.c
> @@ -0,0 +1,176 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * sy8106a-regulator.c - Regulator device driver for SY8106A

Just make the entire thing a C++ comment so it looks consistent and
joined up.

> +static int sy8106a_set_voltage_sel(struct regulator_dev *rdev, unsigned int 
> sel)
> +{
> + /* We use our set_voltage_sel in order to avoid unnecessary I2C
> +  * chatter, because the regulator_get_voltage_sel_regmap using
> +  * apply_bit would perform 4 unnecessary transfers instead of one,
> +  * increasing the chance of error.
> +  */
> + return regmap_write(rdev->regmap, rdev->desc->vsel_reg,
> + sel | SY8106A_GO_BIT);

Why would it do these extra transfers?  Is this just the fact that you
didn't set up a register cache (though the r/m/w cycle should only add
the read)?  We could put some logic in the core regmap code to detect
that an _update_bits() call is going to write to the whole register,
though it'd be even easier to just let this register be cached.

Generally if we can usefully optimize things we should do it at the
framework level.

> + if (reg & SY8106A_GO_BIT)
> + return reg & rdev->desc->vsel_mask;
> + else
> + return (chip->fixed_voltage - rdev->desc->min_uV) /
> +rdev->desc->uV_step;

You could use the standard get_voltage_sel() if you provide a mapping
operation that set everything with _GO_BIT set to return the fixed
voltage.  Though looking at this it seems that the fixed voltage will
always be one that could be set via the register anyway so I'm wondering
if the easiest thing here isn't to just have the driver turn off _GO_BIT
during probe() and not worry about the special case at runtime.


signature.asc
Description: PGP signature


Re: [PATCH v4 7/8] drm/i2c: tda998x: register as a drm bridge

2018-04-24 Thread Russell King - ARM Linux
On Tue, Apr 24, 2018 at 07:04:16PM +0300, Jyri Sarha wrote:
> On 24/04/18 13:14, Peter Rosin wrote:
> > On 2018-04-24 10:08, Russell King - ARM Linux wrote:
> >> On Tue, Apr 24, 2018 at 08:58:42AM +0200, Peter Rosin wrote:
> >>> On 2018-04-23 18:08, Russell King - ARM Linux wrote:
>  On Mon, Apr 23, 2018 at 09:23:00AM +0200, Peter Rosin wrote:
> >  static int tda998x_remove(struct i2c_client *client)
> >  {
> > -   component_del(>dev, _ops);
> > +   struct device *dev = >dev;
> > +   struct tda998x_bridge *bridge = dev_get_drvdata(dev);
> > +
> > +   drm_bridge_remove(>bridge);
> > +   component_del(dev, _ops);
> > +
> 
>  I'd like to ask a rather fundamental question about DRM bridge support,
>  because I suspect that there's a major fsckup here.
> 
>  The above is the function that deals with the TDA998x device being
>  unbound from the driver.  With the component API, this results in the
>  DRM device correctly being torn down, because one of the hardware
>  devices has gone.
> 
>  With DRM bridge, the bridge is merely removed from the list of
>  bridges:
> 
>  void drm_bridge_remove(struct drm_bridge *bridge)
>  {
>  mutex_lock(_lock);
>  list_del_init(>list);
>  mutex_unlock(_lock);
>  }
>  EXPORT_SYMBOL(drm_bridge_remove);
> 
>  and the memory backing the "struct tda998x_bridge" (which contains
>  the struct drm_bridge) will be freed by the devm subsystem.
> 
>  However, there is no notification into the rest of the DRM subsystem
>  that the device has gone away.  Worse, the memory that is still in
>  use by DRM has now been freed, so further use of the DRM device
>  results in a use-after-free bug.
> 
>  This is really not good, and to me looks like a fundamental problem
>  with the DRM bridge code.  I see nothing in the DRM bridge code that
>  deals with the lifetime of a "DRM bridge" or indeed the lifetime of
>  the actual device itself.
> 
>  So, from what I can see, there seems to be a fundamental lifetime
>  issue with the design of the DRM bridge code.  This needs to be
>  fixed.
> >>>
> >>> Oh crap. A gigantic can of worms...
> >>
> >> Yes, it's especially annoying for me, having put the effort in to
> >> the component helper to cover all these cases.
> >>
> >>> Would a patch (completely untested btw) along this line of thinking make
> >>> any difference whatsoever?
> >>
> >> It looks interesting - from what I can see of the device links code,
> >> it would have the effect of unbinding the DRM device just before
> >> TDA998x is unbound, so that's an improvement.
> >>
> >> However, from what I can see, the link vanishes at that point (as
> >> DL_FLAG_AUTOREMOVE is set), and re-binding the TDA998x device results
> >> in nothing further happening - the link will be recreated, but there
> >> appears to be nothing that triggers the "consumer" to rebind at that
> >> point.  Maybe I've missed something?
> > 
> > Right, auto-remove is a no-go. So, improving on the previous...
> > 
> > (I think drm_panel might suffer from this issue too?)
> 
> Yes it does and I took a shot at trying to fix it at the end of the
> previous merge window, but gave up as I run out of time. I re-spun the
> work now after reading this thread. I add you and Russell to cc.

Right, and these exact problems are what the component helper is
there to sort out, in a subsystem independent way.

What is the problem with the component helper that people seem to
be soo loathed to use it, instead preferring to come up with sub-
standard and broken alternatives?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH v4 7/8] drm/i2c: tda998x: register as a drm bridge

2018-04-24 Thread Russell King - ARM Linux
On Tue, Apr 24, 2018 at 07:04:16PM +0300, Jyri Sarha wrote:
> On 24/04/18 13:14, Peter Rosin wrote:
> > On 2018-04-24 10:08, Russell King - ARM Linux wrote:
> >> On Tue, Apr 24, 2018 at 08:58:42AM +0200, Peter Rosin wrote:
> >>> On 2018-04-23 18:08, Russell King - ARM Linux wrote:
>  On Mon, Apr 23, 2018 at 09:23:00AM +0200, Peter Rosin wrote:
> >  static int tda998x_remove(struct i2c_client *client)
> >  {
> > -   component_del(>dev, _ops);
> > +   struct device *dev = >dev;
> > +   struct tda998x_bridge *bridge = dev_get_drvdata(dev);
> > +
> > +   drm_bridge_remove(>bridge);
> > +   component_del(dev, _ops);
> > +
> 
>  I'd like to ask a rather fundamental question about DRM bridge support,
>  because I suspect that there's a major fsckup here.
> 
>  The above is the function that deals with the TDA998x device being
>  unbound from the driver.  With the component API, this results in the
>  DRM device correctly being torn down, because one of the hardware
>  devices has gone.
> 
>  With DRM bridge, the bridge is merely removed from the list of
>  bridges:
> 
>  void drm_bridge_remove(struct drm_bridge *bridge)
>  {
>  mutex_lock(_lock);
>  list_del_init(>list);
>  mutex_unlock(_lock);
>  }
>  EXPORT_SYMBOL(drm_bridge_remove);
> 
>  and the memory backing the "struct tda998x_bridge" (which contains
>  the struct drm_bridge) will be freed by the devm subsystem.
> 
>  However, there is no notification into the rest of the DRM subsystem
>  that the device has gone away.  Worse, the memory that is still in
>  use by DRM has now been freed, so further use of the DRM device
>  results in a use-after-free bug.
> 
>  This is really not good, and to me looks like a fundamental problem
>  with the DRM bridge code.  I see nothing in the DRM bridge code that
>  deals with the lifetime of a "DRM bridge" or indeed the lifetime of
>  the actual device itself.
> 
>  So, from what I can see, there seems to be a fundamental lifetime
>  issue with the design of the DRM bridge code.  This needs to be
>  fixed.
> >>>
> >>> Oh crap. A gigantic can of worms...
> >>
> >> Yes, it's especially annoying for me, having put the effort in to
> >> the component helper to cover all these cases.
> >>
> >>> Would a patch (completely untested btw) along this line of thinking make
> >>> any difference whatsoever?
> >>
> >> It looks interesting - from what I can see of the device links code,
> >> it would have the effect of unbinding the DRM device just before
> >> TDA998x is unbound, so that's an improvement.
> >>
> >> However, from what I can see, the link vanishes at that point (as
> >> DL_FLAG_AUTOREMOVE is set), and re-binding the TDA998x device results
> >> in nothing further happening - the link will be recreated, but there
> >> appears to be nothing that triggers the "consumer" to rebind at that
> >> point.  Maybe I've missed something?
> > 
> > Right, auto-remove is a no-go. So, improving on the previous...
> > 
> > (I think drm_panel might suffer from this issue too?)
> 
> Yes it does and I took a shot at trying to fix it at the end of the
> previous merge window, but gave up as I run out of time. I re-spun the
> work now after reading this thread. I add you and Russell to cc.

Right, and these exact problems are what the component helper is
there to sort out, in a subsystem independent way.

What is the problem with the component helper that people seem to
be soo loathed to use it, instead preferring to come up with sub-
standard and broken alternatives?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: vmalloc with GFP_NOFS

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Tue 24-04-18 12:46:55, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > 
> > > Hi,
> > > it seems that we still have few vmalloc users who perform GFP_NOFS
> > > allocation:
> > > drivers/mtd/ubi/io.c
> > > fs/ext4/xattr.c
> > > fs/gfs2/dir.c
> > > fs/gfs2/quota.c
> > > fs/nfs/blocklayout/extent_tree.c
> > > fs/ubifs/debug.c
> > > fs/ubifs/lprops.c
> > > fs/ubifs/lpt_commit.c
> > > fs/ubifs/orphan.c
> > > 
> > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly
> > > because we do have hardocded GFP_KERNEL allocations deep inside the
> > > vmalloc layers. That means that if GFP_NOFS really protects from
> > > recursion into the fs deadlocks then the vmalloc call is broken.
> > > 
> > > What to do about this? Well, there are two things. Firstly, it would be
> > > really great to double check whether the GFP_NOFS is really needed. I
> > > cannot judge that because I am not familiar with the code. It would be
> > > great if the respective maintainers (hopefully get_maintainer.sh pointed
> > > me to all relevant ones). If there is not reclaim recursion issue then
> > > simply use the standard vmalloc (aka GFP_KERNEL request).
> > > 
> > > If the use is really valid then we have a way to do the vmalloc
> > > allocation properly. We have memalloc_nofs_{save,restore} scope api. How
> > > does that work? You simply call memalloc_nofs_save when the reclaim
> > > recursion critical section starts (e.g. when you take a lock which is
> > > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore
> > > when the critical section ends. _All_ allocations within that scope
> > > will get GFP_NOFS semantic automagically. If you are not sure about the
> > > scope itself then the easiest workaround is to wrap the vmalloc itself
> > > with a big fat comment that this should be revisited.
> > > 
> > > Does that sound like something that can be done in a reasonable time?
> > > I have tried to bring this up in the past but our speed is glacial and
> > > there are attempts to do hacks like checking for abusers inside the
> > > vmalloc which is just too ugly to live.
> > > 
> > > Please do not hesitate to get back to me if something is not clear.
> > > 
> > > Thanks!
> > > -- 
> > > Michal Hocko
> > > SUSE Labs
> > 
> > I made a patch that adds memalloc_noio/fs_save around these calls a year 
> > ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html
> 
> Yeah, and that is the wrong approach.

It is crude, but it fixes the deadlock possibility. Then, the maintainers 
will have a lot of time to refactor the code and move these 
memalloc_noio_save calls to the proper scope.

> Let's try to fix this properly
> this time. As the above outlines, the worst case we can end up mid-term
> would be to wrap vmalloc calls with the scope api with a TODO. But I am
> pretty sure the respective maintainers can come up with a better
> solution. I am definitely willing to help here.
> -- 
> Michal Hocko
> SUSE Labs

Mikulas


Re: vmalloc with GFP_NOFS

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Tue 24-04-18 12:46:55, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > 
> > > Hi,
> > > it seems that we still have few vmalloc users who perform GFP_NOFS
> > > allocation:
> > > drivers/mtd/ubi/io.c
> > > fs/ext4/xattr.c
> > > fs/gfs2/dir.c
> > > fs/gfs2/quota.c
> > > fs/nfs/blocklayout/extent_tree.c
> > > fs/ubifs/debug.c
> > > fs/ubifs/lprops.c
> > > fs/ubifs/lpt_commit.c
> > > fs/ubifs/orphan.c
> > > 
> > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly
> > > because we do have hardocded GFP_KERNEL allocations deep inside the
> > > vmalloc layers. That means that if GFP_NOFS really protects from
> > > recursion into the fs deadlocks then the vmalloc call is broken.
> > > 
> > > What to do about this? Well, there are two things. Firstly, it would be
> > > really great to double check whether the GFP_NOFS is really needed. I
> > > cannot judge that because I am not familiar with the code. It would be
> > > great if the respective maintainers (hopefully get_maintainer.sh pointed
> > > me to all relevant ones). If there is not reclaim recursion issue then
> > > simply use the standard vmalloc (aka GFP_KERNEL request).
> > > 
> > > If the use is really valid then we have a way to do the vmalloc
> > > allocation properly. We have memalloc_nofs_{save,restore} scope api. How
> > > does that work? You simply call memalloc_nofs_save when the reclaim
> > > recursion critical section starts (e.g. when you take a lock which is
> > > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore
> > > when the critical section ends. _All_ allocations within that scope
> > > will get GFP_NOFS semantic automagically. If you are not sure about the
> > > scope itself then the easiest workaround is to wrap the vmalloc itself
> > > with a big fat comment that this should be revisited.
> > > 
> > > Does that sound like something that can be done in a reasonable time?
> > > I have tried to bring this up in the past but our speed is glacial and
> > > there are attempts to do hacks like checking for abusers inside the
> > > vmalloc which is just too ugly to live.
> > > 
> > > Please do not hesitate to get back to me if something is not clear.
> > > 
> > > Thanks!
> > > -- 
> > > Michal Hocko
> > > SUSE Labs
> > 
> > I made a patch that adds memalloc_noio/fs_save around these calls a year 
> > ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html
> 
> Yeah, and that is the wrong approach.

It is crude, but it fixes the deadlock possibility. Then, the maintainers 
will have a lot of time to refactor the code and move these 
memalloc_noio_save calls to the proper scope.

> Let's try to fix this properly
> this time. As the above outlines, the worst case we can end up mid-term
> would be to wrap vmalloc calls with the scope api with a TODO. But I am
> pretty sure the respective maintainers can come up with a better
> solution. I am definitely willing to help here.
> -- 
> Michal Hocko
> SUSE Labs

Mikulas


Re: [PATCH 7/9] Pmalloc Rare Write: modify selected pools

2018-04-24 Thread Igor Stoppa

On 24/04/18 16:33, Igor Stoppa wrote:



On 24/04/18 15:50, Matthew Wilcox wrote:

On Mon, Apr 23, 2018 at 04:54:56PM +0400, Igor Stoppa wrote:

While the vanilla version of pmalloc provides support for permanently
transitioning between writable and read-only of a memory pool, this
patch seeks to support a separate class of data, which would still
benefit from write protection, most of the time, but it still needs to
be modifiable. Maybe very seldom, but still cannot be permanently marked
as read-only.


This seems like a horrible idea that basically makes this feature 
useless.

I would say the right way to do this is to have:

struct modifiable_data {
struct immutable_data *d;
...
};

Then allocate a new pool, change d and destroy the old pool.


I'm not sure I understand.


A few cups of coffee later ...

This seems like a regression from my case.

My case (see the example with the initialized state) is:

static void *pointer_to_pmalloc_memory __ro_after_init;

then, during init:

pointer_to_pmalloc_memory = pmalloc(pool, size);

then init happens

*pointer_to_pmalloc_memory = some_value;

pmalloc_protect_pool(pool9;

and to change the value:

support_variable = some_other_value;

pmalloc_rare_write(pool, pointer_to_pmalloc_memory,
   _variable, size)

But in this case the pmalloc allocation would be assigned to a writable 
variable.


This seems like a regression to me: at this point who cares anymore 
about the pmalloc memory?


Just rewrite the pointer to point to somewhere else that is writable and 
has the desired (from the attacker) value.


It doesn't even require gadgets. pmalloc becomes useless.

Do I still need more coffee?

--
igor


Re: [PATCH 7/9] Pmalloc Rare Write: modify selected pools

2018-04-24 Thread Igor Stoppa

On 24/04/18 16:33, Igor Stoppa wrote:



On 24/04/18 15:50, Matthew Wilcox wrote:

On Mon, Apr 23, 2018 at 04:54:56PM +0400, Igor Stoppa wrote:

While the vanilla version of pmalloc provides support for permanently
transitioning between writable and read-only of a memory pool, this
patch seeks to support a separate class of data, which would still
benefit from write protection, most of the time, but it still needs to
be modifiable. Maybe very seldom, but still cannot be permanently marked
as read-only.


This seems like a horrible idea that basically makes this feature 
useless.

I would say the right way to do this is to have:

struct modifiable_data {
struct immutable_data *d;
...
};

Then allocate a new pool, change d and destroy the old pool.


I'm not sure I understand.


A few cups of coffee later ...

This seems like a regression from my case.

My case (see the example with the initialized state) is:

static void *pointer_to_pmalloc_memory __ro_after_init;

then, during init:

pointer_to_pmalloc_memory = pmalloc(pool, size);

then init happens

*pointer_to_pmalloc_memory = some_value;

pmalloc_protect_pool(pool9;

and to change the value:

support_variable = some_other_value;

pmalloc_rare_write(pool, pointer_to_pmalloc_memory,
   _variable, size)

But in this case the pmalloc allocation would be assigned to a writable 
variable.


This seems like a regression to me: at this point who cares anymore 
about the pmalloc memory?


Just rewrite the pointer to point to somewhere else that is writable and 
has the desired (from the attacker) value.


It doesn't even require gadgets. pmalloc becomes useless.

Do I still need more coffee?

--
igor


Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 13:00:11, Mikulas Patocka wrote:
> 
> 
> On Tue, 24 Apr 2018, Michal Hocko wrote:
> 
> > On Tue 24-04-18 11:50:30, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > > 
> > > > On Mon 23-04-18 20:06:16, Mikulas Patocka wrote:
> > > > [...]
> > > > > @@ -404,6 +405,12 @@ void *kvmalloc_node(size_t size, gfp_t f
> > > > >*/
> > > > >   WARN_ON_ONCE((flags & GFP_KERNEL) != GFP_KERNEL);
> > > > >  
> > > > > +#ifdef CONFIG_DEBUG_SG
> > > > > + /* Catch bugs when the caller uses DMA API on the result of 
> > > > > kvmalloc. */
> > > > > + if (!(prandom_u32_max(2) & 1))
> > > > > + goto do_vmalloc;
> > > > > +#endif
> > > > 
> > > > I really do not think there is anything DEBUG_SG specific here. Why you
> > > > simply do not follow should_failslab path or even reuse the function?
> > > 
> > > CONFIG_DEBUG_SG is enabled by default in RHEL and Fedora debug kernel (if 
> > > you don't like CONFIG_DEBUG_SG, pick any other option that is enabled 
> > > there).
> > 
> > Are you telling me that you are shaping a debugging functionality basing
> > on what RHEL has enabled? And you call me evil. This is just rediculous.
> > 
> > > Fail-injection framework is if off by default and it must be explicitly 
> > > enabled and configured by the user - and most users won't enable it.
> > 
> > It can be enabled easily. And if you care enough for your debugging
> > kernel then just make it enabled unconditionally.
> 
> So, should we add a new option CONFIG_KVMALLOC_FALLBACK_DEFAULT? I'm not 
> quite sure if 3 lines of debugging code need an extra option, but if you 
> don't want to reuse any existing debug option, it may be possible. Adding 
> it to the RHEL debug kernel would be trivial.

Wouldn't it be equally trivial to simply enable the fault injection? You
would get additional failure paths testing as a bonus.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 13:00:11, Mikulas Patocka wrote:
> 
> 
> On Tue, 24 Apr 2018, Michal Hocko wrote:
> 
> > On Tue 24-04-18 11:50:30, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > > 
> > > > On Mon 23-04-18 20:06:16, Mikulas Patocka wrote:
> > > > [...]
> > > > > @@ -404,6 +405,12 @@ void *kvmalloc_node(size_t size, gfp_t f
> > > > >*/
> > > > >   WARN_ON_ONCE((flags & GFP_KERNEL) != GFP_KERNEL);
> > > > >  
> > > > > +#ifdef CONFIG_DEBUG_SG
> > > > > + /* Catch bugs when the caller uses DMA API on the result of 
> > > > > kvmalloc. */
> > > > > + if (!(prandom_u32_max(2) & 1))
> > > > > + goto do_vmalloc;
> > > > > +#endif
> > > > 
> > > > I really do not think there is anything DEBUG_SG specific here. Why you
> > > > simply do not follow should_failslab path or even reuse the function?
> > > 
> > > CONFIG_DEBUG_SG is enabled by default in RHEL and Fedora debug kernel (if 
> > > you don't like CONFIG_DEBUG_SG, pick any other option that is enabled 
> > > there).
> > 
> > Are you telling me that you are shaping a debugging functionality basing
> > on what RHEL has enabled? And you call me evil. This is just rediculous.
> > 
> > > Fail-injection framework is if off by default and it must be explicitly 
> > > enabled and configured by the user - and most users won't enable it.
> > 
> > It can be enabled easily. And if you care enough for your debugging
> > kernel then just make it enabled unconditionally.
> 
> So, should we add a new option CONFIG_KVMALLOC_FALLBACK_DEFAULT? I'm not 
> quite sure if 3 lines of debugging code need an extra option, but if you 
> don't want to reuse any existing debug option, it may be possible. Adding 
> it to the RHEL debug kernel would be trivial.

Wouldn't it be equally trivial to simply enable the fault injection? You
would get additional failure paths testing as a bonus.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 12:48:50, Chunyu Hu wrote:
> 
> 
> - Original Message -
> > From: "Michal Hocko" 
> > To: "Chunyu Hu" 
> > Cc: "Dmitry Vyukov" , "Catalin Marinas" 
> > , "Chunyu Hu"
> > , "LKML" , "Linux-MM" 
> > 
> > Sent: Tuesday, April 24, 2018 9:20:57 PM
> > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in 
> > gfp_kmemleak_mask
> > 
> > On Mon 23-04-18 12:17:32, Chunyu Hu wrote:
> > [...]
> > > So if there is a new flag, it would be the 25th bits.
> > 
> > No new flags please. Can you simply store a simple bool into fail_page_alloc
> > and have save/restore api for that?
> 
> Hi Michal,
> 
> I still don't get your point. The original NOFAIL added in kmemleak was 
> for skipping fault injection in page/slab  allocation for kmemleak object, 
> since kmemleak will disable itself until next reboot, whenever it hit an 
> allocation failure, in that case, it will lose effect to check kmemleak 
> in errer path rose by fault injection. But NOFAULT's effect is more than 
> skipping fault injection, it's also for hard allocation. So a dedicated flag
> for skipping fault injection in specified slab/page allocation was mentioned.

I am not familiar with the kmemleak all that much, but fiddling with the
gfp_mask is a wrong way to achieve kmemleak specific action. I might be
easilly wrong but I do not see any code that would restore the original
gfp_mask down the kmem_cache_alloc path.

> d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") 
>   
> Do you mean something like below, with the save/store api? But looks like
> to make it possible to skip a specified allocation, not global disabling,
> a bool is not enough, and a gfp_flag is also needed. Maybe I missed something?

Yes, this is essentially what I meant. It is still a global thing which
is not all that great and if it matters then you can make it per
task_struct. That really depends on the code flow here.

-- 
Michal Hocko
SUSE Labs


Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 12:48:50, Chunyu Hu wrote:
> 
> 
> - Original Message -
> > From: "Michal Hocko" 
> > To: "Chunyu Hu" 
> > Cc: "Dmitry Vyukov" , "Catalin Marinas" 
> > , "Chunyu Hu"
> > , "LKML" , "Linux-MM" 
> > 
> > Sent: Tuesday, April 24, 2018 9:20:57 PM
> > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in 
> > gfp_kmemleak_mask
> > 
> > On Mon 23-04-18 12:17:32, Chunyu Hu wrote:
> > [...]
> > > So if there is a new flag, it would be the 25th bits.
> > 
> > No new flags please. Can you simply store a simple bool into fail_page_alloc
> > and have save/restore api for that?
> 
> Hi Michal,
> 
> I still don't get your point. The original NOFAIL added in kmemleak was 
> for skipping fault injection in page/slab  allocation for kmemleak object, 
> since kmemleak will disable itself until next reboot, whenever it hit an 
> allocation failure, in that case, it will lose effect to check kmemleak 
> in errer path rose by fault injection. But NOFAULT's effect is more than 
> skipping fault injection, it's also for hard allocation. So a dedicated flag
> for skipping fault injection in specified slab/page allocation was mentioned.

I am not familiar with the kmemleak all that much, but fiddling with the
gfp_mask is a wrong way to achieve kmemleak specific action. I might be
easilly wrong but I do not see any code that would restore the original
gfp_mask down the kmem_cache_alloc path.

> d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") 
>   
> Do you mean something like below, with the save/store api? But looks like
> to make it possible to skip a specified allocation, not global disabling,
> a bool is not enough, and a gfp_flag is also needed. Maybe I missed something?

Yes, this is essentially what I meant. It is still a global thing which
is not all that great and if it matters then you can make it per
task_struct. That really depends on the code flow here.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH] platform/x86: acer-wmi: add another KEY_POWER keycode

2018-04-24 Thread Andy Shevchenko
On Tue, 2018-04-24 at 19:15 +0300, Andy Shevchenko wrote:
> On Mon, Apr 23, 2018 at 5:48 PM, Gianfranco Costamagna
>  wrote:
> > Hello,
> > 
> > > Can you give your corresponding tag as well?
> > 
> > 
> > we are implementing the fix right now, please follow bug
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766054
> 
> OK, I postpone this patch until new version.

Or I missed the point and it's working? Tell me what to do with it.

For now I pushed to my review and testing queue, thanks!

-- 
Andy Shevchenko 
Intel Finland Oy


Re: [PATCH] platform/x86: acer-wmi: add another KEY_POWER keycode

2018-04-24 Thread Andy Shevchenko
On Tue, 2018-04-24 at 19:15 +0300, Andy Shevchenko wrote:
> On Mon, Apr 23, 2018 at 5:48 PM, Gianfranco Costamagna
>  wrote:
> > Hello,
> > 
> > > Can you give your corresponding tag as well?
> > 
> > 
> > we are implementing the fix right now, please follow bug
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766054
> 
> OK, I postpone this patch until new version.

Or I missed the point and it's working? Tell me what to do with it.

For now I pushed to my review and testing queue, thanks!

-- 
Andy Shevchenko 
Intel Finland Oy


Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Tue 24-04-18 11:50:30, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > 
> > > On Mon 23-04-18 20:06:16, Mikulas Patocka wrote:
> > > [...]
> > > > @@ -404,6 +405,12 @@ void *kvmalloc_node(size_t size, gfp_t f
> > > >  */
> > > > WARN_ON_ONCE((flags & GFP_KERNEL) != GFP_KERNEL);
> > > >  
> > > > +#ifdef CONFIG_DEBUG_SG
> > > > +   /* Catch bugs when the caller uses DMA API on the result of 
> > > > kvmalloc. */
> > > > +   if (!(prandom_u32_max(2) & 1))
> > > > +   goto do_vmalloc;
> > > > +#endif
> > > 
> > > I really do not think there is anything DEBUG_SG specific here. Why you
> > > simply do not follow should_failslab path or even reuse the function?
> > 
> > CONFIG_DEBUG_SG is enabled by default in RHEL and Fedora debug kernel (if 
> > you don't like CONFIG_DEBUG_SG, pick any other option that is enabled 
> > there).
> 
> Are you telling me that you are shaping a debugging functionality basing
> on what RHEL has enabled? And you call me evil. This is just rediculous.
> 
> > Fail-injection framework is if off by default and it must be explicitly 
> > enabled and configured by the user - and most users won't enable it.
> 
> It can be enabled easily. And if you care enough for your debugging
> kernel then just make it enabled unconditionally.

So, should we add a new option CONFIG_KVMALLOC_FALLBACK_DEFAULT? I'm not 
quite sure if 3 lines of debugging code need an extra option, but if you 
don't want to reuse any existing debug option, it may be possible. Adding 
it to the RHEL debug kernel would be trivial.

Mikulas


Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Tue 24-04-18 11:50:30, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > 
> > > On Mon 23-04-18 20:06:16, Mikulas Patocka wrote:
> > > [...]
> > > > @@ -404,6 +405,12 @@ void *kvmalloc_node(size_t size, gfp_t f
> > > >  */
> > > > WARN_ON_ONCE((flags & GFP_KERNEL) != GFP_KERNEL);
> > > >  
> > > > +#ifdef CONFIG_DEBUG_SG
> > > > +   /* Catch bugs when the caller uses DMA API on the result of 
> > > > kvmalloc. */
> > > > +   if (!(prandom_u32_max(2) & 1))
> > > > +   goto do_vmalloc;
> > > > +#endif
> > > 
> > > I really do not think there is anything DEBUG_SG specific here. Why you
> > > simply do not follow should_failslab path or even reuse the function?
> > 
> > CONFIG_DEBUG_SG is enabled by default in RHEL and Fedora debug kernel (if 
> > you don't like CONFIG_DEBUG_SG, pick any other option that is enabled 
> > there).
> 
> Are you telling me that you are shaping a debugging functionality basing
> on what RHEL has enabled? And you call me evil. This is just rediculous.
> 
> > Fail-injection framework is if off by default and it must be explicitly 
> > enabled and configured by the user - and most users won't enable it.
> 
> It can be enabled easily. And if you care enough for your debugging
> kernel then just make it enabled unconditionally.

So, should we add a new option CONFIG_KVMALLOC_FALLBACK_DEFAULT? I'm not 
quite sure if 3 lines of debugging code need an extra option, but if you 
don't want to reuse any existing debug option, it may be possible. Adding 
it to the RHEL debug kernel would be trivial.

Mikulas


Re: vmalloc with GFP_NOFS

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 12:46:55, Mikulas Patocka wrote:
> 
> 
> On Tue, 24 Apr 2018, Michal Hocko wrote:
> 
> > Hi,
> > it seems that we still have few vmalloc users who perform GFP_NOFS
> > allocation:
> > drivers/mtd/ubi/io.c
> > fs/ext4/xattr.c
> > fs/gfs2/dir.c
> > fs/gfs2/quota.c
> > fs/nfs/blocklayout/extent_tree.c
> > fs/ubifs/debug.c
> > fs/ubifs/lprops.c
> > fs/ubifs/lpt_commit.c
> > fs/ubifs/orphan.c
> > 
> > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly
> > because we do have hardocded GFP_KERNEL allocations deep inside the
> > vmalloc layers. That means that if GFP_NOFS really protects from
> > recursion into the fs deadlocks then the vmalloc call is broken.
> > 
> > What to do about this? Well, there are two things. Firstly, it would be
> > really great to double check whether the GFP_NOFS is really needed. I
> > cannot judge that because I am not familiar with the code. It would be
> > great if the respective maintainers (hopefully get_maintainer.sh pointed
> > me to all relevant ones). If there is not reclaim recursion issue then
> > simply use the standard vmalloc (aka GFP_KERNEL request).
> > 
> > If the use is really valid then we have a way to do the vmalloc
> > allocation properly. We have memalloc_nofs_{save,restore} scope api. How
> > does that work? You simply call memalloc_nofs_save when the reclaim
> > recursion critical section starts (e.g. when you take a lock which is
> > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore
> > when the critical section ends. _All_ allocations within that scope
> > will get GFP_NOFS semantic automagically. If you are not sure about the
> > scope itself then the easiest workaround is to wrap the vmalloc itself
> > with a big fat comment that this should be revisited.
> > 
> > Does that sound like something that can be done in a reasonable time?
> > I have tried to bring this up in the past but our speed is glacial and
> > there are attempts to do hacks like checking for abusers inside the
> > vmalloc which is just too ugly to live.
> > 
> > Please do not hesitate to get back to me if something is not clear.
> > 
> > Thanks!
> > -- 
> > Michal Hocko
> > SUSE Labs
> 
> I made a patch that adds memalloc_noio/fs_save around these calls a year 
> ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html

Yeah, and that is the wrong approach. Let's try to fix this properly
this time. As the above outlines, the worst case we can end up mid-term
would be to wrap vmalloc calls with the scope api with a TODO. But I am
pretty sure the respective maintainers can come up with a better
solution. I am definitely willing to help here.
-- 
Michal Hocko
SUSE Labs


Re: vmalloc with GFP_NOFS

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 12:46:55, Mikulas Patocka wrote:
> 
> 
> On Tue, 24 Apr 2018, Michal Hocko wrote:
> 
> > Hi,
> > it seems that we still have few vmalloc users who perform GFP_NOFS
> > allocation:
> > drivers/mtd/ubi/io.c
> > fs/ext4/xattr.c
> > fs/gfs2/dir.c
> > fs/gfs2/quota.c
> > fs/nfs/blocklayout/extent_tree.c
> > fs/ubifs/debug.c
> > fs/ubifs/lprops.c
> > fs/ubifs/lpt_commit.c
> > fs/ubifs/orphan.c
> > 
> > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly
> > because we do have hardocded GFP_KERNEL allocations deep inside the
> > vmalloc layers. That means that if GFP_NOFS really protects from
> > recursion into the fs deadlocks then the vmalloc call is broken.
> > 
> > What to do about this? Well, there are two things. Firstly, it would be
> > really great to double check whether the GFP_NOFS is really needed. I
> > cannot judge that because I am not familiar with the code. It would be
> > great if the respective maintainers (hopefully get_maintainer.sh pointed
> > me to all relevant ones). If there is not reclaim recursion issue then
> > simply use the standard vmalloc (aka GFP_KERNEL request).
> > 
> > If the use is really valid then we have a way to do the vmalloc
> > allocation properly. We have memalloc_nofs_{save,restore} scope api. How
> > does that work? You simply call memalloc_nofs_save when the reclaim
> > recursion critical section starts (e.g. when you take a lock which is
> > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore
> > when the critical section ends. _All_ allocations within that scope
> > will get GFP_NOFS semantic automagically. If you are not sure about the
> > scope itself then the easiest workaround is to wrap the vmalloc itself
> > with a big fat comment that this should be revisited.
> > 
> > Does that sound like something that can be done in a reasonable time?
> > I have tried to bring this up in the past but our speed is glacial and
> > there are attempts to do hacks like checking for abusers inside the
> > vmalloc which is just too ugly to live.
> > 
> > Please do not hesitate to get back to me if something is not clear.
> > 
> > Thanks!
> > -- 
> > Michal Hocko
> > SUSE Labs
> 
> I made a patch that adds memalloc_noio/fs_save around these calls a year 
> ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html

Yeah, and that is the wrong approach. Let's try to fix this properly
this time. As the above outlines, the worst case we can end up mid-term
would be to wrap vmalloc calls with the scope api with a TODO. But I am
pretty sure the respective maintainers can come up with a better
solution. I am definitely willing to help here.
-- 
Michal Hocko
SUSE Labs


[PATCH 2/7] dt-bindings: add generic gnss binding

2018-04-24 Thread Johan Hovold
Describe generic properties for GNSS receivers.

Signed-off-by: Johan Hovold 
---
 .../devicetree/bindings/gnss/gnss.txt | 36 +++
 MAINTAINERS   |  1 +
 2 files changed, 37 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/gnss.txt

diff --git a/Documentation/devicetree/bindings/gnss/gnss.txt 
b/Documentation/devicetree/bindings/gnss/gnss.txt
new file mode 100644
index ..bcdaca043eaa
--- /dev/null
+++ b/Documentation/devicetree/bindings/gnss/gnss.txt
@@ -0,0 +1,36 @@
+GNSS Receiver DT binding
+
+This documents the binding structure and common properties for GNSS receiver
+devices.
+
+A GNSS receiver node is a node named "gnss" and typically resides on a serial
+bus (e.g. UART, I2C or SPI).
+
+Please refer to the following documents for generic properties:
+
+   Documentation/devicetree/bindings/serial/slave-device.txt
+   Documentation/devicetree/bindings/spi/spi-bus.txt
+
+Required Properties:
+
+- compatible   : A string reflecting the vendor and specific device the node
+ represents
+
+Optional Properties:
+- enable-gpios : GPIO used to power on (or off) the device
+- timepulse-gpios  : Timepulse (e.g. 1PPS) GPIO
+
+Example:
+
+serial@1234 {
+   compatible = "ns16550a";
+
+   gnss {
+   compatible = "u-blox,neo-8";
+
+   vcc-supply = <_reg>;
+   timepulse-gpios = < 16 GPIO_ACTIVE_HIGH>;
+
+   current-speed = <4800>;
+   };
+};
diff --git a/MAINTAINERS b/MAINTAINERS
index dc3df211c1a7..fa219e80a1f8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5965,6 +5965,7 @@ F:include/uapi/linux/gigaset_dev.h
 GNSS SUBSYSTEM
 M: Johan Hovold 
 S: Maintained
+F: Documentation/devicetree/bindings/gnss/
 F: drivers/gnss/
 F: include/linux/gnss.h
 
-- 
2.17.0



[PATCH 2/7] dt-bindings: add generic gnss binding

2018-04-24 Thread Johan Hovold
Describe generic properties for GNSS receivers.

Signed-off-by: Johan Hovold 
---
 .../devicetree/bindings/gnss/gnss.txt | 36 +++
 MAINTAINERS   |  1 +
 2 files changed, 37 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/gnss.txt

diff --git a/Documentation/devicetree/bindings/gnss/gnss.txt 
b/Documentation/devicetree/bindings/gnss/gnss.txt
new file mode 100644
index ..bcdaca043eaa
--- /dev/null
+++ b/Documentation/devicetree/bindings/gnss/gnss.txt
@@ -0,0 +1,36 @@
+GNSS Receiver DT binding
+
+This documents the binding structure and common properties for GNSS receiver
+devices.
+
+A GNSS receiver node is a node named "gnss" and typically resides on a serial
+bus (e.g. UART, I2C or SPI).
+
+Please refer to the following documents for generic properties:
+
+   Documentation/devicetree/bindings/serial/slave-device.txt
+   Documentation/devicetree/bindings/spi/spi-bus.txt
+
+Required Properties:
+
+- compatible   : A string reflecting the vendor and specific device the node
+ represents
+
+Optional Properties:
+- enable-gpios : GPIO used to power on (or off) the device
+- timepulse-gpios  : Timepulse (e.g. 1PPS) GPIO
+
+Example:
+
+serial@1234 {
+   compatible = "ns16550a";
+
+   gnss {
+   compatible = "u-blox,neo-8";
+
+   vcc-supply = <_reg>;
+   timepulse-gpios = < 16 GPIO_ACTIVE_HIGH>;
+
+   current-speed = <4800>;
+   };
+};
diff --git a/MAINTAINERS b/MAINTAINERS
index dc3df211c1a7..fa219e80a1f8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5965,6 +5965,7 @@ F:include/uapi/linux/gigaset_dev.h
 GNSS SUBSYSTEM
 M: Johan Hovold 
 S: Maintained
+F: Documentation/devicetree/bindings/gnss/
 F: drivers/gnss/
 F: include/linux/gnss.h
 
-- 
2.17.0



Re: [PATCH] net: phy: TLK10X initial driver submission

2018-04-24 Thread Florian Fainelli


On 04/19/2018 01:28 AM, Måns Andersson wrote:
> From: Mans Andersson 
> 
> Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
> 
> In addition the TLK10X needs to be removed from DP83848 driver as the
> power back off support is added here for this device.

I would not think this is a compelling enough reason, you could very
well just adjust the dp83848.c driver just to account for these
properties that you are introducing. More comments below.

[snip]

> +#define TLK10X_INT_EN_MASK   \
> + (TLK10X_MISR_ANC_INT_EN |   \
> +  TLK10X_MISR_DUP_INT_EN |   \
> +  TLK10X_MISR_SPD_INT_EN |   \
> +  TLK10X_MISR_LINK_INT_EN)
> +
> +struct tlk10x_private {
> + int pwrbo_level;

unsigned int

> +};
> +
> +static int tlk10x_read(struct phy_device *phydev, int reg)
> +{
> + if (reg & ~0x1f) {
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }

Humm, this looks a bit fragile, you would likely want to create separate
helper functions for these extended registers and make sure you handle
write failures as well. Also consider making use of the page helpers
from include/linux/phy.h.

> +
> + return phy_read(phydev, reg);
> +}
> +
> +static int tlk10x_write(struct phy_device *phydev, int reg, int val)
> +{
> + if (reg & ~0x1f) {
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }

Same here.

> +
> + return phy_write(phydev, reg, val);
> +}
> +
> +#ifdef CONFIG_OF_MDIO
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + struct tlk10x_private *tlk10x = phydev->priv;
> + struct device *dev = >mdio.dev;
> + struct device_node *of_node = dev->of_node;
> + int ret;
> +
> + if (!of_node)
> + return 0;
> +
> + ret = of_property_read_u32(of_node, "ti,power-back-off",
> +>pwrbo_level);
> + if (ret) {
> + dev_err(dev, "missing ti,power-back-off property");
> + tlk10x->pwrbo_level = 0;

This should not be necessary, that should be the default with a zero
initialized private data structure.

> + }
> +
> + return 0;
> +}
> +#else
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + return 0;
> +}
> +#endif /* CONFIG_OF_MDIO */
> +
> +static int tlk10x_config_init(struct phy_device *phydev)
> +{
> + int ret, reg;
> + struct tlk10x_private *tlk10x;
> +
> + ret = genphy_config_init(phydev);
> + if (ret < 0)
> + return ret;
> +
> + if (!phydev->priv) {
> + tlk10x = devm_kzalloc(>mdio.dev, sizeof(*tlk10x),
> +   GFP_KERNEL);
> + if (!tlk10x)
> + return -ENOMEM;
> +
> + phydev->priv = tlk10x;
> + ret = tlk10x_of_init(phydev);
> + if (ret)
> + return ret;
> + } else {
> + tlk10x = (struct tlk10x_private *)phydev->priv;
> + }

You need to implement a probe() function that is responsible for
allocation private memory instead of doing this check.

> +
> + // Power back off
> + if (tlk10x->pwrbo_level < 0 || tlk10x->pwrbo_level > 3)
> + tlk10x->pwrbo_level = 0;

How can you have pwrb_level < 0 when you use of_read_property_u32()?

> + reg = tlk10x_read(phydev, TLK10X_PWRBOCR);
> + reg = ((reg & ~TLK10X_PWRBOCR_MASK)
> + | (tlk10x->pwrbo_level << 6));

One too many levels of parenthesis, the outer ones should not be necessary.

> + ret = tlk10x_write(phydev, TLK10X_PWRBOCR, reg);
> + if (ret < 0) {
> + dev_err(>mdio.dev,
> + "unable to set power back-off (err=%d)\n", ret);
> + return ret;
> + }
> + dev_info(>mdio.dev, "power back-off set to level %d\n",
> +  tlk10x->pwrbo_level);

config_init() is called often, consider making this a debugging statement.

-- 
Florian


Re: [PATCH] net: phy: TLK10X initial driver submission

2018-04-24 Thread Florian Fainelli


On 04/19/2018 01:28 AM, Måns Andersson wrote:
> From: Mans Andersson 
> 
> Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
> 
> In addition the TLK10X needs to be removed from DP83848 driver as the
> power back off support is added here for this device.

I would not think this is a compelling enough reason, you could very
well just adjust the dp83848.c driver just to account for these
properties that you are introducing. More comments below.

[snip]

> +#define TLK10X_INT_EN_MASK   \
> + (TLK10X_MISR_ANC_INT_EN |   \
> +  TLK10X_MISR_DUP_INT_EN |   \
> +  TLK10X_MISR_SPD_INT_EN |   \
> +  TLK10X_MISR_LINK_INT_EN)
> +
> +struct tlk10x_private {
> + int pwrbo_level;

unsigned int

> +};
> +
> +static int tlk10x_read(struct phy_device *phydev, int reg)
> +{
> + if (reg & ~0x1f) {
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }

Humm, this looks a bit fragile, you would likely want to create separate
helper functions for these extended registers and make sure you handle
write failures as well. Also consider making use of the page helpers
from include/linux/phy.h.

> +
> + return phy_read(phydev, reg);
> +}
> +
> +static int tlk10x_write(struct phy_device *phydev, int reg, int val)
> +{
> + if (reg & ~0x1f) {
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }

Same here.

> +
> + return phy_write(phydev, reg, val);
> +}
> +
> +#ifdef CONFIG_OF_MDIO
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + struct tlk10x_private *tlk10x = phydev->priv;
> + struct device *dev = >mdio.dev;
> + struct device_node *of_node = dev->of_node;
> + int ret;
> +
> + if (!of_node)
> + return 0;
> +
> + ret = of_property_read_u32(of_node, "ti,power-back-off",
> +>pwrbo_level);
> + if (ret) {
> + dev_err(dev, "missing ti,power-back-off property");
> + tlk10x->pwrbo_level = 0;

This should not be necessary, that should be the default with a zero
initialized private data structure.

> + }
> +
> + return 0;
> +}
> +#else
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + return 0;
> +}
> +#endif /* CONFIG_OF_MDIO */
> +
> +static int tlk10x_config_init(struct phy_device *phydev)
> +{
> + int ret, reg;
> + struct tlk10x_private *tlk10x;
> +
> + ret = genphy_config_init(phydev);
> + if (ret < 0)
> + return ret;
> +
> + if (!phydev->priv) {
> + tlk10x = devm_kzalloc(>mdio.dev, sizeof(*tlk10x),
> +   GFP_KERNEL);
> + if (!tlk10x)
> + return -ENOMEM;
> +
> + phydev->priv = tlk10x;
> + ret = tlk10x_of_init(phydev);
> + if (ret)
> + return ret;
> + } else {
> + tlk10x = (struct tlk10x_private *)phydev->priv;
> + }

You need to implement a probe() function that is responsible for
allocation private memory instead of doing this check.

> +
> + // Power back off
> + if (tlk10x->pwrbo_level < 0 || tlk10x->pwrbo_level > 3)
> + tlk10x->pwrbo_level = 0;

How can you have pwrb_level < 0 when you use of_read_property_u32()?

> + reg = tlk10x_read(phydev, TLK10X_PWRBOCR);
> + reg = ((reg & ~TLK10X_PWRBOCR_MASK)
> + | (tlk10x->pwrbo_level << 6));

One too many levels of parenthesis, the outer ones should not be necessary.

> + ret = tlk10x_write(phydev, TLK10X_PWRBOCR, reg);
> + if (ret < 0) {
> + dev_err(>mdio.dev,
> + "unable to set power back-off (err=%d)\n", ret);
> + return ret;
> + }
> + dev_info(>mdio.dev, "power back-off set to level %d\n",
> +  tlk10x->pwrbo_level);

config_init() is called often, consider making this a debugging statement.

-- 
Florian


[PATCH 3/7] gnss: add generic serial driver

2018-04-24 Thread Johan Hovold
Add a generic serial GNSS driver (library) which provides a common
implementation for the gnss interface and power management (runtime and
system suspend). This allows GNSS drivers for specific chip to be
implemented by simply providing a set_power() callback to handle three
states: ACTIVE, STANDBY and OFF.

Signed-off-by: Johan Hovold 
---
 drivers/gnss/Kconfig  |   7 +
 drivers/gnss/Makefile |   3 +
 drivers/gnss/serial.c | 288 ++
 drivers/gnss/serial.h |  47 +++
 4 files changed, 345 insertions(+)
 create mode 100644 drivers/gnss/serial.c
 create mode 100644 drivers/gnss/serial.h

diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
index 103fcc70992e..f8ee54f99a8d 100644
--- a/drivers/gnss/Kconfig
+++ b/drivers/gnss/Kconfig
@@ -9,3 +9,10 @@ menuconfig GNSS
 
  To compile this driver as a module, choose M here: the module will
  be called gnss.
+
+if GNSS
+
+config GNSS_SERIAL
+   tristate
+
+endif # GNSS
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
index 1f7a7baab1d9..171aba71684d 100644
--- a/drivers/gnss/Makefile
+++ b/drivers/gnss/Makefile
@@ -5,3 +5,6 @@
 
 obj-$(CONFIG_GNSS) += gnss.o
 gnss-y := core.o
+
+obj-$(CONFIG_GNSS_SERIAL)  += gnss-serial.o
+gnss-serial-y := serial.o
diff --git a/drivers/gnss/serial.c b/drivers/gnss/serial.c
new file mode 100644
index ..06a4b511f380
--- /dev/null
+++ b/drivers/gnss/serial.c
@@ -0,0 +1,288 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic serial GNSS receiver driver
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "serial.h"
+
+static int gnss_serial_open(struct gnss_device *gdev)
+{
+   struct gnss_serial *gserial = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = gserial->serdev;
+   int ret;
+
+   ret = serdev_device_open(serdev);
+   if (ret)
+   return ret;
+
+   serdev_device_set_baudrate(serdev, gserial->speed);
+   serdev_device_set_flow_control(serdev, false);
+
+   ret = pm_runtime_get_sync(>dev);
+   if (ret < 0) {
+   pm_runtime_put_noidle(>dev);
+   goto err_close;
+   }
+
+   return 0;
+
+err_close:
+   serdev_device_close(serdev);
+
+   return ret;
+}
+
+static void gnss_serial_close(struct gnss_device *gdev)
+{
+   struct gnss_serial *gserial = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = gserial->serdev;
+
+   serdev_device_close(serdev);
+
+   pm_runtime_put(>dev);
+}
+
+static int gnss_serial_write_raw(struct gnss_device *gdev,
+   const unsigned char *buf, size_t count)
+{
+   struct gnss_serial *gserial = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = gserial->serdev;
+   int ret;
+
+   /* write is only buffered synchronously */
+   ret = serdev_device_write(serdev, buf, count, 0);
+   if (ret < 0)
+   return ret;
+
+   /* FIXME: determine if interrupted? */
+   serdev_device_wait_until_sent(serdev, 0);
+
+   return count;
+}
+
+static const struct gnss_operations gnss_serial_gnss_ops = {
+   .open   = gnss_serial_open,
+   .close  = gnss_serial_close,
+   .write_raw  = gnss_serial_write_raw,
+};
+
+static int gnss_serial_receive_buf(struct serdev_device *serdev,
+   const unsigned char *buf, size_t count)
+{
+   struct gnss_serial *gserial = serdev_device_get_drvdata(serdev);
+   struct gnss_device *gdev = gserial->gdev;
+
+   return gnss_insert_raw(gdev, buf, count);
+}
+
+static const struct serdev_device_ops gnss_serial_serdev_ops = {
+   .receive_buf= gnss_serial_receive_buf,
+   .write_wakeup   = serdev_device_write_wakeup,
+};
+
+static int gnss_serial_set_power(struct gnss_serial *gserial,
+   enum gnss_serial_pm_state state)
+{
+   if (!gserial->ops || !gserial->ops->set_power)
+   return 0;
+
+   return gserial->ops->set_power(gserial, state);
+}
+
+/*
+ * FIXME: need to provide subdriver defaults or separate dt parsing from
+ * allocation.
+ */
+static int gnss_serial_parse_dt(struct serdev_device *serdev)
+{
+   struct gnss_serial *gserial = serdev_device_get_drvdata(serdev);
+   struct device_node *node = serdev->dev.of_node;
+   u32 speed = 4800;
+
+   of_property_read_u32(node, "current-speed", );
+
+   gserial->speed = speed;
+
+   return 0;
+}
+
+struct gnss_serial *
+gnss_serial_allocate(struct serdev_device *serdev, size_t data_size)
+{
+   struct gnss_serial *gserial;
+   struct gnss_device *gdev;
+   int ret;
+
+   gserial = kzalloc(sizeof(*gserial) + data_size, GFP_KERNEL);
+   if (!gserial)
+   return ERR_PTR(-ENOMEM);
+
+   gdev = gnss_allocate_device(>dev);
+   if 

[PATCH 3/7] gnss: add generic serial driver

2018-04-24 Thread Johan Hovold
Add a generic serial GNSS driver (library) which provides a common
implementation for the gnss interface and power management (runtime and
system suspend). This allows GNSS drivers for specific chip to be
implemented by simply providing a set_power() callback to handle three
states: ACTIVE, STANDBY and OFF.

Signed-off-by: Johan Hovold 
---
 drivers/gnss/Kconfig  |   7 +
 drivers/gnss/Makefile |   3 +
 drivers/gnss/serial.c | 288 ++
 drivers/gnss/serial.h |  47 +++
 4 files changed, 345 insertions(+)
 create mode 100644 drivers/gnss/serial.c
 create mode 100644 drivers/gnss/serial.h

diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
index 103fcc70992e..f8ee54f99a8d 100644
--- a/drivers/gnss/Kconfig
+++ b/drivers/gnss/Kconfig
@@ -9,3 +9,10 @@ menuconfig GNSS
 
  To compile this driver as a module, choose M here: the module will
  be called gnss.
+
+if GNSS
+
+config GNSS_SERIAL
+   tristate
+
+endif # GNSS
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
index 1f7a7baab1d9..171aba71684d 100644
--- a/drivers/gnss/Makefile
+++ b/drivers/gnss/Makefile
@@ -5,3 +5,6 @@
 
 obj-$(CONFIG_GNSS) += gnss.o
 gnss-y := core.o
+
+obj-$(CONFIG_GNSS_SERIAL)  += gnss-serial.o
+gnss-serial-y := serial.o
diff --git a/drivers/gnss/serial.c b/drivers/gnss/serial.c
new file mode 100644
index ..06a4b511f380
--- /dev/null
+++ b/drivers/gnss/serial.c
@@ -0,0 +1,288 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic serial GNSS receiver driver
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "serial.h"
+
+static int gnss_serial_open(struct gnss_device *gdev)
+{
+   struct gnss_serial *gserial = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = gserial->serdev;
+   int ret;
+
+   ret = serdev_device_open(serdev);
+   if (ret)
+   return ret;
+
+   serdev_device_set_baudrate(serdev, gserial->speed);
+   serdev_device_set_flow_control(serdev, false);
+
+   ret = pm_runtime_get_sync(>dev);
+   if (ret < 0) {
+   pm_runtime_put_noidle(>dev);
+   goto err_close;
+   }
+
+   return 0;
+
+err_close:
+   serdev_device_close(serdev);
+
+   return ret;
+}
+
+static void gnss_serial_close(struct gnss_device *gdev)
+{
+   struct gnss_serial *gserial = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = gserial->serdev;
+
+   serdev_device_close(serdev);
+
+   pm_runtime_put(>dev);
+}
+
+static int gnss_serial_write_raw(struct gnss_device *gdev,
+   const unsigned char *buf, size_t count)
+{
+   struct gnss_serial *gserial = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = gserial->serdev;
+   int ret;
+
+   /* write is only buffered synchronously */
+   ret = serdev_device_write(serdev, buf, count, 0);
+   if (ret < 0)
+   return ret;
+
+   /* FIXME: determine if interrupted? */
+   serdev_device_wait_until_sent(serdev, 0);
+
+   return count;
+}
+
+static const struct gnss_operations gnss_serial_gnss_ops = {
+   .open   = gnss_serial_open,
+   .close  = gnss_serial_close,
+   .write_raw  = gnss_serial_write_raw,
+};
+
+static int gnss_serial_receive_buf(struct serdev_device *serdev,
+   const unsigned char *buf, size_t count)
+{
+   struct gnss_serial *gserial = serdev_device_get_drvdata(serdev);
+   struct gnss_device *gdev = gserial->gdev;
+
+   return gnss_insert_raw(gdev, buf, count);
+}
+
+static const struct serdev_device_ops gnss_serial_serdev_ops = {
+   .receive_buf= gnss_serial_receive_buf,
+   .write_wakeup   = serdev_device_write_wakeup,
+};
+
+static int gnss_serial_set_power(struct gnss_serial *gserial,
+   enum gnss_serial_pm_state state)
+{
+   if (!gserial->ops || !gserial->ops->set_power)
+   return 0;
+
+   return gserial->ops->set_power(gserial, state);
+}
+
+/*
+ * FIXME: need to provide subdriver defaults or separate dt parsing from
+ * allocation.
+ */
+static int gnss_serial_parse_dt(struct serdev_device *serdev)
+{
+   struct gnss_serial *gserial = serdev_device_get_drvdata(serdev);
+   struct device_node *node = serdev->dev.of_node;
+   u32 speed = 4800;
+
+   of_property_read_u32(node, "current-speed", );
+
+   gserial->speed = speed;
+
+   return 0;
+}
+
+struct gnss_serial *
+gnss_serial_allocate(struct serdev_device *serdev, size_t data_size)
+{
+   struct gnss_serial *gserial;
+   struct gnss_device *gdev;
+   int ret;
+
+   gserial = kzalloc(sizeof(*gserial) + data_size, GFP_KERNEL);
+   if (!gserial)
+   return ERR_PTR(-ENOMEM);
+
+   gdev = gnss_allocate_device(>dev);
+   if (!gdev) {
+   ret = -ENOMEM;
+ 

[PATCH 4/7] dt-bindings: gnss: add u-blox binding

2018-04-24 Thread Johan Hovold
Add binding for u-blox GNSS receivers.

Note that the u-blox product names encodes form factor (e.g. "neo"),
chipset (e.g. "8") and variant (e.g. "q"), but that only formfactor and
chipset is used for the compatible strings (for now).

Signed-off-by: Johan Hovold 
---
 .../devicetree/bindings/gnss/u-blox.txt   | 31 +++
 .../devicetree/bindings/vendor-prefixes.txt   |  1 +
 2 files changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/u-blox.txt

diff --git a/Documentation/devicetree/bindings/gnss/u-blox.txt 
b/Documentation/devicetree/bindings/gnss/u-blox.txt
new file mode 100644
index ..bb54b83a177f
--- /dev/null
+++ b/Documentation/devicetree/bindings/gnss/u-blox.txt
@@ -0,0 +1,31 @@
+u-blox GNSS Receiver DT binding
+
+The u-blox GNSS receivers can use UART, DDC (I2C), SPI and USB interfaces.
+
+Please see Documentation/devicetree/bindings/gnss/gnss.txt for generic
+properties.
+
+Required Properties:
+
+- compatible   : Must be one of
+
+   "u-blox,neo-8"
+   "u-blox,neo-m8"
+
+- vcc-supply   : Main voltage regulator (VCC)
+
+Optional Properties:
+
+- timepulse-gpios  : Timepulse (e.g. 1PPS) GPIO (TIMEPULSE)
+
+Example:
+
+serial@1234 {
+   compatible = "ns16550a";
+
+   gnss {
+   compatible = "u-blox,neo-8";
+
+   vcc-supply = <_reg>;
+   };
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index b5f978a4cac6..2128dfdf73f1 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -374,6 +374,7 @@ tronsmart   Tronsmart
 truly  Truly Semiconductors Limited
 tsdTheobroma Systems Design und Consulting GmbH
 tyan   Tyan Computer Corporation
+u-blox u-blox
 ucrobotics uCRobotics
 ubnt   Ubiquiti Networks
 udoo   Udoo
-- 
2.17.0



[PATCH 4/7] dt-bindings: gnss: add u-blox binding

2018-04-24 Thread Johan Hovold
Add binding for u-blox GNSS receivers.

Note that the u-blox product names encodes form factor (e.g. "neo"),
chipset (e.g. "8") and variant (e.g. "q"), but that only formfactor and
chipset is used for the compatible strings (for now).

Signed-off-by: Johan Hovold 
---
 .../devicetree/bindings/gnss/u-blox.txt   | 31 +++
 .../devicetree/bindings/vendor-prefixes.txt   |  1 +
 2 files changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/u-blox.txt

diff --git a/Documentation/devicetree/bindings/gnss/u-blox.txt 
b/Documentation/devicetree/bindings/gnss/u-blox.txt
new file mode 100644
index ..bb54b83a177f
--- /dev/null
+++ b/Documentation/devicetree/bindings/gnss/u-blox.txt
@@ -0,0 +1,31 @@
+u-blox GNSS Receiver DT binding
+
+The u-blox GNSS receivers can use UART, DDC (I2C), SPI and USB interfaces.
+
+Please see Documentation/devicetree/bindings/gnss/gnss.txt for generic
+properties.
+
+Required Properties:
+
+- compatible   : Must be one of
+
+   "u-blox,neo-8"
+   "u-blox,neo-m8"
+
+- vcc-supply   : Main voltage regulator (VCC)
+
+Optional Properties:
+
+- timepulse-gpios  : Timepulse (e.g. 1PPS) GPIO (TIMEPULSE)
+
+Example:
+
+serial@1234 {
+   compatible = "ns16550a";
+
+   gnss {
+   compatible = "u-blox,neo-8";
+
+   vcc-supply = <_reg>;
+   };
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index b5f978a4cac6..2128dfdf73f1 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -374,6 +374,7 @@ tronsmart   Tronsmart
 truly  Truly Semiconductors Limited
 tsdTheobroma Systems Design und Consulting GmbH
 tyan   Tyan Computer Corporation
+u-blox u-blox
 ucrobotics uCRobotics
 ubnt   Ubiquiti Networks
 udoo   Udoo
-- 
2.17.0



[PATCH 5/7] gnss: add driver for u-blox receivers

2018-04-24 Thread Johan Hovold
Add driver for serial-connected u-blox GNSS receivers.

Note that the driver uses the generic GNSS serial implementation and
therefore essentially only manages power abstracted into three power
states: ACTIVE, STANDBY, and OFF.

For u-blox receivers with a single supply and no enable-gpios, this
simply means that the main supply is disabled in STANDBY and OFF.

Note that timepulse-support is not yet implemented.

Signed-off-by: Johan Hovold 
---
 drivers/gnss/Kconfig  |  13 +
 drivers/gnss/Makefile |   3 +
 drivers/gnss/ubx.c| 133 ++
 3 files changed, 149 insertions(+)
 create mode 100644 drivers/gnss/ubx.c

diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
index f8ee54f99a8d..784b8c0367d9 100644
--- a/drivers/gnss/Kconfig
+++ b/drivers/gnss/Kconfig
@@ -15,4 +15,17 @@ if GNSS
 config GNSS_SERIAL
tristate
 
+config GNSS_UBX_SERIAL
+   tristate "u-blox GNSS receiver support"
+   depends on SERIAL_DEV_BUS
+   select GNSS_SERIAL
+   ---help---
+ Say Y here if you have a u-blox GNSS receiver which uses a serial
+ interface.
+
+ To compile this driver as a module, choose M here: the module will
+ be called gnss-ubx.
+
+ If unsure, say N.
+
 endif # GNSS
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
index 171aba71684d..d9295b20b7bc 100644
--- a/drivers/gnss/Makefile
+++ b/drivers/gnss/Makefile
@@ -8,3 +8,6 @@ gnss-y := core.o
 
 obj-$(CONFIG_GNSS_SERIAL)  += gnss-serial.o
 gnss-serial-y := serial.o
+
+obj-$(CONFIG_GNSS_UBX_SERIAL)  += gnss-ubx.o
+gnss-ubx-y := ubx.o
diff --git a/drivers/gnss/ubx.c b/drivers/gnss/ubx.c
new file mode 100644
index ..2f2f00202b5b
--- /dev/null
+++ b/drivers/gnss/ubx.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * u-blox GNSS receiver driver
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "serial.h"
+
+struct ubx_data {
+   struct regulator *vcc;
+};
+
+static int ubx_set_active(struct gnss_serial *gserial)
+{
+   struct ubx_data *data = gnss_serial_get_drvdata(gserial);
+   int ret;
+
+   dev_dbg(>serdev->dev, "%s\n", __func__);
+
+   ret = regulator_enable(data->vcc);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static int ubx_set_standby(struct gnss_serial *gserial)
+{
+   struct ubx_data *data = gnss_serial_get_drvdata(gserial);
+   int ret;
+
+   dev_dbg(>serdev->dev, "%s\n", __func__);
+
+   ret = regulator_disable(data->vcc);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static int
+ubx_set_power(struct gnss_serial *gserial, enum gnss_serial_pm_state state)
+{
+   switch (state) {
+   case GNSS_SERIAL_ACTIVE:
+   return ubx_set_active(gserial);
+   case GNSS_SERIAL_OFF:
+   case GNSS_SERIAL_STANDBY:
+   return ubx_set_standby(gserial);
+   }
+
+   return -EINVAL;
+}
+
+const struct gnss_serial_ops ubx_gserial_ops = {
+   .set_power = ubx_set_power,
+};
+
+static int ubx_probe(struct serdev_device *serdev)
+{
+   struct gnss_serial *gserial;
+   struct ubx_data *data;
+   int ret;
+
+   gserial = gnss_serial_allocate(serdev, sizeof(*data));
+   if (IS_ERR(gserial)) {
+   ret = PTR_ERR(gserial);
+   return ret;
+   }
+
+   gserial->ops = _gserial_ops;
+
+   data = gnss_serial_get_drvdata(gserial);
+
+   data->vcc = devm_regulator_get(>dev, "vcc");
+   if (IS_ERR(data->vcc)) {
+   ret = PTR_ERR(data->vcc);
+   goto err_free_gserial;
+   }
+
+   ret = gnss_serial_register(gserial);
+   if (ret)
+   goto err_free_gserial;
+
+   return 0;
+
+err_free_gserial:
+   gnss_serial_free(gserial);
+
+   return ret;
+}
+
+static void ubx_remove(struct serdev_device *serdev)
+{
+   struct gnss_serial *gserial = serdev_device_get_drvdata(serdev);
+
+   gnss_serial_deregister(gserial);
+   gnss_serial_free(gserial);
+};
+
+#ifdef CONFIG_OF
+static const struct of_device_id ubx_of_match[] = {
+   { .compatible = "u-blox,neo-8" },
+   { .compatible = "u-blox,neo-m8" },
+   {},
+};
+MODULE_DEVICE_TABLE(of, ubx_of_match);
+#endif
+
+static struct serdev_device_driver ubx_driver = {
+   .driver = {
+   .name   = "gnss-ubx",
+   .of_match_table = of_match_ptr(ubx_of_match),
+   .pm = _serial_pm_ops,
+   },
+   .probe  = ubx_probe,
+   .remove = ubx_remove,
+};
+module_serdev_device_driver(ubx_driver);
+
+MODULE_AUTHOR("Johan Hovold ");
+MODULE_DESCRIPTION("u-blox GNSS receiver driver");
+MODULE_LICENSE("GPL v2");
-- 
2.17.0



[PATCH 0/7] gnss: add new GNSS subsystem

2018-04-24 Thread Johan Hovold
This series adds a new subsystem for GNSS receivers (e.g. GPS
receivers).

While GNSS receivers are typically accessed using a UART interface they
often also support other I/O interfaces such as I2C, SPI and USB, while
yet other devices use iomem or even some form of remote-processor
messaging (rpmsg).

The new GNSS subsystem abstracts the underlying interface and provides a
new "gnss" class type, which exposes a character-device interface (e.g.
/dev/gnss0) to user space. This allows GNSS receivers to have a
representation in the Linux device model, something which is important
not least for power management purposes and which also allows for easy
detection and (eventually) identification of GNSS devices.

Note that the character-device interface provides raw access to whatever
protocol the receiver is (currently) using, such as NMEA 0183, UBX or
SiRF Binary. These protocols are expected to be continued to be handled
by user space for the time being, even if some hybrid solutions are also
conceivable (e.g. to have kernel drivers issue management commands).

This will still allow for better platform integration by allowing GNSS
devices and their resources (e.g. regulators and enable-gpios) to be
described by firmware and managed by kernel drivers rather than
platform-specific scripts and services.

While the current interface is kept minimal, it could be extended using
IOCTLs, sysfs or uevents as needs and proper abstraction levels are
identified and determined (e.g. for device and feature identification).

Another possible extension is to add generic 1PPS support.

I decided to go with a custom character-device interface rather than
pretend that these abstract GNSS devices are still TTY devices (e.g.
/dev/ttyGNSS0). Obviously, modifying line settings or reading modem
control signals does not make any sense for a device using, say, a
USB (not USB-serial) or iomem interface. This also means, however, that
user space would no longer be able to set the line speed to match a new
port configuration that can be set using the various GNSS protocols when
the underlying interface is indeed a UART; instead this would need to be
taken care of by the driver.

Also note that writes are always synchronous instead of requiring user
space to call tcdrain() after every command.

This all seems to work well-enough (e.g. with gpsd), but please let me
know if I've overlooked something which would indeed require a TTY
interface instead.

As proof-of-concept I have implemented drivers for receivers based on
two common GNSS chipsets (SiRFstar and u-blox), but due to lack of
hardware these have so far only been tested using mockup devices and a
USB-serial-based GPS device (using out-of-tree code). [ Let me know if
you've got any evalutation kits to spare. ]

Finally, note that documentation (including kerneldoc) remains to be
written, but hopefully this will not hinder review given that the
current interfaces are fairly self-describing.

Johan


Johan Hovold (7):
  gnss: add GNSS receiver subsystem
  dt-bindings: add generic gnss binding
  gnss: add generic serial driver
  dt-bindings: gnss: add u-blox binding
  gnss: add driver for u-blox receivers
  dt-bindings: gnss: add sirfstar binding
  gnss: add driver for sirfstar-based receivers

 .../devicetree/bindings/gnss/gnss.txt |  36 ++
 .../devicetree/bindings/gnss/sirfstar.txt |  38 ++
 .../devicetree/bindings/gnss/u-blox.txt   |  31 ++
 .../devicetree/bindings/vendor-prefixes.txt   |   4 +
 MAINTAINERS   |   7 +
 drivers/Kconfig   |   2 +
 drivers/Makefile  |   1 +
 drivers/gnss/Kconfig  |  43 ++
 drivers/gnss/Makefile |  16 +
 drivers/gnss/core.c   | 385 
 drivers/gnss/serial.c | 288 
 drivers/gnss/serial.h |  47 ++
 drivers/gnss/sirf.c   | 415 ++
 drivers/gnss/ubx.c| 133 ++
 include/linux/gnss.h  |  64 +++
 15 files changed, 1510 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/gnss.txt
 create mode 100644 Documentation/devicetree/bindings/gnss/sirfstar.txt
 create mode 100644 Documentation/devicetree/bindings/gnss/u-blox.txt
 create mode 100644 drivers/gnss/Kconfig
 create mode 100644 drivers/gnss/Makefile
 create mode 100644 drivers/gnss/core.c
 create mode 100644 drivers/gnss/serial.c
 create mode 100644 drivers/gnss/serial.h
 create mode 100644 drivers/gnss/sirf.c
 create mode 100644 drivers/gnss/ubx.c
 create mode 100644 include/linux/gnss.h

-- 
2.17.0



[PATCH 5/7] gnss: add driver for u-blox receivers

2018-04-24 Thread Johan Hovold
Add driver for serial-connected u-blox GNSS receivers.

Note that the driver uses the generic GNSS serial implementation and
therefore essentially only manages power abstracted into three power
states: ACTIVE, STANDBY, and OFF.

For u-blox receivers with a single supply and no enable-gpios, this
simply means that the main supply is disabled in STANDBY and OFF.

Note that timepulse-support is not yet implemented.

Signed-off-by: Johan Hovold 
---
 drivers/gnss/Kconfig  |  13 +
 drivers/gnss/Makefile |   3 +
 drivers/gnss/ubx.c| 133 ++
 3 files changed, 149 insertions(+)
 create mode 100644 drivers/gnss/ubx.c

diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
index f8ee54f99a8d..784b8c0367d9 100644
--- a/drivers/gnss/Kconfig
+++ b/drivers/gnss/Kconfig
@@ -15,4 +15,17 @@ if GNSS
 config GNSS_SERIAL
tristate
 
+config GNSS_UBX_SERIAL
+   tristate "u-blox GNSS receiver support"
+   depends on SERIAL_DEV_BUS
+   select GNSS_SERIAL
+   ---help---
+ Say Y here if you have a u-blox GNSS receiver which uses a serial
+ interface.
+
+ To compile this driver as a module, choose M here: the module will
+ be called gnss-ubx.
+
+ If unsure, say N.
+
 endif # GNSS
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
index 171aba71684d..d9295b20b7bc 100644
--- a/drivers/gnss/Makefile
+++ b/drivers/gnss/Makefile
@@ -8,3 +8,6 @@ gnss-y := core.o
 
 obj-$(CONFIG_GNSS_SERIAL)  += gnss-serial.o
 gnss-serial-y := serial.o
+
+obj-$(CONFIG_GNSS_UBX_SERIAL)  += gnss-ubx.o
+gnss-ubx-y := ubx.o
diff --git a/drivers/gnss/ubx.c b/drivers/gnss/ubx.c
new file mode 100644
index ..2f2f00202b5b
--- /dev/null
+++ b/drivers/gnss/ubx.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * u-blox GNSS receiver driver
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "serial.h"
+
+struct ubx_data {
+   struct regulator *vcc;
+};
+
+static int ubx_set_active(struct gnss_serial *gserial)
+{
+   struct ubx_data *data = gnss_serial_get_drvdata(gserial);
+   int ret;
+
+   dev_dbg(>serdev->dev, "%s\n", __func__);
+
+   ret = regulator_enable(data->vcc);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static int ubx_set_standby(struct gnss_serial *gserial)
+{
+   struct ubx_data *data = gnss_serial_get_drvdata(gserial);
+   int ret;
+
+   dev_dbg(>serdev->dev, "%s\n", __func__);
+
+   ret = regulator_disable(data->vcc);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static int
+ubx_set_power(struct gnss_serial *gserial, enum gnss_serial_pm_state state)
+{
+   switch (state) {
+   case GNSS_SERIAL_ACTIVE:
+   return ubx_set_active(gserial);
+   case GNSS_SERIAL_OFF:
+   case GNSS_SERIAL_STANDBY:
+   return ubx_set_standby(gserial);
+   }
+
+   return -EINVAL;
+}
+
+const struct gnss_serial_ops ubx_gserial_ops = {
+   .set_power = ubx_set_power,
+};
+
+static int ubx_probe(struct serdev_device *serdev)
+{
+   struct gnss_serial *gserial;
+   struct ubx_data *data;
+   int ret;
+
+   gserial = gnss_serial_allocate(serdev, sizeof(*data));
+   if (IS_ERR(gserial)) {
+   ret = PTR_ERR(gserial);
+   return ret;
+   }
+
+   gserial->ops = _gserial_ops;
+
+   data = gnss_serial_get_drvdata(gserial);
+
+   data->vcc = devm_regulator_get(>dev, "vcc");
+   if (IS_ERR(data->vcc)) {
+   ret = PTR_ERR(data->vcc);
+   goto err_free_gserial;
+   }
+
+   ret = gnss_serial_register(gserial);
+   if (ret)
+   goto err_free_gserial;
+
+   return 0;
+
+err_free_gserial:
+   gnss_serial_free(gserial);
+
+   return ret;
+}
+
+static void ubx_remove(struct serdev_device *serdev)
+{
+   struct gnss_serial *gserial = serdev_device_get_drvdata(serdev);
+
+   gnss_serial_deregister(gserial);
+   gnss_serial_free(gserial);
+};
+
+#ifdef CONFIG_OF
+static const struct of_device_id ubx_of_match[] = {
+   { .compatible = "u-blox,neo-8" },
+   { .compatible = "u-blox,neo-m8" },
+   {},
+};
+MODULE_DEVICE_TABLE(of, ubx_of_match);
+#endif
+
+static struct serdev_device_driver ubx_driver = {
+   .driver = {
+   .name   = "gnss-ubx",
+   .of_match_table = of_match_ptr(ubx_of_match),
+   .pm = _serial_pm_ops,
+   },
+   .probe  = ubx_probe,
+   .remove = ubx_remove,
+};
+module_serdev_device_driver(ubx_driver);
+
+MODULE_AUTHOR("Johan Hovold ");
+MODULE_DESCRIPTION("u-blox GNSS receiver driver");
+MODULE_LICENSE("GPL v2");
-- 
2.17.0



[PATCH 0/7] gnss: add new GNSS subsystem

2018-04-24 Thread Johan Hovold
This series adds a new subsystem for GNSS receivers (e.g. GPS
receivers).

While GNSS receivers are typically accessed using a UART interface they
often also support other I/O interfaces such as I2C, SPI and USB, while
yet other devices use iomem or even some form of remote-processor
messaging (rpmsg).

The new GNSS subsystem abstracts the underlying interface and provides a
new "gnss" class type, which exposes a character-device interface (e.g.
/dev/gnss0) to user space. This allows GNSS receivers to have a
representation in the Linux device model, something which is important
not least for power management purposes and which also allows for easy
detection and (eventually) identification of GNSS devices.

Note that the character-device interface provides raw access to whatever
protocol the receiver is (currently) using, such as NMEA 0183, UBX or
SiRF Binary. These protocols are expected to be continued to be handled
by user space for the time being, even if some hybrid solutions are also
conceivable (e.g. to have kernel drivers issue management commands).

This will still allow for better platform integration by allowing GNSS
devices and their resources (e.g. regulators and enable-gpios) to be
described by firmware and managed by kernel drivers rather than
platform-specific scripts and services.

While the current interface is kept minimal, it could be extended using
IOCTLs, sysfs or uevents as needs and proper abstraction levels are
identified and determined (e.g. for device and feature identification).

Another possible extension is to add generic 1PPS support.

I decided to go with a custom character-device interface rather than
pretend that these abstract GNSS devices are still TTY devices (e.g.
/dev/ttyGNSS0). Obviously, modifying line settings or reading modem
control signals does not make any sense for a device using, say, a
USB (not USB-serial) or iomem interface. This also means, however, that
user space would no longer be able to set the line speed to match a new
port configuration that can be set using the various GNSS protocols when
the underlying interface is indeed a UART; instead this would need to be
taken care of by the driver.

Also note that writes are always synchronous instead of requiring user
space to call tcdrain() after every command.

This all seems to work well-enough (e.g. with gpsd), but please let me
know if I've overlooked something which would indeed require a TTY
interface instead.

As proof-of-concept I have implemented drivers for receivers based on
two common GNSS chipsets (SiRFstar and u-blox), but due to lack of
hardware these have so far only been tested using mockup devices and a
USB-serial-based GPS device (using out-of-tree code). [ Let me know if
you've got any evalutation kits to spare. ]

Finally, note that documentation (including kerneldoc) remains to be
written, but hopefully this will not hinder review given that the
current interfaces are fairly self-describing.

Johan


Johan Hovold (7):
  gnss: add GNSS receiver subsystem
  dt-bindings: add generic gnss binding
  gnss: add generic serial driver
  dt-bindings: gnss: add u-blox binding
  gnss: add driver for u-blox receivers
  dt-bindings: gnss: add sirfstar binding
  gnss: add driver for sirfstar-based receivers

 .../devicetree/bindings/gnss/gnss.txt |  36 ++
 .../devicetree/bindings/gnss/sirfstar.txt |  38 ++
 .../devicetree/bindings/gnss/u-blox.txt   |  31 ++
 .../devicetree/bindings/vendor-prefixes.txt   |   4 +
 MAINTAINERS   |   7 +
 drivers/Kconfig   |   2 +
 drivers/Makefile  |   1 +
 drivers/gnss/Kconfig  |  43 ++
 drivers/gnss/Makefile |  16 +
 drivers/gnss/core.c   | 385 
 drivers/gnss/serial.c | 288 
 drivers/gnss/serial.h |  47 ++
 drivers/gnss/sirf.c   | 415 ++
 drivers/gnss/ubx.c| 133 ++
 include/linux/gnss.h  |  64 +++
 15 files changed, 1510 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/gnss.txt
 create mode 100644 Documentation/devicetree/bindings/gnss/sirfstar.txt
 create mode 100644 Documentation/devicetree/bindings/gnss/u-blox.txt
 create mode 100644 drivers/gnss/Kconfig
 create mode 100644 drivers/gnss/Makefile
 create mode 100644 drivers/gnss/core.c
 create mode 100644 drivers/gnss/serial.c
 create mode 100644 drivers/gnss/serial.h
 create mode 100644 drivers/gnss/sirf.c
 create mode 100644 drivers/gnss/ubx.c
 create mode 100644 include/linux/gnss.h

-- 
2.17.0



Re: [tip:x86/urgent] x86/io: Define readq()/writeq() to use 64-bit type

2018-04-24 Thread Andy Shevchenko
On Sat, 2018-03-31 at 20:45 +0200, Ingo Molnar wrote:
> * Andy Shevchenko  wrote:
> 
> > On Sat, Mar 31, 2018 at 3:06 PM, Andy Shevchenko
> >  wrote:
> > > On Sat, Mar 31, 2018 at 1:22 PM, Ingo Molnar 
> > > wrote:
> > > > * Ingo Molnar  wrote:
> > > >   [tip:x86/urgent 14/14]
> > > > drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1690:22: sparse:
> > > > incorrect type in argument 1 (different base types)
> > > > 
> > > > Since you were on Cc: of that report I assumed you'd take care
> > > > of it.
> > > 
> > > Ah, yes. This one I fixed.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit
> > /?h=for-next=71591d1280e5ef02c2af2ffb9801d0c842973be9
> 
> Yeah, so this fix in the RDMA tree needs to go upstream first, before
> we can apply 
> the changes to the interface. Could you resend at around v4.17-rc1 or
> so?

Btw, should I recend or you can still pick it up from linux-next or
mailing list?

I can resend tomorrow if needed.

-- 
Andy Shevchenko 
Intel Finland Oy


Re: [PATCH v3 02/12] KVM: arm/arm64: Document KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION

2018-04-24 Thread Peter Maydell
On 24 April 2018 at 17:46, Christoffer Dall  wrote:
> On Fri, Apr 13, 2018 at 10:20:48AM +0200, Eric Auger wrote:
>> --- a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
>> +++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
>> @@ -27,9 +27,32 @@ Groups:
>>VCPU and all of the redistributor pages are contiguous.
>>Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
>>This address needs to be 64K aligned.
>> +
>> +KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION (rw, 64-bit)
>> +  The attr field of kvm_device_attr encodes 3 values:
>> +  bits: | 63     52  |  51      16 | 15 - 12  |11 - 0
>> +  values:   | count  |   base  |  flags   | index
>> +  - index encodes the unique redistributor region index
>
> I'm not entirely sure I understand the purpose of the index field.
> Isn't a redistributor region identified uniquely by its base address?

You need a way to tell the difference beween:
 (1) redistributors for CPUs 0..63 at 0x4000, redistributors
 for 64..127 at 0x8000
 (2) redistributors for CPUs 0..63 at 0x8000, redistributors
 for 64..127 at 0x4000

The index field tells you which order the redistributor
regions go in.

thanks
-- PMM


Re: [tip:x86/urgent] x86/io: Define readq()/writeq() to use 64-bit type

2018-04-24 Thread Andy Shevchenko
On Sat, 2018-03-31 at 20:45 +0200, Ingo Molnar wrote:
> * Andy Shevchenko  wrote:
> 
> > On Sat, Mar 31, 2018 at 3:06 PM, Andy Shevchenko
> >  wrote:
> > > On Sat, Mar 31, 2018 at 1:22 PM, Ingo Molnar 
> > > wrote:
> > > > * Ingo Molnar  wrote:
> > > >   [tip:x86/urgent 14/14]
> > > > drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1690:22: sparse:
> > > > incorrect type in argument 1 (different base types)
> > > > 
> > > > Since you were on Cc: of that report I assumed you'd take care
> > > > of it.
> > > 
> > > Ah, yes. This one I fixed.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit
> > /?h=for-next=71591d1280e5ef02c2af2ffb9801d0c842973be9
> 
> Yeah, so this fix in the RDMA tree needs to go upstream first, before
> we can apply 
> the changes to the interface. Could you resend at around v4.17-rc1 or
> so?

Btw, should I recend or you can still pick it up from linux-next or
mailing list?

I can resend tomorrow if needed.

-- 
Andy Shevchenko 
Intel Finland Oy


Re: [PATCH v3 02/12] KVM: arm/arm64: Document KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION

2018-04-24 Thread Peter Maydell
On 24 April 2018 at 17:46, Christoffer Dall  wrote:
> On Fri, Apr 13, 2018 at 10:20:48AM +0200, Eric Auger wrote:
>> --- a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
>> +++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
>> @@ -27,9 +27,32 @@ Groups:
>>VCPU and all of the redistributor pages are contiguous.
>>Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
>>This address needs to be 64K aligned.
>> +
>> +KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION (rw, 64-bit)
>> +  The attr field of kvm_device_attr encodes 3 values:
>> +  bits: | 63     52  |  51      16 | 15 - 12  |11 - 0
>> +  values:   | count  |   base  |  flags   | index
>> +  - index encodes the unique redistributor region index
>
> I'm not entirely sure I understand the purpose of the index field.
> Isn't a redistributor region identified uniquely by its base address?

You need a way to tell the difference beween:
 (1) redistributors for CPUs 0..63 at 0x4000, redistributors
 for 64..127 at 0x8000
 (2) redistributors for CPUs 0..63 at 0x8000, redistributors
 for 64..127 at 0x4000

The index field tells you which order the redistributor
regions go in.

thanks
-- PMM


[PATCH 6/7] dt-bindings: gnss: add sirfstar binding

2018-04-24 Thread Johan Hovold
Add binding for SiRFstar-based GNSS receivers.

Note that while four compatible-strings are initially added representing
devices which differ in which I/O interfaces they support, they
otherwise essentially share the same feature set.

Pin and supply names (and some recommended timings) vary slightly, but
the binding recommends using a common set of names.

Note that the wakeup gpio is not intended to be as a wakeup source, but
rather to detect the current power state of the device (active or
hibernate).

Signed-off-by: Johan Hovold 
---
 .../devicetree/bindings/gnss/sirfstar.txt | 38 +++
 .../devicetree/bindings/vendor-prefixes.txt   |  3 ++
 2 files changed, 41 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/sirfstar.txt

diff --git a/Documentation/devicetree/bindings/gnss/sirfstar.txt 
b/Documentation/devicetree/bindings/gnss/sirfstar.txt
new file mode 100644
index ..5e6a02aec49a
--- /dev/null
+++ b/Documentation/devicetree/bindings/gnss/sirfstar.txt
@@ -0,0 +1,38 @@
+SiRFstar-based GNSS Receiver DT binding
+
+SiRFstar chipsets are used in GNSS-receiver modules produced by several
+vendors and can use UART, SPI or I2C interfaces.
+
+Please see Documentation/devicetree/bindings/gnss/gnss.txt for generic
+properties.
+
+Required Properties:
+
+- compatible   : Must be one of
+
+   "fastrax,uc430"
+   "linx,r4"
+   "wi2wi,w2sg0008i"
+   "wi2wi,w2sg0084i"
+
+- vcc-supply   : Main voltage regulator (3V3_IN, VDD, VCC)
+
+Optional Properties:
+
+- enable-gpios : GPIO used to power on and off device (ON_OFF)
+- wakeup-gpios : GPIO used to determine device power state (WAKEUP, RFPWRUP)
+- timepulse-gpios  : Timepulse (e.g 1PPS) GPIO (1PPS, TM)
+
+Example:
+
+serial@1234 {
+   compatible = "ns16550a";
+
+   gnss {
+   compatible = "wi2wi,w2sg0084i";
+
+   vcc-supply = <_reg>;
+   enable-gpios = < 16 GPIO_ACTIVE_HIGH>;
+   wakeup-gpios = < 17 GPIO_ACTIVE_HIGH>;
+   };
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 2128dfdf73f1..ddd81c82082d 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -120,6 +120,7 @@ excito  Excito
 ezchip EZchip Semiconductor
 fairphone  Fairphone B.V.
 faradayFaraday Technology Corporation
+fastraxFastrax Oy
 fcsFairchild Semiconductor
 fireflyFirefly
 focaltech  FocalTech Systems Co.,Ltd
@@ -197,6 +198,7 @@ licheepiLichee Pi
 linaro Linaro Limited
 linksysBelkin International, Inc. (Linksys)
 linux  Linux-specific binding
+linx   Linx Technologies
 lltc   Linear Technology Corporation
 lsiLSI Corp. (LSI Logic)
 lwnLiebherr-Werk Nenzing GmbH
@@ -390,6 +392,7 @@ vivante Vivante Corporation
 vocore VoCore Studio
 voipac Voipac Technologies s.r.o.
 votVision Optical Technology Co., Ltd.
+wi2wi  Wi2Wi
 wd Western Digital Corp.
 wetek  WeTek Electronics, limited.
 wexler Wexler
-- 
2.17.0



[PATCH 6/7] dt-bindings: gnss: add sirfstar binding

2018-04-24 Thread Johan Hovold
Add binding for SiRFstar-based GNSS receivers.

Note that while four compatible-strings are initially added representing
devices which differ in which I/O interfaces they support, they
otherwise essentially share the same feature set.

Pin and supply names (and some recommended timings) vary slightly, but
the binding recommends using a common set of names.

Note that the wakeup gpio is not intended to be as a wakeup source, but
rather to detect the current power state of the device (active or
hibernate).

Signed-off-by: Johan Hovold 
---
 .../devicetree/bindings/gnss/sirfstar.txt | 38 +++
 .../devicetree/bindings/vendor-prefixes.txt   |  3 ++
 2 files changed, 41 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gnss/sirfstar.txt

diff --git a/Documentation/devicetree/bindings/gnss/sirfstar.txt 
b/Documentation/devicetree/bindings/gnss/sirfstar.txt
new file mode 100644
index ..5e6a02aec49a
--- /dev/null
+++ b/Documentation/devicetree/bindings/gnss/sirfstar.txt
@@ -0,0 +1,38 @@
+SiRFstar-based GNSS Receiver DT binding
+
+SiRFstar chipsets are used in GNSS-receiver modules produced by several
+vendors and can use UART, SPI or I2C interfaces.
+
+Please see Documentation/devicetree/bindings/gnss/gnss.txt for generic
+properties.
+
+Required Properties:
+
+- compatible   : Must be one of
+
+   "fastrax,uc430"
+   "linx,r4"
+   "wi2wi,w2sg0008i"
+   "wi2wi,w2sg0084i"
+
+- vcc-supply   : Main voltage regulator (3V3_IN, VDD, VCC)
+
+Optional Properties:
+
+- enable-gpios : GPIO used to power on and off device (ON_OFF)
+- wakeup-gpios : GPIO used to determine device power state (WAKEUP, RFPWRUP)
+- timepulse-gpios  : Timepulse (e.g 1PPS) GPIO (1PPS, TM)
+
+Example:
+
+serial@1234 {
+   compatible = "ns16550a";
+
+   gnss {
+   compatible = "wi2wi,w2sg0084i";
+
+   vcc-supply = <_reg>;
+   enable-gpios = < 16 GPIO_ACTIVE_HIGH>;
+   wakeup-gpios = < 17 GPIO_ACTIVE_HIGH>;
+   };
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 2128dfdf73f1..ddd81c82082d 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -120,6 +120,7 @@ excito  Excito
 ezchip EZchip Semiconductor
 fairphone  Fairphone B.V.
 faradayFaraday Technology Corporation
+fastraxFastrax Oy
 fcsFairchild Semiconductor
 fireflyFirefly
 focaltech  FocalTech Systems Co.,Ltd
@@ -197,6 +198,7 @@ licheepiLichee Pi
 linaro Linaro Limited
 linksysBelkin International, Inc. (Linksys)
 linux  Linux-specific binding
+linx   Linx Technologies
 lltc   Linear Technology Corporation
 lsiLSI Corp. (LSI Logic)
 lwnLiebherr-Werk Nenzing GmbH
@@ -390,6 +392,7 @@ vivante Vivante Corporation
 vocore VoCore Studio
 voipac Voipac Technologies s.r.o.
 votVision Optical Technology Co., Ltd.
+wi2wi  Wi2Wi
 wd Western Digital Corp.
 wetek  WeTek Electronics, limited.
 wexler Wexler
-- 
2.17.0



[PATCH 7/7] gnss: add driver for sirfstar-based receivers

2018-04-24 Thread Johan Hovold
Add driver for serial-connected SiRFstar-based GNSS receivers.

These devices typically boot into hibernate mode from which they can be
woken using a pulse on the ON_OFF (enable) input pin. Once active, a
pulse on the same ON_OFF pin is used to put the device back into
hibernate mode. The current state can be determined by sampling the
WAKEUP output.

Hardware configurations where WAKEUP has been connected to ON_OFF (and
where an initial WAKEUP pulse during boot is sufficient to have the
device boot into active mode) are also supported. In this case, device
power is managed using the main-supply regulator only.

Note that configurations where WAKEUP is left not connected, so that the
device power state can only indirectly be determined using the I/O
interface, is currently not supported. It should be fairly
straight-forward to extend the current implementation with such support
however (and this this is the main reason for not using the generic
serial implementation for this driver).

Note that timepulse-support is left unimplemented.

Signed-off-by: Johan Hovold 
---
 drivers/gnss/Kconfig  |  12 ++
 drivers/gnss/Makefile |   3 +
 drivers/gnss/sirf.c   | 415 ++
 3 files changed, 430 insertions(+)
 create mode 100644 drivers/gnss/sirf.c

diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
index 784b8c0367d9..6abc88514512 100644
--- a/drivers/gnss/Kconfig
+++ b/drivers/gnss/Kconfig
@@ -15,6 +15,18 @@ if GNSS
 config GNSS_SERIAL
tristate
 
+config GNSS_SIRF_SERIAL
+   tristate "SiRFstar GNSS receiver support"
+   depends on SERIAL_DEV_BUS
+   ---help---
+ Say Y here if you have a SiRFstar-based GNSS receiver which uses a
+ serial interface.
+
+ To compile this driver as a module, choose M here: the module will
+ be called gnss-sirf.
+
+ If unsure, say N.
+
 config GNSS_UBX_SERIAL
tristate "u-blox GNSS receiver support"
depends on SERIAL_DEV_BUS
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
index d9295b20b7bc..5cf0ebe0330a 100644
--- a/drivers/gnss/Makefile
+++ b/drivers/gnss/Makefile
@@ -9,5 +9,8 @@ gnss-y := core.o
 obj-$(CONFIG_GNSS_SERIAL)  += gnss-serial.o
 gnss-serial-y := serial.o
 
+obj-$(CONFIG_GNSS_SIRF_SERIAL) += gnss-sirf.o
+gnss-sirf-y := sirf.o
+
 obj-$(CONFIG_GNSS_UBX_SERIAL)  += gnss-ubx.o
 gnss-ubx-y := ubx.o
diff --git a/drivers/gnss/sirf.c b/drivers/gnss/sirf.c
new file mode 100644
index ..9d46da1530ca
--- /dev/null
+++ b/drivers/gnss/sirf.c
@@ -0,0 +1,415 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SiRFstar GNSS receiver driver
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SIRF_BOOT_DELAY500
+#define SIRF_ON_OFF_PULSE_TIME 100
+#define SIRF_ACTIVATE_TIMEOUT  200
+#define SIRF_HIBERNATE_TIMEOUT 200
+
+struct sirf_data {
+   struct gnss_device *gdev;
+   struct serdev_device *serdev;
+   speed_t speed;
+   struct regulator *vcc;
+   struct gpio_desc *on_off;
+   struct gpio_desc *wakeup;
+   int irq;
+   bool active;
+   wait_queue_head_t power_wait;
+};
+
+static int sirf_open(struct gnss_device *gdev)
+{
+   struct sirf_data *data = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = data->serdev;
+   int ret;
+
+   ret = serdev_device_open(serdev);
+   if (ret)
+   return ret;
+
+   serdev_device_set_baudrate(serdev, data->speed);
+   serdev_device_set_flow_control(serdev, false);
+
+   ret = pm_runtime_get_sync(>dev);
+   if (ret < 0) {
+   dev_err(>dev, "failed to runtime resume: %d\n", ret);
+   pm_runtime_put_noidle(>dev);
+   goto err_close;
+   }
+
+   return 0;
+
+err_close:
+   serdev_device_close(serdev);
+
+   return ret;
+}
+
+static void sirf_close(struct gnss_device *gdev)
+{
+   struct sirf_data *data = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = data->serdev;
+
+   serdev_device_close(serdev);
+
+   pm_runtime_put(>dev);
+}
+
+static int sirf_write_raw(struct gnss_device *gdev,
+   const unsigned char *buf, size_t count)
+{
+   struct sirf_data *data = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = data->serdev;
+   int ret;
+
+   /* write is only buffered synchronously */
+   ret = serdev_device_write(serdev, buf, count, 0);
+   if (ret < 0)
+   return ret;
+
+   /* FIXME: determine if interrupted? */
+   serdev_device_wait_until_sent(serdev, 0);
+
+   return count;
+}
+
+static const struct gnss_operations sirf_gnss_ops = {
+   .open   = sirf_open,
+   .close  = sirf_close,
+   

[PATCH 7/7] gnss: add driver for sirfstar-based receivers

2018-04-24 Thread Johan Hovold
Add driver for serial-connected SiRFstar-based GNSS receivers.

These devices typically boot into hibernate mode from which they can be
woken using a pulse on the ON_OFF (enable) input pin. Once active, a
pulse on the same ON_OFF pin is used to put the device back into
hibernate mode. The current state can be determined by sampling the
WAKEUP output.

Hardware configurations where WAKEUP has been connected to ON_OFF (and
where an initial WAKEUP pulse during boot is sufficient to have the
device boot into active mode) are also supported. In this case, device
power is managed using the main-supply regulator only.

Note that configurations where WAKEUP is left not connected, so that the
device power state can only indirectly be determined using the I/O
interface, is currently not supported. It should be fairly
straight-forward to extend the current implementation with such support
however (and this this is the main reason for not using the generic
serial implementation for this driver).

Note that timepulse-support is left unimplemented.

Signed-off-by: Johan Hovold 
---
 drivers/gnss/Kconfig  |  12 ++
 drivers/gnss/Makefile |   3 +
 drivers/gnss/sirf.c   | 415 ++
 3 files changed, 430 insertions(+)
 create mode 100644 drivers/gnss/sirf.c

diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
index 784b8c0367d9..6abc88514512 100644
--- a/drivers/gnss/Kconfig
+++ b/drivers/gnss/Kconfig
@@ -15,6 +15,18 @@ if GNSS
 config GNSS_SERIAL
tristate
 
+config GNSS_SIRF_SERIAL
+   tristate "SiRFstar GNSS receiver support"
+   depends on SERIAL_DEV_BUS
+   ---help---
+ Say Y here if you have a SiRFstar-based GNSS receiver which uses a
+ serial interface.
+
+ To compile this driver as a module, choose M here: the module will
+ be called gnss-sirf.
+
+ If unsure, say N.
+
 config GNSS_UBX_SERIAL
tristate "u-blox GNSS receiver support"
depends on SERIAL_DEV_BUS
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
index d9295b20b7bc..5cf0ebe0330a 100644
--- a/drivers/gnss/Makefile
+++ b/drivers/gnss/Makefile
@@ -9,5 +9,8 @@ gnss-y := core.o
 obj-$(CONFIG_GNSS_SERIAL)  += gnss-serial.o
 gnss-serial-y := serial.o
 
+obj-$(CONFIG_GNSS_SIRF_SERIAL) += gnss-sirf.o
+gnss-sirf-y := sirf.o
+
 obj-$(CONFIG_GNSS_UBX_SERIAL)  += gnss-ubx.o
 gnss-ubx-y := ubx.o
diff --git a/drivers/gnss/sirf.c b/drivers/gnss/sirf.c
new file mode 100644
index ..9d46da1530ca
--- /dev/null
+++ b/drivers/gnss/sirf.c
@@ -0,0 +1,415 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SiRFstar GNSS receiver driver
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SIRF_BOOT_DELAY500
+#define SIRF_ON_OFF_PULSE_TIME 100
+#define SIRF_ACTIVATE_TIMEOUT  200
+#define SIRF_HIBERNATE_TIMEOUT 200
+
+struct sirf_data {
+   struct gnss_device *gdev;
+   struct serdev_device *serdev;
+   speed_t speed;
+   struct regulator *vcc;
+   struct gpio_desc *on_off;
+   struct gpio_desc *wakeup;
+   int irq;
+   bool active;
+   wait_queue_head_t power_wait;
+};
+
+static int sirf_open(struct gnss_device *gdev)
+{
+   struct sirf_data *data = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = data->serdev;
+   int ret;
+
+   ret = serdev_device_open(serdev);
+   if (ret)
+   return ret;
+
+   serdev_device_set_baudrate(serdev, data->speed);
+   serdev_device_set_flow_control(serdev, false);
+
+   ret = pm_runtime_get_sync(>dev);
+   if (ret < 0) {
+   dev_err(>dev, "failed to runtime resume: %d\n", ret);
+   pm_runtime_put_noidle(>dev);
+   goto err_close;
+   }
+
+   return 0;
+
+err_close:
+   serdev_device_close(serdev);
+
+   return ret;
+}
+
+static void sirf_close(struct gnss_device *gdev)
+{
+   struct sirf_data *data = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = data->serdev;
+
+   serdev_device_close(serdev);
+
+   pm_runtime_put(>dev);
+}
+
+static int sirf_write_raw(struct gnss_device *gdev,
+   const unsigned char *buf, size_t count)
+{
+   struct sirf_data *data = gnss_get_drvdata(gdev);
+   struct serdev_device *serdev = data->serdev;
+   int ret;
+
+   /* write is only buffered synchronously */
+   ret = serdev_device_write(serdev, buf, count, 0);
+   if (ret < 0)
+   return ret;
+
+   /* FIXME: determine if interrupted? */
+   serdev_device_wait_until_sent(serdev, 0);
+
+   return count;
+}
+
+static const struct gnss_operations sirf_gnss_ops = {
+   .open   = sirf_open,
+   .close  = sirf_close,
+   .write_raw  = sirf_write_raw,
+};
+

Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.

2018-04-24 Thread Andrey Grodzovsky



On 04/24/2018 12:42 PM, Eric W. Biederman wrote:

Andrey Grodzovsky  writes:


Currently calling wait_event_killable as part of exiting process
will stall forever since SIGKILL generation is suppresed by PF_EXITING.

In our partilaur case AMDGPU driver wants to flush all GPU jobs in
flight before shutting down. But if some job hangs the pipe we still want to
be able to kill it and avoid a process in D state.

I should clarify.  This absolutely can not be done.
PF_EXITING is set just before a task starts tearing down it's signal
handling.

So delivering any signal, or otherwise depending on signal handling
after PF_EXITING is set can not be done.  That abstraction is gone.


I see, so you suggest it's the driver responsibility to avoid creating 
such code path that ends up

calling wait_event_killable from exit call stack (PF_EXITING == 1) ?

Andrey



Eric


Signed-off-by: Andrey Grodzovsky 
---
  kernel/signal.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index c6e4c83..c49c706 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct 
task_struct *p)
  {
if (sigismember(>blocked, sig))
return 0;
-   if (p->flags & PF_EXITING)
-   return 0;
if (sig == SIGKILL)
return 1;
+   if (p->flags & PF_EXITING)
+   return 0;
if (task_is_stopped_or_traced(p))
return 0;
return task_curr(p) || !signal_pending(p);




Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.

2018-04-24 Thread Andrey Grodzovsky



On 04/24/2018 12:42 PM, Eric W. Biederman wrote:

Andrey Grodzovsky  writes:


Currently calling wait_event_killable as part of exiting process
will stall forever since SIGKILL generation is suppresed by PF_EXITING.

In our partilaur case AMDGPU driver wants to flush all GPU jobs in
flight before shutting down. But if some job hangs the pipe we still want to
be able to kill it and avoid a process in D state.

I should clarify.  This absolutely can not be done.
PF_EXITING is set just before a task starts tearing down it's signal
handling.

So delivering any signal, or otherwise depending on signal handling
after PF_EXITING is set can not be done.  That abstraction is gone.


I see, so you suggest it's the driver responsibility to avoid creating 
such code path that ends up

calling wait_event_killable from exit call stack (PF_EXITING == 1) ?

Andrey



Eric


Signed-off-by: Andrey Grodzovsky 
---
  kernel/signal.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index c6e4c83..c49c706 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct 
task_struct *p)
  {
if (sigismember(>blocked, sig))
return 0;
-   if (p->flags & PF_EXITING)
-   return 0;
if (sig == SIGKILL)
return 1;
+   if (p->flags & PF_EXITING)
+   return 0;
if (task_is_stopped_or_traced(p))
return 0;
return task_curr(p) || !signal_pending(p);




[PATCH 1/7] gnss: add GNSS receiver subsystem

2018-04-24 Thread Johan Hovold
Add a new subsystem for GNSS (e.g. GPS) receivers.

While GNSS receivers are typically accessed using a UART interface they
often also support other I/O interfaces such as I2C, SPI and USB, while
yet other devices use iomem or even some form of remote-processor
messaging (rpmsg).

The new GNSS subsystem abstracts the underlying interface and provides a
new "gnss" class type, which exposes a character-device interface (e.g.
/dev/gnss0) to user space. This allows GNSS receivers to have a
representation in the Linux device model, something which is important
not least for power management purposes.

Note that the character-device interface provides raw access to whatever
protocol the receiver is (currently) using, such as NMEA 0183, UBX or
SiRF Binary. These protocols are expected to be continued to be handled
by user space for the time being, even if some hybrid solutions are also
conceivable (e.g. to have kernel drivers issue management commands).

This will still allow for better platform integration by allowing GNSS
devices and their resources (e.g. regulators and enable-gpios) to be
described by firmware and managed by kernel drivers rather than
platform-specific scripts and services.

While the current interface is kept minimal, it could be extended using
IOCTLs, sysfs or uevents as needs and proper abstraction levels are
identified and determined (e.g. for device and feature identification).

Signed-off-by: Johan Hovold 
---
 MAINTAINERS   |   6 +
 drivers/Kconfig   |   2 +
 drivers/Makefile  |   1 +
 drivers/gnss/Kconfig  |  11 ++
 drivers/gnss/Makefile |   7 +
 drivers/gnss/core.c   | 385 ++
 include/linux/gnss.h  |  64 +++
 7 files changed, 476 insertions(+)
 create mode 100644 drivers/gnss/Kconfig
 create mode 100644 drivers/gnss/Makefile
 create mode 100644 drivers/gnss/core.c
 create mode 100644 include/linux/gnss.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0a1410d5a621..dc3df211c1a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5962,6 +5962,12 @@ F:   Documentation/isdn/README.gigaset
 F: drivers/isdn/gigaset/
 F: include/uapi/linux/gigaset_dev.h
 
+GNSS SUBSYSTEM
+M: Johan Hovold 
+S: Maintained
+F: drivers/gnss/
+F: include/linux/gnss.h
+
 GO7007 MPEG CODEC
 M: Hans Verkuil 
 L: linux-me...@vger.kernel.org
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 95b9ccc08165..ab4d43923c4d 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -9,6 +9,8 @@ source "drivers/bus/Kconfig"
 
 source "drivers/connector/Kconfig"
 
+source "drivers/gnss/Kconfig"
+
 source "drivers/mtd/Kconfig"
 
 source "drivers/of/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 24cd47014657..cc9a7c5f7d2c 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -185,3 +185,4 @@ obj-$(CONFIG_TEE)   += tee/
 obj-$(CONFIG_MULTIPLEXER)  += mux/
 obj-$(CONFIG_UNISYS_VISORBUS)  += visorbus/
 obj-$(CONFIG_SIOX) += siox/
+obj-$(CONFIG_GNSS) += gnss/
diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
new file mode 100644
index ..103fcc70992e
--- /dev/null
+++ b/drivers/gnss/Kconfig
@@ -0,0 +1,11 @@
+#
+# GNSS receiver configuration
+#
+
+menuconfig GNSS
+   tristate "GNSS receiver support"
+   ---help---
+ Say Y here if you have a GNSS receiver (e.g. a GPS receiver).
+
+ To compile this driver as a module, choose M here: the module will
+ be called gnss.
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
new file mode 100644
index ..1f7a7baab1d9
--- /dev/null
+++ b/drivers/gnss/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the GNSS subsystem.
+#
+
+obj-$(CONFIG_GNSS) += gnss.o
+gnss-y := core.o
diff --git a/drivers/gnss/core.c b/drivers/gnss/core.c
new file mode 100644
index ..9b88c7e72233
--- /dev/null
+++ b/drivers/gnss/core.c
@@ -0,0 +1,385 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * GNSS receiver core
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define GNSS_MINORS16
+
+static DEFINE_IDA(gnss_minors);
+static dev_t gnss_first;
+
+/* FIFO size must be a power of two */
+#define GNSS_READ_FIFO_SIZE4096
+#define GNSS_WRITE_BUF_SIZE1024
+
+#define to_gnss_device(d) container_of((d), struct gnss_device, dev)
+
+static int gnss_open(struct inode *inode, struct file *file)
+{
+   struct gnss_device *gdev;
+   int ret = 0;
+
+   gdev = container_of(inode->i_cdev, struct gnss_device, cdev);
+
+   get_device(>dev);
+
+   nonseekable_open(inode, file);
+   file->private_data = gdev;
+
+   down_write(>rwsem);
+   if (gdev->disconnected) {
+ 

[PATCH 1/7] gnss: add GNSS receiver subsystem

2018-04-24 Thread Johan Hovold
Add a new subsystem for GNSS (e.g. GPS) receivers.

While GNSS receivers are typically accessed using a UART interface they
often also support other I/O interfaces such as I2C, SPI and USB, while
yet other devices use iomem or even some form of remote-processor
messaging (rpmsg).

The new GNSS subsystem abstracts the underlying interface and provides a
new "gnss" class type, which exposes a character-device interface (e.g.
/dev/gnss0) to user space. This allows GNSS receivers to have a
representation in the Linux device model, something which is important
not least for power management purposes.

Note that the character-device interface provides raw access to whatever
protocol the receiver is (currently) using, such as NMEA 0183, UBX or
SiRF Binary. These protocols are expected to be continued to be handled
by user space for the time being, even if some hybrid solutions are also
conceivable (e.g. to have kernel drivers issue management commands).

This will still allow for better platform integration by allowing GNSS
devices and their resources (e.g. regulators and enable-gpios) to be
described by firmware and managed by kernel drivers rather than
platform-specific scripts and services.

While the current interface is kept minimal, it could be extended using
IOCTLs, sysfs or uevents as needs and proper abstraction levels are
identified and determined (e.g. for device and feature identification).

Signed-off-by: Johan Hovold 
---
 MAINTAINERS   |   6 +
 drivers/Kconfig   |   2 +
 drivers/Makefile  |   1 +
 drivers/gnss/Kconfig  |  11 ++
 drivers/gnss/Makefile |   7 +
 drivers/gnss/core.c   | 385 ++
 include/linux/gnss.h  |  64 +++
 7 files changed, 476 insertions(+)
 create mode 100644 drivers/gnss/Kconfig
 create mode 100644 drivers/gnss/Makefile
 create mode 100644 drivers/gnss/core.c
 create mode 100644 include/linux/gnss.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0a1410d5a621..dc3df211c1a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5962,6 +5962,12 @@ F:   Documentation/isdn/README.gigaset
 F: drivers/isdn/gigaset/
 F: include/uapi/linux/gigaset_dev.h
 
+GNSS SUBSYSTEM
+M: Johan Hovold 
+S: Maintained
+F: drivers/gnss/
+F: include/linux/gnss.h
+
 GO7007 MPEG CODEC
 M: Hans Verkuil 
 L: linux-me...@vger.kernel.org
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 95b9ccc08165..ab4d43923c4d 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -9,6 +9,8 @@ source "drivers/bus/Kconfig"
 
 source "drivers/connector/Kconfig"
 
+source "drivers/gnss/Kconfig"
+
 source "drivers/mtd/Kconfig"
 
 source "drivers/of/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 24cd47014657..cc9a7c5f7d2c 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -185,3 +185,4 @@ obj-$(CONFIG_TEE)   += tee/
 obj-$(CONFIG_MULTIPLEXER)  += mux/
 obj-$(CONFIG_UNISYS_VISORBUS)  += visorbus/
 obj-$(CONFIG_SIOX) += siox/
+obj-$(CONFIG_GNSS) += gnss/
diff --git a/drivers/gnss/Kconfig b/drivers/gnss/Kconfig
new file mode 100644
index ..103fcc70992e
--- /dev/null
+++ b/drivers/gnss/Kconfig
@@ -0,0 +1,11 @@
+#
+# GNSS receiver configuration
+#
+
+menuconfig GNSS
+   tristate "GNSS receiver support"
+   ---help---
+ Say Y here if you have a GNSS receiver (e.g. a GPS receiver).
+
+ To compile this driver as a module, choose M here: the module will
+ be called gnss.
diff --git a/drivers/gnss/Makefile b/drivers/gnss/Makefile
new file mode 100644
index ..1f7a7baab1d9
--- /dev/null
+++ b/drivers/gnss/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the GNSS subsystem.
+#
+
+obj-$(CONFIG_GNSS) += gnss.o
+gnss-y := core.o
diff --git a/drivers/gnss/core.c b/drivers/gnss/core.c
new file mode 100644
index ..9b88c7e72233
--- /dev/null
+++ b/drivers/gnss/core.c
@@ -0,0 +1,385 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * GNSS receiver core
+ *
+ * Copyright (C) 2018 Johan Hovold 
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define GNSS_MINORS16
+
+static DEFINE_IDA(gnss_minors);
+static dev_t gnss_first;
+
+/* FIFO size must be a power of two */
+#define GNSS_READ_FIFO_SIZE4096
+#define GNSS_WRITE_BUF_SIZE1024
+
+#define to_gnss_device(d) container_of((d), struct gnss_device, dev)
+
+static int gnss_open(struct inode *inode, struct file *file)
+{
+   struct gnss_device *gdev;
+   int ret = 0;
+
+   gdev = container_of(inode->i_cdev, struct gnss_device, cdev);
+
+   get_device(>dev);
+
+   nonseekable_open(inode, file);
+   file->private_data = gdev;
+
+   down_write(>rwsem);
+   if (gdev->disconnected) {
+   ret = -ENODEV;
+   goto unlock;
+   }
+
+   if 

Re: [PATCH v5 21/23] ASoC: qdsp6: dt-bindings: Add apq8096 machine bindings

2018-04-24 Thread Srinivas Kandagatla

Thanks for the review.

On 24/04/18 17:25, Rob Herring wrote:

On Wed, Apr 18, 2018 at 04:31:55PM +0100, srinivas.kandaga...@linaro.org wrote:

From: Srinivas Kandagatla 

Add devicetree bindings documentation file for Qualcomm apq8096 sound card.

Signed-off-by: Srinivas Kandagatla 
---
  .../devicetree/bindings/sound/qcom,apq8096.txt | 76 ++
  1 file changed, 76 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/sound/qcom,apq8096.txt

diff --git a/Documentation/devicetree/bindings/sound/qcom,apq8096.txt 
b/Documentation/devicetree/bindings/sound/qcom,apq8096.txt
new file mode 100644
index ..37e23d926b95
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/qcom,apq8096.txt
@@ -0,0 +1,76 @@
+* Qualcomm Technologies APQ8096 ASoC sound card driver
+
+This binding describes the APQ8096 sound card, which uses qdsp for audio.
+
+- compatible:
+   Usage: required
+   Value type: 
+   Definition: must be "qcom,apq8096-sndcard"
+
+- qcom,audio-routing:
+   Usage: Optional
+   Value type: 
+   Definition:  A list of the connections between audio components.
+ Each entry is a pair of strings, the first being the
+ connection's sink, the second being the connection's
+ source. Valid names could be power supplies, MicBias
+ of codec and the jacks on the board:


Please list out valid values here.


I can list the values for the HDMI playback use-case, but the list would 
grow as we start adding wcd9335 codec support.





+
+= dailinks
+Each subnode of sndcard represents either a dailink, and subnodes of each
+dailinks would be cpu/codec/platform dais.
+
+- link-name:


Not a standard property, but I guess that sneaked in with the 8016
binding...

Yes, I followed 8016 bindings.

Am happy to prefix this with qcom if that makes more sense.



+   Usage: required
+   Value type: 
+   Definition: User friendly name for dai link
+
+= CPU, PLATFORM, CODEC dais subnodes
+- cpu:
+   Usage: required
+   Value type: 
+   Definition: cpu dai sub-node
+
+- codec:
+   Usage: Optional
+   Value type: 
+   Definition: codec dai sub-node
+
+- platform:
+   Usage: Optional
+   Value type: 
+   Definition: platform dai sub-node
+
+- sound-dai:
+   Usage: required
+   Value type: 


phandle with args.

Yep.

--srini


Re: [PATCH v5 21/23] ASoC: qdsp6: dt-bindings: Add apq8096 machine bindings

2018-04-24 Thread Srinivas Kandagatla

Thanks for the review.

On 24/04/18 17:25, Rob Herring wrote:

On Wed, Apr 18, 2018 at 04:31:55PM +0100, srinivas.kandaga...@linaro.org wrote:

From: Srinivas Kandagatla 

Add devicetree bindings documentation file for Qualcomm apq8096 sound card.

Signed-off-by: Srinivas Kandagatla 
---
  .../devicetree/bindings/sound/qcom,apq8096.txt | 76 ++
  1 file changed, 76 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/sound/qcom,apq8096.txt

diff --git a/Documentation/devicetree/bindings/sound/qcom,apq8096.txt 
b/Documentation/devicetree/bindings/sound/qcom,apq8096.txt
new file mode 100644
index ..37e23d926b95
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/qcom,apq8096.txt
@@ -0,0 +1,76 @@
+* Qualcomm Technologies APQ8096 ASoC sound card driver
+
+This binding describes the APQ8096 sound card, which uses qdsp for audio.
+
+- compatible:
+   Usage: required
+   Value type: 
+   Definition: must be "qcom,apq8096-sndcard"
+
+- qcom,audio-routing:
+   Usage: Optional
+   Value type: 
+   Definition:  A list of the connections between audio components.
+ Each entry is a pair of strings, the first being the
+ connection's sink, the second being the connection's
+ source. Valid names could be power supplies, MicBias
+ of codec and the jacks on the board:


Please list out valid values here.


I can list the values for the HDMI playback use-case, but the list would 
grow as we start adding wcd9335 codec support.





+
+= dailinks
+Each subnode of sndcard represents either a dailink, and subnodes of each
+dailinks would be cpu/codec/platform dais.
+
+- link-name:


Not a standard property, but I guess that sneaked in with the 8016
binding...

Yes, I followed 8016 bindings.

Am happy to prefix this with qcom if that makes more sense.



+   Usage: required
+   Value type: 
+   Definition: User friendly name for dai link
+
+= CPU, PLATFORM, CODEC dais subnodes
+- cpu:
+   Usage: required
+   Value type: 
+   Definition: cpu dai sub-node
+
+- codec:
+   Usage: Optional
+   Value type: 
+   Definition: codec dai sub-node
+
+- platform:
+   Usage: Optional
+   Value type: 
+   Definition: platform dai sub-node
+
+- sound-dai:
+   Usage: required
+   Value type: 


phandle with args.

Yep.

--srini


Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask

2018-04-24 Thread Chunyu Hu


- Original Message -
> From: "Michal Hocko" 
> To: "Chunyu Hu" 
> Cc: "Dmitry Vyukov" , "Catalin Marinas" 
> , "Chunyu Hu"
> , "LKML" , "Linux-MM" 
> 
> Sent: Tuesday, April 24, 2018 9:20:57 PM
> Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in 
> gfp_kmemleak_mask
> 
> On Mon 23-04-18 12:17:32, Chunyu Hu wrote:
> [...]
> > So if there is a new flag, it would be the 25th bits.
> 
> No new flags please. Can you simply store a simple bool into fail_page_alloc
> and have save/restore api for that?

Hi Michal,

I still don't get your point. The original NOFAIL added in kmemleak was 
for skipping fault injection in page/slab  allocation for kmemleak object, 
since kmemleak will disable itself until next reboot, whenever it hit an 
allocation failure, in that case, it will lose effect to check kmemleak 
in errer path rose by fault injection. But NOFAULT's effect is more than 
skipping fault injection, it's also for hard allocation. So a dedicated flag
for skipping fault injection in specified slab/page allocation was mentioned.
 
d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") 
  
Do you mean something like below, with the save/store api? But looks like
to make it possible to skip a specified allocation, not global disabling,
a bool is not enough, and a gfp_flag is also needed. Maybe I missed something?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 905db9d..ca6f609 100644   
  
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3053,14 +3053,28 @@ struct page *rmqueue(struct zone *preferred_zone,
 
bool ignore_gfp_highmem;
bool ignore_gfp_reclaim;
+   bool ignore_kmemleak_fault;
u32 min_order;
 } fail_page_alloc = { 
.attr = FAULT_ATTR_INITIALIZER,
.ignore_gfp_reclaim = true,
.ignore_gfp_highmem = true,
+   .ignore_kmemleak_fault = true,
.min_order = 1,
 };
 
+bool saved_fail_page_alloc_ignore;
+void fail_page_alloc_ignore_save(void)
+{
+   saved_fail_page_alloc_ignore = fail_page_alloc.ignore_kmemleak_fault;
+   fail_page_alloc.ignore_kmemleak_fault = true;
+}
+
+void fail_page_alloc_ignore_restore(void)
+{
+   fail_page_alloc.ignore_kmemleak_fault = saved_fail_page_alloc_ignore;
+}
+
 static int __init setup_fail_page_alloc(char *str)
 {
return setup_fault_attr(_page_alloc.attr, str);
@@ -3075,6 +3089,9 @@ static bool should_fail_alloc_page(gfp_t gfp_mask, 
unsigned int order)
return false;
if (fail_page_alloc.ignore_gfp_highmem && (gfp_mask & __GFP_HIGHMEM))
return false;
+   /* looks like here we still need a new GFP_KMEMLKEAK flag ... */
+   if (fail_page_alloc.ignore_kmemleak_fault && (gfp_mask & __GFP_KMEMLEAK))
+   return false;
if (fail_page_alloc.ignore_gfp_reclaim &&
(gfp_mask & __GFP_DIRECT_RECLAIM))
return false;   

> 
> --
> Michal Hocko
> SUSE Labs
> 

-- 
Regards,
Chunyu Hu



Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask

2018-04-24 Thread Chunyu Hu


- Original Message -
> From: "Michal Hocko" 
> To: "Chunyu Hu" 
> Cc: "Dmitry Vyukov" , "Catalin Marinas" 
> , "Chunyu Hu"
> , "LKML" , "Linux-MM" 
> 
> Sent: Tuesday, April 24, 2018 9:20:57 PM
> Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in 
> gfp_kmemleak_mask
> 
> On Mon 23-04-18 12:17:32, Chunyu Hu wrote:
> [...]
> > So if there is a new flag, it would be the 25th bits.
> 
> No new flags please. Can you simply store a simple bool into fail_page_alloc
> and have save/restore api for that?

Hi Michal,

I still don't get your point. The original NOFAIL added in kmemleak was 
for skipping fault injection in page/slab  allocation for kmemleak object, 
since kmemleak will disable itself until next reboot, whenever it hit an 
allocation failure, in that case, it will lose effect to check kmemleak 
in errer path rose by fault injection. But NOFAULT's effect is more than 
skipping fault injection, it's also for hard allocation. So a dedicated flag
for skipping fault injection in specified slab/page allocation was mentioned.
 
d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") 
  
Do you mean something like below, with the save/store api? But looks like
to make it possible to skip a specified allocation, not global disabling,
a bool is not enough, and a gfp_flag is also needed. Maybe I missed something?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 905db9d..ca6f609 100644   
  
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3053,14 +3053,28 @@ struct page *rmqueue(struct zone *preferred_zone,
 
bool ignore_gfp_highmem;
bool ignore_gfp_reclaim;
+   bool ignore_kmemleak_fault;
u32 min_order;
 } fail_page_alloc = { 
.attr = FAULT_ATTR_INITIALIZER,
.ignore_gfp_reclaim = true,
.ignore_gfp_highmem = true,
+   .ignore_kmemleak_fault = true,
.min_order = 1,
 };
 
+bool saved_fail_page_alloc_ignore;
+void fail_page_alloc_ignore_save(void)
+{
+   saved_fail_page_alloc_ignore = fail_page_alloc.ignore_kmemleak_fault;
+   fail_page_alloc.ignore_kmemleak_fault = true;
+}
+
+void fail_page_alloc_ignore_restore(void)
+{
+   fail_page_alloc.ignore_kmemleak_fault = saved_fail_page_alloc_ignore;
+}
+
 static int __init setup_fail_page_alloc(char *str)
 {
return setup_fault_attr(_page_alloc.attr, str);
@@ -3075,6 +3089,9 @@ static bool should_fail_alloc_page(gfp_t gfp_mask, 
unsigned int order)
return false;
if (fail_page_alloc.ignore_gfp_highmem && (gfp_mask & __GFP_HIGHMEM))
return false;
+   /* looks like here we still need a new GFP_KMEMLKEAK flag ... */
+   if (fail_page_alloc.ignore_kmemleak_fault && (gfp_mask & __GFP_KMEMLEAK))
+   return false;
if (fail_page_alloc.ignore_gfp_reclaim &&
(gfp_mask & __GFP_DIRECT_RECLAIM))
return false;   

> 
> --
> Michal Hocko
> SUSE Labs
> 

-- 
Regards,
Chunyu Hu



Re: hrtimer (rdmavt RNR timer) was lost

2018-04-24 Thread Frederic Weisbecker
On Tue, Apr 24, 2018 at 04:54:58PM +0200, Thomas Gleixner wrote:
> On Mon, 23 Apr 2018, Thomas Gleixner wrote:
> > On Mon, 23 Apr 2018, Wan, Kaike wrote:
> > > > Can you apply the following debug patch and enable the hrtimer_start 
> > > > trace
> > > > point and send me the full trace or upload it somewhere?
> > > 
> > > The original trace was about 29GB and I filtered it with
> > > "66dda1ea" (the offending base) to generate a 1.4GB file that I
> > > could open and investigate.  I am not sure how I can send them to you. Do
> > > you have somewhere I can upload to?
> > >
> > > I can try your debug patch and again I am anticipating a big trace file.
> > 
> > Well, you can find the spot where the fail happens and then extract the
> > full thing from 2s before that point to 1s after. That should be reasonably
> > small and good enough. Let me know when you have it and how big it is
> > (compressed) and we'll figure something out how to transport it.
> 
> Thanks for the more complete data which actually let me decode the wreckage.
> 
> So you have NO_HZ enabled and looking at the trace then this is related:
> 
>   -0 [003] dN.. 3598605307236: hrtimer_start: 
> hrtimer=04346740 function=tick_sched_timer expires=71217100 
> softexpires=71217100mode=ABS|PINNED base=66dda1ea 
> next=708914d7
>   -0 [003] dN.. 3598605307601: hrtimer_post_add: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=04346740
> 
>   -0 [002] d.h. 3598605313255: hrtimer_start: 
> hrtimer=662d2dd8 function=rvt_rc_rnr_retry [rdmavt] 
> expires=712172915662 softexpires=712172915662mode=REL base=66dda1ea 
> next=04346740
>   -0 [002] d.h. 3599538740786: hrtimer_post_add: 
> hrtimer=662d2dd8 function=rvt_rc_rnr_retry [rdmavt] 
> base=66dda1ea next=662d2dd8
> 
>   -0 [003] dn.. 3599538742242: hrtimer_pre_remove: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=662d2dd8
>   -0 [003] dn.. 3599538743084: hrtimer_post_remove: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=662d2dd8
>   
>   -0 [003] dn.. 3599538743830: hrtimer_start: 
> hrtimer=04346740 function=tick_sched_timer expires=71676700 
> softexpires=71676700mode=ABS|PINNED base=66dda1ea 
> next=662d2dd8
>   -0 [003] dn.. 3599538744560: hrtimer_post_add: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=662d2dd8
> 
> And staring at the NOHZ code I'm pretty sure, I know what happens.
> 
> CPU 3 CPU 2
> 
> idle
> start tick_sched_timer 71217100
>   start rdmavt timer 712172915662
>   lock(base)
> tick_nohz_stop_tick()
> tick = 71676700   timerqueue_add(tmr)
> 
> hrtimer_set_expires(>sched_timer, tick); < FAIL
> 
>   if (tmr->exp < 
> queue->next->exp)
> hrtimer_start(sched_timer)queue->next = tmr;
> lock(base)
>   unlock(base)
> timerqueue_remove()
> timerqueue_add()
> 
> 
> So ts->sched_timer is still queued and queue->next is pointing to it, but
> then expires is modified. So the other side sees the new expiry time and
> makes the rdmavt timer the new queue->next. All f*cked from there.
> 
> The problem was introduced with:
> 
>   d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get 
> out of sync")
> 
> The changelog is not really helpful, but I can't see why the expiry time of
> ts->sched_timer should be updated before the timer is [re]started in case
> of HIGHRES + NOHZ. hrtimer_start() sets the expiry time, so that should be
> sufficient. Of course the tick needs to be stored in ts->sched_timer for
> the !HIGHRES + HOHZ case, but that's trivial enough to do. Patch against
> Linus tree below. It's easy to backport in case you want to run it against
> the kernel which made the observation simpler.
> 
> I need to come up with integration of hrtimer_set_expires() into debug
> objects to catch this kind of horrors. Groan

Sorry for all that, that must have been hairy enough to debug :-(

I thought that I could fiddle with the sched_timer because nothing else
is supposed to touch it concurrently, but that forgot about the whole
queue locked under cpu_base. My fault.

The fix looks all good to me, thanks!

ACK.


Re: hrtimer (rdmavt RNR timer) was lost

2018-04-24 Thread Frederic Weisbecker
On Tue, Apr 24, 2018 at 04:54:58PM +0200, Thomas Gleixner wrote:
> On Mon, 23 Apr 2018, Thomas Gleixner wrote:
> > On Mon, 23 Apr 2018, Wan, Kaike wrote:
> > > > Can you apply the following debug patch and enable the hrtimer_start 
> > > > trace
> > > > point and send me the full trace or upload it somewhere?
> > > 
> > > The original trace was about 29GB and I filtered it with
> > > "66dda1ea" (the offending base) to generate a 1.4GB file that I
> > > could open and investigate.  I am not sure how I can send them to you. Do
> > > you have somewhere I can upload to?
> > >
> > > I can try your debug patch and again I am anticipating a big trace file.
> > 
> > Well, you can find the spot where the fail happens and then extract the
> > full thing from 2s before that point to 1s after. That should be reasonably
> > small and good enough. Let me know when you have it and how big it is
> > (compressed) and we'll figure something out how to transport it.
> 
> Thanks for the more complete data which actually let me decode the wreckage.
> 
> So you have NO_HZ enabled and looking at the trace then this is related:
> 
>   -0 [003] dN.. 3598605307236: hrtimer_start: 
> hrtimer=04346740 function=tick_sched_timer expires=71217100 
> softexpires=71217100mode=ABS|PINNED base=66dda1ea 
> next=708914d7
>   -0 [003] dN.. 3598605307601: hrtimer_post_add: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=04346740
> 
>   -0 [002] d.h. 3598605313255: hrtimer_start: 
> hrtimer=662d2dd8 function=rvt_rc_rnr_retry [rdmavt] 
> expires=712172915662 softexpires=712172915662mode=REL base=66dda1ea 
> next=04346740
>   -0 [002] d.h. 3599538740786: hrtimer_post_add: 
> hrtimer=662d2dd8 function=rvt_rc_rnr_retry [rdmavt] 
> base=66dda1ea next=662d2dd8
> 
>   -0 [003] dn.. 3599538742242: hrtimer_pre_remove: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=662d2dd8
>   -0 [003] dn.. 3599538743084: hrtimer_post_remove: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=662d2dd8
>   
>   -0 [003] dn.. 3599538743830: hrtimer_start: 
> hrtimer=04346740 function=tick_sched_timer expires=71676700 
> softexpires=71676700mode=ABS|PINNED base=66dda1ea 
> next=662d2dd8
>   -0 [003] dn.. 3599538744560: hrtimer_post_add: 
> hrtimer=04346740 function=tick_sched_timer base=66dda1ea 
> next=662d2dd8
> 
> And staring at the NOHZ code I'm pretty sure, I know what happens.
> 
> CPU 3 CPU 2
> 
> idle
> start tick_sched_timer 71217100
>   start rdmavt timer 712172915662
>   lock(base)
> tick_nohz_stop_tick()
> tick = 71676700   timerqueue_add(tmr)
> 
> hrtimer_set_expires(>sched_timer, tick); < FAIL
> 
>   if (tmr->exp < 
> queue->next->exp)
> hrtimer_start(sched_timer)queue->next = tmr;
> lock(base)
>   unlock(base)
> timerqueue_remove()
> timerqueue_add()
> 
> 
> So ts->sched_timer is still queued and queue->next is pointing to it, but
> then expires is modified. So the other side sees the new expiry time and
> makes the rdmavt timer the new queue->next. All f*cked from there.
> 
> The problem was introduced with:
> 
>   d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get 
> out of sync")
> 
> The changelog is not really helpful, but I can't see why the expiry time of
> ts->sched_timer should be updated before the timer is [re]started in case
> of HIGHRES + NOHZ. hrtimer_start() sets the expiry time, so that should be
> sufficient. Of course the tick needs to be stored in ts->sched_timer for
> the !HIGHRES + HOHZ case, but that's trivial enough to do. Patch against
> Linus tree below. It's easy to backport in case you want to run it against
> the kernel which made the observation simpler.
> 
> I need to come up with integration of hrtimer_set_expires() into debug
> objects to catch this kind of horrors. Groan

Sorry for all that, that must have been hairy enough to debug :-(

I thought that I could fiddle with the sched_timer because nothing else
is supposed to touch it concurrently, but that forgot about the whole
queue locked under cpu_base. My fault.

The fix looks all good to me, thanks!

ACK.


Re: [PATCH v5 1/2] char: sparc64: Add privileged ADI driver

2018-04-24 Thread Tom Hromatka



On 04/23/2018 11:52 AM, Greg KH wrote:

On Mon, Apr 23, 2018 at 11:33:31AM -0600, Tom Hromatka wrote:

SPARC M7 and newer processors utilize ADI to version and
protect memory.  This driver is capable of reading/writing
ADI/MCD versions from privileged user space processes.
Addresses in the adi file are mapped linearly to physical
memory at a ratio of 1:adi_blksz.  Thus, a read (or write)
of offset K in the file operates upon the ADI version at
physical address K * adi_blksz.  The version information
is encoded as one version per byte.  Intended consumers
are makedumpfile and crash.

What do you mean by "crash"?  Should this tie into the pstore
infrastructure, or is this just a userspace thing?  Just curious.


My apologies.  I was referring to the crash utility:
https://github.com/crash-utility/crash

A future commit to store the ADI versions to the pstore would be
really cool.  I am fearful the amount of ADI data could overwhelm
the pstore, though.  The latest sparc machines support 4 TB of RAM
which could mean several GBs of ADI versions.  But storing the ADI
versions pertaining to the failing code should be possible.  I need
to do more research...



Minor code comments below now that the license stuff is correct, I
decided to read the code :)


:)


+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MODULE_NAME"adi"

What's wrong with KBUILD_MODNAME?  Just use that instead of MODULE_NAME
later on in the file.


Good catch.  I'll do that in the next rev of this patch.


+#define MAX_BUF_SZ 4096

PAGE_SIZE?  Just curious.


When a user requests a large read/write in makedumpfile or the crash
utility, these tools typically make requests in 4096-sized chunks.
I believe you are correct that these operations are based upon page
size, but I have not verified.  I was hesitant to connect MAX_BUF_SZ to
PAGE_SIZE without this verification.  I'll look into it more...


+
+static int adi_open(struct inode *inode, struct file *file)
+{
+   file->f_mode |= FMODE_UNSIGNED_OFFSET;

That's odd, why?


sparc64 currently supports 4 TB of RAM (and could support much more in the
future).  Offsets into this ADI privileged driver are address / 64, but
that could change also in the future depending upon cache line sizes.  I
was afraid that future sparc systems could have very large file offsets.
Overkill?




+   return 0;
+}
+
+static int read_mcd_tag(unsigned long addr)
+{
+   long err;
+   int ver;
+
+   __asm__ __volatile__(
+   "1:ldxa [%[addr]] %[asi], %[ver]\n"
+   "  mov 0, %[err]\n"
+   "2:\n"
+   "  .section .fixup,#alloc,#execinstr\n"
+   "  .align 4\n"
+   "3:sethi %%hi(2b), %%g1\n"
+   "  jmpl  %%g1 + %%lo(2b), %%g0\n"
+   "  mov %[invalid], %[err]\n"
+   "  .previous\n"
+   "  .section __ex_table, \"a\"\n"
+   "  .align 4\n"
+   "  .word  1b, 3b\n"
+   "  .previous\n"
+   : [ver] "=r" (ver), [err] "=r" (err)
+   : [addr] "r"  (addr), [invalid] "i" (EFAULT),
+ [asi] "i" (ASI_MCD_REAL)
+   : "memory", "g1"
+   );
+
+   if (err)
+   return -EFAULT;
+   else
+   return ver;
+}
+
+static ssize_t adi_read(struct file *file, char __user *buf,
+   size_t count, loff_t *offp)
+{
+   size_t ver_buf_sz, bytes_read = 0;
+   int ver_buf_idx = 0;
+   loff_t offset;
+   u8 *ver_buf;
+   ssize_t ret;
+
+   ver_buf_sz = min_t(size_t, count, MAX_BUF_SZ);
+   ver_buf = kmalloc(ver_buf_sz, GFP_KERNEL);
+   if (!ver_buf)
+   return -ENOMEM;
+
+   offset = (*offp) * adi_blksize();
+
+   while (bytes_read < count) {
+   ret = read_mcd_tag(offset);
+   if (ret < 0)
+   goto out;
+
+   ver_buf[ver_buf_idx] = (u8)ret;

Are you sure ret fits in 8 bits here?


Yes, I believe so.  read_mcd_tag() will return a negative number
on an error - which is checked a couple lines above.  Otherwise,
the read succeeded which means a valid ADI version was returned.
Valid ADI versions are 0 through 16.


+   ver_buf_idx++;
+   offset += adi_blksize();
+
+   if (ver_buf_idx >= ver_buf_sz) {
+   if (copy_to_user(buf + bytes_read, ver_buf,
+ver_buf_sz)) {
+   ret = -EFAULT;
+   goto out;
+   }
+
+   bytes_read += ver_buf_sz;
+   ver_buf_idx = 0;
+
+   ver_buf_sz = min(count - bytes_read,
+(size_t)MAX_BUF_SZ);
+   }
+   }
+
+   (*offp) += bytes_read;
+   ret = bytes_read;
+out:
+   kfree(ver_buf);
+   

Re: [PATCH v5 1/2] char: sparc64: Add privileged ADI driver

2018-04-24 Thread Tom Hromatka



On 04/23/2018 11:52 AM, Greg KH wrote:

On Mon, Apr 23, 2018 at 11:33:31AM -0600, Tom Hromatka wrote:

SPARC M7 and newer processors utilize ADI to version and
protect memory.  This driver is capable of reading/writing
ADI/MCD versions from privileged user space processes.
Addresses in the adi file are mapped linearly to physical
memory at a ratio of 1:adi_blksz.  Thus, a read (or write)
of offset K in the file operates upon the ADI version at
physical address K * adi_blksz.  The version information
is encoded as one version per byte.  Intended consumers
are makedumpfile and crash.

What do you mean by "crash"?  Should this tie into the pstore
infrastructure, or is this just a userspace thing?  Just curious.


My apologies.  I was referring to the crash utility:
https://github.com/crash-utility/crash

A future commit to store the ADI versions to the pstore would be
really cool.  I am fearful the amount of ADI data could overwhelm
the pstore, though.  The latest sparc machines support 4 TB of RAM
which could mean several GBs of ADI versions.  But storing the ADI
versions pertaining to the failing code should be possible.  I need
to do more research...



Minor code comments below now that the license stuff is correct, I
decided to read the code :)


:)


+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MODULE_NAME"adi"

What's wrong with KBUILD_MODNAME?  Just use that instead of MODULE_NAME
later on in the file.


Good catch.  I'll do that in the next rev of this patch.


+#define MAX_BUF_SZ 4096

PAGE_SIZE?  Just curious.


When a user requests a large read/write in makedumpfile or the crash
utility, these tools typically make requests in 4096-sized chunks.
I believe you are correct that these operations are based upon page
size, but I have not verified.  I was hesitant to connect MAX_BUF_SZ to
PAGE_SIZE without this verification.  I'll look into it more...


+
+static int adi_open(struct inode *inode, struct file *file)
+{
+   file->f_mode |= FMODE_UNSIGNED_OFFSET;

That's odd, why?


sparc64 currently supports 4 TB of RAM (and could support much more in the
future).  Offsets into this ADI privileged driver are address / 64, but
that could change also in the future depending upon cache line sizes.  I
was afraid that future sparc systems could have very large file offsets.
Overkill?




+   return 0;
+}
+
+static int read_mcd_tag(unsigned long addr)
+{
+   long err;
+   int ver;
+
+   __asm__ __volatile__(
+   "1:ldxa [%[addr]] %[asi], %[ver]\n"
+   "  mov 0, %[err]\n"
+   "2:\n"
+   "  .section .fixup,#alloc,#execinstr\n"
+   "  .align 4\n"
+   "3:sethi %%hi(2b), %%g1\n"
+   "  jmpl  %%g1 + %%lo(2b), %%g0\n"
+   "  mov %[invalid], %[err]\n"
+   "  .previous\n"
+   "  .section __ex_table, \"a\"\n"
+   "  .align 4\n"
+   "  .word  1b, 3b\n"
+   "  .previous\n"
+   : [ver] "=r" (ver), [err] "=r" (err)
+   : [addr] "r"  (addr), [invalid] "i" (EFAULT),
+ [asi] "i" (ASI_MCD_REAL)
+   : "memory", "g1"
+   );
+
+   if (err)
+   return -EFAULT;
+   else
+   return ver;
+}
+
+static ssize_t adi_read(struct file *file, char __user *buf,
+   size_t count, loff_t *offp)
+{
+   size_t ver_buf_sz, bytes_read = 0;
+   int ver_buf_idx = 0;
+   loff_t offset;
+   u8 *ver_buf;
+   ssize_t ret;
+
+   ver_buf_sz = min_t(size_t, count, MAX_BUF_SZ);
+   ver_buf = kmalloc(ver_buf_sz, GFP_KERNEL);
+   if (!ver_buf)
+   return -ENOMEM;
+
+   offset = (*offp) * adi_blksize();
+
+   while (bytes_read < count) {
+   ret = read_mcd_tag(offset);
+   if (ret < 0)
+   goto out;
+
+   ver_buf[ver_buf_idx] = (u8)ret;

Are you sure ret fits in 8 bits here?


Yes, I believe so.  read_mcd_tag() will return a negative number
on an error - which is checked a couple lines above.  Otherwise,
the read succeeded which means a valid ADI version was returned.
Valid ADI versions are 0 through 16.


+   ver_buf_idx++;
+   offset += adi_blksize();
+
+   if (ver_buf_idx >= ver_buf_sz) {
+   if (copy_to_user(buf + bytes_read, ver_buf,
+ver_buf_sz)) {
+   ret = -EFAULT;
+   goto out;
+   }
+
+   bytes_read += ver_buf_sz;
+   ver_buf_idx = 0;
+
+   ver_buf_sz = min(count - bytes_read,
+(size_t)MAX_BUF_SZ);
+   }
+   }
+
+   (*offp) += bytes_read;
+   ret = bytes_read;
+out:
+   kfree(ver_buf);
+   

Re: [PATCH v4 4/9] vsprintf: Consolidate handling of unknown pointer specifiers

2018-04-24 Thread Andy Shevchenko
On Wed, 2018-04-11 at 11:52 +0200, Petr Mladek wrote:
> On Tue 2018-04-10 14:41:55, Andy Shevchenko wrote:
> > On Mon, 2018-04-09 at 15:50 +0200, Petr Mladek wrote:
> > > On Sat 2018-04-07 17:26:40, Andy Shevchenko wrote:

> > I think about 1-7 and 9 that can go as is before your changes.
> > And patch 8 postpone
> 
> Done. All the 8 patches are in printk.git, branch
> for-4.18-vsprintf-cleanup.

Thanks!

> I am sorry that we have missed 4.17. In theory, it still might be
> possible to push them there. But there does not seem to be a real
> reason to rush them.

Agree.

-- 
Andy Shevchenko 
Intel Finland Oy


Re: [PATCH v4 4/9] vsprintf: Consolidate handling of unknown pointer specifiers

2018-04-24 Thread Andy Shevchenko
On Wed, 2018-04-11 at 11:52 +0200, Petr Mladek wrote:
> On Tue 2018-04-10 14:41:55, Andy Shevchenko wrote:
> > On Mon, 2018-04-09 at 15:50 +0200, Petr Mladek wrote:
> > > On Sat 2018-04-07 17:26:40, Andy Shevchenko wrote:

> > I think about 1-7 and 9 that can go as is before your changes.
> > And patch 8 postpone
> 
> Done. All the 8 patches are in printk.git, branch
> for-4.18-vsprintf-cleanup.

Thanks!

> I am sorry that we have missed 4.17. In theory, it still might be
> possible to push them there. But there does not seem to be a real
> reason to rush them.

Agree.

-- 
Andy Shevchenko 
Intel Finland Oy


Re: [PATCH v3 06/12] KVM: arm/arm64: Helper to register a new redistributor region

2018-04-24 Thread Christoffer Dall
On Fri, Apr 13, 2018 at 10:20:52AM +0200, Eric Auger wrote:
> We introduce a new helper that creates and inserts a new redistributor
> region into the rdist region list. This helper both handles the case
> where the redistributor region size is known at registration time
> and the legacy case where it is not (eventually depending on the number
> of online vcpus). Depending on pfns, we perform all the possible checks
> that we can do:
> 
> - end of memory crossing
> - incorrect alignment of the base address
> - collision with distributor region if already defined
> - collision with already registered rdist regions
> - check of the new index
> 
> Rdist regions must be inserted by increasing order of indices. Indices
> must be contiguous.
> 
> We also introduce vgic_v3_rdist_region_from_index() which will be used
> from the vgic kvm-device, later on.
> 
> Signed-off-by: Eric Auger 
> ---
>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 95 
> +---
>  virt/kvm/arm/vgic/vgic-v3.c  | 29 
>  virt/kvm/arm/vgic/vgic.h | 14 ++
>  3 files changed, 122 insertions(+), 16 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c 
> b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> index ce5c927..5273fb8 100644
> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> @@ -680,14 +680,66 @@ static int vgic_register_all_redist_iodevs(struct kvm 
> *kvm)
>   return ret;
>  }
>  
> -int vgic_v3_set_redist_base(struct kvm *kvm, u64 addr)
> +/**
> + * vgic_v3_insert_redist_region - Insert a new redistributor region
> + *
> + * Performs various checks before inserting the rdist region in the list.
> + * Those tests depend on whether the size of the rdist region is known
> + * (ie. count != 0). The list is sorted by rdist region index.
> + *
> + * @kvm: kvm handle
> + * @index: redist region index
> + * @base: base of the new rdist region
> + * @count: number of redistributors the region is made of (of 0 in the old 
> style
> + * single region, whose size is induced from the number of vcpus)
> + *
> + * Return 0 on success, < 0 otherwise
> + */
> +static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
> + gpa_t base, uint32_t count)
>  {
> - struct vgic_dist *vgic = >arch.vgic;
> + struct vgic_dist *d = >arch.vgic;
>   struct vgic_redist_region *rdreg;
> + struct list_head *rd_regions = >rd_regions;
> + struct list_head *last = rd_regions->prev;
> +

nit: extra blank line?

> + gpa_t new_start, new_end;
> + size_t size = count * KVM_VGIC_V3_REDIST_SIZE;
>   int ret;
>  
> - /* vgic_check_ioaddr makes sure we don't do this twice */
> - if (!list_empty(>rd_regions))
> + /* single rdist region already set ?*/
> + if (!count && !list_empty(rd_regions))
> + return -EINVAL;
> +
> + /* cross the end of memory ? */
> + if (base + size < base)
> + return -EINVAL;

what is the size of memory?  This seems to check for a gpa_t overflow,
but not againt the IPA space of the VM...

> +
> + if (list_empty(rd_regions)) {
> + if (index != 0)
> + return -EINVAL;

note, I think this can be simplified if we can rid of the index.

> + } else {
> + rdreg = list_entry(last, struct vgic_redist_region, list);

you can use list_last_entry here and get rid of the 'last' temporary
variable above.

> + if (index != rdreg->index + 1)
> + return -EINVAL;
> +
> + /* Cannot add an explicitly sized regions after legacy region */
> + if (!rdreg->count)
> + return -EINVAL;
> + }
> +
> + /*
> +  * collision with already set dist region ?
> +  * this assumes we know the size of the new rdist region (pfns != 0)
> +  * otherwise we can only test this when all vcpus are registered
> +  */

I don't really understand this commentary... :(

> + if (!count && !IS_VGIC_ADDR_UNDEF(d->vgic_dist_base) &&
> + (!(d->vgic_dist_base + KVM_VGIC_V3_DIST_SIZE <= base)) &&
> + (!(base + size <= d->vgic_dist_base)))
> + return -EINVAL;

Can't you call vgic_v3_check_base() here instead?

> +
> + /* collision with any other rdist region? */
> + if (vgic_v3_rdist_overlap(kvm, base, size))
>   return -EINVAL;
>  
>   rdreg = kzalloc(sizeof(*rdreg), GFP_KERNEL);
> @@ -696,17 +748,32 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u64 addr)
>  
>   rdreg->base = VGIC_ADDR_UNDEF;
>  
> - ret = vgic_check_ioaddr(kvm, >base, addr, SZ_64K);
> + ret = vgic_check_ioaddr(kvm, >base, base, SZ_64K);
>   if (ret)
> - goto out;
> + goto free;
>  
> - rdreg->base = addr;
> - if (!vgic_v3_check_base(kvm)) {
> - ret = -EINVAL;
> - goto out;
> - }
> + rdreg->base = base;
> + 

Re: [PATCH v3 06/12] KVM: arm/arm64: Helper to register a new redistributor region

2018-04-24 Thread Christoffer Dall
On Fri, Apr 13, 2018 at 10:20:52AM +0200, Eric Auger wrote:
> We introduce a new helper that creates and inserts a new redistributor
> region into the rdist region list. This helper both handles the case
> where the redistributor region size is known at registration time
> and the legacy case where it is not (eventually depending on the number
> of online vcpus). Depending on pfns, we perform all the possible checks
> that we can do:
> 
> - end of memory crossing
> - incorrect alignment of the base address
> - collision with distributor region if already defined
> - collision with already registered rdist regions
> - check of the new index
> 
> Rdist regions must be inserted by increasing order of indices. Indices
> must be contiguous.
> 
> We also introduce vgic_v3_rdist_region_from_index() which will be used
> from the vgic kvm-device, later on.
> 
> Signed-off-by: Eric Auger 
> ---
>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 95 
> +---
>  virt/kvm/arm/vgic/vgic-v3.c  | 29 
>  virt/kvm/arm/vgic/vgic.h | 14 ++
>  3 files changed, 122 insertions(+), 16 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c 
> b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> index ce5c927..5273fb8 100644
> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> @@ -680,14 +680,66 @@ static int vgic_register_all_redist_iodevs(struct kvm 
> *kvm)
>   return ret;
>  }
>  
> -int vgic_v3_set_redist_base(struct kvm *kvm, u64 addr)
> +/**
> + * vgic_v3_insert_redist_region - Insert a new redistributor region
> + *
> + * Performs various checks before inserting the rdist region in the list.
> + * Those tests depend on whether the size of the rdist region is known
> + * (ie. count != 0). The list is sorted by rdist region index.
> + *
> + * @kvm: kvm handle
> + * @index: redist region index
> + * @base: base of the new rdist region
> + * @count: number of redistributors the region is made of (of 0 in the old 
> style
> + * single region, whose size is induced from the number of vcpus)
> + *
> + * Return 0 on success, < 0 otherwise
> + */
> +static int vgic_v3_insert_redist_region(struct kvm *kvm, uint32_t index,
> + gpa_t base, uint32_t count)
>  {
> - struct vgic_dist *vgic = >arch.vgic;
> + struct vgic_dist *d = >arch.vgic;
>   struct vgic_redist_region *rdreg;
> + struct list_head *rd_regions = >rd_regions;
> + struct list_head *last = rd_regions->prev;
> +

nit: extra blank line?

> + gpa_t new_start, new_end;
> + size_t size = count * KVM_VGIC_V3_REDIST_SIZE;
>   int ret;
>  
> - /* vgic_check_ioaddr makes sure we don't do this twice */
> - if (!list_empty(>rd_regions))
> + /* single rdist region already set ?*/
> + if (!count && !list_empty(rd_regions))
> + return -EINVAL;
> +
> + /* cross the end of memory ? */
> + if (base + size < base)
> + return -EINVAL;

what is the size of memory?  This seems to check for a gpa_t overflow,
but not againt the IPA space of the VM...

> +
> + if (list_empty(rd_regions)) {
> + if (index != 0)
> + return -EINVAL;

note, I think this can be simplified if we can rid of the index.

> + } else {
> + rdreg = list_entry(last, struct vgic_redist_region, list);

you can use list_last_entry here and get rid of the 'last' temporary
variable above.

> + if (index != rdreg->index + 1)
> + return -EINVAL;
> +
> + /* Cannot add an explicitly sized regions after legacy region */
> + if (!rdreg->count)
> + return -EINVAL;
> + }
> +
> + /*
> +  * collision with already set dist region ?
> +  * this assumes we know the size of the new rdist region (pfns != 0)
> +  * otherwise we can only test this when all vcpus are registered
> +  */

I don't really understand this commentary... :(

> + if (!count && !IS_VGIC_ADDR_UNDEF(d->vgic_dist_base) &&
> + (!(d->vgic_dist_base + KVM_VGIC_V3_DIST_SIZE <= base)) &&
> + (!(base + size <= d->vgic_dist_base)))
> + return -EINVAL;

Can't you call vgic_v3_check_base() here instead?

> +
> + /* collision with any other rdist region? */
> + if (vgic_v3_rdist_overlap(kvm, base, size))
>   return -EINVAL;
>  
>   rdreg = kzalloc(sizeof(*rdreg), GFP_KERNEL);
> @@ -696,17 +748,32 @@ int vgic_v3_set_redist_base(struct kvm *kvm, u64 addr)
>  
>   rdreg->base = VGIC_ADDR_UNDEF;
>  
> - ret = vgic_check_ioaddr(kvm, >base, addr, SZ_64K);
> + ret = vgic_check_ioaddr(kvm, >base, base, SZ_64K);
>   if (ret)
> - goto out;
> + goto free;
>  
> - rdreg->base = addr;
> - if (!vgic_v3_check_base(kvm)) {
> - ret = -EINVAL;
> - goto out;
> - }
> + rdreg->base = base;
> + rdreg->count = count;
> + 

Re: vmalloc with GFP_NOFS

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> Hi,
> it seems that we still have few vmalloc users who perform GFP_NOFS
> allocation:
> drivers/mtd/ubi/io.c
> fs/ext4/xattr.c
> fs/gfs2/dir.c
> fs/gfs2/quota.c
> fs/nfs/blocklayout/extent_tree.c
> fs/ubifs/debug.c
> fs/ubifs/lprops.c
> fs/ubifs/lpt_commit.c
> fs/ubifs/orphan.c
> 
> Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly
> because we do have hardocded GFP_KERNEL allocations deep inside the
> vmalloc layers. That means that if GFP_NOFS really protects from
> recursion into the fs deadlocks then the vmalloc call is broken.
> 
> What to do about this? Well, there are two things. Firstly, it would be
> really great to double check whether the GFP_NOFS is really needed. I
> cannot judge that because I am not familiar with the code. It would be
> great if the respective maintainers (hopefully get_maintainer.sh pointed
> me to all relevant ones). If there is not reclaim recursion issue then
> simply use the standard vmalloc (aka GFP_KERNEL request).
> 
> If the use is really valid then we have a way to do the vmalloc
> allocation properly. We have memalloc_nofs_{save,restore} scope api. How
> does that work? You simply call memalloc_nofs_save when the reclaim
> recursion critical section starts (e.g. when you take a lock which is
> then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore
> when the critical section ends. _All_ allocations within that scope
> will get GFP_NOFS semantic automagically. If you are not sure about the
> scope itself then the easiest workaround is to wrap the vmalloc itself
> with a big fat comment that this should be revisited.
> 
> Does that sound like something that can be done in a reasonable time?
> I have tried to bring this up in the past but our speed is glacial and
> there are attempts to do hacks like checking for abusers inside the
> vmalloc which is just too ugly to live.
> 
> Please do not hesitate to get back to me if something is not clear.
> 
> Thanks!
> -- 
> Michal Hocko
> SUSE Labs

I made a patch that adds memalloc_noio/fs_save around these calls a year 
ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html

Mikulas


Re: vmalloc with GFP_NOFS

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> Hi,
> it seems that we still have few vmalloc users who perform GFP_NOFS
> allocation:
> drivers/mtd/ubi/io.c
> fs/ext4/xattr.c
> fs/gfs2/dir.c
> fs/gfs2/quota.c
> fs/nfs/blocklayout/extent_tree.c
> fs/ubifs/debug.c
> fs/ubifs/lprops.c
> fs/ubifs/lpt_commit.c
> fs/ubifs/orphan.c
> 
> Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly
> because we do have hardocded GFP_KERNEL allocations deep inside the
> vmalloc layers. That means that if GFP_NOFS really protects from
> recursion into the fs deadlocks then the vmalloc call is broken.
> 
> What to do about this? Well, there are two things. Firstly, it would be
> really great to double check whether the GFP_NOFS is really needed. I
> cannot judge that because I am not familiar with the code. It would be
> great if the respective maintainers (hopefully get_maintainer.sh pointed
> me to all relevant ones). If there is not reclaim recursion issue then
> simply use the standard vmalloc (aka GFP_KERNEL request).
> 
> If the use is really valid then we have a way to do the vmalloc
> allocation properly. We have memalloc_nofs_{save,restore} scope api. How
> does that work? You simply call memalloc_nofs_save when the reclaim
> recursion critical section starts (e.g. when you take a lock which is
> then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore
> when the critical section ends. _All_ allocations within that scope
> will get GFP_NOFS semantic automagically. If you are not sure about the
> scope itself then the easiest workaround is to wrap the vmalloc itself
> with a big fat comment that this should be revisited.
> 
> Does that sound like something that can be done in a reasonable time?
> I have tried to bring this up in the past but our speed is glacial and
> there are attempts to do hacks like checking for abusers inside the
> vmalloc which is just too ugly to live.
> 
> Please do not hesitate to get back to me if something is not clear.
> 
> Thanks!
> -- 
> Michal Hocko
> SUSE Labs

I made a patch that adds memalloc_noio/fs_save around these calls a year 
ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html

Mikulas


Re: [PATCH v3 02/12] KVM: arm/arm64: Document KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION

2018-04-24 Thread Christoffer Dall
On Fri, Apr 13, 2018 at 10:20:48AM +0200, Eric Auger wrote:
> We introduce a new KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attribute in
> KVM_DEV_ARM_VGIC_GRP_ADDR group. It allows userspace to provide the
> base address and size of a redistributor region
> 
> Compared to KVM_VGIC_V3_ADDR_TYPE_REDIST, this new attribute allows
> to declare several separate redistributor regions.
> 
> So the whole redist space does not need to be contiguous anymore.
> 
> Signed-off-by: Eric Auger 
> ---
>  Documentation/virtual/kvm/devices/arm-vgic-v3.txt | 25 
> ++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt 
> b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
> index 9293b45..cbc4328 100644
> --- a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
> +++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
> @@ -27,9 +27,32 @@ Groups:
>VCPU and all of the redistributor pages are contiguous.
>Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
>This address needs to be 64K aligned.
> +
> +KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION (rw, 64-bit)
> +  The attr field of kvm_device_attr encodes 3 values:
> +  bits: | 63     52  |  51      16 | 15 - 12  |11 - 0
> +  values:   | count  |   base  |  flags   | index
> +  - index encodes the unique redistributor region index

I'm not entirely sure I understand the purpose of the index field.
Isn't a redistributor region identified uniquely by its base address?

Otherwise this looks good.

Thanks,
-Christoffer


> +  - flags: reserved for future use, currently 0
> +  - base field encodes bits [51:16] of the guest physical base address
> +of the first redistributor in the region.
> +  - count encodes the number of redistributors in the region. Must be
> +greater than 0.
> +  There are two 64K pages for each redistributor in the region and
> +  redistributors are laid out contiguously within the region. Regions
> +  are filled with redistributors in the index order. The sum of all
> +  region count fields must be greater than or equal to the number of
> +  VCPUs. Redistributor regions must be registered in the incremental
> +  index order, starting from index 0.
> +  Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
> +
> +  It is invalid to mix calls with KVM_VGIC_V3_ADDR_TYPE_REDIST and
> +  KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attributes.
> +
>Errors:
>  -E2BIG:  Address outside of addressable IPA range
> --EINVAL: Incorrectly aligned address
> +-EINVAL: Incorrectly aligned address, bad redistributor region
> + count/index, mixed redistributor region attribute usage
>  -EEXIST: Address already configured
>  -ENXIO:  The group or attribute is unknown/unsupported for this device
>   or hardware support is missing.
> -- 
> 2.5.5
> 


Re: [PATCH v3 02/12] KVM: arm/arm64: Document KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION

2018-04-24 Thread Christoffer Dall
On Fri, Apr 13, 2018 at 10:20:48AM +0200, Eric Auger wrote:
> We introduce a new KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attribute in
> KVM_DEV_ARM_VGIC_GRP_ADDR group. It allows userspace to provide the
> base address and size of a redistributor region
> 
> Compared to KVM_VGIC_V3_ADDR_TYPE_REDIST, this new attribute allows
> to declare several separate redistributor regions.
> 
> So the whole redist space does not need to be contiguous anymore.
> 
> Signed-off-by: Eric Auger 
> ---
>  Documentation/virtual/kvm/devices/arm-vgic-v3.txt | 25 
> ++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt 
> b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
> index 9293b45..cbc4328 100644
> --- a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
> +++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
> @@ -27,9 +27,32 @@ Groups:
>VCPU and all of the redistributor pages are contiguous.
>Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
>This address needs to be 64K aligned.
> +
> +KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION (rw, 64-bit)
> +  The attr field of kvm_device_attr encodes 3 values:
> +  bits: | 63     52  |  51      16 | 15 - 12  |11 - 0
> +  values:   | count  |   base  |  flags   | index
> +  - index encodes the unique redistributor region index

I'm not entirely sure I understand the purpose of the index field.
Isn't a redistributor region identified uniquely by its base address?

Otherwise this looks good.

Thanks,
-Christoffer


> +  - flags: reserved for future use, currently 0
> +  - base field encodes bits [51:16] of the guest physical base address
> +of the first redistributor in the region.
> +  - count encodes the number of redistributors in the region. Must be
> +greater than 0.
> +  There are two 64K pages for each redistributor in the region and
> +  redistributors are laid out contiguously within the region. Regions
> +  are filled with redistributors in the index order. The sum of all
> +  region count fields must be greater than or equal to the number of
> +  VCPUs. Redistributor regions must be registered in the incremental
> +  index order, starting from index 0.
> +  Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
> +
> +  It is invalid to mix calls with KVM_VGIC_V3_ADDR_TYPE_REDIST and
> +  KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attributes.
> +
>Errors:
>  -E2BIG:  Address outside of addressable IPA range
> --EINVAL: Incorrectly aligned address
> +-EINVAL: Incorrectly aligned address, bad redistributor region
> + count/index, mixed redistributor region attribute usage
>  -EEXIST: Address already configured
>  -ENXIO:  The group or attribute is unknown/unsupported for this device
>   or hardware support is missing.
> -- 
> 2.5.5
> 


[PATCH v6] fs: dax: Adding new return type vm_fault_t

2018-04-24 Thread Souptick Joarder
Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.

commit 1c8f422059ae ("mm: change return type to vm_fault_t")

There was an existing bug inside dax_load_hole()
if vm_insert_mixed had failed to allocate a page table,
we'd return VM_FAULT_NOPAGE instead of VM_FAULT_OOM.
With new vmf_insert_mixed() this issue is addressed.

vm_insert_mixed_mkwrite has inefficiency when it returns
an error value, driver has to convert it to vm_fault_t
type. With new vmf_insert_mixed_mkwrite() this limitation
will be addressed.

Signed-off-by: Souptick Joarder 
Reviewed-by: Jan Kara 
Reviewed-by: Matthew Wilcox 
---
v2: vm_insert_mixed_mkwrite is replaced by new
vmf_insert_mixed_mkwrite

v3: Addressed Matthew's comment. One patch which
changes both at the same time. The history
should be bisectable so that it compiles and
works at every point.

v4: Updated the change log

v5: Updated the change log

v6: Added comment in source file

 fs/dax.c| 78 +
 include/linux/dax.h |  4 +--
 include/linux/mm.h  |  4 +--
 mm/memory.c | 21 ---
 4 files changed, 58 insertions(+), 49 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index aaec72de..821986c 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -905,12 +905,12 @@ static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, 
size_t size,
  * If this page is ever written to we will re-fault and change the mapping to
  * point to real DAX storage instead.
  */
-static int dax_load_hole(struct address_space *mapping, void *entry,
+static vm_fault_t dax_load_hole(struct address_space *mapping, void *entry,
 struct vm_fault *vmf)
 {
struct inode *inode = mapping->host;
unsigned long vaddr = vmf->address;
-   int ret = VM_FAULT_NOPAGE;
+   vm_fault_t ret = VM_FAULT_NOPAGE;
struct page *zero_page;
void *entry2;
pfn_t pfn;
@@ -929,7 +929,7 @@ static int dax_load_hole(struct address_space *mapping, 
void *entry,
goto out;
}
 
-   vm_insert_mixed(vmf->vma, vaddr, pfn);
+   ret = vmf_insert_mixed(vmf->vma, vaddr, pfn);
 out:
trace_dax_load_hole(inode, vmf, ret);
return ret;
@@ -1112,7 +1112,7 @@ int __dax_zero_page_range(struct block_device *bdev,
 }
 EXPORT_SYMBOL_GPL(dax_iomap_rw);
 
-static int dax_fault_return(int error)
+static vm_fault_t dax_fault_return(int error)
 {
if (error == 0)
return VM_FAULT_NOPAGE;
@@ -1132,7 +1132,7 @@ static bool dax_fault_is_synchronous(unsigned long flags,
&& (iomap->flags & IOMAP_F_DIRTY);
 }
 
-static int dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
+static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
   int *iomap_errp, const struct iomap_ops *ops)
 {
struct vm_area_struct *vma = vmf->vma;
@@ -1145,18 +1145,18 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
int error, major = 0;
bool write = vmf->flags & FAULT_FLAG_WRITE;
bool sync;
-   int vmf_ret = 0;
+   vm_fault_t ret = 0;
void *entry;
pfn_t pfn;
 
-   trace_dax_pte_fault(inode, vmf, vmf_ret);
+   trace_dax_pte_fault(inode, vmf, ret);
/*
 * Check whether offset isn't beyond end of file now. Caller is supposed
 * to hold locks serializing us with truncate / punch hole so this is
 * a reliable test.
 */
if (pos >= i_size_read(inode)) {
-   vmf_ret = VM_FAULT_SIGBUS;
+   ret = VM_FAULT_SIGBUS;
goto out;
}
 
@@ -1165,7 +1165,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
 
entry = grab_mapping_entry(mapping, vmf->pgoff, 0);
if (IS_ERR(entry)) {
-   vmf_ret = dax_fault_return(PTR_ERR(entry));
+   ret = dax_fault_return(PTR_ERR(entry));
goto out;
}
 
@@ -1176,7 +1176,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
 * retried.
 */
if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) {
-   vmf_ret = VM_FAULT_NOPAGE;
+   ret = VM_FAULT_NOPAGE;
goto unlock_entry;
}
 
@@ -1189,7 +1189,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
if (iomap_errp)
*iomap_errp = error;
if (error) {
-   vmf_ret = dax_fault_return(error);
+   ret = dax_fault_return(error);
goto unlock_entry;
}
if (WARN_ON_ONCE(iomap.offset + iomap.length < pos + PAGE_SIZE)) {
@@ -1219,9 +1219,9 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 

[PATCH v6] fs: dax: Adding new return type vm_fault_t

2018-04-24 Thread Souptick Joarder
Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.

commit 1c8f422059ae ("mm: change return type to vm_fault_t")

There was an existing bug inside dax_load_hole()
if vm_insert_mixed had failed to allocate a page table,
we'd return VM_FAULT_NOPAGE instead of VM_FAULT_OOM.
With new vmf_insert_mixed() this issue is addressed.

vm_insert_mixed_mkwrite has inefficiency when it returns
an error value, driver has to convert it to vm_fault_t
type. With new vmf_insert_mixed_mkwrite() this limitation
will be addressed.

Signed-off-by: Souptick Joarder 
Reviewed-by: Jan Kara 
Reviewed-by: Matthew Wilcox 
---
v2: vm_insert_mixed_mkwrite is replaced by new
vmf_insert_mixed_mkwrite

v3: Addressed Matthew's comment. One patch which
changes both at the same time. The history
should be bisectable so that it compiles and
works at every point.

v4: Updated the change log

v5: Updated the change log

v6: Added comment in source file

 fs/dax.c| 78 +
 include/linux/dax.h |  4 +--
 include/linux/mm.h  |  4 +--
 mm/memory.c | 21 ---
 4 files changed, 58 insertions(+), 49 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index aaec72de..821986c 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -905,12 +905,12 @@ static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, 
size_t size,
  * If this page is ever written to we will re-fault and change the mapping to
  * point to real DAX storage instead.
  */
-static int dax_load_hole(struct address_space *mapping, void *entry,
+static vm_fault_t dax_load_hole(struct address_space *mapping, void *entry,
 struct vm_fault *vmf)
 {
struct inode *inode = mapping->host;
unsigned long vaddr = vmf->address;
-   int ret = VM_FAULT_NOPAGE;
+   vm_fault_t ret = VM_FAULT_NOPAGE;
struct page *zero_page;
void *entry2;
pfn_t pfn;
@@ -929,7 +929,7 @@ static int dax_load_hole(struct address_space *mapping, 
void *entry,
goto out;
}
 
-   vm_insert_mixed(vmf->vma, vaddr, pfn);
+   ret = vmf_insert_mixed(vmf->vma, vaddr, pfn);
 out:
trace_dax_load_hole(inode, vmf, ret);
return ret;
@@ -1112,7 +1112,7 @@ int __dax_zero_page_range(struct block_device *bdev,
 }
 EXPORT_SYMBOL_GPL(dax_iomap_rw);
 
-static int dax_fault_return(int error)
+static vm_fault_t dax_fault_return(int error)
 {
if (error == 0)
return VM_FAULT_NOPAGE;
@@ -1132,7 +1132,7 @@ static bool dax_fault_is_synchronous(unsigned long flags,
&& (iomap->flags & IOMAP_F_DIRTY);
 }
 
-static int dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
+static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
   int *iomap_errp, const struct iomap_ops *ops)
 {
struct vm_area_struct *vma = vmf->vma;
@@ -1145,18 +1145,18 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
int error, major = 0;
bool write = vmf->flags & FAULT_FLAG_WRITE;
bool sync;
-   int vmf_ret = 0;
+   vm_fault_t ret = 0;
void *entry;
pfn_t pfn;
 
-   trace_dax_pte_fault(inode, vmf, vmf_ret);
+   trace_dax_pte_fault(inode, vmf, ret);
/*
 * Check whether offset isn't beyond end of file now. Caller is supposed
 * to hold locks serializing us with truncate / punch hole so this is
 * a reliable test.
 */
if (pos >= i_size_read(inode)) {
-   vmf_ret = VM_FAULT_SIGBUS;
+   ret = VM_FAULT_SIGBUS;
goto out;
}
 
@@ -1165,7 +1165,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
 
entry = grab_mapping_entry(mapping, vmf->pgoff, 0);
if (IS_ERR(entry)) {
-   vmf_ret = dax_fault_return(PTR_ERR(entry));
+   ret = dax_fault_return(PTR_ERR(entry));
goto out;
}
 
@@ -1176,7 +1176,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
 * retried.
 */
if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) {
-   vmf_ret = VM_FAULT_NOPAGE;
+   ret = VM_FAULT_NOPAGE;
goto unlock_entry;
}
 
@@ -1189,7 +1189,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
if (iomap_errp)
*iomap_errp = error;
if (error) {
-   vmf_ret = dax_fault_return(error);
+   ret = dax_fault_return(error);
goto unlock_entry;
}
if (WARN_ON_ONCE(iomap.offset + iomap.length < pos + PAGE_SIZE)) {
@@ -1219,9 +1219,9 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, 
pfn_t *pfnp,
goto 

Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.

2018-04-24 Thread Eric W. Biederman
Andrey Grodzovsky  writes:

> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

I should clarify.  This absolutely can not be done.
PF_EXITING is set just before a task starts tearing down it's signal
handling.

So delivering any signal, or otherwise depending on signal handling
after PF_EXITING is set can not be done.  That abstraction is gone.

Eric

> Signed-off-by: Andrey Grodzovsky 
> ---
>  kernel/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c6e4c83..c49c706 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct 
> task_struct *p)
>  {
>   if (sigismember(>blocked, sig))
>   return 0;
> - if (p->flags & PF_EXITING)
> - return 0;
>   if (sig == SIGKILL)
>   return 1;
> + if (p->flags & PF_EXITING)
> + return 0;
>   if (task_is_stopped_or_traced(p))
>   return 0;
>   return task_curr(p) || !signal_pending(p);


Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.

2018-04-24 Thread Eric W. Biederman
Andrey Grodzovsky  writes:

> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

I should clarify.  This absolutely can not be done.
PF_EXITING is set just before a task starts tearing down it's signal
handling.

So delivering any signal, or otherwise depending on signal handling
after PF_EXITING is set can not be done.  That abstraction is gone.

Eric

> Signed-off-by: Andrey Grodzovsky 
> ---
>  kernel/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c6e4c83..c49c706 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct 
> task_struct *p)
>  {
>   if (sigismember(>blocked, sig))
>   return 0;
> - if (p->flags & PF_EXITING)
> - return 0;
>   if (sig == SIGKILL)
>   return 1;
> + if (p->flags & PF_EXITING)
> + return 0;
>   if (task_is_stopped_or_traced(p))
>   return 0;
>   return task_curr(p) || !signal_pending(p);


Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-24 Thread Andrey Grodzovsky



On 04/24/2018 12:23 PM, Eric W. Biederman wrote:

Andrey Grodzovsky  writes:


Avoid calling wait_event_killable when you are possibly being called
from get_signal routine since in that case you end up in a deadlock
where you are alreay blocked in singla processing any trying to wait
on a new signal.

I am curious what the call path that is problematic here.


Here is the problematic call stack

[<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
[<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
[<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
[<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
[<0>] drm_release+0x414/0x5b0 [drm]
[<0>] __fput+0x176/0x350
[<0>] task_work_run+0xa1/0xc0
[<0>] do_exit+0x48f/0x1280
[<0>] do_group_exit+0x89/0x140
[<0>] get_signal+0x375/0x8f0
[<0>] do_signal+0x79/0xaa0
[<0>] exit_to_usermode_loop+0x83/0xd0
[<0>] do_syscall_64+0x244/0x270
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0x

On exit from system call you process all the signals you received and 
encounter a fatal signal which triggers process termination.




In general waiting seems wrong when the process has already been
fatally killed as indicated by PF_SIGNALED.


So indeed this patch avoids wait in this case.



Returning -ERESTARTSYS seems wrong as nothing should make it back even
to the edge of userspace here.


Can you clarify please - what should be returned here instead ?

Andrey



Given that this is the only use of PF_SIGNALED outside of bsd process
accounting I find this code very suspicious.

It looks the code path that gets called during exit is buggy and needs
to be sorted out.

Eric



Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 088ff2b..09fd258 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler 
*sched,
return;
/**
 * The client will not queue more IBs during this fini, consume existing
-* queued IBs or discard them on SIGKILL
+* queued IBs or discard them when in death signal state since
+* wait_event_killable can't receive signals in that state.
*/
-   if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
+   if (current->flags & PF_SIGNALED)
entity->fini_status = -ERESTARTSYS;
else
entity->fini_status = wait_event_killable(sched->job_scheduled,




Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-24 Thread Andrey Grodzovsky



On 04/24/2018 12:23 PM, Eric W. Biederman wrote:

Andrey Grodzovsky  writes:


Avoid calling wait_event_killable when you are possibly being called
from get_signal routine since in that case you end up in a deadlock
where you are alreay blocked in singla processing any trying to wait
on a new signal.

I am curious what the call path that is problematic here.


Here is the problematic call stack

[<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
[<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
[<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
[<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
[<0>] drm_release+0x414/0x5b0 [drm]
[<0>] __fput+0x176/0x350
[<0>] task_work_run+0xa1/0xc0
[<0>] do_exit+0x48f/0x1280
[<0>] do_group_exit+0x89/0x140
[<0>] get_signal+0x375/0x8f0
[<0>] do_signal+0x79/0xaa0
[<0>] exit_to_usermode_loop+0x83/0xd0
[<0>] do_syscall_64+0x244/0x270
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0x

On exit from system call you process all the signals you received and 
encounter a fatal signal which triggers process termination.




In general waiting seems wrong when the process has already been
fatally killed as indicated by PF_SIGNALED.


So indeed this patch avoids wait in this case.



Returning -ERESTARTSYS seems wrong as nothing should make it back even
to the edge of userspace here.


Can you clarify please - what should be returned here instead ?

Andrey



Given that this is the only use of PF_SIGNALED outside of bsd process
accounting I find this code very suspicious.

It looks the code path that gets called during exit is buggy and needs
to be sorted out.

Eric



Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 088ff2b..09fd258 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler 
*sched,
return;
/**
 * The client will not queue more IBs during this fini, consume existing
-* queued IBs or discard them on SIGKILL
+* queued IBs or discard them when in death signal state since
+* wait_event_killable can't receive signals in that state.
*/
-   if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
+   if (current->flags & PF_SIGNALED)
entity->fini_status = -ERESTARTSYS;
else
entity->fini_status = wait_event_killable(sched->job_scheduled,




Re: [PATCH 4/6] drm/panel: simple: Add support for Banana Pi 7" S070WV20-CT16 panel

2018-04-24 Thread Rob Herring
On Thu, Apr 19, 2018 at 05:32:23PM +0800, Chen-Yu Tsai wrote:
> This panel is marketed as Banana Pi 7" LCD display. On the back is
> a sticker denoting the model name S070WV20-CT16.
> 
> This is a 7" 800x480 panel connected through a 24-bit RGB interface.
> However the panel only does 262k colors.
> 
> Signed-off-by: Chen-Yu Tsai 
> ---
>  .../display/panel/bananapi,s070wv20-ct16.txt  |  7 ++
>  drivers/gpu/drm/panel/panel-simple.c  | 25 +++
>  2 files changed, 32 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt 
> b/Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt
> new file mode 100644
> index ..2ec35ce36e9a
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt
> @@ -0,0 +1,7 @@
> +Banana Pi 7" (S070WV20-CT16) TFT LCD Panel
> +
> +Required properties:
> +- compatible: should be "bananapi,s070wv20-ct16"
> +
> +This binding is compatible with the simple-panel binding, which is specified
> +in simple-panel.txt in this directory.

Does this panel have a power supply (or more than 1) or backlight? I 
can't know that by just refering to simple-panel.txt.



Re: [patch V3 00/10] rslib: Cleanup and VLA removal

2018-04-24 Thread Kees Cook
On Sun, Apr 22, 2018 at 9:23 AM, Thomas Gleixner  wrote:
> Kees tried to get rid of the Variable Length Arrays in the Reed-Solomon
> library by replacing them with fixed length arrays on stack. Though they
> are rather large and Andrew did not fall in love with that solution.
>
> This series addresses that in a different way by splitting the rs control
> structure up, so that each invocation of rs_init() returns a new instance
> in which the decoder buffers are allocated. The polynom tables which build
> the base for the RS codecs are still shareable between the instances to
> avoid large allocations and initializations of the same data over and over.
>
> The usage sites have been audited and fixed up where necessary to
> accomodate the decoder change which forbids parallel decoder invocation for
> a particular rs control instance to prevent buffer corruption.
>
> While at it the patch set tidies up the code and converts the related files
> over to use SPDX license identifiers.
>
> Changes since V2:
>
>Provide a gfp aware variant of init_rs and use it in the alloc callback
>of the dm/verity_fec rs_pool mempool allocations
>
>Picked up Reviewed/Acked-by tags and fixed the subject line for the MTD
>specific patch.

Excellent, thanks!

> Changes since V1:
>
>Simplify error path in the diskonchip code and use the proper
>function to free the decoder.
>
> As this spawns multiple subsystems it should either go through Andrews tree
> or Kees can route it with his other hardening stuff.

I'll create a VLA sub-tree to my KSPP stuff and start collecting this
and other VLA fixes that maintainers haven't taken yet. Once
build-testing finishes, I'll push it out for -next.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH 4/6] drm/panel: simple: Add support for Banana Pi 7" S070WV20-CT16 panel

2018-04-24 Thread Rob Herring
On Thu, Apr 19, 2018 at 05:32:23PM +0800, Chen-Yu Tsai wrote:
> This panel is marketed as Banana Pi 7" LCD display. On the back is
> a sticker denoting the model name S070WV20-CT16.
> 
> This is a 7" 800x480 panel connected through a 24-bit RGB interface.
> However the panel only does 262k colors.
> 
> Signed-off-by: Chen-Yu Tsai 
> ---
>  .../display/panel/bananapi,s070wv20-ct16.txt  |  7 ++
>  drivers/gpu/drm/panel/panel-simple.c  | 25 +++
>  2 files changed, 32 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt 
> b/Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt
> new file mode 100644
> index ..2ec35ce36e9a
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/display/panel/bananapi,s070wv20-ct16.txt
> @@ -0,0 +1,7 @@
> +Banana Pi 7" (S070WV20-CT16) TFT LCD Panel
> +
> +Required properties:
> +- compatible: should be "bananapi,s070wv20-ct16"
> +
> +This binding is compatible with the simple-panel binding, which is specified
> +in simple-panel.txt in this directory.

Does this panel have a power supply (or more than 1) or backlight? I 
can't know that by just refering to simple-panel.txt.



Re: [patch V3 00/10] rslib: Cleanup and VLA removal

2018-04-24 Thread Kees Cook
On Sun, Apr 22, 2018 at 9:23 AM, Thomas Gleixner  wrote:
> Kees tried to get rid of the Variable Length Arrays in the Reed-Solomon
> library by replacing them with fixed length arrays on stack. Though they
> are rather large and Andrew did not fall in love with that solution.
>
> This series addresses that in a different way by splitting the rs control
> structure up, so that each invocation of rs_init() returns a new instance
> in which the decoder buffers are allocated. The polynom tables which build
> the base for the RS codecs are still shareable between the instances to
> avoid large allocations and initializations of the same data over and over.
>
> The usage sites have been audited and fixed up where necessary to
> accomodate the decoder change which forbids parallel decoder invocation for
> a particular rs control instance to prevent buffer corruption.
>
> While at it the patch set tidies up the code and converts the related files
> over to use SPDX license identifiers.
>
> Changes since V2:
>
>Provide a gfp aware variant of init_rs and use it in the alloc callback
>of the dm/verity_fec rs_pool mempool allocations
>
>Picked up Reviewed/Acked-by tags and fixed the subject line for the MTD
>specific patch.

Excellent, thanks!

> Changes since V1:
>
>Simplify error path in the diskonchip code and use the proper
>function to free the decoder.
>
> As this spawns multiple subsystems it should either go through Andrews tree
> or Kees can route it with his other hardening stuff.

I'll create a VLA sub-tree to my KSPP stuff and start collecting this
and other VLA fixes that maintainers haven't taken yet. Once
build-testing finishes, I'll push it out for -next.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH 01/10] vfio: ccw: Moving state change out of IRQ context

2018-04-24 Thread Halil Pasic



On 04/24/2018 11:59 AM, Cornelia Huck wrote:

On Tue, 24 Apr 2018 10:40:56 +0200
Pierre Morel  wrote:


On 24/04/2018 08:54, Dong Jia Shi wrote:

* Pierre Morel  [2018-04-19 16:48:04 +0200]:

[...]
  

@@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
   static void vfio_ccw_sch_irq(struct subchannel *sch)
   {
struct vfio_ccw_private *private = dev_get_drvdata(>dev);
+   struct irb *irb = this_cpu_ptr(_irb);

inc_irq_stat(IRQIO_CIO);
-   vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT);
+   memcpy(>irb, irb, sizeof(*irb));
+
+   WARN_ON(work_pending(>io_work));

Hmm, why do we need this?


The current design insure that we have not two concurrent SSCH requests.
How ever I want here to track spurious interrupt.
If we implement cancel, halt or clear requests, we also may trigger (AFAIU)
a second interrupts depending on races between instructions, controller
and device.


You won't get an interrupt for a successful cancel. If you do a
halt/clear, you will make the subchannel halt/clear pending in addition
to start pending and you'll only get one interrupt (if the I/O has
progressed far enough, you won't be able to issue a hsch). The
interesting case is:
- guest does a ssch, we do a ssch on the device
- the guest does a csch before it got the interrupt for the ssch
- before we do the csch on the device, the subchannel is already status
   pending with completion of the ssch
- after we issue the csch, we get a second interrupt (for the csch)

I think we should present two interrupts to the guest in that case.
Races between issuing ssch/hsch/csch and the subchannel becoming status
pending happen on real hardware as well, we're just more likely to see
them with the vfio layer in between.



AFAIU this will be the problem of the person implementing the clear
and the halt for vfio-ccw. I.e. it's a non-problem right now.


(I'm currently trying to recall what we're doing with unsolicited
interrupts. These are fun wrt deferred cc 1; I'm not sure if there are
cases where we want to present a deferred cc to the guest.)


What are 'fun wrt deferred cc 1' again? The deferred cc I understand
but wrt does not click at all.



Also, doing a second ssch before we got final state for the first one
is perfectly valid. Linux just does not do it, so I'm not sure if we
should invest too much time there.



We do not need it strongly.

  

+   queue_work(vfio_ccw_work_q, >io_work);
+   if (private->completion)
+   complete(private->completion);
   }

   static int vfio_ccw_sch_probe(struct subchannel *sch)

[...]
  








Re: [PATCH 01/10] vfio: ccw: Moving state change out of IRQ context

2018-04-24 Thread Halil Pasic



On 04/24/2018 11:59 AM, Cornelia Huck wrote:

On Tue, 24 Apr 2018 10:40:56 +0200
Pierre Morel  wrote:


On 24/04/2018 08:54, Dong Jia Shi wrote:

* Pierre Morel  [2018-04-19 16:48:04 +0200]:

[...]
  

@@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
   static void vfio_ccw_sch_irq(struct subchannel *sch)
   {
struct vfio_ccw_private *private = dev_get_drvdata(>dev);
+   struct irb *irb = this_cpu_ptr(_irb);

inc_irq_stat(IRQIO_CIO);
-   vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT);
+   memcpy(>irb, irb, sizeof(*irb));
+
+   WARN_ON(work_pending(>io_work));

Hmm, why do we need this?


The current design insure that we have not two concurrent SSCH requests.
How ever I want here to track spurious interrupt.
If we implement cancel, halt or clear requests, we also may trigger (AFAIU)
a second interrupts depending on races between instructions, controller
and device.


You won't get an interrupt for a successful cancel. If you do a
halt/clear, you will make the subchannel halt/clear pending in addition
to start pending and you'll only get one interrupt (if the I/O has
progressed far enough, you won't be able to issue a hsch). The
interesting case is:
- guest does a ssch, we do a ssch on the device
- the guest does a csch before it got the interrupt for the ssch
- before we do the csch on the device, the subchannel is already status
   pending with completion of the ssch
- after we issue the csch, we get a second interrupt (for the csch)

I think we should present two interrupts to the guest in that case.
Races between issuing ssch/hsch/csch and the subchannel becoming status
pending happen on real hardware as well, we're just more likely to see
them with the vfio layer in between.



AFAIU this will be the problem of the person implementing the clear
and the halt for vfio-ccw. I.e. it's a non-problem right now.


(I'm currently trying to recall what we're doing with unsolicited
interrupts. These are fun wrt deferred cc 1; I'm not sure if there are
cases where we want to present a deferred cc to the guest.)


What are 'fun wrt deferred cc 1' again? The deferred cc I understand
but wrt does not click at all.



Also, doing a second ssch before we got final state for the first one
is perfectly valid. Linux just does not do it, so I'm not sure if we
should invest too much time there.



We do not need it strongly.

  

+   queue_work(vfio_ccw_work_q, >io_work);
+   if (private->completion)
+   complete(private->completion);
   }

   static int vfio_ccw_sch_probe(struct subchannel *sch)

[...]
  








Re: [PATCH 1/1] tools: power/acpi, revert to LD = gcc

2018-04-24 Thread Martin Kelly
On Tue, Apr 24, 2018 at 12:43 AM, Jiri Slaby  wrote:
> Commit 7ed1c1901fe5 (tools: fix cross-compile var clobbering) removed
> setting of LD to $(CROSS_COMPILE)gcc. This broke build of acpica
> (acpidump) in power/acpi:
>  ld: unrecognized option '-D_LINUX'
>
> The tools pass CFLAGS to the linker (incl. -D_LINUX), so revert this
> particular change and let LD be $(CC) again. Note that the old behaviour
> was a bit different, it used $(CROSS_COMPILE)gcc which was eliminated by
> the commit 7ed1c1901fe5. We use $(CC) for that reason.
>
> Signed-off-by: Jiri Slaby 
> Cc: Martin Kelly 
> Cc: "Rafael J. Wysocki" 
> Cc: Len Brown 
> Cc: Robert Moore 
> Cc: Erik Schmauss 
> Cc: linux-a...@vger.kernel.org
> Cc: de...@acpica.org
> ---
>  tools/power/acpi/Makefile.config | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/tools/power/acpi/Makefile.config 
> b/tools/power/acpi/Makefile.config
> index 2cccbba64418..f304be71c278 100644
> --- a/tools/power/acpi/Makefile.config
> +++ b/tools/power/acpi/Makefile.config
> @@ -56,6 +56,7 @@ INSTALL_SCRIPT = ${INSTALL_PROGRAM}
>  # to compile vs uClibc, that can be done here as well.
>  CROSS = #/usr/i386-linux-uclibc/usr/bin/i386-uclibc-
>  CROSS_COMPILE ?= $(CROSS)
> +LD = $(CC)
>  HOSTCC = gcc
>
>  # check if compiler option is supported
> --
> 2.16.3
>

Reviewed-by: Martin Kelly 

Thanks for noticing this and sorry I missed it!


Re: [PATCH 1/1] tools: power/acpi, revert to LD = gcc

2018-04-24 Thread Martin Kelly
On Tue, Apr 24, 2018 at 12:43 AM, Jiri Slaby  wrote:
> Commit 7ed1c1901fe5 (tools: fix cross-compile var clobbering) removed
> setting of LD to $(CROSS_COMPILE)gcc. This broke build of acpica
> (acpidump) in power/acpi:
>  ld: unrecognized option '-D_LINUX'
>
> The tools pass CFLAGS to the linker (incl. -D_LINUX), so revert this
> particular change and let LD be $(CC) again. Note that the old behaviour
> was a bit different, it used $(CROSS_COMPILE)gcc which was eliminated by
> the commit 7ed1c1901fe5. We use $(CC) for that reason.
>
> Signed-off-by: Jiri Slaby 
> Cc: Martin Kelly 
> Cc: "Rafael J. Wysocki" 
> Cc: Len Brown 
> Cc: Robert Moore 
> Cc: Erik Schmauss 
> Cc: linux-a...@vger.kernel.org
> Cc: de...@acpica.org
> ---
>  tools/power/acpi/Makefile.config | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/tools/power/acpi/Makefile.config 
> b/tools/power/acpi/Makefile.config
> index 2cccbba64418..f304be71c278 100644
> --- a/tools/power/acpi/Makefile.config
> +++ b/tools/power/acpi/Makefile.config
> @@ -56,6 +56,7 @@ INSTALL_SCRIPT = ${INSTALL_PROGRAM}
>  # to compile vs uClibc, that can be done here as well.
>  CROSS = #/usr/i386-linux-uclibc/usr/bin/i386-uclibc-
>  CROSS_COMPILE ?= $(CROSS)
> +LD = $(CC)
>  HOSTCC = gcc
>
>  # check if compiler option is supported
> --
> 2.16.3
>

Reviewed-by: Martin Kelly 

Thanks for noticing this and sorry I missed it!


Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.

2018-04-24 Thread Andrey Grodzovsky



On 04/24/2018 12:14 PM, Eric W. Biederman wrote:

Andrey Grodzovsky  writes:


If the ring is hanging for some reason allow to recover the waiting
by sending fatal signal.

Originally-by: David Panariti 
Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index eb80edf..37a36af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, 
unsigned ring_id)
  
  	if (other) {

signed long r;
-   r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
-   if (r < 0) {
-   DRM_ERROR("Error (%ld) waiting for fence!\n", r);
-   return r;
+
+   while (true) {
+   if ((r = dma_fence_wait_timeout(other, true,
+   MAX_SCHEDULE_TIMEOUT)) >= 0)
+   return 0;
+
+   if (fatal_signal_pending(current)) {
+   DRM_ERROR("Error (%ld) waiting for fence!\n", 
r);
+   return r;
+   }

It looks like if you make this code say:
if (fatal_signal_pending(current) ||
(current->flags & PF_EXITING)) {
DRM_ERROR("Error (%ld) waiting for fence!\n", 
r);
return r;

}
}

Than you would not need the horrible hack want_signal to deliver signals
to processes who have passed exit_signal() and don't expect to need
their signal handling mechanisms anymore.


Let me clarify,  the change in want_signal wasn't addressing this code 
but hang in
drm_sched_entity_do_release->wait_event_killable, when you try to 
gracefully terminate by waiting

for all job completions on the GPU pipe you process is using.

Andrey



Eric





Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.

2018-04-24 Thread Andrey Grodzovsky



On 04/24/2018 12:14 PM, Eric W. Biederman wrote:

Andrey Grodzovsky  writes:


If the ring is hanging for some reason allow to recover the waiting
by sending fatal signal.

Originally-by: David Panariti 
Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index eb80edf..37a36af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, 
unsigned ring_id)
  
  	if (other) {

signed long r;
-   r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
-   if (r < 0) {
-   DRM_ERROR("Error (%ld) waiting for fence!\n", r);
-   return r;
+
+   while (true) {
+   if ((r = dma_fence_wait_timeout(other, true,
+   MAX_SCHEDULE_TIMEOUT)) >= 0)
+   return 0;
+
+   if (fatal_signal_pending(current)) {
+   DRM_ERROR("Error (%ld) waiting for fence!\n", 
r);
+   return r;
+   }

It looks like if you make this code say:
if (fatal_signal_pending(current) ||
(current->flags & PF_EXITING)) {
DRM_ERROR("Error (%ld) waiting for fence!\n", 
r);
return r;

}
}

Than you would not need the horrible hack want_signal to deliver signals
to processes who have passed exit_signal() and don't expect to need
their signal handling mechanisms anymore.


Let me clarify,  the change in want_signal wasn't addressing this code 
but hang in
drm_sched_entity_do_release->wait_event_killable, when you try to 
gracefully terminate by waiting

for all job completions on the GPU pipe you process is using.

Andrey



Eric





Re: [PATCH 2/8] dt-bindings: display: bridge: thc63lvd1024: Add lvds map property

2018-04-24 Thread Rob Herring
On Thu, Apr 19, 2018 at 11:31:03AM +0200, Jacopo Mondi wrote:
> The THC63LVD1024 LVDS to RGB bridge supports two different input mapping
> modes, selectable by means of an external pin.
> 
> Describe the LVDS mode map through a newly defined mandatory property in
> device tree bindings.
> 
> Signed-off-by: Jacopo Mondi 
> ---
>  .../devicetree/bindings/display/bridge/thine,thc63lvd1024.txt  | 3 
> +++
>  1 file changed, 3 insertions(+)

+1 for what Laurent said.

Reviewed-by: Rob Herring 


Re: [PATCH 2/8] dt-bindings: display: bridge: thc63lvd1024: Add lvds map property

2018-04-24 Thread Rob Herring
On Thu, Apr 19, 2018 at 11:31:03AM +0200, Jacopo Mondi wrote:
> The THC63LVD1024 LVDS to RGB bridge supports two different input mapping
> modes, selectable by means of an external pin.
> 
> Describe the LVDS mode map through a newly defined mandatory property in
> device tree bindings.
> 
> Signed-off-by: Jacopo Mondi 
> ---
>  .../devicetree/bindings/display/bridge/thine,thc63lvd1024.txt  | 3 
> +++
>  1 file changed, 3 insertions(+)

+1 for what Laurent said.

Reviewed-by: Rob Herring 


Re: [PATCH] net: phy: allow scanning busses with missing phys

2018-04-24 Thread Florian Fainelli


On 04/24/2018 09:09 AM, Alexandre Belloni wrote:
> Some MDIO busses will error out when trying to read a phy address with no
> phy present at that address. In that case, probing the bus will fail
> because __mdiobus_register() is scanning the bus for all possible phys
> addresses.
> 
> In case MII_PHYSID1 returns -EIO or -ENODEV, consider there is no phy at
> this address and set the phy ID to 0x which is then properly
> handled in get_phy_device().

Humm, why not have your MDIO bus implementation do the scanning itself
in a reset() callback, which happens before probing the bus, and based
on the results, set phy_mask accordingly such that only PHYs present are
populated?

My only concern with your change is that we are having a special
treatment for EIO and ENODEV, so we must make sure MDIO bus drivers are
all conforming to that.

> 
> Suggested-by: Andrew Lunn 
> Signed-off-by: Alexandre Belloni 
> ---
>  drivers/net/phy/phy_device.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index ac23322a32e1..9e4ba8e80a18 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -535,8 +535,17 @@ static int get_phy_id(struct mii_bus *bus, int addr, u32 
> *phy_id,
>  
>   /* Grab the bits from PHYIR1, and put them in the upper half */
>   phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
> - if (phy_reg < 0)
> + if (phy_reg < 0) {
> + /* if there is no device, return without an error so scanning
> +  * the bus works properly
> +  */
> + if (phy_reg == -EIO || phy_reg == -ENODEV) {
> + *phy_id = 0x;
> + return 0;
> + }
> +
>   return -EIO;
> + }
>  
>   *phy_id = (phy_reg & 0x) << 16;
>  
> 

-- 
Florian


Re: [PATCH] net: phy: allow scanning busses with missing phys

2018-04-24 Thread Florian Fainelli


On 04/24/2018 09:09 AM, Alexandre Belloni wrote:
> Some MDIO busses will error out when trying to read a phy address with no
> phy present at that address. In that case, probing the bus will fail
> because __mdiobus_register() is scanning the bus for all possible phys
> addresses.
> 
> In case MII_PHYSID1 returns -EIO or -ENODEV, consider there is no phy at
> this address and set the phy ID to 0x which is then properly
> handled in get_phy_device().

Humm, why not have your MDIO bus implementation do the scanning itself
in a reset() callback, which happens before probing the bus, and based
on the results, set phy_mask accordingly such that only PHYs present are
populated?

My only concern with your change is that we are having a special
treatment for EIO and ENODEV, so we must make sure MDIO bus drivers are
all conforming to that.

> 
> Suggested-by: Andrew Lunn 
> Signed-off-by: Alexandre Belloni 
> ---
>  drivers/net/phy/phy_device.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index ac23322a32e1..9e4ba8e80a18 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -535,8 +535,17 @@ static int get_phy_id(struct mii_bus *bus, int addr, u32 
> *phy_id,
>  
>   /* Grab the bits from PHYIR1, and put them in the upper half */
>   phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
> - if (phy_reg < 0)
> + if (phy_reg < 0) {
> + /* if there is no device, return without an error so scanning
> +  * the bus works properly
> +  */
> + if (phy_reg == -EIO || phy_reg == -ENODEV) {
> + *phy_id = 0x;
> + return 0;
> + }
> +
>   return -EIO;
> + }
>  
>   *phy_id = (phy_reg & 0x) << 16;
>  
> 

-- 
Florian


Re: [PATCH] net: sh-eth: fix sh_eth_start_xmit()'s return type

2018-04-24 Thread Sergei Shtylyov
Hello!

On 04/24/2018 04:17 PM, Luc Van Oostenryck wrote:

> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
> 
> Fix this by returning 'netdev_tx_t' in this driver too.
> 
> Signed-off-by: Luc Van Oostenryck 

Acked-by: Sergei Shtylyov 

[...]
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
> b/drivers/net/ethernet/renesas/sh_eth.c
> index b6b90a631..0875a169f 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -2454,7 +2454,7 @@ static void sh_eth_tx_timeout(struct net_device *ndev)
>  }
>  
>  /* Packet transmit function */
> -static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +static netdev_tx_t sh_eth_start_xmit(struct sk_buff *skb, struct net_device 
> *ndev)

   But aren't you violating 80-column limit?

[...]

MBR, Sergei


Re: [PATCH] net: sh-eth: fix sh_eth_start_xmit()'s return type

2018-04-24 Thread Sergei Shtylyov
Hello!

On 04/24/2018 04:17 PM, Luc Van Oostenryck wrote:

> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
> 
> Fix this by returning 'netdev_tx_t' in this driver too.
> 
> Signed-off-by: Luc Van Oostenryck 

Acked-by: Sergei Shtylyov 

[...]
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
> b/drivers/net/ethernet/renesas/sh_eth.c
> index b6b90a631..0875a169f 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -2454,7 +2454,7 @@ static void sh_eth_tx_timeout(struct net_device *ndev)
>  }
>  
>  /* Packet transmit function */
> -static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +static netdev_tx_t sh_eth_start_xmit(struct sk_buff *skb, struct net_device 
> *ndev)

   But aren't you violating 80-column limit?

[...]

MBR, Sergei


Re: [PATCH 2/3] ASoC: AMD: Move clk enable from hw_params/free to startup/shutdown

2018-04-24 Thread Daniel Kurtz
On Mon, Apr 23, 2018 at 9:03 PM Vijendar Mukunda 
wrote:

> From: Akshu Agrawal 

> hw_param can be called multiple times and thus we can have
> more clk enable. The clk may not get diabled due to refcounting.
> startup/shutdown ensures single clk enable/disable call.

> Signed-off-by: Akshu Agrawal 
> Signed-off-by: Vijendar Mukunda 
> ---
>   sound/soc/amd/acp-da7219-max98357a.c | 54

>   1 file changed, 37 insertions(+), 17 deletions(-)

> diff --git a/sound/soc/amd/acp-da7219-max98357a.c
b/sound/soc/amd/acp-da7219-max98357a.c
> index b205c78..0f16f6d 100644
> --- a/sound/soc/amd/acp-da7219-max98357a.c
> +++ b/sound/soc/amd/acp-da7219-max98357a.c
> @@ -38,8 +38,7 @@
>   #include "../codecs/da7219.h"
>   #include "../codecs/da7219-aad.h"

> -#define CZ_PLAT_CLK 2400
> -#define MCLK_RATE 24576000
> +#define CZ_PLAT_CLK 2500
>   #define DUAL_CHANNEL   2

>   static struct snd_soc_jack cz_jack;
> @@ -62,7 +61,7 @@ static int cz_da7219_init(struct snd_soc_pcm_runtime
*rtd)
>  }

>  ret = snd_soc_dai_set_pll(codec_dai, 0, DA7219_SYSCLK_PLL,
> - CZ_PLAT_CLK, MCLK_RATE);
> + CZ_PLAT_CLK, DA7219_PLL_FREQ_OUT_98304);

These are unrelated fixes that should be in their own patch.

>  if (ret < 0) {
>  dev_err(rtd->dev, "can't set codec pll: %d\n", ret);
>  return ret;
> @@ -85,8 +84,7 @@ static int cz_da7219_init(struct snd_soc_pcm_runtime
*rtd)
>  return 0;
>   }

> -static int cz_da7219_hw_params(struct snd_pcm_substream *substream,
> -struct snd_pcm_hw_params *params)
> +static int da7219_clk_enable(struct snd_pcm_substream *substream)
>   {
>  int ret = 0;
>  struct snd_soc_pcm_runtime *rtd = substream->private_data;
> @@ -100,11 +98,9 @@ static int cz_da7219_hw_params(struct
snd_pcm_substream *substream,
>  return ret;
>   }

> -static int cz_da7219_hw_free(struct snd_pcm_substream *substream)
> +static void da7219_clk_disable(void)
>   {
>  clk_disable_unprepare(da7219_dai_clk);
> -
> -   return 0;
>   }

>   static const unsigned int channels[] = {
> @@ -127,7 +123,7 @@ static const struct snd_pcm_hw_constraint_list
constraints_channels = {
>  .mask = 0,
>   };

> -static int cz_fe_startup(struct snd_pcm_substream *substream)
> +static int cz_da7219_startup(struct snd_pcm_substream *substream)
>   {
>  struct snd_pcm_runtime *runtime = substream->runtime;

> @@ -141,23 +137,47 @@ static int cz_fe_startup(struct snd_pcm_substream
*substream)
>  snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> _rates);

> -   return 0;
> +   return da7219_clk_enable(substream);
> +}
> +
> +static void cz_da7219_shutdown(struct snd_pcm_substream *substream)
> +{
> +   da7219_clk_disable();
> +}
> +
> +static int cz_max_startup(struct snd_pcm_substream *substream)
> +{
> +   return da7219_clk_enable(substream);
> +}
> +
> +static void cz_max_shutdown(struct snd_pcm_substream *substream)
> +{
> +   da7219_clk_disable();
> +}
> +
> +static int cz_dmic_startup(struct snd_pcm_substream *substream)
> +{
> +   return da7219_clk_enable(substream);
> +}
> +
> +static void cz_dmic_shutdown(struct snd_pcm_substream *substream)
> +{
> +   da7219_clk_disable();
>   }

This is ok, or you could combine the common cz_max_* & cz_dmic_*.


>   static struct snd_soc_ops cz_da7219_cap_ops = {

I think these should all be "static const struct snd_soc_ops"  (please fix
in a separate patch).

> -   .hw_params = cz_da7219_hw_params,
> -   .hw_free = cz_da7219_hw_free,
> -   .startup = cz_fe_startup,
> +   .startup = cz_da7219_startup,
> +   .shutdown = cz_da7219_shutdown,
>   };

>   static struct snd_soc_ops cz_max_play_ops = {
> -   .hw_params = cz_da7219_hw_params,
> -   .hw_free = cz_da7219_hw_free,
> +   .startup = cz_max_startup,
> +   .shutdown = cz_max_shutdown,
>   };

>   static struct snd_soc_ops cz_dmic_cap_ops = {
> -   .hw_params = cz_da7219_hw_params,
> -   .hw_free = cz_da7219_hw_free,
> +   .startup = cz_dmic_startup,
> +   .shutdown = cz_dmic_shutdown,
>   };

>   static struct snd_soc_dai_link cz_dai_7219_98357[] = {
> --
> 2.7.4


Re: [PATCH 2/3] ASoC: AMD: Move clk enable from hw_params/free to startup/shutdown

2018-04-24 Thread Daniel Kurtz
On Mon, Apr 23, 2018 at 9:03 PM Vijendar Mukunda 
wrote:

> From: Akshu Agrawal 

> hw_param can be called multiple times and thus we can have
> more clk enable. The clk may not get diabled due to refcounting.
> startup/shutdown ensures single clk enable/disable call.

> Signed-off-by: Akshu Agrawal 
> Signed-off-by: Vijendar Mukunda 
> ---
>   sound/soc/amd/acp-da7219-max98357a.c | 54

>   1 file changed, 37 insertions(+), 17 deletions(-)

> diff --git a/sound/soc/amd/acp-da7219-max98357a.c
b/sound/soc/amd/acp-da7219-max98357a.c
> index b205c78..0f16f6d 100644
> --- a/sound/soc/amd/acp-da7219-max98357a.c
> +++ b/sound/soc/amd/acp-da7219-max98357a.c
> @@ -38,8 +38,7 @@
>   #include "../codecs/da7219.h"
>   #include "../codecs/da7219-aad.h"

> -#define CZ_PLAT_CLK 2400
> -#define MCLK_RATE 24576000
> +#define CZ_PLAT_CLK 2500
>   #define DUAL_CHANNEL   2

>   static struct snd_soc_jack cz_jack;
> @@ -62,7 +61,7 @@ static int cz_da7219_init(struct snd_soc_pcm_runtime
*rtd)
>  }

>  ret = snd_soc_dai_set_pll(codec_dai, 0, DA7219_SYSCLK_PLL,
> - CZ_PLAT_CLK, MCLK_RATE);
> + CZ_PLAT_CLK, DA7219_PLL_FREQ_OUT_98304);

These are unrelated fixes that should be in their own patch.

>  if (ret < 0) {
>  dev_err(rtd->dev, "can't set codec pll: %d\n", ret);
>  return ret;
> @@ -85,8 +84,7 @@ static int cz_da7219_init(struct snd_soc_pcm_runtime
*rtd)
>  return 0;
>   }

> -static int cz_da7219_hw_params(struct snd_pcm_substream *substream,
> -struct snd_pcm_hw_params *params)
> +static int da7219_clk_enable(struct snd_pcm_substream *substream)
>   {
>  int ret = 0;
>  struct snd_soc_pcm_runtime *rtd = substream->private_data;
> @@ -100,11 +98,9 @@ static int cz_da7219_hw_params(struct
snd_pcm_substream *substream,
>  return ret;
>   }

> -static int cz_da7219_hw_free(struct snd_pcm_substream *substream)
> +static void da7219_clk_disable(void)
>   {
>  clk_disable_unprepare(da7219_dai_clk);
> -
> -   return 0;
>   }

>   static const unsigned int channels[] = {
> @@ -127,7 +123,7 @@ static const struct snd_pcm_hw_constraint_list
constraints_channels = {
>  .mask = 0,
>   };

> -static int cz_fe_startup(struct snd_pcm_substream *substream)
> +static int cz_da7219_startup(struct snd_pcm_substream *substream)
>   {
>  struct snd_pcm_runtime *runtime = substream->runtime;

> @@ -141,23 +137,47 @@ static int cz_fe_startup(struct snd_pcm_substream
*substream)
>  snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
> _rates);

> -   return 0;
> +   return da7219_clk_enable(substream);
> +}
> +
> +static void cz_da7219_shutdown(struct snd_pcm_substream *substream)
> +{
> +   da7219_clk_disable();
> +}
> +
> +static int cz_max_startup(struct snd_pcm_substream *substream)
> +{
> +   return da7219_clk_enable(substream);
> +}
> +
> +static void cz_max_shutdown(struct snd_pcm_substream *substream)
> +{
> +   da7219_clk_disable();
> +}
> +
> +static int cz_dmic_startup(struct snd_pcm_substream *substream)
> +{
> +   return da7219_clk_enable(substream);
> +}
> +
> +static void cz_dmic_shutdown(struct snd_pcm_substream *substream)
> +{
> +   da7219_clk_disable();
>   }

This is ok, or you could combine the common cz_max_* & cz_dmic_*.


>   static struct snd_soc_ops cz_da7219_cap_ops = {

I think these should all be "static const struct snd_soc_ops"  (please fix
in a separate patch).

> -   .hw_params = cz_da7219_hw_params,
> -   .hw_free = cz_da7219_hw_free,
> -   .startup = cz_fe_startup,
> +   .startup = cz_da7219_startup,
> +   .shutdown = cz_da7219_shutdown,
>   };

>   static struct snd_soc_ops cz_max_play_ops = {
> -   .hw_params = cz_da7219_hw_params,
> -   .hw_free = cz_da7219_hw_free,
> +   .startup = cz_max_startup,
> +   .shutdown = cz_max_shutdown,
>   };

>   static struct snd_soc_ops cz_dmic_cap_ops = {
> -   .hw_params = cz_da7219_hw_params,
> -   .hw_free = cz_da7219_hw_free,
> +   .startup = cz_dmic_startup,
> +   .shutdown = cz_dmic_shutdown,
>   };

>   static struct snd_soc_dai_link cz_dai_7219_98357[] = {
> --
> 2.7.4


Re: [PATCH] net: phy: TLK10X initial driver submission

2018-04-24 Thread Rob Herring
On Thu, Apr 19, 2018 at 10:28:16AM +0200, Måns Andersson wrote:
> From: Mans Andersson 
> 
> Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
> 
> In addition the TLK10X needs to be removed from DP83848 driver as the
> power back off support is added here for this device.
> 
> Datasheet:
> http://www.ti.com/lit/gpn/tlk106
> ---
>  .../devicetree/bindings/net/ti,tlk10x.txt  |  27 +++

Please split bindings to a separate patch.

>  drivers/net/phy/Kconfig|   5 +
>  drivers/net/phy/Makefile   |   1 +
>  drivers/net/phy/dp83848.c  |   3 -
>  drivers/net/phy/tlk10x.c   | 209 
> +
>  5 files changed, 242 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/net/ti,tlk10x.txt
>  create mode 100644 drivers/net/phy/tlk10x.c
> 
> diff --git a/Documentation/devicetree/bindings/net/ti,tlk10x.txt 
> b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> new file mode 100644
> index 000..371d0d7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> @@ -0,0 +1,27 @@
> +* Texas Instruments - TLK105 / TLK106 ethernet PHYs
> +
> +Required properties:
> + - reg - The ID number for the phy, usually a small integer

Isn't this the MDIO bus address?

This should have a compatible string too.

> +
> +Optional properties:
> + - ti,power-back-off - Power Back Off Level
> + Please refer to data sheet chapter 8.6 and TI Application
> + Note SLLA3228
> + 0 - Normal Operation
> + 1 - Level 1 (up to 140m cable between TLK link partners)
> + 2 - Level 2 (up to 100m cable between TLK link partners)
> + 3 - Level 3 (up to 80m cable between TLK link partners)
> +
> +Default child nodes are standard Ethernet PHY device
> +nodes as described in Documentation/devicetree/bindings/net/phy.txt
> +
> +Example:
> +
> + ethernet-phy@0 {
> + reg = <0>;
> + ti,power-back-off = <2>;
> + };
> +
> +Datasheets and documentation can be found at:
> +http://www.ti.com/lit/gpn/tlk106
> +http://www.ti.com/lit/an/slla328/slla328.pdf

Move this to before the properties.

Rob



Re: [PATCH] net: phy: TLK10X initial driver submission

2018-04-24 Thread Rob Herring
On Thu, Apr 19, 2018 at 10:28:16AM +0200, Måns Andersson wrote:
> From: Mans Andersson 
> 
> Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
> 
> In addition the TLK10X needs to be removed from DP83848 driver as the
> power back off support is added here for this device.
> 
> Datasheet:
> http://www.ti.com/lit/gpn/tlk106
> ---
>  .../devicetree/bindings/net/ti,tlk10x.txt  |  27 +++

Please split bindings to a separate patch.

>  drivers/net/phy/Kconfig|   5 +
>  drivers/net/phy/Makefile   |   1 +
>  drivers/net/phy/dp83848.c  |   3 -
>  drivers/net/phy/tlk10x.c   | 209 
> +
>  5 files changed, 242 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/net/ti,tlk10x.txt
>  create mode 100644 drivers/net/phy/tlk10x.c
> 
> diff --git a/Documentation/devicetree/bindings/net/ti,tlk10x.txt 
> b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> new file mode 100644
> index 000..371d0d7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> @@ -0,0 +1,27 @@
> +* Texas Instruments - TLK105 / TLK106 ethernet PHYs
> +
> +Required properties:
> + - reg - The ID number for the phy, usually a small integer

Isn't this the MDIO bus address?

This should have a compatible string too.

> +
> +Optional properties:
> + - ti,power-back-off - Power Back Off Level
> + Please refer to data sheet chapter 8.6 and TI Application
> + Note SLLA3228
> + 0 - Normal Operation
> + 1 - Level 1 (up to 140m cable between TLK link partners)
> + 2 - Level 2 (up to 100m cable between TLK link partners)
> + 3 - Level 3 (up to 80m cable between TLK link partners)
> +
> +Default child nodes are standard Ethernet PHY device
> +nodes as described in Documentation/devicetree/bindings/net/phy.txt
> +
> +Example:
> +
> + ethernet-phy@0 {
> + reg = <0>;
> + ti,power-back-off = <2>;
> + };
> +
> +Datasheets and documentation can be found at:
> +http://www.ti.com/lit/gpn/tlk106
> +http://www.ti.com/lit/an/slla328/slla328.pdf

Move this to before the properties.

Rob



Re: KASAN: use-after-free Read in alloc_pid

2018-04-24 Thread Eric W. Biederman
Tetsuo Handa  writes:

> On 2018/04/10 23:33, Tetsuo Handa wrote:
>> syzbot wrote:
>>> syzbot has found reproducer for the following crash on upstream commit
>>> c18bb396d3d261ebbb4efbc05129c5d354c541e4 (Tue Apr 10 00:04:10 2018 +)
>>> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
>>> syzbot dashboard link:  
>>> https://syzkaller.appspot.com/bug?extid=7a1cff37dbbef9e7ba4c
>>>
>> While we are waiting for
>> 
>>   rpc_pipefs: fix double-dput()
>>   rpc_pipefs: deal with early sget() failures
>>   kernfs: deal with early sget() failures
>>   procfs: deal with early sget() failures
>>   orangefs_kill_sb(): deal with allocation failures
>>   nfsd_umount(): deal with early sget() failures
>>   nfs: avoid double-free on early sget() failures
>>   jffs2_kill_sb(): deal with failed allocations
>>   hypfs_kill_super(): deal with failed allocations
>> 
>> in 
>> https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/log/?h=for-linus
>>  ,
>> I think the patch at
>> 
>>   WARNING in kill_block_super
>>   
>> https://syzkaller.appspot.com/bug?id=588996a25a2587be2e3a54e8646728fb9cae44e7
>> 
>> is better.
>> 
>
> OK. The patch was sent to linux.git as commit 8e04944f0ea8b838.
>
> #syz fix: mm,vmscan: Allow preallocating memory for
> register_shrinker().

Sigh no fixes tag on the commit you sent to Linus, and no
cc'ing of stable.

Can you please update the stable folks that 9ee332d99e4d ("sget():
handle failures of register_shrinker()") is fixed with the commit you
just sent to Linus?

Thank you,
Eric


Re: KASAN: use-after-free Read in alloc_pid

2018-04-24 Thread Eric W. Biederman
Tetsuo Handa  writes:

> On 2018/04/10 23:33, Tetsuo Handa wrote:
>> syzbot wrote:
>>> syzbot has found reproducer for the following crash on upstream commit
>>> c18bb396d3d261ebbb4efbc05129c5d354c541e4 (Tue Apr 10 00:04:10 2018 +)
>>> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
>>> syzbot dashboard link:  
>>> https://syzkaller.appspot.com/bug?extid=7a1cff37dbbef9e7ba4c
>>>
>> While we are waiting for
>> 
>>   rpc_pipefs: fix double-dput()
>>   rpc_pipefs: deal with early sget() failures
>>   kernfs: deal with early sget() failures
>>   procfs: deal with early sget() failures
>>   orangefs_kill_sb(): deal with allocation failures
>>   nfsd_umount(): deal with early sget() failures
>>   nfs: avoid double-free on early sget() failures
>>   jffs2_kill_sb(): deal with failed allocations
>>   hypfs_kill_super(): deal with failed allocations
>> 
>> in 
>> https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/log/?h=for-linus
>>  ,
>> I think the patch at
>> 
>>   WARNING in kill_block_super
>>   
>> https://syzkaller.appspot.com/bug?id=588996a25a2587be2e3a54e8646728fb9cae44e7
>> 
>> is better.
>> 
>
> OK. The patch was sent to linux.git as commit 8e04944f0ea8b838.
>
> #syz fix: mm,vmscan: Allow preallocating memory for
> register_shrinker().

Sigh no fixes tag on the commit you sent to Linus, and no
cc'ing of stable.

Can you please update the stable folks that 9ee332d99e4d ("sget():
handle failures of register_shrinker()") is fixed with the commit you
just sent to Linus?

Thank you,
Eric


Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Tue 24-04-18 11:30:40, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > 
> > > On Mon 23-04-18 20:25:15, Mikulas Patocka wrote:
> > > 
> > > > Fixing __vmalloc code 
> > > > is easy and it doesn't require cooperation with maintainers.
> > > 
> > > But it is a hack against the intention of the scope api.
> > 
> > It is not!
> 
> This discussion simply doesn't make much sense it seems. The scope API
> is to document the scope of the reclaim recursion critical section. That
> certainly is not a utility function like vmalloc.

That 15-line __vmalloc bugfix doesn't prevent you (or any other kernel 
developer) from converting the code to the scope API. You make nonsensical 
excuses.

Mikulas


Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-24 Thread Mikulas Patocka


On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Tue 24-04-18 11:30:40, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 24 Apr 2018, Michal Hocko wrote:
> > 
> > > On Mon 23-04-18 20:25:15, Mikulas Patocka wrote:
> > > 
> > > > Fixing __vmalloc code 
> > > > is easy and it doesn't require cooperation with maintainers.
> > > 
> > > But it is a hack against the intention of the scope api.
> > 
> > It is not!
> 
> This discussion simply doesn't make much sense it seems. The scope API
> is to document the scope of the reclaim recursion critical section. That
> certainly is not a utility function like vmalloc.

That 15-line __vmalloc bugfix doesn't prevent you (or any other kernel 
developer) from converting the code to the scope API. You make nonsensical 
excuses.

Mikulas


Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.

2018-04-24 Thread Eric W. Biederman
"Panariti, David"  writes:

> Andrey Grodzovsky  writes:
>> Kind of dma_fence_wait_killable, except that we don't have such API
>> (maybe worth adding ?)
> Depends on how many places it would be called, or think it might be called.  
> Can always factor on the 2nd time it's needed.
> Factoring, IMO, rarely hurts.  The factored function can easily be visited 
> using `M-.' ;->
>
> Also, if the wait could be very long, would a log message, something like 
> "xxx has run for Y seconds."  help?
> I personally hate hanging w/no info.

Ugh.  This loop appears susceptible to loosing wake ups.  There are
races between when a wake-up happens, when we clear the sleeping state,
and when we test the stat to see if we should stat awake.  So yes
implementing a dma_fence_wait_killable that handles of all that
correctly sounds like an very good idea.

Eric


>> If the ring is hanging for some reason allow to recover the waiting by 
>> sending fatal signal.
>>
>> Originally-by: David Panariti 
>> Signed-off-by: Andrey Grodzovsky 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index eb80edf..37a36af 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, 
>> unsigned ring_id)
>>
>>   if (other) {
>>   signed long r;
>> - r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>> - if (r < 0) {
>> - DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> - return r;
>> +
>> + while (true) {
>> + if ((r = dma_fence_wait_timeout(other, true,
>> + MAX_SCHEDULE_TIMEOUT)) >= 0)
>> + return 0;
>> +
>> + if (fatal_signal_pending(current)) {
>> + DRM_ERROR("Error (%ld) waiting for fence!\n", 
>> r);
>> + return r;
>> + }
>>   }
>>   }
>>
>> --
>> 2.7.4
>>
Eric


Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.

2018-04-24 Thread Eric W. Biederman
"Panariti, David"  writes:

> Andrey Grodzovsky  writes:
>> Kind of dma_fence_wait_killable, except that we don't have such API
>> (maybe worth adding ?)
> Depends on how many places it would be called, or think it might be called.  
> Can always factor on the 2nd time it's needed.
> Factoring, IMO, rarely hurts.  The factored function can easily be visited 
> using `M-.' ;->
>
> Also, if the wait could be very long, would a log message, something like 
> "xxx has run for Y seconds."  help?
> I personally hate hanging w/no info.

Ugh.  This loop appears susceptible to loosing wake ups.  There are
races between when a wake-up happens, when we clear the sleeping state,
and when we test the stat to see if we should stat awake.  So yes
implementing a dma_fence_wait_killable that handles of all that
correctly sounds like an very good idea.

Eric


>> If the ring is hanging for some reason allow to recover the waiting by 
>> sending fatal signal.
>>
>> Originally-by: David Panariti 
>> Signed-off-by: Andrey Grodzovsky 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index eb80edf..37a36af 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, 
>> unsigned ring_id)
>>
>>   if (other) {
>>   signed long r;
>> - r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>> - if (r < 0) {
>> - DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> - return r;
>> +
>> + while (true) {
>> + if ((r = dma_fence_wait_timeout(other, true,
>> + MAX_SCHEDULE_TIMEOUT)) >= 0)
>> + return 0;
>> +
>> + if (fatal_signal_pending(current)) {
>> + DRM_ERROR("Error (%ld) waiting for fence!\n", 
>> r);
>> + return r;
>> + }
>>   }
>>   }
>>
>> --
>> 2.7.4
>>
Eric


[PATCH 2/3] powerpc/nohash: remove _PAGE_BUSY

2018-04-24 Thread Christophe Leroy
_PAGE_BUSY is always 0, remove it

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 10 +++---
 arch/powerpc/include/asm/nohash/pte-book3e.h |  5 -
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 251d74c9013e..e8de7cb4d3fb 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -186,14 +186,12 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
 
__asm__ __volatile__(
"1: ldarx   %0,0,%3 # pte_update\n\
-   andi.   %1,%0,%6\n\
-   bne-1b \n\
andc%1,%0,%4 \n\
-   or  %1,%1,%7\n\
+   or  %1,%1,%6\n\
stdcx.  %1,0,%3 \n\
bne-1b"
: "=" (old), "=" (tmp), "=m" (*ptep)
-   : "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
+   : "r" (ptep), "r" (clr), "m" (*ptep), "r" (set)
: "cc" );
 #else
unsigned long old = pte_val(*ptep);
@@ -290,13 +288,11 @@ static inline void __ptep_set_access_flags(struct 
mm_struct *mm,
 
__asm__ __volatile__(
"1: ldarx   %0,0,%4\n\
-   andi.   %1,%0,%6\n\
-   bne-1b \n\
or  %0,%3,%0\n\
stdcx.  %0,0,%4\n\
bne-1b"
:"=" (old), "=" (tmp), "=m" (*ptep)
-   :"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
+   :"r" (bits), "r" (ptep), "m" (*ptep)
:"cc");
 #else
unsigned long old = pte_val(*ptep);
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h 
b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 9ff51b4c0cac..12730b81cd98 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -57,13 +57,8 @@
 #define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
 #define _PAGE_PRIVILEGED   (_PAGE_BAP_SR)
 
-#define _PAGE_BUSY 0
-
 #define _PAGE_SPECIAL  _PAGE_SW0
 
-/* Flags to be preserved on PTE modifications */
-#define _PAGE_HPTEFLAGS_PAGE_BUSY
-
 /* Base page size */
 #ifdef CONFIG_PPC_64K_PAGES
 #define _PAGE_PSIZE_PAGE_PSIZE_64K
-- 
2.13.3



[PATCH 2/3] powerpc/nohash: remove _PAGE_BUSY

2018-04-24 Thread Christophe Leroy
_PAGE_BUSY is always 0, remove it

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 10 +++---
 arch/powerpc/include/asm/nohash/pte-book3e.h |  5 -
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 251d74c9013e..e8de7cb4d3fb 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -186,14 +186,12 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
 
__asm__ __volatile__(
"1: ldarx   %0,0,%3 # pte_update\n\
-   andi.   %1,%0,%6\n\
-   bne-1b \n\
andc%1,%0,%4 \n\
-   or  %1,%1,%7\n\
+   or  %1,%1,%6\n\
stdcx.  %1,0,%3 \n\
bne-1b"
: "=" (old), "=" (tmp), "=m" (*ptep)
-   : "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
+   : "r" (ptep), "r" (clr), "m" (*ptep), "r" (set)
: "cc" );
 #else
unsigned long old = pte_val(*ptep);
@@ -290,13 +288,11 @@ static inline void __ptep_set_access_flags(struct 
mm_struct *mm,
 
__asm__ __volatile__(
"1: ldarx   %0,0,%4\n\
-   andi.   %1,%0,%6\n\
-   bne-1b \n\
or  %0,%3,%0\n\
stdcx.  %0,0,%4\n\
bne-1b"
:"=" (old), "=" (tmp), "=m" (*ptep)
-   :"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
+   :"r" (bits), "r" (ptep), "m" (*ptep)
:"cc");
 #else
unsigned long old = pte_val(*ptep);
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h 
b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 9ff51b4c0cac..12730b81cd98 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -57,13 +57,8 @@
 #define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
 #define _PAGE_PRIVILEGED   (_PAGE_BAP_SR)
 
-#define _PAGE_BUSY 0
-
 #define _PAGE_SPECIAL  _PAGE_SW0
 
-/* Flags to be preserved on PTE modifications */
-#define _PAGE_HPTEFLAGS_PAGE_BUSY
-
 /* Base page size */
 #ifdef CONFIG_PPC_64K_PAGES
 #define _PAGE_PSIZE_PAGE_PSIZE_64K
-- 
2.13.3



[PATCH 3/3] powerpc/nohash: use IS_ENABLED() to simplify __set_pte_at()

2018-04-24 Thread Christophe Leroy
By using IS_ENABLED() we can simplify __set_pte_at() by removing
redundant *ptep = pte

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pgtable.h | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index f2fe3cbe90af..077472640b35 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -148,40 +148,33 @@ extern void set_pte_at(struct mm_struct *mm, unsigned 
long addr, pte_t *ptep,
 static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
 {
-#if defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
/* Second case is 32-bit with 64-bit PTE.  In this case, we
 * can just store as long as we do the two halves in the right order
 * with a barrier in between.
 * In the percpu case, we also fallback to the simple update
 */
-   if (percpu) {
-   *ptep = pte;
+   if (IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_PTE_64BIT) && 
!percpu) {
+   __asm__ __volatile__("\
+   stw%U0%X0 %2,%0\n\
+   eieio\n\
+   stw%U0%X0 %L2,%1"
+   : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+   : "r" (pte) : "memory");
return;
}
-   __asm__ __volatile__("\
-   stw%U0%X0 %2,%0\n\
-   eieio\n\
-   stw%U0%X0 %L2,%1"
-   : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-   : "r" (pte) : "memory");
-
-#else
/* Anything else just stores the PTE normally. That covers all 64-bit
 * cases, and 32-bit non-hash with 32-bit PTEs.
 */
*ptep = pte;
 
-#ifdef CONFIG_PPC_BOOK3E_64
/*
 * With hardware tablewalk, a sync is needed to ensure that
 * subsequent accesses see the PTE we just wrote.  Unlike userspace
 * mappings, we can't tolerate spurious faults, so make sure
 * the new PTE will be seen the first time.
 */
-   if (is_kernel_addr(addr))
+   if (IS_ENABLED(CONFIG_PPC_BOOK3E_64) && is_kernel_addr(addr))
mb();
-#endif
-#endif
 }
 
 
-- 
2.13.3



[PATCH 3/3] powerpc/nohash: use IS_ENABLED() to simplify __set_pte_at()

2018-04-24 Thread Christophe Leroy
By using IS_ENABLED() we can simplify __set_pte_at() by removing
redundant *ptep = pte

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pgtable.h | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index f2fe3cbe90af..077472640b35 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -148,40 +148,33 @@ extern void set_pte_at(struct mm_struct *mm, unsigned 
long addr, pte_t *ptep,
 static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
 {
-#if defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
/* Second case is 32-bit with 64-bit PTE.  In this case, we
 * can just store as long as we do the two halves in the right order
 * with a barrier in between.
 * In the percpu case, we also fallback to the simple update
 */
-   if (percpu) {
-   *ptep = pte;
+   if (IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_PTE_64BIT) && 
!percpu) {
+   __asm__ __volatile__("\
+   stw%U0%X0 %2,%0\n\
+   eieio\n\
+   stw%U0%X0 %L2,%1"
+   : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+   : "r" (pte) : "memory");
return;
}
-   __asm__ __volatile__("\
-   stw%U0%X0 %2,%0\n\
-   eieio\n\
-   stw%U0%X0 %L2,%1"
-   : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-   : "r" (pte) : "memory");
-
-#else
/* Anything else just stores the PTE normally. That covers all 64-bit
 * cases, and 32-bit non-hash with 32-bit PTEs.
 */
*ptep = pte;
 
-#ifdef CONFIG_PPC_BOOK3E_64
/*
 * With hardware tablewalk, a sync is needed to ensure that
 * subsequent accesses see the PTE we just wrote.  Unlike userspace
 * mappings, we can't tolerate spurious faults, so make sure
 * the new PTE will be seen the first time.
 */
-   if (is_kernel_addr(addr))
+   if (IS_ENABLED(CONFIG_PPC_BOOK3E_64) && is_kernel_addr(addr))
mb();
-#endif
-#endif
 }
 
 
-- 
2.13.3



[PATCH 1/3] powerpc/nohash: remove hash related code from nohash headers.

2018-04-24 Thread Christophe Leroy
When nohash and book3s header were split, some hash related stuff
remained in the nohash header. This patch removes them.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 29 +++--
 arch/powerpc/include/asm/nohash/64/pgtable.h | 16 ++--
 arch/powerpc/include/asm/nohash/pgtable.h| 38 +++-
 arch/powerpc/include/asm/nohash/pte-book3e.h |  1 -
 4 files changed, 10 insertions(+), 74 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index a717b5c39b9c..b413abcd5a09 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -129,7 +129,7 @@ extern int icache_44x_need_flush;
 #ifndef __ASSEMBLY__
 
 #define pte_clear(mm, addr, ptep) \
-   do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
+   do { pte_update(ptep, ~0, 0); } while (0)
 
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(pmd_val(pmd) & _PMD_BAD)
@@ -142,21 +142,6 @@ static inline void pmd_clear(pmd_t *pmdp)
 
 
 /*
- * When flushing the tlb entry for a page, we also need to flush the hash
- * table entry.  flush_hash_pages is assembler (for speed) in hashtable.S.
- */
-extern int flush_hash_pages(unsigned context, unsigned long va,
-   unsigned long pmdval, int count);
-
-/* Add an HPTE to the hash table */
-extern void add_hash_page(unsigned context, unsigned long va,
- unsigned long pmdval);
-
-/* Flush an entry from the TLB/hash table */
-extern void flush_hash_entry(struct mm_struct *mm, pte_t *ptep,
-unsigned long address);
-
-/*
  * PTE updates. This function is called whenever an existing
  * valid PTE is updated. This does -not- include set_pte_at()
  * which nowadays only sets a new PTE.
@@ -242,12 +227,6 @@ static inline int __ptep_test_and_clear_young(unsigned int 
context, unsigned lon
 {
unsigned long old;
old = pte_update(ptep, _PAGE_ACCESSED, 0);
-#if _PAGE_HASHPTE != 0
-   if (old & _PAGE_HASHPTE) {
-   unsigned long ptephys = __pa(ptep) & PAGE_MASK;
-   flush_hash_pages(context, addr, ptephys, 1);
-   }
-#endif
return (old & _PAGE_ACCESSED) != 0;
 }
 #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
@@ -257,7 +236,7 @@ static inline int __ptep_test_and_clear_young(unsigned int 
context, unsigned lon
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
   pte_t *ptep)
 {
-   return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
+   return __pte(pte_update(ptep, ~0, 0));
 }
 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
@@ -285,7 +264,7 @@ static inline void __ptep_set_access_flags(struct mm_struct 
*mm,
 }
 
 #define __HAVE_ARCH_PTE_SAME
-#define pte_same(A,B)  (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0)
+#define pte_same(A,B)  ((pte_val(A) ^ pte_val(B)) == 0)
 
 /*
  * Note that on Book E processors, the pmd contains the kernel virtual
@@ -326,7 +305,7 @@ static inline void __ptep_set_access_flags(struct mm_struct 
*mm,
 /*
  * Encode and decode a swap entry.
  * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
+ * must not include the _PAGE_PRESENT bit.
  *   -- paulus
  */
 #define __swp_type(entry)  ((entry).val & 0x1f)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 5c5f75d005ad..251d74c9013e 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -173,8 +173,6 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
 /* to find an entry in a kernel page-table-directory */
 /* This now only contains the vmalloc pages */
 #define pgd_offset_k(address) pgd_offset(_mm, address)
-extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
-   pte_t *ptep, unsigned long pte, int huge);
 
 /* Atomic PTE updates */
 static inline unsigned long pte_update(struct mm_struct *mm,
@@ -205,11 +203,6 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
if (!huge)
assert_pte_locked(mm, addr);
 
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (old & _PAGE_HASHPTE)
-   hpte_need_flush(mm, addr, ptep, old, huge);
-#endif
-
return old;
 }
 
@@ -218,7 +211,7 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 {
unsigned long old;
 
-   if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+   if (pte_young(*ptep))
return 0;
old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
return (old & _PAGE_ACCESSED) != 0;
@@ -312,7 +305,7 @@ static inline void 

[PATCH 1/3] powerpc/nohash: remove hash related code from nohash headers.

2018-04-24 Thread Christophe Leroy
When nohash and book3s header were split, some hash related stuff
remained in the nohash header. This patch removes them.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 29 +++--
 arch/powerpc/include/asm/nohash/64/pgtable.h | 16 ++--
 arch/powerpc/include/asm/nohash/pgtable.h| 38 +++-
 arch/powerpc/include/asm/nohash/pte-book3e.h |  1 -
 4 files changed, 10 insertions(+), 74 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index a717b5c39b9c..b413abcd5a09 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -129,7 +129,7 @@ extern int icache_44x_need_flush;
 #ifndef __ASSEMBLY__
 
 #define pte_clear(mm, addr, ptep) \
-   do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
+   do { pte_update(ptep, ~0, 0); } while (0)
 
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(pmd_val(pmd) & _PMD_BAD)
@@ -142,21 +142,6 @@ static inline void pmd_clear(pmd_t *pmdp)
 
 
 /*
- * When flushing the tlb entry for a page, we also need to flush the hash
- * table entry.  flush_hash_pages is assembler (for speed) in hashtable.S.
- */
-extern int flush_hash_pages(unsigned context, unsigned long va,
-   unsigned long pmdval, int count);
-
-/* Add an HPTE to the hash table */
-extern void add_hash_page(unsigned context, unsigned long va,
- unsigned long pmdval);
-
-/* Flush an entry from the TLB/hash table */
-extern void flush_hash_entry(struct mm_struct *mm, pte_t *ptep,
-unsigned long address);
-
-/*
  * PTE updates. This function is called whenever an existing
  * valid PTE is updated. This does -not- include set_pte_at()
  * which nowadays only sets a new PTE.
@@ -242,12 +227,6 @@ static inline int __ptep_test_and_clear_young(unsigned int 
context, unsigned lon
 {
unsigned long old;
old = pte_update(ptep, _PAGE_ACCESSED, 0);
-#if _PAGE_HASHPTE != 0
-   if (old & _PAGE_HASHPTE) {
-   unsigned long ptephys = __pa(ptep) & PAGE_MASK;
-   flush_hash_pages(context, addr, ptephys, 1);
-   }
-#endif
return (old & _PAGE_ACCESSED) != 0;
 }
 #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
@@ -257,7 +236,7 @@ static inline int __ptep_test_and_clear_young(unsigned int 
context, unsigned lon
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
   pte_t *ptep)
 {
-   return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
+   return __pte(pte_update(ptep, ~0, 0));
 }
 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
@@ -285,7 +264,7 @@ static inline void __ptep_set_access_flags(struct mm_struct 
*mm,
 }
 
 #define __HAVE_ARCH_PTE_SAME
-#define pte_same(A,B)  (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0)
+#define pte_same(A,B)  ((pte_val(A) ^ pte_val(B)) == 0)
 
 /*
  * Note that on Book E processors, the pmd contains the kernel virtual
@@ -326,7 +305,7 @@ static inline void __ptep_set_access_flags(struct mm_struct 
*mm,
 /*
  * Encode and decode a swap entry.
  * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
+ * must not include the _PAGE_PRESENT bit.
  *   -- paulus
  */
 #define __swp_type(entry)  ((entry).val & 0x1f)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 5c5f75d005ad..251d74c9013e 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -173,8 +173,6 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
 /* to find an entry in a kernel page-table-directory */
 /* This now only contains the vmalloc pages */
 #define pgd_offset_k(address) pgd_offset(_mm, address)
-extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
-   pte_t *ptep, unsigned long pte, int huge);
 
 /* Atomic PTE updates */
 static inline unsigned long pte_update(struct mm_struct *mm,
@@ -205,11 +203,6 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
if (!huge)
assert_pte_locked(mm, addr);
 
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (old & _PAGE_HASHPTE)
-   hpte_need_flush(mm, addr, ptep, old, huge);
-#endif
-
return old;
 }
 
@@ -218,7 +211,7 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 {
unsigned long old;
 
-   if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+   if (pte_young(*ptep))
return 0;
old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
return (old & _PAGE_ACCESSED) != 0;
@@ -312,7 +305,7 @@ static inline void __ptep_set_access_flags(struct mm_struct 
*mm,

<    5   6   7   8   9   10   11   12   13   14   >