Re: [GIT PULL 0/5] perf/core improvements and fixes

2018-02-20 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 11737ca9e3b9d84448fa405a80980aa9957bcee8:
> 
>   Merge tag 'perf-core-for-mingo-4.17-20180216' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2018-02-17 11:39:47 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.17-20180220
> 
> for you to fetch changes up to 66dfdff03d196e51322c6a85c0d8db8bb2bdd655:
> 
>   perf tools: Add Python 3 support (2018-02-19 12:28:23 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> - Initial support for linking with python3, by explicitely setting
>   the PYTHON Makefile variable, python2 remains supported, more work
>   needed to test the shipped python scripts used with 'perf script'
>   (Jaroslav Škarvada)
> 
> - Make twatch.py, an example python script that sets up evlists and
>   evsels to then parse events from mmap, to work with both python2 and
>   python3 (Arnaldo Carvalho de Melo)
> 
> - Fix setting 'perf ftrace' function filter when using a non-existent
>   function, which should result in an error instead of simply not setting
>   the filter and showing all functions (Changbin Du)
> 
> - Fix paranoid check in machine__set_kernel_mmap(), problem introduced
>   in previous perf/core batch (Namhyung Kim)
> 
> - Fix reading cpuid model information in s/390, ditto, also introduced
>   in the previous perf/core batch (Thomas Richter)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (1):
>   perf python: Make twatch.py work with both python2 and python3
> 
> Changbin Du (1):
>   perf ftrace: Append an EOL when write tracing files
> 
> Jaroslav Škarvada (1):
>   perf tools: Add Python 3 support
> 
> Namhyung Kim (1):
>   perf machine: Fix paranoid check in machine__set_kernel_mmap()
> 
> Thomas Richter (1):
>   perf s390: Fix reading cpuid model information
> 
>  tools/perf/Makefile.config |  23 +---
>  tools/perf/Makefile.perf   |   4 +-
>  tools/perf/arch/s390/util/header.c |   2 +-
>  tools/perf/builtin-ftrace.c|  18 ++-
>  tools/perf/python/twatch.py|   8 +-
>  .../perf/scripts/python/Perf-Trace-Util/Context.c  |  34 -
>  tools/perf/util/machine.c  |   2 +-
>  tools/perf/util/python.c   |  95 ++---
>  .../util/scripting-engines/trace-event-python.c| 147 
> +++--
>  tools/perf/util/setup.py   |   6 +-
>  10 files changed, 243 insertions(+), 96 deletions(-)

Pulled, thanks a lot Arnaldo!

Ingo


Re: A "domain invalid" cgroup *can* sometimes have member tasks

2018-02-20 Thread Michael Kerrisk (man-pages)
Hi Tehjun,

On 21 February 2018 at 00:01, Tejun Heo  wrote:
> On Tue, Feb 20, 2018 at 11:35:47AM -0800, Tejun Heo wrote:
>> Hmm... nr_populated_domain_children check should have caught that
>> condition and rejected it.  Will look into what's going on.
>
> Ah, okay, I was special-casing the first level children case too
> early.  If you nest the test case by one more level, it fails as
> intended.  I'll fix the checks.

Thanks for looking into this so quickly! Please CC me on the patch.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


Re: [PATCH v2 01/21] lib/vsprintf: Print time and date in human readable format via %pt

2018-02-20 Thread Rasmus Villemoes
On 2018-02-21 00:55, Joe Perches wrote:
> On Tue, 2018-02-20 at 23:43 +0200, Andy Shevchenko wrote:
>> There are users which print time and date represented by content of
>> struct rtc_time in human readable format.
>>
>> Instead of open coding that each time introduce %ptR[dt][rv] specifier.
>>
>> Note, users have to select PRINTK_PEXT_TIMEDATE option in a Kconfig.
> 
> Not sure this is a great option.
> Not just the name, the need to select it.

Bikeshedding first: If you do keep the config option, please use PRINTF,
not PRINTK - vsprintf can be and is used by lots of code other than printk.

Well, on the one hand, I like to reduce the size of the kernel when
possible and ideally make all new functionality guarded by config
options, but OTOH, how much does compiling out the datetime formatters
really save? Also, I agree with Joe's concern about the need to select
it. Maybe if we had a gcc plugin that did %pFOO validation it could also
warn about %pBAR being used without a corresponding config option being
set. But we don't have that currently...

Rasmus


Re: [PATCH 00/23] kconfig: move compiler capability tests to Kconfig

2018-02-20 Thread Masahiro Yamada
2018-02-20 0:18 GMT+09:00 Ulf Magnusson :

>>
>> I'm not happy that we in one context can reference CONFIG variables
>> directly, but inside the $(call ...) and $(shell ...) needs the $ prefix.
>> But I could not come up with something un-ambigious where this could be 
>> avoided.
>
> I think we should be careful about allowing references to config
> symbols. It mixes up the parsing and evaluation phases, since $() is
> expanded during parsing (which I consider a feature and think is
> needed to retain sanity).
>
> Patch 06/23 removes the last existing instance of symbol references in
> strings by getting rid of 'option env'. That's an improvement to me.
> We shouldn't add it back.


This is really important design decision,
so I'd like to hear a little more from experts.


For example, x86 allows users to choose sub-arch, either 'i386' or 'x86_64'.

https://github.com/torvalds/linux/blob/v4.16-rc2/arch/x86/Kconfig#L4



If the user toggles CONFIG_64BIT,
the bi-arch compiler will work in a slightly different mode
(at least, back-end parts)

So, my question is, is there a case,

$(cc-option, -m32 -foo) is y, but
$(cc-option, -m64 -foo) is n  ?
(or vice versa)


If the answer is yes, $(cc-option -foo) would have to be re-calculated
every time CONFIG_64BIT is toggled.

This is what I'd like to avoid, though.



-- 
Best Regards
Masahiro Yamada


Re: [RFCv4 16/21] v4l2: video_device: support for creating requests

2018-02-20 Thread Hans Verkuil
On 02/21/2018 07:01 AM, Alexandre Courbot wrote:
> On Wed, Feb 21, 2018 at 1:35 AM, Hans Verkuil  wrote:
>> On 02/20/2018 05:44 AM, Alexandre Courbot wrote:
>>> Add a new VIDIOC_NEW_REQUEST ioctl, which allows to instanciate requests
>>> on devices that support the request API. Requests created that way can
>>> only control the device they originate from, making them suitable for
>>> simple devices, but not complex pipelines.
>>>
>>> Signed-off-by: Alexandre Courbot 
>>> ---
>>>  Documentation/ioctl/ioctl-number.txt |  1 +
>>>  drivers/media/v4l2-core/v4l2-dev.c   |  2 ++
>>>  drivers/media/v4l2-core/v4l2-ioctl.c | 25 +
>>>  include/media/v4l2-dev.h |  2 ++
>>>  include/uapi/linux/videodev2.h   |  3 +++
>>>  5 files changed, 33 insertions(+)
>>>
>>> diff --git a/Documentation/ioctl/ioctl-number.txt 
>>> b/Documentation/ioctl/ioctl-number.txt
>>> index 6501389d55b9..afdc9ed255b0 100644
>>> --- a/Documentation/ioctl/ioctl-number.txt
>>> +++ b/Documentation/ioctl/ioctl-number.txt
>>> @@ -286,6 +286,7 @@ Code  Seq#(hex)   Include FileComments
>>>   
>>>  'z'  10-4F   drivers/s390/crypto/zcrypt_api.hconflict!
>>>  '|'  00-7F   linux/media.h
>>> +'|'  80-9F   linux/media-request.h
>>>  0x80 00-1F   linux/fb.h
>>>  0x89 00-06   arch/x86/include/asm/sockios.h
>>>  0x89 0B-DF   linux/sockios.h
>>
>> This  doesn't belong in this patch.
> 
> Do you mean we need a separate patch for this?

Yes.

I now also see why you started at 0x80. Let's keep that for now.

> 
>>
>>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
>>> b/drivers/media/v4l2-core/v4l2-dev.c
>>> index 0301fe426a43..062ebee5bffc 100644
>>> --- a/drivers/media/v4l2-core/v4l2-dev.c
>>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
>>> @@ -559,6 +559,8 @@ static void determine_valid_ioctls(struct video_device 
>>> *vdev)
>>>   set_bit(_IOC_NR(VIDIOC_TRY_EXT_CTRLS), valid_ioctls);
>>>   if (vdev->ctrl_handler || ops->vidioc_querymenu)
>>>   set_bit(_IOC_NR(VIDIOC_QUERYMENU), valid_ioctls);
>>> + if (vdev->req_mgr)
>>> + set_bit(_IOC_NR(VIDIOC_NEW_REQUEST), valid_ioctls);
>>>   SET_VALID_IOCTL(ops, VIDIOC_G_FREQUENCY, vidioc_g_frequency);
>>>   SET_VALID_IOCTL(ops, VIDIOC_S_FREQUENCY, vidioc_s_frequency);
>>>   SET_VALID_IOCTL(ops, VIDIOC_LOG_STATUS, vidioc_log_status);
>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c 
>>> b/drivers/media/v4l2-core/v4l2-ioctl.c
>>> index ab4968ea443f..a45fe078f8ae 100644
>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>> @@ -21,6 +21,7 @@
>>>
>>>  #include 
>>>
>>> +#include 
>>>  #include 
>>>  #include 
>>>  #include 
>>> @@ -842,6 +843,13 @@ static void v4l_print_freq_band(const void *arg, bool 
>>> write_only)
>>>   p->rangehigh, p->modulation);
>>>  }
>>>
>>> +static void vidioc_print_new_request(const void *arg, bool write_only)
>>> +{
>>> + const struct media_request_new *new = arg;
>>> +
>>> + pr_cont("fd=0x%x\n", new->fd);
>>
>> I'd use %d since fds are typically shown as signed integers.
> 
> Right.
> 
>>
>>> +}
>>> +
>>>  static void v4l_print_edid(const void *arg, bool write_only)
>>>  {
>>>   const struct v4l2_edid *p = arg;
>>> @@ -2486,6 +2494,22 @@ static int v4l_enum_freq_bands(const struct 
>>> v4l2_ioctl_ops *ops,
>>>   return -ENOTTY;
>>>  }
>>>
>>> +static int vidioc_new_request(const struct v4l2_ioctl_ops *ops,
>>> +   struct file *file, void *fh, void *arg)
>>> +{
>>> +#if IS_ENABLED(CONFIG_MEDIA_REQUEST_API)
>>> + struct media_request_new *new = arg;
>>> + struct video_device *vfd = video_devdata(file);
>>> +
>>> + if (!vfd->req_mgr)
>>> + return -ENOTTY;
>>> +
>>> + return media_request_ioctl_new(vfd->req_mgr, new);
>>> +#else
>>> + return -ENOTTY;
>>> +#endif
>>> +}
>>
>> You don't need the #ifdef's here. media_request_ioctl_new() will be stubbed 
>> if
>> CONFIG_MEDIA_REQUEST_API isn't set.
> 
> Correct.
> 
>>
>>> +
>>>  struct v4l2_ioctl_info {
>>>   unsigned int ioctl;
>>>   u32 flags;
>>> @@ -2617,6 +2641,7 @@ static struct v4l2_ioctl_info v4l2_ioctls[] = {
>>>   IOCTL_INFO_FNC(VIDIOC_ENUM_FREQ_BANDS, v4l_enum_freq_bands, 
>>> v4l_print_freq_band, 0),
>>>   IOCTL_INFO_FNC(VIDIOC_DBG_G_CHIP_INFO, v4l_dbg_g_chip_info, 
>>> v4l_print_dbg_chip_info, INFO_FL_CLEAR(v4l2_dbg_chip_info, match)),
>>>   IOCTL_INFO_FNC(VIDIOC_QUERY_EXT_CTRL, v4l_query_ext_ctrl, 
>>> v4l_print_query_ext_ctrl, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_query_ext_ctrl, 
>>> id)),
>>> + IOCTL_INFO_FNC(VIDIOC_NEW_REQUEST, vidioc_new_request, 
>>> vidioc_print_new_request, 0),
>>>  };
>>>  #define V4L2_IOCTLS ARRAY_SIZE(v4l2_ioctls)
>>>
>>> diff --git a/include/media/v4l2-dev.h b/include/media/v4l2-dev.h
>>> index 53f32022fabe..e6c4e10889bc 100644
>>> --- a/include/media/v4l2-dev.h
>>> +++ b/include/med

Re: [PATCH v4] tpm: Trigger only missing TPM 2.0 self tests

2018-02-20 Thread Jarkko Sakkinen
On Wed, Feb 21, 2018 at 12:53:47AM +0200, Jarkko Sakkinen wrote:
> From: Alexander Steffen 
> 
> My Nuvoton 6xx in a Dell XPS-13 has been intermittently failing to work
> (necessitating a reboot). The problem seems to be that the TPM gets into a
> state where the partial self-test doesn't return TPM_RC_SUCCESS (meaning
> all tests have run to completion), but instead returns TPM_RC_TESTING
> (meaning some tests are still running in the background).  There are
> various theories that resending the self-test command actually causes the
> tests to restart and thus triggers more TPM_RC_TESTING returns until the
> timeout is exceeded.
> 
> There are several issues here: firstly being we shouldn't slow down the
> boot sequence waiting for the self test to complete once the TPM
> backgrounds them.  It will actually make available all functions that have
> passed and if it gets a failure return TPM_RC_FAILURE to every subsequent
> command.  So the fix is to kick off self tests once and if they return
> TPM_RC_TESTING log that as a backgrounded self test and continue on.  In
> order to prevent other tpm users from seeing any TPM_RC_TESTING returns
> (which it might if they send a command that needs a TPM subsystem which is
> still under test), we loop in tpm_transmit_cmd until either a timeout or we
> don't get a TPM_RC_TESTING return.
> 
> Finally, there have been observations of strange returns from a partial
> test. One Nuvoton is occasionally returning TPM_RC_COMMAND_CODE, so treat
> any unexpected return from a partial self test as an indication we need to
> run a full self test.
> 
> Signed-off-by: Alexander Steffen 
> Signed-off-by: James Bottomley 
> Signed-off-by: Jarkko Sakkinen 
> Fixes: 2482b1bba5122b1d5516c909832bdd282015b8e9

Fixes tag is in wrong format. Should be 12 characters of the hash and
short summary in parentheses:

Fixes: 2482b1bba512 ("tpm: Trigger only missing TPM 2.0 self tests")

/Jarkko


Re: [RFCv4 01/21] media: add request API core and UAPI

2018-02-20 Thread Hans Verkuil
On 02/21/2018 07:01 AM, Alexandre Courbot wrote:
> Hi Hans,
> 
> On Tue, Feb 20, 2018 at 7:36 PM, Hans Verkuil  wrote:
>> On 02/20/18 05:44, Alexandre Courbot wrote:



>>> +#define MEDIA_REQUEST_IOC(__cmd, func) 
>>>   \
>>> + [_IOC_NR(MEDIA_REQUEST_IOC_##__cmd) - 0x80] = { \
>>> + .cmd = MEDIA_REQUEST_IOC_##__cmd,   \
>>> + .fn = func, \
>>> + }
>>> +
>>> +struct media_request_ioctl_info {
>>> + unsigned int cmd;
>>> + long (*fn)(struct media_request *req);
>>> +};
>>> +
>>> +static const struct media_request_ioctl_info ioctl_info[] = {
>>> + MEDIA_REQUEST_IOC(SUBMIT, media_request_ioctl_submit),
>>> + MEDIA_REQUEST_IOC(REINIT, media_request_ioctl_reinit),
>>
>> There are only two ioctls, so there is really no need for the
>> MEDIA_REQUEST_IOC define. Just keep it simple.
> 
> The number of times it is used doesn't change the fact that it helps
> with readability IMHO.

But this macro just boils down to:

static const struct media_request_ioctl_info ioctl_info[] = {
{ MEDIA_REQUEST_IOC_SUBMIT, media_request_ioctl_submit },
{ MEDIA_REQUEST_IOC_REINIT, media_request_ioctl_reinit },
};

It's absolutely identical! So it seems senseless to me.

> 
>>
>>> +};
>>> +
>>> +static long media_request_ioctl(struct file *filp, unsigned int cmd,
>>> + unsigned long __arg)
>>> +{
>>> + struct media_request *req = filp->private_data;
>>> + const struct media_request_ioctl_info *info;
>>> +
>>> + if ((_IOC_NR(cmd) < 0x80) ||
>>
>> Why start the ioctl number at 0x80? Why not just 0?
>> It avoids all this hassle with the 0x80 offset.

There is no clash with the MC ioctls, so I really don't believe the 0x80
offset is needed.

>>
>>> +  _IOC_NR(cmd) >= 0x80 + ARRAY_SIZE(ioctl_info) ||
>>> +  ioctl_info[_IOC_NR(cmd) - 0x80].cmd != cmd)
>>> + return -ENOIOCTLCMD;
>>> +
>>> + info = &ioctl_info[_IOC_NR(cmd) - 0x80];
>>> +
>>> + return info->fn(req);
>>> +}



>>> diff --git a/include/uapi/linux/media-request.h 
>>> b/include/uapi/linux/media-request.h
>>> new file mode 100644
>>> index ..5d30f731a442
>>> --- /dev/null
>>> +++ b/include/uapi/linux/media-request.h
>>> @@ -0,0 +1,37 @@
>>> +/*
>>> + * Media requests UAPI
>>> + *
>>> + * Copyright (C) 2018, The Chromium OS Authors.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + */
>>> +
>>> +#ifndef __LINUX_MEDIA_REQUEST_H
>>> +#define __LINUX_MEDIA_REQUEST_H
>>> +
>>> +#ifndef __KERNEL__
>>> +#include 
>>> +#endif
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +/* Only check that requests can be used, do not allocate */
>>> +#define MEDIA_REQUEST_FLAG_TEST  0x0001
>>> +
>>> +struct media_request_new {
>>> + __u32 flags;
>>> + __s32 fd;
>>> +} __attribute__ ((packed));
>>> +
>>> +#define MEDIA_REQUEST_IOC_SUBMIT   _IO('|',  128)
>>> +#define MEDIA_REQUEST_IOC_REINIT   _IO('|',  129)
>>> +
>>> +#endif
>>>
>>
>> I need to think a bit more on this internal API, so I might come back
>> to this patch for more comments.
> 
> I think I should probably elaborate on why I think it is advantageous
> to have these ioctls handled here.

Sorry for the confusion, I was not actually referring to these ioctls.
In fact, I really like them. It was more a general comment about the
request API core.

I should have been more clear.

Regards,

Hans

> 
> One of the reasons if that it does not force user-space to keep track
> of who issued the request to operate on it. Semantically, the only
> device a request could be submitted to is the device that produced it
> anyway, so since that argument is constant we may as well get rid of
> it (and we also don't need to pass the request FD as argument
> anymore).
> 
> It also gives us more freedom when designing new request-related
> ioctls: before, all request-related operations were multiplexed under
> a single MEDIA_IOC_REQUEST_CMD ioctl, which cmd field indicated the
> actual operation to perform. With this design, all the arguments must
> fit within the media_request_cmd structure, which may cause confusion
> as it will have to be variable-sized. I am thinking in particular
> about a future atomic-like API to set topology, controls and buffers
> related to a request all at the same time. Having it as a request
> ioctl seems perfectly fitting to me.
> 



Re: [PATCH 3/4] kernel/fork: switch vmapped stack callation to __vmalloc_area()

2018-02-20 Thread Konstantin Khlebnikov



On 21.02.2018 03:16, Andrew Morton wrote:

On Tue, 23 Jan 2018 16:57:21 +0300 Konstantin Khlebnikov 
 wrote:


# stress-ng --clone 100 -t 10s --metrics-brief
at 32-core machine shows boost 35000 -> 36000 bogo ops

Patch 4/4 is a kind of RFC.
Actually per-cpu cache of preallocated stacks works faster than buddy allocator 
thus
performance boots for it happens only at completely insane rate of clones.



I'm not really sure what to make of this patchset.  Is it useful in any
known real-world use cases?


Not yet. Feel free to ignore last patch.




+ This option neutralize stack overflow protection but allows to
+ achieve best performance for syscalls fork() and clone().


That sounds problematic, but perhaps acceptable if the fallback only
happens rarely.

Can this code be folded into CONFIG_VMAP_STACk in some cleaner fashion?
We now have options for non-vmapped stacks, vmapped stacks and a mix
of both.

And what about this comment in arch/Kconfig:VMAP_STACK:

   This is presently incompatible with KASAN because KASAN expects
   the stack to map directly to the KASAN shadow map using a formula
   that is incorrect if the stack is in vmalloc space.


So VMAP_STACK_AS_FALLBACK will intermittently break KASAN?



All of this (including CONFIG_VMAP_STACK) could be turned into boot option.
I think this would be a best solution.


Re: [PATCH v7 0/3] clocksource/drivers/atcpit100: Add andestech atcpit100 timer

2018-02-20 Thread Greentime Hu
2018-02-13 17:13 GMT+08:00 Greentime Hu :
> Hi, all:
>
> ATCPIT100 is often used on the Andes architecture,
> This timer provide 4 PIT channels. Each PIT channel is a
> multi-function timer, can be configured as 32,16,8 bit timers
> or PWM as well.
>
> For system timer it will set channel 1 32-bit timer0 as clock
> source and count downwards until underflow and restart again.
>
> It also set channel 0 32-bit timer0 as clock event and count
> downwards until condition match. It will generate an interrupt
> for handling periodically.
>
> Changes in v7:
>  - Fix atcpit100_clkevt_next_event(), before set reload register,
>clock source timer shall disable. And re-enable it after the setting.
>Without this modification, the test case 'clock_nanosleep02' of 
> ltp_20170929
>will fail.
>
> Changes in v6:
>  - To select TIMER_OF in drivers/clocksource/Kconfig instead of 
> arch/nds32/Kconfig
>  - Refine Kconfig
>  - Update license format to SPDX-License-Identifier
>
>
> Rick Chen (3):
>   clocksource/drivers/atcpit100: Add andestech atcpit100 timer
>   clocksource/drivers/atcpit100: VDSO support
>   dt-bindings: timer: Add andestech atcpit100 timer binding doc
>
>  .../bindings/timer/andestech,atcpit100-timer.txt   |  33 +++
>  drivers/clocksource/Kconfig|   9 +
>  drivers/clocksource/Makefile   |   1 +
>  drivers/clocksource/timer-atcpit100.c  | 266 
> +
>  4 files changed, 309 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/timer/andestech,atcpit100-timer.txt
>  create mode 100644 drivers/clocksource/timer-atcpit100.c
>
Hi, Daniel:

Please merge this driver for 4.17 to go along with the nds32
architeture support.
Thank you.


[PATCH v4] iommu/amd: Add support for fast IOTLB flushing

2018-02-20 Thread Suravee Suthikulpanit
Since AMD IOMMU driver currently flushes all TLB entries
when page size is more than one, use the same interface
for both iommu_ops.flush_iotlb_all() and iommu_ops.iotlb_sync().

Cc: Joerg Roedel 
Signed-off-by: Suravee Suthikulpanit 
---
Changes from v3 (https://patchwork.kernel.org/patch/10193235)
 * Change amd_iommu_iotlb_range_add() to no-op and iotlb_sync()
   to full domain flush for now since we currently flush all entries
   when the page size is more than one.
 * Fine-grained invalidation will be introduced in subsequent
   patch series.

 drivers/iommu/amd_iommu.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fed8059..6061a8d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3043,9 +3043,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
unmap_size = iommu_unmap_page(domain, iova, page_size);
mutex_unlock(&domain->api_lock);
 
-   domain_flush_tlb_pde(domain);
-   domain_flush_complete(domain);
-
return unmap_size;
 }
 
@@ -3163,6 +3160,19 @@ static bool amd_iommu_is_attach_deferred(struct 
iommu_domain *domain,
return dev_data->defer_attach;
 }
 
+static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
+{
+   struct protection_domain *dom = to_pdomain(domain);
+
+   domain_flush_tlb_pde(dom);
+   domain_flush_complete(dom);
+}
+
+static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
+ unsigned long iova, size_t size)
+{
+}
+
 const struct iommu_ops amd_iommu_ops = {
.capable = amd_iommu_capable,
.domain_alloc = amd_iommu_domain_alloc,
@@ -3181,6 +3191,9 @@ static bool amd_iommu_is_attach_deferred(struct 
iommu_domain *domain,
.apply_resv_region = amd_iommu_apply_resv_region,
.is_attach_deferred = amd_iommu_is_attach_deferred,
.pgsize_bitmap  = AMD_IOMMU_PGSIZES,
+   .flush_iotlb_all = amd_iommu_flush_iotlb_all,
+   .iotlb_range_add = amd_iommu_iotlb_range_add,
+   .iotlb_sync = amd_iommu_flush_iotlb_all,
 };
 
 /*
-- 
1.8.3.1



Re: [PATCH 05/10] hwmon: generic-pwm-tachometer: Add generic PWM based tachometer

2018-02-20 Thread Mikko Perttunen
AIUI, the PWM framework already exposes a sysfs node with period 
information. We should just use that instead of adding a new driver for 
this.


In any case, we cannot add something like this to device tree since it's 
not a hardware device.


Mikko

On 21.02.2018 08:58, Rajkumar Rampelli wrote:

Add generic PWM based tachometer driver via HWMON interface
to report the RPM of motor. This drivers get the period/duty
cycle from PWM IP which captures the motor PWM output.

This driver implements a simple interface for monitoring the speed of
a fan and exposes it in roatations per minute (RPM) to the user space
by using the hwmon's sysfs interface

Signed-off-by: Rajkumar Rampelli 
---
 Documentation/hwmon/generic-pwm-tachometer |  17 +
 drivers/hwmon/Kconfig  |  10 +++
 drivers/hwmon/Makefile |   1 +
 drivers/hwmon/generic-pwm-tachometer.c | 112 +
 4 files changed, 140 insertions(+)
 create mode 100644 Documentation/hwmon/generic-pwm-tachometer
 create mode 100644 drivers/hwmon/generic-pwm-tachometer.c

diff --git a/Documentation/hwmon/generic-pwm-tachometer 
b/Documentation/hwmon/generic-pwm-tachometer
new file mode 100644
index 000..e0713ee
--- /dev/null
+++ b/Documentation/hwmon/generic-pwm-tachometer
@@ -0,0 +1,17 @@
+Kernel driver generic-pwm-tachometer
+
+
+This driver enables the use of a PWM module to monitor a fan. It uses the
+generic PWM interface and can be used on SoCs as along as the SoC supports
+Tachometer controller that moniors the Fan speed in periods.
+
+Author: Rajkumar Rampelli 
+
+Description
+---
+
+The driver implements a simple interface for monitoring the Fan speed using
+PWM module and Tachometer controller. It requests period value through PWM
+capture interface to Tachometer and measures the Rotations per minute using
+received period value. It exposes the Fan speed in RPM to the user space by
+using the hwmon's sysfs interface.
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index ef23553..8912dcb 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1878,6 +1878,16 @@ config SENSORS_XGENE
  If you say yes here you get support for the temperature
  and power sensors for APM X-Gene SoC.

+config GENERIC_PWM_TACHOMETER
+   tristate "Generic PWM based tachometer driver"
+   depends on PWM
+   help
+ Enables a driver to use PWM signal from motor to use
+ for measuring the motor speed. The RPM is captured by
+ PWM modules which has PWM capture capability and this
+ drivers reads the captured data from PWM IP to convert
+ it to speed in RPM.
+
 if ACPI

 comment "ACPI drivers"
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index f814b4a..9dcc374 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -175,6 +175,7 @@ obj-$(CONFIG_SENSORS_WM8350)+= wm8350-hwmon.o
 obj-$(CONFIG_SENSORS_XGENE)+= xgene-hwmon.o

 obj-$(CONFIG_PMBUS)+= pmbus/
+obj-$(CONFIG_GENERIC_PWM_TACHOMETER) += generic-pwm-tachometer.o

 ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG

diff --git a/drivers/hwmon/generic-pwm-tachometer.c 
b/drivers/hwmon/generic-pwm-tachometer.c
new file mode 100644
index 000..9354d43
--- /dev/null
+++ b/drivers/hwmon/generic-pwm-tachometer.c
@@ -0,0 +1,112 @@
+/*
+ * Copyright (c) 2017-2018, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct pwm_hwmon_tach {
+   struct device   *dev;
+   struct pwm_device   *pwm;
+   struct device   *hwmon;
+};
+
+static ssize_t show_rpm(struct device *dev, struct device_attribute *attr,
+   char *buf)
+{
+   struct pwm_hwmon_tach *ptt = dev_get_drvdata(dev);
+   struct pwm_device *pwm = ptt->pwm;
+   struct pwm_capture result;
+   int err;
+   unsigned int rpm = 0;
+
+   err = pwm_capture(pwm, &result, 0);
+   if (err < 0) {
+   dev_err(ptt->dev, "Failed to capture PWM: %d\n", err);
+   return err;
+   }
+
+   if (result.period)
+   rpm = DIV_ROUND_CLOSEST_ULL(60ULL * NSEC_PER_SEC,
+   result.period);
+
+   return sprintf(buf, "%u\n", rpm);
+}
+
+static SENSOR_DEVICE_ATTR(rpm, 0444, show_rpm, NULL, 0);
+
+static struct attribute *pwm_tach_attrs[] = {
+   &sensor_dev_attr_rpm.dev_attr.attr,

Re: [PATCH 0/6] DISCONTIGMEM support for PPC32

2018-02-20 Thread Christophe LEROY



Le 20/02/2018 à 17:14, Jonathan Neuschäfer a écrit :

This patchset adds support for DISCONTIGMEM on 32-bit PowerPC. This is
required to properly support the Nintendo Wii's memory layout, in which
there are two blocks of RAM and MMIO in the middle.

Previously, this memory layout was handled by code that joins the two
RAM blocks into one, reserves the MMIO hole, and permits allocations of
reserved memory in ioremap. This hack didn't work with resource-based
allocation (as used for example in the GPIO driver for Wii[1]), however.

After this patchset, users of the Wii can either select CONFIG_FLATMEM
to get the old behaviour, or CONFIG_DISCONTIGMEM to get the new
behaviour.


My question might me stupid, as I don't know PCC64 in deep, but when 
looking at page_is_ram() in arch/powerpc/mm/mem.c, I have the feeling 
the PPC64 implements ram by blocks. Isn't it what you are trying to 
achieve ? Wouldn't it be feasible to map to what's done in PPC64 for PPC32 ?


Christophe



Some parts of this patchset are probably not ideal (I'm thinking of my
implementation of pfn_to_nid here), and will require some discussion/
changes.

[1]: https://www.spinics.net/lists/devicetree/msg213956.html

Jonathan Neuschäfer (6):
   powerpc/mm/32: Use pfn_valid to check if pointer is in RAM
   powerpc: numa: Fix overshift on PPC32
   powerpc: numa: Use the right #ifdef guards around functions
   powerpc: numa: Restrict fake NUMA enulation to CONFIG_NUMA systems
   powerpc: Implement DISCONTIGMEM and allow selection on PPC32
   powerpc: wii: Don't rely on reserved memory hack if DISCONTIGMEM is
 set

  arch/powerpc/Kconfig |  5 -
  arch/powerpc/include/asm/mmzone.h| 21 +
  arch/powerpc/mm/numa.c   | 18 +++---
  arch/powerpc/mm/pgtable_32.c |  2 +-
  arch/powerpc/platforms/embedded6xx/wii.c | 10 +++---
  5 files changed, 48 insertions(+), 8 deletions(-)



Re: [PATCH v3] iommu/amd: Add support for fast IOTLB flushing

2018-02-20 Thread Suravee Suthikulpanit

Hi Joerg,

On 2/13/18 8:29 PM, Joerg Roedel wrote:

Hi Suravee,

thanks for working on this.

On Wed, Jan 31, 2018 at 12:01:14AM -0500, Suravee Suthikulpanit wrote:

+static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
+ unsigned long iova, size_t size)
+{
+   struct amd_iommu_flush_entries *entry, *p;
+   unsigned long flags;
+   bool found = false;
+
+   spin_lock_irqsave(&amd_iommu_flush_list_lock, flags);


I am not happy with introducing or using global locks when they are not
necessary. Can this be a per-domain lock?

Besides, did you check it makes sense to actually keep track of the
ranges here? My approach would be to just make iotlb_range_add() an noop
and do a full domain flush in iotlb_sync(). But maybe you did
measurements you can share here to show there is a benefit.



Joerg



Alright, I'll send out v4 w/ iotlb_range_add() as no-op, and iotlb_sync()
as full domain flush. This should be sufficient to get start with adopting
the fast TLB flushing interface.

I'll submit support for fine-grain TLB invalidation as a separate series.

Thanks,
Suravee


[PATCH 08/10] arm64: defconfig: enable Nvidia Tegra Tachometer as a module

2018-02-20 Thread Rajkumar Rampelli
Tegra Tachometer driver implements PWM capture to measure
period. Enable this driver as a module in the ARM64 defconfig.

Signed-off-by: Rajkumar Rampelli 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 634b373..8b2bda7 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -550,6 +550,7 @@ CONFIG_PWM_MESON=m
 CONFIG_PWM_ROCKCHIP=y
 CONFIG_PWM_SAMSUNG=y
 CONFIG_PWM_TEGRA=m
+CONFIG_PWM_TEGRA_TACHOMETER=m
 CONFIG_PHY_RCAR_GEN3_USB2=y
 CONFIG_PHY_HI6220_USB=y
 CONFIG_PHY_QCOM_USB_HS=y
-- 
2.1.4



[PATCH 07/10] arm64: tegra: Add PWM based Tachometer support on Tegra186

2018-02-20 Thread Rajkumar Rampelli
Add PWM based Tachometer support on Tegra186 to measure
number of rotations of a Fan per minute by using PWM
capture interface

Signed-off-by: Rajkumar Rampelli 
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index 8f2d598..37149e9 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1042,4 +1042,10 @@
reset-names = "tachometer";
status = "disabled";
};
+
+   generic_pwm_tachometer {
+   compatible = "generic-pwm-tachometer";
+   pwms = <&tegra_tachometer 0 100>;
+   status = "disabled";
+   };
 };
-- 
2.1.4



[PATCH 10/10] arm64: tegra: Add PWM controller on Tegra186 soc

2018-02-20 Thread Rajkumar Rampelli
The NVIDIA Tegra186 SoC has a PWM controller which is
used in FAN control use case.

Signed-off-by: Rajkumar Rampelli 
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index 37149e9..c6f154e 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1032,6 +1032,17 @@
interrupt-parent = <&gic>;
};
 
+   pwm@c34 {
+   compatible = "nvidia,tegra186-pwm";
+   reg = <0x0 0xc34 0x0 0x1>;
+   clocks = <&bpmp TEGRA186_CLK_PWM4>;
+   clock-names = "pwm";
+   #pwm-cells = <2>;
+   resets = <&bpmp TEGRA186_RESET_PWM4>;
+   reset-names = "pwm";
+   status = "disabled";
+   };
+
tegra_tachometer: tachometer@39c {
compatible = "nvidia,tegra186-pwm-tachometer";
reg = <0x0 0x039c 0x0 0x10>;
-- 
2.1.4



[PATCH 09/10] arm64: defconfig: Enable Generic PWM based Tachometer driver

2018-02-20 Thread Rajkumar Rampelli
Enable Generic PWM based Tachometer driver which implements a simple
interface for monitoring the speed of a fan in roatations per minute,
and exposes it to the user space by using the hwmon's sysfs interface.
Enable this driver as a module in the ARM64 defconfig.

Signed-off-by: Rajkumar Rampelli 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 8b2bda7..1b29109 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -321,6 +321,7 @@ CONFIG_BATTERY_BQ27XXX=y
 CONFIG_SENSORS_ARM_SCPI=y
 CONFIG_SENSORS_LM90=m
 CONFIG_SENSORS_INA2XX=m
+CONFIG_GENERIC_PWM_TACHOMETER=m
 CONFIG_THERMAL_GOV_POWER_ALLOCATOR=y
 CONFIG_CPU_THERMAL=y
 CONFIG_THERMAL_EMULATION=y
-- 
2.1.4



[PATCH 05/10] hwmon: generic-pwm-tachometer: Add generic PWM based tachometer

2018-02-20 Thread Rajkumar Rampelli
Add generic PWM based tachometer driver via HWMON interface
to report the RPM of motor. This drivers get the period/duty
cycle from PWM IP which captures the motor PWM output.

This driver implements a simple interface for monitoring the speed of
a fan and exposes it in roatations per minute (RPM) to the user space
by using the hwmon's sysfs interface

Signed-off-by: Rajkumar Rampelli 
---
 Documentation/hwmon/generic-pwm-tachometer |  17 +
 drivers/hwmon/Kconfig  |  10 +++
 drivers/hwmon/Makefile |   1 +
 drivers/hwmon/generic-pwm-tachometer.c | 112 +
 4 files changed, 140 insertions(+)
 create mode 100644 Documentation/hwmon/generic-pwm-tachometer
 create mode 100644 drivers/hwmon/generic-pwm-tachometer.c

diff --git a/Documentation/hwmon/generic-pwm-tachometer 
b/Documentation/hwmon/generic-pwm-tachometer
new file mode 100644
index 000..e0713ee
--- /dev/null
+++ b/Documentation/hwmon/generic-pwm-tachometer
@@ -0,0 +1,17 @@
+Kernel driver generic-pwm-tachometer
+
+
+This driver enables the use of a PWM module to monitor a fan. It uses the
+generic PWM interface and can be used on SoCs as along as the SoC supports
+Tachometer controller that moniors the Fan speed in periods.
+
+Author: Rajkumar Rampelli 
+
+Description
+---
+
+The driver implements a simple interface for monitoring the Fan speed using
+PWM module and Tachometer controller. It requests period value through PWM
+capture interface to Tachometer and measures the Rotations per minute using
+received period value. It exposes the Fan speed in RPM to the user space by
+using the hwmon's sysfs interface.
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index ef23553..8912dcb 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1878,6 +1878,16 @@ config SENSORS_XGENE
  If you say yes here you get support for the temperature
  and power sensors for APM X-Gene SoC.
 
+config GENERIC_PWM_TACHOMETER
+   tristate "Generic PWM based tachometer driver"
+   depends on PWM
+   help
+ Enables a driver to use PWM signal from motor to use
+ for measuring the motor speed. The RPM is captured by
+ PWM modules which has PWM capture capability and this
+ drivers reads the captured data from PWM IP to convert
+ it to speed in RPM.
+
 if ACPI
 
 comment "ACPI drivers"
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index f814b4a..9dcc374 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -175,6 +175,7 @@ obj-$(CONFIG_SENSORS_WM8350)+= wm8350-hwmon.o
 obj-$(CONFIG_SENSORS_XGENE)+= xgene-hwmon.o
 
 obj-$(CONFIG_PMBUS)+= pmbus/
+obj-$(CONFIG_GENERIC_PWM_TACHOMETER) += generic-pwm-tachometer.o
 
 ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
 
diff --git a/drivers/hwmon/generic-pwm-tachometer.c 
b/drivers/hwmon/generic-pwm-tachometer.c
new file mode 100644
index 000..9354d43
--- /dev/null
+++ b/drivers/hwmon/generic-pwm-tachometer.c
@@ -0,0 +1,112 @@
+/*
+ * Copyright (c) 2017-2018, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct pwm_hwmon_tach {
+   struct device   *dev;
+   struct pwm_device   *pwm;
+   struct device   *hwmon;
+};
+
+static ssize_t show_rpm(struct device *dev, struct device_attribute *attr,
+   char *buf)
+{
+   struct pwm_hwmon_tach *ptt = dev_get_drvdata(dev);
+   struct pwm_device *pwm = ptt->pwm;
+   struct pwm_capture result;
+   int err;
+   unsigned int rpm = 0;
+
+   err = pwm_capture(pwm, &result, 0);
+   if (err < 0) {
+   dev_err(ptt->dev, "Failed to capture PWM: %d\n", err);
+   return err;
+   }
+
+   if (result.period)
+   rpm = DIV_ROUND_CLOSEST_ULL(60ULL * NSEC_PER_SEC,
+   result.period);
+
+   return sprintf(buf, "%u\n", rpm);
+}
+
+static SENSOR_DEVICE_ATTR(rpm, 0444, show_rpm, NULL, 0);
+
+static struct attribute *pwm_tach_attrs[] = {
+   &sensor_dev_attr_rpm.dev_attr.attr,
+   NULL,
+};
+
+ATTRIBUTE_GROUPS(pwm_tach);
+
+static int pwm_tach_probe(struct platform_device *pdev)
+{
+   struct pwm_hwmon_tach *ptt;
+   int err;
+
+   ptt = devm_kzalloc(&pdev->dev, sizeof(*ptt), GFP_KERNEL);
+   if (!ptt)
+   return -ENOMEM;
+
+   ptt

[PATCH 06/10] arm64: tegra: Add Tachometer Controller on Tegra186

2018-02-20 Thread Rajkumar Rampelli
The NVIDIA Tegra186 SoC has a Tachometer Controller that analyzes the
PWM signal of a Fan and reports the period value through pwm interface.

Signed-off-by: Rajkumar Rampelli 
---
 arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts |  5 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi   | 11 +++
 2 files changed, 16 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts 
b/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts
index bd5305a..13c3e59 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts
+++ b/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts
@@ -172,4 +172,9 @@
vin-supply = <&vdd_5v0_sys>;
};
};
+
+   tachometer@39c {
+   nvidia,pulse-per-rev = <2>;
+   nvidia,capture-window-len = <2>;
+   };
 };
diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index b762227..8f2d598 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1031,4 +1031,15 @@
(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
interrupt-parent = <&gic>;
};
+
+   tegra_tachometer: tachometer@39c {
+   compatible = "nvidia,tegra186-pwm-tachometer";
+   reg = <0x0 0x039c 0x0 0x10>;
+   #pwm-cells = <2>;
+   clocks = <&bpmp TEGRA186_CLK_TACH>;
+   clock-names = "tachometer";
+   resets = <&bpmp TEGRA186_RESET_TACH>;
+   reset-names = "tachometer";
+   status = "disabled";
+   };
 };
-- 
2.1.4



[PATCH 04/10] hwmon: generic-pwm-tachometer: Add DT binding details

2018-02-20 Thread Rajkumar Rampelli
Add DT binding details for the PWM based generic tachometer
driver which gets the period of the PWM tach-output from Fan
via PWM IP having capability of capturing the signal.

Signed-off-by: Rajkumar Rampelli 
---
 .../bindings/hwmon/generic-pwm-tachometer.txt  | 25 ++
 1 file changed, 25 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/hwmon/generic-pwm-tachometer.txt

diff --git a/Documentation/devicetree/bindings/hwmon/generic-pwm-tachometer.txt 
b/Documentation/devicetree/bindings/hwmon/generic-pwm-tachometer.txt
new file mode 100644
index 000..3541fe5
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/generic-pwm-tachometer.txt
@@ -0,0 +1,25 @@
+Device tree bindings for fan tach output connected to PWM controller with
+PWM capture capability.
+
+Required properties:
+- compatible : Should be "generic-pwm-tachometer"
+- pwms   : PWM handle. Please refer pwm.txt DT binding for more details.
+
+Example:
+   tegra_tachometer: tachometer@39c {
+   compatible = "nvidia,tegra186-pwm-tachometer";
+   reg = <0x0 0x039c 0x0 0x10>;
+   clocks = <&bpmp_clks TEGRA194_CLK_TACH>;
+   clock-names = "tachometer";
+   resets = <&bpmp_resets TEGRA194_RESET_TACH>;
+   reset-names = "tachometer";
+   nvidia,pulse-per-rev = <2>;
+   nvidia,sampling-window = <2>;
+   status = "okay";
+   };
+
+   generic_pwm_tachometer {
+   compatible = "generic-pwm-tachometer";
+   pwms = <&tegra_tachometer 0 100>;
+   status = "okay";
+   };
-- 
2.1.4



[PATCH 03/10] pwm: tegra: Add PWM based Tachometer driver

2018-02-20 Thread Rajkumar Rampelli
PWM Tachometer driver capture the PWM signal which is output of FAN
in general and provide the period of PWM signal which is converted to
RPM by SW.

Add Tegra Tachometer driver which implements the pwm-capture to
measure period.

Signed-off-by: Rajkumar Rampelli 
Signed-off-by: Laxman Dewangan 
---
 drivers/pwm/Kconfig|  10 ++
 drivers/pwm/Makefile   |   1 +
 drivers/pwm/pwm-tegra-tachometer.c | 303 +
 3 files changed, 314 insertions(+)
 create mode 100644 drivers/pwm/pwm-tegra-tachometer.c

diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index 763ee50..29aeeeb 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -454,6 +454,16 @@ config PWM_TEGRA
  To compile this driver as a module, choose M here: the module
  will be called pwm-tegra.
 
+config PWM_TEGRA_TACHOMETER
+   tristate "NVIDIA Tegra Tachometer PWM driver"
+   depends on ARCH_TEGRA
+   help
+ NVIDIA Tegra Tachometer reads the PWM signal and reports the PWM
+ signal periods. This helps in measuring the fan speed where Fan
+ output for speed is PWM signal.
+
+ This driver support the Tachometer driver in PWM framework.
+
 config  PWM_TIECAP
tristate "ECAP PWM support"
depends on ARCH_OMAP2PLUS || ARCH_DAVINCI_DA8XX || ARCH_KEYSTONE
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index 0258a74..14c183e 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -45,6 +45,7 @@ obj-$(CONFIG_PWM_STM32_LP)+= pwm-stm32-lp.o
 obj-$(CONFIG_PWM_STMPE)+= pwm-stmpe.o
 obj-$(CONFIG_PWM_SUN4I)+= pwm-sun4i.o
 obj-$(CONFIG_PWM_TEGRA)+= pwm-tegra.o
+obj-$(CONFIG_PWM_TEGRA_TACHOMETER) += pwm-tegra-tachometer.o
 obj-$(CONFIG_PWM_TIECAP)   += pwm-tiecap.o
 obj-$(CONFIG_PWM_TIEHRPWM) += pwm-tiehrpwm.o
 obj-$(CONFIG_PWM_TIPWMSS)  += pwm-tipwmss.o
diff --git a/drivers/pwm/pwm-tegra-tachometer.c 
b/drivers/pwm/pwm-tegra-tachometer.c
new file mode 100644
index 000..1304e47
--- /dev/null
+++ b/drivers/pwm/pwm-tegra-tachometer.c
@@ -0,0 +1,303 @@
+/*
+ * Tegra Tachometer Pulse-Width-Modulation driver
+ *
+ * Copyright (c) 2017-2018, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Since oscillator clock (38.4MHz) serves as a clock source for
+ * the tach input controller, 1.0105263MHz (i.e. 38.4/38) has to be
+ * used as a clock value in the RPM calculations
+ */
+#define TACH_COUNTER_CLK   1010526
+
+#define TACH_FAN_TACH0 0x0
+#define TACH_FAN_TACH0_PERIOD_MASK 0x7
+#define TACH_FAN_TACH0_PERIOD_MAX  0x7
+#define TACH_FAN_TACH0_PERIOD_MIN  0x0
+#define TACH_FAN_TACH0_WIN_LENGTH_SHIFT25
+#define TACH_FAN_TACH0_WIN_LENGTH_MASK 0x3
+#define TACH_FAN_TACH0_OVERFLOW_MASK   BIT(24)
+
+#define TACH_FAN_TACH1 0x4
+#define TACH_FAN_TACH1_HI_MASK 0x7
+/*
+ * struct pwm_tegra_tach - Tegra tachometer object
+ * @dev: device providing the Tachometer
+ * @pulse_per_rev: Pulses per revolution of a Fan
+ * @capture_window_len: Defines the window of the FAN TACH monitor
+ * @regs: physical base addresses of the controller
+ * @clk: phandle list of tachometer clocks
+ * @rst: phandle to reset the controller
+ * @chip: PWM chip providing this PWM device
+ */
+struct pwm_tegra_tach {
+   struct device   *dev;
+   void __iomem*regs;
+   struct clk  *clk;
+   struct reset_control*rst;
+   u32 pulse_per_rev;
+   u32 capture_window_len;
+   struct pwm_chip chip;
+};
+
+static struct pwm_tegra_tach *to_tegra_pwm_chip(struct pwm_chip *chip)
+{
+   return container_of(chip, struct pwm_tegra_tach, chip);
+}
+
+static u32 tachometer_readl(struct pwm_tegra_tach *ptt, unsigned long reg)
+{
+   return readl(ptt->regs + reg);
+}
+
+static inline void tachometer_writel(struct pwm_tegra_tach *ptt, u32 val,
+unsigned long reg)
+{
+   writel(val, ptt->regs + reg);
+}
+
+static int pwm_tegra_tach_set_wlen(struct pwm_tegra_tach *ptt,
+  u32 window_length)
+{
+   u32 tach0, wlen;
+
+   /*
+* As per FAN Spec, the window length value should 

[PATCH 02/10] dt-bindings: Tegra186 tachometer device tree bindings

2018-02-20 Thread Rajkumar Rampelli
Supply Device tree binding documentation for the NVIDIA
Tegra186 SoC's Tachometer Controller

Signed-off-by: Rajkumar Rampelli 
---
 .../bindings/pwm/pwm-tegra-tachometer.txt  | 31 ++
 1 file changed, 31 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pwm/pwm-tegra-tachometer.txt

diff --git a/Documentation/devicetree/bindings/pwm/pwm-tegra-tachometer.txt 
b/Documentation/devicetree/bindings/pwm/pwm-tegra-tachometer.txt
new file mode 100644
index 000..4a7ead4
--- /dev/null
+++ b/Documentation/devicetree/bindings/pwm/pwm-tegra-tachometer.txt
@@ -0,0 +1,31 @@
+Bindings for a PWM based Tachometer driver
+
+Required properties:
+- compatible: Must be "nvidia,tegra186-pwm-tachometer"
+- reg: physical base addresses of the controller and length of
+  memory mapped region.
+- #pwm-cells: should be 2. See pwm.txt in this directory for a
+  description of the cells format.
+- clocks: phandle list of tachometer clocks
+- clock-names: should be "tachometer". See clock-bindings.txt in documentations
+- resets: phandle to the reset controller for the Tachometer IP
+- reset-names: should be "tachometer". See reset.txt in documentations
+- nvidia,pulse-per-rev: Integer, pulses per revolution of a Fan. This value
+  obtained from Fan specification document.
+- nvidia,capture-window-len: Integer, window of the Fan Tach monitor, it 
indicates
+  that how many period of the input fan tach signal will the FAN TACH logic
+  monitor. Valid values are 1, 2, 4 and 8 only.
+
+Example:
+   tegra_tachometer: tachometer@39c {
+   compatible = "nvidia,tegra186-pwm-tachometer";
+   reg = <0x0 0x039c 0x0 0x10>;
+   #pwm-cells = <2>;
+   clocks = <&tegra_car TEGRA186_CLK_TACH>;
+   clock-names = "tachometer";
+   resets = <&tegra_car TEGRA186_RESET_TACH>;
+   reset-names = "tachometer";
+   nvidia,pulse-per-rev = <2>;
+   nvidia,capture-window-len = <2>;
+   status = "disabled";
+   };
-- 
2.1.4



[PATCH 01/10] pwm: core: Add support for PWM HW driver with pwm capture only

2018-02-20 Thread Rajkumar Rampelli
Add support for pwm HW driver which has only capture functionality.
This helps to implement the PWM based Tachometer driver which reads
the PWM output signals from electronic fans.

PWM Tachometer captures the period and duty cycle of the PWM signal

Signed-off-by: Rajkumar Rampelli 
---
 drivers/pwm/core.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
index 1581f6a..87d14c9 100644
--- a/drivers/pwm/core.c
+++ b/drivers/pwm/core.c
@@ -246,6 +246,10 @@ static bool pwm_ops_check(const struct pwm_ops *ops)
if (ops->apply)
return true;
 
+   /* driver supports capture operation */
+   if (ops->capture)
+   return true;
+
return false;
 }
 
-- 
2.1.4



[PATCH 00/10] Implementation of Tegra Tachometer driver

2018-02-20 Thread Rajkumar Rampelli
The following patches adds support for PWM based Tegra Tachometer driver
which implements PWM capture interface to analyze the PWM signal of a
electronic fan and reports it in periods and duty cycles.

Generic PWM Tachometer implemented to monitor the speed of fan in rpms
using PWM interface. RPM of Fan will be exposed to user interface through
HWMON sysfs interface avialable at below location
/sys/devices/platform/generic_pwm_tachometer/hwmon/hwmon0/rpm

Steps to validate Tachometer:
A. push modules pwm-tegra.ko, pwm-tegra-tachometer.ko and
   generic-pwm-tachometer.ko to linux device using scp command.
   scp build/tegra186/drivers/pwm/pwm-tegra.ko ubuntu@10.19.65.176:/tmp/
   scp build/tegra186/drivers/pwm/pwm-tegra-tachometer.ko 
ubuntu@10.19.65.176:/tmp/
   scp build/tegra186/drivers/hwmon/generic-pwm-tachometer.ko 
ubuntu@10.19.65.176:/tmp/
B. On Linux device console, insert these modules using insmod command.
   insmod /tmp/pwm-tegra.ko
   insmod /tmp/pwm-tegra-tachometer.ko
   insmod /tmp/generic-pwm-tachometer.ko
C. Read RPM value at below sysfs node
   cat /sys/devices/platform/generic_pwm_tachometer/hwmon/hwmon0/rpm
D. Change the FAN speed using PWM sysfs interface. Follow below steps for the 
same:
   a. cd /sys/class/pwm/pwmchip0
   b. ls -la (make sure pwm controller is c34.pwm)
  Output should be: device -> ../../../c34.pwm
   c. echo 0 > export
   d. cd pwmchip0:0
   e. echo 8000 > period
   f. echo 1 > enable
   g. echo 7000 > duty_cycle # change duty_cycles from 0 to 7000 and see FAN 
speed
   h. cat /sys/devices/platform/generic_pwm_tachometer/hwmon/hwmon0/rpm
   i. echo 4000 > duty_cycle
   h. cat /sys/devices/platform/generic_pwm_tachometer/hwmon/hwmon0/rpm
   i. echo 2000 > duty_cycle
   h. cat /sys/devices/platform/generic_pwm_tachometer/hwmon/hwmon0/rpm
   i. echo 0 > duty_cycle
   h. cat /sys/devices/platform/generic_pwm_tachometer/hwmon/hwmon0/rpm

Rajkumar Rampelli (10):
  pwm: core: Add support for PWM HW driver with pwm capture only
  dt-bindings: Tegra186 tachometer device tree bindings
  pwm: tegra: Add PWM based Tachometer driver
  hwmon: generic-pwm-tachometer: Add DT binding details
  hwmon: generic-pwm-tachometer: Add generic PWM based tachometer
  arm64: tegra: Add Tachometer Controller on Tegra186
  arm64: tegra: Add PWM based Tachometer support on Tegra186
  arm64: defconfig: enable Nvidia Tegra Tachometer as a module
  arm64: defconfig: Enable Generic PWM based Tachometer driver
  arm64: tegra: Add PWM controller on Tegra186 soc

 .../bindings/hwmon/generic-pwm-tachometer.txt  |  25 ++
 .../bindings/pwm/pwm-tegra-tachometer.txt  |  31 +++
 Documentation/hwmon/generic-pwm-tachometer |  17 ++
 arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts |   5 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi   |  28 ++
 arch/arm64/configs/defconfig   |   2 +
 drivers/hwmon/Kconfig  |  10 +
 drivers/hwmon/Makefile |   1 +
 drivers/hwmon/generic-pwm-tachometer.c | 112 
 drivers/pwm/Kconfig|  10 +
 drivers/pwm/Makefile   |   1 +
 drivers/pwm/core.c |   4 +
 drivers/pwm/pwm-tegra-tachometer.c | 303 +
 13 files changed, 549 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/hwmon/generic-pwm-tachometer.txt
 create mode 100644 
Documentation/devicetree/bindings/pwm/pwm-tegra-tachometer.txt
 create mode 100644 Documentation/hwmon/generic-pwm-tachometer
 create mode 100644 drivers/hwmon/generic-pwm-tachometer.c
 create mode 100644 drivers/pwm/pwm-tegra-tachometer.c

-- 
2.1.4



Re: [PATCH] audit: return on memory error to avoid null pointer dereference

2018-02-20 Thread Richard Guy Briggs
On 2018-02-21 01:47, Richard Guy Briggs wrote:
> If there is a memory allocation error when trying to change an audit
> kernel feature value, the ignored allocation error will trigger a NULL
> pointer dereference oops on subsequent use of that pointer.  Return
> instead.
> 
> See: https://github.com/linux-audit/audit-kernel/issues/76
> Signed-off-by: Richard Guy Briggs 

Self-NACK.  It was based on local development and won't compile on
upstream.  Fix pending.

> ---
>  kernel/audit.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 196d327..31cb11d 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -1063,6 +1063,8 @@ static void audit_log_feature_change(int which, u32 
> old_feature, u32 new_feature
>   return;
>  
>   ab = audit_log_start(context, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
> + if (!ab)
> + return;
>   audit_log_task_info(ab, current);
>   audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u 
> res=%d",
>audit_feature_names[which], !!old_feature, 
> !!new_feature,
> -- 
> 1.8.3.1
> 

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


[PATCH] audit: return on memory error to avoid null pointer dereference

2018-02-20 Thread Richard Guy Briggs
If there is a memory allocation error when trying to change an audit
kernel feature value, the ignored allocation error will trigger a NULL
pointer dereference oops on subsequent use of that pointer.  Return
instead.

See: https://github.com/linux-audit/audit-kernel/issues/76
Signed-off-by: Richard Guy Briggs 
---
 kernel/audit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/audit.c b/kernel/audit.c
index 196d327..31cb11d 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1063,6 +1063,8 @@ static void audit_log_feature_change(int which, u32 
old_feature, u32 new_feature
return;
 
ab = audit_log_start(context, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
+   if (!ab)
+   return;
audit_log_task_info(ab, current);
audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u 
res=%d",
 audit_feature_names[which], !!old_feature, 
!!new_feature,
-- 
1.8.3.1



Re: [PATCH] ocxl: Add get_metadata IOCTL to share OCXL information to userspace

2018-02-20 Thread Balbir Singh
On Wed, Feb 21, 2018 at 3:57 PM, Alastair D'Silva  wrote:
> From: Alastair D'Silva 
>
> Some required information is not exposed to userspace currently (eg. the
> PASID), pass this information back, along with other information which
> is currently communicated via sysfs, which saves some parsing effort in
> userspace.
>
> Signed-off-by: Alastair D'Silva 
> ---
>  drivers/misc/ocxl/file.c | 27 +++
>  include/uapi/misc/ocxl.h | 22 ++
>  2 files changed, 49 insertions(+)
>
> diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c
> index d9aa407db06a..11514a8444e5 100644
> --- a/drivers/misc/ocxl/file.c
> +++ b/drivers/misc/ocxl/file.c
> @@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx,
> return rc;
>  }
>
> +static long afu_ioctl_get_metadata(struct ocxl_context *ctx,
> +   struct ocxl_ioctl_get_metadata __user *uarg)

Why do we call this metadata? Isn't this an afu_descriptor?

> +{
> +   struct ocxl_ioctl_get_metadata arg;
> +
> +   memset(&arg, 0, sizeof(arg));
> +
> +   arg.version = 0;

Does it make sense to have version 0? Even if does, you can afford
to skip initialization due to the memset above. I prefer that versions
start with 1

> +
> +   arg.afu_version_major = ctx->afu->config.version_major;
> +   arg.afu_version_minor = ctx->afu->config.version_minor;
> +   arg.pasid = ctx->pasid;
> +   arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride;
> +   arg.global_mmio_size = ctx->afu->config.global_mmio_size;
> +
> +   if (copy_to_user(uarg, &arg, sizeof(arg)))
> +   return -EFAULT;
> +
> +   return 0;
> +}
> +
>  #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" :  
>   \
> x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" :   \
> x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \
> x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \
> +   x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \
> "UNKNOWN")
>
>  static long afu_ioctl(struct file *file, unsigned int cmd,
> @@ -157,6 +179,11 @@ static long afu_ioctl(struct file *file, unsigned int 
> cmd,
> irq_fd.eventfd);
> break;
>
> +   case OCXL_IOCTL_GET_METADATA:
> +   rc = afu_ioctl_get_metadata(ctx,
> +   (struct ocxl_ioctl_get_metadata __user *) 
> args);
> +   break;
> +
> default:
> rc = -EINVAL;
> }
> diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h
> index 4b0b0b756f3e..16e1f48ce280 100644
> --- a/include/uapi/misc/ocxl.h
> +++ b/include/uapi/misc/ocxl.h
> @@ -32,6 +32,27 @@ struct ocxl_ioctl_attach {
> __u64 reserved3;
>  };
>
> +/*
> + * Version contains the version of the struct.
> + * Versions will always be backwards compatible, that is, new versions will 
> not
> + * alter existing fields
> + */
> +struct ocxl_ioctl_get_metadata {

This sounds more like a function name, do we need it to be _get_metdata?

> +   __u16 version;
> +
> +   // Version 0 fields
> +   __u8  afu_version_major;
> +   __u8  afu_version_minor;
> +   __u32 pasid;
> +
> +   __u64 pp_mmio_size;
> +   __u64 global_mmio_size;
> +

Should we document the fields? pp_ stands for per process, but is not
very clear at first look. Why do we care to return only the size, what
about lpc size?

> +   // End version 0 fields
> +
> +   __u64 reserved[13]; // Total of 16*u64
> +};


Balbir Singh.


Re: [PATCH -mm -v5 RESEND] mm, swap: Fix race between swapoff and some swap operations

2018-02-20 Thread huang ying
On Wed, Feb 21, 2018 at 7:38 AM, Andrew Morton
 wrote:
> On Sun, 18 Feb 2018 09:06:47 +0800 huang ying  
> wrote:
>
>> >> >> +struct swap_info_struct *get_swap_device(swp_entry_t entry)
>> >> >> +{
>> >> >> +  struct swap_info_struct *si;
>> >> >> +  unsigned long type, offset;
>> >> >> +
>> >> >> +  if (!entry.val)
>> >> >> +  goto out;
>> >> >> +  type = swp_type(entry);
>> >> >> +  if (type >= nr_swapfiles)
>> >> >> +  goto bad_nofile;
>> >> >> +  si = swap_info[type];
>> >> >> +
>> >> >> +  preempt_disable();
>> >> >
>> >> > This preempt_disable() is later than I'd expect.  If a well-timed race
>> >> > occurs, `si' could now be pointing at a defunct entry.  If that
>> >> > well-timed race include a swapoff AND a swapon, `si' could be pointing
>> >> > at the info for a new device?
>> >>
>> >> struct swap_info_struct pointed to by swap_info[] will never be freed.
>> >> During swapoff, we only free the memory pointed to by the fields of
>> >> struct swap_info_struct.  And when swapon, we will always reuse
>> >> swap_info[type] if it's not NULL.  So it should be safe to dereference
>> >> swap_info[type] with preemption enabled.
>> >
>> > That's my point.  If there's a race window during which there is a
>> > parallel swapoff+swapon, this swap_info_struct may now be in use for a
>> > different device?
>>
>> Yes.  It's possible.  And the caller of get_swap_device() can live
>> with it if the swap_info_struct has been fully initialized.  For
>> example, for the race in the patch description,
>>
>> do_swap_page
>>   swapin_readahead
>> __read_swap_cache_async
>>   swapcache_prepare
>> __swap_duplicate
>>
>> in __swap_duplicate(), it's possible that the swap device returned by
>> get_swap_device() is different from the swap device when
>> __swap_duplicate() call get_swap_device().  But the struct_info_struct
>> has been fully initialized, so __swap_duplicate() can reference
>> si->swap_map[] safely.  And we will check si->swap_map[] before any
>> further operation.  Even if the swap entry is swapped out again for
>> the new swap device, we will check the page table again in
>> do_swap_page().  So there is no functionality problem.
>
> That's rather revolting.  Can we tighten this up?  Or at least very
> loudly document it?

TBH, I think my original fix patch which uses a reference count in
swap_info_struct is easier to be understood.  But I understand it has
its own drawbacks too.  Anyway, unless there are some better ideas to
resolve this, I will send out a new version with more document.

Best Regards,
Huang, Ying


/kbuild/src/stress/include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_buf_queue'

2018-02-20 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   af3e79d29555b97dd096e2f8e36a0f50213808a8
commit: b46dc8ae17a427c50c00241898832807576fd28a media: videobuf2: fix up for 
"media: annotate ->poll() instances"
date:   2 weeks ago
config: x86_64-randconfig-s5-02211303 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
git checkout b46dc8ae17a427c50c00241898832807576fd28a
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/media/common/videobuf2/videobuf2-core.o: In function 
`__read_once_size':
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_buf_queue'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_buf_queue'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_buf_done'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_buf_done'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_qbuf'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_qbuf'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_dqbuf'
>> /kbuild/src/stress/include/linux/compiler.h:183: undefined reference to 
>> `__tracepoint_vb2_dqbuf'
   drivers/media/common/videobuf2/videobuf2-core.o: In function 
`vb2_core_streamon':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-core.c:1737: 
undefined reference to `v4l_vb2q_enable_media_source'
   drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x10): 
undefined reference to `__tracepoint_vb2_buf_queue'
   drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x28): 
undefined reference to `__tracepoint_vb2_buf_done'
   drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x40): 
undefined reference to `__tracepoint_vb2_qbuf'
   drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x58): 
undefined reference to `__tracepoint_vb2_dqbuf'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function `vb2_poll':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:678: 
undefined reference to `video_devdata'
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:685: 
undefined reference to `v4l2_event_pending'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`vb2_ioctl_reqbufs':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:714: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`vb2_ioctl_create_bufs':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:733: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`vb2_ioctl_prepare_buf':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:759: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`vb2_ioctl_querybuf':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:769: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`vb2_ioctl_qbuf':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:778: 
undefined reference to `video_devdata'
   
drivers/media/common/videobuf2/videobuf2-v4l2.o:/kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:788:
 more undefined references to `video_devdata' follow
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`_vb2_fop_release':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:848: 
undefined reference to `v4l2_fh_release'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function 
`vb2_fop_release':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:854: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function `vb2_fop_write':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:864: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function `vb2_fop_read':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:888: 
undefined reference to `video_devdata'
   drivers/media/common/videobuf2/videobuf2-v4l2.o: In function `vb2_fop_poll':
   /kbuild/src/stress/drivers/media/common/videobuf2/videobuf2-v4l2.c:911: 
undefined reference to `video_devdata'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [RFC PATCH V2 13/22] x86/intel_rdt: Support schemata write - pseudo-locking core

2018-02-20 Thread Reinette Chatre
Hi Mike,

On 2/20/2018 5:58 PM, Mike Kravetz wrote:
> On 02/20/2018 03:21 PM, Thomas Gleixner wrote:
>> On Tue, 20 Feb 2018, Reinette Chatre wrote:
>>> On 2/20/2018 9:15 AM, Thomas Gleixner wrote:
 On Tue, 13 Feb 2018, Reinette Chatre wrote:

 Now the remaining thing is the memory allocation and the mmap itself. I
 really dislike the preallocation of memory right at setup time. Ideally
 that should be an allocation of the application itself, but the horrid
 wbinvd stuff kinda prevents that. With that restriction we are more or less
 bound to immediate allocation and population.
>>>
>>> Acknowledged. I am not sure if the current permissions would support
>>> such a dynamic setup though. At this time the system administrator is
>>> the one that sets up the pseudo-locked region and can through
>>> permissions of the character device provide access to these regions to
>>> user space applications.
>>
>> You still would need some interface, e.g. character device which allows you
>> to hand in the pointer to the user allocated memory and do the cache
>> priming. So you could use the same permission setup for that character
>> device.
>>
>> The other problem is that we'd need to have MAP_CONTIG first so you
>> actually can allocate physically contigous memory from user space. Mike is
>> working on that, but it's not available today. The only way to do so today
>> (with lots of waste) would be MAP_HUGETLB, which might be an acceptable
>> constraint up to the point where MAP_CONTIG is available.
> 
> Just to clarify, there is not any activity on exposing a general purpose
> MAP_CONTIG interface to user space.  When initially proposed, MAP_CONTIG
> was shot down and the suggestion was to create a new in kernel interface
> to make allocation of contiguous pages easier.  The initial use case was
> a driver which could use the new internal interface as part of it's
> mmap() routine to give contiguous regions to user space.
> 
> Reinette is using this new interface, but that must be for the ?immediate
> allocation? case you are trying to move away from.  Sorry, I have not been
> following development of this feature.
> 
> If you would have to create a device to accept a user buffer, could you
> perhaps use the same device to create/hand out a contiguous mapping?

Thank you very much for keeping an eye on this discussion. I do still
intend to implement the immediate allocation case by using the new
find_alloc_contig_pages()/free_contig_pages().

Reinette




RE: [RFC PATCH] mmc: sdhci-of-arasan: Add auto tuning support for ZynqMP Platform

2018-02-20 Thread Manish Narani
Hi Adrian,


> -Original Message-
> From: Adrian Hunter [mailto:adrian.hun...@intel.com]
> Sent: Friday, February 16, 2018 7:37 PM
> To: Manish Narani ; michal.si...@xilinx.com;
> ulf.hans...@linaro.org; linux-arm-ker...@lists.infradead.org; linux-
> m...@vger.kernel.org; linux-kernel@vger.kernel.org;
> devicet...@vger.kernel.org; mark.rutl...@arm.com; robh...@kernel.org
> Cc: Anirudha Sarangi ; Srinivas Goud
> ; Manish Narani 
> Subject: Re: [RFC PATCH] mmc: sdhci-of-arasan: Add auto tuning support for
> ZynqMP Platform
> 
> On 30/01/18 20:14, Manish Narani wrote:
> > This patch adds support of SD auto tuning for ZynqMP platform. Auto
> > tuning sequence sends tuning block to card when operating in UHS-1
> > modes. This resets the DLL and sends CMD19/CMD21 as a part of the auto
> > tuning process. Once the auto tuning process gets completed, reset the
> > DLL to load the newly obtained SDHC tuned tap value.
> 
> How is this different from:
>   1. reset the dll
>   2. call sdhci_execute_tuning
>   3. reset the dll
> 
Thanks for your comments. I am looking into this. I will check and let you know 
on the same.

Thanks,
- Manish

> >
> > Signed-off-by: Manish Narani 
> > ---
> >  .../devicetree/bindings/mmc/arasan,sdhci.txt   |   1 +
> >  drivers/mmc/host/sdhci-of-arasan.c | 219
> -
> >  2 files changed, 219 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt
> > b/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt
> > index 60481bf..7d29751 100644
> > --- a/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt
> > +++ b/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt
> > @@ -14,6 +14,7 @@ Required Properties:
> >  - "arasan,sdhci-4.9a": generic Arasan SDHCI 4.9a PHY
> >  - "arasan,sdhci-5.1": generic Arasan SDHCI 5.1 PHY
> >  - "rockchip,rk3399-sdhci-5.1", "arasan,sdhci-5.1": rk3399 eMMC
> > PHY
> > +- "xlnx,zynqmp-8.9a": Xilinx ZynqMP 8.9a PHY
> >For this device it is strongly suggested to include 
> > arasan,soc-ctl-syscon.
> >- reg: From mmc bindings: Register location and length.
> >- clocks: From clock bindings: Handles to clock inputs.
> > diff --git a/drivers/mmc/host/sdhci-of-arasan.c
> > b/drivers/mmc/host/sdhci-of-arasan.c
> > index 0720ea7..7673db4 100644
> > --- a/drivers/mmc/host/sdhci-of-arasan.c
> > +++ b/drivers/mmc/host/sdhci-of-arasan.c
> > @@ -24,15 +24,18 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include "sdhci-pltfm.h"
> >  #include 
> > +#include 
> >
> >  #define SDHCI_ARASAN_VENDOR_REGISTER   0x78
> >
> >  #define VENDOR_ENHANCED_STROBE BIT(0)
> >
> >  #define PHY_CLK_TOO_SLOW_HZ40
> > +#define MAX_TUNING_LOOP40
> >
> >  /*
> >   * On some SoCs the syscon area has a feature where the upper 16-bits
> > of @@ -88,6 +91,7 @@ struct sdhci_arasan_data {
> > struct sdhci_host *host;
> > struct clk  *clk_ahb;
> > struct phy  *phy;
> > +   u32 device_id;
> > boolis_phy_on;
> >
> > struct clk_hw   sdcardclk_hw;
> > @@ -157,6 +161,213 @@ static int sdhci_arasan_syscon_write(struct
> sdhci_host *host,
> > return ret;
> >  }
> >
> > +/**
> > + * arasan_zynqmp_dll_reset - Issue the DLL reset.
> > + * @deviceid:  Unique Id of device
> > + */
> > +void zynqmp_dll_reset(u8 deviceid)
> > +{
> > +   const struct zynqmp_eemi_ops *eemi_ops = get_eemi_ops();
> > +
> > +   if (!eemi_ops || !eemi_ops->ioctl)
> > +   return;
> > +
> > +   /* Issue DLL Reset */
> > +   if (deviceid == 0)
> > +   eemi_ops->ioctl(NODE_SD_0, IOCTL_SD_DLL_RESET,
> > +   PM_DLL_RESET_PULSE, 0, NULL);
> > +   else
> > +   eemi_ops->ioctl(NODE_SD_1, IOCTL_SD_DLL_RESET,
> > +   PM_DLL_RESET_PULSE, 0, NULL); }
> > +
> > +static void arasan_zynqmp_dll_reset(struct sdhci_host *host, u8
> > +deviceid) {
> > +   u16 clk;
> > +   unsigned long timeout;
> > +
> > +   clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
> > +   clk &= ~(SDHCI_CLOCK_CARD_EN | SDHCI_CLOCK_INT_EN);
> > +   sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
> > +
> > +   /* Issue DLL Reset */
> > +   zynqmp_dll_reset(deviceid);
> > +
> > +   clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
> > +   clk |= SDHCI_CLOCK_INT_EN;
> > +   sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
> > +
> > +   /* Wait max 20 ms */
> > +   timeout = 20;
> > +   while (!((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL))
> > +   & SDHCI_CLOCK_INT_STABLE)) {
> > +   if (timeout == 0) {
> > +   dev_err(mmc_dev(host->mmc),
> > +   ": Internal clock never stabilised.\n");
> > +   return;
> > + 

Re: [RFCv4 10/21] videodev2.h: Add request_fd field to v4l2_buffer

2018-02-20 Thread Alexandre Courbot
On Wed, Feb 21, 2018 at 12:20 AM, Hans Verkuil  wrote:
> On 02/20/18 05:44, Alexandre Courbot wrote:
>> From: Hans Verkuil 
>>
>> When queuing buffers allow for passing the request that should
>> be associated with this buffer.
>>
>> Signed-off-by: Hans Verkuil 
>> [acour...@chromium.org: make request ID 32-bit]
>> Signed-off-by: Alexandre Courbot 
>> ---
>>  drivers/media/common/videobuf2/videobuf2-v4l2.c | 2 +-
>>  drivers/media/usb/cpia2/cpia2_v4l.c | 2 +-
>>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c   | 9 ++---
>>  drivers/media/v4l2-core/v4l2-ioctl.c| 4 ++--
>>  include/uapi/linux/videodev2.h  | 3 ++-
>>  5 files changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c 
>> b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> index 886a2d8d5c6c..6d4d184aa68e 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> @@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, 
>> void *pb)
>>   b->timestamp = ns_to_timeval(vb->timestamp);
>>   b->timecode = vbuf->timecode;
>>   b->sequence = vbuf->sequence;
>> - b->reserved2 = 0;
>> + b->request_fd = 0;
>>   b->reserved = 0;
>>
>>   if (q->is_multiplanar) {
>> diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c 
>> b/drivers/media/usb/cpia2/cpia2_v4l.c
>> index 99f106b13280..af42ce3ceb48 100644
>> --- a/drivers/media/usb/cpia2/cpia2_v4l.c
>> +++ b/drivers/media/usb/cpia2/cpia2_v4l.c
>> @@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, 
>> struct v4l2_buffer *buf)
>>   buf->sequence = cam->buffers[buf->index].seq;
>>   buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
>>   buf->length = cam->frame_size;
>> - buf->reserved2 = 0;
>> + buf->request_fd = 0;
>>   buf->reserved = 0;
>>   memset(&buf->timecode, 0, sizeof(buf->timecode));
>>
>> diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c 
>> b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
>> index 5198c9eeb348..32bf47489a2e 100644
>> --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
>> +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
>> @@ -386,7 +386,7 @@ struct v4l2_buffer32 {
>>   __s32   fd;
>>   } m;
>>   __u32   length;
>> - __u32   reserved2;
>> + __s32   request_fd;
>>   __u32   reserved;
>>  };
>>
>> @@ -486,6 +486,7 @@ static int get_v4l2_buffer32(struct v4l2_buffer __user 
>> *kp,
>>  {
>>   u32 type;
>>   u32 length;
>> + s32 request_fd;
>>   enum v4l2_memory memory;
>>   struct v4l2_plane32 __user *uplane32;
>>   struct v4l2_plane __user *uplane;
>> @@ -500,7 +501,9 @@ static int get_v4l2_buffer32(struct v4l2_buffer __user 
>> *kp,
>>   get_user(memory, &up->memory) ||
>>   put_user(memory, &kp->memory) ||
>>   get_user(length, &up->length) ||
>> - put_user(length, &kp->length))
>> + put_user(length, &kp->length) ||
>> + get_user(request_fd, &up->request_fd) ||
>> + put_user(request_fd, &kp->request_fd))
>>   return -EFAULT;
>>
>>   if (V4L2_TYPE_IS_OUTPUT(type))
>> @@ -604,7 +607,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer __user 
>> *kp,
>>   assign_in_user(&up->timestamp.tv_usec, &kp->timestamp.tv_usec) ||
>>   copy_in_user(&up->timecode, &kp->timecode, sizeof(kp->timecode)) ||
>>   assign_in_user(&up->sequence, &kp->sequence) ||
>> - assign_in_user(&up->reserved2, &kp->reserved2) ||
>> + assign_in_user(&up->request_fd, &kp->request_fd) ||
>>   assign_in_user(&up->reserved, &kp->reserved) ||
>>   get_user(length, &kp->length) ||
>>   put_user(length, &up->length))
>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c 
>> b/drivers/media/v4l2-core/v4l2-ioctl.c
>> index 260288ca4f55..7bfeaf233d5a 100644
>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>> @@ -437,13 +437,13 @@ static void v4l_print_buffer(const void *arg, bool 
>> write_only)
>>   const struct v4l2_plane *plane;
>>   int i;
>>
>> - pr_cont("%02ld:%02d:%02d.%08ld index=%d, type=%s, flags=0x%08x, 
>> field=%s, sequence=%d, memory=%s",
>> + pr_cont("%02ld:%02d:%02d.%08ld index=%d, type=%s, request_fd=%u, 
>> flags=0x%08x, field=%s, sequence=%d, memory=%s",
>>   p->timestamp.tv_sec / 3600,
>>   (int)(p->timestamp.tv_sec / 60) % 60,
>>   (int)(p->timestamp.tv_sec % 60),
>>   (long)p->timestamp.tv_usec,
>>   p->index,
>> - prt_names(p->type, v4l2_type_names),
>> + prt_names(p->type, v4l2_type_names), p->request_fd,
>>   p->flags, prt_names(p->field, v4l2_field_names),
>>

Re: [RFCv4 13/21] media: videobuf2-v4l2: support for requests

2018-02-20 Thread Alexandre Courbot
On Wed, Feb 21, 2018 at 1:18 AM, Hans Verkuil  wrote:
> On 02/20/2018 05:44 AM, Alexandre Courbot wrote:
>> Add a new vb2_qbuf_request() (a request-aware version of vb2_qbuf())
>> that request-aware drivers can call to queue a buffer into a request
>> instead of directly into the vb2 queue if relevent.
>>
>> This function expects that drivers invoking it are using instances of
>> v4l2_request_entity and v4l2_request_entity_data to describe their
>> entity and entity data respectively.
>>
>> Also add the vb2_request_submit() helper function which drivers can
>> invoke in order to queue all the buffers previously queued into a
>> request into the regular vb2 queue.
>>
>> Signed-off-by: Alexandre Courbot 
>> ---
>>  .../media/common/videobuf2/videobuf2-v4l2.c   | 129 +-
>>  include/media/videobuf2-v4l2.h|  59 
>>  2 files changed, 187 insertions(+), 1 deletion(-)
>>
>
> 
>
>> @@ -776,10 +899,14 @@ EXPORT_SYMBOL_GPL(vb2_ioctl_querybuf);
>>  int vb2_ioctl_qbuf(struct file *file, void *priv, struct v4l2_buffer *p)
>>  {
>>   struct video_device *vdev = video_devdata(file);
>> + struct v4l2_fh *fh = NULL;
>> +
>> + if (test_bit(V4L2_FL_USES_V4L2_FH, &vdev->flags))
>> + fh = file->private_data;
>
> No need for this. All drivers using vb2 will also use v4l2_fh.

Fixed.

>
>>
>>   if (vb2_queue_is_busy(vdev, file))
>>   return -EBUSY;
>> - return vb2_qbuf(vdev->queue, p);
>> + return vb2_qbuf_request(vdev->queue, p, fh ? fh->entity : NULL);
>>  }
>>  EXPORT_SYMBOL_GPL(vb2_ioctl_qbuf);
>>
>> diff --git a/include/media/videobuf2-v4l2.h b/include/media/videobuf2-v4l2.h
>> index 3d5e2d739f05..d4dfa266a0da 100644
>> --- a/include/media/videobuf2-v4l2.h
>> +++ b/include/media/videobuf2-v4l2.h
>> @@ -23,6 +23,12 @@
>>  #error VB2_MAX_PLANES != VIDEO_MAX_PLANES
>>  #endif
>>
>> +struct media_entity;
>> +struct v4l2_fh;
>> +struct media_request;
>> +struct media_request_entity;
>> +struct v4l2_request_entity_data;
>> +
>>  /**
>>   * struct vb2_v4l2_buffer - video buffer information for v4l2.
>>   *
>> @@ -116,6 +122,59 @@ int vb2_prepare_buf(struct vb2_queue *q, struct 
>> v4l2_buffer *b);
>>   */
>>  int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b);
>>
>> +#if IS_ENABLED(CONFIG_MEDIA_REQUEST_API)
>> +
>> +/**
>> + * vb2_qbuf_request() - Queue a buffer, with request support
>> + * @q:   pointer to &struct vb2_queue with videobuf2 queue.
>> + * @b:   buffer structure passed from userspace to
>> + *   &v4l2_ioctl_ops->vidioc_qbuf handler in driver
>> + * @entity:  request entity to queue for if requests are used.
>> + *
>> + * Should be called from &v4l2_ioctl_ops->vidioc_qbuf handler of a driver.
>> + *
>> + * If requests are not in use, calling this is equivalent to calling 
>> vb2_qbuf().
>> + *
>> + * If the request_fd member of b is set, then the buffer represented by b is
>> + * queued in the request instead of the vb2 queue. The buffer will be passed
>> + * to the vb2 queue when the request is submitted.
>
> I would definitely also prepare the buffer at this time. That way you'll see 
> any
> errors relating to the prepare early on.

I was wondering about that, so glad to have your opinion on this. Will
make sure buffers are prepared before queuing them to a request.

>
>> + *
>> + * The return values from this function are intended to be directly returned
>> + * from &v4l2_ioctl_ops->vidioc_qbuf handler in driver.
>> + */
>> +int vb2_qbuf_request(struct vb2_queue *q, struct v4l2_buffer *b,
>> +  struct media_request_entity *entity);
>> +
>> +/**
>> + * vb2_request_submit() - Queue all the buffers in a v4l2 request.
>> + * @data:request entity data to queue buffers of
>> + *
>> + * This function should be called from the media_request_entity_ops::submit
>> + * hook for instances of media_request_v4l2_entity_data. It will immediately
>> + * queue all the request-bound buffers to their respective vb2 queues.
>> + *
>> + * Errors from vb2_core_qbuf() are returned if something happened. Also, 
>> since
>> + * v4l2 request entities require at least one buffer for the request to 
>> trigger,
>> + * this function will return -EINVAL if no buffer have been bound at all for
>> + * this entity.
>> + */
>> +int vb2_request_submit(struct v4l2_request_entity_data *data);
>> +
>> +#else /* CONFIG_MEDIA_REQUEST_API */
>> +
>> +static inline int vb2_qbuf_request(struct vb2_queue *q, struct v4l2_buffer 
>> *b,
>> +struct media_request_entity *entity)
>> +{
>> + return vb2_qbuf(q, b);
>> +}
>> +
>> +static inline int vb2_request_submit(struct v4l2_request_entity_data *data)
>> +{
>> + return -ENOTSUPP;
>> +}
>> +
>> +#endif /* CONFIG_MEDIA_REQUEST_API */
>> +
>>  /**
>>   * vb2_expbuf() - Export a buffer as a file descriptor
>>   * @q:   pointer to &struct vb2_queue with videobuf2 queue.
>>
>
> Regards,
>
> Hans


Re: [RFCv4 11/21] media: v4l2_fh: add request entity field

2018-02-20 Thread Alexandre Courbot
On Wed, Feb 21, 2018 at 12:24 AM, Hans Verkuil  wrote:
> On 02/20/18 05:44, Alexandre Courbot wrote:
>> Allow drivers to assign a request entity to v4l2_fh. This will be useful
>> for request-aware ioctls to find out which request entity to use.
>>
>> Signed-off-by: Alexandre Courbot 
>> ---
>>  include/media/v4l2-fh.h | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/include/media/v4l2-fh.h b/include/media/v4l2-fh.h
>> index ea73fef8bdc0..f54cb319dd64 100644
>> --- a/include/media/v4l2-fh.h
>> +++ b/include/media/v4l2-fh.h
>> @@ -28,6 +28,7 @@
>>
>>  struct video_device;
>>  struct v4l2_ctrl_handler;
>> +struct media_request_entity;
>>
>>  /**
>>   * struct v4l2_fh - Describes a V4L2 file handler
>> @@ -43,6 +44,7 @@ struct v4l2_ctrl_handler;
>>   * @navailable: number of available events at @available list
>>   * @sequence: event sequence number
>>   * @m2m_ctx: pointer to &struct v4l2_m2m_ctx
>> + * @entity: the request entity this fh operates on behalf of
>>   */
>>  struct v4l2_fh {
>>   struct list_headlist;
>> @@ -60,6 +62,7 @@ struct v4l2_fh {
>>  #if IS_ENABLED(CONFIG_V4L2_MEM2MEM_DEV)
>>   struct v4l2_m2m_ctx *m2m_ctx;
>>  #endif
>> + struct media_request_entity *entity;
>
> The name 'media_request_entity' is very confusing.
>
> In the media controller API terminology an entity represents a piece
> of hardware with inputs and outputs (very rough description), but a
> request is not an entity. It may be associated with an entity, though.
>
> So calling this field 'entity' is also very misleading.

Note that this field is not a request though, it is a pointer to a
piece of hardware referenced by a request, which is closer to the MC
terminology. Or do you mean this should just be renamed
"request_entity"?

If we go all the way, the media_ prefix is also misleading - it
implies a dependency to the media controller framework, while there is
none (in this patchset at least).

However I thought that 'request' alone (instead of media_request) may
name-conflict with something else, and since 'media' is also the
umbrella term for anything under drivers/media it sounds fitting on
the other hand. Suggestions are welcome though.

>
> As with previous patches, I'll have to think about this and try and
> come up with better, less confusing names.

I will gladly take suggestions, have been trying to come with a better
name to reply to your comment above but could not find any. :)


Re: [RFCv4 09/21] v4l2: add request API support

2018-02-20 Thread Alexandre Courbot
On Tue, Feb 20, 2018 at 10:25 PM, Hans Verkuil  wrote:
> On 02/20/18 05:44, Alexandre Courbot wrote:
>> Add a v4l2 request entity data structure that takes care of storing the
>> request-related state of a V4L2 device ; in this case, its controls.
>>
>> Signed-off-by: Alexandre Courbot 
>> ---
>>  drivers/media/v4l2-core/Makefile   |   1 +
>>  drivers/media/v4l2-core/v4l2-request.c | 178 +
>>  include/media/v4l2-request.h   | 159 ++
>>  3 files changed, 338 insertions(+)
>>  create mode 100644 drivers/media/v4l2-core/v4l2-request.c
>>  create mode 100644 include/media/v4l2-request.h
>>
>> diff --git a/drivers/media/v4l2-core/Makefile 
>> b/drivers/media/v4l2-core/Makefile
>> index 80de2cb9c476..13d0477535bd 100644
>> --- a/drivers/media/v4l2-core/Makefile
>> +++ b/drivers/media/v4l2-core/Makefile
>> @@ -16,6 +16,7 @@ ifeq ($(CONFIG_TRACEPOINTS),y)
>>videodev-objs += vb2-trace.o v4l2-trace.o
>>  endif
>>  videodev-$(CONFIG_MEDIA_CONTROLLER) += v4l2-mc.o
>> +videodev-$(CONFIG_MEDIA_REQUEST_API) += v4l2-request.o
>>
>>  obj-$(CONFIG_VIDEO_V4L2) += videodev.o
>>  obj-$(CONFIG_VIDEO_V4L2) += v4l2-common.o
>> diff --git a/drivers/media/v4l2-core/v4l2-request.c 
>> b/drivers/media/v4l2-core/v4l2-request.c
>> new file mode 100644
>> index ..e8ad10e2f525
>> --- /dev/null
>> +++ b/drivers/media/v4l2-core/v4l2-request.c
>> @@ -0,0 +1,178 @@
>> +/*
>> + * Media requests support for V4L2
>> + *
>> + * Copyright (C) 2018, The Chromium OS Authors.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +void v4l2_request_entity_init(struct v4l2_request_entity *entity,
>> +   const struct media_request_entity_ops *ops,
>> +   struct video_device *vdev)
>> +{
>> + media_request_entity_init(&entity->base, 
>> MEDIA_REQUEST_ENTITY_TYPE_V4L2, ops);
>> + entity->vdev = vdev;
>> +}
>> +EXPORT_SYMBOL_GPL(v4l2_request_entity_init);
>> +
>> +struct media_request_entity_data *
>> +v4l2_request_entity_data_alloc(struct media_request *req,
>> +struct v4l2_ctrl_handler *hdl)
>> +{
>> + struct v4l2_request_entity_data *data;
>> + int ret;
>> +
>> + data = kzalloc(sizeof(*data), GFP_KERNEL);
>> + if (!data)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + ret = v4l2_ctrl_request_init(&data->ctrls);
>> + if (ret) {
>> + kfree(data);
>> + return ERR_PTR(ret);
>> + }
>> + ret = v4l2_ctrl_request_clone(&data->ctrls, hdl, NULL);
>> + if (ret) {
>> + kfree(data);
>> + return ERR_PTR(ret);
>> + }
>> +
>> + INIT_LIST_HEAD(&data->queued_buffers);
>> +
>> + return &data->base;
>> +}
>> +EXPORT_SYMBOL_GPL(v4l2_request_entity_data_alloc);
>> +
>> +void v4l2_request_entity_data_free(struct media_request_entity_data *_data)
>> +{
>> + struct v4l2_request_entity_data *data;
>> + struct v4l2_vb2_request_buffer *qb, *n;
>> +
>> + data = to_v4l2_entity_data(_data);
>> +
>> + list_for_each_entry_safe(qb, n, &data->queued_buffers, node) {
>> + struct vb2_buffer *buf;
>> + dev_warn(_data->request->mgr->dev,
>> +  "entity data freed while buffer still queued!\n");
>> +
>> + /* give buffer back to user-space */
>> + buf = qb->queue->bufs[qb->v4l2_buf.index];
>> + buf->state = qb->pre_req_state;
>> + buf->request = NULL;
>> +
>> + kfree(qb);
>> + }
>> +
>> + v4l2_ctrl_handler_free(&data->ctrls);
>> + kfree(data);
>> +}
>> +EXPORT_SYMBOL_GPL(v4l2_request_entity_data_free);
>> +
>> +
>> +
>> +
>> +
>> +static struct media_request *v4l2_request_alloc(struct media_request_mgr 
>> *mgr)
>> +{
>> + struct media_request *req;
>> +
>> + req = kzalloc(sizeof(*req), GFP_KERNEL);
>> + if (!req)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + req->mgr = mgr;
>> + req->state = MEDIA_REQUEST_STATE_IDLE;
>> + INIT_LIST_HEAD(&req->data);
>> + init_waitqueue_head(&req->complete_wait);
>> + mutex_init(&req->lock);
>> +
>> + mutex_lock(&mgr->mutex);
>> + list_add_tail(&req->list, &mgr->requests);
>> + mutex_unlock(&mgr->mutex);
>> +
>> + return req;
>> +}
>> +
>> +static void v4l2_request_free(struct media_request *req)
>> +{
>> + struct media_request_mgr *mgr = req->mgr;
>> + struct media_request_entity_data *data, *next;
>> +

Re: [RFCv4 16/21] v4l2: video_device: support for creating requests

2018-02-20 Thread Alexandre Courbot
On Wed, Feb 21, 2018 at 1:35 AM, Hans Verkuil  wrote:
> On 02/20/2018 05:44 AM, Alexandre Courbot wrote:
>> Add a new VIDIOC_NEW_REQUEST ioctl, which allows to instanciate requests
>> on devices that support the request API. Requests created that way can
>> only control the device they originate from, making them suitable for
>> simple devices, but not complex pipelines.
>>
>> Signed-off-by: Alexandre Courbot 
>> ---
>>  Documentation/ioctl/ioctl-number.txt |  1 +
>>  drivers/media/v4l2-core/v4l2-dev.c   |  2 ++
>>  drivers/media/v4l2-core/v4l2-ioctl.c | 25 +
>>  include/media/v4l2-dev.h |  2 ++
>>  include/uapi/linux/videodev2.h   |  3 +++
>>  5 files changed, 33 insertions(+)
>>
>> diff --git a/Documentation/ioctl/ioctl-number.txt 
>> b/Documentation/ioctl/ioctl-number.txt
>> index 6501389d55b9..afdc9ed255b0 100644
>> --- a/Documentation/ioctl/ioctl-number.txt
>> +++ b/Documentation/ioctl/ioctl-number.txt
>> @@ -286,6 +286,7 @@ Code  Seq#(hex)   Include FileComments
>>   
>>  'z'  10-4F   drivers/s390/crypto/zcrypt_api.hconflict!
>>  '|'  00-7F   linux/media.h
>> +'|'  80-9F   linux/media-request.h
>>  0x80 00-1F   linux/fb.h
>>  0x89 00-06   arch/x86/include/asm/sockios.h
>>  0x89 0B-DF   linux/sockios.h
>
> This  doesn't belong in this patch.

Do you mean we need a separate patch for this?

>
>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
>> b/drivers/media/v4l2-core/v4l2-dev.c
>> index 0301fe426a43..062ebee5bffc 100644
>> --- a/drivers/media/v4l2-core/v4l2-dev.c
>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
>> @@ -559,6 +559,8 @@ static void determine_valid_ioctls(struct video_device 
>> *vdev)
>>   set_bit(_IOC_NR(VIDIOC_TRY_EXT_CTRLS), valid_ioctls);
>>   if (vdev->ctrl_handler || ops->vidioc_querymenu)
>>   set_bit(_IOC_NR(VIDIOC_QUERYMENU), valid_ioctls);
>> + if (vdev->req_mgr)
>> + set_bit(_IOC_NR(VIDIOC_NEW_REQUEST), valid_ioctls);
>>   SET_VALID_IOCTL(ops, VIDIOC_G_FREQUENCY, vidioc_g_frequency);
>>   SET_VALID_IOCTL(ops, VIDIOC_S_FREQUENCY, vidioc_s_frequency);
>>   SET_VALID_IOCTL(ops, VIDIOC_LOG_STATUS, vidioc_log_status);
>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c 
>> b/drivers/media/v4l2-core/v4l2-ioctl.c
>> index ab4968ea443f..a45fe078f8ae 100644
>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>> @@ -21,6 +21,7 @@
>>
>>  #include 
>>
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -842,6 +843,13 @@ static void v4l_print_freq_band(const void *arg, bool 
>> write_only)
>>   p->rangehigh, p->modulation);
>>  }
>>
>> +static void vidioc_print_new_request(const void *arg, bool write_only)
>> +{
>> + const struct media_request_new *new = arg;
>> +
>> + pr_cont("fd=0x%x\n", new->fd);
>
> I'd use %d since fds are typically shown as signed integers.

Right.

>
>> +}
>> +
>>  static void v4l_print_edid(const void *arg, bool write_only)
>>  {
>>   const struct v4l2_edid *p = arg;
>> @@ -2486,6 +2494,22 @@ static int v4l_enum_freq_bands(const struct 
>> v4l2_ioctl_ops *ops,
>>   return -ENOTTY;
>>  }
>>
>> +static int vidioc_new_request(const struct v4l2_ioctl_ops *ops,
>> +   struct file *file, void *fh, void *arg)
>> +{
>> +#if IS_ENABLED(CONFIG_MEDIA_REQUEST_API)
>> + struct media_request_new *new = arg;
>> + struct video_device *vfd = video_devdata(file);
>> +
>> + if (!vfd->req_mgr)
>> + return -ENOTTY;
>> +
>> + return media_request_ioctl_new(vfd->req_mgr, new);
>> +#else
>> + return -ENOTTY;
>> +#endif
>> +}
>
> You don't need the #ifdef's here. media_request_ioctl_new() will be stubbed if
> CONFIG_MEDIA_REQUEST_API isn't set.

Correct.

>
>> +
>>  struct v4l2_ioctl_info {
>>   unsigned int ioctl;
>>   u32 flags;
>> @@ -2617,6 +2641,7 @@ static struct v4l2_ioctl_info v4l2_ioctls[] = {
>>   IOCTL_INFO_FNC(VIDIOC_ENUM_FREQ_BANDS, v4l_enum_freq_bands, 
>> v4l_print_freq_band, 0),
>>   IOCTL_INFO_FNC(VIDIOC_DBG_G_CHIP_INFO, v4l_dbg_g_chip_info, 
>> v4l_print_dbg_chip_info, INFO_FL_CLEAR(v4l2_dbg_chip_info, match)),
>>   IOCTL_INFO_FNC(VIDIOC_QUERY_EXT_CTRL, v4l_query_ext_ctrl, 
>> v4l_print_query_ext_ctrl, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_query_ext_ctrl, 
>> id)),
>> + IOCTL_INFO_FNC(VIDIOC_NEW_REQUEST, vidioc_new_request, 
>> vidioc_print_new_request, 0),
>>  };
>>  #define V4L2_IOCTLS ARRAY_SIZE(v4l2_ioctls)
>>
>> diff --git a/include/media/v4l2-dev.h b/include/media/v4l2-dev.h
>> index 53f32022fabe..e6c4e10889bc 100644
>> --- a/include/media/v4l2-dev.h
>> +++ b/include/media/v4l2-dev.h
>> @@ -209,6 +209,7 @@ struct v4l2_file_operations {
>>   * @entity: &struct media_entity
>>   * @intf_devnode: pointer to &struct media_intf_devnode
>>   * @pipe: &struct media_pipeline
>> + * @req_mgr: request manager to use if this device supports cre

Re: [RFCv4 01/21] media: add request API core and UAPI

2018-02-20 Thread Alexandre Courbot
Hi Hans,

On Tue, Feb 20, 2018 at 7:36 PM, Hans Verkuil  wrote:
> On 02/20/18 05:44, Alexandre Courbot wrote:
>> The request API provides a way to group buffers and device parameters
>> into units of work to be queued and executed. This patch introduces the
>> UAPI and core framework.
>>
>> This patch is based on the previous work by Laurent Pinchart. The core
>> has changed considerably, but the UAPI is mostly untouched.
>>
>> Signed-off-by: Alexandre Courbot 
>> ---
>>  drivers/media/Kconfig  |   3 +
>>  drivers/media/Makefile |   6 +
>>  drivers/media/media-request.c  | 341 
>>  include/media/media-request.h  | 349 +
>>  include/uapi/linux/media-request.h |  37 +++
>>  5 files changed, 736 insertions(+)
>>  create mode 100644 drivers/media/media-request.c
>>  create mode 100644 include/media/media-request.h
>>  create mode 100644 include/uapi/linux/media-request.h
>>
>> diff --git a/drivers/media/Kconfig b/drivers/media/Kconfig
>> index 145e12bfb819..db30fc9547d2 100644
>> --- a/drivers/media/Kconfig
>> +++ b/drivers/media/Kconfig
>> @@ -130,6 +130,9 @@ config VIDEO_V4L2_SUBDEV_API
>>
>> This API is mostly used by camera interfaces in embedded platforms.
>>
>> +config MEDIA_REQUEST_API
>> + tristate
>> +
>>  source "drivers/media/v4l2-core/Kconfig"
>>
>>  #
>> diff --git a/drivers/media/Makefile b/drivers/media/Makefile
>> index 594b462ddf0e..03c0a39ad344 100644
>> --- a/drivers/media/Makefile
>> +++ b/drivers/media/Makefile
>> @@ -5,6 +5,12 @@
>>
>>  media-objs   := media-device.o media-devnode.o media-entity.o
>>
>> +#
>> +# Request API support comes as its own module since it can be used by
>> +# both media and video devices
>> +#
>> +obj-$(CONFIG_MEDIA_REQUEST_API) += media-request.o
>> +
>>  #
>>  # I2C drivers should come before other drivers, otherwise they'll fail
>>  # when compiled as builtin drivers
>> diff --git a/drivers/media/media-request.c b/drivers/media/media-request.c
>> new file mode 100644
>> index ..b88362028561
>> --- /dev/null
>> +++ b/drivers/media/media-request.c
>> @@ -0,0 +1,341 @@
>> +/*
>> + * Request base implementation
>> + *
>> + * Copyright (C) 2018, The Chromium OS Authors.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include 
>> +
>> +const struct file_operations request_fops;
>> +
>> +static const char * const media_request_states[] __maybe_unused = {
>
> Why 'maybe_unused'?
>
>> + "IDLE",
>> + "SUBMITTED",
>> + "COMPLETED",
>> + "INVALID",
>
> I don't like to yell. I prefer "Idle", "Submitted", etc.

Sure.

>
>> +};
>> +
>> +struct media_request *media_request_get(struct media_request *req)
>> +{
>> + get_file(req->file);
>> + return req;
>> +}
>> +EXPORT_SYMBOL_GPL(media_request_get);
>> +
>> +struct media_request *media_request_get_from_fd(int fd)
>> +{
>> + struct file *f;
>> +
>> + f = fget(fd);
>> + if (!f)
>> + return NULL;
>> +
>> + /* Not a request FD? */
>> + if (f->f_op != &request_fops) {
>> + fput(f);
>> + return NULL;
>> + }
>> +
>> + return f->private_data;
>> +}
>> +EXPORT_SYMBOL_GPL(media_request_get_from_fd);
>> +
>> +void media_request_put(struct media_request *req)
>> +{
>> + if (WARN_ON(req == NULL))
>> + return;
>> +
>> + fput(req->file);
>> +}
>> +EXPORT_SYMBOL_GPL(media_request_put);
>> +
>> +struct media_request_entity_data *
>> +media_request_get_entity_data(struct media_request *req,
>> +   struct media_request_entity *entity)
>> +{
>> + struct media_request_entity_data *data;
>> +
>> + /* First check that this entity is valid for this request at all */
>> + if (!req->mgr->ops->entity_valid(req, entity))
>> + return ERR_PTR(-EINVAL);
>> +
>> + mutex_lock(&req->lock);
>> +
>> + /* Lookup whether we already have entity data */
>> + list_for_each_entry(data, &req->data, list) {
>> + if (data->entity == entity)
>> + goto out;
>> + }
>> +
>> + /* No entity data found, let's create it */
>> + data = entity->ops->data_alloc(req, entity);
>> + if (IS_ERR(data))
>> + goto out;
>> +
>> + data->entity = entity;
>> + list_add_tail(&data->list, &req->data);
>> +
>> +out:
>> + mutex_unlock(&req->lock);
>> +
>> + ret

nla_put_string() vs NLA_STRING

2018-02-20 Thread Kees Cook
Hi,

It seems that in at least one case[1], nla_put_string() is being used
on an NLA_STRING, which lacks a NULL terminator, which leads to
silliness when nla_put_string() uses strlen() to figure out the size:

/**
 * nla_put_string - Add a string netlink attribute to a socket buffer
 * @skb: socket buffer to add attribute to
 * @attrtype: attribute type
 * @str: NUL terminated string
*/
static inline int nla_put_string(struct sk_buff *skb, int attrtype,
const char *str)
{
return nla_put(skb, attrtype, strlen(str) + 1, str);
}


This is a problem at least here:

struct regulatory_request {
...
char alpha2[2];
...

static const struct nla_policy nl80211_policy[NUM_NL80211_ATTR] = {
...
[NL80211_ATTR_REG_ALPHA2] = { .type = NLA_STRING, .len = 2 },
...

AIUI, working with NLA_STRING needs nla_strlcpy() to "extract" them,
and that takes the nla_policy size normally to bounds-check the copy.


So, this specific problem needs fixing (in at least two places calling
nla_put_string(msg, NL80211_ATTR_REG_ALPHA2, ...)). While I suspect
it's only ever written an extra byte from the following variable in
the structure which is an enum nl80211_dfs_regions, I worry there
might be a lot more of these (though I'd hope unterminated strings are
uncommon for internal representation). And more generally, it seems
like only the NLA _input_ functions actually check nla_policy details.
It seems that the output functions should do the same too, yes?

-Kees

[1] https://github.com/copperhead/linux-hardened/issues/72

-- 
Kees Cook
Pixel Security


Re: [RFC PATCH V2 13/22] x86/intel_rdt: Support schemata write - pseudo-locking core

2018-02-20 Thread Reinette Chatre
Hi Thomas,

On 2/20/2018 3:21 PM, Thomas Gleixner wrote:
> On Tue, 20 Feb 2018, Reinette Chatre wrote:
>> On 2/20/2018 9:15 AM, Thomas Gleixner wrote:
>>> On Tue, 13 Feb 2018, Reinette Chatre wrote:
>>>
>>> Are you really sure that the life time rules of plr are correct vs. an
>>> application which still has the locked memory mapped? i.e. the following
>>> operation:
>>
>> You are correct. I am not preventing an administrator from removing the
>> pseudo-locked region if it is in use. I will fix that.
> 
> The removal is fine and you cannot prevent it w/o introducing a mess, but
> you have to make sure that the PLR and the mapped memory are not
> vanishing. The refcount rules I outlined are exactly doing that.

Thank you for catching my misunderstanding. Will do.

>> Thank you so much for taking the time to do this thorough review and to
>> make these suggestions. While I am still digesting the details I do
>> intend to follow all (as well as the ones earlier I did not explicitly
>> respond to).
> 
> Make your mind up and tell me where I'm wrong before you implement the crap
> I suggested blindly, as that will just cause the next reviewer (me or
> someone else) to tell _you_ that it is crap :)

Will do. I need more time to digest your suggestions because your
thorough review provided me plenty to consider.

>> Keeping the CLOSID associated with the pseudo-locked region will surely
>> make the above simpler since CLOSID's are association with resource
>> groups (represented by the directories). I would like to highlight that
>> on some platforms there are only a few (for example, 4) CLOSIDs
>> available. Not releasing a CLOSID would thus reduce available CLOSIDs
>> that are already limited. These platforms do have smaller possible
>> bitmasks though (for example, 8 possible bits), which may make light of
>> this concern. I thus just add it as informational to the consequence of
>> this simplification.
> 
> Yes. If you have 4 CLOSIDs and only 8 CBM bits it really does not matter
> much.
> 
>>> Now the remaining thing is the memory allocation and the mmap itself. I
>>> really dislike the preallocation of memory right at setup time. Ideally
>>> that should be an allocation of the application itself, but the horrid
>>> wbinvd stuff kinda prevents that. With that restriction we are more or less
>>> bound to immediate allocation and population.
>>
>> Acknowledged. I am not sure if the current permissions would support
>> such a dynamic setup though. At this time the system administrator is
>> the one that sets up the pseudo-locked region and can through
>> permissions of the character device provide access to these regions to
>> user space applications.
> 
> You still would need some interface, e.g. character device which allows you
> to hand in the pointer to the user allocated memory and do the cache
> priming. So you could use the same permission setup for that character
> device.
> 
> The other problem is that we'd need to have MAP_CONTIG first so you
> actually can allocate physically contigous memory from user space. Mike is
> working on that, but it's not available today. The only way to do so today
> (with lots of waste) would be MAP_HUGETLB, which might be an acceptable
> constraint up to the point where MAP_CONTIG is available.

I recorded this in a pseudo-locking task list as something to consider
if the wbinvd requirement goes away at some point.

> Though this all depends on the ability to remove the wbinvd
> requirement. But even if we can remove that we'd still need to be aware
> that the cache priming loop which needs to run with interrupts disabled is
> expensive as well and can introduce undesired latencies. Needs all some
> thought...

Reinette





Re: [PATCH] cpufreq: powernv: Check negative value returned by cpufreq_table_find_index_dl()

2018-02-20 Thread Viresh Kumar
On 21-02-18, 16:39, Michael Ellerman wrote:
> Viresh Kumar  writes:

> > AFAICT, you will get -1 here only if the freq table had no valid
> > frequencies (or the freq table is empty). Why would that happen ?
> 
> Bugs?

The cupfreq driver shouldn't have registered itself in that case (i.e.
if the cpufreq table is empty).

> Or if you ask for a target_freq that is higher than anything in the
> table.

You will still get a valid index in that case.

There is only once case where we return -1, when the cpufreq table
doesn't have any valid frequencies.

> Or the API changes, and we forget to update this call site.

I am not sure we can do much about that right now.

> If you're saying that cpufreq_table_find_index_dl() can NEVER fail,

Yes, if we have at least one valid frequency in the table, otherwise
the cpufreq driver shouldn't have registered itself.

> then
> write it so that it can never fail and change it to return unsigned int.

But what should we do when there is no frequency in the cpufreq table?
Just in case where a driver is buggy and tries to call this routine
for an invalid table.

> Having it potentially return -1, which is then used to index an array
> and not handling that is just asking for bugs to happen.

I understand what you are trying to say here, but I don't know what
can be done to prevent this here.

What we can do is change the return type to void and pass a int
pointer to the routine, but that wouldn't change anything at all. That
pointers variable can still have -1 in it.

-- 
viresh


Re: [PATCH] ocxl: Add get_metadata IOCTL to share OCXL information to userspace

2018-02-20 Thread Andrew Donnellan

On 21/02/18 15:57, Alastair D'Silva wrote:

From: Alastair D'Silva 

Some required information is not exposed to userspace currently (eg. the
PASID), pass this information back, along with other information which
is currently communicated via sysfs, which saves some parsing effort in
userspace.

Signed-off-by: Alastair D'Silva 


Seems fine.

Acked-by: Andrew Donnellan 

--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



Re: [PATCH] netlink: put module reference if dump start fails

2018-02-20 Thread Eric Dumazet
On Wed, 2018-02-21 at 04:41 +0100, Jason A. Donenfeld wrote:
> Before, if cb->start() failed, the module reference would never be put,
> because cb->cb_running is intentionally false at this point. Users are
> generally annoyed by this because they can no longer unload modules that
> leak references. Also, it may be possible to tediously wrap a reference
> counter back to zero, especially since module.c still uses atomic_inc
> instead of refcount_inc.
> 
> This patch expands the error path to simply call module_put if
> cb->start() fails.
> 
> Signed-off-by: Jason A. Donenfeld 
> ---
> This probably should be queued up for stable.

When was the bug added ? This would help a lot stable teams ...

Thanks.



[PATCH] kconfig: Print reverse dependencies in groups

2018-02-20 Thread Masahiro Yamada
Signed-off-by: Masahiro Yamada 
---

This patch requires the following as a pre-requisite:
https://patchwork.kernel.org/patch/10229545/

These two will work equivalently to the following three:

https://patchwork.kernel.org/patch/10226951/
https://patchwork.kernel.org/patch/10226953/
https://patchwork.kernel.org/patch/10226955/


 scripts/kconfig/expr.c | 18 --
 scripts/kconfig/expr.h |  3 ++-
 scripts/kconfig/menu.c | 10 ++
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/scripts/kconfig/expr.c b/scripts/kconfig/expr.c
index cd3a8f5..49376e1 100644
--- a/scripts/kconfig/expr.c
+++ b/scripts/kconfig/expr.c
@@ -1323,19 +1323,25 @@ void expr_gstr_print(struct expr *e, struct gstr *gs)
  */
 static void expr_print_revdep(struct expr *e,
  void (*fn)(void *, struct symbol *, const char *),
- void *data)
+ void *data, tristate pr_type, const char **title)
 {
if (e->type == E_OR) {
-   expr_print_revdep(e->left.expr, fn, data);
-   expr_print_revdep(e->right.expr, fn, data);
-   } else {
+   expr_print_revdep(e->left.expr, fn, data, pr_type, title);
+   expr_print_revdep(e->right.expr, fn, data, pr_type, title);
+   } else if (expr_calc_value(e) == pr_type) {
+   if (*title) {
+   fn(data, NULL, *title);
+   *title = NULL;
+   }
+
fn(data, NULL, "  - ");
expr_print(e, fn, data, E_NONE);
fn(data, NULL, "\n");
}
 }
 
-void expr_gstr_print_revdep(struct expr *e, struct gstr *gs)
+void expr_gstr_print_revdep(struct expr *e, struct gstr *gs,
+   tristate pr_type, const char *title)
 {
-   expr_print_revdep(e, expr_print_gstr_helper, gs);
+   expr_print_revdep(e, expr_print_gstr_helper, gs, pr_type, &title);
 }
diff --git a/scripts/kconfig/expr.h b/scripts/kconfig/expr.h
index c16e82e..8dbf2a4 100644
--- a/scripts/kconfig/expr.h
+++ b/scripts/kconfig/expr.h
@@ -310,7 +310,8 @@ struct expr *expr_simplify_unmet_dep(struct expr *e1, 
struct expr *e2);
 void expr_fprint(struct expr *e, FILE *out);
 struct gstr; /* forward */
 void expr_gstr_print(struct expr *e, struct gstr *gs);
-void expr_gstr_print_revdep(struct expr *e, struct gstr *gs);
+void expr_gstr_print_revdep(struct expr *e, struct gstr *gs,
+   tristate pr_type, const char *title);
 
 static inline int expr_is_yes(struct expr *e)
 {
diff --git a/scripts/kconfig/menu.c b/scripts/kconfig/menu.c
index 7e70be3..3b9beca 100644
--- a/scripts/kconfig/menu.c
+++ b/scripts/kconfig/menu.c
@@ -827,14 +827,16 @@ static void get_symbol_str(struct gstr *r, struct symbol 
*sym,
 
get_symbol_props_str(r, sym, P_SELECT, _("  Selects: "));
if (sym->rev_dep.expr) {
-   str_append(r, _("  Selected by: \n"));
-   expr_gstr_print_revdep(sym->rev_dep.expr, r);
+   expr_gstr_print_revdep(sym->rev_dep.expr, r, yes, "  Selected 
by [y]\n");
+   expr_gstr_print_revdep(sym->rev_dep.expr, r, mod, "  Selected 
by [m]\n");
+   expr_gstr_print_revdep(sym->rev_dep.expr, r, no, "  Selected by 
[n]\n");
}
 
get_symbol_props_str(r, sym, P_IMPLY, _("  Implies: "));
if (sym->implied.expr) {
-   str_append(r, _("  Implied by: \n"));
-   expr_gstr_print_revdep(sym->implied.expr, r);
+   expr_gstr_print_revdep(sym->rev_dep.expr, r, yes, "  Implied by 
[y]\n");
+   expr_gstr_print_revdep(sym->rev_dep.expr, r, mod, "  Implied by 
[m]\n");
+   expr_gstr_print_revdep(sym->rev_dep.expr, r, no, "  Implied by 
[n]\n");
}
 
str_append(r, "\n\n");
-- 
2.7.4



Re: [PATCH] mmc: dw_mmc-k3: Fix out-of-bounds access through DT alias

2018-02-20 Thread Jaehoon Chung
Hi Geert,

On 02/20/2018 07:50 PM, Geert Uytterhoeven wrote:
> On Tue, Feb 20, 2018 at 10:03 AM, Geert Uytterhoeven
>  wrote:
>> The hs_timing_cfg[] array is indexed using a value derived from the
>> "mshcN" alias in DT, which may lead to an out-of-bounds access.
>>
>> Fix this by adding a range check.
>>
>> Fixes: 7d92895208a008a2 ("mmc: dw_mmc-k3: Fix out-of-bounds access through 
>> DT alias")
> 
> Oops
> 
> Fixes: 361c7fe9b02eee7e ("mmc: dw_mmc-k3: add sd support for hi3660")

Could you resend the patch with changing commit-msg?
Then i will pick yours.

Best Regards,
Jaehoon Chung

> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 



Re: [PATCH] cpufreq: powernv: Check negative value returned by cpufreq_table_find_index_dl()

2018-02-20 Thread Michael Ellerman
Viresh Kumar  writes:

> On 12-02-18, 15:51, Shilpasri G Bhat wrote:
>> This patch fixes the below Coverity warning:
>> 
>> *** CID 182816:  Memory - illegal accesses  (NEGATIVE_RETURNS)
>> /drivers/cpufreq/powernv-cpufreq.c: 1008 in powernv_fast_switch()
>> 1002 unsigned int target_freq)
>> 1003 {
>> 1004 int index;
>> 1005 struct powernv_smp_call_data freq_data;
>> 1006
>> 1007 index = cpufreq_table_find_index_dl(policy, target_freq);
>> >>> CID 182816:  Memory - illegal accesses  (NEGATIVE_RETURNS)
>> >>> Using variable "index" as an index to array "powernv_freqs".
>> 1008 freq_data.pstate_id = powernv_freqs[index].driver_data;
>> 1009 freq_data.gpstate_id = powernv_freqs[index].driver_data;
>> 1010 set_pstate(&freq_data);
>> 1011
>> 1012 return powernv_freqs[index].frequency;
>> 1013 }
>> 
>> Signed-off-by: Shilpasri G Bhat 
>> ---
>>  drivers/cpufreq/powernv-cpufreq.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/drivers/cpufreq/powernv-cpufreq.c 
>> b/drivers/cpufreq/powernv-cpufreq.c
>> index 29cdec1..69edfe9 100644
>> --- a/drivers/cpufreq/powernv-cpufreq.c
>> +++ b/drivers/cpufreq/powernv-cpufreq.c
>> @@ -1005,6 +1005,9 @@ static unsigned int powernv_fast_switch(struct 
>> cpufreq_policy *policy,
>>  struct powernv_smp_call_data freq_data;
>>  
>>  index = cpufreq_table_find_index_dl(policy, target_freq);
>> +if (unlikely(index < 0))
>> +index = get_nominal_index();
>> +
>
> AFAICT, you will get -1 here only if the freq table had no valid
> frequencies (or the freq table is empty). Why would that happen ?

Bugs?

Or if you ask for a target_freq that is higher than anything in the
table.

Or the API changes, and we forget to update this call site.

If you're saying that cpufreq_table_find_index_dl() can NEVER fail, then
write it so that it can never fail and change it to return unsigned int.

Having it potentially return -1, which is then used to index an array
and not handling that is just asking for bugs to happen.

cheers


Re: [PATCH 2/3] Input: gpio-keys - allow setting wakeup interrupt trigger type in DT

2018-02-20 Thread Brian Norris
Hi,

On Sun, Feb 18, 2018 at 05:34:53PM -0600, Rob Herring wrote:
> On Fri, Feb 09, 2018 at 03:42:47PM -0800, Brian Norris wrote:
> > On Fri, Feb 09, 2018 at 07:55:09PM +0800, Jeffy Chen wrote:
> > > Allow specifying a different interrupt trigger type for wakeup when
> > > using the gpio-keys input device as a wakeup source.
> > > 
> > > Signed-off-by: Jeffy Chen 
> > > ---
> > > 
> > >  Documentation/devicetree/bindings/input/gpio-keys.txt | 9 +
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/Documentation/devicetree/bindings/input/gpio-keys.txt 
> > > b/Documentation/devicetree/bindings/input/gpio-keys.txt
> > > index a94940481e55..61926cef708f 100644
> > > --- a/Documentation/devicetree/bindings/input/gpio-keys.txt
> > > +++ b/Documentation/devicetree/bindings/input/gpio-keys.txt
> > > @@ -26,6 +26,15 @@ Optional subnode-properties:
> > > If not specified defaults to 5.
> > >   - wakeup-source: Boolean, button can wake-up the system.
> > >(Legacy property supported: "gpio-key,wakeup")
> > > + - wakeup-trigger-type: Specifies the interrupt trigger type for wakeup.
> > > +  The value is defined in 
> > > 
> > 
> > Do you really want to codify interrupt triggers here? It seems like most
> > of the information about edge vs. level is already codified elsewhere,
> > so this becomes a little redundant. And in fact, some bindings may be
> > specifying a "gpio", not technically an interrupt (at least not
> > directly), so it feels weird to apply IRQ_* flags to them right here.
> > Anyway, I think he only piece you really want to describe here is, do we
> > wake on "event asserted", "event deasserted", or both. (The "none" case
> > would just mean you shouldn't have the "wakeup-source" property.)
> > 
> > So maybe:
> > 
> > wakeup-trigger-type: Specifies whether the key should wake the
> > system when asserted, when deasserted, or both. This property is
> > only valid for keys that wake up the system (e.g., when the
> > "wakeup-source" property is also provided). Supported values
> > are:
> >   1: asserted
> 
> As wakeup is an IRQ, that's assumed.
> 
> >   2: deasserted
> 
> Just invert the flags for the IRQ.

What? Wouldn't that change the meaning of the key?

> >   3: both asserted and deasserted
> 
> I don't see what would be the usecase. But wouldn't this be any edge 
> (because level certainly doesn't make sense)?

Well, #3 is how the driver happens to currently interpret the binding ;)
I believe the idea is that you get wakeup triggers on all events
(pressed or released). I'm not actually sure why though, since it
doesn't really make for a good use case. (We want both edges in S0, but
not really in S3.)

For some background: the case that inspired this is SW_PEN_INSERTED. We
want to receive events on both edges (PEN_INSERTED = "asserted"; and
!PEN_INSERTED = "deasserted", meaning pen ejected) in S0 (e.g., to
show/hide special UI menus when we think the pen is "in use"). But in
S3, we tend to want to wake only when the pen is ejected. We can't
invert the IRQ, because then ejection and insertion get swapped...
Also, per the above, the current wakeup condition is for both edges. We
want to override that.

Brian

> > 
> > ? We could still make macros out of those, if we want
> > (input/linux-event-codes.h?). And then leave it up to the driver to
> > determine how to translate that into the appropriate edge or level
> > triggers.
> > 
> > Brian
> > 
> > > +  Only the following flags are supported:
> > > + IRQ_TYPE_NONE
> > > + IRQ_TYPE_EDGE_RISING
> > > + IRQ_TYPE_EDGE_FALLING
> > > + IRQ_TYPE_EDGE_BOTH
> > > + IRQ_TYPE_LEVEL_HIGH
> > > + IRQ_TYPE_LEVEL_LOW
> > >   - linux,can-disable: Boolean, indicates that button is connected
> > > to dedicated (not shared) interrupt which can be disabled to
> > > suppress events from the button.
> > > -- 
> > > 2.11.0
> > > 
> > > 


Re: [PATCH v2 1/3] HID: add driver for Valve Steam Controller

2018-02-20 Thread Cameron Gutman
On 02/20/2018 11:33 AM, Rodrigo Rivas Costa wrote:
> +static void steam_work_connect_cb(struct work_struct *work)
> +{
> + struct steam_device *steam = container_of(work, struct steam_device,
> + work_connect);
> + unsigned long flags;
> + bool connected;
> + int ret;
> +
> + spin_lock_irqsave(&steam->lock, flags);
> + connected = steam->connected;
> + spin_unlock_irqrestore(&steam->lock, flags);
> +
> + if (connected) {
> + if (steam->input) {
> + dbg_hid("%s: already connected\n", __func__);
> + return;
> + }
> + ret = steam_register(steam);
> + if (ret) {
> + hid_err(steam->hdev,
> + "%s:steam_register failed with error %d\n",
> + __func__, ret);
> + return;
> + }
> + } else {
> + steam_unregister(steam);

I think you need synchronization here. You don't want to be in the middle of
processing a HID event or power supply update and have your device freed out
from underneath you.

xpad uses RCU to avoid this race.

> + }
> +}
> +

Regards,
Cameron


Re: [PATCH ipmi/kcs_bmc v2] ipmi: kcs_bmc: make the code be more clean

2018-02-20 Thread Wang, Haiyue



On 2018-02-20 21:29, Corey Minyard wrote:

On 02/19/2018 09:55 AM, Haiyue Wang wrote:

---
When you use ---, it means everything following is not in the commit 
text,

including your signature.


Got it.

v1 -> v2:


Do you want me to fold this into the previous patch?  That's generally
not how things work, a new patch is fine for this, with a list of things
done like below.


I will submit a new patch.

One comment inline below...



Add 'SPDX-License-Identifier' style for header files modification.
---

1. Add the missed key word '__user' for read / write.
2. Remove the prefix 'file' of 'file_to_kcs_bmc', no need this
duplicated word as its parameter has 'struct file *filp'.
3. Change the 'unsigned int' to '__poll_t' to meet the new 'poll'
definition.
4. Correct the 'SPDX-License-Identifier' style for header files.

Signed-off-by: Haiyue Wang 
---
  drivers/char/ipmi/kcs_bmc.c    | 32 
+---

  drivers/char/ipmi/kcs_bmc.h    |  6 --
  drivers/char/ipmi/kcs_bmc_aspeed.c |  4 +++-
  include/uapi/linux/ipmi_bmc.h  |  6 --
  4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/char/ipmi/kcs_bmc.c b/drivers/char/ipmi/kcs_bmc.c
index 6476bfb..fbfc05e 100644
--- a/drivers/char/ipmi/kcs_bmc.c
+++ b/drivers/char/ipmi/kcs_bmc.c
@@ -1,5 +1,7 @@
  // SPDX-License-Identifier: GPL-2.0
-// Copyright (c) 2015-2018, Intel Corporation.
+/*
+ * Copyright (c) 2015-2018, Intel Corporation.
+ */
    #define pr_fmt(fmt) "kcs-bmc: " fmt
  @@ -242,14 +244,14 @@ int kcs_bmc_handle_event(struct kcs_bmc 
*kcs_bmc)

  }
  EXPORT_SYMBOL(kcs_bmc_handle_event);
  -static inline struct kcs_bmc *file_to_kcs_bmc(struct file *filp)
+static inline struct kcs_bmc *to_kcs_bmc(struct file *filp)
  {
  return container_of(filp->private_data, struct kcs_bmc, miscdev);
  }
    static int kcs_bmc_open(struct inode *inode, struct file *filp)
  {
-    struct kcs_bmc *kcs_bmc = file_to_kcs_bmc(filp);
+    struct kcs_bmc *kcs_bmc = to_kcs_bmc(filp);
  int ret = 0;
    spin_lock_irq(&kcs_bmc->lock);
@@ -262,25 +264,25 @@ static int kcs_bmc_open(struct inode *inode, 
struct file *filp)

  return ret;
  }
  -static unsigned int kcs_bmc_poll(struct file *filp, poll_table *wait)
+static __poll_t kcs_bmc_poll(struct file *filp, poll_table *wait)
  {
-    struct kcs_bmc *kcs_bmc = file_to_kcs_bmc(filp);
-    unsigned int mask = 0;
+    struct kcs_bmc *kcs_bmc = to_kcs_bmc(filp);
+    __poll_t mask = 0;
    poll_wait(filp, &kcs_bmc->queue, wait);
    spin_lock_irq(&kcs_bmc->lock);
  if (kcs_bmc->data_in_avail)
-    mask |= POLLIN;
+    mask |= EPOLLIN;


I get this:

  CC [M]  drivers/char/ipmi/kcs_bmc.o
../drivers/char/ipmi/kcs_bmc.c: In function ‘kcs_bmc_poll’:
../drivers/char/ipmi/kcs_bmc.c:276:11: error: ‘EPOLLIN’ undeclared 
(first use in this function)

   mask |= EPOLLIN;
   ^
../drivers/char/ipmi/kcs_bmc.c:276:11: note: each undeclared 
identifier is reported only once for each function it appears in


probably need to include linux/eventpoll.h

I forgot to tell you that you need update the git tree, it is merged in 
from 'Linux 4.16-rc1'. Like bt-bmc.c. :)


git log -p drivers/char/ipmi/bt-bmc.c
commit a9a08845e9acbd224e4ee466f5c1275ed50054e8
Author: Linus Torvalds 
Date:   Sun Feb 11 14:34:03 2018 -0800

    vfs: do bulk POLL* -> EPOLL* replacement

    This is the mindless scripted replacement of kernel use of POLL*
    variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP 
NVAL MSG; do
    L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | 
grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
    for f in $L; do sed -i 
"-es/^\([^\"]*\)\(\\)/\\1E\\2/" $f; done

    done

    with de-mangling cleanups yet to come.

    NOTE! On almost all architectures, the EPOLL* constants have the same
    values as the POLL* constants do.  But they keyword here is "almost".
    For various bad reasons they aren't the same, and epoll() doesn't
    actually work quite correctly in some cases due to this on Sparc et al.

    The next patch from Al will sort out the final differences, and we
    should be all done.

    Scripted-by: Al Viro 
    Signed-off-by: Linus Torvalds 

diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c
index 7992c87..c95b93b 100644
--- a/drivers/char/ipmi/bt-bmc.c
+++ b/drivers/char/ipmi/bt-bmc.c
@@ -349,10 +349,10 @@ static __poll_t bt_bmc_poll(struct file *file, 
poll_table *wait)

    ctrl = bt_inb(bt_bmc, BT_CTRL);

    if (ctrl & BT_CTRL_H2B_ATN)
-   mask |= POLLIN;
+   mask |= EPOLLIN;

    if (!(ctrl & (BT_CTRL_H_BUSY | BT_CTRL_B2H_ATN)))
-   mask |= POLLOUT;
+   mask |= EPOLLOUT;

    return mask;


-corey


spin_unlock_irq(&kcs_bmc->lock);
    return mask;
  }
  -static ssize_t kcs_bmc_read(struct file *filp, char *buf,
-    size_t coun

[PATCH 8/9] scsi: ufs: fix irq return code

2018-02-20 Thread Asutosh Das
From: Venkat Gopalakrishnan 

Return IRQ_HANDLED only if the irq is really handled, this will
help in catching spurious interrupts that go unhandled.

Signed-off-by: Venkat Gopalakrishnan 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 137 ++
 1 file changed, 101 insertions(+), 36 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 4d4c7d6..6541e1d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -211,7 +211,7 @@ enum {
END_FIX
 };
 
-static void ufshcd_tmc_handler(struct ufs_hba *hba);
+static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba);
 static void ufshcd_async_scan(void *data, async_cookie_t cookie);
 static int ufshcd_reset_and_restore(struct ufs_hba *hba);
 static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd);
@@ -4630,19 +4630,29 @@ static int ufshcd_task_req_compl(struct ufs_hba *hba, 
u32 index, u8 *resp)
  * ufshcd_uic_cmd_compl - handle completion of uic command
  * @hba: per adapter instance
  * @intr_status: interrupt status generated by the controller
+ *
+ * Returns
+ *  IRQ_HANDLED - If interrupt is valid
+ *  IRQ_NONE- If invalid interrupt
  */
-static void ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
+static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status)
 {
+   irqreturn_t retval = IRQ_NONE;
+
if ((intr_status & UIC_COMMAND_COMPL) && hba->active_uic_cmd) {
hba->active_uic_cmd->argument2 |=
ufshcd_get_uic_cmd_result(hba);
hba->active_uic_cmd->argument3 =
ufshcd_get_dme_attr_val(hba);
complete(&hba->active_uic_cmd->done);
+   retval = IRQ_HANDLED;
}
 
-   if ((intr_status & UFSHCD_UIC_PWR_MASK) && hba->uic_async_done)
+   if ((intr_status & UFSHCD_UIC_PWR_MASK) && hba->uic_async_done) {
complete(hba->uic_async_done);
+   retval = IRQ_HANDLED;
+   }
+   return retval;
 }
 
 /**
@@ -4698,8 +4708,12 @@ static void __ufshcd_transfer_req_compl(struct ufs_hba 
*hba,
 /**
  * ufshcd_transfer_req_compl - handle SCSI and query command completion
  * @hba: per adapter instance
+ *
+ * Returns
+ *  IRQ_HANDLED - If interrupt is valid
+ *  IRQ_NONE- If invalid interrupt
  */
-static void ufshcd_transfer_req_compl(struct ufs_hba *hba)
+static irqreturn_t ufshcd_transfer_req_compl(struct ufs_hba *hba)
 {
unsigned long completed_reqs;
u32 tr_doorbell;
@@ -4717,7 +4731,12 @@ static void ufshcd_transfer_req_compl(struct ufs_hba 
*hba)
tr_doorbell = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL);
completed_reqs = tr_doorbell ^ hba->outstanding_reqs;
 
-   __ufshcd_transfer_req_compl(hba, completed_reqs);
+   if (completed_reqs) {
+   __ufshcd_transfer_req_compl(hba, completed_reqs);
+   return IRQ_HANDLED;
+   } else {
+   return IRQ_NONE;
+   }
 }
 
 /**
@@ -5243,16 +5262,21 @@ static void ufshcd_update_uic_reg_hist(struct 
ufs_uic_err_reg_hist *reg_hist,
 /**
  * ufshcd_update_uic_error - check and set fatal UIC error flags.
  * @hba: per-adapter instance
+ *
+ * Returns
+ *  IRQ_HANDLED - If interrupt is valid
+ *  IRQ_NONE- If invalid interrupt
  */
-static void ufshcd_update_uic_error(struct ufs_hba *hba)
+static irqreturn_t ufshcd_update_uic_error(struct ufs_hba *hba)
 {
u32 reg;
+   irqreturn_t retval = IRQ_NONE;
 
/* PHY layer lane error */
reg = ufshcd_readl(hba, REG_UIC_ERROR_CODE_PHY_ADAPTER_LAYER);
/* Ignore LINERESET indication, as this is not an error */
if ((reg & UIC_PHY_ADAPTER_LAYER_ERROR) &&
-   (reg & UIC_PHY_ADAPTER_LAYER_LANE_ERR_MASK)) {
+   (reg & UIC_PHY_ADAPTER_LAYER_ERROR_CODE_MASK)) {
/*
 * To know whether this error is fatal or not, DB timeout
 * must be checked but this error is handled separately.
@@ -5263,57 +5287,73 @@ static void ufshcd_update_uic_error(struct ufs_hba *hba)
 
/* PA_INIT_ERROR is fatal and needs UIC reset */
reg = ufshcd_readl(hba, REG_UIC_ERROR_CODE_DATA_LINK_LAYER);
-   if (reg)
+   if ((reg & UIC_DATA_LINK_LAYER_ERROR) &&
+   (reg & UIC_DATA_LINK_LAYER_ERROR_CODE_MASK)) {
ufshcd_update_uic_reg_hist(&hba->ufs_stats.dl_err, reg);
-
-   if (reg & UIC_DATA_LINK_LAYER_ERROR_PA_INIT)
-   hba->uic_error |= UFSHCD_UIC_DL_PA_INIT_ERROR;
-   else if (hba->dev_quirks &
-  UFS_DEVICE_QUIRK_RECOVERY_FROM_DL_NAC_ERRORS) {
-   if (reg & UIC_DATA_LINK_LAYER_ERROR_NAC_RECEIVED)
-   hba->uic_error |=
-   UFSHCD_UIC_DL_NAC_RECEIVED_ERROR;
-   else if (reg & UIC_DATA_LINK_LAYER_ERROR_TCx_REPLAY_TIMEOUT)
- 

[PATCH 1/9] scsi: ufs: Allowing power mode change

2018-02-20 Thread Asutosh Das
From: Yaniv Gardi 

Due to M-PHY issues, moving from HS to any other mode or gear or
even Hibern8 causes some un-predicted behavior of the device.
This patch fixes this issues.

Signed-off-by: Yaniv Gardi 
Signed-off-by: Subhash Jadavani 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 011c336..d74d529 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4167,9 +4167,13 @@ static int ufshcd_link_startup(struct ufs_hba *hba)
goto out;
} while (ret && retries--);
 
-   if (ret)
+   if (ret) {
/* failed to get the link up... retire */
goto out;
+   } else {
+   ufshcd_dme_set(hba, UIC_ARG_MIB(TX_LCC_ENABLE), 0);
+   ufshcd_dme_set(hba, UIC_ARG_MIB(TX_LCC_ENABLE), 1);
+   }
 
if (link_startup_again) {
link_startup_again = false;
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH 3/9] scsi: ufs: fix exception event handling

2018-02-20 Thread Asutosh Das
From: Maya Erez 

The device can set the exception event bit in one of the response UPIU,
for example to notify the need for urgent BKOPs operation.
In such a case the host driver calls ufshcd_exception_event_handler to
handle this notification.
When trying to check the exception event status (for finding the cause for
the exception event), the device may be busy with additional SCSI commands
handling and may not respond within the 100ms timeout.

To prevent that, we need to block SCSI commands during handling of
exception events and allow retransmissions of the query requests,
in case of timeout.

Signed-off-by: Subhash Jadavani 
Signed-off-by: Maya Erez 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index cc7eb1e..8d3f8ce 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4972,6 +4972,7 @@ static void ufshcd_exception_event_handler(struct 
work_struct *work)
hba = container_of(work, struct ufs_hba, eeh_work);
 
pm_runtime_get_sync(hba->dev);
+   scsi_block_requests(hba->host);
err = ufshcd_get_ee_status(hba, &status);
if (err) {
dev_err(hba->dev, "%s: failed to get exception status %d\n",
@@ -4985,6 +4986,7 @@ static void ufshcd_exception_event_handler(struct 
work_struct *work)
ufshcd_bkops_exception_event_handler(hba);
 
 out:
+   scsi_unblock_requests(hba->host);
pm_runtime_put_sync(hba->dev);
return;
 }
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH 2/9] scsi: ufs: Add LCC quirk for host and device

2018-02-20 Thread Asutosh Das
From: Subhash Jadavani 

LCC (Line Control Command) is being used for communication between
UFS host and UFS device. But some hosts might have the issue with
issuing the LCC commands to UFS device and in this case LCC could be
explicitly disabled.

But there could be a need where we don't want to disable the LCC
on both host & device; hence this change splits the quirk in 2 parts
one for host and one for device.

Signed-off-by: Subhash Jadavani 
Signed-off-by: Venkat Gopalakrishnan 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 16 
 drivers/scsi/ufs/ufshcd.h | 11 +++
 2 files changed, 27 insertions(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index d74d529..cc7eb1e 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4121,6 +4121,11 @@ static int ufshcd_disable_tx_lcc(struct ufs_hba *hba, 
bool peer)
return err;
 }
 
+static inline int ufshcd_disable_host_tx_lcc(struct ufs_hba *hba)
+{
+   return ufshcd_disable_tx_lcc(hba, false);
+}
+
 static inline int ufshcd_disable_device_tx_lcc(struct ufs_hba *hba)
 {
return ufshcd_disable_tx_lcc(hba, true);
@@ -4175,6 +4180,17 @@ static int ufshcd_link_startup(struct ufs_hba *hba)
ufshcd_dme_set(hba, UIC_ARG_MIB(TX_LCC_ENABLE), 1);
}
 
+   if (hba->quirks & UFSHCD_BROKEN_LCC_PROCESSING_ON_HOST) {
+   ret = ufshcd_disable_device_tx_lcc(hba);
+   if (ret)
+   goto out;
+   }
+
+   if (hba->quirks & UFSHCD_BROKEN_LCC_PROCESSING_ON_DEVICE) {
+   ret = ufshcd_disable_host_tx_lcc(hba);
+   if (ret)
+   goto out;
+   }
if (link_startup_again) {
link_startup_again = false;
retries = DME_LINKSTARTUP_RETRIES;
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 1332e54..7a2dad3 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -591,6 +591,17 @@ struct ufs_hba {
 */
#define UFSHCD_QUIRK_PRDT_BYTE_GRAN 0x80
 
+   /*
+* If UFS device is having issue in processing LCC (Line Control
+* Command) coming from UFS host controller then enable this quirk.
+* When this quirk is enabled, host controller driver should disable
+* the LCC transmission on UFS host controller (by clearing
+* TX_LCC_ENABLE attribute of host to 0).
+*/
+   #define UFSHCD_BROKEN_LCC_PROCESSING_ON_DEVICE  0x100
+
+   #define UFSHCD_BROKEN_LCC_PROCESSING_ON_HOST0x200
+
unsigned int quirks;/* Deviations from standard UFSHCI spec. */
 
/* Device deviations from standard UFS device spec. */
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH 5/9] scsi: ufs: add reference counting for scsi block requests

2018-02-20 Thread Asutosh Das
From: Subhash Jadavani 

Currently we call the scsi_block_requests()/scsi_unblock_requests()
whenever we want to block/unblock scsi requests but as there is no
reference counting, nesting of these calls could leave us in undesired
state sometime. Consider following call flow sequence:
1. func1() calls scsi_block_requests() but calls func2() before
   calling scsi_unblock_requests()
2. func2() calls scsi_block_requests()
3. func2() calls scsi_unblock_requests()
4. func1() calls scsi_unblock_requests()

As there is no reference counting, we will have scsi requests unblocked
after #3 instead of it to be unblocked only after #4. Though we may not
have failures seen with this, we might run into some failures in future.
Better solution would be to fix this by adding reference counting.

Signed-off-by: Subhash Jadavani 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 44 +---
 drivers/scsi/ufs/ufshcd.h |  5 +
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 7a4df95..987b81b 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -264,6 +264,36 @@ static inline void ufshcd_disable_irq(struct ufs_hba *hba)
}
 }
 
+void ufshcd_scsi_unblock_requests(struct ufs_hba *hba)
+{
+   unsigned long flags;
+   bool unblock = false;
+
+   spin_lock_irqsave(hba->host->host_lock, flags);
+   hba->scsi_block_reqs_cnt--;
+   unblock = !hba->scsi_block_reqs_cnt;
+   spin_unlock_irqrestore(hba->host->host_lock, flags);
+   if (unblock)
+   scsi_unblock_requests(hba->host);
+}
+EXPORT_SYMBOL(ufshcd_scsi_unblock_requests);
+
+static inline void __ufshcd_scsi_block_requests(struct ufs_hba *hba)
+{
+   if (!hba->scsi_block_reqs_cnt++)
+   scsi_block_requests(hba->host);
+}
+
+void ufshcd_scsi_block_requests(struct ufs_hba *hba)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(hba->host->host_lock, flags);
+   __ufshcd_scsi_block_requests(hba);
+   spin_unlock_irqrestore(hba->host->host_lock, flags);
+}
+EXPORT_SYMBOL(ufshcd_scsi_block_requests);
+
 /* replace non-printable or non-ASCII characters with spaces */
 static inline void ufshcd_remove_non_printable(char *val)
 {
@@ -1079,12 +1109,12 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba 
*hba)
 * make sure that there are no outstanding requests when
 * clock scaling is in progress
 */
-   scsi_block_requests(hba->host);
+   ufshcd_scsi_block_requests(hba);
down_write(&hba->clk_scaling_lock);
if (ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) {
ret = -EBUSY;
up_write(&hba->clk_scaling_lock);
-   scsi_unblock_requests(hba->host);
+   ufshcd_scsi_unblock_requests(hba);
}
 
return ret;
@@ -1093,7 +1123,7 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba 
*hba)
 static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba)
 {
up_write(&hba->clk_scaling_lock);
-   scsi_unblock_requests(hba->host);
+   ufshcd_scsi_unblock_requests(hba);
 }
 
 /**
@@ -1413,7 +1443,7 @@ static void ufshcd_ungate_work(struct work_struct *work)
hba->clk_gating.is_suspended = false;
}
 unblock_reqs:
-   scsi_unblock_requests(hba->host);
+   ufshcd_scsi_unblock_requests(hba);
 }
 
 /**
@@ -1469,7 +1499,7 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
 * work and to enable clocks.
 */
case CLKS_OFF:
-   scsi_block_requests(hba->host);
+   __ufshcd_scsi_block_requests(hba);
hba->clk_gating.state = REQ_CLKS_ON;
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
@@ -5197,7 +5227,7 @@ static void ufshcd_err_handler(struct work_struct *work)
 
 out:
spin_unlock_irqrestore(hba->host->host_lock, flags);
-   scsi_unblock_requests(hba->host);
+   ufshcd_scsi_unblock_requests(hba);
ufshcd_release(hba);
pm_runtime_put_sync(hba->dev);
 }
@@ -5299,7 +5329,7 @@ static void ufshcd_check_errors(struct ufs_hba *hba)
/* handle fatal errors only when link is functional */
if (hba->ufshcd_state == UFSHCD_STATE_OPERATIONAL) {
/* block commands from scsi mid-layer */
-   scsi_block_requests(hba->host);
+   __ufshcd_scsi_block_requests(hba);
 
hba->ufshcd_state = UFSHCD_STATE_EH_SCHEDULED;
 
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 7a2dad3..4385741 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -498,6 +498,7 @@ struct ufs_stats {
  * @urgent_bkops_lvl: keeps track of urgent bkops level for device
  * @is_urgent_bkops_lvl_checked

[PATCH 9/9] scsi: ufs: Add clock ungating to a separate workqueue

2018-02-20 Thread Asutosh Das
From: Vijay Viswanath 

UFS driver can receive a request during memory reclaim by kswapd.
So when ufs driver puts the ungate work in queue, and if there are no
idle workers, kthreadd is invoked to create a new kworker. Since
kswapd task holds a mutex which kthreadd also needs, this can cause
a deadlock situation. So ungate work must be done in a separate
work queue with WQ__RECLAIM flag enabled.Such a workqueue will have
a rescue thread which will be called when the above deadlock
condition is possible.

Signed-off-by: Vijay Viswanath 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 10 +-
 drivers/scsi/ufs/ufshcd.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 6541e1d..bb3382a 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1503,7 +1503,8 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
hba->clk_gating.state = REQ_CLKS_ON;
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
-   schedule_work(&hba->clk_gating.ungate_work);
+   queue_work(hba->clk_gating.clk_gating_workq,
+  &hba->clk_gating.ungate_work);
/*
 * fall through to check if we should wait for this
 * work to be done or not.
@@ -1689,6 +1690,8 @@ static ssize_t ufshcd_clkgate_enable_store(struct device 
*dev,
 
 static void ufshcd_init_clk_gating(struct ufs_hba *hba)
 {
+   char wq_name[sizeof("ufs_clk_gating_00")];
+
if (!ufshcd_is_clkgating_allowed(hba))
return;
 
@@ -1696,6 +1699,10 @@ static void ufshcd_init_clk_gating(struct ufs_hba *hba)
INIT_DELAYED_WORK(&hba->clk_gating.gate_work, ufshcd_gate_work);
INIT_WORK(&hba->clk_gating.ungate_work, ufshcd_ungate_work);
 
+   snprintf(wq_name, ARRAY_SIZE(wq_name), "ufs_clk_gating_%d",
+hba->host->host_no);
+   hba->clk_gating.clk_gating_workq = 
create_singlethread_workqueue(wq_name);
+
hba->clk_gating.is_enabled = true;
 
hba->clk_gating.delay_attr.show = ufshcd_clkgate_delay_show;
@@ -1723,6 +1730,7 @@ static void ufshcd_exit_clk_gating(struct ufs_hba *hba)
device_remove_file(hba->dev, &hba->clk_gating.enable_attr);
cancel_work_sync(&hba->clk_gating.ungate_work);
cancel_delayed_work_sync(&hba->clk_gating.gate_work);
+   destroy_workqueue(hba->clk_gating.clk_gating_workq);
 }
 
 /* Must be called with host lock acquired */
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 4385741..570c33e 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -361,6 +361,7 @@ struct ufs_clk_gating {
struct device_attribute enable_attr;
bool is_enabled;
int active_reqs;
+   struct workqueue_struct *clk_gating_workq;
 };
 
 struct ufs_saved_pwr_info {
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH 4/9] scsi: ufshcd: fix possible unclocked register access

2018-02-20 Thread Asutosh Das
From: Subhash Jadavani 

vendor specific setup_clocks ops may depend on clocks managed by ufshcd
driver so if the vendor specific setup_clocks callback is called when
the required clocks are turned off, it results into unclocked register
access.

This change make sure that required clocks are enabled before vendor
specific setup_clocks callback is called.

Signed-off-by: Subhash Jadavani 
Signed-off-by: Venkat Gopalakrishnan 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 8d3f8ce..7a4df95 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6782,9 +6782,16 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, 
bool on,
if (list_empty(head))
goto out;
 
-   ret = ufshcd_vops_setup_clocks(hba, on, PRE_CHANGE);
-   if (ret)
-   return ret;
+   /*
+* vendor specific setup_clocks ops may depend on clocks managed by
+* this standard driver hence call the vendor specific setup_clocks
+* before disabling the clocks managed here.
+*/
+   if (!on) {
+   ret = ufshcd_vops_setup_clocks(hba, on, PRE_CHANGE);
+   if (ret)
+   return ret;
+   }
 
list_for_each_entry(clki, head, list) {
if (!IS_ERR_OR_NULL(clki->clk)) {
@@ -6808,9 +6815,16 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, 
bool on,
}
}
 
-   ret = ufshcd_vops_setup_clocks(hba, on, POST_CHANGE);
-   if (ret)
-   return ret;
+   /*
+* vendor specific setup_clocks ops may depend on clocks managed by
+* this standard driver hence call the vendor specific setup_clocks
+* after enabling the clocks managed here.
+*/
+   if (on) {
+   ret = ufshcd_vops_setup_clocks(hba, on, POST_CHANGE);
+   if (ret)
+   return ret;
+   }
 
 out:
if (ret) {
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH 6/9] scsi: ufs-qcom: remove broken hci version quirk

2018-02-20 Thread Asutosh Das
From: Subhash Jadavani 

UFSHCD_QUIRK_BROKEN_UFS_HCI_VERSION is only applicable for QCOM UFS host
controller version 2.x.y and this has been fixed from version 3.x.y
onwards, hence this change removes this quirk for version 3.x.y onwards.

Signed-off-by: Subhash Jadavani 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufs-qcom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index 2b38db2..221820a 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -1098,7 +1098,7 @@ static void ufs_qcom_advertise_quirks(struct ufs_hba *hba)
hba->quirks |= UFSHCD_QUIRK_BROKEN_LCC;
}
 
-   if (host->hw_ver.major >= 0x2) {
+   if (host->hw_ver.major == 0x2) {
hba->quirks |= UFSHCD_QUIRK_BROKEN_UFS_HCI_VERSION;
 
if (!ufs_qcom_cap_qunipro(host))
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH 7/9] scsi: ufs: make sure all interrupts are processed

2018-02-20 Thread Asutosh Das
From: Venkat Gopalakrishnan 

As multiple requests are submitted to the ufs host controller in
parallel there could be instances where the command completion
interrupt arrives later for a request that is already processed
earlier as the corresponding doorbell was cleared when handling
the previous interrupt. Read the interrupt status in a loop after
processing the received interrupt to catch such interrupts and
handle it.

Signed-off-by: Venkat Gopalakrishnan 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufshcd.c | 27 +++
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 987b81b..4d4c7d6 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5406,19 +5406,30 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
u32 intr_status, enabled_intr_status;
irqreturn_t retval = IRQ_NONE;
struct ufs_hba *hba = __hba;
+   int retries = hba->nutrs;
 
spin_lock(hba->host->host_lock);
intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
-   enabled_intr_status =
-   intr_status & ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
 
-   if (intr_status)
-   ufshcd_writel(hba, intr_status, REG_INTERRUPT_STATUS);
+   /*
+* There could be max of hba->nutrs reqs in flight and in worst case
+* if the reqs get finished 1 by 1 after the interrupt status is
+* read, make sure we handle them by checking the interrupt status
+* again in a loop until we process all of the reqs before returning.
+*/
+   do {
+   enabled_intr_status =
+   intr_status & ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
+   if (intr_status)
+   ufshcd_writel(hba, intr_status, REG_INTERRUPT_STATUS);
+   if (enabled_intr_status) {
+   ufshcd_sl_intr(hba, enabled_intr_status);
+   retval = IRQ_HANDLED;
+   }
+
+   intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
+   } while (intr_status && --retries);
 
-   if (enabled_intr_status) {
-   ufshcd_sl_intr(hba, enabled_intr_status);
-   retval = IRQ_HANDLED;
-   }
spin_unlock(hba->host->host_lock);
return retval;
 }
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.



Re: [PATCH 11/23] kconfig: add 'shell-stdout' function

2018-02-20 Thread Masahiro Yamada
2018-02-20 3:01 GMT+09:00 Linus Torvalds :
> On Mon, Feb 19, 2018 at 9:44 AM, Linus Torvalds
>  wrote:
>>
>> I do like your "success"/"stdout" more than "shell"/"shell-stdout",
>> because with that naming I don't get the feeling that one should
>> subsume the other.
>
> Hmm. Thinking about it some more, I really would prefer just "$(shell
> ...)" everywhere.
>
> But it would be nice if perhaps the error handling would match the
> context somehow.
>
> I'm wondering if this might tie into the whole quoting discussion in
> the other thread.
>
> Because the rule could be:
>
> (a) unquoted $(shell ) is a bool, and failing is ok (and turns into
> y/n depending on whether successful or failing)
>
> So
>
>   config CC_IS_GCC
>   bool
>   default $(shell $CC --version | grep -q gcc)
>
> works automatically.
>
> (b) but with quoting, $(shell ) is a string, and failing is an error
>
> So
>
>   config GCC_VERSION
>   int
>   default "$(shell-stdout $srctree/scripts/gcc-version.sh $CC
> | sed 's/^0*//')" if CC_IS_GCC
>   default 0
>
> would need those quotes, and if the shell-script returns a failure,
> we'd _abort_.


GCC_VERSION is int type.

Setting aside the Kconfig internal, I prefer 50700 to "50700"

According to my common sense, I do not want to quote integers.




IMO, I prefer to use different names for different purpose.
So, 'stdout' and 'success' look good to me.



BTW, I noticed just one built-in function is enough
because 'success' can be derived from 'stdout'.


So, my plan is, implement $(shell ...) as a built-in function.
This returns the stdout from the command.


Then, implement 'success' as a textual shorthand
by using macro, like this:

macro success $(shell ($(1) && echo y) || echo n)


macro can be expanded recursively, so cc-option
can be implemented based on 'success' macro.





-- 
Best Regards
Masahiro Yamada


[PATCH] ocxl: Add get_metadata IOCTL to share OCXL information to userspace

2018-02-20 Thread Alastair D'Silva
From: Alastair D'Silva 

Some required information is not exposed to userspace currently (eg. the
PASID), pass this information back, along with other information which
is currently communicated via sysfs, which saves some parsing effort in
userspace.

Signed-off-by: Alastair D'Silva 
---
 drivers/misc/ocxl/file.c | 27 +++
 include/uapi/misc/ocxl.h | 22 ++
 2 files changed, 49 insertions(+)

diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c
index d9aa407db06a..11514a8444e5 100644
--- a/drivers/misc/ocxl/file.c
+++ b/drivers/misc/ocxl/file.c
@@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx,
return rc;
 }
 
+static long afu_ioctl_get_metadata(struct ocxl_context *ctx,
+   struct ocxl_ioctl_get_metadata __user *uarg)
+{
+   struct ocxl_ioctl_get_metadata arg;
+
+   memset(&arg, 0, sizeof(arg));
+
+   arg.version = 0;
+
+   arg.afu_version_major = ctx->afu->config.version_major;
+   arg.afu_version_minor = ctx->afu->config.version_minor;
+   arg.pasid = ctx->pasid;
+   arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride;
+   arg.global_mmio_size = ctx->afu->config.global_mmio_size;
+
+   if (copy_to_user(uarg, &arg, sizeof(arg)))
+   return -EFAULT;
+
+   return 0;
+}
+
 #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" :
\
x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" :   \
x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \
x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \
+   x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \
"UNKNOWN")
 
 static long afu_ioctl(struct file *file, unsigned int cmd,
@@ -157,6 +179,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd,
irq_fd.eventfd);
break;
 
+   case OCXL_IOCTL_GET_METADATA:
+   rc = afu_ioctl_get_metadata(ctx,
+   (struct ocxl_ioctl_get_metadata __user *) args);
+   break;
+
default:
rc = -EINVAL;
}
diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h
index 4b0b0b756f3e..16e1f48ce280 100644
--- a/include/uapi/misc/ocxl.h
+++ b/include/uapi/misc/ocxl.h
@@ -32,6 +32,27 @@ struct ocxl_ioctl_attach {
__u64 reserved3;
 };
 
+/*
+ * Version contains the version of the struct.
+ * Versions will always be backwards compatible, that is, new versions will not
+ * alter existing fields
+ */
+struct ocxl_ioctl_get_metadata {
+   __u16 version;
+
+   // Version 0 fields
+   __u8  afu_version_major;
+   __u8  afu_version_minor;
+   __u32 pasid;
+
+   __u64 pp_mmio_size;
+   __u64 global_mmio_size;
+
+   // End version 0 fields
+
+   __u64 reserved[13]; // Total of 16*u64
+};
+
 struct ocxl_ioctl_irq_fd {
__u64 irq_offset;
__s32 eventfd;
@@ -45,5 +66,6 @@ struct ocxl_ioctl_irq_fd {
 #define OCXL_IOCTL_IRQ_ALLOC   _IOR(OCXL_MAGIC, 0x11, __u64)
 #define OCXL_IOCTL_IRQ_FREE_IOW(OCXL_MAGIC, 0x12, __u64)
 #define OCXL_IOCTL_IRQ_SET_FD  _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd)
+#define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct 
ocxl_ioctl_get_metadata)
 
 #endif /* _UAPI_MISC_OCXL_H */
-- 
2.14.3



Pls assist me to come and establish an industry in your country

2018-02-20 Thread Dim Deng
Genuine offer
 How are you today and your family, I am a citizen of Sudan but
currently staying in Burkina Faso. My name is Miss Mariam Dim Deng,
25years old originated from Sudan.
My late father Dr. Dominic Dim Deng was the former Minister for SPLA
Affair and Special Adviser to President Salva Kiir of South Sudan for
Decentralization. You can read more about the crash through the below
site:
http://news.bbc.co.uk/2/hi/africa/7380412.stm

I am the only survival daughter of my late father, I want to establish
an industry in your country.My father left the sum of eight million,
four hundred thousand dollars, in the bank, so you will help me to
receive the fund for investment in your country.
Please can you be my partner and to direct me the way to establish an
industry,out of this fund I will give you 30% for helping me, please
contact me at email: m.dim.de...@gmail.com

Your beloved Mariam Dim Deng


Re: [PATCH v2 1/1] clk: npcm750: update text with fixed clocks

2018-02-20 Thread Brendan Higgins
On Mon, Feb 19, 2018 at 6:49 AM, Rob Herring  wrote:
> On Thu, Feb 15, 2018 at 02:38:12PM -0800, Brendan Higgins wrote:
>> On Thu, Feb 15, 2018 at 5:39 AM, Tali Perry  wrote:
>> >
>> > Signed-off-by: Tali Perry 
>> >
>>  
>>
>> I think this should probably be rolled into [PATCH v2 1/1] npcm750: add fixed
>> clocks (moved from drivers/clk/clk-npcm7xx.c):
>> https://www.spinics.net/lists/arm-kernel/msg634678.html
>
> No, binding docs, dts files and driver code should all be separate
> patches.

My mistake. This patch has a dt-bindings include file; should the include file
go in here, with the dtsi changes, or in its own separate patch?

Thanks


Re: [PATCH 10/23] stack-protector: test compiler capability in Kconfig and drop AUTO mode

2018-02-20 Thread Masahiro Yamada
2018-02-17 3:38 GMT+09:00 Masahiro Yamada :
> Add CC_HAS_STACKPROTECTOR(_STRONG) to test if the compiler supports
> -fstack-protector(-strong) option.
>
> X86 has additional shell scripts in case the compiler supports the
> option, but generates broken code.  I added CC_HAS_SANE_STACKPROTECTOR
> to test this.  I had to add -m32 to gcc-x86_32-has-stack-protector.sh
> to make it work correctly.
>
> If the compiler does not support the option, the menu is automatically
> hidden.  If _STRONG is not supported, it will fall back to _REGULAR.
> This means, _AUTO is implicitly supported in the dependency solver of
> Kconfig, hence removed.
>
> I also turned the 'choice' into only two boolean symbols.  The use of
> 'choice' is not a good idea here, because all of all{yes,mod,no}config
> would choose the first visible value, while we want allnoconfig to
> disable as many features as possible.
>
> I did not add CC_HAS_STACKPROTECTOR_NONE in the hope that GCC versions
> we support will recognize -fno-stack-protector.
>
> If this turns out to be a problem, it will be possible to do this:
>
> stackp-flags-$(CONFIG_CC_HAS_STACKPROTECTOR_NONE) := -fno-stack-protector
> stackp-flags-$(CONFIG_CC_STACKPROTECTOR)  := -fstack-protector
> stackp-flags-$(CONFIG_CC_STACKPROTECTOR_STRONG)   := -fstack-protector-strong
>
> Signed-off-by: Masahiro Yamada 
> ---
>
>  Makefile  | 93 
> ++-
>  arch/Kconfig  | 37 ++--
>  arch/x86/Kconfig  |  8 ++-
>  scripts/gcc-x86_32-has-stack-protector.sh |  7 +--
>  scripts/gcc-x86_64-has-stack-protector.sh |  5 --
>  5 files changed, 30 insertions(+), 120 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 9a8c689..e9fc7c9 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -675,55 +675,11 @@ ifneq ($(CONFIG_FRAME_WARN),0)
>  KBUILD_CFLAGS += $(call cc-option,-Wframe-larger-than=${CONFIG_FRAME_WARN})
>  endif
>
> -# This selects the stack protector compiler flag. Testing it is delayed
> -# until after .config has been reprocessed, in the prepare-compiler-check
> -# target.
> -ifdef CONFIG_CC_STACKPROTECTOR_AUTO
> -  stackp-flag := $(call cc-option,-fstack-protector-strong,$(call 
> cc-option,-fstack-protector))
> -  stackp-name := AUTO
> -else
> -ifdef CONFIG_CC_STACKPROTECTOR_REGULAR
> -  stackp-flag := -fstack-protector
> -  stackp-name := REGULAR
> -else
> -ifdef CONFIG_CC_STACKPROTECTOR_STRONG
> -  stackp-flag := -fstack-protector-strong
> -  stackp-name := STRONG
> -else
> -  # If either there is no stack protector for this architecture or
> -  # CONFIG_CC_STACKPROTECTOR_NONE is selected, we're done, and $(stackp-name)
> -  # is empty, skipping all remaining stack protector tests.
> -  #
> -  # Force off for distro compilers that enable stack protector by default.
> -  KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
> -endif
> -endif
> -endif
> -# Find arch-specific stack protector compiler sanity-checking script.
> -ifdef stackp-name
> -ifneq ($(stackp-flag),)
> -  stackp-path := 
> $(srctree)/scripts/gcc-$(SRCARCH)_$(BITS)-has-stack-protector.sh
> -  stackp-check := $(wildcard $(stackp-path))
> -  # If the wildcard test matches a test script, run it to check 
> functionality.
> -  ifdef stackp-check
> -ifneq ($(shell $(CONFIG_SHELL) $(stackp-check) $(CC) $(KBUILD_CPPFLAGS) 
> $(biarch)),y)
> -  stackp-broken := y
> -endif
> -  endif
> -  ifndef stackp-broken
> -# If the stack protector is functional, enable code that depends on it.
> -KBUILD_CPPFLAGS += -DCONFIG_CC_STACKPROTECTOR
> -# Either we've already detected the flag (for AUTO) or we'll fail the
> -# build in the prepare-compiler-check rule (for specific flag).
> -KBUILD_CFLAGS += $(stackp-flag)
> -  else
> -# We have to make sure stack protector is unconditionally disabled if
> -# the compiler is broken (in case we're going to continue the build in
> -# AUTO mode).
> -KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
> -  endif
> -endif
> -endif
> +stackp-flags-y := -fno-stack-protector
> +stackp-flags-$(CONFIG_CC_STACKPROTECTOR)   := -fstack-protector
> +stackp-flags-$(CONFIG_CC_STACKPROTECTOR_STRONG):= 
> -fstack-protector-strong
> +
> +KBUILD_CFLAGS += $(stackp-flags-y)
>
>  ifeq ($(cc-name),clang)
>  KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
> @@ -1079,7 +1035,7 @@ endif
>  # prepare2 creates a makefile if using a separate output directory.
>  # From this point forward, .config has been reprocessed, so any rules
>  # that need to depend on updated CONFIG_* values can be checked here.
> -prepare2: prepare3 prepare-compiler-check outputmakefile asm-generic
> +prepare2: prepare3 outputmakefile asm-generic
>
>  prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
> include/config/auto.conf
> @@ -1105,43 +1061,6 @@ uapi-asm-generic:
>  PHONY += prepare-objto

[PATCH 2/2] ASoC: support ROHM BD28623 codec

2018-02-20 Thread Katsuhiro Suzuki
This patch adds support of the ROHM BD28623MUV
Class D speaker amplifier for Flat-panel TVs.
This IC delivers an output power of 20W + 20W.

Signed-off-by: Katsuhiro Suzuki 
---
 sound/soc/codecs/Kconfig   |   8 ++
 sound/soc/codecs/Makefile  |   2 +
 sound/soc/codecs/bd28623.c | 258 +
 3 files changed, 268 insertions(+)
 create mode 100644 sound/soc/codecs/bd28623.c

diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
index f72a90104a58..6a53e188ead6 100644
--- a/sound/soc/codecs/Kconfig
+++ b/sound/soc/codecs/Kconfig
@@ -47,6 +47,7 @@ config SND_SOC_ALL_CODECS
select SND_SOC_ALC5623 if I2C
select SND_SOC_ALC5632 if I2C
select SND_SOC_BT_SCO
+   select SND_SOC_BD28623
select SND_SOC_CQ0093VC
select SND_SOC_CS35L32 if I2C
select SND_SOC_CS35L33 if I2C
@@ -418,6 +419,13 @@ config SND_SOC_ALC5623
 config SND_SOC_ALC5632
tristate
 
+config SND_SOC_BD28623
+   tristate "ROHM BD28623 CODEC"
+   help
+ Enable support for ROHM BD28623MUV Class D speaker amplifier.
+ This codec does not have any control buses such as I2C, it
+ detect format of I2S automatically.
+
 config SND_SOC_BT_SCO
tristate "Dummy BT SCO codec driver"
 
diff --git a/sound/soc/codecs/Makefile b/sound/soc/codecs/Makefile
index 56c3252820d2..f2c710e16557 100644
--- a/sound/soc/codecs/Makefile
+++ b/sound/soc/codecs/Makefile
@@ -37,6 +37,7 @@ snd-soc-ak4671-objs := ak4671.o
 snd-soc-ak5386-objs := ak5386.o
 snd-soc-ak5558-objs := ak5558.o
 snd-soc-arizona-objs := arizona.o
+snd-soc-bd28623-objs := bd28623.o
 snd-soc-bt-sco-objs := bt-sco.o
 snd-soc-cq93vc-objs := cq93vc.o
 snd-soc-cs35l32-objs := cs35l32.o
@@ -285,6 +286,7 @@ obj-$(CONFIG_SND_SOC_AK5558)+= snd-soc-ak5558.o
 obj-$(CONFIG_SND_SOC_ALC5623)+= snd-soc-alc5623.o
 obj-$(CONFIG_SND_SOC_ALC5632)  += snd-soc-alc5632.o
 obj-$(CONFIG_SND_SOC_ARIZONA)  += snd-soc-arizona.o
+obj-$(CONFIG_SND_SOC_BD28623)  += snd-soc-bd28623.o
 obj-$(CONFIG_SND_SOC_BT_SCO)   += snd-soc-bt-sco.o
 obj-$(CONFIG_SND_SOC_CQ0093VC) += snd-soc-cq93vc.o
 obj-$(CONFIG_SND_SOC_CS35L32)  += snd-soc-cs35l32.o
diff --git a/sound/soc/codecs/bd28623.c b/sound/soc/codecs/bd28623.c
new file mode 100644
index ..672c8e790e24
--- /dev/null
+++ b/sound/soc/codecs/bd28623.c
@@ -0,0 +1,258 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ROHM BD28623MUV class D speaker amplifier codec driver.
+ *
+ * Copyright (c) 2018 Socionext Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define BD28623_NUM_SUPPLIES3
+
+static const char *const bd28623_supply_names[BD28623_NUM_SUPPLIES] = {
+   "VCCA",
+   "VCCP1",
+   "VCCP2",
+};
+
+struct bd28623_priv {
+   struct platform_device *pdev;
+   struct regulator_bulk_data supplies[BD28623_NUM_SUPPLIES];
+   struct gpio_desc *reset_gpio;
+   struct gpio_desc *mute_gpio;
+
+   int switch_spk;
+};
+
+static const struct snd_soc_dapm_widget bd28623_widgets[] = {
+   SND_SOC_DAPM_DAC("DAC", "Playback", SND_SOC_NOPM, 0, 0),
+   SND_SOC_DAPM_OUTPUT("OUT1P"),
+   SND_SOC_DAPM_OUTPUT("OUT1N"),
+   SND_SOC_DAPM_OUTPUT("OUT2P"),
+   SND_SOC_DAPM_OUTPUT("OUT2N"),
+};
+
+static const struct snd_soc_dapm_route bd28623_routes[] = {
+   { "OUT1P", NULL, "DAC" },
+   { "OUT1N", NULL, "DAC" },
+   { "OUT2P", NULL, "DAC" },
+   { "OUT2N", NULL, "DAC" },
+};
+
+static int bd28623_power_on(struct bd28623_priv *bd)
+{
+   struct device *dev = &bd->pdev->dev;
+   int ret;
+
+   ret = regulator_bulk_enable(ARRAY_SIZE(bd->supplies), bd->supplies);
+   if (ret) {
+   dev_err(dev, "Failed to enable supplies: %d\n", ret);
+   return ret;
+   }
+
+   gpiod_set_value(bd->reset_gpio, 0);
+   usleep_range(30, 40);
+
+   return 0;
+}
+
+static void bd28623_power_off(struct bd28623_priv *bd)
+{
+   gpiod_set_value(bd->reset_gpio, 1);
+
+   regulator_bulk_disable(ARRAY_SIZE(bd->supplies), bd->supplies);
+}
+
+static int bd28623_update_switch_spk(struct bd28623_priv *bd)
+{
+   if (bd->switch_spk)
+   gpiod_set_value(bd->mute_gpio, 0);
+   else
+   gpiod_set_value(bd->mute_gpio, 1);
+
+   return 0;
+}
+
+static int bd28623_get_switch_spk(struct snd_kcontrol *kcontrol,
+ struct snd_ctl_elem_value *ucontrol)
+{
+   struct snd_soc_component *component =
+   snd_soc_kcontrol_component(kcontrol);
+   struct bd28623_priv *bd = snd_soc_component_get_drvdata(component);
+
+   ucontrol->value.integer.value[0] = bd->switch_spk;
+
+   return 0;
+}
+
+static int bd28623_set_switch_spk(struct snd_kcontrol *kcontrol,
+ struct snd_ctl_elem_value *ucontrol)
+{
+   struct snd_soc_component *component =
+   snd_soc_kcontrol_component(kcontrol);
+  

[PATCH 0/2] ASoC: add support for ROHM BD28623 codec

2018-02-20 Thread Katsuhiro Suzuki
This patch adds support for ROHM BD28623MUV class D speaker
amplifier codec driver.

This driver only refers information of HW specification document
that can be derivered at website of ROHM.

http://www.rohm.com/web/global/products/-/product/BD28623MUV

Katsuhiro Suzuki (2):
  ASoC: add DT bindings documentation for ROHM BD28623 codec
  ASoC: support ROHM BD28623 codec

 .../devicetree/bindings/sound/rohm,bd28623.txt |  26 +++
 sound/soc/codecs/Kconfig   |   8 +
 sound/soc/codecs/Makefile  |   2 +
 sound/soc/codecs/bd28623.c | 258 +
 4 files changed, 294 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/rohm,bd28623.txt
 create mode 100644 sound/soc/codecs/bd28623.c

-- 
2.16.1



[PATCH 1/2] ASoC: add DT bindings documentation for ROHM BD28623 codec

2018-02-20 Thread Katsuhiro Suzuki
This patch adds DT bindings documentation for ROHM BD28623MUV
class D speaker amplifier.

Signed-off-by: Katsuhiro Suzuki 
---
 .../devicetree/bindings/sound/rohm,bd28623.txt | 26 ++
 1 file changed, 26 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/rohm,bd28623.txt

diff --git a/Documentation/devicetree/bindings/sound/rohm,bd28623.txt 
b/Documentation/devicetree/bindings/sound/rohm,bd28623.txt
new file mode 100644
index ..954c689b5b08
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/rohm,bd28623.txt
@@ -0,0 +1,26 @@
+ROHM BD28623MUV Class D speaker amplifier for digital input
+
+This codec does not have any control buses such as I2C, it detect format and
+rate of I2S signal automatically. It has two signals that can be connected
+to GPIOs: reset and mute.
+
+Required properties:
+- compatible  : should be "rohm,bd28623"
+- #sound-dai-cells: should be 0.
+- reset-gpios : GPIO specifier for the active low reset line
+- mute-gpios  : GPIO specifier for the active low mute line
+
+Optional properties:
+- VCCA-supply : regulator phandle for the VCCA supply
+- VCCP1-supply: regulator phandle for the VCCP1 supply
+- VCCP2-supply: regulator phandle for the VCCP2 supply
+
+Example:
+
+   codec {
+   compatible = "rohm,bd28623";
+   #sound-dai-cells = <0>;
+
+   reset-gpios = <&gpio 0 GPIO_ACTIVE_LOW>;
+   mute-gpios = <&gpio 1 GPIO_ACTIVE_LOW>;
+   };
-- 
2.16.1



[PATCH RFC] kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment

2018-02-20 Thread Cao jin
GCC_PLUGINS_CFLAGS is already in the environment, so it is superfluous
to add it in commanline of final build of init/

Signed-off-by: Cao jin 
---
This is only tested with Randomizing Structure Layout plugin. The test
method is not so grace but I think it can prove the correctness of this
patch. On the other hand, if we concerns that some flags cannot be
passed during final build, should also consider all the other flags.

Currently, with Randomizing plugin enabled, the crash utility can't work
with it, the symptom is a Segmentation fault due to infinite function call.

With the patch, the symptom is exactly the same, so I am sure the plugin works.

 scripts/link-vmlinux.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index e6818b8e7141..e07b2d251ad6 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -248,7 +248,7 @@ else
 fi;
 
 # final build of init/
-${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init 
GCC_PLUGINS_CFLAGS="${GCC_PLUGINS_CFLAGS}"
+${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init
 
 archive_builtin
 
-- 
2.14.3





Re: [PATCH 5/6] mm, hugetlb: further simplify hugetlb allocation API

2018-02-20 Thread Dan Rue
On Wed, Jan 03, 2018 at 10:32:12AM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> Hugetlb allocator has several layer of allocation functions depending
> and the purpose of the allocation. There are two allocators depending
> on whether the page can be allocated from the page allocator or we need
> a contiguous allocator. This is currently opencoded in alloc_fresh_huge_page
> which is the only path that might allocate giga pages which require the
> later allocator. Create alloc_fresh_huge_page which hides this
> implementation detail and use it in all callers which hardcoded the
> buddy allocator path (__hugetlb_alloc_buddy_huge_page). This shouldn't
> introduce any funtional change because both migration and surplus
> allocators exlude giga pages explicitly.
> 
> While we are at it let's do some renaming. The current scheme is not
> consistent and overly painfull to read and understand. Get rid of prefix
> underscores from most functions. There is no real reason to make names
> longer.
> * alloc_fresh_huge_page is the new layer to abstract underlying
>   allocator
> * __hugetlb_alloc_buddy_huge_page becomes shorter and neater
>   alloc_buddy_huge_page.
> * Former alloc_fresh_huge_page becomes alloc_pool_huge_page because we put
>   the new page directly to the pool
> * alloc_surplus_huge_page can drop the opencoded prep_new_huge_page code
>   as it uses alloc_fresh_huge_page now
> * others lose their excessive prefix underscores to make names shorter

Hi Michal -

We (Linaro) run the libhugetlbfs test suite continuously against
mainline and recently (Feb 1), the 'counters' test started failing on
with the following error:

root@localhost:~# mount_point="/mnt/hugetlb/"
root@localhost:~# echo 200 > /proc/sys/vm/nr_hugepages
root@localhost:~# mkdir -p "${mount_point}"
root@localhost:~# mount -t hugetlbfs hugetlbfs "${mount_point}"
root@localhost:~# export 
LD_LIBRARY_PATH=/root/libhugetlbfs/libhugetlbfs-2.20/obj64
root@localhost:~# /root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters
Starting testcase 
"/root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters", pid 3319
Base pool size: 0
Clean...
FAILLine 326: Bad HugePages_Total: expected 0, actual 1

Line 326 refers to the test source @
https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.c#L326

I bisected the failure to this commit. The problem is seen on multiple
architectures (tested x86-64 and arm64).

Thanks,
Dan

> 
> Reviewed-by: Mike Kravetz 
> Reviewed-by: Naoya Horiguchi 
> Signed-off-by: Michal Hocko 
> ---
>  mm/hugetlb.c | 78 
> 
>  1 file changed, 42 insertions(+), 36 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 7dc80cbe8e89..60acd3e93a95 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1378,7 +1378,7 @@ pgoff_t __basepage_index(struct page *page)
>   return (index << compound_order(page_head)) + compound_idx;
>  }
>  
> -static struct page *__hugetlb_alloc_buddy_huge_page(struct hstate *h,
> +static struct page *alloc_buddy_huge_page(struct hstate *h,
>   gfp_t gfp_mask, int nid, nodemask_t *nmask)
>  {
>   int order = huge_page_order(h);
> @@ -1396,34 +1396,49 @@ static struct page 
> *__hugetlb_alloc_buddy_huge_page(struct hstate *h,
>   return page;
>  }
>  
> +/*
> + * Common helper to allocate a fresh hugetlb page. All specific allocators
> + * should use this function to get new hugetlb pages
> + */
> +static struct page *alloc_fresh_huge_page(struct hstate *h,
> + gfp_t gfp_mask, int nid, nodemask_t *nmask)
> +{
> + struct page *page;
> +
> + if (hstate_is_gigantic(h))
> + page = alloc_gigantic_page(h, gfp_mask, nid, nmask);
> + else
> + page = alloc_buddy_huge_page(h, gfp_mask,
> + nid, nmask);
> + if (!page)
> + return NULL;
> +
> + if (hstate_is_gigantic(h))
> + prep_compound_gigantic_page(page, huge_page_order(h));
> + prep_new_huge_page(h, page, page_to_nid(page));
> +
> + return page;
> +}
> +
>  /*
>   * Allocates a fresh page to the hugetlb allocator pool in the node 
> interleaved
>   * manner.
>   */
> -static int alloc_fresh_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
> +static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
>  {
>   struct page *page;
>   int nr_nodes, node;
>   gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
>  
>   for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
> - if (hstate_is_gigantic(h))
> - page = alloc_gigantic_page(h, gfp_mask,
> - node, nodes_allowed);
> - else
> - page = __hugetlb_alloc_buddy_huge_page(h, gfp_mask,
> - node, nodes_allowed);
> + page = alloc_fresh_huge_page(h, gfp_

[PATCH 5/7] sched/isolation: Offload residual 1Hz scheduler tick

2018-02-20 Thread Frederic Weisbecker
When a CPU runs in full dynticks mode, a 1Hz tick remains in order to
keep the scheduler stats alive. However this residual tick is a burden
for bare metal tasks that can't stand any interruption at all, or want
to minimize them.

The usual boot parameters "nohz_full=" or "isolcpus=nohz" will now
outsource these scheduler ticks to the global workqueue so that a
housekeeping CPU handles those remotely. The sched_class::task_tick()
implementations have been audited and look safe to be called remotely
as the target runqueue and its current task are passed in parameter
and don't seem to be accessed locally.

Note that in the case of using isolcpus, it's still up to the user to
affine the global workqueues to the housekeeping CPUs through
/sys/devices/virtual/workqueue/cpumask or domains isolation
"isolcpus=nohz,domain".

Reviewed-by: Thomas Gleixner 
Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 kernel/sched/core.c  | 92 
 kernel/sched/deadline.c  |  8 +
 kernel/sched/fair.c  |  7 +++-
 kernel/sched/idle_task.c |  8 +
 kernel/sched/isolation.c |  4 +++
 kernel/sched/rt.c|  8 +
 kernel/sched/sched.h |  2 ++
 kernel/sched/stop_task.c |  8 +
 8 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e72ca3c..5dfef45 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3125,6 +3125,96 @@ u64 scheduler_tick_max_deferment(void)
 
return jiffies_to_nsecs(next - now);
 }
+
+struct tick_work {
+   int cpu;
+   struct delayed_work work;
+};
+
+static struct tick_work __percpu *tick_work_cpu;
+
+static void sched_tick_remote(struct work_struct *work)
+{
+   struct delayed_work *dwork = to_delayed_work(work);
+   struct tick_work *twork = container_of(dwork, struct tick_work, work);
+   int cpu = twork->cpu;
+   struct rq *rq = cpu_rq(cpu);
+   struct rq_flags rf;
+
+   /*
+* Handle the tick only if it appears the remote CPU is running in full
+* dynticks mode. The check is racy by nature, but missing a tick or
+* having one too much is no big deal because the scheduler tick updates
+* statistics and checks timeslices in a time-independent way, 
regardless
+* of when exactly it is running.
+*/
+   if (!idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu)) {
+   struct task_struct *curr;
+   u64 delta;
+
+   rq_lock_irq(rq, &rf);
+   update_rq_clock(rq);
+   curr = rq->curr;
+   delta = rq_clock_task(rq) - curr->se.exec_start;
+
+   /*
+* Make sure the next tick runs within a reasonable
+* amount of time.
+*/
+   WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+   curr->sched_class->task_tick(rq, curr, 0);
+   rq_unlock_irq(rq, &rf);
+   }
+
+   /*
+* Run the remote tick once per second (1Hz). This arbitrary
+* frequency is large enough to avoid overload but short enough
+* to keep scheduler internal stats reasonably up to date.
+*/
+   queue_delayed_work(system_unbound_wq, dwork, HZ);
+}
+
+static void sched_tick_start(int cpu)
+{
+   struct tick_work *twork;
+
+   if (housekeeping_cpu(cpu, HK_FLAG_TICK))
+   return;
+
+   WARN_ON_ONCE(!tick_work_cpu);
+
+   twork = per_cpu_ptr(tick_work_cpu, cpu);
+   twork->cpu = cpu;
+   INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
+   queue_delayed_work(system_unbound_wq, &twork->work, HZ);
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+static void sched_tick_stop(int cpu)
+{
+   struct tick_work *twork;
+
+   if (housekeeping_cpu(cpu, HK_FLAG_TICK))
+   return;
+
+   WARN_ON_ONCE(!tick_work_cpu);
+
+   twork = per_cpu_ptr(tick_work_cpu, cpu);
+   cancel_delayed_work_sync(&twork->work);
+}
+#endif /* CONFIG_HOTPLUG_CPU */
+
+int __init sched_tick_offload_init(void)
+{
+   tick_work_cpu = alloc_percpu(struct tick_work);
+   BUG_ON(!tick_work_cpu);
+
+   return 0;
+}
+
+#else /* !CONFIG_NO_HZ_FULL */
+static inline void sched_tick_start(int cpu) { }
+static inline void sched_tick_stop(int cpu) { }
 #endif
 
 #if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \
@@ -5786,6 +5876,7 @@ int sched_cpu_starting(unsigned int cpu)
 {
set_cpu_rq_start_time(cpu);
sched_rq_cpu_starting(cpu);
+   sched_tick_start(cpu);
return 0;
 }
 
@@ -5797,6 +5888,7 @@ int sched_cpu_dying(unsigned int cpu)
 
/* Handle pending wakeups and then migrate everything off */
sched_ttwu_pending();
+   sched_tick_stop(cpu);
 
rq_

[PATCH 6/7] sched/nohz: Remove the 1 Hz tick code

2018-02-20 Thread Frederic Weisbecker
Now that the 1Hz tick is offloaded to workqueues, we can safely remove
the residual code that used to handle it locally.

Reviewed-by: Thomas Gleixner 
Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 include/linux/sched/nohz.h |  4 
 kernel/sched/core.c| 29 -
 kernel/sched/idle_task.c   |  1 -
 kernel/sched/sched.h   | 11 +--
 kernel/time/tick-sched.c   |  6 --
 5 files changed, 1 insertion(+), 50 deletions(-)

diff --git a/include/linux/sched/nohz.h b/include/linux/sched/nohz.h
index 3d3a97d..0942172 100644
--- a/include/linux/sched/nohz.h
+++ b/include/linux/sched/nohz.h
@@ -37,8 +37,4 @@ extern void wake_up_nohz_cpu(int cpu);
 static inline void wake_up_nohz_cpu(int cpu) { }
 #endif
 
-#ifdef CONFIG_NO_HZ_FULL
-extern u64 scheduler_tick_max_deferment(void);
-#endif
-
 #endif /* _LINUX_SCHED_NOHZ_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5dfef45..8fff4f1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3096,35 +3096,9 @@ void scheduler_tick(void)
rq->idle_balance = idle_cpu(cpu);
trigger_load_balance(rq);
 #endif
-   rq_last_tick_reset(rq);
 }
 
 #ifdef CONFIG_NO_HZ_FULL
-/**
- * scheduler_tick_max_deferment
- *
- * Keep at least one tick per second when a single
- * active task is running because the scheduler doesn't
- * yet completely support full dynticks environment.
- *
- * This makes sure that uptime, CFS vruntime, load
- * balancing, etc... continue to move forward, even
- * with a very low granularity.
- *
- * Return: Maximum deferment in nanoseconds.
- */
-u64 scheduler_tick_max_deferment(void)
-{
-   struct rq *rq = this_rq();
-   unsigned long next, now = READ_ONCE(jiffies);
-
-   next = rq->last_sched_tick + HZ;
-
-   if (time_before_eq(next, now))
-   return 0;
-
-   return jiffies_to_nsecs(next - now);
-}
 
 struct tick_work {
int cpu;
@@ -6116,9 +6090,6 @@ void __init sched_init(void)
rq->last_load_update_tick = jiffies;
rq->nohz_flags = 0;
 #endif
-#ifdef CONFIG_NO_HZ_FULL
-   rq->last_sched_tick = 0;
-#endif
 #endif /* CONFIG_SMP */
hrtick_rq_init(rq);
atomic_set(&rq->nr_iowait, 0);
diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c
index e1b46e0..48b8a83 100644
--- a/kernel/sched/idle_task.c
+++ b/kernel/sched/idle_task.c
@@ -48,7 +48,6 @@ dequeue_task_idle(struct rq *rq, struct task_struct *p, int 
flags)
 
 static void put_prev_task_idle(struct rq *rq, struct task_struct *prev)
 {
-   rq_last_tick_reset(rq);
 }
 
 /*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c1c7c78..dc6c8b5 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -727,9 +727,7 @@ struct rq {
 #endif /* CONFIG_SMP */
unsigned long nohz_flags;
 #endif /* CONFIG_NO_HZ_COMMON */
-#ifdef CONFIG_NO_HZ_FULL
-   unsigned long last_sched_tick;
-#endif
+
/* capture load from *all* tasks on this cpu: */
struct load_weight load;
unsigned long nr_load_updates;
@@ -1626,13 +1624,6 @@ static inline void sub_nr_running(struct rq *rq, 
unsigned count)
sched_update_tick_dependency(rq);
 }
 
-static inline void rq_last_tick_reset(struct rq *rq)
-{
-#ifdef CONFIG_NO_HZ_FULL
-   rq->last_sched_tick = jiffies;
-#endif
-}
-
 extern void update_rq_clock(struct rq *rq);
 
 extern void activate_task(struct rq *rq, struct task_struct *p, int flags);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index d479b21..f2fa2e9 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -748,12 +748,6 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched 
*ts,
delta = KTIME_MAX;
}
 
-#ifdef CONFIG_NO_HZ_FULL
-   /* Limit the tick delta to the maximum scheduler deferment */
-   if (!ts->inidle)
-   delta = min(delta, scheduler_tick_max_deferment());
-#endif
-
/* Calculate the next expiry time */
if (delta < (KTIME_MAX - basemono))
expires = basemono + delta;
-- 
2.7.4



[PATCH 4/7] sched/isolation: Isolate workqueues when "nohz_full=" is set

2018-02-20 Thread Frederic Weisbecker
As we prepare for offloading the residual 1hz scheduler ticks to
workqueue, let's affine those to housekeepers so that they don't
interrupt the CPUs that don't want to be disturbed.

Reviewed-by: Thomas Gleixner 
Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 include/linux/sched/isolation.h | 1 +
 kernel/sched/isolation.c| 3 ++-
 kernel/workqueue.c  | 3 ++-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index d849431..4a6582c 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -12,6 +12,7 @@ enum hk_flags {
HK_FLAG_SCHED   = (1 << 3),
HK_FLAG_TICK= (1 << 4),
HK_FLAG_DOMAIN  = (1 << 5),
+   HK_FLAG_WQ  = (1 << 6),
 };
 
 #ifdef CONFIG_CPU_ISOLATION
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index b71b436..a2500c4 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -3,6 +3,7 @@
  *  any CPU: unbound workqueues, timers, kthreads and any offloadable work.
  *
  * Copyright (C) 2017 Red Hat, Inc., Frederic Weisbecker
+ * Copyright (C) 2017-2018 SUSE, Frederic Weisbecker
  *
  */
 
@@ -119,7 +120,7 @@ static int __init housekeeping_nohz_full_setup(char *str)
 {
unsigned int flags;
 
-   flags = HK_FLAG_TICK | HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
+   flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU | 
HK_FLAG_MISC;
 
return housekeeping_setup(str, flags);
 }
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 017044c..593dbe7 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5565,12 +5565,13 @@ static void __init wq_numa_init(void)
 int __init workqueue_init_early(void)
 {
int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL };
+   int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
int i, cpu;
 
WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
 
BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
-   cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(HK_FLAG_DOMAIN));
+   cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(hk_flags));
 
pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);
 
-- 
2.7.4



[PATCH 7/7] sched/isolation: Update nohz documentation to explain tick offload

2018-02-20 Thread Frederic Weisbecker
Update the documentation to reflect the 1Hz tick offload changes.

Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 Documentation/admin-guide/kernel-parameters.txt | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 1d1d53f..50b9837 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1766,6 +1766,17 @@
 
nohz
  Disable the tick when a single task runs.
+
+ A residual 1Hz tick is offloaded to workqueues, which 
you
+ need to affine to housekeeping through the global
+ workqueue's affinity configured via the
+ /sys/devices/virtual/workqueue/cpumask sysfs file, or
+ by using the 'domain' flag described below.
+
+ NOTE: by default the global workqueue runs on all 
CPUs,
+ so to protect individual CPUs the 'cpumask' file has 
to
+ be configured manually after bootup.
+
domain
  Isolate from the general SMP balancing and scheduling
  algorithms. Note that performing domain isolation 
this way
-- 
2.7.4



[PATCH 2/7] nohz: Convert tick_nohz_tick_stopped() to bool

2018-02-20 Thread Frederic Weisbecker
It makes this function more self-explanatory about what it does and how
to use it.

Reported-by: Thomas Gleixner 
Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 include/linux/tick.h | 2 +-
 kernel/time/tick-sched.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 7cc3592..86576d9 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -113,7 +113,7 @@ enum tick_dep_bits {
 
 #ifdef CONFIG_NO_HZ_COMMON
 extern bool tick_nohz_enabled;
-extern int tick_nohz_tick_stopped(void);
+extern bool tick_nohz_tick_stopped(void);
 extern void tick_nohz_idle_enter(void);
 extern void tick_nohz_idle_exit(void);
 extern void tick_nohz_irq_exit(void);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 29a5733..0aba041 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -481,7 +481,7 @@ static int __init setup_tick_nohz(char *str)
 
 __setup("nohz=", setup_tick_nohz);
 
-int tick_nohz_tick_stopped(void)
+bool tick_nohz_tick_stopped(void)
 {
return __this_cpu_read(tick_cpu_sched.tick_stopped);
 }
-- 
2.7.4



[PATCH 0/7] isolation: 1Hz residual tick offloading v7

2018-02-20 Thread Frederic Weisbecker
This version addresses comments from Thomas:

* Convert tick_nohz_tick_stopped[_cpu]() to bool
* Add comments to each sched_class::task_tick() to make sure that datas
  are always fetched from rq and task passed in parameters to allow
  for remote ticks.
* Add reviewed-by tags

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
sched/0hz-v7

HEAD: b0d4913a0a39c717f354e40c1642264632e76960

Thanks,
Frederic
---

Frederic Weisbecker (7):
  sched: Rename init_rq_hrtick to hrtick_rq_init
  nohz: Convert tick_nohz_tick_stopped() to bool
  nohz: Allow to check if remote CPU tick is stopped
  sched/isolation: Isolate workqueues when "nohz_full=" is set
  sched/isolation: Offload residual 1Hz scheduler tick
  sched/nohz: Remove the 1 Hz tick code
  sched/isolation: Update nohz documentation to explain tick offload


 Documentation/admin-guide/kernel-parameters.txt |  11 +++
 include/linux/sched/isolation.h |   1 +
 include/linux/sched/nohz.h  |   4 -
 include/linux/tick.h|   4 +-
 kernel/sched/core.c | 117 ++--
 kernel/sched/deadline.c |   8 ++
 kernel/sched/fair.c |   7 +-
 kernel/sched/idle_task.c|   9 +-
 kernel/sched/isolation.c|   7 +-
 kernel/sched/rt.c   |   8 ++
 kernel/sched/sched.h|  13 +--
 kernel/sched/stop_task.c|   8 ++
 kernel/time/tick-sched.c|  15 +--
 kernel/workqueue.c  |   3 +-
 14 files changed, 162 insertions(+), 53 deletions(-)


[PATCH 1/7] sched: Rename init_rq_hrtick to hrtick_rq_init

2018-02-20 Thread Frederic Weisbecker
Do that rename in order to normalize the hrtick namespace.

Reviewed-by: Thomas Gleixner 
Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 kernel/sched/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e7c535e..e72ca3c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -333,7 +333,7 @@ void hrtick_start(struct rq *rq, u64 delay)
 }
 #endif /* CONFIG_SMP */
 
-static void init_rq_hrtick(struct rq *rq)
+static void hrtick_rq_init(struct rq *rq)
 {
 #ifdef CONFIG_SMP
rq->hrtick_csd_pending = 0;
@@ -351,7 +351,7 @@ static inline void hrtick_clear(struct rq *rq)
 {
 }
 
-static inline void init_rq_hrtick(struct rq *rq)
+static inline void hrtick_rq_init(struct rq *rq)
 {
 }
 #endif /* CONFIG_SCHED_HRTICK */
@@ -6028,7 +6028,7 @@ void __init sched_init(void)
rq->last_sched_tick = 0;
 #endif
 #endif /* CONFIG_SMP */
-   init_rq_hrtick(rq);
+   hrtick_rq_init(rq);
atomic_set(&rq->nr_iowait, 0);
}
 
-- 
2.7.4



[PATCH 3/7] nohz: Allow to check if remote CPU tick is stopped

2018-02-20 Thread Frederic Weisbecker
This check is racy but provides a good heuristic to determine whether
a CPU may need a remote tick or not.

Reviewed-by: Thomas Gleixner 
Signed-off-by: Frederic Weisbecker 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Luiz Capitulino 
Cc: Mike Galbraith 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Wanpeng Li 
Cc: Ingo Molnar 
---
 include/linux/tick.h | 2 ++
 kernel/time/tick-sched.c | 7 +++
 2 files changed, 9 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 86576d9..7f8c9a12 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -114,6 +114,7 @@ enum tick_dep_bits {
 #ifdef CONFIG_NO_HZ_COMMON
 extern bool tick_nohz_enabled;
 extern bool tick_nohz_tick_stopped(void);
+extern bool tick_nohz_tick_stopped_cpu(int cpu);
 extern void tick_nohz_idle_enter(void);
 extern void tick_nohz_idle_exit(void);
 extern void tick_nohz_irq_exit(void);
@@ -125,6 +126,7 @@ extern u64 get_cpu_iowait_time_us(int cpu, u64 
*last_update_time);
 #else /* !CONFIG_NO_HZ_COMMON */
 #define tick_nohz_enabled (0)
 static inline int tick_nohz_tick_stopped(void) { return 0; }
+static inline int tick_nohz_tick_stopped_cpu(int cpu) { return 0; }
 static inline void tick_nohz_idle_enter(void) { }
 static inline void tick_nohz_idle_exit(void) { }
 
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 0aba041..d479b21 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -486,6 +486,13 @@ bool tick_nohz_tick_stopped(void)
return __this_cpu_read(tick_cpu_sched.tick_stopped);
 }
 
+bool tick_nohz_tick_stopped_cpu(int cpu)
+{
+   struct tick_sched *ts = per_cpu_ptr(&tick_cpu_sched, cpu);
+
+   return ts->tick_stopped;
+}
+
 /**
  * tick_nohz_update_jiffies - update jiffies when idle was interrupted
  *
-- 
2.7.4



Re: [PATCH v21 2/4] mailbox: mediatek: Add Mediatek CMDQ driver

2018-02-20 Thread CK Hu
Hi, Houlong:

I've one inline comment.

On Wed, 2018-01-31 at 15:28 +0800, houlong@mediatek.com wrote:
> From: "hs.l...@mediatek.com" 
> 
> This patch is first version of Mediatek Command Queue(CMDQ) driver. The
> CMDQ is used to help write registers with critical time limitation,
> such as updating display configuration during the vblank. It controls
> Global Command Engine (GCE) hardware to achieve this requirement.
> Currently, CMDQ only supports display related hardwares, but we expect
> it can be extended to other hardwares for future requirements.
> 
> Signed-off-by: Houlong Wei 
> Signed-off-by: HS Liao 
> Signed-off-by: CK Hu 
> ---
>  drivers/mailbox/Kconfig  |   10 +
>  drivers/mailbox/Makefile |2 +
>  drivers/mailbox/mtk-cmdq-mailbox.c   |  594 
> ++
>  include/linux/mailbox/mtk-cmdq-mailbox.h |   77 
>  4 files changed, 683 insertions(+)
>  create mode 100644 drivers/mailbox/mtk-cmdq-mailbox.c
>  create mode 100644 include/linux/mailbox/mtk-cmdq-mailbox.h
> 
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index ba2f152..43bb26f 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -171,4 +171,14 @@ config BCM_FLEXRM_MBOX
> Mailbox implementation of the Broadcom FlexRM ring manager,
> which provides access to various offload engines on Broadcom
> SoCs. Say Y here if you want to use the Broadcom FlexRM.
> +
> +config MTK_CMDQ_MBOX
> + bool "MediaTek CMDQ Mailbox Support"
> + depends on ARM64 && ( ARCH_MEDIATEK || COMPILE_TEST )
> + select MTK_INFRACFG
> + help
> +   Say yes here to add support for the MediaTek Command Queue (CMDQ)
> +   mailbox driver. The CMDQ is used to help read/write registers with
> +   critical time limitation, such as updating display configuration
> +   during the vblank.
>  endif
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index 4896f8d..484d341 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -36,3 +36,5 @@ obj-$(CONFIG_BCM_FLEXRM_MBOX)   += bcm-flexrm-mailbox.o
>  obj-$(CONFIG_QCOM_APCS_IPC)  += qcom-apcs-ipc-mailbox.o
>  
>  obj-$(CONFIG_TEGRA_HSP_MBOX) += tegra-hsp.o
> +
> +obj-$(CONFIG_MTK_CMDQ_MBOX)  += mtk-cmdq-mailbox.o
> diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c 
> b/drivers/mailbox/mtk-cmdq-mailbox.c
> new file mode 100644
> index 000..394a335
> --- /dev/null
> +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
> @@ -0,0 +1,594 @@
> +/*
> + * Copyright (c) 2015 MediaTek Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define CMDQ_THR_MAX_COUNT   16
> +#define CMDQ_OP_CODE_MASK(0xff << CMDQ_OP_CODE_SHIFT)
> +#define CMDQ_TIMEOUT_MS  1000
> +#define CMDQ_IRQ_MASK0x
> +#define CMDQ_NUM_CMD(t)  (t->cmd_buf_size / 
> CMDQ_INST_SIZE)
> +
> +#define CMDQ_CURR_IRQ_STATUS 0x10
> +#define CMDQ_THR_SLOT_CYCLES 0x30
> +
> +#define CMDQ_THR_BASE0x100
> +#define CMDQ_THR_SIZE0x80
> +#define CMDQ_THR_WARM_RESET  0x00
> +#define CMDQ_THR_ENABLE_TASK 0x04
> +#define CMDQ_THR_SUSPEND_TASK0x08
> +#define CMDQ_THR_CURR_STATUS 0x0c
> +#define CMDQ_THR_IRQ_STATUS  0x10
> +#define CMDQ_THR_IRQ_ENABLE  0x14
> +#define CMDQ_THR_CURR_ADDR   0x20
> +#define CMDQ_THR_END_ADDR0x24
> +#define CMDQ_THR_WAIT_TOKEN  0x30
> +
> +#define CMDQ_THR_ENABLED 0x1
> +#define CMDQ_THR_DISABLED0x0
> +#define CMDQ_THR_SUSPEND 0x1
> +#define CMDQ_THR_RESUME  0x0
> +#define CMDQ_THR_STATUS_SUSPENDEDBIT(1)
> +#define CMDQ_THR_DO_WARM_RESET   BIT(0)
> +#define CMDQ_THR_ACTIVE_SLOT_CYCLES  0x3200
> +#define CMDQ_THR_IRQ_DONE0x1
> +#define CMDQ_THR_IRQ_ERROR   0x12
> +#define CMDQ_THR_IRQ_EN  (CMDQ_THR_IRQ_ERROR | 
> CMDQ_THR_IRQ_DONE)
> +#define CMDQ_THR_IS_WAITING  BIT(31)
> +
> +#define CMDQ_JUMP_BY_OFFSET  0x1000
> +#define CMDQ_JUMP_BY_PA  0x1001
> +
> +struct cmdq_thread {
> + struct mbox_chan*chan;
> + void __iomem*base;
> + struct list_headtask_busy_list;
> + struct timer_lis

Re: [PATCH 2/3] ARM: orion: mark orion_ge00_mvmdio_bus_name as const

2018-02-20 Thread Andrew Lunn
On Tue, Feb 20, 2018 at 05:24:51PM +0100, Arnd Bergmann wrote:
> A section type mismatch warning shows up when building with LTO,
> since orion_ge00_mvmdio_bus_name was put in __initconst but not marked
> const itself:
> 
> include/linux/of.h: In function 'spear_setup_of_timer':
> arch/arm/mach-spear/time.c:207:34: error: 'timer_of_match' causes a section 
> type conflict with 'orion_ge00_mvmdio_bus_name'
>  static const struct of_device_id timer_of_match[] __initconst = {
>   ^
> arch/arm/plat-orion/common.c:475:32: note: 'orion_ge00_mvmdio_bus_name' was 
> declared here
>  static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
>
> This marks it const as well.

Hi Arnd

I'm not sure this is the correct fix. orion_ge00_mvmdio_bus_name is
assigned to orion_ge00_switch_board_info->bus_id. This is then passed
to mdiobus_register_board_info() which makes a copy of
orion_ge00_switch_board_info and adds it to its linked list. The
original orion_ge00_switch_board_info will get freed since it is
__initdata, but the copy still has a pointer to
orion_ge00_mvmdio_bus_name, which also gets freed.

I think the correct fix is to remove the __initconst from
orion_ge00_mvmdio_bus_name. Marking it const is however valid.

Andrew


[PATCH] netlink: put module reference if dump start fails

2018-02-20 Thread Jason A. Donenfeld
Before, if cb->start() failed, the module reference would never be put,
because cb->cb_running is intentionally false at this point. Users are
generally annoyed by this because they can no longer unload modules that
leak references. Also, it may be possible to tediously wrap a reference
counter back to zero, especially since module.c still uses atomic_inc
instead of refcount_inc.

This patch expands the error path to simply call module_put if
cb->start() fails.

Signed-off-by: Jason A. Donenfeld 
---
This probably should be queued up for stable.

 net/netlink/af_netlink.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 2ad445c1d27c..07e8478068f0 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2308,7 +2308,7 @@ int __netlink_dump_start(struct sock *ssk, struct sk_buff 
*skb,
if (cb->start) {
ret = cb->start(cb);
if (ret)
-   goto error_unlock;
+   goto error_put;
}
 
nlk->cb_running = true;
@@ -2328,6 +2328,8 @@ int __netlink_dump_start(struct sock *ssk, struct sk_buff 
*skb,
 */
return -EINTR;
 
+error_put:
+   module_put(control->module);
 error_unlock:
sock_put(sk);
mutex_unlock(nlk->cb_mutex);
-- 
2.16.1



Re: [PATCH] cpufreq: scpi: invoke frequency-invariance setter function

2018-02-20 Thread Viresh Kumar
On 20-02-18, 11:10, Dietmar Eggemann wrote:
> Commit 343a8d17fa8d ("cpufreq: scpi: remove arm_big_little dependency")
> changed the cpufreq driver on juno from arm_big_little to scpi.
> 
> The scpi set_target function does not call the frequency-invariance
> setter function arch_set_freq_scale() like the arm_big_little set_target
> function does. As a result the task scheduler load and utilization
> signals are not frequency-invariant on this platform anymore.
> 
> Fix this by adding a call to arch_set_freq_scale() into
> scpi_cpufreq_set_target().
> 
> Fixes: 343a8d17fa8d ("cpufreq: scpi: remove arm_big_little dependency")
> Cc: Rafael J. Wysocki 
> Cc: Viresh Kumar 
> Cc: Sudeep Holla 
> Signed-off-by: Dietmar Eggemann 
> Acked-by: Sudeep Holla 
> ---
>  drivers/cpufreq/scpi-cpufreq.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
> index c32a833e1b00..3101d4e9c2de 100644
> --- a/drivers/cpufreq/scpi-cpufreq.c
> +++ b/drivers/cpufreq/scpi-cpufreq.c
> @@ -51,13 +51,19 @@ static unsigned int scpi_cpufreq_get_rate(unsigned int 
> cpu)
>  static int
>  scpi_cpufreq_set_target(struct cpufreq_policy *policy, unsigned int index)
>  {
> + unsigned long freq = policy->freq_table[index].frequency;
>   struct scpi_data *priv = policy->driver_data;
> - u64 rate = policy->freq_table[index].frequency * 1000;
> + u64 rate = freq * 1000;
>   int ret;
>  
>   ret = clk_set_rate(priv->clk, rate);
> - if (!ret && (clk_get_rate(priv->clk) != rate))
> - ret = -EIO;
> + if (!ret) {
> + if (clk_get_rate(priv->clk) != rate)
> + ret = -EIO;
> +
> + arch_set_freq_scale(policy->related_cpus, freq,
> + policy->cpuinfo.max_freq);
> + }
>  
>   return ret;
>  }

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH] Carrier detect ok, don't turn off negotiation

2018-02-20 Thread Denis Du
Hi, David:

How  is your thinking about this patch?



>From b5902a4dfc709b62b704997ab64f31c9ef69a6db Mon Sep 17 00:00:00 2001 
From: Denis Du  
Date: Mon, 15 Jan 2018 17:26:06 -0500 
Subject: [PATCH] netdev: carrier detect ok, don't turn off negotiation 

Sometimes when physical lines have a just good noise to make the protocol 
handshaking fail, but the carrier detect still good. Then after remove of 
the noise, nobody will trigger this protocol to be start again to cause 
the link to never come back. The fix is when the carrier is still on, not 
terminate the protocol handshaking. 

Signed-off-by: Denis Du  
--- 
drivers/net/wan/hdlc_ppp.c | 5 - 
1 file changed, 4 insertions(+), 1 deletion(-) 

diff --git a/drivers/net/wan/hdlc_ppp.c b/drivers/net/wan/hdlc_ppp.c 
index afeca6b..ab8b3cb 100644 
--- a/drivers/net/wan/hdlc_ppp.c 
+++ b/drivers/net/wan/hdlc_ppp.c 
@@ -574,7 +574,10 @@ static void ppp_timer(struct timer_list *t) 
ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0, 
 0, NULL); 
proto->restart_counter--; 
-} else 
+} else if (netif_carrier_ok(proto->dev)) 
+ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0, 
+ 0, NULL); 
+else 
ppp_cp_event(proto->dev, proto->pid, TO_BAD, 0, 0, 
 0, NULL); 
break; 
-- 
2.1.4 





On ‎Tuesday‎, ‎February‎ ‎06‎, ‎2018‎ ‎12‎:‎06‎:‎43‎ ‎PM‎ ‎EST, Denis Du 
 wrote: 







Ok, I submit it  again.


In drivers/net/wan/hdlc_ppp.c, some noise on physical line can cause the 
carrier detect still ok, but the protocol will fail. So if carrier detect ok, 
don't turn off protocol negotiation

This patch is against the kernel version Linux 4.15-rc8





On Tuesday, February 6, 2018, 10:29:53 AM EST, David Miller 
 wrote: 





From: Denis Du 

Date: Tue, 6 Feb 2018 15:15:28 + (UTC)

> How  do you think my patch?
> 
> As you see, Krzysztof  think my patch is ok to be accepted.
> But if you have a better idea to fix it,I am glad to see it. Anyway, this 
> issue have to be fixed.


Please resubmit it and I'll think about it again, thank you.


linux-next: Tree for Feb 21

2018-02-20 Thread Stephen Rothwell
Hi all,

Changes since 20180220:

Non-merge commits (relative to Linus' tree): 2620
 3067 files changed, 109463 insertions(+), 57328 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 256 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (af3e79d29555 Merge tag 'leds_for-4.16-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds)
Merging fixes/master (7928b2cbe55b Linux 4.16-rc1)
Merging kbuild-current/fixes (36c1681678b5 genksyms: drop *.hash.c from 
.gitignore)
Merging arc-current/for-curr (053823335956 arc: dts: use 'atmel' as 
manufacturer for at24 in axs10x_mb)
Merging arm-current/fixes (091f02483df7 ARM: net: bpf: clarify tail_call index)
Merging m68k-current/for-linus (2334b1ac1235 MAINTAINERS: Add NuBus subsystem 
entry)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (2c10636a0b9c powerpc/pseries: Check for zero 
filled ibm,dynamic-memory property)
Merging sparc/master (aebb48f5e465 sparc64: fix typo in 
CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (abe27a885d9e ibmvnic: Check for NULL skb's in NAPI poll 
routine)
Merging bpf/master (b1a2ce825737 tools/libbpf: Avoid possibly using 
uninitialized variable)
Merging ipsec/master (013cb81e89f8 xfrm: Fix infinite loop in 
xfrm_get_dst_nexthop with transport mode.)
Merging netfilter/master (cfc2c7405333 netfilter: IDLETIMER: be syzkaller 
friendly)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (7ac8ff95f48c mvpp2: fix multicast address 
filter)
Merging mac80211/master (3b07029729e3 mac80211: Fix sending ADDBA response for 
an ongoing session)
Merging rdma-fixes/for-rc (2f08ee363fe0 RDMA/restrack: don't use 
uaccess_kernel())
Merging sound-current/for-linus (fdcc968a3b29 ALSA: hda/realtek: PCI quirk for 
Fujitsu U7x7)
Merging pci-current/for-linus (7928b2cbe55b Linux 4.16-rc1)
Merging driver-core.current/driver-core-linus (7928b2cbe55b Linux 4.16-rc1)
Merging tty.current/tty-linus (7928b2cbe55b Linux 4.16-rc1)
Merging usb.current/usb-linus (44eb5e12b845 Revert "usb: musb: host: don't 
start next rx urb if current one failed")
Merging usb-gadget-fixes/fixes (98112041bcca usb: dwc3: core: Fix ULPI PHYs and 
prevent phy_get/ulpi_init during suspend/resume)
Merging usb-serial-fixes/usb-linus (d14ac576d10f USB: serial: cp210x: add new 
device ID ELV ALC 8xxx)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (7928b2cbe55b Linux 4.16-rc1)
Merging staging.current/staging-linus (c6754712e053 Merge tag 
'iio-fixes-for-4.16a' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus)
Merging char-misc.current/char-misc-linus (5aaa096d844c Merge tag 
'extcon-fixes-for-4.16-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into 
char-misc-linus)
Merging input-current/for-linus (ea4f7bd2aca9 Input: matrix_keypad - fix race 
when disabling interrupts)
Merging crypto-current/ma

Re: [PATCH 5/7] ARM: fix __inflate_kernel_data stack warning for LTO

2018-02-20 Thread Nicolas Pitre
On Tue, 20 Feb 2018, Arnd Bergmann wrote:

> Commit ca8b5d97d6bf ("ARM: XIP kernel: store .data compressed in ROM")
> moved the decompressor workspace to the stack and added a compiler
> flag to avoid the stack size warning.
> 
> With LTO, that warning comes back. Moving the workspace into an initdata
> variable avoids that warning but presumably also undoes the optimization.

Not only that, but it will probably crash at run time. What this code 
does is uncompressing initialized data to memory, _including_ initdata. 
So you'll end up overwriting your inflate_state while decompressing.

> We could also try disabling the warning locally in that file with
> _Pragma("GCC disagnostic"), but we lack a little bit of infrastructure
> to do that nicely.

Your patch #1/7 showed issues with the final part of this feature 
anyway, so my suggestion for that patch will take care of this one too 
for the time being.

> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arm/kernel/Makefile| 3 ---
>  arch/arm/kernel/head-inflate-data.c | 3 ++-
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
> index b59ac4bf82b8..2e8d40d442a2 100644
> --- a/arch/arm/kernel/Makefile
> +++ b/arch/arm/kernel/Makefile
> @@ -88,9 +88,6 @@ head-y  := head$(MMUEXT).o
>  obj-$(CONFIG_DEBUG_LL)   += debug.o
>  obj-$(CONFIG_EARLY_PRINTK)   += early_printk.o
>  
> -# This is executed very early using a temporary stack when no memory 
> allocator
> -# nor global data is available. Everything has to be allocated on the stack.
> -CFLAGS_head-inflate-data.o := $(call cc-option,-Wframe-larger-than=10240)
>  obj-$(CONFIG_XIP_DEFLATED_DATA) += head-inflate-data.o
>  
>  obj-$(CONFIG_ARM_VIRT_EXT)   += hyp-stub.o
> diff --git a/arch/arm/kernel/head-inflate-data.c 
> b/arch/arm/kernel/head-inflate-data.c
> index 6dd0ce5e6058..b208c4541bd1 100644
> --- a/arch/arm/kernel/head-inflate-data.c
> +++ b/arch/arm/kernel/head-inflate-data.c
> @@ -35,10 +35,11 @@ extern char _sdata[];
>   * stack then there is no need to clean up before returning.
>   */
>  
> +static __initdata struct inflate_state state;
> +
>  int __init __inflate_kernel_data(void)
>  {
>   struct z_stream_s stream, *strm = &stream;
> - struct inflate_state state;
>   char *in = __data_loc;
>   int rc;
>  
> -- 
> 2.9.0
> 
> 


Re: [PATCH] scsi: cxlflash: Select SCSI_SCAN_ASYNC

2018-02-20 Thread Matthew R. Ochs
On Tue, Feb 20, 2018 at 07:56:35PM +1100, Michael Ellerman wrote:
> Vaibhav Jain  writes:
> 
> > The cxlflash driver uses "Asynchronous SCSI scanning" enabled by
> > CONFIG_SCSI_SCAN_ASYNC. Without this enabled the modprobe of cxlflash
> > module gets hung with following backtrace:
> >
> > Call Trace:
> >  __switch_to+0x2cc/0x470
> >  __schedule+0x288/0xab0
> >  schedule+0x40/0xc0
> >  schedule_timeout+0x254/0x4f0
> >  wait_for_common+0xdc/0x260
> >  flush_work+0x140/0x2a0
> >  work_on_cpu+0x88/0xb0
> >  pci_device_probe+0x1d0/0x220
> >  driver_probe_device+0x408/0x5b0
> >  __driver_attach+0x16c/0x1a0
> >  bus_for_each_dev+0xb8/0x110
> >  driver_attach+0x3c/0x60
> >  bus_add_driver+0x1d8/0x370
> >  driver_register+0x9c/0x180
> >  __pci_register_driver+0x74/0xa0
> >  init_cxlflash+0x158/0x1cc
> >  do_one_initcall+0x68/0x1e0
> >  do_init_module+0x90/0x254
> >  load_module+0x2f8c/0x3720
> >  SyS_finit_module+0xcc/0x140
> >  system_call+0x58/0x6c
> 
> Why does it "hang"? That's kind of bizarre, I would expect either a
> build or runtime failure if a feature the driver requires is missing.
> 

It hangs due to a bug in the driver. I briefly looked at it several
months back before getting distracted with other items. IIRC there
was an issue with the state machine.

> > diff --git a/drivers/scsi/cxlflash/Kconfig b/drivers/scsi/cxlflash/Kconfig
> > index a011c5dbf214..f054c1b0fff3 100644
> > --- a/drivers/scsi/cxlflash/Kconfig
> > +++ b/drivers/scsi/cxlflash/Kconfig
> > @@ -6,6 +6,7 @@ config CXLFLASH
> > tristate "Support for IBM CAPI Flash"
> > depends on PCI && SCSI && CXL && EEH
> > select IRQ_POLL
> > +   select SCSI_SCAN_ASYNC
> 
> It's user configurable, so it's rude to select it. It can also be
> disabled on the kernel command line, so this seems like a fragile
> solution.
>
I think Vaibhav's intention here was to avoid the hang while the bug
is still present - I believe he has encountered it several times recently.

The proper solution would be to fix the bug.



Re: why scripts/link-vmlinux.sh has a final build of init/

2018-02-20 Thread Cao jin
Sorry for late.

On 02/14/2018 07:31 PM, Masahiro Yamada wrote:
> 2018-02-13 16:08 GMT+09:00 Cao jin :
> 
>> BTW, I still have 2 questions.
>>
>> 1. In final build, why need
>>
>>GCC_PLUGINS_CFLAGS="${GCC_PLUGINS_CFLAGS}"
>>
>> Doesn't GCC_PLUGINS_CFLAGS already exist in the environment?
>>
>> I also tested the Randomizing Structure Layout plugin with this patch,
>> the plugin seems works in my test.
> 
> 
> I have not tested, but GCC_PLUGINS_CFLAGS="${GCC_PLUGINS_CFLAGS}"
> is probably unnecessary.
> 
> 
> 
>> 2. scripts/link-vmlinux.sh seems just handle only one argument: clean.
>> So why shouldn't it be:
> 
> 
> To detect the change of $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux)
> because link-vmlinux.sh depends on them.
> 

I understood, I missed the existence of .vmlinux.cmd file.
Thanks very much, Masahiro-san.

-- 
Sincerely,
Cao jin

> 
>> diff --git a/Makefile b/Makefile
>> index ccd981892ef2..21d93b545381 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -998,7 +998,7 @@ ARCH_POSTLINK := $(wildcard
>> $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
>>
>>  # Final link of vmlinux with optional arch pass after final link
>>  cmd_link-vmlinux = \
>> -   $(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ;\
>> +   $(CONFIG_SHELL) $<; \
>> $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
>>
>>  vmlinux: scripts/link-vmlinux.sh vmlinux_prereq $(vmlinux-deps) FORCE
> 
> 
> 





[PATCH V2] PCI: Add ACS quirk for Ampere root ports

2018-02-20 Thread Feng Kan
The Ampere Computing PCIe root port does not support ACS at this point.
However, the hardware provides isolation and source validation through the
SMMU. The stream ID generated by the PCIe ports contain both the
bus/device/function number as well as the port ID in its 3 most significant
bits. Turn on ACS but disable all the peer-to-peer features.

Signed-off-by: Feng Kan 
---
 V2 - Correct patch summary as per Bjorn's comment

 This is a rebranding of APM to Ampere, it is a change of vendor id
 and device id, all functionality stays the same as before.

 drivers/pci/quirks.c| 9 +
 include/linux/pci_ids.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index fc73401..57748a3 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4514,6 +4514,15 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_CAVIUM, PCI_ANY_ID, pci_quirk_cavium_acs },
/* APM X-Gene */
{ PCI_VENDOR_ID_AMCC, 0xE004, pci_quirk_xgene_acs },
+   /* Ampere Computing */
+   { PCI_VENDOR_ID_AMPERE, 0xE005, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE006, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE007, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE008, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE009, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE00A, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE00B, pci_quirk_xgene_acs },
+   { PCI_VENDOR_ID_AMPERE, 0xE00C, pci_quirk_xgene_acs },
{ 0 }
 };
 
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index a6b3066..c875d42 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -1333,6 +1333,7 @@
 #define PCI_DEVICE_ID_IMS_TT3D 0x9135
 
 #define PCI_VENDOR_ID_AMCC 0x10e8
+#define PCI_VENDOR_ID_AMPERE   0x1def
 
 #define PCI_VENDOR_ID_INTERG   0x10ea
 #define PCI_DEVICE_ID_INTERG_1682  0x1682
-- 
2.7.4



Re: [PATCH 1/3] seccomp, ptrace: switch get_metadata types to arch independent

2018-02-20 Thread Dmitry V. Levin
On Tue, Feb 20, 2018 at 07:47:45PM -0700, Tycho Andersen wrote:
> Commit 26500475ac1b ("ptrace, seccomp: add support for retrieving seccomp
> metadata") introduced `struct seccomp_metadata`, which contained unsigned
> longs that should be arch independent. The type of the flags member was
> chosen to match the corresponding argument to seccomp(), and so we need
> something at least as big as unsigned long. My understanding is that __u64
> should fit the bill, so let's switch both types to that.
> 
> While this is userspace facing, it was only introduced in 4.16-rc2, and so
> should be safe assuming it goes in before then.
> 
> Reported-by: "Dmitry V. Levin" 
> Signed-off-by: Tycho Andersen 
> CC: Kees Cook 
> CC: Oleg Nesterov 
> ---
>  include/uapi/linux/ptrace.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index e46d82b91166..d5a1b8a492b9 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -69,8 +69,8 @@ struct ptrace_peeksiginfo_args {
>  #define PTRACE_SECCOMP_GET_METADATA  0x420d
>  
>  struct seccomp_metadata {
> - unsigned long filter_off;   /* Input: which filter */
> - unsigned int flags; /* Output: filter's flags */
> + __u64 filter_off;   /* Input: which filter */
> + __u64 flags;/* Output: filter's flags */
>  };
>  
>  /* Read signals from a shared (process wide) queue */

That's much better, thanks.

Reviewed-by: "Dmitry V. Levin" 


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH 3/7] [HACK] pass endianess flag to LTO linker

2018-02-20 Thread Nicolas Pitre
On Tue, 20 Feb 2018, Arnd Bergmann wrote:

> We need some way to pass -mbig-endian to the linker during the
> LTO link stage, otherwise we get a waning like
> 
> arm-linux-gnueabi/bin/ld: arch/arm/lib/clearbit.o: compiled for a big endian 
> system and target is little endian
> 
> for each file we link in.
> 
> There is probably a better method of passing that flag, I'm just
> adding it to a different hack that I added earlier for x86 LTO
> here.

Didn't the patch below fix it for you already?

- >8
Date: Fri, 1 Sep 2017 18:37:52 -0400
Subject: [PATCH] scripts/gcc-ld: LTO on ARM needs arch specific gcc flags

Otherwise the final link where code generation happens produces code
for the wrong ISA when the default CPU configured into gcc is not the
one we need.

Also display the actual command when invoked with "make V=1".

Signed-off-by: Nicolas Pitre 

diff --git a/scripts/gcc-ld b/scripts/gcc-ld
index d95dd0be38..fa53be2a34 100755
--- a/scripts/gcc-ld
+++ b/scripts/gcc-ld
@@ -27,4 +27,10 @@ while [ "$1" != "" ] ; do
shift
 done
 
-exec $CC $ARGS
+case "${KBUILD_VERBOSE}" in
+*1*)
+   set -x
+   ;;
+esac
+
+exec $CC $KBUILD_CFLAGS $ARGS



Re: [PATCH 2/7] ARM: LTO: avoid THUMB2_KERNEL+LTO

2018-02-20 Thread Nicolas Pitre
On Tue, 20 Feb 2018, Arnd Bergmann wrote:

> Trying to build an LTO-Enabled kernel with Thumb2 instructions failed
> horribly for me, with an endless output of things like
> 
> ccVnNycO.s:2665: Error: thumb conditional instruction should be in IT block 
> -- `bxne lr'
> ccVnNycO.s:7128: Error: thumb conditional instruction should be in IT block 
> -- `strexeq r5,r2,[r3]'
> ccVnNycO.s:7258: Error: thumb conditional instruction should be in IT block 
> -- `strexeq lr,r0,[r3]'
> ccVnNycO.s:17380: Error: thumb conditional instruction should be in IT block 
> -- `strexeq r1,r2,[r6]'
> ccVnNycO.s:19163: Error: thumb conditional instruction should be in IT block 
> -- `strexeq r8,r6,[r3]'
> ccVnNycO.s:22722: Error: thumb conditional instruction should be in IT block 
> -- `strexeq r7,r1,[r0]'
> ccVnNycO.s:24105: conditional infixes are deprecated in unified syntax
> ccVnNycO.s:24105: Error: thumb conditional instruction should be in IT block 
> -- `sbcccs r1,r1,r3'
> ccVnNycO.s:24105: Error: thumb conditional instruction should be in IT block 
> -- `movcc r3,#0'
> ccVnNycO.s:24210: conditional infixes are deprecated in unified syntax
> ccVnNycO.s:24210: Error: thumb conditional instruction should be in IT block 
> -- `sbcccs r2,r2,r3'
> ccVnNycO.s:24210: Error: thumb conditional instruction should be in IT block 
> -- `movcc r3,#0'
> 
> I did not investigate this too much, disabling Thumb2 support when LTO is
> set lets me build randconfig kernels.
> 
> Since ARM_SINGLE_ARMV7M is Thumb2-only, I have to disallow LTO for V7-M
> targets.

Here's the workaround I sent you on January 2rd:

- >*
#!/bin/bash

# work around https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78353

GCC_ROOT=/opt/gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf

set -e
set -x

cd $GCC_ROOT//arm-linux-gnueabihf/bin
[ -e fat-as ] && exit 1
mv as fat-as
cat > as << EOF
#!/bin/bash
exec -a "\$0" "\$(dirname "\$0")/fat-as" -mimplicit-it=always "\$@"
EOF
chmod +x as
- >8


Nicolas


Re: [PATCH] random: always fill buffer in get_random_bytes_wait

2018-02-20 Thread Jason A. Donenfeld
Hi Ted,

Can you apply this?

Thanks,
Jason


[PATCH v2 1/3] mm: memcg: plumbing memcg for kmem cache allocations

2018-02-20 Thread Shakeel Butt
Introducing the memcg variant for kmem cache allocation functions.
Currently the kernel switches the root kmem cache with the memcg
specific kmem cache for __GFP_ACCOUNT allocations to charge those
allocations to the memcg. However, the memcg to charge is extracted from
the current task_struct. This patch introduces the variant of kmem cache
allocation functions where the memcg can be provided explicitly by the
caller instead of deducing the memcg from the current task.

These functions are useful for use-cases where the allocations should be
charged to the memcg different from the memcg of the caller. One such
concrete use-case is the allocations for fsnotify event objects where
the objects should be charged to the listener instead of the producer.

One requirement to call these functions is that the caller must have the
reference to the memcg.

Signed-off-by: Shakeel Butt 
---
Changelog since v1:
- Fixed build for SLOB

 include/linux/memcontrol.h |  3 +-
 include/linux/slab.h   | 41 
 mm/memcontrol.c| 18 +++--
 mm/slab.c  | 78 +-
 mm/slab.h  |  6 +--
 mm/slob.c  |  7 
 mm/slub.c  | 77 ++---
 7 files changed, 199 insertions(+), 31 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c79cdf9f8138..48eaf19859e9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1174,7 +1174,8 @@ static inline bool 
mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
-struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep);
+struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep,
+   struct mem_cgroup *memcg);
 void memcg_kmem_put_cache(struct kmem_cache *cachep);
 int memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
struct mem_cgroup *memcg);
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 231abc8976c5..24355bc9e655 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -353,6 +353,8 @@ static __always_inline int kmalloc_index(size_t size)
 
 void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc;
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) 
__assume_slab_alignment __malloc;
+void *kmem_cache_alloc_memcg(struct kmem_cache *, gfp_t flags,
+   struct mem_cgroup *memcg) __assume_slab_alignment __malloc;
 void kmem_cache_free(struct kmem_cache *, void *);
 
 /*
@@ -377,6 +379,8 @@ static __always_inline void kfree_bulk(size_t size, void 
**p)
 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node) 
__assume_kmalloc_alignment __malloc;
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) 
__assume_slab_alignment __malloc;
+void *kmem_cache_alloc_node_memcg(struct kmem_cache *, gfp_t flags, int node,
+   struct mem_cgroup *memcg) __assume_slab_alignment __malloc;
 #else
 static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
@@ -387,15 +391,26 @@ static __always_inline void *kmem_cache_alloc_node(struct 
kmem_cache *s, gfp_t f
 {
return kmem_cache_alloc(s, flags);
 }
+
+static __always_inline void *kmem_cache_alloc_node_memcg(struct kmem_cache *s,
+   gfp_t flags, int node, struct mem_cgroup *memcg)
+{
+   return kmem_cache_alloc_memcg(s, flags, memcg);
+}
 #endif
 
 #ifdef CONFIG_TRACING
 extern void *kmem_cache_alloc_trace(struct kmem_cache *, gfp_t, size_t) 
__assume_slab_alignment __malloc;
+extern void *kmem_cache_alloc_memcg_trace(struct kmem_cache *, gfp_t, size_t,
+   struct mem_cgroup *memcg) __assume_slab_alignment __malloc;
 
 #ifdef CONFIG_NUMA
 extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
   gfp_t gfpflags,
   int node, size_t size) 
__assume_slab_alignment __malloc;
+extern void *kmem_cache_alloc_node_memcg_trace(struct kmem_cache *s,
+   gfp_t gfpflags, int node, size_t size,
+   struct mem_cgroup *memcg) __assume_slab_alignment __malloc;
 #else
 static __always_inline void *
 kmem_cache_alloc_node_trace(struct kmem_cache *s,
@@ -404,6 +419,13 @@ kmem_cache_alloc_node_trace(struct kmem_cache *s,
 {
return kmem_cache_alloc_trace(s, gfpflags, size);
 }
+
+static __always_inline void *
+kmem_cache_alloc_node_memcg_trace(struct kmem_cache *s, gfp_t gfpflags,
+   int node, size_t size, struct mem_cgroup *memcg)
+{
+   return kmem_cache_alloc_memcg_trace(s, gfpflags, size, memcg);
+}
 #endif /* CONFIG_NUMA */
 
 #else /* CONFIG_TRACING */
@@ -416,6 +438,15 @@ static __always_inline void *kmem_cache_alloc_trace(struct 
kmem_cache *s,
return ret;
 }
 
+static __always_inline void *kmem_cache_alloc_m

Re: [PATCH 1/7] ARM: disallow combining XIP and LTO

2018-02-20 Thread Nicolas Pitre
On Tue, 20 Feb 2018, Arnd Bergmann wrote:

> This fails during deflate_xip_data.sh
> 
>   /home/arnd/cross-gcc/bin/arm-linux-gnueabi-objcopy -O binary -R .comment -S 
>  vmlinux arch/arm/boot/xipImage && /bin/bash -c 
> '/git/arm-soc/arch/arm/boot/deflate_xip_data.sh vmlinux 
> arch/arm/boot/xipImage || { rm -f arch/arm/boot/xipImage; false; }'
> make -f /git/arm-soc/scripts/Makefile.modpost
> + sym_val __data_loc
> + sed -n / __data_loc$/{s/ .*$//p;q}
> + /home/arnd/cross-gcc/bin/arm-linux-gnueabi-gcc-nm vmlinux
> /home/arnd/cross-gcc/lib/gcc/arm-linux-gnueabi/8.0.1/../../../../arm-linux-gnueabi/bin/nm
>  terminated with signal 13 [Broken pipe]
> + local val=ac74c0f4
> + [ ac74c0f4 ]
> + echo 2893332724
> + __data_loc=2893332724
> + sym_val _edata_loc
> + /home/arnd/cross-gcc/bin/arm-linux-gnueabi-gcc-nm vmlinux
> + sed -n / _edata_loc$/{s/ .*$//p;q}
> /home/arnd/cross-gcc/lib/gcc/arm-linux-gnueabi/8.0.1/../../../../arm-linux-gnueabi/bin/nm
>  terminated with signal 13 [Broken pipe]
> + local val=ac7b8744
> + [ ac7b8744 ]
> + echo 2893776708
> + _edata_loc=2893776708
> + sym_val _xiprom
> + sed -n / _xiprom$/{s/ .*$//p;q}
> + /home/arnd/cross-gcc/bin/arm-linux-gnueabi-gcc-nm vmlinux
> /home/arnd/cross-gcc/lib/gcc/arm-linux-gnueabi/8.0.1/../../../../arm-linux-gnueabi/bin/nm
>  terminated with signal 13 [Broken pipe]
> 
> Obviously we want to make the combination work, no idea why it doesn't.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arm/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 823e397ee0f3..8ed0f664f86f 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1976,6 +1976,7 @@ endchoice
>  config XIP_KERNEL
>   bool "Kernel Execute-In-Place from ROM"
>   depends on !ARM_LPAE && !ARCH_MULTIPLATFORM
> + depends on !LTO

You should move this to config XIP_DEFLATED_DATA instead.


Nicolas


[PATCH v2 2/3] mm: memcg: plumbing memcg for kmalloc allocations

2018-02-20 Thread Shakeel Butt
Introducing the memcg variant for kmalloc allocation functions.
The kmalloc allocations are underlying served using the kmem caches
unless the size of the allocation request is larger than
KMALLOC_MAX_CACHE_SIZE, in which case, the kmem caches are bypassed and
the request is routed directly to page allocator. So, for __GFP_ACCOUNT
kmalloc allocations, the memcg of current task is charged. This patch
introduces memcg variant of kmalloc functions to allow callers to
provide memcg for charging.

Signed-off-by: Shakeel Butt 
---
Changelog since v1:
- Fixed build for SLOB

 include/linux/memcontrol.h |  3 +-
 include/linux/slab.h   | 45 +++---
 mm/memcontrol.c|  9 --
 mm/page_alloc.c|  2 +-
 mm/slab.c  | 31 +-
 mm/slab_common.c   | 41 +++-
 mm/slob.c  |  6 
 mm/slub.c  | 65 +++---
 8 files changed, 172 insertions(+), 30 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 48eaf19859e9..9dec8a5c0ca2 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1179,7 +1179,8 @@ struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache 
*cachep,
 void memcg_kmem_put_cache(struct kmem_cache *cachep);
 int memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
struct mem_cgroup *memcg);
-int memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
+int memcg_kmem_charge(struct page *page, gfp_t gfp, int order,
+ struct mem_cgroup *memcg);
 void memcg_kmem_uncharge(struct page *page, int order);
 
 #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 24355bc9e655..9df5d6279b38 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -352,6 +352,8 @@ static __always_inline int kmalloc_index(size_t size)
 #endif /* !CONFIG_SLOB */
 
 void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc;
+void *__kmalloc_memcg(size_t size, gfp_t flags,
+   struct mem_cgroup *memcg) __assume_kmalloc_alignment __malloc;
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) 
__assume_slab_alignment __malloc;
 void *kmem_cache_alloc_memcg(struct kmem_cache *, gfp_t flags,
struct mem_cgroup *memcg) __assume_slab_alignment __malloc;
@@ -378,6 +380,8 @@ static __always_inline void kfree_bulk(size_t size, void 
**p)
 
 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node) 
__assume_kmalloc_alignment __malloc;
+void *__kmalloc_node_memcg(size_t size, gfp_t flags, int node,
+   struct mem_cgroup *memcg) __assume_kmalloc_alignment __malloc;
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) 
__assume_slab_alignment __malloc;
 void *kmem_cache_alloc_node_memcg(struct kmem_cache *, gfp_t flags, int node,
struct mem_cgroup *memcg) __assume_slab_alignment __malloc;
@@ -387,6 +391,12 @@ static __always_inline void *__kmalloc_node(size_t size, 
gfp_t flags, int node)
return __kmalloc(size, flags);
 }
 
+static __always_inline void *__kmalloc_node_memcg(size_t size, gfp_t flags,
+   struct mem_cgroup *memcg, int node)
+{
+   return __kmalloc_memcg(size, flags, memcg);
+}
+
 static __always_inline void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t 
flags, int node)
 {
return kmem_cache_alloc(s, flags);
@@ -470,15 +480,26 @@ kmem_cache_alloc_node_memcg_trace(struct kmem_cache *s, 
gfp_t gfpflags,
 #endif /* CONFIG_TRACING */
 
 extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) 
__assume_page_alignment __malloc;
+extern void *kmalloc_order_memcg(size_t size, gfp_t flags, unsigned int order,
+   struct mem_cgroup *memcg) __assume_page_alignment __malloc;
 
 #ifdef CONFIG_TRACING
 extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) 
__assume_page_alignment __malloc;
+extern void *kmalloc_order_memcg_trace(size_t size, gfp_t flags,
+   unsigned int order,
+   struct mem_cgroup *memcg) __assume_page_alignment __malloc;
 #else
 static __always_inline void *
 kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
 {
return kmalloc_order(size, flags, order);
 }
+static __always_inline void *
+kmalloc_order_memcg_trace(size_t size, gfp_t flags, unsigned int order,
+ struct mem_cgroup *memcg)
+{
+   return kmalloc_order_memcg(size, flags, order, memcg);
+}
 #endif
 
 static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
@@ -487,6 +508,14 @@ static __always_inline void *kmalloc_large(size_t size, 
gfp_t flags)
return kmalloc_order_trace(size, flags, order);
 }
 
+static __always_inline void *kmalloc_large_memcg(size_t size, gfp_t flags,
+str

[PATCH v2 3/3] fs: fsnotify: account fsnotify metadata to kmemcg

2018-02-20 Thread Shakeel Butt
A lot of memory can be consumed by the events generated for the huge or
unlimited queues if there is either no or slow listener. This can cause
system level memory pressure or OOMs. So, it's better to account the
fsnotify kmem caches to the memcg of the listener.

There are seven fsnotify kmem caches and among them allocations from
dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
inotify_inode_mark_cachep happens in the context of syscall from the
listener. So, SLAB_ACCOUNT is enough for these caches.

The objects from fsnotify_mark_connector_cachep are not accounted as
they are small compared to the notification mark or events and it is
unclear whom to account connector to since it is shared by all events
attached to the inode.

The allocations from the event caches happen in the context of the event
producer. For such caches we will need to remote charge the allocations
to the listener's memcg. Thus we save the memcg reference in the
fsnotify_group structure of the listener.

This patch has also moved the members of fsnotify_group to keep the
size same, at least for 64 bit build, even with additional member by
filling the holes.

Signed-off-by: Shakeel Butt 
---
Changelog since v1:
- no more charging fsnotify_mark_connector objects
- Fixed the build for SLOB

 fs/notify/dnotify/dnotify.c  |  5 +++--
 fs/notify/fanotify/fanotify.c| 12 +++-
 fs/notify/fanotify/fanotify.h|  3 ++-
 fs/notify/fanotify/fanotify_user.c   |  7 +--
 fs/notify/group.c|  4 
 fs/notify/inotify/inotify_fsnotify.c |  2 +-
 fs/notify/inotify/inotify_user.c |  5 -
 fs/notify/mark.c |  6 --
 include/linux/fsnotify_backend.h | 12 
 include/linux/memcontrol.h   |  7 +++
 mm/memcontrol.c  |  2 +-
 11 files changed, 46 insertions(+), 19 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 63a1ca4b9dee..eb5c41284649 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -384,8 +384,9 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned 
long arg)
 
 static int __init dnotify_init(void)
 {
-   dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC);
-   dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC);
+   dnotify_struct_cache = KMEM_CACHE(dnotify_struct,
+ SLAB_PANIC|SLAB_ACCOUNT);
+   dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC|SLAB_ACCOUNT);
 
dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops);
if (IS_ERR(dnotify_group))
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 6702a6a0bbb5..0d9493ebc7cd 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -140,22 +140,24 @@ static bool fanotify_should_send_event(struct 
fsnotify_mark *inode_mark,
 }
 
 struct fanotify_event_info *fanotify_alloc_event(struct inode *inode, u32 mask,
-const struct path *path)
+const struct path *path,
+struct mem_cgroup *memcg)
 {
struct fanotify_event_info *event;
 
if (fanotify_is_perm_event(mask)) {
struct fanotify_perm_event_info *pevent;
 
-   pevent = kmem_cache_alloc(fanotify_perm_event_cachep,
- GFP_KERNEL);
+   pevent = kmem_cache_alloc_memcg(fanotify_perm_event_cachep,
+   GFP_KERNEL, memcg);
if (!pevent)
return NULL;
event = &pevent->fae;
pevent->response = 0;
goto init;
}
-   event = kmem_cache_alloc(fanotify_event_cachep, GFP_KERNEL);
+   event = kmem_cache_alloc_memcg(fanotify_event_cachep, GFP_KERNEL,
+  memcg);
if (!event)
return NULL;
 init: __maybe_unused
@@ -210,7 +212,7 @@ static int fanotify_handle_event(struct fsnotify_group 
*group,
return 0;
}
 
-   event = fanotify_alloc_event(inode, mask, data);
+   event = fanotify_alloc_event(inode, mask, data, group->memcg);
ret = -ENOMEM;
if (unlikely(!event))
goto finish;
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 256d9d1ddea9..51b797896c87 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -53,4 +53,5 @@ static inline struct fanotify_event_info *FANOTIFY_E(struct 
fsnotify_event *fse)
 }
 
 struct fanotify_event_info *fanotify_alloc_event(struct inode *inode, u32 mask,
-const struct path *path);
+const struct path *path,
+   

[PATCH v2 0/3] Directed kmem charging

2018-02-20 Thread Shakeel Butt
This patchset introduces memcg variant memory allocation functions. The
caller can explicitly pass the memcg to charge for kmem allocations.
Currently the kernel, for __GFP_ACCOUNT memory allocation requests,
extract the memcg of the current task to charge for the kmem allocation.
This patch series introduces kmem allocation functions where the caller
can pass the pointer to the remote memcg. The remote memcg will be
charged for the allocation instead of the memcg of the caller. However
the caller must have a reference to the remote memcg.

Fixed the build for SLOB in v2.

Shakeel Butt (3):
  mm: memcg: plumbing memcg for kmem cache allocations
  mm: memcg: plumbing memcg for kmalloc allocations
  fs: fsnotify: account fsnotify metadata to kmemcg

 fs/notify/dnotify/dnotify.c  |   5 +-
 fs/notify/fanotify/fanotify.c|  12 ++-
 fs/notify/fanotify/fanotify.h|   3 +-
 fs/notify/fanotify/fanotify_user.c   |   7 +-
 fs/notify/group.c|   4 +
 fs/notify/inotify/inotify_fsnotify.c |   2 +-
 fs/notify/inotify/inotify_user.c |   5 +-
 fs/notify/mark.c |   6 +-
 include/linux/fsnotify_backend.h |  12 ++-
 include/linux/memcontrol.h   |  13 ++-
 include/linux/slab.h |  86 +++-
 mm/memcontrol.c  |  29 --
 mm/page_alloc.c  |   2 +-
 mm/slab.c| 107 
 mm/slab.h|   6 +-
 mm/slab_common.c |  41 +++-
 mm/slob.c|  13 +++
 mm/slub.c| 140 ++-
 18 files changed, 415 insertions(+), 78 deletions(-)

-- 
2.16.1.291.g4437f3f132-goog



Re: [PATCH v4 2/2] ptrace, seccomp: add support for retrieving seccomp metadata

2018-02-20 Thread Tycho Andersen
On Tue, Feb 20, 2018 at 09:13:28PM +0100, Eugene Syromiatnikov wrote:
> On Tue, Nov 14, 2017 at 07:00:19PM -0700, Tycho Andersen wrote:
> > With the new SECCOMP_FILTER_FLAG_LOG, we need to be able to extract these
> > flags for checkpoint restore, since they describe the state of a filter.
> > 
> > So, let's add PTRACE_SECCOMP_GET_METADATA, similar to ..._GET_FILTER, which
> > returns the metadata of the nth filter (right now, just the flags).
> > Hopefully this will be future proof, and new per-filter metadata can be
> > added to this struct.
> > 
> > v3: * use GET_METADATA instead of GET_FLAGS
> > v4: * resend
> > 
> > Signed-off-by: Tycho Andersen 
> > CC: Kees Cook 
> > CC: Andy Lutomirski 
> > CC: Oleg Nesterov 
> > ---
> >  include/linux/seccomp.h |  8 
> >  include/uapi/linux/ptrace.h |  6 ++
> >  kernel/ptrace.c |  4 
> >  kernel/seccomp.c| 34 ++
> >  4 files changed, 52 insertions(+)
> > 
> > diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
> > index 10f25f7e4304..c723a5c4e3ff 100644
> > --- a/include/linux/seccomp.h
> > +++ b/include/linux/seccomp.h
> > @@ -95,11 +95,19 @@ static inline void get_seccomp_filter(struct 
> > task_struct *tsk)
> >  #if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_CHECKPOINT_RESTORE)
> >  extern long seccomp_get_filter(struct task_struct *task,
> >unsigned long filter_off, void __user *data);
> > +extern long seccomp_get_metadata(struct task_struct *task,
> > +unsigned long filter_off, void __user *data);
> >  #else
> >  static inline long seccomp_get_filter(struct task_struct *task,
> >   unsigned long n, void __user *data)
> >  {
> > return -EINVAL;
> >  }
> > +static inline long seccomp_get_metadata(struct task_struct *task,
> > +   unsigned long filter_off,
> > +   void __user *data)
> > +{
> > +   return -EINVAL;
> > +}
> >  #endif /* CONFIG_SECCOMP_FILTER && CONFIG_CHECKPOINT_RESTORE */
> >  #endif /* _LINUX_SECCOMP_H */
> > diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> > index e3939e00980b..e46d82b91166 100644
> > --- a/include/uapi/linux/ptrace.h
> > +++ b/include/uapi/linux/ptrace.h
> > @@ -66,6 +66,12 @@ struct ptrace_peeksiginfo_args {
> >  #define PTRACE_SETSIGMASK  0x420b
> >  
> >  #define PTRACE_SECCOMP_GET_FILTER  0x420c
> > +#define PTRACE_SECCOMP_GET_METADATA0x420d
> > +
> > +struct seccomp_metadata {
> > +   unsigned long filter_off;   /* Input: which filter */
> > +   unsigned int flags; /* Output: filter's flags */
> > +};
> >  
> >  /* Read signals from a shared (process wide) queue */
> >  #define PTRACE_PEEKSIGINFO_SHARED  (1 << 0)
> > diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> > index 84b1367935e4..58291e9f3276 100644
> > --- a/kernel/ptrace.c
> > +++ b/kernel/ptrace.c
> > @@ -1092,6 +1092,10 @@ int ptrace_request(struct task_struct *child, long 
> > request,
> > ret = seccomp_get_filter(child, addr, datavp);
> > break;
> >  
> > +   case PTRACE_SECCOMP_GET_METADATA:
> > +   ret = seccomp_get_metadata(child, addr, datavp);
> > +   break;
> > +
> > default:
> > break;
> > }
> > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > index bad457862ee0..7f1f2f3ea549 100644
> > --- a/kernel/seccomp.c
> > +++ b/kernel/seccomp.c
> > @@ -1061,6 +1061,40 @@ long seccomp_get_filter(struct task_struct *task, 
> > unsigned long filter_off,
> > __put_seccomp_filter(filter);
> > return ret;
> >  }
> > +
> > +long seccomp_get_metadata(struct task_struct *task,
> > + unsigned long size, void __user *data)
> > +{
> > +   long ret;
> > +   struct seccomp_filter *filter;
> > +   struct seccomp_metadata kmd = {};
> > +
> > +   if (!capable(CAP_SYS_ADMIN) ||
> > +   current->seccomp.mode != SECCOMP_MODE_DISABLED) {
> > +   return -EACCES;
> > +   }
> > +
> 
> > +   size = min_t(unsigned long, size, sizeof(kmd));
> > +
> > +   if (copy_from_user(&kmd, data, size))
> > +   return -EFAULT;
> 
> This will work kinda funny if size is less than sizeof(kmd.filter_off).
> And I see no reason for copying the whole structure when only filter_off
> is needed.

Yes, agreed. EINVAL seems better.

> > +
> > +   filter = get_nth_filter(task, kmd.filter_off);
> > +   if (IS_ERR(filter))
> > +   return PTR_ERR(filter);
> > +
> > +   memset(&kmd, 0, sizeof(kmd));
> 
> Is rewriting the filter_off field to 0 intentional?

It "shouldn't" matter, but it's probably better not to; I've fixed it
in https://lkml.org/lkml/2018/2/20/755

Thanks!

Tycho


Re: [PATCH v4 2/2] ptrace, seccomp: add support for retrieving seccomp metadata

2018-02-20 Thread Tycho Andersen
On Tue, Feb 20, 2018 at 10:30:52PM +0300, Dmitry V. Levin wrote:
> > +struct seccomp_metadata {
> > +   unsigned long filter_off;   /* Input: which filter */
> > +   unsigned int flags; /* Output: filter's flags */
> > +};
> 
> This "unsigned long" field is unacceptable unless you are willing
> to implement a compat layer for this arch-dependent type.
> 
> Please use an arch-independent type instead.
> See e.g. struct ptrace_peeksiginfo_args or struct struct seccomp_data.

Whoops, sorry about that. I believe I've addressed it in
https://lkml.org/lkml/2018/2/20/755

Cheers,

Tycho


Re: [PATCH v4 0/6] Add dynamic ftrace support for RISC-V platforms

2018-02-20 Thread Alan Kao
On Tue, Feb 13, 2018 at 01:13:15PM +0800, Alan Quey-Liang Kao(高魁良) wrote:
> This patch set includes the building blocks of dynamic ftrace features
> for RISC-V machines.
> 
> Changes in v4:
>  - Organize code structure according to changes in v3
>  - Rebase onto the riscv-linux-4.15 branch at github's 
>riscv/riscv-linux repo.  Note that this set is based on the previous
>ftrace patch, which provided basic support.
> 
> Changes in v3:
>  - Replace the nops at the tracer call sites into "call ftrace_stub"
>instructions for better understanding (1/6, 2/6 and 5/6)
> 
> Changes in v2:
>  - Fix the return value as writing to kernel text goes wrong (2/6)
>  - Replace manual comparisons by calling memcmp (2/6)
>  - Simplify the conditional assignment in the Makefile (1/6)
> 
> Alan Kao (6):
>   riscv/ftrace: Add RECORD_MCOUNT support
>   riscv/ftrace: Add dynamic function tracer support
>   riscv/ftrace: Add dynamic function graph tracer support
>   riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support
>   riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support
>   riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support
> 
>  arch/riscv/Kconfig  |   3 +
>  arch/riscv/Makefile |   3 +
>  arch/riscv/include/asm/ftrace.h |  56 ++
>  arch/riscv/kernel/Makefile  |   5 +-
>  arch/riscv/kernel/ftrace.c  | 175 -
>  arch/riscv/kernel/mcount-dyn.S  | 239 
> 
>  arch/riscv/kernel/mcount.S  |  22 ++--
>  arch/riscv/kernel/stacktrace.c  |   6 +
>  scripts/recordmcount.pl |   5 +
>  9 files changed, 501 insertions(+), 13 deletions(-)
>  create mode 100644 arch/riscv/kernel/mcount-dyn.S
> 
> -- 
> 2.15.1
>

Any comments? 


[PATCH 2/3] ptrace, seccomp: tweak get_metadata behavior slightly

2018-02-20 Thread Tycho Andersen
Previously if users passed a small size for the input structure size, they
would get get odd behavior. It doesn't make sense to pass a structure
smaller than at least filter_off size, so let's just give -EINVAL in this
case.

This changes userspace visible behavior, but was only introduced in commit
26500475ac1b ("ptrace, seccomp: add support for retrieving seccomp
metadata") in 4.16-rc2, so should be safe to change if merged before then.

Reported-by: Eugene Syromiatnikov 
Signed-off-by: Tycho Andersen 
CC: Kees Cook 
CC: Oleg Nesterov 
---
 kernel/seccomp.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 940fa408a288..dc77548167ef 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1076,14 +1076,16 @@ long seccomp_get_metadata(struct task_struct *task,
 
size = min_t(unsigned long, size, sizeof(kmd));
 
-   if (copy_from_user(&kmd, data, size))
+   if (size < sizeof(kmd.filter_off))
+   return -EINVAL;
+
+   if (copy_from_user(&kmd.filter_off, data, sizeof(kmd.filter_off)))
return -EFAULT;
 
filter = get_nth_filter(task, kmd.filter_off);
if (IS_ERR(filter))
return PTR_ERR(filter);
 
-   memset(&kmd, 0, sizeof(kmd));
if (filter->log)
kmd.flags |= SECCOMP_FILTER_FLAG_LOG;
 
-- 
2.14.1



[PATCH 1/3] seccomp, ptrace: switch get_metadata types to arch independent

2018-02-20 Thread Tycho Andersen
Commit 26500475ac1b ("ptrace, seccomp: add support for retrieving seccomp
metadata") introduced `struct seccomp_metadata`, which contained unsigned
longs that should be arch independent. The type of the flags member was
chosen to match the corresponding argument to seccomp(), and so we need
something at least as big as unsigned long. My understanding is that __u64
should fit the bill, so let's switch both types to that.

While this is userspace facing, it was only introduced in 4.16-rc2, and so
should be safe assuming it goes in before then.

Reported-by: "Dmitry V. Levin" 
Signed-off-by: Tycho Andersen 
CC: Kees Cook 
CC: Oleg Nesterov 
---
 include/uapi/linux/ptrace.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index e46d82b91166..d5a1b8a492b9 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -69,8 +69,8 @@ struct ptrace_peeksiginfo_args {
 #define PTRACE_SECCOMP_GET_METADATA0x420d
 
 struct seccomp_metadata {
-   unsigned long filter_off;   /* Input: which filter */
-   unsigned int flags; /* Output: filter's flags */
+   __u64 filter_off;   /* Input: which filter */
+   __u64 flags;/* Output: filter's flags */
 };
 
 /* Read signals from a shared (process wide) queue */
-- 
2.14.1



[PATCH 0/3] some fixups for PTRACE_SECCOMP_GET_METADATA

2018-02-20 Thread Tycho Andersen
Hi Kees,

Here are a couple of tweaks/fixes people suggested to the get_metadata
functionality, plus a test to ensure that things work the way they're supposed
to and stay that way.

Cheers,

Tycho

Tycho Andersen (3):
  seccomp, ptrace: switch get_metadata types to arch independent
  ptrace, seccomp: tweak get_metadata behavior slightly
  seccomp: add a selftest for get_metadata

 include/uapi/linux/ptrace.h   |  4 +-
 kernel/seccomp.c  |  6 ++-
 tools/testing/selftests/seccomp/seccomp_bpf.c | 61 +++
 3 files changed, 67 insertions(+), 4 deletions(-)

-- 
2.14.1



  1   2   3   4   5   6   7   8   9   >