date:20190530

linux-next: Tree for May 31

2019-05-30 Thread Stephen Rothwell

Hi all,

Changes since 20190530:

The net-next tree gained a conflict against the net tree.  It also gained
a build failure for which I reverted a commit.

I applied a patch to fix an sh build probem.

The akpm-current tree lost its build failure.

Non-merge commits (relative to Linus' tree): 3342
 3601 files changed, 131273 insertions(+), 60141 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 291 trees (counting Linus' and 70 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (bec7550cca10 Merge tag 'docs-5.2-fixes2' of 
git://git.lwn.net/linux)
Merging fixes/master (2bbacd1a9278 Merge tag 'kconfig-v5.2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kspp-gustavo/for-next/kspp (034e673710d3 platform/x86: acer-wmi: Mark 
expected switch fall-throughs)
Merging kbuild-current/fixes (30a28f11b618 kbuild: tar-pkg: enable 
communication with jobserver)
Merging arc-current/for-curr (46e04c25e72f ARC: [plat-hsdk] Get rid of 
inappropriate PHY settings)
Merging arm-current/fixes (e17b1af96b2a ARM: 8857/1: efi: enable CP15 DMB 
instructions before cleaning the cache)
Merging arm64-fixes/for-next/fixes (1e29ab3186e3 arm64: use the correct 
function type for __arm64_sys_ni_syscall)
Merging m68k-current/for-linus (fdd20ec8786a Documentation/features/time: Mark 
m68k having modern-timekeeping)
Merging powerpc-fixes/fixes (d6e3af06d947 powerpc/pseries: Fix xive=off command 
line)
Merging s390-fixes/fixes (1c2c7029c008 s390/crypto: fix possible sleep during 
spinlock aquired)
Merging sparc/master (54dee406374c Merge tag 'arm64-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (b73484b2fc0d ethtool: Check for vlan etype or vlan tci when 
parsing flow_rule)
Merging bpf/master (5fac1718e706 selftests: bpf: fix compiler warning in 
flow_dissector test)
Merging ipsec/master (7c80eb1c7e2b af_key: fix leaks in key_pol_get_resp and 
dump_sp.)
Merging netfilter/master (58e8b37069ff Merge branch 
'net-phy-dp83867-add-some-fixes')
Merging ipvs/master (e633508a9528 netfilter: nft_fib: Fix existence check 
support)
Merging wireless-drivers/master (685c9b7750bf mwifiex: Abort at too short BSS 
descriptor element)
Merging mac80211/master (180aa422ef27 nl80211: fill all policy .type entries)
Merging rdma-fixes/for-rc (4f240dfec6bc RDMA/efa: Remove MAYEXEC flag check 
from mmap flow)
Merging sound-current/for-linus (6954158a1640 ALSA: fireface: Use ULL suffixes 
for 64-bit constants)
Merging sound-asoc-fixes/for-linus (6ce64b151bdf Merge branch 'asoc-5.2' into 
asoc-linus)
Merging regmap-fixes/for-linus (38ee2a8cc70e Merge branch 'regmap-5.2' into 
regmap-linus)
Merging regulator-fixes/for-linus (ae920866d4fc Merge branch 'regulator-5.2' 
into regulator-linus)
Merging spi-fixes/for-linus (be02f18a60ed Merge branch 'spi-5.2' into spi-linus)
Merging pci-current/for-linus (a188339ca5a3 Linux 5.2-rc1)
Merging driver-core.current/driver-core-linus (cd6c84d8f0cd Linux 5.2-rc2)
Merging tty.current/tty-linus (a1ad1cc9704f vt/fbcon: deinitialize resources in 
visual_init() after failed memory allocation)
Merging usb.current/usb-linus (3ea3091f1bd8 usbip: usbip_host: fix stub_dev 
lock context

Re: [v3 PATCH] usb: create usb_debug_root for gadget only

2019-05-30 Thread Felipe Balbi



Hi,

Chunfeng Yun  writes:

> Hi Felipe,
> On Tue, 2019-05-28 at 11:11 +0300, Felipe Balbi wrote:
>> Hi,
>> 
>> Chunfeng Yun  writes:
>> > diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
>> > index 7fcb9f782931..88b3ee03a12d 100644
>> > --- a/drivers/usb/core/usb.c
>> > +++ b/drivers/usb/core/usb.c
>> > @@ -1190,7 +1190,7 @@ EXPORT_SYMBOL_GPL(usb_debug_root);
>> >  
>> >  static void usb_debugfs_init(void)
>> >  {
>> > -  usb_debug_root = debugfs_create_dir("usb", NULL);
>> > +  usb_debug_root = debugfs_create_dir(USB_DEBUG_ROOT_NAME, NULL);
>> >debugfs_create_file("devices", 0444, usb_debug_root, NULL,
>> >_devices_fops);
>> >  }
>> 
>> might be a better idea to move this to usb common. Then have a function
>> which can be called by both host and gadget to maybe create the
>> directory:
>> 
>> static struct dentry *usb_debug_root;
>> 
>> struct dentry *usb_debugfs_init(void)
>> {
>>  if (!usb_debug_root)
>>  usb_debug_root = debugfs_create_dir("usb", NULL);
>> 
>>  return usb_debug_root;
>> }
>> 
>> 
>> Then usb core would be updated to something like:
>> 
>> static void usb_core_debugfs_init(void)
>> {
>>  struct dentry *root = usb_debugfs_init();
>> 
>>  debugfs_create_file("devices", 0444, root, NULL, _devices_fops);
>> }
>> 
> I find a problem when move usb_debugfs_init() and usb_debugfs_cleanup()
> into usb common, it's easy to create "usb" directory, but difficult to
> cleanup it:
>
> common/common.c
>
> struct dentry *usb_debugfs_init(void)
> {
> if (!usb_debug_root)
> usb_debug_root = debugfs_create_dir("usb", NULL);
>
> return usb_debug_root;
> }
>
> void usb_debugfs_cleanup(void)
> {
> debugfs_remove_recursive(usb_debug_root);
> usb_debug_root = NULL;
> }
>
> core/usb.c
>
> static void usb_core_debugfs_init(void)
> {
> struct dentry *root = usb_debugfs_init();
>
> debugfs_create_file("devices", 0444, root, NULL,
> _devices_fops);
> }
>
> static int __init usb_init(void)
> {
> ...
> usb_core_debugfs_init();
> ...
> }
>
> static void __exit usb_exit(void)
> {
> ...
> usb_debugfs_cleanup();
> // will be error, gadget may use it.
> ...
> }
>
> gadget/udc/core.c
>
> static int __init usb_udc_init(void)
> {
> ...
> usb_debugfs_init();
> ...
> }
>
> static void __exit usb_udc_exit(void)
> {
> ...
> usb_debugfs_cleanup();
> // can't cleanup in fact, usb core may use it.
> }
>
> How to handle this case? introduce a reference count? do you have any
> suggestion?

I guess a simple refcount is the way to go:

struct dentry *usb_debugfs_init(void)
{
if (!usb_debug_root)
usb_debug_root = debugfs_create_dir("usb", NULL);

usb_debug_root_refcnt++;
return usb_debug_root;
}

void usb_debugfs_cleanup(void)
{
if (!(--usb_debug_root_refcnt)) {
debugfs_remove_recursive(usb_debug_root);
usb_debug_root = NULL;
}
}

Or something along those lines

-- 
balbi

Re: [PATCH bpf] bpf: preallocate a perf_sample_data per event fd

2019-05-30 Thread Song Liu




> On May 30, 2019, at 5:01 PM, Matt Mullins  wrote:
> 
> On Thu, 2019-05-30 at 23:28 +, Song Liu wrote:
>>> On May 30, 2019, at 3:55 PM, Matt Mullins  wrote:
>>> 
>>> It is possible that a BPF program can be called while another BPF
>>> program is executing bpf_perf_event_output.  This has been observed with
>>> I/O completion occurring as a result of an interrupt:
>>> 
>>> bpf_prog_247fd1341cddaea4_trace_req_end+0x8d7/0x1000
>>> ? trace_call_bpf+0x82/0x100
>>> ? sch_direct_xmit+0xe2/0x230
>>> ? blk_mq_end_request+0x1/0x100
>>> ? blk_mq_end_request+0x5/0x100
>>> ? kprobe_perf_func+0x19b/0x240
>>> ? __qdisc_run+0x86/0x520
>>> ? blk_mq_end_request+0x1/0x100
>>> ? blk_mq_end_request+0x5/0x100
>>> ? kprobe_ftrace_handler+0x90/0xf0
>>> ? ftrace_ops_assist_func+0x6e/0xe0
>>> ? ip6_input_finish+0xbf/0x460
>>> ? 0xa01e80bf
>>> ? nbd_dbg_flags_show+0xc0/0xc0 [nbd]
>>> ? blkdev_issue_zeroout+0x200/0x200
>>> ? blk_mq_end_request+0x1/0x100
>>> ? blk_mq_end_request+0x5/0x100
>>> ? flush_smp_call_function_queue+0x6c/0xe0
>>> ? smp_call_function_single_interrupt+0x32/0xc0
>>> ? call_function_single_interrupt+0xf/0x20
>>> ? call_function_single_interrupt+0xa/0x20
>>> ? swiotlb_map_page+0x140/0x140
>>> ? refcount_sub_and_test+0x1a/0x50
>>> ? tcp_wfree+0x20/0xf0
>>> ? skb_release_head_state+0x62/0xc0
>>> ? skb_release_all+0xe/0x30
>>> ? napi_consume_skb+0xb5/0x100
>>> ? mlx5e_poll_tx_cq+0x1df/0x4e0
>>> ? mlx5e_poll_tx_cq+0x38c/0x4e0
>>> ? mlx5e_napi_poll+0x58/0xc30
>>> ? mlx5e_napi_poll+0x232/0xc30
>>> ? net_rx_action+0x128/0x340
>>> ? __do_softirq+0xd4/0x2ad
>>> ? irq_exit+0xa5/0xb0
>>> ? do_IRQ+0x7d/0xc0
>>> ? common_interrupt+0xf/0xf
>>> 
>>> ? __rb_free_aux+0xf0/0xf0
>>> ? perf_output_sample+0x28/0x7b0
>>> ? perf_prepare_sample+0x54/0x4a0
>>> ? perf_event_output+0x43/0x60
>>> ? bpf_perf_event_output_raw_tp+0x15f/0x180
>>> ? blk_mq_start_request+0x1/0x120
>>> ? bpf_prog_411a64a706fc6044_should_trace+0xad4/0x1000
>>> ? bpf_trace_run3+0x2c/0x80
>>> ? nbd_send_cmd+0x4c2/0x690 [nbd]
>>> 
>>> This also cannot be alleviated by further splitting the per-cpu
>>> perf_sample_data structs (as in commit 283ca526a9bd ("bpf: fix
>>> corruption on concurrent perf_event_output calls")), as a raw_tp could
>>> be attached to the block:block_rq_complete tracepoint and execute during
>>> another raw_tp.  Instead, keep a pre-allocated perf_sample_data
>>> structure per perf_event_array element and fail a bpf_perf_event_output
>>> if that element is concurrently being used.
>>> 
>>> Fixes: 20b9d7ac4852 ("bpf: avoid excessive stack usage for 
>>> perf_sample_data")
>>> Signed-off-by: Matt Mullins 
>>> ---
>>> It felt a bit overkill, but I had to split bpf_event_entry into its own
>>> header file to break an include cycle from perf_event.h -> cgroup.h ->
>>> cgroup-defs.h -> bpf-cgroup.h -> bpf.h -> (potentially) perf_event.h.
>>> 
>>> include/linux/bpf.h   |  7 ---
>>> include/linux/bpf_event.h | 20 
>>> kernel/bpf/arraymap.c |  2 ++
>>> kernel/trace/bpf_trace.c  | 30 +-
>>> 4 files changed, 39 insertions(+), 20 deletions(-)
>>> create mode 100644 include/linux/bpf_event.h
>>> 
>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>>> index 4fb3aa2dc975..13b253a36402 100644
>>> --- a/include/linux/bpf.h
>>> +++ b/include/linux/bpf.h
>>> @@ -467,13 +467,6 @@ static inline bool bpf_map_flags_access_ok(u32 
>>> access_flags)
>>>(BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG);
>>> }
>>> 
>> 
>> I think we can avoid the include cycle as:
>> 
>> +struct perf_sample_data *sd;
>> struct bpf_event_entry {
>>  struct perf_event *event;
>>  struct file *perf_file;
>>  struct file *map_file;
>>  struct rcu_head rcu;
>> +struct perf_sample_data *sd;
>> };
> 
> Yeah, that totally works.  I was mostly doing this so we had only one
> kmalloc allocation, but I'm not too worried about having an extra
> object in kmalloc-64 if it simplifies the code a lot.

We can also do something like

   ee = kzalloc(sizeof(struct bpf_event_entry) + sizeof(struct 
perf_sample_data));
   ee->sd = (void *)ee + sizeof(struct bpf_event_entry);

Thanks,
Song

> 
>> 
>>> -struct bpf_event_entry {
>>> -   struct perf_event *event;
>>> -   struct file *perf_file;
>>> -   struct file *map_file;
>>> -   struct rcu_head rcu;
>>> -};
>>> -
>>> bool bpf_prog_array_compatible(struct bpf_array *array, const struct 
>>> bpf_prog *fp);
>>> int bpf_prog_calc_tag(struct bpf_prog *fp);
>>> 
>>> diff --git a/include/linux/bpf_event.h b/include/linux/bpf_event.h
>>> new file mode 100644
>>> index ..9f415990f921
>>> --- /dev/null
>>> +++ b/include/linux/bpf_event.h
>>> @@ -0,0 +1,20 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +
>>> +#ifndef _LINUX_BPF_EVENT_H
>>> +#define _LINUX_BPF_EVENT_H
>>>

Re: [PATCH] PCI: endpoint: Add DMA to Linux PCI EP Framework

2019-05-30 Thread Kishon Vijay Abraham I

Hi Vinod,

On 31/05/19 10:37 AM, Vinod Koul wrote:
> Hi Kishon,
> 
> On 30-05-19, 11:16, Kishon Vijay Abraham I wrote:
>> +Vinod Koul
>>
>> Hi,
>>
>> On 30/05/19 4:07 AM, Alan Mikhak wrote:
>>> On Mon, May 27, 2019 at 2:09 AM Gustavo Pimentel
>>>  wrote:

 On Fri, May 24, 2019 at 20:42:43, Alan Mikhak 
 wrote:

 Hi Alan,

> On Fri, May 24, 2019 at 1:59 AM Gustavo Pimentel
>  wrote:
>>
>> Hi Alan,
>>
>> This patch implementation is very HW implementation dependent and
>> requires the DMA to exposed through PCIe BARs, which aren't always the
>> case. Besides, you are defining some control bits on
>> include/linux/pci-epc.h that may not have any meaning to other types of
>> DMA.
>>
>> I don't think this was what Kishon had in mind when he developed the
>> pcitest, but let see what Kishon was to say about it.
>>
>> I've developed a DMA driver for DWC PCI using Linux Kernel DMAengine API
>> and which I submitted some days ago.
>> By having a DMA driver which implemented using DMAengine API, means the
>> pcitest can use the DMAengine client API, which will be completely
>> generic to any other DMA implementation.
>>
>> right, my initial thought process was to use only dmaengine APIs in
>> pci-epf-test so that the system DMA or DMA within the PCIe controller can be
>> used transparently. But can we register DMA within the PCIe controller to the
>> DMA subsystem? AFAIK only system DMA should register with the DMA subsystem.
>> (ADMA in SDHCI doesn't use dmaengine). Vinod Koul can confirm.
> 
> So would this DMA be dedicated for PCI and all PCI devices on the bus?

Yes, this DMA will be used only by PCI ($patch is w.r.t PCIe device mode. So
all endpoint functions both physical and virtual functions will use the DMA in
the controller).
> If so I do not see a reason why this cannot be using dmaengine. The use

Thanks for clarifying. I was under the impression any DMA within a peripheral
controller shouldn't use DMAengine.
> case would be memcpy for DMA right or mem to device (vice versa) transfers?

The device is memory mapped so it would be only memcopy.
> 
> Btw many driver in sdhci do use dmaengine APIs and yes we are missing
> support in framework than individual drivers

I think dmaengine APIs is used only when the platform uses system DMA and not
ADMA within the SDHCI controller. IOW there is no dma_async_device_register()
to register ADMA in SDHCI with DMA subsystem.

Thanks
Kishon

Re: [PATCH 4.19 000/276] 4.19.47-stable review

2019-05-30 Thread Naresh Kamboju

On Thu, 30 May 2019 at 08:51, Greg Kroah-Hartman
 wrote:
>
> This is the start of the stable review cycle for the 4.19.47 release.
> There are 276 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat 01 Jun 2019 03:02:08 AM UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.47-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary


kernel: 4.19.47-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.19.y
git commit: ce4f69c2c1a58809446ca1cc59521671d7974f8a
git describe: v4.19.46-277-gce4f69c2c1a5
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.46-277-gce4f69c2c1a5


No regressions (compared to build v4.19.46)

No fixes (compared to build v4.19.46)


Ran 25004 total tests in the following environments and test suites.

Environments
--
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
---
* build
* install-android-platform-tools-r2600
* kselftest
* libgpiod
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none
* libhug[

-- 
Linaro LKFT
https://lkft.linaro.org

Re: [PATCH 4.14 000/193] 4.14.123-stable review

2019-05-30 Thread Naresh Kamboju

On Thu, 30 May 2019 at 08:55, Greg Kroah-Hartman
 wrote:
>
> This is the start of the stable review cycle for the 4.14.123 release.
> There are 193 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat 01 Jun 2019 03:02:04 AM UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.123-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.14.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary


kernel: 4.14.123-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.14.y
git commit: 0352fa2fdaa68f3e27866e6f6a5125aa9efcefe4
git describe: v4.14.122-194-g0352fa2fdaa6
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.122-194-g0352fa2fdaa6


No regressions (compared to build v4.14.122)

No fixes (compared to build v4.14.122)


Ran 22398 total tests in the following environments and test suites.

Environments
--
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
---
* build
* install-android-platform-tools-r2600
* libhugetlbfs
* ltp-commands-tests
* ltp-containers-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fs-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-sched-tests
* ltp-timers-tests
* spectre-meltdown-checker-test
* kselftest
* ltp-cap_bounds-tests
* ltp-cpuhotplug-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* network-basic-tests
* perf
* v4l2-compliance
* ltp-open-posix-tests
* kvm-unit-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none
* ssuite

-- 
Linaro LKFT
https://lkft.linaro.org

Re: [PATCH v4] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

2019-05-30 Thread Josh Poimboeuf

On Fri, May 31, 2019 at 01:42:02AM +0200, Jiri Kosina wrote:
> On Thu, 30 May 2019, Josh Poimboeuf wrote:
> 
> > > > Reviewed-by: Thomas Gleixner 
> > > 
> > > Yes, it is, thanks!
> > 
> > I still think changing monitor/mwait to use a fixmap address would be a
> > much cleaner way to fix this.  I can try to work up a patch tomorrow.
> 
> I disagree with that from the backwards compatibility point of view.
> 
> I personally am quite frequently using differnet combinations of 
> resumer/resumee kernels, and I've never been biten by it so far. I'd guess 
> I am not the only one.
> Fixmap sort of breaks that invariant.

Right now there is no backwards compatibility because nosmt resume is
already broken.

For "future" backwards compatibility we could just define a hard-coded
reserved fixmap page address, adjacent to the vsyscall reserved address.

Something like this (not yet tested)?  Maybe we could also remove the
resume_play_dead() hack?

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 9da8cccdf3fb..1c328624162c 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -80,6 +80,7 @@ enum fixed_addresses {
 #ifdef CONFIG_X86_VSYSCALL_EMULATION
VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT,
 #endif
+   FIX_MWAIT = (FIXADDR_TOP - VSYSCALL_ADDR - 1) >> PAGE_SHIFT,
 #endif
FIX_DBGP_BASE,
FIX_EARLYCON_MEM_BASE,
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 73e69117..9804fbe25d03 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -108,6 +108,8 @@ int __read_mostly __max_smt_threads = 1;
 /* Flag to indicate if a complete sched domain rebuild is required */
 bool x86_topology_update;
 
+static char __mwait_page[PAGE_SIZE];
+
 int arch_update_cpu_topology(void)
 {
int retval = x86_topology_update;
@@ -1319,6 +1321,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
smp_quirk_init_udelay();
 
speculative_store_bypass_ht_init();
+
+   set_fixmap(FIX_MWAIT, __pa_symbol(&__mwait_page));
 }
 
 void arch_enable_nonboot_cpus_begin(void)
@@ -1631,11 +1635,12 @@ static inline void mwait_play_dead(void)
}
 
/*
-* This should be a memory location in a cache line which is
-* unlikely to be touched by other processors.  The actual
-* content is immaterial as it is not actually modified in any way.
+* This memory location is never actually written to.  It's mapped at a
+* reserved fixmap address to ensure the monitored address remains
+* valid across a hibernation resume operation.  Otherwise a triple
+* fault can occur.
 */
-   mwait_ptr = _thread_info()->flags;
+   mwait_ptr = (void *)fix_to_virt(FIX_MWAIT);
 
wbinvd();

Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC

2019-05-30 Thread Borislav Petkov

On Fri, May 31, 2019 at 01:15:33AM +, Herrenschmidt, Benjamin wrote:
> This isn't terribly helpful, there's nothing telling anybody which of
> those files corresponds to an ARM SoC :-)

drivers/edac/altera_edac.c is one example.

Also, James and I have a small writeup on how an arm driver should look
like, we just need to polish it up and post it.

James?

> That said ...
> 
> You really want a single EDAC driver that contains all the stuff for
> the caches, the memory controller, etc... ?

Yap.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply. Srsly.

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-05-30 Thread Aubrey Li

On Fri, May 31, 2019 at 11:01 AM Aaron Lu  wrote:
>
> This feels like "date" failed to schedule on some CPU
> on time.
>
> My first reaction is: when shell wakes up from sleep, it will
> fork date. If the script is untagged and those workloads are
> tagged and all available cores are already running workload
> threads, the forked date can lose to the running workload
> threads due to __prio_less() can't properly do vruntime comparison
> for tasks on different CPUs. So those idle siblings can't run
> date and are idled instead. See my previous post on this:
> https://lore.kernel.org/lkml/20190429033620.GA128241@aaronlu/
> (Now that I re-read my post, I see that I didn't make it clear
> that se_bash and se_hog are assigned different tags(e.g. hog is
> tagged and bash is untagged).

Yes, script is untagged. This looks like exactly the problem in you
previous post. I didn't follow that, does that discussion lead to a solution?

>
> Siblings being forced idle is expected due to the nature of core
> scheduling, but when two tasks belonging to two siblings are
> fighting for schedule, we should let the higher priority one win.
>
> It used to work on v2 is probably due to we mistakenly
> allow different tagged tasks to schedule on the same core at
> the same time, but that is fixed in v3.

I have 64 threads running on a 104-CPU server, that is, when the
system has ~40% idle time, and "date" is still failed to be picked
up onto CPU on time. This may be the nature of core scheduling,
but it seems to be far from fairness.

Shouldn't we share the core between (sysbench+gemmbench)
and (date)? I mean core level sharing instead of  "date" starvation?

Thanks,
-Aubrey

Re: [PATCH] PCI: endpoint: Add DMA to Linux PCI EP Framework

2019-05-30 Thread Kishon Vijay Abraham I

Hi Alan,

On 30/05/19 11:26 PM, Alan Mikhak wrote:
> On Wed, May 29, 2019 at 10:48 PM Kishon Vijay Abraham I  wrote:
>>
>> +Vinod Koul
>>
>> Hi,
>>
> On Fri, May 24, 2019 at 1:59 AM Gustavo Pimentel
>  wrote:
>>
>> Hi Alan,
>>
>> This patch implementation is very HW implementation dependent and
>> requires the DMA to exposed through PCIe BARs, which aren't always the
>> case. Besides, you are defining some control bits on
>> include/linux/pci-epc.h that may not have any meaning to other types of
>> DMA.
>>
>> I don't think this was what Kishon had in mind when he developed the
>> pcitest, but let see what Kishon was to say about it.
>>
>> I've developed a DMA driver for DWC PCI using Linux Kernel DMAengine API
>> and which I submitted some days ago.
>> By having a DMA driver which implemented using DMAengine API, means the
>> pcitest can use the DMAengine client API, which will be completely
>> generic to any other DMA implementation.
>>
>> right, my initial thought process was to use only dmaengine APIs in
>> pci-epf-test so that the system DMA or DMA within the PCIe controller can be
>> used transparently. But can we register DMA within the PCIe controller to the
>> DMA subsystem? AFAIK only system DMA should register with the DMA subsystem.
>> (ADMA in SDHCI doesn't use dmaengine). Vinod Koul can confirm.
>>
>> If DMA within the PCIe controller cannot be registered in DMA subsystem, we
>> should use something like what Alan has done in this patch with dma_read ops.
>> The dma_read ops implementation in the EP controller can either use dmaengine
>> APIs or use the DMA within the PCIe controller.
>>
>> I'll review the patch separately.
>>
>> Thanks
>> Kishon
> 
> Hi Kishon,
> 
> I have some improvements in mind for a v2 patch in response to
> feedback from Gustavo Pimentel that the current implementation is HW
> specific. I hesitate from submitting a v2 patch because it seems best
> to seek comment on possible directions this may be taking.
> 
> One alternative is to wait for or modify test functions in
> pci-epf-test.c to call DMAengine client APIs, if possible. I imagine
> pci-epf-test.c test functions would still allocate the necessary local
> buffer on the endpoint side for the same canned tests for everyone to
> use. They would prepare the buffer in the existing manner by filling
> it with random bytes and calculate CRC in the case of a write test.
> However, they would then initiate DMA operations by using DMAengine
> client APIs in a generic way instead of calling memcpy_toio() and
> memcpy_fromio(). They would post-process the buffer in the existing

No, you can't remove memcpy_toio/memcpy_fromio APIs. There could be platforms
without system DMA or they could have system DMA but without MEMCOPY channels
or without DMA in their PCI controller.
> manner such as the checking for CRC in the case of a read test.
> Finally, they would release the resources and report results back to
> the user of pcitest across the PCIe bus through the existing methods.
> 
> Another alternative I have in mind for v2 is to change the struct
> pci_epc_dma that this patch added to pci-epc.h from the following:
> 
> struct pci_epc_dma {
> u32 control;
> u32 size;
> u64 sar;
> u64 dar;
> };
> 
> to something similar to the following:
> 
> struct pci_epc_dma {
> size_t  size;
> void *buffer;
> int flags;
> };
> 
> The 'flags' field can be a bit field or separate boolean values to
> specify such things as linked-list mode vs single-block, etc.
> Associated #defines would be removed from pci-epc.h to be replaced if
> needed with something generic. The 'size' field specifies the size of
> DMA transfer that can fit in the buffer.

I still have to look closer into your DMA patch but linked-list mode or single
block mode shouldn't be an user select-able option but should be determined by
the size of transfer.
> 
> That way the dma test functions in pci-epf-test.c can simply kmalloc
> and prepare a local buffer on the endpoint side for the DMA transfer
> and pass its pointer down the stack using the 'buffer' field to lower
> layers. This would allow different PCIe controller drivers to
> implement DMA or not according to their needs. Each implementer can
> decide to use DMAengine client API, which would be preferable, or
> directly read or write to DMA hardware registers to suit their needs.

yes, that would be my preferred method as well. In fact I had implemented
pci_epf_tx() in [1], as a way for pci-epf-test to pass buffer address to
endpoint controller driver. I had also implemented helpers for platforms using
system DMA (i.e uses DMAengine).

Thanks
Kishon

[1] ->
http://git.ti.com/cgit/cgit.cgi/ti-linux-kernel/ti-linux-kernel.git/tree/drivers/pci/endpoint/pci-epf-core.c?h=ti-linux-4.19.y
> 
> I would appreciate feedback and comment on such choices as part of this 
> review.
> 
>

Re: [PATCH] PCI: endpoint: Add DMA to Linux PCI EP Framework

2019-05-30 Thread Vinod Koul

Hi Kishon,

On 30-05-19, 11:16, Kishon Vijay Abraham I wrote:
> +Vinod Koul
> 
> Hi,
> 
> On 30/05/19 4:07 AM, Alan Mikhak wrote:
> > On Mon, May 27, 2019 at 2:09 AM Gustavo Pimentel
> >  wrote:
> >>
> >> On Fri, May 24, 2019 at 20:42:43, Alan Mikhak 
> >> wrote:
> >>
> >> Hi Alan,
> >>
> >>> On Fri, May 24, 2019 at 1:59 AM Gustavo Pimentel
> >>>  wrote:
> 
>  Hi Alan,
> 
>  This patch implementation is very HW implementation dependent and
>  requires the DMA to exposed through PCIe BARs, which aren't always the
>  case. Besides, you are defining some control bits on
>  include/linux/pci-epc.h that may not have any meaning to other types of
>  DMA.
> 
>  I don't think this was what Kishon had in mind when he developed the
>  pcitest, but let see what Kishon was to say about it.
> 
>  I've developed a DMA driver for DWC PCI using Linux Kernel DMAengine API
>  and which I submitted some days ago.
>  By having a DMA driver which implemented using DMAengine API, means the
>  pcitest can use the DMAengine client API, which will be completely
>  generic to any other DMA implementation.
> 
> right, my initial thought process was to use only dmaengine APIs in
> pci-epf-test so that the system DMA or DMA within the PCIe controller can be
> used transparently. But can we register DMA within the PCIe controller to the
> DMA subsystem? AFAIK only system DMA should register with the DMA subsystem.
> (ADMA in SDHCI doesn't use dmaengine). Vinod Koul can confirm.

So would this DMA be dedicated for PCI and all PCI devices on the bus?
If so I do not see a reason why this cannot be using dmaengine. The use
case would be memcpy for DMA right or mem to device (vice versa) transfers?

Btw many driver in sdhci do use dmaengine APIs and yes we are missing
support in framework than individual drivers

> If DMA within the PCIe controller cannot be registered in DMA subsystem, we
> should use something like what Alan has done in this patch with dma_read ops.
> The dma_read ops implementation in the EP controller can either use dmaengine
> APIs or use the DMA within the PCIe controller.
> 
> I'll review the patch separately.
> 
> Thanks
> Kishon

-- 
~Vinod

Re: [PATCH v4 00/14] Provide generic top-down mmap layout functions

2019-05-30 Thread Alex Ghiti


On 5/29/19 4:16 PM, Kees Cook wrote:

On Sun, May 26, 2019 at 09:47:32AM -0400, Alexandre Ghiti wrote:

This series introduces generic functions to make top-down mmap layout
easily accessible to architectures, in particular riscv which was
the initial goal of this series.
The generic implementation was taken from arm64 and used successively
by arm, mips and finally riscv.

As I've mentioned before, I think this is really great. Making this
common has long been on my TODO list. Thank you for the work! (I've sent
separate review emails for individual patches where my ack wasn't
already present...)



Thanks :)



   - There is no common API to determine if a process is 32b, so I came up with
 !IS_ENABLED(CONFIG_64BIT) || is_compat_task() in [PATCH v4 12/14].

Do we need a common helper for this idiom? (Note that I don't think it's
worth blocking the series for this.)



Each architecture has its own way of finding that out, it might be 
interesting if there are other

places in generic code to propose something in that sense.
I will search for such places if they exist and come back with something.

Thanks Kees for your time,

Alex




-Kees

[PATCH 1/2] PCI: Code reorganization for VGA device link

2019-05-30 Thread Abhishek Sahu

This patch does minor code reorganization. It introduces a helper
function which creates device link from the non-VGA controller
(consumer) to the VGA (supplier) and uses this helper function for
creating device link from integrated HDA controller to VGA. It will
help in subsequent patches which require a similar kind of device
link from USB/Type-C USCI controller to VGA.

Signed-off-by: Abhishek Sahu 
---
 drivers/pci/quirks.c | 44 +---
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index a077f67fe1da..a20f7771a323 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4916,36 +4916,50 @@ static void quirk_fsl_no_msi(struct pci_dev *pdev)
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, quirk_fsl_no_msi);
 
 /*
- * GPUs with integrated HDA controller for streaming audio to attached displays
- * need a device link from the HDA controller (consumer) to the GPU (supplier)
- * so that the GPU is powered up whenever the HDA controller is accessed.
- * The GPU and HDA controller are functions 0 and 1 of the same PCI device.
- * The device link stays in place until shutdown (or removal of the PCI device
- * if it's hotplugged).  Runtime PM is allowed by default on the HDA controller
- * to prevent it from permanently keeping the GPU awake.
+ * GPUs can be multi-function PCI device which can contain controllers other
+ * than VGA (like Audio, USB, etc.). Internally in the hardware, these non-VGA
+ * controllers are tightly coupled with VGA controller. Whenever these
+ * controllers are runtime active, the VGA controller should also be in active
+ * state. Normally, in these GPUs, the VGA controller is present at function 0.
+ *
+ * This is a helper function which creates device link from the non-VGA
+ * controller (consumer) to the VGA (supplier). The device link stays in place
+ * until shutdown (or removal of the PCI device if it's hotplugged).
+ * Runtime PM is allowed by default on these non-VGA controllers to prevent
+ * it from permanently keeping the GPU awake.
  */
-static void quirk_gpu_hda(struct pci_dev *hda)
+static void
+pci_create_device_link_with_vga(struct pci_dev *pdev, unsigned int devfn)
 {
struct pci_dev *gpu;
 
-   if (PCI_FUNC(hda->devfn) != 1)
+   if (PCI_FUNC(pdev->devfn) != devfn)
return;
 
-   gpu = pci_get_domain_bus_and_slot(pci_domain_nr(hda->bus),
- hda->bus->number,
- PCI_DEVFN(PCI_SLOT(hda->devfn), 0));
+   gpu = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+ pdev->bus->number,
+ PCI_DEVFN(PCI_SLOT(pdev->devfn), 0));
if (!gpu || (gpu->class >> 16) != PCI_BASE_CLASS_DISPLAY) {
pci_dev_put(gpu);
return;
}
 
-   if (!device_link_add(>dev, >dev,
+   if (!device_link_add(>dev, >dev,
 DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME))
-   pci_err(hda, "cannot link HDA to GPU %s\n", pci_name(gpu));
+   pci_err(pdev, "cannot link with VGA %s\n", pci_name(gpu));
 
-   pm_runtime_allow(>dev);
+   pm_runtime_allow(>dev);
pci_dev_put(gpu);
 }
+
+/*
+ * Create device link for GPUs with integrated HDA controller for streaming
+ * audio to attached displays.
+ */
+static void quirk_gpu_hda(struct pci_dev *hda)
+{
+   pci_create_device_link_with_vga(hda, 1);
+}
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_ATI, PCI_ANY_ID,
  PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMD, PCI_ANY_ID,
-- 
2.17.1

[PATCH 2/2] PCI: Create device link for NVIDIA GPU

2019-05-30 Thread Abhishek Sahu

NVIDIA Turing GPUs include hardware support for USB Type-C and
VirtualLink. It helps in delivering the power, display, and data
required to power VR headsets through a single USB Type-C connector.
The Turing GPU is a multi-function PCI device has the following
four functions:

- VGA display controller (Function 0)
- Audio controller (Function 1)
- USB xHCI Host controller (Function 2)
- USB Type-C USCI controller (Function 3)

The function 0 is tightly coupled with other functions in the
hardware. When function 0 goes in runtime suspended state,
then it will do power gating for most of the hardware blocks.
Some of these hardware blocks are used by other functions which
leads to functional failure. So if any of these functions (1/2/3)
are active, then function 0 should also be in active state.
'commit 07f4f97d7b4b ("vga_switcheroo: Use device link for
HDA controller")' creates the device link from function 1 to
function 0. A similar kind of device link needs to be created
between function 0 and functions 2 and 3 for NVIDIA Turing GPU.

This patch does the same and create the required device links. It
will make function 0 to be runtime PM active if other functions
are still active.

Signed-off-by: Abhishek Sahu 
---
 drivers/pci/quirks.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index a20f7771a323..afdbc199efc5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4967,6 +4967,29 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMD, 
PCI_ANY_ID,
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
  PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
 
+/* Create device link for NVIDIA GPU with integrated USB controller to VGA. */
+static void quirk_gpu_usb(struct pci_dev *usb)
+{
+   pci_create_device_link_with_vga(usb, 2);
+}
+DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ PCI_CLASS_SERIAL_USB, 8, quirk_gpu_usb);
+
+/*
+ * Create device link for NVIDIA GPU with integrated Type-C UCSI controller
+ * to VGA. Currently there is no class code defined for UCSI device over PCI
+ * so using UNKNOWN class for now and it will be updated when UCSI
+ * over PCI gets a class code.
+ */
+#define PCI_CLASS_SERIAL_UNKNOWN   0x0c80
+static void quirk_gpu_usb_typec_ucsi(struct pci_dev *ucsi)
+{
+   pci_create_device_link_with_vga(ucsi, 3);
+}
+DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ PCI_CLASS_SERIAL_UNKNOWN, 8,
+ quirk_gpu_usb_typec_ucsi);
+
 /*
  * Some IDT switches incorrectly flag an ACS Source Validation error on
  * completions for config read requests even though PCIe r4.0, sec
-- 
2.17.1

[PATCH 0/2] PCI: device link quirk for NVIDIA GPU

2019-05-30 Thread Abhishek Sahu

NVIDIA Turing GPU [1] has hardware support for USB Type-C and
VirtualLink [2]. The Turing GPU is a multi-function PCI device
which has the following four functions:

- VGA display controller (Function 0)
- Audio controller (Function 1)
- USB xHCI Host controller (Function 2)
- USB Type-C USCI controller (Function 3)

Currently NVIDIA and Nouveau GPU drivers only manage function 0.
Rest of the functions are managed by other drivers. These functions
internally in the hardware are tightly coupled. When function 0 goes
in runtime suspended state, then it will do power gating for most of
the hardware blocks. Some of these hardware blocks are used by
the other PCI functions, which leads to functional failure. In the
mainline kernel, the device link is present between
function 0 and function 1.  This patch series deals with creating
a similar kind of device link between function 0 and
functions 2 and 3.

[1] 
https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf
[2] https://en.wikipedia.org/wiki/VirtualLink

Abhishek Sahu (2):
  PCI: Code reorganization for VGA device link
  PCI: Create device link for NVIDIA GPU

 drivers/pci/quirks.c | 67 ++--
 1 file changed, 52 insertions(+), 15 deletions(-)

-- 
2.17.1

Re: [PATCH] arm64: dts: rockchip: Add missing PCIe pwr amd rst configuration

2019-05-30 Thread Anand Moon

Hi Manivannan,

On Fri, 31 May 2019 at 09:32, Manivannan Sadhasivam
 wrote:
>
> Hi,
>
> On Thu, May 30, 2019 at 12:58:37PM +, Anand Moon wrote:
> > This patch add missing PCIe gpio and pinctrl for power (#PCIE_PWR)
> > also add PCIe gpio and pinctrl for reset (#PCIE_PERST_L).
> >
> > Signed-off-by: Anand Moon 
> > ---
> > Tested on Rock960 Model A
> > ---
> >  arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi | 16 ++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi 
> > b/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
> > index c7d48d41e184..f5bef6b0fe89 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
> > @@ -55,9 +55,10 @@
> >
> >   vcc3v3_pcie: vcc3v3-pcie-regulator {
> >   compatible = "regulator-fixed";
> > + gpio = < RK_PA2 GPIO_ACTIVE_HIGH>;
> >   enable-active-high;
> >   pinctrl-names = "default";
> > - pinctrl-0 = <_drv>;
> > + pinctrl-0 = <_drv _pwr>;
> >   regulator-boot-on;
> >   regulator-name = "vcc3v3_pcie";
> >   regulator-min-microvolt = <330>;
> > @@ -381,9 +382,10 @@
> >  };
> >
> >   {
> > + ep-gpio = < RK_PD4 GPIO_ACTIVE_HIGH>;
> >   num-lanes = <4>;
> >   pinctrl-names = "default";
> > - pinctrl-0 = <_clkreqn_cpm>;
> > + pinctrl-0 = <_clkreqn_cpm _perst_l>;
> >   vpcie3v3-supply = <_pcie>;
> >   status = "okay";
> >  };
> > @@ -408,6 +410,16 @@
> >   };
> >   };
> >
> > + pcie {
> > + pcie_pwr: pcie-pwr {
> > + rockchip,pins = <2 RK_PA2 RK_FUNC_GPIO 
> > _pull_none>;
> > + };
> > +
> > + pcie_perst_l:pcie-perst-l {
> > + rockchip,pins = <2 RK_PD4 RK_FUNC_GPIO 
> > _pull_none>;
> > + };
>
> Which schematics did you refer? According to Rock960 v2.1 schematics [1], 
> below
> is the pin mapping for PCI-E PWR and PERST:
>
> PCIE_PERST - GPIO2_A2
> PCIE_PWR - GPIO2_A5
>

Opps, I have referred the wrong schematics *RK3399_Rock960_V1.0.pdf*
may be old version.
Thanks for pointing out the correct schematics.

> Above mapping holds true for Rock960 version 1.1, 1.2 and 1.3. Also,
> rk3399-rock960.dtsi is common for both Rock960 and Ficus boards, so the board
> specific parts should go to rk3399-rock960.dts and rk3399-ficus.dts.
>
> Thanks,
> Mani

I have ROCK960-V 1.2 (Model A) for testing so. I will be sending patch
v2 the relevant
node update in rk3399-rock960.dts and rk3399-ficus.dts if below common
for both the boards.

PCIE_PERST - GPIO2_A2
PCIE_PWR - GPIO2_A5

>
> [1] https://dl.vamrs.com/products/rock960/docs/hw/rock960_sch_v12_20180314.pdf

Best Regards
-Anand

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-05-30 Thread Aubrey Li

On Thu, May 30, 2019 at 10:17 PM Julien Desfossez
 wrote:
>
> Interesting, could you detail a bit more your test setup (commands used,
> type of machine, any cgroup/pinning configuration, etc) ? I would like
> to reproduce it and investigate.

Let me see if I can simply my test to reproduce it.

Thanks,
-Aubrey

Re: [PATCH v4 08/14] arm: Use generic mmap top-down layout and brk randomization

2019-05-30 Thread Alex Ghiti


On 5/29/19 3:26 PM, Kees Cook wrote:

On Sun, May 26, 2019 at 09:47:40AM -0400, Alexandre Ghiti wrote:

arm uses a top-down mmap layout by default that exactly fits the generic
functions, so get rid of arch specific code and use the generic version
by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
As ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
use the generic version of arch_randomize_brk since it also fits.
Note that this commit also removes the possibility for arm to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Signed-off-by: Alexandre Ghiti 

Acked-by: Kees Cook 

It may be worth noting that STACK_RND_MASK is safe to remove here
because it matches the default that now exists in mm/util.c.



Yes, thanks for pointing that.


Thanks,


Alex




-Kees


---
  arch/arm/Kconfig |  2 +-
  arch/arm/include/asm/processor.h |  2 --
  arch/arm/kernel/process.c|  5 ---
  arch/arm/mm/mmap.c   | 62 
  4 files changed, 1 insertion(+), 70 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 8869742a85df..27687a8c9fb5 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -6,7 +6,6 @@ config ARM
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_DEBUG_VIRTUAL if MMU
select ARCH_HAS_DEVMEM_IS_ALLOWED
-   select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_KCOV
@@ -29,6 +28,7 @@ config ARM
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF
+   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_IPC_PARSE_VERSION
select BUILDTIME_EXTABLE_SORT if MMU
select CLONE_BACKWARDS
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index 5d06f75ffad4..95b7688341c5 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -143,8 +143,6 @@ static inline void prefetchw(const void *ptr)
  #endif
  #endif
  
-#define HAVE_ARCH_PICK_MMAP_LAYOUT

-
  #endif
  
  #endif /* __ASM_ARM_PROCESSOR_H */

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 72cc0862a30e..19a765db5f7f 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -322,11 +322,6 @@ unsigned long get_wchan(struct task_struct *p)
return 0;
  }
  
-unsigned long arch_randomize_brk(struct mm_struct *mm)

-{
-   return randomize_page(mm->brk, 0x0200);
-}
-
  #ifdef CONFIG_MMU
  #ifdef CONFIG_KUSER_HELPERS
  /*
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index 0b94b674aa91..b8d912ac9e61 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -17,43 +17,6 @@
addr)+SHMLBA-1)&~(SHMLBA-1)) +  \
 (((pgoff)<  
-/* gap between mmap and stack */

-#define MIN_GAP(128*1024*1024UL)
-#define MAX_GAP((STACK_TOP)/6*5)
-#define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12))
-
-static int mmap_is_legacy(struct rlimit *rlim_stack)
-{
-   if (current->personality & ADDR_COMPAT_LAYOUT)
-   return 1;
-
-   if (rlim_stack->rlim_cur == RLIM_INFINITY)
-   return 1;
-
-   return sysctl_legacy_va_layout;
-}
-
-static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
-{
-   unsigned long gap = rlim_stack->rlim_cur;
-   unsigned long pad = stack_guard_gap;
-
-   /* Account for stack randomization if necessary */
-   if (current->flags & PF_RANDOMIZE)
-   pad += (STACK_RND_MASK << PAGE_SHIFT);
-
-   /* Values close to RLIM_INFINITY can overflow. */
-   if (gap + pad > gap)
-   gap += pad;
-
-   if (gap < MIN_GAP)
-   gap = MIN_GAP;
-   else if (gap > MAX_GAP)
-   gap = MAX_GAP;
-
-   return PAGE_ALIGN(STACK_TOP - gap - rnd);
-}
-
  /*
   * We need to ensure that shared mappings are correctly aligned to
   * avoid aliasing issues with VIPT caches.  We need to ensure that
@@ -181,31 +144,6 @@ arch_get_unmapped_area_topdown(struct file *filp, const 
unsigned long addr0,
return addr;
  }
  
-unsigned long arch_mmap_rnd(void)

-{
-   unsigned long rnd;
-
-   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
-
-   return rnd << PAGE_SHIFT;
-}
-
-void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
-{
-   unsigned long random_factor = 0UL;
-
-   if (current->flags & PF_RANDOMIZE)
-   random_factor = arch_mmap_rnd();
-
-   if (mmap_is_legacy(rlim_stack)) {
-   mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
-   mm->get_unmapped_area = arch_get_unmapped_area;
-   } else {
-   mm->mmap_base = mmap_base(random_factor, rlim_stack);
-   mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-   }
-}

Re: [PATCH v5] mfd: cros_ec_dev: Register cros_ec_accel_legacy driver as a subdevice

2019-05-30 Thread Gwendal Grignou

On Thu, May 30, 2019 at 12:48 AM Lee Jones  wrote:
>
> On Wed, 29 May 2019, Gwendal Grignou wrote:
>
> > On Wed, May 29, 2019 at 4:44 AM Lee Jones  wrote:
> > >
> > > On Tue, 28 May 2019, Gwendal Grignou wrote:
> > >
> > > > On Mon, Apr 1, 2019 at 8:46 PM Lee Jones  wrote:
> > > > >
> > > > > On Wed, 27 Feb 2019, Gwendal Grignou wrote:
> > > > >
> > > > > > From: Enric Balletbo i Serra 
> > > > > >
> > > > > > With this patch, the cros_ec_ctl driver will register the legacy
> > > > > > accelerometer driver (named cros_ec_accel_legacy) if it fails to
> > > > > > register sensors through the usual path cros_ec_sensors_register().
> > > > > > This legacy device is present on Chromebook devices with older EC
> > > > > > firmware only supporting deprecated EC commands (Glimmer based 
> > > > > > devices).
> > > > > >
> > > > > > Tested-by: Gwendal Grignou 
> > > > > > Signed-off-by: Enric Balletbo i Serra 
> > > > > > Reviewed-by: Gwendal Grignou 
> > > > > > Reviewed-by: Andy Shevchenko 
> > > > > > ---
> > > > > > Changes in v5:
> > > > > > - Remove unnecessary white lines.
> > > > > >
> > > > > > Changes in v4:
> > > > > > - [5/8] Nit: EC -> ECs (Lee Jones)
> > > > > > - [5/8] Statically define cros_ec_accel_legacy_cells (Lee Jones)
> > > > > >
> > > > > > Changes in v3:
> > > > > > - [5/8] Add the Reviewed-by Andy Shevchenko.
> > > > > >
> > > > > > Changes in v2:
> > > > > > - [5/8] Add the Reviewed-by Gwendal.
> > > > > >
> > > > > >  drivers/mfd/cros_ec_dev.c | 66 
> > > > > > +++
> > > > > >  1 file changed, 66 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/mfd/cros_ec_dev.c b/drivers/mfd/cros_ec_dev.c
> > > > > > index d275deaecb12..64567bd0a081 100644
> > > > > > --- a/drivers/mfd/cros_ec_dev.c
> > > > > > +++ b/drivers/mfd/cros_ec_dev.c
> > > > > > @@ -376,6 +376,69 @@ static void cros_ec_sensors_register(struct 
> > > > > > cros_ec_dev *ec)
> > > > > >   kfree(msg);
> > > > > >  }
> > > > > >
> > > > > > +static struct cros_ec_sensor_platform sensor_platforms[] = {
> > > > > > + { .sensor_num = 0 },
> > > > > > + { .sensor_num = 1 }
> > > > > > +};
> > > > >
> > > > > I'm still very uncomfortable with this struct.
> > > > >
> > > > > Other than these indices, the sensors have no other distinguishing
> > > > > features, thus there should be no need to identify or distinguish
> > > > > between them in this way.
> > > > When initializing the sensors, the IIO driver expect to find in the
> > > > data  structure pointed by dev_get_platdata(dev), in field sensor_num
> > > > is stored the index assigned by the embedded controller to talk to a
> > > > given sensor.
> > > > cros_ec_sensors_register() use the same mechanism; in that function,
> > > > the sensor_num field is populated from the output of an EC command
> > > > MOTIONSENSE_CMD_INFO. In case of legacy mode, that command may not be
> > > > available and in any case we know the EC has only either 2
> > > > accelerometers present or nothing.
> > > >
> > > > For instance, let's compare a legacy device with a more recent one:
> > > >
> > > > legacy:
> > > > type  |   id  | sensor_num   | device name
> > > > accelerometer  |   0   |   0  | cros-ec-accel.0
> > > > accelerometer  |   1   |   1  | cros-ec-accel.1
> > > >
> > > > Modern:
> > > > type  |   id  | sensor_num   | device name
> > > > accelerometer  |   0   |   0  | cros-ec-accel.0
> > > > accelerometer  |   1   |   1  | cros-ec-accel.1
> > > > gyroscope|0  |2 | cros-ec-gyro.0
> > > > magnetometer |0  |   3  | cros-ec-mag.0
> > > > light  |0  |   4  | 
> > > > cros-ec-light.0
> > > > ...
> > >
> > > Why can't these numbers be assigned at runtime?
> > I assume you want to know why IIO drivers need to know "sensor_num"
> > ahead of time. It is because each IIO driver is independent from the
> > other.
> > Let assume there was 2 light sensors in the device:
> > type  |   id  | sensor_num   | device name
> >  light  |0  |   4  | 
> > cros-ec-light.0
> >  light  |1  |   5  | 
> > cros-ec-light.1
> >
> > In case of sensors of the same type without sensor_num, cros-ec-light
> > driver has no information at probe time if it should bind to sensors
> > named by the EC 4 or 5.
> >
> > We could get away with cros-ec-accel, as EC always presents
> > accelerometers with sensor_num  0 and 1, but I don't want to rely on
> > this property in the general case.
> > Only cros_ec_dev MFD driver has the global view of all sensors available.
>
> Well seeing as this implementation has already been accepted and you're
> only *using* it, rather than creating it, I think this conversation is
> moot.

答复: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-30 Thread Tony W Wang-oc

On Fri, May 31, 2019, Raj, Ashok wrote:
> On Thu, May 30, 2019 at 09:13:39AM +, Tony W Wang-oc wrote:
> > On Thu, May 30, 2019, Tony W Wang-oc wrote:
> > > Hi Ashok,
> > > I have two questions about this patch, could you help to check:
> > >
> > > 1, for broadcast #MC exceptions, this patch seems require #MC exception
> > > errors
> > > set MCG_STATUS_RIPV = 1.
> > > But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> > > (like "Recoverable-not-continuable SRAR Type" Errors), for these errors
> > > the patch doesn't seem to work, is that okay?
> > >
> > > 2, for LMCE exceptions, this patch seems require #MC exception errors
> > > set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally even
> > > on offline CPU.
> > > For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline CPU
> > > handle these LMCE errors, is that okay?
> > >
> >
> > More specifically, this patch seems require #MC exceptions meet the
> condition
> > "MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon X5650
> machine (SMP),
> 
> The offline CPU will never get a LMCE=1, since those only happen on the CPU
> that's doing active work. Offline CPUs just sitting in idle.
> 
> The specific error here is a PCC=1, so irrespective of what happens
> We do capture the errors in the per-cpu log, and kernel would panic.
> 
> What specifically this patch tries to achieve is to leave an error
> sitting with MCG-STATUS.MCIP=1 and another recoverable error would shut
> the
> system dowm.
Yes, agree with you for this point.

But for question 1, When some #MC exception errors broadcast to offline CPU,
like "Recoverable-not-continuable SRAR Type" Errors, set MCG_STATUS_RIPV = 0, 
PCC = 0, is there also the problem : " Kernel panic - not syncing: Timeout: Not 
all CPUs 
entered broadcast exception handler"?

Thanks
> 
> I don't see anything wrong with what this patch does..
> 
> > "Data CACHE Level-2 Generic Error" does not meet this condition.
> >
> > I got below message from:
> https://www.centos.org/forums/viewtopic.php?p=292742
> >
> > Hardware event. This is not a software error.
> > MCE 0
> > CPU 4 BANK 6 TSC b7065eeaa18b0
> > TIME 1545643603 Mon Dec 24 10:26:43 2018
> > MCG status:MCIP
> > MCi status:
> > Uncorrected error
> > Error enabled
> > Processor context corrupt
> > MCA: Data CACHE Level-2 Generic Error
> > STATUS b2008106 MCGSTATUS 4
> > MCGCAP 1c09 APICID 4 SOCKETID 0
> > CPUID Vendor Intel Family 6 Model 44
> >
> > > Thanks
> > > Tony W Wang-oc

Re: [PATCH v2] PCI: endpoint: Skip odd BAR when skipping 64bit BAR

2019-05-30 Thread Kishon Vijay Abraham I

Hi Alan,

On 25/05/19 12:20 AM, Alan Mikhak wrote:
> Hi Kishon,
> 
> Yes. This change is still applicable even when the platform specifies
> that it only supports 64-bit BARs by setting the bar_fixed_64bit
> member of epc_features.
> 
> The issue being fixed is this: If the 'continue' statement is executed
> within the loop, the loop index 'bar' needs to advanced by two, not
> one, when the BAR is 64-bit. Otherwise the next loop iteration will be
> on an odd BAR which doesn't exist.

IIUC you are fixing the case where the BAR is "reserved" (specified in
epc_features) and is also a 64-bit BAR?

If 2 consecutive BARs are marked as reserved in reserved_bar of epc_features,
the result should be the same right?

Thanks
Kishon

> 
> The PCI_BASE_ADDRESS_MEM_TYPE_64 flag in epf_bar->flag reflects the
> value set by the platform in the bar_fixed_64bit member of
> epc_features.
> 
> This patch moves the checking of  PCI_BASE_ADDRESS_MEM_TYPE_64 in
> epf_bar->flags to before the 'continue' statement to advance the 'bar'
> loop index accordingly. The comment you see about 'pci_epc_set_bar()'
> preceding the moved code is the original comment and was also moved
> along with the code.
> 
> Regards,
> Alan Mikhak
> 
> On Fri, May 24, 2019 at 1:51 AM Kishon Vijay Abraham I  wrote:
>>
>> Hi,
>>
>> On 24/05/19 5:25 AM, Alan Mikhak wrote:
>>> +Bjorn Helgaas, +Gustavo Pimentel, +Wen Yang, +Kangjie Lu
>>>
>>> On Thu, May 23, 2019 at 2:55 PM Alan Mikhak  wrote:

 Always skip odd bar when skipping 64bit BARs in pci_epf_test_set_bar()
 and pci_epf_test_alloc_space().

 Otherwise, pci_epf_test_set_bar() will call pci_epc_set_bar() on odd loop
 index when skipping reserved 64bit BAR. Moreover, 
 pci_epf_test_alloc_space()
 will call pci_epf_alloc_space() on bind for odd loop index when BAR is 
 64bit
 but leaks on subsequent unbind by not calling pci_epf_free_space().

 Signed-off-by: Alan Mikhak 
 Reviewed-by: Paul Walmsley 
 ---
  drivers/pci/endpoint/functions/pci-epf-test.c | 25 
 -
  1 file changed, 12 insertions(+), 13 deletions(-)

 diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c 
 b/drivers/pci/endpoint/functions/pci-epf-test.c
 index 27806987e93b..96156a537922 100644
 --- a/drivers/pci/endpoint/functions/pci-epf-test.c
 +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
 @@ -389,7 +389,7 @@ static void pci_epf_test_unbind(struct pci_epf *epf)

  static int pci_epf_test_set_bar(struct pci_epf *epf)
  {
 -   int bar;
 +   int bar, add;
 int ret;
 struct pci_epf_bar *epf_bar;
 struct pci_epc *epc = epf->epc;
 @@ -400,8 +400,14 @@ static int pci_epf_test_set_bar(struct pci_epf *epf)

 epc_features = epf_test->epc_features;

 -   for (bar = BAR_0; bar <= BAR_5; bar++) {
 +   for (bar = BAR_0; bar <= BAR_5; bar += add) {
 epf_bar = >bar[bar];
 +   /*
 +* pci_epc_set_bar() sets PCI_BASE_ADDRESS_MEM_TYPE_64
 +* if the specific implementation required a 64-bit BAR,
 +* even if we only requested a 32-bit BAR.
 +*/
>>
>> set_bar shouldn't set PCI_BASE_ADDRESS_MEM_TYPE_64. If a platform supports 
>> only
>> 64-bit BAR, that should be specified in epc_features bar_fixed_64bit member.
>>
>> Thanks
>> Kishon

Re: [PATCH] vmalloc: Don't use flush flag when no exec perm

2019-05-30 Thread Edgecombe, Rick P

On Thu, 2019-05-30 at 10:44 +0300, Meelis Roos wrote:
> > > The addition of VM_FLUSH_RESET_PERMS for BPF JIT allocations was
> > > bisected to prevent boot on an UltraSparc III machine. It was
> > > found
> > > that
> > > sometime shortly after the TLB flush this flag does on vfree of
> > > the
> > > BPF
> > > program, the machine hung. Further investigation showed that
> > > before
> > > any of
> > > the changes for this flag were introduced, with
> > > CONFIG_DEBUG_PAGEALLOC
> > > configured (which does a similar TLB flush of the vmalloc range
> > > on
> > > every vfree), this machine also hung shortly after the first
> > > vmalloc
> > > unmap/free.
> > > 
> > > So the evidence points to there being some existing issue with
> > > the
> > > vmalloc TLB flushes, but it's still unknown exactly why these
> > > hangs
> > > are
> > > happening on sparc. It is also unknown when someone with this
> > > hardware
> > > could resolve this, and in the meantime using this flag on it
> > > turns a
> > > lurking behavior into something that prevents boot.
> > 
> > The sparc TLB flush issue has been bisected and is being worked on
> > now,
> > so hopefully we won't need this patch:
> > https://marc.info/?l=linux-sparc=155915694304118=2
> 
> And the sparc64 patch that fixes CONFIG_DEBUG_PAGEALLOC also fixes
> booting
> of the latest git kernel on Sun V445 where my problem initially
> happened.
> 
Thanks Meelis. So the TLB flush on this platform will be fixed and we
won't need this patch.

[PATCH 1/3] dt-bindings: i2c: document bindings for i2c-slave-mqueue

2019-05-30 Thread Eduardo Valentin

Document the i2c-slave-mqueue binding by adding
descriptor, required properties, and example.

Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Wolfram Sang 
Cc: linux-...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 .../bindings/i2c/i2c-slave-mqueue.txt | 34 +++
 1 file changed, 34 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/i2c/i2c-slave-mqueue.txt

diff --git a/Documentation/devicetree/bindings/i2c/i2c-slave-mqueue.txt 
b/Documentation/devicetree/bindings/i2c/i2c-slave-mqueue.txt
new file mode 100644
index ..eb1881a4fc0e
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/i2c-slave-mqueue.txt
@@ -0,0 +1,34 @@
+===
+Device Tree for I2C slave message queue backend
+===
+
+Some protocols over I2C/SMBus are designed for bi-directional transferring
+messages by using I2C Master Write protocol. This requires that both sides
+of the communication have slave addresses.
+
+This I2C slave mqueue (message queue) is used to receive and queue
+messages from the remote i2c intelligent device; and it will add the target
+slave address (with R/W# bit is always 0) into the message at the first byte.
+
+Links
+
+`Intelligent Platform Management Bus
+Communications Protocol Specification
+`_
+
+`Management Component Transport Protocol (MCTP)
+SMBus/I2C Transport Binding Specification
+`_
+
+Required Properties:
+- compatible   : should be "i2c-slave-mqueue"
+- reg  : slave address
+
+Example:
+
+i2c {
+   slave_mqueue: i2c-slave-mqueue {
+   compatible = "i2c-slave-mqueue";
+   reg = <0x10>;
+   };
+};
-- 
2.21.0

[PATCH 2/3] i2c: slave-mqueue: add a slave backend to receive and queue messages

2019-05-30 Thread Eduardo Valentin

From: Haiyue Wang 

Some protocols over I2C are designed for bi-directional transferring
messages by using I2C Master Write protocol. Like the MCTP (Management
Component Transport Protocol) and IPMB (Intelligent Platform Management
Bus), they both require that the userspace can receive messages from
I2C dirvers under slave mode.

This new slave mqueue backend is used to receive and queue messages, it
will exposes these messages to userspace by sysfs bin file.

Note: DT interface and a couple of minor fixes here and there
by Eduardo, so I kept the original authorship here.

Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Wolfram Sang 
Cc: linux-...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Haiyue Wang 
Signed-off-by: Eduardo Valentin 
---
 Documentation/i2c/slave-mqueue-backend.rst | 124 
 MAINTAINERS|   8 +
 drivers/i2c/Kconfig|  25 +++
 drivers/i2c/Makefile   |   1 +
 drivers/i2c/i2c-slave-mqueue.c | 211 +
 5 files changed, 369 insertions(+)
 create mode 100644 Documentation/i2c/slave-mqueue-backend.rst
 create mode 100644 drivers/i2c/i2c-slave-mqueue.c

diff --git a/Documentation/i2c/slave-mqueue-backend.rst 
b/Documentation/i2c/slave-mqueue-backend.rst
new file mode 100644
index ..376dff998fa3
--- /dev/null
+++ b/Documentation/i2c/slave-mqueue-backend.rst
@@ -0,0 +1,124 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=
+Linux I2C slave message queue backend
+=
+
+:Author: Haiyue Wang 
+
+Some protocols over I2C/SMBus are designed for bi-directional transferring
+messages by using I2C Master Write protocol. This requires that both sides
+of the communication have slave addresses.
+
+Like MCTP (Management Component Transport Protocol) and IPMB (Intelligent
+Platform Management Bus), they both require that the userspace can receive
+messages from i2c drivers under slave mode.
+
+This I2C slave mqueue (message queue) backend is used to receive and queue
+messages from the remote i2c intelligent device; and it will add the target
+slave address (with R/W# bit is always 0) into the message at the first byte,
+so that userspace can use this byte to dispatch the messages into different
+handling modules. Also, like IPMB, the address byte is in its message format,
+it needs it to do checksum.
+
+For messages are time related, so this backend will flush the oldest message
+to queue the newest one.
+
+Link
+
+`Intelligent Platform Management Bus
+Communications Protocol Specification
+`_
+
+`Management Component Transport Protocol (MCTP)
+SMBus/I2C Transport Binding Specification
+`_
+
+How to use
+--
+For example, the I2C5 bus has slave address 0x10, the below command will create
+the related message queue interface:
+
+echo slave-mqueue 0x1010 > /sys/bus/i2c/devices/i2c-5/new_device
+
+Then you can dump the messages like this:
+
+hexdump -C /sys/bus/i2c/devices/5-1010/slave-mqueue
+
+Code Example
+
+*Note: call 'lseek' before 'read', this is a requirement from kernfs' design.*
+
+::
+
+  #include 
+  #include 
+  #include 
+  #include 
+  #include 
+  #include 
+  #include 
+
+  int main(int argc, char *argv[])
+  {
+  int i, r;
+  struct pollfd pfd;
+  struct timespec ts;
+  unsigned char data[256];
+
+  pfd.fd = open(argv[1], O_RDONLY | O_NONBLOCK);
+  if (pfd.fd < 0)
+  return -1;
+
+  pfd.events = POLLPRI;
+
+  while (1) {
+  r = poll(, 1, 5000);
+
+  if (r < 0)
+  break;
+
+  if (r == 0 || !(pfd.revents & POLLPRI))
+  continue;
+
+  lseek(pfd.fd, 0, SEEK_SET);
+  r = read(pfd.fd, data, sizeof(data));
+  if (r <= 0)
+  continue;
+
+  clock_gettime(CLOCK_MONOTONIC, );
+  printf("[%ld.%.9ld] :", ts.tv_sec, ts.tv_nsec);
+  for (i = 0; i < r; i++)
+  printf(" %02x", data[i]);
+  printf("\n");
+  }
+
+  close(pfd.fd);
+
+  return 0;
+  }
+
+Result
+--
+*./a.out "/sys/bus/i2c/devices/5-1010/slave-mqueue"*
+
+::
+
+  [10183.232500449] : 20 18 c8 2c 78 01 5b
+  [10183.479358348] : 20 18 c8 2c 78 01 5b
+  [10183.726556812] : 20 18 c8 2c 78 01 5b
+  [10183.972605863] : 20 18 c8 2c 78 01 5b
+  [10184.220124772] : 20 18 c8 2c 78 01 5b
+  [10184.467764166] : 20 18 c8 2c 78 01 5b
+  [10193.233421784] : 20 18 c8 2c 7c 01 57
+  [10193.480273460] : 20 18 c8 2c 7c 01 57
+  [10193.726788733] : 20 18 c8 2c 7c 01 57
+

[PATCH 3/3] Documentation: ABI: Add i2c-slave-mqueue sysfs documentation

2019-05-30 Thread Eduardo Valentin

Document the slave-mqueue sysfs attribute used by
the i2c-slave-mqueue driver.

Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Wolfram Sang 
Cc: linux-...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin 
---
 .../ABI/testing/sysfs-bus-i2c-devices-slave-mqueue | 10 ++
 1 file changed, 10 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-i2c-devices-slave-mqueue

diff --git a/Documentation/ABI/testing/sysfs-bus-i2c-devices-slave-mqueue 
b/Documentation/ABI/testing/sysfs-bus-i2c-devices-slave-mqueue
new file mode 100644
index ..28318108ce85
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-i2c-devices-slave-mqueue
@@ -0,0 +1,10 @@
+What:  /sys/bus/i2c/devices/*/slave-mqueue
+Date:  May 2019
+KernelVersion: 5.2
+Contact:   Eduardo Valentin 
+Description:
+   Reading to this file will return exactly one message,
+   when available, of the i2c-slave-mqueue device attached
+   to that bus. Userspace can also poll on this file to
+   get notified when new messages are available.
+Users: i2c-slave-mqueue driver
-- 
2.21.0

[PATCH 0/3] introduce i2c-slave-mqueue

2019-05-30 Thread Eduardo Valentin

Wolfram,

I am sending you the i2c-slave-mqueue driver.
Apparently Haiyue had to move on to another project and
does not have cycles to continue with the comments on this
driver after some time waiting for feedback,
that is essentially why I took over.

Here is a small changelog from V5 to V6:
- Added DT support for probing via Device Tree
- Docummented DT bindings
- Documented sysfs ABI
- Small fixes on wording and Kconfig entries.

Haiyue's V5: https://lkml.org/lkml/2018/4/23/835

BR,

Eduardo Valentin (2):
  dt-bindings: i2c: document bindings for i2c-slave-mqueue
  Documentation: ABI: Add i2c-slave-mqueue sysfs documentation

Haiyue Wang (1):
  i2c: slave-mqueue: add a slave backend to receive and queue messages

 .../sysfs-bus-i2c-devices-slave-mqueue|  10 +
 .../bindings/i2c/i2c-slave-mqueue.txt |  34 +++
 Documentation/i2c/slave-mqueue-backend.rst| 124 ++
 MAINTAINERS   |   8 +
 drivers/i2c/Kconfig   |  25 +++
 drivers/i2c/Makefile  |   1 +
 drivers/i2c/i2c-slave-mqueue.c| 211 ++
 7 files changed, 413 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-i2c-devices-slave-mqueue
 create mode 100644 Documentation/devicetree/bindings/i2c/i2c-slave-mqueue.txt
 create mode 100644 Documentation/i2c/slave-mqueue-backend.rst
 create mode 100644 drivers/i2c/i2c-slave-mqueue.c

-- 
2.21.0

Re: [PATCH net-next 0/5] PTP support for the SJA1105 DSA driver

2019-05-30 Thread Richard Cochran

On Thu, May 30, 2019 at 06:23:09PM +0300, Vladimir Oltean wrote:
> On Thu, 30 May 2019 at 18:06, Richard Cochran  
> wrote:
> >
> > But are the frames received in the same order?  What happens your MAC
> > drops a frame?
> >
> 
> If it drops a normal frame, it carries on.
> If it drops a meta frame, it prints "Expected meta frame", resets the
> state machine and carries on.
> If it drops a timestampable frame, it prints "Unexpected meta frame",
> resets the state machine and carries on.

What I meant was, consider how dropped frames in the MAC will spoil
any chance that the driver has to correctly match time stamps with
frames.

Thanks,
Richard

Re: [GIT PULL] arm64: fixes for -rc3

2019-05-30 Thread pr-tracker-bot

The pull request you sent on Thu, 30 May 2019 17:11:26 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/adc3f554fa1e0f1c7b76007150814e1d8a5fcd2b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT] Networking

2019-05-30 Thread pr-tracker-bot

The pull request you sent on Thu, 30 May 2019 16:05:06 -0700 (PDT):

> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net refs/heads/master

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/036e34310931e64ce4f1edead435708cd517db10

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH v2 0/6] mm/devm_memremap_pages: Fix page release race

2019-05-30 Thread Dan Williams

On Mon, May 13, 2019 at 12:22 PM Logan Gunthorpe  wrote:
>
>
>
> On 2019-05-08 11:05 a.m., Logan Gunthorpe wrote:
> >
> >
> > On 2019-05-07 5:55 p.m., Dan Williams wrote:
> >> Changes since v1 [1]:
> >> - Fix a NULL-pointer deref crash in pci_p2pdma_release() (Logan)
> >>
> >> - Refresh the p2pdma patch headers to match the format of other p2pdma
> >>patches (Bjorn)
> >>
> >> - Collect Ira's reviewed-by
> >>
> >> [1]: 
> >> https://lore.kernel.org/lkml/155387324370.2443841.574715745262628837.st...@dwillia2-desk3.amr.corp.intel.com/
> >
> > This series looks good to me:
> >
> > Reviewed-by: Logan Gunthorpe 
> >
> > However, I haven't tested it yet but I intend to later this week.
>
> I've tested libnvdimm-pending which includes this series on my setup and
> everything works great.

Hi Andrew,

With this tested-by can we move forward on this fix set? I'm not aware
of any other remaining comments. Greg had a question about
"drivers/base/devres: Introduce devm_release_action()" that I
answered, but otherwise the feedback has gone silent.

Re: [PATCH] arm64: dts: rockchip: Add missing PCIe pwr amd rst configuration

2019-05-30 Thread Manivannan Sadhasivam

Hi,

On Thu, May 30, 2019 at 12:58:37PM +, Anand Moon wrote:
> This patch add missing PCIe gpio and pinctrl for power (#PCIE_PWR)
> also add PCIe gpio and pinctrl for reset (#PCIE_PERST_L).
> 
> Signed-off-by: Anand Moon 
> ---
> Tested on Rock960 Model A
> ---
>  arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi 
> b/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
> index c7d48d41e184..f5bef6b0fe89 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
> @@ -55,9 +55,10 @@
>  
>   vcc3v3_pcie: vcc3v3-pcie-regulator {
>   compatible = "regulator-fixed";
> + gpio = < RK_PA2 GPIO_ACTIVE_HIGH>;
>   enable-active-high;
>   pinctrl-names = "default";
> - pinctrl-0 = <_drv>;
> + pinctrl-0 = <_drv _pwr>;
>   regulator-boot-on;
>   regulator-name = "vcc3v3_pcie";
>   regulator-min-microvolt = <330>;
> @@ -381,9 +382,10 @@
>  };
>  
>   {
> + ep-gpio = < RK_PD4 GPIO_ACTIVE_HIGH>;
>   num-lanes = <4>;
>   pinctrl-names = "default";
> - pinctrl-0 = <_clkreqn_cpm>;
> + pinctrl-0 = <_clkreqn_cpm _perst_l>;
>   vpcie3v3-supply = <_pcie>;
>   status = "okay";
>  };
> @@ -408,6 +410,16 @@
>   };
>   };
>  
> + pcie {
> + pcie_pwr: pcie-pwr {
> + rockchip,pins = <2 RK_PA2 RK_FUNC_GPIO _pull_none>;
> + };
> +
> + pcie_perst_l:pcie-perst-l {
> + rockchip,pins = <2 RK_PD4 RK_FUNC_GPIO _pull_none>;
> + };

Which schematics did you refer? According to Rock960 v2.1 schematics [1], below
is the pin mapping for PCI-E PWR and PERST:

PCIE_PERST - GPIO2_A2
PCIE_PWR - GPIO2_A5

Above mapping holds true for Rock960 version 1.1, 1.2 and 1.3. Also,
rk3399-rock960.dtsi is common for both Rock960 and Ficus boards, so the board
specific parts should go to rk3399-rock960.dts and rk3399-ficus.dts.

Thanks,
Mani

[1] https://dl.vamrs.com/products/rock960/docs/hw/rock960_sch_v12_20180314.pdf
> + };
> +
>   sdmmc {
>   sdmmc_bus1: sdmmc-bus1 {
>   rockchip,pins =
> -- 
> 2.21.0
>

Re: [GIT PULL] configfs fix for 5.2

2019-05-30 Thread pr-tracker-bot

The pull request you sent on Thu, 30 May 2019 10:53:21 +0200:

> git://git.infradead.org/users/hch/configfs.git tags/configfs-for-5.2-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/8cb7104d03dddeb2f28e590b2d1fab7bf0eef284

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] sound fixes for 5.2-rc3

2019-05-30 Thread pr-tracker-bot

The pull request you sent on Thu, 30 May 2019 10:51:17 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git 
> tags/sound-5.2-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c5ba1712661233ce0f4666b8c3dee5bb78d380f2

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request

2019-05-30 Thread Alexey Kardashevskiy




On 31/05/2019 08:49, Shawn Anastasio wrote:
> On 5/29/19 10:39 PM, Alexey Kardashevskiy wrote:
>>
>>
>> On 28/05/2019 17:39, Shawn Anastasio wrote:
>>>
>>>
>>> On 5/28/19 1:27 AM, Alexey Kardashevskiy wrote:


 On 28/05/2019 15:36, Oliver wrote:
> On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio 
> wrote:
>>
>> Introduce a new pcibios function pcibios_ignore_alignment_request
>> which allows the PCI core to defer to platform-specific code to
>> determine whether or not to ignore alignment requests for PCI
>> resources.
>>
>> The existing behavior is to simply ignore alignment requests when
>> PCI_PROBE_ONLY is set. This is behavior is maintained by the
>> default implementation of pcibios_ignore_alignment_request.
>>
>> Signed-off-by: Shawn Anastasio 
>> ---
>>    drivers/pci/pci.c   | 9 +++--
>>    include/linux/pci.h | 1 +
>>    2 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8abc843b1615..8207a09085d1 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -5882,6 +5882,11 @@ resource_size_t __weak
>> pcibios_default_alignment(void)
>>   return 0;
>>    }
>>
>> +int __weak pcibios_ignore_alignment_request(void)
>> +{
>> +   return pci_has_flag(PCI_PROBE_ONLY);
>> +}
>> +
>>    #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
>>    static char
>> resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
>>    static DEFINE_SPINLOCK(resource_alignment_lock);
>> @@ -5906,9 +5911,9 @@ static resource_size_t
>> pci_specified_resource_alignment(struct pci_dev *dev,
>>   p = resource_alignment_param;
>>   if (!*p && !align)
>>   goto out;
>> -   if (pci_has_flag(PCI_PROBE_ONLY)) {
>> +   if (pcibios_ignore_alignment_request()) {
>>   align = 0;
>> -   pr_info_once("PCI: Ignoring requested alignments
>> (PCI_PROBE_ONLY)\n");
>> +   pr_info_once("PCI: Ignoring requested alignments\n");
>>   goto out;
>>   }
>
> I think the logic here is questionable to begin with. If the user has
> explicitly requested re-aligning a resource via the command line then
> we should probably do it even if PCI_PROBE_ONLY is set. When it breaks
> they get to keep the pieces.
>
> That said, the real issue here is that PCI_PROBE_ONLY probably
> shouldn't be set under qemu/kvm. Under the other hypervisor (PowerVM)
> hotplugged devices are configured by firmware before it's passed to
> the guest and we need to keep the FW assignments otherwise things
> break. QEMU however doesn't do any BAR assignments and relies on that
> being handled by the guest. At boot time this is done by SLOF, but
> Linux only keeps SLOF around until it's extracted the device-tree.
> Once that's done SLOF gets blown away and the kernel needs to do it's
> own BAR assignments. I'm guessing there's a hack in there to make it
> work today, but it's a little surprising that it works at all...


 The hack is to run a modified qemu-aware "/usr/sbin/rtas_errd" in the
 guest which receives an event from qemu (RAS_EPOW from
 /proc/interrupts), fetches device tree chunks (and as I understand it -
 they come with BARs from phyp but without from qemu) and writes "1" to
 "/sys/bus/pci/rescan" which calls pci_assign_resource() eventually:
>>>
>>> Interesting. Does this mean that the PHYP hotplug path doesn't
>>> call pci_assign_resource?
>>
>>
>> I'd expect dlpar_add_slot() to be called under phyp and eventually
>> pci_device_add() which (I think) may or may not trigger later
>> reassignment.
>>
>>
>>> If so it means the patch may not
>>> break that platform after all, though it still may not be
>>> the correct way of doing things.
>>
>>
>> We should probably stop enforcing the PCI_PROBE_ONLY flag - it seems
>> that (unless resource_alignment= is used) the pseries guest should just
>> walk through all allocated resources and leave them unchanged.
> 
> If we add a pcibios_default_alignment() implementation like was
> suggested earlier, then it will behave as if the user has
> specified resource_alignment= by default and SLOF's assignments
> won't be honored (I think).


I removed pci_add_flags(PCI_PROBE_ONLY) from pSeries_setup_arch and
tried booting with and without pci=resource_alignment= and I can see no
difference - BARs are still aligned to 64K as programmed in SLOF; if I
hack SLOF to align to 4K or 32K - BARs get packed and the guest leaves
them unchanged.


> I guess it boils down to one question - is it important that we
> observe SLOF's initial BAR assignments?

It isn't if it's SLOF but it is if it's phyp. It used to not
allow/support BAR reassignment and even if it

[PATCH v2 00/17] net: introduce Qualcomm IPA driver

2019-05-30 Thread Alex Elder

This series presents the driver for the Qualcomm IP Accelerator (IPA).

This is version 2 of the series.  This version has addressed almost
all of the feedback received in the first version:
  https://lore.kernel.org/lkml/20190512012508.10608-1-el...@linaro.org/
More detail is included in the individual patches, but here is a
high-level summary of what's changed since then:
  - Two spinlocks have been removed.
  - The code for enabling and disabling endpoint interrupts has
been simplified considerably, and the spinlock is no longer
required
  - A spinlock used when updating ring buffer pointers is no
longer needed.  Integers indexing the ring are used instead
(and they don't even have to be atomic).
  - One spinlock remains to protect list updates, but it is always
acquired using spin_lock_bh() (no more irqsave).
  - Information about the queueing and completion of messages is now
supplied to the network stack in batches rather than one at a
time.
  - I/O completion handling has been simplified, with the IRQ
handler now consisting mainly of disabling the interrupt and
calling napi_schedule().
  - Some comments have been updated and improved througout.

What follows is the introduction supplied with v1 of the series.

-

The IPA is a component present in some Qualcomm SoCs that allows
network functions such as aggregation, filtering, routing, and NAT
to be performed without active involvement of the main application
processor (AP).

Initially, these advanced features are disabled; the IPA driver
simply provides a network interface that makes the modem's LTE
network available to the AP.  In addition, only support for the
IPA found in the Qualcomm SDM845 SoC is provided.

This code is derived from a driver developed internally by Qualcomm.
A version of the original source can be seen here:
  https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree
in the "drivers/platform/msm/ipa" directory.  Many were involved in
developing this, but the following individuals deserve explicit
acknowledgement for their substantial contributions:

Abhishek Choubey
Ady Abraham
Chaitanya Pratapa
David Arinzon
Ghanim Fodi
Gidon Studinski
Ravi Gummadidala
Shihuan Liu
Skylar Chang

A version of this code was posted in November 2018 as an RFC.
  https://lore.kernel.org/lkml/20181107003250.5832-1-el...@linaro.org/
All feedback received was addressed.  The code has undergone
considerable further rework since that time, and most of the
"future work" described then has now been completed.

This code is available in buildable form here, based on kernel
v5.2-rc1:
  remote: ssh://g...@git.linaro.org/people/alex.elder/linux.git
  branch: ipa-v2_kernel-v5.2-rc2
75adf2ac1266 arm64: defconfig: enable build of IPA code

The branch depends on a commit now found in in net-next.  It has
been cherry-picked, and (in this branch) has this commit ID:
  13c627b5a078 net: qualcomm: rmnet: Move common struct definitions to include
by 

-Alex

Alex Elder (17):
  bitfield.h: add FIELD_MAX() and field_max()
  dt-bindings: soc: qcom: add IPA bindings
  soc: qcom: ipa: main code
  soc: qcom: ipa: configuration data
  soc: qcom: ipa: clocking, interrupts, and memory
  soc: qcom: ipa: GSI headers
  soc: qcom: ipa: the generic software interface
  soc: qcom: ipa: GSI transactions
  soc: qcom: ipa: IPA interface to GSI
  soc: qcom: ipa: IPA endpoints
  soc: qcom: ipa: immediate commands
  soc: qcom: ipa: IPA network device and microcontroller
  soc: qcom: ipa: AP/modem communications
  soc: qcom: ipa: support build of IPA code
  MAINTAINERS: add entry for the Qualcomm IPA driver
  arm64: dts: sdm845: add IPA information
  arm64: defconfig: enable build of IPA code

 .../devicetree/bindings/net/qcom,ipa.yaml |  180 ++
 MAINTAINERS   |6 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |   51 +
 arch/arm64/configs/defconfig  |1 +
 drivers/net/Kconfig   |2 +
 drivers/net/Makefile  |1 +
 drivers/net/ipa/Kconfig   |   16 +
 drivers/net/ipa/Makefile  |7 +
 drivers/net/ipa/gsi.c | 1635 +
 drivers/net/ipa/gsi.h |  246 +++
 drivers/net/ipa/gsi_private.h |  148 ++
 drivers/net/ipa/gsi_reg.h |  376 
 drivers/net/ipa/gsi_trans.c   |  624 +++
 drivers/net/ipa/gsi_trans.h   |  116 ++
 drivers/net/ipa/ipa.h |  131 ++
 drivers/net/ipa/ipa_clock.c   |  297 +++
 drivers/net/ipa/ipa_clock.h   |   52 +
 drivers/net/ipa/ipa_cmd.c |  377 
 drivers/net/ipa/ipa_cmd.h |  116 ++
 drivers/net/ipa/ipa_data-sdm845.c |  245 +++

[PATCH v2 02/17] dt-bindings: soc: qcom: add IPA bindings

2019-05-30 Thread Alex Elder

Add the binding definitions for the "qcom,ipa" device tree node.

Signed-off-by: Alex Elder 
---
 .../devicetree/bindings/net/qcom,ipa.yaml | 180 ++
 1 file changed, 180 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.yaml

diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml 
b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
new file mode 100644
index ..0037fc278a61
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
@@ -0,0 +1,180 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/qcom,ipa.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm IP Accelerator (IPA)
+
+maintainers:
+  - Alex Elder 
+
+description:
+  This binding describes the Qualcomm IPA.  The IPA is capable of offloading
+  certain network processing tasks (e.g. filtering, routing, and NAT) from
+  the main processor.
+
+  The IPA sits between multiple independent "execution environments,"
+  including the Application Processor (AP) and the modem.  The IPA presents
+  a Generic Software Interface (GSI) to each execution environment.
+  The GSI is an integral part of the IPA, but it is logically isolated
+  and has a distinct interrupt and a separately-defined address space.
+
+  See also soc/qcom/qcom,smp2p.txt and interconnect/interconnect.txt.
+
+  - |
+ -
+|  | |   |
+|  AP  +<---.   .+ Modem |
+|  +--. |   | .->+   |
+|  |  | |   | |  |   |
+  | |   | |  -
+  v |   v |
+--+-+---+-+--
+|GSI|
+|---|
+|   |
+|IPA|
+|   |
+-
+
+properties:
+  compatible:
+  const: "qcom,sdm845-ipa"
+
+  reg:
+items:
+  - description: IPA registers
+  - description: IPA shared memory
+  - description: GSI registers
+
+  reg-names:
+items:
+  - const: ipa-reg
+  - const: ipa-shared
+  - const: gsi
+
+  clocks:
+maxItems: 1
+
+  clock-names:
+  const: core
+
+  interrupts:
+items:
+  - description: IPA interrupt (hardware IRQ)
+  - description: GSI interrupt (hardware IRQ)
+  - description: Modem clock query interrupt (smp2p interrupt)
+  - description: Modem setup ready interrupt (smp2p interrupt)
+
+  interrupt-names:
+items:
+  - const: ipa
+  - const: gsi
+  - const: ipa-clock-query
+  - const: ipa-setup-ready
+
+  interconnects:
+items:
+  - description: Interconnect path between IPA and main memory
+  - description: Interconnect path between IPA and internal memory
+  - description: Interconnect path between IPA and the AP subsystem
+
+  interconnect-names:
+items:
+  - const: memory
+  - const: imem
+  - const: config
+
+  qcom,smem-states:
+description: State bits used in by the AP to signal the modem.
+items:
+- description: Whether the "ipa-clock-enabled" state bit is valid
+- description: Whether the IPA clock is enabled (if valid)
+
+  qcom,smem-state-names:
+description: The names of the state bits used for SMP2P output
+items:
+  - const: ipa-clock-enabled-valid
+  - const: ipa-clock-enabled
+
+  modem-init:
+type: boolean
+description:
+  If present, it indicates that the modem is responsible for
+  performing early IPA initialization, including loading and
+  validating firwmare used by the GSI.
+
+  memory-region:
+maxItems: 1
+description:
+  If present, a phandle for a reserved memory area that holds
+  the firmware passed to Trust Zone for authentication.  Required
+  when Trust Zone (not the modem) performs early initialization.
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - interrupts
+  - interconnects
+  - qcom,smem-states
+
+oneOf:
+  - required:
+- modem-init
+  - required:
+- memory-region
+
+examples:
+  - |
+smp2p-mpss {
+compatible = "qcom,smp2p";
+ipa_smp2p_out: ipa-ap-to-modem {
+qcom,entry-name = "ipa";
+#qcom,smem-state-cells = <1>;
+};
+
+ipa_smp2p_in: ipa-modem-to-ap {
+qcom,entry-name = "ipa";
+interrupt-controller;
+#interrupt-cells = <2>;
+};
+};
+ipa@1e4 {
+compatible = "qcom,sdm845-ipa";
+
+modem-init;
+
+reg = <0 0x1e4 0 0x7000>,
+<0 0x1e47000 0 0x2000>,
+<0 0x1e04000 0 0x2c000>;
+reg-names = "ipa-reg",
+"ipa-shared";
+"gsi";
+
+interrupts-extended = < 0 311

[PATCH v2 06/17] soc: qcom: ipa: GSI headers

2019-05-30 Thread Alex Elder

The Generic Software Interface is a layer of the IPA driver that
abstracts the underlying hardware.  The next patch includes the
main code for GSI (including some additional documentation).  This
patch just includes three GSI header files.

  - "gsi.h" is the top-level GSI header file.  There is one of these
associated with the IPA structure; in fact, it is embedded within
the IPA structure.  (Were it not embedded this way, many of the
definitions structures defined here could be private to GSI code.)
The main abstraction implemented by the GSI code is the channel,
and this header exposes several operations that can be performed
on a GSI channel.

  - "gsi_private.h" exposes some definitions that are intended to be
private, used only by the main GSI code and the GSI transaction
code (defined in an upcoming patch).

  - Like "ipa_reg.h", "gsi_reg.h" defines the offsets of the 32-bit
registers used by the GSI layer, along with masks that define the
position and width of fields less than 32 bits located within
these registers.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/gsi.h | 246 ++
 drivers/net/ipa/gsi_private.h | 148 +
 drivers/net/ipa/gsi_reg.h | 376 ++
 3 files changed, 770 insertions(+)
 create mode 100644 drivers/net/ipa/gsi.h
 create mode 100644 drivers/net/ipa/gsi_private.h
 create mode 100644 drivers/net/ipa/gsi_reg.h

diff --git a/drivers/net/ipa/gsi.h b/drivers/net/ipa/gsi.h
new file mode 100644
index ..872ca682853a
--- /dev/null
+++ b/drivers/net/ipa/gsi.h
@@ -0,0 +1,246 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _GSI_H_
+#define _GSI_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define GSI_CHANNEL_MAX14
+#define GSI_EVT_RING_MAX   10
+
+struct device;
+struct scatterlist;
+struct platform_device;
+
+struct gsi;
+struct gsi_trans;
+struct gsi_channel_data;
+struct gsi_ipa_endpoint_data;
+
+/* Execution environment IDs */
+enum gsi_ee_id {
+   GSI_EE_AP   = 0,
+   GSI_EE_MODEM= 1,
+   GSI_EE_UC   = 2,
+   GSI_EE_TZ   = 3,
+};
+
+/* Channel operation statistics, aggregated across all channels */
+struct gsi_channel_stats {
+   u64 allocate;
+   u64 start;
+   u64 stop;
+   u64 reset;
+   u64 free;
+};
+
+struct gsi_ring {
+   void *virt; /* ring array base address */
+   dma_addr_t addr;/* primarily low 32 bits used */
+   u32 count;  /* number of elements in ring */
+
+   /* The ring index value indicates the next "open" entry in the ring.
+*
+* A channel ring consists of TRE entries filled by the AP and passed
+* to the hardware for processing.  For a channel ring, the ring index
+* identifies the next unused entry to be filled by the AP.
+*
+* An event ring consists of event structures filled by the hardware
+* and passed to the AP.  For event rings, the ring index identifies
+* the next ring entry that is not known to have been filled by the
+* hardware.
+*/
+   u32 index;
+};
+
+struct gsi_trans_info {
+   struct gsi_trans **map; /* TRE -> transaction map */
+   u32 pool_count; /* # transactions in the pool */
+   struct gsi_trans *pool; /* transaction allocation pool */
+   u32 pool_free;  /* next free trans in pool (modulo) */
+   u32 sg_pool_count;  /* # SGs in the allocation pool */
+   struct scatterlist *sg_pool;/* SG allocation pool */
+   u32 sg_pool_free;   /* next free SG pool entry */
+
+   atomic_t tre_avail; /* # unallocated TREs in ring */
+   spinlock_t spinlock;/* protects updates to the lists */
+   struct list_head alloc; /* allocated, not committed */
+   struct list_head pending;   /* committed, awaiting completion */
+   struct list_head complete;  /* completed, awaiting poll */
+   struct list_head polled;/* returned by gsi_channel_poll_one() */
+};
+
+/* Hardware values signifying the state of a channel */
+enum gsi_channel_state {
+   GSI_CHANNEL_STATE_NOT_ALLOCATED = 0x0,
+   GSI_CHANNEL_STATE_ALLOCATED = 0x1,
+   GSI_CHANNEL_STATE_STARTED   = 0x2,
+   GSI_CHANNEL_STATE_STOPPED   = 0x3,
+   GSI_CHANNEL_STATE_STOP_IN_PROC  = 0x4,
+   GSI_CHANNEL_STATE_ERROR = 0xf,
+};
+
+/* We only care about channels between IPA and AP */
+struct gsi_channel {
+   struct gsi *gsi;
+   u32 toward_ipa; /* 0: IPA->AP; 1: AP->IPA */
+
+   const struct gsi_channel_data *data;/* initialization data */
+
+   struct completion

Re: linux-next: manual merge of the userns tree with the arc-current tree

2019-05-30 Thread Stephen Rothwell

Hi Vineet,

On Thu, 30 May 2019 17:11:33 + Vineet Gupta  
wrote:
>
> Thx for this. Unfortunately I had to force push my for-next due to broken #7 
> and
> #8 above. So you may have to do this once again.

Thanks for the heads up, but "git rerere" seems to have still coped, so
its all good.

-- 
Cheers,
Stephen Rothwell


pgpk_EJGA7zK6.pgp
Description: OpenPGP digital signature

[PATCH v2 07/17] soc: qcom: ipa: the generic software interface

2019-05-30 Thread Alex Elder

This patch includes "gsi.c", which implements the generic software
interface (GSI) for IPA.  The generic software interface abstracts
channels, which provide a means of transferring data either from the
AP to the IPA, or from the IPA to the AP.  A ring buffer of "transfer
elements" (TREs) is used to describe data transfers to perform.  The
AP writes a doorbell register associated with a channel to let it know
it has added new entries (for an AP->IPA channel) or has finished
processing entries (for an IPA->AP channel).

Each channel also has an event ring buffer, used by the IPA to
communicate information about events related to a channel (for
example, the completion of TREs).  The IPA writes its own doorbell
register, which triggers an interrupt on the AP, to signal that
new event information has arrived.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/gsi.c | 1635 +
 1 file changed, 1635 insertions(+)
 create mode 100644 drivers/net/ipa/gsi.c

diff --git a/drivers/net/ipa/gsi.c b/drivers/net/ipa/gsi.c
new file mode 100644
index ..a749d3b0d792
--- /dev/null
+++ b/drivers/net/ipa/gsi.c
@@ -0,0 +1,1635 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "gsi.h"
+#include "gsi_reg.h"
+#include "gsi_private.h"
+#include "gsi_trans.h"
+#include "ipa_gsi.h"
+#include "ipa_data.h"
+
+/**
+ * DOC: The IPA Generic Software Interface
+ *
+ * The generic software interface (GSI) is an integral component of the IPA,
+ * providing a well-defined communication layer between the AP subsystem
+ * and the IPA core.  The modem uses the GSI layer as well.
+ *
+ *  -
+ * |  | |   |
+ * |  AP  +<---.   .+ Modem |
+ * |  +--. |   | .->+   |
+ * |  |  | |   | |  |   |
+ *   | |   | |  -
+ *   v |   v |
+ * --+-+---+-+--
+ * |GSI|
+ * |---|
+ * |   |
+ * |IPA|
+ * |   |
+ * -
+ *
+ * In the above diagram, the AP and Modem represent "execution environments"
+ * (EEs), which are independent operating environments that use the IPA for
+ * data transfer.
+ *
+ * Each EE uses a set of unidirectional GSI "channels," which allow transfer
+ * of data to or from the IPA.  A channel is implemented as a ring buffer,
+ * with a DRAM-resident array of "transfer elements" (TREs) available to
+ * describe transfers to or from other EEs through the IPA.  A transfer
+ * element can also contain an immediate command, requesting the IPA perform
+ * actions other than data transfer.
+ *
+ * Each TRE refers to a block of data--also located DRAM.  After writing one
+ * or more TREs to a channel, the writer (either the IPA or an EE) writes a
+ * doorbell register to inform the receiving side how many elements have
+ * been written.  Writing to a doorbell register triggers within the GSI.
+ *
+ * Each channel has a GSI "event ring" associated with it.  An event ring
+ * is implemented very much like a channel ring, but is always directed from
+ * the IPA to an EE.  The IPA notifies an EE (such as the AP) about channel
+ * events by adding an entry to the event ring associated with the channel.
+ * The GSI then writes its doorbell for the event ring, causing the target
+ * EE to be interrupted.  Each entry in an event ring contains a pointer
+ * to the channel TRE whose completion the event represents.
+ *
+ * Each TRE in a channel ring has a set of flags.  One flag indicates whether
+ * the completion of the transfer operation generates an entry (and possibly
+ * an interrupt) in the channel's event ring.  Other flags allow transfer
+ * elements to be chained together, forming a single logical transaction.
+ * TRE flags are used to control whether and when interrupts are generated
+ * to signal completion of channel transfers.
+ *
+ * Elements in channel and event rings are completed (or consumed) strictly
+ * in order.  Completion of one entry implies the completion of all preceding
+ * entries.  A single completion interrupt can therefore communicate the
+ * completion of many transfers.
+ *
+ * Note that all GSI registers are little-endian, which is the assumed
+ * endianness of I/O space accesses.  The accessor functions perform byte
+ * swapping if needed (i.e., for a big endian CPU).
+ */
+
+/* Delay period for interrupt moderation (in 32KHz IPA internal timer ticks) */
+#define IPA_GSI_EVT_RING_INT_MODT  (32 * 1) /* 1ms under 32KHz clock */
+
+#define GSI_CMD_TIMEOUT5   /* seconds */
+
+#define GSI_MHI_ER_START   10  /* First reserved event number */
+#define GSI_MHI_ER_END 16

[PATCH v2 03/17] soc: qcom: ipa: main code

2019-05-30 Thread Alex Elder

This patch includes three source files that represent some basic "main
program" code for the IPA driver.  They are:
  - "ipa.h" defines the top-level IPA structure which represents an IPA
 device throughout the code.
  - "ipa_main.c" contains the platform driver probe function, along with
some general code used during initialization.
  - "ipa_reg.h" defines the offsets of the 32-bit registers used for the
IPA device, along with masks that define the position and width of
fields less than 32 bits located within these registers.

Each file includes some documentation that provides a little more
overview of how the code is organized and used.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa.h  | 131 ++
 drivers/net/ipa/ipa_main.c | 921 +
 drivers/net/ipa/ipa_reg.h  | 279 +++
 3 files changed, 1331 insertions(+)
 create mode 100644 drivers/net/ipa/ipa.h
 create mode 100644 drivers/net/ipa/ipa_main.c
 create mode 100644 drivers/net/ipa/ipa_reg.h

diff --git a/drivers/net/ipa/ipa.h b/drivers/net/ipa/ipa.h
new file mode 100644
index ..c580254d1e0e
--- /dev/null
+++ b/drivers/net/ipa/ipa.h
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_H_
+#define _IPA_H_
+
+#include 
+#include 
+#include 
+#include 
+
+#include "gsi.h"
+#include "ipa_qmi.h"
+#include "ipa_endpoint.h"
+#include "ipa_interrupt.h"
+
+struct clk;
+struct icc_path;
+struct net_device;
+struct platform_device;
+
+struct ipa_clock;
+struct ipa_smp2p;
+struct ipa_interrupt;
+
+/**
+ * struct ipa - IPA information
+ * @gsi:   Embedded GSI structure
+ * @pdev:  Platform device
+ * @smp2p: SMP2P information
+ * @clock: IPA clocking information
+ * @suspend_ref:   Whether clock reference preventing suspend taken
+ * @route_virt:Virtual address of routing table
+ * @route_addr:DMA address for routing table
+ * @filter_virt:   Virtual address of filter table
+ * @filter_addr:   DMA address for filter table
+ * @interrupt: IPA Interrupt information
+ * @uc_loaded: Non-zero when microcontroller has reported it's ready
+ * @ipa_phys:  Physical address of IPA memory space
+ * @ipa_virt:  Virtual address for IPA memory space
+ * @reg_virt:  Virtual address used for IPA register access
+ * @shared_phys:   Physical address of memory space shared with modem
+ * @shared_virt:   Virtual address of memory space shared with modem
+ * @shared_offset: Additional offset used for shared memory
+ * @wakeup:Wakeup source information
+ * @filter_support:Bit mask indicating endpoints that support filtering
+ * @initialized:   Bit mask indicating endpoints initialized
+ * @set_up:Bit mask indicating endpoints set up
+ * @enabled:   Bit mask indicating endpoints enabled
+ * @suspended: Bit mask indicating endpoints suspended
+ * @endpoint:  Array of endpoint information
+ * @endpoint_map:  Mapping of GSI channel to IPA endpoint information
+ * @command_endpoint:  Endpoint used for command TX
+ * @default_endpoint:  Endpoint used for default route RX
+ * @modem_netdev:  Network device structure used for modem
+ * @setup_complete:Flag indicating whether setup stage has completed
+ * @qmi:   QMI information
+ */
+struct ipa {
+   struct gsi gsi;
+   struct platform_device *pdev;
+   struct ipa_smp2p *smp2p;
+   struct ipa_clock *clock;
+   atomic_t suspend_ref;
+
+   void *route_virt;
+   dma_addr_t route_addr;
+   void *filter_virt;
+   dma_addr_t filter_addr;
+
+   struct ipa_interrupt *interrupt;
+   u32 uc_loaded;
+
+   phys_addr_t reg_phys;
+   void __iomem *reg_virt;
+   phys_addr_t shared_phys;
+   void *shared_virt;
+   u32 shared_offset;
+
+   struct wakeup_source wakeup;
+
+   /* Bit masks indicating endpoint state */
+   u32 filter_support;
+   u32 initialized;
+   u32 set_up;
+   u32 enabled;
+   u32 suspended;
+
+   struct ipa_endpoint endpoint[IPA_ENDPOINT_MAX];
+   struct ipa_endpoint *endpoint_map[GSI_CHANNEL_MAX];
+   struct ipa_endpoint *command_endpoint;  /* TX */
+   struct ipa_endpoint *default_endpoint;  /* Default route RX */
+
+   struct net_device *modem_netdev;
+   u32 setup_complete;
+
+   struct ipa_qmi qmi;
+};
+
+/**
+ * ipa_setup() - Perform IPA setup
+ * @ipa:   IPA pointer
+ *
+ * IPA initialization is broken into stages:  init; config; setup; and
+ * sometimes enable.  (These have inverses exit, deconfig, teardown, and
+ * disable.)  Activities performed at the init stage can be done without
+ * requiring any access to hardware.  For IPA, activities performed at the
+ * config

[PATCH v2 04/17] soc: qcom: ipa: configuration data

2019-05-30 Thread Alex Elder

This patch defines configuration data that is used to specify some
of the details of IPA hardware supported by the driver.  It is built
as Device Tree match data, discovered at boot time.  Initially the
driver only supports the Qualcomm SDM845 SoC.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_data-sdm845.c | 245 +++
 drivers/net/ipa/ipa_data.h| 267 ++
 2 files changed, 512 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_data-sdm845.c
 create mode 100644 drivers/net/ipa/ipa_data.h

diff --git a/drivers/net/ipa/ipa_data-sdm845.c 
b/drivers/net/ipa/ipa_data-sdm845.c
new file mode 100644
index ..62c0f25f5161
--- /dev/null
+++ b/drivers/net/ipa/ipa_data-sdm845.c
@@ -0,0 +1,245 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include 
+
+#include "gsi.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+
+/* Differentiate Boolean from numerical options */
+#define NO 0
+#define YES1
+
+/* Endpoint configuration for the SDM845 SoC. */
+static const struct gsi_ipa_endpoint_data gsi_ipa_endpoint_data[] = {
+   {
+   .ee_id  = GSI_EE_AP,
+   .channel_id = 4,
+   .endpoint_id= IPA_ENDPOINT_AP_COMMAND_TX,
+   .toward_ipa = YES,
+   .channel = {
+   .tlv_count  = 20,
+   .wrr_priority   = YES,
+   .tre_count  = 256,
+   .event_count= 512,
+   },
+   .endpoint = {
+   .seq_type   = IPA_SEQ_DMA_ONLY,
+   .config = {
+   .dma_mode   = YES,
+   .dma_endpoint   = IPA_ENDPOINT_AP_LAN_RX,
+   },
+   },
+   },
+   {
+   .ee_id  = GSI_EE_AP,
+   .channel_id = 5,
+   .endpoint_id= IPA_ENDPOINT_AP_LAN_RX,
+   .toward_ipa = NO,
+   .channel = {
+   .tlv_count  = 8,
+   .tre_count  = 256,
+   .event_count= 256,
+   },
+   .endpoint = {
+   .seq_type   = IPA_SEQ_INVALID,
+   .config = {
+   .checksum   = YES,
+   .aggregation= YES,
+   .status_enable  = YES,
+   .rx = {
+   .pad_align  = ilog2(sizeof(u32)),
+   },
+   },
+   },
+   },
+   {
+   .ee_id  = GSI_EE_AP,
+   .channel_id = 3,
+   .endpoint_id= IPA_ENDPOINT_AP_MODEM_TX,
+   .toward_ipa = YES,
+   .channel = {
+   .tlv_count  = 16,
+   .tre_count  = 512,
+   .event_count= 512,
+   },
+   .endpoint = {
+   .support_flt= YES,
+   .seq_type   =
+   IPA_SEQ_2ND_PKT_PROCESS_PASS_NO_DEC_UCP,
+   .config = {
+   .checksum   = YES,
+   .qmap   = YES,
+   .status_enable  = YES,
+   .tx = {
+   .delay  = YES,
+   .status_endpoint =
+   IPA_ENDPOINT_MODEM_AP_RX,
+   },
+   },
+   },
+   },
+   {
+   .ee_id  = GSI_EE_AP,
+   .channel_id = 6,
+   .endpoint_id= IPA_ENDPOINT_AP_MODEM_RX,
+   .toward_ipa = NO,
+   .channel = {
+   .tlv_count  = 8,
+   .tre_count  = 256,
+   .event_count= 256,
+   },
+   .endpoint = {
+   .seq_type   = IPA_SEQ_INVALID,
+   .config = {
+   .checksum   = YES,
+   .qmap   = YES,
+   .aggregation= YES,
+   .rx = {
+   .aggr_close_eof = YES,
+   },
+   },
+   },
+   },
+   {
+   .ee_id  = GSI_EE_MODEM,
+   .channel_id = 1,
+   .endpoint_id= IPA_ENDPOINT_MODEM_COMMAND_TX,
+   .toward_ipa = YES,
+

[PATCH v2 12/17] soc: qcom: ipa: IPA network device and microcontroller

2019-05-30 Thread Alex Elder

This patch includes the code that implements a Linux network device,
using one TX and one RX IPA endpoint.  It is used to implement the
network device representing the modem and its connection to wireless
networks.  There are only a few things that are really modem-specific
though, and they aren't clearly called out here.  Such distinctions
will be made clearer if we wish to support a network device for
anything other than the modem.

Sort of unrelated, this patch also includes the code supporting the
microcontroller CPU present on the IPA.  The microcontroller can be
used to implement special handling of packets, but at this time we
don't support that.  Still, it is a component that needs to be
initialized, and in the event of a crash we need to do some
synchronization between the AP and the microcontroller.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_netdev.c | 251 +++
 drivers/net/ipa/ipa_netdev.h |  24 
 drivers/net/ipa/ipa_uc.c | 208 +
 drivers/net/ipa/ipa_uc.h |  32 +
 4 files changed, 515 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_netdev.c
 create mode 100644 drivers/net/ipa/ipa_netdev.h
 create mode 100644 drivers/net/ipa/ipa_uc.c
 create mode 100644 drivers/net/ipa/ipa_uc.h

diff --git a/drivers/net/ipa/ipa_netdev.c b/drivers/net/ipa/ipa_netdev.c
new file mode 100644
index ..19c73c4da02b
--- /dev/null
+++ b/drivers/net/ipa/ipa_netdev.c
@@ -0,0 +1,251 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2014-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+/* Modem Transport Network Driver. */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ipa.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+#include "ipa_mem.h"
+#include "ipa_netdev.h"
+#include "ipa_qmi.h"
+
+#define IPA_NETDEV_NAME"rmnet_ipa%d"
+
+#define TAILROOM   0   /* for padding by mux layer */
+
+#define IPA_NETDEV_TIMEOUT 10  /* seconds */
+
+/** struct ipa_priv - IPA network device private data */
+struct ipa_priv {
+   struct ipa_endpoint *tx_endpoint;
+   struct ipa_endpoint *rx_endpoint;
+};
+
+/** ipa_netdev_open() - Opens the modem network interface */
+static int ipa_netdev_open(struct net_device *netdev)
+{
+   struct ipa_priv *priv = netdev_priv(netdev);
+   int ret;
+
+   ret = ipa_endpoint_enable_one(priv->tx_endpoint);
+   if (ret)
+   return ret;
+   ret = ipa_endpoint_enable_one(priv->rx_endpoint);
+   if (ret)
+   goto err_disable_tx;
+
+   netif_start_queue(netdev);
+
+   return 0;
+
+err_disable_tx:
+   ipa_endpoint_disable_one(priv->tx_endpoint);
+
+   return ret;
+}
+
+/** ipa_netdev_stop() - Stops the modem network interface. */
+static int ipa_netdev_stop(struct net_device *netdev)
+{
+   struct ipa_priv *priv = netdev_priv(netdev);
+
+   netif_stop_queue(netdev);
+
+   ipa_endpoint_disable_one(priv->rx_endpoint);
+   ipa_endpoint_disable_one(priv->tx_endpoint);
+
+   return 0;
+}
+
+/** ipa_netdev_xmit() - Transmits an skb.
+ * @skb: skb to be transmitted
+ * @dev: network device
+ *
+ * Return codes:
+ * NETDEV_TX_OK: Success
+ * NETDEV_TX_BUSY: Error while transmitting the skb. Try again later
+ */
+static int ipa_netdev_xmit(struct sk_buff *skb, struct net_device *netdev)
+{
+   struct net_device_stats *stats = >stats;
+   struct ipa_priv *priv = netdev_priv(netdev);
+   struct ipa_endpoint *endpoint;
+   u32 skb_len = skb->len;
+
+   if (!skb_len)
+   goto err_drop;
+
+   endpoint = priv->tx_endpoint;
+   if (endpoint->data->config.qmap && skb->protocol != htons(ETH_P_MAP))
+   goto err_drop;
+
+   if (ipa_endpoint_skb_tx(endpoint, skb))
+   return NETDEV_TX_BUSY;
+
+   stats->tx_packets++;
+   stats->tx_bytes += skb_len;
+
+   return NETDEV_TX_OK;
+
+err_drop:
+   dev_kfree_skb_any(skb);
+   stats->tx_dropped++;
+
+   return NETDEV_TX_OK;
+}
+
+void ipa_netdev_skb_rx(struct net_device *netdev, struct sk_buff *skb)
+{
+   struct net_device_stats *stats = >stats;
+
+   if (skb) {
+   skb->dev = netdev;
+   skb->protocol = htons(ETH_P_MAP);
+   stats->rx_packets++;
+   stats->rx_bytes += skb->len;
+
+   (void)netif_receive_skb(skb);
+   } else {
+   stats->rx_dropped++;
+   }
+}
+
+static const struct net_device_ops ipa_netdev_ops = {
+   .ndo_open   = ipa_netdev_open,
+   .ndo_stop   = ipa_netdev_stop,
+   .ndo_start_xmit = ipa_netdev_xmit,
+};
+
+/** netdev_setup() - netdev setup function  */
+static void netdev_setup(struct net_device *netdev)
+{
+   netdev->netdev_ops = _netdev_ops;
+   ether_setup(netdev);
+   /* No header ops (override value set by ether_setup()) */
+

[PATCH v2 11/17] soc: qcom: ipa: immediate commands

2019-05-30 Thread Alex Elder

One TX endpoint (per EE) is used for issuing immediate commands to
the IPA.  These commands request activites beyond simple data
transfers to be done by the IPA hardware.  For example, the IPA is
able to manage routing packets among endpoints, and immediate commands
are used to configure tables used for that routing.

Immediate commands are built on top of GSI transactions.  They are
different from normal transfers (in that they use a special endpoint,
and their "payload" is interpreted differently), so separate functions
are used to issue immediate command transactions.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_cmd.c | 377 ++
 drivers/net/ipa/ipa_cmd.h | 116 
 2 files changed, 493 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_cmd.c
 create mode 100644 drivers/net/ipa/ipa_cmd.h

diff --git a/drivers/net/ipa/ipa_cmd.c b/drivers/net/ipa/ipa_cmd.c
new file mode 100644
index ..32b11941436d
--- /dev/null
+++ b/drivers/net/ipa/ipa_cmd.c
@@ -0,0 +1,377 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "gsi.h"
+#include "gsi_trans.h"
+#include "ipa.h"
+#include "ipa_endpoint.h"
+#include "ipa_cmd.h"
+#include "ipa_mem.h"
+
+/**
+ * DOC:  IPA Immediate Commands
+ *
+ * The AP command TX endpoint is used to issue immediate commands to the IPA.
+ * An immediate command is generally used to request the IPA do something
+ * other than data transfer to another endpoint.
+ *
+ * Immediate commands are represented by GSI transactions just like other
+ * transfer requests, represented by a single GSI TRE.  Each immediate
+ * command has a well-defined format, having a payload of a known length.
+ * This allows the transfer element's length field to be used to hold an
+ * immediate command's opcode.  The payload for a command resides in DRAM
+ * and is described by a single scatterlist entry in its transaction.
+ * Commands do not require a transaction completion callback.  To commit
+ * an immediate command transaction, either gsi_trans_commit_command() or
+ * gsi_trans_commit_command_timeout() is used.
+ */
+
+#define IPA_GSI_DMA_TASK_TIMEOUT   15  /* milliseconds */
+
+/**
+ * __ipa_cmd_timeout() - Send an immediate command with timeout
+ * @ipa:   IPA structure
+ * @opcode:Immediate command opcode (must not be IPA_CMD_NONE)
+ * @payload:   Pointer to command payload
+ * @size:  Size of payload
+ * @timeout:   Milliseconds to wait for completion (0 waits indefinitely)
+ *
+ * This common function implements ipa_cmd() and ipa_cmd_timeout().  It
+ * allocates, initializes, and commits a transaction for the immediate
+ * command.  The transaction is committed using gsi_trans_commit_command(),
+ * or if a non-zero timeout is supplied, gsi_trans_commit_command_timeout().
+ *
+ * @Return:0 if successful, or a negative error code
+ */
+static int __ipa_cmd_timeout(struct ipa *ipa, enum ipa_cmd_opcode opcode,
+void *payload, size_t size, u32 timeout)
+{
+   struct ipa_endpoint *endpoint = ipa->command_endpoint;
+   struct gsi_trans *trans;
+   int ret;
+
+   /* assert(opcode != IPA_CMD_NONE) */
+   trans = gsi_channel_trans_alloc(>gsi, endpoint->channel_id, 1);
+   if (!trans)
+   return -EBUSY;
+
+   sg_init_one(trans->sgl, payload, size);
+
+   if (timeout)
+   ret = gsi_trans_commit_command_timeout(trans, opcode, timeout);
+   else
+   ret = gsi_trans_commit_command(trans, opcode);
+   if (ret)
+   goto err_trans_free;
+
+   return 0;
+
+err_trans_free:
+   gsi_trans_free(trans);
+
+   return ret;
+}
+
+static int
+ipa_cmd(struct ipa *ipa, enum ipa_cmd_opcode opcode, void *payload, size_t 
size)
+{
+   return __ipa_cmd_timeout(ipa, opcode, payload, size, 0);
+}
+
+static int ipa_cmd_timeout(struct ipa *ipa, enum ipa_cmd_opcode opcode,
+  void *payload, size_t size)
+{
+   return __ipa_cmd_timeout(ipa, opcode, payload, size,
+IPA_GSI_DMA_TASK_TIMEOUT);
+}
+
+/* Field masks for ipa_imm_cmd_hw_hdr_init_local structure fields */
+#define IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASKGENMASK(11, 0)
+#define IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK  GENMASK(27, 12)
+#define IPA_CMD_HDR_INIT_FLAGS_RESERVED_FMASK  GENMASK(28, 31)
+
+struct ipa_imm_cmd_hw_hdr_init_local {
+   u64 hdr_table_addr;
+   u32 flags;
+   u32 reserved;
+};
+
+/* Initialize header space in IPA-local memory */
+int ipa_cmd_hdr_init_local(struct ipa *ipa, u32 offset, u32 size)
+{
+   struct ipa_imm_cmd_hw_hdr_init_local *payload;
+   struct device *dev = >pdev->dev;
+   dma_addr_t addr;
+   void *virt;
+   u32 flags;
+   u32 max;
+   int ret;
+
+   if (size >

[PATCH v2 10/17] soc: qcom: ipa: IPA endpoints

2019-05-30 Thread Alex Elder

This patch includes the code implementing an IPA endpoint.  This is
the primary abstraction implemented by the IPA.  An endpoint is one
end of a network connection between two entities physically
connected to the IPA.  Specifically, the AP and the modem implement
endpoints, and an (AP endpoint, modem endpoint) pair implements the
transfer of network data in one direction between the AP and modem.

Endpoints are built on top of GSI channels, but IPA endpoints
represent the higher-level functionality that the IPA provides.
Data can be sent through a GSI channel, but it is the IPA endpoint
that represents what is on the "other end" to receive that data.
Other functionality, including aggregation, checksum offload and
(at some future date) IP routing and filtering are all associated
with the IPA endpoint.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_endpoint.c | 1283 
 drivers/net/ipa/ipa_endpoint.h |   97 +++
 2 files changed, 1380 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_endpoint.c
 create mode 100644 drivers/net/ipa/ipa_endpoint.h

diff --git a/drivers/net/ipa/ipa_endpoint.c b/drivers/net/ipa/ipa_endpoint.c
new file mode 100644
index ..0185db35033d
--- /dev/null
+++ b/drivers/net/ipa/ipa_endpoint.c
@@ -0,0 +1,1283 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "gsi.h"
+#include "gsi_trans.h"
+#include "ipa.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+#include "ipa_cmd.h"
+#include "ipa_mem.h"
+#include "ipa_netdev.h"
+
+#define atomic_dec_not_zero(v) atomic_add_unless((v), -1, 0)
+
+#define IPA_REPLENISH_BATCH16
+
+#define IPA_RX_BUFFER_SIZE (PAGE_SIZE << IPA_RX_BUFFER_ORDER)
+#define IPA_RX_BUFFER_ORDER1   /* 8KB endpoint RX buffers (2 pages) */
+
+/* The amount of RX buffer space consumed by standard skb overhead */
+#define IPA_RX_BUFFER_OVERHEAD (PAGE_SIZE - SKB_MAX_ORDER(NET_SKB_PAD, 0))
+
+#define IPA_ENDPOINT_STOP_RETRY_MAX10
+#define IPA_ENDPOINT_STOP_RX_SIZE  1   /* bytes */
+
+#define IPA_ENDPOINT_RESET_AGGR_RETRY_MAX  3
+#define IPA_AGGR_TIME_LIMIT_DEFAULT1   /* milliseconds */
+
+/** enum ipa_status_opcode - status element opcode hardware values */
+enum ipa_status_opcode {
+   IPA_STATUS_OPCODE_PACKET= 0x01,
+   IPA_STATUS_OPCODE_NEW_FRAG_RULE = 0x02,
+   IPA_STATUS_OPCODE_DROPPED_PACKET= 0x04,
+   IPA_STATUS_OPCODE_SUSPENDED_PACKET  = 0x08,
+   IPA_STATUS_OPCODE_LOG   = 0x10,
+   IPA_STATUS_OPCODE_DCMP  = 0x20,
+   IPA_STATUS_OPCODE_PACKET_2ND_PASS   = 0x40,
+};
+
+/** enum ipa_status_exception - status element exception type */
+enum ipa_status_exception {
+   IPA_STATUS_EXCEPTION_NONE,
+   IPA_STATUS_EXCEPTION_DEAGGR,
+   IPA_STATUS_EXCEPTION_IPTYPE,
+   IPA_STATUS_EXCEPTION_PACKET_LENGTH,
+   IPA_STATUS_EXCEPTION_PACKET_THRESHOLD,
+   IPA_STATUS_EXCEPTION_FRAG_RULE_MISS,
+   IPA_STATUS_EXCEPTION_SW_FILT,
+   IPA_STATUS_EXCEPTION_NAT,
+   IPA_STATUS_EXCEPTION_IPV6CT,
+   IPA_STATUS_EXCEPTION_MAX,
+};
+
+/**
+ * struct ipa_status - Abstracted IPA status element
+ * @opcode:Status element type
+ * @exception: The first exception that took place
+ * @pkt_len:   Payload length
+ * @dst_endpoint:  Destination endpoint
+ * @metadata:  32-bit metadata value used by packet
+ * @rt_miss:   Flag; if 1, indicates there was a routing rule miss
+ *
+ * Note that the hardware status element supplies additional information
+ * that is currently unused.
+ */
+struct ipa_status {
+   enum ipa_status_opcode opcode;
+   enum ipa_status_exception exception;
+   u32 pkt_len;
+   u32 dst_endpoint;
+   u32 metadata;
+   u32 rt_miss;
+};
+
+/* Field masks for struct ipa_status_raw structure fields */
+
+#define IPA_STATUS_SRC_IDX_FMASK   GENMASK(4, 0)
+
+#define IPA_STATUS_DST_IDX_FMASK   GENMASK(4, 0)
+
+#define IPA_STATUS_FLAGS1_FLT_LOCAL_FMASK  GENMASK(0, 0)
+#define IPA_STATUS_FLAGS1_FLT_HASH_FMASK   GENMASK(1, 1)
+#define IPA_STATUS_FLAGS1_FLT_GLOBAL_FMASK GENMASK(2, 2)
+#define IPA_STATUS_FLAGS1_FLT_RET_HDR_FMASKGENMASK(3, 3)
+#define IPA_STATUS_FLAGS1_FLT_RULE_ID_FMASKGENMASK(13, 4)
+#define IPA_STATUS_FLAGS1_RT_LOCAL_FMASK   GENMASK(14, 14)
+#define IPA_STATUS_FLAGS1_RT_HASH_FMASKGENMASK(15, 15)
+#define IPA_STATUS_FLAGS1_UCP_FMASKGENMASK(16, 16)
+#define IPA_STATUS_FLAGS1_RT_TBL_IDX_FMASK GENMASK(21, 17)
+#define IPA_STATUS_FLAGS1_RT_RULE_ID_FMASK GENMASK(31, 22)
+
+#define IPA_STATUS_FLAGS2_NAT_HIT_FMASKGENMASK_ULL(0, 0)
+#define IPA_STATUS_FLAGS2_NAT_ENTRY_IDX_FMASK  GENMASK_ULL(13, 1)

[PATCH v2 01/17] bitfield.h: add FIELD_MAX() and field_max()

2019-05-30 Thread Alex Elder

Define FIELD_MAX(), which supplies the maximum value that can be
represented by a field value.  Define field_max() as well, to go
along with the lower-case forms of the field mask functions.

Signed-off-by: Alex Elder 
---
 include/linux/bitfield.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h
index 3f1ef4450a7c..cf4f06774520 100644
--- a/include/linux/bitfield.h
+++ b/include/linux/bitfield.h
@@ -63,6 +63,19 @@
  (1ULL << __bf_shf(_mask))); \
})
 
+/**
+ * FIELD_MAX() - produce the maximum value representable by a field
+ * @_mask: shifted mask defining the field's length and position
+ *
+ * FIELD_MAX() returns the maximum value that can be held in the field
+ * specified by @_mask.
+ */
+#define FIELD_MAX(_mask)   \
+   ({  \
+   __BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_MAX: "); \
+   (typeof(_mask))((_mask) >> __bf_shf(_mask));\
+   })
+
 /**
  * FIELD_FIT() - check if value fits in the field
  * @_mask: shifted mask defining the field's length and position
@@ -118,6 +131,7 @@ static __always_inline u64 field_mask(u64 field)
 {
return field / field_multiplier(field);
 }
+#define field_max(field)   ((typeof(field))field_mask(field))
 #define MAKE_OP(type,base,to,from) \
 static __always_inline __##type type##_encode_bits(base v, base field) \
 {  \
-- 
2.20.1

[PATCH v2 08/17] soc: qcom: ipa: GSI transactions

2019-05-30 Thread Alex Elder

This patch implements GSI transactions.  A GSI transaction is a
structure that represents a single request (consisting of one or
more TREs) sent to the GSI hardware.  The last TRE in a transaction
includes a flag requesting that the GSI interrupt the AP to notify
that it has completed.

TREs are executed and completed strictly in order.  For this reason,
the completion of a single TRE implies that all previous TREs (in
particular all of those "earlier" in a transaction) have completed.

Whenever there is a need to send a request (a set of TREs) to the
IPA, a GSI transaction is allocated, specifying the number of TREs
that will be required.  Details of the request (e.g. transfer offsets
and length) are represented by in a Linux scatterlist array that is
incorporated in the transaction structure.

Once "filled," the transaction is committed.  The GSI transaction
layer performs all needed mapping (and unmapping) for DMA, and
issues the request to the hardware.  When the hardware signals
that the request has completed, a callback function allows for
cleanup or followup activity to be performed before the transaction
is freed.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/gsi_trans.c | 624 
 drivers/net/ipa/gsi_trans.h | 116 +++
 2 files changed, 740 insertions(+)
 create mode 100644 drivers/net/ipa/gsi_trans.c
 create mode 100644 drivers/net/ipa/gsi_trans.h

diff --git a/drivers/net/ipa/gsi_trans.c b/drivers/net/ipa/gsi_trans.c
new file mode 100644
index ..267e33093554
--- /dev/null
+++ b/drivers/net/ipa/gsi_trans.c
@@ -0,0 +1,624 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "gsi.h"
+#include "gsi_private.h"
+#include "gsi_trans.h"
+#include "ipa_gsi.h"
+#include "ipa_data.h"
+#include "ipa_cmd.h"
+
+/**
+ * DOC: GSI Transactions
+ *
+ * A GSI transaction abstracts the behavior of a GSI channel by representing
+ * everything about a related group of data transfers in a single structure.
+ * Most details of interaction with the GSI hardware are managed by the GSI
+ * transaction core, allowing users to simply describe transfers to be
+ * performed.  When a transaction has completed a callback function
+ * (dependent on the type of endpoint associated with the channel) allows
+ * cleanup of resources associated with the transaction.
+ *
+ * To perform a data transfer (or a related set of them), a user of the GSI
+ * transaction interface allocates a transaction, indicating the number of
+ * TREs required (one per data transfer).  If sufficient TREs are available,
+ * they are reserved for use in the transaction and the allocation succeeds.
+ * This way exhaustion of the available TREs in a channel ring is detected
+ * as early as possible.  All resources required to complete a transaction
+ * are allocated at transaction allocation time.
+ *
+ * Transfers performed as part of a transaction are represented in an array
+ * of Linux scatterlist structures.  This array is allocated with the
+ * transaction, and its entries must be initialized using standard
+ * scatterlist functions (such as sg_init_one() or skb_to_sgvec()).
+ *
+ * Once a transaction's scatterlist structures have been initialized, the
+ * transaction is committed.  The GSI transaction layer is responsible for
+ * DMA mapping (and unmapping) memory described in the transaction's
+ * scatterlist array.  The only way committing a transaction fails is if
+ * this DMA mapping step returns an error.  Otherwise, ownership of the
+ * entire transaction is transferred to the GSI transaction core.  The GSI
+ * transaction code formats the content of the scatterlist array into the
+ * channel ring buffer and informs the hardware that new TREs are available
+ * to process.
+ *
+ * The last TRE in each transaction is marked to interrupt the AP when the
+ * GSI hardware has completed it.  Because transfers described by TREs are
+ * performed strictly in order, signaling the completion of just the last
+ * TRE in the transaction is sufficient to indicate the full transaction
+ * is complete.
+ *
+ * When a transaction is complete, ipa_gsi_trans_complete() is called by the
+ * GSI code into the IPA layer, allowing it to perform any final cleanup
+ * required before the transaction is freed.
+ */
+
+/* gsi_tre->flags mask values (in CPU byte order) */
+#define GSI_TRE_FLAGS_CHAIN_FMASK  GENMASK(0, 0)
+#define GSI_TRE_FLAGS_IEOB_FMASK   GENMASK(8, 8)
+#define GSI_TRE_FLAGS_IEOT_FMASK   GENMASK(9, 9)
+#define GSI_TRE_FLAGS_BEI_FMASKGENMASK(10, 10)
+#define GSI_TRE_FLAGS_TYPE_FMASK   GENMASK(23, 16)
+
+/* Hardware values representing a transfer element type */
+enum gsi_tre_type {
+   GSI_RE_XFER = 0x2,
+   GSI_RE_IMMD_CMD = 0x3,
+   GSI_RE_NOP  = 0x4,
+};
+
+/* Map a given ring entry

[PATCH v2 17/17] arm64: defconfig: enable build of IPA code

2019-05-30 Thread Alex Elder

Add CONFIG_IPA to the 64-bit Arm defconfig.

Signed-off-by: Alex Elder 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 4d583514258c..6ed86cb6b597 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -261,6 +261,7 @@ CONFIG_SMSC911X=y
 CONFIG_SNI_AVE=y
 CONFIG_SNI_NETSEC=y
 CONFIG_STMMAC_ETH=m
+CONFIG_IPA=m
 CONFIG_MDIO_BUS_MUX_MMIOREG=y
 CONFIG_AT803X_PHY=m
 CONFIG_MARVELL_PHY=m
-- 
2.20.1

[PATCH v2 15/17] MAINTAINERS: add entry for the Qualcomm IPA driver

2019-05-30 Thread Alex Elder

Add an entry in the MAINTAINERS file for the Qualcomm IPA driver

Signed-off-by: Alex Elder 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 429c6c624861..a2dece647641 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12872,6 +12872,12 @@ L: alsa-de...@alsa-project.org (moderated for 
non-subscribers)
 S: Supported
 F: sound/soc/qcom/
 
+QCOM IPA DRIVER
+M: Alex Elder 
+L: net...@vger.kernel.org
+S: Supported
+F: drivers/net/ipa/
+
 QEMU MACHINE EMULATOR AND VIRTUALIZER SUPPORT
 M: Gabriel Somlo 
 M: "Michael S. Tsirkin" 
-- 
2.20.1

[PATCH v2 14/17] soc: qcom: ipa: support build of IPA code

2019-05-30 Thread Alex Elder

Add build and Kconfig support for the Qualcomm IPA driver.

Signed-off-by: Alex Elder 
---
 drivers/net/Kconfig  |  2 ++
 drivers/net/Makefile |  1 +
 drivers/net/ipa/Kconfig  | 16 
 drivers/net/ipa/Makefile |  7 +++
 4 files changed, 26 insertions(+)
 create mode 100644 drivers/net/ipa/Kconfig
 create mode 100644 drivers/net/ipa/Makefile

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 48e209e55843..d87fe174eb9f 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -388,6 +388,8 @@ source "drivers/net/fddi/Kconfig"
 
 source "drivers/net/hippi/Kconfig"
 
+source "drivers/net/ipa/Kconfig"
+
 config NET_SB1000
tristate "General Instruments Surfboard 1000"
depends on PNP
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 0d3ba056cda3..ff8918fe09b0 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -45,6 +45,7 @@ obj-$(CONFIG_ETHERNET) += ethernet/
 obj-$(CONFIG_FDDI) += fddi/
 obj-$(CONFIG_HIPPI) += hippi/
 obj-$(CONFIG_HAMRADIO) += hamradio/
+obj-$(CONFIG_IPA) += ipa/
 obj-$(CONFIG_PLIP) += plip/
 obj-$(CONFIG_PPP) += ppp/
 obj-$(CONFIG_PPP_ASYNC) += ppp/
diff --git a/drivers/net/ipa/Kconfig b/drivers/net/ipa/Kconfig
new file mode 100644
index ..b1e3f7405992
--- /dev/null
+++ b/drivers/net/ipa/Kconfig
@@ -0,0 +1,16 @@
+config IPA
+   tristate "Qualcomm IPA support"
+   depends on NET
+   select QCOM_QMI_HELPERS
+   select QCOM_MDT_LOADER
+   default n
+   help
+ Choose Y here to include support for the Qualcomm IP Accelerator
+ (IPA), a hardware block present in some Qualcomm SoCs.  The IPA
+ is a programmable protocol processor that is capable of generic
+ hardware handling of IP packets, including routing, filtering,
+ and NAT.  Currently the IPA driver supports only basic transport
+ of network traffic between the AP and modem, on the Qualcomm
+ SDM845 SoC.
+
+ If unsure, say N.
diff --git a/drivers/net/ipa/Makefile b/drivers/net/ipa/Makefile
new file mode 100644
index ..a43039c09a25
--- /dev/null
+++ b/drivers/net/ipa/Makefile
@@ -0,0 +1,7 @@
+obj-$(CONFIG_IPA)  +=  ipa.o
+
+ipa-y  :=  ipa_main.o ipa_clock.o ipa_mem.o \
+   ipa_interrupt.o gsi.o gsi_trans.o \
+   ipa_gsi.o ipa_smp2p.o ipa_uc.o \
+   ipa_endpoint.o ipa_cmd.o ipa_netdev.o \
+   ipa_qmi.o ipa_qmi_msg.o ipa_data-sdm845.o
-- 
2.20.1

[PATCH v2 13/17] soc: qcom: ipa: AP/modem communications

2019-05-30 Thread Alex Elder

This patch implements two forms of out-of-band communication between
the AP and modem.

  - QMI is a mechanism that allows clients running on the AP
interact with services running on the modem (and vice-versa).
The AP IPA driver uses QMI to communicate with the corresponding
IPA driver resident on the modem, to agree on parameters used
with the IPA hardware and to ensure both sides are ready before
entering operational mode.

  - SMP2P is a more primitive mechanism available for the modem and
AP to communicate with each other.  It provides a means for either
the AP or modem to interrupt the other, and furthermore, to provide
32 bits worth of information.  The IPA driver uses SMP2P to tell
the modem what the state of the IPA clock was in the event of a
crash.  This allows the modem to safely access the IPA hardware
(or avoid doing so) when a crash occurs, for example, to access
information within the IPA hardware.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_qmi.c | 402 +++
 drivers/net/ipa/ipa_qmi.h |  35 ++
 drivers/net/ipa/ipa_qmi_msg.c | 583 ++
 drivers/net/ipa/ipa_qmi_msg.h | 238 ++
 drivers/net/ipa/ipa_smp2p.c   | 304 ++
 drivers/net/ipa/ipa_smp2p.h   |  47 +++
 6 files changed, 1609 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_qmi.c
 create mode 100644 drivers/net/ipa/ipa_qmi.h
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.c
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.h
 create mode 100644 drivers/net/ipa/ipa_smp2p.c
 create mode 100644 drivers/net/ipa/ipa_smp2p.h

diff --git a/drivers/net/ipa/ipa_qmi.c b/drivers/net/ipa/ipa_qmi.c
new file mode 100644
index ..e94437508f6c
--- /dev/null
+++ b/drivers/net/ipa/ipa_qmi.c
@@ -0,0 +1,402 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ipa.h"
+#include "ipa_endpoint.h"
+#include "ipa_mem.h"
+#include "ipa_qmi_msg.h"
+
+#define QMI_INIT_DRIVER_TIMEOUT6   /* A minute in milliseconds */
+
+/**
+ * DOC: AP/Modem QMI Handshake
+ *
+ * The AP and modem perform a "handshake" at initialization time to ensure
+ * each side knows the other side is ready.  Two QMI handles (endpoints) are
+ * used for this; one provides service on the modem for AP requests, and the
+ * other is on the AP to service modem requests (and to supply an indication
+ * from the AP).
+ *
+ * The QMI service on the modem expects to receive an INIT_DRIVER request from
+ * the AP, which contains parameters used by the modem during initialization.
+ * The AP sends this request using the client handle as soon as it is knows
+ * the modem side service is available.  The modem responds to this request
+ * immediately.
+ *
+ * When the modem learns the AP service is available, it is able to
+ * communicate its status to the AP.  The modem uses this to tell
+ * the AP when it is ready to receive an indication, sending an
+ * INDICATION_REGISTER request to the handle served by the AP.  This
+ * is independent of the modem's initialization of its driver.
+ *
+ * When the modem has completed the driver initialization requested by the
+ * AP, it sends a DRIVER_INIT_COMPLETE request to the AP.   This request
+ * could arrive at the AP either before or after the INDICATION_REGISTER
+ * request.
+ *
+ * The final step in the handshake occurs after the AP has received both
+ * requests from the modem.  The AP completes the handshake by sending an
+ * INIT_COMPLETE_IND indication message to the modem.
+ */
+
+#define IPA_HOST_SERVICE_SVC_ID0x31
+#define IPA_HOST_SVC_VERS  1
+#define IPA_HOST_SERVICE_INS_ID1
+
+#define IPA_MODEM_SERVICE_SVC_ID   0x31
+#define IPA_MODEM_SERVICE_INS_ID   2
+#define IPA_MODEM_SVC_VERS 1
+
+/* Send an INIT_COMPLETE_IND indication message to the modem */
+static int ipa_send_master_driver_init_complete_ind(struct qmi_handle *qmi,
+   struct sockaddr_qrtr *sq)
+{
+   struct ipa_init_complete_ind ind = { };
+
+   ind.status.result = QMI_RESULT_SUCCESS_V01;
+   ind.status.error = QMI_ERR_NONE_V01;
+
+   return qmi_send_indication(qmi, sq, IPA_QMI_INIT_COMPLETE_IND,
+  IPA_QMI_INIT_COMPLETE_IND_SZ,
+  ipa_init_complete_ind_ei, );
+}
+
+/* This function is called to determine whether to complete the handshake by
+ * sending an INIT_COMPLETE_IND indication message to the modem.  The
+ * "init_driver" parameter is false when we've received an INDICATION_REGISTER
+ * request message from the modem, or true when we've received the response
+ * from the INIT_DRIVER request message we send.  If this function decides the
+ * message should be sent,

[PATCH v2 16/17] arm64: dts: sdm845: add IPA information

2019-05-30 Thread Alex Elder

Add IPA-related nodes and definitions to "sdm845.dtsi".

Signed-off-by: Alex Elder 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 51 
 1 file changed, 51 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index fcb93300ca62..985479925af8 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
interrupt-parent = <>;
@@ -517,6 +518,17 @@
interrupt-controller;
#interrupt-cells = <2>;
};
+
+   ipa_smp2p_out: ipa-ap-to-modem {
+   qcom,entry-name = "ipa";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   ipa_smp2p_in: ipa-modem-to-ap {
+   qcom,entry-name = "ipa";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
};
 
smp2p-slpi {
@@ -1268,6 +1280,45 @@
};
};
 
+   ipa@1e4 {
+   compatible = "qcom,sdm845-ipa";
+
+   modem-init;
+
+   reg = <0 0x1e4 0 0x7000>,
+ <0 0x1e47000 0 0x2000>,
+ <0 0x1e04000 0 0x2c000>;
+   reg-names = "ipa-reg",
+   "ipa-shared",
+   "gsi";
+
+   interrupts-extended =
+   < 0 311 IRQ_TYPE_EDGE_RISING>,
+   < 0 432 IRQ_TYPE_LEVEL_HIGH>,
+   <_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+   <_smp2p_in 1 IRQ_TYPE_EDGE_RISING>;
+   interrupt-names = "ipa",
+ "gsi",
+ "ipa-clock-query",
+ "ipa-setup-ready";
+
+   clocks = < RPMH_IPA_CLK>;
+   clock-names = "core";
+
+   interconnects =
+   <_hlos MASTER_IPA _hlos SLAVE_EBI1>,
+   <_hlos MASTER_IPA _hlos SLAVE_IMEM>,
+   <_hlos MASTER_APPSS_PROC _hlos 
SLAVE_IPA_CFG>;
+   interconnect-names = "memory",
+"imem",
+"config";
+
+   qcom,smem-states = <_smp2p_out 0>,
+  <_smp2p_out 1>;
+   qcom,smem-state-names = "ipa-clock-enabled-valid",
+   "ipa-clock-enabled";
+   };
+
tcsr_mutex_regs: syscon@1f4 {
compatible = "syscon";
reg = <0 0x01f4 0 0x4>;
-- 
2.20.1

[PATCH v2 09/17] soc: qcom: ipa: IPA interface to GSI

2019-05-30 Thread Alex Elder

This patch provides interface functions supplied by the IPA layer
that are called from the GSI layer.  One function is called when a
GSI transaction has completed.  The others allow the GSI layer to
inform the IPA layer when the hardware has been told it has new TREs
to execute, and when the hardware has indicated transactions have
completed.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_gsi.c | 48 ++
 drivers/net/ipa/ipa_gsi.h | 49 +++
 2 files changed, 97 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_gsi.c
 create mode 100644 drivers/net/ipa/ipa_gsi.h

diff --git a/drivers/net/ipa/ipa_gsi.c b/drivers/net/ipa/ipa_gsi.c
new file mode 100644
index ..7f8d74688c1e
--- /dev/null
+++ b/drivers/net/ipa/ipa_gsi.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include 
+
+#include "gsi_trans.h"
+#include "ipa.h"
+#include "ipa_endpoint.h"
+
+void ipa_gsi_trans_complete(struct gsi_trans *trans)
+{
+   struct ipa *ipa = container_of(trans->gsi, struct ipa, gsi);
+   struct ipa_endpoint *endpoint;
+
+   endpoint = ipa->endpoint_map[trans->channel_id];
+   if (endpoint == ipa->command_endpoint)
+   return; /* Nothing to do for commands */
+
+   if (endpoint->toward_ipa)
+   ipa_endpoint_skb_tx_complete(trans);
+   else
+   ipa_endpoint_rx_complete(trans);
+}
+
+void ipa_gsi_channel_tx_queued(struct gsi *gsi, u32 channel_id, u32 count,
+  u32 byte_count)
+{
+   struct ipa *ipa = container_of(gsi, struct ipa, gsi);
+   struct ipa_endpoint *endpoint;
+
+   endpoint = ipa->endpoint_map[channel_id];
+   if (endpoint->netdev)
+   netdev_sent_queue(endpoint->netdev, byte_count);
+}
+
+void ipa_gsi_channel_tx_completed(struct gsi *gsi, u32 channel_id, u32 count,
+ u32 byte_count)
+{
+   struct ipa *ipa = container_of(gsi, struct ipa, gsi);
+   struct ipa_endpoint *endpoint;
+
+   endpoint = ipa->endpoint_map[channel_id];
+   if (endpoint->netdev)
+   netdev_completed_queue(endpoint->netdev, count, byte_count);
+}
diff --git a/drivers/net/ipa/ipa_gsi.h b/drivers/net/ipa/ipa_gsi.h
new file mode 100644
index ..72adb520da40
--- /dev/null
+++ b/drivers/net/ipa/ipa_gsi.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_GSI_TRANS_H_
+#define _IPA_GSI_TRANS_H_
+
+#include 
+
+struct gsi_trans;
+
+/**
+ * ipa_gsi_trans_complete() - GSI transaction completion callback
+ * @gsi:   GSI pointer
+ * @trans: Transaction that has completed
+ *
+ * This called from the GSI layer to notify the IPA layer that a
+ * transaction has completed.
+ */
+void ipa_gsi_trans_complete(struct gsi_trans *trans);
+
+/**
+ * ipa_gsi_channel_tx_queued() - GSI queued to hardware notification
+ * @gsi:   GSI pointer
+ * @channel_id:Channel number
+ * @count: Number of transactions queued
+ * @byte_count:Number of bytes to transfer represented by transactions
+ *
+ * This called from the GSI layer to notify the IPA layer that some
+ * number of transactions have been queued to hardware for execution.
+ */
+void ipa_gsi_channel_tx_queued(struct gsi *gsi, u32 channel_id, u32 count,
+  u32 byte_count);
+/**
+ * ipa_gsi_trans_complete() - GSI transaction completion callback
+ipa_gsi_channel_tx_completed()
+ * @gsi:   GSI pointer
+ * @channel_id:Channel number
+ * @count: Number of transactions completed since last report
+ * @byte_count:Number of bytes transferred represented by transactions
+ *
+ * This called from the GSI layer to notify the IPA layer that the hardware
+ * has reported the completion of some number of transactions.
+ */
+void ipa_gsi_channel_tx_completed(struct gsi *gsi, u32 channel_id, u32 count,
+ u32 byte_count);
+
+#endif /* _IPA_GSI_TRANS_H_ */
-- 
2.20.1

[PATCH v2 05/17] soc: qcom: ipa: clocking, interrupts, and memory

2019-05-30 Thread Alex Elder

This patch incorporates three source files (and their headers).  They're
grouped into one patch mainly for the purpose of making the number and
size of patches in this series somewhat reasonable.

  - "ipa_clock.c" and "ipa_clock.h" implement clocking for the IPA device.
The IPA has a single core clock managed by the common clock framework.
In addition, the IPA has three buses whose bandwidth is managed by the
Linux interconnect framework.  At this time the core clock and all
three buses are either on or off; we don't yet do any more fine-grained
management than that.  The core clock and interconnects are enabled
and disabled as a unit, using a unified clock-like abstraction,
ipa_clock_get()/ipa_clock_put().

  - "ipa_interrupt.c" and "ipa_interrupt.h" implement IPA interrupts.
There are two hardare IRQs used by the IPA driver (the other is
the GSI interrupt, described in a separate patch).  Several types
of interrupt are handled by the IPA IRQ handler; these are not part
of data/fast path.

  - The IPA has a region of local memory that is accessible by the AP
(and modem).  Within that region are areas with certain defined
purposes.  "ipa_mem.c" and "ipa_mem.h" define those regions, and
implement their initialization.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_clock.c | 297 
 drivers/net/ipa/ipa_clock.h |  52 ++
 drivers/net/ipa/ipa_interrupt.c | 279 ++
 drivers/net/ipa/ipa_interrupt.h |  53 ++
 drivers/net/ipa/ipa_mem.c   | 234 +
 drivers/net/ipa/ipa_mem.h   |  83 +
 6 files changed, 998 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_clock.c
 create mode 100644 drivers/net/ipa/ipa_clock.h
 create mode 100644 drivers/net/ipa/ipa_interrupt.c
 create mode 100644 drivers/net/ipa/ipa_interrupt.h
 create mode 100644 drivers/net/ipa/ipa_mem.c
 create mode 100644 drivers/net/ipa/ipa_mem.h

diff --git a/drivers/net/ipa/ipa_clock.c b/drivers/net/ipa/ipa_clock.c
new file mode 100644
index ..9ed12e8183ad
--- /dev/null
+++ b/drivers/net/ipa/ipa_clock.c
@@ -0,0 +1,297 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ipa.h"
+#include "ipa_clock.h"
+#include "ipa_netdev.h"
+
+/**
+ * DOC: IPA Clocking
+ *
+ * The "IPA Clock" manages both the IPA core clock and the interconnects
+ * (buses) the IPA depends on as a single logical entity.  A reference count
+ * is incremented by "get" operations and decremented by "put" operations.
+ * Transitions of that count from 0 to 1 result in the clock and interconnects
+ * being enabled, and transitions of the count from 1 to 0 cause them to be
+ * disabled.  We currently operate the core clock at a fixed clock rate, and
+ * all buses at a fixed average and peak bandwidth.  As more advanced IPA
+ * features are enabled, we can will better use of clock and bus scaling.
+ *
+ * An IPA clock reference must be held for any access to IPA hardware.
+ */
+
+#defineIPA_CORE_CLOCK_RATE (75UL * 1000 * 1000)/* Hz */
+
+/* Interconnect path bandwidths (each times 1000 bytes per second) */
+#define IPA_MEMORY_AVG (80 * 1000) /* 80 MBps */
+#define IPA_MEMORY_PEAK(600 * 1000)
+
+#define IPA_IMEM_AVG   (80 * 1000)
+#define IPA_IMEM_PEAK  (350 * 1000)
+
+#define IPA_CONFIG_AVG (40 * 1000)
+#define IPA_CONFIG_PEAK(40 * 1000)
+
+/**
+ * struct ipa_clock - IPA clocking information
+ * @core:  IPA core clock
+ * @memory_path:   Memory interconnect
+ * @imem_path: Internal memory interconnect
+ * @config_path:   Configuration space interconnect
+ * @mutex; Protects clock enable/disable
+ * @count: Clocking reference count
+ */
+struct ipa_clock {
+   struct ipa *ipa;
+   atomic_t count;
+   struct mutex mutex; /* protects clock enable/disable */
+   struct clk *core;
+   struct icc_path *memory_path;
+   struct icc_path *imem_path;
+   struct icc_path *config_path;
+};
+
+/* Initialize interconnects required for IPA operation */
+static int ipa_interconnect_init(struct ipa_clock *clock, struct device *dev)
+{
+   struct icc_path *path;
+
+   path = of_icc_get(dev, "memory");
+   if (IS_ERR(path))
+   goto err_return;
+   clock->memory_path = path;
+
+   path = of_icc_get(dev, "imem");
+   if (IS_ERR(path))
+   goto err_memory_path_put;
+   clock->imem_path = path;
+
+   path = of_icc_get(dev, "config");
+   if (IS_ERR(path))
+   goto err_imem_path_put;
+   clock->config_path = path;
+
+   return 0;
+
+err_imem_path_put:
+

Re: [PATCH v3 3/8] drivers/soc: xdma: Add user interface

2019-05-30 Thread Eduardo Valentin

On Wed, May 29, 2019 at 01:10:03PM -0500, Eddie James wrote:
> This commits adds a miscdevice to provide a user interface to the XDMA
> engine. The interface provides the write operation to start DMA
> operations. The DMA parameters are passed as the data to the write call.
> The actual data to transfer is NOT passed through write. Note that both
> directions of DMA operation are accomplished through the write command;
> BMC to host and host to BMC.
> 
> The XDMA engine is restricted to only accessing the reserved memory
> space on the AST2500, typically used by the VGA. For this reason, the
> VGA memory space is pooled and allocated with genalloc. Users calling
> mmap allocate pages from this pool for their usage. The space allocated
> by a client will be the space used in the DMA operation. For an
> "upstream" (BMC to host) operation, the data in the client's area will
> be transferred to the host. For a "downstream" (host to BMC) operation,
> the host data will be placed in the client's memory area.
> 
> Poll is also provided in order to determine when the DMA operation is
> complete for non-blocking IO.
> 
> Signed-off-by: Eddie James 
> ---
>  drivers/soc/aspeed/aspeed-xdma.c | 201 
> +++
>  1 file changed, 201 insertions(+)
> 
> diff --git a/drivers/soc/aspeed/aspeed-xdma.c 
> b/drivers/soc/aspeed/aspeed-xdma.c
> index 3dc0ce4..39f6545 100644
> --- a/drivers/soc/aspeed/aspeed-xdma.c
> +++ b/drivers/soc/aspeed/aspeed-xdma.c
> @@ -129,6 +129,8 @@ struct aspeed_xdma {
>  
>   unsigned long flags;
>   unsigned int cmd_idx;
> + struct mutex list_lock;
> + struct mutex start_lock;
>   wait_queue_head_t wait;
>   struct aspeed_xdma_client *current_client;
>  
> @@ -140,6 +142,8 @@ struct aspeed_xdma {
>   dma_addr_t cmdq_vga_phys;
>   void *cmdq_vga_virt;
>   struct gen_pool *vga_pool;
> +
> + struct miscdevice misc;
>  };
>  
>  struct aspeed_xdma_client {
> @@ -331,6 +335,183 @@ static irqreturn_t aspeed_xdma_irq(int irq, void *arg)
>   return IRQ_HANDLED;
>  }
>  
> +static ssize_t aspeed_xdma_write(struct file *file, const char __user *buf,
> +  size_t len, loff_t *offset)
> +{
> + int rc;
> + struct aspeed_xdma_op op;
> + struct aspeed_xdma_client *client = file->private_data;
> + struct aspeed_xdma *ctx = client->ctx;
> + u32 offs = client->phys ? (client->phys - ctx->vga_phys) :
> + XDMA_CMDQ_SIZE;
> +
> + if (len != sizeof(struct aspeed_xdma_op))
> + return -EINVAL;
> +
> + rc = copy_from_user(, buf, len);
> + if (rc)
> + return rc;
> +
> + if (op.len > (ctx->vga_size - offs) || op.len < XDMA_BYTE_ALIGN)
> + return -EINVAL;
> +
> + if (file->f_flags & O_NONBLOCK) {
> + if (!mutex_trylock(>start_lock))
> + return -EAGAIN;
> +
> + if (test_bit(XDMA_IN_PRG, >flags)) {
> + mutex_unlock(>start_lock);
> + return -EAGAIN;
> + }
> + } else {
> + mutex_lock(>start_lock);
> +
> + rc = wait_event_interruptible(ctx->wait,
> +   !test_bit(XDMA_IN_PRG,
> + >flags));
> + if (rc) {
> + mutex_unlock(>start_lock);
> + return -EINTR;
> + }
> + }
> +
> + ctx->current_client = client;
> + set_bit(XDMA_IN_PRG, >flags);
> +
> + aspeed_xdma_start(ctx, , ctx->vga_phys + offs);
> +
> + mutex_unlock(>start_lock);
> +
> + if (!(file->f_flags & O_NONBLOCK)) {
> + rc = wait_event_interruptible(ctx->wait,
> +   !test_bit(XDMA_IN_PRG,
> + >flags));
> + if (rc)
> + return -EINTR;
> + }
> +
> + return len;
> +}
> +
> +static __poll_t aspeed_xdma_poll(struct file *file,
> +  struct poll_table_struct *wait)
> +{
> + __poll_t mask = 0;
> + __poll_t req = poll_requested_events(wait);
> + struct aspeed_xdma_client *client = file->private_data;
> + struct aspeed_xdma *ctx = client->ctx;
> +
> + if (req & (EPOLLIN | EPOLLRDNORM)) {
> + if (test_bit(XDMA_IN_PRG, >flags))
> + poll_wait(file, >wait, wait);
> +
> + if (!test_bit(XDMA_IN_PRG, >flags))
> + mask |= EPOLLIN | EPOLLRDNORM;
> + }
> +
> + if (req & (EPOLLOUT | EPOLLWRNORM)) {
> + if (test_bit(XDMA_IN_PRG, >flags))
> + poll_wait(file, >wait, wait);
> +
> + if (!test_bit(XDMA_IN_PRG, >flags))
> + mask |= EPOLLOUT | EPOLLWRNORM;
> + }
> +
> + return mask;
> +}
> +
> +static void aspeed_xdma_vma_close(struct vm_area_struct *vma)
> +{
> + struct aspeed_xdma_client

Re: mmotm 2019-05-29-20-52 uploaded

2019-05-30 Thread Stephen Rothwell

Hi all,

On Wed, 29 May 2019 21:43:36 -0700 Luigi Semenzato  wrote:
>
> My apologies but the patch
> 
> mm-smaps-split-pss-into-components.patch
> 
> has a bug (does not update private_clean and private_dirty).  Please
> do not include it.  I will resubmit a corrected version.

I have dropped that from linux-next today.

P.S. in the future please trim your replies to relevant bits, thanks.
-- 
Cheers,
Stephen Rothwell


pgp5bMJu1WWfW.pgp
Description: OpenPGP digital signature

memory leak in pppoe_sendmsg

2019-05-30 Thread syzbot


Hello,

syzbot found the following crash on:

HEAD commit:bec7550c Merge tag 'docs-5.2-fixes2' of git://git.lwn.net/..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1280ecbaa0
kernel config:  https://syzkaller.appspot.com/x/.config?x=64479170dcaf0e11
dashboard link: https://syzkaller.appspot.com/bug?extid=6bdfd184eac7709e5cc9
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17112572a0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+6bdfd184eac7709e5...@syzkaller.appspotmail.com

2019/05/30 11:35:26 executed programs: 9
2019/05/30 11:35:32 executed programs: 11
2019/05/30 11:35:38 executed programs: 13
2019/05/30 11:35:44 executed programs: 15
2019/05/30 11:35:50 executed programs: 17
BUG: memory leak
unreferenced object 0x888124199500 (size 224):
  comm "syz-executor.0", pid 7122, jiffies 4295041954 (age 13.860s)
  hex dump (first 32 bytes):
00 96 19 24 81 88 ff ff d0 20 0e 2a 81 88 ff ff  ...$. .*
00 00 00 00 00 00 00 00 00 20 0e 2a 81 88 ff ff  . .*
  backtrace:
[<688f689a>] kmemleak_alloc_recursive  
include/linux/kmemleak.h:55 [inline]

[<688f689a>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<688f689a>] slab_alloc_node mm/slab.c:3269 [inline]
[<688f689a>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579
[] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198
[<4ce3be0b>] alloc_skb include/linux/skbuff.h:1058 [inline]
[<4ce3be0b>] sock_wmalloc+0x4f/0x80 net/core/sock.c:2077
[<8110994c>] pppoe_sendmsg+0xd0/0x250  
drivers/net/ppp/pppoe.c:871

[] sock_sendmsg_nosec net/socket.c:652 [inline]
[] sock_sendmsg+0x54/0x70 net/socket.c:671
[<8012ebfd>] ___sys_sendmsg+0x194/0x3c0 net/socket.c:2292
[<87b8ef6f>] __sys_sendmmsg+0xf4/0x270 net/socket.c:2387
[<2b3700b1>] __do_sys_sendmmsg net/socket.c:2416 [inline]
[<2b3700b1>] __se_sys_sendmmsg net/socket.c:2413 [inline]
[<2b3700b1>] __x64_sys_sendmmsg+0x28/0x30 net/socket.c:2413
[<28e31f9f>] do_syscall_64+0x76/0x1a0  
arch/x86/entry/common.c:301

[<4279bc05>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0x888115706600 (size 512):
  comm "syz-executor.0", pid 7122, jiffies 4295041954 (age 13.860s)
  hex dump (first 32 bytes):
23 32 aa aa aa aa aa 0a aa aa aa aa aa 0a 88 64  #2.d
11 00 04 00 00 00 70 72 6f 66 69 6c 65 3d 30 20  ..profile=0
  backtrace:
[] kmemleak_alloc_recursive  
include/linux/kmemleak.h:55 [inline]

[] slab_post_alloc_hook mm/slab.h:439 [inline]
[] slab_alloc_node mm/slab.c:3269 [inline]
[] kmem_cache_alloc_node_trace+0x15b/0x2a0  
mm/slab.c:3597

[] __do_kmalloc_node mm/slab.c:3619 [inline]
[] __kmalloc_node_track_caller+0x38/0x50  
mm/slab.c:3634
[<8692fea3>] __kmalloc_reserve.isra.0+0x40/0xb0  
net/core/skbuff.c:142

[] __alloc_skb+0xa0/0x210 net/core/skbuff.c:210
[<4ce3be0b>] alloc_skb include/linux/skbuff.h:1058 [inline]
[<4ce3be0b>] sock_wmalloc+0x4f/0x80 net/core/sock.c:2077
[<8110994c>] pppoe_sendmsg+0xd0/0x250  
drivers/net/ppp/pppoe.c:871

[] sock_sendmsg_nosec net/socket.c:652 [inline]
[] sock_sendmsg+0x54/0x70 net/socket.c:671
[<8012ebfd>] ___sys_sendmsg+0x194/0x3c0 net/socket.c:2292
[<87b8ef6f>] __sys_sendmmsg+0xf4/0x270 net/socket.c:2387
[<2b3700b1>] __do_sys_sendmmsg net/socket.c:2416 [inline]
[<2b3700b1>] __se_sys_sendmmsg net/socket.c:2413 [inline]
[<2b3700b1>] __x64_sys_sendmmsg+0x28/0x30 net/socket.c:2413
[<28e31f9f>] do_syscall_64+0x76/0x1a0  
arch/x86/entry/common.c:301

[<4279bc05>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0x888124199600 (size 224):
  comm "syz-executor.0", pid 7122, jiffies 4295041954 (age 13.860s)
  hex dump (first 32 bytes):
00 97 19 24 81 88 ff ff 00 95 19 24 81 88 ff ff  ...$...$
00 00 00 00 00 00 00 00 00 20 0e 2a 81 88 ff ff  . .*
  backtrace:
[<688f689a>] kmemleak_alloc_recursive  
include/linux/kmemleak.h:55 [inline]

[<688f689a>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<688f689a>] slab_alloc_node mm/slab.c:3269 [inline]
[<688f689a>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579
[] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198
[<4ce3be0b>] alloc_skb include/linux/skbuff.h:1058 [inline]
[<4ce3be0b>] sock_wmalloc+0x4f/0x80

Re: [PATCH v3 5/8] drivers/soc: xdma: Add PCI device configuration sysfs

2019-05-30 Thread Eduardo Valentin

On Wed, May 29, 2019 at 01:10:05PM -0500, Eddie James wrote:
> The AST2500 has two PCI devices embedded. The XDMA engine can use either
> device to perform DMA transfers. Users need the capability to choose
> which device to use. This commit therefore adds two sysfs files that
> toggle the AST2500 and XDMA engine between the two PCI devices.
> 
> Signed-off-by: Eddie James 
> ---
>  drivers/soc/aspeed/aspeed-xdma.c | 103 
> +--
>  1 file changed, 100 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/soc/aspeed/aspeed-xdma.c 
> b/drivers/soc/aspeed/aspeed-xdma.c
> index 39f6545..ddd5e1e 100644
> --- a/drivers/soc/aspeed/aspeed-xdma.c
> +++ b/drivers/soc/aspeed/aspeed-xdma.c
> @@ -143,6 +143,7 @@ struct aspeed_xdma {
>   void *cmdq_vga_virt;
>   struct gen_pool *vga_pool;
>  
> + char pcidev[4];
>   struct miscdevice misc;
>  };
>  
> @@ -165,6 +166,10 @@ struct aspeed_xdma_client {
>   SCU_PCIE_CONF_VGA_EN_IRQ | SCU_PCIE_CONF_VGA_EN_DMA |
>   SCU_PCIE_CONF_RSVD;
>  
> +static char *_pcidev = "vga";
> +module_param_named(pcidev, _pcidev, charp, 0600);
> +MODULE_PARM_DESC(pcidev, "Default PCI device used by XDMA engine for DMA 
> ops");
> +
>  static void aspeed_scu_pcie_write(struct aspeed_xdma *ctx, u32 conf)
>  {
>   u32 v = 0;
> @@ -512,7 +517,7 @@ static int aspeed_xdma_release(struct inode *inode, 
> struct file *file)
>   .release= aspeed_xdma_release,
>  };
>  
> -static int aspeed_xdma_init_mem(struct aspeed_xdma *ctx)
> +static int aspeed_xdma_init_mem(struct aspeed_xdma *ctx, u32 conf)
>  {
>   int rc;
>   u32 scu_conf = 0;
> @@ -522,7 +527,7 @@ static int aspeed_xdma_init_mem(struct aspeed_xdma *ctx)
>   const u32 vga_sizes[4] = { 0x80, 0x100, 0x200, 0x400 };
>   void __iomem *sdmc_base = ioremap(0x1e6e, 0x100);
>  
> - aspeed_scu_pcie_write(ctx, aspeed_xdma_vga_pcie_conf);
> + aspeed_scu_pcie_write(ctx, conf);
>  
>   regmap_read(ctx->scu, SCU_STRAP, _conf);
>   ctx->vga_size = vga_sizes[FIELD_GET(SCU_STRAP_VGA_MEM, scu_conf)];
> @@ -598,10 +603,91 @@ static int aspeed_xdma_init_mem(struct aspeed_xdma *ctx)
>   return rc;
>  }
>  
> +static int aspeed_xdma_change_pcie_conf(struct aspeed_xdma *ctx, u32 conf)
> +{
> + int rc;
> +
> + mutex_lock(>start_lock);
> + rc = wait_event_interruptible_timeout(ctx->wait,
> +   !test_bit(XDMA_IN_PRG,
> + >flags),
> +   msecs_to_jiffies(1000));
> + if (rc < 0) {
> + mutex_unlock(>start_lock);
> + return -EINTR;
> + }
> +
> + /* previous op didn't complete, wake up waiters anyway */
> + if (!rc)
> + wake_up_interruptible_all(>wait);
> +
> + reset_control_assert(ctx->reset);
> + msleep(10);
> +
> + aspeed_scu_pcie_write(ctx, conf);
> + msleep(10);
> +
> + reset_control_deassert(ctx->reset);
> + msleep(10);
> +
> + aspeed_xdma_init_eng(ctx);
> +
> + mutex_unlock(>start_lock);
> +
> + return 0;
> +}
> +
> +static int aspeed_xdma_pcidev_to_conf(struct aspeed_xdma *ctx,
> +   const char *pcidev, u32 *conf)
> +{
> + if (!strcasecmp(pcidev, "vga")) {
> + *conf = aspeed_xdma_vga_pcie_conf;
> + return 0;
> + }
> +
> + if (!strcasecmp(pcidev, "bmc")) {
> + *conf = aspeed_xdma_bmc_pcie_conf;
> + return 0;
> + }

strncasecmp()?

> +
> + return -EINVAL;
> +}
> +
> +static ssize_t aspeed_xdma_show_pcidev(struct device *dev,
> +struct device_attribute *attr,
> +char *buf)
> +{
> + struct aspeed_xdma *ctx = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE - 1, "%s", ctx->pcidev);
> +}
> +
> +static ssize_t aspeed_xdma_store_pcidev(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + u32 conf;
> + struct aspeed_xdma *ctx = dev_get_drvdata(dev);
> + int rc = aspeed_xdma_pcidev_to_conf(ctx, buf, );
> +
> + if (rc)
> + return rc;
> +
> + rc = aspeed_xdma_change_pcie_conf(ctx, conf);
> + if (rc)
> + return rc;
> +
> + strcpy(ctx->pcidev, buf);

should we use strncpy() instead?

> + return count;
> +}
> +static DEVICE_ATTR(pcidev, 0644, aspeed_xdma_show_pcidev,
> +aspeed_xdma_store_pcidev);
> +
>  static int aspeed_xdma_probe(struct platform_device *pdev)
>  {
>   int irq;
>   int rc;
> + u32 conf;
>   struct resource *res;
>   struct device *dev = >dev;
>   struct aspeed_xdma *ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> @@ -657,7 +743,14 @@ static int aspeed_xdma_probe(struct platform_device 
> *pdev)
>

Re: linux-next: boot failure after merge of the akpm tree

2019-05-30 Thread Stephen Rothwell

Hi all,

On Fri, 31 May 2019 12:27:58 +1000 Nicholas Piggin  wrote:
>
> > I have reverted
> > 
> >   c353e2997976 ("mm/vmalloc: hugepage vmalloc mappings")
> >   a826492f28d9 ("mm: move ioremap page table mapping function to mm/")
> > 
> > (and my fix up) for today and things seem to work (if only because the
> > BUG() has been removed :-)).  
> 
> Good to know, maybe I didn't test powerpc without later enabling 
> patches...
> 
> The series also has a compile bug on ARM I have to work out, so
> yeah drop those for now, I'll post a v2. The large system map patches
> that I posted in that series can stay I think.

OK, I have removed them from the akpm-current tree today.

-- 
Cheers,
Stephen Rothwell


pgppAWhHsp5VO.pgp
Description: OpenPGP digital signature

Re: [v3 7/7] drm: mediatek: adjust dsi and mipi_tx probe sequence

2019-05-30 Thread CK Hu

Hi, Jitao:

On Sun, 2019-05-19 at 17:25 +0800, Jitao Shi wrote:
> mtk_mipi_tx is the phy of mtk_dsi.
> mtk_dsi get the phy(mtk_mipi_tx) in probe().
> 
> So,  mtk_mipi_tx init should be ahead of mtk_dsi. Or mtk_dsi will
> defer to wait mtk_mipi_tx probe done.

Reviewed-by: CK Hu 

> 
> Signed-off-by: Jitao Shi 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> index cf59ea9bccfd..583d533d9574 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> @@ -633,8 +633,8 @@ static struct platform_driver * const mtk_drm_drivers[] = 
> {
>   _disp_rdma_driver,
>   _dpi_driver,
>   _drm_platform_driver,
> - _dsi_driver,
>   _mipi_tx_driver,
> + _dsi_driver,
>  };
>  
>  static int __init mtk_drm_init(void)

linux-next: build warning after merge of the scsi tree

2019-05-30 Thread Stephen Rothwell

Hi all,

After merging the scsi tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

drivers/scsi/ibmvscsi/ibmvscsi.c: In function 'ibmvscsi_work':
drivers/scsi/ibmvscsi/ibmvscsi.c:2151:5: warning: 'rc' may be used 
uninitialized in this function [-Wmaybe-uninitialized]
  if (rc) {
 ^
drivers/scsi/ibmvscsi/ibmvscsi.c:2121:6: note: 'rc' was declared here
  int rc;
  ^~

Introduced by commit

  035a3c4046b5 ("scsi: ibmvscsi: redo driver work thread to use enum action 
states")

-- 
Cheers,
Stephen Rothwell


pgpx7DUczOhYC.pgp
Description: OpenPGP digital signature

答复: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-30 Thread David Wang

> -Original Mail-
> Sender: Raj, Ashok 
> Time: 2019.05.31 1:11
> To : Tony W Wang-oc 
> CC: tip...@zytor.com; b...@suse.de; h...@zytor.com;
> linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-tip-comm...@vger.kernel.org; mi...@kernel.org; pet...@infradead.org;
> sta...@vger.kernel.org; t...@linutronix.de; tony.l...@intel.com;
> torva...@linux-foundation.org; David Wang ; Ashok
> Raj 
> Topic: Re: Re: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t
> participate in rendezvous process
> 
> On Thu, May 30, 2019 at 09:13:39AM +, Tony W Wang-oc wrote:
> > On Thu, May 30, 2019, Tony W Wang-oc wrote:
> > > Hi Ashok,
> > > I have two questions about this patch, could you help to check:
> > >
> > > 1, for broadcast #MC exceptions, this patch seems require #MC
> > > exception errors set MCG_STATUS_RIPV = 1.
> > > But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> > > (like "Recoverable-not-continuable SRAR Type" Errors), for these
> > > errors the patch doesn't seem to work, is that okay?
> > >
> > > 2, for LMCE exceptions, this patch seems require #MC exception
> > > errors set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally
> > > even on offline CPU.
> > > For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline
> > > CPU handle these LMCE errors, is that okay?
> > >
> >
> > More specifically, this patch seems require #MC exceptions meet the
> > condition "MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon
> > X5650 machine (SMP),
> 
> The offline CPU will never get a LMCE=1, since those only happen on the CPU
> that's doing active work. Offline CPUs just sitting in idle.
So, for intel CPU, LMCE is only for Thread level(or core level) error? If not, 
suppose 2 threads
share level-2 cache. And thread 0 is active, thread 1 was offlined by SW. When 
MCE for this level-2
cache occurred, thread 1 will be active. When thread 1 read mcgstatus.lmce, the 
result will be always 0?

Thanks.
> 
> The specific error here is a PCC=1, so irrespective of what happens We do 
> capture
> the errors in the per-cpu log, and kernel would panic.
> 
> What specifically this patch tries to achieve is to leave an error sitting 
> with
> MCG-STATUS.MCIP=1 and another recoverable error would shut the system
> dowm.
> 
> I don't see anything wrong with what this patch does..
> 
> > "Data CACHE Level-2 Generic Error" does not meet this condition.
> >
> > I got below message from:
> > https://www.centos.org/forums/viewtopic.php?p=292742
> >
> > Hardware event. This is not a software error.
> > MCE 0
> > CPU 4 BANK 6 TSC b7065eeaa18b0
> > TIME 1545643603 Mon Dec 24 10:26:43 2018 MCG status:MCIP MCi status:
> > Uncorrected error
> > Error enabled
> > Processor context corrupt
> > MCA: Data CACHE Level-2 Generic Error
> > STATUS b2008106 MCGSTATUS 4
> > MCGCAP 1c09 APICID 4 SOCKETID 0
> > CPUID Vendor Intel Family 6 Model 44
> >
> > > Thanks
> > > Tony W Wang-oc

Re: [PATCH v3 2/8] drivers/soc: Add Aspeed XDMA Engine Driver

2019-05-30 Thread Eduardo Valentin

On Wed, May 29, 2019 at 01:10:02PM -0500, Eddie James wrote:
> The XDMA engine embedded in the AST2500 SOC performs PCI DMA operations
> between the SOC (acting as a BMC) and a host processor in a server.
> 
> This commit adds a driver to control the XDMA engine and adds functions
> to initialize the hardware and memory and start DMA operations.
> 
> Signed-off-by: Eddie James 
> ---
>  MAINTAINERS  |  10 +
>  drivers/soc/aspeed/Kconfig   |   8 +
>  drivers/soc/aspeed/Makefile  |   1 +
>  drivers/soc/aspeed/aspeed-xdma.c | 520 
> +++
>  include/uapi/linux/aspeed-xdma.h |  26 ++
>  5 files changed, 565 insertions(+)
>  create mode 100644 drivers/soc/aspeed/aspeed-xdma.c
>  create mode 100644 include/uapi/linux/aspeed-xdma.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7e09dda..84e2b62 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2584,6 +2584,16 @@ S: Maintained
>  F:   drivers/media/platform/aspeed-video.c
>  F:   Documentation/devicetree/bindings/media/aspeed-video.txt
>  
> +ASPEED XDMA ENGINE DRIVER
> +M:   Eddie James 
> +L:   linux-asp...@lists.ozlabs.org (moderated for non-subscribers)
> +L:   linux-kernel@vger.kernel.org
> +S:   Maintained
> +F:   Documentation/devicetree/bindings/misc/aspeed,xdma.txt
> +F:   Documentation/ABI/testing/sysfs-devices-platform-aspeed-xdma
> +F:   drivers/soc/aspeed/aspeed-xdma.c
> +F:   include/uapi/linux/aspeed-xdma.h
> +
>  ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>  M:   Corentin Chary 
>  L:   acpi4asus-u...@lists.sourceforge.net
> diff --git a/drivers/soc/aspeed/Kconfig b/drivers/soc/aspeed/Kconfig
> index 323e177..8b08310 100644
> --- a/drivers/soc/aspeed/Kconfig
> +++ b/drivers/soc/aspeed/Kconfig
> @@ -29,4 +29,12 @@ config ASPEED_P2A_CTRL
> ioctl()s, the driver also provides an interface for userspace 
> mappings to
> a pre-defined region.
>  
> +config ASPEED_XDMA
> + tristate "Aspeed XDMA Engine Driver"
> + depends on SOC_ASPEED && REGMAP && MFD_SYSCON && HAS_DMA
> + help
> +   Enable support for the Aspeed XDMA Engine found on the Aspeed AST2500
> +   SOC. The XDMA engine can perform automatic PCI DMA operations between
> +   the AST2500 (acting as a BMC) and a host processor.
> +
>  endmenu
> diff --git a/drivers/soc/aspeed/Makefile b/drivers/soc/aspeed/Makefile
> index b64be47..977b046 100644
> --- a/drivers/soc/aspeed/Makefile
> +++ b/drivers/soc/aspeed/Makefile
> @@ -2,3 +2,4 @@
>  obj-$(CONFIG_ASPEED_LPC_CTRL)+= aspeed-lpc-ctrl.o
>  obj-$(CONFIG_ASPEED_LPC_SNOOP)   += aspeed-lpc-snoop.o
>  obj-$(CONFIG_ASPEED_P2A_CTRL)+= aspeed-p2a-ctrl.o
> +obj-$(CONFIG_ASPEED_XDMA)+= aspeed-xdma.o
> diff --git a/drivers/soc/aspeed/aspeed-xdma.c 
> b/drivers/soc/aspeed/aspeed-xdma.c
> new file mode 100644
> index 000..3dc0ce4
> --- /dev/null
> +++ b/drivers/soc/aspeed/aspeed-xdma.c
> @@ -0,0 +1,520 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +// Copyright IBM Corp 2019
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DEVICE_NAME  "aspeed-xdma"
> +
> +#define SCU_STRAP0x070
> +#define  SCU_STRAP_VGA_MEM   GENMASK(3, 2)
> +
> +#define SCU_PCIE_CONF0x180
> +#define  SCU_PCIE_CONF_VGA_ENBIT(0)
> +#define  SCU_PCIE_CONF_VGA_EN_MMIO   BIT(1)
> +#define  SCU_PCIE_CONF_VGA_EN_LPCBIT(2)
> +#define  SCU_PCIE_CONF_VGA_EN_MSIBIT(3)
> +#define  SCU_PCIE_CONF_VGA_EN_MCTP   BIT(4)
> +#define  SCU_PCIE_CONF_VGA_EN_IRQBIT(5)
> +#define  SCU_PCIE_CONF_VGA_EN_DMABIT(6)
> +#define  SCU_PCIE_CONF_BMC_ENBIT(8)
> +#define  SCU_PCIE_CONF_BMC_EN_MMIO   BIT(9)
> +#define  SCU_PCIE_CONF_BMC_EN_MSIBIT(11)
> +#define  SCU_PCIE_CONF_BMC_EN_MCTP   BIT(12)
> +#define  SCU_PCIE_CONF_BMC_EN_IRQBIT(13)
> +#define  SCU_PCIE_CONF_BMC_EN_DMABIT(14)
> +#define  SCU_PCIE_CONF_RSVD  GENMASK(19, 18)
> +
> +#define SDMC_CONF0x004
> +#define  SDMC_CONF_MEM   GENMASK(1, 0)
> +#define SDMC_REMAP   0x008
> +#define  SDMC_REMAP_MAGICGENMASK(17, 16)
> +
> +#define XDMA_CMD_SIZE4
> +#define XDMA_CMDQ_SIZE   PAGE_SIZE
> +#define XDMA_BYTE_ALIGN  16
> +#define XDMA_MAX_LINE_SIZE   BIT(10)
> +#define XDMA_NUM_CMDS\
> + (XDMA_CMDQ_SIZE / sizeof(struct aspeed_xdma_cmd))
> +#define XDMA_NUM_DEBUGFS_REGS6
> +
> +#define XDMA_CMD_BMC_CHECK   BIT(0)
> +#define XDMA_CMD_BMC_ADDRGENMASK(29, 4)
> +#define XDMA_CMD_BMC_DIR_US

Re: [PATCH v5 3/3] thermal: cpu_cooling: Migrate to using the EM framework

2019-05-30 Thread Viresh Kumar

On 30-05-19, 12:27, Quentin Perret wrote:
> On Thursday 30 May 2019 at 10:20:38 (+0100), Quentin Perret wrote:
> > The newly introduced Energy Model framework manages power cost tables in
> > a generic way. Moreover, it supports several types of models since the
> > tables can come from DT or firmware (through SCMI) for example. On the
> > other hand, the cpu_cooling subsystem manages its own power cost tables
> > using only DT data.
> > 
> > In order to avoid the duplication of data in the kernel, and in order to
> > enable IPA with EMs coming from more than just DT, remove the private
> > tables from cpu_cooling.c and migrate it to using the centralized EM
> > framework. Doing so should have no visible functional impact for
> > existing users of IPA since:
> > 
> >  - recent extenstions to the the PM_OPP infrastructure enable the
> >registration of EMs in PM_EM using the DT property used by IPA;
> > 
> >  - the existing upstream cpufreq drivers marked with the
> >'CPUFREQ_IS_COOLING_DEV' flag all use the aforementioned PM_OPP
> >infrastructure, which means they all support PM_EM. The only two
> >exceptions are qoriq-cpufreq which doesn't in fact use an EM and
> >scmi-cpufreq which doesn't use DT for power costs.
> > 
> > For existing users of cpu_cooling, PM_EM tables will contain the exact
> > same power values that IPA used to compute on its own until now. The
> > only new dependency for them is to compile in CONFIG_ENERGY_MODEL.
> > 
> > The case where the thermal subsystem is used without an Energy Model
> > (cpufreq_cooling_ops) is handled by looking directly at CPUFreq's
> > frequency table which is already a dependency for cpu_cooling.c anyway.
> > Since the thermal framework expects the cooling states in a particular
> > order, bail out whenever the CPUFreq table is unsorted, since that is
> > fairly uncommon in general, and there are currently no users of
> > cpu_cooling for this use-case.
> > 
> > Acked-by: Viresh Kumar 
> 
> Viresh: the patch hasn't changed much so I kept this, but please shout
> if you're not happy with the new version :-)

Yeah, it  looked fine and so I didn't complain :)

-- 
viresh

[PATCH 1/1] PCI/IOV: fix cfg_size setting for multiple vfio-devices

2019-05-30 Thread Hao Zheng

When there are multiple vfio devices, only the cfg_size of first
vfio device can be correctly set to 4096. The cfg_size of other
vfio devices are incorrectly set to 256.

This will cause an error when live migrating a virtual machine with
vfio devices to multiple destinations. Fix this by setting correct
pcie_cap field before cfg_size initialize.

Signed-off-by: Hao Zheng 
Signed-off-by: Quan Xu 
cc: Zou Nanhai 
---
 drivers/pci/iov.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 3aa115e..239fad1 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -133,6 +133,7 @@ static void pci_read_vf_config_common(struct pci_dev 
*virtfn)
pci_read_config_word(virtfn, PCI_SUBSYSTEM_ID,
 >sriov->subsystem_device);
 
+   set_pcie_port_type(virtfn);
physfn->sriov->cfg_size = pci_cfg_space_size(virtfn);
 }
 
-- 
1.8.3.1

Re: [PATCH 1/2] scsi_host: add support for request batching

2019-05-30 Thread Ming Lei

On Thu, May 30, 2019 at 7:28 PM Paolo Bonzini  wrote:
>
> This allows a list of requests to be issued, with the LLD only writing
> the hardware doorbell when necessary, after the last request was prepared.
> This is more efficient if we have lists of requests to issue, particularly
> on virtualized hardware, where writing the doorbell is more expensive than
> on real hardware.
>
> The use case for this is plugged IO, where blk-mq flushes a batch of
> requests all at once.
>
> The API is the same as for blk-mq, just with blk-mq concepts tweaked to
> fit the SCSI subsystem API: the "last" flag in blk_mq_queue_data becomes
> a flag in scsi_cmnd, while the queue_num in the commit_rqs callback is
> extracted from the hctx and passed as a parameter.
>
> The only complication is that blk-mq uses different plugging heuristics
> depending on whether commit_rqs is present or not.  So we have two
> different sets of blk_mq_ops and pick one depending on whether the
> scsi_host template uses commit_rqs or not.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  drivers/scsi/scsi_lib.c  | 37 ++---
>  include/scsi/scsi_cmnd.h |  1 +
>  include/scsi/scsi_host.h | 16 ++--
>  3 files changed, 49 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 601b9f1de267..eb4e67d02bfe 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1673,10 +1673,11 @@ static blk_status_t scsi_queue_rq(struct 
> blk_mq_hw_ctx *hctx,
> blk_mq_start_request(req);
> }
>
> +   cmd->flags &= SCMD_PRESERVED_FLAGS;
> if (sdev->simple_tags)
> cmd->flags |= SCMD_TAGGED;
> -   else
> -   cmd->flags &= ~SCMD_TAGGED;
> +   if (bd->last)
> +   cmd->flags |= SCMD_LAST;
>
> scsi_init_cmd_errh(cmd);
> cmd->scsi_done = scsi_mq_done;
> @@ -1807,10 +1808,37 @@ void __scsi_init_queue(struct Scsi_Host *shost, 
> struct request_queue *q)
>  }
>  EXPORT_SYMBOL_GPL(__scsi_init_queue);
>
> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
> +   .get_budget = scsi_mq_get_budget,
> +   .put_budget = scsi_mq_put_budget,
> +   .queue_rq   = scsi_queue_rq,
> +   .complete   = scsi_softirq_done,
> +   .timeout= scsi_timeout,
> +#ifdef CONFIG_BLK_DEBUG_FS
> +   .show_rq= scsi_show_rq,
> +#endif
> +   .init_request   = scsi_mq_init_request,
> +   .exit_request   = scsi_mq_exit_request,
> +   .initialize_rq_fn = scsi_initialize_rq,
> +   .busy   = scsi_mq_lld_busy,
> +   .map_queues = scsi_map_queues,
> +};
> +
> +
> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
> +{
> +   struct request_queue *q = hctx->queue;
> +   struct scsi_device *sdev = q->queuedata;
> +   struct Scsi_Host *shost = sdev->host;
> +
> +   shost->hostt->commit_rqs(shost, hctx->queue_num);
> +}

It should be fine to implement scsi_commit_rqs() as:

 if (shost->hostt->commit_rqs)
   shost->hostt->commit_rqs(shost, hctx->queue_num);

then scsi_mq_ops_no_commit can be saved.

Because .commit_rqs() is only called when BLK_STS_*_RESOURCE is
returned from scsi_queue_rq(), at that time shost->hostt->commit_rqs should
have been hit from cache given .queuecommand is called via
host->hostt->queuecommand.

Not mention BLK_STS_*_RESOURCE is just often returned for small queue depth
device.

Thanks,
Ming Lei

Re: [LKP] [SUNRPC] 0472e47660: fsmark.app_overhead 16.0% regression

2019-05-30 Thread Xing Zhengjun





On 5/31/2019 3:10 AM, Trond Myklebust wrote:

On Thu, 2019-05-30 at 15:20 +0800, Xing Zhengjun wrote:


On 5/30/2019 10:00 AM, Trond Myklebust wrote:

Hi Xing,

On Thu, 2019-05-30 at 09:35 +0800, Xing Zhengjun wrote:

Hi Trond,

On 5/20/2019 1:54 PM, kernel test robot wrote:

Greeting,

FYI, we noticed a 16.0% improvement of fsmark.app_overhead due
to
commit:


commit: 0472e476604998c127f3c80d291113e77c5676ac ("SUNRPC:
Convert
socket page send code to use iov_iter()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
master

in testcase: fsmark
on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @
3.00GHz with 384G memory
with following parameters:

iterations: 1x
nr_threads: 64t
disk: 1BRD_48G
fs: xfs
fs2: nfsv4
filesize: 4M
test_size: 40G
sync_method: fsyncBeforeClose
cpufreq_governor: performance

test-description: The fsmark is a file system benchmark to test
synchronous write workloads, for example, mail servers
workload.
test-url: https://sourceforge.net/projects/fsmark/



Details are as below:
-

->


To reproduce:

   git clone https://github.com/intel/lkp-tests.git
   cd lkp-tests
   bin/lkp install job.yaml  # job file is attached in
this
email
   bin/lkp run job.yaml

===

==
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconf
ig/n
r_threads/rootfs/sync_method/tbox_group/test_size/testcase:
 gcc-7/performance/1BRD_48G/4M/nfsv4/xfs/1x/x86_64-rhel-
7.6/64t/debian-x86_64-2018-04-03.cgz/fsyncBeforeClose/lkp-ivb-
ep01/40G/fsmark

commit:
 e791f8e938 ("SUNRPC: Convert xs_send_kvec() to use
iov_iter_kvec()")
 0472e47660 ("SUNRPC: Convert socket page send code to use
iov_iter()")

e791f8e9380d945e 0472e476604998c127f3c80d291
 ---
  fail:runs  %reproductionfail:runs
  | | |
  :4   50%   2:4 dmesg.WARNING:a
t#for
_ip_interrupt_entry/0x
%stddev %change %stddev
\  |\
 15118573
±  2% +16.0%   17538083fsmark.app_overhead
   510.93   -
22.7% 395.12fsmark.files_per_sec
24.90   +22.8%  30.57fsmark.time.ela
psed_
time
24.90   +22.8%  30.57fsmark.time.ela
psed_
time.max
   288.00 ±  2% -
27.8% 208.00fsmark.time.percent_of_cpu_this_job_got
70.03 ±  2% -
11.3%  62.14fsmark.time.system_time



Do you have time to take a look at this regression?


  From your stats, it looks to me as if the problem is increased
NUMA
overhead. Pretty much everything else appears to be the same or
actually performing better than previously. Am I interpreting that
correctly?

The real regression is the throughput(fsmark.files_per_sec) is
decreased
by 22.7%.


Understood, but I'm trying to make sense of why. I'm not able to
reproduce this, so I have to rely on your performance stats to
understand where the 22.7% regression is coming from. As far as I can
see, the only numbers in the stats you published that are showing a
performance regression (other than the fsmark number itself), are the
NUMA numbers. Is that a correct interpretation?


We re-test the case yesterday, the test result almost is the same.
we will do more test and also check the test case itself, if you need
more information, please let me know, thanks.


If my interpretation above is correct, then I'm not seeing where
this
patch would be introducing new NUMA regressions. It is just
converting
from using one method of doing socket I/O to another. Could it
perhaps
be a memory artefact due to your running the NFS client and server
on
the same machine?

Apologies for pushing back a little, but I just don't have the
hardware available to test NUMA configurations, so I'm relying on
external testing for the above kind of scenario.


Thanks for looking at this.  If you need more information, please let
me
know.

Thanks
Trond



--
Zhengjun Xing

Re: [PATCH v3 1/1] arm64: dts: rockchip: add core dtsi file for RK3399Pro SoCs

2019-05-30 Thread Manivannan Sadhasivam

On Thu, May 30, 2019 at 08:08:48AM +0800, Jianqun Xu wrote:
> This patch adds core dtsi file for Rockchip RK3399Pro SoCs,
> include rk3399.dtsi. Also enable pciei0/pcie_phy for AP to
> talk to NPU part inside SoC.
> 
> Signed-off-by: Jianqun Xu 
> ---
> changes since v2:
> - only enable pcie0 and pcie_phy nodes, thanks for Heiko and manivannan
> 
> changes since v1:
> - remove dfi and dmc
> 
>  arch/arm64/boot/dts/rockchip/rk3399pro.dtsi | 22 +
>  1 file changed, 22 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/rockchip/rk3399pro.dtsi
> 
> diff --git a/arch/arm64/boot/dts/rockchip/rk3399pro.dtsi 
> b/arch/arm64/boot/dts/rockchip/rk3399pro.dtsi
> new file mode 100644
> index ..bb5ebf6608b9
> --- /dev/null
> +++ b/arch/arm64/boot/dts/rockchip/rk3399pro.dtsi
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> +// Copyright (c) 2019 Fuzhou Rockchip Electronics Co., Ltd.
> +
> +#include "rk3399.dtsi"
> +
> +/ {
> + compatible = "rockchip,rk3399pro";
> +};
> +
> +/* Default to enabled since AP talk to NPU part over pcie */
> +_phy {
> + status = "okay";
> +};
> +
> +/* Default to enabled since AP talk to NPU part over pcie */
> + {
> + ep-gpios = < RK_PB4 GPIO_ACTIVE_HIGH>;
> + num-lanes = <4>;
> + pinctrl-names = "default";
> + pinctrl-0 = <_clkreqn_cpm>;

No pinctrl config for ep-gpio? Other than that, it looks good to me.

Acked-by: Manivannan Sadhasivam 

Thanks,
Mani
> + status = "okay";
> +};
> -- 
> 2.17.1
> 
> 
>

Re: [PATCH v2] checkpatch.pl: Warn on duplicate sysctl local variable

2019-05-30 Thread Joe Perches

On Fri, 2019-05-31 at 03:12 +0200, Matteo Croce wrote:
> Commit 6a33853c5773 ("proc/sysctl: add shared variables for range check")
> adds some shared const variables to be used instead of a local copy in
> each source file.
> Warn when a chunk duplicates one of these values in a ctl_table struct:
> 
> $ scripts/checkpatch.pl 0001-test-commit.patch
> WARNING: duplicated sysctl range checking value 'zero', consider using 
> the shared one in include/linux/sysctl.h
> #27: FILE: arch/arm/kernel/isa.c:48:
> +   .extra1 = ,
> 
> WARNING: duplicated sysctl range checking value 'int_max', consider using 
> the shared one in include/linux/sysctl.h
> #28: FILE: arch/arm/kernel/isa.c:49:
> +   .extra2 = _max,
> 
> total: 0 errors, 2 warnings, 14 lines checked
> 
> Signed-off-by: Matteo Croce 
> ---
>  scripts/checkpatch.pl | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 342c7c781ba5..629c31435487 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -6639,6 +6639,12 @@ sub process {
>"unknown module license " . 
> $extracted_string . "\n" . $herecurr);
>   }
>   }
> +
> +# check for sysctl duplicate constants
> + if ($line =~ /\.extra[12]\s*=\s*&(zero|one|int_max|max_int)\b/) 
> {

why max_int, there isn't a single use of it in the kernel ?

Re: [PATCH net-next] netfilter: nf_conntrack_bridge: Fix build error without IPV6

2019-05-30 Thread Yuehaibing

+cc netdev

On 2019/5/31 10:46, YueHaibing wrote:
> Fix gcc build error while CONFIG_IPV6 is not set
> 
> In file included from net/netfilter/core.c:19:0:
> ./include/linux/netfilter_ipv6.h: In function 'nf_ipv6_br_defrag':
> ./include/linux/netfilter_ipv6.h:110:9: error: implicit declaration of 
> function 'nf_ct_frag6_gather' [-Werror=implicit-function-declaration]
> 
> Reported-by: Hulk Robot 
> Fixes: 764dd163ac92 ("netfilter: nf_conntrack_bridge: add support for IPv6")
> Signed-off-by: YueHaibing 
> ---
>  include/linux/netfilter_ipv6.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h
> index a21b8c9..4ea97fd 100644
> --- a/include/linux/netfilter_ipv6.h
> +++ b/include/linux/netfilter_ipv6.h
> @@ -96,6 +96,8 @@ static inline int nf_ip6_route(struct net *net, struct 
> dst_entry **dst,
>  #endif
>  }
>  
> +int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user);
> +
>  static inline int nf_ipv6_br_defrag(struct net *net, struct sk_buff *skb,
>   u32 user)
>  {
>

Re: [PATCH] ipv6: Prevent overrun when parsing v6 header options

2019-05-30 Thread Yang Xiao

On Fri, May 31, 2019 at 1:17 AM Eric Dumazet  wrote:
>
>
>
> On 5/30/19 8:28 AM, Young Xiao wrote:
> > The fragmentation code tries to parse the header options in order
> > to figure out where to insert the fragment option.  Since nexthdr points
> > to an invalid option, the calculation of the size of the network header
> > can made to be much larger than the linear section of the skb and data
> > is read outside of it.
> >
> > This vulnerability is similar to CVE-2017-9074.
> >
> > Signed-off-by: Young Xiao <92siuy...@gmail.com>
> > ---
> >  net/ipv6/mip6.c | 24 ++--
> >  1 file changed, 14 insertions(+), 10 deletions(-)
> >
> > diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
> > index 64f0f7b..30ed1c5 100644
> > --- a/net/ipv6/mip6.c
> > +++ b/net/ipv6/mip6.c
> > @@ -263,8 +263,6 @@ static int mip6_destopt_offset(struct xfrm_state *x, 
> > struct sk_buff *skb,
> >  u8 **nexthdr)
> >  {
> >   u16 offset = sizeof(struct ipv6hdr);
> > - struct ipv6_opt_hdr *exthdr =
> > -(struct ipv6_opt_hdr *)(ipv6_hdr(skb) + 1);
> >   const unsigned char *nh = skb_network_header(skb);
> >   unsigned int packet_len = skb_tail_pointer(skb) -
> >   skb_network_header(skb);
> > @@ -272,7 +270,8 @@ static int mip6_destopt_offset(struct xfrm_state *x, 
> > struct sk_buff *skb,
> >
> >   *nexthdr = _hdr(skb)->nexthdr;
> >
> > - while (offset + 1 <= packet_len) {
> > + while (offset <= packet_len) {
> > + struct ipv6_opt_hdr *exthdr;
> >
> >   switch (**nexthdr) {
> >   case NEXTHDR_HOP:
> > @@ -299,12 +298,15 @@ static int mip6_destopt_offset(struct xfrm_state *x, 
> > struct sk_buff *skb,
> >   return offset;
> >   }
> >
> > + if (offset + sizeof(struct ipv6_opt_hdr) > packet_len)
> > + return -EINVAL;
> > +
> > + exthdr = (struct ipv6_opt_hdr *)(nh + offset);
> >   offset += ipv6_optlen(exthdr);
> >   *nexthdr = >nexthdr;
> > - exthdr = (struct ipv6_opt_hdr *)(nh + offset);
> >   }
> >
> > - return offset;
> > + return -EINVAL;
> >  }
> >
>
>
> Ok, but have you checked that callers have been fixed ?

I've checked the callers. There are two callers:
xfrm6_transport_output() and xfrm6_ro_output(). There are checks in
both function.

--
hdr_len = x->type->hdr_offset(x, skb, );
if (hdr_len < 0)
return hdr_len;
--
>
> xfrm6_transport_output() seems buggy as well,
> unless the skbs are linearized before entering these functions ?
I can not understand what you mean about this comment.
Could you explain it in more detail.

>
> Thanks.
>
>
>

[PATCH] Subject:alignment:fetch pc-instr before irq_enable

2019-05-30 Thread xiaoqian

When the instruction code under PC address is read through
_probe_kernel_read in do_alignment,if the pte page corresponding
to the code segment of PC address is reclaimed exactly at this time,
the address mapping cannot be reconstructed because page fault_disable()
is executed in _probe_kernel_read function,and the failure to obtain
the instruction code of PC finally results in the unsuccessful repair
operation.
Thus we can modify the implementation of reading user-mode PC instruction
before local_irq_enable to avoid the above risk.
At the same time, adjust the sequence of code processing and optimize the
process.

Signed-off-by: xiaoqian 
Cc: sta...@vger.kernel.org
---
 arch/arm/mm/alignment.c | 81 +
 1 file changed, 55 insertions(+), 26 deletions(-)

diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index e376883ab35b..4124b9ce3c70 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -76,6 +76,11 @@
 #define IS_T32(hi16) \
(((hi16) & 0xe000) == 0xe000 && ((hi16) & 0x1800))
 
+#define INVALID_INSTR_MODE 0
+#define ARM_INSTR_MODE 1
+#define THUMB_INSTR_MODE   2
+#define THUMB2_INSTR_MODE  3
+
 static unsigned long ai_user;
 static unsigned long ai_sys;
 static void *ai_sys_last_pc;
@@ -705,6 +710,48 @@ thumb2arm(u16 tinstr)
}
 }
 
+static unsigned int
+fetch_usr_pc_instr(struct pt_regs *regs, unsigned long *pc_instrptr)
+{
+   unsigned int fault;
+   unsigned long instrptr;
+   unsigned long instr_mode = INVALID_INSTR_MODE;
+
+   instrptr = instruction_pointer(regs);
+
+   if (thumb_mode(regs)) {
+   u16 tinstr = 0;
+   u16 *ptr = (u16 *)(instrptr & ~1);
+
+   fault = probe_kernel_address(ptr, tinstr);
+   if (!fault) {
+   tinstr = __mem_to_opcode_thumb16(tinstr);
+   if (cpu_architecture() >= CPU_ARCH_ARMv7 &&
+   IS_T32(tinstr)) {
+   /* Thumb-2 32-bit */
+   u16 tinstr2 = 0;
+
+   fault = probe_kernel_address(ptr + 1, tinstr2);
+   if (!fault) {
+   tinstr2 = 
__mem_to_opcode_thumb16(tinstr2);
+   *pc_instrptr = 
__opcode_thumb32_compose(tinstr, tinstr2);
+   instr_mode = THUMB2_INSTR_MODE;
+   }
+   } else {
+   *pc_instrptr = thumb2arm(tinstr);
+   instr_mode = THUMB_INSTR_MODE;
+   }
+   }
+   } else {
+   fault = probe_kernel_address((void *)instrptr, *pc_instrptr);
+   if (!fault) {
+   *pc_instrptr = __mem_to_opcode_arm(*pc_instrptr);
+   instr_mode = ARM_INSTR_MODE;
+   }
+   }
+   return instr_mode;
+}
+
 /*
  * Convert Thumb-2 32 bit LDM, STM, LDRD, STRD to equivalent instruction
  * handlable by ARM alignment handler, also find the corresponding handler,
@@ -775,42 +822,24 @@ do_alignment(unsigned long addr, unsigned int fsr, struct 
pt_regs *regs)
unsigned long instr = 0, instrptr;
int (*handler)(unsigned long addr, unsigned long instr, struct pt_regs 
*regs);
unsigned int type;
-   unsigned int fault;
u16 tinstr = 0;
int isize = 4;
int thumb2_32b = 0;
+   unsigned long pc_instr_mode;
+
+   pc_instr_mode = fetch_usr_pc_instr(regs, );
 
if (interrupts_enabled(regs))
local_irq_enable();
 
instrptr = instruction_pointer(regs);
-
-   if (thumb_mode(regs)) {
-   u16 *ptr = (u16 *)(instrptr & ~1);
-   fault = probe_kernel_address(ptr, tinstr);
-   tinstr = __mem_to_opcode_thumb16(tinstr);
-   if (!fault) {
-   if (cpu_architecture() >= CPU_ARCH_ARMv7 &&
-   IS_T32(tinstr)) {
-   /* Thumb-2 32-bit */
-   u16 tinst2 = 0;
-   fault = probe_kernel_address(ptr + 1, tinst2);
-   tinst2 = __mem_to_opcode_thumb16(tinst2);
-   instr = __opcode_thumb32_compose(tinstr, 
tinst2);
-   thumb2_32b = 1;
-   } else {
-   isize = 2;
-   instr = thumb2arm(tinstr);
-   }
-   }
-   } else {
-   fault = probe_kernel_address((void *)instrptr, instr);
-   instr = __mem_to_opcode_arm(instr);
-   }
-
-   if (fault) {
+   if (pc_instr_mode == INVALID_INSTR_MODE) {
type = TYPE_FAULT;
goto bad_or_fault;
+   } else if

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-05-30 Thread Aaron Lu

On 2019/5/30 22:04, Aubrey Li wrote:
> On Thu, May 30, 2019 at 4:36 AM Vineeth Remanan Pillai
>  wrote:
>>
>> Third iteration of the Core-Scheduling feature.
>>
>> This version fixes mostly correctness related issues in v2 and
>> addresses performance issues. Also, addressed some crashes related
>> to cgroups and cpu hotplugging.
>>
>> We have tested and verified that incompatible processes are not
>> selected during schedule. In terms of performance, the impact
>> depends on the workload:
>> - on CPU intensive applications that use all the logical CPUs with
>>   SMT enabled, enabling core scheduling performs better than nosmt.
>> - on mixed workloads with considerable io compared to cpu usage,
>>   nosmt seems to perform better than core scheduling.
> 
> My testing scripts can not be completed on this version. I figured out the
> number of cpu utilization report entry didn't reach my minimal requirement.
> Then I wrote a simple script to verify.
> 
> $ cat test.sh
> #!/bin/sh
> 
> for i in `seq 1 10`
> do
> echo `date`, $i
> sleep 1
> done
> 

Is the shell put to some cgroup and assigned some tag or simply untagged?

> 
> Normally it works as below:
> 
> Thu May 30 14:13:40 CST 2019, 1
> Thu May 30 14:13:41 CST 2019, 2
> Thu May 30 14:13:42 CST 2019, 3
> Thu May 30 14:13:43 CST 2019, 4
> Thu May 30 14:13:44 CST 2019, 5
> Thu May 30 14:13:45 CST 2019, 6
> Thu May 30 14:13:46 CST 2019, 7
> Thu May 30 14:13:47 CST 2019, 8
> Thu May 30 14:13:48 CST 2019, 9
> Thu May 30 14:13:49 CST 2019, 10
> 
> When the system was running 32 sysbench threads and
> 32 gemmbench threads, it worked as below(the system
> has ~38% idle time)

Are the two workloads assigned different tags?
And how many cores/threads do you have?

> Thu May 30 14:14:20 CST 2019, 1
> Thu May 30 14:14:21 CST 2019, 2
> Thu May 30 14:14:22 CST 2019, 3
> Thu May 30 14:14:24 CST 2019, 4 <===x=
> Thu May 30 14:14:25 CST 2019, 5
> Thu May 30 14:14:26 CST 2019, 6
> Thu May 30 14:14:28 CST 2019, 7 <===x=
> Thu May 30 14:14:29 CST 2019, 8
> Thu May 30 14:14:31 CST 2019, 9 <===x=
> Thu May 30 14:14:34 CST 2019, 10 <===x=

This feels like "date" failed to schedule on some CPU
on time.

> And it got worse when the system was running 64/64 case,
> the system still had ~3% idle time
> Thu May 30 14:26:40 CST 2019, 1
> Thu May 30 14:26:46 CST 2019, 2
> Thu May 30 14:26:53 CST 2019, 3
> Thu May 30 14:27:01 CST 2019, 4
> Thu May 30 14:27:03 CST 2019, 5
> Thu May 30 14:27:11 CST 2019, 6
> Thu May 30 14:27:31 CST 2019, 7
> Thu May 30 14:27:32 CST 2019, 8
> Thu May 30 14:27:41 CST 2019, 9
> Thu May 30 14:27:56 CST 2019, 10
> 
> Any thoughts?

My first reaction is: when shell wakes up from sleep, it will
fork date. If the script is untagged and those workloads are
tagged and all available cores are already running workload
threads, the forked date can lose to the running workload
threads due to __prio_less() can't properly do vruntime comparison
for tasks on different CPUs. So those idle siblings can't run
date and are idled instead. See my previous post on this:

https://lore.kernel.org/lkml/20190429033620.GA128241@aaronlu/
(Now that I re-read my post, I see that I didn't make it clear
that se_bash and se_hog are assigned different tags(e.g. hog is
tagged and bash is untagged).

Siblings being forced idle is expected due to the nature of core
scheduling, but when two tasks belonging to two siblings are
fighting for schedule, we should let the higher priority one win.

It used to work on v2 is probably due to we mistakenly
allow different tagged tasks to schedule on the same core at
the same time, but that is fixed in v3.

[PATCH v8 4/4] arm64: dts: qcom: sdm845: Add Q6V5 MSS node

2019-05-30 Thread Bjorn Andersson

From: Sibi Sankar 

This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs.

Reviewed-by: Douglas Anderson 
Reviewed-by: Vinod Koul 
Signed-off-by: Sibi Sankar 
Signed-off-by: Bjorn Andersson 
---

Changes since v7:
- None

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 58 
 1 file changed, 58 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index b25c251b6503..978ceaec78cb 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -1671,6 +1671,64 @@
};
};
 
+   mss_pil: remoteproc@408 {
+   compatible = "qcom,sdm845-mss-pil";
+   reg = <0 0x0408 0 0x408>, <0 0x0418 0 0x48>;
+   reg-names = "qdsp6", "rmb";
+
+   interrupts-extended =
+   < GIC_SPI 266 IRQ_TYPE_EDGE_RISING>,
+   <_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+   <_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
+   <_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
+   <_smp2p_in 3 IRQ_TYPE_EDGE_RISING>,
+   <_smp2p_in 7 IRQ_TYPE_EDGE_RISING>;
+   interrupt-names = "wdog", "fatal", "ready",
+ "handover", "stop-ack",
+ "shutdown-ack";
+
+   clocks = < GCC_MSS_CFG_AHB_CLK>,
+< GCC_MSS_Q6_MEMNOC_AXI_CLK>,
+< GCC_BOOT_ROM_AHB_CLK>,
+< GCC_MSS_GPLL0_DIV_CLK_SRC>,
+< GCC_MSS_SNOC_AXI_CLK>,
+< GCC_MSS_MFAB_AXIS_CLK>,
+< GCC_PRNG_AHB_CLK>,
+< RPMH_CXO_CLK>;
+   clock-names = "iface", "bus", "mem", "gpll0_mss",
+ "snoc_axi", "mnoc_axi", "prng", "xo";
+
+   qcom,smem-states = <_smp2p_out 0>;
+   qcom,smem-state-names = "stop";
+
+   resets = <_reset AOSS_CC_MSS_RESTART>,
+<_reset PDC_MODEM_SYNC_RESET>;
+   reset-names = "mss_restart", "pdc_reset";
+
+   qcom,halt-regs = <_mutex_regs 0x23000 0x25000 
0x24000>;
+
+   power-domains = <_qmp 2>,
+   < SDM845_CX>,
+   < SDM845_MX>,
+   < SDM845_MSS>;
+   power-domain-names = "load_state", "cx", "mx", "mss";
+
+   mba {
+   memory-region = <_region>;
+   };
+
+   mpss {
+   memory-region = <_region>;
+   };
+
+   glink-edge {
+   interrupts = ;
+   label = "modem";
+   qcom,remote-pid = <1>;
+   mboxes = <_shared 12>;
+   };
+   };
+
gpucc: clock-controller@509 {
compatible = "qcom,sdm845-gpucc";
reg = <0 0x0509 0 0x9000>;
-- 
2.18.0

[PATCH v8 1/4] dt-bindings: soc: qcom: Add AOSS QMP binding

2019-05-30 Thread Bjorn Andersson

Add binding for the QMP based side-channel communication mechanism to
the AOSS, which is used to control resources not exposed through the
RPMh interface.

Reviewed-by: Rob Herring 
Reviewed-by: Vinod Koul 
Signed-off-by: Bjorn Andersson 
---

Changes since v7:
- Fix spelling of "Messaging"

 .../bindings/soc/qcom/qcom,aoss-qmp.txt   | 81 +++
 include/dt-bindings/power/qcom-aoss-qmp.h | 14 
 2 files changed, 95 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
 create mode 100644 include/dt-bindings/power/qcom-aoss-qmp.h

diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt 
b/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
new file mode 100644
index ..954ffee0a9c4
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
@@ -0,0 +1,81 @@
+Qualcomm Always-On Subsystem side channel binding
+
+This binding describes the hardware component responsible for side channel
+requests to the always-on subsystem (AOSS), used for certain power management
+requests that is not handled by the standard RPMh interface. Each client in the
+SoC has it's own block of message RAM and IRQ for communication with the AOSS.
+The protocol used to communicate in the message RAM is known as Qualcomm
+Messaging Protocol (QMP)
+
+The AOSS side channel exposes control over a set of resources, used to control
+a set of debug related clocks and to affect the low power state of resources
+related to the secondary subsystems. These resources are exposed as a set of
+power-domains.
+
+- compatible:
+   Usage: required
+   Value type: 
+   Definition: must be "qcom,sdm845-aoss-qmp"
+
+- reg:
+   Usage: required
+   Value type: 
+   Definition: the base address and size of the message RAM for this
+   client's communication with the AOSS
+
+- interrupts:
+   Usage: required
+   Value type: 
+   Definition: should specify the AOSS message IRQ for this client
+
+- mboxes:
+   Usage: required
+   Value type: 
+   Definition: reference to the mailbox representing the outgoing doorbell
+   in APCS for this client, as described in mailbox/mailbox.txt
+
+- #clock-cells:
+   Usage: optional
+   Value type: 
+   Definition: must be 0
+   The single clock represents the QDSS clock.
+
+- #power-domain-cells:
+   Usage: optional
+   Value type: 
+   Definition: must be 1
+   The provided power-domains are:
+   CDSP state (0), LPASS state (1), modem state (2), SLPI
+   state (3), SPSS state (4) and Venus state (5).
+
+= SUBNODES
+The AOSS side channel also provides the controls for three cooling devices,
+these are expressed as subnodes of the QMP node. The name of the node is used
+to identify the resource and must therefor be "cx", "mx" or "ebi".
+
+- #cooling-cells:
+   Usage: optional
+   Value type: 
+   Definition: must be 2
+
+= EXAMPLE
+
+The following example represents the AOSS side-channel message RAM and the
+mechanism exposing the power-domains, as found in SDM845.
+
+  aoss_qmp: qmp@c30 {
+ compatible = "qcom,sdm845-aoss-qmp";
+ reg = <0x0c30 0x10>;
+ interrupts = ;
+ mboxes = <_shared 0>;
+
+ #power-domain-cells = <1>;
+
+ cx_cdev: cx {
+   #cooling-cells = <2>;
+ };
+
+ mx_cdev: mx {
+   #cooling-cells = <2>;
+ };
+  };
diff --git a/include/dt-bindings/power/qcom-aoss-qmp.h 
b/include/dt-bindings/power/qcom-aoss-qmp.h
new file mode 100644
index ..ec336d31dee4
--- /dev/null
+++ b/include/dt-bindings/power/qcom-aoss-qmp.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018, Linaro Ltd. */
+
+#ifndef __DT_BINDINGS_POWER_QCOM_AOSS_QMP_H
+#define __DT_BINDINGS_POWER_QCOM_AOSS_QMP_H
+
+#define AOSS_QMP_LS_CDSP   0
+#define AOSS_QMP_LS_LPASS  1
+#define AOSS_QMP_LS_MODEM  2
+#define AOSS_QMP_LS_SLPI   3
+#define AOSS_QMP_LS_SPSS   4
+#define AOSS_QMP_LS_VENUS  5
+
+#endif
-- 
2.18.0

[PATCH v8 2/4] soc: qcom: Add AOSS QMP driver

2019-05-30 Thread Bjorn Andersson

The Always On Subsystem (AOSS) Qualcomm Messaging Protocol (QMP) driver
is used to communicate with the AOSS for certain side-channel requests,
that are not available through the RPMh interface.

The communication is a very simple synchronous mechanism of messages
being written in message RAM and a doorbell in the AOSS is rung. As the
AOSS has processed the message length is cleared and an interrupt is
fired by the AOSS as acknowledgment.

The driver exposes the QDSS clock as a clock and the low-power state
associated with the remoteprocs in the system as a set of power-domains.

Tested-by: Sai Prakash Ranjan 
Signed-off-by: Bjorn Andersson 
---

Changes since v7:
- Fixed handling of of_clk_add_hw_provider() errors
- Reduced QMP_MSG_LEN to 64
- constify constant strings
- GENMASK() bitmasks
- Fix return value of qmp_open()

 drivers/soc/qcom/Kconfig |  12 +
 drivers/soc/qcom/Makefile|   1 +
 drivers/soc/qcom/qcom_aoss.c | 479 +++
 3 files changed, 492 insertions(+)
 create mode 100644 drivers/soc/qcom/qcom_aoss.c

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 1ee298f6bf17..7aa0d1f17e65 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -3,6 +3,18 @@
 #
 menu "Qualcomm SoC drivers"
 
+config QCOM_AOSS_QMP
+   tristate "Qualcomm AOSS Driver"
+   depends on ARCH_QCOM || COMPILE_TEST
+   depends on MAILBOX
+   depends on COMMON_CLK
+   select PM_GENERIC_DOMAINS
+   help
+ This driver provides the means of communicating with and controlling
+ the low-power state for resources related to the remoteproc
+ subsystems as well as controlling the debug clocks exposed by the 
Always On
+ Subsystem (AOSS) using Qualcomm Messaging Protocol (QMP).
+
 config QCOM_COMMAND_DB
bool "Qualcomm Command DB"
depends on ARCH_QCOM || COMPILE_TEST
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index ffe519b0cb66..eeb088beb15f 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 CFLAGS_rpmh-rsc.o := -I$(src)
+obj-$(CONFIG_QCOM_AOSS_QMP) += qcom_aoss.o
 obj-$(CONFIG_QCOM_GENI_SE) +=  qcom-geni-se.o
 obj-$(CONFIG_QCOM_COMMAND_DB) += cmd-db.o
 obj-$(CONFIG_QCOM_GLINK_SSR) +=glink_ssr.o
diff --git a/drivers/soc/qcom/qcom_aoss.c b/drivers/soc/qcom/qcom_aoss.c
new file mode 100644
index ..369d4cab96b0
--- /dev/null
+++ b/drivers/soc/qcom/qcom_aoss.c
@@ -0,0 +1,479 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019, Linaro Ltd
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define QMP_DESC_MAGIC 0x0
+#define QMP_DESC_VERSION   0x4
+#define QMP_DESC_FEATURES  0x8
+
+/* AOP-side offsets */
+#define QMP_DESC_UCORE_LINK_STATE  0xc
+#define QMP_DESC_UCORE_LINK_STATE_ACK  0x10
+#define QMP_DESC_UCORE_CH_STATE0x14
+#define QMP_DESC_UCORE_CH_STATE_ACK0x18
+#define QMP_DESC_UCORE_MBOX_SIZE   0x1c
+#define QMP_DESC_UCORE_MBOX_OFFSET 0x20
+
+/* Linux-side offsets */
+#define QMP_DESC_MCORE_LINK_STATE  0x24
+#define QMP_DESC_MCORE_LINK_STATE_ACK  0x28
+#define QMP_DESC_MCORE_CH_STATE0x2c
+#define QMP_DESC_MCORE_CH_STATE_ACK0x30
+#define QMP_DESC_MCORE_MBOX_SIZE   0x34
+#define QMP_DESC_MCORE_MBOX_OFFSET 0x38
+
+#define QMP_STATE_UP   GENMASK(15, 0)
+#define QMP_STATE_DOWN GENMASK(31, 16)
+
+#define QMP_MAGIC  0x4d41494c /* mail */
+#define QMP_VERSION1
+
+/* 64 bytes is enough to store the requests and provides padding to 4 bytes */
+#define QMP_MSG_LEN64
+
+/**
+ * struct qmp - driver state for QMP implementation
+ * @msgram: iomem referencing the message RAM used for communication
+ * @dev: reference to QMP device
+ * @mbox_client: mailbox client used to ring the doorbell on transmit
+ * @mbox_chan: mailbox channel used to ring the doorbell on transmit
+ * @offset: offset within @msgram where messages should be written
+ * @size: maximum size of the messages to be transmitted
+ * @event: wait_queue for synchronization with the IRQ
+ * @tx_lock: provides synchronization between multiple callers of qmp_send()
+ * @qdss_clk: QDSS clock hw struct
+ * @pd_data: genpd data
+ */
+struct qmp {
+   void __iomem *msgram;
+   struct device *dev;
+
+   struct mbox_client mbox_client;
+   struct mbox_chan *mbox_chan;
+
+   size_t offset;
+   size_t size;
+
+   wait_queue_head_t event;
+
+   struct mutex tx_lock;
+
+   struct clk_hw qdss_clk;
+   struct genpd_onecell_data pd_data;
+};
+
+struct qmp_pd {
+   struct qmp *qmp;
+   struct generic_pm_domain pd;
+};
+
+#define to_qmp_pd_resource(res) container_of(res, struct qmp_pd, pd)
+
+static void qmp_kick(struct qmp *qmp)
+{
+

[PATCH v8 3/4] arm64: dts: qcom: Add AOSS QMP node

2019-05-30 Thread Bjorn Andersson

The AOSS QMP provides a number of power domains, used for QDSS and
PIL, add the node for this.

Tested-by: Sibi Sankar 
Reviewed-by: Sibi Sankar 
Reviewed-by: Vinod Koul 
Signed-off-by: Bjorn Andersson 
---

Changes since v7:
- None

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index fcb93300ca62..b25c251b6503 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -2142,6 +2142,16 @@
#reset-cells = <1>;
};
 
+   aoss_qmp: qmp@c30 {
+   compatible = "qcom,sdm845-aoss-qmp";
+   reg = <0 0x0c30 0 0x10>;
+   interrupts = ;
+   mboxes = <_shared 0>;
+
+   #clock-cells = <0>;
+   #power-domain-cells = <1>;
+   };
+
spmi_bus: spmi@c44 {
compatible = "qcom,spmi-pmic-arb";
reg = <0 0x0c44 0 0x1100>,
-- 
2.18.0

[PATCH v8 0/4] Qualcomm AOSS QMP driver

2019-05-30 Thread Bjorn Andersson

Introduce a driver implementing Qualcomm Messaging Protocol (QMP) to
communicate with the Always On Subsystem (AOSS) and expose the low-power
states for the remoteprocs as a set of power-domains and the QDSS clock
as a clock.

Changes since v7:
- Minor tweaks code style tweaks and error handling

Changes since v6:
- First couple of patches merged for v5.2
- Squashed the qmp and qmp-pd driver into one and by that moved it all
  to one file
- Expose QDSS clock as a clock instead of a power domain

Bjorn Andersson (3):
  dt-bindings: soc: qcom: Add AOSS QMP binding
  soc: qcom: Add AOSS QMP driver
  arm64: dts: qcom: Add AOSS QMP node

Sibi Sankar (1):
  arm64: dts: qcom: sdm845: Add Q6V5 MSS node

 .../bindings/soc/qcom/qcom,aoss-qmp.txt   |  81 +++
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |  68 +++
 drivers/soc/qcom/Kconfig  |  12 +
 drivers/soc/qcom/Makefile |   1 +
 drivers/soc/qcom/qcom_aoss.c  | 479 ++
 include/dt-bindings/power/qcom-aoss-qmp.h |  14 +
 6 files changed, 655 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
 create mode 100644 drivers/soc/qcom/qcom_aoss.c
 create mode 100644 include/dt-bindings/power/qcom-aoss-qmp.h

-- 
2.18.0

Re: [GIT PULL] clk fixes for v5.2-rc2

2019-05-30 Thread pr-tracker-bot

The pull request you sent on Wed, 29 May 2019 15:29:16 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git 
> tags/clk-fixes-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/20f944965601c59e68865d4ee12225fbabb5652b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH] x86/mm/tlb: Do partial TLB flush when possible

2019-05-30 Thread Zhenzhong Duan




On 2019/5/30 22:15, Andy Lutomirski wrote:

On Thu, May 30, 2019 at 12:56 AM Zhenzhong Duan
 wrote:

This is a small optimization to stale TLB flush, if there is one new TLB
flush, let it choose to do partial or full flush. or else, the stale
flush take over and do full flush.

I think this is invalid because:


+   if (unlikely(f->new_tlb_gen <= local_tlb_gen &&
+   local_tlb_gen + 1 == mm_tlb_gen)) {
+   /*
+* For stale TLB flush request, if there will be one new TLB
+* flush coming, we leave the work to the new IPI as it knows
+* partial or full TLB flush to take, or else we do the full
+* flush.
+*/
+   trace_tlb_flush(reason, 0);
+   return;

We do indeed know that the TLB will get flushed eventually, but we're
actually providing a stronger guarantee that the TLB will be
adequately flushed by the time we return.  Otherwise, after
flush_tlb_mm_range(), there will be a window in which the TLB isn't
flushed yet.


You are right. I didn't notice this point, sorry for the noise.

Zhenzhong

Re: [PATCHv5 0/2] x86/boot/KASLR: skip the specified crashkernel region

2019-05-30 Thread Pingfan Liu

Maintainers, ping?
Hi Borislav, during the review of V4, you suggested to re-design the
return value of parse_crashkernel(), the latest try is on
https://lore.kernel.org/patchwork/patch/1065514/. It seems hard to
move on in that thread. On the other hand, my series "[PATCHv5 0/2]
x86/boot/KASLR: skip the specified crashkernel region" has no depend
on the "re-design the return value of parse_crashkernel()".

Thanks,
  Pingfan
On Tue, May 7, 2019 at 12:32 PM Pingfan Liu  wrote:
>
> crashkernel=x@y or or =range1:size1[,range2:size2,...]@offset option may
> fail to reserve the required memory region if KASLR puts kernel into the
> region. To avoid this uncertainty, asking KASLR to skip the required
> region.
> And the parsing routine can be re-used at this early boot stage.
>
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: "H. Peter Anvin" 
> Cc: Baoquan He 
> Cc: Will Deacon 
> Cc: Nicolas Pitre 
> Cc: Vivek Goyal 
> Cc: Chao Fan 
> Cc: "Kirill A. Shutemov" 
> Cc: Ard Biesheuvel 
> CC: Hari Bathini 
> Cc: linux-kernel@vger.kernel.org
> ---
> v3 -> v4:
>   reuse the parse_crashkernel_xx routines
> v4 -> v5:
>   drop unnecessary initialization of crash_base in [2/2]
>
> Pingfan Liu (2):
>   kernel/crash_core: separate the parsing routines to
> lib/parse_crashkernel.c
>   x86/boot/KASLR: skip the specified crashkernel region
>
>  arch/x86/boot/compressed/kaslr.c |  40 ++
>  kernel/crash_core.c  | 273 
>  lib/Makefile |   2 +
>  lib/parse_crashkernel.c  | 289 
> +++
>  4 files changed, 331 insertions(+), 273 deletions(-)
>  create mode 100644 lib/parse_crashkernel.c
>
> --
> 2.7.4
>

[PATCH] s390/purgatory: update .gitignore

2019-05-30 Thread Masahiro Yamada

Since commit 4c0f032d4963 ("s390/purgatory: Omit use of bin2c"),
kexec-purgatory.c is not generated.

purgatory and purgatory.lds are generated files, so should be ignored
by git.

Signed-off-by: Masahiro Yamada 
---

 arch/s390/purgatory/.gitignore | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/s390/purgatory/.gitignore b/arch/s390/purgatory/.gitignore
index e9e66f178a6d..04a03433c720 100644
--- a/arch/s390/purgatory/.gitignore
+++ b/arch/s390/purgatory/.gitignore
@@ -1,2 +1,3 @@
-kexec-purgatory.c
+purgatory
+purgatory.lds
 purgatory.ro
-- 
2.17.1

[PATCH net-next] netfilter: nf_conntrack_bridge: Fix build error without IPV6

2019-05-30 Thread YueHaibing

Fix gcc build error while CONFIG_IPV6 is not set

In file included from net/netfilter/core.c:19:0:
./include/linux/netfilter_ipv6.h: In function 'nf_ipv6_br_defrag':
./include/linux/netfilter_ipv6.h:110:9: error: implicit declaration of function 
'nf_ct_frag6_gather' [-Werror=implicit-function-declaration]

Reported-by: Hulk Robot 
Fixes: 764dd163ac92 ("netfilter: nf_conntrack_bridge: add support for IPv6")
Signed-off-by: YueHaibing 
---
 include/linux/netfilter_ipv6.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h
index a21b8c9..4ea97fd 100644
--- a/include/linux/netfilter_ipv6.h
+++ b/include/linux/netfilter_ipv6.h
@@ -96,6 +96,8 @@ static inline int nf_ip6_route(struct net *net, struct 
dst_entry **dst,
 #endif
 }
 
+int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user);
+
 static inline int nf_ipv6_br_defrag(struct net *net, struct sk_buff *skb,
u32 user)
 {
-- 
2.7.4

Re: mmotm 2019-05-29-20-52 uploaded

2019-05-30 Thread Huang, Ying

"Huang, Ying"  writes:

> Hi, Mike,
>
> Mike Kravetz  writes:
>
>> On 5/29/19 8:53 PM, a...@linux-foundation.org wrote:
>>> The mm-of-the-moment snapshot 2019-05-29-20-52 has been uploaded to
>>> 
>>>http://www.ozlabs.org/~akpm/mmotm/
>>> 
>>
>> With this kernel, I seem to get many messages such as:
>>
>> get_swap_device: Bad swap file entry 1401
>>
>> It would seem to be related to commit 3e2c19f9bef7e
>>> * mm-swap-fix-race-between-swapoff-and-some-swap-operations.patch
>
> Hi, Mike,
>
> Thanks for reporting!  I find an issue in my patch and I can reproduce
> your problem now.  The reason is total_swapcache_pages() will call
> get_swap_device() for invalid swap device.  So we need to find a way to
> silence the warning.  I will post a fix ASAP.

I have sent out a fix patch in another thread with title

"[PATCH -mm] mm, swap: Fix bad swap file entry warning"

Can you try it?

Best Regards,
Huang, Ying

[PATCH -mm] mm, swap: Fix bad swap file entry warning

2019-05-30 Thread Huang, Ying

From: Huang Ying 

Mike reported the following warning messages

  get_swap_device: Bad swap file entry 1401

This is produced by

- total_swapcache_pages()
  - get_swap_device()

Where get_swap_device() is used to check whether the swap device is
valid and prevent it from being swapoff if so.  But get_swap_device()
may produce warning message as above for some invalid swap devices.
This is fixed via calling swp_swap_info() before get_swap_device() to
filter out the swap devices that may cause warning messages.

Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with 
get_swap_device()")
Signed-off-by: "Huang, Ying" 
Cc: Mike Kravetz 
Cc: Andrea Parri 
Cc: Paul E. McKenney 
Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
---
 mm/swap_state.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index b84c58b572ca..62da25b7f2ed 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -76,8 +76,13 @@ unsigned long total_swapcache_pages(void)
struct swap_info_struct *si;
 
for (i = 0; i < MAX_SWAPFILES; i++) {
+   swp_entry_t entry = swp_entry(i, 1);
+
+   /* Avoid get_swap_device() to warn for bad swap entry */
+   if (!swp_swap_info(entry))
+   continue;
/* Prevent swapoff to free swapper_spaces */
-   si = get_swap_device(swp_entry(i, 1));
+   si = get_swap_device(entry);
if (!si)
continue;
nr = nr_swapper_spaces[i];
-- 
2.20.1

Re: linux-next: boot failure after merge of the akpm tree

2019-05-30 Thread Nicholas Piggin

Stephen Rothwell's on May 30, 2019 4:17 pm:
> Hi all,
> 
> My qemu boot (PowerPC le guest on PowerPC le host, with and without kvm,
> using a kernel built with powerpc_pseries_le_defconfig) oopses during boot
> like this:
> 
> -
> numa: Node 0 CPUs: 0
> Using standard scheduler topology
> devtmpfs: initialized
> clocksource: jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 
> 1911260446275 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> [ cut here ]
> kernel BUG at mm/vmalloc.c:472!
> Oops: Exception in kernel mode, sig: 5 [#1]
> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2 #2
> NIP:  c0369b18 LR: c0369c74 CTR: c0176e30
> REGS: c0007e6636e0 TRAP: 0700   Not tainted  (5.2.0-rc2)
> MSR:  82029033   CR: 24024882  XER: 2000
> CFAR: c0369c78 IRQMASK: 0 
> GPR00: c0369c74 c0007e663970 c119c100 0001 
> GPR04: 7ec2 0001f4fe19cb 0001f5398c84 c138 
> GPR08:  0001 0001 02b2 
> GPR12: 4000 c138 c0010fc0 0001 
> GPR16: 0001 818e c0df9988  
> GPR20: 0001 2dc2 0dc0 0022 
> GPR24: c0007e2204c0 0dc2 0001 c00a 
> GPR28: c008 0001  0dc0 
> NIP [c0369b18] __vmalloc_node_range+0x1f8/0x410
> LR [c0369c74] __vmalloc_node_range+0x354/0x410
> Call Trace:
> [c0007e663970] [c0369c74] __vmalloc_node_range+0x354/0x410 
> (unreliable)
> [c0007e663a70] [c0369d80] __vmalloc+0x50/0x60
> [c0007e663ae0] [c0299a98] bpf_prog_alloc_no_stats+0x58/0x120
> [c0007e663b20] [c0299b90] bpf_prog_alloc+0x30/0xe0
> [c0007e663b60] [c0a49dd8] bpf_prog_create+0x68/0x100
> [c0007e663ba0] [c0f4f2a8] ptp_classifier_init+0x4c/0x80
> [c0007e663be0] [c0f4b9e8] sock_init+0xe0/0x100
> [c0007e663c10] [c0010b60] do_one_initcall+0x60/0x2c0
> [c0007e663ce0] [c0ee45b0] kernel_init_freeable+0x37c/0x478
> [c0007e663db0] [c0010fe4] kernel_init+0x2c/0x148
> [c0007e663e20] [c000c0cc] ret_from_kernel_thread+0x5c/0x70
> Instruction dump:
> 6000 2c23 418200dc e9580020 79e91f24 7c6a492a 40920170 8138002c 
> 394f0001 794f0020 7f895040 419dffbc <0fe0> 6000 3f41 4bfffedc 
> ---[ end trace 49ed8f97d467e164 ]---
> 
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0005
> -
> 
> The BUG is:
> 
>   BUG_ON(page_shift != PAGE_SIZE);
> 
> in the !CONFIG_HAVE_ARCH_HUGE_VMAP version of vmap_hpages_range().
> 
> I am guessing this is something to do with the vmalloc changes in Andrew's
> patches (or it could be the fixup I did to Nick's patch).
> 
> I have reverted
> 
>   c353e2997976 ("mm/vmalloc: hugepage vmalloc mappings")
>   a826492f28d9 ("mm: move ioremap page table mapping function to mm/")
> 
> (and my fix up) for today and things seem to work (if only because the
> BUG() has been removed :-)).

Good to know, maybe I didn't test powerpc without later enabling 
patches...

The series also has a compile bug on ARM I have to work out, so
yeah drop those for now, I'll post a v2. The large system map patches
that I posted in that series can stay I think.

Thanks,
Nick

[GIT PULL] gcc-plugins update for v5.2-rc3

2019-05-30 Thread Kees Cook

Hi Linus,

Please pull this gcc-plugins fix for v5.2-rc3. This has lived in
linux-next for about a week now.

Thanks!

-Kees

The following changes since commit 259799ea5a9aa099a267f3b99e1f7078bbaf5c5e:

  gcc-plugins: arm_ssp_per_task_plugin: Fix for older GCC < 6 (2019-05-10 
15:35:01 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
tags/gcc-plugins-v5.2-rc3

for you to fetch changes up to 7210e060155b9cf557fb13128353c3e494fa5ed3:

  gcc-plugins: Fix build failures under Darwin host (2019-05-20 13:30:54 -0700)


gcc-plugins: Handle unusual header environment

- Fix redefined macro error under a Darwin build host


Kees Cook (1):
  gcc-plugins: Fix build failures under Darwin host

 scripts/gcc-plugins/gcc-common.h | 4 
 1 file changed, 4 insertions(+)

-- 
Kees Cook

Re: [PATCH] firmware_loader: fix build without sysctl

2019-05-30 Thread Randy Dunlap

On 5/30/19 6:26 PM, Matteo Croce wrote:
> firmware_config_table has references to the sysctl code which
> triggers a build failure when CONFIG_PROC_SYSCTL is not set:
> 
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x30): undefined 
> reference to `sysctl_vals'
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x38): undefined 
> reference to `sysctl_vals'
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x70): undefined 
> reference to `sysctl_vals'
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x78): undefined 
> reference to `sysctl_vals'
> 
> Put the firmware_config_table struct under #ifdef CONFIG_PROC_SYSCTL.
> 
> Fixes: 6a33853c5773 ("proc/sysctl: add shared variables for range check")
> Reported-by: Randy Dunlap 
> Signed-off-by: Matteo Croce 

Works for me.

Acked-by: Randy Dunlap  # build-tested

Thanks.

> ---
>  drivers/base/firmware_loader/fallback_table.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/base/firmware_loader/fallback_table.c 
> b/drivers/base/firmware_loader/fallback_table.c
> index 58d4a1263480..18d646777fb9 100644
> --- a/drivers/base/firmware_loader/fallback_table.c
> +++ b/drivers/base/firmware_loader/fallback_table.c
> @@ -23,6 +23,8 @@ struct firmware_fallback_config fw_fallback_config = {
>  };
>  EXPORT_SYMBOL_GPL(fw_fallback_config);
>  
> +#ifdef CONFIG_PROC_SYSCTL
> +
>  struct ctl_table firmware_config_table[] = {
>   {
>   .procname   = "force_sysfs_fallback",
> @@ -45,3 +47,5 @@ struct ctl_table firmware_config_table[] = {
>   { }
>  };
>  EXPORT_SYMBOL_GPL(firmware_config_table);
> +
> +#endif
> 


-- 
~Randy

Re: [PATCH v2] checkpatch.pl: Warn on duplicate sysctl local variable

2019-05-30 Thread Kees Cook

On Fri, May 31, 2019 at 03:12:27AM +0200, Matteo Croce wrote:
> Commit 6a33853c5773 ("proc/sysctl: add shared variables for range check")
> adds some shared const variables to be used instead of a local copy in
> each source file.
> Warn when a chunk duplicates one of these values in a ctl_table struct:
> 
> $ scripts/checkpatch.pl 0001-test-commit.patch
> WARNING: duplicated sysctl range checking value 'zero', consider using 
> the shared one in include/linux/sysctl.h
> #27: FILE: arch/arm/kernel/isa.c:48:
> +   .extra1 = ,
> 
> WARNING: duplicated sysctl range checking value 'int_max', consider using 
> the shared one in include/linux/sysctl.h
> #28: FILE: arch/arm/kernel/isa.c:49:
> +   .extra2 = _max,
> 
> total: 0 errors, 2 warnings, 14 lines checked
> 
> Signed-off-by: Matteo Croce 

Reviewed-by: Kees Cook 

-Kees

> ---
>  scripts/checkpatch.pl | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 342c7c781ba5..629c31435487 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -6639,6 +6639,12 @@ sub process {
>"unknown module license " . 
> $extracted_string . "\n" . $herecurr);
>   }
>   }
> +
> +# check for sysctl duplicate constants
> + if ($line =~ /\.extra[12]\s*=\s*&(zero|one|int_max|max_int)\b/) 
> {
> + WARN("DUPLICATED_SYSCTL_CONST",
> + "duplicated sysctl range checking value '$1', 
> consider using the shared one in include/linux/sysctl.h\n" . $herecurr);
> + }
>   }
>  
>   # If we have no input at all, then there is nothing to report on
> -- 
> 2.21.0
> 

-- 
Kees Cook

Re: [PATCH] arm: vdso: pass --be8 to linker if necessary

2019-05-30 Thread Masahiro Yamada

Hi Jason,

Thanks for catching this.

On Thu, May 30, 2019 at 3:26 AM Jason A. Donenfeld  wrote:
>
> The commit fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC)
> to link VDSO") removed the passing of CFLAGS, since ld doesn't take
> those directly. However, prior, big-endian ARM was relying on gcc to
> translate its -mbe8 option into ld's --be8 option. Lacking this, ld


'git grep -- -mbe8' has no hit.

Is it a toolchain internal flag?



> generated be32 code, making the VDSO generate SIGILL when called by
> userspace.
>
> This commit passes --be8 if CONFIG_CPU_ENDIAN_BE8 is enabled.
>
> Signed-off-by: Jason A. Donenfeld 
> Cc: Masahiro Yamada 
> Cc: Russell King 
> Cc: Arnd Bergmann 
> Cc: Ard Biesheuvel 
> ---
>  arch/arm/vdso/Makefile | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
> index fadf554d9391..1f5ec9741e6d 100644
> --- a/arch/arm/vdso/Makefile
> +++ b/arch/arm/vdso/Makefile
> @@ -10,9 +10,10 @@ obj-vdso := $(addprefix $(obj)/, $(obj-vdso))
>  ccflags-y := -fPIC -fno-common -fno-builtin -fno-stack-protector
>  ccflags-y += -DDISABLE_BRANCH_PROFILING
>
> -ldflags-y = -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
> +ldflags-$(CONFIG_CPU_ENDIAN_BE8) := --be8
> +ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
> -z max-page-size=4096 -z common-page-size=4096 \
> -   -nostdlib -shared \
> +   -nostdlib -shared $(ldflags-y) \
> $(call ld-option, --hash-style=sysv) \
> $(call ld-option, --build-id) \
> -T
> --
> 2.21.0
>


--
Best Regards
Masahiro Yamada

Re: [PATCH] kbuild: teach kselftest-merge to find nested config files

2019-05-30 Thread Masahiro Yamada

On Fri, May 31, 2019 at 4:00 AM Dan Rue  wrote:
>
> On Mon, May 20, 2019 at 07:56:41PM +0200, Greg KH wrote:
> > On Mon, May 20, 2019 at 10:16:14AM -0500, Dan Rue wrote:
> > > Current implementation of kselftest-merge only finds config files that
> > > are one level deep using `$(srctree)/tools/testing/selftests/*/config`.
> > >
> > > Often, config files are added in nested directories, and do not get
> > > picked up by kselftest-merge.
> > >
> > > Use `find` to catch all config files under
> > > `$(srctree)/tools/testing/selftests` instead.
> > >
> > > Signed-off-by: Dan Rue 
> > > ---
> > >  Makefile | 5 ++---
> > >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > To be more specific here, the binderfs test is not catching the config
> > entry, so it would be nice to get this into the stable trees as well :)
> >
> > > diff --git a/Makefile b/Makefile
> > > index a45f84a7e811..e99e7f9484af 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -1228,9 +1228,8 @@ kselftest-clean:
> > >  PHONY += kselftest-merge
> > >  kselftest-merge:
> > > $(if $(wildcard $(objtree)/.config),, $(error No .config exists, 
> > > config your kernel first!))
> > > -   $(Q)$(CONFIG_SHELL) $(srctree)/scripts/kconfig/merge_config.sh \
> > > -   -m $(objtree)/.config \
> > > -   $(srctree)/tools/testing/selftests/*/config
> > > +   $(Q)find $(srctree)/tools/testing/selftests -name config | \
> > > +   xargs $(srctree)/scripts/kconfig/merge_config.sh -m 
> > > $(objtree)/.config
> > > +$(Q)$(MAKE) -f $(srctree)/Makefile olddefconfig
> > >
> > >  # 
> > > ---
> >
> > is find run with $(Q)?  It isn't with other instances in the Makefile.
>
> I'm not entirely sure all the ways that $(Q) is used (it looks like it
> just gets set to @), but if i run 'KBUILD_VERBOSE=1 make
> kselftest-merge' I do see the find command printed before running:
>
> find ./tools/testing/selftests -name config | \
>   xargs ./scripts/kconfig/merge_config.sh -m ./.config
>
> I noticed find used inconsistently (sometimes with @, sometimes with
> $(Q), sometimes with neither), so I picked the usage that seemed most
> correct to me.


I agree. Using $(Q) looks correct to me.



> Dan
>
> >
> > thanks,
> >
> > greg k-h
>
> --
> Linaro - Kernel Validation



-- 
Best Regards
Masahiro Yamada

Re: [PATCH 3/4] net: stmmac: modify default value of tx-frames

2019-05-30 Thread biao huang

Hi Andrew,

On Thu, 2019-05-30 at 14:58 +0200, Andrew Lunn wrote:
> On Thu, May 30, 2019 at 04:54:43PM +0800, Biao Huang wrote:
> > the default value of tx-frames is 25, it's too late when
> > passing tstamp to stack, then the ptp4l will fail:
> > 
> > ptp4l -i eth0 -f gPTP.cfg -m
> > ptp4l: selected /dev/ptp0 as PTP clock
> > ptp4l: port 1: INITIALIZING to LISTENING on INITIALIZE
> > ptp4l: port 0: INITIALIZING to LISTENING on INITIALIZE
> > ptp4l: port 1: link up
> > ptp4l: timed out while polling for tx timestamp
> > ptp4l: increasing tx_timestamp_timeout may correct this issue,
> >but it is likely caused by a driver bug
> > ptp4l: port 1: send peer delay response failed
> > ptp4l: port 1: LISTENING to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
> > 
> > ptp4l tests pass when changing the tx-frames from 25 to 1 with
> > ethtool -C option.
> > It should be fine to set tx-frames default value to 1, so ptp4l will pass
> > by default.
> 
> Hi Biao
> 
> What does this do to the number of interrupts? Do we get 25 times more
> interrupts? Have you done any performance tests to see if this causes
> performance regressions?
Yes, it seems tx-frames=25 can reduce interrupts.
But the tx interrupt is handled in napi now, which will disable/enable
tx interrupts at the beginning/ending of napi flow.

Here is the test result on our platform:
tx-frames=1 tx-frames=25
irq number  478514  393750  
performance 904Mbits/sec902Mbits/sec

commands for test:
"cat /proc/interrupts | grep eth0"
"iperf3 -c ipaddress -w 256K -t 60"

Thanks to napi, the interrupts will not grow 25 times more(almost the
same level), and no obvious performance degradation.

Is there anybody can double check the performance with tx-frames = 0 or
25?
> 
>   Andrew
Thanks.
Biao

[GIT PULL] Staging/IIO driver fixes for 5.2-rc3

2019-05-30 Thread Greg KH

The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git 
tags/staging-5.2-rc3

for you to fetch changes up to e61ff0fba72d981449c90b5299cebb74534b6f7c:

  staging: kpc2000: Add dependency on MFD_CORE to kconfig symbol 'KPC2000' 
(2019-05-24 09:41:09 +0200)


Staging/IIO driver fixes for 5.2-rc3

Here are some Staging and IIO driver fixes to resolve some reported
problems for 5.2-rc3.

Nothing major here, just some tiny changes, full details are in the
shortlog.

All have been in linux-next for a while with no reported issues.

Signed-off-by: Greg Kroah-Hartman 


Chengguang Xu (1):
  staging: erofs: set sb->s_root to NULL when failing from __getname()

Dan Carpenter (4):
  staging: kpc2000: double unlock in error handling in kpc_dma_transfer()
  Staging: vc04_services: Fix a couple error codes
  staging: vc04_services: prevent integer overflow in create_pagelist()
  staging: wilc1000: Fix some double unlock bugs in wilc_wlan_cleanup()

Geordan Neukum (1):
  staging: kpc2000: Add dependency on MFD_CORE to kconfig symbol 'KPC2000'

Greg Kroah-Hartman (1):
  Merge tag 'iio-fixes-for-5.2a' of git://git.kernel.org/.../jic23/iio into 
staging-linus:

Max Filippov (1):
  staging: kpc2000: fix build error on xtensa

Ruslan Babayev (1):
  iio: dac: ds4422/ds4424 fix chip verification

Sean Nyekjaer (1):
  iio: adc: ti-ads8688: fix timestamp is not updated in buffer

Steve Moskovchenko (1):
  iio: imu: mpu6050: Fix FIFO layout for ICM20602

Tim Collier (1):
  staging: wlan-ng: fix adapter initialization failure

Tomer Maimon (1):
  iio: adc: modify NPCM ADC read reference voltage

Vincent Stehlé (1):
  iio: adc: ads124: avoid buffer overflow

YueHaibing (1):
  staging: kpc2000: Fix build error without CONFIG_UIO

 drivers/iio/adc/npcm_adc.c |  2 +-
 drivers/iio/adc/ti-ads124s08.c |  2 +-
 drivers/iio/adc/ti-ads8688.c   |  2 +-
 drivers/iio/dac/ds4424.c   |  2 +-
 drivers/iio/imu/inv_mpu6050/inv_mpu_core.c | 46 --
 drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h  | 20 +-
 drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c |  3 ++
 drivers/staging/erofs/super.c  |  1 +
 drivers/staging/kpc2000/Kconfig|  2 +
 drivers/staging/kpc2000/kpc_dma/fileops.c  |  4 +-
 .../vc04_services/bcm2835-camera/controls.c|  4 +-
 .../interface/vchiq_arm/vchiq_2835_arm.c   |  9 +
 drivers/staging/wilc1000/wilc_wlan.c   |  8 +++-
 drivers/staging/wlan-ng/hfa384x_usb.c  |  3 +-
 14 files changed, 91 insertions(+), 17 deletions(-)

[GIT PULL] TTY/Serial fixes for 5.2-rc3

2019-05-30 Thread Greg KH

The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tags/tty-5.2-rc3

for you to fetch changes up to a1ad1cc9704f64c169261a76e1aee1cf1ae51832:

  vt/fbcon: deinitialize resources in visual_init() after failed memory 
allocation (2019-05-24 17:08:18 +0200)


TTY/Serial driver fixes for 5.2-rc3

Here are some small serial and TTY driver fixes for 5.2-rc3.

Nothing major, just a number of fixes for reported issues.  The fbcon
core fix also resolves an issue, and was acked by the relevant
maintainer to go through this tree.

All of these have been in linux-next with no reported issues.

Signed-off-by: Greg Kroah-Hartman 


George G. Davis (1):
  serial: sh-sci: disable DMA for uart_console

Grzegorz Halat (1):
  vt/fbcon: deinitialize resources in visual_init() after failed memory 
allocation

Joe Burmeister (1):
  tty: max310x: Fix external crystal register setup

Jorge Ramirez-Ortiz (1):
  tty: serial: msm_serial: Fix XON/XOFF

Sascha Hauer (1):
  serial: imx: remove log spamming error message

 drivers/tty/serial/imx.c |  1 -
 drivers/tty/serial/max310x.c |  2 +-
 drivers/tty/serial/msm_serial.c  |  5 -
 drivers/tty/serial/sh-sci.c  |  7 +++
 drivers/tty/vt/vt.c  | 11 +--
 drivers/video/fbdev/core/fbcon.c |  2 +-
 6 files changed, 22 insertions(+), 6 deletions(-)

[GIT PULL] USB fixes for 5.2-rc3

2019-05-30 Thread Greg KH

The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git tags/usb-5.2-rc3

for you to fetch changes up to 3ea3091f1bd8586125848c62be295910e9802af0:

  usbip: usbip_host: fix stub_dev lock context imbalance regression (2019-05-29 
13:26:32 -0700)


USB fixes for 5.2-rc3

Here are some tiny USB fixes for a number of reported issues for
5.2-rc3.

Nothing huge here, just a small collection of xhci and other driver bugs
that syzbot has been finding in some drivers.  There is also a usbip fix
and a fix for the usbip fix in here :)

All have been in linux-next with no reported issues.

Signed-off-by: Greg Kroah-Hartman 


Alan Stern (3):
  USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
  media: usb: siano: Fix general protection fault in smsusb
  media: usb: siano: Fix false-positive "uninitialized variable" warning

Andrey Smirnov (1):
  xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()

Carsten Schmid (1):
  usb: xhci: avoid null pointer deref when bos field is NULL

Chunfeng Yun (1):
  usb: mtu3: fix up undefined reference to usb_debug_root

Fabio Estevam (1):
  xhci: Use %zu for printing size_t type

Henry Lin (1):
  xhci: update bounce buffer with correct sg num

Jia-Ju Bai (1):
  usb: xhci: Fix a potential null pointer dereference in 
xhci_debugfs_create_endpoint()

Mathias Nyman (1):
  xhci: Fix immediate data transfer if buffer is already DMA mapped

Mauro Carvalho Chehab (1):
  media: smsusb: better handle optional alignment

Maximilian Luz (1):
  USB: Add LPM quirk for Surface Dock GigE adapter

Oliver Neukum (5):
  USB: sisusbvga: fix oops in error path of sisusb_probe
  USB: rio500: refuse more than one device at a time
  USB: rio500: fix memory leak in close after disconnect
  USB: rio500: simplify locking
  USB: rio500: update Documentation

Shuah Khan (2):
  usbip: usbip_host: fix BUG: sleeping function called from invalid context
  usbip: usbip_host: fix stub_dev lock context imbalance regression

 Documentation/usb/rio.txt   | 66 ++
 drivers/media/usb/siano/smsusb.c| 33 +--
 drivers/usb/core/config.c   |  4 +-
 drivers/usb/core/quirks.c   |  3 ++
 drivers/usb/host/xhci-debugfs.c |  3 ++
 drivers/usb/host/xhci-ring.c| 26 
 drivers/usb/host/xhci.c | 24 +--
 drivers/usb/host/xhci.h |  3 +-
 drivers/usb/misc/rio500.c   | 80 ++---
 drivers/usb/misc/sisusbvga/sisusb.c | 15 +++
 drivers/usb/mtu3/mtu3_debugfs.c |  3 +-
 drivers/usb/usbip/stub_dev.c| 75 --
 12 files changed, 182 insertions(+), 153 deletions(-)

Re: [PATCH v2 00/17] perf tools: Coresight: Add CPU-wide trace support

2019-05-30 Thread Leo Yan

On Fri, May 24, 2019 at 11:34:51AM -0600, Mathieu Poirier wrote:
> This patchset adds support for CoreSight CPU-wide trace scenarios.  More 
> specifically it extends the work that was done for per thread scenarios to
> handle more than a single trace ID.  It also temporally correlate traces
> based on timestamp generated by the tracers so that rendering by the perf
> mechanic is ordered.
> 
> Everything is based on Arnaldo's perf/core branch (46d4c9a05285).  I will
> send another revision when it is rebased to a 5.2 rc candidate.
> 
> Before this set:
>   # root@juno:/home/linaro# perf record -e cs_etm/@2007.etr/ -C 2,3 
> sleep 1
>   failed to mmap with 12 (Cannot allocate memory)
> 
> After this set:
>   # root@juno:/home/linaro# perf record -e cs_etm/@2007.etr/ -C 2,3 
> sleep 1
>   [ perf record: Captured and wrote 1.352 MB perf.data ]

I have tested this patch set on Juno and DB410c boards, FWIW:

Tested-by: Leo Yan 

> Regards,
> Mathieu
> 
> Changes for V2:
> * Fixed error condition in function cs_etm_set_option() (Leo)
> * Fixed changelog spelling error (Leo).
> * Moved from calloc() to malloc() in cs_etm__etmq_get_traceid_queue()
> * Got rid of CS_ETM_PACKET_QUEUE_NR macro
> * Fixed indentation problem in function cs_etm__process_traceid_queue() (Leo).
> 
> Mathieu Poirier (17):
>   perf tools: Configure contextID tracing in CPU-wide mode
>   perf tools: Configure timestsamp generation in CPU-wide mode
>   perf tools: Configure SWITCH_EVENTS in CPU-wide mode
>   perf tools: Add handling of itrace start events
>   perf tools: Add handling of switch-CPU-wide events
>   perf tools: Refactor error path in cs_etm_decoder__new()
>   perf tools: Move packet queue out of decoder structure
>   perf tools: Fix indentation in function
> cs_etm__process_decoder_queue()
>   perf tools: Introduce the concept of trace ID queues
>   perf tools: Get rid of unused cpu in struct cs_etm_queue
>   perf tools: Move thread to traceid_queue
>   perf tools: Move tid/pid to traceid_queue
>   perf tools: Use traceID aware memory callback API
>   perf tools: Add support for multiple traceID queues
>   perf tools: Linking PE contextID with perf thread mechanic
>   perf tools: Add notion of time to decoding code
>   perf tools: Add support for CPU-wide trace scenarios
> 
>  tools/perf/Makefile.config|3 +
>  tools/perf/arch/arm/util/cs-etm.c |  186 ++-
>  .../perf/util/cs-etm-decoder/cs-etm-decoder.c |  269 +++--
>  .../perf/util/cs-etm-decoder/cs-etm-decoder.h |   39 +-
>  tools/perf/util/cs-etm.c  | 1026 +
>  tools/perf/util/cs-etm.h  |  103 ++
>  6 files changed, 1252 insertions(+), 374 deletions(-)
> 
> -- 
> 2.17.1
>

Re: mmotm 2019-05-29-20-52 uploaded

2019-05-30 Thread Huang, Ying

Hi, Mike,

Mike Kravetz  writes:

> On 5/29/19 8:53 PM, a...@linux-foundation.org wrote:
>> The mm-of-the-moment snapshot 2019-05-29-20-52 has been uploaded to
>> 
>>http://www.ozlabs.org/~akpm/mmotm/
>> 
>
> With this kernel, I seem to get many messages such as:
>
> get_swap_device: Bad swap file entry 1401
>
> It would seem to be related to commit 3e2c19f9bef7e
>> * mm-swap-fix-race-between-swapoff-and-some-swap-operations.patch

Hi, Mike,

Thanks for reporting!  I find an issue in my patch and I can reproduce
your problem now.  The reason is total_swapcache_pages() will call
get_swap_device() for invalid swap device.  So we need to find a way to
silence the warning.  I will post a fix ASAP.

Best Regards,
Huang, Ying

RE: [PATCH 0/2] mailbox: arm: introduce smc triggered mailbox

2019-05-30 Thread Peng Fan



> 
> > > Subject: Re: [PATCH 0/2] mailbox: arm: introduce smc triggered
> > > mailbox
> > >
> > > Hi,
> > >
> > > On 5/22/19 10:50 PM, Peng Fan wrote:
> > > > This is a modified version from Andre Przywara's patch series
> > > >
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo
> > > re.ke
> rnel.org%2Fpatchwork%2Fcover%2F812997%2Fdata=02%7C01%7Cpe
> > >
> ng.fan%40nxp.com%7C010c9ddd5df645c9c66b08d6dfa46cb2%7C686ea1d3b
> > >
> c2b4c6fa92cd99c5c301635%7C0%7C0%7C636942294631442665sdat
> > >
> a=BbS5ZQtzMANSwaKRDJ62NKrPrAyaED1%2BvymQaT6Qr8E%3Drese
> > > rved=0.
> > > > [1] is a draft implementation of i.MX8MM SCMI ATF implementation
> > > > that use smc as mailbox, power/clk is included, but only part of
> > > > clk has been implemented to work with hardware, power domain only
> > > > supports get name for now.
> > > >
> > > > The traditional Linux mailbox mechanism uses some kind of
> > > > dedicated hardware IP to signal a condition to some other
> > > > processing unit, typically a dedicated management processor.
> > > > This mailbox feature is used for instance by the SCMI protocol to
> > > > signal a request for some action to be taken by the management
> processor.
> > > > However some SoCs does not have a dedicated management core to
> > > provide
> > > > those services. In order to service TEE and to avoid linux
> > > > shutdown power and clock that used by TEE, need let firmware to
> > > > handle power and clock, the firmware here is ARM Trusted Firmware
> > > > that could also run SCMI service.
> > > >
> > > > The existing SCMI implementation uses a rather flexible shared
> > > > memory region to communicate commands and their parameters, it
> > > > still requires a mailbox to actually trigger the action.
> > >
> > > We have had something similar done internally with a couple of minor
> > > differences:
> > >
> > > - a SGI is used to send SCMI notifications/delayed replies to
> > > support asynchronism (patches are in the works to actually add that
> > > to the Linux SCMI framework). There is no good support for SGI in
> > > the kernel right now so we hacked up something from the existing SMP
> > > code and adding the ability to register our own IPI handlers
> > > (SHAME!). Using a PPI should work and should allow for using request_irq()
> AFAICT.
> >
> > So you are also implementing a firmware inside ATF for SCMI usecase, right?
> >
> > Introducing SGI in ATF to notify Linux will introduce complexity,
> > there is no good framework inside ATF for SCMI, and I use
> > synchronization call for simplicity for now.
> 
> I think we don't disagree, but just to clarify on one thing:
> 
> I think we should avoid tying this driver to specific protocol or software on 
> the
> other end, be it ATF or SCMI. After all it's just a mailbox driver, meant to 
> signal
> some event (and parameters) to some external entity. Yes, SCMI (or SCPI back
> then) was the reason to push for this, but it should be independent from that.

Thanks, I agree.

> I am not even sure we should mention it too much in the documentation.

I think we need a usecase here, so it should be fine.

> 
> So whether the receiving end is ATF or something else it irrelevant, I think. 
> For
> instance we have had discussions in Xen to provide guests some virtualised
> device management support, and using an HVC mailbox seems like a neat
> solution. This could be using the SCMI (or SCPI) protocol, but that's not a
> requirement. In this case the Xen hypervisor would be the one to pick up the
> mailbox trigger, probably forwarding the request to something else (Dom0 in
> this case).

I do not get the point "forwarding the request", DomU HVC will trap to Xen,
so how to forward to Dom0?

Thanks,
Peng.

> Also having a generic SMC mailbox could avoid having the actual hardware
> mailbox drivers in the kernel, so EL3 firmware could forward the request to an
> external management processor, and Linux would just work, without the need
> to describe the actual hardware mailbox device in some firmware tables. This
> might help ACPI on those devices.
> 
> Cheers,
> Andre.
> 
> > >
> > > - the mailbox identifier is indicated as part of the SMC call such
> > > that we can have multiple SCMI mailboxes serving both standard
> > > protocols and non-standard (in the 0x80 and above) range, also they
> > > may have different throughput (in hindsight, these could simply be
> > > different channels)
> > >
> > > Your patch series looks both good and useful to me, I would just put
> > > a provision in the binding to support an optional interrupt such
> > > that asynchronism gets reasonably easy to plug in when it is
> > > available (and desirable).
> >
> > Ok. Let me think about and add that in new version patch.
> >
> > Thanks,
> > Peng.
> >
> > >
> > > >
> > > > This patch series provides a Linux mailbox compatible service
> > > > which uses smc calls to invoke firmware code, for instance taking
> > > > care of SCMI
> > > requests.
> > > >

Re: [PATCH] firmware_loader: fix build without sysctl

2019-05-30 Thread Stephen Rothwell

Hi all,

On Fri, 31 May 2019 03:26:49 +0200 Matteo Croce  wrote:
>
> firmware_config_table has references to the sysctl code which
> triggers a build failure when CONFIG_PROC_SYSCTL is not set:
> 
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x30): undefined 
> reference to `sysctl_vals'
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x38): undefined 
> reference to `sysctl_vals'
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x70): undefined 
> reference to `sysctl_vals'
> ld: drivers/base/firmware_loader/fallback_table.o:(.data+0x78): undefined 
> reference to `sysctl_vals'
> 
> Put the firmware_config_table struct under #ifdef CONFIG_PROC_SYSCTL.
> 
> Fixes: 6a33853c5773 ("proc/sysctl: add shared variables for range check")
> Reported-by: Randy Dunlap 
> Signed-off-by: Matteo Croce 

I have added this to linux-next today.

-- 
Cheers,
Stephen Rothwell


pgpRjodoULP0r.pgp
Description: OpenPGP digital signature

Re: [PATCH] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint

2019-05-30 Thread Frederic Weisbecker

On Thu, May 30, 2019 at 05:58:09AM -0700, Paul E. McKenney wrote:
> It turns out that tick_broadcast_offline() was an innocent bystander.
> After all, interrupts are supposed to be disabled throughout
> take_cpu_down(), and therefore should have been disabled upon entry to
> tick_offline_cpu() and thus to tick_broadcast_offline().  This suggests
> that one of the CPU-hotplug notifiers was incorrectly enabling interrupts,
> and leaving them enabled on return.
> 
> Some debugging code showed that the culprit was sched_cpu_dying().
> It had irqs enabled after return from sched_tick_stop().  Which in turn
> had irqs enabled after return from cancel_delayed_work_sync().  Which is a
> wrapper around __cancel_work_timer().  Which can sleep in the case where
> something else is concurrently trying to cancel the same delayed work,
> and as Thomas Gleixner pointed out on IRC, sleeping is a decidedly bad
> idea when you are invoked from take_cpu_down(), regardless of the state
> you leave interrupts in upon return.

Nice catch! Sorry for leaving that puzzle behind.

> 
> Code inspection located no reason why the delayed work absolutely
> needed to be canceled from sched_tick_stop():  The work is not
> bound to the outgoing CPU by design, given that the whole point is
> to collect statistics without disturbing the outgoing CPU.
> 
> This commit therefore simply drops the cancel_delayed_work_sync() from
> sched_tick_stop().  Instead, a new ->state field is added to the tick_work
> structure so that the delayed-work handler function sched_tick_remote()
> can avoid reposting itself.  A cpu_is_offline() check is also added to
> sched_tick_remote() to avoid mucking with the state of an offlined CPU
> (though it does appear safe to do so).

I can't guarantee that it is safe myself to call the tick of an offline
CPU. Better have that check indeed.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 102dfcf0a29a..9a10ee9afcbf 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3050,14 +3050,44 @@ void scheduler_tick(void)
>  
>  struct tick_work {
>   int cpu;
> + int state;
>   struct delayed_work work;
>  };
> +// Values for ->state, see diagram below.
> +#define TICK_SCHED_REMOTE_IDLE   0

So it took me some time to understand that the IDLE state is the
tick work that ackowledged OFFLINING and finally completes the
offline process. Therefore perhaps we can rename it to
TICK_SCHED_REMOTE_OFFLINE so that we instantly get the state
machine scenario.

> +#define TICK_SCHED_REMOTE_RUNNING1
> +#define TICK_SCHED_REMOTE_OFFLINING  2
> +
> +// State diagram for ->state:
> +//
> +//
> +//  +->IDLE-+
> +//  |   |
> +//  |   |
> +//  |   | sched_tick_start()
> +//  | sched_tick_remote()   |
> +//  |   |
> +//  |   V
> +//  |+-->RUNNING
> +//  ||  |
> +//  ||  |
> +//  ||  |
> +//  | sched_tick_start() |  | sched_tick_stop()
> +//  ||  |
> +//  ||  |
> +//  ||  |
> +//  +OFFLINING<-+
> +//
> +//
> +// Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
> +// and sched_tick_start() are happy to leave the state in RUNNING.
>  
>  static struct tick_work __percpu *tick_work_cpu;
>  
>  static void sched_tick_remote(struct work_struct *work)
>  {
>   struct delayed_work *dwork = to_delayed_work(work);
> + int os;
>   struct tick_work *twork = container_of(dwork, struct tick_work, work);
>   int cpu = twork->cpu;
>   struct rq *rq = cpu_rq(cpu);
> @@ -3077,7 +3107,7 @@ static void sched_tick_remote(struct work_struct *work)
>  
>   rq_lock_irq(rq, );
>   curr = rq->curr;
> - if (is_idle_task(curr))
> + if (is_idle_task(curr) || cpu_is_offline(cpu))

Or we could simply check rq->online, while we have rq locked.

>   goto out_unlock;
>  
>   update_rq_clock(rq);
> @@ -3097,13 +3127,22 @@ static void sched_tick_remote(struct work_struct 
> *work)
>   /*
>* Run the remote tick once per second (1Hz). This arbitrary
>* frequency is large enough to avoid overload but short enough
> -  * to keep scheduler internal stats reasonably up to date.
> +  * to keep scheduler internal stats reasonably up to date.  But
> +  * first update state to reflect hotplug activity if

[PATCH v2] hooks: fix a missing-check bug in selinux_sb_eat_lsm_opts()

2019-05-30 Thread Gen Zhang

In selinux_sb_eat_lsm_opts(), 'arg' is allocated by kmemdup_nul(). It
returns NULL when fails. So 'arg' should be checked.

Signed-off-by: Gen Zhang 
Reviewed-by: Ondrej Mosnacek 
Fixes: 99dbbb593fe6 ("selinux: rewrite selinux_sb_eat_lsm_opts()")
---
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 3ec702c..5a9e959 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2635,6 +2635,8 @@ static int selinux_sb_eat_lsm_opts(char *options, void 
**mnt_opts)
*q++ = c;
}
arg = kmemdup_nul(arg, q - arg, GFP_KERNEL);
+   if (!arg)
+   return -ENOMEM;
}
rc = selinux_add_opt(token, arg, mnt_opts);
if (unlikely(rc)) {

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 925 matches

Mail list logo