RE: usb device to ether is not identification in 64bit Windows OS

2016-09-18 Thread Lipengcheng
Hi,
In windows official web, download a new linux.inf file, and add our 
hisilicon usb id. Now it can be used correct identification in PC windows 64bit 
OS. At the same time, it can also be used correctly in PC windows 32bit OS.
windows official web: 
https://msdn.microsoft.com/zh-cn/windows/hardware/drivers/network/remote-ndis-inf-template
the new file(add hisilicon usb id):

; Remote NDIS template device setup file
; Copyright (c) Microsoft Corporation
;
; This is the template for the INF installation script  for the RNDIS-over-USB
; host driver that leverages the newer NDIS 6.x miniport (rndismp6.sys) for
; improved performance. This INF works for Windows 7, Windows Server 2008 R2,
; and later operating systems on x86, amd64 and ia64 platforms.

[Version]
Signature   = "$Windows NT$"
Class   = Net
ClassGUID   = {4d36e972-e325-11ce-bfc1-08002be10318}
Provider= %Microsoft%
DriverVer   = 07/21/2008,6.0.6000.16384
;CatalogFile= device.cat

[Manufacturer]
%Microsoft% = RndisDevices,NTx86,NTamd64,NTia64

; Decoration for x86 architecture
[RndisDevices.NTx86]
%RndisDevice%= RNDIS.NT.6.0, USB\VID_0525_a4a2, 
USB\VID_1d6b_0104_00, USB\VID_0525_A4A2_0318

; Decoration for x64 architecture
[RndisDevices.NTamd64]
%RndisDevice%= RNDIS.NT.6.0, USB\VID_0525_a4a2, 
USB\VID_1d6b_0104_00, USB\VID_0525_A4A2_0318

; Decoration for ia64 architecture
[RndisDevices.NTia64]
%RndisDevice%= RNDIS.NT.6.0, USB\VID_0525_a4a2, 
USB\VID_1d6b_0104_00, USB\VID_0525_A4A2_0318

;@@@ This is the common setting for setup
[ControlFlags]
ExcludeFromSelect=*

; DDInstall section
; References the in-build Netrndis.inf
[RNDIS.NT.6.0]
Characteristics = 0x84   ; NCF_PHYSICAL + NCF_HAS_UI
BusType = 15
; NEVER REMOVE THE FOLLOWING REFERENCE FOR NETRNDIS.INF
include = netrndis.inf
needs   = usbrndis6.ndi
AddReg  = Rndis_AddReg
*IfType= 6; IF_TYPE_ETHERNET_CSMACD.
*MediaType = 16   ; NdisMediumNative802_11
*PhysicalMediaType = 14   ; NdisPhysicalMedium802_3

; DDInstal.Services section
[RNDIS.NT.6.0.Services]
include = netrndis.inf
needs   = usbrndis6.ndi.Services

; Optional registry settings. You can modify as needed.
[RNDIS_AddReg] 
HKR, NDI\params\RndisProperty, ParamDesc,  0, %Rndis_Property%
HKR, NDI\params\RndisProperty, type,   0, "edit"
HKR, NDI\params\RndisProperty, LimitText,  0, "12"
HKR, NDI\params\RndisProperty, UpperCase,  0, "1"
HKR, NDI\params\RndisProperty, default,0, " "
HKR, NDI\params\RndisProperty, optional,   0, "1"

; No sys copyfiles - the sys files are already in-build 
; (part of the operating system).

; Modify these strings for your device as needed.
[Strings]
Microsoft = "Microsoft Corporation"
RndisDevice   = "Remote NDIS6 based Device"
Rndis_Property = "Optional RNDIS Property"

> -Original Message-
> From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
> Sent: Monday, September 19, 2016 12:46 AM
> To: Lipengcheng
> Cc: cor...@lwn.net; linux-...@vger.kernel.org; linux-...@vger.kernel.org; 
> linux-kernel@vger.kernel.org
> Subject: Re: usb device to ether is not identification in 64bit Windows OS
> 
> On Sun, Sep 18, 2016 at 01:16:59PM +, Lipengcheng wrote:
> > Hi,
> > kernel version:4.8.0
> > file:Documentation / usb / linux.inf
> > problem:PC windows is 32bit OS, using Ethernet gadget driver, it can
> > be correct identification. But PC windows is 64bit OS, while modifying
> > linux.inf file LinuxDevice parameters, it can not correct
> > identification, the serial port can be printed correctly:g_ether
> > gadget: high-speed config # 2: RNDIS. I suspect that the linux.inf
> > files are not mismatch 64bit windows OS.
> 
> Given that everyone here does not use Windows at all, you are going to be on 
> your own here, sorry.  If you do come up with an updated .inf
> file, we will be glad to take a patch.
> 
> good luck,
> 
> greg k-h

Best regards,
Pengcheng Li


RE: usb device to ether is not identification in 64bit Windows OS

2016-09-18 Thread Lipengcheng
Hi,
In windows official web, download a new linux.inf file, and add our 
hisilicon usb id. Now it can be used correct identification in PC windows 64bit 
OS. At the same time, it can also be used correctly in PC windows 32bit OS.
windows official web: 
https://msdn.microsoft.com/zh-cn/windows/hardware/drivers/network/remote-ndis-inf-template
the new file(add hisilicon usb id):

; Remote NDIS template device setup file
; Copyright (c) Microsoft Corporation
;
; This is the template for the INF installation script  for the RNDIS-over-USB
; host driver that leverages the newer NDIS 6.x miniport (rndismp6.sys) for
; improved performance. This INF works for Windows 7, Windows Server 2008 R2,
; and later operating systems on x86, amd64 and ia64 platforms.

[Version]
Signature   = "$Windows NT$"
Class   = Net
ClassGUID   = {4d36e972-e325-11ce-bfc1-08002be10318}
Provider= %Microsoft%
DriverVer   = 07/21/2008,6.0.6000.16384
;CatalogFile= device.cat

[Manufacturer]
%Microsoft% = RndisDevices,NTx86,NTamd64,NTia64

; Decoration for x86 architecture
[RndisDevices.NTx86]
%RndisDevice%= RNDIS.NT.6.0, USB\VID_0525_a4a2, 
USB\VID_1d6b_0104_00, USB\VID_0525_A4A2_0318

; Decoration for x64 architecture
[RndisDevices.NTamd64]
%RndisDevice%= RNDIS.NT.6.0, USB\VID_0525_a4a2, 
USB\VID_1d6b_0104_00, USB\VID_0525_A4A2_0318

; Decoration for ia64 architecture
[RndisDevices.NTia64]
%RndisDevice%= RNDIS.NT.6.0, USB\VID_0525_a4a2, 
USB\VID_1d6b_0104_00, USB\VID_0525_A4A2_0318

;@@@ This is the common setting for setup
[ControlFlags]
ExcludeFromSelect=*

; DDInstall section
; References the in-build Netrndis.inf
[RNDIS.NT.6.0]
Characteristics = 0x84   ; NCF_PHYSICAL + NCF_HAS_UI
BusType = 15
; NEVER REMOVE THE FOLLOWING REFERENCE FOR NETRNDIS.INF
include = netrndis.inf
needs   = usbrndis6.ndi
AddReg  = Rndis_AddReg
*IfType= 6; IF_TYPE_ETHERNET_CSMACD.
*MediaType = 16   ; NdisMediumNative802_11
*PhysicalMediaType = 14   ; NdisPhysicalMedium802_3

; DDInstal.Services section
[RNDIS.NT.6.0.Services]
include = netrndis.inf
needs   = usbrndis6.ndi.Services

; Optional registry settings. You can modify as needed.
[RNDIS_AddReg] 
HKR, NDI\params\RndisProperty, ParamDesc,  0, %Rndis_Property%
HKR, NDI\params\RndisProperty, type,   0, "edit"
HKR, NDI\params\RndisProperty, LimitText,  0, "12"
HKR, NDI\params\RndisProperty, UpperCase,  0, "1"
HKR, NDI\params\RndisProperty, default,0, " "
HKR, NDI\params\RndisProperty, optional,   0, "1"

; No sys copyfiles - the sys files are already in-build 
; (part of the operating system).

; Modify these strings for your device as needed.
[Strings]
Microsoft = "Microsoft Corporation"
RndisDevice   = "Remote NDIS6 based Device"
Rndis_Property = "Optional RNDIS Property"

> -Original Message-
> From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
> Sent: Monday, September 19, 2016 12:46 AM
> To: Lipengcheng
> Cc: cor...@lwn.net; linux-...@vger.kernel.org; linux-...@vger.kernel.org; 
> linux-kernel@vger.kernel.org
> Subject: Re: usb device to ether is not identification in 64bit Windows OS
> 
> On Sun, Sep 18, 2016 at 01:16:59PM +, Lipengcheng wrote:
> > Hi,
> > kernel version:4.8.0
> > file:Documentation / usb / linux.inf
> > problem:PC windows is 32bit OS, using Ethernet gadget driver, it can
> > be correct identification. But PC windows is 64bit OS, while modifying
> > linux.inf file LinuxDevice parameters, it can not correct
> > identification, the serial port can be printed correctly:g_ether
> > gadget: high-speed config # 2: RNDIS. I suspect that the linux.inf
> > files are not mismatch 64bit windows OS.
> 
> Given that everyone here does not use Windows at all, you are going to be on 
> your own here, sorry.  If you do come up with an updated .inf
> file, we will be glad to take a patch.
> 
> good luck,
> 
> greg k-h

Best regards,
Pengcheng Li


Re: [PATCH v6 0/6] Add MT8173 MDP Driver

2016-09-18 Thread Minghsiu Tsai
On Wed, 2016-09-14 at 14:43 +0200, Hans Verkuil wrote:
> Hi Minghsiu,
> 
> v6 looks good, but I get these warnings when compiling it for i686:
> 
> linux-git-i686: WARNINGS
> 
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_handle_init_ack':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:28:28:
>  warning: cast to pointer from integer of different size 
> [-Wint-to-pointer-cast]
>   struct mtk_mdp_vpu *vpu = (struct mtk_mdp_vpu *)msg->ap_inst;
> ^
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_ipi_handler':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:40:28:
>  warning: cast to pointer from integer of different size 
> [-Wint-to-pointer-cast]
>   struct mtk_mdp_vpu *vpu = (struct mtk_mdp_vpu *)msg->ap_inst;
> ^
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_send_ap_ipi':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:111:16:
>  warning: cast from pointer to integer of different size 
> [-Wpointer-to-int-cast]
>   msg.ap_inst = (uint64_t)vpu;
> ^
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_init':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:129:16:
>  warning: cast from pointer to integer of different size 
> [-Wpointer-to-int-cast]
>   msg.ap_inst = (uint64_t)vpu;
> ^
> 
> This is not blocking, but if you can post a follow-up patch for this, then 
> that
> would be helpful.
> 

Hi Hans,

I have duplicated this warning message in arch x86. 
I also got the report from kbuild robot. There are build errors in
mtk_mdp_pm_suspend().
Besides, in arch x86, I also have build warning messages in kzalloc()
and kfree() used in mtk_mdp_m2m.c. It can be fixed by including
linux/slab.h
I will upload the patches today.
Thanks


Ming Hsiu

> Regards,
> 
>   Hans
> 
> 
> On 09/08/2016 03:09 PM, Minghsiu Tsai wrote:
> > Changes in v6:
> > - s_selection() can't set the _DEFAULT and _BOUNDS targets
> > - Add Maintainers entry
> > 
> > Changes in v5:
> > - Add ack in the comment of dts patch
> > - Fix s/g_selection()
> > - Separate format V4L2_PIX_FMT_MT21C into new patch  
> > 
> > Changes in v4:
> > - Add "depends on HAS_DMA" in Kconfig.
> > - Fix s/g_selection()
> > - Replace struct v4l2_crop with u32 and struct v4l2_rect
> > - Remove VB2_USERPTR
> > - Move mutex lock after ctx allocation in mtk_mdp_m2m_open()
> > - Add new format V4L2_PIX_FMT_YVU420 to support software on Android 
> > platform.
> > - Only width/height of image in format V4L2_PIX_FMT_MT21 is aligned to 
> > 16/16,
> >   other ones are aligned to 2/2 by default
> > 
> > Changes in v3:
> > - Modify device ndoe as structured one.
> > - Fix conflict in dts on Linux 4.8-rc1
> > 
> > Changes in v2:
> > - Add section to describe blocks function in dts-bindings
> > - Remove the assignment of device_caps in querycap()
> > - Remove format's name assignment
> > - Copy colorspace-related parameters from OUTPUT to CAPTURE
> > - Use m2m helper functions
> > - Fix DMA allocation failure
> > - Initialize lazily vpu instance in streamon()
> > 
> > ==
> >  Introduction
> > ==
> > 
> > The purpose of this series is to add the driver for Media Data Path HW 
> > embedded in the Mediatek's MT8173 SoC.
> > MDP is used for scaling and color space conversion.
> > 
> > It could convert V4L2_PIX_FMT_MT21 to V4L2_PIX_FMT_NV12M or 
> > V4L2_PIX_FMT_YUV420M.
> > 
> > NV12M/YUV420M/MT21 -> MDP -> NV12M/YUV420M
> > 
> > This patch series rely on MTK VPU driver in patch series "Add MT8173 Video 
> > Encoder Driver and VPU Driver"[1] and "Add MT8173 Video Decoder Driver"[2].
> > MDP driver rely on VPU driver to load, communicate with VPU.
> > 
> > Internally the driver uses videobuf2 framework and MTK IOMMU and MTK SMI 
> > both have been merged in v4.6-rc1.
> > 
> > [1]https://patchwork.kernel.org/patch/9002171/
> > [2]https://patchwork.kernel.org/patch/9141245/
> > 
> > ==
> >  Device interface
> > ==
> > 
> > In principle the driver bases on v4l2 memory-to-memory framework:
> > it provides a single video node and each opened file handle gets its own 
> > private context with separate buffer queues. Each context consist of 2 
> > buffer queues: OUTPUT (for source buffers) and CAPTURE (for destination 
> > buffers).
> > OUTPUT and CAPTURE buffer could be MMAP or DMABUF memory type.
> > 
> > v4l2-compliance test output:
> > v4l2-compliance SHA   : abc1453dfe89f244dccd3460d8e1a2e3091cbadb
> > 
> > Driver Info:
> > Driver name   : mtk-mdp
> > Card type : soc:mdp
> > Bus info  : platform:mt8173
> > Driver version: 4.8.0
> > 

Re: [PATCH v6 0/6] Add MT8173 MDP Driver

2016-09-18 Thread Minghsiu Tsai
On Wed, 2016-09-14 at 14:43 +0200, Hans Verkuil wrote:
> Hi Minghsiu,
> 
> v6 looks good, but I get these warnings when compiling it for i686:
> 
> linux-git-i686: WARNINGS
> 
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_handle_init_ack':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:28:28:
>  warning: cast to pointer from integer of different size 
> [-Wint-to-pointer-cast]
>   struct mtk_mdp_vpu *vpu = (struct mtk_mdp_vpu *)msg->ap_inst;
> ^
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_ipi_handler':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:40:28:
>  warning: cast to pointer from integer of different size 
> [-Wint-to-pointer-cast]
>   struct mtk_mdp_vpu *vpu = (struct mtk_mdp_vpu *)msg->ap_inst;
> ^
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_send_ap_ipi':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:111:16:
>  warning: cast from pointer to integer of different size 
> [-Wpointer-to-int-cast]
>   msg.ap_inst = (uint64_t)vpu;
> ^
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c: 
> In function 'mtk_mdp_vpu_init':
> /home/hans/work/build/media-git/drivers/media/platform/mtk-mdp/mtk_mdp_vpu.c:129:16:
>  warning: cast from pointer to integer of different size 
> [-Wpointer-to-int-cast]
>   msg.ap_inst = (uint64_t)vpu;
> ^
> 
> This is not blocking, but if you can post a follow-up patch for this, then 
> that
> would be helpful.
> 

Hi Hans,

I have duplicated this warning message in arch x86. 
I also got the report from kbuild robot. There are build errors in
mtk_mdp_pm_suspend().
Besides, in arch x86, I also have build warning messages in kzalloc()
and kfree() used in mtk_mdp_m2m.c. It can be fixed by including
linux/slab.h
I will upload the patches today.
Thanks


Ming Hsiu

> Regards,
> 
>   Hans
> 
> 
> On 09/08/2016 03:09 PM, Minghsiu Tsai wrote:
> > Changes in v6:
> > - s_selection() can't set the _DEFAULT and _BOUNDS targets
> > - Add Maintainers entry
> > 
> > Changes in v5:
> > - Add ack in the comment of dts patch
> > - Fix s/g_selection()
> > - Separate format V4L2_PIX_FMT_MT21C into new patch  
> > 
> > Changes in v4:
> > - Add "depends on HAS_DMA" in Kconfig.
> > - Fix s/g_selection()
> > - Replace struct v4l2_crop with u32 and struct v4l2_rect
> > - Remove VB2_USERPTR
> > - Move mutex lock after ctx allocation in mtk_mdp_m2m_open()
> > - Add new format V4L2_PIX_FMT_YVU420 to support software on Android 
> > platform.
> > - Only width/height of image in format V4L2_PIX_FMT_MT21 is aligned to 
> > 16/16,
> >   other ones are aligned to 2/2 by default
> > 
> > Changes in v3:
> > - Modify device ndoe as structured one.
> > - Fix conflict in dts on Linux 4.8-rc1
> > 
> > Changes in v2:
> > - Add section to describe blocks function in dts-bindings
> > - Remove the assignment of device_caps in querycap()
> > - Remove format's name assignment
> > - Copy colorspace-related parameters from OUTPUT to CAPTURE
> > - Use m2m helper functions
> > - Fix DMA allocation failure
> > - Initialize lazily vpu instance in streamon()
> > 
> > ==
> >  Introduction
> > ==
> > 
> > The purpose of this series is to add the driver for Media Data Path HW 
> > embedded in the Mediatek's MT8173 SoC.
> > MDP is used for scaling and color space conversion.
> > 
> > It could convert V4L2_PIX_FMT_MT21 to V4L2_PIX_FMT_NV12M or 
> > V4L2_PIX_FMT_YUV420M.
> > 
> > NV12M/YUV420M/MT21 -> MDP -> NV12M/YUV420M
> > 
> > This patch series rely on MTK VPU driver in patch series "Add MT8173 Video 
> > Encoder Driver and VPU Driver"[1] and "Add MT8173 Video Decoder Driver"[2].
> > MDP driver rely on VPU driver to load, communicate with VPU.
> > 
> > Internally the driver uses videobuf2 framework and MTK IOMMU and MTK SMI 
> > both have been merged in v4.6-rc1.
> > 
> > [1]https://patchwork.kernel.org/patch/9002171/
> > [2]https://patchwork.kernel.org/patch/9141245/
> > 
> > ==
> >  Device interface
> > ==
> > 
> > In principle the driver bases on v4l2 memory-to-memory framework:
> > it provides a single video node and each opened file handle gets its own 
> > private context with separate buffer queues. Each context consist of 2 
> > buffer queues: OUTPUT (for source buffers) and CAPTURE (for destination 
> > buffers).
> > OUTPUT and CAPTURE buffer could be MMAP or DMABUF memory type.
> > 
> > v4l2-compliance test output:
> > v4l2-compliance SHA   : abc1453dfe89f244dccd3460d8e1a2e3091cbadb
> > 
> > Driver Info:
> > Driver name   : mtk-mdp
> > Card type : soc:mdp
> > Bus info  : platform:mt8173
> > Driver version: 4.8.0
> > 

Re: [PATCH net-next 00/11] rxrpc: Tracepoint addition and improvement

2016-09-18 Thread David Miller
From: David Howells 
Date: Sun, 18 Sep 2016 00:21:29 +0100

> Tagged thusly:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-rewrite-20160917-2

Pulled.


Re: [PATCH net-next 00/11] rxrpc: Tracepoint addition and improvement

2016-09-18 Thread David Miller
From: David Howells 
Date: Sun, 18 Sep 2016 00:21:29 +0100

> Tagged thusly:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-rewrite-20160917-2

Pulled.


Re: [PATCH v10 3/3] Documentation: kdump: add description of enable multi-cpus support

2016-09-18 Thread Baoquan He
On 09/19/16 at 12:01pm, Baoquan He wrote:
> From: Zhou Wenjian 
> 
> Multi-cpu support is useful to improve the performance of kdump in
> some cases. So add the description of enable multi-cpu support in
> dump-capture kernel.
> 
> Signed-off-by: Zhou Wenjian 
> Acked-by: Baoquan He 
> Acked-by: Xunlei Pang 
> Signed-off-by: Baoquan He 

Oh, sorry. My git config caused this Signed-off-by adding automatically.
Will resend. Nack this patchset.

> ---
>  Documentation/kdump/kdump.txt | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index f7ef340..b0eb27b 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -396,6 +396,13 @@ Notes on loading the dump-capture kernel:
>Note, though maxcpus always works, you had better replace it with
>nr_cpus to save memory if supported by the current ARCH, such as x86.
>  
> +* You should enable multi-cpu support in dump-capture kernel if you intend
> +  to use multi-thread programs with it, such as parallel dump feature of
> +  makedumpfile. Otherwise, the multi-thread program may have a great
> +  performance degradation. To enable multi-cpu support, you should bring up 
> an
> +  SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
> +  options while loading it.
> +
>  * For s390x there are two kdump modes: If a ELF header is specified with
>the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
>is done on all other architectures. If no elfcorehdr= kernel parameter is
> -- 
> 2.5.5
> 


Re: [PATCH net-next 00/14] rxrpc: Fixes & miscellany

2016-09-18 Thread David Miller
From: David Howells 
Date: Sun, 18 Sep 2016 00:17:44 +0100

> Tagged thusly:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-rewrite-20160917-1

Pulled.


Re: [PATCH v10 3/3] Documentation: kdump: add description of enable multi-cpus support

2016-09-18 Thread Baoquan He
On 09/19/16 at 12:01pm, Baoquan He wrote:
> From: Zhou Wenjian 
> 
> Multi-cpu support is useful to improve the performance of kdump in
> some cases. So add the description of enable multi-cpu support in
> dump-capture kernel.
> 
> Signed-off-by: Zhou Wenjian 
> Acked-by: Baoquan He 
> Acked-by: Xunlei Pang 
> Signed-off-by: Baoquan He 

Oh, sorry. My git config caused this Signed-off-by adding automatically.
Will resend. Nack this patchset.

> ---
>  Documentation/kdump/kdump.txt | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index f7ef340..b0eb27b 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -396,6 +396,13 @@ Notes on loading the dump-capture kernel:
>Note, though maxcpus always works, you had better replace it with
>nr_cpus to save memory if supported by the current ARCH, such as x86.
>  
> +* You should enable multi-cpu support in dump-capture kernel if you intend
> +  to use multi-thread programs with it, such as parallel dump feature of
> +  makedumpfile. Otherwise, the multi-thread program may have a great
> +  performance degradation. To enable multi-cpu support, you should bring up 
> an
> +  SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
> +  options while loading it.
> +
>  * For s390x there are two kdump modes: If a ELF header is specified with
>the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
>is done on all other architectures. If no elfcorehdr= kernel parameter is
> -- 
> 2.5.5
> 


Re: [PATCH net-next 00/14] rxrpc: Fixes & miscellany

2016-09-18 Thread David Miller
From: David Howells 
Date: Sun, 18 Sep 2016 00:17:44 +0100

> Tagged thusly:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-rewrite-20160917-1

Pulled.


Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic

2016-09-18 Thread Aneesh Kumar K.V

Christophe Leroy  writes:
> +#else
> +static void hugepd_free(struct mmu_gather *tlb, void *hugepte)
> +{
> + BUG();
> +}
> +
>  #endif


I was expecting that BUG will get removed in the next patch. But I don't
see it in the next patch. Considering

@@ -475,11 +453,10 @@ static void free_hugepd_range(struct mmu_gather *tlb, 
hugepd_t *hpdp, int pdshif
for (i = 0; i < num_hugepd; i++, hpdp++)
hpdp->pd = 0;

-#ifdef CONFIG_PPC_FSL_BOOK3E
-   hugepd_free(tlb, hugepte);
-#else
-   pgtable_free_tlb(tlb, hugepte, pdshift - shift);
-#endif
+   if (shift >= pdshift)
+   hugepd_free(tlb, hugepte);
+   else
+   pgtable_free_tlb(tlb, hugepte, pdshift - shift);
 }

What is that I am missing ?

-aneesh



Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic

2016-09-18 Thread Aneesh Kumar K.V

Christophe Leroy  writes:
> +#else
> +static void hugepd_free(struct mmu_gather *tlb, void *hugepte)
> +{
> + BUG();
> +}
> +
>  #endif


I was expecting that BUG will get removed in the next patch. But I don't
see it in the next patch. Considering

@@ -475,11 +453,10 @@ static void free_hugepd_range(struct mmu_gather *tlb, 
hugepd_t *hpdp, int pdshif
for (i = 0; i < num_hugepd; i++, hpdp++)
hpdp->pd = 0;

-#ifdef CONFIG_PPC_FSL_BOOK3E
-   hugepd_free(tlb, hugepte);
-#else
-   pgtable_free_tlb(tlb, hugepte, pdshift - shift);
-#endif
+   if (shift >= pdshift)
+   hugepd_free(tlb, hugepte);
+   else
+   pgtable_free_tlb(tlb, hugepte, pdshift - shift);
 }

What is that I am missing ?

-aneesh



[PATCH] ARC: [plat-eznps] add missing atomic_fetch_xxx operations

2016-09-18 Thread Noam Camus
From: Noam Camus 

Build brekeage since last changes to generic atomic operations.
Added couple of missing macros which are now mandatory

Signed-off-by: Noam Camus 
---
 arch/arc/include/asm/atomic.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/arc/include/asm/atomic.h b/arch/arc/include/asm/atomic.h
index 4e3c1b6..4f732bf 100644
--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -284,6 +284,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v) 
\
 ATOMIC_OPS(add, +=, CTOP_INST_AADD_DI_R2_R2_R3)
 #define atomic_sub(i, v) atomic_add(-(i), (v))
 #define atomic_sub_return(i, v) atomic_add_return(-(i), (v))
+#define atomic_fetch_sub(i, v) atomic_fetch_add(-(i), (v))
 
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)   \
@@ -292,6 +293,7 @@ ATOMIC_OPS(add, +=, CTOP_INST_AADD_DI_R2_R2_R3)
 
 ATOMIC_OPS(and, &=, CTOP_INST_AAND_DI_R2_R2_R3)
 #define atomic_andnot(mask, v) atomic_and(~(mask), (v))
+#define atomic_fetch_andnot(mask, v) atomic_fetch_and(~(mask), (v))
 ATOMIC_OPS(or, |=, CTOP_INST_AOR_DI_R2_R2_R3)
 ATOMIC_OPS(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3)
 
-- 
1.7.1



[PATCH] ARC: [plat-eznps] add missing atomic_fetch_xxx operations

2016-09-18 Thread Noam Camus
From: Noam Camus 

Build brekeage since last changes to generic atomic operations.
Added couple of missing macros which are now mandatory

Signed-off-by: Noam Camus 
---
 arch/arc/include/asm/atomic.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/arc/include/asm/atomic.h b/arch/arc/include/asm/atomic.h
index 4e3c1b6..4f732bf 100644
--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -284,6 +284,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v) 
\
 ATOMIC_OPS(add, +=, CTOP_INST_AADD_DI_R2_R2_R3)
 #define atomic_sub(i, v) atomic_add(-(i), (v))
 #define atomic_sub_return(i, v) atomic_add_return(-(i), (v))
+#define atomic_fetch_sub(i, v) atomic_fetch_add(-(i), (v))
 
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)   \
@@ -292,6 +293,7 @@ ATOMIC_OPS(add, +=, CTOP_INST_AADD_DI_R2_R2_R3)
 
 ATOMIC_OPS(and, &=, CTOP_INST_AAND_DI_R2_R2_R3)
 #define atomic_andnot(mask, v) atomic_and(~(mask), (v))
+#define atomic_fetch_andnot(mask, v) atomic_fetch_and(~(mask), (v))
 ATOMIC_OPS(or, |=, CTOP_INST_AOR_DI_R2_R2_R3)
 ATOMIC_OPS(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3)
 
-- 
1.7.1



Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic

2016-09-18 Thread Aneesh Kumar K.V
Christophe Leroy  writes:

> Today there are two implementations of hugetlbpages which are managed
> by exclusive #ifdefs:
> * FSL_BOOKE: several directory entries points to the same single hugepage
> * BOOK3S: one upper level directory entry points to a table of hugepages
>
> In preparation of implementation of hugepage support on the 8xx, we
> need a mix of the two above solutions, because the 8xx needs both cases
> depending on the size of pages:
> * In 4k page size mode, each PGD entry covers a 4M bytes area. It means
> that 2 PGD entries will be necessary to cover an 8M hugepage while a
> single PGD entry will cover 8x 512k hugepages.
> * In 16 page size mode, each PGD entry covers a 64M bytes area. It means
> that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
> hugepages will be covers by one PGD entry.
>
> This patch:
> * removes #ifdefs in favor of if/else based on the range sizes
> * merges the two huge_pte_alloc() functions as they are pretty similar
> * merges the two hugetlbpage_init() functions as they are pretty similar
>
> Signed-off-by: Christophe Leroy 
> ---
> v2: This part is new and results from a split of last patch of v1 serie in
> two parts
>
>  arch/powerpc/mm/hugetlbpage.c | 189 
> +-
>  1 file changed, 77 insertions(+), 112 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 8a512b1..2119f00 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -64,14 +64,16 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
> *hpdp,
>  {
>   struct kmem_cache *cachep;
>   pte_t *new;
> -
> -#ifdef CONFIG_PPC_FSL_BOOK3E
>   int i;
> - int num_hugepd = 1 << (pshift - pdshift);
> - cachep = hugepte_cache;
> -#else
> - cachep = PGT_CACHE(pdshift - pshift);
> -#endif
> + int num_hugepd;
> +
> + if (pshift >= pdshift) {
> + cachep = hugepte_cache;
> + num_hugepd = 1 << (pshift - pdshift);
> + } else {
> + cachep = PGT_CACHE(pdshift - pshift);
> + num_hugepd = 1;
> + }

Is there a way to hint likely/unlikely branch based on the page size
selected at build time ?



>  
>   new = kmem_cache_zalloc(cachep, GFP_KERNEL);
>  
> @@ -89,7 +91,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
> *hpdp,
>   smp_wmb();
>  
>   spin_lock(>page_table_lock);
> -#ifdef CONFIG_PPC_FSL_BOOK3E
> +
>   /*
>* We have multiple higher-level entries that point to the same
>* actual pte location.  Fill in each as we go and backtrack on error.
> @@ -100,8 +102,13 @@ static int __hugepte_alloc(struct mm_struct *mm, 
> hugepd_t *hpdp,
>   if (unlikely(!hugepd_none(*hpdp)))
>   break;
>   else



> -#ifdef CONFIG_PPC_FSL_BOOK3E
>  struct kmem_cache *hugepte_cache;
>  static int __init hugetlbpage_init(void)
>  {
>   int psize;
>  
> - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> - unsigned shift;
> -
> - if (!mmu_psize_defs[psize].shift)
> - continue;
> -
> - shift = mmu_psize_to_shift(psize);
> -
> - /* Don't treat normal page sizes as huge... */
> - if (shift != PAGE_SHIFT)
> - if (add_huge_page_size(1ULL << shift) < 0)
> - continue;
> - }
> -
> - /*
> -  * Create a kmem cache for hugeptes.  The bottom bits in the pte have
> -  * size information encoded in them, so align them to allow this
> -  */
> - hugepte_cache =  kmem_cache_create("hugepte-cache", sizeof(pte_t),
> -HUGEPD_SHIFT_MASK + 1, 0, NULL);
> - if (hugepte_cache == NULL)
> - panic("%s: Unable to create kmem cache for hugeptes\n",
> -   __func__);
> -
> - /* Default hpage size = 4M */
> - if (mmu_psize_defs[MMU_PAGE_4M].shift)
> - HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_4M].shift;
> - else
> - panic("%s: Unable to set default huge page size\n", __func__);
> -
> -
> - return 0;
> -}
> -#else
> -static int __init hugetlbpage_init(void)
> -{
> - int psize;
> -
> +#if !defined(CONFIG_PPC_FSL_BOOK3E)
>   if (!radix_enabled() && !mmu_has_feature(MMU_FTR_16M_PAGE))
>   return -ENODEV;
> -
> +#endif

Do we need that #if ? radix_enabled() should become 0 and that if
condition should be removed at compile time isn't it ? or are you
finding errors with that ?


>   for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
>   unsigned shift;
>   unsigned pdshift;
> @@ -860,16 +807,31 @@ static int __init hugetlbpage_init(void)
>* if we have pdshift and shift value same, we don't
>* use pgt cache for hugepd.
>*/
> - if (pdshift != 

Re: [PATCH v2 2/3] powerpc: get hugetlbpage handling more generic

2016-09-18 Thread Aneesh Kumar K.V
Christophe Leroy  writes:

> Today there are two implementations of hugetlbpages which are managed
> by exclusive #ifdefs:
> * FSL_BOOKE: several directory entries points to the same single hugepage
> * BOOK3S: one upper level directory entry points to a table of hugepages
>
> In preparation of implementation of hugepage support on the 8xx, we
> need a mix of the two above solutions, because the 8xx needs both cases
> depending on the size of pages:
> * In 4k page size mode, each PGD entry covers a 4M bytes area. It means
> that 2 PGD entries will be necessary to cover an 8M hugepage while a
> single PGD entry will cover 8x 512k hugepages.
> * In 16 page size mode, each PGD entry covers a 64M bytes area. It means
> that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
> hugepages will be covers by one PGD entry.
>
> This patch:
> * removes #ifdefs in favor of if/else based on the range sizes
> * merges the two huge_pte_alloc() functions as they are pretty similar
> * merges the two hugetlbpage_init() functions as they are pretty similar
>
> Signed-off-by: Christophe Leroy 
> ---
> v2: This part is new and results from a split of last patch of v1 serie in
> two parts
>
>  arch/powerpc/mm/hugetlbpage.c | 189 
> +-
>  1 file changed, 77 insertions(+), 112 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 8a512b1..2119f00 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -64,14 +64,16 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
> *hpdp,
>  {
>   struct kmem_cache *cachep;
>   pte_t *new;
> -
> -#ifdef CONFIG_PPC_FSL_BOOK3E
>   int i;
> - int num_hugepd = 1 << (pshift - pdshift);
> - cachep = hugepte_cache;
> -#else
> - cachep = PGT_CACHE(pdshift - pshift);
> -#endif
> + int num_hugepd;
> +
> + if (pshift >= pdshift) {
> + cachep = hugepte_cache;
> + num_hugepd = 1 << (pshift - pdshift);
> + } else {
> + cachep = PGT_CACHE(pdshift - pshift);
> + num_hugepd = 1;
> + }

Is there a way to hint likely/unlikely branch based on the page size
selected at build time ?



>  
>   new = kmem_cache_zalloc(cachep, GFP_KERNEL);
>  
> @@ -89,7 +91,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
> *hpdp,
>   smp_wmb();
>  
>   spin_lock(>page_table_lock);
> -#ifdef CONFIG_PPC_FSL_BOOK3E
> +
>   /*
>* We have multiple higher-level entries that point to the same
>* actual pte location.  Fill in each as we go and backtrack on error.
> @@ -100,8 +102,13 @@ static int __hugepte_alloc(struct mm_struct *mm, 
> hugepd_t *hpdp,
>   if (unlikely(!hugepd_none(*hpdp)))
>   break;
>   else



> -#ifdef CONFIG_PPC_FSL_BOOK3E
>  struct kmem_cache *hugepte_cache;
>  static int __init hugetlbpage_init(void)
>  {
>   int psize;
>  
> - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> - unsigned shift;
> -
> - if (!mmu_psize_defs[psize].shift)
> - continue;
> -
> - shift = mmu_psize_to_shift(psize);
> -
> - /* Don't treat normal page sizes as huge... */
> - if (shift != PAGE_SHIFT)
> - if (add_huge_page_size(1ULL << shift) < 0)
> - continue;
> - }
> -
> - /*
> -  * Create a kmem cache for hugeptes.  The bottom bits in the pte have
> -  * size information encoded in them, so align them to allow this
> -  */
> - hugepte_cache =  kmem_cache_create("hugepte-cache", sizeof(pte_t),
> -HUGEPD_SHIFT_MASK + 1, 0, NULL);
> - if (hugepte_cache == NULL)
> - panic("%s: Unable to create kmem cache for hugeptes\n",
> -   __func__);
> -
> - /* Default hpage size = 4M */
> - if (mmu_psize_defs[MMU_PAGE_4M].shift)
> - HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_4M].shift;
> - else
> - panic("%s: Unable to set default huge page size\n", __func__);
> -
> -
> - return 0;
> -}
> -#else
> -static int __init hugetlbpage_init(void)
> -{
> - int psize;
> -
> +#if !defined(CONFIG_PPC_FSL_BOOK3E)
>   if (!radix_enabled() && !mmu_has_feature(MMU_FTR_16M_PAGE))
>   return -ENODEV;
> -
> +#endif

Do we need that #if ? radix_enabled() should become 0 and that if
condition should be removed at compile time isn't it ? or are you
finding errors with that ?


>   for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
>   unsigned shift;
>   unsigned pdshift;
> @@ -860,16 +807,31 @@ static int __init hugetlbpage_init(void)
>* if we have pdshift and shift value same, we don't
>* use pgt cache for hugepd.
>*/
> - if (pdshift != shift) {
> + if (pdshift > shift) {
>   

Re: [PATCH 112/124] staging: lustre: ptlrpc: remove unnecessary EXPORT_SYMBOL

2016-09-18 Thread kbuild test robot
Hi frank,

[auto build test ERROR on staging/staging-testing]
[cannot apply to v4.8-rc7 next-20160916]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
[Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
convenience) to record what (public, well-known) commit your patch series was 
built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:
https://github.com/0day-ci/linux/commits/James-Simmons/missing-patches-from-Lustre-2-7-release/20160919-045619
config: s390-allyesconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=s390 

All errors (new ones prefixed by >>):

>> ERROR: "lustre_swab_lov_mds_md" [drivers/staging/lustre/lustre/lov/lov.ko] 
>> undefined!
>> ERROR: "lustre_swab_lov_mds_md" 
>> [drivers/staging/lustre/lustre/llite/lustre.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 112/124] staging: lustre: ptlrpc: remove unnecessary EXPORT_SYMBOL

2016-09-18 Thread kbuild test robot
Hi frank,

[auto build test ERROR on staging/staging-testing]
[cannot apply to v4.8-rc7 next-20160916]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
[Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
convenience) to record what (public, well-known) commit your patch series was 
built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:
https://github.com/0day-ci/linux/commits/James-Simmons/missing-patches-from-Lustre-2-7-release/20160919-045619
config: s390-allyesconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=s390 

All errors (new ones prefixed by >>):

>> ERROR: "lustre_swab_lov_mds_md" [drivers/staging/lustre/lustre/lov/lov.ko] 
>> undefined!
>> ERROR: "lustre_swab_lov_mds_md" 
>> [drivers/staging/lustre/lustre/llite/lustre.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH resend] sctp: Remove some redundant code

2016-09-18 Thread David Miller
From: Christophe JAILLET 
Date: Fri, 16 Sep 2016 23:05:35 +0200

> In commit 311b21774f13 ("sctp: simplify sk_receive_queue locking"), a call
> to 'skb_queue_splice_tail_init()' has been made explicit. Previously it was
> hidden in 'sctp_skb_list_tail()'
> 
> Now, the code around it looks redundant. The '_init()' part of
> 'skb_queue_splice_tail_init()' should already do the same.
> 
> Signed-off-by: Christophe JAILLET 
> Acked-by: Marcelo Ricardo Leitner 
> Acked-by: Neil Horman 

Applied to net-next.


Re: [PATCH resend] sctp: Remove some redundant code

2016-09-18 Thread David Miller
From: Christophe JAILLET 
Date: Fri, 16 Sep 2016 23:05:35 +0200

> In commit 311b21774f13 ("sctp: simplify sk_receive_queue locking"), a call
> to 'skb_queue_splice_tail_init()' has been made explicit. Previously it was
> hidden in 'sctp_skb_list_tail()'
> 
> Now, the code around it looks redundant. The '_init()' part of
> 'skb_queue_splice_tail_init()' should already do the same.
> 
> Signed-off-by: Christophe JAILLET 
> Acked-by: Marcelo Ricardo Leitner 
> Acked-by: Neil Horman 

Applied to net-next.


[PATCH v7 6/6] powerpc: pSeries: Add pv-qspinlock build config/make

2016-09-18 Thread Pan Xinhui
pSeries run as a guest and might need pv-qspinlock.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/kernel/Makefile   | 1 +
 arch/powerpc/platforms/pseries/Kconfig | 8 
 2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index fe4c075..efd2f3d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -49,6 +49,7 @@ obj-$(CONFIG_PPC_970_NAP) += idle_power4.o
 obj-$(CONFIG_PPC_P7_NAP)   += idle_book3s.o
 procfs-y   := proc_powerpc.o
 obj-$(CONFIG_PROC_FS)  += $(procfs-y)
+obj-$(CONFIG_PARAVIRT_SPINLOCKS)   += paravirt.o
 rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI)  := rtas_pci.o
 obj-$(CONFIG_PPC_RTAS) += rtas.o rtas-rtc.o $(rtaspci-y-y)
 obj-$(CONFIG_PPC_RTAS_DAEMON)  += rtasd.o
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index f669323..46632e4 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -128,3 +128,11 @@ config HV_PERF_CTRS
  systems. 24x7 is available on Power 8 systems.
 
   If unsure, select Y.
+
+config PARAVIRT_SPINLOCKS
+   bool "Paravirtialization support for qspinlock"
+   depends on PPC_SPLPAR && QUEUED_SPINLOCKS
+   default y
+   help
+ If platform supports virtualization, for example PowerVM, this option
+ can let guest have a better performace.
-- 
2.4.11



[PATCH v7 5/6] powerpc/pv-qspinlock: powerpc support pv-qspinlock

2016-09-18 Thread Pan Xinhui
The default pv-qspinlock uses qspinlock(native version of pv-qspinlock).
pv_lock initialization should be done in bootstage with irq disabled.
And if we run as a guest with powerKVM/pHyp shared_processor mode,
restore pv_lock_ops callbacks to pv-qspinlock(pv version) which makes
full use of virtualization.

There is a hash table, we store cpu number into it and the key is lock.
So everytime pv_wait can know who is the lock holder by searching the
lock. Also store the lock in a per_cpu struct, and remove it when we own
the lock. Then pv_wait can know which lock we are spinning on. But the
cpu in the hash table might not be the correct lock holder, as for
performace issue, we does not take care of hash conflict.

Also introduce spin_lock_holder, which tells who owns the lock now.
currently the only user is spin_unlock_wait.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/qspinlock.h   |  29 +++-
 arch/powerpc/include/asm/qspinlock_paravirt.h  |  36 +
 .../powerpc/include/asm/qspinlock_paravirt_types.h |  13 ++
 arch/powerpc/kernel/paravirt.c | 153 +
 arch/powerpc/lib/locks.c   |   8 +-
 arch/powerpc/platforms/pseries/setup.c |   5 +
 6 files changed, 241 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

diff --git a/arch/powerpc/include/asm/qspinlock.h 
b/arch/powerpc/include/asm/qspinlock.h
index 881a186..23459fb 100644
--- a/arch/powerpc/include/asm/qspinlock.h
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -15,7 +15,7 @@ static inline u8 * __qspinlock_lock_byte(struct qspinlock 
*lock)
return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
 }
 
-static inline void queued_spin_unlock(struct qspinlock *lock)
+static inline void native_queued_spin_unlock(struct qspinlock *lock)
 {
/* release semantics is required */
smp_store_release(__qspinlock_lock_byte(lock), 0);
@@ -27,6 +27,33 @@ static inline int queued_spin_is_locked(struct qspinlock 
*lock)
return atomic_read(>val);
 }
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#include 
+/*
+ * try to know who is the lock holder, however it is not always true
+ * Return:
+ * -1, we did not know the lock holder.
+ * other value, likely is the lock holder.
+ */
+extern int spin_lock_holder(void *lock);
+
+static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
+{
+   pv_queued_spin_lock(lock, val);
+}
+
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   pv_queued_spin_unlock(lock);
+}
+#else
+#define spin_lock_holder(l) (-1)
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   native_queued_spin_unlock(lock);
+}
+#endif
+
 #include 
 
 /* we need override it as ppc has io_sync stuff */
diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h 
b/arch/powerpc/include/asm/qspinlock_paravirt.h
new file mode 100644
index 000..d87cda0
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock_paravirt.h
@@ -0,0 +1,36 @@
+#ifndef CONFIG_PARAVIRT_SPINLOCKS
+#error "do not include this file"
+#endif
+
+#ifndef _ASM_QSPINLOCK_PARAVIRT_H
+#define _ASM_QSPINLOCK_PARAVIRT_H
+
+#include  
+
+extern void pv_lock_init(void);
+extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+extern void __pv_init_lock_hash(void);
+extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+extern void __pv_queued_spin_unlock(struct qspinlock *lock);
+
+static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val)
+{
+   pv_lock_op.lock(lock, val);
+}
+
+static inline void pv_queued_spin_unlock(struct qspinlock *lock)
+{
+   pv_lock_op.unlock(lock);
+}
+
+static inline void pv_wait(u8 *ptr, u8 val)
+{
+   pv_lock_op.wait(ptr, val);
+}
+
+static inline void pv_kick(int cpu)
+{
+   pv_lock_op.kick(cpu);
+}
+
+#endif
diff --git a/arch/powerpc/include/asm/qspinlock_paravirt_types.h 
b/arch/powerpc/include/asm/qspinlock_paravirt_types.h
new file mode 100644
index 000..83611ed
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock_paravirt_types.h
@@ -0,0 +1,13 @@
+#ifndef _ASM_QSPINLOCK_PARAVIRT_TYPES_H
+#define _ASM_QSPINLOCK_PARAVIRT_TYPES_H
+
+struct pv_lock_ops {
+   void (*lock)(struct qspinlock *lock, u32 val);
+   void (*unlock)(struct qspinlock *lock);
+   void (*wait)(u8 *ptr, u8 val);
+   void (*kick)(int cpu);
+};
+
+extern struct pv_lock_ops pv_lock_op;
+
+#endif
diff --git a/arch/powerpc/kernel/paravirt.c b/arch/powerpc/kernel/paravirt.c
new file mode 100644
index 000..e697b17
--- /dev/null
+++ b/arch/powerpc/kernel/paravirt.c
@@ -0,0 +1,153 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * 

[PATCH v7 1/6] pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock

2016-09-18 Thread Pan Xinhui
cmpxchg_release() is more lighweight than cmpxchg() on some archs(e.g.
PPC), moreover, in __pv_queued_spin_unlock() we only needs a RELEASE in
the fast path(pairing with *_try_lock() or *_lock()). And the slow path
has smp_store_release too. So it's safe to use cmpxchg_release here.

Suggested-by:  Boqun Feng 
Signed-off-by: Pan Xinhui 
---
 kernel/locking/qspinlock_paravirt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 8a99abf..ce655aa 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -544,7 +544,7 @@ __visible void __pv_queued_spin_unlock(struct qspinlock 
*lock)
 * unhash. Otherwise it would be possible to have multiple @lock
 * entries, which would be BAD.
 */
-   locked = cmpxchg(>locked, _Q_LOCKED_VAL, 0);
+   locked = cmpxchg_release(>locked, _Q_LOCKED_VAL, 0);
if (likely(locked == _Q_LOCKED_VAL))
return;
 
-- 
2.4.11



[PATCH v7 3/6] powerpc: pseries/Kconfig: Add qspinlock build config

2016-09-18 Thread Pan Xinhui
pseries will use qspinlock by default.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/platforms/pseries/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index bec90fb..f669323 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -21,6 +21,7 @@ config PPC_PSERIES
select HOTPLUG_CPU if SMP
select ARCH_RANDOM
select PPC_DOORBELL
+   select ARCH_USE_QUEUED_SPINLOCKS
default y
 
 config PPC_SPLPAR
-- 
2.4.11



[PATCH v7 6/6] powerpc: pSeries: Add pv-qspinlock build config/make

2016-09-18 Thread Pan Xinhui
pSeries run as a guest and might need pv-qspinlock.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/kernel/Makefile   | 1 +
 arch/powerpc/platforms/pseries/Kconfig | 8 
 2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index fe4c075..efd2f3d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -49,6 +49,7 @@ obj-$(CONFIG_PPC_970_NAP) += idle_power4.o
 obj-$(CONFIG_PPC_P7_NAP)   += idle_book3s.o
 procfs-y   := proc_powerpc.o
 obj-$(CONFIG_PROC_FS)  += $(procfs-y)
+obj-$(CONFIG_PARAVIRT_SPINLOCKS)   += paravirt.o
 rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI)  := rtas_pci.o
 obj-$(CONFIG_PPC_RTAS) += rtas.o rtas-rtc.o $(rtaspci-y-y)
 obj-$(CONFIG_PPC_RTAS_DAEMON)  += rtasd.o
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index f669323..46632e4 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -128,3 +128,11 @@ config HV_PERF_CTRS
  systems. 24x7 is available on Power 8 systems.
 
   If unsure, select Y.
+
+config PARAVIRT_SPINLOCKS
+   bool "Paravirtialization support for qspinlock"
+   depends on PPC_SPLPAR && QUEUED_SPINLOCKS
+   default y
+   help
+ If platform supports virtualization, for example PowerVM, this option
+ can let guest have a better performace.
-- 
2.4.11



[PATCH v7 5/6] powerpc/pv-qspinlock: powerpc support pv-qspinlock

2016-09-18 Thread Pan Xinhui
The default pv-qspinlock uses qspinlock(native version of pv-qspinlock).
pv_lock initialization should be done in bootstage with irq disabled.
And if we run as a guest with powerKVM/pHyp shared_processor mode,
restore pv_lock_ops callbacks to pv-qspinlock(pv version) which makes
full use of virtualization.

There is a hash table, we store cpu number into it and the key is lock.
So everytime pv_wait can know who is the lock holder by searching the
lock. Also store the lock in a per_cpu struct, and remove it when we own
the lock. Then pv_wait can know which lock we are spinning on. But the
cpu in the hash table might not be the correct lock holder, as for
performace issue, we does not take care of hash conflict.

Also introduce spin_lock_holder, which tells who owns the lock now.
currently the only user is spin_unlock_wait.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/qspinlock.h   |  29 +++-
 arch/powerpc/include/asm/qspinlock_paravirt.h  |  36 +
 .../powerpc/include/asm/qspinlock_paravirt_types.h |  13 ++
 arch/powerpc/kernel/paravirt.c | 153 +
 arch/powerpc/lib/locks.c   |   8 +-
 arch/powerpc/platforms/pseries/setup.c |   5 +
 6 files changed, 241 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

diff --git a/arch/powerpc/include/asm/qspinlock.h 
b/arch/powerpc/include/asm/qspinlock.h
index 881a186..23459fb 100644
--- a/arch/powerpc/include/asm/qspinlock.h
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -15,7 +15,7 @@ static inline u8 * __qspinlock_lock_byte(struct qspinlock 
*lock)
return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
 }
 
-static inline void queued_spin_unlock(struct qspinlock *lock)
+static inline void native_queued_spin_unlock(struct qspinlock *lock)
 {
/* release semantics is required */
smp_store_release(__qspinlock_lock_byte(lock), 0);
@@ -27,6 +27,33 @@ static inline int queued_spin_is_locked(struct qspinlock 
*lock)
return atomic_read(>val);
 }
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#include 
+/*
+ * try to know who is the lock holder, however it is not always true
+ * Return:
+ * -1, we did not know the lock holder.
+ * other value, likely is the lock holder.
+ */
+extern int spin_lock_holder(void *lock);
+
+static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
+{
+   pv_queued_spin_lock(lock, val);
+}
+
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   pv_queued_spin_unlock(lock);
+}
+#else
+#define spin_lock_holder(l) (-1)
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   native_queued_spin_unlock(lock);
+}
+#endif
+
 #include 
 
 /* we need override it as ppc has io_sync stuff */
diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h 
b/arch/powerpc/include/asm/qspinlock_paravirt.h
new file mode 100644
index 000..d87cda0
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock_paravirt.h
@@ -0,0 +1,36 @@
+#ifndef CONFIG_PARAVIRT_SPINLOCKS
+#error "do not include this file"
+#endif
+
+#ifndef _ASM_QSPINLOCK_PARAVIRT_H
+#define _ASM_QSPINLOCK_PARAVIRT_H
+
+#include  
+
+extern void pv_lock_init(void);
+extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+extern void __pv_init_lock_hash(void);
+extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+extern void __pv_queued_spin_unlock(struct qspinlock *lock);
+
+static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val)
+{
+   pv_lock_op.lock(lock, val);
+}
+
+static inline void pv_queued_spin_unlock(struct qspinlock *lock)
+{
+   pv_lock_op.unlock(lock);
+}
+
+static inline void pv_wait(u8 *ptr, u8 val)
+{
+   pv_lock_op.wait(ptr, val);
+}
+
+static inline void pv_kick(int cpu)
+{
+   pv_lock_op.kick(cpu);
+}
+
+#endif
diff --git a/arch/powerpc/include/asm/qspinlock_paravirt_types.h 
b/arch/powerpc/include/asm/qspinlock_paravirt_types.h
new file mode 100644
index 000..83611ed
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock_paravirt_types.h
@@ -0,0 +1,13 @@
+#ifndef _ASM_QSPINLOCK_PARAVIRT_TYPES_H
+#define _ASM_QSPINLOCK_PARAVIRT_TYPES_H
+
+struct pv_lock_ops {
+   void (*lock)(struct qspinlock *lock, u32 val);
+   void (*unlock)(struct qspinlock *lock);
+   void (*wait)(u8 *ptr, u8 val);
+   void (*kick)(int cpu);
+};
+
+extern struct pv_lock_ops pv_lock_op;
+
+#endif
diff --git a/arch/powerpc/kernel/paravirt.c b/arch/powerpc/kernel/paravirt.c
new file mode 100644
index 000..e697b17
--- /dev/null
+++ b/arch/powerpc/kernel/paravirt.c
@@ -0,0 +1,153 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; 

[PATCH v7 1/6] pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock

2016-09-18 Thread Pan Xinhui
cmpxchg_release() is more lighweight than cmpxchg() on some archs(e.g.
PPC), moreover, in __pv_queued_spin_unlock() we only needs a RELEASE in
the fast path(pairing with *_try_lock() or *_lock()). And the slow path
has smp_store_release too. So it's safe to use cmpxchg_release here.

Suggested-by:  Boqun Feng 
Signed-off-by: Pan Xinhui 
---
 kernel/locking/qspinlock_paravirt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 8a99abf..ce655aa 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -544,7 +544,7 @@ __visible void __pv_queued_spin_unlock(struct qspinlock 
*lock)
 * unhash. Otherwise it would be possible to have multiple @lock
 * entries, which would be BAD.
 */
-   locked = cmpxchg(>locked, _Q_LOCKED_VAL, 0);
+   locked = cmpxchg_release(>locked, _Q_LOCKED_VAL, 0);
if (likely(locked == _Q_LOCKED_VAL))
return;
 
-- 
2.4.11



[PATCH v7 3/6] powerpc: pseries/Kconfig: Add qspinlock build config

2016-09-18 Thread Pan Xinhui
pseries will use qspinlock by default.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/platforms/pseries/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index bec90fb..f669323 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -21,6 +21,7 @@ config PPC_PSERIES
select HOTPLUG_CPU if SMP
select ARCH_RANDOM
select PPC_DOORBELL
+   select ARCH_USE_QUEUED_SPINLOCKS
default y
 
 config PPC_SPLPAR
-- 
2.4.11



[PATCH v7 0/6] Implement qspinlock/pv-qspinlock on ppc

2016-09-18 Thread Pan Xinhui
Hi All,
  this is the fairlock patchset. You can apply them and build successfully.
patches are based on 4.8-rc4.
  qspinlock can avoid waiter starved issue. It has about the same speed in
single-thread and it can be much faster in high contention situations
especially when the spinlock is embedded within the data structure to be
protected.

v6 -> v7:
rebase onto 4.8-rc4
no changelog anymore, sorry for that. I hope there is a very careful review.

Todo:
we can save one function call overhead. As we can use feature-fixup to 
patch
the binary code. Currently there is pv_lock_ops->lock(lock) and ->unlock(lock) 
to acquire/release the lock.

some benchmark result below

perf bench
these numbers are ops per sec, So the higher the better.
***
on pSeries with 32 vcpus, 32Gb memory, pHyp.

test case   | pv-qspinlock  |  qspinlock| 
current-spinlock

futex hash  | 618572| 552332| 553788
futex lock-pi   | 364   | 364   | 364
sched pipe  | 78984 | 76060 | 81454


unix bench:
these numbers are scores, So the higher the better.

on PowerNV with 16 cores(cpus) (smt off), 32Gb memory:
-
pv-qspinlock and qspinlock have very similar results because pv-qspinlock use 
native version
which is only having one callback overhead

test case   | pv-qspinlock and qspinlock | current-spinlock

Execl Throughput   761.1 761.4
File Copy 1024 bufsize 2000 maxblocks 1259.81286.6
File Copy 256 bufsize 500 maxblocks782.2 790.3
File Copy 4096 bufsize 8000 maxblocks 2741.52817.4
Pipe Throughput   1063.21036.7
Pipe-based Context Switching   284.7 281.1
Process Creation   679.6 649.1
Shell Scripts (1 concurrent)  1933.21922.9
Shell Scripts (8 concurrent)  5003.34899.8
System Call Overhead   900.6 896.8
 ==
System Benchmarks Index Score 1139.3 1133.0
--- 
-

***
on pSeries with 32 vcpus, 32Gb memory, pHyp.

test case   |   pv-qspinlock |  qspinlock | 
current-spinlock

Execl Throughput 877.1 891.2 872.8
File Copy 1024 bufsize 2000 maxblocks   1390.41399.21395.0
File Copy 256 bufsize 500 maxblocks  882.4 889.5 881.8
File Copy 4096 bufsize 8000 maxblocks   3112.33113.43121.7
Pipe Throughput 1095.81162.61158.5
Pipe-based Context Switching 194.9 192.7 200.7
Process Creation 518.4 526.4 509.1
Shell Scripts (1 concurrent)1401.91413.91402.2
Shell Scripts (8 concurrent)3215.63246.63229.1
System Call Overhead 833.2 892.4 888.1
  
System Benchmarks Index Score   1033.71052.51047.8


**
on pSeries with 32 vcpus, 16Gb memory, KVM.

test case   |   pv-qspinlock |  qspinlock | 
current-spinlock

Execl Throughput 497.4518.7 497.8
File Copy 1024 bufsize 2000 maxblocks   1368.8   1390.11343.3
File Copy 256 bufsize 500 maxblocks  857.7859.8 831.4
File Copy 4096 bufsize 8000 maxblocks   2851.7   2838.12785.5
Pipe Throughput 1221.9 

[PATCH v7 0/6] Implement qspinlock/pv-qspinlock on ppc

2016-09-18 Thread Pan Xinhui
Hi All,
  this is the fairlock patchset. You can apply them and build successfully.
patches are based on 4.8-rc4.
  qspinlock can avoid waiter starved issue. It has about the same speed in
single-thread and it can be much faster in high contention situations
especially when the spinlock is embedded within the data structure to be
protected.

v6 -> v7:
rebase onto 4.8-rc4
no changelog anymore, sorry for that. I hope there is a very careful review.

Todo:
we can save one function call overhead. As we can use feature-fixup to 
patch
the binary code. Currently there is pv_lock_ops->lock(lock) and ->unlock(lock) 
to acquire/release the lock.

some benchmark result below

perf bench
these numbers are ops per sec, So the higher the better.
***
on pSeries with 32 vcpus, 32Gb memory, pHyp.

test case   | pv-qspinlock  |  qspinlock| 
current-spinlock

futex hash  | 618572| 552332| 553788
futex lock-pi   | 364   | 364   | 364
sched pipe  | 78984 | 76060 | 81454


unix bench:
these numbers are scores, So the higher the better.

on PowerNV with 16 cores(cpus) (smt off), 32Gb memory:
-
pv-qspinlock and qspinlock have very similar results because pv-qspinlock use 
native version
which is only having one callback overhead

test case   | pv-qspinlock and qspinlock | current-spinlock

Execl Throughput   761.1 761.4
File Copy 1024 bufsize 2000 maxblocks 1259.81286.6
File Copy 256 bufsize 500 maxblocks782.2 790.3
File Copy 4096 bufsize 8000 maxblocks 2741.52817.4
Pipe Throughput   1063.21036.7
Pipe-based Context Switching   284.7 281.1
Process Creation   679.6 649.1
Shell Scripts (1 concurrent)  1933.21922.9
Shell Scripts (8 concurrent)  5003.34899.8
System Call Overhead   900.6 896.8
 ==
System Benchmarks Index Score 1139.3 1133.0
--- 
-

***
on pSeries with 32 vcpus, 32Gb memory, pHyp.

test case   |   pv-qspinlock |  qspinlock | 
current-spinlock

Execl Throughput 877.1 891.2 872.8
File Copy 1024 bufsize 2000 maxblocks   1390.41399.21395.0
File Copy 256 bufsize 500 maxblocks  882.4 889.5 881.8
File Copy 4096 bufsize 8000 maxblocks   3112.33113.43121.7
Pipe Throughput 1095.81162.61158.5
Pipe-based Context Switching 194.9 192.7 200.7
Process Creation 518.4 526.4 509.1
Shell Scripts (1 concurrent)1401.91413.91402.2
Shell Scripts (8 concurrent)3215.63246.63229.1
System Call Overhead 833.2 892.4 888.1
  
System Benchmarks Index Score   1033.71052.51047.8


**
on pSeries with 32 vcpus, 16Gb memory, KVM.

test case   |   pv-qspinlock |  qspinlock | 
current-spinlock

Execl Throughput 497.4518.7 497.8
File Copy 1024 bufsize 2000 maxblocks   1368.8   1390.11343.3
File Copy 256 bufsize 500 maxblocks  857.7859.8 831.4
File Copy 4096 bufsize 8000 maxblocks   2851.7   2838.12785.5
Pipe Throughput 1221.9 

[PATCH v7 2/6] powerpc/qspinlock: powerpc support qspinlock

2016-09-18 Thread Pan Xinhui
This patch add basic code to enable qspinlock on powerpc. qspinlock is
one kind of fairlock implemention. And seen some performance improvement
under some scenarios.

queued_spin_unlock() release the lock by just one write of NULL to the
->locked field which sits at different places in the two endianness
system.

We override some arch_spin_xxx as powerpc has io_sync stuff which makes
sure the io operations are protected by the lock correctly.

There is another special case, see commit
2c610022711 ("locking/qspinlock: Fix spin_unlock_wait() some more")

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/qspinlock.h  | 66 +++
 arch/powerpc/include/asm/spinlock.h   | 31 +--
 arch/powerpc/include/asm/spinlock_types.h |  4 ++
 arch/powerpc/lib/locks.c  | 59 +++
 4 files changed, 147 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h

diff --git a/arch/powerpc/include/asm/qspinlock.h 
b/arch/powerpc/include/asm/qspinlock.h
new file mode 100644
index 000..881a186
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -0,0 +1,66 @@
+#ifndef _ASM_POWERPC_QSPINLOCK_H
+#define _ASM_POWERPC_QSPINLOCK_H
+
+#include 
+
+#define SPIN_THRESHOLD (1 << 15)
+#define queued_spin_unlock queued_spin_unlock
+#define queued_spin_is_locked queued_spin_is_locked
+#define queued_spin_unlock_wait queued_spin_unlock_wait
+
+extern void queued_spin_unlock_wait(struct qspinlock *lock);
+
+static inline u8 * __qspinlock_lock_byte(struct qspinlock *lock)
+{
+   return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
+}
+
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   /* release semantics is required */
+   smp_store_release(__qspinlock_lock_byte(lock), 0);
+}
+
+static inline int queued_spin_is_locked(struct qspinlock *lock)
+{
+   smp_mb();
+   return atomic_read(>val);
+}
+
+#include 
+
+/* we need override it as ppc has io_sync stuff */
+#undef arch_spin_trylock
+#undef arch_spin_lock
+#undef arch_spin_lock_flags
+#undef arch_spin_unlock
+#define arch_spin_trylock arch_spin_trylock
+#define arch_spin_lock arch_spin_lock
+#define arch_spin_lock_flags arch_spin_lock_flags
+#define arch_spin_unlock arch_spin_unlock
+
+static inline int arch_spin_trylock(arch_spinlock_t *lock)
+{
+   CLEAR_IO_SYNC;
+   return queued_spin_trylock(lock);
+}
+
+static inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+   CLEAR_IO_SYNC;
+   queued_spin_lock(lock);
+}
+
+static inline
+void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags)
+{
+   CLEAR_IO_SYNC;
+   queued_spin_lock(lock);
+}
+
+static inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+   SYNC_IO;
+   queued_spin_unlock(lock);
+}
+#endif /* _ASM_POWERPC_QSPINLOCK_H */
diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index fa37fe9..6aef8dd 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,23 @@
 #define SYNC_IO
 #endif
 
+#if defined(CONFIG_PPC_SPLPAR)
+/* We only yield to the hypervisor if we are in shared processor mode */
+#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
+extern void __spin_yield(arch_spinlock_t *lock);
+extern void __rw_yield(arch_rwlock_t *lock);
+#else /* SPLPAR */
+#define __spin_yield(x)barrier()
+#define __rw_yield(x)  barrier()
+#define SHARED_PROCESSOR   0
+#endif
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include 
+#else
+
+#define arch_spin_relax(lock)  __spin_yield(lock)
+
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
return lock.slock == 0;
@@ -106,18 +123,6 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)
  * held.  Conveniently, we have a word in the paca that holds this
  * value.
  */
-
-#if defined(CONFIG_PPC_SPLPAR)
-/* We only yield to the hypervisor if we are in shared processor mode */
-#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
-extern void __spin_yield(arch_spinlock_t *lock);
-extern void __rw_yield(arch_rwlock_t *lock);
-#else /* SPLPAR */
-#define __spin_yield(x)barrier()
-#define __rw_yield(x)  barrier()
-#define SHARED_PROCESSOR   0
-#endif
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
CLEAR_IO_SYNC;
@@ -195,6 +200,7 @@ out:
smp_mb();
 }
 
+#endif /* !CONFIG_QUEUED_SPINLOCKS */
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
@@ -330,7 +336,6 @@ static inline void arch_write_unlock(arch_rwlock_t *rw)
 #define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
 #define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
 
-#define arch_spin_relax(lock)  __spin_yield(lock)
 #define arch_read_relax(lock)  __rw_yield(lock)
 #define arch_write_relax(lock) __rw_yield(lock)
 
diff --git 

[PATCH v7 4/6] powerpc: lib/locks.c: Add cpu yield/wake helper function

2016-09-18 Thread Pan Xinhui
Add two corresponding helper functions to support pv-qspinlock.

For normal use, __spin_yield_cpu will confer current vcpu slices to the
target vcpu(say, a lock holder). If target vcpu is not specified or it
is in running state, such conferging to lpar happens or not depends.

Because hcall itself will introduce latency and a little overhead. And
we do NOT want to suffer any latency on some cases, e.g. in interrupt handler.
The second parameter *confer* can indicate such case.

__spin_wake_cpu is simpiler, it will wake up one vcpu regardless of its
current vcpu state.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/spinlock.h |  4 +++
 arch/powerpc/lib/locks.c| 59 +
 2 files changed, 63 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 6aef8dd..abb6b0f 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -56,9 +56,13 @@
 /* We only yield to the hypervisor if we are in shared processor mode */
 #define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
 extern void __spin_yield(arch_spinlock_t *lock);
+extern void __spin_yield_cpu(int cpu, int confer);
+extern void __spin_wake_cpu(int cpu);
 extern void __rw_yield(arch_rwlock_t *lock);
 #else /* SPLPAR */
 #define __spin_yield(x)barrier()
+#define __spin_yield_cpu(x,y) barrier()
+#define __spin_wake_cpu(x) barrier()
 #define __rw_yield(x)  barrier()
 #define SHARED_PROCESSOR   0
 #endif
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index 6574626..892df7d 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -23,6 +23,65 @@
 #include 
 #include 
 
+/*
+ * confer our slices to a specified cpu and return. If it is already running or
+ * cpu is -1, then we will check confer. If confer is NULL, we will return
+ * otherwise we confer our slices to lpar.
+ */
+void __spin_yield_cpu(int cpu, int confer)
+{
+   unsigned int holder_cpu = cpu, yield_count;
+
+   if (cpu == -1)
+   goto yield_to_lpar;
+
+   BUG_ON(holder_cpu >= nr_cpu_ids);
+   yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
+
+   /* if cpu is running, confer slices to lpar conditionally*/
+   if ((yield_count & 1) == 0)
+   goto yield_to_lpar;
+
+   plpar_hcall_norets(H_CONFER,
+   get_hard_smp_processor_id(holder_cpu), yield_count);
+   return;
+
+yield_to_lpar:
+   if (confer)
+   plpar_hcall_norets(H_CONFER, -1, 0);
+}
+EXPORT_SYMBOL_GPL(__spin_yield_cpu);
+
+void __spin_wake_cpu(int cpu)
+{
+   unsigned int holder_cpu = cpu;
+
+   BUG_ON(holder_cpu >= nr_cpu_ids);
+   /*
+* NOTE: we should always do this hcall regardless of
+* the yield_count of the holder_cpu.
+* as thers might be a case like below;
+* CPU  1   2
+*  yielded = true
+*  if (yielded)
+*  __spin_wake_cpu()
+*  __spin_yield_cpu()
+*
+* So we might lose a wake if we check the yield_count and
+* return directly if the holder_cpu is running.
+* IOW. do NOT code like below.
+*  yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
+*  if ((yield_count & 1) == 0)
+*  return;
+*
+* a PROD hcall marks the target_cpu proded, which cause the next cede 
or confer
+* called on the target_cpu invalid.
+*/
+   plpar_hcall_norets(H_PROD,
+   get_hard_smp_processor_id(holder_cpu));
+}
+EXPORT_SYMBOL_GPL(__spin_wake_cpu);
+
 #ifndef CONFIG_QUEUED_SPINLOCKS
 void __spin_yield(arch_spinlock_t *lock)
 {
-- 
2.4.11



[PATCH v7 4/6] powerpc: lib/locks.c: Add cpu yield/wake helper function

2016-09-18 Thread Pan Xinhui
Add two corresponding helper functions to support pv-qspinlock.

For normal use, __spin_yield_cpu will confer current vcpu slices to the
target vcpu(say, a lock holder). If target vcpu is not specified or it
is in running state, such conferging to lpar happens or not depends.

Because hcall itself will introduce latency and a little overhead. And
we do NOT want to suffer any latency on some cases, e.g. in interrupt handler.
The second parameter *confer* can indicate such case.

__spin_wake_cpu is simpiler, it will wake up one vcpu regardless of its
current vcpu state.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/spinlock.h |  4 +++
 arch/powerpc/lib/locks.c| 59 +
 2 files changed, 63 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 6aef8dd..abb6b0f 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -56,9 +56,13 @@
 /* We only yield to the hypervisor if we are in shared processor mode */
 #define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
 extern void __spin_yield(arch_spinlock_t *lock);
+extern void __spin_yield_cpu(int cpu, int confer);
+extern void __spin_wake_cpu(int cpu);
 extern void __rw_yield(arch_rwlock_t *lock);
 #else /* SPLPAR */
 #define __spin_yield(x)barrier()
+#define __spin_yield_cpu(x,y) barrier()
+#define __spin_wake_cpu(x) barrier()
 #define __rw_yield(x)  barrier()
 #define SHARED_PROCESSOR   0
 #endif
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index 6574626..892df7d 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -23,6 +23,65 @@
 #include 
 #include 
 
+/*
+ * confer our slices to a specified cpu and return. If it is already running or
+ * cpu is -1, then we will check confer. If confer is NULL, we will return
+ * otherwise we confer our slices to lpar.
+ */
+void __spin_yield_cpu(int cpu, int confer)
+{
+   unsigned int holder_cpu = cpu, yield_count;
+
+   if (cpu == -1)
+   goto yield_to_lpar;
+
+   BUG_ON(holder_cpu >= nr_cpu_ids);
+   yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
+
+   /* if cpu is running, confer slices to lpar conditionally*/
+   if ((yield_count & 1) == 0)
+   goto yield_to_lpar;
+
+   plpar_hcall_norets(H_CONFER,
+   get_hard_smp_processor_id(holder_cpu), yield_count);
+   return;
+
+yield_to_lpar:
+   if (confer)
+   plpar_hcall_norets(H_CONFER, -1, 0);
+}
+EXPORT_SYMBOL_GPL(__spin_yield_cpu);
+
+void __spin_wake_cpu(int cpu)
+{
+   unsigned int holder_cpu = cpu;
+
+   BUG_ON(holder_cpu >= nr_cpu_ids);
+   /*
+* NOTE: we should always do this hcall regardless of
+* the yield_count of the holder_cpu.
+* as thers might be a case like below;
+* CPU  1   2
+*  yielded = true
+*  if (yielded)
+*  __spin_wake_cpu()
+*  __spin_yield_cpu()
+*
+* So we might lose a wake if we check the yield_count and
+* return directly if the holder_cpu is running.
+* IOW. do NOT code like below.
+*  yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
+*  if ((yield_count & 1) == 0)
+*  return;
+*
+* a PROD hcall marks the target_cpu proded, which cause the next cede 
or confer
+* called on the target_cpu invalid.
+*/
+   plpar_hcall_norets(H_PROD,
+   get_hard_smp_processor_id(holder_cpu));
+}
+EXPORT_SYMBOL_GPL(__spin_wake_cpu);
+
 #ifndef CONFIG_QUEUED_SPINLOCKS
 void __spin_yield(arch_spinlock_t *lock)
 {
-- 
2.4.11



[PATCH v7 2/6] powerpc/qspinlock: powerpc support qspinlock

2016-09-18 Thread Pan Xinhui
This patch add basic code to enable qspinlock on powerpc. qspinlock is
one kind of fairlock implemention. And seen some performance improvement
under some scenarios.

queued_spin_unlock() release the lock by just one write of NULL to the
->locked field which sits at different places in the two endianness
system.

We override some arch_spin_xxx as powerpc has io_sync stuff which makes
sure the io operations are protected by the lock correctly.

There is another special case, see commit
2c610022711 ("locking/qspinlock: Fix spin_unlock_wait() some more")

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/qspinlock.h  | 66 +++
 arch/powerpc/include/asm/spinlock.h   | 31 +--
 arch/powerpc/include/asm/spinlock_types.h |  4 ++
 arch/powerpc/lib/locks.c  | 59 +++
 4 files changed, 147 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h

diff --git a/arch/powerpc/include/asm/qspinlock.h 
b/arch/powerpc/include/asm/qspinlock.h
new file mode 100644
index 000..881a186
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -0,0 +1,66 @@
+#ifndef _ASM_POWERPC_QSPINLOCK_H
+#define _ASM_POWERPC_QSPINLOCK_H
+
+#include 
+
+#define SPIN_THRESHOLD (1 << 15)
+#define queued_spin_unlock queued_spin_unlock
+#define queued_spin_is_locked queued_spin_is_locked
+#define queued_spin_unlock_wait queued_spin_unlock_wait
+
+extern void queued_spin_unlock_wait(struct qspinlock *lock);
+
+static inline u8 * __qspinlock_lock_byte(struct qspinlock *lock)
+{
+   return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
+}
+
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   /* release semantics is required */
+   smp_store_release(__qspinlock_lock_byte(lock), 0);
+}
+
+static inline int queued_spin_is_locked(struct qspinlock *lock)
+{
+   smp_mb();
+   return atomic_read(>val);
+}
+
+#include 
+
+/* we need override it as ppc has io_sync stuff */
+#undef arch_spin_trylock
+#undef arch_spin_lock
+#undef arch_spin_lock_flags
+#undef arch_spin_unlock
+#define arch_spin_trylock arch_spin_trylock
+#define arch_spin_lock arch_spin_lock
+#define arch_spin_lock_flags arch_spin_lock_flags
+#define arch_spin_unlock arch_spin_unlock
+
+static inline int arch_spin_trylock(arch_spinlock_t *lock)
+{
+   CLEAR_IO_SYNC;
+   return queued_spin_trylock(lock);
+}
+
+static inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+   CLEAR_IO_SYNC;
+   queued_spin_lock(lock);
+}
+
+static inline
+void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags)
+{
+   CLEAR_IO_SYNC;
+   queued_spin_lock(lock);
+}
+
+static inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+   SYNC_IO;
+   queued_spin_unlock(lock);
+}
+#endif /* _ASM_POWERPC_QSPINLOCK_H */
diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index fa37fe9..6aef8dd 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,23 @@
 #define SYNC_IO
 #endif
 
+#if defined(CONFIG_PPC_SPLPAR)
+/* We only yield to the hypervisor if we are in shared processor mode */
+#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
+extern void __spin_yield(arch_spinlock_t *lock);
+extern void __rw_yield(arch_rwlock_t *lock);
+#else /* SPLPAR */
+#define __spin_yield(x)barrier()
+#define __rw_yield(x)  barrier()
+#define SHARED_PROCESSOR   0
+#endif
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include 
+#else
+
+#define arch_spin_relax(lock)  __spin_yield(lock)
+
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
return lock.slock == 0;
@@ -106,18 +123,6 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)
  * held.  Conveniently, we have a word in the paca that holds this
  * value.
  */
-
-#if defined(CONFIG_PPC_SPLPAR)
-/* We only yield to the hypervisor if we are in shared processor mode */
-#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
-extern void __spin_yield(arch_spinlock_t *lock);
-extern void __rw_yield(arch_rwlock_t *lock);
-#else /* SPLPAR */
-#define __spin_yield(x)barrier()
-#define __rw_yield(x)  barrier()
-#define SHARED_PROCESSOR   0
-#endif
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
CLEAR_IO_SYNC;
@@ -195,6 +200,7 @@ out:
smp_mb();
 }
 
+#endif /* !CONFIG_QUEUED_SPINLOCKS */
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
@@ -330,7 +336,6 @@ static inline void arch_write_unlock(arch_rwlock_t *rw)
 #define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
 #define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
 
-#define arch_spin_relax(lock)  __spin_yield(lock)
 #define arch_read_relax(lock)  __rw_yield(lock)
 #define arch_write_relax(lock) __rw_yield(lock)
 
diff --git a/arch/powerpc/include/asm/spinlock_types.h 

Re: [PATCH v2 1/3] powerpc: port 64 bits pgtable_cache to 32 bits

2016-09-18 Thread Aneesh Kumar K.V
Christophe Leroy  writes:

> Today powerpc64 uses a set of pgtable_caches while powerpc32 uses
> standard pages when using 4k pages and a single pgtable_cache
> if using other size pages.
>
> In preparation of implementing huge pages on the 8xx, this patch
> replaces the specific powerpc32 handling by the 64 bits approach.
>
> This is done by:
> * moving 64 bits pgtable_cache_add() and pgtable_cache_init()
> in a new file called init-common.c
> * modifying pgtable_cache_init() to also handle the case
> without PMD
> * removing the 32 bits version of pgtable_cache_add() and
> pgtable_cache_init()
> * copying related header contents from 64 bits into both the
> book3s/32 and nohash/32 header files
>
> On the 8xx, the following cache sizes will be used:
> * 4k pages mode:
> - PGT_CACHE(10) for PGD
> - PGT_CACHE(3) for 512k hugepage tables
> * 16k pages mode:
> - PGT_CACHE(6) for PGD
> - PGT_CACHE(7) for 512k hugepage tables
> - PGT_CACHE(3) for 8M hugepage tables
>
> Signed-off-by: Christophe Leroy 
> ---
> v2: in v1, hugepte_cache was wrongly replaced by PGT_CACHE(1).
> This modification has been removed from v2.
>
>  arch/powerpc/include/asm/book3s/32/pgalloc.h |  44 ++--
>  arch/powerpc/include/asm/book3s/32/pgtable.h |  43 
>  arch/powerpc/include/asm/book3s/64/pgtable.h |   3 -
>  arch/powerpc/include/asm/nohash/32/pgalloc.h |  44 ++--
>  arch/powerpc/include/asm/nohash/32/pgtable.h |  45 
>  arch/powerpc/include/asm/nohash/64/pgtable.h |   2 -
>  arch/powerpc/include/asm/pgtable.h   |   2 +
>  arch/powerpc/mm/Makefile |   3 +-
>  arch/powerpc/mm/init-common.c| 147 
> +++
>  arch/powerpc/mm/init_64.c|  77 --
>  arch/powerpc/mm/pgtable_32.c |  37 ---
>  11 files changed, 273 insertions(+), 174 deletions(-)
>  create mode 100644 arch/powerpc/mm/init-common.c
>
> diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h 
> b/arch/powerpc/include/asm/book3s/32/pgalloc.h
> index 8e21bb4..d310546 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
> @@ -2,14 +2,42 @@
>  #define _ASM_POWERPC_BOOK3S_32_PGALLOC_H
>  
>  #include 
> +#include 
>  
> -/* For 32-bit, all levels of page tables are just drawn from get_free_page() 
> */
> -#define MAX_PGTABLE_INDEX_SIZE   0
> +/*
> + * Functions that deal with pagetables that could be at any level of
> + * the table need to be passed an "index_size" so they know how to
> + * handle allocation.  For PTE pages (which are linked to a struct
> + * page for now, and drawn from the main get_free_pages() pool), the
> + * allocation size will be (2^index_size * sizeof(pointer)) and
> + * allocations are drawn from the kmem_cache in PGT_CACHE(index_size).
> + *
> + * The maximum index size needs to be big enough to allow any
> + * pagetable sizes we need, but small enough to fit in the low bits of
> + * any page table pointer.  In other words all pagetables, even tiny
> + * ones, must be aligned to allow at least enough low 0 bits to
> + * contain this value.  This value is also used as a mask, so it must
> + * be one less than a power of two.
> + */
> +#define MAX_PGTABLE_INDEX_SIZE   0xf
>  
>  extern void __bad_pte(pmd_t *pmd);
>  
> -extern pgd_t *pgd_alloc(struct mm_struct *mm);
> -extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
> +extern struct kmem_cache *pgtable_cache[];
> +#define PGT_CACHE(shift) ({  \
> + BUG_ON(!(shift));   \
> + pgtable_cache[(shift) - 1]; \
> + })
> +
> +static inline pgd_t *pgd_alloc(struct mm_struct *mm)
> +{
> + return kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE), GFP_KERNEL);
> +}
> +
> +static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> +{
> + kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
> +}
>  
>  /*
>   * We don't have any real pmd's, and this code never triggers because
> @@ -68,8 +96,12 @@ static inline void pte_free(struct mm_struct *mm, 
> pgtable_t ptepage)
>  
>  static inline void pgtable_free(void *table, unsigned index_size)
>  {
> - BUG_ON(index_size); /* 32-bit doesn't use this */
> - free_page((unsigned long)table);
> + if (!index_size) {
> + free_page((unsigned long)table);
> + } else {
> + BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
> + kmem_cache_free(PGT_CACHE(index_size), table);
> + }
>  }
>  
>  #define check_pgt_cache()do { } while (0)
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
> b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 6b8b2d5..f887499 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -8,6 +8,26 @@
>  /* And here we include common definitions */
>  #include 
>  
> +#define 

Re: [PATCH v2 1/3] powerpc: port 64 bits pgtable_cache to 32 bits

2016-09-18 Thread Aneesh Kumar K.V
Christophe Leroy  writes:

> Today powerpc64 uses a set of pgtable_caches while powerpc32 uses
> standard pages when using 4k pages and a single pgtable_cache
> if using other size pages.
>
> In preparation of implementing huge pages on the 8xx, this patch
> replaces the specific powerpc32 handling by the 64 bits approach.
>
> This is done by:
> * moving 64 bits pgtable_cache_add() and pgtable_cache_init()
> in a new file called init-common.c
> * modifying pgtable_cache_init() to also handle the case
> without PMD
> * removing the 32 bits version of pgtable_cache_add() and
> pgtable_cache_init()
> * copying related header contents from 64 bits into both the
> book3s/32 and nohash/32 header files
>
> On the 8xx, the following cache sizes will be used:
> * 4k pages mode:
> - PGT_CACHE(10) for PGD
> - PGT_CACHE(3) for 512k hugepage tables
> * 16k pages mode:
> - PGT_CACHE(6) for PGD
> - PGT_CACHE(7) for 512k hugepage tables
> - PGT_CACHE(3) for 8M hugepage tables
>
> Signed-off-by: Christophe Leroy 
> ---
> v2: in v1, hugepte_cache was wrongly replaced by PGT_CACHE(1).
> This modification has been removed from v2.
>
>  arch/powerpc/include/asm/book3s/32/pgalloc.h |  44 ++--
>  arch/powerpc/include/asm/book3s/32/pgtable.h |  43 
>  arch/powerpc/include/asm/book3s/64/pgtable.h |   3 -
>  arch/powerpc/include/asm/nohash/32/pgalloc.h |  44 ++--
>  arch/powerpc/include/asm/nohash/32/pgtable.h |  45 
>  arch/powerpc/include/asm/nohash/64/pgtable.h |   2 -
>  arch/powerpc/include/asm/pgtable.h   |   2 +
>  arch/powerpc/mm/Makefile |   3 +-
>  arch/powerpc/mm/init-common.c| 147 
> +++
>  arch/powerpc/mm/init_64.c|  77 --
>  arch/powerpc/mm/pgtable_32.c |  37 ---
>  11 files changed, 273 insertions(+), 174 deletions(-)
>  create mode 100644 arch/powerpc/mm/init-common.c
>
> diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h 
> b/arch/powerpc/include/asm/book3s/32/pgalloc.h
> index 8e21bb4..d310546 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
> @@ -2,14 +2,42 @@
>  #define _ASM_POWERPC_BOOK3S_32_PGALLOC_H
>  
>  #include 
> +#include 
>  
> -/* For 32-bit, all levels of page tables are just drawn from get_free_page() 
> */
> -#define MAX_PGTABLE_INDEX_SIZE   0
> +/*
> + * Functions that deal with pagetables that could be at any level of
> + * the table need to be passed an "index_size" so they know how to
> + * handle allocation.  For PTE pages (which are linked to a struct
> + * page for now, and drawn from the main get_free_pages() pool), the
> + * allocation size will be (2^index_size * sizeof(pointer)) and
> + * allocations are drawn from the kmem_cache in PGT_CACHE(index_size).
> + *
> + * The maximum index size needs to be big enough to allow any
> + * pagetable sizes we need, but small enough to fit in the low bits of
> + * any page table pointer.  In other words all pagetables, even tiny
> + * ones, must be aligned to allow at least enough low 0 bits to
> + * contain this value.  This value is also used as a mask, so it must
> + * be one less than a power of two.
> + */
> +#define MAX_PGTABLE_INDEX_SIZE   0xf
>  
>  extern void __bad_pte(pmd_t *pmd);
>  
> -extern pgd_t *pgd_alloc(struct mm_struct *mm);
> -extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
> +extern struct kmem_cache *pgtable_cache[];
> +#define PGT_CACHE(shift) ({  \
> + BUG_ON(!(shift));   \
> + pgtable_cache[(shift) - 1]; \
> + })
> +
> +static inline pgd_t *pgd_alloc(struct mm_struct *mm)
> +{
> + return kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE), GFP_KERNEL);
> +}
> +
> +static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> +{
> + kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
> +}
>  
>  /*
>   * We don't have any real pmd's, and this code never triggers because
> @@ -68,8 +96,12 @@ static inline void pte_free(struct mm_struct *mm, 
> pgtable_t ptepage)
>  
>  static inline void pgtable_free(void *table, unsigned index_size)
>  {
> - BUG_ON(index_size); /* 32-bit doesn't use this */
> - free_page((unsigned long)table);
> + if (!index_size) {
> + free_page((unsigned long)table);
> + } else {
> + BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
> + kmem_cache_free(PGT_CACHE(index_size), table);
> + }
>  }
>  
>  #define check_pgt_cache()do { } while (0)
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
> b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 6b8b2d5..f887499 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -8,6 +8,26 @@
>  /* And here we include common definitions */
>  #include 
>  
> +#define PTE_INDEX_SIZE   PTE_SHIFT
> +#define PMD_INDEX_SIZE   

[PATCH v2 3/3] ARM: dts: imx6qdl-apalis: Use enable-gpios property for backlight

2016-09-18 Thread Sanchayan Maity
Use enable-gpios property of PWM backlight driver for backlight
control.

Signed-off-by: Sanchayan Maity 
---
Changes since v1:

Fix commit message

v1: https://lkml.org/lkml/2016/9/14/55
---
 arch/arm/boot/dts/imx6qdl-apalis.dtsi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/imx6qdl-apalis.dtsi 
b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
index 8c67dd8..9100bde 100644
--- a/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
@@ -49,7 +49,10 @@
 
backlight: backlight {
compatible = "pwm-backlight";
+   pinctrl-names = "default";
+   pinctrl-0 = <_gpio_bl_on>;
pwms = < 0 500>;
+   enable-gpios = < 13 GPIO_ACTIVE_HIGH>;
status = "disabled";
};
 
@@ -614,6 +617,12 @@
>;
};
 
+   pinctrl_gpio_bl_on: gpioblon {
+   fsl,pins = <
+   MX6QDL_PAD_EIM_DA13__GPIO3_IO13 0x1b0b0
+   >;
+   };
+
pinctrl_gpio_keys: gpio1io04grp {
fsl,pins = <
/* Power button */
-- 
2.9.3



[PATCH v2 3/3] ARM: dts: imx6qdl-apalis: Use enable-gpios property for backlight

2016-09-18 Thread Sanchayan Maity
Use enable-gpios property of PWM backlight driver for backlight
control.

Signed-off-by: Sanchayan Maity 
---
Changes since v1:

Fix commit message

v1: https://lkml.org/lkml/2016/9/14/55
---
 arch/arm/boot/dts/imx6qdl-apalis.dtsi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/imx6qdl-apalis.dtsi 
b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
index 8c67dd8..9100bde 100644
--- a/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
@@ -49,7 +49,10 @@
 
backlight: backlight {
compatible = "pwm-backlight";
+   pinctrl-names = "default";
+   pinctrl-0 = <_gpio_bl_on>;
pwms = < 0 500>;
+   enable-gpios = < 13 GPIO_ACTIVE_HIGH>;
status = "disabled";
};
 
@@ -614,6 +617,12 @@
>;
};
 
+   pinctrl_gpio_bl_on: gpioblon {
+   fsl,pins = <
+   MX6QDL_PAD_EIM_DA13__GPIO3_IO13 0x1b0b0
+   >;
+   };
+
pinctrl_gpio_keys: gpio1io04grp {
fsl,pins = <
/* Power button */
-- 
2.9.3



[PATCH v2 1/3] ARM: dts: imx6qdl-apalis: Do not rely on DDC I2C bus bitbang for HDMI

2016-09-18 Thread Sanchayan Maity
Remove the use of DDC I2C bus bitbang to support reading of EDID
and rely on support from internal HDMI I2C master controller instead.
As a result remove the device tree property ddc-i2c-bus.

Signed-off-by: Sanchayan Maity 
---
Changes since v1:

Change the ranking in i2c aliases

v1: https://lkml.org/lkml/2016/9/14/55
---
 arch/arm/boot/dts/imx6q-apalis-ixora.dts | 12 +++-
 arch/arm/boot/dts/imx6qdl-apalis.dtsi| 25 +
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/arch/arm/boot/dts/imx6q-apalis-ixora.dts 
b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
index 207b85b..82b81e0 100644
--- a/arch/arm/boot/dts/imx6q-apalis-ixora.dts
+++ b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
@@ -55,10 +55,9 @@
 "fsl,imx6q";
 
aliases {
-   i2c0 = 
-   i2c1 = 
-   i2c2 = 
-   i2c3 = 
+   i2c0 = 
+   i2c1 = 
+   i2c2 = 
};
 
aliases {
@@ -186,11 +185,6 @@
 };
 
  {
-   ddc-i2c-bus = <>;
-   status = "okay";
-};
-
- {
status = "okay";
 };
 
diff --git a/arch/arm/boot/dts/imx6qdl-apalis.dtsi 
b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
index 99e323b..8c67dd8 100644
--- a/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
@@ -53,18 +53,6 @@
status = "disabled";
};
 
-   /* DDC_I2C: I2C2_SDA/SCL on MXM3 205/207 */
-   i2cddc: i2c@0 {
-   compatible = "i2c-gpio";
-   pinctrl-names = "default";
-   pinctrl-0 = <_i2c_ddc>;
-   gpios = < 16 GPIO_ACTIVE_HIGH /* sda */
- 30 GPIO_ACTIVE_HIGH /* scl */
-   >;
-   i2c-gpio,delay-us = <2>;/* ~100 kHz */
-   status = "disabled";
-   };
-
reg_1p8v: regulator-1p8v {
compatible = "regulator-fixed";
regulator-name = "1P8V";
@@ -209,6 +197,12 @@
};
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_hdmi_ddc>;
+   status = "disabled";
+};
+
 /*
  * GEN1_I2C: I2C1_SDA/SCL on MXM3 209/211 (e.g. RTC on carrier
  * board)
@@ -633,11 +627,10 @@
>;
};
 
-   pinctrl_i2c_ddc: gpioi2cddcgrp {
+   pinctrl_hdmi_ddc: hdmiddcgrp {
fsl,pins = <
-   /* DDC bitbang */
-   MX6QDL_PAD_EIM_EB2__GPIO2_IO30 0x1b0b0
-   MX6QDL_PAD_EIM_D16__GPIO3_IO16 0x1b0b0
+   MX6QDL_PAD_EIM_EB2__HDMI_TX_DDC_SCL 0x4001b8b1
+   MX6QDL_PAD_EIM_D16__HDMI_TX_DDC_SDA 0x4001b8b1
>;
};
 
-- 
2.9.3



[PATCH v2 2/3] ARM: dts: imx6q-apalis-ixora: Remove use of pwm-leds

2016-09-18 Thread Sanchayan Maity
Remove use of pwm-leds and use the standard /sys/class/pwm
interface from PWM subsystem.

Signed-off-by: Sanchayan Maity 
Acked-by: Marcel Ziswiler 
---
v1: https://lkml.org/lkml/2016/9/14/55
---
 arch/arm/boot/dts/imx6q-apalis-ixora.dts | 22 --
 1 file changed, 22 deletions(-)

diff --git a/arch/arm/boot/dts/imx6q-apalis-ixora.dts 
b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
index 82b81e0..383192c 100644
--- a/arch/arm/boot/dts/imx6q-apalis-ixora.dts
+++ b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
@@ -146,28 +146,6 @@
gpios = < 2 GPIO_ACTIVE_HIGH>;
};
};
-
-   pwmleds {
-   compatible = "pwm-leds";
-
-   ledpwm1 {
-   label = "PWM1";
-   pwms = < 0 5>;
-   max-brightness = <255>;
-   };
-
-   ledpwm2 {
-   label = "PWM2";
-   pwms = < 0 5>;
-   max-brightness = <255>;
-   };
-
-   ledpwm3 {
-   label = "PWM3";
-   pwms = < 0 5>;
-   max-brightness = <255>;
-   };
-   };
 };
 
  {
-- 
2.9.3



[PATCH v2 1/3] ARM: dts: imx6qdl-apalis: Do not rely on DDC I2C bus bitbang for HDMI

2016-09-18 Thread Sanchayan Maity
Remove the use of DDC I2C bus bitbang to support reading of EDID
and rely on support from internal HDMI I2C master controller instead.
As a result remove the device tree property ddc-i2c-bus.

Signed-off-by: Sanchayan Maity 
---
Changes since v1:

Change the ranking in i2c aliases

v1: https://lkml.org/lkml/2016/9/14/55
---
 arch/arm/boot/dts/imx6q-apalis-ixora.dts | 12 +++-
 arch/arm/boot/dts/imx6qdl-apalis.dtsi| 25 +
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/arch/arm/boot/dts/imx6q-apalis-ixora.dts 
b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
index 207b85b..82b81e0 100644
--- a/arch/arm/boot/dts/imx6q-apalis-ixora.dts
+++ b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
@@ -55,10 +55,9 @@
 "fsl,imx6q";
 
aliases {
-   i2c0 = 
-   i2c1 = 
-   i2c2 = 
-   i2c3 = 
+   i2c0 = 
+   i2c1 = 
+   i2c2 = 
};
 
aliases {
@@ -186,11 +185,6 @@
 };
 
  {
-   ddc-i2c-bus = <>;
-   status = "okay";
-};
-
- {
status = "okay";
 };
 
diff --git a/arch/arm/boot/dts/imx6qdl-apalis.dtsi 
b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
index 99e323b..8c67dd8 100644
--- a/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
@@ -53,18 +53,6 @@
status = "disabled";
};
 
-   /* DDC_I2C: I2C2_SDA/SCL on MXM3 205/207 */
-   i2cddc: i2c@0 {
-   compatible = "i2c-gpio";
-   pinctrl-names = "default";
-   pinctrl-0 = <_i2c_ddc>;
-   gpios = < 16 GPIO_ACTIVE_HIGH /* sda */
- 30 GPIO_ACTIVE_HIGH /* scl */
-   >;
-   i2c-gpio,delay-us = <2>;/* ~100 kHz */
-   status = "disabled";
-   };
-
reg_1p8v: regulator-1p8v {
compatible = "regulator-fixed";
regulator-name = "1P8V";
@@ -209,6 +197,12 @@
};
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_hdmi_ddc>;
+   status = "disabled";
+};
+
 /*
  * GEN1_I2C: I2C1_SDA/SCL on MXM3 209/211 (e.g. RTC on carrier
  * board)
@@ -633,11 +627,10 @@
>;
};
 
-   pinctrl_i2c_ddc: gpioi2cddcgrp {
+   pinctrl_hdmi_ddc: hdmiddcgrp {
fsl,pins = <
-   /* DDC bitbang */
-   MX6QDL_PAD_EIM_EB2__GPIO2_IO30 0x1b0b0
-   MX6QDL_PAD_EIM_D16__GPIO3_IO16 0x1b0b0
+   MX6QDL_PAD_EIM_EB2__HDMI_TX_DDC_SCL 0x4001b8b1
+   MX6QDL_PAD_EIM_D16__HDMI_TX_DDC_SDA 0x4001b8b1
>;
};
 
-- 
2.9.3



[PATCH v2 2/3] ARM: dts: imx6q-apalis-ixora: Remove use of pwm-leds

2016-09-18 Thread Sanchayan Maity
Remove use of pwm-leds and use the standard /sys/class/pwm
interface from PWM subsystem.

Signed-off-by: Sanchayan Maity 
Acked-by: Marcel Ziswiler 
---
v1: https://lkml.org/lkml/2016/9/14/55
---
 arch/arm/boot/dts/imx6q-apalis-ixora.dts | 22 --
 1 file changed, 22 deletions(-)

diff --git a/arch/arm/boot/dts/imx6q-apalis-ixora.dts 
b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
index 82b81e0..383192c 100644
--- a/arch/arm/boot/dts/imx6q-apalis-ixora.dts
+++ b/arch/arm/boot/dts/imx6q-apalis-ixora.dts
@@ -146,28 +146,6 @@
gpios = < 2 GPIO_ACTIVE_HIGH>;
};
};
-
-   pwmleds {
-   compatible = "pwm-leds";
-
-   ledpwm1 {
-   label = "PWM1";
-   pwms = < 0 5>;
-   max-brightness = <255>;
-   };
-
-   ledpwm2 {
-   label = "PWM2";
-   pwms = < 0 5>;
-   max-brightness = <255>;
-   };
-
-   ledpwm3 {
-   label = "PWM3";
-   pwms = < 0 5>;
-   max-brightness = <255>;
-   };
-   };
 };
 
  {
-- 
2.9.3



[PATCH v2] serial: 8250_pci: Use symbolic constants for EXAR's MPIO registers

2016-09-18 Thread Jan Kiszka
Less magic that only requires comments.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
 - move new constants from uapi header into driver

 drivers/tty/serial/8250/8250_pci.c | 55 +++---
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_pci.c 
b/drivers/tty/serial/8250/8250_pci.c
index c1d4a8f..eff6c2f 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -1754,6 +1754,19 @@ static int pci_eg20t_init(struct pci_dev *dev)
 #define PCI_DEVICE_ID_EXAR_XR17V4358   0x4358
 #define PCI_DEVICE_ID_EXAR_XR17V8358   0x8358
 
+#define UART_EXAR_MPIOINT_7_0  0x8f/* MPIOINT[7:0] */
+#define UART_EXAR_MPIOLVL_7_0  0x90/* MPIOLVL[7:0] */
+#define UART_EXAR_MPIO3T_7_0   0x91/* MPIO3T[7:0] */
+#define UART_EXAR_MPIOINV_7_0  0x92/* MPIOINV[7:0] */
+#define UART_EXAR_MPIOSEL_7_0  0x93/* MPIOSEL[7:0] */
+#define UART_EXAR_MPIOOD_7_0   0x94/* MPIOOD[7:0] */
+#define UART_EXAR_MPIOINT_15_8 0x95/* MPIOINT[15:8] */
+#define UART_EXAR_MPIOLVL_15_8 0x96/* MPIOLVL[15:8] */
+#define UART_EXAR_MPIO3T_15_8  0x97/* MPIO3T[15:8] */
+#define UART_EXAR_MPIOINV_15_8 0x98/* MPIOINV[15:8] */
+#define UART_EXAR_MPIOSEL_15_8 0x99/* MPIOSEL[15:8] */
+#define UART_EXAR_MPIOOD_15_8  0x9a/* MPIOOD[15:8] */
+
 static int
 pci_xr17c154_setup(struct serial_private *priv,
  const struct pciserial_board *board,
@@ -1796,18 +1809,18 @@ pci_xr17v35x_setup(struct serial_private *priv,
 * Setup Multipurpose Input/Output pins.
 */
if (idx == 0) {
-   writeb(0x00, p + 0x8f); /*MPIOINT[7:0]*/
-   writeb(0x00, p + 0x90); /*MPIOLVL[7:0]*/
-   writeb(0x00, p + 0x91); /*MPIO3T[7:0]*/
-   writeb(0x00, p + 0x92); /*MPIOINV[7:0]*/
-   writeb(0x00, p + 0x93); /*MPIOSEL[7:0]*/
-   writeb(0x00, p + 0x94); /*MPIOOD[7:0]*/
-   writeb(0x00, p + 0x95); /*MPIOINT[15:8]*/
-   writeb(0x00, p + 0x96); /*MPIOLVL[15:8]*/
-   writeb(0x00, p + 0x97); /*MPIO3T[15:8]*/
-   writeb(0x00, p + 0x98); /*MPIOINV[15:8]*/
-   writeb(0x00, p + 0x99); /*MPIOSEL[15:8]*/
-   writeb(0x00, p + 0x9a); /*MPIOOD[15:8]*/
+   writeb(0x00, p + UART_EXAR_MPIOINT_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOLVL_7_0);
+   writeb(0x00, p + UART_EXAR_MPIO3T_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOINV_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOSEL_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOOD_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOINT_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOLVL_15_8);
+   writeb(0x00, p + UART_EXAR_MPIO3T_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOINV_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOSEL_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOOD_15_8);
}
writeb(0x00, p + UART_EXAR_8XMODE);
writeb(UART_FCTR_EXAR_TRGD, p + UART_EXAR_FCTR);
@@ -1843,20 +1856,20 @@ pci_fastcom335_setup(struct serial_private *priv,
switch (priv->dev->device) {
case PCI_DEVICE_ID_COMMTECH_4222PCI335:
case PCI_DEVICE_ID_COMMTECH_4224PCI335:
-   writeb(0x78, p + 0x90); /* MPIOLVL[7:0] */
-   writeb(0x00, p + 0x92); /* MPIOINV[7:0] */
-   writeb(0x00, p + 0x93); /* MPIOSEL[7:0] */
+   writeb(0x78, p + UART_EXAR_MPIOLVL_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOINV_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOSEL_7_0);
break;
case PCI_DEVICE_ID_COMMTECH_2324PCI335:
case PCI_DEVICE_ID_COMMTECH_2328PCI335:
-   writeb(0x00, p + 0x90); /* MPIOLVL[7:0] */
-   writeb(0xc0, p + 0x92); /* MPIOINV[7:0] */
-   writeb(0xc0, p + 0x93); /* MPIOSEL[7:0] */
+   writeb(0x00, p + UART_EXAR_MPIOLVL_7_0);
+   writeb(0xc0, p + UART_EXAR_MPIOINV_7_0);
+   writeb(0xc0, p + UART_EXAR_MPIOSEL_7_0);
break;
}
-   writeb(0x00, p + 0x8f); /* MPIOINT[7:0] */
-   writeb(0x00, p + 0x91); /* MPIO3T[7:0] */
-   writeb(0x00, p + 0x94); /* MPIOOD[7:0] */
+   writeb(0x00, p + UART_EXAR_MPIOINT_7_0);
+   writeb(0x00, p + UART_EXAR_MPIO3T_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOOD_7_0);
}
writeb(0x00, p + UART_EXAR_8XMODE);
writeb(UART_FCTR_EXAR_TRGD, p + UART_EXAR_FCTR);
-- 
2.1.4


[PATCH v2] serial: 8250_pci: Use symbolic constants for EXAR's MPIO registers

2016-09-18 Thread Jan Kiszka
Less magic that only requires comments.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
 - move new constants from uapi header into driver

 drivers/tty/serial/8250/8250_pci.c | 55 +++---
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_pci.c 
b/drivers/tty/serial/8250/8250_pci.c
index c1d4a8f..eff6c2f 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -1754,6 +1754,19 @@ static int pci_eg20t_init(struct pci_dev *dev)
 #define PCI_DEVICE_ID_EXAR_XR17V4358   0x4358
 #define PCI_DEVICE_ID_EXAR_XR17V8358   0x8358
 
+#define UART_EXAR_MPIOINT_7_0  0x8f/* MPIOINT[7:0] */
+#define UART_EXAR_MPIOLVL_7_0  0x90/* MPIOLVL[7:0] */
+#define UART_EXAR_MPIO3T_7_0   0x91/* MPIO3T[7:0] */
+#define UART_EXAR_MPIOINV_7_0  0x92/* MPIOINV[7:0] */
+#define UART_EXAR_MPIOSEL_7_0  0x93/* MPIOSEL[7:0] */
+#define UART_EXAR_MPIOOD_7_0   0x94/* MPIOOD[7:0] */
+#define UART_EXAR_MPIOINT_15_8 0x95/* MPIOINT[15:8] */
+#define UART_EXAR_MPIOLVL_15_8 0x96/* MPIOLVL[15:8] */
+#define UART_EXAR_MPIO3T_15_8  0x97/* MPIO3T[15:8] */
+#define UART_EXAR_MPIOINV_15_8 0x98/* MPIOINV[15:8] */
+#define UART_EXAR_MPIOSEL_15_8 0x99/* MPIOSEL[15:8] */
+#define UART_EXAR_MPIOOD_15_8  0x9a/* MPIOOD[15:8] */
+
 static int
 pci_xr17c154_setup(struct serial_private *priv,
  const struct pciserial_board *board,
@@ -1796,18 +1809,18 @@ pci_xr17v35x_setup(struct serial_private *priv,
 * Setup Multipurpose Input/Output pins.
 */
if (idx == 0) {
-   writeb(0x00, p + 0x8f); /*MPIOINT[7:0]*/
-   writeb(0x00, p + 0x90); /*MPIOLVL[7:0]*/
-   writeb(0x00, p + 0x91); /*MPIO3T[7:0]*/
-   writeb(0x00, p + 0x92); /*MPIOINV[7:0]*/
-   writeb(0x00, p + 0x93); /*MPIOSEL[7:0]*/
-   writeb(0x00, p + 0x94); /*MPIOOD[7:0]*/
-   writeb(0x00, p + 0x95); /*MPIOINT[15:8]*/
-   writeb(0x00, p + 0x96); /*MPIOLVL[15:8]*/
-   writeb(0x00, p + 0x97); /*MPIO3T[15:8]*/
-   writeb(0x00, p + 0x98); /*MPIOINV[15:8]*/
-   writeb(0x00, p + 0x99); /*MPIOSEL[15:8]*/
-   writeb(0x00, p + 0x9a); /*MPIOOD[15:8]*/
+   writeb(0x00, p + UART_EXAR_MPIOINT_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOLVL_7_0);
+   writeb(0x00, p + UART_EXAR_MPIO3T_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOINV_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOSEL_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOOD_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOINT_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOLVL_15_8);
+   writeb(0x00, p + UART_EXAR_MPIO3T_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOINV_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOSEL_15_8);
+   writeb(0x00, p + UART_EXAR_MPIOOD_15_8);
}
writeb(0x00, p + UART_EXAR_8XMODE);
writeb(UART_FCTR_EXAR_TRGD, p + UART_EXAR_FCTR);
@@ -1843,20 +1856,20 @@ pci_fastcom335_setup(struct serial_private *priv,
switch (priv->dev->device) {
case PCI_DEVICE_ID_COMMTECH_4222PCI335:
case PCI_DEVICE_ID_COMMTECH_4224PCI335:
-   writeb(0x78, p + 0x90); /* MPIOLVL[7:0] */
-   writeb(0x00, p + 0x92); /* MPIOINV[7:0] */
-   writeb(0x00, p + 0x93); /* MPIOSEL[7:0] */
+   writeb(0x78, p + UART_EXAR_MPIOLVL_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOINV_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOSEL_7_0);
break;
case PCI_DEVICE_ID_COMMTECH_2324PCI335:
case PCI_DEVICE_ID_COMMTECH_2328PCI335:
-   writeb(0x00, p + 0x90); /* MPIOLVL[7:0] */
-   writeb(0xc0, p + 0x92); /* MPIOINV[7:0] */
-   writeb(0xc0, p + 0x93); /* MPIOSEL[7:0] */
+   writeb(0x00, p + UART_EXAR_MPIOLVL_7_0);
+   writeb(0xc0, p + UART_EXAR_MPIOINV_7_0);
+   writeb(0xc0, p + UART_EXAR_MPIOSEL_7_0);
break;
}
-   writeb(0x00, p + 0x8f); /* MPIOINT[7:0] */
-   writeb(0x00, p + 0x91); /* MPIO3T[7:0] */
-   writeb(0x00, p + 0x94); /* MPIOOD[7:0] */
+   writeb(0x00, p + UART_EXAR_MPIOINT_7_0);
+   writeb(0x00, p + UART_EXAR_MPIO3T_7_0);
+   writeb(0x00, p + UART_EXAR_MPIOOD_7_0);
}
writeb(0x00, p + UART_EXAR_8XMODE);
writeb(UART_FCTR_EXAR_TRGD, p + UART_EXAR_FCTR);
-- 
2.1.4


Re: [PATCH 3/5] ipc/sem: optimize perform_atomic_semop()

2016-09-18 Thread Manfred Spraul

Hi Davidlohr,

On 09/13/2016 10:33 AM, Davidlohr Bueso wrote:




@@ -1751,12 +1820,17 @@ SYSCALL_DEFINE4(semtimedop, int, semid, 
struct sembuf __user *, tsops,

 if (sop->sem_num >= max)
 max = sop->sem_num;
 if (sop->sem_flg & SEM_UNDO)
-undos = 1;
+undos = true;
 if (sop->sem_op != 0)
-alter = 1;
+alter = true;
+if (sop->sem_num < SEMOPM_FAST && !dupsop) {
+if (dup & (1 << sop->sem_num))
+dupsop = 1;
+else
+dup |= 1 << sop->sem_num;
+}
 }
At least for nsops=2, sops[0].sem_num !=sops[1].sem_num can detect 
absense of duplicated ops regardless of the array size.

Should we support that?


There are various individual cases like that (ie obviously nsops == 1, 
alter == 0, etc)
where the dup detection would be unnecessary, but it seems like a 
stretch to go
at it like this. The above will work on the common case (assuming 
lower sem_num
of course). So I'm not particularly worried about being too smart at 
the dup detection.



What about the attached dup detection?

--
Manfred
>From 140340a358dbf66b3bc6f848ca9b860e3e957e84 Mon Sep 17 00:00:00 2001
From: Manfred Spraul 
Date: Mon, 19 Sep 2016 06:25:20 +0200
Subject: [PATCH] ipc/sem: Update duplicate sop detection

The duplicated sop detection can be improved:
- use uint64_t instead of unsigned long for the bit array
  storage, otherwise we break 32-bit archs
- support large arrays, just interpret the bit array
  as a hash array (i.e.: an operation that accesses semaphore
  0 and 64 would trigger the dupsop code, but that is
  far better than not trying at all for semnum >=64)
- support test-for-zero-and-increase, this case can use the
  fast codepath.

Untested! S-O-B only for the code, needs testing.

Signed-off-by: Manfred Spraul 
---
 ipc/sem.c | 29 ++---
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index d9c743a..eda9e46 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1784,7 +1784,8 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
 	int max, locknum;
 	bool undos = false, alter = false, dupsop = false;
 	struct sem_queue queue;
-	unsigned long dup = 0, jiffies_left = 0;
+	unsigned long jiffies_left = 0;
+	uint64_t dup;
 	struct ipc_namespace *ns;
 
 	ns = current->nsproxy->ipc_ns;
@@ -1816,18 +1817,32 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
 		jiffies_left = timespec_to_jiffies(&_timeout);
 	}
 	max = 0;
+
+	dup = 0;
 	for (sop = sops; sop < sops + nsops; sop++) {
+		uint64_t mask;
+
 		if (sop->sem_num >= max)
 			max = sop->sem_num;
 		if (sop->sem_flg & SEM_UNDO)
 			undos = true;
-		if (sop->sem_op != 0)
+
+		/* 64: BITS_PER_UNIT64 */
+		mask = 1<<((sop->sem_num)%64);
+
+		if (dup & mask) {
+			/*
+			 * There was a previous alter access that appears
+			 * to have accessed the same semaphore, thus
+			 * use the dupsop logic.
+			 * "appears", because the detection can only check
+			 * % BITS_PER_UNIT64.
+			 */
+			dupsop = 1;
+		}
+		if (sop->sem_op != 0) {
 			alter = true;
-		if (sop->sem_num < SEMOPM_FAST && !dupsop) {
-			if (dup & (1 << sop->sem_num))
-dupsop = 1;
-			else
-dup |= 1 << sop->sem_num;
+			dup |= mask;
 		}
 	}
 
-- 
2.7.4



Re: [PATCH 3/5] ipc/sem: optimize perform_atomic_semop()

2016-09-18 Thread Manfred Spraul

Hi Davidlohr,

On 09/13/2016 10:33 AM, Davidlohr Bueso wrote:




@@ -1751,12 +1820,17 @@ SYSCALL_DEFINE4(semtimedop, int, semid, 
struct sembuf __user *, tsops,

 if (sop->sem_num >= max)
 max = sop->sem_num;
 if (sop->sem_flg & SEM_UNDO)
-undos = 1;
+undos = true;
 if (sop->sem_op != 0)
-alter = 1;
+alter = true;
+if (sop->sem_num < SEMOPM_FAST && !dupsop) {
+if (dup & (1 << sop->sem_num))
+dupsop = 1;
+else
+dup |= 1 << sop->sem_num;
+}
 }
At least for nsops=2, sops[0].sem_num !=sops[1].sem_num can detect 
absense of duplicated ops regardless of the array size.

Should we support that?


There are various individual cases like that (ie obviously nsops == 1, 
alter == 0, etc)
where the dup detection would be unnecessary, but it seems like a 
stretch to go
at it like this. The above will work on the common case (assuming 
lower sem_num
of course). So I'm not particularly worried about being too smart at 
the dup detection.



What about the attached dup detection?

--
Manfred
>From 140340a358dbf66b3bc6f848ca9b860e3e957e84 Mon Sep 17 00:00:00 2001
From: Manfred Spraul 
Date: Mon, 19 Sep 2016 06:25:20 +0200
Subject: [PATCH] ipc/sem: Update duplicate sop detection

The duplicated sop detection can be improved:
- use uint64_t instead of unsigned long for the bit array
  storage, otherwise we break 32-bit archs
- support large arrays, just interpret the bit array
  as a hash array (i.e.: an operation that accesses semaphore
  0 and 64 would trigger the dupsop code, but that is
  far better than not trying at all for semnum >=64)
- support test-for-zero-and-increase, this case can use the
  fast codepath.

Untested! S-O-B only for the code, needs testing.

Signed-off-by: Manfred Spraul 
---
 ipc/sem.c | 29 ++---
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index d9c743a..eda9e46 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1784,7 +1784,8 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
 	int max, locknum;
 	bool undos = false, alter = false, dupsop = false;
 	struct sem_queue queue;
-	unsigned long dup = 0, jiffies_left = 0;
+	unsigned long jiffies_left = 0;
+	uint64_t dup;
 	struct ipc_namespace *ns;
 
 	ns = current->nsproxy->ipc_ns;
@@ -1816,18 +1817,32 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
 		jiffies_left = timespec_to_jiffies(&_timeout);
 	}
 	max = 0;
+
+	dup = 0;
 	for (sop = sops; sop < sops + nsops; sop++) {
+		uint64_t mask;
+
 		if (sop->sem_num >= max)
 			max = sop->sem_num;
 		if (sop->sem_flg & SEM_UNDO)
 			undos = true;
-		if (sop->sem_op != 0)
+
+		/* 64: BITS_PER_UNIT64 */
+		mask = 1<<((sop->sem_num)%64);
+
+		if (dup & mask) {
+			/*
+			 * There was a previous alter access that appears
+			 * to have accessed the same semaphore, thus
+			 * use the dupsop logic.
+			 * "appears", because the detection can only check
+			 * % BITS_PER_UNIT64.
+			 */
+			dupsop = 1;
+		}
+		if (sop->sem_op != 0) {
 			alter = true;
-		if (sop->sem_num < SEMOPM_FAST && !dupsop) {
-			if (dup & (1 << sop->sem_num))
-dupsop = 1;
-			else
-dup |= 1 << sop->sem_num;
+			dup |= mask;
 		}
 	}
 
-- 
2.7.4



[PATCH V1 1/3] clk: Loongson1: Refactor Loongson1 clock

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

Factor out the common functions into loongson1/clk.c
to support both Loongson1B and Loongson1C. And, put
the rest into loongson1/clk-loongson1b.c.

Signed-off-by: Kelvin Cheung 

---
v1:
   Rebase the patch on clk: ls1x: Migrate to clk_hw based OF
   and registration APIs.
---
 drivers/clk/Makefile   |  2 +-
 drivers/clk/loongson1/Makefile |  2 +
 .../clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} | 51 ++
 drivers/clk/loongson1/clk.c| 43 ++
 drivers/clk/loongson1/clk.h| 19 
 5 files changed, 69 insertions(+), 48 deletions(-)
 create mode 100644 drivers/clk/loongson1/Makefile
 rename drivers/clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} (78%)
 create mode 100644 drivers/clk/loongson1/clk.c
 create mode 100644 drivers/clk/loongson1/clk.h

diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
index 8264d81..925081e 100644
--- a/drivers/clk/Makefile
+++ b/drivers/clk/Makefile
@@ -26,7 +26,6 @@ obj-$(CONFIG_ARCH_CLPS711X)   += clk-clps711x.o
 obj-$(CONFIG_COMMON_CLK_CS2000_CP) += clk-cs2000-cp.o
 obj-$(CONFIG_ARCH_EFM32)   += clk-efm32gg.o
 obj-$(CONFIG_ARCH_HIGHBANK)+= clk-highbank.o
-obj-$(CONFIG_MACH_LOONGSON32)  += clk-ls1x.o
 obj-$(CONFIG_COMMON_CLK_MAX77686)  += clk-max77686.o
 obj-$(CONFIG_ARCH_MB86S7X) += clk-mb86s7x.o
 obj-$(CONFIG_ARCH_MOXART)  += clk-moxart.o
@@ -61,6 +60,7 @@ obj-$(CONFIG_ARCH_HISI)   += hisilicon/
 obj-$(CONFIG_ARCH_MXC) += imx/
 obj-$(CONFIG_MACH_INGENIC) += ingenic/
 obj-$(CONFIG_COMMON_CLK_KEYSTONE)  += keystone/
+obj-$(CONFIG_MACH_LOONGSON32)  += loongson1/
 obj-$(CONFIG_ARCH_MEDIATEK)+= mediatek/
 obj-$(CONFIG_COMMON_CLK_AMLOGIC)   += meson/
 obj-$(CONFIG_MACH_PIC32)   += microchip/
diff --git a/drivers/clk/loongson1/Makefile b/drivers/clk/loongson1/Makefile
new file mode 100644
index 000..5a162a1
--- /dev/null
+++ b/drivers/clk/loongson1/Makefile
@@ -0,0 +1,2 @@
+obj-y  += clk.o
+obj-$(CONFIG_LOONGSON1_LS1B)   += clk-loongson1b.o
diff --git a/drivers/clk/clk-ls1x.c b/drivers/clk/loongson1/clk-loongson1b.c
similarity index 78%
rename from drivers/clk/clk-ls1x.c
rename to drivers/clk/loongson1/clk-loongson1b.c
index 8430e45..5b6817e 100644
--- a/drivers/clk/clk-ls1x.c
+++ b/drivers/clk/loongson1/clk-loongson1b.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012 Zhang, Keguang 
+ * Copyright (c) 2012-2016 Zhang, Keguang 
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -10,25 +10,16 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include 
+#include "clk.h"
 
 #define OSC(33 * 100)
 #define DIV_APB2
 
 static DEFINE_SPINLOCK(_lock);
 
-static int ls1x_pll_clk_enable(struct clk_hw *hw)
-{
-   return 0;
-}
-
-static void ls1x_pll_clk_disable(struct clk_hw *hw)
-{
-}
-
 static unsigned long ls1x_pll_recalc_rate(struct clk_hw *hw,
  unsigned long parent_rate)
 {
@@ -43,44 +34,9 @@ static unsigned long ls1x_pll_recalc_rate(struct clk_hw *hw,
 }
 
 static const struct clk_ops ls1x_pll_clk_ops = {
-   .enable = ls1x_pll_clk_enable,
-   .disable = ls1x_pll_clk_disable,
.recalc_rate = ls1x_pll_recalc_rate,
 };
 
-static struct clk_hw *__init clk_hw_register_pll(struct device *dev,
-const char *name,
-const char *parent_name,
-unsigned long flags)
-{
-   int ret;
-   struct clk_hw *hw;
-   struct clk_init_data init;
-
-   /* allocate the divider */
-   hw = kzalloc(sizeof(struct clk_hw), GFP_KERNEL);
-   if (!hw) {
-   pr_err("%s: could not allocate clk_hw\n", __func__);
-   return ERR_PTR(-ENOMEM);
-   }
-
-   init.name = name;
-   init.ops = _pll_clk_ops;
-   init.flags = flags | CLK_IS_BASIC;
-   init.parent_names = (parent_name ? _name : NULL);
-   init.num_parents = (parent_name ? 1 : 0);
-   hw->init = 
-
-   /* register the clock */
-   ret = clk_hw_register(dev, hw);
-   if (ret) {
-   kfree(hw);
-   hw = ERR_PTR(ret);
-   }
-
-   return hw;
-}
-
 static const char * const cpu_parents[] = { "cpu_clk_div", "osc_33m_clk", };
 static const char * const ahb_parents[] = { "ahb_clk_div", "osc_33m_clk", };
 static const char * const dc_parents[] = { "dc_clk_div", "osc_33m_clk", };
@@ -93,7 +49,8 @@ void __init ls1x_clk_init(void)
clk_hw_register_clkdev(hw, 

[PATCH V1 3/3] clk: Loongson1: Make use of GENMASK

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

Make use of GENMASK instead of open coding the equivalent operation,
and update the PLL formula.

Signed-off-by: Kelvin Cheung 
---
 drivers/clk/loongson1/clk-loongson1b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/loongson1/clk-loongson1b.c 
b/drivers/clk/loongson1/clk-loongson1b.c
index 4b3d9d2..f36a97e 100644
--- a/drivers/clk/loongson1/clk-loongson1b.c
+++ b/drivers/clk/loongson1/clk-loongson1b.c
@@ -26,7 +26,7 @@ static unsigned long ls1x_pll_recalc_rate(struct clk_hw *hw,
u32 pll, rate;
 
pll = __raw_readl(LS1X_CLK_PLL_FREQ);
-   rate = 12 + (pll & 0x3f) + (((pll >> 8) & 0x3ff) >> 10);
+   rate = 12 + (pll & GENMASK(5, 0));
rate *= OSC;
rate >>= 1;
 
-- 
1.9.1



[PATCH V1 2/3] clk: Loongson1: Update clocks of Loongson1B

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

This patch updates some clock names of Loongson1B,
and adds AC97, DMA and NAND clock.

Signed-off-by: Kelvin Cheung 

---
v1:
   Rebase the patch on clk: ls1x: Migrate to clk_hw based OF
   and registration APIs.
---
 drivers/clk/loongson1/clk-loongson1b.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/clk/loongson1/clk-loongson1b.c 
b/drivers/clk/loongson1/clk-loongson1b.c
index 5b6817e..4b3d9d2 100644
--- a/drivers/clk/loongson1/clk-loongson1b.c
+++ b/drivers/clk/loongson1/clk-loongson1b.c
@@ -37,19 +37,19 @@ static const struct clk_ops ls1x_pll_clk_ops = {
.recalc_rate = ls1x_pll_recalc_rate,
 };
 
-static const char * const cpu_parents[] = { "cpu_clk_div", "osc_33m_clk", };
-static const char * const ahb_parents[] = { "ahb_clk_div", "osc_33m_clk", };
-static const char * const dc_parents[] = { "dc_clk_div", "osc_33m_clk", };
+static const char *const cpu_parents[] = { "cpu_clk_div", "osc_clk", };
+static const char *const ahb_parents[] = { "ahb_clk_div", "osc_clk", };
+static const char *const dc_parents[] = { "dc_clk_div", "osc_clk", };
 
 void __init ls1x_clk_init(void)
 {
struct clk_hw *hw;
 
-   hw = clk_hw_register_fixed_rate(NULL, "osc_33m_clk", NULL, 0, OSC);
-   clk_hw_register_clkdev(hw, "osc_33m_clk", NULL);
+   hw = clk_hw_register_fixed_rate(NULL, "osc_clk", NULL, 0, OSC);
+   clk_hw_register_clkdev(hw, "osc_clk", NULL);
 
/* clock derived from 33 MHz OSC clk */
-   hw = clk_hw_register_pll(NULL, "pll_clk", "osc_33m_clk",
+   hw = clk_hw_register_pll(NULL, "pll_clk", "osc_clk",
 _pll_clk_ops, 0);
clk_hw_register_clkdev(hw, "pll_clk", NULL);
 
@@ -104,6 +104,7 @@ void __init ls1x_clk_init(void)
   CLK_SET_RATE_NO_REPARENT, LS1X_CLK_PLL_DIV,
   BYPASS_DDR_SHIFT, BYPASS_DDR_WIDTH, 0, &_lock);
clk_hw_register_clkdev(hw, "ahb_clk", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-dma", NULL);
clk_hw_register_clkdev(hw, "stmmaceth", NULL);
 
/* clock derived from AHB clk */
@@ -111,9 +112,11 @@ void __init ls1x_clk_init(void)
hw = clk_hw_register_fixed_factor(NULL, "apb_clk", "ahb_clk", 0, 1,
DIV_APB);
clk_hw_register_clkdev(hw, "apb_clk", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_i2c", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_pwmtimer", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_spi", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_wdt", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-ac97", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-i2c", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-nand", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-pwmtimer", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-spi", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-wdt", NULL);
clk_hw_register_clkdev(hw, "serial8250", NULL);
 }
-- 
1.9.1



[PATCH V1 3/3] clk: Loongson1: Make use of GENMASK

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

Make use of GENMASK instead of open coding the equivalent operation,
and update the PLL formula.

Signed-off-by: Kelvin Cheung 
---
 drivers/clk/loongson1/clk-loongson1b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/loongson1/clk-loongson1b.c 
b/drivers/clk/loongson1/clk-loongson1b.c
index 4b3d9d2..f36a97e 100644
--- a/drivers/clk/loongson1/clk-loongson1b.c
+++ b/drivers/clk/loongson1/clk-loongson1b.c
@@ -26,7 +26,7 @@ static unsigned long ls1x_pll_recalc_rate(struct clk_hw *hw,
u32 pll, rate;
 
pll = __raw_readl(LS1X_CLK_PLL_FREQ);
-   rate = 12 + (pll & 0x3f) + (((pll >> 8) & 0x3ff) >> 10);
+   rate = 12 + (pll & GENMASK(5, 0));
rate *= OSC;
rate >>= 1;
 
-- 
1.9.1



[PATCH V1 2/3] clk: Loongson1: Update clocks of Loongson1B

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

This patch updates some clock names of Loongson1B,
and adds AC97, DMA and NAND clock.

Signed-off-by: Kelvin Cheung 

---
v1:
   Rebase the patch on clk: ls1x: Migrate to clk_hw based OF
   and registration APIs.
---
 drivers/clk/loongson1/clk-loongson1b.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/clk/loongson1/clk-loongson1b.c 
b/drivers/clk/loongson1/clk-loongson1b.c
index 5b6817e..4b3d9d2 100644
--- a/drivers/clk/loongson1/clk-loongson1b.c
+++ b/drivers/clk/loongson1/clk-loongson1b.c
@@ -37,19 +37,19 @@ static const struct clk_ops ls1x_pll_clk_ops = {
.recalc_rate = ls1x_pll_recalc_rate,
 };
 
-static const char * const cpu_parents[] = { "cpu_clk_div", "osc_33m_clk", };
-static const char * const ahb_parents[] = { "ahb_clk_div", "osc_33m_clk", };
-static const char * const dc_parents[] = { "dc_clk_div", "osc_33m_clk", };
+static const char *const cpu_parents[] = { "cpu_clk_div", "osc_clk", };
+static const char *const ahb_parents[] = { "ahb_clk_div", "osc_clk", };
+static const char *const dc_parents[] = { "dc_clk_div", "osc_clk", };
 
 void __init ls1x_clk_init(void)
 {
struct clk_hw *hw;
 
-   hw = clk_hw_register_fixed_rate(NULL, "osc_33m_clk", NULL, 0, OSC);
-   clk_hw_register_clkdev(hw, "osc_33m_clk", NULL);
+   hw = clk_hw_register_fixed_rate(NULL, "osc_clk", NULL, 0, OSC);
+   clk_hw_register_clkdev(hw, "osc_clk", NULL);
 
/* clock derived from 33 MHz OSC clk */
-   hw = clk_hw_register_pll(NULL, "pll_clk", "osc_33m_clk",
+   hw = clk_hw_register_pll(NULL, "pll_clk", "osc_clk",
 _pll_clk_ops, 0);
clk_hw_register_clkdev(hw, "pll_clk", NULL);
 
@@ -104,6 +104,7 @@ void __init ls1x_clk_init(void)
   CLK_SET_RATE_NO_REPARENT, LS1X_CLK_PLL_DIV,
   BYPASS_DDR_SHIFT, BYPASS_DDR_WIDTH, 0, &_lock);
clk_hw_register_clkdev(hw, "ahb_clk", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-dma", NULL);
clk_hw_register_clkdev(hw, "stmmaceth", NULL);
 
/* clock derived from AHB clk */
@@ -111,9 +112,11 @@ void __init ls1x_clk_init(void)
hw = clk_hw_register_fixed_factor(NULL, "apb_clk", "ahb_clk", 0, 1,
DIV_APB);
clk_hw_register_clkdev(hw, "apb_clk", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_i2c", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_pwmtimer", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_spi", NULL);
-   clk_hw_register_clkdev(hw, "ls1x_wdt", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-ac97", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-i2c", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-nand", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-pwmtimer", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-spi", NULL);
+   clk_hw_register_clkdev(hw, "ls1x-wdt", NULL);
clk_hw_register_clkdev(hw, "serial8250", NULL);
 }
-- 
1.9.1



[PATCH V1 1/3] clk: Loongson1: Refactor Loongson1 clock

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

Factor out the common functions into loongson1/clk.c
to support both Loongson1B and Loongson1C. And, put
the rest into loongson1/clk-loongson1b.c.

Signed-off-by: Kelvin Cheung 

---
v1:
   Rebase the patch on clk: ls1x: Migrate to clk_hw based OF
   and registration APIs.
---
 drivers/clk/Makefile   |  2 +-
 drivers/clk/loongson1/Makefile |  2 +
 .../clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} | 51 ++
 drivers/clk/loongson1/clk.c| 43 ++
 drivers/clk/loongson1/clk.h| 19 
 5 files changed, 69 insertions(+), 48 deletions(-)
 create mode 100644 drivers/clk/loongson1/Makefile
 rename drivers/clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} (78%)
 create mode 100644 drivers/clk/loongson1/clk.c
 create mode 100644 drivers/clk/loongson1/clk.h

diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
index 8264d81..925081e 100644
--- a/drivers/clk/Makefile
+++ b/drivers/clk/Makefile
@@ -26,7 +26,6 @@ obj-$(CONFIG_ARCH_CLPS711X)   += clk-clps711x.o
 obj-$(CONFIG_COMMON_CLK_CS2000_CP) += clk-cs2000-cp.o
 obj-$(CONFIG_ARCH_EFM32)   += clk-efm32gg.o
 obj-$(CONFIG_ARCH_HIGHBANK)+= clk-highbank.o
-obj-$(CONFIG_MACH_LOONGSON32)  += clk-ls1x.o
 obj-$(CONFIG_COMMON_CLK_MAX77686)  += clk-max77686.o
 obj-$(CONFIG_ARCH_MB86S7X) += clk-mb86s7x.o
 obj-$(CONFIG_ARCH_MOXART)  += clk-moxart.o
@@ -61,6 +60,7 @@ obj-$(CONFIG_ARCH_HISI)   += hisilicon/
 obj-$(CONFIG_ARCH_MXC) += imx/
 obj-$(CONFIG_MACH_INGENIC) += ingenic/
 obj-$(CONFIG_COMMON_CLK_KEYSTONE)  += keystone/
+obj-$(CONFIG_MACH_LOONGSON32)  += loongson1/
 obj-$(CONFIG_ARCH_MEDIATEK)+= mediatek/
 obj-$(CONFIG_COMMON_CLK_AMLOGIC)   += meson/
 obj-$(CONFIG_MACH_PIC32)   += microchip/
diff --git a/drivers/clk/loongson1/Makefile b/drivers/clk/loongson1/Makefile
new file mode 100644
index 000..5a162a1
--- /dev/null
+++ b/drivers/clk/loongson1/Makefile
@@ -0,0 +1,2 @@
+obj-y  += clk.o
+obj-$(CONFIG_LOONGSON1_LS1B)   += clk-loongson1b.o
diff --git a/drivers/clk/clk-ls1x.c b/drivers/clk/loongson1/clk-loongson1b.c
similarity index 78%
rename from drivers/clk/clk-ls1x.c
rename to drivers/clk/loongson1/clk-loongson1b.c
index 8430e45..5b6817e 100644
--- a/drivers/clk/clk-ls1x.c
+++ b/drivers/clk/loongson1/clk-loongson1b.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012 Zhang, Keguang 
+ * Copyright (c) 2012-2016 Zhang, Keguang 
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -10,25 +10,16 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include 
+#include "clk.h"
 
 #define OSC(33 * 100)
 #define DIV_APB2
 
 static DEFINE_SPINLOCK(_lock);
 
-static int ls1x_pll_clk_enable(struct clk_hw *hw)
-{
-   return 0;
-}
-
-static void ls1x_pll_clk_disable(struct clk_hw *hw)
-{
-}
-
 static unsigned long ls1x_pll_recalc_rate(struct clk_hw *hw,
  unsigned long parent_rate)
 {
@@ -43,44 +34,9 @@ static unsigned long ls1x_pll_recalc_rate(struct clk_hw *hw,
 }
 
 static const struct clk_ops ls1x_pll_clk_ops = {
-   .enable = ls1x_pll_clk_enable,
-   .disable = ls1x_pll_clk_disable,
.recalc_rate = ls1x_pll_recalc_rate,
 };
 
-static struct clk_hw *__init clk_hw_register_pll(struct device *dev,
-const char *name,
-const char *parent_name,
-unsigned long flags)
-{
-   int ret;
-   struct clk_hw *hw;
-   struct clk_init_data init;
-
-   /* allocate the divider */
-   hw = kzalloc(sizeof(struct clk_hw), GFP_KERNEL);
-   if (!hw) {
-   pr_err("%s: could not allocate clk_hw\n", __func__);
-   return ERR_PTR(-ENOMEM);
-   }
-
-   init.name = name;
-   init.ops = _pll_clk_ops;
-   init.flags = flags | CLK_IS_BASIC;
-   init.parent_names = (parent_name ? _name : NULL);
-   init.num_parents = (parent_name ? 1 : 0);
-   hw->init = 
-
-   /* register the clock */
-   ret = clk_hw_register(dev, hw);
-   if (ret) {
-   kfree(hw);
-   hw = ERR_PTR(ret);
-   }
-
-   return hw;
-}
-
 static const char * const cpu_parents[] = { "cpu_clk_div", "osc_33m_clk", };
 static const char * const ahb_parents[] = { "ahb_clk_div", "osc_33m_clk", };
 static const char * const dc_parents[] = { "dc_clk_div", "osc_33m_clk", };
@@ -93,7 +49,8 @@ void __init ls1x_clk_init(void)
clk_hw_register_clkdev(hw, "osc_33m_clk", NULL);
 
/* clock derived from 33 MHz OSC clk */
-   hw = clk_hw_register_pll(NULL, 

[PATCH V1 0/3] Refactor Loongson1 clock

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

This patchset is to refactor Loongson1 clock,
and update Loongson1B clocks.

This applies on top of clk-next.

Thanks!

Changelog:
v1:
   Rebase the patch on clk: ls1x: Migrate to clk_hw based OF
   and registration APIs.

Kelvin Cheung (3):
  clk: Loongson1: Refactor Loongson1 clock
  clk: Loongson1: Update clocks of Loongson1B
  clk: Loongson1: Make use of GENMASK

 drivers/clk/Makefile   |  2 +-
 drivers/clk/loongson1/Makefile |  2 +
 .../clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} | 74 +-
 drivers/clk/loongson1/clk.c| 43 +
 drivers/clk/loongson1/clk.h| 19 ++
 5 files changed, 82 insertions(+), 58 deletions(-)
 create mode 100644 drivers/clk/loongson1/Makefile
 rename drivers/clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} (67%)
 create mode 100644 drivers/clk/loongson1/clk.c
 create mode 100644 drivers/clk/loongson1/clk.h

-- 
1.9.1



[PATCH V1 0/3] Refactor Loongson1 clock

2016-09-18 Thread Keguang Zhang
From: Kelvin Cheung 

This patchset is to refactor Loongson1 clock,
and update Loongson1B clocks.

This applies on top of clk-next.

Thanks!

Changelog:
v1:
   Rebase the patch on clk: ls1x: Migrate to clk_hw based OF
   and registration APIs.

Kelvin Cheung (3):
  clk: Loongson1: Refactor Loongson1 clock
  clk: Loongson1: Update clocks of Loongson1B
  clk: Loongson1: Make use of GENMASK

 drivers/clk/Makefile   |  2 +-
 drivers/clk/loongson1/Makefile |  2 +
 .../clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} | 74 +-
 drivers/clk/loongson1/clk.c| 43 +
 drivers/clk/loongson1/clk.h| 19 ++
 5 files changed, 82 insertions(+), 58 deletions(-)
 create mode 100644 drivers/clk/loongson1/Makefile
 rename drivers/clk/{clk-ls1x.c => loongson1/clk-loongson1b.c} (67%)
 create mode 100644 drivers/clk/loongson1/clk.c
 create mode 100644 drivers/clk/loongson1/clk.h

-- 
1.9.1



Re: [ISSUE] Memleak in LED sysfs on heavy usage

2016-09-18 Thread Daniel Gorsulowski

Hi Jacek,

Am 16.09.2016 um 15:41 schrieb Jacek Anaszewski:

On 09/16/2016 02:08 PM, Daniel Gorsulowski wrote:

Hi Jacek,

Am 16.09.2016 um 13:25 schrieb Jacek Anaszewski:

On 09/16/2016 10:15 AM, Daniel Gorsulowski wrote:

Hi Jacek,

Am 16.09.2016 um 09:31 schrieb Jacek Anaszewski:

Hi Daniel,

On 09/12/2016 10:50 AM, Daniel Gorsulowski wrote:

Hello!

Please consider if I made something wrong, sending this issue. This is
my first contact to the LKML.
By mistake, I accessed an LED via /sys/class/leds subsystem very
fast in
an user application. I figured out, that the free user memory
decreased
constantly. So I tried to analyze the Problem and wrote a litte
script:

#!/bin/sh
while [ 1 ]; do
echo 1 > /sys/class/leds/2a_service_yellow/brightness
echo 0 > /sys/class/leds/2a_service_yellow/brightness
done

And voila, I was able to reproduce the problem.
So I add a bit more debugging:

#!/bin/sh
cnt=0
while [ 1 ]; do
if [ `expr $cnt % 1000` -eq 0 ]; then
free | grep Mem: | cut -d' ' -f25
fi
echo 1 > /sys/class/leds/2a_service_yellow/brightness
echo 0 > /sys/class/leds/2a_service_yellow/brightness
let "cnt++"
done

And huh? No memory is eaten anymore. So it looks like, the problem
only
occours on heavy (fast) usage of /sys/class/leds subsystem.

I rewrote the script and toggled a GPIO pin, but there was no problem
recognizable.


I've been unable to reproduce the problem with leds-aat1290 driver
and Samsung M0 board. It must be driver specific issue.
What driver did you use?


I defined LEDS_GPIO and so I'm using leds-gpio driver.
danielg@debby:~/opt/prj/ti-linux-kernel$ cat .config | grep LEDS | grep
-v "^# "
CONFIG_INPUT_LEDS=y
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
CONFIG_LEDS_GPIO=y
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=y
CONFIG_LEDS_TRIGGER_ONESHOT=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
CONFIG_LEDS_TRIGGER_GPIO=y
CONFIG_LEDS_TRIGGER_DEFAULT_ON=y
CONFIG_LEDS_TRIGGER_TRANSIENT=y



Unfortunately I am still unable to reproduce the problem with leds-gpio.
I'm not observing any heavy usage with your test case:

~#free
  total   used   free sharedbuffers
cached
Mem:   1028092  61364 966728  0   8416  22396
-/+ buffers/cache:  30552 997540
Swap:0  0  0


Actually you didn't give any numbers. What kernel version are you using?


As I wrote, the problems occurred in vanilla 4.6 kernel, but also in 4.4
kernel (with PREEMPT-RT Patchset).


Heh, funny coincidence. I was testing this on recent linux-leds.git,
for-next branch and was not able to detect the issue. It started to
appear after resetting HEAD to 4.8-rc2 base. Finally it turned out
that what fixes the issue is the most recent commit [1].

Further investigation revealed that this is kobject_uevent_env(),
called from led_trigger_set(), which causes memory leaks when called
with high frequency.

CC GregKH.

[1]
https://git.kernel.org/cgit/linux/kernel/git/j.anaszewski/linux-leds.git/commit/?h=for-next=f3f624941be0fafb29fff5c1411fa433feca792c


Nice to hear about the Fix, thanks for your investigation!

Kind regards,
Daniel


Re: [ISSUE] Memleak in LED sysfs on heavy usage

2016-09-18 Thread Daniel Gorsulowski

Hi Jacek,

Am 16.09.2016 um 15:41 schrieb Jacek Anaszewski:

On 09/16/2016 02:08 PM, Daniel Gorsulowski wrote:

Hi Jacek,

Am 16.09.2016 um 13:25 schrieb Jacek Anaszewski:

On 09/16/2016 10:15 AM, Daniel Gorsulowski wrote:

Hi Jacek,

Am 16.09.2016 um 09:31 schrieb Jacek Anaszewski:

Hi Daniel,

On 09/12/2016 10:50 AM, Daniel Gorsulowski wrote:

Hello!

Please consider if I made something wrong, sending this issue. This is
my first contact to the LKML.
By mistake, I accessed an LED via /sys/class/leds subsystem very
fast in
an user application. I figured out, that the free user memory
decreased
constantly. So I tried to analyze the Problem and wrote a litte
script:

#!/bin/sh
while [ 1 ]; do
echo 1 > /sys/class/leds/2a_service_yellow/brightness
echo 0 > /sys/class/leds/2a_service_yellow/brightness
done

And voila, I was able to reproduce the problem.
So I add a bit more debugging:

#!/bin/sh
cnt=0
while [ 1 ]; do
if [ `expr $cnt % 1000` -eq 0 ]; then
free | grep Mem: | cut -d' ' -f25
fi
echo 1 > /sys/class/leds/2a_service_yellow/brightness
echo 0 > /sys/class/leds/2a_service_yellow/brightness
let "cnt++"
done

And huh? No memory is eaten anymore. So it looks like, the problem
only
occours on heavy (fast) usage of /sys/class/leds subsystem.

I rewrote the script and toggled a GPIO pin, but there was no problem
recognizable.


I've been unable to reproduce the problem with leds-aat1290 driver
and Samsung M0 board. It must be driver specific issue.
What driver did you use?


I defined LEDS_GPIO and so I'm using leds-gpio driver.
danielg@debby:~/opt/prj/ti-linux-kernel$ cat .config | grep LEDS | grep
-v "^# "
CONFIG_INPUT_LEDS=y
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
CONFIG_LEDS_GPIO=y
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=y
CONFIG_LEDS_TRIGGER_ONESHOT=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
CONFIG_LEDS_TRIGGER_GPIO=y
CONFIG_LEDS_TRIGGER_DEFAULT_ON=y
CONFIG_LEDS_TRIGGER_TRANSIENT=y



Unfortunately I am still unable to reproduce the problem with leds-gpio.
I'm not observing any heavy usage with your test case:

~#free
  total   used   free sharedbuffers
cached
Mem:   1028092  61364 966728  0   8416  22396
-/+ buffers/cache:  30552 997540
Swap:0  0  0


Actually you didn't give any numbers. What kernel version are you using?


As I wrote, the problems occurred in vanilla 4.6 kernel, but also in 4.4
kernel (with PREEMPT-RT Patchset).


Heh, funny coincidence. I was testing this on recent linux-leds.git,
for-next branch and was not able to detect the issue. It started to
appear after resetting HEAD to 4.8-rc2 base. Finally it turned out
that what fixes the issue is the most recent commit [1].

Further investigation revealed that this is kobject_uevent_env(),
called from led_trigger_set(), which causes memory leaks when called
with high frequency.

CC GregKH.

[1]
https://git.kernel.org/cgit/linux/kernel/git/j.anaszewski/linux-leds.git/commit/?h=for-next=f3f624941be0fafb29fff5c1411fa433feca792c


Nice to hear about the Fix, thanks for your investigation!

Kind regards,
Daniel


[PATCH] dma-buf/sync_file: fix documentation error

2016-09-18 Thread Emilio López
The ioctl name and description on the documentation block don't
match the ioctl being defined. This was probably overlooked while
renaming the ioctls during the sync file destaging. This patch
provides a more accurate description of what the ioctl actually does.

Signed-off-by: Emilio López 
---

This is something I saw while refreshing my kselftest patches. Hopefully
this patch describes the new ioctl well enough, let me know if you
think it doesn't :)

Cheers,
Emilio

 include/uapi/linux/sync_file.h | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index 413303d..cdf8ec2 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -85,15 +85,12 @@ struct sync_file_info {
 #define SYNC_IOC_MERGE _IOWR(SYNC_IOC_MAGIC, 3, struct sync_merge_data)
 
 /**
- * DOC: SYNC_IOC_FENCE_INFO - get detailed information on a fence
+ * DOC: SYNC_IOC_FILE_INFO - get detailed information on a sync_file
  *
- * Takes a struct sync_file_info_data with extra space allocated for pt_info.
- * Caller should write the size of the buffer into len.  On return, len is
- * updated to reflect the total size of the sync_file_info_data including
- * pt_info.
- *
- * pt_info is a buffer containing sync_pt_infos for every sync_pt in the fence.
- * To iterate over the sync_pt_infos, use the sync_pt_info.len field.
+ * Takes a struct sync_file_info. If num_fences is 0, the field is updated
+ * with the actual number of fences. If num_fences is > 0, the system will
+ * use the pointer provided on sync_fence_info to return up to num_fences of
+ * struct sync_fence_info, with detailed fence information.
  */
 #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
 
-- 
2.9.3



[PATCH] dma-buf/sync_file: fix documentation error

2016-09-18 Thread Emilio López
The ioctl name and description on the documentation block don't
match the ioctl being defined. This was probably overlooked while
renaming the ioctls during the sync file destaging. This patch
provides a more accurate description of what the ioctl actually does.

Signed-off-by: Emilio López 
---

This is something I saw while refreshing my kselftest patches. Hopefully
this patch describes the new ioctl well enough, let me know if you
think it doesn't :)

Cheers,
Emilio

 include/uapi/linux/sync_file.h | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index 413303d..cdf8ec2 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -85,15 +85,12 @@ struct sync_file_info {
 #define SYNC_IOC_MERGE _IOWR(SYNC_IOC_MAGIC, 3, struct sync_merge_data)
 
 /**
- * DOC: SYNC_IOC_FENCE_INFO - get detailed information on a fence
+ * DOC: SYNC_IOC_FILE_INFO - get detailed information on a sync_file
  *
- * Takes a struct sync_file_info_data with extra space allocated for pt_info.
- * Caller should write the size of the buffer into len.  On return, len is
- * updated to reflect the total size of the sync_file_info_data including
- * pt_info.
- *
- * pt_info is a buffer containing sync_pt_infos for every sync_pt in the fence.
- * To iterate over the sync_pt_infos, use the sync_pt_info.len field.
+ * Takes a struct sync_file_info. If num_fences is 0, the field is updated
+ * with the actual number of fences. If num_fences is > 0, the system will
+ * use the pointer provided on sync_fence_info to return up to num_fences of
+ * struct sync_fence_info, with detailed fence information.
  */
 #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
 
-- 
2.9.3



[PATCH v4 6/6] x86/arch_prctl Add ARCH_[GET|SET]_CPUID

2016-09-18 Thread Kyle Huey
Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. Exposing this feature to userspace will allow a
ptracer to trap and emulate the CPUID instruction.

When supported, this feature is controlled by toggling bit 0 of
MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
http://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf

Implement a new pair of arch_prctls, available on both x86-32 and x86-64.

ARCH_GET_CPUID: Returns the current CPUID faulting state, either
  ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. arg2 must be 0.

ARCH_SET_CPUID: Set the CPUID faulting state to arg2, which must be either
  ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. Returns EINVAL if arg2 is
  another value or CPUID faulting is not supported on this system.

The state of the CPUID faulting flag is propagated across forks, but reset
upon exec.

Signed-off-by: Kyle Huey 
---
 arch/x86/include/asm/msr-index.h  |   1 +
 arch/x86/include/asm/thread_info.h|   6 +-
 arch/x86/include/uapi/asm/prctl.h |   6 +
 arch/x86/kernel/process.c |  94 +++-
 fs/exec.c |   1 +
 include/linux/thread_info.h   |   4 +
 tools/testing/selftests/x86/Makefile  |   2 +-
 tools/testing/selftests/x86/cpuid-fault.c | 231 ++
 8 files changed, 342 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/x86/cpuid-fault.c

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 39aa563..cddefdd 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -53,6 +53,7 @@
 #define MSR_MTRRcap0x00fe
 #define MSR_IA32_BBL_CR_CTL0x0119
 #define MSR_IA32_BBL_CR_CTL3   0x011e
+#define MSR_MISC_FEATURES_ENABLES  0x0140
 
 #define MSR_IA32_SYSENTER_CS   0x0174
 #define MSR_IA32_SYSENTER_ESP  0x0175
diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 8b7c8d8..1bc79bc 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -93,6 +93,7 @@ struct thread_info {
 #define TIF_SECCOMP8   /* secure computing */
 #define TIF_USER_RETURN_NOTIFY 11  /* notify kernel of userspace return */
 #define TIF_UPROBE 12  /* breakpointed or singlestepping */
+#define TIF_NOCPUID15  /* CPUID is not accessible in userland 
*/
 #define TIF_NOTSC  16  /* TSC is not accessible in userland */
 #define TIF_IA32   17  /* IA32 compatibility process */
 #define TIF_FORK   18  /* ret_from_fork */
@@ -117,6 +118,7 @@ struct thread_info {
 #define _TIF_SECCOMP   (1 << TIF_SECCOMP)
 #define _TIF_USER_RETURN_NOTIFY(1 << TIF_USER_RETURN_NOTIFY)
 #define _TIF_UPROBE(1 << TIF_UPROBE)
+#define _TIF_NOCPUID   (1 << TIF_NOCPUID)
 #define _TIF_NOTSC (1 << TIF_NOTSC)
 #define _TIF_IA32  (1 << TIF_IA32)
 #define _TIF_FORK  (1 << TIF_FORK)
@@ -146,7 +148,7 @@ struct thread_info {
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW
\
-   (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP)
+   (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP)
 
 #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
@@ -293,6 +295,8 @@ static inline bool in_ia32_syscall(void)
 extern void arch_task_cache_init(void);
 extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct 
*src);
 extern void arch_release_task_struct(struct task_struct *tsk);
+extern void arch_post_exec(void);
+#define arch_post_exec arch_post_exec
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h 
b/arch/x86/include/uapi/asm/prctl.h
index 3ac5032..c087e55 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -6,4 +6,10 @@
 #define ARCH_GET_FS 0x1003
 #define ARCH_GET_GS 0x1004
 
+/* Get/set the process' ability to use the CPUID instruction */
+#define ARCH_GET_CPUID 0x1005
+#define ARCH_SET_CPUID 0x1006
+# define ARCH_CPUID_ENABLE 1   /* allow the use of the CPUID 
instruction */
+# define ARCH_CPUID_SIGSEGV2   /* throw a SIGSEGV instead of 
reading the CPUID */
+
 #endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 97aa104..3ac90eb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * per-CPU 

[PATCH v4 6/6] x86/arch_prctl Add ARCH_[GET|SET]_CPUID

2016-09-18 Thread Kyle Huey
Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. Exposing this feature to userspace will allow a
ptracer to trap and emulate the CPUID instruction.

When supported, this feature is controlled by toggling bit 0 of
MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
http://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf

Implement a new pair of arch_prctls, available on both x86-32 and x86-64.

ARCH_GET_CPUID: Returns the current CPUID faulting state, either
  ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. arg2 must be 0.

ARCH_SET_CPUID: Set the CPUID faulting state to arg2, which must be either
  ARCH_CPUID_ENABLE or ARCH_CPUID_SIGSEGV. Returns EINVAL if arg2 is
  another value or CPUID faulting is not supported on this system.

The state of the CPUID faulting flag is propagated across forks, but reset
upon exec.

Signed-off-by: Kyle Huey 
---
 arch/x86/include/asm/msr-index.h  |   1 +
 arch/x86/include/asm/thread_info.h|   6 +-
 arch/x86/include/uapi/asm/prctl.h |   6 +
 arch/x86/kernel/process.c |  94 +++-
 fs/exec.c |   1 +
 include/linux/thread_info.h   |   4 +
 tools/testing/selftests/x86/Makefile  |   2 +-
 tools/testing/selftests/x86/cpuid-fault.c | 231 ++
 8 files changed, 342 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/x86/cpuid-fault.c

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 39aa563..cddefdd 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -53,6 +53,7 @@
 #define MSR_MTRRcap0x00fe
 #define MSR_IA32_BBL_CR_CTL0x0119
 #define MSR_IA32_BBL_CR_CTL3   0x011e
+#define MSR_MISC_FEATURES_ENABLES  0x0140
 
 #define MSR_IA32_SYSENTER_CS   0x0174
 #define MSR_IA32_SYSENTER_ESP  0x0175
diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 8b7c8d8..1bc79bc 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -93,6 +93,7 @@ struct thread_info {
 #define TIF_SECCOMP8   /* secure computing */
 #define TIF_USER_RETURN_NOTIFY 11  /* notify kernel of userspace return */
 #define TIF_UPROBE 12  /* breakpointed or singlestepping */
+#define TIF_NOCPUID15  /* CPUID is not accessible in userland 
*/
 #define TIF_NOTSC  16  /* TSC is not accessible in userland */
 #define TIF_IA32   17  /* IA32 compatibility process */
 #define TIF_FORK   18  /* ret_from_fork */
@@ -117,6 +118,7 @@ struct thread_info {
 #define _TIF_SECCOMP   (1 << TIF_SECCOMP)
 #define _TIF_USER_RETURN_NOTIFY(1 << TIF_USER_RETURN_NOTIFY)
 #define _TIF_UPROBE(1 << TIF_UPROBE)
+#define _TIF_NOCPUID   (1 << TIF_NOCPUID)
 #define _TIF_NOTSC (1 << TIF_NOTSC)
 #define _TIF_IA32  (1 << TIF_IA32)
 #define _TIF_FORK  (1 << TIF_FORK)
@@ -146,7 +148,7 @@ struct thread_info {
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW
\
-   (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP)
+   (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP)
 
 #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
@@ -293,6 +295,8 @@ static inline bool in_ia32_syscall(void)
 extern void arch_task_cache_init(void);
 extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct 
*src);
 extern void arch_release_task_struct(struct task_struct *tsk);
+extern void arch_post_exec(void);
+#define arch_post_exec arch_post_exec
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h 
b/arch/x86/include/uapi/asm/prctl.h
index 3ac5032..c087e55 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -6,4 +6,10 @@
 #define ARCH_GET_FS 0x1003
 #define ARCH_GET_GS 0x1004
 
+/* Get/set the process' ability to use the CPUID instruction */
+#define ARCH_GET_CPUID 0x1005
+#define ARCH_SET_CPUID 0x1006
+# define ARCH_CPUID_ENABLE 1   /* allow the use of the CPUID 
instruction */
+# define ARCH_CPUID_SIGSEGV2   /* throw a SIGSEGV instead of 
reading the CPUID */
+
 #endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 97aa104..3ac90eb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * per-CPU TSS segments. 

[PATCH v4 4/6] x86/syscalls/32 Wire up arch_prctl on x86-32

2016-09-18 Thread Kyle Huey
Hook up arch_prctl to call do_arch_prctl on x86-32, and in 32 bit compat
mode on x86-64. This allows us to have arch_prctls that are not specific to
64 bits.

On UML, simply stub out this syscall.

Signed-off-by: Kyle Huey 
---
 arch/x86/entry/syscalls/syscall_32.tbl | 1 +
 arch/x86/kernel/process_32.c   | 7 +++
 arch/x86/kernel/process_64.c   | 7 +++
 arch/x86/um/Makefile   | 2 +-
 arch/x86/um/syscalls_32.c  | 7 +++
 include/linux/compat.h | 2 ++
 6 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/um/syscalls_32.c

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index f848572..300fdf8 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -386,3 +386,4 @@
 377i386copy_file_range sys_copy_file_range
 378i386preadv2 sys_preadv2 
compat_sys_preadv2
 379i386pwritev2sys_pwritev2
compat_sys_pwritev2
+380i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index d86be29..71770a4 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -54,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 
 asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
 asmlinkage void ret_from_kernel_thread(void) __asm__("ret_from_kernel_thread");
@@ -316,3 +318,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct 
*next_p)
 
return prev_p;
 }
+
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
+{
+   return do_arch_prctl(current, code, arg2);
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 5c60e2c..aa2b99a 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -599,6 +599,13 @@ SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
return ret;
 }
 
+#ifdef CONFIG_IA32_EMULATION
+COMPAT_SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
+{
+   return do_arch_prctl(current, code, arg2);
+}
+#endif
+
 unsigned long KSTK_ESP(struct task_struct *task)
 {
return task_pt_regs(task)->sp;
diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
index 3ee2bb6..5e039d6 100644
--- a/arch/x86/um/Makefile
+++ b/arch/x86/um/Makefile
@@ -16,7 +16,7 @@ obj-y = bug.o bugs_$(BITS).o delay.o fault.o ksyms.o ldt.o \
 
 ifeq ($(CONFIG_X86_32),y)
 
-obj-y += checksum_32.o
+obj-y += checksum_32.o syscalls_32.o
 obj-$(CONFIG_ELF_CORE) += elfcore.o
 
 subarch-y = ../lib/string_32.o ../lib/atomic64_32.o ../lib/atomic64_cx8_32.o
diff --git a/arch/x86/um/syscalls_32.c b/arch/x86/um/syscalls_32.c
new file mode 100644
index 000..ccf0598
--- /dev/null
+++ b/arch/x86/um/syscalls_32.c
@@ -0,0 +1,7 @@
+#include 
+#include 
+
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
+{
+   return -EINVAL;
+}
diff --git a/include/linux/compat.h b/include/linux/compat.h
index f964ef7..0039d53 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -722,6 +722,8 @@ asmlinkage long 
compat_sys_sched_rr_get_interval(compat_pid_t pid,
 asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
int, const char __user *);
 
+asmlinkage long compat_sys_arch_prctl(int, unsigned long);
+
 /*
  * For most but not all architectures, "am I in a compat syscall?" and
  * "am I a compat task?" are the same question.  For architectures on which
-- 
2.9.3



[PATCH v4 3/6] x86/arch_prctl Add a new do_arch_prctl

2016-09-18 Thread Kyle Huey
Add a new do_arch_prctl to handle arch_prctls that are not specific to 64
bits. Call it from the syscall entry point, but not any of the other
callsites in the kernel, which all want one of the existing 64 bit only
arch_prctls.

Signed-off-by: Kyle Huey 
---
 arch/x86/include/asm/proto.h | 1 +
 arch/x86/kernel/process.c| 5 +
 arch/x86/kernel/process_64.c | 8 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 95c3e51..94a57cc 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -30,6 +30,7 @@ void x86_report_nx(void);
 
 extern int reboot_force;
 
+long do_arch_prctl(struct task_struct *task, int code, unsigned long arg2);
 #ifdef CONFIG_X86_64
 long do_arch_prctl_64(struct task_struct *task, int code, unsigned long arg2);
 #endif
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 62c0b0e..97aa104 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -567,3 +567,8 @@ unsigned long get_wchan(struct task_struct *p)
} while (count++ < 16 && p->state != TASK_RUNNING);
return 0;
 }
+
+long do_arch_prctl(struct task_struct *task, int code, unsigned long arg2)
+{
+   return -EINVAL;
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 292ce48..5c60e2c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -590,7 +590,13 @@ long do_arch_prctl_64(struct task_struct *task, int code, 
unsigned long arg2)
 
 SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
 {
-   return do_arch_prctl_64(current, code, arg2);
+   long ret;
+
+   ret = do_arch_prctl_64(current, code, arg2);
+   if (ret == -EINVAL)
+   ret = do_arch_prctl(current, code, arg2);
+
+   return ret;
 }
 
 unsigned long KSTK_ESP(struct task_struct *task)
-- 
2.9.3



[PATCH v4 4/6] x86/syscalls/32 Wire up arch_prctl on x86-32

2016-09-18 Thread Kyle Huey
Hook up arch_prctl to call do_arch_prctl on x86-32, and in 32 bit compat
mode on x86-64. This allows us to have arch_prctls that are not specific to
64 bits.

On UML, simply stub out this syscall.

Signed-off-by: Kyle Huey 
---
 arch/x86/entry/syscalls/syscall_32.tbl | 1 +
 arch/x86/kernel/process_32.c   | 7 +++
 arch/x86/kernel/process_64.c   | 7 +++
 arch/x86/um/Makefile   | 2 +-
 arch/x86/um/syscalls_32.c  | 7 +++
 include/linux/compat.h | 2 ++
 6 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/um/syscalls_32.c

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index f848572..300fdf8 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -386,3 +386,4 @@
 377i386copy_file_range sys_copy_file_range
 378i386preadv2 sys_preadv2 
compat_sys_preadv2
 379i386pwritev2sys_pwritev2
compat_sys_pwritev2
+380i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index d86be29..71770a4 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -54,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 
 asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
 asmlinkage void ret_from_kernel_thread(void) __asm__("ret_from_kernel_thread");
@@ -316,3 +318,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct 
*next_p)
 
return prev_p;
 }
+
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
+{
+   return do_arch_prctl(current, code, arg2);
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 5c60e2c..aa2b99a 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -599,6 +599,13 @@ SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
return ret;
 }
 
+#ifdef CONFIG_IA32_EMULATION
+COMPAT_SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
+{
+   return do_arch_prctl(current, code, arg2);
+}
+#endif
+
 unsigned long KSTK_ESP(struct task_struct *task)
 {
return task_pt_regs(task)->sp;
diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
index 3ee2bb6..5e039d6 100644
--- a/arch/x86/um/Makefile
+++ b/arch/x86/um/Makefile
@@ -16,7 +16,7 @@ obj-y = bug.o bugs_$(BITS).o delay.o fault.o ksyms.o ldt.o \
 
 ifeq ($(CONFIG_X86_32),y)
 
-obj-y += checksum_32.o
+obj-y += checksum_32.o syscalls_32.o
 obj-$(CONFIG_ELF_CORE) += elfcore.o
 
 subarch-y = ../lib/string_32.o ../lib/atomic64_32.o ../lib/atomic64_cx8_32.o
diff --git a/arch/x86/um/syscalls_32.c b/arch/x86/um/syscalls_32.c
new file mode 100644
index 000..ccf0598
--- /dev/null
+++ b/arch/x86/um/syscalls_32.c
@@ -0,0 +1,7 @@
+#include 
+#include 
+
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
+{
+   return -EINVAL;
+}
diff --git a/include/linux/compat.h b/include/linux/compat.h
index f964ef7..0039d53 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -722,6 +722,8 @@ asmlinkage long 
compat_sys_sched_rr_get_interval(compat_pid_t pid,
 asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
int, const char __user *);
 
+asmlinkage long compat_sys_arch_prctl(int, unsigned long);
+
 /*
  * For most but not all architectures, "am I in a compat syscall?" and
  * "am I a compat task?" are the same question.  For architectures on which
-- 
2.9.3



[PATCH v4 3/6] x86/arch_prctl Add a new do_arch_prctl

2016-09-18 Thread Kyle Huey
Add a new do_arch_prctl to handle arch_prctls that are not specific to 64
bits. Call it from the syscall entry point, but not any of the other
callsites in the kernel, which all want one of the existing 64 bit only
arch_prctls.

Signed-off-by: Kyle Huey 
---
 arch/x86/include/asm/proto.h | 1 +
 arch/x86/kernel/process.c| 5 +
 arch/x86/kernel/process_64.c | 8 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 95c3e51..94a57cc 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -30,6 +30,7 @@ void x86_report_nx(void);
 
 extern int reboot_force;
 
+long do_arch_prctl(struct task_struct *task, int code, unsigned long arg2);
 #ifdef CONFIG_X86_64
 long do_arch_prctl_64(struct task_struct *task, int code, unsigned long arg2);
 #endif
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 62c0b0e..97aa104 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -567,3 +567,8 @@ unsigned long get_wchan(struct task_struct *p)
} while (count++ < 16 && p->state != TASK_RUNNING);
return 0;
 }
+
+long do_arch_prctl(struct task_struct *task, int code, unsigned long arg2)
+{
+   return -EINVAL;
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 292ce48..5c60e2c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -590,7 +590,13 @@ long do_arch_prctl_64(struct task_struct *task, int code, 
unsigned long arg2)
 
 SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
 {
-   return do_arch_prctl_64(current, code, arg2);
+   long ret;
+
+   ret = do_arch_prctl_64(current, code, arg2);
+   if (ret == -EINVAL)
+   ret = do_arch_prctl(current, code, arg2);
+
+   return ret;
 }
 
 unsigned long KSTK_ESP(struct task_struct *task)
-- 
2.9.3



[PATCH v4 2/6] x86/arch_prctl/64 Rename do_arch_prctl to do_arch_prctl_64

2016-09-18 Thread Kyle Huey
In order to introduce new arch_prctls that are not 64 bit only, rename the
existing 64 bit implementation to do_arch_prctl_64. Also rename the second
argument to arch_prctl, which will no longer always be an address.

Signed-off-by: Kyle Huey 
---
 arch/x86/include/asm/proto.h |  4 +++-
 arch/x86/kernel/process_64.c | 26 ++
 arch/x86/kernel/ptrace.c |  8 
 arch/x86/um/syscalls_64.c|  4 ++--
 4 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 9b9b30b..95c3e51 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -30,6 +30,8 @@ void x86_report_nx(void);
 
 extern int reboot_force;
 
-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr);
+#ifdef CONFIG_X86_64
+long do_arch_prctl_64(struct task_struct *task, int code, unsigned long arg2);
+#endif
 
 #endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4d6363c..292ce48 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -197,7 +197,7 @@ int copy_thread_tls(unsigned long clone_flags, unsigned 
long sp,
(struct user_desc __user *)tls, 0);
else
 #endif
-   err = do_arch_prctl(p, ARCH_SET_FS, tls);
+   err = do_arch_prctl_64(p, ARCH_SET_FS, tls);
if (err)
goto out;
}
@@ -525,7 +525,7 @@ void set_personality_ia32(bool x32)
 }
 EXPORT_SYMBOL_GPL(set_personality_ia32);
 
-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
+long do_arch_prctl_64(struct task_struct *task, int code, unsigned long arg2)
 {
int ret = 0;
int doit = task == current;
@@ -533,48 +533,50 @@ long do_arch_prctl(struct task_struct *task, int code, 
unsigned long addr)
 
switch (code) {
case ARCH_SET_GS:
-   if (addr >= TASK_SIZE_MAX)
+   if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.gsindex = 0;
-   task->thread.gsbase = addr;
+   task->thread.gsbase = arg2;
if (doit) {
load_gs_index(0);
-   ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, addr);
+   ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, arg2);
}
put_cpu();
break;
case ARCH_SET_FS:
/* Not strictly needed for fs, but do it for symmetry
   with gs */
-   if (addr >= TASK_SIZE_MAX)
+   if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.fsindex = 0;
-   task->thread.fsbase = addr;
+   task->thread.fsbase = arg2;
if (doit) {
/* set the selector to 0 to not confuse __switch_to */
loadsegment(fs, 0);
-   ret = wrmsrl_safe(MSR_FS_BASE, addr);
+   ret = wrmsrl_safe(MSR_FS_BASE, arg2);
}
put_cpu();
break;
case ARCH_GET_FS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_FS_BASE, base);
else
base = task->thread.fsbase;
-   ret = put_user(base, (unsigned long __user *)addr);
+   ret = put_user(base, (unsigned long __user *)arg2);
break;
}
case ARCH_GET_GS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_KERNEL_GS_BASE, base);
else
base = task->thread.gsbase;
-   ret = put_user(base, (unsigned long __user *)addr);
+   ret = put_user(base, (unsigned long __user *)arg2);
break;
}
 
@@ -586,9 +588,9 @@ long do_arch_prctl(struct task_struct *task, int code, 
unsigned long addr)
return ret;
 }
 
-SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, addr)
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
 {
-   return do_arch_prctl(current, code, addr);
+   return do_arch_prctl_64(current, code, arg2);
 }
 
 unsigned long KSTK_ESP(struct task_struct *task)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f79576a..030cbc5 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -395,12 +395,12 @@ static int putreg(struct task_struct *child,
if (value >= TASK_SIZE_MAX)
return -EIO;
/*
-* When changing the segment base, use do_arch_prctl
+* When changing the segment base, use do_arch_prctl_64
 * to set 

[PATCH v4 5/6] x86/cpufeature Detect CPUID faulting support

2016-09-18 Thread Kyle Huey
Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. This will allow a ptracer to emulate the CPUID
instruction.

Bit 31 of MSR_PLATFORM_INFO advertises support for this feature. It is
documented in detail in Section 2.3.2 of
http://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf

Detect support for this feature and expose it as X86_FEATURE_CPUID_FAULT.

Signed-off-by: Kyle Huey 
Reviewed-by: Andy Lutomirski 
---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/msr-index.h   |  1 +
 arch/x86/kernel/cpu/scattered.c| 13 +
 3 files changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 92a8308..78b9d06 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -190,6 +190,7 @@
 
 #define X86_FEATURE_CPB( 7*32+ 2) /* AMD Core Performance 
Boost */
 #define X86_FEATURE_EPB( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS 
support */
+#define X86_FEATURE_CPUID_FAULT ( 7*32+ 4) /* Intel CPUID faulting */
 
 #define X86_FEATURE_HW_PSTATE  ( 7*32+ 8) /* AMD HW-PState */
 #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 56f4c66..39aa563 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -41,6 +41,7 @@
 #define MSR_IA32_PERFCTR1  0x00c2
 #define MSR_FSB_FREQ   0x00cd
 #define MSR_PLATFORM_INFO  0x00ce
+#define PLATINFO_CPUID_FAULT   (1UL << 31)
 
 #define MSR_NHM_SNB_PKG_CST_CFG_CTL0x00e2
 #define NHM_C3_AUTO_DEMOTE (1UL << 25)
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 8cb57df..7901481 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -24,6 +24,16 @@ enum cpuid_regs {
CR_EBX
 };
 
+static bool supports_cpuid_faulting(void)
+{
+   unsigned int lo, hi;
+
+   if (rdmsr_safe(MSR_PLATFORM_INFO, , ))
+   return false;
+
+   return lo & PLATINFO_CPUID_FAULT;
+}
+
 void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
 {
u32 max_level;
@@ -54,4 +64,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
if (regs[cb->reg] & (1 << cb->bit))
set_cpu_cap(c, cb->feature);
}
+
+   if (supports_cpuid_faulting())
+   set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
 }
-- 
2.9.3



[PATCH v4 5/6] x86/cpufeature Detect CPUID faulting support

2016-09-18 Thread Kyle Huey
Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. This will allow a ptracer to emulate the CPUID
instruction.

Bit 31 of MSR_PLATFORM_INFO advertises support for this feature. It is
documented in detail in Section 2.3.2 of
http://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf

Detect support for this feature and expose it as X86_FEATURE_CPUID_FAULT.

Signed-off-by: Kyle Huey 
Reviewed-by: Andy Lutomirski 
---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/msr-index.h   |  1 +
 arch/x86/kernel/cpu/scattered.c| 13 +
 3 files changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 92a8308..78b9d06 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -190,6 +190,7 @@
 
 #define X86_FEATURE_CPB( 7*32+ 2) /* AMD Core Performance 
Boost */
 #define X86_FEATURE_EPB( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS 
support */
+#define X86_FEATURE_CPUID_FAULT ( 7*32+ 4) /* Intel CPUID faulting */
 
 #define X86_FEATURE_HW_PSTATE  ( 7*32+ 8) /* AMD HW-PState */
 #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 56f4c66..39aa563 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -41,6 +41,7 @@
 #define MSR_IA32_PERFCTR1  0x00c2
 #define MSR_FSB_FREQ   0x00cd
 #define MSR_PLATFORM_INFO  0x00ce
+#define PLATINFO_CPUID_FAULT   (1UL << 31)
 
 #define MSR_NHM_SNB_PKG_CST_CFG_CTL0x00e2
 #define NHM_C3_AUTO_DEMOTE (1UL << 25)
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 8cb57df..7901481 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -24,6 +24,16 @@ enum cpuid_regs {
CR_EBX
 };
 
+static bool supports_cpuid_faulting(void)
+{
+   unsigned int lo, hi;
+
+   if (rdmsr_safe(MSR_PLATFORM_INFO, , ))
+   return false;
+
+   return lo & PLATINFO_CPUID_FAULT;
+}
+
 void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
 {
u32 max_level;
@@ -54,4 +64,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
if (regs[cb->reg] & (1 << cb->bit))
set_cpu_cap(c, cb->feature);
}
+
+   if (supports_cpuid_faulting())
+   set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
 }
-- 
2.9.3



[PATCH v4 2/6] x86/arch_prctl/64 Rename do_arch_prctl to do_arch_prctl_64

2016-09-18 Thread Kyle Huey
In order to introduce new arch_prctls that are not 64 bit only, rename the
existing 64 bit implementation to do_arch_prctl_64. Also rename the second
argument to arch_prctl, which will no longer always be an address.

Signed-off-by: Kyle Huey 
---
 arch/x86/include/asm/proto.h |  4 +++-
 arch/x86/kernel/process_64.c | 26 ++
 arch/x86/kernel/ptrace.c |  8 
 arch/x86/um/syscalls_64.c|  4 ++--
 4 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 9b9b30b..95c3e51 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -30,6 +30,8 @@ void x86_report_nx(void);
 
 extern int reboot_force;
 
-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr);
+#ifdef CONFIG_X86_64
+long do_arch_prctl_64(struct task_struct *task, int code, unsigned long arg2);
+#endif
 
 #endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4d6363c..292ce48 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -197,7 +197,7 @@ int copy_thread_tls(unsigned long clone_flags, unsigned 
long sp,
(struct user_desc __user *)tls, 0);
else
 #endif
-   err = do_arch_prctl(p, ARCH_SET_FS, tls);
+   err = do_arch_prctl_64(p, ARCH_SET_FS, tls);
if (err)
goto out;
}
@@ -525,7 +525,7 @@ void set_personality_ia32(bool x32)
 }
 EXPORT_SYMBOL_GPL(set_personality_ia32);
 
-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
+long do_arch_prctl_64(struct task_struct *task, int code, unsigned long arg2)
 {
int ret = 0;
int doit = task == current;
@@ -533,48 +533,50 @@ long do_arch_prctl(struct task_struct *task, int code, 
unsigned long addr)
 
switch (code) {
case ARCH_SET_GS:
-   if (addr >= TASK_SIZE_MAX)
+   if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.gsindex = 0;
-   task->thread.gsbase = addr;
+   task->thread.gsbase = arg2;
if (doit) {
load_gs_index(0);
-   ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, addr);
+   ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, arg2);
}
put_cpu();
break;
case ARCH_SET_FS:
/* Not strictly needed for fs, but do it for symmetry
   with gs */
-   if (addr >= TASK_SIZE_MAX)
+   if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.fsindex = 0;
-   task->thread.fsbase = addr;
+   task->thread.fsbase = arg2;
if (doit) {
/* set the selector to 0 to not confuse __switch_to */
loadsegment(fs, 0);
-   ret = wrmsrl_safe(MSR_FS_BASE, addr);
+   ret = wrmsrl_safe(MSR_FS_BASE, arg2);
}
put_cpu();
break;
case ARCH_GET_FS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_FS_BASE, base);
else
base = task->thread.fsbase;
-   ret = put_user(base, (unsigned long __user *)addr);
+   ret = put_user(base, (unsigned long __user *)arg2);
break;
}
case ARCH_GET_GS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_KERNEL_GS_BASE, base);
else
base = task->thread.gsbase;
-   ret = put_user(base, (unsigned long __user *)addr);
+   ret = put_user(base, (unsigned long __user *)arg2);
break;
}
 
@@ -586,9 +588,9 @@ long do_arch_prctl(struct task_struct *task, int code, 
unsigned long addr)
return ret;
 }
 
-SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, addr)
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, arg2)
 {
-   return do_arch_prctl(current, code, addr);
+   return do_arch_prctl_64(current, code, arg2);
 }
 
 unsigned long KSTK_ESP(struct task_struct *task)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f79576a..030cbc5 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -395,12 +395,12 @@ static int putreg(struct task_struct *child,
if (value >= TASK_SIZE_MAX)
return -EIO;
/*
-* When changing the segment base, use do_arch_prctl
+* When changing the segment base, use do_arch_prctl_64
 * to set either thread.fs or 

[PATCH v4 1/6] x86/arch_prctl/64 Use SYSCALL_DEFINE2 to define sys_arch_prctl

2016-09-18 Thread Kyle Huey
Signed-off-by: Kyle Huey 
---
 arch/x86/kernel/process_64.c | 3 ++-
 arch/x86/um/syscalls_64.c| 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 63236d8..4d6363c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -585,7 +586,7 @@ long do_arch_prctl(struct task_struct *task, int code, 
unsigned long addr)
return ret;
 }
 
-long sys_arch_prctl(int code, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, addr)
 {
return do_arch_prctl(current, code, addr);
 }
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index e655227..3282066 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -72,7 +72,7 @@ long arch_prctl(struct task_struct *task, int code, unsigned 
long __user *addr)
return ret;
 }
 
-long sys_arch_prctl(int code, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, addr)
 {
return arch_prctl(current, code, (unsigned long __user *) addr);
 }
-- 
2.9.3

base-commit: 024c7e3756d8a42fc41fe8a9488488b9b09d1dcc


[PATCH v4 1/6] x86/arch_prctl/64 Use SYSCALL_DEFINE2 to define sys_arch_prctl

2016-09-18 Thread Kyle Huey
Signed-off-by: Kyle Huey 
---
 arch/x86/kernel/process_64.c | 3 ++-
 arch/x86/um/syscalls_64.c| 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 63236d8..4d6363c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -585,7 +586,7 @@ long do_arch_prctl(struct task_struct *task, int code, 
unsigned long addr)
return ret;
 }
 
-long sys_arch_prctl(int code, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, addr)
 {
return do_arch_prctl(current, code, addr);
 }
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index e655227..3282066 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -72,7 +72,7 @@ long arch_prctl(struct task_struct *task, int code, unsigned 
long __user *addr)
return ret;
 }
 
-long sys_arch_prctl(int code, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, code, unsigned long, addr)
 {
return arch_prctl(current, code, (unsigned long __user *) addr);
 }
-- 
2.9.3

base-commit: 024c7e3756d8a42fc41fe8a9488488b9b09d1dcc


[PATCH v4 0/6] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction

2016-09-18 Thread Kyle Huey
rr (http://rr-project.org/), a userspace record-and-replay reverse-
execution debugger, would like to trap and emulate the CPUID instruction.
This would allow us to a) mask away certain hardware features that rr does
not support (e.g. RDRAND) and b) enable trace portability across machines
by providing constant results.

Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
CPL > 0.  Expose this capability to userspace as a new pair of arch_prctls,
ARCH_GET_CPUID and ARCH_SET_CPUID, with two values, ARCH_CPUID_ENABLE and
ARCH_CPUID_SIGSEGV.

The following changes have been made since v3:

Patch 1 was split into patches 1-4, patches 2 and 3 became patches 5 and 6, 
respectively.

Patch 1:
- Use SYSCALL_DEFINE in UML.

Patch 2:
- More descriptive commit message.

Patch 3:
- More decriptive commit message.
- Name the common arch_prctl function do_arch_prctl instead of
  do_arch_prctl_common

Patch 4:
- Move the 32-bit syscall entry point to process_32.c, place the compat
  entry point in process_64.c

Patch 5 (previously Patch 2):
- More descriptive commit message.
- Prefix the #define for the cpuid faulting bit with PLATINFO
- supports_cpuid_faulting returns bool
- Rearrange supports_cpuid_faulting to avoid linebreaks

Patch 6 (previously Patch 3):
- ARCH_GET_CPUID now takes 0 for the second argument, and returns the
  result directly.
- arch_post_exec is now a #define, called from setup_new_exec
- The test now uses errx
- The test now checks that ARCH_GET_CPUID returns ARCH_CPUID_SIGSEGV after
  fork()



[PATCH v4 0/6] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction

2016-09-18 Thread Kyle Huey
rr (http://rr-project.org/), a userspace record-and-replay reverse-
execution debugger, would like to trap and emulate the CPUID instruction.
This would allow us to a) mask away certain hardware features that rr does
not support (e.g. RDRAND) and b) enable trace portability across machines
by providing constant results.

Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
CPL > 0.  Expose this capability to userspace as a new pair of arch_prctls,
ARCH_GET_CPUID and ARCH_SET_CPUID, with two values, ARCH_CPUID_ENABLE and
ARCH_CPUID_SIGSEGV.

The following changes have been made since v3:

Patch 1 was split into patches 1-4, patches 2 and 3 became patches 5 and 6, 
respectively.

Patch 1:
- Use SYSCALL_DEFINE in UML.

Patch 2:
- More descriptive commit message.

Patch 3:
- More decriptive commit message.
- Name the common arch_prctl function do_arch_prctl instead of
  do_arch_prctl_common

Patch 4:
- Move the 32-bit syscall entry point to process_32.c, place the compat
  entry point in process_64.c

Patch 5 (previously Patch 2):
- More descriptive commit message.
- Prefix the #define for the cpuid faulting bit with PLATINFO
- supports_cpuid_faulting returns bool
- Rearrange supports_cpuid_faulting to avoid linebreaks

Patch 6 (previously Patch 3):
- ARCH_GET_CPUID now takes 0 for the second argument, and returns the
  result directly.
- arch_post_exec is now a #define, called from setup_new_exec
- The test now uses errx
- The test now checks that ARCH_GET_CPUID returns ARCH_CPUID_SIGSEGV after
  fork()



there are unencrypted files in an encrypted directory in F2FS

2016-09-18 Thread xiakaixu

Hi Kim,

According to the encryption design policy "all of the  files or
subdirectories in an encrypted directory must be encrypted". But
the current f2fs code seems allow to there are unencrypted files
in an encrypted directory. For example, the f2fs_create() and
f2fs_mknod() functions call f2fs_new_inode() to check the child inode.

/* If the directory encrypted, then we should encrypt the inode. */
if (f2fs_encrypted_inode(dir) && f2fs_may_encrypt(inode))
f2fs_set_encrypted_inode(inode);

static inline bool f2fs_may_encrypt(struct inode *inode)
{
#ifdef CONFIG_F2FS_FS_ENCRYPTION
umode_t mode = inode->i_mode;

return (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode));
#else
return 0;
#endif
}

So even if the child inode is not REG/DIR/LNK and it still can be created
successfully which is unencrypted file. Instead, maybe here we can return
-EACCESS. Not sure about it :)

--
Regards
Kaixu Xia



there are unencrypted files in an encrypted directory in F2FS

2016-09-18 Thread xiakaixu

Hi Kim,

According to the encryption design policy "all of the  files or
subdirectories in an encrypted directory must be encrypted". But
the current f2fs code seems allow to there are unencrypted files
in an encrypted directory. For example, the f2fs_create() and
f2fs_mknod() functions call f2fs_new_inode() to check the child inode.

/* If the directory encrypted, then we should encrypt the inode. */
if (f2fs_encrypted_inode(dir) && f2fs_may_encrypt(inode))
f2fs_set_encrypted_inode(inode);

static inline bool f2fs_may_encrypt(struct inode *inode)
{
#ifdef CONFIG_F2FS_FS_ENCRYPTION
umode_t mode = inode->i_mode;

return (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode));
#else
return 0;
#endif
}

So even if the child inode is not REG/DIR/LNK and it still can be created
successfully which is unencrypted file. Instead, maybe here we can return
-EACCESS. Not sure about it :)

--
Regards
Kaixu Xia



[PATCH 0/2] make POSIX timers configurable

2016-09-18 Thread Nicolas Pitre
Many embedded systems don't need the full POSIX timer support.
Configuring them out provides a nice kernel image size reduction.

When POSIX timers are configured out, the PTP clock subsystem should be
left out as well. However a bunch of ethernet drivers currently *select*
it in their Kconfig entries. Therefore some more tweaks were needed to
break that hard dependency for those drivers to still be configured in
if desired.

It was agreed that the best path upstream for those patches is via
John Stultz's timer tree.

Previous itterations of those patches and the discussion threads can be
found here:

  https://lkml.org/lkml/2016/9/14/992

  https://lkml.org/lkml/2016/9/14/803

  https://lkml.org/lkml/2016/9/8/793

diffstat:

 drivers/Makefile|   2 +-
 drivers/net/ethernet/adi/Kconfig|   8 +-
 drivers/net/ethernet/amd/Kconfig|   2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-main.c   |   6 +-
 drivers/net/ethernet/broadcom/Kconfig   |   4 +-
 drivers/net/ethernet/cavium/Kconfig |   2 +-
 drivers/net/ethernet/freescale/Kconfig  |   2 +-
 drivers/net/ethernet/intel/Kconfig  |  10 +-
 drivers/net/ethernet/intel/e1000e/ptp.c |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_ptp.c  |   2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c|   2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c|   2 +-
 drivers/net/ethernet/mellanox/mlx4/Kconfig  |   2 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_clock.c  |   2 +-
 drivers/net/ethernet/renesas/Kconfig|   2 +-
 drivers/net/ethernet/samsung/Kconfig|   2 +-
 drivers/net/ethernet/sfc/Kconfig|   2 +-
 drivers/net/ethernet/sfc/ptp.c  |  14 +--
 drivers/net/ethernet/stmicro/stmmac/Kconfig |   2 +-
 .../net/ethernet/stmicro/stmmac/stmmac_ptp.c|   2 +-
 drivers/net/ethernet/ti/Kconfig |   2 +-
 drivers/net/ethernet/tile/Kconfig   |   2 +-
 drivers/ptp/Kconfig |  14 ++-
 include/linux/posix-timers.h|  28 -
 include/linux/ptp_clock_kernel.h|  59 ++---
 include/linux/sched.h   |  10 ++
 init/Kconfig|  17 +++
 kernel/signal.c |   4 +
 kernel/time/Kconfig |   1 +
 kernel/time/Makefile|  10 +-
 kernel/time/posix-stubs.c   | 118 ++
 33 files changed, 277 insertions(+), 64 deletions(-)


[PATCH 0/2] make POSIX timers configurable

2016-09-18 Thread Nicolas Pitre
Many embedded systems don't need the full POSIX timer support.
Configuring them out provides a nice kernel image size reduction.

When POSIX timers are configured out, the PTP clock subsystem should be
left out as well. However a bunch of ethernet drivers currently *select*
it in their Kconfig entries. Therefore some more tweaks were needed to
break that hard dependency for those drivers to still be configured in
if desired.

It was agreed that the best path upstream for those patches is via
John Stultz's timer tree.

Previous itterations of those patches and the discussion threads can be
found here:

  https://lkml.org/lkml/2016/9/14/992

  https://lkml.org/lkml/2016/9/14/803

  https://lkml.org/lkml/2016/9/8/793

diffstat:

 drivers/Makefile|   2 +-
 drivers/net/ethernet/adi/Kconfig|   8 +-
 drivers/net/ethernet/amd/Kconfig|   2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-main.c   |   6 +-
 drivers/net/ethernet/broadcom/Kconfig   |   4 +-
 drivers/net/ethernet/cavium/Kconfig |   2 +-
 drivers/net/ethernet/freescale/Kconfig  |   2 +-
 drivers/net/ethernet/intel/Kconfig  |  10 +-
 drivers/net/ethernet/intel/e1000e/ptp.c |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_ptp.c  |   2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c|   2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c|   2 +-
 drivers/net/ethernet/mellanox/mlx4/Kconfig  |   2 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_clock.c  |   2 +-
 drivers/net/ethernet/renesas/Kconfig|   2 +-
 drivers/net/ethernet/samsung/Kconfig|   2 +-
 drivers/net/ethernet/sfc/Kconfig|   2 +-
 drivers/net/ethernet/sfc/ptp.c  |  14 +--
 drivers/net/ethernet/stmicro/stmmac/Kconfig |   2 +-
 .../net/ethernet/stmicro/stmmac/stmmac_ptp.c|   2 +-
 drivers/net/ethernet/ti/Kconfig |   2 +-
 drivers/net/ethernet/tile/Kconfig   |   2 +-
 drivers/ptp/Kconfig |  14 ++-
 include/linux/posix-timers.h|  28 -
 include/linux/ptp_clock_kernel.h|  59 ++---
 include/linux/sched.h   |  10 ++
 init/Kconfig|  17 +++
 kernel/signal.c |   4 +
 kernel/time/Kconfig |   1 +
 kernel/time/Makefile|  10 +-
 kernel/time/posix-stubs.c   | 118 ++
 33 files changed, 277 insertions(+), 64 deletions(-)


[PATCH 2/2] posix-timers: make it configurable

2016-09-18 Thread Nicolas Pitre
Many embedded systems typically don't need them.  This removes about
22KB from the kernel binary size on ARM when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls
are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only.

Signed-off-by: Nicolas Pitre 
---
 drivers/ptp/Kconfig  |   2 +-
 include/linux/posix-timers.h |  28 +-
 include/linux/sched.h|  10 
 init/Kconfig |  17 +++
 kernel/signal.c  |   4 ++
 kernel/time/Kconfig  |   1 +
 kernel/time/Makefile |  10 +++-
 kernel/time/posix-stubs.c| 118 +++
 8 files changed, 185 insertions(+), 5 deletions(-)
 create mode 100644 kernel/time/posix-stubs.c

diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index f34b3748c0..940fa10907 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -10,7 +10,7 @@ config PTP_1588_CLOCK_SELECTED
 config PTP_1588_CLOCK
tristate "PTP clock support"
default PTP_1588_CLOCK_SELECTED
-   depends on NET
+   depends on NET && POSIX_TIMERS
select PPS
select NET_PTP_CLASSIFY
help
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 62d44c1760..2288c5c557 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -118,6 +118,8 @@ struct k_clock {
 extern struct k_clock clock_posix_cpu;
 extern struct k_clock clock_posix_dynamic;
 
+#ifdef CONFIG_POSIX_TIMERS
+
 void posix_timers_register_clock(const clockid_t clock_id, struct k_clock 
*new_clock);
 
 /* function to call to trigger timer event */
@@ -131,8 +133,30 @@ void posix_cpu_timers_exit_group(struct task_struct *task);
 void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
   cputime_t *newval, cputime_t *oldval);
 
-long clock_nanosleep_restart(struct restart_block *restart_block);
-
 void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
 
+#else
+
+#include 
+
+static inline void posix_timers_register_clock(const clockid_t clock_id,
+  struct k_clock *new_clock) {}
+static inline int posix_timer_event(struct k_itimer *timr, int si_private)
+{ return 0; }
+static inline void run_posix_cpu_timers(struct task_struct *task) {}
+static inline void posix_cpu_timers_exit(struct task_struct *task)
+{
+   add_device_randomness((const void*) >se.sum_exec_runtime,
+ sizeof(unsigned long long));
+}
+static inline void posix_cpu_timers_exit_group(struct task_struct *task) {}
+static inline void set_process_cpu_timer(struct task_struct *task,
+   unsigned int clock_idx, cputime_t *newval, cputime_t *oldval) {}
+static inline void update_rlimit_cpu(struct task_struct *task,
+unsigned long rlim_new) {}
+
+#endif
+
+long clock_nanosleep_restart(struct restart_block *restart_block);
+
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 54182d52a0..39a1d6d3f5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2924,8 +2924,13 @@ static inline void exit_thread(struct task_struct *tsk)
 extern void exit_files(struct task_struct *);
 extern void __cleanup_sighand(struct sighand_struct *);
 
+#ifdef CONFIG_POSIX_TIMERS
 extern void exit_itimers(struct signal_struct *);
 extern void flush_itimer_signals(void);
+#else
+static inline void exit_itimers(struct signal_struct *s) {}
+static inline void flush_itimer_signals(void) {}
+#endif
 
 extern void do_group_exit(int);
 
@@ -3382,7 +3387,12 @@ static __always_inline bool need_resched(void)
  * Thread group CPU time accounting.
  */
 void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times);
+#ifdef CONFIG_POSIX_TIMERS
 void thread_group_cputimer(struct task_struct *tsk, struct task_cputime 
*times);
+#else
+static inline void thread_group_cputimer(struct task_struct *tsk,
+struct task_cputime *times) {}
+#endif
 
 /*
  * Reevaluate whether the task has signals pending delivery.
diff --git a/init/Kconfig b/init/Kconfig
index a117738afd..3fdea723dd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1449,6 +1449,23 @@ config SYSCTL_SYSCALL
 
  If unsure say N here.
 
+config POSIX_TIMERS
+   bool "Posix Clocks & timers" if EXPERT
+   default y
+   help
+ This includes native support for POSIX timers to the kernel.
+ Most embedded systems may have no use for them and therefore they
+ can be configured out to reduce the size of the kernel 

[PATCH 1/2] ptp_clock: allow for it to be optional

2016-09-18 Thread Nicolas Pitre
In order to break the hard dependency between the PTP clock subsystem and
ethernet drivers capable of being clock providers, this patch provides
simple PTP stub functions to allow linkage of those drivers into the
kernel even when the PTP subsystem is configured out.

And to make it possible for PTP to be configured out, the select statement
in the Kconfig entry for those ethernet drivers is changed from selecting
PTP_1588_CLOCK to PTP_1588_CLOCK_SELECTED whose purpose is to indicate the
default Kconfig value for the PTP subsystem.

This way the PTP subsystem may have Kconfig dependencies of its own, such
as POSIX_TIMERS, without making those ethernet drivers unavailable if
POSIX timers are cconfigured out. And when support for POSIX timers is
selected again then PTP clock support will also be selected accordingly.

Drivers must be ready to accept NULL from ptp_clock_register().
The pch_gbe driver is a bit special as it relies on extra code in
drivers/ptp/ptp_pch.c. Therefore we let the make process descend into
drivers/ptp/ even if PTP_1588_CLOCK is unselected.

Signed-off-by: Nicolas Pitre 
Acked-by: Richard Cochran 
---
 drivers/Makefile   |  2 +-
 drivers/net/ethernet/adi/Kconfig   |  8 ++-
 drivers/net/ethernet/amd/Kconfig   |  2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-main.c  |  6 ++-
 drivers/net/ethernet/broadcom/Kconfig  |  4 +-
 drivers/net/ethernet/cavium/Kconfig|  2 +-
 drivers/net/ethernet/freescale/Kconfig |  2 +-
 drivers/net/ethernet/intel/Kconfig | 10 ++--
 drivers/net/ethernet/intel/e1000e/ptp.c|  2 +-
 drivers/net/ethernet/intel/i40e/i40e_ptp.c |  2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c   |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c   |  2 +-
 drivers/net/ethernet/mellanox/mlx4/Kconfig |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig|  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c |  2 +-
 drivers/net/ethernet/renesas/Kconfig   |  2 +-
 drivers/net/ethernet/samsung/Kconfig   |  2 +-
 drivers/net/ethernet/sfc/Kconfig   |  2 +-
 drivers/net/ethernet/sfc/ptp.c | 14 ++---
 drivers/net/ethernet/stmicro/stmmac/Kconfig|  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c   |  2 +-
 drivers/net/ethernet/ti/Kconfig|  2 +-
 drivers/net/ethernet/tile/Kconfig  |  2 +-
 drivers/ptp/Kconfig| 12 +++--
 include/linux/ptp_clock_kernel.h   | 59 +++---
 26 files changed, 92 insertions(+), 59 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 53abb4a5f7..8a538d0856 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -105,7 +105,7 @@ obj-$(CONFIG_INPUT) += input/
 obj-$(CONFIG_RTC_LIB)  += rtc/
 obj-y  += i2c/ media/
 obj-$(CONFIG_PPS)  += pps/
-obj-$(CONFIG_PTP_1588_CLOCK)   += ptp/
+obj-y  += ptp/
 obj-$(CONFIG_W1)   += w1/
 obj-y  += power/
 obj-$(CONFIG_HWMON)+= hwmon/
diff --git a/drivers/net/ethernet/adi/Kconfig b/drivers/net/ethernet/adi/Kconfig
index 6b94ba6103..67094a9cfe 100644
--- a/drivers/net/ethernet/adi/Kconfig
+++ b/drivers/net/ethernet/adi/Kconfig
@@ -55,10 +55,14 @@ config BFIN_RX_DESC_NUM
---help---
  Set the number of buffer packets used in driver.
 
+config BFIN_MAC_HAS_HWSTAMP
+   def_tristate BFIN_MAC
+   depends on BF518
+   select PTP_1588_CLOCK_SELECTED
+
 config BFIN_MAC_USE_HWSTAMP
bool "Use IEEE 1588 hwstamp"
-   depends on BFIN_MAC && BF518
-   select PTP_1588_CLOCK
+   depends on BFIN_MAC_HAS_HWSTAMP && PTP_1588_CLOCK
default y
---help---
  To support the IEEE 1588 Precision Time Protocol (PTP), select y here
diff --git a/drivers/net/ethernet/amd/Kconfig b/drivers/net/ethernet/amd/Kconfig
index 0038709fd3..327e71a554 100644
--- a/drivers/net/ethernet/amd/Kconfig
+++ b/drivers/net/ethernet/amd/Kconfig
@@ -177,7 +177,7 @@ config AMD_XGBE
depends on ARM64 || COMPILE_TEST
select BITREVERSE
select CRC32
-   select PTP_1588_CLOCK
+   select PTP_1588_CLOCK_SELECTED
---help---
  This driver supports the AMD 10GbE Ethernet device found on an
  AMD SoC.
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-main.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-main.c
index 3eee3201b5..4aeeb018b6 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-main.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-main.c
@@ -773,7 +773,8 @@ static int xgbe_probe(struct platform_device *pdev)
goto err_wq;
}
 
-   xgbe_ptp_register(pdata);
+   if 

[PATCH 1/2] ptp_clock: allow for it to be optional

2016-09-18 Thread Nicolas Pitre
In order to break the hard dependency between the PTP clock subsystem and
ethernet drivers capable of being clock providers, this patch provides
simple PTP stub functions to allow linkage of those drivers into the
kernel even when the PTP subsystem is configured out.

And to make it possible for PTP to be configured out, the select statement
in the Kconfig entry for those ethernet drivers is changed from selecting
PTP_1588_CLOCK to PTP_1588_CLOCK_SELECTED whose purpose is to indicate the
default Kconfig value for the PTP subsystem.

This way the PTP subsystem may have Kconfig dependencies of its own, such
as POSIX_TIMERS, without making those ethernet drivers unavailable if
POSIX timers are cconfigured out. And when support for POSIX timers is
selected again then PTP clock support will also be selected accordingly.

Drivers must be ready to accept NULL from ptp_clock_register().
The pch_gbe driver is a bit special as it relies on extra code in
drivers/ptp/ptp_pch.c. Therefore we let the make process descend into
drivers/ptp/ even if PTP_1588_CLOCK is unselected.

Signed-off-by: Nicolas Pitre 
Acked-by: Richard Cochran 
---
 drivers/Makefile   |  2 +-
 drivers/net/ethernet/adi/Kconfig   |  8 ++-
 drivers/net/ethernet/amd/Kconfig   |  2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-main.c  |  6 ++-
 drivers/net/ethernet/broadcom/Kconfig  |  4 +-
 drivers/net/ethernet/cavium/Kconfig|  2 +-
 drivers/net/ethernet/freescale/Kconfig |  2 +-
 drivers/net/ethernet/intel/Kconfig | 10 ++--
 drivers/net/ethernet/intel/e1000e/ptp.c|  2 +-
 drivers/net/ethernet/intel/i40e/i40e_ptp.c |  2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c   |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c   |  2 +-
 drivers/net/ethernet/mellanox/mlx4/Kconfig |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig|  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c |  2 +-
 drivers/net/ethernet/renesas/Kconfig   |  2 +-
 drivers/net/ethernet/samsung/Kconfig   |  2 +-
 drivers/net/ethernet/sfc/Kconfig   |  2 +-
 drivers/net/ethernet/sfc/ptp.c | 14 ++---
 drivers/net/ethernet/stmicro/stmmac/Kconfig|  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c   |  2 +-
 drivers/net/ethernet/ti/Kconfig|  2 +-
 drivers/net/ethernet/tile/Kconfig  |  2 +-
 drivers/ptp/Kconfig| 12 +++--
 include/linux/ptp_clock_kernel.h   | 59 +++---
 26 files changed, 92 insertions(+), 59 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 53abb4a5f7..8a538d0856 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -105,7 +105,7 @@ obj-$(CONFIG_INPUT) += input/
 obj-$(CONFIG_RTC_LIB)  += rtc/
 obj-y  += i2c/ media/
 obj-$(CONFIG_PPS)  += pps/
-obj-$(CONFIG_PTP_1588_CLOCK)   += ptp/
+obj-y  += ptp/
 obj-$(CONFIG_W1)   += w1/
 obj-y  += power/
 obj-$(CONFIG_HWMON)+= hwmon/
diff --git a/drivers/net/ethernet/adi/Kconfig b/drivers/net/ethernet/adi/Kconfig
index 6b94ba6103..67094a9cfe 100644
--- a/drivers/net/ethernet/adi/Kconfig
+++ b/drivers/net/ethernet/adi/Kconfig
@@ -55,10 +55,14 @@ config BFIN_RX_DESC_NUM
---help---
  Set the number of buffer packets used in driver.
 
+config BFIN_MAC_HAS_HWSTAMP
+   def_tristate BFIN_MAC
+   depends on BF518
+   select PTP_1588_CLOCK_SELECTED
+
 config BFIN_MAC_USE_HWSTAMP
bool "Use IEEE 1588 hwstamp"
-   depends on BFIN_MAC && BF518
-   select PTP_1588_CLOCK
+   depends on BFIN_MAC_HAS_HWSTAMP && PTP_1588_CLOCK
default y
---help---
  To support the IEEE 1588 Precision Time Protocol (PTP), select y here
diff --git a/drivers/net/ethernet/amd/Kconfig b/drivers/net/ethernet/amd/Kconfig
index 0038709fd3..327e71a554 100644
--- a/drivers/net/ethernet/amd/Kconfig
+++ b/drivers/net/ethernet/amd/Kconfig
@@ -177,7 +177,7 @@ config AMD_XGBE
depends on ARM64 || COMPILE_TEST
select BITREVERSE
select CRC32
-   select PTP_1588_CLOCK
+   select PTP_1588_CLOCK_SELECTED
---help---
  This driver supports the AMD 10GbE Ethernet device found on an
  AMD SoC.
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-main.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-main.c
index 3eee3201b5..4aeeb018b6 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-main.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-main.c
@@ -773,7 +773,8 @@ static int xgbe_probe(struct platform_device *pdev)
goto err_wq;
}
 
-   xgbe_ptp_register(pdata);
+   if (IS_REACHABLE(CONFIG_PTP_1588_CLOCK))
+   

[PATCH 2/2] posix-timers: make it configurable

2016-09-18 Thread Nicolas Pitre
Many embedded systems typically don't need them.  This removes about
22KB from the kernel binary size on ARM when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls
are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only.

Signed-off-by: Nicolas Pitre 
---
 drivers/ptp/Kconfig  |   2 +-
 include/linux/posix-timers.h |  28 +-
 include/linux/sched.h|  10 
 init/Kconfig |  17 +++
 kernel/signal.c  |   4 ++
 kernel/time/Kconfig  |   1 +
 kernel/time/Makefile |  10 +++-
 kernel/time/posix-stubs.c| 118 +++
 8 files changed, 185 insertions(+), 5 deletions(-)
 create mode 100644 kernel/time/posix-stubs.c

diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index f34b3748c0..940fa10907 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -10,7 +10,7 @@ config PTP_1588_CLOCK_SELECTED
 config PTP_1588_CLOCK
tristate "PTP clock support"
default PTP_1588_CLOCK_SELECTED
-   depends on NET
+   depends on NET && POSIX_TIMERS
select PPS
select NET_PTP_CLASSIFY
help
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 62d44c1760..2288c5c557 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -118,6 +118,8 @@ struct k_clock {
 extern struct k_clock clock_posix_cpu;
 extern struct k_clock clock_posix_dynamic;
 
+#ifdef CONFIG_POSIX_TIMERS
+
 void posix_timers_register_clock(const clockid_t clock_id, struct k_clock 
*new_clock);
 
 /* function to call to trigger timer event */
@@ -131,8 +133,30 @@ void posix_cpu_timers_exit_group(struct task_struct *task);
 void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
   cputime_t *newval, cputime_t *oldval);
 
-long clock_nanosleep_restart(struct restart_block *restart_block);
-
 void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
 
+#else
+
+#include 
+
+static inline void posix_timers_register_clock(const clockid_t clock_id,
+  struct k_clock *new_clock) {}
+static inline int posix_timer_event(struct k_itimer *timr, int si_private)
+{ return 0; }
+static inline void run_posix_cpu_timers(struct task_struct *task) {}
+static inline void posix_cpu_timers_exit(struct task_struct *task)
+{
+   add_device_randomness((const void*) >se.sum_exec_runtime,
+ sizeof(unsigned long long));
+}
+static inline void posix_cpu_timers_exit_group(struct task_struct *task) {}
+static inline void set_process_cpu_timer(struct task_struct *task,
+   unsigned int clock_idx, cputime_t *newval, cputime_t *oldval) {}
+static inline void update_rlimit_cpu(struct task_struct *task,
+unsigned long rlim_new) {}
+
+#endif
+
+long clock_nanosleep_restart(struct restart_block *restart_block);
+
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 54182d52a0..39a1d6d3f5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2924,8 +2924,13 @@ static inline void exit_thread(struct task_struct *tsk)
 extern void exit_files(struct task_struct *);
 extern void __cleanup_sighand(struct sighand_struct *);
 
+#ifdef CONFIG_POSIX_TIMERS
 extern void exit_itimers(struct signal_struct *);
 extern void flush_itimer_signals(void);
+#else
+static inline void exit_itimers(struct signal_struct *s) {}
+static inline void flush_itimer_signals(void) {}
+#endif
 
 extern void do_group_exit(int);
 
@@ -3382,7 +3387,12 @@ static __always_inline bool need_resched(void)
  * Thread group CPU time accounting.
  */
 void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times);
+#ifdef CONFIG_POSIX_TIMERS
 void thread_group_cputimer(struct task_struct *tsk, struct task_cputime 
*times);
+#else
+static inline void thread_group_cputimer(struct task_struct *tsk,
+struct task_cputime *times) {}
+#endif
 
 /*
  * Reevaluate whether the task has signals pending delivery.
diff --git a/init/Kconfig b/init/Kconfig
index a117738afd..3fdea723dd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1449,6 +1449,23 @@ config SYSCTL_SYSCALL
 
  If unsure say N here.
 
+config POSIX_TIMERS
+   bool "Posix Clocks & timers" if EXPERT
+   default y
+   help
+ This includes native support for POSIX timers to the kernel.
+ Most embedded systems may have no use for them and therefore they
+ can be configured out to reduce the size of the kernel image.
+
+ 

[PATCH v10 1/3] docs: kernel-parameter : Improve the description of nr_cpus and maxcpus

2016-09-18 Thread Baoquan He
>From the old description people still can't get what's the exact
difference between nr_cpus and maxcpus. Especially in kdump kernel
nr_cpus is always suggested if it's implemented in ARCH. The reason
is nr_cpus is used to limit the max number of possible cpu in system,
the sum of already plugged cpus and hot plug cpus can't exceed its
value. However maxcpus is used to limit how many cpus are allowed to
be brought up during bootup.

Signed-off-by: Baoquan He 
---
 Documentation/kernel-parameters.txt | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index a4f4d69..98d6406 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2161,10 +2161,13 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
than or equal to this physical address is ignored.
 
maxcpus=[SMP] Maximum number of processors that an SMP kernel
-   should make use of.  maxcpus=n : n >= 0 limits the
-   kernel to using 'n' processors.  n=0 is a special case,
-   it is equivalent to "nosmp", which also disables
-   the IO APIC.
+   will bring up during bootup.  maxcpus=n : n >= 0 limits
+   the kernel to bring up 'n' processors. Surely after
+   bootup you can bring up the other plugged cpu by 
executing
+   "echo 1 > /sys/devices/system/cpu/cpuX/online". So 
maxcpus
+   only takes effect during system bootup.
+   While n=0 is a special case, it is equivalent to 
"nosmp",
+   which also disables the IO APIC.
 
max_loop=   [LOOP] The number of loop block devices that get
(loop.max_loop) unconditionally pre-created at init time. The default
@@ -2773,9 +2776,12 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
 
nr_cpus=[SMP] Maximum number of processors that an SMP kernel
could support.  nr_cpus=n : n >= 1 limits the kernel to
-   supporting 'n' processors. Later in runtime you can not
-   use hotplug cpu feature to put more cpu back to online.
-   just like you compile the kernel NR_CPUS=n
+   support 'n' processors. It could be larger than the
+   number of already plugged CPU during bootup, later in
+   runtime you can physically add extra cpu until it 
reaches
+   n. So during boot up some boot time memory for per-cpu
+   variables need be pre-allocated for later physical cpu
+   hot plugging.
 
nr_uarts=   [SERIAL] maximum number of UARTs to be registered.
 
-- 
2.5.5



[PATCH v10 1/3] docs: kernel-parameter : Improve the description of nr_cpus and maxcpus

2016-09-18 Thread Baoquan He
>From the old description people still can't get what's the exact
difference between nr_cpus and maxcpus. Especially in kdump kernel
nr_cpus is always suggested if it's implemented in ARCH. The reason
is nr_cpus is used to limit the max number of possible cpu in system,
the sum of already plugged cpus and hot plug cpus can't exceed its
value. However maxcpus is used to limit how many cpus are allowed to
be brought up during bootup.

Signed-off-by: Baoquan He 
---
 Documentation/kernel-parameters.txt | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index a4f4d69..98d6406 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2161,10 +2161,13 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
than or equal to this physical address is ignored.
 
maxcpus=[SMP] Maximum number of processors that an SMP kernel
-   should make use of.  maxcpus=n : n >= 0 limits the
-   kernel to using 'n' processors.  n=0 is a special case,
-   it is equivalent to "nosmp", which also disables
-   the IO APIC.
+   will bring up during bootup.  maxcpus=n : n >= 0 limits
+   the kernel to bring up 'n' processors. Surely after
+   bootup you can bring up the other plugged cpu by 
executing
+   "echo 1 > /sys/devices/system/cpu/cpuX/online". So 
maxcpus
+   only takes effect during system bootup.
+   While n=0 is a special case, it is equivalent to 
"nosmp",
+   which also disables the IO APIC.
 
max_loop=   [LOOP] The number of loop block devices that get
(loop.max_loop) unconditionally pre-created at init time. The default
@@ -2773,9 +2776,12 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
 
nr_cpus=[SMP] Maximum number of processors that an SMP kernel
could support.  nr_cpus=n : n >= 1 limits the kernel to
-   supporting 'n' processors. Later in runtime you can not
-   use hotplug cpu feature to put more cpu back to online.
-   just like you compile the kernel NR_CPUS=n
+   support 'n' processors. It could be larger than the
+   number of already plugged CPU during bootup, later in
+   runtime you can physically add extra cpu until it 
reaches
+   n. So during boot up some boot time memory for per-cpu
+   variables need be pre-allocated for later physical cpu
+   hot plugging.
 
nr_uarts=   [SERIAL] maximum number of UARTs to be registered.
 
-- 
2.5.5



[PATCH v10 0/3] Documentation: add description of enable multi-cpus support for kdump

2016-09-18 Thread Baoquan He
This is v10 post. In this patchset patch 1/3 is added to give more details
about nr_cpus and maxcpus in kernel-parameters.txt. This is suggested by
Jonathan since the description of them is unclear so that people can't see
what's the exact difference between them. Otherwise no big further change
for 2/3 and 3/3 which comprise the old post.

Please access below link to check the previous changelog.
https://lkml.org/lkml/2016/8/17/646

Baoquan He (1):
  docs: kernel-parameter : Improve the description of nr_cpus and
maxcpus

Zhou Wenjian (2):
  Documentation: kdump: remind user of nr_cpus
  Documentation: kdump: add description of enable multi-cpus support

 Documentation/kdump/kdump.txt   |  9 +
 Documentation/kernel-parameters.txt | 20 +---
 2 files changed, 22 insertions(+), 7 deletions(-)

-- 
2.5.5



[PATCH v10 2/3] Documentation: kdump: remind user of nr_cpus

2016-09-18 Thread Baoquan He
From: Zhou Wenjian 

nr_cpus can help to save memory. So we should remind user of it.

Signed-off-by: Zhou Wenjian 
Acked-by: Baoquan He 
Acked-by: Xunlei Pang 
Signed-off-by: Baoquan He 
---
 Documentation/kdump/kdump.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 88ff63d..f7ef340 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -393,6 +393,8 @@ Notes on loading the dump-capture kernel:
 * We generally don' have to bring up a SMP kernel just to capture the
   dump. Hence generally it is useful either to build a UP dump-capture
   kernel or specify maxcpus=1 option while loading dump-capture kernel.
+  Note, though maxcpus always works, you had better replace it with
+  nr_cpus to save memory if supported by the current ARCH, such as x86.
 
 * For s390x there are two kdump modes: If a ELF header is specified with
   the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
-- 
2.5.5



[PATCH v10 3/3] Documentation: kdump: add description of enable multi-cpus support

2016-09-18 Thread Baoquan He
From: Zhou Wenjian 

Multi-cpu support is useful to improve the performance of kdump in
some cases. So add the description of enable multi-cpu support in
dump-capture kernel.

Signed-off-by: Zhou Wenjian 
Acked-by: Baoquan He 
Acked-by: Xunlei Pang 
Signed-off-by: Baoquan He 
---
 Documentation/kdump/kdump.txt | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index f7ef340..b0eb27b 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -396,6 +396,13 @@ Notes on loading the dump-capture kernel:
   Note, though maxcpus always works, you had better replace it with
   nr_cpus to save memory if supported by the current ARCH, such as x86.
 
+* You should enable multi-cpu support in dump-capture kernel if you intend
+  to use multi-thread programs with it, such as parallel dump feature of
+  makedumpfile. Otherwise, the multi-thread program may have a great
+  performance degradation. To enable multi-cpu support, you should bring up an
+  SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
+  options while loading it.
+
 * For s390x there are two kdump modes: If a ELF header is specified with
   the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
   is done on all other architectures. If no elfcorehdr= kernel parameter is
-- 
2.5.5



[PATCH v10 0/3] Documentation: add description of enable multi-cpus support for kdump

2016-09-18 Thread Baoquan He
This is v10 post. In this patchset patch 1/3 is added to give more details
about nr_cpus and maxcpus in kernel-parameters.txt. This is suggested by
Jonathan since the description of them is unclear so that people can't see
what's the exact difference between them. Otherwise no big further change
for 2/3 and 3/3 which comprise the old post.

Please access below link to check the previous changelog.
https://lkml.org/lkml/2016/8/17/646

Baoquan He (1):
  docs: kernel-parameter : Improve the description of nr_cpus and
maxcpus

Zhou Wenjian (2):
  Documentation: kdump: remind user of nr_cpus
  Documentation: kdump: add description of enable multi-cpus support

 Documentation/kdump/kdump.txt   |  9 +
 Documentation/kernel-parameters.txt | 20 +---
 2 files changed, 22 insertions(+), 7 deletions(-)

-- 
2.5.5



[PATCH v10 2/3] Documentation: kdump: remind user of nr_cpus

2016-09-18 Thread Baoquan He
From: Zhou Wenjian 

nr_cpus can help to save memory. So we should remind user of it.

Signed-off-by: Zhou Wenjian 
Acked-by: Baoquan He 
Acked-by: Xunlei Pang 
Signed-off-by: Baoquan He 
---
 Documentation/kdump/kdump.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 88ff63d..f7ef340 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -393,6 +393,8 @@ Notes on loading the dump-capture kernel:
 * We generally don' have to bring up a SMP kernel just to capture the
   dump. Hence generally it is useful either to build a UP dump-capture
   kernel or specify maxcpus=1 option while loading dump-capture kernel.
+  Note, though maxcpus always works, you had better replace it with
+  nr_cpus to save memory if supported by the current ARCH, such as x86.
 
 * For s390x there are two kdump modes: If a ELF header is specified with
   the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
-- 
2.5.5



[PATCH v10 3/3] Documentation: kdump: add description of enable multi-cpus support

2016-09-18 Thread Baoquan He
From: Zhou Wenjian 

Multi-cpu support is useful to improve the performance of kdump in
some cases. So add the description of enable multi-cpu support in
dump-capture kernel.

Signed-off-by: Zhou Wenjian 
Acked-by: Baoquan He 
Acked-by: Xunlei Pang 
Signed-off-by: Baoquan He 
---
 Documentation/kdump/kdump.txt | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index f7ef340..b0eb27b 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -396,6 +396,13 @@ Notes on loading the dump-capture kernel:
   Note, though maxcpus always works, you had better replace it with
   nr_cpus to save memory if supported by the current ARCH, such as x86.
 
+* You should enable multi-cpu support in dump-capture kernel if you intend
+  to use multi-thread programs with it, such as parallel dump feature of
+  makedumpfile. Otherwise, the multi-thread program may have a great
+  performance degradation. To enable multi-cpu support, you should bring up an
+  SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
+  options while loading it.
+
 * For s390x there are two kdump modes: If a ELF header is specified with
   the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
   is done on all other architectures. If no elfcorehdr= kernel parameter is
-- 
2.5.5



Re: [PATCH linux-firmware 09/12] WHENCE: Fix metadata for snd-soc-skl firmware

2016-09-18 Thread Vinod Koul
On Sun, Sep 18, 2016 at 03:03:21AM +0100, Ben Hutchings wrote:
> Fix filename 'intel/dsp_fw_bxtn.bin'.  List all the files and their
> versions, not just the symlinks.  Delete the non-standard 'md5sum'
> fields.

Acked-by: Vinod Koul 


-- 
~Vinod


signature.asc
Description: Digital signature


Re: [PATCH linux-firmware 09/12] WHENCE: Fix metadata for snd-soc-skl firmware

2016-09-18 Thread Vinod Koul
On Sun, Sep 18, 2016 at 03:03:21AM +0100, Ben Hutchings wrote:
> Fix filename 'intel/dsp_fw_bxtn.bin'.  List all the files and their
> versions, not just the symlinks.  Delete the non-standard 'md5sum'
> fields.

Acked-by: Vinod Koul 


-- 
~Vinod


signature.asc
Description: Digital signature


Re: [PATCH v3 03/15] lockdep: Refactor lookup_chain_cache()

2016-09-18 Thread Byungchul Park
On Thu, Sep 15, 2016 at 10:33:46AM -0500, Nilay Vaish wrote:
> On 13 September 2016 at 04:45, Byungchul Park  wrote:
> > @@ -2215,6 +2178,75 @@ cache_hit:
> > return 1;
> >  }
> >
> > +/*
> > + * Look up a dependency chain.
> > + */
> > +static inline struct lock_chain *lookup_chain_cache(u64 chain_key)
> > +{
> > +   struct hlist_head *hash_head = chainhashentry(chain_key);
> > +   struct lock_chain *chain;
> > +
> > +   /*
> > +* We can walk it lock-free, because entries only get added
> > +* to the hash:
> > +*/
> > +   hlist_for_each_entry_rcu(chain, hash_head, entry) {
> > +   if (chain->chain_key == chain_key) {
> > +   debug_atomic_inc(chain_lookup_hits);
> > +   return chain;
> > +   }
> > +   }
> > +   return NULL;
> > +}
> 
> Byungchul,  do you think we should increment chain_lookup_misses
> before returning NULL from the above function?

Hello,

No, I don't think so.
It will be done in add_chain_cache().

Thank you,
Byungchul

> 
> --
> Nilay


Re: [PATCH v3 03/15] lockdep: Refactor lookup_chain_cache()

2016-09-18 Thread Byungchul Park
On Thu, Sep 15, 2016 at 10:33:46AM -0500, Nilay Vaish wrote:
> On 13 September 2016 at 04:45, Byungchul Park  wrote:
> > @@ -2215,6 +2178,75 @@ cache_hit:
> > return 1;
> >  }
> >
> > +/*
> > + * Look up a dependency chain.
> > + */
> > +static inline struct lock_chain *lookup_chain_cache(u64 chain_key)
> > +{
> > +   struct hlist_head *hash_head = chainhashentry(chain_key);
> > +   struct lock_chain *chain;
> > +
> > +   /*
> > +* We can walk it lock-free, because entries only get added
> > +* to the hash:
> > +*/
> > +   hlist_for_each_entry_rcu(chain, hash_head, entry) {
> > +   if (chain->chain_key == chain_key) {
> > +   debug_atomic_inc(chain_lookup_hits);
> > +   return chain;
> > +   }
> > +   }
> > +   return NULL;
> > +}
> 
> Byungchul,  do you think we should increment chain_lookup_misses
> before returning NULL from the above function?

Hello,

No, I don't think so.
It will be done in add_chain_cache().

Thank you,
Byungchul

> 
> --
> Nilay


Re: [PATCH 4/7 v3] sched: propagate load during synchronous attach/detach

2016-09-18 Thread Wanpeng Li
2016-09-12 15:47 GMT+08:00 Vincent Guittot :
> When a task moves from/to a cfs_rq, we set a flag which is then used to
> propagate the change at parent level (sched_entity and cfs_rq) during
> next update. If the cfs_rq is throttled, the flag will stay pending until
> the cfs_rw is unthrottled.
>
> For propagating the utilization, we copy the utilization of child cfs_rq to
> the sched_entity.
>
> For propagating the load, we have to take into account the load of the
> whole task group in order to evaluate the load of the sched_entity.
> Similarly to what was done before the rewrite of PELT, we add a correction
> factor in case the task group's load is less than its share so it will
> contribute the same load of a task of equal weight.
>
> Signed-off-by: Vincent Guittot 
> ---
>  kernel/sched/fair.c  | 170 
> ++-
>  kernel/sched/sched.h |   1 +
>  2 files changed, 170 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0aa1d7d..e4015f6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3017,6 +3017,132 @@ static inline void cfs_rq_util_change(struct cfs_rq 
> *cfs_rq)
> }
>  }
>
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +/* Take into account change of utilization of a child task group */
> +static inline void
> +update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +   struct cfs_rq *gcfs_rq =  group_cfs_rq(se);
> +   long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> +
> +   /* Nothing to update */
> +   if (!delta)
> +   return;
> +
> +   /* Set new sched_entity's utilizaton */

s/utilizaton/utilization

> +   se->avg.util_avg = gcfs_rq->avg.util_avg;
> +   se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> +
> +   /* Update parent cfs_rq utilization */
> +   cfs_rq->avg.util_avg =  max_t(long, cfs_rq->avg.util_avg + delta, 0);
> +   cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> +}
> +
> +/* Take into account change of load of a child task group */
> +static inline void
> +update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +   struct cfs_rq *gcfs_rq = group_cfs_rq(se);
> +   long delta, load = gcfs_rq->avg.load_avg;
> +
> +   /* If the load of group cfs_rq is null, the load of the
> +* sched_entity will also be null so we can skip the formula
> +*/
> +   if (load) {
> +   long tg_load;
> +
> +   /* Get tg's load and ensure tg_load > 0 */
> +   tg_load = atomic_long_read(_rq->tg->load_avg) + 1;
> +
> +   /* Ensure tg_load >= load and updated with current load*/
> +   tg_load -= gcfs_rq->tg_load_avg_contrib;
> +   tg_load += load;
> +
> +   /* scale gcfs_rq's load into tg's shares*/
> +   load *= scale_load_down(gcfs_rq->tg->shares);
> +   load /= tg_load;
> +
> +   /*
> +* we need to compute a correction term in the case that the
> +* task group is consuming <1 cpu so that we would contribute
> +* the same load as a task of equal weight.
> +   */
> +   if (tg_load < scale_load_down(gcfs_rq->tg->shares)) {
> +   load *= tg_load;
> +   load /= scale_load_down(gcfs_rq->tg->shares);
> +   }
> +   }
> +
> +   delta = load - se->avg.load_avg;
> +
> +   /* Nothing to update */
> +   if (!delta)
> +   return;
> +
> +   /* Set new sched_entity's load */
> +   se->avg.load_avg = load;
> +   se->avg.load_sum = se->avg.load_avg * LOAD_AVG_MAX;
> +
> +   /* Update parent cfs_rq load */
> +   cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg + delta, 0);
> +   cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * LOAD_AVG_MAX;
> +}
> +
> +static inline void set_tg_cfs_propagate(struct cfs_rq *cfs_rq)
> +{
> +   /* set cfs_rq's flag */
> +   cfs_rq->propagate_avg = 1;
> +}
> +
> +static inline int test_and_clear_tg_cfs_propagate(struct sched_entity *se)
> +{
> +   /* Get my cfs_rq */
> +   struct cfs_rq *cfs_rq = group_cfs_rq(se);
> +
> +   /* Nothing to propagate */
> +   if (!cfs_rq->propagate_avg)
> +   return 0;
> +
> +   /* Clear my cfs_rq's flag */
> +   cfs_rq->propagate_avg = 0;
> +
> +   return 1;
> +}
> +
> +/* Update task and its cfs_rq load average */
> +static inline int propagate_entity_load_avg(struct sched_entity *se)
> +{
> +   struct cfs_rq *cfs_rq;
> +
> +   if (entity_is_task(se))
> +   return 0;
> +
> +   if (!test_and_clear_tg_cfs_propagate(se))
> +   return 0;
> +
> +   /* Get parent cfs_rq */
> +   cfs_rq = cfs_rq_of(se);
> +
> +   /* Propagate to parent */
> +   set_tg_cfs_propagate(cfs_rq);
> +

Re: [PATCH 4/7 v3] sched: propagate load during synchronous attach/detach

2016-09-18 Thread Wanpeng Li
2016-09-12 15:47 GMT+08:00 Vincent Guittot :
> When a task moves from/to a cfs_rq, we set a flag which is then used to
> propagate the change at parent level (sched_entity and cfs_rq) during
> next update. If the cfs_rq is throttled, the flag will stay pending until
> the cfs_rw is unthrottled.
>
> For propagating the utilization, we copy the utilization of child cfs_rq to
> the sched_entity.
>
> For propagating the load, we have to take into account the load of the
> whole task group in order to evaluate the load of the sched_entity.
> Similarly to what was done before the rewrite of PELT, we add a correction
> factor in case the task group's load is less than its share so it will
> contribute the same load of a task of equal weight.
>
> Signed-off-by: Vincent Guittot 
> ---
>  kernel/sched/fair.c  | 170 
> ++-
>  kernel/sched/sched.h |   1 +
>  2 files changed, 170 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0aa1d7d..e4015f6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3017,6 +3017,132 @@ static inline void cfs_rq_util_change(struct cfs_rq 
> *cfs_rq)
> }
>  }
>
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +/* Take into account change of utilization of a child task group */
> +static inline void
> +update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +   struct cfs_rq *gcfs_rq =  group_cfs_rq(se);
> +   long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> +
> +   /* Nothing to update */
> +   if (!delta)
> +   return;
> +
> +   /* Set new sched_entity's utilizaton */

s/utilizaton/utilization

> +   se->avg.util_avg = gcfs_rq->avg.util_avg;
> +   se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> +
> +   /* Update parent cfs_rq utilization */
> +   cfs_rq->avg.util_avg =  max_t(long, cfs_rq->avg.util_avg + delta, 0);
> +   cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> +}
> +
> +/* Take into account change of load of a child task group */
> +static inline void
> +update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +   struct cfs_rq *gcfs_rq = group_cfs_rq(se);
> +   long delta, load = gcfs_rq->avg.load_avg;
> +
> +   /* If the load of group cfs_rq is null, the load of the
> +* sched_entity will also be null so we can skip the formula
> +*/
> +   if (load) {
> +   long tg_load;
> +
> +   /* Get tg's load and ensure tg_load > 0 */
> +   tg_load = atomic_long_read(_rq->tg->load_avg) + 1;
> +
> +   /* Ensure tg_load >= load and updated with current load*/
> +   tg_load -= gcfs_rq->tg_load_avg_contrib;
> +   tg_load += load;
> +
> +   /* scale gcfs_rq's load into tg's shares*/
> +   load *= scale_load_down(gcfs_rq->tg->shares);
> +   load /= tg_load;
> +
> +   /*
> +* we need to compute a correction term in the case that the
> +* task group is consuming <1 cpu so that we would contribute
> +* the same load as a task of equal weight.
> +   */
> +   if (tg_load < scale_load_down(gcfs_rq->tg->shares)) {
> +   load *= tg_load;
> +   load /= scale_load_down(gcfs_rq->tg->shares);
> +   }
> +   }
> +
> +   delta = load - se->avg.load_avg;
> +
> +   /* Nothing to update */
> +   if (!delta)
> +   return;
> +
> +   /* Set new sched_entity's load */
> +   se->avg.load_avg = load;
> +   se->avg.load_sum = se->avg.load_avg * LOAD_AVG_MAX;
> +
> +   /* Update parent cfs_rq load */
> +   cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg + delta, 0);
> +   cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * LOAD_AVG_MAX;
> +}
> +
> +static inline void set_tg_cfs_propagate(struct cfs_rq *cfs_rq)
> +{
> +   /* set cfs_rq's flag */
> +   cfs_rq->propagate_avg = 1;
> +}
> +
> +static inline int test_and_clear_tg_cfs_propagate(struct sched_entity *se)
> +{
> +   /* Get my cfs_rq */
> +   struct cfs_rq *cfs_rq = group_cfs_rq(se);
> +
> +   /* Nothing to propagate */
> +   if (!cfs_rq->propagate_avg)
> +   return 0;
> +
> +   /* Clear my cfs_rq's flag */
> +   cfs_rq->propagate_avg = 0;
> +
> +   return 1;
> +}
> +
> +/* Update task and its cfs_rq load average */
> +static inline int propagate_entity_load_avg(struct sched_entity *se)
> +{
> +   struct cfs_rq *cfs_rq;
> +
> +   if (entity_is_task(se))
> +   return 0;
> +
> +   if (!test_and_clear_tg_cfs_propagate(se))
> +   return 0;
> +
> +   /* Get parent cfs_rq */
> +   cfs_rq = cfs_rq_of(se);
> +
> +   /* Propagate to parent */
> +   set_tg_cfs_propagate(cfs_rq);
> +
> +   /* Update utilization */
> +   

Re: [PATCH] random: Fix kernel panic due to system_wq use before init

2016-09-18 Thread Waiman Long

On 09/14/2016 03:19 PM, Linus Torvalds wrote:

On Wed, Sep 14, 2016 at 12:14 PM, Waiman Long  wrote:

In the stack backtrace above, the kernel hadn't even reached SMP boot after
about 50s. That was extremely slow. I tried the 4.7.3 kernel and it booted
up fine. So I suspect that there may be too many interrupts going on and it
consumes most of the CPU cycles. The prime suspect is the random driver, I
think.

Any chance of bisecting it at least partially? The random driver
doesn't do interrupts itself, it just gets called by other drivers
doing intterrupts. So if there are too many of them, that would be
something else..

Linus


I have finally finished bisecting the problem. I was wrong in saying 
that the 4.7.3 kernel had no problem. It did have. There were some 
slight differences between the 4.8 and 4.7 kernel config files that I 
used. After some further testing, it was found that the bootup problem 
only happened when the following kernel config option was defined:


CONFIG_EFI_MIXED=y

Bisecting reviewed that the following 4.6 patch was the first patch that 
had this problem:


c9f2a9a65e4855b74d92cdad688f6ee4a1a323ff
[PATCH] x86/efi: Hoist page table switching code into efi_call_virt()

I did testing on my test system with three different partition sizes:
1) 16-socket Broadwell-EX with 12TB memory
2) 8-socket Broadwell-EX with 6TB memory
3) 4-socket Broadwell-EX with 3TB memory

Only the 16-socket and 8-socket configurations had this problem. I am 
not sure if over 4TB of main memory is a factor or not.


I have attached several slightly different panic messages that had 
happened in my testing. I know little about the EFI code and so I am not 
sure if it is a kernel problem, firmware problem or a combination of 
both. Hopefully someone with knowledge on this code will shed light on 
this problem.


Cheers,
Longman
commit 1bb6936473c07b5a7c8daced1000893b7145bb14
Author: Ard Biesheuvel 
Date:   Mon Feb 1 22:07:00 2016 +

efi: Runtime-wrapper: Get rid of the rtc_lock spinlock

The rtc_lock spinlock aims to serialize access to the CMOS RTC
between the UEFI firmware and the kernel drivers that use it
directly. However, x86 is the only arch that performs such
direct accesses, and that never uses the time related UEFI
runtime services. Since no other UEFI enlightened architectures
have a legcay CMOS RTC anyway, we can remove the rtc_lock
spinlock entirely.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Matt Fleming 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/1454364428-494-7-git-send-email-matt@codeblue
Signed-off-by: Ingo Molnar 

-
[0.00] ACPI: X2APIC_NMI (uid[0x16b] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16c] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16d] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16e] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16f] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x170] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x171] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x172] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x173] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x174] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x175] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x176] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x177] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x178] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x179] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17a] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17b] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17c] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17d] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17e] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17f] high level lint[0x1])
[0.00] IOAPIC[0]: apic_id 8, version 32, address 0xfec0, GSI 0-23
[0.00] IOAPIC[1]: apic_id 9, version 32, address 0xfec01000, GSI 24-47
[0.00] IOAPIC[2]: apic_id 10, version 32, address 0xfec04000, GSI 48-71
[0.00] IOAPIC[3]: apic_id 11, version 32, address 0xfec08000, GSI 72-95
[0.00] IOAPIC[4]: apic_id 12, version 32, address 0xfec09000, GSI 96-119
[0.00] IOAPIC[5]: 

Re: [PATCH] random: Fix kernel panic due to system_wq use before init

2016-09-18 Thread Waiman Long

On 09/14/2016 03:19 PM, Linus Torvalds wrote:

On Wed, Sep 14, 2016 at 12:14 PM, Waiman Long  wrote:

In the stack backtrace above, the kernel hadn't even reached SMP boot after
about 50s. That was extremely slow. I tried the 4.7.3 kernel and it booted
up fine. So I suspect that there may be too many interrupts going on and it
consumes most of the CPU cycles. The prime suspect is the random driver, I
think.

Any chance of bisecting it at least partially? The random driver
doesn't do interrupts itself, it just gets called by other drivers
doing intterrupts. So if there are too many of them, that would be
something else..

Linus


I have finally finished bisecting the problem. I was wrong in saying 
that the 4.7.3 kernel had no problem. It did have. There were some 
slight differences between the 4.8 and 4.7 kernel config files that I 
used. After some further testing, it was found that the bootup problem 
only happened when the following kernel config option was defined:


CONFIG_EFI_MIXED=y

Bisecting reviewed that the following 4.6 patch was the first patch that 
had this problem:


c9f2a9a65e4855b74d92cdad688f6ee4a1a323ff
[PATCH] x86/efi: Hoist page table switching code into efi_call_virt()

I did testing on my test system with three different partition sizes:
1) 16-socket Broadwell-EX with 12TB memory
2) 8-socket Broadwell-EX with 6TB memory
3) 4-socket Broadwell-EX with 3TB memory

Only the 16-socket and 8-socket configurations had this problem. I am 
not sure if over 4TB of main memory is a factor or not.


I have attached several slightly different panic messages that had 
happened in my testing. I know little about the EFI code and so I am not 
sure if it is a kernel problem, firmware problem or a combination of 
both. Hopefully someone with knowledge on this code will shed light on 
this problem.


Cheers,
Longman
commit 1bb6936473c07b5a7c8daced1000893b7145bb14
Author: Ard Biesheuvel 
Date:   Mon Feb 1 22:07:00 2016 +

efi: Runtime-wrapper: Get rid of the rtc_lock spinlock

The rtc_lock spinlock aims to serialize access to the CMOS RTC
between the UEFI firmware and the kernel drivers that use it
directly. However, x86 is the only arch that performs such
direct accesses, and that never uses the time related UEFI
runtime services. Since no other UEFI enlightened architectures
have a legcay CMOS RTC anyway, we can remove the rtc_lock
spinlock entirely.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Matt Fleming 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/1454364428-494-7-git-send-email-matt@codeblue
Signed-off-by: Ingo Molnar 

-
[0.00] ACPI: X2APIC_NMI (uid[0x16b] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16c] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16d] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16e] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x16f] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x170] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x171] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x172] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x173] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x174] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x175] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x176] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x177] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x178] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x179] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17a] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17b] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17c] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17d] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17e] high level lint[0x1])
[0.00] ACPI: X2APIC_NMI (uid[0x17f] high level lint[0x1])
[0.00] IOAPIC[0]: apic_id 8, version 32, address 0xfec0, GSI 0-23
[0.00] IOAPIC[1]: apic_id 9, version 32, address 0xfec01000, GSI 24-47
[0.00] IOAPIC[2]: apic_id 10, version 32, address 0xfec04000, GSI 48-71
[0.00] IOAPIC[3]: apic_id 11, version 32, address 0xfec08000, GSI 72-95
[0.00] IOAPIC[4]: apic_id 12, version 32, address 0xfec09000, GSI 96-119
[0.00] IOAPIC[5]: apic_id 13, version 32, address 0xfec0c000, GSI 
120-143
[0.00] IOAPIC[6]: apic_id 14, version 32, address 0xfec1, GSI 
144-167
[0.00] IOAPIC[7]: apic_id 15, version 32, address 0xfec11000, GSI 
168-191
[0.00] IOAPIC[8]: apic_id 16, 

[RESEND PATCH 2/3] vcodec: mediatek: Add Mediatek JPEG Decoder Driver

2016-09-18 Thread Rick Chang
Add v4l2 driver for Mediatek JPEG Decoder

Signed-off-by: Rick Chang 
Signed-off-by: Minghsiu Tsai 
---
 drivers/media/platform/Kconfig   |   15 +
 drivers/media/platform/Makefile  |2 +
 drivers/media/platform/mtk-jpeg/Makefile |4 +
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c  | 1271 ++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h  |  141 +++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.c|  417 +++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.h|   91 ++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.c |  160 +++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.h |   25 +
 drivers/media/platform/mtk-jpeg/mtk_jpeg_reg.h   |   58 +
 10 files changed, 2184 insertions(+)
 create mode 100644 drivers/media/platform/mtk-jpeg/Makefile
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.c
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.h
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.c
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.h
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_reg.h

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index f98ed3f..4769a56 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -162,6 +162,21 @@ config VIDEO_CODA
   Coda is a range of video codec IPs that supports
   H.264, MPEG-4, and other video formats.
 
+config VIDEO_MEDIATEK_JPEG
+   tristate "Mediatek JPEG Codec driver"
+   depends on MTK_IOMMU_V1 || COMPILE_TEST
+   depends on VIDEO_DEV && VIDEO_V4L2
+   depends on ARCH_MEDIATEK || COMPILE_TEST
+   depends on HAS_DMA
+   select VIDEOBUF2_DMA_CONTIG
+   select V4L2_MEM2MEM_DEV
+   ---help---
+ Mediatek jpeg codec driver provides HW capability to decode
+ JPEG format
+
+ To compile this driver as a module, choose M here: the
+ module will be called mtk-jpeg
+
 config VIDEO_MEDIATEK_VPU
tristate "Mediatek Video Processor Unit"
depends on VIDEO_DEV && VIDEO_V4L2 && HAS_DMA
diff --git a/drivers/media/platform/Makefile b/drivers/media/platform/Makefile
index 40b18d1..351d979 100644
--- a/drivers/media/platform/Makefile
+++ b/drivers/media/platform/Makefile
@@ -66,3 +66,5 @@ ccflags-y += -I$(srctree)/drivers/media/i2c
 obj-$(CONFIG_VIDEO_MEDIATEK_VPU)   += mtk-vpu/
 
 obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC)+= mtk-vcodec/
+
+obj-$(CONFIG_VIDEO_MEDIATEK_JPEG)  += mtk-jpeg/
diff --git a/drivers/media/platform/mtk-jpeg/Makefile 
b/drivers/media/platform/mtk-jpeg/Makefile
new file mode 100644
index 000..59528fa
--- /dev/null
+++ b/drivers/media/platform/mtk-jpeg/Makefile
@@ -0,0 +1,4 @@
+mtk_jpeg-objs := mtk_jpeg_core.o mtk_jpeg_hw.o mtk_jpeg_parse.o
+obj-$(CONFIG_VIDEO_MEDIATEK_JPEG) += mtk_jpeg.o
+
+ccflags-y += -I$(srctree)/drivers/media/platform/mtk-videobuf
diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
new file mode 100644
index 000..3804a48
--- /dev/null
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
@@ -0,0 +1,1271 @@
+/*
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ming Hsiu Tsai 
+ * Rick Chang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_jpeg_hw.h"
+#include "mtk_jpeg_core.h"
+#include "mtk_jpeg_parse.h"
+
+static struct mtk_jpeg_fmt mtk_jpeg_formats[] = {
+   {
+   .name   = "JPEG JFIF",
+   .fourcc = V4L2_PIX_FMT_JPEG,
+   .colplanes  = 1,
+   .flags  = MTK_JPEG_FMT_FLAG_DEC_OUTPUT,
+   },
+   {
+   .name   = "YUV 4:2:0 non-contiguous 3-planar, Y/Cb/Cr",
+   .fourcc = V4L2_PIX_FMT_YUV420M,
+   .h_sample   = {4, 2, 2},
+   .v_sample   = {4, 2, 2},
+   .colplanes  = 3,
+   .h_align= 5,
+   .v_align= 4,
+   .flags  = MTK_JPEG_FMT_FLAG_DEC_CAPTURE,
+   },
+   {
+   

[RESEND PATCH 2/3] vcodec: mediatek: Add Mediatek JPEG Decoder Driver

2016-09-18 Thread Rick Chang
Add v4l2 driver for Mediatek JPEG Decoder

Signed-off-by: Rick Chang 
Signed-off-by: Minghsiu Tsai 
---
 drivers/media/platform/Kconfig   |   15 +
 drivers/media/platform/Makefile  |2 +
 drivers/media/platform/mtk-jpeg/Makefile |4 +
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c  | 1271 ++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h  |  141 +++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.c|  417 +++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.h|   91 ++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.c |  160 +++
 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.h |   25 +
 drivers/media/platform/mtk-jpeg/mtk_jpeg_reg.h   |   58 +
 10 files changed, 2184 insertions(+)
 create mode 100644 drivers/media/platform/mtk-jpeg/Makefile
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.c
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_hw.h
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.c
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_parse.h
 create mode 100644 drivers/media/platform/mtk-jpeg/mtk_jpeg_reg.h

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index f98ed3f..4769a56 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -162,6 +162,21 @@ config VIDEO_CODA
   Coda is a range of video codec IPs that supports
   H.264, MPEG-4, and other video formats.
 
+config VIDEO_MEDIATEK_JPEG
+   tristate "Mediatek JPEG Codec driver"
+   depends on MTK_IOMMU_V1 || COMPILE_TEST
+   depends on VIDEO_DEV && VIDEO_V4L2
+   depends on ARCH_MEDIATEK || COMPILE_TEST
+   depends on HAS_DMA
+   select VIDEOBUF2_DMA_CONTIG
+   select V4L2_MEM2MEM_DEV
+   ---help---
+ Mediatek jpeg codec driver provides HW capability to decode
+ JPEG format
+
+ To compile this driver as a module, choose M here: the
+ module will be called mtk-jpeg
+
 config VIDEO_MEDIATEK_VPU
tristate "Mediatek Video Processor Unit"
depends on VIDEO_DEV && VIDEO_V4L2 && HAS_DMA
diff --git a/drivers/media/platform/Makefile b/drivers/media/platform/Makefile
index 40b18d1..351d979 100644
--- a/drivers/media/platform/Makefile
+++ b/drivers/media/platform/Makefile
@@ -66,3 +66,5 @@ ccflags-y += -I$(srctree)/drivers/media/i2c
 obj-$(CONFIG_VIDEO_MEDIATEK_VPU)   += mtk-vpu/
 
 obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC)+= mtk-vcodec/
+
+obj-$(CONFIG_VIDEO_MEDIATEK_JPEG)  += mtk-jpeg/
diff --git a/drivers/media/platform/mtk-jpeg/Makefile 
b/drivers/media/platform/mtk-jpeg/Makefile
new file mode 100644
index 000..59528fa
--- /dev/null
+++ b/drivers/media/platform/mtk-jpeg/Makefile
@@ -0,0 +1,4 @@
+mtk_jpeg-objs := mtk_jpeg_core.o mtk_jpeg_hw.o mtk_jpeg_parse.o
+obj-$(CONFIG_VIDEO_MEDIATEK_JPEG) += mtk_jpeg.o
+
+ccflags-y += -I$(srctree)/drivers/media/platform/mtk-videobuf
diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
new file mode 100644
index 000..3804a48
--- /dev/null
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
@@ -0,0 +1,1271 @@
+/*
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ming Hsiu Tsai 
+ * Rick Chang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_jpeg_hw.h"
+#include "mtk_jpeg_core.h"
+#include "mtk_jpeg_parse.h"
+
+static struct mtk_jpeg_fmt mtk_jpeg_formats[] = {
+   {
+   .name   = "JPEG JFIF",
+   .fourcc = V4L2_PIX_FMT_JPEG,
+   .colplanes  = 1,
+   .flags  = MTK_JPEG_FMT_FLAG_DEC_OUTPUT,
+   },
+   {
+   .name   = "YUV 4:2:0 non-contiguous 3-planar, Y/Cb/Cr",
+   .fourcc = V4L2_PIX_FMT_YUV420M,
+   .h_sample   = {4, 2, 2},
+   .v_sample   = {4, 2, 2},
+   .colplanes  = 3,
+   .h_align= 5,
+   .v_align= 4,
+   .flags  = MTK_JPEG_FMT_FLAG_DEC_CAPTURE,
+   },
+   {
+   .name   = "YUV 4:2:2 non-contiguous 3-planar, Y/Cb/Cr",
+   .fourcc = 

[RESEND PATCH 3/3] arm: dts: mt2701: Add node for Mediatek JPEG Decoder

2016-09-18 Thread Rick Chang
Signed-off-by: Rick Chang 
Signed-off-by: Minghsiu Tsai 
---
This patch depends on: 
  CCF "arm: dts: mt2701: Add clock controller device nodes"[1]
  power domain patch "Mediatek MT2701 SCPSYS power domain support v7"[2]
  iommu and smi "Add the dtsi node of iommu and smi for mt2701"[3]

[1] https://patchwork.kernel.org/patch/9109081
[2] http://lists.infradead.org/pipermail/linux-mediatek/2016-May/005429.html
[3] https://patchwork.kernel.org/patch/9164013/
---
 arch/arm/boot/dts/mt2701.dtsi | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm/boot/dts/mt2701.dtsi b/arch/arm/boot/dts/mt2701.dtsi
index d550d36..a9838bd 100644
--- a/arch/arm/boot/dts/mt2701.dtsi
+++ b/arch/arm/boot/dts/mt2701.dtsi
@@ -284,6 +284,20 @@
power-domains = < MT2701_POWER_DOMAIN_ISP>;
};
 
+   jpegdec: jpegdec@15004000 {
+   compatible = "mediatek,mt2701-jpgdec";
+   reg = <0 0x15004000 0 0x1000>;
+   interrupts = ;
+   clocks =  < CLK_IMG_JPGDEC_SMI>,
+ < CLK_IMG_JPGDEC>;
+   clock-names = "jpgdec-smi",
+ "jpgdec";
+   power-domains = < MT2701_POWER_DOMAIN_ISP>;
+   mediatek,larb = <>;
+   iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
+< MT2701_M4U_PORT_JPGDEC_BSDMA>;
+   };
+
vdecsys: syscon@1600 {
compatible = "mediatek,mt2701-vdecsys", "syscon";
reg = <0 0x1600 0 0x1000>;
-- 
1.9.1



[RESEND PATCH 1/3] dt-bindings: mediatek: Add a binding for Mediatek JPEG Decoder

2016-09-18 Thread Rick Chang
Add a DT binding documentation for Mediatek JPEG Decoder of
MT2701 SoC.

Signed-off-by: Rick Chang 
Signed-off-by: Minghsiu Tsai 
---
 .../bindings/media/mediatek-jpeg-codec.txt | 35 ++
 1 file changed, 35 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/media/mediatek-jpeg-codec.txt

diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-codec.txt 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-codec.txt
new file mode 100644
index 000..514e656
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-codec.txt
@@ -0,0 +1,35 @@
+* Mediatek JPEG Codec
+
+Mediatek JPEG Codec device driver is a v4l2 driver which can decode
+JPEG-encoded video frames.
+
+Required properties:
+  - compatible : "mediatek,mt2701-jpgdec"
+  - reg : Physical base address of the jpeg codec registers and length of
+memory mapped region.
+  - interrupts : interrupt number to the cpu.
+  - clocks : clock name from clock manager
+  - clock-names: the clocks of the jpeg codec H/W
+  - power-domains : a phandle to the power domain.
+  - larb : must contain the larbes of current platform
+  - iommus : Mediatek IOMMU H/W has designed the fixed associations with
+the multimedia H/W. and there is only one multimedia iommu domain.
+"iommus = < portid>" the "portid" is from
+dt-bindings\iommu\mt2701-iommu-port.h, it means that this portid will
+enable iommu. The portid default is disable iommu if "< portid>"
+don't be added.
+
+Example:
+   jpegdec: jpegdec@15004000 {
+   compatible = "mediatek,mt2701-jpgdec";
+   reg = <0 0x15004000 0 0x1000>;
+   interrupts = ;
+   clocks =  < CLK_IMG_JPGDEC_SMI>,
+ < CLK_IMG_JPGDEC>;
+   clock-names = "jpgdec-smi",
+ "jpgdec";
+   power-domains = < MT2701_POWER_DOMAIN_ISP>;
+   mediatek,larb = <>;
+   iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
+< MT2701_M4U_PORT_JPGDEC_BSDMA>;
+   };
-- 
1.9.1



  1   2   3   4   5   6   7   8   9   10   >