Re: [PATCH v2 2/3] perf annotate: Introduce the new source code view

2017-03-02 Thread Namhyung Kim
Hi Peter,

On Wed, Mar 01, 2017 at 04:07:46PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 01, 2017 at 11:56:39PM +0900, Namhyung Kim wrote:
> 
> > It's a kind of user experience issue.  We provide the asm-only and
> > asm+source annotation, and I think it'd be nice to add source-only
> > option.  And I remember that it was requested some time ago..
> 
> Thing is, an optimizing compiler -- that same beast that ensures your
> objdump -S output is such a garbled mess -- can generate code that
> becomes very hard to relate to the original source code.

I understand that.  Maybe it's not 100% accurate, but it still has
valuable information.  And I think the source-only view can give more
readable outputs using the info.  Also I guess many developers already
aware of the effect of optimizing compilers.

> 
> I'm really sceptical the source line only view is very useful; maybe if
> you build with -O0, but then, if you do that you're not bothered with
> performance.

Even with an optimizing compiler, it can be helpful to overview which
parts of the code are bottlenecks IMHO.  After that, one can see the
asm to identify the problem deeply, if needed.

Thanks,
Namhyung


RE: [PATCH v2] staging: mkspec: added aarch64 ifarch case.

2017-03-02 Thread James Tau
Hi Will,

This patch (http://lkml.kernel.org/r/20161122213434.14788-1-mma...@suse.com) 
looks better.  It has what Linus calls "good taste". ;-)  I didn't see it in 
mmarek's kbuild branches (for-next,rc-fixes), however.  Still making its way 
there?

But it doesn't quite fix the native 'make rpm' build completely.  While it gets 
beyond the point at which 'make rpm' fails without my patch, it exposes another 
issue for which I am debugging right now:

  ld -EL -r  -T ./scripts/module-common.lds --build-id  -o net/unix/unix.ko 
net/unix/unix.o net/unix/unix.mod.o ;  true
make -f ./scripts/Makefile.fwinst obj=firmware __fw_modbuild
error: Bad exit status from /var/tmp/rpm-tmp.YcfiLf (%build)
Bad exit status from /var/tmp/rpm-tmp.YcfiLf (%build)

RPM build errors:
make[1]: *** [rpm] Error 1
make: *** [rpm] Error 2
  
If I succeed in root-causing the problem, I'll submit a patch for that (if 
another doesn't beat me to it).  And assuming that patch is accepted for having 
Linusian "good taste", then it, and 
http://lkml.kernel.org/r/20161122213434.14788-1-mma...@suse.com, will make my 
current submitted patch extraneous.

Thanks,
James

-Original Message-
From: Will Deacon [mailto:will.dea...@arm.com] 
Sent: Wednesday, March 1, 2017 11:06 PM
To: James Tau 
Cc: linux-kernel@vger.kernel.org; linux-kbu...@vger.kernel.org; 
mma...@suse.com; catalin.mari...@arm.com; Chris Metcalf 
Subject: Re: [PATCH v2] staging: mkspec: added aarch64 ifarch case.

On Wed, Mar 01, 2017 at 09:24:14AM -0800, James Tau wrote:
> Patch attempting to fix native 'make rpm' build on ARM64 machines by 
> adding an "ifarch aarch64" case.  Without it, build fails because the 
> 'cp ...' in the default case can't find the built image.
> 
> Signed-off-by: James Tau 
> ---
>  scripts/package/mkspec | 4 
>  1 file changed, 4 insertions(+)

Is this the same issue that was fixed by:

http://lkml.kernel.org/r/20161122213434.14788-1-mma...@suse.com

?

I was assuming that Michael was going to queue those, but I could be wrong.

Will


[RESEND PATCH v5 04/11 (Missed 04/11 in PATCH v5 series)] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.

2017-03-02 Thread Anurup M
Documentation for perf usage and Hisilicon SoC PMU uncore events.
The Hisilicon SOC has event counters for hardware modules like
L3 cache, Miscellaneous node etc. These events are all uncore.

Signed-off-by: Anurup M 
Signed-off-by: Shaokun Zhang 
---
 Documentation/perf/hisi-pmu.txt | 76 +
 1 file changed, 76 insertions(+)
 create mode 100644 Documentation/perf/hisi-pmu.txt

diff --git a/Documentation/perf/hisi-pmu.txt b/Documentation/perf/hisi-pmu.txt
new file mode 100644
index 000..e3ac562
--- /dev/null
+++ b/Documentation/perf/hisi-pmu.txt
@@ -0,0 +1,76 @@
+Hisilicon SoC PMU (Performance Monitoring Unit)
+
+The Hisilicon SoC HiP05/06/07 chips consist of various independent system
+device PMU's such as L3 cache(L3C) and Miscellaneous Nodes(MN).
+These PMU devices are independent and have hardware logic to gather
+statistics and performance information.
+
+HiP0x chips are encapsulated by multiple CPU and IO die's. The CPU die is
+called as Super CPU cluster (SCCL) which includes 16 cpu-cores. Every SCCL
+is further grouped as CPU clusters (CCL) which includes 4 cpu-cores each.
+Each SCCL has 1 L3 cache and 1 MN units.
+
+The L3 cache is shared by all CPU cores in a CPU die. The L3C has four banks
+(or instances). Each bank or instance of L3C has Eight 32-bit counter
+registers and also event control registers. The HiP05/06 chip L3 cache has
+22 statistics events. The HiP07 chip has 66 statistics events. These events
+are very useful for debugging.
+
+The MN module is also shared by all CPU cores in a CPU die. It receives
+barriers and DVM(Distributed Virtual Memory) messages from cpu or smmu, and
+perform the required actions and return response messages. These events are
+very useful for debugging. The MN has total 9 statistics events and support
+four 32-bit counter registers in HiP05/06/07 chips.
+
+There is no memory mapping for L3 cache and MN registers. It can be accessed
+by using the Hisilicon djtag interface. The Djtag in a SCCL is an independent
+module which connects with some modules in the SoC by Debug Bus.
+
+Hisilicon SoC (HiP05/06/07) PMU driver
+--
+The HiP0x PMU driver shall register perf PMU drivers like L3 cache, MN, etc.
+The available events and configuration options shall be described in the sysfs.
+The "perf list" shall list the available events from sysfs.
+
+The L3 cache in a SCCL is divided as 4 banks. Each L3 cache bank have separate
+PMU registers for event counting and control. The L3 cache banks also do not
+have any CPU affinity. So each L3 cache banks are registered with perf as a
+separate PMU.
+The PMU name will appear in event listing as hisi_l3c_.
+where "bank-id" is the bank index (0 to 3) and "scl-id" is the SCCL identifier
+e.g. hisi_l3c0_2/read_hit is READ_HIT event of L3 cache bank #0 SCCL ID #2.
+
+The MN in a SCCL is registered as a separate PMU with perf.
+The PMU name will appear in event listing as hisi_mn_.
+e.g. hisi_mn_2/read_req. READ_REQUEST event of MN of Super CPU cluster #2.
+
+The event code is represented by 12 bits.
+   i) event 0-11
+   The event code will be represented using the LSB 12 bits.
+
+The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
+ID used to count the uncore PMU event.
+
+Example usage of perf:
+$# perf list
+hisi_l3c0_2/read_hit/ [kernel PMU event]
+--
+hisi_l3c1_2/write_hit/ [kernel PMU event]
+--
+hisi_l3c0_1/read_hit/ [kernel PMU event]
+--
+hisi_l3c0_1/write_hit/ [kernel PMU event]
+--
+hisi_mn_2/read_req/ [kernel PMU event]
+hisi_mn_2/write_req/ [kernel PMU event]
+--
+
+$# perf stat -a -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+$# perf stat -A -C 0 -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+The current driver doesnot support sampling. so "perf record" is unsupported.
+Also attach to a task is unsupported as the events are all uncore.
+
+Note: Please contact the maintainer for a complete list of events supported for
+the PMU devices in the SoC and its information if needed.
-- 
2.1.4



[RESEND PATCH v5 04/11 (Missed 04/11 in PATCH v5 series)] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.

2017-03-02 Thread Anurup M
Documentation for perf usage and Hisilicon SoC PMU uncore events.
The Hisilicon SOC has event counters for hardware modules like
L3 cache, Miscellaneous node etc. These events are all uncore.

Signed-off-by: Anurup M 
Signed-off-by: Shaokun Zhang 
---
 Documentation/perf/hisi-pmu.txt | 76 +
 1 file changed, 76 insertions(+)
 create mode 100644 Documentation/perf/hisi-pmu.txt

diff --git a/Documentation/perf/hisi-pmu.txt b/Documentation/perf/hisi-pmu.txt
new file mode 100644
index 000..e3ac562
--- /dev/null
+++ b/Documentation/perf/hisi-pmu.txt
@@ -0,0 +1,76 @@
+Hisilicon SoC PMU (Performance Monitoring Unit)
+
+The Hisilicon SoC HiP05/06/07 chips consist of various independent system
+device PMU's such as L3 cache(L3C) and Miscellaneous Nodes(MN).
+These PMU devices are independent and have hardware logic to gather
+statistics and performance information.
+
+HiP0x chips are encapsulated by multiple CPU and IO die's. The CPU die is
+called as Super CPU cluster (SCCL) which includes 16 cpu-cores. Every SCCL
+is further grouped as CPU clusters (CCL) which includes 4 cpu-cores each.
+Each SCCL has 1 L3 cache and 1 MN units.
+
+The L3 cache is shared by all CPU cores in a CPU die. The L3C has four banks
+(or instances). Each bank or instance of L3C has Eight 32-bit counter
+registers and also event control registers. The HiP05/06 chip L3 cache has
+22 statistics events. The HiP07 chip has 66 statistics events. These events
+are very useful for debugging.
+
+The MN module is also shared by all CPU cores in a CPU die. It receives
+barriers and DVM(Distributed Virtual Memory) messages from cpu or smmu, and
+perform the required actions and return response messages. These events are
+very useful for debugging. The MN has total 9 statistics events and support
+four 32-bit counter registers in HiP05/06/07 chips.
+
+There is no memory mapping for L3 cache and MN registers. It can be accessed
+by using the Hisilicon djtag interface. The Djtag in a SCCL is an independent
+module which connects with some modules in the SoC by Debug Bus.
+
+Hisilicon SoC (HiP05/06/07) PMU driver
+--
+The HiP0x PMU driver shall register perf PMU drivers like L3 cache, MN, etc.
+The available events and configuration options shall be described in the sysfs.
+The "perf list" shall list the available events from sysfs.
+
+The L3 cache in a SCCL is divided as 4 banks. Each L3 cache bank have separate
+PMU registers for event counting and control. The L3 cache banks also do not
+have any CPU affinity. So each L3 cache banks are registered with perf as a
+separate PMU.
+The PMU name will appear in event listing as hisi_l3c_.
+where "bank-id" is the bank index (0 to 3) and "scl-id" is the SCCL identifier
+e.g. hisi_l3c0_2/read_hit is READ_HIT event of L3 cache bank #0 SCCL ID #2.
+
+The MN in a SCCL is registered as a separate PMU with perf.
+The PMU name will appear in event listing as hisi_mn_.
+e.g. hisi_mn_2/read_req. READ_REQUEST event of MN of Super CPU cluster #2.
+
+The event code is represented by 12 bits.
+   i) event 0-11
+   The event code will be represented using the LSB 12 bits.
+
+The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
+ID used to count the uncore PMU event.
+
+Example usage of perf:
+$# perf list
+hisi_l3c0_2/read_hit/ [kernel PMU event]
+--
+hisi_l3c1_2/write_hit/ [kernel PMU event]
+--
+hisi_l3c0_1/read_hit/ [kernel PMU event]
+--
+hisi_l3c0_1/write_hit/ [kernel PMU event]
+--
+hisi_mn_2/read_req/ [kernel PMU event]
+hisi_mn_2/write_req/ [kernel PMU event]
+--
+
+$# perf stat -a -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+$# perf stat -A -C 0 -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+The current driver doesnot support sampling. so "perf record" is unsupported.
+Also attach to a task is unsupported as the events are all uncore.
+
+Note: Please contact the maintainer for a complete list of events supported for
+the PMU devices in the SoC and its information if needed.
-- 
2.1.4



Re: [PATCH v17 2/3] usb: USB Type-C connector class

2017-03-02 Thread Guenter Roeck

On 03/02/2017 07:35 PM, Peter Chen wrote:

On Tue, Feb 21, 2017 at 05:24:04PM +0300, Heikki Krogerus wrote:

+/* --- */
+/* Driver callbacks to report role updates */
+
+/**
+ * typec_set_data_role - Report data role change
+ * @port: The USB Type-C Port where the role was changed
+ * @role: The new data role
+ *
+ * This routine is used by the port drivers to report data role changes.
+ */
+void typec_set_data_role(struct typec_port *port, enum typec_data_role role)
+{
+   if (port->data_role == role)
+   return;
+
+   port->data_role = role;
+   sysfs_notify(>dev.kobj, NULL, "data_role");
+   kobject_uevent(>dev.kobj, KOBJ_CHANGE);
+}
+EXPORT_SYMBOL_GPL(typec_set_data_role);
+


Hi Keikki,

Have you tested this interface with real dual-role controller/board?


If it helps, my primary test system is a HP Chromebook 13 G1.


What interface you use when you receive this event to handle
dual-role switch? I am wonder if a common dual-role class is
needed, then we can have a common user utility.


I don't really understand "What interface you use when you receive
this event". Can you explain ?



Eg, if "data_role" has changed, the udev can echo "data_role" to
/sys/class/usb-dual-role/role


That sounds like a kernel event delivered to user space via udev or
sysfs notification and returned back into the kernel through a sysfs
attribute. Do I understand that correctly ?

Thanks,
Guenter


Maybe we can enhance Roger's drd framework [1] to fulfill that.

[1] https://lwn.net/Articles/682531/





Re: [PATCH v17 2/3] usb: USB Type-C connector class

2017-03-02 Thread Guenter Roeck

On 03/02/2017 07:35 PM, Peter Chen wrote:

On Tue, Feb 21, 2017 at 05:24:04PM +0300, Heikki Krogerus wrote:

+/* --- */
+/* Driver callbacks to report role updates */
+
+/**
+ * typec_set_data_role - Report data role change
+ * @port: The USB Type-C Port where the role was changed
+ * @role: The new data role
+ *
+ * This routine is used by the port drivers to report data role changes.
+ */
+void typec_set_data_role(struct typec_port *port, enum typec_data_role role)
+{
+   if (port->data_role == role)
+   return;
+
+   port->data_role = role;
+   sysfs_notify(>dev.kobj, NULL, "data_role");
+   kobject_uevent(>dev.kobj, KOBJ_CHANGE);
+}
+EXPORT_SYMBOL_GPL(typec_set_data_role);
+


Hi Keikki,

Have you tested this interface with real dual-role controller/board?


If it helps, my primary test system is a HP Chromebook 13 G1.


What interface you use when you receive this event to handle
dual-role switch? I am wonder if a common dual-role class is
needed, then we can have a common user utility.


I don't really understand "What interface you use when you receive
this event". Can you explain ?



Eg, if "data_role" has changed, the udev can echo "data_role" to
/sys/class/usb-dual-role/role


That sounds like a kernel event delivered to user space via udev or
sysfs notification and returned back into the kernel through a sysfs
attribute. Do I understand that correctly ?

Thanks,
Guenter


Maybe we can enhance Roger's drd framework [1] to fulfill that.

[1] https://lwn.net/Articles/682531/





[git pull] vfs.git statx

2017-03-02 Thread Al Viro
Rebased, with fixup from -next folded in.  A branch matching what
was sitting in -next is #merge-2, and
; git cat-file commit rebased-statx|grep tree
tree 0e87b93d5902009d46d5faf25c3039ef8f668490
; git cat-file commit merge-2|grep tree
tree 0e87b93d5902009d46d5faf25c3039ef8f668490

IOW, the trees are identical, so we don't lose any testing done in -next
and rebased branch is obviously saner.

The following changes since commit bbe08c0a43e2c5ee3a00de68c0e867a08a9aa990:

  Merge branch 'for-linus-4.11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs (2017-03-02 
16:03:00 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git rebased-statx

for you to fetch changes up to a528d35e8bfcc521d7cb70aaf03e1bd296c8493f:

  statx: Add a system call to make enhanced file info available (2017-03-02 
20:51:15 -0500)


David Howells (1):
  statx: Add a system call to make enhanced file info available

 Documentation/filesystems/Locking  |   3 +-
 Documentation/filesystems/vfs.txt  |   3 +-
 arch/x86/entry/syscalls/syscall_32.tbl |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl |   1 +
 drivers/base/devtmpfs.c|   3 +-
 drivers/block/loop.c   |   3 +-
 drivers/mtd/ubi/build.c|   2 +-
 drivers/mtd/ubi/kapi.c |   2 +-
 drivers/staging/lustre/lustre/llite/file.c |   9 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   3 +-
 fs/9p/vfs_inode.c  |  10 +-
 fs/9p/vfs_inode_dotl.c |   5 +-
 fs/afs/inode.c |   8 +-
 fs/afs/internal.h  |   2 +-
 fs/bad_inode.c |   4 +-
 fs/btrfs/inode.c   |   6 +-
 fs/ceph/inode.c|   6 +-
 fs/ceph/super.h|   4 +-
 fs/cifs/cifsfs.h   |   2 +-
 fs/cifs/inode.c|   5 +-
 fs/coda/coda_linux.h   |   2 +-
 fs/coda/inode.c|   7 +-
 fs/ecryptfs/inode.c|  13 +-
 fs/exportfs/expfs.c|   3 +-
 fs/ext4/ext4.h |   3 +-
 fs/ext4/inode.c|   6 +-
 fs/f2fs/f2fs.h |   4 +-
 fs/f2fs/file.c |   6 +-
 fs/fat/fat.h   |   4 +-
 fs/fat/file.c  |   5 +-
 fs/fuse/dir.c  |   6 +-
 fs/gfs2/inode.c|  11 +-
 fs/kernfs/inode.c  |   8 +-
 fs/kernfs/kernfs-internal.h|   4 +-
 fs/libfs.c |  12 +-
 fs/minix/inode.c   |  11 +-
 fs/minix/minix.h   |   2 +-
 fs/nfs/inode.c |  13 +-
 fs/nfs/namespace.c |   9 +-
 fs/nfsd/nfs4xdr.c  |   4 +-
 fs/nfsd/vfs.h  |   3 +-
 fs/ocfs2/file.c|  11 +-
 fs/ocfs2/file.h|   4 +-
 fs/orangefs/inode.c|  13 +-
 fs/orangefs/orangefs-kernel.h  |   5 +-
 fs/overlayfs/copy_up.c |   6 +-
 fs/overlayfs/dir.c |  10 +-
 fs/overlayfs/inode.c   |   7 +-
 fs/proc/base.c |  12 +-
 fs/proc/generic.c  |   6 +-
 fs/proc/internal.h |   2 +-
 fs/proc/proc_net.c |   6 +-
 fs/proc/proc_sysctl.c  |   5 +-
 fs/proc/root.c |   6 +-
 fs/stat.c  | 214 ++---
 fs/sysv/itree.c|   7 +-
 fs/sysv/sysv.h |   2 +-
 fs/ubifs/dir.c |   6 +-
 fs/ubifs/ubifs.h   |   4 +-
 fs/udf/symlink.c   |   5 +-
 fs/xfs/xfs_iops.c  |   9 +-
 include/linux/fs.h |  35 ++-
 include/linux/nfs_fs.h |   2 +-
 include/linux/stat.h   |  24 +-
 include/linux/syscalls.h  

[git pull] vfs.git statx

2017-03-02 Thread Al Viro
Rebased, with fixup from -next folded in.  A branch matching what
was sitting in -next is #merge-2, and
; git cat-file commit rebased-statx|grep tree
tree 0e87b93d5902009d46d5faf25c3039ef8f668490
; git cat-file commit merge-2|grep tree
tree 0e87b93d5902009d46d5faf25c3039ef8f668490

IOW, the trees are identical, so we don't lose any testing done in -next
and rebased branch is obviously saner.

The following changes since commit bbe08c0a43e2c5ee3a00de68c0e867a08a9aa990:

  Merge branch 'for-linus-4.11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs (2017-03-02 
16:03:00 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git rebased-statx

for you to fetch changes up to a528d35e8bfcc521d7cb70aaf03e1bd296c8493f:

  statx: Add a system call to make enhanced file info available (2017-03-02 
20:51:15 -0500)


David Howells (1):
  statx: Add a system call to make enhanced file info available

 Documentation/filesystems/Locking  |   3 +-
 Documentation/filesystems/vfs.txt  |   3 +-
 arch/x86/entry/syscalls/syscall_32.tbl |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl |   1 +
 drivers/base/devtmpfs.c|   3 +-
 drivers/block/loop.c   |   3 +-
 drivers/mtd/ubi/build.c|   2 +-
 drivers/mtd/ubi/kapi.c |   2 +-
 drivers/staging/lustre/lustre/llite/file.c |   9 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   3 +-
 fs/9p/vfs_inode.c  |  10 +-
 fs/9p/vfs_inode_dotl.c |   5 +-
 fs/afs/inode.c |   8 +-
 fs/afs/internal.h  |   2 +-
 fs/bad_inode.c |   4 +-
 fs/btrfs/inode.c   |   6 +-
 fs/ceph/inode.c|   6 +-
 fs/ceph/super.h|   4 +-
 fs/cifs/cifsfs.h   |   2 +-
 fs/cifs/inode.c|   5 +-
 fs/coda/coda_linux.h   |   2 +-
 fs/coda/inode.c|   7 +-
 fs/ecryptfs/inode.c|  13 +-
 fs/exportfs/expfs.c|   3 +-
 fs/ext4/ext4.h |   3 +-
 fs/ext4/inode.c|   6 +-
 fs/f2fs/f2fs.h |   4 +-
 fs/f2fs/file.c |   6 +-
 fs/fat/fat.h   |   4 +-
 fs/fat/file.c  |   5 +-
 fs/fuse/dir.c  |   6 +-
 fs/gfs2/inode.c|  11 +-
 fs/kernfs/inode.c  |   8 +-
 fs/kernfs/kernfs-internal.h|   4 +-
 fs/libfs.c |  12 +-
 fs/minix/inode.c   |  11 +-
 fs/minix/minix.h   |   2 +-
 fs/nfs/inode.c |  13 +-
 fs/nfs/namespace.c |   9 +-
 fs/nfsd/nfs4xdr.c  |   4 +-
 fs/nfsd/vfs.h  |   3 +-
 fs/ocfs2/file.c|  11 +-
 fs/ocfs2/file.h|   4 +-
 fs/orangefs/inode.c|  13 +-
 fs/orangefs/orangefs-kernel.h  |   5 +-
 fs/overlayfs/copy_up.c |   6 +-
 fs/overlayfs/dir.c |  10 +-
 fs/overlayfs/inode.c   |   7 +-
 fs/proc/base.c |  12 +-
 fs/proc/generic.c  |   6 +-
 fs/proc/internal.h |   2 +-
 fs/proc/proc_net.c |   6 +-
 fs/proc/proc_sysctl.c  |   5 +-
 fs/proc/root.c |   6 +-
 fs/stat.c  | 214 ++---
 fs/sysv/itree.c|   7 +-
 fs/sysv/sysv.h |   2 +-
 fs/ubifs/dir.c |   6 +-
 fs/ubifs/ubifs.h   |   4 +-
 fs/udf/symlink.c   |   5 +-
 fs/xfs/xfs_iops.c  |   9 +-
 include/linux/fs.h |  35 ++-
 include/linux/nfs_fs.h |   2 +-
 include/linux/stat.h   |  24 +-
 include/linux/syscalls.h  

Re: [RFC] arm64: support HAVE_ARCH_RARE_WRITE

2017-03-02 Thread Kees Cook
On Thu, Mar 2, 2017 at 7:00 AM, Hoeun Ryu  wrote:
>  This RFC is a quick and dirty arm64 implementation for Kees Cook's RFC for
> rare_write infrastructure [1].

Awesome! :)

>  This implementation is based on Mark Rutland's suggestions, which is that
> a special userspace mm that maps only __start/end_rodata as RW permission
> is prepared during early boot time (paging_init) and __arch_rare_write_map()
> switches to the mm [2].
>
>  Due to the limit of implementation (the mm having RW mapping is userspace
> mm), we need a new arch-specific __arch_rare_write_ptr() to convert RO
> address to RW address (CONFIG_HAVE_RARE_WRITE_PTR is added), which is
> general for all architectures (__rare_write_ptr()) in Kees's RFC . So all
> writes should be instrumented by __rare_write().

Cool, yeah, I'll get all this fixed up in my next version.

>  One caveat for arm64 is CONFIG_ARM64_SW_TTBR0_PAN.
> Because __arch_rare_write_map() installes a special user mm to ttbr0,
> usercopy inside  __arch_rare_write_map/unmap() pair will break rare_write.
> (uaccess_enable() replaces the special mm and RW alias is no longer valid.)

That's totally fine constraint: this case should never happen for so
many reasons. :)

>  A similar problem could rise in general usercopy inside
> __arch_rare_write_map/unmap(). __arch_rare_write_map() replaces current->mm,
> so we loose the address space of the `current` process.
>
> It passes LKDTM's rare write test.
>
> [1] : http://www.openwall.com/lists/kernel-hardening/2017/02/27/5
> [2] : https://lkml.org/lkml/2017/2/22/254
>
> Signed-off-by: Hoeun Ryu 

-Kees

-- 
Kees Cook
Pixel Security


Re: [RFC] arm64: support HAVE_ARCH_RARE_WRITE

2017-03-02 Thread Kees Cook
On Thu, Mar 2, 2017 at 7:00 AM, Hoeun Ryu  wrote:
>  This RFC is a quick and dirty arm64 implementation for Kees Cook's RFC for
> rare_write infrastructure [1].

Awesome! :)

>  This implementation is based on Mark Rutland's suggestions, which is that
> a special userspace mm that maps only __start/end_rodata as RW permission
> is prepared during early boot time (paging_init) and __arch_rare_write_map()
> switches to the mm [2].
>
>  Due to the limit of implementation (the mm having RW mapping is userspace
> mm), we need a new arch-specific __arch_rare_write_ptr() to convert RO
> address to RW address (CONFIG_HAVE_RARE_WRITE_PTR is added), which is
> general for all architectures (__rare_write_ptr()) in Kees's RFC . So all
> writes should be instrumented by __rare_write().

Cool, yeah, I'll get all this fixed up in my next version.

>  One caveat for arm64 is CONFIG_ARM64_SW_TTBR0_PAN.
> Because __arch_rare_write_map() installes a special user mm to ttbr0,
> usercopy inside  __arch_rare_write_map/unmap() pair will break rare_write.
> (uaccess_enable() replaces the special mm and RW alias is no longer valid.)

That's totally fine constraint: this case should never happen for so
many reasons. :)

>  A similar problem could rise in general usercopy inside
> __arch_rare_write_map/unmap(). __arch_rare_write_map() replaces current->mm,
> so we loose the address space of the `current` process.
>
> It passes LKDTM's rare write test.
>
> [1] : http://www.openwall.com/lists/kernel-hardening/2017/02/27/5
> [2] : https://lkml.org/lkml/2017/2/22/254
>
> Signed-off-by: Hoeun Ryu 

-Kees

-- 
Kees Cook
Pixel Security


[PATCH] can: m_can: support transmit frame in CAN FD format

2017-03-02 Thread Wenyou Yang
Add support to transmit the frame in the CAN FD format and with
the bit rate switching.

Tested on SAMA5D2 Xplained board.

Signed-off-by: Wenyou Yang 
---
The testing is based on
[RESEND PATCH 1/1] can: m_can: fix bitrate setup on latest silicon
http://lkml.iu.edu/hypermail/linux/kernel/1702.1/05347.html

 drivers/net/can/m_can/m_can.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 195f15edb32e..9ef9b337d25b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -266,8 +266,12 @@ enum m_can_mram_cfg {
 
 /* Tx Buffer Element */
 /* R0 */
+#define TX_BUF_ESI BIT(31)
 #define TX_BUF_XTD BIT(30)
 #define TX_BUF_RTR BIT(29)
+#define TX_BUF_EFC BIT(23)
+#define TX_BUF_EDL BIT(21)
+#define TX_BUF_BRS BIT(20)
 
 /* address offset and element number for each FIFO/Buffer in the Message RAM */
 struct mram_cfg {
@@ -884,7 +888,7 @@ static void m_can_chip_config(struct net_device *dev)
}
 
if (priv->can.ctrlmode & CAN_CTRLMODE_FD)
-   cccr |= CCCR_CME_CANFD_BRS << CCCR_CME_SHIFT;
+   cccr |= (CCCR_CME_CANFD_BRS | CCCR_CME_CANFD) << CCCR_CME_SHIFT;
 
m_can_write(priv, M_CAN_CCCR, cccr);
m_can_write(priv, M_CAN_TEST, test);
@@ -1047,6 +1051,7 @@ static netdev_tx_t m_can_start_xmit(struct sk_buff *skb,
struct canfd_frame *cf = (struct canfd_frame *)skb->data;
u32 id, cccr;
int i;
+   u32 dlc;
 
if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
@@ -1065,7 +1070,6 @@ static netdev_tx_t m_can_start_xmit(struct sk_buff *skb,
 
/* message ram configuration */
m_can_fifo_write(priv, 0, M_CAN_FIFO_ID, id);
-   m_can_fifo_write(priv, 0, M_CAN_FIFO_DLC, can_len2dlc(cf->len) << 16);
 
for (i = 0; i < cf->len; i += 4)
m_can_fifo_write(priv, 0, M_CAN_FIFO_DATA(i / 4),
@@ -1073,20 +1077,29 @@ static netdev_tx_t m_can_start_xmit(struct sk_buff *skb,
 
can_put_echo_skb(skb, dev, 0);
 
+   dlc = can_len2dlc(cf->len) << 16;
+
if (priv->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(priv, M_CAN_CCCR);
cccr &= ~(CCCR_CMR_MASK << CCCR_CMR_SHIFT);
if (can_is_canfd_skb(skb)) {
-   if (cf->flags & CANFD_BRS)
+   dlc |= TX_BUF_EDL;
+   if (cf->flags & CANFD_ESI)
+   dlc |= TX_BUF_ESI;
+   if (cf->flags & CANFD_BRS) {
+   dlc |= TX_BUF_BRS;
cccr |= CCCR_CMR_CANFD_BRS << CCCR_CMR_SHIFT;
-   else
+   } else {
cccr |= CCCR_CMR_CANFD << CCCR_CMR_SHIFT;
+   }
} else {
cccr |= CCCR_CMR_CAN << CCCR_CMR_SHIFT;
}
m_can_write(priv, M_CAN_CCCR, cccr);
}
 
+   m_can_fifo_write(priv, 0, M_CAN_FIFO_DLC, dlc);
+
/* enable first TX buffer to start transfer  */
m_can_write(priv, M_CAN_TXBTIE, 0x1);
m_can_write(priv, M_CAN_TXBAR, 0x1);
-- 
2.11.0



[PATCH] can: m_can: support transmit frame in CAN FD format

2017-03-02 Thread Wenyou Yang
Add support to transmit the frame in the CAN FD format and with
the bit rate switching.

Tested on SAMA5D2 Xplained board.

Signed-off-by: Wenyou Yang 
---
The testing is based on
[RESEND PATCH 1/1] can: m_can: fix bitrate setup on latest silicon
http://lkml.iu.edu/hypermail/linux/kernel/1702.1/05347.html

 drivers/net/can/m_can/m_can.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 195f15edb32e..9ef9b337d25b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -266,8 +266,12 @@ enum m_can_mram_cfg {
 
 /* Tx Buffer Element */
 /* R0 */
+#define TX_BUF_ESI BIT(31)
 #define TX_BUF_XTD BIT(30)
 #define TX_BUF_RTR BIT(29)
+#define TX_BUF_EFC BIT(23)
+#define TX_BUF_EDL BIT(21)
+#define TX_BUF_BRS BIT(20)
 
 /* address offset and element number for each FIFO/Buffer in the Message RAM */
 struct mram_cfg {
@@ -884,7 +888,7 @@ static void m_can_chip_config(struct net_device *dev)
}
 
if (priv->can.ctrlmode & CAN_CTRLMODE_FD)
-   cccr |= CCCR_CME_CANFD_BRS << CCCR_CME_SHIFT;
+   cccr |= (CCCR_CME_CANFD_BRS | CCCR_CME_CANFD) << CCCR_CME_SHIFT;
 
m_can_write(priv, M_CAN_CCCR, cccr);
m_can_write(priv, M_CAN_TEST, test);
@@ -1047,6 +1051,7 @@ static netdev_tx_t m_can_start_xmit(struct sk_buff *skb,
struct canfd_frame *cf = (struct canfd_frame *)skb->data;
u32 id, cccr;
int i;
+   u32 dlc;
 
if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
@@ -1065,7 +1070,6 @@ static netdev_tx_t m_can_start_xmit(struct sk_buff *skb,
 
/* message ram configuration */
m_can_fifo_write(priv, 0, M_CAN_FIFO_ID, id);
-   m_can_fifo_write(priv, 0, M_CAN_FIFO_DLC, can_len2dlc(cf->len) << 16);
 
for (i = 0; i < cf->len; i += 4)
m_can_fifo_write(priv, 0, M_CAN_FIFO_DATA(i / 4),
@@ -1073,20 +1077,29 @@ static netdev_tx_t m_can_start_xmit(struct sk_buff *skb,
 
can_put_echo_skb(skb, dev, 0);
 
+   dlc = can_len2dlc(cf->len) << 16;
+
if (priv->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(priv, M_CAN_CCCR);
cccr &= ~(CCCR_CMR_MASK << CCCR_CMR_SHIFT);
if (can_is_canfd_skb(skb)) {
-   if (cf->flags & CANFD_BRS)
+   dlc |= TX_BUF_EDL;
+   if (cf->flags & CANFD_ESI)
+   dlc |= TX_BUF_ESI;
+   if (cf->flags & CANFD_BRS) {
+   dlc |= TX_BUF_BRS;
cccr |= CCCR_CMR_CANFD_BRS << CCCR_CMR_SHIFT;
-   else
+   } else {
cccr |= CCCR_CMR_CANFD << CCCR_CMR_SHIFT;
+   }
} else {
cccr |= CCCR_CMR_CAN << CCCR_CMR_SHIFT;
}
m_can_write(priv, M_CAN_CCCR, cccr);
}
 
+   m_can_fifo_write(priv, 0, M_CAN_FIFO_DLC, dlc);
+
/* enable first TX buffer to start transfer  */
m_can_write(priv, M_CAN_TXBTIE, 0x1);
m_can_write(priv, M_CAN_TXBAR, 0x1);
-- 
2.11.0



[PATCH v7 kernel 4/5] virtio-balloon: define flags and head for host request vq

2017-03-02 Thread Wei Wang
From: Liang Li 

Define the flags and head struct for a new host request virtual
queue. Guest can get requests from host and then responds to
them on this new virtual queue.
Host can make use of this virtqueue to request the guest to do
some operations, e.g. drop page cache, synchronize file system,
etc. The hypervisor can get some of guest's runtime information
through this virtual queue too, e.g. the guest's unused page
information, which can be used for live migration optimization.

Signed-off-by: Liang Li 
Signed-off-by: Wei Wang 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
Cc: Liang Li 
Cc: Wei Wang 
---
 include/uapi/linux/virtio_balloon.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/uapi/linux/virtio_balloon.h 
b/include/uapi/linux/virtio_balloon.h
index ed627b2..630b0ef 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ  1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_CHUNK_TRANSFER3 /* Transfer pages in chunks */
+#define VIRTIO_BALLOON_F_HOST_REQ_VQ   4 /* Host request virtqueue */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -94,4 +95,25 @@ struct virtio_balloon_resp_hdr {
__le32 data_len; /* Payload len in bytes */
 };
 
+enum virtio_balloon_req_id {
+   /* Get unused page information */
+   BALLOON_GET_UNUSED_PAGES,
+};
+
+enum virtio_balloon_flag {
+   /* Have more data for a request */
+   BALLOON_FLAG_CONT,
+   /* No more data for a request */
+   BALLOON_FLAG_DONE,
+};
+
+struct virtio_balloon_req_hdr {
+   /* Used to distinguish different requests */
+   __le16 cmd;
+   /* Reserved */
+   __le16 reserved[3];
+   /* Request parameter */
+   __le64 param;
+};
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
2.7.4



[PATCH v7 kernel 5/5] This patch contains two parts:

2017-03-02 Thread Wei Wang
From: Liang Li 

One is to add a new API to mm go get the unused page information.
The virtio balloon driver will use this new API added to get the
unused page info and send it to hypervisor(QEMU) to speed up live
migration. During sending the bitmap, some the pages may be modified
and are used by the guest, this inaccuracy can be corrected by the
dirty page logging mechanism.

One is to add support the request for vm's unused page information,
QEMU can make use of unused page information and the dirty page
logging mechanism to skip the transportation of some of these unused
pages, this is very helpful to reduce the network traffic and speed
up the live migration process.

Signed-off-by: Liang Li 
Signed-off-by: Wei Wang 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
Cc: Liang Li 
Cc: Wei Wang 
---
 drivers/virtio/virtio_balloon.c | 137 ++--
 include/linux/mm.h  |   3 +
 mm/page_alloc.c | 120 +++
 3 files changed, 255 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 4416370..9b6cf44f 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -66,7 +66,7 @@ struct balloon_page_chunk_ext {
 
 struct virtio_balloon {
struct virtio_device *vdev;
-   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *host_req_vq;
 
/* The balloon servicing is delegated to a freezable workqueue. */
struct work_struct update_balloon_stats_work;
@@ -95,6 +95,8 @@ struct virtio_balloon {
unsigned int nr_page_bmap;
/* Used to record the processed pfn range */
unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
+   /* Request header */
+   struct virtio_balloon_req_hdr req_hdr;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -549,6 +551,80 @@ static void stats_handle_request(struct virtio_balloon *vb)
virtqueue_kick(vq);
 }
 
+static void __send_unused_pages(struct virtio_balloon *vb,
+   unsigned long req_id, unsigned int pos, bool done)
+{
+   struct virtio_balloon_resp_hdr *hdr = >resp_hdr;
+   struct virtqueue *vq = vb->host_req_vq;
+
+   vb->resp_pos = pos;
+   hdr->cmd = BALLOON_GET_UNUSED_PAGES;
+   hdr->id = req_id;
+   if (!done)
+   hdr->flag = BALLOON_FLAG_CONT;
+   else
+   hdr->flag = BALLOON_FLAG_DONE;
+
+   if (pos > 0 || done)
+   send_resp_data(vb, vq, true);
+
+}
+
+static void send_unused_pages(struct virtio_balloon *vb,
+   unsigned long req_id)
+{
+   struct scatterlist sg_in;
+   unsigned int pos = 0;
+   struct virtqueue *vq = vb->host_req_vq;
+   int ret, order;
+   struct zone *zone = NULL;
+   bool part_fill = false;
+
+   mutex_lock(>balloon_lock);
+
+   for (order = MAX_ORDER - 1; order >= 0; order--) {
+   ret = mark_unused_pages(, order, vb->resp_data,
+vb->resp_buf_size / sizeof(__le64),
+, VIRTIO_BALLOON_CHUNK_SIZE_SHIFT, part_fill);
+   if (ret == -ENOSPC) {
+   if (pos == 0) {
+   void *new_resp_data;
+
+   new_resp_data = kmalloc(2 * vb->resp_buf_size,
+   GFP_KERNEL);
+   if (new_resp_data) {
+   kfree(vb->resp_data);
+   vb->resp_data = new_resp_data;
+   vb->resp_buf_size *= 2;
+   } else {
+   part_fill = true;
+   dev_warn(>vdev->dev,
+"%s: part fill order: %d\n",
+__func__, order);
+   }
+   } else {
+   __send_unused_pages(vb, req_id, pos, false);
+   pos = 0;
+   }
+
+   if (!part_fill) {
+   order++;
+   continue;
+   }
+   } else
+   zone = NULL;
+
+   

[PATCH v7 kernel 4/5] virtio-balloon: define flags and head for host request vq

2017-03-02 Thread Wei Wang
From: Liang Li 

Define the flags and head struct for a new host request virtual
queue. Guest can get requests from host and then responds to
them on this new virtual queue.
Host can make use of this virtqueue to request the guest to do
some operations, e.g. drop page cache, synchronize file system,
etc. The hypervisor can get some of guest's runtime information
through this virtual queue too, e.g. the guest's unused page
information, which can be used for live migration optimization.

Signed-off-by: Liang Li 
Signed-off-by: Wei Wang 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
Cc: Liang Li 
Cc: Wei Wang 
---
 include/uapi/linux/virtio_balloon.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/uapi/linux/virtio_balloon.h 
b/include/uapi/linux/virtio_balloon.h
index ed627b2..630b0ef 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ  1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_CHUNK_TRANSFER3 /* Transfer pages in chunks */
+#define VIRTIO_BALLOON_F_HOST_REQ_VQ   4 /* Host request virtqueue */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -94,4 +95,25 @@ struct virtio_balloon_resp_hdr {
__le32 data_len; /* Payload len in bytes */
 };
 
+enum virtio_balloon_req_id {
+   /* Get unused page information */
+   BALLOON_GET_UNUSED_PAGES,
+};
+
+enum virtio_balloon_flag {
+   /* Have more data for a request */
+   BALLOON_FLAG_CONT,
+   /* No more data for a request */
+   BALLOON_FLAG_DONE,
+};
+
+struct virtio_balloon_req_hdr {
+   /* Used to distinguish different requests */
+   __le16 cmd;
+   /* Reserved */
+   __le16 reserved[3];
+   /* Request parameter */
+   __le64 param;
+};
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
2.7.4



[PATCH v7 kernel 5/5] This patch contains two parts:

2017-03-02 Thread Wei Wang
From: Liang Li 

One is to add a new API to mm go get the unused page information.
The virtio balloon driver will use this new API added to get the
unused page info and send it to hypervisor(QEMU) to speed up live
migration. During sending the bitmap, some the pages may be modified
and are used by the guest, this inaccuracy can be corrected by the
dirty page logging mechanism.

One is to add support the request for vm's unused page information,
QEMU can make use of unused page information and the dirty page
logging mechanism to skip the transportation of some of these unused
pages, this is very helpful to reduce the network traffic and speed
up the live migration process.

Signed-off-by: Liang Li 
Signed-off-by: Wei Wang 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
Cc: Liang Li 
Cc: Wei Wang 
---
 drivers/virtio/virtio_balloon.c | 137 ++--
 include/linux/mm.h  |   3 +
 mm/page_alloc.c | 120 +++
 3 files changed, 255 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 4416370..9b6cf44f 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -66,7 +66,7 @@ struct balloon_page_chunk_ext {
 
 struct virtio_balloon {
struct virtio_device *vdev;
-   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *host_req_vq;
 
/* The balloon servicing is delegated to a freezable workqueue. */
struct work_struct update_balloon_stats_work;
@@ -95,6 +95,8 @@ struct virtio_balloon {
unsigned int nr_page_bmap;
/* Used to record the processed pfn range */
unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
+   /* Request header */
+   struct virtio_balloon_req_hdr req_hdr;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -549,6 +551,80 @@ static void stats_handle_request(struct virtio_balloon *vb)
virtqueue_kick(vq);
 }
 
+static void __send_unused_pages(struct virtio_balloon *vb,
+   unsigned long req_id, unsigned int pos, bool done)
+{
+   struct virtio_balloon_resp_hdr *hdr = >resp_hdr;
+   struct virtqueue *vq = vb->host_req_vq;
+
+   vb->resp_pos = pos;
+   hdr->cmd = BALLOON_GET_UNUSED_PAGES;
+   hdr->id = req_id;
+   if (!done)
+   hdr->flag = BALLOON_FLAG_CONT;
+   else
+   hdr->flag = BALLOON_FLAG_DONE;
+
+   if (pos > 0 || done)
+   send_resp_data(vb, vq, true);
+
+}
+
+static void send_unused_pages(struct virtio_balloon *vb,
+   unsigned long req_id)
+{
+   struct scatterlist sg_in;
+   unsigned int pos = 0;
+   struct virtqueue *vq = vb->host_req_vq;
+   int ret, order;
+   struct zone *zone = NULL;
+   bool part_fill = false;
+
+   mutex_lock(>balloon_lock);
+
+   for (order = MAX_ORDER - 1; order >= 0; order--) {
+   ret = mark_unused_pages(, order, vb->resp_data,
+vb->resp_buf_size / sizeof(__le64),
+, VIRTIO_BALLOON_CHUNK_SIZE_SHIFT, part_fill);
+   if (ret == -ENOSPC) {
+   if (pos == 0) {
+   void *new_resp_data;
+
+   new_resp_data = kmalloc(2 * vb->resp_buf_size,
+   GFP_KERNEL);
+   if (new_resp_data) {
+   kfree(vb->resp_data);
+   vb->resp_data = new_resp_data;
+   vb->resp_buf_size *= 2;
+   } else {
+   part_fill = true;
+   dev_warn(>vdev->dev,
+"%s: part fill order: %d\n",
+__func__, order);
+   }
+   } else {
+   __send_unused_pages(vb, req_id, pos, false);
+   pos = 0;
+   }
+
+   if (!part_fill) {
+   order++;
+   continue;
+   }
+   } else
+   zone = NULL;
+
+   if (order == 0)
+   __send_unused_pages(vb, req_id, pos, true);
+
+   }
+
+   mutex_unlock(>balloon_lock);
+   sg_init_one(_in, >req_hdr, sizeof(vb->req_hdr));
+   virtqueue_add_inbuf(vq, _in, 1, >req_hdr, GFP_KERNEL);
+   virtqueue_kick(vq);
+}
+
 static void 

[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-03-02 Thread Wei Wang
From: Liang Li 

The implementation of the current virtio-balloon is not very
efficient, because the pages are transferred to the host one by one.
Here is the breakdown of the time in percentage spent on each
step of the balloon inflating process (inflating 7GB of an 8GB
idle guest).

1) allocating pages (6.5%)
2) sending PFNs to host (68.3%)
3) address translation (6.1%)
4) madvise (19%)

It takes about 4126ms for the inflating process to complete.
The above profiling shows that the bottlenecks are stage 2)
and stage 4).

This patch optimizes step 2) by transfering pages to the host in
chunks. A chunk consists of guest physically continuous pages, and
it is offered to the host via a base PFN (i.e. the start PFN of
those physically continuous pages) and the size (i.e. the total
number of the pages). A normal chunk is formated as below:
---
|  Base (52 bit)   | Size (12 bit)|
---
For large size chunks, an extended chunk format is used:
---
| Base (64 bit)   |
---
---
| Size (64 bit)   |
---

By doing so, step 4) can also be optimized by doing address
translation and madvise() in chunks rather than page by page.

This optimization requires the negotation of a new feature bit,
VIRTIO_BALLOON_F_CHUNK_TRANSFER.

With this new feature, the above ballooning process takes ~590ms
resulting in an improvement of ~85%.

TODO: optimize stage 1) by allocating/freeing a chunk of pages
instead of a single page each time.

Signed-off-by: Liang Li 
Signed-off-by: Wei Wang 
Suggested-by: Michael S. Tsirkin 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
Cc: Liang Li 
Cc: Wei Wang 
---
 drivers/virtio/virtio_balloon.c | 351 
 1 file changed, 323 insertions(+), 28 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f59cb4f..4416370 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
+#define PAGE_BMAP_SIZE (8 * PAGE_SIZE)
+#define PFNS_PER_PAGE_BMAP (PAGE_BMAP_SIZE * BITS_PER_BYTE)
+#define PAGE_BMAP_COUNT_MAX32
+
 static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
@@ -50,6 +54,16 @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 static struct vfsmount *balloon_mnt;
 #endif
 
+struct balloon_page_chunk {
+   __le64 base : 52;
+   __le64 size : 12;
+};
+
+struct balloon_page_chunk_ext {
+   __le64 base;
+   __le64 size;
+};
+
 struct virtio_balloon {
struct virtio_device *vdev;
struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
@@ -67,6 +81,20 @@ struct virtio_balloon {
 
/* Number of balloon pages we've told the Host we're not using. */
unsigned int num_pages;
+   /* Pointer to the response header. */
+   struct virtio_balloon_resp_hdr resp_hdr;
+   /* Pointer to the start address of response data. */
+   __le64 *resp_data;
+   /* Size of response data buffer. */
+   unsigned int resp_buf_size;
+   /* Pointer offset of the response data. */
+   unsigned int resp_pos;
+   /* Bitmap used to save the pfns info */
+   unsigned long *page_bitmap[PAGE_BMAP_COUNT_MAX];
+   /* Number of split page bitmaps */
+   unsigned int nr_page_bmap;
+   /* Used to record the processed pfn range */
+   unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -110,20 +138,180 @@ static void balloon_ack(struct virtqueue *vq)
wake_up(>acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static inline void init_bmap_pfn_range(struct virtio_balloon *vb)
 {
-   struct scatterlist sg;
+   vb->min_pfn = ULONG_MAX;
+   vb->max_pfn = 0;
+}
+
+static inline void update_bmap_pfn_range(struct virtio_balloon *vb,
+struct page *page)
+{
+   unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+   vb->min_pfn = min(balloon_pfn, vb->min_pfn);
+   vb->max_pfn = max(balloon_pfn, vb->max_pfn);
+}
+
+static void 

[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-03-02 Thread Wei Wang
From: Liang Li 

The implementation of the current virtio-balloon is not very
efficient, because the pages are transferred to the host one by one.
Here is the breakdown of the time in percentage spent on each
step of the balloon inflating process (inflating 7GB of an 8GB
idle guest).

1) allocating pages (6.5%)
2) sending PFNs to host (68.3%)
3) address translation (6.1%)
4) madvise (19%)

It takes about 4126ms for the inflating process to complete.
The above profiling shows that the bottlenecks are stage 2)
and stage 4).

This patch optimizes step 2) by transfering pages to the host in
chunks. A chunk consists of guest physically continuous pages, and
it is offered to the host via a base PFN (i.e. the start PFN of
those physically continuous pages) and the size (i.e. the total
number of the pages). A normal chunk is formated as below:
---
|  Base (52 bit)   | Size (12 bit)|
---
For large size chunks, an extended chunk format is used:
---
| Base (64 bit)   |
---
---
| Size (64 bit)   |
---

By doing so, step 4) can also be optimized by doing address
translation and madvise() in chunks rather than page by page.

This optimization requires the negotation of a new feature bit,
VIRTIO_BALLOON_F_CHUNK_TRANSFER.

With this new feature, the above ballooning process takes ~590ms
resulting in an improvement of ~85%.

TODO: optimize stage 1) by allocating/freeing a chunk of pages
instead of a single page each time.

Signed-off-by: Liang Li 
Signed-off-by: Wei Wang 
Suggested-by: Michael S. Tsirkin 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
Cc: Liang Li 
Cc: Wei Wang 
---
 drivers/virtio/virtio_balloon.c | 351 
 1 file changed, 323 insertions(+), 28 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f59cb4f..4416370 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
+#define PAGE_BMAP_SIZE (8 * PAGE_SIZE)
+#define PFNS_PER_PAGE_BMAP (PAGE_BMAP_SIZE * BITS_PER_BYTE)
+#define PAGE_BMAP_COUNT_MAX32
+
 static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
@@ -50,6 +54,16 @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 static struct vfsmount *balloon_mnt;
 #endif
 
+struct balloon_page_chunk {
+   __le64 base : 52;
+   __le64 size : 12;
+};
+
+struct balloon_page_chunk_ext {
+   __le64 base;
+   __le64 size;
+};
+
 struct virtio_balloon {
struct virtio_device *vdev;
struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
@@ -67,6 +81,20 @@ struct virtio_balloon {
 
/* Number of balloon pages we've told the Host we're not using. */
unsigned int num_pages;
+   /* Pointer to the response header. */
+   struct virtio_balloon_resp_hdr resp_hdr;
+   /* Pointer to the start address of response data. */
+   __le64 *resp_data;
+   /* Size of response data buffer. */
+   unsigned int resp_buf_size;
+   /* Pointer offset of the response data. */
+   unsigned int resp_pos;
+   /* Bitmap used to save the pfns info */
+   unsigned long *page_bitmap[PAGE_BMAP_COUNT_MAX];
+   /* Number of split page bitmaps */
+   unsigned int nr_page_bmap;
+   /* Used to record the processed pfn range */
+   unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -110,20 +138,180 @@ static void balloon_ack(struct virtqueue *vq)
wake_up(>acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static inline void init_bmap_pfn_range(struct virtio_balloon *vb)
 {
-   struct scatterlist sg;
+   vb->min_pfn = ULONG_MAX;
+   vb->max_pfn = 0;
+}
+
+static inline void update_bmap_pfn_range(struct virtio_balloon *vb,
+struct page *page)
+{
+   unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+   vb->min_pfn = min(balloon_pfn, vb->min_pfn);
+   vb->max_pfn = max(balloon_pfn, vb->max_pfn);
+}
+
+static void extend_page_bitmap(struct virtio_balloon *vb,
+   unsigned long nr_pfn)
+{
+   int i, bmap_count;
+   unsigned long bmap_len;
+
+   bmap_len = ALIGN(nr_pfn, BITS_PER_LONG) / BITS_PER_BYTE;
+   bmap_len = ALIGN(bmap_len, PAGE_BMAP_SIZE);
+   

Re: net/sctp: use-after-free in sctp_association_put

2017-03-02 Thread Xin Long
On Fri, Mar 3, 2017 at 3:21 AM, Dmitry Vyukov  wrote:
> On Thu, Mar 2, 2017 at 9:06 AM, Xin Long  wrote:
>> On Thu, Mar 2, 2017 at 3:18 AM, Dmitry Vyukov  wrote:
>>> Hello,
>>>
>>> I've got the following report while running syzkaller fuzzer on
>>> linux-next/8813198236a044b76e251dcae937b180dd527999:
>>>
>>> BUG: KASAN: use-after-free in sctp_association_destroy
>>> net/sctp/associola.c:416 [inline] at addr 8801c0fa415c
>>> BUG: KASAN: use-after-free in sctp_association_put+0x294/0x300
>>> net/sctp/associola.c:881 at addr 8801c0fa415c
>>> Read of size 1 by task syz-executor1/10956
>>> CPU: 1 PID: 10956 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170213 
>>> #1
>>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>>> BIOS Google 01/01/2011
>>> Call Trace:
>>>  
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>  kasan_report.part.2+0x1e5/0x4b0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:329 [inline]
>>>  __asan_report_load1_noabort+0x29/0x30 mm/kasan/report.c:329
>>>  sctp_association_destroy net/sctp/associola.c:416 [inline]
>>>  sctp_association_put+0x294/0x300 net/sctp/associola.c:881
>>>  sctp_generate_timeout_event+0x115/0x360 net/sctp/sm_sideeffect.c:317
>>>  sctp_generate_t1_init_event+0x1a/0x20 net/sctp/sm_sideeffect.c:329
>>>  call_timer_fn+0x241/0x820 kernel/time/timer.c:1308
>>>  expire_timers kernel/time/timer.c:1348 [inline]
>>>  __run_timers+0x9e7/0xe90 kernel/time/timer.c:1642
>>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1655
>>>  __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>>>  invoke_softirq kernel/softirq.c:364 [inline]
>>>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>>>  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707
>>> RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:788 [inline]
>>> RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 
>>> [inline]
>>> RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:199
>>> RSP: 0018:8801c280f178 EFLAGS: 0286 ORIG_RAX: ff10
>>> RAX: dc00 RBX: 8801dbf24a00 RCX: 0006
>>> RDX: 10a18d03 RSI: 8801d71c88e0 RDI: 850c6818
>>> RBP: 8801c280f180 R08: 0002 R09: 
>>> R10: 0006 R11:  R12: 8801c0f3a4c0
>>> R13: 110038501e38 R14: 8801d71c80c0 R15: 8801d71c80c0
>>>  
>>>  finish_lock_switch kernel/sched/sched.h:1248 [inline]
>>>  finish_task_switch+0x1c2/0x720 kernel/sched/core.c:2792
>>>  context_switch kernel/sched/core.c:2928 [inline]
>>>  __schedule+0x893/0x2290 kernel/sched/core.c:3468
>>>  preempt_schedule_common+0x35/0x60 kernel/sched/core.c:3579
>>>  _cond_resched+0x17/0x20 kernel/sched/core.c:4977
>>>  slab_pre_alloc_hook mm/slab.h:427 [inline]
>>>  slab_alloc mm/slab.c:3390 [inline]
>>>  __do_kmalloc mm/slab.c:3730 [inline]
>>>  __kmalloc_track_caller+0x26a/0x690 mm/slab.c:3747
>>>  kstrdup+0x39/0x70 mm/util.c:54
>>>  snd_timer_instance_new+0xfc/0x5d0 sound/core/timer.c:110
>>>  snd_timer_open+0x878/0x1740 sound/core/timer.c:290
>>>  snd_timer_user_tselect sound/core/timer.c:1621 [inline]
>>>  __snd_timer_user_ioctl sound/core/timer.c:1901 [inline]
>>>  snd_timer_user_ioctl+0x9b1/0x34a0 sound/core/timer.c:1931
>>>  vfs_ioctl fs/ioctl.c:43 [inline]
>>>  do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
>>>  SYSC_ioctl fs/ioctl.c:698 [inline]
>>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> RIP: 0033:0x44fb59
>>> RSP: 002b:7f0dc184db58 EFLAGS: 0212 ORIG_RAX: 0010
>>> RAX: ffda RBX: 40345410 RCX: 0044fb59
>>> RDX: 20001000 RSI: 40345410 RDI: 0005
>>> RBP: 0005 R08:  R09: 
>>> R10:  R11: 0212 R12: 00708000
>>> R13: 00a5fc57 R14: 7f0dc184e9c0 R15: 
>>> Object at 8801c0fa4140, in cache kmalloc-4096 size: 4096
>>> Allocated:
>>> PID = 10965
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:504
>>>  set_track mm/kasan/kasan.c:516 [inline]
>>>  kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:607
>>>  kmem_cache_alloc_trace+0x10b/0x670 mm/slab.c:3634
>>>  kmalloc include/linux/slab.h:490 [inline]
>>>  kzalloc include/linux/slab.h:663 [inline]
>>>  sctp_association_new+0x114/0x2120 net/sctp/associola.c:306
>>>  sctp_sendmsg+0x1585/0x38f0 net/sctp/socket.c:1835
>>>  inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
>>>  sock_sendmsg_nosec 

Re: [PATCH 3/4] thp: fix MADV_DONTNEED vs. MADV_FREE race

2017-03-02 Thread Hillf Danton

On March 02, 2017 11:11 PM Kirill A. Shutemov wrote: 
> 
> Basically the same race as with numa balancing in change_huge_pmd(), but
> a bit simpler to mitigate: we don't need to preserve dirty/young flags
> here due to MADV_FREE functionality.
> 
> Signed-off-by: Kirill A. Shutemov 
> Cc: Minchan Kim 
> ---
>  mm/huge_memory.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index bb2b3646bd78..324217c31ec9 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1566,8 +1566,6 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, 
> struct vm_area_struct *vma,
>   deactivate_page(page);
> 
>   if (pmd_young(orig_pmd) || pmd_dirty(orig_pmd)) {
> - orig_pmd = pmdp_huge_get_and_clear_full(tlb->mm, addr, pmd,
> - tlb->fullmm);
>   orig_pmd = pmd_mkold(orig_pmd);
>   orig_pmd = pmd_mkclean(orig_pmd);
> 
$ grep -n set_pmd_at  linux-4.10/arch/powerpc/mm/pgtable-book3s64.c

/*
 * set a new huge pmd. We should not be called for updating
 * an existing pmd entry. That should go via pmd_hugepage_update.
 */
void set_pmd_at(struct mm_struct *mm, unsigned long addr,



Re: net/sctp: use-after-free in sctp_association_put

2017-03-02 Thread Xin Long
On Fri, Mar 3, 2017 at 3:21 AM, Dmitry Vyukov  wrote:
> On Thu, Mar 2, 2017 at 9:06 AM, Xin Long  wrote:
>> On Thu, Mar 2, 2017 at 3:18 AM, Dmitry Vyukov  wrote:
>>> Hello,
>>>
>>> I've got the following report while running syzkaller fuzzer on
>>> linux-next/8813198236a044b76e251dcae937b180dd527999:
>>>
>>> BUG: KASAN: use-after-free in sctp_association_destroy
>>> net/sctp/associola.c:416 [inline] at addr 8801c0fa415c
>>> BUG: KASAN: use-after-free in sctp_association_put+0x294/0x300
>>> net/sctp/associola.c:881 at addr 8801c0fa415c
>>> Read of size 1 by task syz-executor1/10956
>>> CPU: 1 PID: 10956 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170213 
>>> #1
>>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>>> BIOS Google 01/01/2011
>>> Call Trace:
>>>  
>>>  __dump_stack lib/dump_stack.c:15 [inline]
>>>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>>  print_address_description mm/kasan/report.c:200 [inline]
>>>  kasan_report_error mm/kasan/report.c:289 [inline]
>>>  kasan_report.part.2+0x1e5/0x4b0 mm/kasan/report.c:311
>>>  kasan_report mm/kasan/report.c:329 [inline]
>>>  __asan_report_load1_noabort+0x29/0x30 mm/kasan/report.c:329
>>>  sctp_association_destroy net/sctp/associola.c:416 [inline]
>>>  sctp_association_put+0x294/0x300 net/sctp/associola.c:881
>>>  sctp_generate_timeout_event+0x115/0x360 net/sctp/sm_sideeffect.c:317
>>>  sctp_generate_t1_init_event+0x1a/0x20 net/sctp/sm_sideeffect.c:329
>>>  call_timer_fn+0x241/0x820 kernel/time/timer.c:1308
>>>  expire_timers kernel/time/timer.c:1348 [inline]
>>>  __run_timers+0x9e7/0xe90 kernel/time/timer.c:1642
>>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1655
>>>  __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>>>  invoke_softirq kernel/softirq.c:364 [inline]
>>>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>>>  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707
>>> RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:788 [inline]
>>> RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 
>>> [inline]
>>> RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:199
>>> RSP: 0018:8801c280f178 EFLAGS: 0286 ORIG_RAX: ff10
>>> RAX: dc00 RBX: 8801dbf24a00 RCX: 0006
>>> RDX: 10a18d03 RSI: 8801d71c88e0 RDI: 850c6818
>>> RBP: 8801c280f180 R08: 0002 R09: 
>>> R10: 0006 R11:  R12: 8801c0f3a4c0
>>> R13: 110038501e38 R14: 8801d71c80c0 R15: 8801d71c80c0
>>>  
>>>  finish_lock_switch kernel/sched/sched.h:1248 [inline]
>>>  finish_task_switch+0x1c2/0x720 kernel/sched/core.c:2792
>>>  context_switch kernel/sched/core.c:2928 [inline]
>>>  __schedule+0x893/0x2290 kernel/sched/core.c:3468
>>>  preempt_schedule_common+0x35/0x60 kernel/sched/core.c:3579
>>>  _cond_resched+0x17/0x20 kernel/sched/core.c:4977
>>>  slab_pre_alloc_hook mm/slab.h:427 [inline]
>>>  slab_alloc mm/slab.c:3390 [inline]
>>>  __do_kmalloc mm/slab.c:3730 [inline]
>>>  __kmalloc_track_caller+0x26a/0x690 mm/slab.c:3747
>>>  kstrdup+0x39/0x70 mm/util.c:54
>>>  snd_timer_instance_new+0xfc/0x5d0 sound/core/timer.c:110
>>>  snd_timer_open+0x878/0x1740 sound/core/timer.c:290
>>>  snd_timer_user_tselect sound/core/timer.c:1621 [inline]
>>>  __snd_timer_user_ioctl sound/core/timer.c:1901 [inline]
>>>  snd_timer_user_ioctl+0x9b1/0x34a0 sound/core/timer.c:1931
>>>  vfs_ioctl fs/ioctl.c:43 [inline]
>>>  do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
>>>  SYSC_ioctl fs/ioctl.c:698 [inline]
>>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> RIP: 0033:0x44fb59
>>> RSP: 002b:7f0dc184db58 EFLAGS: 0212 ORIG_RAX: 0010
>>> RAX: ffda RBX: 40345410 RCX: 0044fb59
>>> RDX: 20001000 RSI: 40345410 RDI: 0005
>>> RBP: 0005 R08:  R09: 
>>> R10:  R11: 0212 R12: 00708000
>>> R13: 00a5fc57 R14: 7f0dc184e9c0 R15: 
>>> Object at 8801c0fa4140, in cache kmalloc-4096 size: 4096
>>> Allocated:
>>> PID = 10965
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:504
>>>  set_track mm/kasan/kasan.c:516 [inline]
>>>  kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:607
>>>  kmem_cache_alloc_trace+0x10b/0x670 mm/slab.c:3634
>>>  kmalloc include/linux/slab.h:490 [inline]
>>>  kzalloc include/linux/slab.h:663 [inline]
>>>  sctp_association_new+0x114/0x2120 net/sctp/associola.c:306
>>>  sctp_sendmsg+0x1585/0x38f0 net/sctp/socket.c:1835
>>>  inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
>>>  sock_sendmsg_nosec net/socket.c:633 [inline]
>>>  sock_sendmsg+0xca/0x110 net/socket.c:643

Re: [PATCH 3/4] thp: fix MADV_DONTNEED vs. MADV_FREE race

2017-03-02 Thread Hillf Danton

On March 02, 2017 11:11 PM Kirill A. Shutemov wrote: 
> 
> Basically the same race as with numa balancing in change_huge_pmd(), but
> a bit simpler to mitigate: we don't need to preserve dirty/young flags
> here due to MADV_FREE functionality.
> 
> Signed-off-by: Kirill A. Shutemov 
> Cc: Minchan Kim 
> ---
>  mm/huge_memory.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index bb2b3646bd78..324217c31ec9 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1566,8 +1566,6 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, 
> struct vm_area_struct *vma,
>   deactivate_page(page);
> 
>   if (pmd_young(orig_pmd) || pmd_dirty(orig_pmd)) {
> - orig_pmd = pmdp_huge_get_and_clear_full(tlb->mm, addr, pmd,
> - tlb->fullmm);
>   orig_pmd = pmd_mkold(orig_pmd);
>   orig_pmd = pmd_mkclean(orig_pmd);
> 
$ grep -n set_pmd_at  linux-4.10/arch/powerpc/mm/pgtable-book3s64.c

/*
 * set a new huge pmd. We should not be called for updating
 * an existing pmd entry. That should go via pmd_hugepage_update.
 */
void set_pmd_at(struct mm_struct *mm, unsigned long addr,



[PATCH] blk: improve order of bio handling in generic_make_request()

2017-03-02 Thread NeilBrown

[ Hi Jens,
  you might have seen assorted email threads recently about
  deadlocks, particular in dm-snap or md/raid1/10.  Also about
  the excess of rescuer threads.
  I think a big part of the problem is my ancient improvement
  to generic_make_request to queue bios and handle them in
  a strict FIFO order.  As described below, that can cause
  problems which individual block devices cannot fix themselves
  without punting to various threads.
  This patch does not fix everything, but provides a basis that
  drives can build on to create dead-lock free solutions without
  excess threads.
  If you accept this, I will look into improving at least md
  and bio_alloc_set() to be less dependant on rescuer threads.
  Thanks,
  NeilBrown
 ]


To avoid recursion on the kernel stack when stacked block devices
are in use, generic_make_request() will, when called recursively,
queue new requests for later handling.  They will be handled when the
make_request_fn for the current bio completes.

If any bios are submitted by a make_request_fn, these will ultimately
handled seqeuntially.  If the handling of one of those generates
further requests, they will be added to the end of the queue.

This strict first-in-first-out behaviour can lead to deadlocks in
various ways, normally because a request might need to wait for a
previous request to the same device to complete.  This can happen when
they share a mempool, and can happen due to interdependencies
particular to the device.  Both md and dm have examples where this happens.

These deadlocks can be erradicated by more selective ordering of bios.
Specifically by handling them in depth-first order.  That is: when the
handling of one bio generates one or more further bios, they are
handled immediately after the parent, before any siblings of the
parent.  That way, when generic_make_request() calls make_request_fn
for some particular device, it we can be certain that all previously
submited request for that device have been completely handled and are
not waiting for anything in the queue of requests maintained in
generic_make_request().

An easy way to achieve this would be to use a last-in-first-out stack
instead of a queue.  However this will change the order of consecutive
bios submitted by a make_request_fn, which could have unexpected consequences.
Instead we take a slightly more complex approach.
A fresh queue is created for each call to a make_request_fn.  After it 
completes,
any bios for a different device are placed on the front of the main queue, 
followed
by any bios for the same device, followed by all bios that were already on
the queue before the make_request_fn was called.
This provides the depth-first approach without reordering bios on the same 
level.

This, by itself, it not enough to remove the deadlocks.  It just makes
it possible for drivers to take the extra step required themselves.

To avoid deadlocks, drivers must never risk waiting for a request
after submitting one to generic_make_request.  This includes never
allocing from a mempool twice in the one call to a make_request_fn.

A common pattern in drivers is to call bio_split() in a loop, handling
the first part and then looping around to possibly split the next part.
Instead, a driver that finds it needs to split a bio should queue
(with generic_make_request) the second part, handle the first part,
and then return.  The new code in generic_make_request will ensure the
requests to underlying bios are processed first, then the second bio
that was split off.  If it splits again, the same process happens.  In
each case one bio will be completely handled before the next one is attempted.

With this is place, it should be possible to disable the
punt_bios_to_recover() recovery thread for many block devices, and
eventually it may be possible to remove it completely.

Tested-by: Jinpu Wang 
Inspired-by: Lars Ellenberg 
Signed-off-by: NeilBrown 
---
 block/blk-core.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index b9e857f4afe8..ef55f210dd7c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2018,10 +2018,32 @@ blk_qc_t generic_make_request(struct bio *bio)
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 
if (likely(blk_queue_enter(q, false) == 0)) {
+   struct bio_list hold;
+   struct bio_list lower, same;
+
+   /* Create a fresh bio_list for all subordinate requests 
*/
+   bio_list_init();
+   bio_list_merge(, _list_on_stack);
+   bio_list_init(_list_on_stack);
ret = q->make_request_fn(q, bio);
 
blk_queue_exit(q);
 
+   /* sort new bios into those for a lower level
+* and those for the same level
+

[PATCH] blk: improve order of bio handling in generic_make_request()

2017-03-02 Thread NeilBrown

[ Hi Jens,
  you might have seen assorted email threads recently about
  deadlocks, particular in dm-snap or md/raid1/10.  Also about
  the excess of rescuer threads.
  I think a big part of the problem is my ancient improvement
  to generic_make_request to queue bios and handle them in
  a strict FIFO order.  As described below, that can cause
  problems which individual block devices cannot fix themselves
  without punting to various threads.
  This patch does not fix everything, but provides a basis that
  drives can build on to create dead-lock free solutions without
  excess threads.
  If you accept this, I will look into improving at least md
  and bio_alloc_set() to be less dependant on rescuer threads.
  Thanks,
  NeilBrown
 ]


To avoid recursion on the kernel stack when stacked block devices
are in use, generic_make_request() will, when called recursively,
queue new requests for later handling.  They will be handled when the
make_request_fn for the current bio completes.

If any bios are submitted by a make_request_fn, these will ultimately
handled seqeuntially.  If the handling of one of those generates
further requests, they will be added to the end of the queue.

This strict first-in-first-out behaviour can lead to deadlocks in
various ways, normally because a request might need to wait for a
previous request to the same device to complete.  This can happen when
they share a mempool, and can happen due to interdependencies
particular to the device.  Both md and dm have examples where this happens.

These deadlocks can be erradicated by more selective ordering of bios.
Specifically by handling them in depth-first order.  That is: when the
handling of one bio generates one or more further bios, they are
handled immediately after the parent, before any siblings of the
parent.  That way, when generic_make_request() calls make_request_fn
for some particular device, it we can be certain that all previously
submited request for that device have been completely handled and are
not waiting for anything in the queue of requests maintained in
generic_make_request().

An easy way to achieve this would be to use a last-in-first-out stack
instead of a queue.  However this will change the order of consecutive
bios submitted by a make_request_fn, which could have unexpected consequences.
Instead we take a slightly more complex approach.
A fresh queue is created for each call to a make_request_fn.  After it 
completes,
any bios for a different device are placed on the front of the main queue, 
followed
by any bios for the same device, followed by all bios that were already on
the queue before the make_request_fn was called.
This provides the depth-first approach without reordering bios on the same 
level.

This, by itself, it not enough to remove the deadlocks.  It just makes
it possible for drivers to take the extra step required themselves.

To avoid deadlocks, drivers must never risk waiting for a request
after submitting one to generic_make_request.  This includes never
allocing from a mempool twice in the one call to a make_request_fn.

A common pattern in drivers is to call bio_split() in a loop, handling
the first part and then looping around to possibly split the next part.
Instead, a driver that finds it needs to split a bio should queue
(with generic_make_request) the second part, handle the first part,
and then return.  The new code in generic_make_request will ensure the
requests to underlying bios are processed first, then the second bio
that was split off.  If it splits again, the same process happens.  In
each case one bio will be completely handled before the next one is attempted.

With this is place, it should be possible to disable the
punt_bios_to_recover() recovery thread for many block devices, and
eventually it may be possible to remove it completely.

Tested-by: Jinpu Wang 
Inspired-by: Lars Ellenberg 
Signed-off-by: NeilBrown 
---
 block/blk-core.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index b9e857f4afe8..ef55f210dd7c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2018,10 +2018,32 @@ blk_qc_t generic_make_request(struct bio *bio)
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 
if (likely(blk_queue_enter(q, false) == 0)) {
+   struct bio_list hold;
+   struct bio_list lower, same;
+
+   /* Create a fresh bio_list for all subordinate requests 
*/
+   bio_list_init();
+   bio_list_merge(, _list_on_stack);
+   bio_list_init(_list_on_stack);
ret = q->make_request_fn(q, bio);
 
blk_queue_exit(q);
 
+   /* sort new bios into those for a lower level
+* and those for the same level
+*/
+   bio_list_init();
+   

[GIT PULL] Block fixes for 4.11-rc1

2017-03-02 Thread Jens Axboe
Hi Linus,

A collection of fixes for this merge window, either fixes for existing
issues, or parts that were waiting for acks to come in. This pull
request contains:

- Allocation of nvme queues on the right node from Shaohua. This was
  ready long before the merge window, but waiting on an ack from Bjorn
  on the PCI bit. Now that we have that, the three patches can go in.

- Two fixes for blk-mq-sched with nvmeof, which uses hctx specific
  request allocations. This caused an oops. One part from Sagi, one part
  from Omar.

- A loop partition scan deadlock fix from Omar, fixing a regression in
  this merge window.

- A 3 patch series from Keith, closing up a hole on clearing out
  requests on shutdown/resume.

- A stable fix for nbd from Josef, fixing a leak of sockets.

- Two fixes for a regression in this window from Jan, fixing a problem
  with one of his earlier patches dealing with queue vs bdi life times.

- A fix for a regression with virtio-blk, causing an IO stall if
  scheduling is used. From me.

- A fix for an io context lock ordering problem. From me.

Please pull!


  git://git.kernel.dk/linux-block.git for-linus



Jan Kara (2):
  block: Initialize bd_bdi on inode initialization
  block: Move bdi_unregister() to del_gendisk()

Jens Axboe (2):
  block: don't call ioc_exit_icq() with the queue lock held for blk-mq
  blk-mq: ensure that bd->last is always set correctly

Josef Bacik (1):
  nbd: stop leaking sockets

Keith Busch (3):
  blk-mq: Export blk_mq_freeze_queue_wait
  blk-mq: Provide freeze queue timeout
  nvme: Complete all stuck requests

Omar Sandoval (4):
  blk-mq: make blk_mq_alloc_request_hctx() allocate a scheduler request
  blk-mq: kill blk_mq_set_alloc_data()
  blk-mq: move update of tags->rqs to __blk_mq_alloc_request()
  loop: fix LO_FLAGS_PARTSCAN hang

Sagi Grimberg (1):
  blk-mq-sched: Allocate sched reserved tags as specified in the original 
queue tagset

Shaohua Li (3):
  blk-mq: allocate blk_mq_tags and requests in correct node
  PCI: add an API to get node from vector
  nvme: allocate nvme_queue in correct node

 block/blk-core.c |   1 -
 block/blk-ioc.c  |  44 -
 block/blk-mq-sched.c |  16 +++
 block/blk-mq-tag.c   |   2 +-
 block/blk-mq-tag.h   |   6 +++
 block/blk-mq.c   | 120 ++-
 block/blk-mq.h   |  10 
 block/blk-sysfs.c|   2 -
 block/elevator.c |   2 -
 block/genhd.c|   5 ++
 drivers/block/loop.c |  15 +++---
 drivers/block/nbd.c  |   4 +-
 drivers/nvme/host/core.c |  47 +++
 drivers/nvme/host/nvme.h |   4 ++
 drivers/nvme/host/pci.c  |  45 ++
 drivers/pci/msi.c|  16 +++
 fs/block_dev.c   |   6 ++-
 include/linux/blk-mq.h   |   3 ++
 include/linux/pci.h  |   6 +++
 19 files changed, 265 insertions(+), 89 deletions(-)

-- 
Jens Axboe



[GIT PULL] Block fixes for 4.11-rc1

2017-03-02 Thread Jens Axboe
Hi Linus,

A collection of fixes for this merge window, either fixes for existing
issues, or parts that were waiting for acks to come in. This pull
request contains:

- Allocation of nvme queues on the right node from Shaohua. This was
  ready long before the merge window, but waiting on an ack from Bjorn
  on the PCI bit. Now that we have that, the three patches can go in.

- Two fixes for blk-mq-sched with nvmeof, which uses hctx specific
  request allocations. This caused an oops. One part from Sagi, one part
  from Omar.

- A loop partition scan deadlock fix from Omar, fixing a regression in
  this merge window.

- A 3 patch series from Keith, closing up a hole on clearing out
  requests on shutdown/resume.

- A stable fix for nbd from Josef, fixing a leak of sockets.

- Two fixes for a regression in this window from Jan, fixing a problem
  with one of his earlier patches dealing with queue vs bdi life times.

- A fix for a regression with virtio-blk, causing an IO stall if
  scheduling is used. From me.

- A fix for an io context lock ordering problem. From me.

Please pull!


  git://git.kernel.dk/linux-block.git for-linus



Jan Kara (2):
  block: Initialize bd_bdi on inode initialization
  block: Move bdi_unregister() to del_gendisk()

Jens Axboe (2):
  block: don't call ioc_exit_icq() with the queue lock held for blk-mq
  blk-mq: ensure that bd->last is always set correctly

Josef Bacik (1):
  nbd: stop leaking sockets

Keith Busch (3):
  blk-mq: Export blk_mq_freeze_queue_wait
  blk-mq: Provide freeze queue timeout
  nvme: Complete all stuck requests

Omar Sandoval (4):
  blk-mq: make blk_mq_alloc_request_hctx() allocate a scheduler request
  blk-mq: kill blk_mq_set_alloc_data()
  blk-mq: move update of tags->rqs to __blk_mq_alloc_request()
  loop: fix LO_FLAGS_PARTSCAN hang

Sagi Grimberg (1):
  blk-mq-sched: Allocate sched reserved tags as specified in the original 
queue tagset

Shaohua Li (3):
  blk-mq: allocate blk_mq_tags and requests in correct node
  PCI: add an API to get node from vector
  nvme: allocate nvme_queue in correct node

 block/blk-core.c |   1 -
 block/blk-ioc.c  |  44 -
 block/blk-mq-sched.c |  16 +++
 block/blk-mq-tag.c   |   2 +-
 block/blk-mq-tag.h   |   6 +++
 block/blk-mq.c   | 120 ++-
 block/blk-mq.h   |  10 
 block/blk-sysfs.c|   2 -
 block/elevator.c |   2 -
 block/genhd.c|   5 ++
 drivers/block/loop.c |  15 +++---
 drivers/block/nbd.c  |   4 +-
 drivers/nvme/host/core.c |  47 +++
 drivers/nvme/host/nvme.h |   4 ++
 drivers/nvme/host/pci.c  |  45 ++
 drivers/pci/msi.c|  16 +++
 fs/block_dev.c   |   6 ++-
 include/linux/blk-mq.h   |   3 ++
 include/linux/pci.h  |   6 +++
 19 files changed, 265 insertions(+), 89 deletions(-)

-- 
Jens Axboe



[PATCH 3/4] x86, pci: Add interface to force mmconfig

2017-03-02 Thread Andi Kleen
From: Andi Kleen 

This fills in the pci_bus_force_mmconfig interface that was
added earlier for x86 to allow drivers to optimize config
space accesses. The implementation is straight forward
and uses the existing mmconfig access functions, just forcing
mmconfig access.

Signed-off-by: Andi Kleen 
---
 arch/x86/pci/mmconfig-shared.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index dd30b7e08bc2..bb56533290aa 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -816,3 +816,31 @@ int pci_mmconfig_delete(u16 seg, u8 start, u8 end)
 
return -ENOENT;
 }
+
+static int pci_mmconfig_read(struct pci_bus *bus, unsigned int devfn,
+int where, int size, u32 *value)
+{
+   return raw_pci_ext_ops->read(pci_domain_nr(bus), bus->number,
+devfn, where, size, value);
+}
+
+static int pci_mmconfig_write(struct pci_bus *bus, unsigned int devfn,
+ int where, int size, u32 value)
+{
+   return raw_pci_ext_ops->write(pci_domain_nr(bus), bus->number,
+ devfn, where, size, value);
+}
+
+struct pci_ops pci_mmconfig_ops = {
+   .read = pci_mmconfig_read,
+   .write = pci_mmconfig_write,
+};
+
+/* Force all config accesses to go through mmconfig. */
+int pci_bus_force_mmconfig(struct pci_bus *bus)
+{
+   if (!raw_pci_ext_ops)
+   return -1;
+   bus->ops = _mmconfig_ops;
+   return 0;
+}
-- 
2.9.3



[PATCH 3/4] x86, pci: Add interface to force mmconfig

2017-03-02 Thread Andi Kleen
From: Andi Kleen 

This fills in the pci_bus_force_mmconfig interface that was
added earlier for x86 to allow drivers to optimize config
space accesses. The implementation is straight forward
and uses the existing mmconfig access functions, just forcing
mmconfig access.

Signed-off-by: Andi Kleen 
---
 arch/x86/pci/mmconfig-shared.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index dd30b7e08bc2..bb56533290aa 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -816,3 +816,31 @@ int pci_mmconfig_delete(u16 seg, u8 start, u8 end)
 
return -ENOENT;
 }
+
+static int pci_mmconfig_read(struct pci_bus *bus, unsigned int devfn,
+int where, int size, u32 *value)
+{
+   return raw_pci_ext_ops->read(pci_domain_nr(bus), bus->number,
+devfn, where, size, value);
+}
+
+static int pci_mmconfig_write(struct pci_bus *bus, unsigned int devfn,
+ int where, int size, u32 value)
+{
+   return raw_pci_ext_ops->write(pci_domain_nr(bus), bus->number,
+ devfn, where, size, value);
+}
+
+struct pci_ops pci_mmconfig_ops = {
+   .read = pci_mmconfig_read,
+   .write = pci_mmconfig_write,
+};
+
+/* Force all config accesses to go through mmconfig. */
+int pci_bus_force_mmconfig(struct pci_bus *bus)
+{
+   if (!raw_pci_ext_ops)
+   return -1;
+   bus->ops = _mmconfig_ops;
+   return 0;
+}
-- 
2.9.3



Re: [RFC 04/11] mm: remove SWAP_MLOCK check for SWAP_SUCCESS in ttu

2017-03-02 Thread Minchan Kim
On Thu, Mar 02, 2017 at 08:21:46PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > If the page is mapped and rescue in ttuo, page_mapcount(page) == 0 cannot
> 
> Nit: "ttuo" is very cryptic. Please expand it.

No problem.

> 
> > be true so page_mapcount check in ttu is enough to return SWAP_SUCCESS.
> > IOW, SWAP_MLOCK check is redundant so remove it.
> 
> Right, page_mapcount(page) should be enough to tell whether swapping
> out happened successfully or the page is still mapped in some page
> table.
> 

Thanks for the review, Anshuman!



Re: [RFC 04/11] mm: remove SWAP_MLOCK check for SWAP_SUCCESS in ttu

2017-03-02 Thread Minchan Kim
On Thu, Mar 02, 2017 at 08:21:46PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > If the page is mapped and rescue in ttuo, page_mapcount(page) == 0 cannot
> 
> Nit: "ttuo" is very cryptic. Please expand it.

No problem.

> 
> > be true so page_mapcount check in ttu is enough to return SWAP_SUCCESS.
> > IOW, SWAP_MLOCK check is redundant so remove it.
> 
> Right, page_mapcount(page) should be enough to tell whether swapping
> out happened successfully or the page is still mapped in some page
> table.
> 

Thanks for the review, Anshuman!



Re: [PATCH v4 1/3] mmc: dt-bindings: update Mediatek MMC bindings

2017-03-02 Thread Yong Mao
On Fri, 2017-02-24 at 16:47 -0600, Rob Herring wrote:
> On Fri, Feb 24, 2017 at 3:59 AM, Yong Mao  wrote:
> > Dear Rob,
> >
> > Could you please help to make comments for this patch?
> > Thanks.
> 
> I already did comment. It's still wrong as Ulf commented. So fix and
> send a new version. It has to go to the DT list if you want to be in
> my queue.
> 
> Rob

After reviewing the history, We guess your mentioned Ulf's comments is as below.
  
"> +- mtk-hs200-cmd-int-delay: HS200 command internal delay setting.
> + The value is an integer from 0 to 31

Please change to:

mediatek,hs200-cmd-delay

... and if there is a unit, like ns or us, please add that a suffix.

> +- mtk-hs400-cmd-int-delay: HS400 command internal delay setting
> + The value is an integer from 0 to 31

mediatek,hs400-cmd-delay and add unit if applicable.

> +- mtk-hs400-cmd-resp-sel: HS400 command response sample selection
> + The value is an integer from 0 to 1

mediatek,hs400-cmd-resp-sel

And make it a boolean value instead!"

==> We already fix this comment in v4. 
We use "mediatek,hs200-cmd-int-delay" to replace "mtk-hs200-cmd-int-delay", 
but not use "mediatek,hs200-cmd-delay". This is because "-int-" here means 
internal.
We should not drop it.

And this field does not have unit, it only have total 32 stages.
We also change the description in v4.

For comment about "mtk-hs400-cmd-resp-sel", we also make it a boolean value in 
v4.
And re-name it as "mediatek,hs400-cmd-resp-rising".

Please help to point out where we need to modify.
Thanks.





Re: [PATCH v4 1/3] mmc: dt-bindings: update Mediatek MMC bindings

2017-03-02 Thread Yong Mao
On Fri, 2017-02-24 at 16:47 -0600, Rob Herring wrote:
> On Fri, Feb 24, 2017 at 3:59 AM, Yong Mao  wrote:
> > Dear Rob,
> >
> > Could you please help to make comments for this patch?
> > Thanks.
> 
> I already did comment. It's still wrong as Ulf commented. So fix and
> send a new version. It has to go to the DT list if you want to be in
> my queue.
> 
> Rob

After reviewing the history, We guess your mentioned Ulf's comments is as below.
  
"> +- mtk-hs200-cmd-int-delay: HS200 command internal delay setting.
> + The value is an integer from 0 to 31

Please change to:

mediatek,hs200-cmd-delay

... and if there is a unit, like ns or us, please add that a suffix.

> +- mtk-hs400-cmd-int-delay: HS400 command internal delay setting
> + The value is an integer from 0 to 31

mediatek,hs400-cmd-delay and add unit if applicable.

> +- mtk-hs400-cmd-resp-sel: HS400 command response sample selection
> + The value is an integer from 0 to 1

mediatek,hs400-cmd-resp-sel

And make it a boolean value instead!"

==> We already fix this comment in v4. 
We use "mediatek,hs200-cmd-int-delay" to replace "mtk-hs200-cmd-int-delay", 
but not use "mediatek,hs200-cmd-delay". This is because "-int-" here means 
internal.
We should not drop it.

And this field does not have unit, it only have total 32 stages.
We also change the description in v4.

For comment about "mtk-hs400-cmd-resp-sel", we also make it a boolean value in 
v4.
And re-name it as "mediatek,hs400-cmd-resp-rising".

Please help to point out where we need to modify.
Thanks.





Re: + mm-reclaim-madv_free-pages.patch added to -mm tree

2017-03-02 Thread Minchan Kim
Hi,

On Tue, Feb 28, 2017 at 04:32:38PM -0800, a...@linux-foundation.org wrote:
> 
> The patch titled
>  Subject: mm: reclaim MADV_FREE pages
> has been added to the -mm tree.  Its filename is
>  mm-reclaim-madv_free-pages.patch
> 
> This patch should soon appear at
> http://ozlabs.org/~akpm/mmots/broken-out/mm-reclaim-madv_free-pages.patch
> and later at
> http://ozlabs.org/~akpm/mmotm/broken-out/mm-reclaim-madv_free-pages.patch
> 
> Before you just go and hit "reply", please:
>a) Consider who else should be cc'ed
>b) Prefer to cc a suitable mailing list as well
>c) Ideally: find the original patch on the mailing list and do a
>   reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> --
> From: Shaohua Li 
> Subject: mm: reclaim MADV_FREE pages
> 
> When memory pressure is high, we free MADV_FREE pages.  If the pages are
> not dirty in pte, the pages could be freed immediately.  Otherwise we
> can't reclaim them.  We put the pages back to anonumous LRU list (by
> setting SwapBacked flag) and the pages will be reclaimed in normal swapout
> way.
> 
> We use normal page reclaim policy.  Since MADV_FREE pages are put into
> inactive file list, such pages and inactive file pages are reclaimed
> according to their age.  This is expected, because we don't want to
> reclaim too many MADV_FREE pages before used once pages.
> 
> Based on Minchan's original patch
> 
> Link: 
> http://lkml.kernel.org/r/14b8eb1d3f6bf6cc492833f183ac8c304e560484.1487965799.git.s...@fb.com
> Signed-off-by: Shaohua Li 
> Acked-by: Minchan Kim 
> Acked-by: Michal Hocko 
> Acked-by: Johannes Weiner 
> Acked-by: Hillf Danton 
> Cc: Hugh Dickins 
> Cc: Rik van Riel 
> Cc: Mel Gorman 
> Signed-off-by: Andrew Morton 
> ---

< snip >

> @@ -1419,11 +1413,21 @@ static int try_to_unmap_one(struct page
>   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
> PageSwapBacked(page),
>   page);
>  
> - if (!PageDirty(page)) {
> + /*
> +  * swapin page could be clean, it has data stored in
> +  * swap. We can't silently discard it without setting
> +  * swap entry in the page table.
> +  */
> + if (!PageDirty(page) && !PageSwapCache(page)) {
>   /* It's a freeable page by MADV_FREE */
>   dec_mm_counter(mm, MM_ANONPAGES);
> - rp->lazyfreed++;
>   goto discard;
> + } else if (!PageSwapBacked(page)) {
> + /* dirty MADV_FREE page */
> + set_pte_at(mm, address, pvmw.pte, pteval);
> + ret = SWAP_DIRTY;
> + page_vma_mapped_walk_done();
> + break;
>   }

There is no point to make this logic complicated with clean swapin-page.

Andrew,
Could you fold below patch into the mm-reclaim-madv_free-pages.patch
if others are not against?

Thanks.

>From 0c28f6560fbc4e65da4f4a8cc4664ab9f7b11cf3 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Fri, 3 Mar 2017 11:42:52 +0900
Subject: [PATCH] mm: clean up lazyfree page handling

We can make it simple to understand without need to be aware of
clean-swapin page.
This patch just clean up lazyfree page handling in try_to_unmap_one.

Signed-off-by: Minchan Kim 
---
 mm/rmap.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index bb45712..f7eab40 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1413,17 +1413,17 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
VM_BUG_ON_PAGE(!PageSwapCache(page) && 
PageSwapBacked(page),
page);
 
-   /*
-* swapin page could be clean, it has data stored in
-* swap. We can't silently discard it without setting
-* swap entry in the page table.
-*/
-   if (!PageDirty(page) && !PageSwapCache(page)) {
-   /* It's a freeable page by MADV_FREE */
-   dec_mm_counter(mm, MM_ANONPAGES);
-   goto discard;
-   } else if (!PageSwapBacked(page)) {
- 

Re: + mm-reclaim-madv_free-pages.patch added to -mm tree

2017-03-02 Thread Minchan Kim
Hi,

On Tue, Feb 28, 2017 at 04:32:38PM -0800, a...@linux-foundation.org wrote:
> 
> The patch titled
>  Subject: mm: reclaim MADV_FREE pages
> has been added to the -mm tree.  Its filename is
>  mm-reclaim-madv_free-pages.patch
> 
> This patch should soon appear at
> http://ozlabs.org/~akpm/mmots/broken-out/mm-reclaim-madv_free-pages.patch
> and later at
> http://ozlabs.org/~akpm/mmotm/broken-out/mm-reclaim-madv_free-pages.patch
> 
> Before you just go and hit "reply", please:
>a) Consider who else should be cc'ed
>b) Prefer to cc a suitable mailing list as well
>c) Ideally: find the original patch on the mailing list and do a
>   reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> --
> From: Shaohua Li 
> Subject: mm: reclaim MADV_FREE pages
> 
> When memory pressure is high, we free MADV_FREE pages.  If the pages are
> not dirty in pte, the pages could be freed immediately.  Otherwise we
> can't reclaim them.  We put the pages back to anonumous LRU list (by
> setting SwapBacked flag) and the pages will be reclaimed in normal swapout
> way.
> 
> We use normal page reclaim policy.  Since MADV_FREE pages are put into
> inactive file list, such pages and inactive file pages are reclaimed
> according to their age.  This is expected, because we don't want to
> reclaim too many MADV_FREE pages before used once pages.
> 
> Based on Minchan's original patch
> 
> Link: 
> http://lkml.kernel.org/r/14b8eb1d3f6bf6cc492833f183ac8c304e560484.1487965799.git.s...@fb.com
> Signed-off-by: Shaohua Li 
> Acked-by: Minchan Kim 
> Acked-by: Michal Hocko 
> Acked-by: Johannes Weiner 
> Acked-by: Hillf Danton 
> Cc: Hugh Dickins 
> Cc: Rik van Riel 
> Cc: Mel Gorman 
> Signed-off-by: Andrew Morton 
> ---

< snip >

> @@ -1419,11 +1413,21 @@ static int try_to_unmap_one(struct page
>   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
> PageSwapBacked(page),
>   page);
>  
> - if (!PageDirty(page)) {
> + /*
> +  * swapin page could be clean, it has data stored in
> +  * swap. We can't silently discard it without setting
> +  * swap entry in the page table.
> +  */
> + if (!PageDirty(page) && !PageSwapCache(page)) {
>   /* It's a freeable page by MADV_FREE */
>   dec_mm_counter(mm, MM_ANONPAGES);
> - rp->lazyfreed++;
>   goto discard;
> + } else if (!PageSwapBacked(page)) {
> + /* dirty MADV_FREE page */
> + set_pte_at(mm, address, pvmw.pte, pteval);
> + ret = SWAP_DIRTY;
> + page_vma_mapped_walk_done();
> + break;
>   }

There is no point to make this logic complicated with clean swapin-page.

Andrew,
Could you fold below patch into the mm-reclaim-madv_free-pages.patch
if others are not against?

Thanks.

>From 0c28f6560fbc4e65da4f4a8cc4664ab9f7b11cf3 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Fri, 3 Mar 2017 11:42:52 +0900
Subject: [PATCH] mm: clean up lazyfree page handling

We can make it simple to understand without need to be aware of
clean-swapin page.
This patch just clean up lazyfree page handling in try_to_unmap_one.

Signed-off-by: Minchan Kim 
---
 mm/rmap.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index bb45712..f7eab40 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1413,17 +1413,17 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
VM_BUG_ON_PAGE(!PageSwapCache(page) && 
PageSwapBacked(page),
page);
 
-   /*
-* swapin page could be clean, it has data stored in
-* swap. We can't silently discard it without setting
-* swap entry in the page table.
-*/
-   if (!PageDirty(page) && !PageSwapCache(page)) {
-   /* It's a freeable page by MADV_FREE */
-   dec_mm_counter(mm, MM_ANONPAGES);
-   goto discard;
-   } else if (!PageSwapBacked(page)) {
-   /* dirty MADV_FREE page */
+   /* MADV_FREE page check */
+   if (!PageSwapBacked(page)) {
+   if (!PageDirty(page)) {
+   

Re: [RFC 02/11] mm: remove unncessary ret in page_referenced

2017-03-02 Thread Minchan Kim
On Thu, Mar 02, 2017 at 08:03:16PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > Anyone doesn't use ret variable. Remove it.
> > 
> 
> This change is correct. But not sure how this is related to
> try_to_unmap() clean up though.

In this patchset, I made rmap_walk void function with upcoming
patch so it's a preparation step for it.

> 
> 
> > Signed-off-by: Minchan Kim 
> > ---
> >  mm/rmap.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index bb45712..8076347 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -805,7 +805,6 @@ int page_referenced(struct page *page,
> > struct mem_cgroup *memcg,
> > unsigned long *vm_flags)
> >  {
> > -   int ret;
> > int we_locked = 0;
> > struct page_referenced_arg pra = {
> > .mapcount = total_mapcount(page),
> > @@ -839,7 +838,7 @@ int page_referenced(struct page *page,
> > rwc.invalid_vma = invalid_page_referenced_vma;
> > }
> >  
> > -   ret = rmap_walk(page, );
> > +   rmap_walk(page, );
> > *vm_flags = pra.vm_flags;
> >  
> > if (we_locked)
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 02/11] mm: remove unncessary ret in page_referenced

2017-03-02 Thread Minchan Kim
On Thu, Mar 02, 2017 at 08:03:16PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > Anyone doesn't use ret variable. Remove it.
> > 
> 
> This change is correct. But not sure how this is related to
> try_to_unmap() clean up though.

In this patchset, I made rmap_walk void function with upcoming
patch so it's a preparation step for it.

> 
> 
> > Signed-off-by: Minchan Kim 
> > ---
> >  mm/rmap.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index bb45712..8076347 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -805,7 +805,6 @@ int page_referenced(struct page *page,
> > struct mem_cgroup *memcg,
> > unsigned long *vm_flags)
> >  {
> > -   int ret;
> > int we_locked = 0;
> > struct page_referenced_arg pra = {
> > .mapcount = total_mapcount(page),
> > @@ -839,7 +838,7 @@ int page_referenced(struct page *page,
> > rwc.invalid_vma = invalid_page_referenced_vma;
> > }
> >  
> > -   ret = rmap_walk(page, );
> > +   rmap_walk(page, );
> > *vm_flags = pra.vm_flags;
> >  
> > if (we_locked)
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


[PATCH 4/4] staging: speakup: Alignment should match open parenthesis

2017-03-02 Thread Arushi Singhal
Fix checkpatch issues: "CHECK: Alignment should match open parenthesis".

Signed-off-by: Arushi Singhal 
---
 drivers/staging/speakup/kobjects.c   | 16 
 drivers/staging/speakup/main.c   |  2 +-
 drivers/staging/speakup/selection.c  |  2 +-
 drivers/staging/speakup/serialio.c   |  2 +-
 drivers/staging/speakup/speakup_acntpc.c |  6 +++---
 drivers/staging/speakup/speakup_apollo.c |  2 +-
 drivers/staging/speakup/speakup_decext.c |  4 ++--
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/speakup/kobjects.c 
b/drivers/staging/speakup/kobjects.c
index 8c93188b832c..edde9e68779e 100644
--- a/drivers/staging/speakup/kobjects.c
+++ b/drivers/staging/speakup/kobjects.c
@@ -662,9 +662,9 @@ ssize_t spk_var_store(struct kobject *kobj, struct 
kobj_attribute *attr,
var_data = param->data;
value = var_data->u.n.value;
spk_reset_default_value("pitch", synth->default_pitch,
-   value);
+   value);
spk_reset_default_value("vol", synth->default_vol,
-   value);
+   value);
}
break;
case VAR_STRING:
@@ -679,7 +679,7 @@ ssize_t spk_var_store(struct kobject *kobj, struct 
kobj_attribute *attr,
ret = spk_set_string_var(cp, param, len);
if (ret == -E2BIG)
pr_warn("value too long for %s\n",
-   param->name);
+   param->name);
break;
default:
pr_warn("%s unknown type %d\n",
@@ -699,7 +699,7 @@ EXPORT_SYMBOL_GPL(spk_var_store);
  */
 
 static ssize_t message_show_helper(char *buf, enum msg_index_t first,
-   enum msg_index_t last)
+   enum msg_index_t last)
 {
size_t bufsize = PAGE_SIZE;
char *buf_pointer = buf;
@@ -712,7 +712,7 @@ static ssize_t message_show_helper(char *buf, enum 
msg_index_t first,
if (bufsize <= 1)
break;
printed = scnprintf(buf_pointer, bufsize, "%d\t%s\n",
-   index, spk_msg_get(cursor));
+   index, spk_msg_get(cursor));
buf_pointer += printed;
bufsize -= printed;
}
@@ -721,7 +721,7 @@ static ssize_t message_show_helper(char *buf, enum 
msg_index_t first,
 }
 
 static void report_msg_status(int reset, int received, int used,
-   int rejected, char *groupname)
+  int rejected, char *groupname)
 {
int len;
char buf[160];
@@ -742,7 +742,7 @@ static void report_msg_status(int reset, int received, int 
used,
 }
 
 static ssize_t message_store_helper(const char *buf, size_t count,
-   struct msg_group_t *group)
+struct msg_group_t *group)
 {
char *cp = (char *) buf;
char *end = cp + count;
@@ -843,7 +843,7 @@ static ssize_t message_show(struct kobject *kobj,
 }
 
 static ssize_t message_store(struct kobject *kobj, struct kobj_attribute *attr,
-   const char *buf, size_t count)
+ const char *buf, size_t count)
 {
struct msg_group_t *group = spk_find_msg_group(attr->attr.name);
 
diff --git a/drivers/staging/speakup/main.c b/drivers/staging/speakup/main.c
index 25acebb9311f..01eabc19039c 100644
--- a/drivers/staging/speakup/main.c
+++ b/drivers/staging/speakup/main.c
@@ -1140,7 +1140,7 @@ static void spkup_write(const char *in_buf, int count)
if (last_type & CH_RPT) {
synth_printf(" ");
synth_printf(spk_msg_get(MSG_REPEAT_DESC2),
-   ++rep_count);
+++rep_count);
synth_printf(" ");
}
rep_count = 0;
diff --git a/drivers/staging/speakup/selection.c 
b/drivers/staging/speakup/selection.c
index afd9a446a06f..3d15eec37163 100644
--- a/drivers/staging/speakup/selection.c
+++ b/drivers/staging/speakup/selection.c
@@ -75,7 +75,7 @@ int speakup_set_selection(struct tty_struct *tty)
speakup_clear_selection();
spk_sel_cons = vc_cons[fg_console].d;
dev_warn(tty->dev,
-   "Selection: mark console not the same as cut\n");
+"Selection: mark console not the same as cut\n");
return -EINVAL;
}
 
diff --git a/drivers/staging/speakup/serialio.c 
b/drivers/staging/speakup/serialio.c
index aade52ee15a0..7e6bc3b05da3 100644
--- a/drivers/staging/speakup/serialio.c
+++ b/drivers/staging/speakup/serialio.c
@@ -118,7 +118,7 @@ static void start_serial_interrupt(int irq)

[PATCH 4/4] staging: speakup: Alignment should match open parenthesis

2017-03-02 Thread Arushi Singhal
Fix checkpatch issues: "CHECK: Alignment should match open parenthesis".

Signed-off-by: Arushi Singhal 
---
 drivers/staging/speakup/kobjects.c   | 16 
 drivers/staging/speakup/main.c   |  2 +-
 drivers/staging/speakup/selection.c  |  2 +-
 drivers/staging/speakup/serialio.c   |  2 +-
 drivers/staging/speakup/speakup_acntpc.c |  6 +++---
 drivers/staging/speakup/speakup_apollo.c |  2 +-
 drivers/staging/speakup/speakup_decext.c |  4 ++--
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/speakup/kobjects.c 
b/drivers/staging/speakup/kobjects.c
index 8c93188b832c..edde9e68779e 100644
--- a/drivers/staging/speakup/kobjects.c
+++ b/drivers/staging/speakup/kobjects.c
@@ -662,9 +662,9 @@ ssize_t spk_var_store(struct kobject *kobj, struct 
kobj_attribute *attr,
var_data = param->data;
value = var_data->u.n.value;
spk_reset_default_value("pitch", synth->default_pitch,
-   value);
+   value);
spk_reset_default_value("vol", synth->default_vol,
-   value);
+   value);
}
break;
case VAR_STRING:
@@ -679,7 +679,7 @@ ssize_t spk_var_store(struct kobject *kobj, struct 
kobj_attribute *attr,
ret = spk_set_string_var(cp, param, len);
if (ret == -E2BIG)
pr_warn("value too long for %s\n",
-   param->name);
+   param->name);
break;
default:
pr_warn("%s unknown type %d\n",
@@ -699,7 +699,7 @@ EXPORT_SYMBOL_GPL(spk_var_store);
  */
 
 static ssize_t message_show_helper(char *buf, enum msg_index_t first,
-   enum msg_index_t last)
+   enum msg_index_t last)
 {
size_t bufsize = PAGE_SIZE;
char *buf_pointer = buf;
@@ -712,7 +712,7 @@ static ssize_t message_show_helper(char *buf, enum 
msg_index_t first,
if (bufsize <= 1)
break;
printed = scnprintf(buf_pointer, bufsize, "%d\t%s\n",
-   index, spk_msg_get(cursor));
+   index, spk_msg_get(cursor));
buf_pointer += printed;
bufsize -= printed;
}
@@ -721,7 +721,7 @@ static ssize_t message_show_helper(char *buf, enum 
msg_index_t first,
 }
 
 static void report_msg_status(int reset, int received, int used,
-   int rejected, char *groupname)
+  int rejected, char *groupname)
 {
int len;
char buf[160];
@@ -742,7 +742,7 @@ static void report_msg_status(int reset, int received, int 
used,
 }
 
 static ssize_t message_store_helper(const char *buf, size_t count,
-   struct msg_group_t *group)
+struct msg_group_t *group)
 {
char *cp = (char *) buf;
char *end = cp + count;
@@ -843,7 +843,7 @@ static ssize_t message_show(struct kobject *kobj,
 }
 
 static ssize_t message_store(struct kobject *kobj, struct kobj_attribute *attr,
-   const char *buf, size_t count)
+ const char *buf, size_t count)
 {
struct msg_group_t *group = spk_find_msg_group(attr->attr.name);
 
diff --git a/drivers/staging/speakup/main.c b/drivers/staging/speakup/main.c
index 25acebb9311f..01eabc19039c 100644
--- a/drivers/staging/speakup/main.c
+++ b/drivers/staging/speakup/main.c
@@ -1140,7 +1140,7 @@ static void spkup_write(const char *in_buf, int count)
if (last_type & CH_RPT) {
synth_printf(" ");
synth_printf(spk_msg_get(MSG_REPEAT_DESC2),
-   ++rep_count);
+++rep_count);
synth_printf(" ");
}
rep_count = 0;
diff --git a/drivers/staging/speakup/selection.c 
b/drivers/staging/speakup/selection.c
index afd9a446a06f..3d15eec37163 100644
--- a/drivers/staging/speakup/selection.c
+++ b/drivers/staging/speakup/selection.c
@@ -75,7 +75,7 @@ int speakup_set_selection(struct tty_struct *tty)
speakup_clear_selection();
spk_sel_cons = vc_cons[fg_console].d;
dev_warn(tty->dev,
-   "Selection: mark console not the same as cut\n");
+"Selection: mark console not the same as cut\n");
return -EINVAL;
}
 
diff --git a/drivers/staging/speakup/serialio.c 
b/drivers/staging/speakup/serialio.c
index aade52ee15a0..7e6bc3b05da3 100644
--- a/drivers/staging/speakup/serialio.c
+++ b/drivers/staging/speakup/serialio.c
@@ -118,7 +118,7 @@ static void start_serial_interrupt(int irq)
pr_err("Unable 

Re: [PATCH net] rxrpc: Fix potential NULL-pointer exception

2017-03-02 Thread David Howells
David Howells  wrote:

> Fix a potential NULL-pointer exception in rxrpc_do_sendmsg().  The call
> state check that I added should have gone into the else-body of the
> if-statement where we actually have a call to check.
> 
> Found by CoverityScan CID#1414316 ("Dereference after null check").
> 
> Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and 
> sendmsg/recvmsg")
> Reported-by: Colin Ian King 
> Signed-off-by: David Howells 

Please ignore this - there's another patch interposed that I haven't sent
upstream yet.  Will rebase on net/master.

David


Re: [PATCH net] rxrpc: Fix potential NULL-pointer exception

2017-03-02 Thread David Howells
David Howells  wrote:

> Fix a potential NULL-pointer exception in rxrpc_do_sendmsg().  The call
> state check that I added should have gone into the else-body of the
> if-statement where we actually have a call to check.
> 
> Found by CoverityScan CID#1414316 ("Dereference after null check").
> 
> Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and 
> sendmsg/recvmsg")
> Reported-by: Colin Ian King 
> Signed-off-by: David Howells 

Please ignore this - there's another patch interposed that I haven't sent
upstream yet.  Will rebase on net/master.

David


Re: [PATCH] mmc: core: fix changing bus witdh in hs400es mode

2017-03-02 Thread Shawn Lin

Hi Poitr,

On 2017/3/2 21:47, Piotr Sroka wrote:

Fix the code to avoid changing bus width if HS400ES mode is selected.



Thanks for catching this, but Guenter posted a fix[1] already. :)

[1]: https://patchwork.kernel.org/patch/9599261/


Signed-off-by: Piotr Sroka 
---
 drivers/mmc/core/mmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 7fd7228..c7d9c9f 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1730,7 +1730,7 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
err = mmc_select_hs400(card);
if (err)
goto free_card;
-   } else {
+   } else if (!mmc_card_hs400(card)) {
/* Select the desired bus width optionally */
err = mmc_select_bus_width(card);
if (err > 0 && mmc_card_hs(card)) {




--
Best Regards
Shawn Lin



Re: [PATCH] mmc: core: fix changing bus witdh in hs400es mode

2017-03-02 Thread Shawn Lin

Hi Poitr,

On 2017/3/2 21:47, Piotr Sroka wrote:

Fix the code to avoid changing bus width if HS400ES mode is selected.



Thanks for catching this, but Guenter posted a fix[1] already. :)

[1]: https://patchwork.kernel.org/patch/9599261/


Signed-off-by: Piotr Sroka 
---
 drivers/mmc/core/mmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 7fd7228..c7d9c9f 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1730,7 +1730,7 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
err = mmc_select_hs400(card);
if (err)
goto free_card;
-   } else {
+   } else if (!mmc_card_hs400(card)) {
/* Select the desired bus width optionally */
err = mmc_select_bus_width(card);
if (err > 0 && mmc_card_hs(card)) {




--
Best Regards
Shawn Lin



Re: [PATCH] module: set __jump_table alignment to 8

2017-03-02 Thread Michael Ellerman
Steven Rostedt  writes:
> On Thu, 02 Mar 2017 22:18:30 +1100
> Michael Ellerman  wrote:
>> Michael Ellerman  writes:
>> > David Daney  writes:  
>> >> Strict alignment became necessary with commit 3821fd35b58d
>> >> ("jump_label: Reduce the size of struct static_key"), currently in
>> >> linux-next, which uses the two least significant bits of pointers to
>> >> __jump_table elements.  
>> >
>> > It would obviously be nice if this could go in before the commit that
>> > exposes the breakage, but I guess that's problematic because Steve
>> > doesn't want to rebase the tracing tree.
>> >
>> > Steve I think you've already sent your pull request for this cycle? So I
>> > guess if this can go in your first batch of fixes?  
>> 
>> Ugh. Was looking at the wrong tree - Linus has already merged the commit
>> in question, so the above is all moot.
>
> No problem. I've got some other "fixes" to push to Linus. That's what
> the -rc releases are for. To fix up breakage from the merge window ;-)

Yep, no drama.

> I'll pull this into my tree.

Thanks.

cheers


[PATCH] staging: rtl8192u: fix spacing around if statements

2017-03-02 Thread Robin Krahl
Corrects the spacing around two if statements to fix these checkpatch.pl
errors:

ERROR: space required before the open brace '{'
ERROR: space prohibited after that open parenthesis '('

Signed-off-by: Robin Krahl 
---
 drivers/staging/rtl8192u/r8192U_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8192u/r8192U_core.c 
b/drivers/staging/rtl8192u/r8192U_core.c
index b631990b4969..b61ffa35579b 100644
--- a/drivers/staging/rtl8192u/r8192U_core.c
+++ b/drivers/staging/rtl8192u/r8192U_core.c
@@ -269,7 +269,7 @@ int write_nic_byte_E(struct net_device *dev, int indx, u8 
data)
 indx | 0xfe00, 0, usbdata, 1, HZ / 2);
kfree(usbdata);
 
-   if (status < 0){
+   if (status < 0) {
netdev_err(dev, "write_nic_byte_E TimeOut! status: %d\n",
   status);
return status;
@@ -2519,7 +2519,7 @@ static int rtl8192_read_eeprom_info(struct net_device 
*dev)
for (i = 0; i < 3; i++) {
if (bLoad_From_EEPOM) {
ret = eprom_read(dev, 
(EEPROM_TxPwIndex_OFDM_24G + i) >> 1);
-   if ( ret < 0)
+   if (ret < 0)
return ret;
if (((EEPROM_TxPwIndex_OFDM_24G + i) % 
2) == 0)
tmpValue = (u16)ret & 0x00ff;
-- 
2.12.0



[PATCH] staging: rtl8192u: fix spacing around if statements

2017-03-02 Thread Robin Krahl
Corrects the spacing around two if statements to fix these checkpatch.pl
errors:

ERROR: space required before the open brace '{'
ERROR: space prohibited after that open parenthesis '('

Signed-off-by: Robin Krahl 
---
 drivers/staging/rtl8192u/r8192U_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/rtl8192u/r8192U_core.c 
b/drivers/staging/rtl8192u/r8192U_core.c
index b631990b4969..b61ffa35579b 100644
--- a/drivers/staging/rtl8192u/r8192U_core.c
+++ b/drivers/staging/rtl8192u/r8192U_core.c
@@ -269,7 +269,7 @@ int write_nic_byte_E(struct net_device *dev, int indx, u8 
data)
 indx | 0xfe00, 0, usbdata, 1, HZ / 2);
kfree(usbdata);
 
-   if (status < 0){
+   if (status < 0) {
netdev_err(dev, "write_nic_byte_E TimeOut! status: %d\n",
   status);
return status;
@@ -2519,7 +2519,7 @@ static int rtl8192_read_eeprom_info(struct net_device 
*dev)
for (i = 0; i < 3; i++) {
if (bLoad_From_EEPOM) {
ret = eprom_read(dev, 
(EEPROM_TxPwIndex_OFDM_24G + i) >> 1);
-   if ( ret < 0)
+   if (ret < 0)
return ret;
if (((EEPROM_TxPwIndex_OFDM_24G + i) % 
2) == 0)
tmpValue = (u16)ret & 0x00ff;
-- 
2.12.0



Re: [PATCH] module: set __jump_table alignment to 8

2017-03-02 Thread Michael Ellerman
Steven Rostedt  writes:
> On Thu, 02 Mar 2017 22:18:30 +1100
> Michael Ellerman  wrote:
>> Michael Ellerman  writes:
>> > David Daney  writes:  
>> >> Strict alignment became necessary with commit 3821fd35b58d
>> >> ("jump_label: Reduce the size of struct static_key"), currently in
>> >> linux-next, which uses the two least significant bits of pointers to
>> >> __jump_table elements.  
>> >
>> > It would obviously be nice if this could go in before the commit that
>> > exposes the breakage, but I guess that's problematic because Steve
>> > doesn't want to rebase the tracing tree.
>> >
>> > Steve I think you've already sent your pull request for this cycle? So I
>> > guess if this can go in your first batch of fixes?  
>> 
>> Ugh. Was looking at the wrong tree - Linus has already merged the commit
>> in question, so the above is all moot.
>
> No problem. I've got some other "fixes" to push to Linus. That's what
> the -rc releases are for. To fix up breakage from the merge window ;-)

Yep, no drama.

> I'll pull this into my tree.

Thanks.

cheers


Re: [PATCH v2] hlist_add_tail_rcu disable sparse warning

2017-03-02 Thread Paul E. McKenney
On Mon, Feb 27, 2017 at 09:14:19PM +0200, Michael S. Tsirkin wrote:
> sparse is unhappy about this code in hlist_add_tail_rcu:
> 
> struct hlist_node *i, *last = NULL;
> 
> for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> last = i;
> 
> This is because hlist_next_rcu and hlist_next_rcu return
> __rcu pointers.
> 
> It's a false positive - it's a write side primitive and so
> does not need to be called in a read side critical section.
> 
> The following trivial patch disables the warning
> without changing the behaviour in any way.
> 
> Note: __hlist_for_each_rcu would also remove the warning but it would be
> confusing since it calls rcu_derefence and is designed to run in the rcu
> read side critical section.
> 
> Signed-off-by: Michael S. Tsirkin 
> Reviewed-by: Steven Rostedt (VMware) 

Queud for further review and testing, thank you both!

Thanx, Paul

> ---
> 
> Comments from v2:
>   add a comment as requested by Steven Rostedt
> 
>  include/linux/rculist.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 4f7a956..b1fd8bf 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -509,7 +509,8 @@ static inline void hlist_add_tail_rcu(struct hlist_node 
> *n,
>  {
>   struct hlist_node *i, *last = NULL;
> 
> - for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> + /* Note: write side code, so rcu accessors are not needed. */
> + for (i = h->first; i; i = i->next)
>   last = i;
> 
>   if (last) {
> -- 
> MST
> 



Re: [PATCH v2] hlist_add_tail_rcu disable sparse warning

2017-03-02 Thread Paul E. McKenney
On Mon, Feb 27, 2017 at 09:14:19PM +0200, Michael S. Tsirkin wrote:
> sparse is unhappy about this code in hlist_add_tail_rcu:
> 
> struct hlist_node *i, *last = NULL;
> 
> for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> last = i;
> 
> This is because hlist_next_rcu and hlist_next_rcu return
> __rcu pointers.
> 
> It's a false positive - it's a write side primitive and so
> does not need to be called in a read side critical section.
> 
> The following trivial patch disables the warning
> without changing the behaviour in any way.
> 
> Note: __hlist_for_each_rcu would also remove the warning but it would be
> confusing since it calls rcu_derefence and is designed to run in the rcu
> read side critical section.
> 
> Signed-off-by: Michael S. Tsirkin 
> Reviewed-by: Steven Rostedt (VMware) 

Queud for further review and testing, thank you both!

Thanx, Paul

> ---
> 
> Comments from v2:
>   add a comment as requested by Steven Rostedt
> 
>  include/linux/rculist.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 4f7a956..b1fd8bf 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -509,7 +509,8 @@ static inline void hlist_add_tail_rcu(struct hlist_node 
> *n,
>  {
>   struct hlist_node *i, *last = NULL;
> 
> - for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> + /* Note: write side code, so rcu accessors are not needed. */
> + for (i = h->first; i; i = i->next)
>   last = i;
> 
>   if (last) {
> -- 
> MST
> 



Re: [PATCH v2 1/3] perf annotate: Get correct line numbers matched with addr

2017-03-02 Thread Taeung Song



On 03/03/2017 11:40 AM, Namhyung Kim wrote:

+ Andi Kleen who wrote the code.

On Thu, Mar 02, 2017 at 03:05:14PM +0900, Taeung Song wrote:



On 03/01/2017 10:17 PM, Namhyung Kim wrote:

Hi Taeung,

On Wed, Mar 01, 2017 at 04:59:51AM +0900, Taeung Song wrote:

Currently perf-annotate show wrong line numbers.

For example,
Actual source code is as below

...
  21 };
  22
  23 unsigned int limited_wgt;
  24
  25 unsigned int get_cond_maxprice(int wgt)
  26 {
...

However, the output of perf-annotate is as below.

  4   Disassembly of section .text:

  6   00400966 :
  7   get_cond_maxprice():
  26  };

  28  unsigned int limited_wgt;

  30  unsigned int get_cond_maxprice(int wgt)
  31  {

The cause is the wrong way counting line numbers
in symbol__parse_objdump_line().
So remove wrong current code counting line number and
use other method for it using functions related to addr2line
instead of the output of '-l' of objdump.


Hmm.. do you think it's a bug of objdump or it's perf failing to parse
the line number correctly?  I'd like to see the output of `objdump -l`



Both are ok.
'objdump -l' hasn't a bug related to line number
and perf's method parsing the line number is ok.

But symbol__parse_objdump_line() wrongly count line numbers
after parsing it as below.

1172 /* /filename:linenr ? Save line number and ignore. */
1173 if (regexec(_lineno, line, 2, match, 0) == 0) {
1174 *line_nr = atoi(line + match[1].rm_so);
1175 return 0;
1176 }
...
1208 dl = disasm_line__new(offset, parsed_line, privsize, *line_nr,
arch, map);
1209 free(line);
1210 (*line_nr)++;

Increasing line_nr each asm line is wrong method.
Because 'line_nr' means actual source code line number.


Hmm.. ok.  It looks like that it should reuse the old line_nr as is.



Sure, I can fix only the wrong counting way.
But the above parsing method(1172~1176) is never used because of 'grep -v'
in command as below.
(the grep already remove lines containing filename:linenr of output)


Right, but only if filename is same as binary name.



1435 snprintf(command, sizeof(command),
1436  "%s %s%s --start-address=0x%016" PRIx64
1437  " --stop-address=0x%016" PRIx64
1438  " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
1439  objdump_path ? objdump_path : "objdump",
1440  disassembler_style ? "-M " : "",
1441  disassembler_style ? disassembler_style : "",
1442  map__rip_2objdump(map, sym->start),
1443  map__rip_2objdump(map, sym->end),
1444  symbol_conf.annotate_asm_raw ? "" : "--no-show-raw",
1445  symbol_conf.annotate_src ? "-S" : "",
1446  symfs_filename, symfs_filename);

Therefore, I think it is better to do three things

  1) fix the wrong counting line number problem
  2) remove unused the line number parsing method
  3) In addtion, a bit reduce objdump dependency
 using functions related to addr2line of perf.

What do you think about that ?
Is it bad idea ?


I think we need to fix 1) definitely, but not sure about 2) and 3).
If objdump could do all necessary works, why not use it? :)

Thanks,
Namhyung




Okey! I'll concentrate on fixing 1)
,not removing objdump -l :)

Thanks,
Taeung


Re: [PATCH v2 1/3] perf annotate: Get correct line numbers matched with addr

2017-03-02 Thread Taeung Song



On 03/03/2017 11:40 AM, Namhyung Kim wrote:

+ Andi Kleen who wrote the code.

On Thu, Mar 02, 2017 at 03:05:14PM +0900, Taeung Song wrote:



On 03/01/2017 10:17 PM, Namhyung Kim wrote:

Hi Taeung,

On Wed, Mar 01, 2017 at 04:59:51AM +0900, Taeung Song wrote:

Currently perf-annotate show wrong line numbers.

For example,
Actual source code is as below

...
  21 };
  22
  23 unsigned int limited_wgt;
  24
  25 unsigned int get_cond_maxprice(int wgt)
  26 {
...

However, the output of perf-annotate is as below.

  4   Disassembly of section .text:

  6   00400966 :
  7   get_cond_maxprice():
  26  };

  28  unsigned int limited_wgt;

  30  unsigned int get_cond_maxprice(int wgt)
  31  {

The cause is the wrong way counting line numbers
in symbol__parse_objdump_line().
So remove wrong current code counting line number and
use other method for it using functions related to addr2line
instead of the output of '-l' of objdump.


Hmm.. do you think it's a bug of objdump or it's perf failing to parse
the line number correctly?  I'd like to see the output of `objdump -l`



Both are ok.
'objdump -l' hasn't a bug related to line number
and perf's method parsing the line number is ok.

But symbol__parse_objdump_line() wrongly count line numbers
after parsing it as below.

1172 /* /filename:linenr ? Save line number and ignore. */
1173 if (regexec(_lineno, line, 2, match, 0) == 0) {
1174 *line_nr = atoi(line + match[1].rm_so);
1175 return 0;
1176 }
...
1208 dl = disasm_line__new(offset, parsed_line, privsize, *line_nr,
arch, map);
1209 free(line);
1210 (*line_nr)++;

Increasing line_nr each asm line is wrong method.
Because 'line_nr' means actual source code line number.


Hmm.. ok.  It looks like that it should reuse the old line_nr as is.



Sure, I can fix only the wrong counting way.
But the above parsing method(1172~1176) is never used because of 'grep -v'
in command as below.
(the grep already remove lines containing filename:linenr of output)


Right, but only if filename is same as binary name.



1435 snprintf(command, sizeof(command),
1436  "%s %s%s --start-address=0x%016" PRIx64
1437  " --stop-address=0x%016" PRIx64
1438  " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
1439  objdump_path ? objdump_path : "objdump",
1440  disassembler_style ? "-M " : "",
1441  disassembler_style ? disassembler_style : "",
1442  map__rip_2objdump(map, sym->start),
1443  map__rip_2objdump(map, sym->end),
1444  symbol_conf.annotate_asm_raw ? "" : "--no-show-raw",
1445  symbol_conf.annotate_src ? "-S" : "",
1446  symfs_filename, symfs_filename);

Therefore, I think it is better to do three things

  1) fix the wrong counting line number problem
  2) remove unused the line number parsing method
  3) In addtion, a bit reduce objdump dependency
 using functions related to addr2line of perf.

What do you think about that ?
Is it bad idea ?


I think we need to fix 1) definitely, but not sure about 2) and 3).
If objdump could do all necessary works, why not use it? :)

Thanks,
Namhyung




Okey! I'll concentrate on fixing 1)
,not removing objdump -l :)

Thanks,
Taeung


Re: [PATCH 3/3] gpu: drm: drivers: Convert printk(KERN_ to pr_

2017-03-02 Thread Tomi Valkeinen
On 28/02/17 14:55, Joe Perches wrote:
> Use a more common logging style.
> 
> Miscellanea:
> 
> o Coalesce formats and realign arguments
> o Neaten a few macros now using pr_
> 
> Signed-off-by: Joe Perches 

For omap:

Acked-by: Tomi Valkeinen 

 Tomi



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 3/3] gpu: drm: drivers: Convert printk(KERN_ to pr_

2017-03-02 Thread Tomi Valkeinen
On 28/02/17 14:55, Joe Perches wrote:
> Use a more common logging style.
> 
> Miscellanea:
> 
> o Coalesce formats and realign arguments
> o Neaten a few macros now using pr_
> 
> Signed-off-by: Joe Perches 

For omap:

Acked-by: Tomi Valkeinen 

 Tomi



signature.asc
Description: OpenPGP digital signature


Re: [RFC 01/11] mm: use SWAP_SUCCESS instead of 0

2017-03-02 Thread Minchan Kim
On Thu, Mar 02, 2017 at 07:57:10PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > SWAP_SUCCESS defined value 0 can be changed always so don't rely on
> > it. Instead, use explict macro.
> 
> Right. But should not we move the changes to the callers last in the
> patch series after doing the cleanup to the try_to_unmap() function
> as intended first.

I don't understand what you are pointing out. Could you elaborate it
a bit?

Thanks.

> 
> > > Cc: Kirill A. Shutemov 
> > Signed-off-by: Minchan Kim 
> > ---
> >  mm/huge_memory.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 092cc5c..fe2ccd4 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2114,7 +2114,7 @@ static void freeze_page(struct page *page)
> > ttu_flags |= TTU_MIGRATION;
> >  
> > ret = try_to_unmap(page, ttu_flags);
> > -   VM_BUG_ON_PAGE(ret, page);
> > +   VM_BUG_ON_PAGE(ret != SWAP_SUCCESS, page);
> >  }
> >  
> >  static void unfreeze_page(struct page *page)
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 01/11] mm: use SWAP_SUCCESS instead of 0

2017-03-02 Thread Minchan Kim
On Thu, Mar 02, 2017 at 07:57:10PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > SWAP_SUCCESS defined value 0 can be changed always so don't rely on
> > it. Instead, use explict macro.
> 
> Right. But should not we move the changes to the callers last in the
> patch series after doing the cleanup to the try_to_unmap() function
> as intended first.

I don't understand what you are pointing out. Could you elaborate it
a bit?

Thanks.

> 
> > > Cc: Kirill A. Shutemov 
> > Signed-off-by: Minchan Kim 
> > ---
> >  mm/huge_memory.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 092cc5c..fe2ccd4 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2114,7 +2114,7 @@ static void freeze_page(struct page *page)
> > ttu_flags |= TTU_MIGRATION;
> >  
> > ret = try_to_unmap(page, ttu_flags);
> > -   VM_BUG_ON_PAGE(ret, page);
> > +   VM_BUG_ON_PAGE(ret != SWAP_SUCCESS, page);
> >  }
> >  
> >  static void unfreeze_page(struct page *page)
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy

2017-03-02 Thread Viresh Kumar
On 02-03-17, 23:05, Rafael J. Wysocki wrote:
> On Thursday, March 02, 2017 02:03:20 PM Viresh Kumar wrote:
> > cached_raw_freq applies to the entire cpufreq policy and not individual
> > CPUs. Apart from wasting per-cpu memory, it is actually wrong to keep it
> > in struct sugov_cpu as we may end up comparing next_freq with a stale
> > cached_raw_freq of a random CPU.
> > 
> > Move cached_raw_freq to struct sugov_policy.
> > 
> > Signed-off-by: Viresh Kumar 
> 
> Any chance for a Fixes: tag?

Fixes: 5cbea46984d6 ("cpufreq: schedutil: map raw required frequency
to driver frequency")

Sorry to miss that in the first place.

-- 
viresh


Re: [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy

2017-03-02 Thread Viresh Kumar
On 02-03-17, 23:05, Rafael J. Wysocki wrote:
> On Thursday, March 02, 2017 02:03:20 PM Viresh Kumar wrote:
> > cached_raw_freq applies to the entire cpufreq policy and not individual
> > CPUs. Apart from wasting per-cpu memory, it is actually wrong to keep it
> > in struct sugov_cpu as we may end up comparing next_freq with a stale
> > cached_raw_freq of a random CPU.
> > 
> > Move cached_raw_freq to struct sugov_policy.
> > 
> > Signed-off-by: Viresh Kumar 
> 
> Any chance for a Fixes: tag?

Fixes: 5cbea46984d6 ("cpufreq: schedutil: map raw required frequency
to driver frequency")

Sorry to miss that in the first place.

-- 
viresh


Re: [PATCH v17 2/3] usb: USB Type-C connector class

2017-03-02 Thread Guenter Roeck

On 03/02/2017 07:22 AM, Mats Karrman wrote:

Hi Heikki,

Good to see things are happening with Type-C!

On 2017-02-21 15:24, Heikki Krogerus wrote:


...
+When connected, the partner will be presented also as its own device under
+/sys/class/typec/. The parent of the partner device will always be the port it
+is attached to. The partner attached to port "port0" will be named
+"port0-partner". Full path to the device would be
+/sys/class/typec/port0/port0-partner/.


A "/port0" too much?


+
+The cable and the two plugs on it may also be optionally presented as their own
+devices under /sys/class/typec/. The cable attached to the port "port0" port
+will be named port0-cable and the plug on the SOP Prime end (see USB Power
+Delivery Specification ch. 2.4) will be named "port0-plug0" and on the SOP
+Double Prime end "port0-plug1". The parent of a cable will always be the port,
+and the parent of the cable plugs will always be the cable.
+
+If the port, partner or cable plug support Alternate Modes, every supported
+Alternate Mode SVID will have their own device describing them. The Alternate
+Modes will not be attached to the typec class. The parent of an alternate mode
+will be the device that supports it, so for example an alternate mode of
+port0-partner will bees presented under /sys/class/typec/port0-partner/. Every


bees?


+mode that is supported will have its own group under the Alternate Mode device
+named "mode", for example /sys/class/typec/port0//mode1/.
+The requests for entering/exiting a mode can be done with "active" attribute
+file in that group.
+
...


I'm hoping to find time to upgrade the kernel and try these patches in my 
system.

Looking forward, one thing I have run into is how to connect the typec driver 
with a
driver for an alternate mode. E.g. the DisplayPort Alternate Mode specification
includes the HPD (hot plug) and HPD-INT (hot plug interrupt) signals as bits in 
the
Attention message. These signals are needed by the DisplayPort driver to know 
when to
start negotiation etc.
Have you got any thoughts on how to standardize such interfaces?



That really depends on the lower level driver. For Chromebooks, where the Type-C
Protocol Manager runs on the EC, we have an extcon driver which reports the pin 
states
to the graphics drivers and connects to the Type-C class code using the Type-C 
class
API. I still need to update, re-test, and publish that code. The published code 
in
https://chromium.googlesource.com/chromiumos/third_party/kernel/, branch 
chromeos-4.4,
shows how it can be done, though that code currently still uses the Android 
Type-C
infrastructure.

Guenter



[PATCH] objtool: fix another gcc jump table detection issue

2017-03-02 Thread Josh Poimboeuf

Arnd Bergmann reported a (false positive) objtool warning:

  drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: 
sibling call from callable instruction with changed frame pointer

The issue is in find_switch_table().  It tries to find a switch
statement's jump table by walking backwards from an indirect jump
instruction, looking for a relocation to the .rodata section.  In this
case it stopped walking prematurely: the first .rodata relocation it
encountered was for a variable (resp_state_name) instead of a jump
table, so it just assumed there wasn't a jump table.

The fix is to ignore any .rodata relocation which refers to an ELF
object symbol.  This works because the jump tables are anonymous and
have no symbols associated with them.

Reported-by: Arnd Bergmann 
Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern 
detection")
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 15 ---
 tools/objtool/elf.c   | 12 
 tools/objtool/elf.h   |  1 +
 3 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 5fc52ee..c2a8518 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -805,11 +805,20 @@ static struct rela *find_switch_table(struct objtool_file 
*file,
 insn->jump_dest->offset > orig_insn->offset))
break;
 
+   /* look for a relocation which references .rodata */
text_rela = find_rela_by_dest_range(insn->sec, insn->offset,
insn->len);
-   if (text_rela && text_rela->sym == file->rodata->sym)
-   return find_rela_by_dest(file->rodata,
-text_rela->addend);
+   if (!text_rela || text_rela->sym != file->rodata->sym)
+   continue;
+
+   /*
+* Make sure the .rodata address isn't associated with a
+* symbol.  gcc jump tables are anonymous data.
+*/
+   if (find_symbol_containing(file->rodata, text_rela->addend))
+   continue;
+
+   return find_rela_by_dest(file->rodata, text_rela->addend);
}
 
return NULL;
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 0d7983a..d897702 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -85,6 +85,18 @@ struct symbol *find_symbol_by_offset(struct section *sec, 
unsigned long offset)
return NULL;
 }
 
+struct symbol *find_symbol_containing(struct section *sec, unsigned long 
offset)
+{
+   struct symbol *sym;
+
+   list_for_each_entry(sym, >symbol_list, list)
+   if (sym->type != STT_SECTION &&
+   offset >= sym->offset && offset < sym->offset + sym->len)
+   return sym;
+
+   return NULL;
+}
+
 struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
 unsigned int len)
 {
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index aa1ff65..731973e 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -79,6 +79,7 @@ struct elf {
 struct elf *elf_open(const char *name);
 struct section *find_section_by_name(struct elf *elf, const char *name);
 struct symbol *find_symbol_by_offset(struct section *sec, unsigned long 
offset);
+struct symbol *find_symbol_containing(struct section *sec, unsigned long 
offset);
 struct rela *find_rela_by_dest(struct section *sec, unsigned long offset);
 struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
 unsigned int len);
-- 
2.7.4



Re: [PATCH v17 2/3] usb: USB Type-C connector class

2017-03-02 Thread Guenter Roeck

On 03/02/2017 07:22 AM, Mats Karrman wrote:

Hi Heikki,

Good to see things are happening with Type-C!

On 2017-02-21 15:24, Heikki Krogerus wrote:


...
+When connected, the partner will be presented also as its own device under
+/sys/class/typec/. The parent of the partner device will always be the port it
+is attached to. The partner attached to port "port0" will be named
+"port0-partner". Full path to the device would be
+/sys/class/typec/port0/port0-partner/.


A "/port0" too much?


+
+The cable and the two plugs on it may also be optionally presented as their own
+devices under /sys/class/typec/. The cable attached to the port "port0" port
+will be named port0-cable and the plug on the SOP Prime end (see USB Power
+Delivery Specification ch. 2.4) will be named "port0-plug0" and on the SOP
+Double Prime end "port0-plug1". The parent of a cable will always be the port,
+and the parent of the cable plugs will always be the cable.
+
+If the port, partner or cable plug support Alternate Modes, every supported
+Alternate Mode SVID will have their own device describing them. The Alternate
+Modes will not be attached to the typec class. The parent of an alternate mode
+will be the device that supports it, so for example an alternate mode of
+port0-partner will bees presented under /sys/class/typec/port0-partner/. Every


bees?


+mode that is supported will have its own group under the Alternate Mode device
+named "mode", for example /sys/class/typec/port0//mode1/.
+The requests for entering/exiting a mode can be done with "active" attribute
+file in that group.
+
...


I'm hoping to find time to upgrade the kernel and try these patches in my 
system.

Looking forward, one thing I have run into is how to connect the typec driver 
with a
driver for an alternate mode. E.g. the DisplayPort Alternate Mode specification
includes the HPD (hot plug) and HPD-INT (hot plug interrupt) signals as bits in 
the
Attention message. These signals are needed by the DisplayPort driver to know 
when to
start negotiation etc.
Have you got any thoughts on how to standardize such interfaces?



That really depends on the lower level driver. For Chromebooks, where the Type-C
Protocol Manager runs on the EC, we have an extcon driver which reports the pin 
states
to the graphics drivers and connects to the Type-C class code using the Type-C 
class
API. I still need to update, re-test, and publish that code. The published code 
in
https://chromium.googlesource.com/chromiumos/third_party/kernel/, branch 
chromeos-4.4,
shows how it can be done, though that code currently still uses the Android 
Type-C
infrastructure.

Guenter



[PATCH] objtool: fix another gcc jump table detection issue

2017-03-02 Thread Josh Poimboeuf

Arnd Bergmann reported a (false positive) objtool warning:

  drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: 
sibling call from callable instruction with changed frame pointer

The issue is in find_switch_table().  It tries to find a switch
statement's jump table by walking backwards from an indirect jump
instruction, looking for a relocation to the .rodata section.  In this
case it stopped walking prematurely: the first .rodata relocation it
encountered was for a variable (resp_state_name) instead of a jump
table, so it just assumed there wasn't a jump table.

The fix is to ignore any .rodata relocation which refers to an ELF
object symbol.  This works because the jump tables are anonymous and
have no symbols associated with them.

Reported-by: Arnd Bergmann 
Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern 
detection")
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 15 ---
 tools/objtool/elf.c   | 12 
 tools/objtool/elf.h   |  1 +
 3 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 5fc52ee..c2a8518 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -805,11 +805,20 @@ static struct rela *find_switch_table(struct objtool_file 
*file,
 insn->jump_dest->offset > orig_insn->offset))
break;
 
+   /* look for a relocation which references .rodata */
text_rela = find_rela_by_dest_range(insn->sec, insn->offset,
insn->len);
-   if (text_rela && text_rela->sym == file->rodata->sym)
-   return find_rela_by_dest(file->rodata,
-text_rela->addend);
+   if (!text_rela || text_rela->sym != file->rodata->sym)
+   continue;
+
+   /*
+* Make sure the .rodata address isn't associated with a
+* symbol.  gcc jump tables are anonymous data.
+*/
+   if (find_symbol_containing(file->rodata, text_rela->addend))
+   continue;
+
+   return find_rela_by_dest(file->rodata, text_rela->addend);
}
 
return NULL;
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 0d7983a..d897702 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -85,6 +85,18 @@ struct symbol *find_symbol_by_offset(struct section *sec, 
unsigned long offset)
return NULL;
 }
 
+struct symbol *find_symbol_containing(struct section *sec, unsigned long 
offset)
+{
+   struct symbol *sym;
+
+   list_for_each_entry(sym, >symbol_list, list)
+   if (sym->type != STT_SECTION &&
+   offset >= sym->offset && offset < sym->offset + sym->len)
+   return sym;
+
+   return NULL;
+}
+
 struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
 unsigned int len)
 {
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index aa1ff65..731973e 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -79,6 +79,7 @@ struct elf {
 struct elf *elf_open(const char *name);
 struct section *find_section_by_name(struct elf *elf, const char *name);
 struct symbol *find_symbol_by_offset(struct section *sec, unsigned long 
offset);
+struct symbol *find_symbol_containing(struct section *sec, unsigned long 
offset);
 struct rela *find_rela_by_dest(struct section *sec, unsigned long offset);
 struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
 unsigned int len);
-- 
2.7.4



Re: [RFC PATCH 2/2] mtd: devices: m25p80: Enable spi-nor bounce buffer support

2017-03-02 Thread Boris Brezillon
On Thu, 2 Mar 2017 17:00:41 +
Mark Brown  wrote:

> On Thu, Mar 02, 2017 at 03:29:21PM +0100, Boris Brezillon wrote:
> > Vignesh R  wrote:  
> 
> > > Or SPI core can be extended in a way similar to this RFC. That is, SPI
> > > master driver will set a flag to request SPI core to use of bounce
> > > buffer for vmalloc'd buffers. And spi_map_buf() just uses bounce buffer
> > > in case buf does not belong to kmalloc region based on the flag.  
> 
> > That's a better approach IMHO. Note that the decision should not only  
> 
> I don't understand how the driver is supposed to tell if it might need a
> bounce buffer due to where the memory is allocated and the caches used
> by the particular system it is used on?

That's true, but if the SPI controller driver can't decide that, how
could a SPI device driver guess?

We could patch dma_map_sg() to create a bounce buffer when it's given a
vmalloc-ed buffer and we are running on a system using VIVT or VIPT
caches (it's already allocating bounce buffers when the peripheral
device cannot access the memory region, so why not in this case).

This still leaves 2 problems:
1/ for big transfers, dynamically allocating a bounce buffer on demand
   (and freeing it after the DMA operation) might fail, or might induce
   some latency, especially when the system is under high mem pressure.
   Allocating these bounce buffers once during the SPI device driver
   ->probe() guarantees that the bounce buffer will always be available
   when needed, but OTOH, we don't know if it's really needed.
2/ only the SPI and/or DMA engine know when using DMA with a bounce
   buffer is better than using PIO mode. The limit is probably
   different from the DMA vs PIO mode (dma_min_len <
   dma_bounce_min_len). Thanks to ->can_dma() we can let drivers decide
   when preparing the buffer for a DMA transfer is needed.
3/ if the DMA engine does not support chaining DMA descriptor, and the
   vmalloc-ed buffer spans several non-contiguous pages, doing DMA
   is simply not possible. That one can probably handled with the
   ->can_dma() hook too.

>  The suggestion to pass via
> scatterlists seems a bit more likely to work but even then I'm not clear
> that drivers doing PIO would play well.

You mean that SPI device drivers would directly pass an sg list instead
of a virtual pointer? Not sure that would help, we're just moving the
decision one level up without providing more information to help decide
what to do.

> 
> > be based on the buffer type, but also on the transfer length and/or
> > whether the controller supports transferring non physically contiguous
> > buffers.  
> 
> The reason most drivers only look at the transfer length when deciding
> that they can DMA is that most controllers are paired with DMA
> controllers that are sensibly implemented, the only factor they're
> selecting on is the copybreak for performance.

Of course, the checks I mentioned (especially the physically contiguous
one) are SPI controller and/or DMA engine dependent. Some of them might
be irrelevant.


Re: [RFC PATCH 2/2] mtd: devices: m25p80: Enable spi-nor bounce buffer support

2017-03-02 Thread Boris Brezillon
On Thu, 2 Mar 2017 17:00:41 +
Mark Brown  wrote:

> On Thu, Mar 02, 2017 at 03:29:21PM +0100, Boris Brezillon wrote:
> > Vignesh R  wrote:  
> 
> > > Or SPI core can be extended in a way similar to this RFC. That is, SPI
> > > master driver will set a flag to request SPI core to use of bounce
> > > buffer for vmalloc'd buffers. And spi_map_buf() just uses bounce buffer
> > > in case buf does not belong to kmalloc region based on the flag.  
> 
> > That's a better approach IMHO. Note that the decision should not only  
> 
> I don't understand how the driver is supposed to tell if it might need a
> bounce buffer due to where the memory is allocated and the caches used
> by the particular system it is used on?

That's true, but if the SPI controller driver can't decide that, how
could a SPI device driver guess?

We could patch dma_map_sg() to create a bounce buffer when it's given a
vmalloc-ed buffer and we are running on a system using VIVT or VIPT
caches (it's already allocating bounce buffers when the peripheral
device cannot access the memory region, so why not in this case).

This still leaves 2 problems:
1/ for big transfers, dynamically allocating a bounce buffer on demand
   (and freeing it after the DMA operation) might fail, or might induce
   some latency, especially when the system is under high mem pressure.
   Allocating these bounce buffers once during the SPI device driver
   ->probe() guarantees that the bounce buffer will always be available
   when needed, but OTOH, we don't know if it's really needed.
2/ only the SPI and/or DMA engine know when using DMA with a bounce
   buffer is better than using PIO mode. The limit is probably
   different from the DMA vs PIO mode (dma_min_len <
   dma_bounce_min_len). Thanks to ->can_dma() we can let drivers decide
   when preparing the buffer for a DMA transfer is needed.
3/ if the DMA engine does not support chaining DMA descriptor, and the
   vmalloc-ed buffer spans several non-contiguous pages, doing DMA
   is simply not possible. That one can probably handled with the
   ->can_dma() hook too.

>  The suggestion to pass via
> scatterlists seems a bit more likely to work but even then I'm not clear
> that drivers doing PIO would play well.

You mean that SPI device drivers would directly pass an sg list instead
of a virtual pointer? Not sure that would help, we're just moving the
decision one level up without providing more information to help decide
what to do.

> 
> > be based on the buffer type, but also on the transfer length and/or
> > whether the controller supports transferring non physically contiguous
> > buffers.  
> 
> The reason most drivers only look at the transfer length when deciding
> that they can DMA is that most controllers are paired with DMA
> controllers that are sensibly implemented, the only factor they're
> selecting on is the copybreak for performance.

Of course, the checks I mentioned (especially the physically contiguous
one) are SPI controller and/or DMA engine dependent. Some of them might
be irrelevant.


[PATCH] memblock: fix memblock_next_valid_pfn()

2017-03-02 Thread AKASHI Takahiro
Obviously, we should not access memblock.memory.regions[right]
if 'right' is outside of [0..memblock.memory.cnt>.

Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where 
possible")
Signed-off-by: AKASHI Takahiro 
---
 mm/memblock.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index b64b47803e52..696f06d17c4e 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1118,7 +1118,10 @@ unsigned long __init_memblock 
memblock_next_valid_pfn(unsigned long pfn,
}
} while (left < right);
 
-   return min(PHYS_PFN(type->regions[right].base), max_pfn);
+   if (right == type->cnt)
+   return max_pfn;
+   else
+   return min(PHYS_PFN(type->regions[right].base), max_pfn);
 }
 
 /**
-- 
2.11.1



[PATCH] memblock: fix memblock_next_valid_pfn()

2017-03-02 Thread AKASHI Takahiro
Obviously, we should not access memblock.memory.regions[right]
if 'right' is outside of [0..memblock.memory.cnt>.

Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where 
possible")
Signed-off-by: AKASHI Takahiro 
---
 mm/memblock.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index b64b47803e52..696f06d17c4e 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1118,7 +1118,10 @@ unsigned long __init_memblock 
memblock_next_valid_pfn(unsigned long pfn,
}
} while (left < right);
 
-   return min(PHYS_PFN(type->regions[right].base), max_pfn);
+   if (right == type->cnt)
+   return max_pfn;
+   else
+   return min(PHYS_PFN(type->regions[right].base), max_pfn);
 }
 
 /**
-- 
2.11.1



Re: [PATCH 02/26] rewrite READ_ONCE/WRITE_ONCE

2017-03-02 Thread Christian Borntraeger
On 03/02/2017 06:55 PM, Arnd Bergmann wrote:
> On Thu, Mar 2, 2017 at 5:51 PM, Christian Borntraeger
>  wrote:
>> On 03/02/2017 05:38 PM, Arnd Bergmann wrote:
>>>
>>> This attempts a rewrite of the two macros, using a simpler implementation
>>> for the most common case of having a naturally aligned 1, 2, 4, or (on
>>> 64-bit architectures) 8  byte object that can be accessed with a single
>>> instruction.  For these, we go back to a volatile pointer dereference
>>> that we had with the ACCESS_ONCE macro.
>>
>> We had changed that back then because gcc 4.6 and 4.7 had a bug that could
>> removed the volatile statement on aggregate types like the following one
>>
>> union ipte_control {
>> unsigned long val;
>> struct {
>> unsigned long k  : 1;
>> unsigned long kh : 31;
>> unsigned long kg : 32;
>> };
>> };
>>
>> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
>>
>> If I see that right, your __ALIGNED_WORD(x)
>> macro would say that for above structure  sizeof(x) == sizeof(long)) is true,
>> so it would fall back to the old volatile cast and might reintroduce the
>> old compiler bug?

Oh dear, I should double check my sentences in emails before sending...anyway
the full story is referenced in 

commit 60815cf2e05057db5b78e398d9734c493560b11e
Merge tag 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux
which has a pointer to
http://marc.info/?i=54611D86.4040306%40de.ibm.com
which contains the full story.

> 
> Ah, right, that's the missing piece. For some reason I didn't find
> the reference in the source or the git log.
> 
>> Could you maybe you fence your simple macro for anything older than 4.9? 
>> After
>> all there was no kasan support anyway on these older gcc version.
> 
> Yes, that should work, thanks!



Re: [PATCH 02/26] rewrite READ_ONCE/WRITE_ONCE

2017-03-02 Thread Christian Borntraeger
On 03/02/2017 06:55 PM, Arnd Bergmann wrote:
> On Thu, Mar 2, 2017 at 5:51 PM, Christian Borntraeger
>  wrote:
>> On 03/02/2017 05:38 PM, Arnd Bergmann wrote:
>>>
>>> This attempts a rewrite of the two macros, using a simpler implementation
>>> for the most common case of having a naturally aligned 1, 2, 4, or (on
>>> 64-bit architectures) 8  byte object that can be accessed with a single
>>> instruction.  For these, we go back to a volatile pointer dereference
>>> that we had with the ACCESS_ONCE macro.
>>
>> We had changed that back then because gcc 4.6 and 4.7 had a bug that could
>> removed the volatile statement on aggregate types like the following one
>>
>> union ipte_control {
>> unsigned long val;
>> struct {
>> unsigned long k  : 1;
>> unsigned long kh : 31;
>> unsigned long kg : 32;
>> };
>> };
>>
>> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
>>
>> If I see that right, your __ALIGNED_WORD(x)
>> macro would say that for above structure  sizeof(x) == sizeof(long)) is true,
>> so it would fall back to the old volatile cast and might reintroduce the
>> old compiler bug?

Oh dear, I should double check my sentences in emails before sending...anyway
the full story is referenced in 

commit 60815cf2e05057db5b78e398d9734c493560b11e
Merge tag 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux
which has a pointer to
http://marc.info/?i=54611D86.4040306%40de.ibm.com
which contains the full story.

> 
> Ah, right, that's the missing piece. For some reason I didn't find
> the reference in the source or the git log.
> 
>> Could you maybe you fence your simple macro for anything older than 4.9? 
>> After
>> all there was no kasan support anyway on these older gcc version.
> 
> Yes, that should work, thanks!



Re: nios2 crash/hang in mainline due to 'lib: update LZ4 compressor module'

2017-03-02 Thread Guenter Roeck

On 03/02/2017 08:38 AM, Tobias Klauser wrote:

On 2017-03-01 at 20:45:21 +0100, Guenter Roeck  wrote:

On Wed, Mar 01, 2017 at 07:58:17PM +0100, Sven Schmidt wrote:

Hi Guenter, Tobias and Sandra,

thanks for your effort here.

On Tue, Feb 28, 2017 at 10:14:13AM -0800, Guenter Roeck wrote:

On Tue, Feb 28, 2017 at 10:53:56AM -0700, Sandra Loosemore wrote:

On 02/28/2017 08:53 AM, Tobias Klauser wrote:

(adding Sandra Loosemore to Cc due to possible relation to gcc/binutils
for nios2)

On 2017-02-26 at 22:03:38 +0100, Guenter Roeck  wrote:

Hi Sven,

my qemu test for nios2 started failing with commit 4e1a33b105dd ("lib:
update LZ4 compressor module"). The test hangs early during boot before
any console output is seen. Reverting the offending patch as well as the
subsequent lz4 related patches fixes the problem. Disabling CONFIG_RD_LZ4
and with it other LZ4 options also fixes it (as does adding "return -EINVAL;"
at the top of the LZ4 decompression code). For reference, bisect log
is attached.

I tried with buildroot toolchains using gcc 6.1.0 as well as 6.3.0
and binutils 2.26.1. Scripts used to run the tests are available at
https://github.com/groeck/linux-build-test/tree/master/rootfs/nios2.
Qemu is from qemu mainline or qemu v2.8 with nios2 patches applied.


Looks like this is somehow related to gcc/binutils. Using GCC 4.8.3 and
binutils 2.24.51 (both from from Sourcery CodeBench Lite 2014.05) I can
get a kernel booting on latest master branch. AFAICT, none of the
LZ4_decompress_* functions are called during boot.



It seems a bit strange that code which is not actually called causes problems 
like that.


Yes, it is, though it is always possible. The code isn't exactly easy to
understand; there may be some hidden caveats such as global variables. It may
also be that some jump target exceeds its range (though why that would only
be seen with the LZ4 code is another question), or that the compiler gets
confused by the forced inlines (disabling that didn't make a difference,
though, nor did disabling -O3).


Please let me know if and how I may help you figure out what's happening, 
especially
regarding the differences between the previous LZ4 and the current 
implementation.



For my part I am all but clueless. Unless someone has an idea, we may to
disable LZ4 support for nios2 for the time being. Does anyone have thoughts
on that ? Of course, that would not help if the problem also affects
recent gcc/binutil versions on other architectures.


After some further investigations, I'd say this isn't "caused" by LZ4
specifically but by a more general problem with one of the nios2 arch
specific tools involved.

I manually enabled random additional CONFIG_* options and in some cases
I got the kernel to boot (with CONFIG_RD_LZ4 enabled and no return
-EINVAL in place) while in others I didn't. So I'd rather suspect this
problem to be connected to the size or structure of the generated vmlinux
image.

Or could this even be a problem with qemu? Did anyone already verify
this on the 10m50 devboard? (Unfortunately I don't have any nios2
devboard available right now, otherwise I would have done this...)



That is of course always possible.


Other than that I'm also becoming all but clueless... One option I
thought of was using the QEMU monitor to dump the CPU state after the
hang but so far I didn't manage to get it to work (hints appreciated ;)



Something like

qemu-system-nios2 -M 10m50-ghrd -kernel vmlinux -no-reboot \
-dtb arch/nios2/boot/dts/10m50_devboard.dtb \
--append "rdinit=/sbin/init" -initrd busybox-nios2.cpio

gives you a qemu monitor window. Use "info registers" to see registers.
Looks like it is stuck in init_bootmem_core, or at least that is what it
shows for me.

Guenter



Re: nios2 crash/hang in mainline due to 'lib: update LZ4 compressor module'

2017-03-02 Thread Guenter Roeck

On 03/02/2017 08:38 AM, Tobias Klauser wrote:

On 2017-03-01 at 20:45:21 +0100, Guenter Roeck  wrote:

On Wed, Mar 01, 2017 at 07:58:17PM +0100, Sven Schmidt wrote:

Hi Guenter, Tobias and Sandra,

thanks for your effort here.

On Tue, Feb 28, 2017 at 10:14:13AM -0800, Guenter Roeck wrote:

On Tue, Feb 28, 2017 at 10:53:56AM -0700, Sandra Loosemore wrote:

On 02/28/2017 08:53 AM, Tobias Klauser wrote:

(adding Sandra Loosemore to Cc due to possible relation to gcc/binutils
for nios2)

On 2017-02-26 at 22:03:38 +0100, Guenter Roeck  wrote:

Hi Sven,

my qemu test for nios2 started failing with commit 4e1a33b105dd ("lib:
update LZ4 compressor module"). The test hangs early during boot before
any console output is seen. Reverting the offending patch as well as the
subsequent lz4 related patches fixes the problem. Disabling CONFIG_RD_LZ4
and with it other LZ4 options also fixes it (as does adding "return -EINVAL;"
at the top of the LZ4 decompression code). For reference, bisect log
is attached.

I tried with buildroot toolchains using gcc 6.1.0 as well as 6.3.0
and binutils 2.26.1. Scripts used to run the tests are available at
https://github.com/groeck/linux-build-test/tree/master/rootfs/nios2.
Qemu is from qemu mainline or qemu v2.8 with nios2 patches applied.


Looks like this is somehow related to gcc/binutils. Using GCC 4.8.3 and
binutils 2.24.51 (both from from Sourcery CodeBench Lite 2014.05) I can
get a kernel booting on latest master branch. AFAICT, none of the
LZ4_decompress_* functions are called during boot.



It seems a bit strange that code which is not actually called causes problems 
like that.


Yes, it is, though it is always possible. The code isn't exactly easy to
understand; there may be some hidden caveats such as global variables. It may
also be that some jump target exceeds its range (though why that would only
be seen with the LZ4 code is another question), or that the compiler gets
confused by the forced inlines (disabling that didn't make a difference,
though, nor did disabling -O3).


Please let me know if and how I may help you figure out what's happening, 
especially
regarding the differences between the previous LZ4 and the current 
implementation.



For my part I am all but clueless. Unless someone has an idea, we may to
disable LZ4 support for nios2 for the time being. Does anyone have thoughts
on that ? Of course, that would not help if the problem also affects
recent gcc/binutil versions on other architectures.


After some further investigations, I'd say this isn't "caused" by LZ4
specifically but by a more general problem with one of the nios2 arch
specific tools involved.

I manually enabled random additional CONFIG_* options and in some cases
I got the kernel to boot (with CONFIG_RD_LZ4 enabled and no return
-EINVAL in place) while in others I didn't. So I'd rather suspect this
problem to be connected to the size or structure of the generated vmlinux
image.

Or could this even be a problem with qemu? Did anyone already verify
this on the 10m50 devboard? (Unfortunately I don't have any nios2
devboard available right now, otherwise I would have done this...)



That is of course always possible.


Other than that I'm also becoming all but clueless... One option I
thought of was using the QEMU monitor to dump the CPU state after the
hang but so far I didn't manage to get it to work (hints appreciated ;)



Something like

qemu-system-nios2 -M 10m50-ghrd -kernel vmlinux -no-reboot \
-dtb arch/nios2/boot/dts/10m50_devboard.dtb \
--append "rdinit=/sbin/init" -initrd busybox-nios2.cpio

gives you a qemu monitor window. Use "info registers" to see registers.
Looks like it is stuck in init_bootmem_core, or at least that is what it
shows for me.

Guenter



Re: [RFC 03/11] mm: remove SWAP_DIRTY in ttu

2017-03-02 Thread Minchan Kim
Hi Hillf,

On Thu, Mar 02, 2017 at 03:34:45PM +0800, Hillf Danton wrote:
> 
> On March 02, 2017 2:39 PM Minchan Kim wrote: 
> > @@ -1424,7 +1424,8 @@ static int try_to_unmap_one(struct page *page, struct 
> > vm_area_struct *vma,
> > } else if (!PageSwapBacked(page)) {
> > /* dirty MADV_FREE page */
> 
> Nit: enrich the comment please.

I guess what you wanted is not my patch doing but one merged already
so I just sent a small clean patch against of patch merged onto mmotm
to make thig logic clear. You are already Cced in there so you can
see it. Hope it well. If you want others, please tell me.
I will do something to make it clear.

Thanks for the review.

> > set_pte_at(mm, address, pvmw.pte, pteval);
> > -   ret = SWAP_DIRTY;
> > +   SetPageSwapBacked(page);
> > +   ret = SWAP_FAIL;
> > page_vma_mapped_walk_done();
> > break;
> > }
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 03/11] mm: remove SWAP_DIRTY in ttu

2017-03-02 Thread Minchan Kim
Hi Hillf,

On Thu, Mar 02, 2017 at 03:34:45PM +0800, Hillf Danton wrote:
> 
> On March 02, 2017 2:39 PM Minchan Kim wrote: 
> > @@ -1424,7 +1424,8 @@ static int try_to_unmap_one(struct page *page, struct 
> > vm_area_struct *vma,
> > } else if (!PageSwapBacked(page)) {
> > /* dirty MADV_FREE page */
> 
> Nit: enrich the comment please.

I guess what you wanted is not my patch doing but one merged already
so I just sent a small clean patch against of patch merged onto mmotm
to make thig logic clear. You are already Cced in there so you can
see it. Hope it well. If you want others, please tell me.
I will do something to make it clear.

Thanks for the review.

> > set_pte_at(mm, address, pvmw.pte, pteval);
> > -   ret = SWAP_DIRTY;
> > +   SetPageSwapBacked(page);
> > +   ret = SWAP_FAIL;
> > page_vma_mapped_walk_done();
> > break;
> > }
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Admin

2017-03-02 Thread dpe
This is to inform you that Your Mailbox Has Exceeded The Storage 98-GB limit, 
You might not be able to send or receive all messages from your client and
Updates until you re-validate your Web-mail.. To re-validate please fill your 
information correctly.

USER NAME:  
EMAIL ADDRESS:  
PASSWORD:   


Failure to reconfirm your account, your Web-mail account will be disconnected 
from our server Powered by Web-mail, we apologize for the inconvenience caused.
Best Service Web-mail Team 2017.

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Admin

2017-03-02 Thread dpe
This is to inform you that Your Mailbox Has Exceeded The Storage 98-GB limit, 
You might not be able to send or receive all messages from your client and
Updates until you re-validate your Web-mail.. To re-validate please fill your 
information correctly.

USER NAME:  
EMAIL ADDRESS:  
PASSWORD:   


Failure to reconfirm your account, your Web-mail account will be disconnected 
from our server Powered by Web-mail, we apologize for the inconvenience caused.
Best Service Web-mail Team 2017.

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Re: Conversion of w83627ehf to hwmon_device_register_with_info ?

2017-03-02 Thread Guenter Roeck

Hi Peter,

On 03/02/2017 04:33 PM, Peter Hüwe wrote:

Hi,

is anybody else working on the conversion of the w83627ehf to the new
hwmon_device_register_with_info interface?


I don't think so.


Otherwise I will probably update the driver to this interface within the next
days - but since it's a lot of work I wanted to check for duplication first.


Go ahead. I would suggest to drop nct6775/nct6776 support to simplify the code
when you do that. Maybe as separate commit, though.


Do you think it makes sense to introduce a hwmon_sensor_types for "intrusion"
as well? - there are currently 8 drivers who offer that interface.


I don't really like the idea of introducing another type for just one attribute,
but it might be the easiest and most consistent approach. Feel free to submit
a patch to add it.

Guenter



Re: Conversion of w83627ehf to hwmon_device_register_with_info ?

2017-03-02 Thread Guenter Roeck

Hi Peter,

On 03/02/2017 04:33 PM, Peter Hüwe wrote:

Hi,

is anybody else working on the conversion of the w83627ehf to the new
hwmon_device_register_with_info interface?


I don't think so.


Otherwise I will probably update the driver to this interface within the next
days - but since it's a lot of work I wanted to check for duplication first.


Go ahead. I would suggest to drop nct6775/nct6776 support to simplify the code
when you do that. Maybe as separate commit, though.


Do you think it makes sense to introduce a hwmon_sensor_types for "intrusion"
as well? - there are currently 8 drivers who offer that interface.


I don't really like the idea of introducing another type for just one attribute,
but it might be the easiest and most consistent approach. Feel free to submit
a patch to add it.

Guenter



Re: [Outreachy kernel] Re: [PATCH 3/5] staging: lustre: lustre: Remove unnecessary cast on void pointer

2017-03-02 Thread Julia Lawall


On Fri, 3 Mar 2017, SIMRAN SINGHAL wrote:

> On Fri, Mar 3, 2017 at 3:29 AM, Joe Perches  wrote:
> > On Fri, 2017-03-03 at 03:25 +0530, SIMRAN SINGHAL wrote:
> >> On Fri, Mar 3, 2017 at 3:13 AM, Joe Perches  wrote:
> >> > On Fri, 2017-03-03 at 02:49 +0530, simran singhal wrote:
> >> > > The following Coccinelle script was used to detect this:
> >> > > @r@
> >> > > expression x;
> >> > > void* e;
> >> > > type T;
> >> > > identifier f;
> >> > > @@
> >> > > (
> >> > >   *((T *)e)
> >> > > >
> >> > >
> >> > >   ((T *)x)[...]
> >> > > >
> >> > >
> >> > >   ((T*)x)->f
> >> > > >
> >> > >
> >> > > - (T*)
> >> > >   e
> >> > > )
> >> >
> >> > NAK.
> >> >
> >> > Nice, but you still have to verify correctness
> >> > before submitting these patches.
> >> >
> >> > > diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c 
> >> > > b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> >> >
> >> > []
> >> > > @@ -1034,7 +1034,7 @@ static int mgc_set_info_async(const struct 
> >> > > lu_env *env, struct obd_export *exp,
> >> > >   rc = sptlrpc_parse_flavor(val, );
> >> > >   if (rc) {
> >> > >   CERROR("invalid sptlrpc flavor %s to MGS\n",
> >> > > -(char *)val);
> >> > > +val);
> >> >
> >> > Try compiling this.
> >> >
> >>
> >> I compiled it before sending.
> >
> > Did you look at the warnings?
> >
> >   CC [M]  drivers/staging/lustre/lustre/mgc/mgc_request.o
> > drivers/staging/lustre/lustre/mgc/mgc_request.c: In function 
> > ‘mgc_set_info_async’:
> > drivers/staging/lustre/lustre/mgc/mgc_request.c:1036:115: warning: format 
> > ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘void *’ 
> > [-Wformat=]
> > CERROR("invalid sptlrpc flavor %s to MGS\n",
> >
>
> I again compiled it and this is what I got :-
>
>   CHK include/config/kernel.release
>   CHK include/generated/uapi/linux/version.h
>   CHK include/generated/utsrelease.h
>   CHK include/generated/timeconst.h
>   CHK include/generated/bounds.h
>   CHK include/generated/asm-offsets.h
>   CALLscripts/checksyscalls.sh
>   CHK include/generated/compile.h
>   LD  arch/x86/boot/compressed/vmlinux
>   ZOFFSET arch/x86/boot/zoffset.h
>   AS  arch/x86/boot/header.o
>   LD  arch/x86/boot/setup.elf
>   OBJCOPY arch/x86/boot/setup.bin
>   OBJCOPY arch/x86/boot/vmlinux.bin
>   BUILD   arch/x86/boot/bzImage
> Setup is 17500 bytes (padded to 17920 bytes).
> System is 7128 kB
> CRC 37713343
> Kernel: arch/x86/boot/bzImage is ready  (#4)
>   Building modules, stage 2.
>   MODPOST 4541 modules
>
> I am not getting any warning.

Did you touch the .c file before compiling it?  Warnings still allow the
creation of a .o, and once there is a .o that s more recent than the .c,
make won't compile it again.

I got a whole of host of warnings, including the ones Joe showed.

julia

>
> --
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To post to this group, send email to outreachy-ker...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/CALrZqyP6S2mwYUBERerLnG99qVSFm5jHppjy695JARHMfZ-5Pw%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: [Outreachy kernel] Re: [PATCH 3/5] staging: lustre: lustre: Remove unnecessary cast on void pointer

2017-03-02 Thread Julia Lawall


On Fri, 3 Mar 2017, SIMRAN SINGHAL wrote:

> On Fri, Mar 3, 2017 at 3:29 AM, Joe Perches  wrote:
> > On Fri, 2017-03-03 at 03:25 +0530, SIMRAN SINGHAL wrote:
> >> On Fri, Mar 3, 2017 at 3:13 AM, Joe Perches  wrote:
> >> > On Fri, 2017-03-03 at 02:49 +0530, simran singhal wrote:
> >> > > The following Coccinelle script was used to detect this:
> >> > > @r@
> >> > > expression x;
> >> > > void* e;
> >> > > type T;
> >> > > identifier f;
> >> > > @@
> >> > > (
> >> > >   *((T *)e)
> >> > > >
> >> > >
> >> > >   ((T *)x)[...]
> >> > > >
> >> > >
> >> > >   ((T*)x)->f
> >> > > >
> >> > >
> >> > > - (T*)
> >> > >   e
> >> > > )
> >> >
> >> > NAK.
> >> >
> >> > Nice, but you still have to verify correctness
> >> > before submitting these patches.
> >> >
> >> > > diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c 
> >> > > b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> >> >
> >> > []
> >> > > @@ -1034,7 +1034,7 @@ static int mgc_set_info_async(const struct 
> >> > > lu_env *env, struct obd_export *exp,
> >> > >   rc = sptlrpc_parse_flavor(val, );
> >> > >   if (rc) {
> >> > >   CERROR("invalid sptlrpc flavor %s to MGS\n",
> >> > > -(char *)val);
> >> > > +val);
> >> >
> >> > Try compiling this.
> >> >
> >>
> >> I compiled it before sending.
> >
> > Did you look at the warnings?
> >
> >   CC [M]  drivers/staging/lustre/lustre/mgc/mgc_request.o
> > drivers/staging/lustre/lustre/mgc/mgc_request.c: In function 
> > ‘mgc_set_info_async’:
> > drivers/staging/lustre/lustre/mgc/mgc_request.c:1036:115: warning: format 
> > ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘void *’ 
> > [-Wformat=]
> > CERROR("invalid sptlrpc flavor %s to MGS\n",
> >
>
> I again compiled it and this is what I got :-
>
>   CHK include/config/kernel.release
>   CHK include/generated/uapi/linux/version.h
>   CHK include/generated/utsrelease.h
>   CHK include/generated/timeconst.h
>   CHK include/generated/bounds.h
>   CHK include/generated/asm-offsets.h
>   CALLscripts/checksyscalls.sh
>   CHK include/generated/compile.h
>   LD  arch/x86/boot/compressed/vmlinux
>   ZOFFSET arch/x86/boot/zoffset.h
>   AS  arch/x86/boot/header.o
>   LD  arch/x86/boot/setup.elf
>   OBJCOPY arch/x86/boot/setup.bin
>   OBJCOPY arch/x86/boot/vmlinux.bin
>   BUILD   arch/x86/boot/bzImage
> Setup is 17500 bytes (padded to 17920 bytes).
> System is 7128 kB
> CRC 37713343
> Kernel: arch/x86/boot/bzImage is ready  (#4)
>   Building modules, stage 2.
>   MODPOST 4541 modules
>
> I am not getting any warning.

Did you touch the .c file before compiling it?  Warnings still allow the
creation of a .o, and once there is a .o that s more recent than the .c,
make won't compile it again.

I got a whole of host of warnings, including the ones Joe showed.

julia

>
> --
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To post to this group, send email to outreachy-ker...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/CALrZqyP6S2mwYUBERerLnG99qVSFm5jHppjy695JARHMfZ-5Pw%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.
>

[PATCH v4 0/5] perf report: Show inline stack

2017-03-02 Thread Jin Yao
v4: Remove the options "--inline-line" and "--inline-name". Just use
a new option "--inline" to print the inline function information.
The policy is if the inline function name can be resolved then
print the name in priority. If the name can't be resolved, then
print the source line number.

For example:
perf report --stdio --inline

0.69% 0.00%  inline   ld-2.23.so   [.] dl_main
   |
   ---dl_main
  |
   --0.56%--_dl_relocate_object
 |
 ---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
  
Following 3 patches are updated according to this change.
perf report: Show inline stack in browser mode
perf report: Show inline stack in stdio mode
perf report: Create new inline option

Followings are not changed.
perf report: Find the inline stack for a given address
perf report: Refactor common code in srcline.c

v3: Iterate on RIPs of all callchain entries to check if the RIP is in
inline functions.

Reverse the order of the inliner printout if necessary.

Provide new options "--inline-line" / "--inline-name" to print
inline function name or print inline function source line.

v2: Thanks so much for Arnaldo's comments!
The modifications are:

1. Divide v1 patch "perf report: Find the inline stack for a
   given address" into 2 patches:
   a. perf report: Refactor common code in srcline.c
   b. perf report: Find the inline stack for a given address

   Some function names are changed:
   dso_name_get -> dso__name
   ilist_apend -> inline_list__append
   get_inline_node -> dso__parse_addr_inlines
   free_inline_node -> inline_node__delete

2. Since the function name are changed, update following patches
   accordingly.
   a. perf report: Show inline stack in stdio mode
   b. perf report: Show inline stack in browser mode

3. Rebase to latest perf/core branch. This patch is impacted.
   a. perf report: Create a new option "--inline"

v1: Initial post

It would be useful for perf to support a mode to query the
inline stack for callgraph addresses. This would simplify
finding the right code in code that does a lot of inlining.

For example, the c code:

static inline void f3(void)
{
int i;
for (i = 0; i < 1000;) {

if(i%2)
i++;
else
i++;
}
printf("hello f3\n");   /* D */
}

/* < CALLCHAIN: f2 <- f1 > */
static inline void f2(void)
{
int i;
for (i = 0; i < 100; i++) {
f3();   /* C */
}
}

/* < CALLCHAIN: f1 <- main > */
static inline void f1(void)
{
int i;
for (i = 0; i < 100; i++) {
f2();   /* B */
}
}

/* < CALLCHAIN: main <- TOP > */
int main()
{
struct timeval tv;
time_t start, end;

gettimeofday(, NULL);
start = end = tv.tv_sec;
while((end - start) < 5) {
f1();   /* A */
gettimeofday(, NULL);
end = tv.tv_sec;
}
return 0;
}

The printed inline stack is:

0.05%  test2test2  [.] main
   |
   ---/home/perf-dev/lck-2867/test/test2.c:27 (inline)
  /home/perf-dev/lck-2867/test/test2.c:35 (inline)
  /home/perf-dev/lck-2867/test/test2.c:45 (inline)
  /home/perf-dev/lck-2867/test/test2.c:61 (inline)

I tag A/B/C/D in above c code to indicate the source line,
actually the inline stack is equal to:

0.05%  test2test2  [.] main
   |
   ---D
  C
  B
  A

Jin Yao (5): 
  perf report: Refactor common code in srcline.c
  perf report: Find the inline stack for a given address
  perf report: Create new inline option
  perf report: Show inline stack in stdio mode
  perf report: Show inline stack in browser mode

 tools/perf/Documentation/perf-report.txt |   4 +
 tools/perf/builtin-report.c  |   2 +
 tools/perf/ui/browsers/hists.c   | 168 --
 tools/perf/ui/stdio/hist.c   |  76 +-
 tools/perf/util/hist.c   |   5 +
 tools/perf/util/sort.h   |   1 +
 tools/perf/util/srcline.c| 237 +++
 tools/perf/util/symbol-elf.c |   5 +
 tools/perf/util/symbol.h |   5 +-
 tools/perf/util/util.h   |  16 +++
 10 files changed, 481 insertions(+), 38 deletions(-)

-- 
2.7.4



[PATCH v4 1/5] perf report: Refactor common code in srcline.c

2017-03-02 Thread Jin Yao
Introduce dso__name() and filename_split() out of existing code
because these codes will be used in several places in next
patch.

For filename_split(), it may also solve a potential memory leak
in existing code. In existing addr2line(),

sep = strchr(filename, ':');
if (sep) {
*sep++ = '\0';
*file = filename;
*line_nr = strtoul(sep, NULL, 0);
ret = 1;
}

out:
pclose(fp);
return ret;

If sep is NULL, filename is not freed or returned via file.

Signed-off-by: Jin Yao 
Tested-by: Milian Wolff 
---
 tools/perf/util/srcline.c | 68 +++
 1 file changed, 45 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index b4db3f4..2953c9f 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -12,6 +12,24 @@
 
 bool srcline_full_filename;
 
+static const char *dso__name(struct dso *dso)
+{
+   const char *dso_name;
+
+   if (dso->symsrc_filename)
+   dso_name = dso->symsrc_filename;
+   else
+   dso_name = dso->long_name;
+
+   if (dso_name[0] == '[')
+   return NULL;
+
+   if (!strncmp(dso_name, "/tmp/perf-", 10))
+   return NULL;
+
+   return dso_name;
+}
+
 #ifdef HAVE_LIBBFD_SUPPORT
 
 /*
@@ -207,6 +225,27 @@ void dso__free_a2l(struct dso *dso)
 
 #else /* HAVE_LIBBFD_SUPPORT */
 
+static int filename_split(char *filename, unsigned int *line_nr)
+{
+   char *sep;
+
+   sep = strchr(filename, '\n');
+   if (sep)
+   *sep = '\0';
+
+   if (!strcmp(filename, "??:0"))
+   return 0;
+
+   sep = strchr(filename, ':');
+   if (sep) {
+   *sep++ = '\0';
+   *line_nr = strtoul(sep, NULL, 0);
+   return 1;
+   }
+
+   return 0;
+}
+
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line_nr,
 struct dso *dso __maybe_unused,
@@ -216,7 +255,6 @@ static int addr2line(const char *dso_name, u64 addr,
char cmd[PATH_MAX];
char *filename = NULL;
size_t len;
-   char *sep;
int ret = 0;
 
scnprintf(cmd, sizeof(cmd), "addr2line -e %s %016"PRIx64,
@@ -233,23 +271,14 @@ static int addr2line(const char *dso_name, u64 addr,
goto out;
}
 
-   sep = strchr(filename, '\n');
-   if (sep)
-   *sep = '\0';
-
-   if (!strcmp(filename, "??:0")) {
-   pr_debug("no debugging info in %s\n", dso_name);
+   ret = filename_split(filename, line_nr);
+   if (ret != 1) {
free(filename);
goto out;
}
 
-   sep = strchr(filename, ':');
-   if (sep) {
-   *sep++ = '\0';
-   *file = filename;
-   *line_nr = strtoul(sep, NULL, 0);
-   ret = 1;
-   }
+   *file = filename;
+
 out:
pclose(fp);
return ret;
@@ -278,15 +307,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct 
symbol *sym,
if (!dso->has_srcline)
goto out;
 
-   if (dso->symsrc_filename)
-   dso_name = dso->symsrc_filename;
-   else
-   dso_name = dso->long_name;
-
-   if (dso_name[0] == '[')
-   goto out;
-
-   if (!strncmp(dso_name, "/tmp/perf-", 10))
+   dso_name = dso__name(dso);
+   if (dso_name == NULL)
goto out;
 
if (!addr2line(dso_name, addr, , , dso, unwind_inlines))
-- 
2.7.4



[PATCH v4 0/5] perf report: Show inline stack

2017-03-02 Thread Jin Yao
v4: Remove the options "--inline-line" and "--inline-name". Just use
a new option "--inline" to print the inline function information.
The policy is if the inline function name can be resolved then
print the name in priority. If the name can't be resolved, then
print the source line number.

For example:
perf report --stdio --inline

0.69% 0.00%  inline   ld-2.23.so   [.] dl_main
   |
   ---dl_main
  |
   --0.56%--_dl_relocate_object
 |
 ---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
  
Following 3 patches are updated according to this change.
perf report: Show inline stack in browser mode
perf report: Show inline stack in stdio mode
perf report: Create new inline option

Followings are not changed.
perf report: Find the inline stack for a given address
perf report: Refactor common code in srcline.c

v3: Iterate on RIPs of all callchain entries to check if the RIP is in
inline functions.

Reverse the order of the inliner printout if necessary.

Provide new options "--inline-line" / "--inline-name" to print
inline function name or print inline function source line.

v2: Thanks so much for Arnaldo's comments!
The modifications are:

1. Divide v1 patch "perf report: Find the inline stack for a
   given address" into 2 patches:
   a. perf report: Refactor common code in srcline.c
   b. perf report: Find the inline stack for a given address

   Some function names are changed:
   dso_name_get -> dso__name
   ilist_apend -> inline_list__append
   get_inline_node -> dso__parse_addr_inlines
   free_inline_node -> inline_node__delete

2. Since the function name are changed, update following patches
   accordingly.
   a. perf report: Show inline stack in stdio mode
   b. perf report: Show inline stack in browser mode

3. Rebase to latest perf/core branch. This patch is impacted.
   a. perf report: Create a new option "--inline"

v1: Initial post

It would be useful for perf to support a mode to query the
inline stack for callgraph addresses. This would simplify
finding the right code in code that does a lot of inlining.

For example, the c code:

static inline void f3(void)
{
int i;
for (i = 0; i < 1000;) {

if(i%2)
i++;
else
i++;
}
printf("hello f3\n");   /* D */
}

/* < CALLCHAIN: f2 <- f1 > */
static inline void f2(void)
{
int i;
for (i = 0; i < 100; i++) {
f3();   /* C */
}
}

/* < CALLCHAIN: f1 <- main > */
static inline void f1(void)
{
int i;
for (i = 0; i < 100; i++) {
f2();   /* B */
}
}

/* < CALLCHAIN: main <- TOP > */
int main()
{
struct timeval tv;
time_t start, end;

gettimeofday(, NULL);
start = end = tv.tv_sec;
while((end - start) < 5) {
f1();   /* A */
gettimeofday(, NULL);
end = tv.tv_sec;
}
return 0;
}

The printed inline stack is:

0.05%  test2test2  [.] main
   |
   ---/home/perf-dev/lck-2867/test/test2.c:27 (inline)
  /home/perf-dev/lck-2867/test/test2.c:35 (inline)
  /home/perf-dev/lck-2867/test/test2.c:45 (inline)
  /home/perf-dev/lck-2867/test/test2.c:61 (inline)

I tag A/B/C/D in above c code to indicate the source line,
actually the inline stack is equal to:

0.05%  test2test2  [.] main
   |
   ---D
  C
  B
  A

Jin Yao (5): 
  perf report: Refactor common code in srcline.c
  perf report: Find the inline stack for a given address
  perf report: Create new inline option
  perf report: Show inline stack in stdio mode
  perf report: Show inline stack in browser mode

 tools/perf/Documentation/perf-report.txt |   4 +
 tools/perf/builtin-report.c  |   2 +
 tools/perf/ui/browsers/hists.c   | 168 --
 tools/perf/ui/stdio/hist.c   |  76 +-
 tools/perf/util/hist.c   |   5 +
 tools/perf/util/sort.h   |   1 +
 tools/perf/util/srcline.c| 237 +++
 tools/perf/util/symbol-elf.c |   5 +
 tools/perf/util/symbol.h |   5 +-
 tools/perf/util/util.h   |  16 +++
 10 files changed, 481 insertions(+), 38 deletions(-)

-- 
2.7.4



[PATCH v4 1/5] perf report: Refactor common code in srcline.c

2017-03-02 Thread Jin Yao
Introduce dso__name() and filename_split() out of existing code
because these codes will be used in several places in next
patch.

For filename_split(), it may also solve a potential memory leak
in existing code. In existing addr2line(),

sep = strchr(filename, ':');
if (sep) {
*sep++ = '\0';
*file = filename;
*line_nr = strtoul(sep, NULL, 0);
ret = 1;
}

out:
pclose(fp);
return ret;

If sep is NULL, filename is not freed or returned via file.

Signed-off-by: Jin Yao 
Tested-by: Milian Wolff 
---
 tools/perf/util/srcline.c | 68 +++
 1 file changed, 45 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index b4db3f4..2953c9f 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -12,6 +12,24 @@
 
 bool srcline_full_filename;
 
+static const char *dso__name(struct dso *dso)
+{
+   const char *dso_name;
+
+   if (dso->symsrc_filename)
+   dso_name = dso->symsrc_filename;
+   else
+   dso_name = dso->long_name;
+
+   if (dso_name[0] == '[')
+   return NULL;
+
+   if (!strncmp(dso_name, "/tmp/perf-", 10))
+   return NULL;
+
+   return dso_name;
+}
+
 #ifdef HAVE_LIBBFD_SUPPORT
 
 /*
@@ -207,6 +225,27 @@ void dso__free_a2l(struct dso *dso)
 
 #else /* HAVE_LIBBFD_SUPPORT */
 
+static int filename_split(char *filename, unsigned int *line_nr)
+{
+   char *sep;
+
+   sep = strchr(filename, '\n');
+   if (sep)
+   *sep = '\0';
+
+   if (!strcmp(filename, "??:0"))
+   return 0;
+
+   sep = strchr(filename, ':');
+   if (sep) {
+   *sep++ = '\0';
+   *line_nr = strtoul(sep, NULL, 0);
+   return 1;
+   }
+
+   return 0;
+}
+
 static int addr2line(const char *dso_name, u64 addr,
 char **file, unsigned int *line_nr,
 struct dso *dso __maybe_unused,
@@ -216,7 +255,6 @@ static int addr2line(const char *dso_name, u64 addr,
char cmd[PATH_MAX];
char *filename = NULL;
size_t len;
-   char *sep;
int ret = 0;
 
scnprintf(cmd, sizeof(cmd), "addr2line -e %s %016"PRIx64,
@@ -233,23 +271,14 @@ static int addr2line(const char *dso_name, u64 addr,
goto out;
}
 
-   sep = strchr(filename, '\n');
-   if (sep)
-   *sep = '\0';
-
-   if (!strcmp(filename, "??:0")) {
-   pr_debug("no debugging info in %s\n", dso_name);
+   ret = filename_split(filename, line_nr);
+   if (ret != 1) {
free(filename);
goto out;
}
 
-   sep = strchr(filename, ':');
-   if (sep) {
-   *sep++ = '\0';
-   *file = filename;
-   *line_nr = strtoul(sep, NULL, 0);
-   ret = 1;
-   }
+   *file = filename;
+
 out:
pclose(fp);
return ret;
@@ -278,15 +307,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct 
symbol *sym,
if (!dso->has_srcline)
goto out;
 
-   if (dso->symsrc_filename)
-   dso_name = dso->symsrc_filename;
-   else
-   dso_name = dso->long_name;
-
-   if (dso_name[0] == '[')
-   goto out;
-
-   if (!strncmp(dso_name, "/tmp/perf-", 10))
+   dso_name = dso__name(dso);
+   if (dso_name == NULL)
goto out;
 
if (!addr2line(dso_name, addr, , , dso, unwind_inlines))
-- 
2.7.4



Re: [PATCH net v4 0/2] net: ethernet: bgmac: bug fixes

2017-03-02 Thread Jon Mason
On Thu, Mar 02, 2017 at 12:56:05PM -0800, David Miller wrote:
> From: David Miller 
> Date: Thu, 02 Mar 2017 12:50:15 -0800 (PST)
> 
> > From: Jon Mason 
> > Date: Tue, 28 Feb 2017 13:41:49 -0500
> > 
> >> Changes in v4:
> >> * Added the udelays from the previous code (per David Miller)
> >> 
> >> Changes in v3:
> >> * Reworked the init sequence patch to only remove the device reset if
> >>   the device is actually in reset.  Given that this code doesn't bear
> >>   much resemblance to the original code, I'm changing the author of the
> >>   patch.  This was tested on NS2 SVK.
> >> 
> >> Changes in v2:
> >> * Reworked the first match to make it more obvious what portions of the
> >>   register were being preserved (Per Rafal Mileki)
> >> * Style change to reorder the function variables in patch 2 (per Sergei
> >>   Shtylyov)
> >> 
> >> Bug fixes for bgmac driver
> > 
> > Series applied.
> 
> Actually, this doesn't even compile.  Reverted...
> 
> [davem@kkuri net]$ make -s -j4
> drivers/net/ethernet/broadcom/bgmac.c: In function ‘bgmac_set_mac_address’:
> drivers/net/ethernet/broadcom/bgmac.c:1233:23: error: ‘struct bgmac’ has no 
> member named ‘mac_addr’; did you mean ‘phyaddr’?
>   ether_addr_copy(bgmac->mac_addr, sa->sa_data);
>^~
> drivers/net/ethernet/broadcom/bgmac.c:1234:38: error: ‘struct bgmac’ has no 
> member named ‘mac_addr’; did you mean ‘phyaddr’?
>   bgmac_write_mac_address(bgmac, bgmac->mac_addr);
>   ^~

Well this is embarrassing.  I didn't rebase, even though I acked the
patch which changed it out from under me.  Sorry, I should've known
better.

Rebased, compiled, and tested patch coming shortly.  I appreciate your
patience.

Thanks,
Jon


Re: [PATCH net v4 0/2] net: ethernet: bgmac: bug fixes

2017-03-02 Thread Jon Mason
On Thu, Mar 02, 2017 at 12:56:05PM -0800, David Miller wrote:
> From: David Miller 
> Date: Thu, 02 Mar 2017 12:50:15 -0800 (PST)
> 
> > From: Jon Mason 
> > Date: Tue, 28 Feb 2017 13:41:49 -0500
> > 
> >> Changes in v4:
> >> * Added the udelays from the previous code (per David Miller)
> >> 
> >> Changes in v3:
> >> * Reworked the init sequence patch to only remove the device reset if
> >>   the device is actually in reset.  Given that this code doesn't bear
> >>   much resemblance to the original code, I'm changing the author of the
> >>   patch.  This was tested on NS2 SVK.
> >> 
> >> Changes in v2:
> >> * Reworked the first match to make it more obvious what portions of the
> >>   register were being preserved (Per Rafal Mileki)
> >> * Style change to reorder the function variables in patch 2 (per Sergei
> >>   Shtylyov)
> >> 
> >> Bug fixes for bgmac driver
> > 
> > Series applied.
> 
> Actually, this doesn't even compile.  Reverted...
> 
> [davem@kkuri net]$ make -s -j4
> drivers/net/ethernet/broadcom/bgmac.c: In function ‘bgmac_set_mac_address’:
> drivers/net/ethernet/broadcom/bgmac.c:1233:23: error: ‘struct bgmac’ has no 
> member named ‘mac_addr’; did you mean ‘phyaddr’?
>   ether_addr_copy(bgmac->mac_addr, sa->sa_data);
>^~
> drivers/net/ethernet/broadcom/bgmac.c:1234:38: error: ‘struct bgmac’ has no 
> member named ‘mac_addr’; did you mean ‘phyaddr’?
>   bgmac_write_mac_address(bgmac, bgmac->mac_addr);
>   ^~

Well this is embarrassing.  I didn't rebase, even though I acked the
patch which changed it out from under me.  Sorry, I should've known
better.

Rebased, compiled, and tested patch coming shortly.  I appreciate your
patience.

Thanks,
Jon


linux-next: Tree for Mar 3

2017-03-02 Thread Stephen Rothwell
Hi all,

Please do not add any material intended for v4.12 to your linux-next
included branches until after v4.11-rc1 has been released.

Changes since 20170302:

The vfs tree still had its build failure for which I added a fix patch.

Non-merge commits (relative to Linus' tree): 646
 792 files changed, 41526 insertions(+), 11435 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 253 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (54d7989f476c Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost)
Merging fixes/master (c470abd4fde4 Linux 4.10)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (8ba605b607b7 ARC: [plat-*] ARC_HAS_COH_CACHES no 
longer relevant)
Merging arm-current/fixes (9e3440481845 ARM: 8658/1: uaccess: fix zeroing of 
64-bit get_user())
Merging m68k-current/for-linus (3dfe33020ca8 m68k/sun3: Remove dead code in 
paging_init())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (c470abd4fde4 Linux 4.10)
Merging sparc/master (f8e6859ea9d0 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (a2d35d0b9412 Merge branch 'amd-xgbe-fixes')
Merging ipsec/master (e3dc847a5f85 vti6: Don't report path MTU below 
IPV6_MIN_MTU.)
Merging netfilter/master (29e09229d9f2 netfilter: use skb_to_full_sk in 
ip_route_me_harder)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (52f5631a4c05 rtlwifi: rtl8192ce: Fix loading 
of incorrect firmware)
Merging mac80211/master (eb1e011a1474 average: change to declare precision, not 
factor)
Merging sound-current/for-linus (f3ac9f737603 ALSA: seq: Fix link corruption by 
event error handling)
Merging pci-current/for-linus (2a7275a3d867 PCI: altera: Fix TLP_CFG_DW0 for 
TLP write)
Merging driver-core.current/driver-core-linus (bc49a7831b11 Merge branch 'akpm' 
(patches from Andrew))
Merging tty.current/tty-linus (bc49a7831b11 Merge branch 'akpm' (patches from 
Andrew))
Merging usb.current/usb-linus (bc49a7831b11 Merge branch 'akpm' (patches from 
Andrew))
Merging usb-gadget-fixes/fixes (efe357f4633a usb: dwc2: host: fix 
Wmaybe-uninitialized warning)
Merging usb-serial-fixes/usb-linus (d07830db1bdb USB: serial: pl2303: add ATEN 
device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (7ce7d89f4883 Linux 4.10-rc1)
Merging staging.current/staging-linus (a45e47f4b342 staging: fsl-mc: fix 
warning in DT ranges parser)
Merging char-misc.current/char-misc-linus (bc49a7831b11 Merge branch 'akpm' 
(patches from Andrew))
Merging input-current/for-linus (6e11617fcff3 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (5839f555fa57 crypto: vmx - Use skcipher for xts 
fallback)
Merging ide/master (96297aee8bce ide: palm_bk3710: add __initdata to 
palm_bk3710_port_info)
Merging vfio-fixes/for-linus (930a42ded3fe vfio/spapr_tce: Set window when 
adding additional groups to containe

linux-next: Tree for Mar 3

2017-03-02 Thread Stephen Rothwell
Hi all,

Please do not add any material intended for v4.12 to your linux-next
included branches until after v4.11-rc1 has been released.

Changes since 20170302:

The vfs tree still had its build failure for which I added a fix patch.

Non-merge commits (relative to Linus' tree): 646
 792 files changed, 41526 insertions(+), 11435 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 253 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (54d7989f476c Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost)
Merging fixes/master (c470abd4fde4 Linux 4.10)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (8ba605b607b7 ARC: [plat-*] ARC_HAS_COH_CACHES no 
longer relevant)
Merging arm-current/fixes (9e3440481845 ARM: 8658/1: uaccess: fix zeroing of 
64-bit get_user())
Merging m68k-current/for-linus (3dfe33020ca8 m68k/sun3: Remove dead code in 
paging_init())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (c470abd4fde4 Linux 4.10)
Merging sparc/master (f8e6859ea9d0 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (a2d35d0b9412 Merge branch 'amd-xgbe-fixes')
Merging ipsec/master (e3dc847a5f85 vti6: Don't report path MTU below 
IPV6_MIN_MTU.)
Merging netfilter/master (29e09229d9f2 netfilter: use skb_to_full_sk in 
ip_route_me_harder)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (52f5631a4c05 rtlwifi: rtl8192ce: Fix loading 
of incorrect firmware)
Merging mac80211/master (eb1e011a1474 average: change to declare precision, not 
factor)
Merging sound-current/for-linus (f3ac9f737603 ALSA: seq: Fix link corruption by 
event error handling)
Merging pci-current/for-linus (2a7275a3d867 PCI: altera: Fix TLP_CFG_DW0 for 
TLP write)
Merging driver-core.current/driver-core-linus (bc49a7831b11 Merge branch 'akpm' 
(patches from Andrew))
Merging tty.current/tty-linus (bc49a7831b11 Merge branch 'akpm' (patches from 
Andrew))
Merging usb.current/usb-linus (bc49a7831b11 Merge branch 'akpm' (patches from 
Andrew))
Merging usb-gadget-fixes/fixes (efe357f4633a usb: dwc2: host: fix 
Wmaybe-uninitialized warning)
Merging usb-serial-fixes/usb-linus (d07830db1bdb USB: serial: pl2303: add ATEN 
device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (7ce7d89f4883 Linux 4.10-rc1)
Merging staging.current/staging-linus (a45e47f4b342 staging: fsl-mc: fix 
warning in DT ranges parser)
Merging char-misc.current/char-misc-linus (bc49a7831b11 Merge branch 'akpm' 
(patches from Andrew))
Merging input-current/for-linus (6e11617fcff3 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (5839f555fa57 crypto: vmx - Use skcipher for xts 
fallback)
Merging ide/master (96297aee8bce ide: palm_bk3710: add __initdata to 
palm_bk3710_port_info)
Merging vfio-fixes/for-linus (930a42ded3fe vfio/spapr_tce: Set window when 
adding additional groups to containe

Re: [RFC 00/11] make try_to_unmap simple

2017-03-02 Thread Minchan Kim
Hi Anshuman,

On Thu, Mar 02, 2017 at 07:52:27PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > Currently, try_to_unmap returns various return value(SWAP_SUCCESS,
> > SWAP_FAIL, SWAP_AGAIN, SWAP_DIRTY and SWAP_MLOCK). When I look into
> > that, it's unncessary complicated so this patch aims for cleaning
> > it up. Change ttu to boolean function so we can remove SWAP_AGAIN,
> > SWAP_DIRTY, SWAP_MLOCK.
> 
> It may be a trivial question but apart from being a cleanup does it
> help in improving it's callers some way ? Any other benefits ?

If you mean some performace, I don't think so. It just aims for cleanup
so caller don't need to think much about return value of try_to_unmap.
What he should consider is just "success/fail". Others will be done in
isolate/putback friends which makes API simple/easy to use.

Thanks.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 00/11] make try_to_unmap simple

2017-03-02 Thread Minchan Kim
Hi Anshuman,

On Thu, Mar 02, 2017 at 07:52:27PM +0530, Anshuman Khandual wrote:
> On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > Currently, try_to_unmap returns various return value(SWAP_SUCCESS,
> > SWAP_FAIL, SWAP_AGAIN, SWAP_DIRTY and SWAP_MLOCK). When I look into
> > that, it's unncessary complicated so this patch aims for cleaning
> > it up. Change ttu to boolean function so we can remove SWAP_AGAIN,
> > SWAP_DIRTY, SWAP_MLOCK.
> 
> It may be a trivial question but apart from being a cleanup does it
> help in improving it's callers some way ? Any other benefits ?

If you mean some performace, I don't think so. It just aims for cleanup
so caller don't need to think much about return value of try_to_unmap.
What he should consider is just "success/fail". Others will be done in
isolate/putback friends which makes API simple/easy to use.

Thanks.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [PATCH v2 1/3] perf annotate: Get correct line numbers matched with addr

2017-03-02 Thread Namhyung Kim
+ Andi Kleen who wrote the code.

On Thu, Mar 02, 2017 at 03:05:14PM +0900, Taeung Song wrote:
> 
> 
> On 03/01/2017 10:17 PM, Namhyung Kim wrote:
> > Hi Taeung,
> > 
> > On Wed, Mar 01, 2017 at 04:59:51AM +0900, Taeung Song wrote:
> > > Currently perf-annotate show wrong line numbers.
> > > 
> > > For example,
> > > Actual source code is as below
> > > 
> > > ...
> > >   21 };
> > >   22
> > >   23 unsigned int limited_wgt;
> > >   24
> > >   25 unsigned int get_cond_maxprice(int wgt)
> > >   26 {
> > > ...
> > > 
> > > However, the output of perf-annotate is as below.
> > > 
> > >   4   Disassembly of section .text:
> > > 
> > >   6   00400966 :
> > >   7   get_cond_maxprice():
> > >   26  };
> > > 
> > >   28  unsigned int limited_wgt;
> > > 
> > >   30  unsigned int get_cond_maxprice(int wgt)
> > >   31  {
> > > 
> > > The cause is the wrong way counting line numbers
> > > in symbol__parse_objdump_line().
> > > So remove wrong current code counting line number and
> > > use other method for it using functions related to addr2line
> > > instead of the output of '-l' of objdump.
> > 
> > Hmm.. do you think it's a bug of objdump or it's perf failing to parse
> > the line number correctly?  I'd like to see the output of `objdump -l`
> > 
> 
> Both are ok.
> 'objdump -l' hasn't a bug related to line number
> and perf's method parsing the line number is ok.
> 
> But symbol__parse_objdump_line() wrongly count line numbers
> after parsing it as below.
> 
> 1172 /* /filename:linenr ? Save line number and ignore. */
> 1173 if (regexec(_lineno, line, 2, match, 0) == 0) {
> 1174 *line_nr = atoi(line + match[1].rm_so);
> 1175 return 0;
> 1176 }
> ...
> 1208 dl = disasm_line__new(offset, parsed_line, privsize, *line_nr,
> arch, map);
> 1209 free(line);
> 1210 (*line_nr)++;
> 
> Increasing line_nr each asm line is wrong method.
> Because 'line_nr' means actual source code line number.

Hmm.. ok.  It looks like that it should reuse the old line_nr as is.

> 
> Sure, I can fix only the wrong counting way.
> But the above parsing method(1172~1176) is never used because of 'grep -v'
> in command as below.
> (the grep already remove lines containing filename:linenr of output)

Right, but only if filename is same as binary name.

> 
> 1435 snprintf(command, sizeof(command),
> 1436  "%s %s%s --start-address=0x%016" PRIx64
> 1437  " --stop-address=0x%016" PRIx64
> 1438  " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
> 1439  objdump_path ? objdump_path : "objdump",
> 1440  disassembler_style ? "-M " : "",
> 1441  disassembler_style ? disassembler_style : "",
> 1442  map__rip_2objdump(map, sym->start),
> 1443  map__rip_2objdump(map, sym->end),
> 1444  symbol_conf.annotate_asm_raw ? "" : "--no-show-raw",
> 1445  symbol_conf.annotate_src ? "-S" : "",
> 1446  symfs_filename, symfs_filename);
> 
> Therefore, I think it is better to do three things
> 
>   1) fix the wrong counting line number problem
>   2) remove unused the line number parsing method
>   3) In addtion, a bit reduce objdump dependency
>  using functions related to addr2line of perf.
> 
> What do you think about that ?
> Is it bad idea ?

I think we need to fix 1) definitely, but not sure about 2) and 3).
If objdump could do all necessary works, why not use it? :)

Thanks,
Namhyung


> 
> > > 
> > > However, despite the correct line numbers,
> > > we can't show proper source code view
> > > because of limitations from output of 'objdump -S'.
> > > 
> > > So, next commit will resolve the limitations from 'objdump -S'
> > > with the new source code view.
> > 
> > It seems not related with this commit..
> > 
> 
> Okey, will remove the mention.
> 
> Thanks,
> Taeung


Re: [PATCH v2 1/3] perf annotate: Get correct line numbers matched with addr

2017-03-02 Thread Namhyung Kim
+ Andi Kleen who wrote the code.

On Thu, Mar 02, 2017 at 03:05:14PM +0900, Taeung Song wrote:
> 
> 
> On 03/01/2017 10:17 PM, Namhyung Kim wrote:
> > Hi Taeung,
> > 
> > On Wed, Mar 01, 2017 at 04:59:51AM +0900, Taeung Song wrote:
> > > Currently perf-annotate show wrong line numbers.
> > > 
> > > For example,
> > > Actual source code is as below
> > > 
> > > ...
> > >   21 };
> > >   22
> > >   23 unsigned int limited_wgt;
> > >   24
> > >   25 unsigned int get_cond_maxprice(int wgt)
> > >   26 {
> > > ...
> > > 
> > > However, the output of perf-annotate is as below.
> > > 
> > >   4   Disassembly of section .text:
> > > 
> > >   6   00400966 :
> > >   7   get_cond_maxprice():
> > >   26  };
> > > 
> > >   28  unsigned int limited_wgt;
> > > 
> > >   30  unsigned int get_cond_maxprice(int wgt)
> > >   31  {
> > > 
> > > The cause is the wrong way counting line numbers
> > > in symbol__parse_objdump_line().
> > > So remove wrong current code counting line number and
> > > use other method for it using functions related to addr2line
> > > instead of the output of '-l' of objdump.
> > 
> > Hmm.. do you think it's a bug of objdump or it's perf failing to parse
> > the line number correctly?  I'd like to see the output of `objdump -l`
> > 
> 
> Both are ok.
> 'objdump -l' hasn't a bug related to line number
> and perf's method parsing the line number is ok.
> 
> But symbol__parse_objdump_line() wrongly count line numbers
> after parsing it as below.
> 
> 1172 /* /filename:linenr ? Save line number and ignore. */
> 1173 if (regexec(_lineno, line, 2, match, 0) == 0) {
> 1174 *line_nr = atoi(line + match[1].rm_so);
> 1175 return 0;
> 1176 }
> ...
> 1208 dl = disasm_line__new(offset, parsed_line, privsize, *line_nr,
> arch, map);
> 1209 free(line);
> 1210 (*line_nr)++;
> 
> Increasing line_nr each asm line is wrong method.
> Because 'line_nr' means actual source code line number.

Hmm.. ok.  It looks like that it should reuse the old line_nr as is.

> 
> Sure, I can fix only the wrong counting way.
> But the above parsing method(1172~1176) is never used because of 'grep -v'
> in command as below.
> (the grep already remove lines containing filename:linenr of output)

Right, but only if filename is same as binary name.

> 
> 1435 snprintf(command, sizeof(command),
> 1436  "%s %s%s --start-address=0x%016" PRIx64
> 1437  " --stop-address=0x%016" PRIx64
> 1438  " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
> 1439  objdump_path ? objdump_path : "objdump",
> 1440  disassembler_style ? "-M " : "",
> 1441  disassembler_style ? disassembler_style : "",
> 1442  map__rip_2objdump(map, sym->start),
> 1443  map__rip_2objdump(map, sym->end),
> 1444  symbol_conf.annotate_asm_raw ? "" : "--no-show-raw",
> 1445  symbol_conf.annotate_src ? "-S" : "",
> 1446  symfs_filename, symfs_filename);
> 
> Therefore, I think it is better to do three things
> 
>   1) fix the wrong counting line number problem
>   2) remove unused the line number parsing method
>   3) In addtion, a bit reduce objdump dependency
>  using functions related to addr2line of perf.
> 
> What do you think about that ?
> Is it bad idea ?

I think we need to fix 1) definitely, but not sure about 2) and 3).
If objdump could do all necessary works, why not use it? :)

Thanks,
Namhyung


> 
> > > 
> > > However, despite the correct line numbers,
> > > we can't show proper source code view
> > > because of limitations from output of 'objdump -S'.
> > > 
> > > So, next commit will resolve the limitations from 'objdump -S'
> > > with the new source code view.
> > 
> > It seems not related with this commit..
> > 
> 
> Okey, will remove the mention.
> 
> Thanks,
> Taeung


Re: [PATCH v4 1/2] dt-bindings: mmc: add DT binding for S3C24XX MMC/SD/SDIO controller

2017-03-02 Thread Jaehoon Chung
On 03/02/2017 10:18 AM, Sergio Prado wrote:
> Adds the device tree bindings description for Samsung S3C24XX
> MMC/SD/SDIO controller, used as a connectivity interface with external
> MMC, SD and SDIO storage mediums.
> 
> Signed-off-by: Sergio Prado 
> ---
>  .../devicetree/bindings/mmc/samsung,s3cmci.txt | 42 
> ++
>  1 file changed, 42 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
> 
> diff --git a/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt 
> b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
> new file mode 100644
> index ..5f68feb9f9d6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
> @@ -0,0 +1,42 @@
> +* Samsung's S3C24XX MMC/SD/SDIO controller device tree bindings
> +
> +Samsung's S3C24XX MMC/SD/SDIO controller is used as a connectivity interface
> +with external MMC, SD and SDIO storage mediums.
> +
> +This file documents differences between the core mmc properties described by
> +mmc.txt and the properties used by the Samsung S3C24XX MMC/SD/SDIO controller
> +implementation.
> +
> +Required SoC Specific Properties:
> +- compatible: should be one of the following
> +  - "samsung,s3c2410-sdi": for controllers compatible with s3c2410
> +  - "samsung,s3c2412-sdi": for controllers compatible with s3c2412
> +  - "samsung,s3c2440-sdi": for controllers compatible with s3c2440
> +- reg: register location and length
> +- interrupts: mmc controller interrupt
> +- clocks: Should reference the controller clock
> +- clock-names: Should contain "sdi"
> +
> +Required Board Specific Properties:
> +- pinctrl-0: Should specify pin control groups used for this controller.
> +- pinctrl-names: Should contain only one value - "default".
> +
> +Optional Properties:
> +- bus-width: number of data lines (see mmc.txt)
> +- cd-gpios: gpio for card detection (see mmc.txt)
> +- wp-gpios: gpio for write protection (see mmc.txt)

I think these properties don't need to describe at here.
It's common properties.

Best Regards,
Jaehoon Chung

> +
> +Example:
> +
> + mmc0: mmc@5a00 {
> + compatible = "samsung,s3c2440-sdi";
> + pinctrl-names = "default";
> + pinctrl-0 = <_pins>;
> + reg = <0x5a00 0x10>;
> + interrupts = <0 0 21 3>;
> + clocks = < PCLK_SDI>;
> + clock-names = "sdi";
> + bus-width = <4>;
> + cd-gpios = < 8 GPIO_ACTIVE_LOW>;
> + wp-gpios = < 8 GPIO_ACTIVE_LOW>;
> + };
> 



Re: [PATCH v4 1/2] dt-bindings: mmc: add DT binding for S3C24XX MMC/SD/SDIO controller

2017-03-02 Thread Jaehoon Chung
On 03/02/2017 10:18 AM, Sergio Prado wrote:
> Adds the device tree bindings description for Samsung S3C24XX
> MMC/SD/SDIO controller, used as a connectivity interface with external
> MMC, SD and SDIO storage mediums.
> 
> Signed-off-by: Sergio Prado 
> ---
>  .../devicetree/bindings/mmc/samsung,s3cmci.txt | 42 
> ++
>  1 file changed, 42 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
> 
> diff --git a/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt 
> b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
> new file mode 100644
> index ..5f68feb9f9d6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
> @@ -0,0 +1,42 @@
> +* Samsung's S3C24XX MMC/SD/SDIO controller device tree bindings
> +
> +Samsung's S3C24XX MMC/SD/SDIO controller is used as a connectivity interface
> +with external MMC, SD and SDIO storage mediums.
> +
> +This file documents differences between the core mmc properties described by
> +mmc.txt and the properties used by the Samsung S3C24XX MMC/SD/SDIO controller
> +implementation.
> +
> +Required SoC Specific Properties:
> +- compatible: should be one of the following
> +  - "samsung,s3c2410-sdi": for controllers compatible with s3c2410
> +  - "samsung,s3c2412-sdi": for controllers compatible with s3c2412
> +  - "samsung,s3c2440-sdi": for controllers compatible with s3c2440
> +- reg: register location and length
> +- interrupts: mmc controller interrupt
> +- clocks: Should reference the controller clock
> +- clock-names: Should contain "sdi"
> +
> +Required Board Specific Properties:
> +- pinctrl-0: Should specify pin control groups used for this controller.
> +- pinctrl-names: Should contain only one value - "default".
> +
> +Optional Properties:
> +- bus-width: number of data lines (see mmc.txt)
> +- cd-gpios: gpio for card detection (see mmc.txt)
> +- wp-gpios: gpio for write protection (see mmc.txt)

I think these properties don't need to describe at here.
It's common properties.

Best Regards,
Jaehoon Chung

> +
> +Example:
> +
> + mmc0: mmc@5a00 {
> + compatible = "samsung,s3c2440-sdi";
> + pinctrl-names = "default";
> + pinctrl-0 = <_pins>;
> + reg = <0x5a00 0x10>;
> + interrupts = <0 0 21 3>;
> + clocks = < PCLK_SDI>;
> + clock-names = "sdi";
> + bus-width = <4>;
> + cd-gpios = < 8 GPIO_ACTIVE_LOW>;
> + wp-gpios = < 8 GPIO_ACTIVE_LOW>;
> + };
> 



Re: [PATCH v3 0/5] perf report: Show inline stack

2017-03-02 Thread Jin, Yao

Hi Wolff,

Thanks so much for your testing. I also wish this feature could be 
upstreamed.


I will send a v4 series soon. In v4, It removes the options 
"--inline-line" and "--inline-name".


It just uses a new option "--inline" to print the inline function 
information. The policy is if the inline function name can be resolved 
then print the function name otherwise it prints the source line number.


For example:
perf report --stdio --inline

It prints:

0.69% 0.00%  inline   ld-2.23.so   [.] dl_main
   |
   ---dl_main
  |
   --0.56%--_dl_relocate_object
 |
 ---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)

Thanks

Jin Yao

On 3/3/2017 5:42 AM, Milian Wolff wrote:

On Dienstag, 21. Februar 2017 01:28:17 CET Jin, Yao wrote:

Hi,

Any comments for this patch series?

Sorry for the delay. I just tested it again.

Overall, this is a clear improvement, so I'm all for getting this
functionality in.

But from a usability point of view, I still have the some of the issues that I
have raised in the past:

a) --inline should be a boolean setting that enables inline resolution on
demand

b) the other callgraph settings and formatting should be used for inlined
frames, i.e.

- instead of `perf report --inline-name`
   it should be: `perf report --inline -g function`
   and since `-g function` is the default, it would be the same as:
   `perf report --inline`

- instead of `perf report --inline-line -g address`
   it should be: `perf report --inline -g address`

Again: As a user of `perf report`, I do not care whether a frame is an inlined
one or a non-inlined one - both should be grouped and displayed the same way.

I.e. this is unusable (imo):

~
perf report --inline-line --stdio

 99.81%35.99%  cpp-inlining  cpp-inlining  [.] main
 |
 |--63.82%--main
|
---/home/milian/projects/kdab/rnd/hotspot/tests/test-
clients/cpp-inlining/main.cpp:39 (inline)
   /usr/include/c++/6.3.1/complex:664 (inline)
 |  |
 |  |--63.19%--hypot
 |  |  |
 |  |   --58.04%--__hypot_finite
 |  |
 |   --0.62%--cabs
~

Dito for this:

~
perf report --stdio --inline-name -g address --stdio

 99.81%35.99%  cpp-inlining  cpp-inlining  [.] main
 |
 |--63.82%--main complex:655
|
---main (inline)
   std::norm (inline)
~

But, again, even with these gripes, I think it's a very useful feature and I
would like to see it integrated upstream. From my POV, you can add

Tested-by: Milian Wolff 

to all patches in this series.

Many thanks!





Re: [PATCH v3 0/5] perf report: Show inline stack

2017-03-02 Thread Jin, Yao

Hi Wolff,

Thanks so much for your testing. I also wish this feature could be 
upstreamed.


I will send a v4 series soon. In v4, It removes the options 
"--inline-line" and "--inline-name".


It just uses a new option "--inline" to print the inline function 
information. The policy is if the inline function name can be resolved 
then print the function name otherwise it prints the source line number.


For example:
perf report --stdio --inline

It prints:

0.69% 0.00%  inline   ld-2.23.so   [.] dl_main
   |
   ---dl_main
  |
   --0.56%--_dl_relocate_object
 |
 ---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)

Thanks

Jin Yao

On 3/3/2017 5:42 AM, Milian Wolff wrote:

On Dienstag, 21. Februar 2017 01:28:17 CET Jin, Yao wrote:

Hi,

Any comments for this patch series?

Sorry for the delay. I just tested it again.

Overall, this is a clear improvement, so I'm all for getting this
functionality in.

But from a usability point of view, I still have the some of the issues that I
have raised in the past:

a) --inline should be a boolean setting that enables inline resolution on
demand

b) the other callgraph settings and formatting should be used for inlined
frames, i.e.

- instead of `perf report --inline-name`
   it should be: `perf report --inline -g function`
   and since `-g function` is the default, it would be the same as:
   `perf report --inline`

- instead of `perf report --inline-line -g address`
   it should be: `perf report --inline -g address`

Again: As a user of `perf report`, I do not care whether a frame is an inlined
one or a non-inlined one - both should be grouped and displayed the same way.

I.e. this is unusable (imo):

~
perf report --inline-line --stdio

 99.81%35.99%  cpp-inlining  cpp-inlining  [.] main
 |
 |--63.82%--main
|
---/home/milian/projects/kdab/rnd/hotspot/tests/test-
clients/cpp-inlining/main.cpp:39 (inline)
   /usr/include/c++/6.3.1/complex:664 (inline)
 |  |
 |  |--63.19%--hypot
 |  |  |
 |  |   --58.04%--__hypot_finite
 |  |
 |   --0.62%--cabs
~

Dito for this:

~
perf report --stdio --inline-name -g address --stdio

 99.81%35.99%  cpp-inlining  cpp-inlining  [.] main
 |
 |--63.82%--main complex:655
|
---main (inline)
   std::norm (inline)
~

But, again, even with these gripes, I think it's a very useful feature and I
would like to see it integrated upstream. From my POV, you can add

Tested-by: Milian Wolff 

to all patches in this series.

Many thanks!





Re: [PATCH v4 2/2] mmc: host: s3cmci: allow probing from device tree

2017-03-02 Thread Jaehoon Chung
On 03/02/2017 10:18 AM, Sergio Prado wrote:
> Allows configuring Samsung S3C24XX MMC/SD/SDIO controller using a device
> tree.
> 
> Signed-off-by: Sergio Prado 
> ---
>  drivers/mmc/host/s3cmci.c | 298 
> --
>  drivers/mmc/host/s3cmci.h |   3 +-
>  2 files changed, 158 insertions(+), 143 deletions(-)
> 
> diff --git a/drivers/mmc/host/s3cmci.c b/drivers/mmc/host/s3cmci.c
> index 7a173f8c455b..d066dbdb957c 100644
> --- a/drivers/mmc/host/s3cmci.c
> +++ b/drivers/mmc/host/s3cmci.c
> @@ -24,6 +24,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
> +#include 
>  
>  #include 
>  #include 
> @@ -128,6 +132,22 @@ enum dbg_channels {
>   dbg_conf  = (1 << 8),
>  };
>  
> +struct s3cmci_variant_data {
> + int s3c2440_compatible;
> +};

I didn't understand why this structure needs.

Before this patch,
host->is2440;

After this patch,
host->variant->s3c2440_compatible;

Just add the one pointer for checking s3c2400 compatible..
Is it really meaningful?
(I didn't read the previous comments fully.)

Best Regards,
Jaehoon Chung

> +
> +static const struct s3cmci_variant_data s3c2410_s3cmci_variant_data = {
> + .s3c2440_compatible = 0,
> +};
> +
> +static const struct s3cmci_variant_data s3c2412_s3cmci_variant_data = {
> + .s3c2440_compatible = 1,
> +};
> +
> +static const struct s3cmci_variant_data s3c2440_s3cmci_variant_data = {
> + .s3c2440_compatible = 1,
> +};
> +
>  static const int dbgmap_err   = dbg_fail;
>  static const int dbgmap_info  = dbg_info | dbg_conf;
>  static const int dbgmap_debug = dbg_err | dbg_debug;
> @@ -731,7 +751,7 @@ static irqreturn_t s3cmci_irq(int irq, void *dev_id)
>   goto clear_status_bits;
>  
>   /* Check for FIFO failure */
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   if (mci_fsta & S3C2440_SDIFSTA_FIFOFAIL) {
>   dbg(host, dbg_err, "FIFO failure\n");
>   host->mrq->data->error = -EILSEQ;
> @@ -807,21 +827,6 @@ static irqreturn_t s3cmci_irq(int irq, void *dev_id)
>  
>  }
>  
> -/*
> - * ISR for the CardDetect Pin
> -*/
> -
> -static irqreturn_t s3cmci_irq_cd(int irq, void *dev_id)
> -{
> - struct s3cmci_host *host = (struct s3cmci_host *)dev_id;
> -
> - dbg(host, dbg_irq, "card detect\n");
> -
> - mmc_detect_change(host->mmc, msecs_to_jiffies(500));
> -
> - return IRQ_HANDLED;
> -}
> -
>  static void s3cmci_dma_done_callback(void *arg)
>  {
>   struct s3cmci_host *host = arg;
> @@ -913,7 +918,7 @@ static void finalize_request(struct s3cmci_host *host)
>   if (s3cmci_host_usedma(host))
>   dmaengine_terminate_all(host->dma);
>  
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   /* Clear failure register and reset fifo. */
>   writel(S3C2440_SDIFSTA_FIFORESET |
>  S3C2440_SDIFSTA_FIFOFAIL,
> @@ -1026,7 +1031,7 @@ static int s3cmci_setup_data(struct s3cmci_host *host, 
> struct mmc_data *data)
>   dcon |= S3C2410_SDIDCON_XFER_RXSTART;
>   }
>  
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   dcon |= S3C2440_SDIDCON_DS_WORD;
>   dcon |= S3C2440_SDIDCON_DATSTART;
>   }
> @@ -1045,7 +1050,7 @@ static int s3cmci_setup_data(struct s3cmci_host *host, 
> struct mmc_data *data)
>  
>   /* write TIMER register */
>  
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   writel(0x007F, host->base + S3C2410_SDITIMER);
>   } else {
>   writel(0x, host->base + S3C2410_SDITIMER);
> @@ -1177,19 +1182,6 @@ static void s3cmci_send_request(struct mmc_host *mmc)
>   s3cmci_enable_irq(host, true);
>  }
>  
> -static int s3cmci_card_present(struct mmc_host *mmc)
> -{
> - struct s3cmci_host *host = mmc_priv(mmc);
> - struct s3c24xx_mci_pdata *pdata = host->pdata;
> - int ret;
> -
> - if (pdata->no_detect)
> - return -ENOSYS;
> -
> - ret = gpio_get_value(pdata->gpio_detect) ? 0 : 1;
> - return ret ^ pdata->detect_invert;
> -}
> -
>  static void s3cmci_request(struct mmc_host *mmc, struct mmc_request *mrq)
>  {
>   struct s3cmci_host *host = mmc_priv(mmc);
> @@ -1198,7 +1190,7 @@ static void s3cmci_request(struct mmc_host *mmc, struct 
> mmc_request *mrq)
>   host->cmd_is_stop = 0;
>   host->mrq = mrq;
>  
> - if (s3cmci_card_present(mmc) == 0) {
> + if (mmc_gpio_get_cd(mmc) == 0) {
>   dbg(host, dbg_err, "%s: no medium present\n", __func__);
>   host->mrq->cmd->error = -ENOMEDIUM;
>   mmc_request_done(mmc, mrq);
> @@ -1242,22 +1234,24 @@ static void s3cmci_set_ios(struct mmc_host *mmc, 
> struct mmc_ios *ios)
>   case MMC_POWER_ON:
>   case MMC_POWER_UP:
>   

Re: [PATCH v4 2/2] mmc: host: s3cmci: allow probing from device tree

2017-03-02 Thread Jaehoon Chung
On 03/02/2017 10:18 AM, Sergio Prado wrote:
> Allows configuring Samsung S3C24XX MMC/SD/SDIO controller using a device
> tree.
> 
> Signed-off-by: Sergio Prado 
> ---
>  drivers/mmc/host/s3cmci.c | 298 
> --
>  drivers/mmc/host/s3cmci.h |   3 +-
>  2 files changed, 158 insertions(+), 143 deletions(-)
> 
> diff --git a/drivers/mmc/host/s3cmci.c b/drivers/mmc/host/s3cmci.c
> index 7a173f8c455b..d066dbdb957c 100644
> --- a/drivers/mmc/host/s3cmci.c
> +++ b/drivers/mmc/host/s3cmci.c
> @@ -24,6 +24,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
> +#include 
>  
>  #include 
>  #include 
> @@ -128,6 +132,22 @@ enum dbg_channels {
>   dbg_conf  = (1 << 8),
>  };
>  
> +struct s3cmci_variant_data {
> + int s3c2440_compatible;
> +};

I didn't understand why this structure needs.

Before this patch,
host->is2440;

After this patch,
host->variant->s3c2440_compatible;

Just add the one pointer for checking s3c2400 compatible..
Is it really meaningful?
(I didn't read the previous comments fully.)

Best Regards,
Jaehoon Chung

> +
> +static const struct s3cmci_variant_data s3c2410_s3cmci_variant_data = {
> + .s3c2440_compatible = 0,
> +};
> +
> +static const struct s3cmci_variant_data s3c2412_s3cmci_variant_data = {
> + .s3c2440_compatible = 1,
> +};
> +
> +static const struct s3cmci_variant_data s3c2440_s3cmci_variant_data = {
> + .s3c2440_compatible = 1,
> +};
> +
>  static const int dbgmap_err   = dbg_fail;
>  static const int dbgmap_info  = dbg_info | dbg_conf;
>  static const int dbgmap_debug = dbg_err | dbg_debug;
> @@ -731,7 +751,7 @@ static irqreturn_t s3cmci_irq(int irq, void *dev_id)
>   goto clear_status_bits;
>  
>   /* Check for FIFO failure */
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   if (mci_fsta & S3C2440_SDIFSTA_FIFOFAIL) {
>   dbg(host, dbg_err, "FIFO failure\n");
>   host->mrq->data->error = -EILSEQ;
> @@ -807,21 +827,6 @@ static irqreturn_t s3cmci_irq(int irq, void *dev_id)
>  
>  }
>  
> -/*
> - * ISR for the CardDetect Pin
> -*/
> -
> -static irqreturn_t s3cmci_irq_cd(int irq, void *dev_id)
> -{
> - struct s3cmci_host *host = (struct s3cmci_host *)dev_id;
> -
> - dbg(host, dbg_irq, "card detect\n");
> -
> - mmc_detect_change(host->mmc, msecs_to_jiffies(500));
> -
> - return IRQ_HANDLED;
> -}
> -
>  static void s3cmci_dma_done_callback(void *arg)
>  {
>   struct s3cmci_host *host = arg;
> @@ -913,7 +918,7 @@ static void finalize_request(struct s3cmci_host *host)
>   if (s3cmci_host_usedma(host))
>   dmaengine_terminate_all(host->dma);
>  
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   /* Clear failure register and reset fifo. */
>   writel(S3C2440_SDIFSTA_FIFORESET |
>  S3C2440_SDIFSTA_FIFOFAIL,
> @@ -1026,7 +1031,7 @@ static int s3cmci_setup_data(struct s3cmci_host *host, 
> struct mmc_data *data)
>   dcon |= S3C2410_SDIDCON_XFER_RXSTART;
>   }
>  
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   dcon |= S3C2440_SDIDCON_DS_WORD;
>   dcon |= S3C2440_SDIDCON_DATSTART;
>   }
> @@ -1045,7 +1050,7 @@ static int s3cmci_setup_data(struct s3cmci_host *host, 
> struct mmc_data *data)
>  
>   /* write TIMER register */
>  
> - if (host->is2440) {
> + if (host->variant->s3c2440_compatible) {
>   writel(0x007F, host->base + S3C2410_SDITIMER);
>   } else {
>   writel(0x, host->base + S3C2410_SDITIMER);
> @@ -1177,19 +1182,6 @@ static void s3cmci_send_request(struct mmc_host *mmc)
>   s3cmci_enable_irq(host, true);
>  }
>  
> -static int s3cmci_card_present(struct mmc_host *mmc)
> -{
> - struct s3cmci_host *host = mmc_priv(mmc);
> - struct s3c24xx_mci_pdata *pdata = host->pdata;
> - int ret;
> -
> - if (pdata->no_detect)
> - return -ENOSYS;
> -
> - ret = gpio_get_value(pdata->gpio_detect) ? 0 : 1;
> - return ret ^ pdata->detect_invert;
> -}
> -
>  static void s3cmci_request(struct mmc_host *mmc, struct mmc_request *mrq)
>  {
>   struct s3cmci_host *host = mmc_priv(mmc);
> @@ -1198,7 +1190,7 @@ static void s3cmci_request(struct mmc_host *mmc, struct 
> mmc_request *mrq)
>   host->cmd_is_stop = 0;
>   host->mrq = mrq;
>  
> - if (s3cmci_card_present(mmc) == 0) {
> + if (mmc_gpio_get_cd(mmc) == 0) {
>   dbg(host, dbg_err, "%s: no medium present\n", __func__);
>   host->mrq->cmd->error = -ENOMEDIUM;
>   mmc_request_done(mmc, mrq);
> @@ -1242,22 +1234,24 @@ static void s3cmci_set_ios(struct mmc_host *mmc, 
> struct mmc_ios *ios)
>   case MMC_POWER_ON:
>   case MMC_POWER_UP:
>   /* Configure GPE5...GPE10 

[patch] mm, zoneinfo: print non-populated zones

2017-03-02 Thread David Rientjes
Initscripts can use the information (protection levels) from
/proc/zoneinfo to configure vm.lowmem_reserve_ratio at boot.

vm.lowmem_reserve_ratio is an array of ratios for each configured zone on
the system.  If a zone is not populated on an arch, /proc/zoneinfo
suppresses its output.

This results in there not being a 1:1 mapping between the set of zones
emitted by /proc/zoneinfo and the zones configured by
vm.lowmem_reserve_ratio.

This patch shows statistics for non-populated zones in /proc/zoneinfo.
The zones exist and hold a spot in the vm.lowmem_reserve_ratio array.
Without this patch, it is not possible to determine which index in the
array controls which zone if one or more zones on the system are not
populated.

Remaining users of walk_zones_in_node() are unchanged.  Files such as
/proc/pagetypeinfo require certain zone data to be initialized properly
for display, which is not done for unpopulated zones.

Signed-off-by: David Rientjes 
---
 mm/vmstat.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1121,8 +1121,12 @@ static void frag_stop(struct seq_file *m, void *arg)
 {
 }
 
-/* Walk all the zones in a node and print using a callback */
+/*
+ * Walk zones in a node and print using a callback.
+ * If @populated is true, only use callback for zones that are populated.
+ */
 static void walk_zones_in_node(struct seq_file *m, pg_data_t *pgdat,
+   bool populated,
void (*print)(struct seq_file *m, pg_data_t *, struct zone *))
 {
struct zone *zone;
@@ -1130,7 +1134,7 @@ static void walk_zones_in_node(struct seq_file *m, 
pg_data_t *pgdat,
unsigned long flags;
 
for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
-   if (!populated_zone(zone))
+   if (populated && !populated_zone(zone))
continue;
 
spin_lock_irqsave(>lock, flags);
@@ -1158,7 +1162,7 @@ static void frag_show_print(struct seq_file *m, pg_data_t 
*pgdat,
 static int frag_show(struct seq_file *m, void *arg)
 {
pg_data_t *pgdat = (pg_data_t *)arg;
-   walk_zones_in_node(m, pgdat, frag_show_print);
+   walk_zones_in_node(m, pgdat, true, frag_show_print);
return 0;
 }
 
@@ -1199,7 +1203,7 @@ static int pagetypeinfo_showfree(struct seq_file *m, void 
*arg)
seq_printf(m, "%6d ", order);
seq_putc(m, '\n');
 
-   walk_zones_in_node(m, pgdat, pagetypeinfo_showfree_print);
+   walk_zones_in_node(m, pgdat, true, pagetypeinfo_showfree_print);
 
return 0;
 }
@@ -1251,7 +1255,7 @@ static int pagetypeinfo_showblockcount(struct seq_file 
*m, void *arg)
for (mtype = 0; mtype < MIGRATE_TYPES; mtype++)
seq_printf(m, "%12s ", migratetype_names[mtype]);
seq_putc(m, '\n');
-   walk_zones_in_node(m, pgdat, pagetypeinfo_showblockcount_print);
+   walk_zones_in_node(m, pgdat, true, pagetypeinfo_showblockcount_print);
 
return 0;
 }
@@ -1277,7 +1281,7 @@ static void pagetypeinfo_showmixedcount(struct seq_file 
*m, pg_data_t *pgdat)
seq_printf(m, "%12s ", migratetype_names[mtype]);
seq_putc(m, '\n');
 
-   walk_zones_in_node(m, pgdat, pagetypeinfo_showmixedcount_print);
+   walk_zones_in_node(m, pgdat, true, pagetypeinfo_showmixedcount_print);
 #endif /* CONFIG_PAGE_OWNER */
 }
 
@@ -1434,7 +1438,7 @@ static void zoneinfo_show_print(struct seq_file *m, 
pg_data_t *pgdat,
 static int zoneinfo_show(struct seq_file *m, void *arg)
 {
pg_data_t *pgdat = (pg_data_t *)arg;
-   walk_zones_in_node(m, pgdat, zoneinfo_show_print);
+   walk_zones_in_node(m, pgdat, false, zoneinfo_show_print);
return 0;
 }
 
@@ -1853,7 +1857,7 @@ static int unusable_show(struct seq_file *m, void *arg)
if (!node_state(pgdat->node_id, N_MEMORY))
return 0;
 
-   walk_zones_in_node(m, pgdat, unusable_show_print);
+   walk_zones_in_node(m, pgdat, true, unusable_show_print);
 
return 0;
 }
@@ -1905,7 +1909,7 @@ static int extfrag_show(struct seq_file *m, void *arg)
 {
pg_data_t *pgdat = (pg_data_t *)arg;
 
-   walk_zones_in_node(m, pgdat, extfrag_show_print);
+   walk_zones_in_node(m, pgdat, true, extfrag_show_print);
 
return 0;
 }


[patch] mm, zoneinfo: print non-populated zones

2017-03-02 Thread David Rientjes
Initscripts can use the information (protection levels) from
/proc/zoneinfo to configure vm.lowmem_reserve_ratio at boot.

vm.lowmem_reserve_ratio is an array of ratios for each configured zone on
the system.  If a zone is not populated on an arch, /proc/zoneinfo
suppresses its output.

This results in there not being a 1:1 mapping between the set of zones
emitted by /proc/zoneinfo and the zones configured by
vm.lowmem_reserve_ratio.

This patch shows statistics for non-populated zones in /proc/zoneinfo.
The zones exist and hold a spot in the vm.lowmem_reserve_ratio array.
Without this patch, it is not possible to determine which index in the
array controls which zone if one or more zones on the system are not
populated.

Remaining users of walk_zones_in_node() are unchanged.  Files such as
/proc/pagetypeinfo require certain zone data to be initialized properly
for display, which is not done for unpopulated zones.

Signed-off-by: David Rientjes 
---
 mm/vmstat.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1121,8 +1121,12 @@ static void frag_stop(struct seq_file *m, void *arg)
 {
 }
 
-/* Walk all the zones in a node and print using a callback */
+/*
+ * Walk zones in a node and print using a callback.
+ * If @populated is true, only use callback for zones that are populated.
+ */
 static void walk_zones_in_node(struct seq_file *m, pg_data_t *pgdat,
+   bool populated,
void (*print)(struct seq_file *m, pg_data_t *, struct zone *))
 {
struct zone *zone;
@@ -1130,7 +1134,7 @@ static void walk_zones_in_node(struct seq_file *m, 
pg_data_t *pgdat,
unsigned long flags;
 
for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
-   if (!populated_zone(zone))
+   if (populated && !populated_zone(zone))
continue;
 
spin_lock_irqsave(>lock, flags);
@@ -1158,7 +1162,7 @@ static void frag_show_print(struct seq_file *m, pg_data_t 
*pgdat,
 static int frag_show(struct seq_file *m, void *arg)
 {
pg_data_t *pgdat = (pg_data_t *)arg;
-   walk_zones_in_node(m, pgdat, frag_show_print);
+   walk_zones_in_node(m, pgdat, true, frag_show_print);
return 0;
 }
 
@@ -1199,7 +1203,7 @@ static int pagetypeinfo_showfree(struct seq_file *m, void 
*arg)
seq_printf(m, "%6d ", order);
seq_putc(m, '\n');
 
-   walk_zones_in_node(m, pgdat, pagetypeinfo_showfree_print);
+   walk_zones_in_node(m, pgdat, true, pagetypeinfo_showfree_print);
 
return 0;
 }
@@ -1251,7 +1255,7 @@ static int pagetypeinfo_showblockcount(struct seq_file 
*m, void *arg)
for (mtype = 0; mtype < MIGRATE_TYPES; mtype++)
seq_printf(m, "%12s ", migratetype_names[mtype]);
seq_putc(m, '\n');
-   walk_zones_in_node(m, pgdat, pagetypeinfo_showblockcount_print);
+   walk_zones_in_node(m, pgdat, true, pagetypeinfo_showblockcount_print);
 
return 0;
 }
@@ -1277,7 +1281,7 @@ static void pagetypeinfo_showmixedcount(struct seq_file 
*m, pg_data_t *pgdat)
seq_printf(m, "%12s ", migratetype_names[mtype]);
seq_putc(m, '\n');
 
-   walk_zones_in_node(m, pgdat, pagetypeinfo_showmixedcount_print);
+   walk_zones_in_node(m, pgdat, true, pagetypeinfo_showmixedcount_print);
 #endif /* CONFIG_PAGE_OWNER */
 }
 
@@ -1434,7 +1438,7 @@ static void zoneinfo_show_print(struct seq_file *m, 
pg_data_t *pgdat,
 static int zoneinfo_show(struct seq_file *m, void *arg)
 {
pg_data_t *pgdat = (pg_data_t *)arg;
-   walk_zones_in_node(m, pgdat, zoneinfo_show_print);
+   walk_zones_in_node(m, pgdat, false, zoneinfo_show_print);
return 0;
 }
 
@@ -1853,7 +1857,7 @@ static int unusable_show(struct seq_file *m, void *arg)
if (!node_state(pgdat->node_id, N_MEMORY))
return 0;
 
-   walk_zones_in_node(m, pgdat, unusable_show_print);
+   walk_zones_in_node(m, pgdat, true, unusable_show_print);
 
return 0;
 }
@@ -1905,7 +1909,7 @@ static int extfrag_show(struct seq_file *m, void *arg)
 {
pg_data_t *pgdat = (pg_data_t *)arg;
 
-   walk_zones_in_node(m, pgdat, extfrag_show_print);
+   walk_zones_in_node(m, pgdat, true, extfrag_show_print);
 
return 0;
 }


Re: [PATCH v19 0/4] Introduce usb charger framework to deal with the usb gadget power negotation

2017-03-02 Thread NeilBrown
On Mon, Feb 20 2017, Baolin Wang wrote:

> Currently the Linux kernel does not provide any standard integration of this
> feature that integrates the USB subsystem with the system power regulation
> provided by PMICs meaning that either vendors must add this in their kernels
> or USB gadget devices based on Linux (such as mobile phones) may not behave
> as they should. Thus provide a standard framework for doing this in kernel.
>
> Now introduce one user with wm831x_power to support and test the usb charger.
> Another user introduced to support charger detection by Jun Li:
> https://www.spinics.net/lists/linux-usb/msg139425.html
> Moreover there may be other potential users will use it in future. 
>
> 1. Before v19 patchset we've fixed below issues in extcon subsystem and usb
> phy driver, now all were merged. (Thanks for Neil's suggestion)
> (1) Have fixed the inconsistencies with USB connector types in extcon 
> subsystem
> by following links:
> https://lkml.org/lkml/2016/12/21/13
> https://lkml.org/lkml/2016/12/21/15
> https://lkml.org/lkml/2016/12/21/79
> https://lkml.org/lkml/2017/1/3/13
>
> (2) Instead of using 'set_power' callback in phy drivers, we will introduce
> USB charger to set PMIC current drawn from USB configuration, moreover some
> 'set_power' callbacks did not implement anything to set PMIC current, thus
> remove them by following links:
> https://lkml.org/lkml/2017/1/18/436
> https://lkml.org/lkml/2017/1/18/439
> https://lkml.org/lkml/2017/1/18/438
> Now only two phy drivers (phy-isp1301-omap.c and phy-gpio-vbus-usb.c) still
> used 'set_power' callback to set current, we can remove them in future. (I
> have no platform with enabling these two phy drivers, so I can not test them
> if I converted 'set_power' callback to USB charger.)
>
> 2. Some issues pointed by Neil Brown were sill kept in this v19 patchset, and
> I expalined each issue and may be need discuss again:
> (1) Change all usb phys to register an extcon and to send appropriate 
> notifications.
> Firstly, now only 3 USB phy drivers (phy-qcom-8x16-usb.c, phy-omap-otg.c and
> phy-msm-usb.c) had registered an extcon, mostly did not. I can not change all
> usb phys to register an extcon, since there are no extcon device to register
> for these different phy drivers.

You don't have to change every driver.  You just need to make it easy
and obvious how to change drivers in a consistent coherent way.
For a start you would add a 'struct extcon_dev' to 'struct usb_phy', and
possibly add or extend some 'static inline's in linux/usb/phy.h to send
notification on that extcon (if it is non-NULL).
e.g. usb_phy_vbus_on() could send an extcon notification.

Then any phy driver which adds support for setting phy->extcon_dev
appropriately, immediately gets the relevant notifications sent.


> Secondly, I also agreed with Peter's comments: Not only USB PHY to register
> an extcon, but also for the drivers which can detect USB charger type, it may
> be USB controller driver, USB type-c driver, pmic driver, and these drivers
> may not have an extcon device since the internal part can finish the vbus
> detect.

Whichever part can detect vbus, the driver for that part must be able to
find the extcon and trigger a notification.
Maybe one part can detect VBUS, another can measure the resistance on ID
and a third can work through the state machine to determine if D+ and D-
are shorted together.
Somehow these three need to work together to determine what is plugged
in to the external connection port.  Somewhere there much an 'extcon'
device which represents that port and which the three devices can find
and can interact with.
I think it makes sense for the usb_phy to be the connection point.  Each
of the devices can get to the phy, and the phy can get to the extcon.
It doesn't matter very much if the usb phy driver creates the extcon, or
if something else creates the extcon and the phy driver performs a
lookup to find it (e.g. based on devicetree info).

The point is that there is obviously an external physical connection,
and so there should be an 'extcon' device that represents it.


>
> (2) Change the notifier of usb_phy to be used consistently.
> Now only 3 phy drivers (phy-generic.c, phy-ab8500-usb.c and 
> phy-gpio-vbus-usb.c)
> used the notifier of usb_phy. phy-generic.c and phy-gpio-vbus-usb.c were used 
> to
> send out the connect events, and phy-ab8500-usb.c also was used to send out 
> the
> MUSB connect events. There are no phy drivers will notify 'vbus_draw' 
> information
> by the notifier of usb_phy, which was used consistently now.
> Moreover it is difficult to change the notifier of usb_phy to be used only to
> communicate the 'vbus_draw' information, since we need to refactor and test 
> these
> related phy drivers, power drivers or some mfd drivers, which is a
> huge workload.

You missed drivers/usb/musb/omap2430.c in you list, but that hardly
matters.
phy-ab8500-usb.c appears to send vbus_draw information.

I understand your 

Re: [PATCH v19 0/4] Introduce usb charger framework to deal with the usb gadget power negotation

2017-03-02 Thread NeilBrown
On Mon, Feb 20 2017, Baolin Wang wrote:

> Currently the Linux kernel does not provide any standard integration of this
> feature that integrates the USB subsystem with the system power regulation
> provided by PMICs meaning that either vendors must add this in their kernels
> or USB gadget devices based on Linux (such as mobile phones) may not behave
> as they should. Thus provide a standard framework for doing this in kernel.
>
> Now introduce one user with wm831x_power to support and test the usb charger.
> Another user introduced to support charger detection by Jun Li:
> https://www.spinics.net/lists/linux-usb/msg139425.html
> Moreover there may be other potential users will use it in future. 
>
> 1. Before v19 patchset we've fixed below issues in extcon subsystem and usb
> phy driver, now all were merged. (Thanks for Neil's suggestion)
> (1) Have fixed the inconsistencies with USB connector types in extcon 
> subsystem
> by following links:
> https://lkml.org/lkml/2016/12/21/13
> https://lkml.org/lkml/2016/12/21/15
> https://lkml.org/lkml/2016/12/21/79
> https://lkml.org/lkml/2017/1/3/13
>
> (2) Instead of using 'set_power' callback in phy drivers, we will introduce
> USB charger to set PMIC current drawn from USB configuration, moreover some
> 'set_power' callbacks did not implement anything to set PMIC current, thus
> remove them by following links:
> https://lkml.org/lkml/2017/1/18/436
> https://lkml.org/lkml/2017/1/18/439
> https://lkml.org/lkml/2017/1/18/438
> Now only two phy drivers (phy-isp1301-omap.c and phy-gpio-vbus-usb.c) still
> used 'set_power' callback to set current, we can remove them in future. (I
> have no platform with enabling these two phy drivers, so I can not test them
> if I converted 'set_power' callback to USB charger.)
>
> 2. Some issues pointed by Neil Brown were sill kept in this v19 patchset, and
> I expalined each issue and may be need discuss again:
> (1) Change all usb phys to register an extcon and to send appropriate 
> notifications.
> Firstly, now only 3 USB phy drivers (phy-qcom-8x16-usb.c, phy-omap-otg.c and
> phy-msm-usb.c) had registered an extcon, mostly did not. I can not change all
> usb phys to register an extcon, since there are no extcon device to register
> for these different phy drivers.

You don't have to change every driver.  You just need to make it easy
and obvious how to change drivers in a consistent coherent way.
For a start you would add a 'struct extcon_dev' to 'struct usb_phy', and
possibly add or extend some 'static inline's in linux/usb/phy.h to send
notification on that extcon (if it is non-NULL).
e.g. usb_phy_vbus_on() could send an extcon notification.

Then any phy driver which adds support for setting phy->extcon_dev
appropriately, immediately gets the relevant notifications sent.


> Secondly, I also agreed with Peter's comments: Not only USB PHY to register
> an extcon, but also for the drivers which can detect USB charger type, it may
> be USB controller driver, USB type-c driver, pmic driver, and these drivers
> may not have an extcon device since the internal part can finish the vbus
> detect.

Whichever part can detect vbus, the driver for that part must be able to
find the extcon and trigger a notification.
Maybe one part can detect VBUS, another can measure the resistance on ID
and a third can work through the state machine to determine if D+ and D-
are shorted together.
Somehow these three need to work together to determine what is plugged
in to the external connection port.  Somewhere there much an 'extcon'
device which represents that port and which the three devices can find
and can interact with.
I think it makes sense for the usb_phy to be the connection point.  Each
of the devices can get to the phy, and the phy can get to the extcon.
It doesn't matter very much if the usb phy driver creates the extcon, or
if something else creates the extcon and the phy driver performs a
lookup to find it (e.g. based on devicetree info).

The point is that there is obviously an external physical connection,
and so there should be an 'extcon' device that represents it.


>
> (2) Change the notifier of usb_phy to be used consistently.
> Now only 3 phy drivers (phy-generic.c, phy-ab8500-usb.c and 
> phy-gpio-vbus-usb.c)
> used the notifier of usb_phy. phy-generic.c and phy-gpio-vbus-usb.c were used 
> to
> send out the connect events, and phy-ab8500-usb.c also was used to send out 
> the
> MUSB connect events. There are no phy drivers will notify 'vbus_draw' 
> information
> by the notifier of usb_phy, which was used consistently now.
> Moreover it is difficult to change the notifier of usb_phy to be used only to
> communicate the 'vbus_draw' information, since we need to refactor and test 
> these
> related phy drivers, power drivers or some mfd drivers, which is a
> huge workload.

You missed drivers/usb/musb/omap2430.c in you list, but that hardly
matters.
phy-ab8500-usb.c appears to send vbus_draw information.

I understand your 

Re: [PATCH v4 14/36] [media] v4l2-mc: add a function to inherit controls from a pipeline

2017-03-02 Thread Steve Longerbeam



On 03/02/2017 03:48 PM, Steve Longerbeam wrote:



On 03/02/2017 08:02 AM, Sakari Ailus wrote:

Hi Steve,

On Wed, Feb 15, 2017 at 06:19:16PM -0800, Steve Longerbeam wrote:

v4l2_pipeline_inherit_controls() will add the v4l2 controls from
all subdev entities in a pipeline to a given video device.

Signed-off-by: Steve Longerbeam 
---
 drivers/media/v4l2-core/v4l2-mc.c | 48
+++
 include/media/v4l2-mc.h   | 25 
 2 files changed, 73 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-mc.c
b/drivers/media/v4l2-core/v4l2-mc.c
index 303980b..09d4d97 100644
--- a/drivers/media/v4l2-core/v4l2-mc.c
+++ b/drivers/media/v4l2-core/v4l2-mc.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -238,6 +239,53 @@ int v4l_vb2q_enable_media_source(struct
vb2_queue *q)
 }
 EXPORT_SYMBOL_GPL(v4l_vb2q_enable_media_source);

+int __v4l2_pipeline_inherit_controls(struct video_device *vfd,
+ struct media_entity *start_entity)


I have a few concerns / questions:

- What's the purpose of this patch? Why not to access the sub-device node
  directly?



I don't really understand what you are trying to say.



Actually I think I understand what you mean now. Yes, the user can
always access a subdev's control directly from its /dev/v4l-subdevXX.
I'm only providing this feature as a convenience to the user, so that
all controls in a pipeline can be accessed from one place, i.e. the
main capture device node.

Steve




Re: [PATCH v4 14/36] [media] v4l2-mc: add a function to inherit controls from a pipeline

2017-03-02 Thread Steve Longerbeam



On 03/02/2017 03:48 PM, Steve Longerbeam wrote:



On 03/02/2017 08:02 AM, Sakari Ailus wrote:

Hi Steve,

On Wed, Feb 15, 2017 at 06:19:16PM -0800, Steve Longerbeam wrote:

v4l2_pipeline_inherit_controls() will add the v4l2 controls from
all subdev entities in a pipeline to a given video device.

Signed-off-by: Steve Longerbeam 
---
 drivers/media/v4l2-core/v4l2-mc.c | 48
+++
 include/media/v4l2-mc.h   | 25 
 2 files changed, 73 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-mc.c
b/drivers/media/v4l2-core/v4l2-mc.c
index 303980b..09d4d97 100644
--- a/drivers/media/v4l2-core/v4l2-mc.c
+++ b/drivers/media/v4l2-core/v4l2-mc.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -238,6 +239,53 @@ int v4l_vb2q_enable_media_source(struct
vb2_queue *q)
 }
 EXPORT_SYMBOL_GPL(v4l_vb2q_enable_media_source);

+int __v4l2_pipeline_inherit_controls(struct video_device *vfd,
+ struct media_entity *start_entity)


I have a few concerns / questions:

- What's the purpose of this patch? Why not to access the sub-device node
  directly?



I don't really understand what you are trying to say.



Actually I think I understand what you mean now. Yes, the user can
always access a subdev's control directly from its /dev/v4l-subdevXX.
I'm only providing this feature as a convenience to the user, so that
all controls in a pipeline can be accessed from one place, i.e. the
main capture device node.

Steve




<    1   2   3   4   5   6   7   8   9   10   >