Re: HP ProLiant DL360p Gen8 hangs with Linux 4.13+.

2018-01-15 Thread Laurence Oberman
On Mon, 2018-01-15 at 07:01 -0800, Hellwig, Christoph wrote:
> Laurence, I'm a little confused.  Is this the same issue we just
> fixed,
> or is this an issue showing up with the fix?
> 
> E.g. what kernel versions or trees are affected?

Hello Christoph

This showed up on a  combined tree of Mikes and Jens (4.15.0-
rc4.block.dm.4.16) I was testing this weekend but was not apparent on
the generic upstream 4.15-rc7 from Linus.
I have to admit that was puzzling me.

When I removed your commit the issue went away.

Ming has crafted a fix so that your original commit can remain in and I
am testing that now against the same tree that was hanging before.

Ming has a handle on the issue so I will report back after testing.

Kernel is building now

Thanks
Laurence




Re: HP ProLiant DL360p Gen8 hangs with Linux 4.13+.

2018-01-15 Thread Hellwig, Christoph
Laurence, I'm a little confused.  Is this the same issue we just fixed,
or is this an issue showing up with the fix?

E.g. what kernel versions or trees are affected?


Re: HP ProLiant DL360p Gen8 hangs with Linux 4.13+.

2018-01-15 Thread Laurence Oberman
On Mon, 2018-01-15 at 20:17 +0800, Ming Lei wrote:
> On Sun, Jan 14, 2018 at 06:40:40PM -0500, Laurence Oberman wrote:
> > On Thu, 2018-01-04 at 14:32 -0800, Vinson Lee wrote:
> > > Hi.
> > > 
> > > HP ProLiant DL360p Gen8 with Smart Array P420i boots to the login
> > > prompt and hangs with Linux 4.13 or later. I cannot log in on
> > > console
> > > or SSH into the machine. Linux 4.12 and older boot fine.
> > > 
> > > 
> > 
> > ...
> > 
> > ...
> > 
> > This issue bit me for for two straight days.
> > I was testing Mike Snitzers combined tree and this commit crept
> > into
> > the latest combined tree.
> > 
> > commit 84676c1f21e8ff54befe985f4f14dc1edc10046b
> > Author: Christoph Hellwig 
> > Date:   Fri Jan 12 10:53:05 2018 +0800
> > 
> > genirq/affinity: assign vectors to all possible CPUs
> >    
> > Currently we assign managed interrupt vectors to all present
> > CPUs.  This
> > works fine for systems were we only online/offline CPUs.  But
> > in
> > case of
> > systems that support physical CPU hotplug (or the virtualized
> > version of
> > it) this means the additional CPUs covered for in the ACPI
> > tables
> > or on
> > the command line are not catered for.  To fix this we'd either
> > need
> > to
> > introduce new hotplug CPU states just for this case, or we can
> > start
> > assining vectors to possible but not present CPUs.
> >    
> > Reported-by: Christian Borntraeger 
> > Tested-by: Christian Borntraeger 
> > Tested-by: Stefan Haberland 
> > Fixes: 4b855ad37194 ("blk-mq: Create hctx for each present
> > CPU")
> > Cc: linux-kernel@vger.kernel.org
> > Cc: Thomas Gleixner 
> > Signed-off-by: Christoph Hellwig 
> > Signed-off-by: Jens Axboe 
> > 
> > Reason I never thought about this being my reason for the latest
> > hang
> > is I have used Linus' tree all the way to 4.15-rc7 with no issues.
> > 
> > Vinson reporting it against 4.13 or later was not making sense
> > because
> > I had not seen the hang until this weekend.
> > 
> > I checked  and its in Linus's tree but its not an issue in the
> > generic
> > 4.15-rc7 for me.
> 
> Hi Laurence,
> 
> Wrt. your issue, I have investigated a bit and found that it is
> because
> one irq vector may be assigned to all offline CPUs, and it may not be
> same with Vinson's.
> 
> And the following patch can address your issue, I may prepare a
> formal
> version if no one objects this approach.
> 
> Thomas, Christoph, could you take a look this patch?
> 
> ---
>  kernel/irq/affinity.c | 69 +++
> 
>  1 file changed, 47 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index a37a3b4b6342..dfc1f6a9c488 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -94,6 +94,39 @@ static int get_nodes_in_cpumask(cpumask_var_t
> *node_to_possible_cpumask,
>   return nodes;
>  }
>  
> +/*
> + * Spread the affinity of @nmsk into @nr_vecs irq vectors, and the
> + * result is stored to @start_irqmsk.
> + */
> +static int irq_vecs_spread_affinity(struct cpumask *irqmsk,
> + int max_irqmsks,
> + struct cpumask *nmsk,
> + int max_ncpus)
> +{
> + int v, ncpus;
> + int vecs_to_assign, extra_vecs;
> +
> + /* Calculate the number of cpus per vector */
> + ncpus = cpumask_weight(nmsk);
> + vecs_to_assign = min(max_ncpus, ncpus);
> +
> + /* Account for rounding errors */
> + extra_vecs = ncpus - vecs_to_assign * (ncpus /
> vecs_to_assign);
> +
> + for (v = 0; v < min(max_irqmsks, vecs_to_assign); v++) {
> + int cpus_per_vec = ncpus / vecs_to_assign;
> +
> + /* Account for extra vectors to compensate rounding
> errors */
> + if (extra_vecs) {
> + cpus_per_vec++;
> + --extra_vecs;
> + }
> + irq_spread_init_one(irqmsk + v, nmsk, cpus_per_vec);
> + }
> +
> + return v;
> +}
> +
>  /**
>   * irq_create_affinity_masks - Create affinity masks for multiqueue
> spreading
>   * @nvecs:   The total number of vectors
> @@ -104,7 +137,7 @@ static int get_nodes_in_cpumask(cpumask_var_t
> *node_to_possible_cpumask,
>  struct cpumask *
>  irq_create_a

Re: HP ProLiant DL360p Gen8 hangs with Linux 4.13+.

2018-01-15 Thread Ming Lei
On Sun, Jan 14, 2018 at 06:40:40PM -0500, Laurence Oberman wrote:
> On Thu, 2018-01-04 at 14:32 -0800, Vinson Lee wrote:
> > Hi.
> > 
> > HP ProLiant DL360p Gen8 with Smart Array P420i boots to the login
> > prompt and hangs with Linux 4.13 or later. I cannot log in on console
> > or SSH into the machine. Linux 4.12 and older boot fine.
> > 
> > 
> ...
> 
> ...
> 
> This issue bit me for for two straight days.
> I was testing Mike Snitzers combined tree and this commit crept into
> the latest combined tree.
> 
> commit 84676c1f21e8ff54befe985f4f14dc1edc10046b
> Author: Christoph Hellwig 
> Date:   Fri Jan 12 10:53:05 2018 +0800
> 
> genirq/affinity: assign vectors to all possible CPUs
>    
> Currently we assign managed interrupt vectors to all present
> CPUs.  This
> works fine for systems were we only online/offline CPUs.  But in
> case of
> systems that support physical CPU hotplug (or the virtualized
> version of
> it) this means the additional CPUs covered for in the ACPI tables
> or on
> the command line are not catered for.  To fix this we'd either need
> to
> introduce new hotplug CPU states just for this case, or we can
> start
> assining vectors to possible but not present CPUs.
>    
> Reported-by: Christian Borntraeger 
> Tested-by: Christian Borntraeger 
> Tested-by: Stefan Haberland 
> Fixes: 4b855ad37194 ("blk-mq: Create hctx for each present CPU")
> Cc: linux-kernel@vger.kernel.org
> Cc: Thomas Gleixner 
> Signed-off-by: Christoph Hellwig 
> Signed-off-by: Jens Axboe 
> 
> Reason I never thought about this being my reason for the latest hang
> is I have used Linus' tree all the way to 4.15-rc7 with no issues.
> 
> Vinson reporting it against 4.13 or later was not making sense because
> I had not seen the hang until this weekend.
> 
> I checked  and its in Linus's tree but its not an issue in the generic
> 4.15-rc7 for me.

Hi Laurence,

Wrt. your issue, I have investigated a bit and found that it is because
one irq vector may be assigned to all offline CPUs, and it may not be
same with Vinson's.

And the following patch can address your issue, I may prepare a formal
version if no one objects this approach.

Thomas, Christoph, could you take a look this patch?

---
 kernel/irq/affinity.c | 69 +++
 1 file changed, 47 insertions(+), 22 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index a37a3b4b6342..dfc1f6a9c488 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -94,6 +94,39 @@ static int get_nodes_in_cpumask(cpumask_var_t 
*node_to_possible_cpumask,
return nodes;
 }
 
+/*
+ * Spread the affinity of @nmsk into @nr_vecs irq vectors, and the
+ * result is stored to @start_irqmsk.
+ */
+static int irq_vecs_spread_affinity(struct cpumask *irqmsk,
+   int max_irqmsks,
+   struct cpumask *nmsk,
+   int max_ncpus)
+{
+   int v, ncpus;
+   int vecs_to_assign, extra_vecs;
+
+   /* Calculate the number of cpus per vector */
+   ncpus = cpumask_weight(nmsk);
+   vecs_to_assign = min(max_ncpus, ncpus);
+
+   /* Account for rounding errors */
+   extra_vecs = ncpus - vecs_to_assign * (ncpus / vecs_to_assign);
+
+   for (v = 0; v < min(max_irqmsks, vecs_to_assign); v++) {
+   int cpus_per_vec = ncpus / vecs_to_assign;
+
+   /* Account for extra vectors to compensate rounding errors */
+   if (extra_vecs) {
+   cpus_per_vec++;
+   --extra_vecs;
+   }
+   irq_spread_init_one(irqmsk + v, nmsk, cpus_per_vec);
+   }
+
+   return v;
+}
+
 /**
  * irq_create_affinity_masks - Create affinity masks for multiqueue spreading
  * @nvecs: The total number of vectors
@@ -104,7 +137,7 @@ static int get_nodes_in_cpumask(cpumask_var_t 
*node_to_possible_cpumask,
 struct cpumask *
 irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 {
-   int n, nodes, cpus_per_vec, extra_vecs, curvec;
+   int n, nodes, curvec;
int affv = nvecs - affd->pre_vectors - affd->post_vectors;
int last_affv = affv + affd->pre_vectors;
nodemask_t nodemsk = NODE_MASK_NONE;
@@ -154,33 +187,25 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
}
 
for_each_node_mask(n, nodemsk) {
-   int ncpus, v, vecs_to_assign, vecs_per_node;
+   int vecs_per_node;
 
/* Spread the vectors per node */
vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
-   /* Get the cpus on t

Re: Hung Task Linux 4.13-rc7 Reiserfs

2017-09-30 Thread Shankara Pailoor
Hi,

I have a reproducer program. It takes about 3-5 minutes to trigger the
hang. The only calls are mmap, open, write, and readahead and the
writes are fairly small (512 bytes).

Reproducer Program: https://pastebin.com/cx1cgABc
Report: https://pastebin.com/uGTAw45E
Logs: https://pastebin.com/EaiE0JLf
Kernel Configs: https://pastebin.com/i6URdADw

Regards,
Shankara

On Fri, Sep 29, 2017 at 11:56 PM, Shankara Pailoor  wrote:
> Hi,
>
> I am fuzzing the kernel 4.13-rc7 with Syzkaller with Reiserfs. I am
> getting the following crash:
>
> INFO: task kworker/0:3:1103 blocked for more than 120 seconds.
>
>
> Here is the full stack trace. I noticed that there are a few tasks
> holding a sbi->lock. Below are a report and a log of all the programs
> executing at the time of the incident.
>
>
> Report: https://pastebin.com/uGTAw45E
> Logs: https://pastebin.com/EaiE0JLf
> Kernel Configs: https://pastebin.com/i6URdADw
>
> I don't have a reproducer yet and any assistance would be appreciated.
>
> Regards,
> Shankara



Hung Task Linux 4.13-rc7 Reiserfs

2017-09-29 Thread Shankara Pailoor
Hi,

I am fuzzing the kernel 4.13-rc7 with Syzkaller with Reiserfs. I am
getting the following crash:

INFO: task kworker/0:3:1103 blocked for more than 120 seconds.


Here is the full stack trace. I noticed that there are a few tasks
holding a sbi->lock. Below are a report and a log of all the programs
executing at the time of the incident.


Report: https://pastebin.com/uGTAw45E
Logs: https://pastebin.com/EaiE0JLf
Kernel Configs: https://pastebin.com/i6URdADw

I don't have a reproducer yet and any assistance would be appreciated.

Regards,
Shankara



Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-21 Thread Kalle Valo
Larry Finger  writes:

> On 09/21/2017 06:37 AM, Zwindl wrote:
>> Hi, I've reported to archlinux's bugzilla, and finally found out the
>> flag which caused that issue, it's the
>> `CONFIG_INTEL_IOMMU_DEFAULT_ON=y` flag, I think may this is a kernel
>> bug, more details at https://bugs.archlinux.org/task/55665
>
> My standard kernel has the following:
>
> CONFIG_INTEL_IOMMU=y
> # CONFIG_INTEL_IOMMU_SVM is not set
> # CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
>
> I will do some further testing to see if turning
> CONFIG_INTEL_IOMMU_DEFAULT_ON also breaks my system.

But not all systems have iommu so check from dmesg that iommu is really
enabled.

-- 
Kalle Valo


Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-21 Thread Larry Finger

On 09/21/2017 06:37 AM, Zwindl wrote:
Hi, I've reported to archlinux's bugzilla, and finally found out the flag which 
caused that issue, it's the `CONFIG_INTEL_IOMMU_DEFAULT_ON=y` flag, I think may 
this is a kernel bug, more details at https://bugs.archlinux.org/task/55665


My standard kernel has the following:

CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_SVM is not set
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set

I will do some further testing to see if turning CONFIG_INTEL_IOMMU_DEFAULT_ON 
also breaks my system.


Larry



Re: stable-rc/linux-4.13.y build: 208 builds: 0 failed, 208 passed, 29 warnings (v4.13.2-53-gb857b6dfc252)

2017-09-18 Thread gregkh
On Mon, Sep 18, 2017 at 04:19:08PM +0200, Arnd Bergmann wrote:
> On Mon, Sep 18, 2017 at 12:57 PM, kernelci.org bot  wrote:
> >
> > stable-rc/linux-4.13.y build: 208 builds: 0 failed, 208 passed, 29 warnings 
> > (v4.13.2-53-gb857b6dfc252)
> 
> Same as v4.9, please backport
> 
> 7bf7a193a90c ("xfs: fix compiler warnings")

Now applied, thanks.

greg k-h


Re: stable-rc/linux-4.13.y build: 208 builds: 0 failed, 208 passed, 29 warnings (v4.13.2-53-gb857b6dfc252)

2017-09-18 Thread Arnd Bergmann
On Mon, Sep 18, 2017 at 12:57 PM, kernelci.org bot  wrote:
>
> stable-rc/linux-4.13.y build: 208 builds: 0 failed, 208 passed, 29 warnings 
> (v4.13.2-53-gb857b6dfc252)

Same as v4.9, please backport

7bf7a193a90c ("xfs: fix compiler warnings")

   Arnd


Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-16 Thread Larry Finger

On 09/16/2017 06:27 AM, Zwindl wrote:
Hi, I've done the test, and the weird thing happened. The kernel buit with this 
config file https://ptpb.pw/HF1g which is from 
https://aur.archlinux.org/packages/linux-git/  can run properly, the wifi can 
connect, despite which version it is, but, with this config file 
https://ptpb.pw/7GuV which comes from the archlinux's official package build 
repo(linux-package 
), 
all the version begin with 4.13 was failed to connect wifi.
So, I think the issue is not caused by the kernel code, is caused by some 
options in the config file, but I can't fully understand the meaning of these 
options so that I can't determine which option caused that issue, what should I 
do now, maybe report this bug to archlinux's maintainer?
By the way, maybe I'll lost internet connection tomorrow, it's time to back to 
university, but I'm happy to help to push the debug progress.


Yes, you need to report this to archlinux's bugzilla or maintainer, whichever is 
appropriate. I have seen a configuration error cause some feature to be silently 
missing, but leading to a WARN is rare.


I looked at your two configurations, but did not see a definitive difference.

Larry



Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-15 Thread Larry Finger

On 09/15/2017 12:12 PM, Zwindl wrote:

Thanks for your patient and advice, I'll keep that in mind.
I do want help, and I got 1 day to build the system, but I can't recall how to 
compile it, The last time I compile kernel is 2013, so, maybe I'll ask you so 
many stupid questions during the build time.

ZWindL


Building a new kernel is not difficult. In an average week, I make at least 10 
new kernels. Many of them are done on slow machines that take many hours. At 
least, your i5 CPU should do it in less that one hour.


Step 1: Download the kernel sources using

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

If your system complains that the git command is unknown, then you will need to 
install it with your package manager (pacman?).


Step 2: "cd  linux" and copy the latest /boot/config-. to the linux source 
directory as ".config". Edit .config, find the line that says
"# CONFIG_LOCALVERSION_AUTO is not set", and change the line to read 
"CONFIG_LOCALVERSION_AUTO=y".


Step 3: Build and install the latest version using

make -j9
sudo make modules_install install

You will need to answer some configuration questions at the start of the first 
make line. Answer with the default value, i.e. just use an ENTER. When the build 
is complete, reboot. Grub should show an entry for something like 
v4.13-12084-ged43e4d190d0. The numbers after the 4.13 will likely be different, 
but the form will match. Check that the new kernel still has the fault. If not, 
it has been fixed and we do not need to find it.


It the problem is still in the latest version of the kernel, then we start the 
bisection with the following:


git bisect start
git bisect bad v4.13
git bisect good v4.12

At this point, git will report the number of revisions to test, the likely 
number of tries, and the SHA hash for the new kernel. Record the first 7 digits 
of the hash, and repeat the make commands above. After the build is complete, 
reboot into the kernel with the hash in the version name and test. Then enter 
the command "git bisect xxx", where xxx is good or bad depending on the test. A 
new trial will be generated by bisecting the appropriate half of the commits. 
Record its hash and redo the build. Repeat until git tells you the bad commit.


This process will generate a number of kernels that will take quite a bit of 
disk space. If you run short, you can delete kernels that have already been 
tested from /boot. You should also delete the corresponding modules from 
/lib/modules.


Good luck,

Larry



Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-15 Thread Larry Finger

On 09/15/2017 05:10 AM, Zwindl wrote:



 Original Message 
Subject: Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13
Local Time: 14 September 2017 6:05 PM
UTC Time: 14 September 2017 18:05
From: larry.fin...@lwfinger.net
To: Zwindl , linux-wirel...@vger.kernel.org 

chaoming...@realsil.com.cn , kv...@codeaurora.org 
, pks...@realtek.com , 
johannes.b...@intel.com , gre...@linuxfoundation.org 
, net...@vger.kernel.org , 
linux-kernel@vger.kernel.org 


On 09/14/2017 08:30 AM, Zwindl wrote:
> Dear developers:
> I"m using Arch Linux with testing enabled, the current kernel version and
> details are
> `Linux zwindl 4.13.2-1-ARCH #1 SMP PREEMPT Thu Sep 14 02:57:34 UTC 2017 x86_64
> GNU/Linux`.
> The wireless card can"t work properly from the kernel 4.13. Here"s the log(in
> attachment) when NetworkManager trying to connect my wifi which is named as
> "TP", my mac addr hided as xx:xx:xx:xx:xx.
> What should I provide to help to debug?
> ZWindL.

The BUG-ON arises in __intel_map_single() due to dir (for direction of DMA)
equal to DMA_NONE (3). When rtl8192ee calls pci_map_single(), it uses
PCI_DMA_TODEVICE (1). I followed the calling sequence through the entire chain,
and none of the routines made any changes to "dir", other that changing the type
from int to enum dma_data_direction. That would not have changed a 1 to a 3.

I built a 4.13.2 system. The problem does not happen here. At this point, the
system has been up for about two hours. I did discover a small memory leak
associated with firmware loading, but that should not have caused the problem.
Nonetheless, I will be sending a patch to fix that problem.

I will continue testing, although I doubt that the problem will happen here.

How long had your system been up when the problem occurred? Your dmesg fragment
did not show any times. What kernels have you tried besides 4.13.2?

Larry

Oh, sorry, the original log is from `journalctl`.
Here's the `dmesg` prints(error.txt). I can't determine which part is related, 
so I paste all of it. I've tried 4.12.X(no issue), 4.13.1(issue), 4.13.2(issue).

ZWindL


The output of dmesg is a lot more instructive than that of journalctl. I now 
know exactly the location that triggered the WARNING. I still do not understand 
it. In fact, it is likely a regression in kernel 4.13 that does not affect my 
Toshiba laptop, nor a Lenovo machine I have, but does affect your Lenovo laptop.


Is it possible for you to install the mainline source from vger.kernel.org using 
git and bisect the issue? It will take quite a bit of time, but it is likely the 
only way to find the offending change. If you are willing to try this, I will 
send you reasonably complete instructions.


By the way, it is usually better to load the dmesg output into a pastebin site 
and post the link. Sending the entire file to a list makes a lot of people 
receive a lot of data for which they have no interest.


Larry




Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-14 Thread Larry Finger

On 09/14/2017 08:30 AM, Zwindl wrote:

Dear developers:
I'm using Arch Linux with testing enabled, the current kernel version and 
details are
`Linux zwindl 4.13.2-1-ARCH #1 SMP PREEMPT Thu Sep 14 02:57:34 UTC 2017 x86_64 
GNU/Linux`.
The wireless card can't work properly from the kernel 4.13. Here's the log(in 
attachment) when NetworkManager trying to connect my wifi which is named as 
'TP', my mac addr hided as xx:xx:xx:xx:xx.

What should I provide to help to debug?
ZWindL.


The BUG-ON arises in __intel_map_single() due to dir (for direction of DMA) 
equal to DMA_NONE (3). When rtl8192ee calls pci_map_single(), it uses 
PCI_DMA_TODEVICE (1). I followed the calling sequence through the entire chain, 
and none of the routines made any changes to 'dir', other that changing the type 
from int to enum dma_data_direction. That would not have changed a 1 to a 3.


I built a 4.13.2 system. The problem does not happen here. At this point, the 
system has been up for about two hours. I did discover a small memory leak 
associated with firmware loading, but that should not have caused the problem. 
Nonetheless, I will be sending a patch to fix that problem.


I will continue testing, although I doubt that the problem will happen here.

How long had your system been up when the problem occurred? Your dmesg fragment 
did not show any times. What kernels have you tried besides 4.13.2?


Larry



[ANNOUNCE] Iproute2 for Linux 4.13

2017-09-05 Thread Stephen Hemminger

Update to iproute2 utility to support new features in Linux 4.13.
This is a larger than usual release because of lots of updates for BPF
and the new RDMA utility. Lots of cleanups and Coverity reported
potential issues as well.

Source:
  https://www.kernel.org/pub/linux/utils/net/iproute2/iproute2-4.13.0.tar.gz

Repository:
  git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

Report problems (or enhancements) to the net...@vger.kernel.org mailing list.

---
Alexander Alemayhu (1):
  examples/bpf: update list of examples

Andreas Henriksson (1):
  ss: fix help/man TCP-STATE description for listening

Arkadi Sharshevsky (1):
  bridge: Distinguish between externally learned vs offloaded FDBs

Casey Callendrello (1):
  netns: make /var/run/netns bind-mount recursive

Daniel Borkmann (8):
  bpf: remove obsolete samples
  bpf: support loading map in map from obj
  bpf: dump id/jited info for cls/act programs
  bpf: improve error reporting around tail calls
  bpf: fix mnt path when from env
  bpf: unbreak libelf linkage for bpf obj loader
  bpf: minor cleanups for bpf_trace_pipe
  bpf: consolidate dumps to use bpf_dump_prog_info

David Ahern (2):
  lib: Dump ext-ack string by default
  libnetlink: Fix extack attribute parsing

Girish Moodalbail (1):
  geneve: support for modifying geneve device

Hangbin Liu (1):
  utils: return default family when rtm_family is not RTNL_FAMILY_IPMR/IP6MR

Jakub Kicinski (3):
  bpf: print xdp offloaded mode
  bpf: add xdpdrv for requesting XDP driver mode
  bpf: allow requesting XDP HW offload

Jiri Benc (2):
  tc: m_tunnel_key: reformat the usage text
  tc: m_tunnel_key: add csum/nocsum option

Jiri Pirko (7):
  tc_filter: add support for chain index
  tc: actions: add helpers to parse and print control actions
  tc/actions: introduce support for goto chain action
  tc: flower: add support for tcp flags
  tc: gact: fix control action parsing
  tc: add support for TRAP action
  tc: don't print error message on miss when parsing action with default

Leon Romanovsky (8):
  utils: Move BIT macro to common header
  rdma: Add basic infrastructure for RDMA tool
  rdma: Add dev object
  rdma: Add link object
  rdma: Add json and pretty outputs
  rdma: Implement json output for dev object
  rdma: Add json output to link object
  rdma: Add initial manual for the tool

Martin KaFai Lau (1):
  bpf: Add support for IFLA_XDP_PROG_ID

Matteo Croce (3):
  tc: fix typo in manpage
  netns: avoid directory traversal
  netns: more input validation

Michal Kubecek (2):
  iplink: check for message truncation in iplink_get()
  iplink: double the buffer size also in iplink_get()

Or Gerlitz (1):
  tc: flower: add support for matching on ip tos and ttl

Phil Sutter (58):
  bpf: Make bytecode-file reading a little more robust
  Really fix get_addr() and get_prefix() error messages
  tc-simple: Fix documentation
  examples: Some shell fixes to cbq.init
  ifcfg: Quote left-hand side of [ ] expression
  tipc/node: Fix socket fd check in cmd_node_get_addr()
  iproute_lwtunnel: Argument to strerror must be positive
  iproute_lwtunnel: csum_mode value checking was ineffective
  ss: Don't leak fd in tcp_show_netlink_file()
  tc/em_ipset: Don't leak sockfd on error path
  ipvrf: Fix error path of vrf_switch()
  ifstat: Fix memleak in error case
  ifstat: Fix memleak in dump_kern_db() for json output
  ss: Fix potential memleak in unix_stats_print()
  tipc/bearer: Fix resource leak in error path
  devlink: No need for this self-assignment
  ipntable: No need to check and assign to parms_rta
  iproute: Fix for missing 'Oifs:' display
  lib/rt_names: Drop dead code in rtnl_rttable_n2a()
  ss: Skip useless check in parse_hostcond()
  ss: Drop useless assignment
  tc/m_gact: Drop dead code
  ipaddress: Avoid accessing uninitialized variable lcl
  iplink_can: Prevent overstepping array bounds
  ipmaddr: Avoid accessing uninitialized data
  ss: Use C99 initializer in netlink_show_one()
  netem/maketable: Check return value of fstat()
  tc/q_multiq: Don't pass garbage in TCA_OPTIONS
  iproute: Check mark value input
  iplink_vrf: Complain if main table is not found
  devlink: Check return code of strslashrsplit()
  lib/bpf: Don't leak fp in bpf_find_mntpt()
  ifstat, nstat: Check fdopen() return value
  tc/q_netem: Don't dereference possibly NULL pointer
  tc/tc_filter: Make sure filter name is not empty
  tipc/bearer: Prevent NULL pointer dereference
  ipntable: Avoid memory allocation for filter.name
  lib/fs: Fix format string in find_fs_mount()
  lib/inet_proto: Review inet_proto_{a2n,n2a}()
  lnstat_util: Simplify alloc

Re: Linux 4.13

2017-09-03 Thread Linus Torvalds
Oh, and a side note on the merge window for 4.14 that obviously opened
as a result of the 4.13 release..

Tomorrow being Labor Day(*) in the US, I'm likely not merging as much
as I usually try to do early in the merge window. I'll probably do
some early pull requests today, do _some_ stuff tomorrow in between
hamburgers, and then the merge window will commence normally on
Tuesday.

  Linus

(*) For non-US people, the US Labor Day is almost, but not quite,
entirely different from the Labor Day celebrated May 1st elsewhere.
Instead of parades, there's hamburgers. And instead of labor
relations, it's all about the BBQ and "end of summer".


Linux 4.13

2017-09-03 Thread Linus Torvalds
e to new mmu_notifier semantic
  IB/hfi1: update to new mmu_notifier semantic
  iommu/amd: update to new mmu_notifier semantic
  iommu/intel: update to new mmu_notifier semantic
  misc/mic/scif: update to new mmu_notifier semantic
  sgi-gru: update to new mmu_notifier semantic
  xen/gntdev: update to new mmu_notifier semantic
  KVM: update to new mmu_notifier semantic v2
  mm/mmu_notifier: kill invalidate_page

Koichiro Den (1):
  xfrm: fix null pointer dereference on state and tmpl sort

Krzysztof Kozlowski (1):
  c6x: defconfig: Cleanup from old Kconfig options

Linus Torvalds (3):
  page waitqueue: always add new entries at the end
  Revert "rmap: do not call mmu_notifier_invalidate_page() under ptl"
  Linux 4.13

Lorenzo Colitti (1):
  net: xfrm: don't double-hold dst when sk_policy in use.

Luca Coelho (1):
  iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()

Lucas Stach (1):
  ASoC: simple_card_utils: fix fallback when "label" property isn't present

Maciej Purski (1):
  drm/bridge/sii8620: Fix memory corruption

Martin Schwidefsky (2):
  s390/mm: fork vs. 5 level page tabel
  s390/mm: fix BUG_ON in crst_table_upgrade

Mathias Krause (4):
  xfrm_user: fix info leak in copy_user_offload()
  xfrm_user: fix info leak in xfrm_notify_sa()
  xfrm_user: fix info leak in build_expire()
  xfrm_user: fix info leak in build_aevent()

Matt Turner (2):
  alpha: Fix build error without CONFIG_VGA_HOSE.
  alpha: Fix section mismatches

Max Gurtovoy (1):
  nvme-rdma: default MR page size to 4k

Maxime Ripard (4):
  dt-bindings: net: Revert sun8i dwmac binding
  arm64: dts: allwinner: Revert EMAC changes
  arm: dts: sunxi: Revert EMAC changes
  net: stmmac: sun8i: Remove the compatibles

Mel Gorman (1):
  mm, madvise: ensure poisoned pages are removed from per-cpu lists

Meng Xu (1):
  perf/core: Fix potential double-fetch bug

Michael Chan (3):
  bnxt_en: Fix .ndo_setup_tc() to include XDP rings.
  bnxt_en: Free MSIX vectors when unregistering the device from bnxt_re.
  bnxt_en: Do not setup MAC address in bnxt_hwrm_func_qcaps().

Michael Cree (1):
  alpha: support R_ALPHA_REFLONG relocations for module loading

Moshe Shemesh (1):
  net/mlx5e: Fix inline header size for small packets

Nikolay Aleksandrov (9):
  sch_htb: fix crash on init failure
  sch_multiq: fix double free on init failure
  sch_hhf: fix null pointer dereference on init failure
  sch_hfsc: fix null pointer deref and double free on init failure
  sch_cbq: fix null pointer dereferences on init failure
  sch_fq_codel: avoid double free on init failure
  sch_netem: avoid null pointer deref on init failure
  sch_sfq: fix null pointer dereference on init failure
  sch_tbf: fix two null pointer dereferences on init failure

Noa Osherovich (1):
  net/mlx5: Fix arm SRQ command for ISSI version 0

Nogah Frankel (1):
  mlxsw: spectrum_switchdev: Fix mrouter flag update

Oleg Nesterov (1):
  epoll: fix race between ep_poll_callback(POLLFREE) and
ep_free()/ep_remove()

Pablo Neira Ayuso (1):
  netfilter: nft_compat: check extension hook mask only if set

Paolo Abeni (1):
  udp6: set rx_dst_cookie on rx_dst updates

Parthasarathy Bhuvaragan (5):
  tipc: remove subscription references only for pending timers
  tipc: perform skb_linearize() before parsing the inner header
  tipc: reassign pointers after skb reallocation / linearization
  tipc: context imbalance at node read unlock
  tipc: permit bond slave as bearer

Paul Blakey (1):
  net/mlx5e: Properly resolve TC offloaded ipv6 vxlan tunnel source address

Pavel Belous (5):
  net:ethernet:aquantia: Extra spinlocks removed.
  net:ethernet:aquantia: Fix for number of RSS queues.
  net:ethernet:aquantia: Workaround for HW checksum bug.
  net:ethernet:aquantia: Fix for incorrect speed index.
  net:ethernet:aquantia: Show info message if bad firmware version detected.

Pavel Shilovsky (1):
  CIFS: Fix maximum SMB2 header size

Pieter Jansen van Vuuren (3):
  nfp: fix unchecked flow dissector use
  nfp: fix supported key layers calculation
  nfp: remove incorrect mask check for vlan matching

Quan Nguyen (1):
  drivers: net: xgene: Correct probe sequence handling

Richard Henderson (3):
  alpha: Update for new syscalls
  alpha: Package string routines together
  alpha: Fix typo in ev6-copy_user.S

Rob Herring (1):
  c6x: Convert to using %pOF instead of full_name

Roopa Prabhu (1):
  bridge: check for null fdb->dst before notifying switchdev drivers

Russell King (1):
  scripts/dtc: fix '%zx' warning

Sabrina Dubroca (3):
  netfilter: ipt_CLUSTERIP: fix use-after-free of proc entry
  macsec: add genl family module alias
  tcp: fix refcnt leak with ebpf conges

Re: Linux 4.13: Reported regressions as of Sunday, 2017-09-03

2017-09-03 Thread Linus Torvalds
On Sun, Sep 3, 2017 at 2:36 AM, Thorsten Leemhuis
 wrote:
>
> [x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
> Status: Asked on the list, but looks like issue gets ignored by everyone
> Note: I'm a bit unsure if adding this issue to this list was a good
> idea. Side note: Was reported against linux-next in May already
> Reported: 2017-07-10
> http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
> Cause: https://git.kernel.org/torvalds/c/e585513b76f7

Sadly, while I love the concept of performance tracking, the
"will-it-scale" reports haven't really been reliable enough to really
be useful. There is a _ton_ of noise in the numbers, and the
test-cases don't seem to be stable enough to really track sanely.

I wish it was otherwise, because we also got a report of "57.3%
improvement of will-it-scale.per_process_ops" this release.

So I find the kernel test robot performance tracking very interesting
in theory, but as things stand now I think it's just that:
"interesting". Not quite ready for action.

 Linus


Linux 4.13: Reported regressions as of Sunday, 2017-09-03

2017-09-03 Thread Thorsten Leemhuis
Hi! Find below my fifth regression report for Linux 4.13. It lists 4
regressions I'm currently aware of. There are no new ones; 2 got fixed
since the last report.

You can also find the report at http://bit.ly/lnxregrep413 where I try
to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was reported against linux-next in May already
Reported: 2017-07-10
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76f7

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Last known developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef88
Linux-Regression-ID: lr#a7d273

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Note: is there any way to query 0-day to see if this is still happening?
Reported: 2017-08-13
https://lkml.org/lkml/2017/8/13/38
Cause: https://git.kernel.org/torvalds/c/89a55278dee4

usb:xhci: regression when ATI chipsets detected
Status: Fix in usb-next/usb-testing
Reported: 2017-08-23 Last known developer activity: 2017-08-28
https://lkml.kernel.org/r/1503485760-15146-1-git-send-email-sandeep.si...@amd.com
Cause: https://git.kernel.org/torvalds/c/e788787ef4f9


== Fixed since last report ==

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: was a tracking bug that got closed by the developer that created it
Reported: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee69b
Linux-Regression-ID: lr#bd29ab

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Situation not perfect, but improved a lot by
https://git.kernel.org/torvalds/c/7e682f766f28
Note: https://lkml.org/lkml/2017/8/31/843
Reported: 2017-08-06
https://bugzilla.kernel.org/show_bug.cgi?id=196599
Cause: https://git.kernel.org/torvalds/c/eef914a9eb5e /
https://git.kernel.org/torvalds/c/908b852df1d5
Linux-Regression-ID: lr#60efe5


Linux 4.13: Reported regressions as of Monday, 2017-08-28

2017-08-28 Thread Thorsten Leemhuis
Hi! Find below my fourth regression report for Linux 4.13. It lists 6
regressions I'm currently aware of. 1 of them is new, 5 got fixed since
the last report (that was two weeks ago; didn't find time for compiling
one last week; sorry). You can also find the report at
http://bit.ly/lnxregrep413 where I try to update it every now and then.
That didn't work to well in the past few weeks; but I'll try to update
it at then end oft the week as the 4.13 release gets closer.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was reported against linux-next in May already
Reported: 2017-07-10 Developer activity: none known
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76f7

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: it's a tracking bug for an issue that seems to get handled by
Intel devs already
Note: suspend-to-idle is rare
Reported: 2017-07-24 Developer activity: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee69b
Linux-Regression-ID: lr#bd29ab

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef88
Linux-Regression-ID: lr#a7d273

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Issue was raised on the mailing list, but ignore
Note: That's a security change, but one that IMHO at least could have
been handled a lot better by giving users a hint what's wrong (and not
"mount: […] Host is down"). I'm considering to submit a revert as RFC to
get a discussion going.
Reported: 2017-08-06 Developer activity: none known
https://bugzilla.kernel.org/show_bug.cgi?id=196599
https://www.spinics.net/lists/linux-cifs/msg12992.html
Cause: https://git.kernel.org/torvalds/c/eef914a9eb5e &
https://git.kernel.org/torvalds/c/908b852df1d5
Linux-Regression-ID: lr#60efe5

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Note: is there any way to query if this is still happening in 0-day?
Reported: 2017-08-13 Developer activity: none known
https://lkml.kernel.org/r/59901cdb.b0ndvwhnqacjcnum%fengguang...@intel.com
Cause: https://git.kernel.org/torvalds/c/89a55278dee4

regression when ATI chipsets detected
Status: Fix proposed
Reported: 2017-08-23 Developer activity: 2017-08-24
https://lkml.kernel.org/r/1503485760-15146-1-git-send-email-sandeep.si...@amd.com
https://lkml.kernel.org/r/1503548835-27057-1-git-send-email-sandeep.si...@amd.com
Cause: https://git.kernel.org/torvalds/c/e788787ef4f9


== Going to get removed ==

SGI UV300/UV300: kernel BUG at arch/x86/mm/init_64.c:350! during boot
Status: not 100% sure if this is a regression; reported didn't provide
feedback
Note: related to https://bugzilla.kernel.org/show_bug.cgi?id=196565 ?
Reported: 2017-08-02 Developer activity: none known
https://bugzilla.kernel.org/show_bug.cgi?id=196561

ACPI/IORT: build regression without IOMMU
Note: Adding this was a mistake, as the causing commit was in linux-next
and not yet in mainline. Sorry for the noise.
Reported: 2017-08-10 Developer activity: none known
https://lkml.kernel.org/r/20170810121114.2509560-1-a...@arndb.de


== Fixed since last report ==

Null dereference in rt5677_i2c_probe()
Status: Fixed in https://git.kernel.org/torvalds/c/9ce76511b67b
Reported: 2017-07-17 Developer activity: none known
https://bugzilla.kernel.org/show_bug.cgi?id=196397
Cause: https://git.kernel.org/torvalds/c/a36afb0ab648
Linux-Regression-ID: lr#96bd63

SCSI-MQ performance regression due to blk-mq scheduler
Status: Fixed in https://git.kernel.org/torvalds/c/cbe7dfa26eee
Reported: 2017-07-31 Developer activity: none known
https://lkml.kernel.org/r/20170731165111.11536-2-ming@redhat.com
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Cause: https://git.kernel.org/torvalds/c/5c279bd9e406

Switching to MQ by default may generate some bug reports
Status: Fixed in https://git.kernel.org/torvalds/c/

Linux 4.13-rc7

2017-08-27 Thread Linus Torvalds
alloc_array()
  ipv4: better IP_MAX_MTU enforcement
  tun: handle register_netdevice() failures properly
  tipc: fix use-after-free

Eric Leblond (1):
  tools lib bpf: improve warning

Eric W. Biederman (1):
  pty: Repair TIOCGPTPEER

Eugeniy Paltsev (1):
  ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103

Fabrice Gasnier (6):
  iio: trigger: stm32-timer: fix quadrature mode get routine
  iio: trigger: stm32-timer: fix write_raw return value
  iio: trigger: stm32-timer: fix get/set down count direction
  iio: trigger: stm32-timer: add enable attribute
  iio: adc: stm32: fix common clock rate
  iio: trigger: stm32-timer: fix get trigger mode

Gregory CLEMENT (1):
  gpio: mvebu: Fix cause computation in irq handler

Hans Verkuil (1):
  ARM: dts: exynos: add needs-hpd for Odroid-XU3/4

Hans de Goede (1):
  Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365

Heiko Carstens (2):
  KVM: s390: sthyi: fix sthyi inline assembly
  KVM: s390: sthyi: fix specification exception detection

Heiner Kallweit (1):
  rtc: ds1307: fix regmap config

Huy Nguyen (1):
  net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled

Jani Nikula (1):
  drm/i915/vbt: ignore extraneous child devices for a port

Jarkko Nikula (4):
  i2c: designware: Fix oops from i2c_dw_irq_handler_slave
  i2c: designware: Fix standard mode speed when configuring the slave mode
  i2c: designware: Remove needless pm_runtime_put_noidle() call
  i2c: designware: Fix runtime PM for I2C slave mode

Javier Martinez Canillas (1):
  i2c: core: Make comment about I2C table requirement to reflect the code

Jeffy Chen (1):
  drm/rockchip: Fix suspend crash when drm is not bound

Jiri Pirko (1):
  net: sched: fix p_filter_chain check in tcf_chain_flush

Joakim Tjernlund (1):
  ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets

Joerg Roedel (1):
  iommu: Fix wrong freeing of iommu_device->dev

Jonathan Corbet (1):
  PATCH] iio: Fix some documentation warnings

Jonathan Liu (1):
  drm/sun4i: Implement drm_driver lastclose to restore fbdev console

Josh Poimboeuf (1):
  objtool: Fix '-mtune=atom' decoding support in objtool 2.0

KT Liao (1):
  Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310

Keerthy (1):
  soc: ti: knav: Add a NULL pointer check for kdev in knav_pool_create

Kirill A. Shutemov (1):
  mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled

Konstantin Khlebnikov (1):
  net_sched: fix order of queue length updates in qdisc_replace()

Krzysztof Kozlowski (1):
  ARC: defconfig: Cleanup from old Kconfig options

Lee Jones (1):
  Revert "mfd: da9061: Fix to remove BBAT_CONT register from chip model"

Linus Torvalds (5):
  Revert "pty: fix the cached path of the pty slave file
descriptor in the master"
  Clarify (and fix) MAX_LFS_FILESIZE macros
  Minor page waitqueue cleanups
  Avoid page waitqueue race leaving possible page locker waiting
  Linux 4.13-rc7

Liping Zhang (1):
  openvswitch: fix skb_panic due to the incorrect actions attrlen

Logan Gunthorpe (2):
  ntb: use correct mw_count function in ntb_tool and ntb_transport
  ntb: ntb_test: ensure the link is up before trying to configure the mws

Lorenzo Bianconi (2):
  iio: magnetometer: st_magn: fix status register address for LSM303AGR
  iio: magnetometer: st_magn: remove ihl property for LSM303AGR

Lv Zheng (1):
  ACPI: EC: Fix regression related to wrong ECDT initialization order

Maarten Lankhorst (2):
  drm/atomic: Handle -EDEADLK with out-fences correctly
  drm/atomic: If the atomic check fails, return its value first

Majd Dibbiny (2):
  IB/mlx5: Fix Raw Packet QP event handler assignment
  IB/mlx5: Always return success for RoCE modify port

Mark Rutland (2):
  arm64: mm: abort uaccess retries upon fatal signal
  perf/core: Fix group {cpu,task} validation

Martijn Coenen (1):
  ANDROID: binder: fix proc->tsk check.

Masaki Ota (1):
  Input: ALPS - fix two-finger scroll breakage in right side on
ALPS touchpad

Masami Hiramatsu (1):
  gpio: reject invalid gpio before getting gpio_desc

Matthew Dawson (1):
  datagram: When peeking datagrams with offset < 0 don't skip empty skbs

Michael Ellerman (1):
  bpf: Update sysctl documentation to list all supported architectures

Neal Cardwell (1):
  tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP

Nicholas Piggin (3):
  kbuild: linker script do not match C names unless
LD_DEAD_CODE_DATA_ELIMINATION is configured
  timers: Fix excessive granularity of new timers after a nohz idle
  KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9

Nikhil Mahale (1):
  drm: Fix framebuffer leak

Noa Osherovich (1):
  IB/core: Avoid accessing non-al

Linux 4.13-rc6

2017-08-20 Thread Linus Torvalds
ix possible deadlock in TCP stack vs BPF filter
  ipv6: fix NULL dereference in ip6_route_dev_notify()
  ipv4: fix NULL dereference in free_fib_info_rcu()

Fabio Estevam (1):
  ARM: dts: imx7d-sdb: Put pinctrl_spi4 in the correct location

Florian Fainelli (2):
  irqchip: brcmstb-l2: Define an irq_pm_shutdown function
  MAINTAINERS: Remove Jason Cooper's irqchip git tree

Florian Westphal (1):
  ipv4: route: fix inet_rtm_getroute induced crash

Gary Bisson (1):
  ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset

Gregory Greenman (2):
  iwlwifi: mvm: set A-MPDU bit upon empty BA notification from FW
  iwlwifi: mvm: rs: fix TLC statistics collection

Gustavo A. R. Silva (1):
  clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()

Haim Dreyfuss (1):
  iwlwifi: fix fw_pre_next_step to apply also for C step

Hanjun Guo (1):
  irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES

Helge Deller (1):
  printk-formats.txt: Better describe the difference between %pS and %pF

Herbert Xu (1):
  crypto: ixp4xx - Fix error handling path in 'aead_perform()'

Icenowy Zheng (4):
  arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias
  arm64: allwinner: a64: pine64: add missing ethernet0 alias
  arm64: allwinner: a64: sopine: add missing ethernet0 alias
  arm64: allwinner: h5: fix pinctrl IRQs

James Smart (2):
  nvmet-fc: correct use after free on list teardown
  nvmet-fc: eliminate incorrect static markers on local variables

Jamie Iles (1):
  signal: don't remove SIGNAL_UNKILLABLE for traced tasks.

Jan Kara (2):
  audit: Fix use after free in audit_remove_watch_rule()
  audit: Receive unmount event

Johannes Weiner (1):
  mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback()

Jon Paul Maloy (2):
  tipc: accept PACKET_MULTICAST packets
  tipc: avoid inheriting msg_non_seq flag when message is returned

Jussi Laako (1):
  ALSA: usb-audio: add DSD support for new Amanero PID

KT Liao (1):
  Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB

Kai-Heng Feng (1):
  Input: elan_i2c - add ELAN0608 to the ACPI table

Kees Cook (1):
  mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes

Keith Busch (2):
  blk-mq: Fix queue usage on failed request allocation
  nvme-pci: set cqe_seen on polled completions

Konstantin Khlebnikov (4):
  net/sched/hfsc: allocate tcf block for hfsc root class
  net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
  net_sched/sfq: update hierarchical backlog when drop packet
  net_sched: remove warning from qdisc_hash_add

Kuninori Morimoto (1):
  arm64: renesas: salvator-common: avoid audio_clkout naming conflict

Laura Abbott (1):
  mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM

Linus Torvalds (3):
  pty: fix the cached path of the pty slave file descriptor in the master
  Sanitize 'move_pages()' permission checks
  Linux 4.13-rc6

Lionel Landwerlin (1):
  drm/i915: remove unused function declaration

Lorenzo Pieralisi (1):
  irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop

Ludovic Desroches (2):
  ARM: dts: at91: sama5d2: use sama5d2 compatible string for SMC
  ARM: dts: at91: sama5d2: fix EBI/NAND controllers declaration

Luis R. Rodriguez (5):
  test_kmod: fix kmod.sh by making it executable
  test_sysctl: fix sysctl.sh by making it executable
  wait: add wait_event_killable_timeout()
  kmod: fix wait on recursive loop
  test_kmod: fix description for -s -and -c parameters

Maor Gottlieb (1):
  IB/uverbs: Fix NULL pointer dereference during device removal

Marc Zyngier (1):
  genirq: Restore trigger settings in irq_modify_status()

Martin Kaiser (1):
  ARM: dts: i.MX25: add ranges to tscadc

Martin Wilck (1):
  nvmet: don't overwrite identify sn/fr with 0-bytes

Matt Redfearn (1):
  clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies

Matthias Kaehlcke (2):
  clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
  drm/i915: Return correct EDP voltage swing table for 0.85V

Michael Hernandez (1):
  scsi: qla2xxx: Fix system crash while triggering FW dump

Michal Hocko (2):
  mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
  mm, oom: fix potential data corruption when oom_reaper races with writer

Munehisa Kamata (1):
  xen-blkfront: use a right index when checking requests

Mustafa Ismail (2):
  i40iw: Correct variable names
  i40iw: Fix typecast of tcp_seq_num

Naftali Goldstein (3):
  iwlwifi: mvm: set the RTS_MIMO_PROT bit in flag mask when
sending sta to fw
  mac80211: add api to start ba session timer expired flow
  iwlwifi: mvm: send delba upon rx ba session timeout

NeilBrown (2):
  md: always clear ->safemode when md_check_recovery 

Linux 4.13: Reported regressions as of Monday, 2017-08-14

2017-08-14 Thread Thorsten Leemhuis
Hi! Find below my third regression report for Linux 4.13. It lists 11
regressions I'm currently aware of (or 10 if you count the two scsi-mq
regressions discussions as one). 4 regressions are new; 3 got fixed
since last weeks report (two others didn't even make it to the report,
as they were quickly fixed); 1 gets removed. You can also find the
report at http://bit.ly/lnxregrep413 where I try to update it every now
and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

P.P.S.: Sorry, I adjusted the report structure again because I added a
new field that shows the date when a proper kernel developer (normally:
one that is working in the affected subsystem) looked into issue. That
should hopefully make it easier to spot regressions that are getting
ignored or got stuck somehow.


== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was report against linux-next in May already
Reported: 2017-07-10 Developer activity: none
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76

Null dereference in rt5677_i2c_probe()
Status: Patch is available in in asoc-next as commit ddc9e69b9dc2
Reported: 2017-07-17 Developer activity: 2017-07-27
https://bugzilla.kernel.org/show_bug.cgi?id=196397
https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
Linux-Regression-ID: lr#96bd63

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: it's a tracking bug for an issue that seems to get handled by
Intel devs already
Note: suspend-to-idle is rare
Reported: 2017-07-24 Developer activity: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
Linux-Regression-ID: lr#bd29ab

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef
Linux-Regression-ID: lr#a7d273

SCSI-MQ performance regression due to blk-mq scheduler
Status: Revert planned
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Note: see also "Switching to MQ by default may generate some bug reports"
Reported: 2017-07-31 Developer activity: 2017-08-13
https://lkml.kernel.org/r/20170731165111.11536-2-ming@redhat.com
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Cause: https://git.kernel.org/torvalds/c/5c279bd9e4

Switching to MQ by default may generate some bug reports
Status: Revert planned
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Note: see also "SCSI-MQ performance regression due to blk-mq scheduler"
Reported: 2017-08-03 Developer activity: 2017-08-13
https://lkml.kernel.org/r/20170803085115.r2jfz2lofy5sp...@techsingularity.net
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Cause: https://git.kernel.org/torvalds/c/5c279bd9e4

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Reminded people they need to get the issue to the mailing list
Note: Due to the changes in  908b852df1d5d27d289e915fea7bfc16d38b8a76
That's a security change, but one that IMHO at least could have been
handled a lot better by giving users a hint what's wrong
Reported: 2017-08-06 Developer activity: none
https://bugzilla.kernel.org/show_bug.cgi?id=196599
https://bugzilla.kernel.org/show_bug.cgi?id=196599#c6
Cause: https://git.kernel.org/torvalds/c/eef914a9eb
Linux-Regression-ID: lr#60efe5

clang build regression in ext4
Status: report contains patch to fix issue
Reported: 2017-08-07 Developer activity: 2017-08-12
https://lkml.kernel.org/r/20170807105701.3835991-1-a...@arndb.de
Cause: https://git.kernel.org/torvalds/c/2df2c3402f

ACPI/IORT: fix build regression without IOMMU
Status: report contains patch to fix issue
Reported: 2017-08-10 Developer activity: 2017-08-10
https://lkml.kernel.org/r/20170810121114.2509560-1-a...@arndb.de
Cause: https://git.kernel.org/torvalds/c/bc8648d49a

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Status: brand new
Reported: 2017-08-13 Developer activity: none
https://lkml.kernel.org/r/5990

Linux 4.13-rc5

2017-08-13 Thread Linus Torvalds
  net: dsa: mediatek: add adjust link support for user ports

Jon Paul Maloy (1):
  tipc: remove premature ESTABLISH FSM event at link synchronization

Jonathan Corbet (1):
  mtd: nand: Fix a docs build warning

Jonathan Toppins (1):
  mm: ratelimit PFNs busy info message

Jordan Crouse (5):
  drm/msm: Remove some potentially blocked register ranges
  drm/msm: Allow hardware clock gating to be toggled
  drm/msm: Turn off hardware clock gating before reading A5XX registers
  drm/msm: args->fence should be args->flags
  drm/msm: Remove __user from __u64 data types

Juergen Gross (4):
  x86: provide an init_mem_mapping hypervisor hook
  xen: split up xen_hvm_init_shared_info()
  xen: fix hvm guest with kaslr enabled
  xen: avoid deadlock in xenbus

Julian Wiedmann (1):
  s390/qeth: fix L3 next-hop in xmit qeth hdr

K. Den (2):
  vxlan: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP
  gue: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP

Kai-Heng Feng (1):
  usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter

Keith Busch (1):
  nvme: fix nvme reset command timeout handling

Kirill A. Shutemov (1):
  rmap: do not call mmu_notifier_invalidate_page() under ptl

Kishon Vijay Abraham I (1):
  mmc: host: omap_hsmmc: Add CMD23 capability to omap_hsmmc driver

Kunihiko Hayashi (1):
  pinctrl: uniphier: fix USB3 pin assignment for Pro4

Kwan (Hingkwan) Huen-SSI (1):
  nvme: fix directive command numd calculation

Leon Romanovsky (5):
  IB/ipoib: Clean error paths in add port
  IB/ipoib: Remove double pointer assigning
  Revert "IB/core: Allow QP state transition from reset to error"
  RDMA/uverbs: Prevent leak of reserved field
  RDMA/mlx5: Fix existence check for extended address vector

Linus Lüssing (1):
  batman-adv: fix TT sync flag inconsistencies

Linus Torvalds (1):
  Linux 4.13-rc5

Lionel Landwerlin (1):
  drm/i915/perf: fix flex eu registers programming

Liu Shuo (1):
  xen/events: Fix interrupt lost during irq_disable and irq_enable

Lorenzo Bianconi (2):
  iio: pressure: st_pressure_core: disable multiread by default for LPS22HB
  iio: accel: st_accel: add SPI-3wire support

Lucas Stach (1):
  drm/bridge: tc358767: fix probe without attached output node

Ludovic Desroches (1):
  pinctrl: generic: update references to Documentation/pinctrl.txt

Luis R. Rodriguez (4):
  firmware: fix batched requests - wake all waiters
  firmware: fix batched requests - send wake up on failure on direct lookups
  firmware: avoid invalid fallback aborts by using killable wait
  test_kmod: fix bug which allows negative values on two config options

Lukas Czerner (1):
  xfs: Fix per-inode DAX flag inheritance

Maarten Lankhorst (1):
  drm/i915: Fix out-of-bounds array access in bdw_load_gamma_lut

Maciej W. Rozycki (1):
  MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression

Manu Gautam (1):
  usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets

Marc Zyngier (2):
  PCI: Add pci_reset_function_locked()
  xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue

Marek Szyprowski (1):
  drm/exynos: forbid creating framebuffers from too small GEM buffers

Mark yao (4):
  drm/rockchip: vop: fix iommu page fault when resume
  drm/rockchip: vop: fix NV12 video display error
  drm/rockchip: vop: round_up pitches to word align
  drm/rockchip: vop: report error when check resource error

Martin Wilck (1):
  nvme: strip trailing 0-bytes in wwid_show

Mateusz Jurczyk (1):
  fuse: initialize the flock flag in fuse_file on allocation

Matija Glavinic Pecotic (1):
  MIPS: Fix race on setting and getting cpu_online_mask

Matt Redfearn (2):
  MIPS: Introduce cpu_tcache_line_size
  MIPS: PCI: Fix smp_processor_id() in preemptible

Matthias Kaehlcke (1):
  zram: rework copy of compressor name in comp_algorithm_store()

Max Filippov (3):
  xtensa: fix cache aliasing handling code for WT cache
  xtensa: don't limit csum_partial export by CONFIG_NET
  xtensa: mm/cache: add missing EXPORT_SYMBOLs

Max Gurtovoy (1):
  nvme-pci: fix CMB sysfs file removal in reset path

Mel Gorman (1):
  futex: Remove unnecessary warning from get_futex_key

Michael Ellerman (2):
  Revert "powerpc/64: Avoid restore_math call if possible in syscall exit"
  powerpc/configs: Re-enable HARD/SOFT lockup detectors

Michael S. Tsirkin (1):
  MAINTAINERS: copy virtio on balloon_compaction.c

Michał Mirosław (2):
  mmc: block: fix lockdep splat when removing mmc_block module
  drm: make DRM_STM default n

Mika Westerberg (1):
  thunderbolt: Do not enumerate more ports from DROM than the controller has

Mike Rapoport (1):
  userfaultfd: replace ENOSPC with ESRCH in case mm has gone
during copy/zeropage

Milan

Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-10 Thread Jeff Mahoney
On 8/6/17 9:59 AM, Thorsten Leemhuis wrote:
> Hi! Find below my second regression report for Linux 4.13. It lists 10
> regressions I'm currently aware of (albeit in one case it's not entirely
> clear yet if it's a regression in 4.13). One regression got fixed since
> last weeks report. You can also find the report at
> http://bit.ly/lnxregrep413 where I try to update it every now and then.
> 
> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid And please tell me if
> there is anything in the report that shouldn't be there.
> 
> Ciao, Thorsten
> 
> P.S.: Thx to all those that CCed me on regression reports or provided
> other input, it makes compiling these reports a whole lot easier!
> 
> == Current regressions ==
> 
> [x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
> (2017-07-10)
> http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
> Status: Asked on the list, but issue still gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/e585513b76
> Note: I'm a bit unsure if adding this issue to this list was a good idea.
> 
> Null dereference in rt5677_i2c_probe() (2017-07-17)
> https://bugzilla.kernel.org/show_bug.cgi?id=196397
> Linux-Regression-ID: lr#96bd63
> Status: Patch is available in in asoc-next as commit ddc9e69b9dc2, but
> was not part of the changes to this subsystem that got merged a few days ago
> Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
> Latest: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6 (2017-07-17)
> 
> [I945GM] Pasted text not shown after mouse middle-click (2017-07-17)
> https://bugs.freedesktop.org/show_bug.cgi?id=101819
> Linux-Regression-ID: lr#d672f3
> Status: could not get reproduced yet
> Note: looks like it's getting ignored
> Latest: https://bugs.freedesktop.org/show_bug.cgi?id=101819#c8 (2017-07-17)
> 
> [Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
> keyboard (2017-07-24)
> https://bugzilla.kernel.org/show_bug.cgi?id=196459
> Linux-Regression-ID: lr#bd29ab
> Status: it's a tracking bug, looks like issue is handled by Intel devs
> already
> Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
> Note: suspend-to-idle is rare
> 
> [lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail (2017-07-26)
> https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
> Linux-Regression-ID: lr#a7d273
> Status: Seems it gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/28785f70ef

We're not ignoring it.  It's that this part of allocation seems to be a
collection of bugs that approximate a correct result, and we're
addressing them individually.  This patch by itself is correct but
uncovered a couple of underlying issues.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [Intel-wired-lan] [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-09 Thread Rafael J. Wysocki
On Wed, Aug 9, 2017 at 11:56 PM, Pavel Machek  wrote:
> Hi!
>
>> >[You seem to have a stale linux-pm address in your address book,
>> >I replaced it with the current one in the CC list.]
>
> Thanks, fixed.
>
>> >>ACPI S3, right. Machine still wakes up properly when I hit a key on
>> >>USB keyboard.
>> >
>> >OK, so my guess would be a driver issue.  What driver is this, igb?
>> >
>> >>Does wake on LAN work on you, on any hardware?
>> >
>> >Yes it does, I checked two machines earlier today, both work.
>> >
>> >Can you enable dynamic debug in device_pm.c and send a dmesg output
>> >with that covering a suspend-resume cycle?
>>
>> 82579 is e1000e
>
> Thanks for all the help.
>
> I now realized what was going on: I had badly inserted ethernet
> cable; machine just connected to wifi, and everything worked... except
> wake on LAN.
>
> Sorry for the noise,

No worries. ;-)


Re: [Intel-wired-lan] [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-09 Thread Pavel Machek
Hi!

> >[You seem to have a stale linux-pm address in your address book,
> >I replaced it with the current one in the CC list.]

Thanks, fixed.

> >>ACPI S3, right. Machine still wakes up properly when I hit a key on
> >>USB keyboard.
> >
> >OK, so my guess would be a driver issue.  What driver is this, igb?
> >
> >>Does wake on LAN work on you, on any hardware?
> >
> >Yes it does, I checked two machines earlier today, both work.
> >
> >Can you enable dynamic debug in device_pm.c and send a dmesg output
> >with that covering a suspend-resume cycle?
> 
> 82579 is e1000e

Thanks for all the help.

I now realized what was going on: I had badly inserted ethernet
cable; machine just connected to wifi, and everything worked... except
wake on LAN.

Sorry for the noise,

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [Intel-wired-lan] [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-09 Thread Rafael J. Wysocki
On Wed, Aug 9, 2017 at 11:15 PM, Hisashi T Fujinaka  wrote:
> On Wed, 9 Aug 2017, Rafael J. Wysocki wrote:
>
>> [You seem to have a stale linux-pm address in your address book,
>> I replaced it with the current one in the CC list.]
>>
>> On Wednesday, August 9, 2017 8:42:31 AM CEST Pavel Machek wrote:
>>>
>>> On Wed 2017-08-09 02:45:54, Rafael J. Wysocki wrote:

 On Tuesday, August 8, 2017 11:00:53 AM CEST Pavel Machek wrote:
>
> Hi!
>
> Perhaps you should get regressi...@kernel.org alias, or something like
> that?
>
>> As always: Are you aware of any other regressions? Then please let me
>> know. For details see http://bit.ly/lnxregtrackid And please tell me
>> if
>> there is anything in the report that shouldn't be there.
>
>
> I am using wake-on-lan quite a bit, and it stopped working. I'll move
> to -rc4 and test there; but if anyone already has some ideas, let me
> know.
>
> Hardware is thinkpad X220
>
> 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network
> Connection (rev 04)


 I guess this is ACPI S3 suspend?
>>>
>>>
>>> ACPI S3, right. Machine still wakes up properly when I hit a key on
>>> USB keyboard.
>>
>>
>> OK, so my guess would be a driver issue.  What driver is this, igb?
>>
>>> Does wake on LAN work on you, on any hardware?
>>
>>
>> Yes it does, I checked two machines earlier today, both work.
>>
>> Can you enable dynamic debug in device_pm.c and send a dmesg output
>> with that covering a suspend-resume cycle?
>
>
> 82579 is e1000e

Hmm.  That also is there in my venerable Thoshiba Portege R500 which
wakes on LAN with 4.13-rc4.  So the driver is off the hook I guess.


Re: [Intel-wired-lan] [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-09 Thread Hisashi T Fujinaka

On Wed, 9 Aug 2017, Rafael J. Wysocki wrote:


[You seem to have a stale linux-pm address in your address book,
I replaced it with the current one in the CC list.]

On Wednesday, August 9, 2017 8:42:31 AM CEST Pavel Machek wrote:

On Wed 2017-08-09 02:45:54, Rafael J. Wysocki wrote:

On Tuesday, August 8, 2017 11:00:53 AM CEST Pavel Machek wrote:

Hi!

Perhaps you should get regressi...@kernel.org alias, or something like that?


As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.


I am using wake-on-lan quite a bit, and it stopped working. I'll move
to -rc4 and test there; but if anyone already has some ideas, let me
know.

Hardware is thinkpad X220

00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network
Connection (rev 04)


I guess this is ACPI S3 suspend?


ACPI S3, right. Machine still wakes up properly when I hit a key on
USB keyboard.


OK, so my guess would be a driver issue.  What driver is this, igb?


Does wake on LAN work on you, on any hardware?


Yes it does, I checked two machines earlier today, both work.

Can you enable dynamic debug in device_pm.c and send a dmesg output
with that covering a suspend-resume cycle?


82579 is e1000e

--
Hisashi T Fujinaka - ht...@twofifty.com


Re: [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-09 Thread Rafael J. Wysocki
[You seem to have a stale linux-pm address in your address book,
 I replaced it with the current one in the CC list.]

On Wednesday, August 9, 2017 8:42:31 AM CEST Pavel Machek wrote:
> On Wed 2017-08-09 02:45:54, Rafael J. Wysocki wrote:
> > On Tuesday, August 8, 2017 11:00:53 AM CEST Pavel Machek wrote:
> > > Hi!
> > > 
> > > Perhaps you should get regressi...@kernel.org alias, or something like 
> > > that?
> > > 
> > > > As always: Are you aware of any other regressions? Then please let me
> > > > know. For details see http://bit.ly/lnxregtrackid And please tell me if
> > > > there is anything in the report that shouldn't be there.
> > > 
> > > I am using wake-on-lan quite a bit, and it stopped working. I'll move
> > > to -rc4 and test there; but if anyone already has some ideas, let me
> > > know.
> > > 
> > > Hardware is thinkpad X220
> > > 
> > > 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network
> > > Connection (rev 04)
> > 
> > I guess this is ACPI S3 suspend?
> 
> ACPI S3, right. Machine still wakes up properly when I hit a key on
> USB keyboard.

OK, so my guess would be a driver issue.  What driver is this, igb?

> Does wake on LAN work on you, on any hardware?

Yes it does, I checked two machines earlier today, both work.

Can you enable dynamic debug in device_pm.c and send a dmesg output
with that covering a suspend-resume cycle?

Thanks,
Rafael



Re: [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-08 Thread Pavel Machek
On Wed 2017-08-09 02:45:54, Rafael J. Wysocki wrote:
> On Tuesday, August 8, 2017 11:00:53 AM CEST Pavel Machek wrote:
> > Hi!
> > 
> > Perhaps you should get regressi...@kernel.org alias, or something like that?
> > 
> > > As always: Are you aware of any other regressions? Then please let me
> > > know. For details see http://bit.ly/lnxregtrackid And please tell me if
> > > there is anything in the report that shouldn't be there.
> > 
> > I am using wake-on-lan quite a bit, and it stopped working. I'll move
> > to -rc4 and test there; but if anyone already has some ideas, let me
> > know.
> > 
> > Hardware is thinkpad X220
> > 
> > 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network
> > Connection (rev 04)
> 
> I guess this is ACPI S3 suspend?

ACPI S3, right. Machine still wakes up properly when I hit a key on
USB keyboard.

Does wake on LAN work on you, on any hardware?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-08 Thread Rafael J. Wysocki
On Tuesday, August 8, 2017 11:00:53 AM CEST Pavel Machek wrote:
> Hi!
> 
> Perhaps you should get regressi...@kernel.org alias, or something like that?
> 
> > As always: Are you aware of any other regressions? Then please let me
> > know. For details see http://bit.ly/lnxregtrackid And please tell me if
> > there is anything in the report that shouldn't be there.
> 
> I am using wake-on-lan quite a bit, and it stopped working. I'll move
> to -rc4 and test there; but if anyone already has some ideas, let me
> know.
> 
> Hardware is thinkpad X220
> 
> 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network
> Connection (rev 04)

I guess this is ACPI S3 suspend?

Thanks,
Rafael



[regression] wake on lan no longer works in 4.13-rc3. was Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-08 Thread Pavel Machek
Hi!

Perhaps you should get regressi...@kernel.org alias, or something like that?

> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid And please tell me if
> there is anything in the report that shouldn't be there.

I am using wake-on-lan quite a bit, and it stopped working. I'll move
to -rc4 and test there; but if anyone already has some ideas, let me
know.

Hardware is thinkpad X220

00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network
Connection (rev 04)

> Ciao, Thorsten
> 
> P.S.: Thx to all those that CCed me on regression reports or provided
> other input, it makes compiling these reports a whole lot easier!
> 
> == Current regressions ==
> 
> [x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
> (2017-07-10)
> http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
> Status: Asked on the list, but issue still gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/e585513b76
> Note: I'm a bit unsure if adding this issue to this list was a good idea.
> 
> Null dereference in rt5677_i2c_probe() (2017-07-17)
> https://bugzilla.kernel.org/show_bug.cgi?id=196397
> Linux-Regression-ID: lr#96bd63
> Status: Patch is available in in asoc-next as commit ddc9e69b9dc2, but
> was not part of the changes to this subsystem that got merged a few days ago
> Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
> Latest: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6 (2017-07-17)
> 
> [I945GM] Pasted text not shown after mouse middle-click (2017-07-17)
> https://bugs.freedesktop.org/show_bug.cgi?id=101819
> Linux-Regression-ID: lr#d672f3
> Status: could not get reproduced yet
> Note: looks like it's getting ignored
> Latest: https://bugs.freedesktop.org/show_bug.cgi?id=101819#c8 (2017-07-17)
> 
> [Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
> keyboard (2017-07-24)
> https://bugzilla.kernel.org/show_bug.cgi?id=196459
> Linux-Regression-ID: lr#bd29ab
> Status: it's a tracking bug, looks like issue is handled by Intel devs
> already
> Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
> Note: suspend-to-idle is rare
> 
> [lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail (2017-07-26)
> https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
> Linux-Regression-ID: lr#a7d273
> Status: Seems it gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/28785f70ef
> 
> Xen HVM guest with KASLR enabled wouldn't boot any longer  (2017-07-28)
> https://lkml.kernel.org/r/20170728102314.29100-1-jgr...@suse.com
> Status: WIP, patches up for review, but were not part of the changes to
> this subsystem that got merged a few days ago
> 
> bio-integrity: Fix regression if profile verify_fn is NULL (2017-08-02)
> https://lkml.kernel.org/r/20170802122750.12216-1-gmazyl...@gmail.com
> Linux-Regression-ID: lr#35498d
> Status: Discussion ongoing how to fix it properly
> Latest: https://lkml.kernel.org/r/yq13795epil@oracle.com (2017-08-02)
> 
> CIFS mount error -112 (2017-08-06)
> https://bugzilla.kernel.org/show_bug.cgi?id=196599
> Linux-Regression-ID: lr#60efe5
> Status: Brand new
> 
> 
> == Waiting for reporter ==
> 
> NULL pointer deref in networking (2017-07-29)
> https://bugzilla.kernel.org/show_bug.cgi?id=196529
> Linux-Regression-ID: lr#084be9
> Status: maybe reporter lost interest
> 
> SGI UV300/UV300: kernel BUG at arch/x86/mm/init_64.c:350! during boot
> (2017-08-02)
> https://bugzilla.kernel.org/show_bug.cgi?id=196561
> Status: not 100% sure if this is a regression
> Note: related to https://bugzilla.kernel.org/show_bug.cgi?id=196565 ?
> 
> 
> == Fixed since last weeks report ==
> 
> Dell XPS 13 9360: Touchscreen does not report events (2017-07-28)
> https://bugzilla.kernel.org/show_bug.cgi?id=196519
> Linux-Regression-ID: lr#fe68bb
> Status: Fixed in rc3
> 
> 
> == Legend ==
> 
> First few lines -> short summary followed by date and a link to the
>  report that lead to inclusion in this report
> Cause -> commit that causes this regression
> Status -> short start summary written by regression tracker
> Note -> additional note written by regression tracker
> Latest -> most recent and informative point where issue was discussed
> See also -> other places where this issue was or is discussed
> 
> Everything apart from the description and the link to the report is
> optional.
> 
> EOF

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-07 Thread Pavel Machek
Hi!

> Hi! Find below my second regression report for Linux 4.13. It lists 10
> regressions I'm currently aware of (albeit in one case it's not entirely
> clear yet if it's a regression in 4.13). One regression got fixed since
> last weeks report. You can also find the report at
> http://bit.ly/lnxregrep413 where I try to update it every now and then.
> 
> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid And please tell me if
> there is anything in the report that shouldn't be there.

There's compile-time regression in et8ek8, with patch available.

On 2017-06-08 02:01, Arnd Bergmann wrote:
> This one got applied twice, causing a build error with clang:
>
> drivers/media/i2c/et8ek8/et8ek8_driver.c:1499:1: error: redefinition
> of '__mod_of__et8ek8_of_table_device_table'
>
> Fixes: 9ae05fd1e791 ("[media] et8ek8: Export OF device ID as module
aliases")
> Signed-off-by: Arnd Bergmann 
> Acked-by: Sakari Ailus 
> Acked-by: Pavel Machek 

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Linux 4.13-rc4

2017-08-06 Thread Linus Torvalds
iable in
lme2510_stream_restart()
  media: i2c: tvp5150: remove useless variable assignment in
tvp5150_set_vbi()
  ASoC: imx-ssi: add check on platform_get_irq return value

Gustavo Romero (1):
  powerpc/tm: Fix saving of TM SPRs in core dump

Hanjun Guo (1):
  ACPI: APD: Fix HID for Hisilicon Hip07/08

Hannes Reinecke (1):
  scsi: scsi_transport_fc: return -EBUSY for deleted vport

Hans Verkuil (3):
  media: cec: cec_transmit_attempt_done: ignore CEC_TX_STATUS_MAX_RETRIES
  media: pulse8-cec: persistent_config should be off by default
  media: cec-notifier: small improvements

Hans de Goede (1):
  ACPI / LPSS: Only call pwm_add_table() for the first PWM controller

Harsha Priya N (2):
  ASoC: Intel: Enabling ASRC for RT5663 codec on kabylake platform
  ASoC: Intel: Use MCLK instead of BLCK as the sysclock for RT5514
codec on kabylake platform

Harvey Hunt (2):
  MIPS: ralink: Fix build error due to missing header
  MIPS: ralink: mt7620: Add missing header

Heiko Carstens (1):
  mm: take memory hotplug lock within numa_zonelist_order_handler()

Heiko Stuebner (2):
  dt-bindings: gpu: drop wrong compatible from midgard binding example
  ARM: dts: rockchip: fix mali gpu node on rk3288

Helge Deller (1):
  parisc: Increase thread and stack size to 32kb

Hoan Tran (1):
  mailbox: pcc: Fix crash when request PCC channel 0

Ido Schimmel (2):
  mlxsw: spectrum_router: Don't offload routes next in list
  ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()

Ilan Tayari (1):
  net/mlx5e: Fix outer_header_zero() check size

Ilya Dryomov (6):
  libceph: make encode_request_*() work with r_mempool requests
  libceph: don't call ->reencode_message() more than once per message
  libceph: fallback for when there isn't a pool-specific choose_arg
  crush: assume weight_set != null imples weight_set_size > 0
  libceph: upmap semantic changes
  libceph: make RECOVERY_DELETES feature create a new interval

Jakub Kicinski (2):
  scsi: aic7xxx: fix firmware build with O=path
  bpf: don't zero out the info struct in bpf_obj_get_info_by_fd()

James Bottomley (1):
  parisc: pdc_stable: Fix locking when creating sysfs links

Jan Kara (3):
  ext4: Don't clear SGID when inheriting ACLs
  ocfs2: don't clear SGID when inheriting ACLs
  ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize

Jan Kiszka (1):
  gpio: exar: Use correct property prefix and document bindings

Jasmin Jessich (2):
  media: staging: cxd2099: Removed printing in write_block
  media: staging: cxd2099: Activate cxd2099 buffer mode

Jason Wang (1):
  Revert "vhost: cache used event for better performance"

Javier Martinez Canillas (1):
  media: vimc: set id_table for platform drivers

Jean Delvare (1):
  drm/amdgpu: Fix undue fallthroughs in golden registers initialization

Jeff Layton (1):
  ext4: convert swap_inode_data() over to use swap() on most of the fields

Jerome Brunet (3):
  ARM64: dts: meson-gx: use specific compatible for the AO pwms
  ARM64: dts: meson-gxl-s905x-libretech-cc: fixup board definition
  clk: meson: mpll: fix mpll0 fractional part ignored

Jerry Lee (1):
  ext4: fix overflow caused by missing cast in ext4_resize_fs()

Joe Perches (2):
  media: stkwebcam: Use more common logging styles
  media: tuner-core: Remove unused #define PREFIX

Joel Stanley (1):
  ftgmac100: return error in ftgmac100_alloc_rx_buf

Joerg Roedel (1):
  iommu/amd: Fix schedule-while-atomic BUG in initialization code

Johan Hovold (3):
  ASoC: fix pcm-creation regression
  ASoC: ux500: Restore platform DAI assignments
  PM / runtime: Document new pm_runtime_set_suspended() constraint

Johannes Berg (1):
  iwlwifi: mvm: defer setting IWL_MVM_STATUS_IN_HW_RESTART

John David Anglin (1):
  parisc: Handle vma's whose context is not current in flush_cache_range

Jonathan Corbet (2):
  libata: fix a couple of doc build warnings
  kthread: fix documentation build warning

Kan Liang (1):
  mm: allow page_cache_get_speculative in interrupt context

Kees Cook (1):
  ipc: add missing container_of()s for randstruct

Kefeng Wang (2):
  libata: remove unused rc in ata_eh_handle_port_resume
  pid: kill pidhash_size in pidhash_init()

Kevin Hilman (2):
  ARM: dts: da850-evm: drop unused VPIF endpoints
  ARM: dts: da850-lcdk: drop unused VPIF endpoints

Krzysztof Kozlowski (2):
  sparc: defconfig: Cleanup from old Kconfig options
  ARM: dts: exynos: Add clocks to audss block to fix silent hang
on Exynos4412

Kuninori Morimoto (2):
  arm64: renesas: salvator-common: sound clock-frequency needs
descending order
  ASoC: sh: hac: add missing "int ret"

Kuppuswamy Sathyanarayanan (1):
  MAINTAINERS: Add entry for Whiskey Cove PMIC GPIO driver

Larry Finger (1):

Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-06 Thread Thorsten Leemhuis
Hi! Find below my second regression report for Linux 4.13. It lists 10
regressions I'm currently aware of (albeit in one case it's not entirely
clear yet if it's a regression in 4.13). One regression got fixed since
last weeks report. You can also find the report at
http://bit.ly/lnxregrep413 where I try to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
(2017-07-10)
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Status: Asked on the list, but issue still gets ignored by everyone
Cause: https://git.kernel.org/torvalds/c/e585513b76
Note: I'm a bit unsure if adding this issue to this list was a good idea.

Null dereference in rt5677_i2c_probe() (2017-07-17)
https://bugzilla.kernel.org/show_bug.cgi?id=196397
Linux-Regression-ID: lr#96bd63
Status: Patch is available in in asoc-next as commit ddc9e69b9dc2, but
was not part of the changes to this subsystem that got merged a few days ago
Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
Latest: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6 (2017-07-17)

[I945GM] Pasted text not shown after mouse middle-click (2017-07-17)
https://bugs.freedesktop.org/show_bug.cgi?id=101819
Linux-Regression-ID: lr#d672f3
Status: could not get reproduced yet
Note: looks like it's getting ignored
Latest: https://bugs.freedesktop.org/show_bug.cgi?id=101819#c8 (2017-07-17)

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard (2017-07-24)
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Linux-Regression-ID: lr#bd29ab
Status: it's a tracking bug, looks like issue is handled by Intel devs
already
Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
Note: suspend-to-idle is rare

[lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail (2017-07-26)
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
Linux-Regression-ID: lr#a7d273
Status: Seems it gets ignored by everyone
Cause: https://git.kernel.org/torvalds/c/28785f70ef

Xen HVM guest with KASLR enabled wouldn't boot any longer  (2017-07-28)
https://lkml.kernel.org/r/20170728102314.29100-1-jgr...@suse.com
Status: WIP, patches up for review, but were not part of the changes to
this subsystem that got merged a few days ago

bio-integrity: Fix regression if profile verify_fn is NULL (2017-08-02)
https://lkml.kernel.org/r/20170802122750.12216-1-gmazyl...@gmail.com
Linux-Regression-ID: lr#35498d
Status: Discussion ongoing how to fix it properly
Latest: https://lkml.kernel.org/r/yq13795epil@oracle.com (2017-08-02)

CIFS mount error -112 (2017-08-06)
https://bugzilla.kernel.org/show_bug.cgi?id=196599
Linux-Regression-ID: lr#60efe5
Status: Brand new


== Waiting for reporter ==

NULL pointer deref in networking (2017-07-29)
https://bugzilla.kernel.org/show_bug.cgi?id=196529
Linux-Regression-ID: lr#084be9
Status: maybe reporter lost interest

SGI UV300/UV300: kernel BUG at arch/x86/mm/init_64.c:350! during boot
(2017-08-02)
https://bugzilla.kernel.org/show_bug.cgi?id=196561
Status: not 100% sure if this is a regression
Note: related to https://bugzilla.kernel.org/show_bug.cgi?id=196565 ?


== Fixed since last weeks report ==

Dell XPS 13 9360: Touchscreen does not report events (2017-07-28)
https://bugzilla.kernel.org/show_bug.cgi?id=196519
Linux-Regression-ID: lr#fe68bb
Status: Fixed in rc3


== Legend ==

First few lines -> short summary followed by date and a link to the
 report that lead to inclusion in this report
Cause -> commit that causes this regression
Status -> short start summary written by regression tracker
Note -> additional note written by regression tracker
Latest -> most recent and informative point where issue was discussed
See also -> other places where this issue was or is discussed

Everything apart from the description and the link to the report is
optional.

EOF


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Trond Myklebust
On Tue, 2017-08-01 at 13:50 -0400, da...@codemonkey.org.uk wrote:
> On Tue, Aug 01, 2017 at 10:20:31AM -0700, Linus Torvalds wrote:
> 
>  > So I think the 'pathname' part may actually be entirely a red
> herring,
>  > and it's the underlying access itself that just picks up a random
>  > pointer from a stack that now contains something different. And
> KASAN
>  > didn't notice the stale stack access itself, because the stack
> slot is
>  > still valid - it's just no longer the original 'verifier'
> allocation.
>  > 
>  > Or *something* like that.
>  > 
>  > None of this looks even remotely new, though - the code seems to
> go
>  > back to 2009. Have you just changed what you're testing to trigger
>  > these things?
> 
> No idea why it only just showed up, but it isn't 100% reproducable
> either.  A month or so ago I did disable the V4 code on the server
> completely (as I was using v3 everywhere else), so maybe I started
> hitting
> a fallback path somewhere.  *shrug*
> 

I would only expect you too see it if you interrupt the wait on the
asynchronous EXCHANGE_ID call (which would allow the RPC call to
continue while the caller stack is trashed). Prior to commit
8d89bd70bc939, that code path was fully synchronous, so there was no
issue with interrupting the call.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Linus Torvalds
On Tue, Aug 1, 2017 at 10:20 AM, Linus Torvalds
 wrote:
>
> So I think the 'pathname' part may actually be entirely a red herring,
> and it's the underlying access itself that just picks up a random
> pointer from a stack that now contains something different. And KASAN
> didn't notice the stale stack access itself, because the stack slot is
> still valid - it's just no longer the original 'verifier' allocation.
>
> Or *something* like that.

I think the "something like that" is actually just reading the
cdata->args.verifier->data pointer itself, and it *is* the stack
access - but the stack page has been free'd (because of the same fatal
signal that interrupted the rpc_wait_for_completion_task() call), and
then re-allocated (and free'd again) as a pathname page.

Maybe.

Regardless, my patch still looks conceptually correct, even if it
might have bugs due to total lack of testing.

Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread da...@codemonkey.org.uk
On Tue, Aug 01, 2017 at 10:20:31AM -0700, Linus Torvalds wrote:

 > So I think the 'pathname' part may actually be entirely a red herring,
 > and it's the underlying access itself that just picks up a random
 > pointer from a stack that now contains something different. And KASAN
 > didn't notice the stale stack access itself, because the stack slot is
 > still valid - it's just no longer the original 'verifier' allocation.
 > 
 > Or *something* like that.
 > 
 > None of this looks even remotely new, though - the code seems to go
 > back to 2009. Have you just changed what you're testing to trigger
 > these things?

No idea why it only just showed up, but it isn't 100% reproducable
either.  A month or so ago I did disable the V4 code on the server
completely (as I was using v3 everywhere else), so maybe I started hitting
a fallback path somewhere.  *shrug*

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Trond Myklebust
On Tue, 2017-08-01 at 10:20 -0700, Linus Torvalds wrote:
> On Tue, Aug 1, 2017 at 8:51 AM, da...@codemonkey.org.uk
>  wrote:
> > On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
> >  > Any chance of getting the output from
> >  >
> >  >./scripts/faddr2line vmlinux
> > nfs4_exchange_id_done+0x3d7/0x8e0
> > 
> > 
> > Hm, that points to this..
> > 
> > 7463 /* Save the EXCHANGE_ID verifier session trunk
> > tests */
> > 7464 memcpy(clp->cl_confirm.data, cdata-
> > >args.verifier->data,
> > 7465sizeof(clp->cl_confirm.data));
> 
> Ok, that certainly made no sense to me, because the KASAN report made
> it look like a stale pathname access (allocated in getname, freed in
> putname), but I think the issue is more fundamental than that.
> 
> That cdata->args.verifier seems to be entirely broken. AT least for
> the "xprt == NULL" case, it does the following:
> 
>  - use the address of a local variable ("&verifier")
> 
>  - wait for the rpc completion using rpc_wait_for_completion_task().
> 
> That's unacceptably buggy crap. rpc_wait_for_completion_task() will
> happily exit on a deadly signal even if the rpc hasn't been
> completed,
> so now you'll have a stale pointer to a stack that has been freed.
> 
> So I think the 'pathname' part may actually be entirely a red
> herring,
> and it's the underlying access itself that just picks up a random
> pointer from a stack that now contains something different. And KASAN
> didn't notice the stale stack access itself, because the stack slot
> is
> still valid - it's just no longer the original 'verifier' allocation.
> 
> Or *something* like that.
> 
> None of this looks even remotely new, though - the code seems to go
> back to 2009. Have you just changed what you're testing to trigger
> these things?
> 
> I'm not even sure why it does that stupid stack allocation. It does a
> *real* allocation just a few lines later:
> 
> struct nfs41_exchange_id_data *calldata
> ...
> calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
> 
> and the whole verifier structure could easily have been part of that
> same allocation as far as I can tell.
> 
> And that really might seem to be the right thing to do.
> 
> TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.
> 
> That patch compiles for me. It *might* even work. Or it might just be
> the ramblings of a diseased mind.
> 
> Anna? Trond?
> 

I came to the same conclusion yesterday, and have a stable patch that
does something similar. I just got distracted with the other bugs that
were introduced by the exchangeid patch series in Linux-4.9 (including
what looks like a duplicate free issue in nfs4_test_session_trunk()).

I can pass a few of the more critical patches on to Anna for merging in
this cycle, then I've got some clean ups ready for the 4.14 merge
window.

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Linus Torvalds
On Tue, Aug 1, 2017 at 8:51 AM, da...@codemonkey.org.uk
 wrote:
> On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
>  > Any chance of getting the output from
>  >
>  >./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
>
>
> Hm, that points to this..
>
> 7463 /* Save the EXCHANGE_ID verifier session trunk tests */
> 7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
> 7465sizeof(clp->cl_confirm.data));

Ok, that certainly made no sense to me, because the KASAN report made
it look like a stale pathname access (allocated in getname, freed in
putname), but I think the issue is more fundamental than that.

That cdata->args.verifier seems to be entirely broken. AT least for
the "xprt == NULL" case, it does the following:

 - use the address of a local variable ("&verifier")

 - wait for the rpc completion using rpc_wait_for_completion_task().

That's unacceptably buggy crap. rpc_wait_for_completion_task() will
happily exit on a deadly signal even if the rpc hasn't been completed,
so now you'll have a stale pointer to a stack that has been freed.

So I think the 'pathname' part may actually be entirely a red herring,
and it's the underlying access itself that just picks up a random
pointer from a stack that now contains something different. And KASAN
didn't notice the stale stack access itself, because the stack slot is
still valid - it's just no longer the original 'verifier' allocation.

Or *something* like that.

None of this looks even remotely new, though - the code seems to go
back to 2009. Have you just changed what you're testing to trigger
these things?

I'm not even sure why it does that stupid stack allocation. It does a
*real* allocation just a few lines later:

struct nfs41_exchange_id_data *calldata
...
calldata = kzalloc(sizeof(*calldata), GFP_NOFS);

and the whole verifier structure could easily have been part of that
same allocation as far as I can tell.

And that really might seem to be the right thing to do.

TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.

That patch compiles for me. It *might* even work. Or it might just be
the ramblings of a diseased mind.

Anna? Trond?

So caveat probatorem,

  Linus
 fs/nfs/nfs4proc.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 18ca6879d8de..0712af3d38f8 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7490,6 +7490,11 @@ static const struct rpc_call_ops 
nfs4_exchange_id_call_ops = {
.rpc_release = nfs4_exchange_id_release,
 };
 
+struct verifier_and_calldata {
+   struct nfs41_exchange_id_data calldata;
+   nfs4_verifier verifier;
+};
+
 /*
  * _nfs4_proc_exchange_id()
  *
@@ -7498,7 +7503,8 @@ static const struct rpc_call_ops 
nfs4_exchange_id_call_ops = {
 static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred 
*cred,
u32 sp4_how, struct rpc_xprt *xprt)
 {
-   nfs4_verifier verifier;
+   struct verifier_and_calldata *vna;
+   nfs4_verifier *verifier;
struct rpc_message msg = {
.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_EXCHANGE_ID],
.rpc_cred = cred,
@@ -7516,14 +7522,17 @@ static int _nfs4_proc_exchange_id(struct nfs_client 
*clp, struct rpc_cred *cred,
if (!atomic_inc_not_zero(&clp->cl_count))
return -EIO;
 
-   calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
-   if (!calldata) {
+   vna = kzalloc(sizeof(*vna), GFP_NOFS);
+   if (!vna) {
nfs_put_client(clp);
return -ENOMEM;
}
+   /* kfree() of calldata will also free the verifier */
+   calldata = &vna->calldata;
+   verifier = &vna->verifier;
 
if (!xprt)
-   nfs4_init_boot_verifier(clp, &verifier);
+   nfs4_init_boot_verifier(clp, verifier);
 
status = nfs4_init_uniform_client_string(clp);
if (status)
@@ -7566,7 +7575,7 @@ static int _nfs4_proc_exchange_id(struct nfs_client *clp, 
struct rpc_cred *cred,
RPC_TASK_SOFT|RPC_TASK_SOFTCONN|RPC_TASK_ASYNC;
calldata->args.verifier = &clp->cl_confirm;
} else {
-   calldata->args.verifier = &verifier;
+   calldata->args.verifier = verifier;
}
calldata->args.client = clp;
 #ifdef CONFIG_NFS_V4_1_MIGRATION


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread da...@codemonkey.org.uk
On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
 > On Mon, Jul 31, 2017 at 8:43 AM, da...@codemonkey.org.uk
 >  wrote:
 > > Another NFSv4 KASAN splat, this time from rc3.
 > >
 > > BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 > 
 > Ugh. It's really hard to tell what access that it - KASAN doesn't
 > actually give enough information. There's lots of 8-byte accesses
 > there in that function.
 > 
 > Any chance of getting the output from
 > 
 >./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
 

Hm, that points to this..

7463 /* Save the EXCHANGE_ID verifier session trunk tests */
7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
7465sizeof(clp->cl_confirm.data));

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-31 Thread Linus Torvalds
On Mon, Jul 31, 2017 at 8:43 AM, da...@codemonkey.org.uk
 wrote:
> Another NFSv4 KASAN splat, this time from rc3.
>
> BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]

Ugh. It's really hard to tell what access that it - KASAN doesn't
actually give enough information. There's lots of 8-byte accesses
there in that function.

Any chance of getting the output from

   ./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0

or something? That would be extremely useful in general for
stacktraces, but it's doubly useful for KASAN because most *other*
stacktraces tend to have a very limited number of things that can warn
(ie there's one or two WARN_ON() calls in a function), but KASAN can
have tens or hundreds..

   Linus


> Read of size 8 at addr 8804508af528 by task kworker/2:1/34
>
> CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 4.13.0-rc3-think+ #1
> Workqueue: rpciod rpc_async_schedule [sunrpc]
> Call Trace:
>  dump_stack+0x68/0xa1
>  print_address_description+0xd9/0x270
>  kasan_report+0x257/0x370
>  ? nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
>  check_memory_region+0x13a/0x1a0
>  __asan_loadN+0xf/0x20
>  nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
>  ? nfs4_exchange_id_release+0xb0/0xb0 [nfsv4]
>  rpc_exit_task+0x69/0x110 [sunrpc]
>  ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>  ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>  __rpc_execute+0x1a0/0x840 [sunrpc]
>  ? rpc_wake_up_queued_task+0x50/0x50 [sunrpc]
>  ? __lock_is_held+0x9a/0x100
>  ? debug_lockdep_rcu_enabled.part.16+0x1a/0x30
>  rpc_async_schedule+0x12/0x20 [sunrpc]
>  process_one_work+0x4d5/0xa70
>  ? flush_delayed_work+0x70/0x70
>  ? lock_acquire+0xfc/0x220
>  worker_thread+0x88/0x630
>  ? pci_mmcfg_check_reserved+0xc0/0xc0
>  kthread+0x1a6/0x1f0
>  ? process_one_work+0xa70/0xa70
>  ? kthread_create_on_node+0xc0/0xc0
>  ret_from_fork+0x27/0x40
>
> Allocated by task 1:
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_kmalloc+0xad/0xe0
>  kasan_slab_alloc+0x12/0x20
>  kmem_cache_alloc+0xe0/0x2f0
>  getname_flags+0x43/0x220
>  getname+0x12/0x20
>  do_sys_open+0x14c/0x2b0
>  SyS_open+0x1e/0x20
>  do_syscall_64+0xea/0x260
>  return_from_SYSCALL_64+0x0/0x7a
>
> Freed by task 1:
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_slab_free+0x72/0xc0
>  kmem_cache_free+0xa8/0x300
>  putname+0x80/0x90
>  do_sys_open+0x22f/0x2b0
>  SyS_open+0x1e/0x20
>  do_syscall_64+0xea/0x260
>  return_from_SYSCALL_64+0x0/0x7a
>
> The buggy address belongs to the object at 8804508aeac0\x0a which belongs 
> to the cache names_cache of size 4096
> The buggy address is located 2664 bytes inside of\x0a 4096-byte region 
> [8804508aeac0, 8804508afac0)
> The buggy address belongs to the page:
> page:ea0011422a00 count:1 mapcount:0 mapping:  (null) index:0x0
> [CONT START]  compound_mapcount: 0
> flags: 0x80008100(slab|head)
> raw: 80008100   000100070007
> raw: ea00113d6020 ea001136e220 8804664f8040 
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  8804508af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8804508af480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>8804508af500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ^
>  8804508af580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8804508af600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==
>


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-31 Thread da...@codemonkey.org.uk
Another NFSv4 KASAN splat, this time from rc3.


==
BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
Read of size 8 at addr 8804508af528 by task kworker/2:1/34

CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 4.13.0-rc3-think+ #1 
Workqueue: rpciod rpc_async_schedule [sunrpc]
Call Trace:
 dump_stack+0x68/0xa1
 print_address_description+0xd9/0x270
 kasan_report+0x257/0x370
 ? nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 check_memory_region+0x13a/0x1a0
 __asan_loadN+0xf/0x20
 nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 ? nfs4_exchange_id_release+0xb0/0xb0 [nfsv4]
 rpc_exit_task+0x69/0x110 [sunrpc]
 ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
 ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
 __rpc_execute+0x1a0/0x840 [sunrpc]
 ? rpc_wake_up_queued_task+0x50/0x50 [sunrpc]
 ? __lock_is_held+0x9a/0x100
 ? debug_lockdep_rcu_enabled.part.16+0x1a/0x30
 rpc_async_schedule+0x12/0x20 [sunrpc]
 process_one_work+0x4d5/0xa70
 ? flush_delayed_work+0x70/0x70
 ? lock_acquire+0xfc/0x220
 worker_thread+0x88/0x630
 ? pci_mmcfg_check_reserved+0xc0/0xc0
 kthread+0x1a6/0x1f0
 ? process_one_work+0xa70/0xa70
 ? kthread_create_on_node+0xc0/0xc0
 ret_from_fork+0x27/0x40

Allocated by task 1:
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_kmalloc+0xad/0xe0
 kasan_slab_alloc+0x12/0x20
 kmem_cache_alloc+0xe0/0x2f0
 getname_flags+0x43/0x220
 getname+0x12/0x20
 do_sys_open+0x14c/0x2b0
 SyS_open+0x1e/0x20
 do_syscall_64+0xea/0x260
 return_from_SYSCALL_64+0x0/0x7a

Freed by task 1:
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_slab_free+0x72/0xc0
 kmem_cache_free+0xa8/0x300
 putname+0x80/0x90
 do_sys_open+0x22f/0x2b0
 SyS_open+0x1e/0x20
 do_syscall_64+0xea/0x260
 return_from_SYSCALL_64+0x0/0x7a

The buggy address belongs to the object at 8804508aeac0\x0a which belongs 
to the cache names_cache of size 4096
The buggy address is located 2664 bytes inside of\x0a 4096-byte region 
[8804508aeac0, 8804508afac0)
The buggy address belongs to the page:
page:ea0011422a00 count:1 mapcount:0 mapping:  (null) index:0x0
[CONT START]  compound_mapcount: 0
flags: 0x80008100(slab|head)
raw: 80008100   000100070007
raw: ea00113d6020 ea001136e220 8804664f8040 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 8804508af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8804508af480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>8804508af500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ^
 8804508af580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8804508af600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==



Linux 4.13-rc3

2017-07-30 Thread Linus Torvalds
s first from per-queue
persistent grants

Eric Huang (1):
  drm/amd/powerplay: fix AVFS voltage offset for Vega10

Ernesto A. Fernández (1):
  jfs: preserve i_mode if __jfs_set_acl() fails

Fabian Frederick (1):
  jfs: atomically read inode size

Faiz Abbas (2):
  ARM: OMAP2+: hsmmc.c: Remove dead code
  mmc: host: omap_hsmmc: remove unused platform callbacks

Filipe Manana (1):
  Btrfs: fix dir item validation when replaying xattr deletes

Frank Rowand (1):
  scripts/dtc: dtx_diff - update include dts paths to match build

Gabriel Krisman Bertazi (1):
  exynos_drm: Clean up duplicated assignment in exynos_drm_driver

Guoqing Jiang (1):
  md: simplify code with bio_io_error

Gustavo A. R. Silva (1):
  xen: selfballoon: remove unnecessary static in frontswap_selfshrink()

Hans Verkuil (1):
  drm/exynos: select CEC_CORE if CEC_NOTIFIER

Heinz Mauelshagen (4):
  dm raid: remove WARN_ON() in raid10_md_layout_to_format()
  dm raid: fix activation check in validate_raid_redundancy()
  dm raid: avoid mddev->suspended access
  dm raid: bump target version

Helge Deller (7):
  parisc: Disable further stack checks when panic occurs during stack check
  parisc: Merge millicode routines via linker script
  parisc: regenerate defconfig files
  parisc: Fix crash when calling PDC_PAT_MEM PDT firmware function
  parisc: Add function to return DIMM slot of physical address
  parisc: Show DIMM slot number which holds broken memory module
  parisc: Suspend lockup detectors before system halt

Herbert Xu (1):
  crypto: authencesn - Fix digest_null crash

Hoegeun Kwon (1):
  drm/exynos/dsi: Remove error handling for bridge_node DT parsing

Ilia Mirkin (1):
  drm/nouveau/disp/nv50-: bump max chans to 21

Imre Deak (2):
  drm/i915: Fix user ptr check size in eb_relocate_vma()
  drm/i915: Fix scaler init during CRTC HW state readout

Inki Dae (2):
  drm/exynos: dsi: do not try to find bridge
  drm/exynos: mic: add a bridge at probe

James Smart (2):
  nvme-fc: address target disconnect race conditions in fcp io submit
  nvme-fc: revise TRADDR parsing

Jan Kara (1):
  jfs: Don't clear SGID when inheriting ACLs

Javier González (1):
  lightnvm: pblk: advance bio according to lba index

Jeff Mahoney (1):
  btrfs: fix lockup in find_free_extent with read-only block groups

Jian Jun Chen (1):
  drm/i915/gvt: Extend KBL platform support in GVT-g

Johannes Thumshirn (2):
  scsi: sg: fix static checker warning in sg_is_valid_dxfer
  nvme: also provide a UUID in the WWID sysfs attribute

John David Anglin (2):
  parisc: Prevent TLB speculation on flushed pages on CPUs that
only support equivalent aliases
  parisc: Extend disabled preemption in copy_user_page

Jon Derrick (1):
  nvme: fabrics commands should use the fctype field for data direction

Jonathan Corbet (2):
  sched/core: Fix some documentation build warnings
  sched/wait: Clean up some documentation warnings

Josef Bacik (4):
  nbd: allow multiple disconnects to be sent
  nbd: take tx_lock before disconnecting
  nbd: only set sndtimeo if we have a timeout set
  nbd: clear disconnected on reconnect

Juergen Gross (1):
  xen: dont fiddle with event channel masking in suspend/resume

Junxiao Bi (1):
  xen-blkfront: fix mq start/stop race

Kai-Heng Feng (1):
  ALSA: hda - Add mute led support for HP ProBook 440 G4

Kailang Yang (3):
  ALSA: hda/realtek - Update headset mode for ALC298
  ALSA: hda/realtek - Update headset mode for ALC225
  ALSA: hda/realtek - No loopback on ALC225/ALC295 codec

Kan Liang (3):
  perf/x86/intel/uncore: Fix Skylake server PCU PMU event format
  perf/x86/intel/uncore: Fix Skylake server CHA LLC_LOOKUP event umask
  perf/x86/intel/uncore: Remove invalid Skylake server CHA filter field

Laurent Vivier (1):
  powerpc/pseries: Fix of_node_put() underflow during reconfig remove

Liang Li (1):
  virtio-balloon: deflate via a page list

Lin Ma (2):
  tools/kvm_stat: use variables instead of hard paths in help output
  tools/kvm_stat: add '-f help' to get the available event list

Linus Torvalds (1):
  Linux 4.13-rc3

Maarten Lankhorst (1):
  drm/i915: Fix bad comparison in skl_compute_plane_wm.

Masami Hiramatsu (1):
  kprobes/x86: Release insn_slot in failure path

Matthias Kaehlcke (1):
  x86/boot: Disable the address-of-packed-member compiler warning

Michael Davidson (1):
  x86/boot: #undef memcpy() et al in string.c

Michael Ellerman (1):
  powerpc/Makefile: Fix ld version check with 64-bit LE-only toolchain

Mikulas Patocka (5):
  dm integrity: fix inefficient allocation of journal space
  dm integrity: use plugging when writing the journal
  dm integrity: WARN_ON if variables representing journal usage
get out of sync
  dm integrity: test for corrupted disk format 

Re: Linux 4.13: Reported regressions as of Sunday, 2017-07-30

2017-07-30 Thread Andy Shevchenko
On Sun, Jul 30, 2017 at 4:49 PM, Thorsten Leemhuis
 wrote:
> Hi! Find below my first regression report for Linux 4.13. It lists 8
> regressions I'm currently aware of (a few others I had on my list got
> fixed in the past few days). You can also find it at
> http://bit.ly/lnxregrep413 where I try to update it every now and then.
>
> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid
> And please tell me if there is anything in the report that shouldn't be
> there.

> P.S.: Thx to all those that CCed me on regression reports or provided
> other input, it makes compiling these reports a whole lot easier!

> Null dereference in rt5677_i2c_probe()
> 2017-07-17 lr#96bd63 https://bugzilla.kernel.org/show_bug.cgi?id=196397
> Due to https://git.kernel.org/torvalds/c/a36afb0ab6

> Status: Takashi proposed a patch that fixes the issue

Status is outdated as per discussion, Patch is available in upstream
as commit ddc9e69b9dc2.

> Latest discussion: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
> (2017-07-17)

...while this one is correct link.

-- 
With Best Regards,
Andy Shevchenko


Linux 4.13: Reported regressions as of Sunday, 2017-07-30

2017-07-30 Thread Thorsten Leemhuis
Hi! Find below my first regression report for Linux 4.13. It lists 8
regressions I'm currently aware of (a few others I had on my list got
fixed in the past few days). You can also find it at
http://bit.ly/lnxregrep413 where I try to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid
And please tell me if there is anything in the report that shouldn't be
there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
2017-07-10 http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Due to https://git.kernel.org/torvalds/c/e585513b76

Null dereference in rt5677_i2c_probe()
2017-07-17 lr#96bd63 https://bugzilla.kernel.org/show_bug.cgi?id=196397
Due to https://git.kernel.org/torvalds/c/a36afb0ab6
Status: Takashi proposed a patch that fixes the issue
Latest discussion: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
(2017-07-17)

[I945GM] Pasted text not shown after mouse middle-click
2017-07-17 lr#d672f3 https://bugs.freedesktop.org/show_bug.cgi?id=101819
Status: could not get reproduced yet
Notes: related to the regression that was fixed rc2+
https://bugs.freedesktop.org/show_bug.cgi?id=101790

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
2017-07-24 lr#bd29ab https://bugzilla.kernel.org/show_bug.cgi?id=196459
Due to https://git.kernel.org/torvalds/c/33e4f80ee6
Status: it's a tracking bug, looks like issue is handled by Intel devs
already
Notes: suspend-to-idle is rare

[lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail
2017-07-26 lr#a7d273
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop Due to
https://git.kernel.org/torvalds/c/28785f70ef

Dell XPS 13 9360: Touchscreen does not report events
2017-07-28 lr#fe68bb https://bugzilla.kernel.org/show_bug.cgi?id=196519
Status: afaics waiting to get forwarded to linux-usb by reporter
Notes: might be the same as
https://bugzilla.kernel.org/show_bug.cgi?id=196431

Xen HVM guest with KASLR enabled wouldn't boot any longer
2017-07-28 https://lkml.kernel.org/r/20170728102314.29100-1-jgr...@suse.com
Status: WIP, patches up for review

NULL pointer deref in networking
2017-07-29 lr#084be9 https://bugzilla.kernel.org/show_bug.cgi?id=196529
Status: told reporter he might be better off posting to netdev


Re: Linux 4.13-rc2

2017-07-23 Thread Linus Torvalds
On Sun, Jul 23, 2017 at 4:48 PM, Linus Torvalds
 wrote:
> Things are chugging along, and we actually had a reasonably active rc2.

.. and Konstantin just noticed that I had forgotten to push out the
actual tag, so the scripts that generate the diffs and tar-balls
didn't run.

So the git trees contained the up-to-date code, but no tag, and
non-git users didn't see rc2 at all.

Tag pushed now too, and the old-fashioned patches should thus
hopefully be generated any moment now.

 Linus


Linux 4.13-rc2

2017-07-23 Thread Linus Torvalds
cessary static
  irqchip/mips-cpu: Drop unnecessary static
  irqchip/digicolor: Drop unnecessary static

Justin Ernst (1):
  x86/platform/uv/BAU: Fix congested_response_us not taking effect

Kaike Wan (1):
  IB/rdmavt: Setting of QP timeout can overflow jiffies computation

Kalderon, Michal (1):
  IB/cma: Fix reference count leak when no ipv4 addresses are set

Kan Liang (1):
  perf/x86/intel: Add Goldmont Plus CPU PMU support

Keerthy (1):
  net: ethernet: ti: cpsw: Push the request_irq function to the end of probe

Kees Cook (3):
  randstruct: Mark various structs for randomization
  task_struct: Allow randomized layout
  randstruct: opt-out externally exposed function pointer structs

Kefeng Wang (2):
  nbd: kill unused ret in recv_work
  bpf: fix return in bpf_skb_adjust_net

Keith Busch (1):
  nvme-pci: Remove nvme_setup_prps BUG_ON

Kosuke Tatsukawa (1):
  net: bonding: Fix transmit load balancing in balance-alb mode

Krzysztof Kozlowski (1):
  x86/defconfig: Remove stale, old Kconfig options

Kuppuswamy Sathyanarayanan (1):
  mux: mux-core: unregister mux_class in mux_exit()

Laurentiu Palcu (1):
  drm/imx: fix typo in ipu_plane_formats[]

Leon Romanovsky (6):
  IB: Convert msleep below 20ms to usleep_range
  IB/IPoIB: Convert IPoIB to memalloc_noio_* calls
  IB/{rdmavt, qib, hfi1}: Remove gfp flags argument
  {net, IB}/mlx4: Remove gfp flags argument
  IB/core: Remove NOIO QP create flag
  IB/mlx5: Clean mr_cache debugfs in case of failure

Levin, Alexander (1):
  wireless: wext: terminate ifr name coming from userspace

Linus Torvalds (4):
  x86: mark kprobe templates as character arrays, not single characters
  Fix up MAINTAINERS file problems
  Properly alphabetize MAINTAINERS file
  Linux 4.13-rc2

LiuJian (1):
  net: hns: add acpi function of xge led control

Luis Henriques (1):
  f2fs: remove extra inode_unlock() in error path

Lynn Lei (1):
  staging: sm750fb: fixed a assignment typo

Mahesh Bandewar (1):
  ipv4: initialize fib_trie prior to register_netdev_notifier call.

Majd Dibbiny (1):
  IB/core: Add ordered workqueue for RoCE GID management

Martin Blumenstingl (1):
  mdio: mux: fix parsing mux registers outside of the PHY address range

Martin Hundebøll (1):
  net: dsa: mv88e6xxx: Enable CMODE config support for 6390X

Martin Wilck (1):
  nvmet: don't report 0-bytes in serial number

Matan Barak (1):
  IB/core: Fix sparse warnings

Mateusz Jurczyk (1):
  netfilter: nfnetlink: Improve input length sanitization in nfnetlink_rcv

Mathias Nyman (2):
  xhci: Fix NULL pointer dereference when cleaning up streams for
removed host
  xhci: fix 2ms port resume timeout

Matt Redfearn (1):
  irqchip/mips-gic: Remove population of irq domain names

Michael Ellerman (4):
  powerpc/powernv: Fix boot on Power8 bare metal due to
opal_configure_cores()
  powerpc/mm/radix: Refactor radix__mark_rodata_ro()
  powerpc/mm/hash: Refactor hash__mark_rodata_ro()
  powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y

Michael Gugino (1):
  staging: rtl8188eu: add TL-WN722N v2 support

Mika Westerberg (1):
  thunderbolt: Correct access permissions for active NVM contents

Mike Marciniszyn (1):
  IB/iser: Handle lack of memory management extentions correctly

Miklos Szeredi (1):
  ovl: fix xattr get and set with selinux

Mikulas Patocka (1):
  x86/cpu: Use indirect call to measure performance in init_amd_k6()

Minas Harutyunyan (1):
  usb: dwc2: gadget: On USB RESET reset device address to zero

Moni Shoua (2):
  IB/core: Namespace is mandatory input for address resolution
  IB/core: Don't resolve IP address to the loopback device

Mustafa Ismail (2):
  i40iw: Fix order of cleanup in close
  i40iw: Do not poll CCQ after it is destroyed

Neal Cardwell (5):
  tcp_bbr: cut pacing rate only if filled pipe
  tcp_bbr: introduce bbr_bw_to_pacing_rate() helper
  tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper
  tcp_bbr: remove sk_pacing_rate=0 transient during init
  tcp_bbr: init pacing rate on first RTT sample

NeilBrown (1):
  net/sunrpc/xprt_sock: fix regression in connection error reporting.

Nicholas Piggin (2):
  powerpc/perf: Avoid spurious PMU interrupts after idle
  powerpc/64s: Fix hypercall entry clobbering r12 input

Nikolay Aleksandrov (1):
  net: bridge: fix dest lookup when vlan proto doesn't match

Nilesh Javali (1):
  scsi: qedi: Add support for Boot from SAN over iSCSI offload

Okash Khawaja (3):
  staging: speakup: safely close tty
  staging: speakup: add functions to register and unregister ldisc
  staging: speakup: safely register and unregister ldisc

Paolo Abeni (1):
  udp: preserve skb->dst if required for IP options processing

Paolo Bonzini (2):
  scsi: virtio_scsi: alw

Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-17 Thread Linus Torvalds
On Sun, Jul 16, 2017 at 8:05 PM, da...@codemonkey.org.uk
 wrote:
> On Sun, Jul 16, 2017 at 10:57:27PM +, Trond Myklebust wrote:
>
>  > > BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
>  > > Read of size 8 at addr 8d582588 by task kworker/0:1/22
>  >
>  > Does the following patch fix it?
>
> Yep, seems to do the trick!

I'm assuming I'll get this fix through a future pull request, and am
not applying the patch as-is. Just FYI.

 Linus


linux-next: stats (Was: Merge window over - Linux 4.13-rc1 out)

2017-07-17 Thread Stephen Rothwell
Hi all,

As usual, the executive friendly graph is at
http://neuling.org/linux-next-size.html :-)

(No merge commits counted, next-20170704 was the first linux-next after
the merge window opened.)

Commits in v4.13-rc1 (relative to v4.12):  11258
Commits in next-20170704:  10626
Commits with the same SHA1: 9739
Commits with the same patch_id:  489 (1)
Commits with the same subject line:   60 (1)

(1) not counting those in the lines above.

So commits in -rc1 that were in next-20170704: 10288 91%

Some breakdown of the list of extra commits (relative to next-20170704)
in -rc1:

Top ten first word of commit summary:

 71 net
 41 drm
 32 ovl
 30 perf
 29 kvm
 28 powerpc
 23 rtc
 20 ubifs
 20 sunrpc
 20 nfs

Top ten authors:

 85 mche...@kernel.org
 45 elena.reshet...@intel.com
 40 h...@lst.de
 31 chuck.le...@oracle.com
 29 yamada.masah...@socionext.com
 21 arvind.yadav...@gmail.com
 20 dhowe...@redhat.com
 20 adrian.hun...@intel.com
 19 j...@perches.com
 19 amir7...@gmail.com

Top ten commiters:

109 da...@davemloft.net
 89 cor...@lwn.net
 68 anna.schuma...@netapp.com
 52 torva...@linux-foundation.org
 35 rich...@nod.at
 33 mszer...@redhat.com
 32 a...@redhat.com
 30 m...@ellerman.id.au
 29 yamada.masah...@socionext.com
 28 rost...@goodmis.org

There are also 340 commits in next-20170704 that didn't make it into
v4.13-rc1.

Top ten first word of commit summary:

 49 media
 34 ib
 23 arm
 18 keys
 16 clocksource
 15 mm
 12 dlm
 12 coresight
 12 arc
 10 rcu

Top ten authors:

 26 paul...@linux.vnet.ibm.com
 23 a...@linux-foundation.org
 16 ebigg...@google.com
 14 elfr...@users.sourceforge.net
 11 mche...@kernel.org
 11 d.schel...@gmx.net
 10 leo@linaro.org
 10 daniel.lezc...@linaro.org
 10 a...@arndb.de
  8 g...@suse.com

Some of Andrew's patches are fixes for other patches in his tree (and
have been merged into those).

Top ten commiters:

 85 s...@canb.auug.org.au
 49 mche...@kernel.org
 34 dledf...@redhat.com
 29 paul...@linux.vnet.ibm.com
 22 dhowe...@redhat.com
 17 mathieu.poir...@linaro.org
 17 daniel.lezc...@linaro.org
 12 vgu...@synopsys.com
 12 t...@atomide.com
 12 teigl...@redhat.com

Those commits by me are from the quilt series (mainly Andrew's mmotm
tree).

-- 
Cheers,
Stephen Rothwell


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread da...@codemonkey.org.uk
On Sun, Jul 16, 2017 at 10:57:27PM +, Trond Myklebust wrote:
 
 > > BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
 > > Read of size 8 at addr 8d582588 by task kworker/0:1/22
 > 
 > Does the following patch fix it?

Yep, seems to do the trick!

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread Trond Myklebust
Hi Dave,

On Sun, 2017-07-16 at 17:15 -0400, Dave Jones wrote:
> On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
>  > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  >  > Hi Linus,
>  >  > 
>  >  > The following changes since commit
> 32c1431eea4881a6b17bd7c639315010aeefa452:
>  >  > 
>  >  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
>  >  > 
>  >  > are available in the git repository at:
>  >  > 
>  >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-
> for-4.13-1
>  >  > 
>  >  > for you to fetch changes up to
> b4f937cffa66b3d56eb8f586e620d0b223a281a3:
>  >  > 
>  >  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-
> 07-13 16:57:18 -0400)
>  > 
>  > Since this landed, I'm seeing this during boot..
>  > 
>  >  =
> =
>  >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  >  Read of size 8 at addr b4eeaf20 by task nfsd/688
> 
> Now that this one got fixed, this one fell out instead..
> Will dig deeper tomorrow.
> 
> ==
> BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
> Read of size 8 at addr 8d582588 by task kworker/0:1/22
> 
> CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 4.13.0-rc1-firewall+ #1 
> Workqueue: rpciod rpc_async_schedule
> Call Trace:
>  dump_stack+0x68/0x94
>  print_address_description+0x2c/0x270
>  ? call_start+0x93/0x100
>  kasan_report+0x239/0x350
>  __asan_load8+0x55/0x90
>  call_start+0x93/0x100
>  ? rpc_default_callback+0x10/0x10
>  ? rpc_default_callback+0x10/0x10
>  __rpc_execute+0x170/0x740
>  ? rpc_wake_up_queued_task+0x50/0x50
>  ? __lock_is_held+0x9f/0x110
>  rpc_async_schedule+0x12/0x20
>  process_one_work+0x4ba/0xb10
>  ? process_one_work+0x401/0xb10
>  ? pwq_dec_nr_in_flight+0x120/0x120
>  worker_thread+0x91/0x670
>  ? __sched_text_start+0x8/0x8
>  kthread+0x1ab/0x200
>  ? process_one_work+0xb10/0xb10
>  ? __kthread_create_on_node+0x340/0x340
>  ret_from_fork+0x27/0x40
> 
> The buggy address belongs to the variable:
>  nfs_cb_version+0x8/0x740

Does the following patch fix it?

Cheers
  Trond

8<--
From b9230cdfbbee90178a1318d20cd3373ffb758788 Mon Sep 17 00:00:00 2001
From: Trond Myklebust 
Date: Sun, 16 Jul 2017 18:52:18 -0400
Subject: [PATCH] nfsd: Fix a memory scribble in the callback channel

The offset of the entry in struct rpc_version has to match the version
number.

Reported-by: Dave Jones 
Fixes: 1c5876ddbdb4 ("sunrpc: move p_count out of struct rpc_procinfo")
Signed-off-by: Trond Myklebust 
---
 fs/nfsd/nfs4callback.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index b45083c0f9ae..49b0a9e7ff18 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -720,8 +720,8 @@ static const struct rpc_version nfs_cb_version4 = {
.counts = nfs4_cb_counts,
 };
 
-static const struct rpc_version *nfs_cb_version[] = {
-   &nfs_cb_version4,
+static const struct rpc_version *nfs_cb_version[2] = {
+   [1] = &nfs_cb_version4,
 };
 
 static const struct rpc_program cb_program;
@@ -795,7 +795,7 @@ static int setup_callback_client(struct nfs4_client *clp, 
struct nfs4_cb_conn *c
.saddress   = (struct sockaddr *) &conn->cb_saddr,
.timeout= &timeparms,
.program= &cb_program,
-   .version= 0,
+   .version= 1,
.flags  = (RPC_CLNT_CREATE_NOPING | 
RPC_CLNT_CREATE_QUIET),
};
struct rpc_clnt *client;
-- 
2.13.3

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread Dave Jones
On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
 > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 >  > Hi Linus,
 >  > 
 >  > The following changes since commit 
 > 32c1431eea4881a6b17bd7c639315010aeefa452:
 >  > 
 >  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
 >  > 
 >  > are available in the git repository at:
 >  > 
 >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
 >  > 
 >  > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
 >  > 
 >  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
 > 16:57:18 -0400)
 > 
 > Since this landed, I'm seeing this during boot..
 > 
 >  ==
 >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 >  Read of size 8 at addr b4eeaf20 by task nfsd/688

Now that this one got fixed, this one fell out instead..
Will dig deeper tomorrow.

==
BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
Read of size 8 at addr 8d582588 by task kworker/0:1/22

CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 4.13.0-rc1-firewall+ #1 
Workqueue: rpciod rpc_async_schedule
Call Trace:
 dump_stack+0x68/0x94
 print_address_description+0x2c/0x270
 ? call_start+0x93/0x100
 kasan_report+0x239/0x350
 __asan_load8+0x55/0x90
 call_start+0x93/0x100
 ? rpc_default_callback+0x10/0x10
 ? rpc_default_callback+0x10/0x10
 __rpc_execute+0x170/0x740
 ? rpc_wake_up_queued_task+0x50/0x50
 ? __lock_is_held+0x9f/0x110
 rpc_async_schedule+0x12/0x20
 process_one_work+0x4ba/0xb10
 ? process_one_work+0x401/0xb10
 ? pwq_dec_nr_in_flight+0x120/0x120
 worker_thread+0x91/0x670
 ? __sched_text_start+0x8/0x8
 kthread+0x1ab/0x200
 ? process_one_work+0xb10/0xb10
 ? __kthread_create_on_node+0x340/0x340
 ret_from_fork+0x27/0x40

The buggy address belongs to the variable:
 nfs_cb_version+0x8/0x740

Memory state around the buggy address:
 8d582480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8d582500: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
>8d582580: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ^
 8d582600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8d582680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==



Merge window over - Linux 4.13-rc1 out

2017-07-15 Thread Linus Torvalds
Ok, normally I do this on Sunday afternoon, but occasionally it
happens a day early like now to avoid people timing me.

In fact, I was planning on doing it yesterday evening this time around
because I was so annoyed with lots of late pull requests on Friday
(and some today), but ended up going to dinner and not getting
everything done, so it's only one day early. Next time...

This looks like a fairly regular release, and as always, rc1 is much
too large to post even the shortlog for. So just my rough "mergelog"
that shows who I pulled from and a oneliner description of the pull.

Once again, the diffstat is absolutely dominated by some AMD gpu
header files, but if you ignore that, things look pretty regular, with
about two thirds drivers and one third "rest" (architecture, core
kernel, core networking, tooling).

Slightly unusual is the Documentation updates, which are a fairly
noticeable part of that "rest" (almost half) due to continued work to
regularize and clean up stuff.

Get testing,

  Linus

---

Al Viro (21):
misc user access cleanups
wait syscall updates
read/write updates
timer-related user access updates
memdup_user() conversions
DRM compat ioctl handling updates
misc compat stuff updates
alpha user access updates
probe_kernel_read() uses
user access str* updates
iov_iter hardening
read/write fix
__copy_in_user removal
spi uaccess delousing
misc filesystem updates
waitid fix
copy*_iter fix
network field-by-field copy-in updates
uacess-unaligned removal
more __copy_.._user elimination
->s_options removal

Alex Williamson (1):
VFIO updates

Alexandre Belloni (1):
RTC updates

Andrew Morton (4):
misc updates
more updates
yet more updates
even more updates

Anna Schumaker (1):
NFS client updates

Arnd Bergmann (7):
non-urgent ARM SoC fixes
ARM SoC platform updates
ARM device-tree updates
ARM SoC defconfig updates
ARM SoC driver updates
ARM 64-bit DT updates
ARM SoC 64-bit updates

Bartlomiej Zolnierkiewicz (1):
fbdev updates

Benson Leung (1):
chrome platform updates

Bjorn Andersson (3):
rpmsg updates
remoteproc updates
hwspinlock updates

Bjorn Helgaas (2):
PCI updates
PCI fixes

Bob Peterson (2):
GFS2 updates
GFS2 fix

Borislav Petkov (1):
EDAC updates

Brian Norris (1):
MTD updates

Bruce Fields (1):
nfsd updates

Chris Metcalf (1):
arch/tile updates

Christoph Hellwig (2):
uuid subsystem
dma-mapping infrastructure

Corey Minyard (1):
IPMI updates

Dan Williams (1):
libnvdimm updates

Darren Hart (2):
x86 platform driver updates
more x86 platform driver updates

Darrick Wong (2):
XFS updates
XFS fixes

Dave Airlie (2):
drm updates
more drm updates

David Miller (5):
networking updates
networking fixes
sparc updates
sparc fixes
networking fixes

David Sterba (3):
btrfs updates
btrfs fix
btrfs fixes

Dmitry Torokhov (2):
input updates
a few more input updates

Doug Ledford (1):
rdma update

Eric Biederman (2):
mnt namespace updates
sysctl fix

Geert Uytterhoeven (1):
m68k updates

Greg KH (6):
USB/PHY updates
staging/IIO updates
tty/serial updates
driver core updates
char/misc updates
USB fixes

Greg Ungerer (1):
x86nommu update

Guenter Roeck (1):
hwmon updates

Helge Deller (2):
parisc updates
another parisc update

Herbert Xu (2):
crypto updates
crypto fixes

Ilya Dryomov (1):
ceph updates

Ingo Molnar (15):
objtool updates
RCU updates
EFI updates
locking updates
perf updates
scheduler updates
nohz updates
x86 apic updates
x86 asm updates
x86 boot updates
x86 cleanups
x86 debug update
x86 hyperv updates
x86 microcode updates
x86 mm updates

Jacek Anaszewski (1):
LED updates

Jaegeuk Kim (1):
f2fs updates

James Bottomley (1):
SCSI updates

James Morris (3):
security layer updates
security layer fixes
key handling fixes

Jan Kara (1):
ext2, udf, reiserfs fixes

Jassi Brar (1):
mailbox updates

Jeff Layton (2):
Writeback error handling fixes
Writeback error handling updates

Jens Axboe (2):
core block/IO updates
more block updates

Jessica Yu (1):
modules updates

Jiri Kosina (1):
HID updates

Joerg Roedel (1):
IOMMU updates

Jon Mason (1):
NTB updates

Jonathan Corbet (3):
documentation updates
documentation fixes
documentation format standardization

Juergen Gross (1):
xen updates

Kees Cook (2):
pstore updates
GCC plugin updates

Lee Jones (2):
MFD updates
backlight updates

Linus Walleij (2):
pin control updates
GPIO updates

Luis de Bethencourt (1):
single befs fix

Mark Brown (3):
regmap updates
regulator updates
spi updates

Martin Schwidefsky (2):
s39

Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).

Agree, it's very important for this code to be correct and the string
functions have some subtleties so it needs scrutiny. I messed up strcpy
between v1 and v2 trying to add a proper read overflow check. My fault
for not looking more closely at strscpy before adopting it based on my
misinterpretation of the API.

This is primarily a bug finding feature right now and it has gotten a
few fixed that actually matter (most were unimportant memcpy read past
end of string constant but not all). I don't think it has another bug
like this strscpy misuse itself, but there will need to be some more
fixes for minor read overflows, etc. elsewhere in the tree before it'll
actually make sense as a hardening feature because it can turn a benign
read overflow into a DoS via BUG(). I think it will be fine for 4.13,
but I definitely wouldn't propose 'default y' for a while, even if there
was no performance cost (and there is).

Fix for this issue is here in case anyone just looks only at this thread
(realized I should have passed send-email a reply id):

http://marc.info/?l=linux-fsdevel&m=150006772418003&w=2


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> The reason q_size isn't used is because it doesn't yet prevent read
> overflow. The commit message mentions that among the current
> limitations
> along with __builtin_object_size(ptr, 1).

Er rather, in strlcat, the q_size is unused after the fast path is
because strnlen obtains the constant again itself.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 13:50 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay 
> wrote:
> > 
> > If strscpy treats the count parameter as a *guarantee* of the dest
> > size
> > rather than a limit,
> 
> No, it's a *limit*.
> 
> And by a *limit*, I mean that we know that we can access both source
> and destination within that limit.

FORTIFY_SOURCE needs to be able to pass a limit without implying that
there's a minimum. That's the distinction I was trying to make. It's
wrong to use anything where it's interpreted as a minimum here. Using
__builtin_object_size(ptr, 0) vs. __builtin_object_size(ptr, 1) doesn't
avoid the problem. __builtin_object_size(ptr, 1) returns a maximum among
the possible buffer sizes just like 0. It's just stricter, i.e. catches
intra-object overflow, which isn't desirable for the first take since it
will cause compatibility issues. There's code using memcpy,
copy_to_user, etc. to read / write multiple fields with a pointer to the
first one passed as the source / destination.

> > My initial patch used strlcpy there, because I wasn't aware of
> > strscpy
> > before it was suggested:
> 
> Since I'm looking at this, I note that the "strlcpy()" code is
> complete garbage too, and has that same
> 
>  p_size == (size_t)-1 && q_size == (size_t)-1
> 
> check which is wrong.  Of course, in strlcpy, q_size is never actually
> *used*, so the whole check seems bogus.

That check is only an optimization. __builtin_object_size always returns
a constant, and it's (size_t)-1 when no limit could be found.

The reason q_size isn't used is because it doesn't yet prevent read
overflow. The commit message mentions that among the current limitations
along with __builtin_object_size(ptr, 1).

> But no, strlcpy() is complete garbage, and should never be used. It is
> truly a shit interface, and anybody who uses it is by definition
> buggy.
> 
> Why? Because the return value of "strlcpy()" is defined to be ignoring
> the limit, so you FUNDAMENTALLY must not use that thing on untrusted
> source strings.
> 
> But since the whole *point* of people using it is for untrusted
> sources, it by definition is garbage.
> 
> Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
> a reason we defined "strscpy()" as the way to do safe copies
> (strncpy(), of course, is broken for both lack of NUL termination
> _and_ for excessive NUL termination when a NUL did exist).

Sure, it doesn't prevent read overflow (but it's not worse than strcpy,
which is the purpose) which is why I said this:

"The fortified string functions should place a limit on reads from the
source. For strcat/strcpy, these could just be a variant of strlcat /
strlcpy using the size limit as a bound on both the source and
destination, with the return value only reporting whether truncation
occurred rather than providing the source length. It would be an easier
API for developers to use too and not that it really matters but it
would be more efficient for the case where truncation is intended."

That's why strscpy was suggested and I switched to that + updated the
commit message to only mention strcat, but it's wrong to use it here
because __builtin_object_size(p, 0) / __builtin_object_size(p, 1) are
only a guaranteed maximum length with no minimum guarantee.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:
> 
> http://www.openwall.com/lists/kernel-hardening/2017/05/04/11
> 
> I was wrong to move it to strscpy. It could be switched back to
> strlcpy
> again unless the kernel considers the count parameter to be a
> guarantee
> that could be leveraged in the future. Using the fortified strlen +
> memcpy would provide the improvement that strscpy was meant to provide
> there over strlcpy.

Similarly, the FORTIFY_SOURCE strcat uses strlcat with the assumption
that the count parameter is a limit, not a guarantee for a optimization.
There's only a C implementation and it currently doesn't, but if it's
meant to be a guarantee then the strcat needs to be changed too.

I'll make a fix moving away both from the existing functions.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay  wrote:
>
> If strscpy treats the count parameter as a *guarantee* of the dest size
> rather than a limit,

No, it's a *limit*.

And by a *limit*, I mean that we know that we can access both source
and destination within that limit.

> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:

Since I'm looking at this, I note that the "strlcpy()" code is
complete garbage too, and has that same

 p_size == (size_t)-1 && q_size == (size_t)-1

check which is wrong.  Of course, in strlcpy, q_size is never actually
*used*, so the whole check seems bogus.

But no, strlcpy() is complete garbage, and should never be used. It is
truly a shit interface, and anybody who uses it is by definition
buggy.

Why? Because the return value of "strlcpy()" is defined to be ignoring
the limit, so you FUNDAMENTALLY must not use that thing on untrusted
source strings.

But since the whole *point* of people using it is for untrusted
sources, it by definition is garbage.

Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
a reason we defined "strscpy()" as the way to do safe copies
(strncpy(), of course, is broken for both lack of NUL termination
_and_ for excessive NUL termination when a NUL did exist).

So quite frankly, this hardening code needs to be looked at again. And
no, if it uses "strlcpy()", then it's not hardering, it's just a pile
of crap.

Yes, I'm annoyed. I really get very very annoyed by "hardening" code
that does nothing of the kind.

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 12:58 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
> > 
> > > yet when I look at the generated code for __ip_map_lookup, I see
> > > 
> > >movl$32, %edx   #,
> > >movq%r13, %rsi  # class,
> > >leaq48(%rax), %rdi  #, tmp126
> > >callstrscpy #
> > > 
> > > what's the bug here? Look at that third argume8nt - %rdx. It is
> > > initialized to 32.
> > 
> > It's not a compiler bug, it's a bug in our strcpy().
> > Whoever wrote this strcpy() into strscpy() code apparently didn't
> > read carefully
> > enough gcc manual about __builtin_object_size().
> > 
> > Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking
> > .html :
> > 
> > __builtin_object_size(ptr, type) returns a constant number
> > of bytes from 'ptr' to the end of the object 'ptr'
> > pointer points to. "type" is an integer constant from 0 to
> > 3. If the least significant bit is clear, objects
> > are whole variables, if it is set, a closest surrounding
> > subobject is considered the object a pointer points to.
> > The second bit determines if maximum or minimum of remaining
> > bytes is computed.
> > 
> > We have type = 0 in strcpy(), so the least significant bit is clear.
> > So the 'ptr' is considered as a pointer to the whole
> > variable i.e. pointer to struct ip_map ip;
> > And the number of bytes from 'ip.m_class' to the end of the ip
> > object is exactly 32.
> > 
> > I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 
> There is something actively *evil* about it. Daniel, Kees, please jump
> on this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus

The issue is the usage of strscpy then, not the __builtin_object_size
type parameter. The type is set 0 rather than 1 to be more lenient by
not detecting intra-object overflow, which is going to come later.

If strscpy treats the count parameter as a *guarantee* of the dest size
rather than a limit, it's wrong to use it there, whether or not the type
parameter for __builtin_object_size is 0 or 1 since it can still return
a larger size. It's a limit with no guaranteed minimum.

My initial patch used strlcpy there, because I wasn't aware of strscpy
before it was suggested:

http://www.openwall.com/lists/kernel-hardening/2017/05/04/11

I was wrong to move it to strscpy. It could be switched back to strlcpy
again unless the kernel considers the count parameter to be a guarantee
that could be leveraged in the future. Using the fortified strlen +
memcpy would provide the improvement that strscpy was meant to provide
there over strlcpy.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Andrey Rybainin


On 07/14/2017 10:58 PM, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
>>
>>> yet when I look at the generated code for __ip_map_lookup, I see
>>>
>>>movl$32, %edx   #,
>>>movq%r13, %rsi  # class,
>>>leaq48(%rax), %rdi  #, tmp126
>>>callstrscpy #
>>>
>>> what's the bug here? Look at that third argume8nt - %rdx. It is
>>> initialized to 32.
>>
>> It's not a compiler bug, it's a bug in our strcpy().
>> Whoever wrote this strcpy() into strscpy() code apparently didn't read 
>> carefully
>> enough gcc manual about __builtin_object_size().
>>
>> Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :
>>
>> __builtin_object_size(ptr, type) returns a constant number of bytes 
>> from 'ptr' to the end of the object 'ptr'
>> pointer points to. "type" is an integer constant from 0 to 3. If the 
>> least significant bit is clear, objects
>> are whole variables, if it is set, a closest surrounding subobject 
>> is considered the object a pointer points to.
>> The second bit determines if maximum or minimum of remaining bytes 
>> is computed.
>>
>> We have type = 0 in strcpy(), so the least significant bit is clear. So the 
>> 'ptr' is considered as a pointer to the whole
>> variable i.e. pointer to struct ip_map ip;
>> And the number of bytes from 'ip.m_class' to the end of the ip object is 
>> exactly 32.
>>
>> I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 

A have some more news to make you even more "happier" :)
strcpy() choose to copy 32-bytes instead of smaller 5-bytes because it has one 
more bug :)

GCC couldn't determine size of class (which is 5-byte string):
strcpy(ip.m_class, class);

So, p_size = 32 and q_size  = -1, this "if (p_size == (size_t)-1 && q_size == 
(size_t)-1)" is false
(because of bogus '&&', obviously we should have '||' here)

and since (32 < (size_t)-1) 
if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)

we end up with 32-bytes strscpy().

Enjoy :)


> There is something actively *evil* about it. Daniel, Kees, please jump on 
> this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus
> 


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
 wrote:
>
>> yet when I look at the generated code for __ip_map_lookup, I see
>>
>>movl$32, %edx   #,
>>movq%r13, %rsi  # class,
>>leaq48(%rax), %rdi  #, tmp126
>>callstrscpy #
>>
>> what's the bug here? Look at that third argume8nt - %rdx. It is
>> initialized to 32.
>
> It's not a compiler bug, it's a bug in our strcpy().
> Whoever wrote this strcpy() into strscpy() code apparently didn't read 
> carefully
> enough gcc manual about __builtin_object_size().
>
> Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :
>
> __builtin_object_size(ptr, type) returns a constant number of bytes 
> from 'ptr' to the end of the object 'ptr'
> pointer points to. "type" is an integer constant from 0 to 3. If the 
> least significant bit is clear, objects
> are whole variables, if it is set, a closest surrounding subobject is 
> considered the object a pointer points to.
> The second bit determines if maximum or minimum of remaining bytes is 
> computed.
>
> We have type = 0 in strcpy(), so the least significant bit is clear. So the 
> 'ptr' is considered as a pointer to the whole
> variable i.e. pointer to struct ip_map ip;
> And the number of bytes from 'ip.m_class' to the end of the ip object is 
> exactly 32.
>
> I suppose that changing the type to 1 should fix this bug.

Oh, that absolutely needs to be done.

Because that "strcpy() -> strscpy()" conversion really depends on that
size being the right size (well, in this case minimal safe size) for
the actual accesses, exactly because "strscpy()" is perfectly willing
to write *past* the end of the destination string within that given
size limit (ie it reads and writes in the same 8-byte chunks).

So if you have a small target string that is contained in a big
object, then the "hardened" strcpy() code can actually end up
overwriting things past the end of the strring, even if the string
itself were to have fit in the buffer.

I note that every single use in string.h is buggy, and it worries me
that __compiletime_object_size() does this too. The only user of that
seems to be check_copy_size(), and now I'm a bit worried what that bug
may have hidden.

I find "hardening" code that adds bugs to be particularly bad and
ugly, the same way that I absolutely *hate* debugging code that turns
out to make debugging impossible (we had that with the "better" stack
tracing code that caused kernel panics to kill the machine entirely
rather than show the backtrace, and I'm still bitter about it a decade
after the fact).

There is something actively *evil* about it. Daniel, Kees, please jump on this.

Andrey, thanks for noticing this thing,

  Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Dave Jones
On Fri, Jul 14, 2017 at 12:05:02PM -0700, Linus Torvalds wrote:
 > On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
 > > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 > >  >
 > >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git 
 > > tags/nfs-for-4.13-1
 > >
 > > Since this landed, I'm seeing this during boot..
 > >
 > >  ==
 > >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 > >  Read of size 8 at addr b4eeaf20 by task nfsd/688
 > 
 > Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
 > of the sources?
 > 
 > The problem may be that the source is initialized from the global
 > string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
 > bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
 > strscpy..
 > 
 > That said, we do check the size first (because we also *write* 8 bytes
 > at a time), so maybe KASAN shouldn't even need to care.
 > 
 > Hmm. it really looks to me like this is actually a compiler bug (I'm
 > using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
 > the same).

Debian's 6.4.0

 > This is the source code in __ip_map_lookup:
 > 
 > struct ip_map ip;
 >  .
 > strcpy(ip.m_class, class);
 > 
 > and "m_class" is 8 bytes in size:
 > 
 > struct ip_map {
 > ...
 > charm_class[8]; /* e.g. "nfsd" */
 > ...
 > 
 > yet when I look at the generated code for __ip_map_lookup, I see
 > 
 > movl$32, %edx   #,
 > movq%r13, %rsi  # class,
 > leaq48(%rax), %rdi  #, tmp126
 > callstrscpy #
 > 
 > what's the bug here? Look at that third argument - %rdx. It is
 > initialized to 32.
 > 
 > WTF?
 > 
 > The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
 > of the two object sizes as the size argument. How the hell is that
 > size argument 32?
 > 
 > Am I missing something? DaveJ, do you see the same?

My compiler seems to have replaced the call with an inlined copy afaics.

0be0 <__ip_map_lookup>:
{
 be0:   e8 00 00 00 00  callq  be5 <__ip_map_lookup+0x5>
 be5:   55  push   %rbp
 be6:   48 b8 00 00 00 00 00movabs $0xdc00,%rax
 bed:   fc ff df 
 bf0:   48 89 e5mov%rsp,%rbp
 bf3:   41 57   push   %r15
 bf5:   41 56   push   %r14
 bf7:   4c 8d 32lea(%rdx),%r14
if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)
 bfa:   ba 20 00 00 00  mov$0x20,%edx
 bff:   41 55   push   %r13
 c01:   4c 8d 2elea(%rsi),%r13
 c04:   41 54   push   %r12
 c06:   53  push   %rbx
 c07:   48 8d 1flea(%rdi),%rbx
 c0a:   48 8d a4 24 60 ff fflea-0xa0(%rsp),%rsp
 c11:   ff 
 c12:   49 89 e4mov%rsp,%r12
 c15:   49 c1 ec 03 shr$0x3,%r12
 c19:   48 c7 04 24 b3 8a b5movq   $0x41b58ab3,(%rsp)
 c20:   41 
 c21:   48 c7 44 24 08 00 00movq   $0x0,0x8(%rsp)
 c28:   00 00 
 c2a:   48 c7 44 24 10 00 00movq   $0x0,0x10(%rsp)
 c31:   00 00 
 c33:   48 8d 7c 24 50  lea0x50(%rsp),%rdi
 c38:   4d 8d 24 04 lea(%r12,%rax,1),%r12
 c3c:   41 c7 04 24 f1 f1 f1movl   $0xf1f1f1f1,(%r12)
 c43:   f1 
 c44:   41 c7 44 24 0c 00 00movl   $0xf4f4,0xc(%r12)
 c4b:   f4 f4 
 c4d:   65 48 8b 04 25 28 00mov%gs:0x28,%rax
 c54:   00 00 
 c56:   48 89 84 24 98 00 00mov%rax,0x98(%rsp)
 c5d:   00 
 c5e:   31 c0   xor%eax,%eax
 c60:   e8 00 00 00 00  callq  c65 <__ip_map_lookup+0x85>
 c65:   48 85 c0test   %rax,%rax
 c68:   0f 88 a0 00 00 00   js d0e <__ip_map_lookup+0x12e>
ip.m_addr = *addr;
 c6e:   be 10 00 00 00  mov$0x10,%esi
 c73:   49 8d 3elea(%r14),%rdi
 c76:   e8 00 00 00 00  callq  c7b <__ip_map_lookup+0x9b>
 c7b:   49 8b 56 08 mov0x8(%r14),%rdx


But that mov $0x20,%edx looks like it might be the same value we're talking 
about.

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Andrey Ryabinin


On 07/14/2017 10:05 PM, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
>> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>>  >
>>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>>
>> Since this landed, I'm seeing this during boot..
>>
>>  ==
>>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>>  Read of size 8 at addr b4eeaf20 by task nfsd/688
> 
> Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
> of the sources?
> 

Nope.

> The problem may be that the source is initialized from the global
> string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
> bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
> strscpy..
> 

Right.

> That said, we do check the size first (because we also *write* 8 bytes
> at a time), so maybe KASAN shouldn't even need to care.
>

Perhaps we could fallback to unoptimzed copy for KASAN case by setting max = 0
in strscpy().

 
> Hmm. it really looks to me like this is actually a compiler bug (I'm
> using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
> the same).
> 
> This is the source code in __ip_map_lookup:
> 
> struct ip_map ip;
>  .
> strcpy(ip.m_class, class);
> 
> and "m_class" is 8 bytes in size:
> 
> struct ip_map {
> ...
> charm_class[8]; /* e.g. "nfsd" */
> ...
> 
> yet when I look at the generated code for __ip_map_lookup, I see
> 
> movl$32, %edx   #,
> movq%r13, %rsi  # class,
> leaq48(%rax), %rdi  #, tmp126
> callstrscpy #
> 
> what's the bug here? Look at that third argume8nt - %rdx. It is
> initialized to 32.
> 
> WTF?
> 


It's not a compiler bug, it's a bug in our strcpy().
Whoever wrote this strcpy() into strscpy() code apparently didn't read carefully
enough gcc manual about __builtin_object_size().

Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :

__builtin_object_size(ptr, type) returns a constant number of bytes 
from 'ptr' to the end of the object 'ptr'
pointer points to. "type" is an integer constant from 0 to 3. If the 
least significant bit is clear, objects
are whole variables, if it is set, a closest surrounding subobject is 
considered the object a pointer points to.
The second bit determines if maximum or minimum of remaining bytes is 
computed. 

We have type = 0 in strcpy(), so the least significant bit is clear. So the 
'ptr' is considered as a pointer to the whole
variable i.e. pointer to struct ip_map ip;
And the number of bytes from 'ip.m_class' to the end of the ip object is 
exactly 32.

I suppose that changing the type to 1 should fix this bug.



> The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
> of the two object sizes as the size argument. How the hell is that
> size argument 32?
> 
> Am I missing something? DaveJ, do you see the same?
> 
>Linus
> 


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  >
>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>
> Since this landed, I'm seeing this during boot..
>
>  ==
>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  Read of size 8 at addr b4eeaf20 by task nfsd/688

Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
of the sources?

The problem may be that the source is initialized from the global
string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
strscpy..

That said, we do check the size first (because we also *write* 8 bytes
at a time), so maybe KASAN shouldn't even need to care.

Hmm. it really looks to me like this is actually a compiler bug (I'm
using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
the same).

This is the source code in __ip_map_lookup:

struct ip_map ip;
 .
strcpy(ip.m_class, class);

and "m_class" is 8 bytes in size:

struct ip_map {
...
charm_class[8]; /* e.g. "nfsd" */
...

yet when I look at the generated code for __ip_map_lookup, I see

movl$32, %edx   #,
movq%r13, %rsi  # class,
leaq48(%rax), %rdi  #, tmp126
callstrscpy #

what's the bug here? Look at that third argument - %rdx. It is
initialized to 32.

WTF?

The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
of the two object sizes as the size argument. How the hell is that
size argument 32?

Am I missing something? DaveJ, do you see the same?

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread J. Bruce Fields
On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  > Hi Linus,
>  > 
>  > The following changes since commit 
> 32c1431eea4881a6b17bd7c639315010aeefa452:
>  > 
>  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
>  > 
>  > are available in the git repository at:
>  > 
>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>  > 
>  > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
>  > 
>  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
> 16:57:18 -0400)
> 
> Since this landed, I'm seeing this during boot..

__ip_map_lookup does have a strcpy, and it looks like that can be
implemented in terms of strscpy.

Based on that backtrace, it should just be copying from
nfsd_program->pg_class, which is initialized to "nfsd" and never
changed.

I spent a few minutes trying to figure out the tracing macros that
define str__nfsd__trace_system_name+0x3a0/0x3e0 and gave up.

So I have no idea what's going on

--b.

> 
>  ==
>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  Read of size 8 at addr b4eeaf20 by task nfsd/688
>  
>  CPU: 0 PID: 688 Comm: nfsd Not tainted 4.12.0-firewall+ #14 
>  Call Trace:
>   dump_stack+0x68/0x94
>   print_address_description+0x2c/0x270
>   ? strscpy+0x4a/0x230
>   kasan_report+0x239/0x350
>   __asan_load8+0x55/0x90
>   strscpy+0x4a/0x230
>   __ip_map_lookup+0x85/0x150
>   ? ip_map_init+0x50/0x50
>   ? lock_acquire+0x270/0x270
>   svcauth_unix_set_client+0x9f3/0xdc0
>   ? svcauth_unix_set_client+0x5/0xdc0
>   ? unix_gid_parse+0x340/0x340
>   ? kasan_kmalloc+0xbb/0xf0
>   ? groups_alloc+0x29/0x80
>   ? __kmalloc+0x13b/0x360
>   ? groups_alloc+0x29/0x80
>   ? groups_alloc+0x48/0x80
>   ? svcauth_unix_accept+0x3a5/0x3c0
>   svc_set_client+0x50/0x60
>   svc_process+0x901/0x10b0
>   ? svc_register+0x430/0x430
>   ? __might_sleep+0x78/0xf0
>   ? preempt_count_sub+0xaf/0x120
>   ? __validate_process_creds+0x9e/0x160
>   nfsd+0x250/0x380
>   ? nfsd+0x5/0x380
>   kthread+0x1ab/0x200
>   ? nfsd_destroy+0x1f0/0x1f0
>   ? __kthread_create_on_node+0x340/0x340
>   ret_from_fork+0x27/0x40
>  
>  The buggy address belongs to the variable:
>   str__nfsd__trace_system_name+0x3a0/0x3e0
>  
>  Memory state around the buggy address:
>   b4eeae00: 00 00 00 01 fa fa fa fa 00 00 00 00 00 04 fa fa
>   b4eeae80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
>  >b4eeaf00: fa fa fa fa 05 fa fa fa fa fa fa fa 00 00 00 00
> ^   
>   b4eeaf80: 00 fa fa fa fa fa fa fa 00 00 05 fa fa fa fa fa
>   b4eeb000: 00 03 fa fa fa fa fa fa 00 07 fa fa fa fa fa fa
>  ==


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Dave Jones
On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 > Hi Linus,
 > 
 > The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452:
 > 
 >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
 > 
 > are available in the git repository at:
 > 
 >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
 > 
 > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
 > 
 >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
 > 16:57:18 -0400)

Since this landed, I'm seeing this during boot..

 ==
 BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 Read of size 8 at addr b4eeaf20 by task nfsd/688
 
 CPU: 0 PID: 688 Comm: nfsd Not tainted 4.12.0-firewall+ #14 
 Call Trace:
  dump_stack+0x68/0x94
  print_address_description+0x2c/0x270
  ? strscpy+0x4a/0x230
  kasan_report+0x239/0x350
  __asan_load8+0x55/0x90
  strscpy+0x4a/0x230
  __ip_map_lookup+0x85/0x150
  ? ip_map_init+0x50/0x50
  ? lock_acquire+0x270/0x270
  svcauth_unix_set_client+0x9f3/0xdc0
  ? svcauth_unix_set_client+0x5/0xdc0
  ? unix_gid_parse+0x340/0x340
  ? kasan_kmalloc+0xbb/0xf0
  ? groups_alloc+0x29/0x80
  ? __kmalloc+0x13b/0x360
  ? groups_alloc+0x29/0x80
  ? groups_alloc+0x48/0x80
  ? svcauth_unix_accept+0x3a5/0x3c0
  svc_set_client+0x50/0x60
  svc_process+0x901/0x10b0
  ? svc_register+0x430/0x430
  ? __might_sleep+0x78/0xf0
  ? preempt_count_sub+0xaf/0x120
  ? __validate_process_creds+0x9e/0x160
  nfsd+0x250/0x380
  ? nfsd+0x5/0x380
  kthread+0x1ab/0x200
  ? nfsd_destroy+0x1f0/0x1f0
  ? __kthread_create_on_node+0x340/0x340
  ret_from_fork+0x27/0x40
 
 The buggy address belongs to the variable:
  str__nfsd__trace_system_name+0x3a0/0x3e0
 
 Memory state around the buggy address:
  b4eeae00: 00 00 00 01 fa fa fa fa 00 00 00 00 00 04 fa fa
  b4eeae80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
 >b4eeaf00: fa fa fa fa 05 fa fa fa fa fa fa fa 00 00 00 00
^   
  b4eeaf80: 00 fa fa fa fa fa fa fa 00 00 05 fa fa fa fa fa
  b4eeb000: 00 03 fa fa fa fa fa fa 00 07 fa fa fa fa fa fa
 ==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Anna Schumaker


On 07/14/2017 03:09 AM, Christoph Hellwig wrote:
> On Thu, Jul 13, 2017 at 02:43:14PM -0700, Linus Torvalds wrote:
>> On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
>>  wrote:
>>>
>>>   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>>
>> Btw, your key seems to have expired, and doing a refresh on it doesn't fix 
>> it.
>>
>> I'm sure you've refreshed your key, but apparently that refresh hasn't
>> been percolated to the public keyservers?

I assumed I had refreshed it once git let me sign things again, but maybe I 
missed a step.

> 
> As someone who has run into an issue in that area recently:  I manually
> had to refresh and re-upload my signing subkey and not just the primary
> key, which wa rather confusing and took a long time to sort out.
> 

I'll start here.  Thanks for the tip!

Anna


[GIT PULL] Please pull *a* befs change for Linux 4.13

2017-07-14 Thread Luis de Bethencourt

Hi Linus,

Very little activity in the befs file system this time since I'm busy
settling into a new job.
Hence the new-car-smell shiny address [0].

Merged one patch from Tommy Nguyen related to documentation.

Thank you very much,
Luis

[0] https://lkml.org/lkml/2017/7/9/37


The following changes since commit 6f7da290413ba713f0cdd9ff1a2a9bb129ef4f6c:

  Linux 4.12 (2017-07-02 16:07:02 -0700)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/luisbg/linux-befs.git 
tags/befs-v4.13-rc1


for you to fetch changes up to 799ce1dbb9bba56ff21733838a05070787fdcde5:

  befs: add kernel-doc formatting for befs_bt_read_super() (2017-07-09 
10:42:50 +0100)



befs fixes for 4.13-rc1


Tommy Nguyen (1):
  befs: add kernel-doc formatting for befs_bt_read_super()

 fs/befs/btree.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Christoph Hellwig
On Thu, Jul 13, 2017 at 02:43:14PM -0700, Linus Torvalds wrote:
> On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
>  wrote:
> >
> >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
> 
> Btw, your key seems to have expired, and doing a refresh on it doesn't fix it.
> 
> I'm sure you've refreshed your key, but apparently that refresh hasn't
> been percolated to the public keyservers?

As someone who has run into an issue in that area recently:  I manually
had to refresh and re-upload my signing subkey and not just the primary
key, which wa rather confusing and took a long time to sort out.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-13 Thread Linus Torvalds
On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
 wrote:
>
>   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1

Btw, your key seems to have expired, and doing a refresh on it doesn't fix it.

I'm sure you've refreshed your key, but apparently that refresh hasn't
been percolated to the public keyservers?

   Linus


[GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-13 Thread Anna Schumaker
Hi Linus,

The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452:

  Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)

are available in the git repository at:

  git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1

for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:

  NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 16:57:18 
-0400)


Stable bugfixes:
- Fix -EACCESS on commit to DS handling
- Fix initialization of nfs_page_array->npages
- Only invalidate dentries that are actually invalid

Features:
- Enable NFSoRDMA transparent state migration
- Add support for lookup-by-filehandle
- Add support for nfs re-exporting

Other bugfixes and cleanups:
- Christoph cleaned up the way we declare NFS operations
- Clean up various internal structures
- Various cleanups to commits
- Various improvements to error handling
- Set the dt_type of . and .. entries in NFS v4
- Make slot allocation more reliable
- Fix fscache stat printing
- Fix uninitialized variable warnings
- Fix potential list overrun in nfs_atomic_open()
- Fix a race in NFSoRDMA RPC reply handler
- Fix return size for nfs42_proc_copy()
- Fix against MAC forgery timing attacks


As Bruce mentioned in his pull request, we didn't do a good job coordinating
with Christoph's patches from the beginning.  I ended up rebasing my tree on
top of Christoph's nfs-ops branch to prevent duplicate commits.  Apologies if
that was the wrong choice!

Cheers,
Anna


Anna Schumaker (1):
  NFS: Set FATTR4_WORD0_TYPE for . and .. entries

Benjamin Coddington (3):
  NFS: convert flags to bool
  NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
  NFS: Fix initialization of nfs_page_array->npages

Christoph Hellwig (33):
  sunrpc: properly type argument to kxdreproc_t
  sunrpc: fix encoder callback prototypes
  lockd: fix encoder callback prototypes
  nfs: fix encoder callback prototypes
  nfsd: fix encoder callback prototypes
  sunrpc/auth_gss: nfsd: fix encoder callback prototypes
  sunrpc: properly type argument to kxdrdproc_t
  sunrpc: fix decoder callback prototypes
  sunrpc/auth_gss: fix decoder callback prototypes
  nfsd: fix decoder callback prototypes
  lockd: fix decoder callback prototypes
  nfs: fix decoder callback prototypes
  nfs: don't cast callback decode/proc/encode routines
  lockd: fix some weird indentation
  sunrpc: move p_count out of struct rpc_procinfo
  nfs: use ARRAY_SIZE() in the nfsacl_version3 declaration
  sunrpc: mark all struct rpc_procinfo instances as const
  nfsd4: const-ify nfs_cb_version4
  nfsd: use named initializers in PROC()
  nfsd: remove the unused PROC() macro in nfs3proc.c
  sunrpc: properly type pc_func callbacks
  sunrpc: properly type pc_release callbacks
  sunrpc: properly type pc_decode callbacks
  sunrpc: properly type pc_encode callbacks
  sunrpc: remove kxdrproc_t
  nfsd4: properly type op_set_currentstateid callbacks
  nfsd4: properly type op_get_currentstateid callbacks
  nfsd4: remove nfsd4op_rsize
  nfsd4: properly type op_func callbacks
  sunrpc: move pc_count out of struct svc_procinfo
  sunrpc: mark all struct svc_procinfo instances as const
  sunrpc: mark all struct svc_version instances as const
  nfsd4: const-ify nfsd4_ops

Chuck Lever (13):
  xprtrdma: On invalidation failure, remove MWs from rl_registered
  xprtrdma: Pre-mark remotely invalidated MRs
  xprtrdma: Pass only the list of registered MRs to ro_unmap_sync
  xprtrdma: Rename rpcrdma_req::rl_free
  xprtrdma: Fix client lock-up after application signal fires
  xprtrdma: Fix FRWR invalidation error recovery
  xprtrdma: Don't defer MR recovery if ro_map fails
  NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration
  NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration
  xprtrdma: Demote "connect" log messages
  xprtrdma: FMR does not need list_del_init()
  xprtrdma: Replace PAGE_MASK with offset_in_page()
  xprtrdma: Fix documenting comments in frwr_ops.c

Dan Carpenter (1):
  NFS: silence a uninitialized variable warning

Jason A. Donenfeld (1):
  sunrpc: use constant time memory comparison for mac

Jeff Layton (1):
  nfs4: add NFSv4 LOOKUPP handlers

NeilBrown (3):
  NFS: only invalidate dentrys that are clearly invalid.
  NFS: guard against confused server in nfs_atomic_open()
  NFS: check for nfs_refresh_inode() errors in nfs_fhget()

Olga Kornievskaia (3):
  PNFS fix EACCESS on commit to DS handling
  PNFS for stateid errors retry against MDS first
  NFSv4.2 fix size storage for nfs42_proc_copy

Peng Tao (3):
  nfs: replace d_add with d_splice_alias in atomic_open
  nfs: add a nfs_ilookup helper
  nfs: add export operations

Trond Myklebus