from:"Olaf Hering"

Re: [PATCH 3/3] scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()

2021-03-29 Thread Olaf Hering

On Thu, Dec 17, Andrea Parri (Microsoft) wrote:

> Check that the packet is of the expected size at least, don't copy data
> past the packet.

> + if (hv_pkt_datalen(desc) < sizeof(struct vstor_packet) -
> + stor_device->vmscsi_size_delta) {
> + dev_err(>device, "Invalid packet len\n");
> + continue;
> + }
> +

Sorry for being late:

It might be just cosmetic, but should this check be done prior the call to 
vmbus_request_addr()?

Unrelated: my copy of vmbus_request_addr() can return 0, which is apparently 
not handled by this loop in storvsc_on_channel_callback().

Olaf

signature.asc
Description: PGP signature

Re: [PATCH v2] kbuild: enforce -Werror=unused-result

2021-03-25 Thread Olaf Hering

Am Fri, 26 Mar 2021 01:55:41 +0900
schrieb Masahiro Yamada :

> What about  drivers/net/ethernet/lantiq_etop.c  ?

Nothing complained about it. I guess there is a build-bot for alpha, but none 
for mips.

> I got a lot of complaints in the last trial.

Why did you get complains, instead of me?


What is the "must" in "__must_check" worth if it is not enforced...

Olaf


pgparPhbxyZbV.pgp
Description: Digitale Signatur von OpenPGP

Re: [PATCH v2] kbuild: enforce -Werror=unused-result

2021-03-25 Thread Olaf Hering

Am Fri, 19 Mar 2021 15:32:31 +0100
schrieb Olaf Hering :

> It is a hard error if a return value is ignored.

The automated builds found only a single error, in load_em86().

Let me know if there are other reasons why the patch was rejected.

Olaf


pgpwH6ihF0muQ.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v1] binfmt: check return value of remove_arg_zero

2021-03-19 Thread Olaf Hering

In preparation to build with -Werror=unused-result, use remove_arg_zero 
properly.

Fixes commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba

Signed-off-by: Olaf Hering 
---
 fs/binfmt_em86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/binfmt_em86.c b/fs/binfmt_em86.c
index 06b9b9fddf70..5b1c02a0250f 100644
--- a/fs/binfmt_em86.c
+++ b/fs/binfmt_em86.c
@@ -63,7 +63,8 @@ static int load_em86(struct linux_binprm *bprm)
 * This is done in reverse order, because of how the
 * user environment and arguments are stored.
 */
-   remove_arg_zero(bprm);
+   retval = remove_arg_zero(bprm);
+   if (retval < 0) return retval; 
retval = copy_string_kernel(bprm->filename, bprm);
if (retval < 0) return retval; 
bprm->argc++;

Re: [PATCH v2] tools/hv: async name resolution in kvp_daemon

2021-03-19 Thread Olaf Hering

Am Fri, 19 Mar 2021 15:41:44 +0100
schrieb Olaf Hering :

> FullyQualifiedDomainName

I think in the past I did asked MSFT what the host side really expects. Maybe 
this time there will be an answer.

Why would the host expect a FQDN from a VM? Why would it care about DNS layout 
of the network within the VM?

Basically my copy of hv_kvp_daemon just sends `uname -n` to the host. This is 
more correct. This does not waste any network resources. This, up to now, led 
to no complains.

So, what is the purpose of this API?


Olaf


pgp8gTjKVcZIE.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v2] tools/hv: async name resolution in kvp_daemon

2021-03-19 Thread Olaf Hering

The hostname is resolved just once since commit 58125210ab3b ("Tools:
hv: cache FQDN in kvp_daemon to avoid timeouts") to make sure the VM
responds within the timeout limits to requests from the host.

If for some reason getaddrinfo fails, the string returned by the
"FullyQualifiedDomainName" request contains some error string, which is
then used by tools on the host side.

Adjust the code to resolve the current hostname in a separate thread.
This thread loops until getaddrinfo returns success. During this time
all "FullyQualifiedDomainName" requests will be answered with an empty
string.

Signed-off-by: Olaf Hering 
---
v2:
 resend, the thread aims for success.

 tools/hv/Makefile|  2 ++
 tools/hv/hv_kvp_daemon.c | 69 ++--
 2 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/tools/hv/Makefile b/tools/hv/Makefile
index b57143d9459c..3b5481015a84 100644
--- a/tools/hv/Makefile
+++ b/tools/hv/Makefile
@@ -22,6 +22,8 @@ ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
 
 ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh
 
+$(OUTPUT)hv_kvp_daemon: LDFLAGS += -lpthread
+
 all: $(ALL_PROGRAMS)
 
 export srctree OUTPUT CC LD CFLAGS
diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index 1e6fd6ca513b..3951b927aa3d 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * KVP protocol: The user mode component first registers with the
@@ -85,7 +86,7 @@ static char *processor_arch;
 static char *os_build;
 static char *os_version;
 static char *lic_version = "Unknown version";
-static char full_domain_name[HV_KVP_EXCHANGE_MAX_VALUE_SIZE];
+static char *full_domain_name;
 static struct utsname uts_buf;
 
 /*
@@ -1327,27 +1328,53 @@ static int kvp_set_ip_info(char *if_name, struct 
hv_kvp_ipaddr_value *new_val)
return error;
 }
 
-
-static void
-kvp_get_domain_name(char *buffer, int length)
+/*
+ * Async retrival of Fully Qualified Domain Name because getaddrinfo takes an
+ * unpredictable amount of time to finish.
+ */
+static void *kvp_getaddrinfo(void *p)
 {
-   struct addrinfo hints, *info ;
-   int error = 0;
+   char *tmp, **str_ptr = (char **)p;
+   char hostname[HOST_NAME_MAX + 1];
+   struct addrinfo *info, hints = {
+   .ai_family = AF_INET, /* Get only ipv4 addrinfo. */
+   .ai_socktype = SOCK_STREAM,
+   .ai_flags = AI_CANONNAME,
+   };
+   int ret;
+
+   if (gethostname(hostname, sizeof(hostname) - 1) < 0)
+   goto out;
 
-   gethostname(buffer, length);
-   memset(, 0, sizeof(hints));
-   hints.ai_family = AF_INET; /*Get only ipv4 addrinfo. */
-   hints.ai_socktype = SOCK_STREAM;
-   hints.ai_flags = AI_CANONNAME;
+   do {
+   ret = getaddrinfo(hostname, NULL, , );
+   if (ret)
+   sleep(1);
+   } while (ret);
+
+   ret = asprintf(, "%s", info->ai_canonname);
+   freeaddrinfo(info);
+   if (ret <= 0)
+   goto out;
 
-   error = getaddrinfo(buffer, NULL, , );
-   if (error != 0) {
-   snprintf(buffer, length, "getaddrinfo failed: 0x%x %s",
-   error, gai_strerror(error));
+   if (ret > HV_KVP_EXCHANGE_MAX_VALUE_SIZE)
+   tmp[HV_KVP_EXCHANGE_MAX_VALUE_SIZE - 1] = '\0';
+   *str_ptr = tmp;
+
+out:
+   pthread_exit(NULL);
+}
+
+static void kvp_obtain_domain_name(char **str_ptr)
+{
+   pthread_t t;
+
+   if (pthread_create(, NULL, kvp_getaddrinfo, str_ptr)) {
+   syslog(LOG_ERR, "pthread_create failed; error: %d %s",
+   errno, strerror(errno));
return;
}
-   snprintf(buffer, length, "%s", info->ai_canonname);
-   freeaddrinfo(info);
+   pthread_detach(t);
 }
 
 void print_usage(char *argv[])
@@ -1404,11 +1431,7 @@ int main(int argc, char *argv[])
 * Retrieve OS release information.
 */
kvp_get_os_info();
-   /*
-* Cache Fully Qualified Domain Name because getaddrinfo takes an
-* unpredictable amount of time to finish.
-*/
-   kvp_get_domain_name(full_domain_name, sizeof(full_domain_name));
+   kvp_obtain_domain_name(_domain_name);
 
if (kvp_file_init()) {
syslog(LOG_ERR, "Failed to initialize the pools");
@@ -1573,7 +1596,7 @@ int main(int argc, char *argv[])
 
switch (hv_msg->body.kvp_enum_data.index) {
case FullyQualifiedDomainName:
-   strcpy(key_value, full_domain_name);
+   strcpy(key_value, full_domain_name ? : "");
strcpy(key_name, "FullyQualifiedDomainName");
break;
case IntegrationServicesVersion:

[PATCH v2] kbuild: enforce -Werror=unused-result

2021-03-19 Thread Olaf Hering

It is a hard error if a return value is ignored.
In case the return value has no meaning, remove the attribute.

Signed-off-by: Olaf Hering 
---
v2:
  resend

 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index a28bb374663d..9b7def6db494 100644
--- a/Makefile
+++ b/Makefile
@@ -495,7 +495,7 @@ KBUILD_AFLAGS   := -D__ASSEMBLY__ -fno-PIE
 KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
   -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
   -Werror=implicit-function-declaration -Werror=implicit-int \
-  -Werror=return-type -Wno-format-security \
+  -Werror=return-type -Werror=unused-result 
-Wno-format-security \
   -std=gnu89
 KBUILD_CPPFLAGS := -D__KERNEL__
 KBUILD_AFLAGS_KERNEL :=

[PATCH v1] kbuild: enforce -Werror=unused-result

2020-11-17 Thread Olaf Hering

It is a hard error if a return value is ignored.
In case the return value has no meaning, remove the attribute.

Signed-off-by: Olaf Hering 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index e2c3f65c4721..c7f9acffad42 100644
--- a/Makefile
+++ b/Makefile
@@ -497,7 +497,7 @@ KBUILD_AFLAGS   := -D__ASSEMBLY__ -fno-PIE
 KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
   -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
   -Werror=implicit-function-declaration -Werror=implicit-int \
-  -Werror=return-type -Wno-format-security \
+  -Werror=return-type -Werror=unused-result 
-Wno-format-security \
   -std=gnu89
 KBUILD_CPPFLAGS := -D__KERNEL__
 KBUILD_AFLAGS_KERNEL :=

[PATCH v1] video: hyperv_fb: include vmalloc.h

2020-11-06 Thread Olaf Hering

hvfb_getmem uses vzalloc, therefore vmalloc.h should be included.

Fixes commit d21987d709e807ba7bbf47044deb56a3c02e8be4 ("video: hyperv:
hyperv_fb: Support deferred IO for Hyper-V frame buffer driver")

Signed-off-by: Olaf Hering 
---
 drivers/video/fbdev/hyperv_fb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
index e36fb1a0ecdb..5bc86f481a78 100644
--- a/drivers/video/fbdev/hyperv_fb.c
+++ b/drivers/video/fbdev/hyperv_fb.c
@@ -47,6 +47,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include

Re: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-19 Thread Olaf Hering

Am Mon, 19 Oct 2020 02:58:08 +
schrieb Michael Kelley :

> I think we should take the patch.

Thanks. I just briefly looked at the code, did not understand much of it. But 
it feels like the math uses the wrong input. I think its is not the 'pr_warn' 
that needs changing, the 'Fixes' tag would also be incorrect because a 
4.12+backports kernel does not show the warning.

Olaf


pgpmTOYwUauIW.pgp
Description: Digitale Signatur von OpenPGP

Re: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-13 Thread Olaf Hering

Am Tue, 13 Oct 2020 11:19:21 +0200
schrieb Olaf Hering :

> A message is generated every 5 minutes. Unclear why this remained unnoticed 
> since at least v5.3. I have added debug to my distro kernel to see what the 
> involved variable values are. More info later today.

The actual values for avail_pages, num_pages and floor are shown below.
The VM has min 512M, startup 1024M. If I interpret it correctly, the host 
requests to balloon down 83MB, while the VM has ~596MB assigned according to 
the GUI. free still reports 878MB.

Olaf

[   66.917948] hv_balloon: Max. dynamic memory size:  MB
[  331.839393] hv_balloon: Balloon request will be partially fulfilled. (65875 
32768 54728) Balloon floor reached.
[  331.847451] hv_balloon: Balloon request will be partially fulfilled. (54745 
21621 54728) Balloon floor reached.
[  331.848480] hv_balloon: Balloon request will be partially fulfilled. (54745 
21604 54728) Balloon floor reached.
[  331.849465] hv_balloon: Balloon request will be partially fulfilled. (54745 
21587 54728) Balloon floor reached.
[  331.850463] hv_balloon: Balloon request will be partially fulfilled. (54745 
21570 54728) Balloon floor reached.
[  331.851393] hv_balloon: Balloon request will be partially fulfilled. (54682 
21553 54728) Balloon floor reached.
[  631.814538] hv_balloon: Balloon request will be partially fulfilled. (54801 
21553 54728) Balloon floor reached.
[  631.819084] hv_balloon: Balloon request will be partially fulfilled. (54801 
21480 54728) Balloon floor reached.
[  631.823487] hv_balloon: Balloon request will be partially fulfilled. (54738 
21407 54728) Balloon floor reached.
[  631.825832] hv_balloon: Balloon request will be partially fulfilled. (54738 
21397 54728) Balloon floor reached.
[  631.827988] hv_balloon: Balloon request will be partially fulfilled. (54738 
21387 54728) Balloon floor reached.
[  631.830111] hv_balloon: Balloon request will be partially fulfilled. (54738 
21377 54728) Balloon floor reached.
[  931.814649] hv_balloon: Balloon request will be partially fulfilled. (54406 
21367 54728) Balloon floor reached.
[ 1231.829087] hv_balloon: Balloon request will be partially fulfilled. (54408 
21367 54728) Balloon floor reached.
[ 1531.859374] hv_balloon: Balloon request will be partially fulfilled. (54416 
21367 54728) Balloon floor reached.
[ 1831.874813] hv_balloon: Balloon request will be partially fulfilled. (54408 
21367 54728) Balloon floor reached.
[ 2131.878262] hv_balloon: Balloon request will be partially fulfilled. (54672 
21367 54728) Balloon floor reached.
[ 2431.895144] hv_balloon: Balloon request will be partially fulfilled. (54532 
21367 54728) Balloon floor reached.
[ 2731.916792] hv_balloon: Balloon request will be partially fulfilled. (54609 
21367 54728) Balloon floor reached.
[ 3031.922862] hv_balloon: Balloon request will be partially fulfilled. (54597 
21367 54728) Balloon floor reached.
[ 3331.949145] hv_balloon: Balloon request will be partially fulfilled. (54615 
21367 54728) Balloon floor reached.
[ 3631.957564] hv_balloon: Balloon request will be partially fulfilled. (54540 
21367 54728) Balloon floor reached.
[ 3931.969477] hv_balloon: Balloon request will be partially fulfilled. (53057 
21367 54728) Balloon floor reached.


pgpd4vHxPdmBv.pgp
Description: Digitale Signatur von OpenPGP

Re: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-13 Thread Olaf Hering

Am Tue, 13 Oct 2020 09:17:17 +
schrieb Wei Liu :

> So ... this patch is not needed anymore?

Why? A message is generated every 5 minutes. Unclear why this remained 
unnoticed since at least v5.3. I have added debug to my distro kernel to see 
what the involved variable values are. More info later today.

Olaf


pgpdCFNj_2sbA.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v2] kbuild: enforce -Werror=return-type

2020-10-11 Thread Olaf Hering

Catch errors which at least gcc tolerates by default:
 warning: 'return' with no value, in function returning non-void [-Wreturn-type]

Signed-off-by: Olaf Hering 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f84d7e4ca0be..965e7259e6e8 100644
--- a/Makefile
+++ b/Makefile
@@ -497,7 +497,7 @@ KBUILD_AFLAGS   := -D__ASSEMBLY__ -fno-PIE
 KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
   -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
   -Werror=implicit-function-declaration -Werror=implicit-int \
-  -Wno-format-security \
+  -Werror=return-type -Wno-format-security \
   -std=gnu89
 KBUILD_CPPFLAGS := -D__KERNEL__
 KBUILD_AFLAGS_KERNEL :=

Re: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-08 Thread Olaf Hering

Am Thu,  8 Oct 2020 09:12:15 +0200
schrieb Olaf Hering :

> warning is logged in dmesg

Actually it is logged on the system console, depending on how logging is 
configured.


Olaf


pgpdjkpjAh1xK.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v1] hv_balloon: disable warning when floor reached

2020-10-08 Thread Olaf Hering

It is not an error if a the host requests to balloon down, but the VM
refuses to do so. Without this change a warning is logged in dmesg
every five minutes.

Fixes commit b3bb97b8a49f3

Signed-off-by: Olaf Hering 
---
 drivers/hv/hv_balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 32e3bc0aa665..0f50295d0214 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1275,7 +1275,7 @@ static void balloon_up(struct work_struct *dummy)
 
/* Refuse to balloon below the floor. */
if (avail_pages < num_pages || avail_pages - num_pages < floor) {
-   pr_warn("Balloon request will be partially fulfilled. %s\n",
+   pr_info("Balloon request will be partially fulfilled. %s\n",
avail_pages < num_pages ? "Not enough memory." :
"Balloon floor reached.");

[PATCH v1] kbuild: enforce -Werror=return-type

2020-10-05 Thread Olaf Hering

Catch errors which at least gcc tolerates by default:
 warning: 'return' with no value, in function returning non-void [-Wreturn-type]

Signed-off-by: Olaf Hering 
---
 Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Makefile b/Makefile
index f84d7e4ca0be..7b2e63e7be18 100644
--- a/Makefile
+++ b/Makefile
@@ -942,6 +942,9 @@ KBUILD_CFLAGS   += $(call cc-option,-Werror=date-time)
 # enforce correct pointer usage
 KBUILD_CFLAGS   += $(call cc-option,-Werror=incompatible-pointer-types)
 
+# enforce correct return type
+KBUILD_CFLAGS   += $(call cc-option,-Werror=return-type)
+
 # Require designated initializers for all marked structures
 KBUILD_CFLAGS   += $(call cc-option,-Werror=designated-init)

[PATCH v1] tools: hv: remove cast from hyperv_die_event

2020-08-19 Thread Olaf Hering

No need to cast a void pointer.

Signed-off-by: Olaf Hering 
---
 drivers/hv/vmbus_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 910b6e90866c..187809977360 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -83,7 +83,7 @@ static int hyperv_panic_event(struct notifier_block *nb, 
unsigned long val,
 static int hyperv_die_event(struct notifier_block *nb, unsigned long val,
void *args)
 {
-   struct die_args *die = (struct die_args *)args;
+   struct die_args *die = args;
struct pt_regs *regs = die->regs;
 
/* Don't notify Hyper-V if the die event is other than oops */

Re: [PATCH] Drivers: hv: Change flag to write log level in panic msg to false

2020-07-10 Thread Olaf Hering

On Fri, Jun 26, Joseph Salisbury wrote:

> When the kernel panics, one page worth of kmsg data is written to an allocated
> page.  The Hypervisor is notified of the page address trough the MSR.  This
> panic information is collected on the host.  Since we are only collecting one
> page of data, the full panic message may not be collected.

Are the people who need to work with this tiny bit of information
satisfied already, or did they already miss info even with this patch?

I'm asking because kmsg_dump_get_buffer unconditionally includes the
timestamp (unless it is enabled). I do wonder if there should be an API
addition to omit the timestamp. Then even more dmesg output can be
written into the 4k buffer.

Olaf

signature.asc
Description: PGP signature

Re: recalibrating x86 TSC during suspend/resume

2019-02-22 Thread Olaf Hering

On Fri, Feb 22, Paolo Bonzini wrote:

> On 22/02/19 12:44, Thomas Gleixner wrote:
> >> The specific usecase I have is a workload within VMs that makes heavy
> >> use of TSC. The kernel is booted with 'clocksource=tsc highres=off 
> >> nohz=off'
> >> because only this clocksource gives enough granularity. The default
> >> paravirtualized clock will return the same values via
> >> clock_gettime(CLOCK_MONOTONIC) if the timespan between two calls is too
> >> short. This does not happen with 'clocksource=tsc'.
> 
> This shouldn't happen.  clock_gettime(CLOCK_MONOTONIC) should be
> monotonic increasing.  Do you have a testcase?

Two years ago I tweaked sysbench to track the execution time of the
'memory' test:

https://github.com/olafhering/sysbench
https://github.com/olafhering/sysbench/blame/pv/src/tests/memory/sb_memory.c

The checks in diff_timespec() triggered with clocksource=xen, but I can
not reproduce it right now with 5.0 and 4.4 based kernels. I have no
data how KVM behaves. In the end the hypervisor was tweaked to tolerate
a certain jitter in expected TSC speed before emulation kicks in. Up to
~1MHz would be ok to stay within the 500PPM limit that ntpd can handle.

But now there is that "island" issue that needs to be resolved in one
way or another.

Olaf

signature.asc
Description: PGP signature

Re: recalibrating x86 TSC during suspend/resume

2019-02-22 Thread Olaf Hering

Am Fri, 22 Feb 2019 12:44:39 +0100 (CET)
schrieb Thomas Gleixner :

> Whether that is accurate enough or not to make NTP happy, I can't tell, but
> it's definitely worth a try.

Thanks Thomas, I will look into the suggestions.


Olaf


pgpKvKEGb9vSF.pgp
Description: Digitale Signatur von OpenPGP

recalibrating x86 TSC during suspend/resume

2019-02-22 Thread Olaf Hering

Is there a way to recalibrate the x86 TSC during a suspend/resume cycle?

While the frequency will remain the same on a Laptop, it may (or rather:
it definitly will) differ if a VM is migrated from one host to another.
The hypervisor may choose to emulate the expected TSC frequency on the
destination host, but this emulation comes with a significant
performance cost. Therefore it would be good if the kernel evaluates the
environment during resume.

The specific usecase I have is a workload within VMs that makes heavy
use of TSC. The kernel is booted with 'clocksource=tsc highres=off nohz=off'
because only this clocksource gives enough granularity. The default
paravirtualized clock will return the same values via
clock_gettime(CLOCK_MONOTONIC) if the timespan between two calls is too
short. This does not happen with 'clocksource=tsc'.

Right now it is not possible to migrate VMs to hosts with different CPU
speeds. This leads to "islands" of identical hardware, and makes
maintenance of hosts harder than it needs to be. If the VM kernel would
be able to cope with CPU/TSC frequency changes, the pool of potential
destination hosts will become significant larger.

The current result of a migration with non-emulated TSC between hosts of
different speed is:

[   42.452258] clocksource: timekeeping watchdog on CPU1: Marking clocksource 
'tsc' as unstable because the skew is too large:
[   42.452270] clocksource:   'xen' wd_now: 6d34a86adb 
wd_last: 6d1dc51793 mask: 
[   42.452272] clocksource:   'tsc' cs_now: 1fd2ce46bb 
cs_last: 1f95c4ca75 mask: 
[   42.452273] tsc: Marking TSC unstable due to clocksource watchdog

Thanks,
Olaf


signature.asc
Description: PGP signature

Re: [PATCH V3 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver

2019-02-08 Thread Olaf Hering

On Thu, Feb 07, lantianyu1...@gmail.com wrote:

> +++ b/drivers/iommu/Kconfig
> +config HYPERV_IOMMU
> + bool "Hyper-V x2APIC IRQ Handling"
> + depends on HYPERV
> + select IOMMU_API
> + help

Consider adding 'default HYPERV' like some other drivers already do it.

Olaf

signature.asc
Description: PGP signature

Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

2018-10-18 Thread Olaf Hering

Am Sun, 17 Sep 2017 20:54:18 -0700
schrieb k...@exchange.microsoft.com:

> This extends existing vmbus related sysfs structure to provide per-channel
> state information. This is useful when diagnosing issues with multiple
> queues in networking and storage.

> +++ b/drivers/hv/vmbus_drv.c
> +static ssize_t write_avail_show(const struct vmbus_channel *channel, char 
> *buf)
> +{
> + const struct hv_ring_buffer_info *rbi = >outbound;
> +
> + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> +}
> +VMBUS_CHAN_ATTR_RO(write_avail);

This is upstream since a year.

But I wonder how this can work if vmbus_device_register is called,
and then something reads the populated sysfs files before vmbus_open returns.
Nothing protects rbi->ring_buffer in this case, which remains NULL
until vmbus_open populates it.

A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
head /sys/bus/vmbus/devices/*/channels/*/*

Olaf


pgpFWn9brdtvW.pgp
Description: Digitale Signatur von OpenPGP

Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

2018-10-18 Thread Olaf Hering

Am Sun, 17 Sep 2017 20:54:18 -0700
schrieb k...@exchange.microsoft.com:

> This extends existing vmbus related sysfs structure to provide per-channel
> state information. This is useful when diagnosing issues with multiple
> queues in networking and storage.

> +++ b/drivers/hv/vmbus_drv.c
> +static ssize_t write_avail_show(const struct vmbus_channel *channel, char 
> *buf)
> +{
> + const struct hv_ring_buffer_info *rbi = >outbound;
> +
> + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> +}
> +VMBUS_CHAN_ATTR_RO(write_avail);

This is upstream since a year.

But I wonder how this can work if vmbus_device_register is called,
and then something reads the populated sysfs files before vmbus_open returns.
Nothing protects rbi->ring_buffer in this case, which remains NULL
until vmbus_open populates it.

A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
head /sys/bus/vmbus/devices/*/channels/*/*

Olaf


pgpFWn9brdtvW.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v1] tools: hv: update lsvmbus to be compatible with python3

2018-05-22 Thread Olaf Hering

From: Olaf Hering <oher...@suse.de>

Python3 changed the way how 'print' works.
Adjust the code to a syntax that is understood by python2 and python3.

Signed-off-by: Olaf Hering <oher...@suse.de>
---
 tools/hv/lsvmbus | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus
index 353e56768df8..55e7374bade0 100644
--- a/tools/hv/lsvmbus
+++ b/tools/hv/lsvmbus
@@ -17,7 +17,7 @@ if options.verbose is not None:
 
 vmbus_sys_path = '/sys/bus/vmbus/devices'
 if not os.path.isdir(vmbus_sys_path):
-   print "%s doesn't exist: exiting..." % vmbus_sys_path
+   print("%s doesn't exist: exiting..." % vmbus_sys_path)
exit(-1)
 
 vmbus_dev_dict = {
@@ -93,11 +93,11 @@ format2 = '%2s: Class_ID = %s - %s\n\tDevice_ID = 
%s\n\tSysfs path: %s\n%s'
 
 for d in vmbus_dev_list:
if verbose == 0:
-   print ('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc)
+   print(('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc))
elif verbose == 1:
-   print ('VMBUS ID ' + format1) % \
-   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping)
+   print (('VMBUS ID ' + format1) %\
+   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping))
else:
-   print ('VMBUS ID ' + format2) % \
+   print (('VMBUS ID ' + format2) % \
(d.vmbus_id, d.class_id, d.dev_desc, \
-   d.device_id, d.sysfs_path, d.chn_vp_mapping)
+   d.device_id, d.sysfs_path, d.chn_vp_mapping))

[PATCH v1] tools: hv: update lsvmbus to be compatible with python3

2018-05-22 Thread Olaf Hering

From: Olaf Hering 

Python3 changed the way how 'print' works.
Adjust the code to a syntax that is understood by python2 and python3.

Signed-off-by: Olaf Hering 
---
 tools/hv/lsvmbus | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus
index 353e56768df8..55e7374bade0 100644
--- a/tools/hv/lsvmbus
+++ b/tools/hv/lsvmbus
@@ -17,7 +17,7 @@ if options.verbose is not None:
 
 vmbus_sys_path = '/sys/bus/vmbus/devices'
 if not os.path.isdir(vmbus_sys_path):
-   print "%s doesn't exist: exiting..." % vmbus_sys_path
+   print("%s doesn't exist: exiting..." % vmbus_sys_path)
exit(-1)
 
 vmbus_dev_dict = {
@@ -93,11 +93,11 @@ format2 = '%2s: Class_ID = %s - %s\n\tDevice_ID = 
%s\n\tSysfs path: %s\n%s'
 
 for d in vmbus_dev_list:
if verbose == 0:
-   print ('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc)
+   print(('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc))
elif verbose == 1:
-   print ('VMBUS ID ' + format1) % \
-   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping)
+   print (('VMBUS ID ' + format1) %\
+   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping))
else:
-   print ('VMBUS ID ' + format2) % \
+   print (('VMBUS ID ' + format2) % \
(d.vmbus_id, d.class_id, d.dev_desc, \
-   d.device_id, d.sysfs_path, d.chn_vp_mapping)
+   d.device_id, d.sysfs_path, d.chn_vp_mapping))

[PATCH v2] tools: hv: update lsvmbus to be compatible with python3

2018-05-22 Thread Olaf Hering

Python3 changed the way how 'print' works.
Adjust the code to a syntax that is understood by python2 and python3.

Signed-off-by: Olaf Hering <o...@aepfle.de>
---
v2:
 correct author 

 tools/hv/lsvmbus | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus
index 353e56768df8..55e7374bade0 100644
--- a/tools/hv/lsvmbus
+++ b/tools/hv/lsvmbus
@@ -17,7 +17,7 @@ if options.verbose is not None:
 
 vmbus_sys_path = '/sys/bus/vmbus/devices'
 if not os.path.isdir(vmbus_sys_path):
-   print "%s doesn't exist: exiting..." % vmbus_sys_path
+   print("%s doesn't exist: exiting..." % vmbus_sys_path)
exit(-1)
 
 vmbus_dev_dict = {
@@ -93,11 +93,11 @@ format2 = '%2s: Class_ID = %s - %s\n\tDevice_ID = 
%s\n\tSysfs path: %s\n%s'
 
 for d in vmbus_dev_list:
if verbose == 0:
-   print ('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc)
+   print(('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc))
elif verbose == 1:
-   print ('VMBUS ID ' + format1) % \
-   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping)
+   print (('VMBUS ID ' + format1) %\
+   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping))
else:
-   print ('VMBUS ID ' + format2) % \
+   print (('VMBUS ID ' + format2) % \
(d.vmbus_id, d.class_id, d.dev_desc, \
-   d.device_id, d.sysfs_path, d.chn_vp_mapping)
+   d.device_id, d.sysfs_path, d.chn_vp_mapping))

[PATCH v2] tools: hv: update lsvmbus to be compatible with python3

2018-05-22 Thread Olaf Hering

Python3 changed the way how 'print' works.
Adjust the code to a syntax that is understood by python2 and python3.

Signed-off-by: Olaf Hering 
---
v2:
 correct author 

 tools/hv/lsvmbus | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus
index 353e56768df8..55e7374bade0 100644
--- a/tools/hv/lsvmbus
+++ b/tools/hv/lsvmbus
@@ -17,7 +17,7 @@ if options.verbose is not None:
 
 vmbus_sys_path = '/sys/bus/vmbus/devices'
 if not os.path.isdir(vmbus_sys_path):
-   print "%s doesn't exist: exiting..." % vmbus_sys_path
+   print("%s doesn't exist: exiting..." % vmbus_sys_path)
exit(-1)
 
 vmbus_dev_dict = {
@@ -93,11 +93,11 @@ format2 = '%2s: Class_ID = %s - %s\n\tDevice_ID = 
%s\n\tSysfs path: %s\n%s'
 
 for d in vmbus_dev_list:
if verbose == 0:
-   print ('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc)
+   print(('VMBUS ID ' + format0) % (d.vmbus_id, d.dev_desc))
elif verbose == 1:
-   print ('VMBUS ID ' + format1) % \
-   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping)
+   print (('VMBUS ID ' + format1) %\
+   (d.vmbus_id, d.class_id, d.dev_desc, d.chn_vp_mapping))
else:
-   print ('VMBUS ID ' + format2) % \
+   print (('VMBUS ID ' + format2) % \
(d.vmbus_id, d.class_id, d.dev_desc, \
-   d.device_id, d.sysfs_path, d.chn_vp_mapping)
+   d.device_id, d.sysfs_path, d.chn_vp_mapping))

[PATCH] tools: hv: include string.h in hv_fcopy_daemon

2018-01-08 Thread Olaf Hering

The usage of strchr requires inclusion of string.h.

Fixes: 0c38cda64aec ("tools: hv: remove unnecessary header files and netlink 
related code")
Signed-off-by: Olaf Hering <o...@aepfle.de>
---
 tools/hv/hv_fcopy_daemon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index 457a1521f32f..89ed6f325e45 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include

[PATCH] tools: hv: include string.h in hv_fcopy_daemon

2018-01-08 Thread Olaf Hering

The usage of strchr requires inclusion of string.h.

Fixes: 0c38cda64aec ("tools: hv: remove unnecessary header files and netlink 
related code")
Signed-off-by: Olaf Hering 
---
 tools/hv/hv_fcopy_daemon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index 457a1521f32f..89ed6f325e45 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include

Re: [PATCHv2] tools: hv: hv_set_ifconfig.sh double check before setting ip

2017-12-08 Thread Olaf Hering

On Fri, Dec 08, Eduardo Otubo wrote:

>  tools/hv/hv_set_ifconfig.sh | 45 
> +++--
>  1 file changed, 43 insertions(+), 2 deletions(-)

> +# let's wait for 3 minutes

Was this codepath runtime tested?

Last time this came up, the conclusion was that Windows terminates the
"KVP connection" after 60 seconds. After that, the VM must be rebooted
because the kernel does not deal with the long-running script. It is
also not clear what the kernel is supposed to do: either silently report
success and hope the scripts reports success one day, or report error
independent from what the script actually returns. New KVP requests
would need to be queued either way.

Olaf

signature.asc
Description: PGP signature

Re: [PATCHv2] tools: hv: hv_set_ifconfig.sh double check before setting ip

2017-12-08 Thread Olaf Hering

On Fri, Dec 08, Eduardo Otubo wrote:

>  tools/hv/hv_set_ifconfig.sh | 45 
> +++--
>  1 file changed, 43 insertions(+), 2 deletions(-)

> +# let's wait for 3 minutes

Was this codepath runtime tested?

Last time this came up, the conclusion was that Windows terminates the
"KVP connection" after 60 seconds. After that, the VM must be rebooted
because the kernel does not deal with the long-running script. It is
also not clear what the kernel is supposed to do: either silently report
success and hope the scripts reports success one day, or report error
independent from what the script actually returns. New KVP requests
would need to be queued either way.

Olaf

signature.asc
Description: PGP signature

Re: [PATCH] tools: hv: handle EINTR in hv_fcopy_daemon

2017-09-05 Thread Olaf Hering

On Tue, Sep 05, KY Srinivasan wrote:

> Were you planning on sending a patch for this.

No, not yet. I dont know how SA_RESTART works.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH] tools: hv: handle EINTR in hv_fcopy_daemon

2017-09-05 Thread Olaf Hering

On Tue, Sep 05, KY Srinivasan wrote:

> Were you planning on sending a patch for this.

No, not yet. I dont know how SA_RESTART works.

Olaf


signature.asc
Description: PGP signature

Re: [PATCHv2] hv_set_ifconfig.sh double check before setting ip

2017-08-28 Thread Olaf Hering

On Mon, Aug 28, Eduardo Otubo wrote:

> +sleep 30s;

Was this runtime tested?
Once this sleep(1) is done, HV_UTIL_TIMEOUT kicks in and the daemon dies.

Olaf


signature.asc
Description: PGP signature

Re: [PATCHv2] hv_set_ifconfig.sh double check before setting ip

2017-08-28 Thread Olaf Hering

On Mon, Aug 28, Eduardo Otubo wrote:

> +sleep 30s;

Was this runtime tested?
Once this sleep(1) is done, HV_UTIL_TIMEOUT kicks in and the daemon dies.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH] tools: hv: handle EINTR in hv_fcopy_daemon

2017-08-25 Thread Olaf Hering

On Fri, Aug 25, Vitaly Kuznetsov wrote:

> Shall we request SA_RESTART with sigaction() in all three daemons instead?

If that works better, probably yes.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH] tools: hv: handle EINTR in hv_fcopy_daemon

2017-08-25 Thread Olaf Hering

On Fri, Aug 25, Vitaly Kuznetsov wrote:

> Shall we request SA_RESTART with sigaction() in all three daemons instead?

If that works better, probably yes.

Olaf


signature.asc
Description: PGP signature

[PATCH] Drivers: hv: fcopy: restore correct transfer length

2017-08-25 Thread Olaf Hering

Prior commit c7e490fc23eb the expected length of bytes read by the
daemon did depend on the context. It was either hv_start_fcopy or
hv_do_fcopy. The daemon had a buffer size of two pages, which was much
larger than needed.

Since commit c7e490fc23eb the expected length of bytes read by the
daemon changed slightly. For START_FILE_COPY it is still the size of
hv_start_fcopy.  But for WRITE_TO_FILE and the other operations it is as
large as the buffer that arrived via vmbus. In case of WRITE_TO_FILE
that is slightly larger than a struct hv_do_fcopy. Since the buffer in
the daemon was still larger everything was fine.

After commit 3f2baa8a7d2e the daemon reads only what is actually needed.
The new buffer layout is as large as a struct hv_do_fcopy, for the
WRITE_TO_FILE operation. Since the kernel expects a slightly larger
size, hvt_op_read will return -EINVAL because the daemon will read
slightly less than expected.

Fix this by restoring the expected buffer size in case of WRITE_TO_FILE.

Fixes: c7e490fc23eb ("Drivers: hv: fcopy: convert to hv_utils_transport")
Fixes: 3f2baa8a7d2e ("Tools: hv: update buffer handling in hv_fcopy_daemon")

Signed-off-by: Olaf Hering <o...@aepfle.de>
---
 drivers/hv/hv_fcopy.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index daa75bd41f86..2364281d8593 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -170,6 +170,10 @@ static void fcopy_send_data(struct work_struct *dummy)
out_src = smsg_out;
break;
 
+   case WRITE_TO_FILE:
+   out_src = fcopy_transaction.fcopy_msg;
+   out_len = sizeof(struct hv_do_fcopy);
+   break;
default:
out_src = fcopy_transaction.fcopy_msg;
out_len = fcopy_transaction.recv_len;

[PATCH] Drivers: hv: fcopy: restore correct transfer length

2017-08-25 Thread Olaf Hering

Prior commit c7e490fc23eb the expected length of bytes read by the
daemon did depend on the context. It was either hv_start_fcopy or
hv_do_fcopy. The daemon had a buffer size of two pages, which was much
larger than needed.

Since commit c7e490fc23eb the expected length of bytes read by the
daemon changed slightly. For START_FILE_COPY it is still the size of
hv_start_fcopy.  But for WRITE_TO_FILE and the other operations it is as
large as the buffer that arrived via vmbus. In case of WRITE_TO_FILE
that is slightly larger than a struct hv_do_fcopy. Since the buffer in
the daemon was still larger everything was fine.

After commit 3f2baa8a7d2e the daemon reads only what is actually needed.
The new buffer layout is as large as a struct hv_do_fcopy, for the
WRITE_TO_FILE operation. Since the kernel expects a slightly larger
size, hvt_op_read will return -EINVAL because the daemon will read
slightly less than expected.

Fix this by restoring the expected buffer size in case of WRITE_TO_FILE.

Fixes: c7e490fc23eb ("Drivers: hv: fcopy: convert to hv_utils_transport")
Fixes: 3f2baa8a7d2e ("Tools: hv: update buffer handling in hv_fcopy_daemon")

Signed-off-by: Olaf Hering 
---
 drivers/hv/hv_fcopy.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index daa75bd41f86..2364281d8593 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -170,6 +170,10 @@ static void fcopy_send_data(struct work_struct *dummy)
out_src = smsg_out;
break;
 
+   case WRITE_TO_FILE:
+   out_src = fcopy_transaction.fcopy_msg;
+   out_len = sizeof(struct hv_do_fcopy);
+   break;
default:
out_src = fcopy_transaction.fcopy_msg;
out_len = fcopy_transaction.recv_len;

[PATCH] tools: hv: handle EINTR in hv_fcopy_daemon

2017-08-25 Thread Olaf Hering

If strace attaches to the daemon pread returns with EINTR, and the
process exits. Catch this case and continue with the next iteration.

Signed-off-by: Olaf Hering <o...@aepfle.de>
---
 tools/hv/hv_fcopy_daemon.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index c273dd34d144..574e6bf5865c 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -201,6 +201,8 @@ int main(int argc, char *argv[])
ssize_t len;
len = pread(fcopy_fd, , sizeof(buffer), 0);
if (len < 0) {
+   if (errno == EINTR)
+   continue;
syslog(LOG_ERR, "pread failed: %s", strerror(errno));
exit(EXIT_FAILURE);
}

[PATCH] tools: hv: handle EINTR in hv_fcopy_daemon

2017-08-25 Thread Olaf Hering

If strace attaches to the daemon pread returns with EINTR, and the
process exits. Catch this case and continue with the next iteration.

Signed-off-by: Olaf Hering 
---
 tools/hv/hv_fcopy_daemon.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index c273dd34d144..574e6bf5865c 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -201,6 +201,8 @@ int main(int argc, char *argv[])
ssize_t len;
len = pread(fcopy_fd, , sizeof(buffer), 0);
if (len < 0) {
+   if (errno == EINTR)
+   continue;
syslog(LOG_ERR, "pread failed: %s", strerror(errno));
exit(EXIT_FAILURE);
}

[PATCH RESEND] tools: hv: update buffer handling in hv_fcopy_daemon

2017-08-09 Thread Olaf Hering

Currently this warning is triggered when compiling hv_fcopy_daemon:

hv_fcopy_daemon.c:216:4: warning: dereferencing type-punned pointer will break 
strict-aliasing rules [-Wstrict-aliasing]
kernel_modver = *(__u32 *)buffer;

Convert the send/receive buffer to a union and pass individual members as
needed. This also gives the correct size for the buffer (~6K instead of 64K).

Signed-off-by: Olaf Hering <o...@aepfle.de>
---
 tools/hv/hv_fcopy_daemon.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index 26ae609a9448..c273dd34d144 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -138,14 +138,17 @@ void print_usage(char *argv[])
 
 int main(int argc, char *argv[])
 {
-   int fcopy_fd, len;
+   int fcopy_fd;
int error;
int daemonize = 1, long_index = 0, opt;
int version = FCOPY_CURRENT_VERSION;
-   char *buffer[4096 * 2];
-   struct hv_fcopy_hdr *in_msg;
+   union {
+   struct hv_fcopy_hdr hdr;
+   struct hv_start_fcopy start;
+   struct hv_do_fcopy copy;
+   __u32 kernel_modver;
+   } buffer = { };
int in_handshake = 1;
-   __u32 kernel_modver;
 
static struct option long_options[] = {
{"help",no_argument,   0,  'h' },
@@ -195,32 +198,30 @@ int main(int argc, char *argv[])
 * In this loop we process fcopy messages after the
 * handshake is complete.
 */
-   len = pread(fcopy_fd, buffer, (4096 * 2), 0);
+   ssize_t len;
+   len = pread(fcopy_fd, , sizeof(buffer), 0);
if (len < 0) {
syslog(LOG_ERR, "pread failed: %s", strerror(errno));
exit(EXIT_FAILURE);
}
 
if (in_handshake) {
-   if (len != sizeof(kernel_modver)) {
+   if (len != sizeof(buffer.kernel_modver)) {
syslog(LOG_ERR, "invalid version negotiation");
exit(EXIT_FAILURE);
}
-   kernel_modver = *(__u32 *)buffer;
in_handshake = 0;
-   syslog(LOG_INFO, "kernel module version: %d",
-  kernel_modver);
+   syslog(LOG_INFO, "kernel module version: %u",
+  buffer.kernel_modver);
continue;
}
 
-   in_msg = (struct hv_fcopy_hdr *)buffer;
-
-   switch (in_msg->operation) {
+   switch (buffer.hdr.operation) {
case START_FILE_COPY:
-   error = hv_start_fcopy((struct hv_start_fcopy *)in_msg);
+   error = hv_start_fcopy();
break;
case WRITE_TO_FILE:
-   error = hv_copy_data((struct hv_do_fcopy *)in_msg);
+   error = hv_copy_data();
break;
case COMPLETE_FCOPY:
error = hv_copy_finished();
@@ -231,7 +232,7 @@ int main(int argc, char *argv[])
 
default:
syslog(LOG_ERR, "Unknown operation: %d",
-   in_msg->operation);
+   buffer.hdr.operation);
 
}

[PATCH RESEND] tools: hv: update buffer handling in hv_fcopy_daemon

2017-08-09 Thread Olaf Hering

Currently this warning is triggered when compiling hv_fcopy_daemon:

hv_fcopy_daemon.c:216:4: warning: dereferencing type-punned pointer will break 
strict-aliasing rules [-Wstrict-aliasing]
kernel_modver = *(__u32 *)buffer;

Convert the send/receive buffer to a union and pass individual members as
needed. This also gives the correct size for the buffer (~6K instead of 64K).

Signed-off-by: Olaf Hering 
---
 tools/hv/hv_fcopy_daemon.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
index 26ae609a9448..c273dd34d144 100644
--- a/tools/hv/hv_fcopy_daemon.c
+++ b/tools/hv/hv_fcopy_daemon.c
@@ -138,14 +138,17 @@ void print_usage(char *argv[])
 
 int main(int argc, char *argv[])
 {
-   int fcopy_fd, len;
+   int fcopy_fd;
int error;
int daemonize = 1, long_index = 0, opt;
int version = FCOPY_CURRENT_VERSION;
-   char *buffer[4096 * 2];
-   struct hv_fcopy_hdr *in_msg;
+   union {
+   struct hv_fcopy_hdr hdr;
+   struct hv_start_fcopy start;
+   struct hv_do_fcopy copy;
+   __u32 kernel_modver;
+   } buffer = { };
int in_handshake = 1;
-   __u32 kernel_modver;
 
static struct option long_options[] = {
{"help",no_argument,   0,  'h' },
@@ -195,32 +198,30 @@ int main(int argc, char *argv[])
 * In this loop we process fcopy messages after the
 * handshake is complete.
 */
-   len = pread(fcopy_fd, buffer, (4096 * 2), 0);
+   ssize_t len;
+   len = pread(fcopy_fd, , sizeof(buffer), 0);
if (len < 0) {
syslog(LOG_ERR, "pread failed: %s", strerror(errno));
exit(EXIT_FAILURE);
}
 
if (in_handshake) {
-   if (len != sizeof(kernel_modver)) {
+   if (len != sizeof(buffer.kernel_modver)) {
syslog(LOG_ERR, "invalid version negotiation");
exit(EXIT_FAILURE);
}
-   kernel_modver = *(__u32 *)buffer;
in_handshake = 0;
-   syslog(LOG_INFO, "kernel module version: %d",
-  kernel_modver);
+   syslog(LOG_INFO, "kernel module version: %u",
+  buffer.kernel_modver);
continue;
}
 
-   in_msg = (struct hv_fcopy_hdr *)buffer;
-
-   switch (in_msg->operation) {
+   switch (buffer.hdr.operation) {
case START_FILE_COPY:
-   error = hv_start_fcopy((struct hv_start_fcopy *)in_msg);
+   error = hv_start_fcopy();
break;
case WRITE_TO_FILE:
-   error = hv_copy_data((struct hv_do_fcopy *)in_msg);
+   error = hv_copy_data();
break;
case COMPLETE_FCOPY:
error = hv_copy_finished();
@@ -231,7 +232,7 @@ int main(int argc, char *argv[])
 
default:
syslog(LOG_ERR, "Unknown operation: %d",
-   in_msg->operation);
+   buffer.hdr.operation);
 
}

[PATCH] Tools: hv: fix snprintf warning in kvp_daemon

2017-08-09 Thread Olaf Hering

Increase buffer size so that "_{-INT_MAX}" will fit.
Spotted by the gcc7 snprintf checker.

Signed-off-by: Olaf Hering <o...@aepfle.de>
---
 tools/hv/hv_kvp_daemon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index 88b20e007c05..eaa3bec273c8 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -1136,7 +1136,7 @@ static int process_ip_string(FILE *f, char *ip_string, 
int type)
int i = 0;
int j = 0;
char str[256];
-   char sub_str[10];
+   char sub_str[13];
int offset = 0;
 
memset(addr, 0, sizeof(addr));

[PATCH] Tools: hv: fix snprintf warning in kvp_daemon

2017-08-09 Thread Olaf Hering

Increase buffer size so that "_{-INT_MAX}" will fit.
Spotted by the gcc7 snprintf checker.

Signed-off-by: Olaf Hering 
---
 tools/hv/hv_kvp_daemon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index 88b20e007c05..eaa3bec273c8 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -1136,7 +1136,7 @@ static int process_ip_string(FILE *f, char *ip_string, 
int type)
int i = 0;
int j = 0;
char str[256];
-   char sub_str[10];
+   char sub_str[13];
int offset = 0;
 
memset(addr, 0, sizeof(addr));

Re: 4.9.3, NULL pointer in __wake_up_common / drm / i915

2017-01-20 Thread Olaf Hering

On Thu, Dec 01, Olaf Hering wrote:

> On Wed, Nov 16, Olaf Hering wrote:
> > During boot into a current openSUSE Tumbleweed 20161108 this laptop
> > starts to hang sometimes with 4.8.x.  Today I was able to catch this
> > crash in __wake_up_common caused by i915 or drm or whatever:
> > ...
> > [   69.851635] BUG: unable to handle kernel NULL pointer dereference at 
> >   (null)
> > [   69.851754] IP: [] __wake_up_common+0x25/0x80
> This still happens with 4.8.10.

This still happens with 4.9.3.

Olaf


[0.00] microcode: microcode updated early to revision 0xa4, date = 
2010-10-02
[0.00] Linux version 4.9.3-1-default (geeko@buildhost) (gcc version 
6.2.1 20161209 [gcc-6-branch revision 243481] (SUSE Linux) ) #1 SMP PREEMPT Thu 
Jan 12 11:32:53 UTC 2017 (2c7dfab)
[0.00] Command line: 
BOOT_IMAGE=(lvm/sd240_crypt_lvm-sd240_btrfs)/tw_gnome/boot/vmlinuz quiet 
panic=9 net.ifnames=0 rootflags=subvol=/tw_gnome,noatime plymouth.enable=0 
resume=/dev/disk/by-label/SD240_CRYPT_SWP
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009f7ff] usable
[0.00] BIOS-e820: [mem 0x0009f800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000dc000-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbf67] usable
[0.00] BIOS-e820: [mem 0xbf68-0xbf690fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbf691000-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed003ff] reserved
[0.00] BIOS-e820: [mem 0xfed14000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed8] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00013fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: FUJITSU SIEMENS ESPRIMO Mobile M9400/M11D, BIOS 1.06 - R059 
- 1566 04/22/2008
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x14 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 0C000 mask FC000 uncachable
[0.00]   1 base 0 mask F write-back
[0.00]   2 base 1 mask FC000 write-back
[0.00]   3 base 0BF70 mask 0 uncachable
[0.00]   4 base 0BF80 mask FFF80 uncachable
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] e820: update [mem 0xbf70-0x] usable ==> reserved
[0.00] e820: last_pfn = 0xbf680 max_arch_pfn = 0x4
[0.00] found SMP MP-table at [mem 0x000f77a0-0x000f77af] mapped at 
[8ef8400f77a0]
[0.00] Scanning 1 areas for low memory corruption
[0.00] Base memory trampoline at [8ef840099000] 99000 size 24576
[0.00] BRK [0xbb07, 0xbb070fff] PGTABLE
[0.00] BRK [0xbb071000, 0xbb071fff] PGTABLE
[0.00] BRK [0xbb072000, 0xbb072fff] PGTABLE
[0.00] BRK [0xbb073000, 0xbb073fff] PGTABLE
[0.00] BRK [0xbb074000, 0xbb074fff] PGTABLE
[0.00] RAMDISK: [mem 0x34cdb000-0x36664fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000F7710 24 (v02 PTLTD )
[0.00] ACPI: XSDT 0xBF688021 8C (v01 FSCPC   
0604  LTP )
[0.00] ACPI: FACP 0xBF68FD0C F4 (v03 INTEL  CRESTLNE 
0604 ALAN 0001)
[0.00] ACPI: DSDT 0xBF689526 006772 (v02 IEC___ M11_ 
0604 INTL 20050624)
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: APIC 0xBF68FE00 68 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: HPET 0xBF68FE68 38 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: MCFG 0xBF68FEA0 3C (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: SLIT 0xBF68FE

Re: 4.9.3, NULL pointer in __wake_up_common / drm / i915

2017-01-20 Thread Olaf Hering

On Thu, Dec 01, Olaf Hering wrote:

> On Wed, Nov 16, Olaf Hering wrote:
> > During boot into a current openSUSE Tumbleweed 20161108 this laptop
> > starts to hang sometimes with 4.8.x.  Today I was able to catch this
> > crash in __wake_up_common caused by i915 or drm or whatever:
> > ...
> > [   69.851635] BUG: unable to handle kernel NULL pointer dereference at 
> >   (null)
> > [   69.851754] IP: [] __wake_up_common+0x25/0x80
> This still happens with 4.8.10.

This still happens with 4.9.3.

Olaf


[0.00] microcode: microcode updated early to revision 0xa4, date = 
2010-10-02
[0.00] Linux version 4.9.3-1-default (geeko@buildhost) (gcc version 
6.2.1 20161209 [gcc-6-branch revision 243481] (SUSE Linux) ) #1 SMP PREEMPT Thu 
Jan 12 11:32:53 UTC 2017 (2c7dfab)
[0.00] Command line: 
BOOT_IMAGE=(lvm/sd240_crypt_lvm-sd240_btrfs)/tw_gnome/boot/vmlinuz quiet 
panic=9 net.ifnames=0 rootflags=subvol=/tw_gnome,noatime plymouth.enable=0 
resume=/dev/disk/by-label/SD240_CRYPT_SWP
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009f7ff] usable
[0.00] BIOS-e820: [mem 0x0009f800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000dc000-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbf67] usable
[0.00] BIOS-e820: [mem 0xbf68-0xbf690fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbf691000-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed003ff] reserved
[0.00] BIOS-e820: [mem 0xfed14000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed8] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00013fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: FUJITSU SIEMENS ESPRIMO Mobile M9400/M11D, BIOS 1.06 - R059 
- 1566 04/22/2008
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x14 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 0C000 mask FC000 uncachable
[0.00]   1 base 0 mask F write-back
[0.00]   2 base 1 mask FC000 write-back
[0.00]   3 base 0BF70 mask 0 uncachable
[0.00]   4 base 0BF80 mask FFF80 uncachable
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] e820: update [mem 0xbf70-0x] usable ==> reserved
[0.00] e820: last_pfn = 0xbf680 max_arch_pfn = 0x4
[0.00] found SMP MP-table at [mem 0x000f77a0-0x000f77af] mapped at 
[8ef8400f77a0]
[0.00] Scanning 1 areas for low memory corruption
[0.00] Base memory trampoline at [8ef840099000] 99000 size 24576
[0.00] BRK [0xbb07, 0xbb070fff] PGTABLE
[0.00] BRK [0xbb071000, 0xbb071fff] PGTABLE
[0.00] BRK [0xbb072000, 0xbb072fff] PGTABLE
[0.00] BRK [0xbb073000, 0xbb073fff] PGTABLE
[0.00] BRK [0xbb074000, 0xbb074fff] PGTABLE
[0.00] RAMDISK: [mem 0x34cdb000-0x36664fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000F7710 24 (v02 PTLTD )
[0.00] ACPI: XSDT 0xBF688021 8C (v01 FSCPC   
0604  LTP )
[0.00] ACPI: FACP 0xBF68FD0C F4 (v03 INTEL  CRESTLNE 
0604 ALAN 0001)
[0.00] ACPI: DSDT 0xBF689526 006772 (v02 IEC___ M11_ 
0604 INTL 20050624)
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: APIC 0xBF68FE00 68 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: HPET 0xBF68FE68 38 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: MCFG 0xBF68FEA0 3C (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: SLIT 0xBF68FE

Re: [PATCH RFC] hv_utils: implement Hyper-V PTP source

2017-01-13 Thread Olaf Hering

On Fri, Jan 13, Vitaly Kuznetsov wrote:

> + hv_ptp_clock = ptp_clock_register(_hyperv_info, NULL);
> + if (IS_ERR(hv_ptp_clock)) {

Should that be IS_ERR_OR_NULL to catch "!IS_REACHABLE(CONFIG_PTP_1588_CLOCK)"?

Olaf


signature.asc
Description: PGP signature

Re: [PATCH RFC] hv_utils: implement Hyper-V PTP source

2017-01-13 Thread Olaf Hering

On Fri, Jan 13, Vitaly Kuznetsov wrote:

> + hv_ptp_clock = ptp_clock_register(_hyperv_info, NULL);
> + if (IS_ERR(hv_ptp_clock)) {

Should that be IS_ERR_OR_NULL to catch "!IS_REACHABLE(CONFIG_PTP_1588_CLOCK)"?

Olaf


signature.asc
Description: PGP signature

Re: [PATCH 03/15] hyperv: use standard bitops

2016-12-21 Thread Olaf Hering

On Tue, Dec 20, Roman Kagan wrote:

Reverting commit 22356585712d ("staging: hv: use sync_bitops when
interacting with the hypervisor") is save because ...

> - sync_set_bit(channel->monitor_bit,
> + set_bit(channel->monitor_bit,


Olaf


signature.asc
Description: PGP signature

Re: [PATCH 03/15] hyperv: use standard bitops

2016-12-21 Thread Olaf Hering

On Tue, Dec 20, Roman Kagan wrote:

Reverting commit 22356585712d ("staging: hv: use sync_bitops when
interacting with the hypervisor") is save because ...

> - sync_set_bit(channel->monitor_bit,
> + set_bit(channel->monitor_bit,


Olaf


signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:
> -> K. Y., but these words were written before I implemented
> vmbus_wait_for_unload(), to me they just explain how we read messages.

Another question for KY:
In my testing, while busy-looping in vmbus_wait_for_unload, I see a few
"message_type==1, hdr->msgtype==2" in the hv_context.synic_message_page
of the cpu which will deliver CHANNELMSG_UNLOAD_RESPONSE.
These values are not listed in their enum lists. Any idea what these
values mean?

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:
> -> K. Y., but these words were written before I implemented
> vmbus_wait_for_unload(), to me they just explain how we read messages.

Another question for KY:
In my testing, while busy-looping in vmbus_wait_for_unload, I see a few
"message_type==1, hdr->msgtype==2" in the hv_context.synic_message_page
of the cpu which will deliver CHANNELMSG_UNLOAD_RESPONSE.
These values are not listed in their enum lists. Any idea what these
values mean?

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:

> vmbus_wait_for_unload() may be receiving a message (not necessarily the
> CHANNELMSG_UNLOAD_RESPONSE, we may see some other message) on the same
> CPU it runs and in this case wrmsrl() makes sense. In other cases it
> does nothing (neither good nor bad).

If that other cpu has interrupts disabled it may not process a pending
msg (the response may be stuck in the host queue?), and the loop can not
kick the other cpus queue if a wrmsrl is just valid for the current cpu.
If thats true, the response will not arrive in the loop.

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:

> vmbus_wait_for_unload() may be receiving a message (not necessarily the
> CHANNELMSG_UNLOAD_RESPONSE, we may see some other message) on the same
> CPU it runs and in this case wrmsrl() makes sense. In other cases it
> does nothing (neither good nor bad).

If that other cpu has interrupts disabled it may not process a pending
msg (the response may be stuck in the host queue?), and the loop can not
kick the other cpus queue if a wrmsrl is just valid for the current cpu.
If thats true, the response will not arrive in the loop.

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:

> We actually need to read the reply and empty the message slot to make
> unload happen. And reading on a different CPU may not work, see:
> 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2016-December/097330.html

Does the following sentences mean the vmbus_signal_eom in
vmbus_wait_for_unload is a noop because the wrmsrl() is expected to
happen on the other cpu instead of the current cpu?

...
- When we read the message we need to clear the slot and signal the fact
  to the hypervisor. In case there are more messages to this CPU pending
  the hypervisor will deliver the next message. The signaling is done by
  writing to an MSR so this can only be done on the appropriate CPU.
...

Olaf


signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:

> We actually need to read the reply and empty the message slot to make
> unload happen. And reading on a different CPU may not work, see:
> 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2016-December/097330.html

Does the following sentences mean the vmbus_signal_eom in
vmbus_wait_for_unload is a noop because the wrmsrl() is expected to
happen on the other cpu instead of the current cpu?

...
- When we read the message we need to clear the slot and signal the fact
  to the hypervisor. In case there are more messages to this CPU pending
  the hypervisor will deliver the next message. The signaling is done by
  writing to an MSR so this can only be done on the appropriate CPU.
...

Olaf


signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:

> I see a number of minor but at least one major issue against such move:
> At least for some Hyper-V versions (2012R2 for example)
> CHANNELMSG_UNLOAD_RESPONSE is delivered to the CPU which initially sent 
> CHANNELMSG_REQUESTOFFERS and on kdump we may not have this CPU up as
> we usually do kdump with nr_cpus=1 (and on the CPU which crashed). 

Since the kdump or kexec kernel will send the unload during boot I would
expect the response to arrive where it was sent, independent from the
number of cpus.

> Minor issue is the necessity preserve the information about
> message/events pages across kexec.

I guess this info is stored somewhere, and the relevant gfns can be
preserved across kernels, if we try really hard.

But after looking further at the involved code paths it seems that the
implemnted polling might be good enough to snatch the response. Was the
mdelay(10) just an arbitrary decision? I interpret the comments in
vmbus_signal_eom such that the host may overwrite the response. Perhaps
such thing may happen during the mdelay?

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Vitaly Kuznetsov wrote:

> I see a number of minor but at least one major issue against such move:
> At least for some Hyper-V versions (2012R2 for example)
> CHANNELMSG_UNLOAD_RESPONSE is delivered to the CPU which initially sent 
> CHANNELMSG_REQUESTOFFERS and on kdump we may not have this CPU up as
> we usually do kdump with nr_cpus=1 (and on the CPU which crashed). 

Since the kdump or kexec kernel will send the unload during boot I would
expect the response to arrive where it was sent, independent from the
number of cpus.

> Minor issue is the necessity preserve the information about
> message/events pages across kexec.

I guess this info is stored somewhere, and the relevant gfns can be
preserved across kernels, if we try really hard.

But after looking further at the involved code paths it seems that the
implemnted polling might be good enough to snatch the response. Was the
mdelay(10) just an arbitrary decision? I interpret the comments in
vmbus_signal_eom such that the host may overwrite the response. Perhaps
such thing may happen during the mdelay?

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Olaf Hering wrote:

> On Thu, Dec 15, Vitaly Kuznetsov wrote:
> 
> > I see a number of minor but at least one major issue against such move:
> > At least for some Hyper-V versions (2012R2 for example)
> > CHANNELMSG_UNLOAD_RESPONSE is delivered to the CPU which initially sent 
> > CHANNELMSG_REQUESTOFFERS and on kdump we may not have this CPU up as
> > we usually do kdump with nr_cpus=1 (and on the CPU which crashed). 
> 
> Since the kdump or kexec kernel will send the unload during boot I would
> expect the response to arrive where it was sent, independent from the
> number of cpus.

Wait, I just noticed that "REQUESTOFFERS" now. That might be a reason
why my suggestion will not work.

Olaf


signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-15 Thread Olaf Hering

On Thu, Dec 15, Olaf Hering wrote:

> On Thu, Dec 15, Vitaly Kuznetsov wrote:
> 
> > I see a number of minor but at least one major issue against such move:
> > At least for some Hyper-V versions (2012R2 for example)
> > CHANNELMSG_UNLOAD_RESPONSE is delivered to the CPU which initially sent 
> > CHANNELMSG_REQUESTOFFERS and on kdump we may not have this CPU up as
> > we usually do kdump with nr_cpus=1 (and on the CPU which crashed). 
> 
> Since the kdump or kexec kernel will send the unload during boot I would
> expect the response to arrive where it was sent, independent from the
> number of cpus.

Wait, I just noticed that "REQUESTOFFERS" now. That might be a reason
why my suggestion will not work.

Olaf


signature.asc
Description: PGP signature

Re: 4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

On Mon, Dec 12, Olaf Hering wrote:

> On Mon, Dec 12, Theodore Ts'o wrote:
> 
> > What was going on at the time of the crash, and can you reproduce this?
> 
> Its the 'rm -rf $dir/$oldbackup.N' at the end of each rsnapshot run.

Also, the system was otherwise idle during the whole day.

Olaf


signature.asc
Description: PGP signature

Re: 4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

On Mon, Dec 12, Theodore Ts'o wrote:

> What was going on at the time of the crash, and can you reproduce this?

Its the 'rm -rf $dir/$oldbackup.N' at the end of each rsnapshot run.
I will try to reproduce it, see if it happens again during the hourly
rsnapshot runs. So far I have not seen it with 4.8.x, cant remember if
it already happend with earlier kernels.

> There was a huge number of messages about ISO 9660 and accesses beyond
> the end of the loop device, and I wonder if any of this might have
> been connected to the crash.

This was an ISO image with download in progress. Since the download was
slow I mounted what was already available, which was enough to make the
installer happy. But a few files are located at the end of the image,
and the loopN block device does not autogrow as more data became
available. The ISOs are on another drive than the backup disk. Not sure
if such an incomplete iso9660 filesystem can confuse the vfs layer?

Olaf

signature.asc
Description: PGP signature

Re: 4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

On Mon, Dec 12, Olaf Hering wrote:

> On Mon, Dec 12, Theodore Ts'o wrote:
> 
> > What was going on at the time of the crash, and can you reproduce this?
> 
> Its the 'rm -rf $dir/$oldbackup.N' at the end of each rsnapshot run.

Also, the system was otherwise idle during the whole day.

Olaf


signature.asc
Description: PGP signature

Re: 4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

On Mon, Dec 12, Theodore Ts'o wrote:

> What was going on at the time of the crash, and can you reproduce this?

Its the 'rm -rf $dir/$oldbackup.N' at the end of each rsnapshot run.
I will try to reproduce it, see if it happens again during the hourly
rsnapshot runs. So far I have not seen it with 4.8.x, cant remember if
it already happend with earlier kernels.

> There was a huge number of messages about ISO 9660 and accesses beyond
> the end of the loop device, and I wonder if any of this might have
> been connected to the crash.

This was an ISO image with download in progress. Since the download was
slow I mounted what was already available, which was enough to make the
installer happy. But a few files are located at the end of the image,
and the loopN block device does not autogrow as more data became
available. The ISOs are on another drive than the backup disk. Not sure
if such an incomplete iso9660 filesystem can confuse the vfs layer?

Olaf

signature.asc
Description: PGP signature

Re: 4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

On Mon, Dec 12, Olaf Hering wrote:

> [197064.401309] RIP: 0010:[]  [] 
> __d_lookup_rcu+0x67/0x180

... and umount is not happy after that, which I think is expected:


root@probook:~ # umount -v /BACKUP_OLH_1T ; dmsetup remove 
luks-feaf408d-3257-4850-b597-bbca1dc651df ; sync ; blockdev --rereadpt 
/dev/disk/by-id/ata-WDC_WD10EZRX-00D8PB0_WD-WCC4M1CCD07U
Dec 12 09:58:30 probook udisksd[2141]: Cleaning up mount point /BACKUP_OLH_1T 
(device 254:6 is not mounted)
Dec 12 09:58:33 probook kernel: BUG: Dentry a2c50b703f00{i=295818f,n=cur}  
still in use (15) [unmount of ext4 dm-6]
Dec 12 09:58:33 probook kernel: [ cut here ]
Dec 12 09:58:33 probook kernel: WARNING: CPU: 2 PID: 24205 at 
/work/github/olafhering/linux/fs/dcache.c:1436 umount_check+0x72/0x80
Dec 12 09:58:33 probook kernel: Modules linked in: cdc_acm tun fuse 
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables 
x_tables af_packet bridge stp llc msr nls_utf8 isofs loop dm_crypt 
algif_skcipher af_alg btrfs zlib_deflate xor raid6_pq uvcvideo 
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev 
snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel 
snd_hda_codec amdkfd amd_iommu_v2 snd_hda_core radeon ttm drm_kms_helper drm 
snd_hwdep fb_sys_fops sky2 snd_pcm syscopyarea tpm_infineon snd_timer 
sysfillrect sysimgblt i2c_algo_bit snd sp5100_tco i2c_piix4 acpi_cpufreq 
kvm_amd tpm_tis tpm_tis_core k10temp shpchp soundcore kvm hp_accel lis3lv02d 
sparse_keymap rfkill fjes tpm video joydev input_polldev irqbypass pcspkr 
thermal ac battery button wmi fan hid_generic
Dec 12 09:58:33 probook kernel:  usbhid ohci_pci serio_raw ehci_pci ohci_hcd 
ehci_hcd usbcore usb_common sdhci_pci sdhci mmc_core sg dm_multipath dm_mod 
scsi_dh_alua
Dec 12 09:58:33 probook kernel: CPU: 2 PID: 24205 Comm: umount Tainted: G  
D 4.8.12-probook #15
Dec 12 09:58:33 probook kernel: Hardware name: Hewlett-Packard HP ProBook 
6555b/1455, BIOS 68DTM Ver. F.21 06/14/2012
Dec 12 09:58:33 probook kernel:   a2c6f8663cf8 
be3c0530 
Dec 12 09:58:33 probook kernel:   a2c6f8663d38 
be08326b 059cf8663d10
Dec 12 09:58:33 probook kernel:  a2c50b6eb8e0 a2c50b703f00 
a2c50b703f58 a2c50b6eb8e0
Dec 12 09:58:33 probook kernel: Call Trace:
Dec 12 09:58:33 probook kernel:  [] dump_stack+0x63/0x83
Dec 12 09:58:33 probook kernel:  [] __warn+0xcb/0xf0
Dec 12 09:58:33 probook kernel:  [] 
warn_slowpath_null+0x1d/0x20
Dec 12 09:58:33 probook kernel:  [] umount_check+0x72/0x80
Dec 12 09:58:33 probook kernel:  [] d_walk+0xd0/0x300
Dec 12 09:58:33 probook kernel:  [] ? dentry_free+0x80/0x80
Dec 12 09:58:33 probook kernel:  [] do_one_tree+0x26/0x40
Dec 12 09:58:33 probook kernel:  [] 
shrink_dcache_for_umount+0x2d/0x90
Dec 12 09:58:33 probook kernel:  [] 
generic_shutdown_super+0x1f/0x100
Dec 12 09:58:33 probook kernel:  [] kill_block_super+0x27/0x70
Dec 12 09:58:33 probook kernel:  [] 
deactivate_locked_super+0x43/0x70
Dec 12 09:58:33 probook kernel:  [] deactivate_super+0x5a/0x60
Dec 12 09:58:33 probook kernel:  [] cleanup_mnt+0x3f/0x90
Dec 12 09:58:33 probook kernel:  [] __cleanup_mnt+0x12/0x20
Dec 12 09:58:33 probook kernel:  [] task_work_run+0x80/0xa0
Dec 12 09:58:33 probook kernel:  [] 
exit_to_usermode_loop+0xc2/0xd0
Dec 12 09:58:33 probook kernel:  [] 
syscall_return_slowpath+0x4e/0x60
Dec 12 09:58:33 probook kernel:  [] 
entry_SYSCALL_64_fastpath+0xa6/0xa8
Dec 12 09:58:33 probook kernel: ---[ end trace c55214155fae1a63 ]---
Segmentation fault
Dec 12 09:58:35 probook kernel: EXT4-fs (dm-6): sb orphan head is 25657346
Dec 12 09:58:35 probook kernel: sb_info orphan list:
Dec 12 09:58:35 probook kernel:   inode dm-6:25657346 at a2c64f4b0a48: mode 
40755, nlink 0, next 25657347
Dec 12 09:58:35 probook kernel:   inode dm-6:25657347 at a2c655bac2a8: mode 
40755, nlink 0, next 25657348
Dec 12 09:58:35 probook kernel:   inode dm-6:25657348 at a2c63f5d7598: mode 
40755, nlink 0, next 25658000
Dec 12 09:58:35 probook kernel:   inode dm-6:25658000 at a2c6d1341368: mode 
40755, nlink 0, next 25658006
Command failed
Dec 12 09:58:35 probook kernel:   inode dm-6:25658006 at a2c6d13630e8: mode 
40755, nlink 0, next 25658115
Dec 12 09:58:35 probook kernel:   inode dm-6:25658115 at a2c552321718: mode 
40755, nlink 0, next 43286668
Dec 12 09:58:35 probook kernel:   inode dm-6:43286668 at a2c6d13d19c8: mode 
40755, nlink 0, next 43319488
Dec 12 09:58:35 probook kernel:   inode dm-6:43319488 at a2c60dd506d8: mode 
40700, nlink 0, next 43352067
Dec 12 09:58:35 probook kernel:   inode dm-6:43352067 at a2c608808a48: mode 
40700, nlink 0, next 43352114
Dec 12 09:58:35 probook kernel:   inode dm-6:43352114 at a2c608def6d8: mode 
40755, nlink 0, next 43352455
Dec 12 09:58:35 probook kernel:   inode dm-6:43352455 at a2c50b6ea5d8: mode 
40755, nlink 0, next 43352463
Dec 12 09

Re: 4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

On Mon, Dec 12, Olaf Hering wrote:

> [197064.401309] RIP: 0010:[]  [] 
> __d_lookup_rcu+0x67/0x180

... and umount is not happy after that, which I think is expected:


root@probook:~ # umount -v /BACKUP_OLH_1T ; dmsetup remove 
luks-feaf408d-3257-4850-b597-bbca1dc651df ; sync ; blockdev --rereadpt 
/dev/disk/by-id/ata-WDC_WD10EZRX-00D8PB0_WD-WCC4M1CCD07U
Dec 12 09:58:30 probook udisksd[2141]: Cleaning up mount point /BACKUP_OLH_1T 
(device 254:6 is not mounted)
Dec 12 09:58:33 probook kernel: BUG: Dentry a2c50b703f00{i=295818f,n=cur}  
still in use (15) [unmount of ext4 dm-6]
Dec 12 09:58:33 probook kernel: [ cut here ]
Dec 12 09:58:33 probook kernel: WARNING: CPU: 2 PID: 24205 at 
/work/github/olafhering/linux/fs/dcache.c:1436 umount_check+0x72/0x80
Dec 12 09:58:33 probook kernel: Modules linked in: cdc_acm tun fuse 
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables 
x_tables af_packet bridge stp llc msr nls_utf8 isofs loop dm_crypt 
algif_skcipher af_alg btrfs zlib_deflate xor raid6_pq uvcvideo 
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev 
snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel 
snd_hda_codec amdkfd amd_iommu_v2 snd_hda_core radeon ttm drm_kms_helper drm 
snd_hwdep fb_sys_fops sky2 snd_pcm syscopyarea tpm_infineon snd_timer 
sysfillrect sysimgblt i2c_algo_bit snd sp5100_tco i2c_piix4 acpi_cpufreq 
kvm_amd tpm_tis tpm_tis_core k10temp shpchp soundcore kvm hp_accel lis3lv02d 
sparse_keymap rfkill fjes tpm video joydev input_polldev irqbypass pcspkr 
thermal ac battery button wmi fan hid_generic
Dec 12 09:58:33 probook kernel:  usbhid ohci_pci serio_raw ehci_pci ohci_hcd 
ehci_hcd usbcore usb_common sdhci_pci sdhci mmc_core sg dm_multipath dm_mod 
scsi_dh_alua
Dec 12 09:58:33 probook kernel: CPU: 2 PID: 24205 Comm: umount Tainted: G  
D 4.8.12-probook #15
Dec 12 09:58:33 probook kernel: Hardware name: Hewlett-Packard HP ProBook 
6555b/1455, BIOS 68DTM Ver. F.21 06/14/2012
Dec 12 09:58:33 probook kernel:   a2c6f8663cf8 
be3c0530 
Dec 12 09:58:33 probook kernel:   a2c6f8663d38 
be08326b 059cf8663d10
Dec 12 09:58:33 probook kernel:  a2c50b6eb8e0 a2c50b703f00 
a2c50b703f58 a2c50b6eb8e0
Dec 12 09:58:33 probook kernel: Call Trace:
Dec 12 09:58:33 probook kernel:  [] dump_stack+0x63/0x83
Dec 12 09:58:33 probook kernel:  [] __warn+0xcb/0xf0
Dec 12 09:58:33 probook kernel:  [] 
warn_slowpath_null+0x1d/0x20
Dec 12 09:58:33 probook kernel:  [] umount_check+0x72/0x80
Dec 12 09:58:33 probook kernel:  [] d_walk+0xd0/0x300
Dec 12 09:58:33 probook kernel:  [] ? dentry_free+0x80/0x80
Dec 12 09:58:33 probook kernel:  [] do_one_tree+0x26/0x40
Dec 12 09:58:33 probook kernel:  [] 
shrink_dcache_for_umount+0x2d/0x90
Dec 12 09:58:33 probook kernel:  [] 
generic_shutdown_super+0x1f/0x100
Dec 12 09:58:33 probook kernel:  [] kill_block_super+0x27/0x70
Dec 12 09:58:33 probook kernel:  [] 
deactivate_locked_super+0x43/0x70
Dec 12 09:58:33 probook kernel:  [] deactivate_super+0x5a/0x60
Dec 12 09:58:33 probook kernel:  [] cleanup_mnt+0x3f/0x90
Dec 12 09:58:33 probook kernel:  [] __cleanup_mnt+0x12/0x20
Dec 12 09:58:33 probook kernel:  [] task_work_run+0x80/0xa0
Dec 12 09:58:33 probook kernel:  [] 
exit_to_usermode_loop+0xc2/0xd0
Dec 12 09:58:33 probook kernel:  [] 
syscall_return_slowpath+0x4e/0x60
Dec 12 09:58:33 probook kernel:  [] 
entry_SYSCALL_64_fastpath+0xa6/0xa8
Dec 12 09:58:33 probook kernel: ---[ end trace c55214155fae1a63 ]---
Segmentation fault
Dec 12 09:58:35 probook kernel: EXT4-fs (dm-6): sb orphan head is 25657346
Dec 12 09:58:35 probook kernel: sb_info orphan list:
Dec 12 09:58:35 probook kernel:   inode dm-6:25657346 at a2c64f4b0a48: mode 
40755, nlink 0, next 25657347
Dec 12 09:58:35 probook kernel:   inode dm-6:25657347 at a2c655bac2a8: mode 
40755, nlink 0, next 25657348
Dec 12 09:58:35 probook kernel:   inode dm-6:25657348 at a2c63f5d7598: mode 
40755, nlink 0, next 25658000
Dec 12 09:58:35 probook kernel:   inode dm-6:25658000 at a2c6d1341368: mode 
40755, nlink 0, next 25658006
Command failed
Dec 12 09:58:35 probook kernel:   inode dm-6:25658006 at a2c6d13630e8: mode 
40755, nlink 0, next 25658115
Dec 12 09:58:35 probook kernel:   inode dm-6:25658115 at a2c552321718: mode 
40755, nlink 0, next 43286668
Dec 12 09:58:35 probook kernel:   inode dm-6:43286668 at a2c6d13d19c8: mode 
40755, nlink 0, next 43319488
Dec 12 09:58:35 probook kernel:   inode dm-6:43319488 at a2c60dd506d8: mode 
40700, nlink 0, next 43352067
Dec 12 09:58:35 probook kernel:   inode dm-6:43352067 at a2c608808a48: mode 
40700, nlink 0, next 43352114
Dec 12 09:58:35 probook kernel:   inode dm-6:43352114 at a2c608def6d8: mode 
40755, nlink 0, next 43352455
Dec 12 09:58:35 probook kernel:   inode dm-6:43352455 at a2c50b6ea5d8: mode 
40755, nlink 0, next 43352463
Dec 12 09

4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

This crash happend during rsnapshot cleanup, full dmesg below:


Dec 11 17:15:02 probook rsnapshot-backup-scripts.hourly.sh[3244]: claimed 
rsnapshot_pid_lock
Dec 11 17:15:02 probook rsnapshot-backup-scripts.hourly.sh[3247]: running hourly
...
[197064.399173] general protection fault:  [#1] PREEMPT SMP
[197064.399277] Modules linked in: cdc_acm tun fuse ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bridge 
stp llc msr nls_utf8 isofs loop dm_crypt algif_skcipher af_alg btrfs 
zlib_deflate xor raid6_pq uvcvideo videobuf2_vmalloc videobuf2_memops 
videobuf2_v4l2 videobuf2_core videodev snd_hda_codec_hdmi snd_hda_codec_idt 
snd_hda_codec_generic snd_hda_intel snd_hda_codec amdkfd amd_iommu_v2 
snd_hda_core radeon ttm drm_kms_helper drm snd_hwdep fb_sys_fops sky2 snd_pcm 
syscopyarea tpm_infineon snd_timer sysfillrect sysimgblt i2c_algo_bit snd 
sp5100_tco i2c_piix4 acpi_cpufreq kvm_amd tpm_tis tpm_tis_core k10temp shpchp 
soundcore kvm hp_accel lis3lv02d sparse_keymap rfkill fjes tpm video joydev 
input_polldev irqbypass pcspkr thermal ac battery button wmi fan hid_generic
[197064.400728]  usbhid ohci_pci serio_raw ehci_pci ohci_hcd ehci_hcd usbcore 
usb_common sdhci_pci sdhci mmc_core sg dm_multipath dm_mod scsi_dh_alua
[197064.400982] CPU: 1 PID: 3735 Comm: rm Not tainted 4.8.12-probook #15
[197064.401080] Hardware name: Hewlett-Packard HP ProBook 6555b/1455, BIOS 
68DTM Ver. F.21 06/14/2012
[197064.401217] task: a2c60d5fe040 task.stack: a2c70761c000
[197064.401309] RIP: 0010:[]  [] 
__d_lookup_rcu+0x67/0x180
[197064.401447] RSP: 0018:a2c70761fc10  EFLAGS: 00010206
[197064.401529] RAX: 0041 RBX: 08ffa2c606126848 RCX: 
b30f4003e000
[197064.401639] RDX: a2c70761fc8c RSI: a2c70761fd50 RDI: 
a2c6f2e1a0c0
[197064.401748] RBP: a2c70761fc68 R08: a2c6f2e1a0c0 R09: 
a2c70761fc8c
[197064.401857] R10: 73ac0c6f R11: 0041 R12: 
0006
[197064.401966] R13: a2c6f2e1a0c0 R14: 004173ac0c6f R15: 
a2c6d729801c
[197064.402077] FS:  7fa0ab9fd700() GS:a2c73fc8() 
knlGS:
[197064.402199] CS:  0010 DS:  ES:  CR0: 80050033
[197064.402287] CR2: 7f0118c31420 CR3: 000193b0f000 CR4: 
06e0
[197064.402397] Stack:
[197064.402431]  a2c720e3a000 a2c6716443b8  
a2c70761fd50
[197064.402563]  0041 be23319f a2c70761fd40 

[197064.402693]  a2c70761fcd8 a2c70761fcd0 a2c6c67cd9a0 
a2c70761fcb8
[197064.402823] Call Trace:
[197064.402870]  [] ? generic_permission+0x10f/0x190
[197064.402967]  [] lookup_fast+0x48/0x2e0
[197064.403050]  [] walk_component+0x38/0x2f0
[197064.403139]  [] ? path_init+0x251/0x350
[197064.403225]  [] path_lookupat+0x67/0x120
[197064.403312]  [] filename_lookup+0x9e/0x150
[197064.403410]  [] ? call_rcu+0x17/0x20
[197064.403498]  [] ? ext4_destroy_inode+0x3e/0xb0
[197064.403592]  [] ? getname_flags+0x4f/0x1f0
[197064.403681]  [] user_path_at_empty+0x36/0x40
[197064.403772]  [] vfs_fstatat+0x53/0xa0
[197064.403855]  [] SYSC_newfstatat+0x15/0x30
[197064.403942]  [] ? do_rmdir+0xbd/0x220
[197064.404025]  [] SyS_newfstatat+0xe/0x10
[197064.404112]  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
[197064.404211] Code: 83 e3 fe 0f 84 95 00 00 00 4c 89 f0 45 89 f2 49 89 d1 48 
c1 e8 20 48 89 75 c0 49 89 fd 48 89 45 c8 eb 08 48 8b 1b 48 85 db 74 73 <44> 8b 
63 fc 4c 3b 6b 10 75 ee 48 83 7b 08 00 74 e7 41 83 e4 fe 
[197064.404801] RIP  [] __d_lookup_rcu+0x67/0x180
[197064.404907]  RSP 
[197064.432716] ---[ end trace c55214155fae1a62 ]---
...
Dec 11 17:18:07 probook rsnapshot[4027]: /usr/bin/rsnapshot -x -c 
/usr/share/rsnapshot-backup-scripts/rsnapshot-backup-scripts.config.txt hourly: 
ERROR: Error! rm_rf("/BACKUP_OLH_1T/_delete.3248")
Dec 11 17:18:07 probook rsnapshot-backup-scripts.hourly.sh[4031]: released 
rsnapshot_pid_lock

# dumpe2fs -h /dev/disk/by-label/BACKUP_OLH_1T 
dumpe2fs 1.43.3 (04-Sep-2016)
Filesystem volume name:   BACKUP_OLH_1T
Last mounted on:  /BACKUP_OLH_1T
Filesystem UUID:  5d48de86-793e-4820-ba06-b9880612a1f0
Filesystem magic number:  0xEF53
Filesystem revision #:1 (dynamic)
Filesystem features:  has_journal ext_attr resize_inode dir_index filetype 
needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg 
dir_nlink extra_isize
Filesystem flags: signed_directory_hash 
Default mount options:user_xattr acl
Filesystem state: clean
Errors behavior:  Continue
Filesystem OS type:   Linux
Inode count:  61048832
Block count:  488379756
Reserved block count: 488379
Free blocks:  337158332
Free inodes:  59386076
First block:  0
Block size:   2048
Fragment size:2048
Reserved GDT blocks:  512
Blocks per group: 16384
Fragments per group:  16384
Inodes per group:

4.8.12, crash in ext4, __d_lookup_rcu

2016-12-12 Thread Olaf Hering

This crash happend during rsnapshot cleanup, full dmesg below:


Dec 11 17:15:02 probook rsnapshot-backup-scripts.hourly.sh[3244]: claimed 
rsnapshot_pid_lock
Dec 11 17:15:02 probook rsnapshot-backup-scripts.hourly.sh[3247]: running hourly
...
[197064.399173] general protection fault:  [#1] PREEMPT SMP
[197064.399277] Modules linked in: cdc_acm tun fuse ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bridge 
stp llc msr nls_utf8 isofs loop dm_crypt algif_skcipher af_alg btrfs 
zlib_deflate xor raid6_pq uvcvideo videobuf2_vmalloc videobuf2_memops 
videobuf2_v4l2 videobuf2_core videodev snd_hda_codec_hdmi snd_hda_codec_idt 
snd_hda_codec_generic snd_hda_intel snd_hda_codec amdkfd amd_iommu_v2 
snd_hda_core radeon ttm drm_kms_helper drm snd_hwdep fb_sys_fops sky2 snd_pcm 
syscopyarea tpm_infineon snd_timer sysfillrect sysimgblt i2c_algo_bit snd 
sp5100_tco i2c_piix4 acpi_cpufreq kvm_amd tpm_tis tpm_tis_core k10temp shpchp 
soundcore kvm hp_accel lis3lv02d sparse_keymap rfkill fjes tpm video joydev 
input_polldev irqbypass pcspkr thermal ac battery button wmi fan hid_generic
[197064.400728]  usbhid ohci_pci serio_raw ehci_pci ohci_hcd ehci_hcd usbcore 
usb_common sdhci_pci sdhci mmc_core sg dm_multipath dm_mod scsi_dh_alua
[197064.400982] CPU: 1 PID: 3735 Comm: rm Not tainted 4.8.12-probook #15
[197064.401080] Hardware name: Hewlett-Packard HP ProBook 6555b/1455, BIOS 
68DTM Ver. F.21 06/14/2012
[197064.401217] task: a2c60d5fe040 task.stack: a2c70761c000
[197064.401309] RIP: 0010:[]  [] 
__d_lookup_rcu+0x67/0x180
[197064.401447] RSP: 0018:a2c70761fc10  EFLAGS: 00010206
[197064.401529] RAX: 0041 RBX: 08ffa2c606126848 RCX: 
b30f4003e000
[197064.401639] RDX: a2c70761fc8c RSI: a2c70761fd50 RDI: 
a2c6f2e1a0c0
[197064.401748] RBP: a2c70761fc68 R08: a2c6f2e1a0c0 R09: 
a2c70761fc8c
[197064.401857] R10: 73ac0c6f R11: 0041 R12: 
0006
[197064.401966] R13: a2c6f2e1a0c0 R14: 004173ac0c6f R15: 
a2c6d729801c
[197064.402077] FS:  7fa0ab9fd700() GS:a2c73fc8() 
knlGS:
[197064.402199] CS:  0010 DS:  ES:  CR0: 80050033
[197064.402287] CR2: 7f0118c31420 CR3: 000193b0f000 CR4: 
06e0
[197064.402397] Stack:
[197064.402431]  a2c720e3a000 a2c6716443b8  
a2c70761fd50
[197064.402563]  0041 be23319f a2c70761fd40 

[197064.402693]  a2c70761fcd8 a2c70761fcd0 a2c6c67cd9a0 
a2c70761fcb8
[197064.402823] Call Trace:
[197064.402870]  [] ? generic_permission+0x10f/0x190
[197064.402967]  [] lookup_fast+0x48/0x2e0
[197064.403050]  [] walk_component+0x38/0x2f0
[197064.403139]  [] ? path_init+0x251/0x350
[197064.403225]  [] path_lookupat+0x67/0x120
[197064.403312]  [] filename_lookup+0x9e/0x150
[197064.403410]  [] ? call_rcu+0x17/0x20
[197064.403498]  [] ? ext4_destroy_inode+0x3e/0xb0
[197064.403592]  [] ? getname_flags+0x4f/0x1f0
[197064.403681]  [] user_path_at_empty+0x36/0x40
[197064.403772]  [] vfs_fstatat+0x53/0xa0
[197064.403855]  [] SYSC_newfstatat+0x15/0x30
[197064.403942]  [] ? do_rmdir+0xbd/0x220
[197064.404025]  [] SyS_newfstatat+0xe/0x10
[197064.404112]  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
[197064.404211] Code: 83 e3 fe 0f 84 95 00 00 00 4c 89 f0 45 89 f2 49 89 d1 48 
c1 e8 20 48 89 75 c0 49 89 fd 48 89 45 c8 eb 08 48 8b 1b 48 85 db 74 73 <44> 8b 
63 fc 4c 3b 6b 10 75 ee 48 83 7b 08 00 74 e7 41 83 e4 fe 
[197064.404801] RIP  [] __d_lookup_rcu+0x67/0x180
[197064.404907]  RSP 
[197064.432716] ---[ end trace c55214155fae1a62 ]---
...
Dec 11 17:18:07 probook rsnapshot[4027]: /usr/bin/rsnapshot -x -c 
/usr/share/rsnapshot-backup-scripts/rsnapshot-backup-scripts.config.txt hourly: 
ERROR: Error! rm_rf("/BACKUP_OLH_1T/_delete.3248")
Dec 11 17:18:07 probook rsnapshot-backup-scripts.hourly.sh[4031]: released 
rsnapshot_pid_lock

# dumpe2fs -h /dev/disk/by-label/BACKUP_OLH_1T 
dumpe2fs 1.43.3 (04-Sep-2016)
Filesystem volume name:   BACKUP_OLH_1T
Last mounted on:  /BACKUP_OLH_1T
Filesystem UUID:  5d48de86-793e-4820-ba06-b9880612a1f0
Filesystem magic number:  0xEF53
Filesystem revision #:1 (dynamic)
Filesystem features:  has_journal ext_attr resize_inode dir_index filetype 
needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg 
dir_nlink extra_isize
Filesystem flags: signed_directory_hash 
Default mount options:user_xattr acl
Filesystem state: clean
Errors behavior:  Continue
Filesystem OS type:   Linux
Inode count:  61048832
Block count:  488379756
Reserved block count: 488379
Free blocks:  337158332
Free inodes:  59386076
First block:  0
Block size:   2048
Fragment size:2048
Reserved GDT blocks:  512
Blocks per group: 16384
Fragments per group:  16384
Inodes per group:

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

On Wed, Dec 07, KY Srinivasan wrote:

> May be a better solution might be to have a new mechanism to indicate
> to the host that all state of the previous incarnation of the kernel
> needs to be cleaned up.  This will be close to what we have on
> hardware.

That would be cool, but until this appears and until its deployed in
Azure and elsewhere I will likely lose the remaining hairs.

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

On Wed, Dec 07, KY Srinivasan wrote:

> May be a better solution might be to have a new mechanism to indicate
> to the host that all state of the previous incarnation of the kernel
> needs to be cleaned up.  This will be close to what we have on
> hardware.

That would be cool, but until this appears and until its deployed in
Azure and elsewhere I will likely lose the remaining hairs.

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

On Wed, Dec 07, KY Srinivasan wrote:

> Is there a mechanism for stashing away state that can be retrieved in
> the context of the execed kernel.

I have to find out. To simplify things the new approach may only be used
in the kdump case, which already passes various info in cmdline. Most
likely there is a way to preserve a few gpfns with the relevant data.

Olaf

signature.asc
Description: PGP signature

Re: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

On Wed, Dec 07, KY Srinivasan wrote:

> Is there a mechanism for stashing away state that can be retrieved in
> the context of the execed kernel.

I have to find out. To simplify things the new approach may only be used
in the kdump case, which already passes various info in cmdline. Most
likely there is a way to preserve a few gpfns with the relevant data.

Olaf

signature.asc
Description: PGP signature

RE: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

Am 7. Dezember 2016 16:04:29 MEZ, schrieb KY Srinivasan :

>Yes; I had played with this approach a while ago. The issue is that the
>host knows about a 
>bunch of in memory state that will be different in the kexec kernel.
>For instance if we did all
>the cleanup as part of the boot sequence, we will need access to all
>the interrupt/messaging 
>infrastructure that was set up in the previous kernel.


Where is that stored?  Perhaps it should be put into one place,  so that the 
new kernel can find and use it. 

Olaf

RE: move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

Am 7. Dezember 2016 16:04:29 MEZ, schrieb KY Srinivasan :

>Yes; I had played with this approach a while ago. The issue is that the
>host knows about a 
>bunch of in memory state that will be different in the kexec kernel.
>For instance if we did all
>the cleanup as part of the boot sequence, we will need access to all
>the interrupt/messaging 
>infrastructure that was set up in the previous kernel.


Where is that stored?  Perhaps it should be put into one place,  so that the 
new kernel can find and use it. 

Olaf

move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

KY,

if a hyperv VM crashes alot of work must be done to prepare the
environment for the kdump kernel. This approach is different compared to
all the other VM types, or baremetal. Since the just crashed kernel is
per definition unreliable all that work should be done within the kdump
kernel because I think a reliable environment exists only there.

Was it ever considered to do the CHANNELMSG_UNLOAD /
CHANNELMSG_UNLOAD_RESPONSE work during boot, instead of doing it before
starting the kexec/kdump kernel?

What would it take to prepare the runtime environment during boot?
Does the newly booted kernel need any info from the previous kernel,
something that cant be determined during boot? If yes, how can such info
be passed from the old kernel to the new kernel?

Olaf


signature.asc
Description: PGP signature

move hyperv CHANNELMSG_UNLOAD from crashed kernel to kdump kernel

2016-12-07 Thread Olaf Hering

KY,

if a hyperv VM crashes alot of work must be done to prepare the
environment for the kdump kernel. This approach is different compared to
all the other VM types, or baremetal. Since the just crashed kernel is
per definition unreliable all that work should be done within the kdump
kernel because I think a reliable environment exists only there.

Was it ever considered to do the CHANNELMSG_UNLOAD /
CHANNELMSG_UNLOAD_RESPONSE work during boot, instead of doing it before
starting the kexec/kdump kernel?

What would it take to prepare the runtime environment during boot?
Does the newly booted kernel need any info from the previous kernel,
something that cant be determined during boot? If yes, how can such info
be passed from the old kernel to the new kernel?

Olaf


signature.asc
Description: PGP signature

Re: 4.8.6, NULL pointer in __wake_up_common / drm / i915

2016-12-01 Thread Olaf Hering

On Wed, Nov 16, Olaf Hering wrote:

> During boot into a current openSUSE Tumbleweed 20161108 this laptop
> starts to hang sometimes with 4.8.x.  Today I was able to catch this
> crash in __wake_up_common caused by i915 or drm or whatever:
> 
> ...
> [   69.851635] BUG: unable to handle kernel NULL pointer dereference at   
> (null)
> [   69.851754] IP: [] __wake_up_common+0x25/0x80

This still happens with 4.8.10.

Any idea how to fix it?

Olaf

[0.00] microcode: microcode updated early to revision 0xa4, date = 
2010-10-02
[0.00] Linux version 4.8.10-1-default (geeko@buildhost) (gcc version 
6.2.1 20160830 [gcc-6-branch revision 239856] (SUSE Linux) ) #1 SMP PREEMPT Mon 
Nov 21 13:50:28 UTC 2016 (d1ec066)
[0.00] Command line: 
BOOT_IMAGE=(lvm/sd240_crypt_lvm-sd240_btrfs)/tw_xfce/boot/vmlinuz quiet panic=9 
net.ifnames=0 rootflags=subvol=/tw_xfce,noatime plymouth.enable=0 
resume=/dev/disk/by-label/SD240_CRYPT_SWP
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009f7ff] usable
[0.00] BIOS-e820: [mem 0x0009f800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000dc000-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbf67] usable
[0.00] BIOS-e820: [mem 0xbf68-0xbf690fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbf691000-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed003ff] reserved
[0.00] BIOS-e820: [mem 0xfed14000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed8] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00013fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: FUJITSU SIEMENS ESPRIMO Mobile M9400/M11D, BIOS 1.06 - R059 
- 1566 04/22/2008
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x14 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 0C000 mask FC000 uncachable
[0.00]   1 base 0 mask F write-back
[0.00]   2 base 1 mask FC000 write-back
[0.00]   3 base 0BF70 mask 0 uncachable
[0.00]   4 base 0BF80 mask FFF80 uncachable
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] e820: update [mem 0xbf70-0x] usable ==> reserved
[0.00] e820: last_pfn = 0xbf680 max_arch_pfn = 0x4
[0.00] found SMP MP-table at [mem 0x000f77a0-0x000f77af] mapped at 
[8801c00f77a0]
[0.00] Scanning 1 areas for low memory corruption
[0.00] Base memory trampoline at [8801c0099000] 99000 size 24576
[0.00] BRK [0x108262000, 0x108262fff] PGTABLE
[0.00] BRK [0x108263000, 0x108263fff] PGTABLE
[0.00] BRK [0x108264000, 0x108264fff] PGTABLE
[0.00] BRK [0x108265000, 0x108265fff] PGTABLE
[0.00] BRK [0x108266000, 0x108266fff] PGTABLE
[0.00] BRK [0x108267000, 0x108267fff] PGTABLE
[0.00] BRK [0x108268000, 0x108268fff] PGTABLE
[0.00] BRK [0x108269000, 0x108269fff] PGTABLE
[0.00] RAMDISK: [mem 0x35b33000-0x36d90fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000F7710 24 (v02 PTLTD )
[0.00] ACPI: XSDT 0xBF688021 8C (v01 FSCPC   
0604  LTP )
[0.00] ACPI: FACP 0xBF68FD0C F4 (v03 INTEL  CRESTLNE 
0604 ALAN 0001)
[0.00] ACPI: DSDT 0xBF689526 006772 (v02 IEC___ M11_ 
0604 INTL 20050624)
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: APIC 0xBF68FE00 68 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: HPET 0xBF68FE68 38 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: MCFG 0xBF68FEA0 3C

Re: 4.8.6, NULL pointer in __wake_up_common / drm / i915

2016-12-01 Thread Olaf Hering

On Wed, Nov 16, Olaf Hering wrote:

> During boot into a current openSUSE Tumbleweed 20161108 this laptop
> starts to hang sometimes with 4.8.x.  Today I was able to catch this
> crash in __wake_up_common caused by i915 or drm or whatever:
> 
> ...
> [   69.851635] BUG: unable to handle kernel NULL pointer dereference at   
> (null)
> [   69.851754] IP: [] __wake_up_common+0x25/0x80

This still happens with 4.8.10.

Any idea how to fix it?

Olaf

[0.00] microcode: microcode updated early to revision 0xa4, date = 
2010-10-02
[0.00] Linux version 4.8.10-1-default (geeko@buildhost) (gcc version 
6.2.1 20160830 [gcc-6-branch revision 239856] (SUSE Linux) ) #1 SMP PREEMPT Mon 
Nov 21 13:50:28 UTC 2016 (d1ec066)
[0.00] Command line: 
BOOT_IMAGE=(lvm/sd240_crypt_lvm-sd240_btrfs)/tw_xfce/boot/vmlinuz quiet panic=9 
net.ifnames=0 rootflags=subvol=/tw_xfce,noatime plymouth.enable=0 
resume=/dev/disk/by-label/SD240_CRYPT_SWP
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009f7ff] usable
[0.00] BIOS-e820: [mem 0x0009f800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000dc000-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbf67] usable
[0.00] BIOS-e820: [mem 0xbf68-0xbf690fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbf691000-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed003ff] reserved
[0.00] BIOS-e820: [mem 0xfed14000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed8] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00013fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: FUJITSU SIEMENS ESPRIMO Mobile M9400/M11D, BIOS 1.06 - R059 
- 1566 04/22/2008
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x14 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 0C000 mask FC000 uncachable
[0.00]   1 base 0 mask F write-back
[0.00]   2 base 1 mask FC000 write-back
[0.00]   3 base 0BF70 mask 0 uncachable
[0.00]   4 base 0BF80 mask FFF80 uncachable
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] e820: update [mem 0xbf70-0x] usable ==> reserved
[0.00] e820: last_pfn = 0xbf680 max_arch_pfn = 0x4
[0.00] found SMP MP-table at [mem 0x000f77a0-0x000f77af] mapped at 
[8801c00f77a0]
[0.00] Scanning 1 areas for low memory corruption
[0.00] Base memory trampoline at [8801c0099000] 99000 size 24576
[0.00] BRK [0x108262000, 0x108262fff] PGTABLE
[0.00] BRK [0x108263000, 0x108263fff] PGTABLE
[0.00] BRK [0x108264000, 0x108264fff] PGTABLE
[0.00] BRK [0x108265000, 0x108265fff] PGTABLE
[0.00] BRK [0x108266000, 0x108266fff] PGTABLE
[0.00] BRK [0x108267000, 0x108267fff] PGTABLE
[0.00] BRK [0x108268000, 0x108268fff] PGTABLE
[0.00] BRK [0x108269000, 0x108269fff] PGTABLE
[0.00] RAMDISK: [mem 0x35b33000-0x36d90fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000F7710 24 (v02 PTLTD )
[0.00] ACPI: XSDT 0xBF688021 8C (v01 FSCPC   
0604  LTP )
[0.00] ACPI: FACP 0xBF68FD0C F4 (v03 INTEL  CRESTLNE 
0604 ALAN 0001)
[0.00] ACPI: DSDT 0xBF689526 006772 (v02 IEC___ M11_ 
0604 INTL 20050624)
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: FACS 0xBF690FC0 40
[0.00] ACPI: APIC 0xBF68FE00 68 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: HPET 0xBF68FE68 38 (v01 INTEL  CRESTLNE 
0604 LOHR 005A)
[0.00] ACPI: MCFG 0xBF68FEA0 3C

Re: [PATCH v4] xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing

2016-11-23 Thread Olaf Hering

On Mon, Nov 21, Boris Ostrovsky wrote:

> Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
> NUMA balancing") set VM_IO flag to prevent grant maps from being
> subjected to NUMA balancing.


Tested-by: Olaf Hering <o...@aepfle.de>

This should go to stable as well.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH v4] xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing

2016-11-23 Thread Olaf Hering

On Mon, Nov 21, Boris Ostrovsky wrote:

> Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
> NUMA balancing") set VM_IO flag to prevent grant maps from being
> subjected to NUMA balancing.


Tested-by: Olaf Hering 

This should go to stable as well.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH v3 (re-send)] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing

2016-11-18 Thread Olaf Hering

On Thu, Nov 17, Boris Ostrovsky wrote:

> Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
> NUMA balancing") set VM_IO flag to prevent grant maps from being
> subjected to NUMA balancing.

Thanks, this works for me in 4.4:

Tested-by: Olaf Hering <o...@aepfle.de>

Olaf


signature.asc
Description: PGP signature

Re: [PATCH v3 (re-send)] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing

2016-11-18 Thread Olaf Hering

On Thu, Nov 17, Boris Ostrovsky wrote:

> Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
> NUMA balancing") set VM_IO flag to prevent grant maps from being
> subjected to NUMA balancing.

Thanks, this works for me in 4.4:

Tested-by: Olaf Hering 

Olaf


signature.asc
Description: PGP signature

Re: [PATCH v2] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing

2016-11-17 Thread Olaf Hering

On Thu, Nov 17, Boris Ostrovsky wrote:

> On 11/17/2016 06:28 AM, Olaf Hering wrote:
> > ERROR: "__mpol_dup" [drivers/xen/xen-gntdev.ko] undefined!
> > ERROR: "get_task_policy" [drivers/xen/xen-gntdev.ko] undefined!
> I just built 4.4.11 with this patch applied and haven't had any problems.

Are these functions exported? How would the driver module call into the
core kernel? Maybe the added functionality should be provided by an
exported helper function.

Olaf

signature.asc
Description: PGP signature

Re: [PATCH v2] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing

2016-11-17 Thread Olaf Hering

On Thu, Nov 17, Boris Ostrovsky wrote:

> On 11/17/2016 06:28 AM, Olaf Hering wrote:
> > ERROR: "__mpol_dup" [drivers/xen/xen-gntdev.ko] undefined!
> > ERROR: "get_task_policy" [drivers/xen/xen-gntdev.ko] undefined!
> I just built 4.4.11 with this patch applied and haven't had any problems.

Are these functions exported? How would the driver module call into the
core kernel? Maybe the added functionality should be provided by an
exported helper function.

Olaf

signature.asc
Description: PGP signature

Re: [PATCH v2] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing

2016-11-17 Thread Olaf Hering

On Wed, Nov 16, Boris Ostrovsky wrote:

> Unfortunately I haven't been able to trigger NUMA balancing
> so while I tested this in general I am not sure I actually
> exercised the code path.

Thanks for the patch!

Would be nice to actually test the code path which caused the initial
addition of VM_IO. I think I lack the hardware to excersise them.

In my 4.4 based sources I get the following unresolved symbols. If that
happens to work in mainline the failures should at least be considered
during backporting of this proposed patch to the stable trees.

ERROR: "__mpol_dup" [drivers/xen/xen-gntdev.ko] undefined!
ERROR: "get_task_policy" [drivers/xen/xen-gntdev.ko] undefined!

Appearently these symbols lack just an EXPORT_SYMBOL_GPL.

Olaf

signature.asc
Description: PGP signature

Re: [PATCH v2] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing

2016-11-17 Thread Olaf Hering

On Wed, Nov 16, Boris Ostrovsky wrote:

> Unfortunately I haven't been able to trigger NUMA balancing
> so while I tested this in general I am not sure I actually
> exercised the code path.

Thanks for the patch!

Would be nice to actually test the code path which caused the initial
addition of VM_IO. I think I lack the hardware to excersise them.

In my 4.4 based sources I get the following unresolved symbols. If that
happens to work in mainline the failures should at least be considered
during backporting of this proposed patch to the stable trees.

ERROR: "__mpol_dup" [drivers/xen/xen-gntdev.ko] undefined!
ERROR: "get_task_policy" [drivers/xen/xen-gntdev.ko] undefined!

Appearently these symbols lack just an EXPORT_SYMBOL_GPL.

Olaf

signature.asc
Description: PGP signature

4.8.6, NULL pointer in __wake_up_common / drm / i915

2016-11-16 Thread Olaf Hering

During boot into a current openSUSE Tumbleweed 20161108 this laptop
starts to hang sometimes with 4.8.x.  Today I was able to catch this
crash in __wake_up_common caused by i915 or drm or whatever:

...
[   69.851635] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[   69.851754] IP: [] __wake_up_common+0x25/0x80
[   69.851797] PGD 12a0d7067 PUD 12a0d6067 PMD 0
[   69.851854] Oops:  [#1] PREEMPT SMP
[   69.851881] Modules linked in: af_packet bridge stp llc msr 
snd_hda_codec_realtek snd_hda_codec_generic arc4 iTCO_wdt iTCO_vendor_support 
ppdev coretemp ath5k ath snd_hda_intel mac80211 pcspkr joydev snd_hda_codec 
snd_hda_core snd_hwdep i2c_i801 i2c_smbus snd_pcm cfg80211 lpc_ich mfd_core 
rfkill acpi_als kfifo_buf parport_pc industrialio sky2 parport snd_timer snd 
i915 drm_kms_helper thermal fan fjes drm video shpchp fb_sys_fops soundcore 
battery syscopyarea sysfillrect sysimgblt ac i2c_algo_bit acpi_cpufreq tpm_tis 
tpm_tis_core button tpm dm_crypt algif_skcipher af_alg btrfs xor zlib_deflate 
uas usb_storage sr_mod cdrom ata_generic raid6_pq ata_piix serio_raw ehci_pci 
uhci_hcd ehci_hcd usbcore usb_common dm_mirror dm_region_hash dm_log sg 
dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua

[   69.852044] CPU: 0 PID: 1891 Comm: X Tainted: GW   
4.8.6-2-default #1
[   69.852044] Hardware name: FUJITSU SIEMENS ESPRIMO Mobile M9400/M11D, BIOS 
1.06 - R059 - 1566 04/22/2008
[   69.852044] task: a35ff907e000 task.stack: a35fea1cc000
[   69.852044] RIP: 0010:[]  [] 
__wake_up_common+0x25/0x80
[   69.852044] RSP: 0018:a35fea1cf8a0  EFLAGS: 00010086
[   69.852044] RAX: 0086 RBX: a35fea362258 RCX: 
[   69.852044] RDX:  RSI: 0003 RDI: a35fea362258
[   69.852044] RBP: a35fea362260 R08:  R09: 0045
[   69.852044] R10: 000b6e53 R11: 0005 R12: 0086
[   69.852044] R13:  R14: 0003 R15: a35ff09101c8
[   69.852044] FS:  7f00ad8bda00() GS:a35fffc0() 
knlGS:
[   69.852044] CS:  0010 DS:  ES:  CR0: 80050033
[   69.852044] CR2:  CR3: 00012a1ae000 CR4: 06f0
[   69.852044] Stack:
[   69.852044]  0045 a35fea362258 a35fea362250 
0086
[   69.852044]   a35ff09101b0 a35ff09101c8 
a80bf291
[   69.852044]  a35fe785a680 a35fe785a680 dead0200 
c040c065
[   69.852044] Call Trace:
[   69.852044]  [] complete_all+0x31/0x40
[   69.852044]  [] drm_send_event_locked+0x25/0x100 [drm]
[   69.852044]  [] drm_vblank_off+0x164/0x210 [drm]
[   69.852044]  [] i9xx_crtc_disable+0x66/0x490 [i915]
[   69.852044]  [] intel_atomic_commit_tail+0x159/0xf40 [i915]
[   69.852044]  [] intel_atomic_commit+0x40c/0x510 [i915]
[   69.852044]  [] intel_release_load_detect_pipe+0x1f/0x80 
[i915]
[   69.852044]  [] intel_tv_detect+0x33a/0x5c0 [i915]
[   69.852044]  [] 
drm_helper_probe_single_connector_modes+0x26d/0x510 [drm_kms_helper]
[   69.852044]  [] drm_mode_getconnector+0x324/0x360 [drm]
[   69.852044]  [] drm_ioctl+0x1b3/0x440 [drm]
[   69.861778]  [] do_vfs_ioctl+0x8f/0x5d0
[   69.863036]  [] SyS_ioctl+0x74/0x80
[   69.867371]  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
[   69.871293] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8
...

Full log below.

Olaf

root@esprimo:~ # dmesg
[0.00] microcode: microcode updated early to revision 0xa4, date = 
2010-10-02
[0.00] Linux version 4.8.6-2-default (geeko@buildhost) (gcc version 
6.2.1 20160830 [gcc-6-branch revision 239856] (SUSE Linux) ) #1 SMP PREEMPT Thu 
Nov 3 13:00:34 UTC 2016 (1d89b44)
[0.00] Command line: 
BOOT_IMAGE=(lvm/sd240_crypt_lvm-sd240_btrfs)/tw_xfce/boot/vmlinuz quiet panic=9 
net.ifnames=0 rootflags=subvol=/tw_xfce,noatime plymouth.enable=0 
resume=/dev/disk/by-label/SD240_CRYPT_SWP
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009f7ff] usable
[0.00] BIOS-e820: [mem 0x0009f800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000dc000-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbf67] usable
[0.00] BIOS-e820: [mem 0xbf68-0xbf690fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbf691000-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed003ff] reserved
[0.00] BIOS-e820: [mem 0xfed14000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem

4.8.6, NULL pointer in __wake_up_common / drm / i915

2016-11-16 Thread Olaf Hering

During boot into a current openSUSE Tumbleweed 20161108 this laptop
starts to hang sometimes with 4.8.x.  Today I was able to catch this
crash in __wake_up_common caused by i915 or drm or whatever:

...
[   69.851635] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[   69.851754] IP: [] __wake_up_common+0x25/0x80
[   69.851797] PGD 12a0d7067 PUD 12a0d6067 PMD 0
[   69.851854] Oops:  [#1] PREEMPT SMP
[   69.851881] Modules linked in: af_packet bridge stp llc msr 
snd_hda_codec_realtek snd_hda_codec_generic arc4 iTCO_wdt iTCO_vendor_support 
ppdev coretemp ath5k ath snd_hda_intel mac80211 pcspkr joydev snd_hda_codec 
snd_hda_core snd_hwdep i2c_i801 i2c_smbus snd_pcm cfg80211 lpc_ich mfd_core 
rfkill acpi_als kfifo_buf parport_pc industrialio sky2 parport snd_timer snd 
i915 drm_kms_helper thermal fan fjes drm video shpchp fb_sys_fops soundcore 
battery syscopyarea sysfillrect sysimgblt ac i2c_algo_bit acpi_cpufreq tpm_tis 
tpm_tis_core button tpm dm_crypt algif_skcipher af_alg btrfs xor zlib_deflate 
uas usb_storage sr_mod cdrom ata_generic raid6_pq ata_piix serio_raw ehci_pci 
uhci_hcd ehci_hcd usbcore usb_common dm_mirror dm_region_hash dm_log sg 
dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua

[   69.852044] CPU: 0 PID: 1891 Comm: X Tainted: GW   
4.8.6-2-default #1
[   69.852044] Hardware name: FUJITSU SIEMENS ESPRIMO Mobile M9400/M11D, BIOS 
1.06 - R059 - 1566 04/22/2008
[   69.852044] task: a35ff907e000 task.stack: a35fea1cc000
[   69.852044] RIP: 0010:[]  [] 
__wake_up_common+0x25/0x80
[   69.852044] RSP: 0018:a35fea1cf8a0  EFLAGS: 00010086
[   69.852044] RAX: 0086 RBX: a35fea362258 RCX: 
[   69.852044] RDX:  RSI: 0003 RDI: a35fea362258
[   69.852044] RBP: a35fea362260 R08:  R09: 0045
[   69.852044] R10: 000b6e53 R11: 0005 R12: 0086
[   69.852044] R13:  R14: 0003 R15: a35ff09101c8
[   69.852044] FS:  7f00ad8bda00() GS:a35fffc0() 
knlGS:
[   69.852044] CS:  0010 DS:  ES:  CR0: 80050033
[   69.852044] CR2:  CR3: 00012a1ae000 CR4: 06f0
[   69.852044] Stack:
[   69.852044]  0045 a35fea362258 a35fea362250 
0086
[   69.852044]   a35ff09101b0 a35ff09101c8 
a80bf291
[   69.852044]  a35fe785a680 a35fe785a680 dead0200 
c040c065
[   69.852044] Call Trace:
[   69.852044]  [] complete_all+0x31/0x40
[   69.852044]  [] drm_send_event_locked+0x25/0x100 [drm]
[   69.852044]  [] drm_vblank_off+0x164/0x210 [drm]
[   69.852044]  [] i9xx_crtc_disable+0x66/0x490 [i915]
[   69.852044]  [] intel_atomic_commit_tail+0x159/0xf40 [i915]
[   69.852044]  [] intel_atomic_commit+0x40c/0x510 [i915]
[   69.852044]  [] intel_release_load_detect_pipe+0x1f/0x80 
[i915]
[   69.852044]  [] intel_tv_detect+0x33a/0x5c0 [i915]
[   69.852044]  [] 
drm_helper_probe_single_connector_modes+0x26d/0x510 [drm_kms_helper]
[   69.852044]  [] drm_mode_getconnector+0x324/0x360 [drm]
[   69.852044]  [] drm_ioctl+0x1b3/0x440 [drm]
[   69.861778]  [] do_vfs_ioctl+0x8f/0x5d0
[   69.863036]  [] SyS_ioctl+0x74/0x80
[   69.867371]  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
[   69.871293] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8
...

Full log below.

Olaf

root@esprimo:~ # dmesg
[0.00] microcode: microcode updated early to revision 0xa4, date = 
2010-10-02
[0.00] Linux version 4.8.6-2-default (geeko@buildhost) (gcc version 
6.2.1 20160830 [gcc-6-branch revision 239856] (SUSE Linux) ) #1 SMP PREEMPT Thu 
Nov 3 13:00:34 UTC 2016 (1d89b44)
[0.00] Command line: 
BOOT_IMAGE=(lvm/sd240_crypt_lvm-sd240_btrfs)/tw_xfce/boot/vmlinuz quiet panic=9 
net.ifnames=0 rootflags=subvol=/tw_xfce,noatime plymouth.enable=0 
resume=/dev/disk/by-label/SD240_CRYPT_SWP
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009f7ff] usable
[0.00] BIOS-e820: [mem 0x0009f800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000dc000-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbf67] usable
[0.00] BIOS-e820: [mem 0xbf68-0xbf690fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbf691000-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed003ff] reserved
[0.00] BIOS-e820: [mem 0xfed14000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem

Re: [Xen-devel] [PATCH RESEND] xen/gntdev: Grant maps should not be subject to NUMA balancing

2016-11-10 Thread Olaf Hering

On Thu, Nov 10, Boris Ostrovsky wrote:

> Are you sure it's this patch that causes the failure?
> 
> I commented out '| VM_IO' and still unable to boot with this option.

Yes, this works for me, sles12sp2 dom0+domU, which is linux-4.4 based:

+++ b/drivers/xen/gntdev.c
@@ -804,7 +804,7 @@ static int gntdev_mmap(struct file *flip, struct 
vm_area_struct *vma)
 
vma->vm_ops = _vmops;
 
-   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
+   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP /*| VM_IO*/;
 
if (use_ptemod)
vma->vm_flags |= VM_DONTCOPY;

with this domU.cfg:

name="x"
memory=1024
serial="pty"
builder="hvm"
disk=[ 'vdev=xvda, direct-io-safe, backendtype=qdisk, target=x.raw', ]
vif=[ 'bridge=br0' ]
keymap="de"
cmdline="linemode=1 console=ttyS0,115200 ignore_loglevel 
install=http://host/sles_dvd1/ start_shell"
kernel= "/sles_dvd1/boot/x86_64/vmlinuz-xen"
ramdisk="/sles_dvd1/boot/x86_64/initrd-xen"


Without VM_IO 'fdisk -l /dev/xvda' works, with VM_IO 'fdisk -l
/dev/xvda' gives IO errors.

Olaf


signature.asc
Description: PGP signature

Re: [Xen-devel] [PATCH RESEND] xen/gntdev: Grant maps should not be subject to NUMA balancing

2016-11-10 Thread Olaf Hering

On Thu, Nov 10, Boris Ostrovsky wrote:

> Are you sure it's this patch that causes the failure?
> 
> I commented out '| VM_IO' and still unable to boot with this option.

Yes, this works for me, sles12sp2 dom0+domU, which is linux-4.4 based:

+++ b/drivers/xen/gntdev.c
@@ -804,7 +804,7 @@ static int gntdev_mmap(struct file *flip, struct 
vm_area_struct *vma)
 
vma->vm_ops = _vmops;
 
-   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
+   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP /*| VM_IO*/;
 
if (use_ptemod)
vma->vm_flags |= VM_DONTCOPY;

with this domU.cfg:

name="x"
memory=1024
serial="pty"
builder="hvm"
disk=[ 'vdev=xvda, direct-io-safe, backendtype=qdisk, target=x.raw', ]
vif=[ 'bridge=br0' ]
keymap="de"
cmdline="linemode=1 console=ttyS0,115200 ignore_loglevel 
install=http://host/sles_dvd1/ start_shell"
kernel= "/sles_dvd1/boot/x86_64/vmlinuz-xen"
ramdisk="/sles_dvd1/boot/x86_64/initrd-xen"


Without VM_IO 'fdisk -l /dev/xvda' works, with VM_IO 'fdisk -l
/dev/xvda' gives IO errors.

Olaf


signature.asc
Description: PGP signature

Re: [Xen-devel] [PATCH RESEND] xen/gntdev: Grant maps should not be subject to NUMA balancing

2016-11-10 Thread Olaf Hering

On Thu, Nov 10, Boris Ostrovsky wrote:

> Is this something new? Because this patch has been there for a year.

It was just tested now, cycling through all the combinations for a
disk=[]. Removing "direct-is-save" will use different code paths and the
error is not seen.

Olaf


signature.asc
Description: PGP signature

Re: [Xen-devel] [PATCH RESEND] xen/gntdev: Grant maps should not be subject to NUMA balancing

2016-11-10 Thread Olaf Hering

On Thu, Nov 10, Boris Ostrovsky wrote:

> Is this something new? Because this patch has been there for a year.

It was just tested now, cycling through all the combinations for a
disk=[]. Removing "direct-is-save" will use different code paths and the
error is not seen.

Olaf


signature.asc
Description: PGP signature

Re: [Xen-devel] [PATCH RESEND] xen/gntdev: Grant maps should not be subject to NUMA balancing

2016-11-10 Thread Olaf Hering

On Tue, Nov 10, Boris Ostrovsky wrote:

> Doing so will cause the grant to be unmapped and then, during
> fault handling, the fault to be mistakenly treated as NUMA hint
> fault.
> 
> In addition, even if those maps could partcipate in NUMA
> balancing, it wouldn't provide any benefit since we are unable
> to determine physical page's node (even if/when VNUMA is
> implemented).
> 
> Marking grant maps' VMAs as VM_IO will exclude them from being
> part of NUMA balancing.

This breaks qdisk+aio because now such pages are rejected with -EFAULT:

check_vma_flags
__get_user_pages
__get_user_pages_locked
__get_user_pages_unlocked
get_user_pages_fast
iov_iter_get_pages
dio_refill_pages
do_direct_IO
do_blockdev_direct_IO
do_blockdev_direct_IO
ext4_direct_IO_read
generic_file_read_iter
aio_run_iocb

domU.cfg:
builder=hvm
disk=['vdev=xvda, direct-io-safe, backendtype=qdisk, target=img.raw']

> @@ -802,7 +802,7 @@ static int gntdev_mmap(struct file *flip, struct 
> vm_area_struct *vma)
> - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP | VM_IO;


Olaf


signature.asc
Description: PGP signature

Re: [Xen-devel] [PATCH RESEND] xen/gntdev: Grant maps should not be subject to NUMA balancing

2016-11-10 Thread Olaf Hering

On Tue, Nov 10, Boris Ostrovsky wrote:

> Doing so will cause the grant to be unmapped and then, during
> fault handling, the fault to be mistakenly treated as NUMA hint
> fault.
> 
> In addition, even if those maps could partcipate in NUMA
> balancing, it wouldn't provide any benefit since we are unable
> to determine physical page's node (even if/when VNUMA is
> implemented).
> 
> Marking grant maps' VMAs as VM_IO will exclude them from being
> part of NUMA balancing.

This breaks qdisk+aio because now such pages are rejected with -EFAULT:

check_vma_flags
__get_user_pages
__get_user_pages_locked
__get_user_pages_unlocked
get_user_pages_fast
iov_iter_get_pages
dio_refill_pages
do_direct_IO
do_blockdev_direct_IO
do_blockdev_direct_IO
ext4_direct_IO_read
generic_file_read_iter
aio_run_iocb

domU.cfg:
builder=hvm
disk=['vdev=xvda, direct-io-safe, backendtype=qdisk, target=img.raw']

> @@ -802,7 +802,7 @@ static int gntdev_mmap(struct file *flip, struct 
> vm_area_struct *vma)
> - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP | VM_IO;


Olaf


signature.asc
Description: PGP signature

Re: [PATCH 1/1] Drivers: hv: hv_util: Avoid dynamic allocation in time synch

2016-09-14 Thread Olaf Hering

On Fri, Sep 09, k...@exchange.microsoft.com wrote:

> +  * This check is safe since we are executing in the
> +  * interrupt context and time synch messages arre always

Typo.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH 1/1] Drivers: hv: hv_util: Avoid dynamic allocation in time synch

2016-09-14 Thread Olaf Hering

On Fri, Sep 09, k...@exchange.microsoft.com wrote:

> +  * This check is safe since we are executing in the
> +  * interrupt context and time synch messages arre always

Typo.

Olaf


signature.asc
Description: PGP signature

Re: [PATCH 3/3] Drivers: hv: utils: Support TimeSync version 4.0 protocol samples.

2016-09-14 Thread Olaf Hering

On Tue, Sep 13, Alex Ng (LIS) wrote:

> > On Thu, Sep 08, k...@exchange.microsoft.com wrote:
> > Perhaps a better approach would be to list the known existing hosts and use
> > the new protocol for upcoming, unknown hosts via 'default:'.
> This is a good idea. I will create another patch that addresses this.

I think this variant would cover upcoming hosts for an old kernel:

switch (vmbus_proto_version) {
case VERSION_WS2008:
util_fw_version = UTIL_WS2K8_FW_VERSION;
sd_srv_version = SD_VERSION_1;
ts_srv_version = TS_VERSION_1;
hb_srv_version = HB_VERSION_1;
break;
case VERSION_WIN7:
case VERSION_WIN8:
case VERSION_WIN8_1:
util_fw_version = UTIL_FW_VERSION;
sd_srv_version = SD_VERSION;
ts_srv_version = TS_VERSION_3;
hb_srv_version = HB_VERSION;
break;
case VERSION_WIN10:
default:
util_fw_version = UTIL_FW_VERSION;
sd_srv_version = SD_VERSION;
ts_srv_version = TS_VERSION;
hb_srv_version = HB_VERSION;
break;
}

Olaf


signature.asc
Description: PGP signature

Re: [PATCH 3/3] Drivers: hv: utils: Support TimeSync version 4.0 protocol samples.

2016-09-14 Thread Olaf Hering

On Tue, Sep 13, Alex Ng (LIS) wrote:

> > On Thu, Sep 08, k...@exchange.microsoft.com wrote:
> > Perhaps a better approach would be to list the known existing hosts and use
> > the new protocol for upcoming, unknown hosts via 'default:'.
> This is a good idea. I will create another patch that addresses this.

I think this variant would cover upcoming hosts for an old kernel:

switch (vmbus_proto_version) {
case VERSION_WS2008:
util_fw_version = UTIL_WS2K8_FW_VERSION;
sd_srv_version = SD_VERSION_1;
ts_srv_version = TS_VERSION_1;
hb_srv_version = HB_VERSION_1;
break;
case VERSION_WIN7:
case VERSION_WIN8:
case VERSION_WIN8_1:
util_fw_version = UTIL_FW_VERSION;
sd_srv_version = SD_VERSION;
ts_srv_version = TS_VERSION_3;
hb_srv_version = HB_VERSION;
break;
case VERSION_WIN10:
default:
util_fw_version = UTIL_FW_VERSION;
sd_srv_version = SD_VERSION;
ts_srv_version = TS_VERSION;
hb_srv_version = HB_VERSION;
break;
}

Olaf


signature.asc
Description: PGP signature

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1191 matches

Mail list logo