Re: Odd build breakage in 4.9-rc7

2016-12-02 Thread Jarod Wilson

On 2016-12-02 3:11 PM, Nicolas Pitre wrote:

On Thu, 1 Dec 2016, Paul Bolle wrote:


On Thu, 2016-12-01 at 12:42 -0500, Nicolas Pitre wrote:

OK I understand what the problem is.  However most of those hunks below
are definitely wrong. ;-)


Probably. By now I've narrowed it down to just these two hunks:


And they're both wrong. ;-) There is no relation between MODVERSIONS and
TRIM_UNUSED_KSYMS.


I'm trying to determine the best way to fix it. Stay tuned.


Will do. I'm curious to see what a proper fix might look like.


Here it is:

- >8
Subject: kbuild: fix building bzImage with CONFIG_TRIM_UNUSED_KSYMS enabled

When building a specific target such as bzImage, modules aren't normally
built. However if CONFIG_TRIM_UNUSED_KSYMS is enabled, no built modules
means none of the exported symbols are used and therefore they will all
be trimmed away from the final kernel. A subsequent "make modules" will
fail because modpost cannot find the needed symbols for those modules in
the kernel binary.

Let's make sure modules are also built whenever CONFIG_TRIM_UNUSED_KSYMS
is enabled and that the kernel binary is properly rebuilt accordingly.


For my previously failing case, things behave again with this patch. 
Thanks much!


Tested-by: Jarod Wilson 

--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-12-02 Thread Jarod Wilson

On 2016-12-02 3:11 PM, Nicolas Pitre wrote:

On Thu, 1 Dec 2016, Paul Bolle wrote:


On Thu, 2016-12-01 at 12:42 -0500, Nicolas Pitre wrote:

OK I understand what the problem is.  However most of those hunks below
are definitely wrong. ;-)


Probably. By now I've narrowed it down to just these two hunks:


And they're both wrong. ;-) There is no relation between MODVERSIONS and
TRIM_UNUSED_KSYMS.


I'm trying to determine the best way to fix it. Stay tuned.


Will do. I'm curious to see what a proper fix might look like.


Here it is:

- >8
Subject: kbuild: fix building bzImage with CONFIG_TRIM_UNUSED_KSYMS enabled

When building a specific target such as bzImage, modules aren't normally
built. However if CONFIG_TRIM_UNUSED_KSYMS is enabled, no built modules
means none of the exported symbols are used and therefore they will all
be trimmed away from the final kernel. A subsequent "make modules" will
fail because modpost cannot find the needed symbols for those modules in
the kernel binary.

Let's make sure modules are also built whenever CONFIG_TRIM_UNUSED_KSYMS
is enabled and that the kernel binary is properly rebuilt accordingly.


For my previously failing case, things behave again with this patch. 
Thanks much!


Tested-by: Jarod Wilson 

--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-12-02 Thread Nicolas Pitre
On Thu, 1 Dec 2016, Paul Bolle wrote:

> On Thu, 2016-12-01 at 12:42 -0500, Nicolas Pitre wrote:
> > OK I understand what the problem is.  However most of those hunks below 
> > are definitely wrong. ;-)
> 
> Probably. By now I've narrowed it down to just these two hunks:

And they're both wrong. ;-) There is no relation between MODVERSIONS and 
TRIM_UNUSED_KSYMS.

> > I'm trying to determine the best way to fix it. Stay tuned.
> 
> Will do. I'm curious to see what a proper fix might look like.

Here it is:

- >8
Subject: kbuild: fix building bzImage with CONFIG_TRIM_UNUSED_KSYMS enabled

When building a specific target such as bzImage, modules aren't normally
built. However if CONFIG_TRIM_UNUSED_KSYMS is enabled, no built modules
means none of the exported symbols are used and therefore they will all
be trimmed away from the final kernel. A subsequent "make modules" will
fail because modpost cannot find the needed symbols for those modules in
the kernel binary.

Let's make sure modules are also built whenever CONFIG_TRIM_UNUSED_KSYMS
is enabled and that the kernel binary is properly rebuilt accordingly.

Signed-off-by: Nicolas Pitre 

diff --git a/Makefile b/Makefile
index 9f9c3b577c..b816089e5d 100644
--- a/Makefile
+++ b/Makefile
@@ -607,6 +607,13 @@ else
 include/config/auto.conf: ;
 endif # $(dot-config)
 
+# For the kernel to actually contain only the needed exported symbols,
+# we have to build modules as well to determine what those symbols are.
+# (this can be evaluated only once include/config/auto.conf has been included)
+ifdef CONFIG_TRIM_UNUSED_KSYMS
+  KBUILD_MODULES := 1
+endif
+
 # The all: target is the default when no target is given on the
 # command line.
 # This allow a user to issue only 'make' to build a kernel including modules
@@ -944,7 +951,7 @@ ifdef CONFIG_GDB_SCRIPTS
 endif
 ifdef CONFIG_TRIM_UNUSED_KSYMS
$(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
- "$(MAKE) KBUILD_MODULES=1 -f $(srctree)/Makefile vmlinux_prereq"
+ "$(MAKE) -f $(srctree)/Makefile vmlinux"
 endif
 
 # standalone target for easier testing


Re: Odd build breakage in 4.9-rc7

2016-12-02 Thread Nicolas Pitre
On Thu, 1 Dec 2016, Paul Bolle wrote:

> On Thu, 2016-12-01 at 12:42 -0500, Nicolas Pitre wrote:
> > OK I understand what the problem is.  However most of those hunks below 
> > are definitely wrong. ;-)
> 
> Probably. By now I've narrowed it down to just these two hunks:

And they're both wrong. ;-) There is no relation between MODVERSIONS and 
TRIM_UNUSED_KSYMS.

> > I'm trying to determine the best way to fix it. Stay tuned.
> 
> Will do. I'm curious to see what a proper fix might look like.

Here it is:

- >8
Subject: kbuild: fix building bzImage with CONFIG_TRIM_UNUSED_KSYMS enabled

When building a specific target such as bzImage, modules aren't normally
built. However if CONFIG_TRIM_UNUSED_KSYMS is enabled, no built modules
means none of the exported symbols are used and therefore they will all
be trimmed away from the final kernel. A subsequent "make modules" will
fail because modpost cannot find the needed symbols for those modules in
the kernel binary.

Let's make sure modules are also built whenever CONFIG_TRIM_UNUSED_KSYMS
is enabled and that the kernel binary is properly rebuilt accordingly.

Signed-off-by: Nicolas Pitre 

diff --git a/Makefile b/Makefile
index 9f9c3b577c..b816089e5d 100644
--- a/Makefile
+++ b/Makefile
@@ -607,6 +607,13 @@ else
 include/config/auto.conf: ;
 endif # $(dot-config)
 
+# For the kernel to actually contain only the needed exported symbols,
+# we have to build modules as well to determine what those symbols are.
+# (this can be evaluated only once include/config/auto.conf has been included)
+ifdef CONFIG_TRIM_UNUSED_KSYMS
+  KBUILD_MODULES := 1
+endif
+
 # The all: target is the default when no target is given on the
 # command line.
 # This allow a user to issue only 'make' to build a kernel including modules
@@ -944,7 +951,7 @@ ifdef CONFIG_GDB_SCRIPTS
 endif
 ifdef CONFIG_TRIM_UNUSED_KSYMS
$(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
- "$(MAKE) KBUILD_MODULES=1 -f $(srctree)/Makefile vmlinux_prereq"
+ "$(MAKE) -f $(srctree)/Makefile vmlinux"
 endif
 
 # standalone target for easier testing


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Paul Bolle
On Thu, 2016-12-01 at 12:42 -0500, Nicolas Pitre wrote:
> OK I understand what the problem is.  However most of those hunks below 
> are definitely wrong. ;-)

Probably. By now I've narrowed it down to just these two hunks:

diff --git a/scripts/Makefile b/scripts/Makefile
index 1d80897a9644..f23e5c4f2496 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -40,7 +40,9 @@ build_docproc: $(obj)/docproc
 build_check-lc_ctype: $(obj)/check-lc_ctype
@:
 
-subdir-$(CONFIG_MODVERSIONS) += genksyms
+ifeq ($(or $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),y)
+subdir-y += genksyms
+endif
 subdir-y += mod
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 subdir-$(CONFIG_DTC) += dtc
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index 8dc1918b6783..7525da1cc2f7 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -68,7 +68,7 @@ while read sym; do
 done >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
-if [ -n "$CONFIG_MODVERSIONS" ]; then
+if [ -n "$CONFIG_MODVERSIONS" -o -n "$CONFIG_TRIM_UNUSED_KSYMS" ]; then
echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
 fi
 

> I'm trying to determine the best way to fix it. Stay tuned.

Will do. I'm curious to see what a proper fix might look like.

Thanks,


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Paul Bolle
On Thu, 2016-12-01 at 12:42 -0500, Nicolas Pitre wrote:
> OK I understand what the problem is.  However most of those hunks below 
> are definitely wrong. ;-)

Probably. By now I've narrowed it down to just these two hunks:

diff --git a/scripts/Makefile b/scripts/Makefile
index 1d80897a9644..f23e5c4f2496 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -40,7 +40,9 @@ build_docproc: $(obj)/docproc
 build_check-lc_ctype: $(obj)/check-lc_ctype
@:
 
-subdir-$(CONFIG_MODVERSIONS) += genksyms
+ifeq ($(or $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),y)
+subdir-y += genksyms
+endif
 subdir-y += mod
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 subdir-$(CONFIG_DTC) += dtc
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index 8dc1918b6783..7525da1cc2f7 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -68,7 +68,7 @@ while read sym; do
 done >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
-if [ -n "$CONFIG_MODVERSIONS" ]; then
+if [ -n "$CONFIG_MODVERSIONS" -o -n "$CONFIG_TRIM_UNUSED_KSYMS" ]; then
echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
 fi
 

> I'm trying to determine the best way to fix it. Stay tuned.

Will do. I'm curious to see what a proper fix might look like.

Thanks,


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Nicolas Pitre
On Thu, 1 Dec 2016, Paul Bolle wrote:

> On Thu, 2016-12-01 at 10:01 +0100, Paul Bolle wrote:
> > Perhaps this is all documented somewhere. But even then it would be nice if
> > the build would fail right at the start. Ie, the build probably should fail 
> > if
> > one does "make bzImage" while having a .config with
> >     CONFIG_TRIM_UNUSED_KSYMS=y
> >     CONFIG_MODULES=y
> >     # CONFIG_MODVERSIONS is not set
> > 
> > Because it seems in that case the subsequent "make modules" will then end in
> > this flood of ERRORs. 
> 
> Or, alternatively, we could use something like the following hack to keep a
> two step build (ie, "make bzImage" and "make modules") do the right thing even
> if CONFING_MODVERSIONS is not set.
> 
> The hack was cobbled together with a fair amount of cargo-cult coding so
> perhaps a hunk or two aren't really needed. We'll see.

OK I understand what the problem is.  However most of those hunks below 
are definitely wrong. ;-)

I'm trying to determine the best way to fix it. Stay tuned.


> Paul Bolle
> 
> diff --git a/Makefile b/Makefile
> index 694111b43cf8..5820f803ca64 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -321,7 +321,7 @@ KBUILD_BUILTIN := 1
>  # make sure the checksums are up to date before we record them.
>  
>  ifeq ($(MAKECMDGOALS),modules)
> -  KBUILD_BUILTIN := $(if $(CONFIG_MODVERSIONS),1)
> +  KBUILD_BUILTIN := $(if $(or 
> $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),1)
>  endif
>  
>  # If we have "make  modules", compile modules
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 1d80897a9644..f23e5c4f2496 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -40,7 +40,9 @@ build_docproc: $(obj)/docproc
>  build_check-lc_ctype: $(obj)/check-lc_ctype
>   @:
>  
> -subdir-$(CONFIG_MODVERSIONS) += genksyms
> +ifeq ($(or $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),y)
> +subdir-y += genksyms
> +endif
>  subdir-y += mod
>  subdir-$(CONFIG_SECURITY_SELINUX) += selinux
>  subdir-$(CONFIG_DTC) += dtc
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 7675d11ee65e..50ce2cf86b7c 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -182,7 +182,7 @@ $(obj)/%.symtypes : $(src)/%.c FORCE
>  
>  quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
>  
> -ifndef CONFIG_MODVERSIONS
> +ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
>  cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $<
>  
>  else
> @@ -358,7 +358,7 @@ $(obj)/%.s: $(src)/%.S FORCE
>  
>  quiet_cmd_as_o_S = AS $(quiet_modtag)  $@
>  
> -ifndef CONFIG_MODVERSIONS
> +ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
>  cmd_as_o_S = $(CC) $(a_flags) -c -o $@ $<
>  
>  else
> diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
> index 8dc1918b6783..7525da1cc2f7 100755
> --- a/scripts/adjust_autoksyms.sh
> +++ b/scripts/adjust_autoksyms.sh
> @@ -68,7 +68,7 @@ while read sym; do
>  done >> "$new_ksyms_file"
>  
>  # Special case for modversions (see modpost.c)
> -if [ -n "$CONFIG_MODVERSIONS" ]; then
> +if [ -n "$CONFIG_MODVERSIONS" -o -n "$CONFIG_TRIM_UNUSED_KSYMS" ]; then
>   echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
>  fi
>  
> 

Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Nicolas Pitre
On Thu, 1 Dec 2016, Paul Bolle wrote:

> On Thu, 2016-12-01 at 10:01 +0100, Paul Bolle wrote:
> > Perhaps this is all documented somewhere. But even then it would be nice if
> > the build would fail right at the start. Ie, the build probably should fail 
> > if
> > one does "make bzImage" while having a .config with
> >     CONFIG_TRIM_UNUSED_KSYMS=y
> >     CONFIG_MODULES=y
> >     # CONFIG_MODVERSIONS is not set
> > 
> > Because it seems in that case the subsequent "make modules" will then end in
> > this flood of ERRORs. 
> 
> Or, alternatively, we could use something like the following hack to keep a
> two step build (ie, "make bzImage" and "make modules") do the right thing even
> if CONFING_MODVERSIONS is not set.
> 
> The hack was cobbled together with a fair amount of cargo-cult coding so
> perhaps a hunk or two aren't really needed. We'll see.

OK I understand what the problem is.  However most of those hunks below 
are definitely wrong. ;-)

I'm trying to determine the best way to fix it. Stay tuned.


> Paul Bolle
> 
> diff --git a/Makefile b/Makefile
> index 694111b43cf8..5820f803ca64 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -321,7 +321,7 @@ KBUILD_BUILTIN := 1
>  # make sure the checksums are up to date before we record them.
>  
>  ifeq ($(MAKECMDGOALS),modules)
> -  KBUILD_BUILTIN := $(if $(CONFIG_MODVERSIONS),1)
> +  KBUILD_BUILTIN := $(if $(or 
> $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),1)
>  endif
>  
>  # If we have "make  modules", compile modules
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 1d80897a9644..f23e5c4f2496 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -40,7 +40,9 @@ build_docproc: $(obj)/docproc
>  build_check-lc_ctype: $(obj)/check-lc_ctype
>   @:
>  
> -subdir-$(CONFIG_MODVERSIONS) += genksyms
> +ifeq ($(or $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),y)
> +subdir-y += genksyms
> +endif
>  subdir-y += mod
>  subdir-$(CONFIG_SECURITY_SELINUX) += selinux
>  subdir-$(CONFIG_DTC) += dtc
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 7675d11ee65e..50ce2cf86b7c 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -182,7 +182,7 @@ $(obj)/%.symtypes : $(src)/%.c FORCE
>  
>  quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
>  
> -ifndef CONFIG_MODVERSIONS
> +ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
>  cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $<
>  
>  else
> @@ -358,7 +358,7 @@ $(obj)/%.s: $(src)/%.S FORCE
>  
>  quiet_cmd_as_o_S = AS $(quiet_modtag)  $@
>  
> -ifndef CONFIG_MODVERSIONS
> +ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
>  cmd_as_o_S = $(CC) $(a_flags) -c -o $@ $<
>  
>  else
> diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
> index 8dc1918b6783..7525da1cc2f7 100755
> --- a/scripts/adjust_autoksyms.sh
> +++ b/scripts/adjust_autoksyms.sh
> @@ -68,7 +68,7 @@ while read sym; do
>  done >> "$new_ksyms_file"
>  
>  # Special case for modversions (see modpost.c)
> -if [ -n "$CONFIG_MODVERSIONS" ]; then
> +if [ -n "$CONFIG_MODVERSIONS" -o -n "$CONFIG_TRIM_UNUSED_KSYMS" ]; then
>   echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
>  fi
>  
> 

Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Jarod Wilson

On 2016-12-01 9:03 AM, Prarit Bhargava wrote:



On 11/30/2016 05:41 PM, Nicolas Pitre wrote:

On Wed, 30 Nov 2016, Linus Torvalds wrote:


On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:


It comes back.  The steps to reproduce this are:

1.  checkout latest linux.git
2.  make -j112

(IOW, it occurs 100% of the time for me on a clean tree.)


I don't have access to such hardware where -j112 could ever make sense.  :-)
In other words, I can't reproduce regardless of the -j value I try.


I suspect it's not new, it's just that you are able to hit the timing
just right (and the new include presumable makes that just be much
easier).


Here's the best fix I can think of. I can't convince myself any other
location would be 100% safe.  Obviously I can't confirm if this actually
fixes anything.

- >8
Subject: kbuild: make sure autoksyms.h exists early

Some people are able to trigger a race where autoksyms.h is used before
its empty version is even created. Let's create it at the same time as
the directory holding it is created.

Signed-off-by: Nicolas Pitre 

diff --git a/Makefile b/Makefile
index 694111b43c..9f9c3b577c 100644
--- a/Makefile
+++ b/Makefile
@@ -1019,8 +1019,6 @@ prepare2: prepare3 prepare-compiler-check outputmakefile 
asm-generic
 prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
include/config/auto.conf
$(cmd_crmodverdir)
-   $(Q)test -e include/generated/autoksyms.h || \
-   touch   include/generated/autoksyms.h

 archprepare: archheaders archscripts prepare1 scripts_basic

diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index ebced77deb..90a091b6ae 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -35,6 +35,8 @@ nconfig: $(obj)/nconf

 silentoldconfig: $(obj)/conf
$(Q)mkdir -p include/config include/generated
+   $(Q)test -e include/generated/autoksyms.h || \
+   touch   include/generated/autoksyms.h
$< $(silent) --$@ $(Kconfig)

 localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf



The testing was successful.

After testing an hour of builds with different -j values, I'm no longer seeing
any compile issues when this patch is applied.  When I remove the patch the
compile error returns so I'm going to say that this patch fixed it.

Thanks again Nicolas.

Tested-by: Prarit Bhargava 


Looks good here as well, can do parallel make w/o reverting Tony's 
patches again.


Tested-by: Jarod Wilson 

--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Jarod Wilson

On 2016-12-01 9:03 AM, Prarit Bhargava wrote:



On 11/30/2016 05:41 PM, Nicolas Pitre wrote:

On Wed, 30 Nov 2016, Linus Torvalds wrote:


On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:


It comes back.  The steps to reproduce this are:

1.  checkout latest linux.git
2.  make -j112

(IOW, it occurs 100% of the time for me on a clean tree.)


I don't have access to such hardware where -j112 could ever make sense.  :-)
In other words, I can't reproduce regardless of the -j value I try.


I suspect it's not new, it's just that you are able to hit the timing
just right (and the new include presumable makes that just be much
easier).


Here's the best fix I can think of. I can't convince myself any other
location would be 100% safe.  Obviously I can't confirm if this actually
fixes anything.

- >8
Subject: kbuild: make sure autoksyms.h exists early

Some people are able to trigger a race where autoksyms.h is used before
its empty version is even created. Let's create it at the same time as
the directory holding it is created.

Signed-off-by: Nicolas Pitre 

diff --git a/Makefile b/Makefile
index 694111b43c..9f9c3b577c 100644
--- a/Makefile
+++ b/Makefile
@@ -1019,8 +1019,6 @@ prepare2: prepare3 prepare-compiler-check outputmakefile 
asm-generic
 prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
include/config/auto.conf
$(cmd_crmodverdir)
-   $(Q)test -e include/generated/autoksyms.h || \
-   touch   include/generated/autoksyms.h

 archprepare: archheaders archscripts prepare1 scripts_basic

diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index ebced77deb..90a091b6ae 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -35,6 +35,8 @@ nconfig: $(obj)/nconf

 silentoldconfig: $(obj)/conf
$(Q)mkdir -p include/config include/generated
+   $(Q)test -e include/generated/autoksyms.h || \
+   touch   include/generated/autoksyms.h
$< $(silent) --$@ $(Kconfig)

 localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf



The testing was successful.

After testing an hour of builds with different -j values, I'm no longer seeing
any compile issues when this patch is applied.  When I remove the patch the
compile error returns so I'm going to say that this patch fixed it.

Thanks again Nicolas.

Tested-by: Prarit Bhargava 


Looks good here as well, can do parallel make w/o reverting Tony's 
patches again.


Tested-by: Jarod Wilson 

--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Prarit Bhargava


On 11/30/2016 05:41 PM, Nicolas Pitre wrote:
> On Wed, 30 Nov 2016, Linus Torvalds wrote:
> 
>> On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
>>>
>>> It comes back.  The steps to reproduce this are:
>>>
>>> 1.  checkout latest linux.git
>>> 2.  make -j112
>>>
>>> (IOW, it occurs 100% of the time for me on a clean tree.)
> 
> I don't have access to such hardware where -j112 could ever make sense.  :-)
> In other words, I can't reproduce regardless of the -j value I try.
> 
>> I suspect it's not new, it's just that you are able to hit the timing
>> just right (and the new include presumable makes that just be much
>> easier).
> 
> Here's the best fix I can think of. I can't convince myself any other 
> location would be 100% safe.  Obviously I can't confirm if this actually 
> fixes anything.
> 
> - >8
> Subject: kbuild: make sure autoksyms.h exists early
> 
> Some people are able to trigger a race where autoksyms.h is used before
> its empty version is even created. Let's create it at the same time as
> the directory holding it is created.
> 
> Signed-off-by: Nicolas Pitre 
> 
> diff --git a/Makefile b/Makefile
> index 694111b43c..9f9c3b577c 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1019,8 +1019,6 @@ prepare2: prepare3 prepare-compiler-check 
> outputmakefile asm-generic
>  prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
> include/config/auto.conf
>   $(cmd_crmodverdir)
> - $(Q)test -e include/generated/autoksyms.h || \
> - touch   include/generated/autoksyms.h
>  
>  archprepare: archheaders archscripts prepare1 scripts_basic
>  
> diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
> index ebced77deb..90a091b6ae 100644
> --- a/scripts/kconfig/Makefile
> +++ b/scripts/kconfig/Makefile
> @@ -35,6 +35,8 @@ nconfig: $(obj)/nconf
>  
>  silentoldconfig: $(obj)/conf
>   $(Q)mkdir -p include/config include/generated
> + $(Q)test -e include/generated/autoksyms.h || \
> + touch   include/generated/autoksyms.h
>   $< $(silent) --$@ $(Kconfig)
>  
>  localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf
> 

The testing was successful.

After testing an hour of builds with different -j values, I'm no longer seeing
any compile issues when this patch is applied.  When I remove the patch the
compile error returns so I'm going to say that this patch fixed it.

Thanks again Nicolas.

Tested-by: Prarit Bhargava 

P.


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Prarit Bhargava


On 11/30/2016 05:41 PM, Nicolas Pitre wrote:
> On Wed, 30 Nov 2016, Linus Torvalds wrote:
> 
>> On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
>>>
>>> It comes back.  The steps to reproduce this are:
>>>
>>> 1.  checkout latest linux.git
>>> 2.  make -j112
>>>
>>> (IOW, it occurs 100% of the time for me on a clean tree.)
> 
> I don't have access to such hardware where -j112 could ever make sense.  :-)
> In other words, I can't reproduce regardless of the -j value I try.
> 
>> I suspect it's not new, it's just that you are able to hit the timing
>> just right (and the new include presumable makes that just be much
>> easier).
> 
> Here's the best fix I can think of. I can't convince myself any other 
> location would be 100% safe.  Obviously I can't confirm if this actually 
> fixes anything.
> 
> - >8
> Subject: kbuild: make sure autoksyms.h exists early
> 
> Some people are able to trigger a race where autoksyms.h is used before
> its empty version is even created. Let's create it at the same time as
> the directory holding it is created.
> 
> Signed-off-by: Nicolas Pitre 
> 
> diff --git a/Makefile b/Makefile
> index 694111b43c..9f9c3b577c 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1019,8 +1019,6 @@ prepare2: prepare3 prepare-compiler-check 
> outputmakefile asm-generic
>  prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
> include/config/auto.conf
>   $(cmd_crmodverdir)
> - $(Q)test -e include/generated/autoksyms.h || \
> - touch   include/generated/autoksyms.h
>  
>  archprepare: archheaders archscripts prepare1 scripts_basic
>  
> diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
> index ebced77deb..90a091b6ae 100644
> --- a/scripts/kconfig/Makefile
> +++ b/scripts/kconfig/Makefile
> @@ -35,6 +35,8 @@ nconfig: $(obj)/nconf
>  
>  silentoldconfig: $(obj)/conf
>   $(Q)mkdir -p include/config include/generated
> + $(Q)test -e include/generated/autoksyms.h || \
> + touch   include/generated/autoksyms.h
>   $< $(silent) --$@ $(Kconfig)
>  
>  localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf
> 

The testing was successful.

After testing an hour of builds with different -j values, I'm no longer seeing
any compile issues when this patch is applied.  When I remove the patch the
compile error returns so I'm going to say that this patch fixed it.

Thanks again Nicolas.

Tested-by: Prarit Bhargava 

P.


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Paul Bolle
On Thu, 2016-12-01 at 10:01 +0100, Paul Bolle wrote:
> Perhaps this is all documented somewhere. But even then it would be nice if
> the build would fail right at the start. Ie, the build probably should fail if
> one does "make bzImage" while having a .config with
>     CONFIG_TRIM_UNUSED_KSYMS=y
>     CONFIG_MODULES=y
>     # CONFIG_MODVERSIONS is not set
> 
> Because it seems in that case the subsequent "make modules" will then end in
> this flood of ERRORs. 

Or, alternatively, we could use something like the following hack to keep a
two step build (ie, "make bzImage" and "make modules") do the right thing even
if CONFING_MODVERSIONS is not set.

The hack was cobbled together with a fair amount of cargo-cult coding so
perhaps a hunk or two aren't really needed. We'll see.

Paul Bolle

diff --git a/Makefile b/Makefile
index 694111b43cf8..5820f803ca64 100644
--- a/Makefile
+++ b/Makefile
@@ -321,7 +321,7 @@ KBUILD_BUILTIN := 1
 # make sure the checksums are up to date before we record them.
 
 ifeq ($(MAKECMDGOALS),modules)
-  KBUILD_BUILTIN := $(if $(CONFIG_MODVERSIONS),1)
+  KBUILD_BUILTIN := $(if $(or 
$(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),1)
 endif
 
 # If we have "make  modules", compile modules
diff --git a/scripts/Makefile b/scripts/Makefile
index 1d80897a9644..f23e5c4f2496 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -40,7 +40,9 @@ build_docproc: $(obj)/docproc
 build_check-lc_ctype: $(obj)/check-lc_ctype
@:
 
-subdir-$(CONFIG_MODVERSIONS) += genksyms
+ifeq ($(or $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),y)
+subdir-y += genksyms
+endif
 subdir-y += mod
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 subdir-$(CONFIG_DTC) += dtc
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 7675d11ee65e..50ce2cf86b7c 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -182,7 +182,7 @@ $(obj)/%.symtypes : $(src)/%.c FORCE
 
 quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
 
-ifndef CONFIG_MODVERSIONS
+ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
 cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $<
 
 else
@@ -358,7 +358,7 @@ $(obj)/%.s: $(src)/%.S FORCE
 
 quiet_cmd_as_o_S = AS $(quiet_modtag)  $@
 
-ifndef CONFIG_MODVERSIONS
+ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
 cmd_as_o_S = $(CC) $(a_flags) -c -o $@ $<
 
 else
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index 8dc1918b6783..7525da1cc2f7 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -68,7 +68,7 @@ while read sym; do
 done >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
-if [ -n "$CONFIG_MODVERSIONS" ]; then
+if [ -n "$CONFIG_MODVERSIONS" -o -n "$CONFIG_TRIM_UNUSED_KSYMS" ]; then
echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
 fi
 


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Paul Bolle
On Thu, 2016-12-01 at 10:01 +0100, Paul Bolle wrote:
> Perhaps this is all documented somewhere. But even then it would be nice if
> the build would fail right at the start. Ie, the build probably should fail if
> one does "make bzImage" while having a .config with
>     CONFIG_TRIM_UNUSED_KSYMS=y
>     CONFIG_MODULES=y
>     # CONFIG_MODVERSIONS is not set
> 
> Because it seems in that case the subsequent "make modules" will then end in
> this flood of ERRORs. 

Or, alternatively, we could use something like the following hack to keep a
two step build (ie, "make bzImage" and "make modules") do the right thing even
if CONFING_MODVERSIONS is not set.

The hack was cobbled together with a fair amount of cargo-cult coding so
perhaps a hunk or two aren't really needed. We'll see.

Paul Bolle

diff --git a/Makefile b/Makefile
index 694111b43cf8..5820f803ca64 100644
--- a/Makefile
+++ b/Makefile
@@ -321,7 +321,7 @@ KBUILD_BUILTIN := 1
 # make sure the checksums are up to date before we record them.
 
 ifeq ($(MAKECMDGOALS),modules)
-  KBUILD_BUILTIN := $(if $(CONFIG_MODVERSIONS),1)
+  KBUILD_BUILTIN := $(if $(or 
$(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),1)
 endif
 
 # If we have "make  modules", compile modules
diff --git a/scripts/Makefile b/scripts/Makefile
index 1d80897a9644..f23e5c4f2496 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -40,7 +40,9 @@ build_docproc: $(obj)/docproc
 build_check-lc_ctype: $(obj)/check-lc_ctype
@:
 
-subdir-$(CONFIG_MODVERSIONS) += genksyms
+ifeq ($(or $(CONFIG_MODVERSIONS),$(CONFIG_TRIM_UNUSED_KSYMS)),y)
+subdir-y += genksyms
+endif
 subdir-y += mod
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 subdir-$(CONFIG_DTC) += dtc
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 7675d11ee65e..50ce2cf86b7c 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -182,7 +182,7 @@ $(obj)/%.symtypes : $(src)/%.c FORCE
 
 quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
 
-ifndef CONFIG_MODVERSIONS
+ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
 cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $<
 
 else
@@ -358,7 +358,7 @@ $(obj)/%.s: $(src)/%.S FORCE
 
 quiet_cmd_as_o_S = AS $(quiet_modtag)  $@
 
-ifndef CONFIG_MODVERSIONS
+ifneq ($(if $(CONFIG_MODVERSIONS),1,$(if $(CONFIG_TRIM_UNUSED_KSYMS),1)),1)
 cmd_as_o_S = $(CC) $(a_flags) -c -o $@ $<
 
 else
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index 8dc1918b6783..7525da1cc2f7 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -68,7 +68,7 @@ while read sym; do
 done >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
-if [ -n "$CONFIG_MODVERSIONS" ]; then
+if [ -n "$CONFIG_MODVERSIONS" -o -n "$CONFIG_TRIM_UNUSED_KSYMS" ]; then
echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
 fi
 


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Prarit Bhargava


On 11/30/2016 05:41 PM, Nicolas Pitre wrote:
> On Wed, 30 Nov 2016, Linus Torvalds wrote:
> 
>> On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
>>>
>>> It comes back.  The steps to reproduce this are:
>>>
>>> 1.  checkout latest linux.git
>>> 2.  make -j112
>>>
>>> (IOW, it occurs 100% of the time for me on a clean tree.)
> 
> I don't have access to such hardware where -j112 could ever make sense.  :-)

:)  I could push the builds onto the -j256 but I'm doing other stuff over 
there. :)

> In other words, I can't reproduce regardless of the -j value I try.
> 
>> I suspect it's not new, it's just that you are able to hit the timing
>> just right (and the new include presumable makes that just be much
>> easier).
> 
> Here's the best fix I can think of. I can't convince myself any other 
> location would be 100% safe.  Obviously I can't confirm if this actually 
> fixes anything.
> 
> - >8
> Subject: kbuild: make sure autoksyms.h exists early
> 

I'm building with this patch on top of latest now.  I will put it in a tight
loop and clear the drop_caches between builds to see if I can make it fail.

Thanks Nicolas -- your help is very much appreciated.

P.


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Prarit Bhargava


On 11/30/2016 05:41 PM, Nicolas Pitre wrote:
> On Wed, 30 Nov 2016, Linus Torvalds wrote:
> 
>> On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
>>>
>>> It comes back.  The steps to reproduce this are:
>>>
>>> 1.  checkout latest linux.git
>>> 2.  make -j112
>>>
>>> (IOW, it occurs 100% of the time for me on a clean tree.)
> 
> I don't have access to such hardware where -j112 could ever make sense.  :-)

:)  I could push the builds onto the -j256 but I'm doing other stuff over 
there. :)

> In other words, I can't reproduce regardless of the -j value I try.
> 
>> I suspect it's not new, it's just that you are able to hit the timing
>> just right (and the new include presumable makes that just be much
>> easier).
> 
> Here's the best fix I can think of. I can't convince myself any other 
> location would be 100% safe.  Obviously I can't confirm if this actually 
> fixes anything.
> 
> - >8
> Subject: kbuild: make sure autoksyms.h exists early
> 

I'm building with this patch on top of latest now.  I will put it in a tight
loop and clear the drop_caches between builds to see if I can make it fail.

Thanks Nicolas -- your help is very much appreciated.

P.


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Paul Bolle
[Added Nicolas, for whom this all might be painfully obvious.]

On Wed, 2016-11-30 at 18:40 -0500, Jarod Wilson wrote:
> My config had MODVERSIONS set, yes.

For the record triggering this flood of ERRORs for undefined symbols that you
ran into has been possible ever since TRIM_UNUSED_KSYMS was added to the tree
(in v4.7). I just tested that.

Perhaps this is all documented somewhere. But even then it would be nice if
the build would fail right at the start. Ie, the build probably should fail if
one does "make bzImage" while having a .config with
    CONFIG_TRIM_UNUSED_KSYMS=y
    CONFIG_MODULES=y
    # CONFIG_MODVERSIONS is not set

Because it seems in that case the subsequent "make modules" will then end in
this flood of ERRORs. 


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-12-01 Thread Paul Bolle
[Added Nicolas, for whom this all might be painfully obvious.]

On Wed, 2016-11-30 at 18:40 -0500, Jarod Wilson wrote:
> My config had MODVERSIONS set, yes.

For the record triggering this flood of ERRORs for undefined symbols that you
ran into has been possible ever since TRIM_UNUSED_KSYMS was added to the tree
(in v4.7). I just tested that.

Perhaps this is all documented somewhere. But even then it would be nice if
the build would fail right at the start. Ie, the build probably should fail if
one does "make bzImage" while having a .config with
    CONFIG_TRIM_UNUSED_KSYMS=y
    CONFIG_MODULES=y
    # CONFIG_MODVERSIONS is not set

Because it seems in that case the subsequent "make modules" will then end in
this flood of ERRORs. 


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 4:57 PM, Paul Bolle wrote:

On Wed, 2016-11-30 at 22:42 +0100, Paul Bolle wrote:

My current theory is that setting MODVERSIONS, somehow, hides the ERROR spew.
Because that could explain your bisect. Linus' commit turns of MODVERSIONS the
hard way. And, naturally, this theory fails if your .configs never had
MODVERSIONS set in the first place.


My config had MODVERSIONS set, yes.


This theory appears to be correct!

It's getting late here so I won't be spending much time on this anymore.
Perhaps you fancy looking into all this now. And, maybe, a new day will bring
me the courage needed to dive into this.


Done for the day here too, will continue prodding tomorrow.

--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 4:57 PM, Paul Bolle wrote:

On Wed, 2016-11-30 at 22:42 +0100, Paul Bolle wrote:

My current theory is that setting MODVERSIONS, somehow, hides the ERROR spew.
Because that could explain your bisect. Linus' commit turns of MODVERSIONS the
hard way. And, naturally, this theory fails if your .configs never had
MODVERSIONS set in the first place.


My config had MODVERSIONS set, yes.


This theory appears to be correct!

It's getting late here so I won't be spending much time on this anymore.
Perhaps you fancy looking into all this now. And, maybe, a new day will bring
me the courage needed to dive into this.


Done for the day here too, will continue prodding tomorrow.

--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Nicolas Pitre
On Wed, 30 Nov 2016, Prarit Bhargava wrote:

> 
> 
> On 11/30/2016 01:36 PM, Linus Torvalds wrote:
> > On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
> > ]>
> >> In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks 
> >> to
> >> identify Xeons with machine check recovery") which adds the include for
> >> generated/autoksyms.h.
> > 
> > Ok, that at least makes some sense. The other blamed commit did not
> > seem to possibly make a difference.
> > 
> >> Searching LKML and I came across a report from Ken Moffat from a month ago:
> >>
> >> http://marc.info/?l=linux-kernel=147794681124332=2
> > 
> > Does a "make clean" get rid of it forever? Or does it come back?
> 
> It comes back.  The steps to reproduce this are:
> 
> 1.  checkout latest linux.git
> 2.  make -j112
> 
> (IOW, it occurs 100% of the time for me on a clean tree.)
> 
> To work around the bug I have to do
> 
> 1.  checkout latest linux.git
> 2.  comment out the include for generated/autoksyms.h at 
> include/linux/export.h:81
> 3.  compile with -j112
> 
> This fails loudly, but then I do
> 
> 4.  uncomment the include for generated/autoksyms.h at 
> include/linux/export.h:81

A simpler workaround is simply:

touch include/generated/autoksyms.h

But hopefully the patch I just sent would fix it for good.


Nicolas


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Nicolas Pitre
On Wed, 30 Nov 2016, Prarit Bhargava wrote:

> 
> 
> On 11/30/2016 01:36 PM, Linus Torvalds wrote:
> > On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
> > ]>
> >> In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks 
> >> to
> >> identify Xeons with machine check recovery") which adds the include for
> >> generated/autoksyms.h.
> > 
> > Ok, that at least makes some sense. The other blamed commit did not
> > seem to possibly make a difference.
> > 
> >> Searching LKML and I came across a report from Ken Moffat from a month ago:
> >>
> >> http://marc.info/?l=linux-kernel=147794681124332=2
> > 
> > Does a "make clean" get rid of it forever? Or does it come back?
> 
> It comes back.  The steps to reproduce this are:
> 
> 1.  checkout latest linux.git
> 2.  make -j112
> 
> (IOW, it occurs 100% of the time for me on a clean tree.)
> 
> To work around the bug I have to do
> 
> 1.  checkout latest linux.git
> 2.  comment out the include for generated/autoksyms.h at 
> include/linux/export.h:81
> 3.  compile with -j112
> 
> This fails loudly, but then I do
> 
> 4.  uncomment the include for generated/autoksyms.h at 
> include/linux/export.h:81

A simpler workaround is simply:

touch include/generated/autoksyms.h

But hopefully the patch I just sent would fix it for good.


Nicolas


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Nicolas Pitre
On Wed, 30 Nov 2016, Linus Torvalds wrote:

> On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
> >
> > It comes back.  The steps to reproduce this are:
> >
> > 1.  checkout latest linux.git
> > 2.  make -j112
> >
> > (IOW, it occurs 100% of the time for me on a clean tree.)

I don't have access to such hardware where -j112 could ever make sense.  :-)
In other words, I can't reproduce regardless of the -j value I try.

> I suspect it's not new, it's just that you are able to hit the timing
> just right (and the new include presumable makes that just be much
> easier).

Here's the best fix I can think of. I can't convince myself any other 
location would be 100% safe.  Obviously I can't confirm if this actually 
fixes anything.

- >8
Subject: kbuild: make sure autoksyms.h exists early

Some people are able to trigger a race where autoksyms.h is used before
its empty version is even created. Let's create it at the same time as
the directory holding it is created.

Signed-off-by: Nicolas Pitre 

diff --git a/Makefile b/Makefile
index 694111b43c..9f9c3b577c 100644
--- a/Makefile
+++ b/Makefile
@@ -1019,8 +1019,6 @@ prepare2: prepare3 prepare-compiler-check outputmakefile 
asm-generic
 prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
include/config/auto.conf
$(cmd_crmodverdir)
-   $(Q)test -e include/generated/autoksyms.h || \
-   touch   include/generated/autoksyms.h
 
 archprepare: archheaders archscripts prepare1 scripts_basic
 
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index ebced77deb..90a091b6ae 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -35,6 +35,8 @@ nconfig: $(obj)/nconf
 
 silentoldconfig: $(obj)/conf
$(Q)mkdir -p include/config include/generated
+   $(Q)test -e include/generated/autoksyms.h || \
+   touch   include/generated/autoksyms.h
$< $(silent) --$@ $(Kconfig)
 
 localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Nicolas Pitre
On Wed, 30 Nov 2016, Linus Torvalds wrote:

> On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
> >
> > It comes back.  The steps to reproduce this are:
> >
> > 1.  checkout latest linux.git
> > 2.  make -j112
> >
> > (IOW, it occurs 100% of the time for me on a clean tree.)

I don't have access to such hardware where -j112 could ever make sense.  :-)
In other words, I can't reproduce regardless of the -j value I try.

> I suspect it's not new, it's just that you are able to hit the timing
> just right (and the new include presumable makes that just be much
> easier).

Here's the best fix I can think of. I can't convince myself any other 
location would be 100% safe.  Obviously I can't confirm if this actually 
fixes anything.

- >8
Subject: kbuild: make sure autoksyms.h exists early

Some people are able to trigger a race where autoksyms.h is used before
its empty version is even created. Let's create it at the same time as
the directory holding it is created.

Signed-off-by: Nicolas Pitre 

diff --git a/Makefile b/Makefile
index 694111b43c..9f9c3b577c 100644
--- a/Makefile
+++ b/Makefile
@@ -1019,8 +1019,6 @@ prepare2: prepare3 prepare-compiler-check outputmakefile 
asm-generic
 prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
include/config/auto.conf
$(cmd_crmodverdir)
-   $(Q)test -e include/generated/autoksyms.h || \
-   touch   include/generated/autoksyms.h
 
 archprepare: archheaders archscripts prepare1 scripts_basic
 
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index ebced77deb..90a091b6ae 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -35,6 +35,8 @@ nconfig: $(obj)/nconf
 
 silentoldconfig: $(obj)/conf
$(Q)mkdir -p include/config include/generated
+   $(Q)test -e include/generated/autoksyms.h || \
+   touch   include/generated/autoksyms.h
$< $(silent) --$@ $(Kconfig)
 
 localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Paul Bolle
On Wed, 2016-11-30 at 22:42 +0100, Paul Bolle wrote:
> My current theory is that setting MODVERSIONS, somehow, hides the ERROR spew.
> Because that could explain your bisect. Linus' commit turns of MODVERSIONS the
> hard way. And, naturally, this theory fails if your .configs never had
> MODVERSIONS set in the first place.

This theory appears to be correct!

It's getting late here so I won't be spending much time on this anymore.
Perhaps you fancy looking into all this now. And, maybe, a new day will bring
me the courage needed to dive into this.

Thanks,


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Paul Bolle
On Wed, 2016-11-30 at 22:42 +0100, Paul Bolle wrote:
> My current theory is that setting MODVERSIONS, somehow, hides the ERROR spew.
> Because that could explain your bisect. Linus' commit turns of MODVERSIONS the
> hard way. And, naturally, this theory fails if your .configs never had
> MODVERSIONS set in the first place.

This theory appears to be correct!

It's getting late here so I won't be spending much time on this anymore.
Perhaps you fancy looking into all this now. And, maybe, a new day will bring
me the courage needed to dive into this.

Thanks,


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Paul Bolle
On Wed, 2016-11-30 at 16:35 -0500, Jarod Wilson wrote:
> Just to confirm, with CONFIG_TRIM_UNUSED_KSYMS unset, the build behaves 
> normally, no ERROR spew.

And if MODVERSIONS is not set?

My current theory is that setting MODVERSIONS, somehow, hides the ERROR spew.
Because that could explain your bisect. Linus' commit turns of MODVERSIONS the
hard way. And, naturally, this theory fails if your .configs never had
MODVERSIONS set in the first place.

I'm testing all this right now, but you probably command more powerful
machines and could beat me in testing this theory.

Thanks,


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Paul Bolle
On Wed, 2016-11-30 at 16:35 -0500, Jarod Wilson wrote:
> Just to confirm, with CONFIG_TRIM_UNUSED_KSYMS unset, the build behaves 
> normally, no ERROR spew.

And if MODVERSIONS is not set?

My current theory is that setting MODVERSIONS, somehow, hides the ERROR spew.
Because that could explain your bisect. Linus' commit turns of MODVERSIONS the
hard way. And, naturally, this theory fails if your .configs never had
MODVERSIONS set in the first place.

I'm testing all this right now, but you probably command more powerful
machines and could beat me in testing this theory.

Thanks,


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 4:07 PM, Jarod Wilson wrote:

On 2016-11-30 3:52 PM, Paul Bolle wrote:

On Wed, 2016-11-30 at 12:24 -0500, Jarod Wilson wrote:

Up second, once we're past the above, building modules goes splat:

8<
$ make -s ARCH=x86_64 V=1 -j8 modules
...
ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
...
8<

There are similar ERROR lines to the tune of 145k lines of output,
basically for every single module and symbol in the build. This breakage
was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
looks fairly innocuous, but when reverted, builds work fine again.


I ran into a modules build printing over 100K ERROR lines a month ago:

https://lkml.kernel.org/r/<1478165881-9263-1-git-send-email-pebo...@tiscali.nl>


That had to do with setting TRIM_UNUSED_KSYMS and so unsetting
UNUSED_SYMBOLS,
as far as I could tell. Did you perhaps also have UNUSED_SYMBOLS unset
when
your modules build when splat?


I did indeed have CONFIG_TRIM_UNUSED_KSYMS=y and CONFIG_UNUSED_SYMBOLS
unset.


Just to confirm, with CONFIG_TRIM_UNUSED_KSYMS unset, the build behaves 
normally, no ERROR spew.


--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 4:07 PM, Jarod Wilson wrote:

On 2016-11-30 3:52 PM, Paul Bolle wrote:

On Wed, 2016-11-30 at 12:24 -0500, Jarod Wilson wrote:

Up second, once we're past the above, building modules goes splat:

8<
$ make -s ARCH=x86_64 V=1 -j8 modules
...
ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
...
8<

There are similar ERROR lines to the tune of 145k lines of output,
basically for every single module and symbol in the build. This breakage
was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
looks fairly innocuous, but when reverted, builds work fine again.


I ran into a modules build printing over 100K ERROR lines a month ago:

https://lkml.kernel.org/r/<1478165881-9263-1-git-send-email-pebo...@tiscali.nl>


That had to do with setting TRIM_UNUSED_KSYMS and so unsetting
UNUSED_SYMBOLS,
as far as I could tell. Did you perhaps also have UNUSED_SYMBOLS unset
when
your modules build when splat?


I did indeed have CONFIG_TRIM_UNUSED_KSYMS=y and CONFIG_UNUSED_SYMBOLS
unset.


Just to confirm, with CONFIG_TRIM_UNUSED_KSYMS unset, the build behaves 
normally, no ERROR spew.


--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 3:52 PM, Paul Bolle wrote:

On Wed, 2016-11-30 at 12:24 -0500, Jarod Wilson wrote:

Up second, once we're past the above, building modules goes splat:

8<
$ make -s ARCH=x86_64 V=1 -j8 modules
...
ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
...
8<

There are similar ERROR lines to the tune of 145k lines of output,
basically for every single module and symbol in the build. This breakage
was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
looks fairly innocuous, but when reverted, builds work fine again.


I ran into a modules build printing over 100K ERROR lines a month ago:

https://lkml.kernel.org/r/<1478165881-9263-1-git-send-email-pebo...@tiscali.nl>

That had to do with setting TRIM_UNUSED_KSYMS and so unsetting UNUSED_SYMBOLS,
as far as I could tell. Did you perhaps also have UNUSED_SYMBOLS unset when
your modules build when splat?


I did indeed have CONFIG_TRIM_UNUSED_KSYMS=y and CONFIG_UNUSED_SYMBOLS 
unset.



And did your bzImage build by any chance print this (to stderr):
sed: can't read .tmp_versions/*.mod: No such file or directory


Yep, I do see this now that I look back at the output from the bzImage 
stage.



If so I might have run into your second issue a month ago already, which makes
your bisect to commit cd3caefb4663 ("Fix subtle CONFIG_MODVERSIONS problems")
suspect. Or did that bisect not cover the second issue?


Hm. No, that bisect was indeed for this issue. Clean build each time, 
freshly unpacked 4.8 + rc6 patch applied + rc6-to-7 bisect patch.



Multi-threaded make vs. single-threaded doesn't matter, setting
CONFIG_BROKEN=y or '# CONFIG_MODVERSIONS is not set' don't make a
difference, and interestingly, if instead of split 'make bzImage' and
'make modules', I just do a single 'make', then things DO build
successfully, so I'm a wee bit baffled as to what's actually going on
here.


Likewise (ie, both the modules splat going away if doing a single make and
being baffled, but more that a wee bit).


Seems like it could be a case of "this patch just happens to tickle the 
issue in just the right way so as to trigger" if you saw this already a 
month prior. Linus' commit message there says things were already broken 
when that went in, I don't know the details of how and haven't yet tried 
to uncover them.


--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 3:52 PM, Paul Bolle wrote:

On Wed, 2016-11-30 at 12:24 -0500, Jarod Wilson wrote:

Up second, once we're past the above, building modules goes splat:

8<
$ make -s ARCH=x86_64 V=1 -j8 modules
...
ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
...
8<

There are similar ERROR lines to the tune of 145k lines of output,
basically for every single module and symbol in the build. This breakage
was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
looks fairly innocuous, but when reverted, builds work fine again.


I ran into a modules build printing over 100K ERROR lines a month ago:

https://lkml.kernel.org/r/<1478165881-9263-1-git-send-email-pebo...@tiscali.nl>

That had to do with setting TRIM_UNUSED_KSYMS and so unsetting UNUSED_SYMBOLS,
as far as I could tell. Did you perhaps also have UNUSED_SYMBOLS unset when
your modules build when splat?


I did indeed have CONFIG_TRIM_UNUSED_KSYMS=y and CONFIG_UNUSED_SYMBOLS 
unset.



And did your bzImage build by any chance print this (to stderr):
sed: can't read .tmp_versions/*.mod: No such file or directory


Yep, I do see this now that I look back at the output from the bzImage 
stage.



If so I might have run into your second issue a month ago already, which makes
your bisect to commit cd3caefb4663 ("Fix subtle CONFIG_MODVERSIONS problems")
suspect. Or did that bisect not cover the second issue?


Hm. No, that bisect was indeed for this issue. Clean build each time, 
freshly unpacked 4.8 + rc6 patch applied + rc6-to-7 bisect patch.



Multi-threaded make vs. single-threaded doesn't matter, setting
CONFIG_BROKEN=y or '# CONFIG_MODVERSIONS is not set' don't make a
difference, and interestingly, if instead of split 'make bzImage' and
'make modules', I just do a single 'make', then things DO build
successfully, so I'm a wee bit baffled as to what's actually going on
here.


Likewise (ie, both the modules splat going away if doing a single make and
being baffled, but more that a wee bit).


Seems like it could be a case of "this patch just happens to tickle the 
issue in just the right way so as to trigger" if you saw this already a 
month prior. Linus' commit message there says things were already broken 
when that went in, I don't know the details of how and haven't yet tried 
to uncover them.


--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Paul Bolle
On Wed, 2016-11-30 at 12:24 -0500, Jarod Wilson wrote:
> Up second, once we're past the above, building modules goes splat:
> 
> 8<
> $ make -s ARCH=x86_64 V=1 -j8 modules
> ...
> ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
> ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
> ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
> ...
> 8<
> 
> There are similar ERROR lines to the tune of 145k lines of output,
> basically for every single module and symbol in the build. This breakage
> was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
> looks fairly innocuous, but when reverted, builds work fine again.

I ran into a modules build printing over 100K ERROR lines a month ago:

https://lkml.kernel.org/r/<1478165881-9263-1-git-send-email-pebo...@tiscali.nl>

That had to do with setting TRIM_UNUSED_KSYMS and so unsetting UNUSED_SYMBOLS,
as far as I could tell. Did you perhaps also have UNUSED_SYMBOLS unset when
your modules build when splat?

And did your bzImage build by any chance print this (to stderr):
    sed: can't read .tmp_versions/*.mod: No such file or directory

If so I might have run into your second issue a month ago already, which makes
your bisect to commit cd3caefb4663 ("Fix subtle CONFIG_MODVERSIONS problems")
suspect. Or did that bisect not cover the second issue?

> Multi-threaded make vs. single-threaded doesn't matter, setting
> CONFIG_BROKEN=y or '# CONFIG_MODVERSIONS is not set' don't make a
> difference, and interestingly, if instead of split 'make bzImage' and
> 'make modules', I just do a single 'make', then things DO build
> successfully, so I'm a wee bit baffled as to what's actually going on
> here.

Likewise (ie, both the modules splat going away if doing a single make and
being baffled, but more that a wee bit).


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Paul Bolle
On Wed, 2016-11-30 at 12:24 -0500, Jarod Wilson wrote:
> Up second, once we're past the above, building modules goes splat:
> 
> 8<
> $ make -s ARCH=x86_64 V=1 -j8 modules
> ...
> ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
> ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
> ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
> ...
> 8<
> 
> There are similar ERROR lines to the tune of 145k lines of output,
> basically for every single module and symbol in the build. This breakage
> was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
> looks fairly innocuous, but when reverted, builds work fine again.

I ran into a modules build printing over 100K ERROR lines a month ago:

https://lkml.kernel.org/r/<1478165881-9263-1-git-send-email-pebo...@tiscali.nl>

That had to do with setting TRIM_UNUSED_KSYMS and so unsetting UNUSED_SYMBOLS,
as far as I could tell. Did you perhaps also have UNUSED_SYMBOLS unset when
your modules build when splat?

And did your bzImage build by any chance print this (to stderr):
    sed: can't read .tmp_versions/*.mod: No such file or directory

If so I might have run into your second issue a month ago already, which makes
your bisect to commit cd3caefb4663 ("Fix subtle CONFIG_MODVERSIONS problems")
suspect. Or did that bisect not cover the second issue?

> Multi-threaded make vs. single-threaded doesn't matter, setting
> CONFIG_BROKEN=y or '# CONFIG_MODVERSIONS is not set' don't make a
> difference, and interestingly, if instead of split 'make bzImage' and
> 'make modules', I just do a single 'make', then things DO build
> successfully, so I'm a wee bit baffled as to what's actually going on
> here.

Likewise (ie, both the modules splat going away if doing a single make and
being baffled, but more that a wee bit).


Paul Bolle


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Linus Torvalds
On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
>
> It comes back.  The steps to reproduce this are:
>
> 1.  checkout latest linux.git
> 2.  make -j112
>
> (IOW, it occurs 100% of the time for me on a clean tree.)

I suspect it's not new, it's just that you are able to hit the timing
just right (and the new include presumable makes that just be much
easier).

The rules for generating include/generated/autoksyms.h aren't new,
they go back to at least 4.7.

Adding Nico to the cc, since he's the person to blame for autokeysyms,
and might have some idea why the dependency wouldn't be done in time.

  Linus


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Linus Torvalds
On Wed, Nov 30, 2016 at 10:50 AM, Prarit Bhargava  wrote:
>
> It comes back.  The steps to reproduce this are:
>
> 1.  checkout latest linux.git
> 2.  make -j112
>
> (IOW, it occurs 100% of the time for me on a clean tree.)

I suspect it's not new, it's just that you are able to hit the timing
just right (and the new include presumable makes that just be much
easier).

The rules for generating include/generated/autoksyms.h aren't new,
they go back to at least 4.7.

Adding Nico to the cc, since he's the person to blame for autokeysyms,
and might have some idea why the dependency wouldn't be done in time.

  Linus


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 1:50 PM, Prarit Bhargava wrote:


On 11/30/2016 01:36 PM, Linus Torvalds wrote:

On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
]>

In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
identify Xeons with machine check recovery") which adds the include for
generated/autoksyms.h.


Ok, that at least makes some sense. The other blamed commit did not
seem to possibly make a difference.


Searching LKML and I came across a report from Ken Moffat from a month ago:

http://marc.info/?l=linux-kernel=147794681124332=2


Does a "make clean" get rid of it forever? Or does it come back?


It comes back.  The steps to reproduce this are:

1.  checkout latest linux.git
2.  make -j112

(IOW, it occurs 100% of the time for me on a clean tree.)

To work around the bug I have to do

1.  checkout latest linux.git
2.  comment out the include for generated/autoksyms.h at 
include/linux/export.h:81
3.  compile with -j112

This fails loudly, but then I do

4.  uncomment the include for generated/autoksyms.h at include/linux/export.h:81
5.  make -j112

and this completes with a bootable kernel AFAICT.


In my case, I first noticed this with rpm builds, which are unpacking a 
tarball every time. I did have ccache installed, but have removed it, to 
no avail. I still get this failure. Reverting these three patches...


9a6fb28a355d2609ace4dab4e6425442c647894d
3637efb00864f465baebd49464e58319fd295b65
ffb173e657fa8123bffa2a169e124b4bca0b5bc4

...gets me a working build every time. So I'm not so sure these patches 
are completely innocent. They may not be directly at fault, but 
certainly seem to be involved in causing things to get tripped up.



If it's a one-time dependency issue that is because some header
dependency addition that the automatic dependency generator hadn't
caught, that might explain a bisection failure too: once the file
happens to get rebuilt (and the dependencies re-done), it starts
working even though the "happens to be rebuilt" had nothing to do with
the original bug.


Hopefully the linux-kbuild folks might be able to point us in the right
direction for a fix.


Indeed. I'm mostly clueless here. Well, mostly clueless, period, but 
even more so here. And with the CONFIG_MODVERSIONS part of my original 
mail, which I guess I should also re-test with ccache uninstalled...


--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson

On 2016-11-30 1:50 PM, Prarit Bhargava wrote:


On 11/30/2016 01:36 PM, Linus Torvalds wrote:

On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
]>

In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
identify Xeons with machine check recovery") which adds the include for
generated/autoksyms.h.


Ok, that at least makes some sense. The other blamed commit did not
seem to possibly make a difference.


Searching LKML and I came across a report from Ken Moffat from a month ago:

http://marc.info/?l=linux-kernel=147794681124332=2


Does a "make clean" get rid of it forever? Or does it come back?


It comes back.  The steps to reproduce this are:

1.  checkout latest linux.git
2.  make -j112

(IOW, it occurs 100% of the time for me on a clean tree.)

To work around the bug I have to do

1.  checkout latest linux.git
2.  comment out the include for generated/autoksyms.h at 
include/linux/export.h:81
3.  compile with -j112

This fails loudly, but then I do

4.  uncomment the include for generated/autoksyms.h at include/linux/export.h:81
5.  make -j112

and this completes with a bootable kernel AFAICT.


In my case, I first noticed this with rpm builds, which are unpacking a 
tarball every time. I did have ccache installed, but have removed it, to 
no avail. I still get this failure. Reverting these three patches...


9a6fb28a355d2609ace4dab4e6425442c647894d
3637efb00864f465baebd49464e58319fd295b65
ffb173e657fa8123bffa2a169e124b4bca0b5bc4

...gets me a working build every time. So I'm not so sure these patches 
are completely innocent. They may not be directly at fault, but 
certainly seem to be involved in causing things to get tripped up.



If it's a one-time dependency issue that is because some header
dependency addition that the automatic dependency generator hadn't
caught, that might explain a bisection failure too: once the file
happens to get rebuilt (and the dependencies re-done), it starts
working even though the "happens to be rebuilt" had nothing to do with
the original bug.


Hopefully the linux-kbuild folks might be able to point us in the right
direction for a fix.


Indeed. I'm mostly clueless here. Well, mostly clueless, period, but 
even more so here. And with the CONFIG_MODVERSIONS part of my original 
mail, which I guess I should also re-test with ccache uninstalled...


--
Jarod Wilson
ja...@redhat.com


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Prarit Bhargava


On 11/30/2016 01:36 PM, Linus Torvalds wrote:
> On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
> ]>
>> In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
>> identify Xeons with machine check recovery") which adds the include for
>> generated/autoksyms.h.
> 
> Ok, that at least makes some sense. The other blamed commit did not
> seem to possibly make a difference.
> 
>> Searching LKML and I came across a report from Ken Moffat from a month ago:
>>
>> http://marc.info/?l=linux-kernel=147794681124332=2
> 
> Does a "make clean" get rid of it forever? Or does it come back?

It comes back.  The steps to reproduce this are:

1.  checkout latest linux.git
2.  make -j112

(IOW, it occurs 100% of the time for me on a clean tree.)

To work around the bug I have to do

1.  checkout latest linux.git
2.  comment out the include for generated/autoksyms.h at 
include/linux/export.h:81
3.  compile with -j112

This fails loudly, but then I do

4.  uncomment the include for generated/autoksyms.h at include/linux/export.h:81
5.  make -j112

and this completes with a bootable kernel AFAICT.

> 
> If it's a one-time dependency issue that is because some header
> dependency addition that the automatic dependency generator hadn't
> caught, that might explain a bisection failure too: once the file
> happens to get rebuilt (and the dependencies re-done), it starts
> working even though the "happens to be rebuilt" had nothing to do with
> the original bug.

Hopefully the linux-kbuild folks might be able to point us in the right
direction for a fix.

P.


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Prarit Bhargava


On 11/30/2016 01:36 PM, Linus Torvalds wrote:
> On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
> ]>
>> In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
>> identify Xeons with machine check recovery") which adds the include for
>> generated/autoksyms.h.
> 
> Ok, that at least makes some sense. The other blamed commit did not
> seem to possibly make a difference.
> 
>> Searching LKML and I came across a report from Ken Moffat from a month ago:
>>
>> http://marc.info/?l=linux-kernel=147794681124332=2
> 
> Does a "make clean" get rid of it forever? Or does it come back?

It comes back.  The steps to reproduce this are:

1.  checkout latest linux.git
2.  make -j112

(IOW, it occurs 100% of the time for me on a clean tree.)

To work around the bug I have to do

1.  checkout latest linux.git
2.  comment out the include for generated/autoksyms.h at 
include/linux/export.h:81
3.  compile with -j112

This fails loudly, but then I do

4.  uncomment the include for generated/autoksyms.h at include/linux/export.h:81
5.  make -j112

and this completes with a bootable kernel AFAICT.

> 
> If it's a one-time dependency issue that is because some header
> dependency addition that the automatic dependency generator hadn't
> caught, that might explain a bisection failure too: once the file
> happens to get rebuilt (and the dependencies re-done), it starts
> working even though the "happens to be rebuilt" had nothing to do with
> the original bug.

Hopefully the linux-kbuild folks might be able to point us in the right
direction for a fix.

P.


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Linus Torvalds
On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
]>
> In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
> identify Xeons with machine check recovery") which adds the include for
> generated/autoksyms.h.

Ok, that at least makes some sense. The other blamed commit did not
seem to possibly make a difference.

> Searching LKML and I came across a report from Ken Moffat from a month ago:
>
> http://marc.info/?l=linux-kernel=147794681124332=2

Does a "make clean" get rid of it forever? Or does it come back?

If it's a one-time dependency issue that is because some header
dependency addition that the automatic dependency generator hadn't
caught, that might explain a bisection failure too: once the file
happens to get rebuilt (and the dependencies re-done), it starts
working even though the "happens to be rebuilt" had nothing to do with
the original bug.

   Linus


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Linus Torvalds
On Wed, Nov 30, 2016 at 10:28 AM, Prarit Bhargava  wrote:
]>
> In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
> identify Xeons with machine check recovery") which adds the include for
> generated/autoksyms.h.

Ok, that at least makes some sense. The other blamed commit did not
seem to possibly make a difference.

> Searching LKML and I came across a report from Ken Moffat from a month ago:
>
> http://marc.info/?l=linux-kernel=147794681124332=2

Does a "make clean" get rid of it forever? Or does it come back?

If it's a one-time dependency issue that is because some header
dependency addition that the automatic dependency generator hadn't
caught, that might explain a bisection failure too: once the file
happens to get rebuilt (and the dependencies re-done), it starts
working even though the "happens to be rebuilt" had nothing to do with
the original bug.

   Linus


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Prarit Bhargava


On 11/30/2016 01:18 PM, Linus Torvalds wrote:
> On Wed, Nov 30, 2016 at 9:24 AM, Jarod Wilson  wrote:
>>
>> Now, if I omit the -j8 and do a single-threaded build, then things work
>> fine. Prarit bisected this failure to commit
>> 9a6fb28a355d2609ace4dab4e6425442c647894d, and indeed, when reverting that
>> patch and the two that follow it from rc7, parallel make works again.
> 
> I seriously doubt that commit really makes a difference, and I think
> it was just random luck.
> 
> Do you perhaps have ccache installed?
> 
> Because ccache at some point broke dependency generation of "gcc -MM"
> that the kernel build system uses, giving those random "No such file"
> build errors.
> 
> Try uninstalling ccache and see if that helps.
> 

I reported this last week on LKML and I'm building on a freshly installed
systems that do not have ccache installed.

For example,

[root@intel-brickland-04 linux]# rpm -q ccache
package ccache is not installed
[root@intel-brickland-04 linux]# which ccache
/usr/bin/which: no ccache in
(/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/libexec/git-core:/root/bin)

In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
identify Xeons with machine check recovery") which adds the include for
generated/autoksyms.h.

Searching LKML and I came across a report from Ken Moffat from a month ago:

http://marc.info/?l=linux-kernel=147794681124332=2

Also cc'ing linux-kbuild.

P.

>   Linus
> 


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Prarit Bhargava


On 11/30/2016 01:18 PM, Linus Torvalds wrote:
> On Wed, Nov 30, 2016 at 9:24 AM, Jarod Wilson  wrote:
>>
>> Now, if I omit the -j8 and do a single-threaded build, then things work
>> fine. Prarit bisected this failure to commit
>> 9a6fb28a355d2609ace4dab4e6425442c647894d, and indeed, when reverting that
>> patch and the two that follow it from rc7, parallel make works again.
> 
> I seriously doubt that commit really makes a difference, and I think
> it was just random luck.
> 
> Do you perhaps have ccache installed?
> 
> Because ccache at some point broke dependency generation of "gcc -MM"
> that the kernel build system uses, giving those random "No such file"
> build errors.
> 
> Try uninstalling ccache and see if that helps.
> 

I reported this last week on LKML and I'm building on a freshly installed
systems that do not have ccache installed.

For example,

[root@intel-brickland-04 linux]# rpm -q ccache
package ccache is not installed
[root@intel-brickland-04 linux]# which ccache
/usr/bin/which: no ccache in
(/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/libexec/git-core:/root/bin)

In my case I tracked this to commit 3637efb00864 ("x86/mce: Add PCI quirks to
identify Xeons with machine check recovery") which adds the include for
generated/autoksyms.h.

Searching LKML and I came across a report from Ken Moffat from a month ago:

http://marc.info/?l=linux-kernel=147794681124332=2

Also cc'ing linux-kbuild.

P.

>   Linus
> 


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Linus Torvalds
On Wed, Nov 30, 2016 at 9:24 AM, Jarod Wilson  wrote:
>
> Now, if I omit the -j8 and do a single-threaded build, then things work
> fine. Prarit bisected this failure to commit
> 9a6fb28a355d2609ace4dab4e6425442c647894d, and indeed, when reverting that
> patch and the two that follow it from rc7, parallel make works again.

I seriously doubt that commit really makes a difference, and I think
it was just random luck.

Do you perhaps have ccache installed?

Because ccache at some point broke dependency generation of "gcc -MM"
that the kernel build system uses, giving those random "No such file"
build errors.

Try uninstalling ccache and see if that helps.

  Linus


Re: Odd build breakage in 4.9-rc7

2016-11-30 Thread Linus Torvalds
On Wed, Nov 30, 2016 at 9:24 AM, Jarod Wilson  wrote:
>
> Now, if I omit the -j8 and do a single-threaded build, then things work
> fine. Prarit bisected this failure to commit
> 9a6fb28a355d2609ace4dab4e6425442c647894d, and indeed, when reverting that
> patch and the two that follow it from rc7, parallel make works again.

I seriously doubt that commit really makes a difference, and I think
it was just random luck.

Do you perhaps have ccache installed?

Because ccache at some point broke dependency generation of "gcc -MM"
that the kernel build system uses, giving those random "No such file"
build errors.

Try uninstalling ccache and see if that helps.

  Linus


Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson
I'm encountering two different build breakages with 4.9-rc7, using an rpm
spec setup I've been using for every rc dating back to at least 3.10.
First up, and actually dating back earlier in the rc cycle, I get:

8<
$ make -s ARCH=x86_64 V=1 -j8 bzImage
...
In file included from ./include/linux/linkage.h:6:0,
 from ./include/linux/kernel.h:6,
 from ./include/asm-generic/bug.h:13,
 from ./arch/x86/include/asm/bug.h:35,
 from ./include/linux/bug.h:4,
 from ./include/linux/jump_label.h:170,
 from ./arch/x86/include/asm/string_64.h:5,
 from ./arch/x86/include/asm/string.h:4,
 from ./include/linux/string.h:18,
 from ./include/uapi/linux/uuid.h:21,
 from ./include/linux/uuid.h:19,
 from ./include/linux/mod_devicetable.h:12,
 from scripts/mod/devicetable-offsets.c:2:
./include/linux/export.h:81:33: fatal error: generated/autoksyms.h: No
such file or directory
 #include 
 ^
compilation terminated.
...
make[2]: *** [scripts/mod/devicetable-offsets.s] Error 1
make[1]: *** [scripts/mod] Error 2
make[1]: *** Waiting for unfinished jobs
8<

Now, if I omit the -j8 and do a single-threaded build, then things work
fine. Prarit bisected this failure to commit
9a6fb28a355d2609ace4dab4e6425442c647894d, and indeed, when reverting that
patch and the two that follow it from rc7, parallel make works again.

Up second, once we're past the above, building modules goes splat:

8<
$ make -s ARCH=x86_64 V=1 -j8 modules
...
ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
...
8<

There are similar ERROR lines to the tune of 145k lines of output,
basically for every single module and symbol in the build. This breakage
was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
looks fairly innocuous, but when reverted, builds work fine again.
Multi-threaded make vs. single-threaded doesn't matter, setting
CONFIG_BROKEN=y or '# CONFIG_MODVERSIONS is not set' don't make a
difference, and interestingly, if instead of split 'make bzImage' and
'make modules', I just do a single 'make', then things DO build
successfully, so I'm a wee bit baffled as to what's actually going on
here.

To the comment in cd3caefb's changelog: I noticed! ;)

-- 
Jarod Wilson
ja...@redhat.com



Odd build breakage in 4.9-rc7

2016-11-30 Thread Jarod Wilson
I'm encountering two different build breakages with 4.9-rc7, using an rpm
spec setup I've been using for every rc dating back to at least 3.10.
First up, and actually dating back earlier in the rc cycle, I get:

8<
$ make -s ARCH=x86_64 V=1 -j8 bzImage
...
In file included from ./include/linux/linkage.h:6:0,
 from ./include/linux/kernel.h:6,
 from ./include/asm-generic/bug.h:13,
 from ./arch/x86/include/asm/bug.h:35,
 from ./include/linux/bug.h:4,
 from ./include/linux/jump_label.h:170,
 from ./arch/x86/include/asm/string_64.h:5,
 from ./arch/x86/include/asm/string.h:4,
 from ./include/linux/string.h:18,
 from ./include/uapi/linux/uuid.h:21,
 from ./include/linux/uuid.h:19,
 from ./include/linux/mod_devicetable.h:12,
 from scripts/mod/devicetable-offsets.c:2:
./include/linux/export.h:81:33: fatal error: generated/autoksyms.h: No
such file or directory
 #include 
 ^
compilation terminated.
...
make[2]: *** [scripts/mod/devicetable-offsets.s] Error 1
make[1]: *** [scripts/mod] Error 2
make[1]: *** Waiting for unfinished jobs
8<

Now, if I omit the -j8 and do a single-threaded build, then things work
fine. Prarit bisected this failure to commit
9a6fb28a355d2609ace4dab4e6425442c647894d, and indeed, when reverting that
patch and the two that follow it from rc7, parallel make works again.

Up second, once we're past the above, building modules goes splat:

8<
$ make -s ARCH=x86_64 V=1 -j8 modules
...
ERROR: "module_put" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_unlock" [virt/lib/irqbypass.ko] undefined!
ERROR: "mutex_lock" [virt/lib/irqbypass.ko] undefined!
...
8<

There are similar ERROR lines to the tune of 145k lines of output,
basically for every single module and symbol in the build. This breakage
was bisected to commit cd3caefb4663e3811d37cc2afad3cce642d60061, which
looks fairly innocuous, but when reverted, builds work fine again.
Multi-threaded make vs. single-threaded doesn't matter, setting
CONFIG_BROKEN=y or '# CONFIG_MODVERSIONS is not set' don't make a
difference, and interestingly, if instead of split 'make bzImage' and
'make modules', I just do a single 'make', then things DO build
successfully, so I'm a wee bit baffled as to what's actually going on
here.

To the comment in cd3caefb's changelog: I noticed! ;)

-- 
Jarod Wilson
ja...@redhat.com