Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-14 Thread Seebs
On Thu, 14 Sep 2017 15:53:09 +0100
"Burton, Ross"  wrote:

> Example:
> 
> path mismatch [5 links]: ino 27539515 db
> '/data/poky-tmp/master/build/work/corei7-64-poky-linux/dbus/1.10.20-r0/recipe-sysroot-native/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld.bfd'
> req
> '/data/poky-tmp/master/build/sysroots-components/x86_64/binutils-cross-x86_64/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld'.
> 
> Looks like this is just warning that there is a symlink, right?

Hard link, not symlink. Symlinks are recorded separately, "links" is
the hard link count in the filesystem, which doesn't include symlinks.

That message is almost always Harmless. Although strictly speaking, it
probably indicates that one of the links was made outside of pseudo,
which in theory shouldn't happen, I think.

At least, I *think* the current intent is to always use an exact match
(both path and inode) if possible, so it shouldn't see a mismatch
unless there's a path that got used but that isn't in the database.

-s
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-14 Thread Burton, Ross
On 13 September 2017 at 23:09, Seebs  wrote:

> On Wed, 13 Sep 2017 22:46:59 +0200
> Martin Jansa  wrote:
>
> >   And some-other-unrelated-file is really some different file, not
> > just hardlink to the same file from some different directory (like
> > between WORKDIR and sysroot other "path mismatch" entries show).
>
> To clarify a thing: In practice, "path mismatch" with links > 1 is
> probably harmless. I think I had a patch floating around to stop
> alerting about it except when being paranoid.
>

Example:

path mismatch [5 links]: ino 27539515 db
'/data/poky-tmp/master/build/work/corei7-64-poky-linux/dbus/1.10.20-r0/recipe-sysroot-native/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld.bfd'
req
'/data/poky-tmp/master/build/sysroots-components/x86_64/binutils-cross-x86_64/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld'.

Looks like this is just warning that there is a symlink, right?

Ross
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-14 Thread Seebs
On Thu, 14 Sep 2017 00:28:34 +0200
Martin Jansa  wrote:

> OK, thanks.
> 
> Any insights to the first issue described in commit message? That we
> know that is an Actual Problem, but hard to reproduce it.

This one?
  glibc-locale-2.26:
  glibc-locale: 
/glibc-binary-localedata-en-gb/usr/lib/locale/en_GB/LC_MEASUREMENT
  is owned by uid 3004, which is the same as the user running bitbake.
  This may be due to host contamination [host-user-contaminated]

My first guess would be some kind of weird timing interaction involving
symlinks or, if it's still there, the really weird copy of locale data
into an intermediate location to move it from the glibc package to the
glibc_locale package or something. But it seems like that should be
reproducible, so I'm not sure.

-s
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-14 Thread Burton, Ross
On 13 September 2017 at 23:08, Seebs  wrote:

> On Wed, 13 Sep 2017 22:50:32 +0200
> Martin Jansa  wrote:
>
> > inode mismatch:
> > '/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/
> systemd/1_234-r0/recipe-sysroot/etc/group'
> > ino 29361846 in db, 29361904 in request.
>
> Just to be clear about this:
>
> This is at least potentially an Actual Problem. It's probably not a
> good idea to habitually ignore these. A thing like this happening means
> that, at some point:
>
> 1. Something created etc/group, and it got inode 29361846 recorded in
> the database.
> 2. Something deleted etc/group, and did NOT record this to the database.
> 3. Something created etc/group, with inode 29361904.
>
> It's possible that the thing creating etc/group didn't record it to the
> database, and something else just referred to it later (like a stat
> call), or that this was generated by the request creating the entry.
>
> Either way, it means changes were made to a pseudo-managed hunk of
> filesystem without pseudo knowing about it, and that is generally bad.


I just applied this patch and got an identical hit in systemd, which makes
me think that the useradd class (maze of code that it is) is causing this
instance.

Ross
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-13 Thread Martin Jansa
OK, thanks.

Any insights to the first issue described in commit message? That we know
that is an Actual Problem, but hard to reproduce it.

I'm rebuilding one component over and over again and with pseudo 1.8.1 I
got this issue in 5 rebuilds from 1000, now in the same build with latest
pseudo 1.8.2 backported from oe-core/master I got this issue only 2 times
in 1000 iterations.

Hopefully this insane check will help discover similar issues even without
triggering the QA host-user-contamination check, so that we can detect such
issues even without really trying (it's now included in my regular "bitbake
world" builds which quite often trigger this QA host-user-contamination
issue in glibc-locale. And now we can correlate that with pseudo.log more
easily.

On Thu, Sep 14, 2017 at 12:08 AM, Seebs  wrote:

> On Wed, 13 Sep 2017 22:50:32 +0200
> Martin Jansa  wrote:
>
> > inode mismatch:
> > '/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/
> systemd/1_234-r0/recipe-sysroot/etc/group'
> > ino 29361846 in db, 29361904 in request.
>
> Just to be clear about this:
>
> This is at least potentially an Actual Problem. It's probably not a
> good idea to habitually ignore these. A thing like this happening means
> that, at some point:
>
> 1. Something created etc/group, and it got inode 29361846 recorded in
> the database.
> 2. Something deleted etc/group, and did NOT record this to the database.
> 3. Something created etc/group, with inode 29361904.
>
> It's possible that the thing creating etc/group didn't record it to the
> database, and something else just referred to it later (like a stat
> call), or that this was generated by the request creating the entry.
>
> Either way, it means changes were made to a pseudo-managed hunk of
> filesystem without pseudo knowing about it, and that is generally bad.
>
> -s
> --
> ___
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
>
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-13 Thread Seebs
On Wed, 13 Sep 2017 22:50:32 +0200
Martin Jansa  wrote:

> inode mismatch:
> '/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/systemd/1_234-r0/recipe-sysroot/etc/group'
> ino 29361846 in db, 29361904 in request.

Just to be clear about this:

This is at least potentially an Actual Problem. It's probably not a
good idea to habitually ignore these. A thing like this happening means
that, at some point:

1. Something created etc/group, and it got inode 29361846 recorded in
the database.
2. Something deleted etc/group, and did NOT record this to the database.
3. Something created etc/group, with inode 29361904.

It's possible that the thing creating etc/group didn't record it to the
database, and something else just referred to it later (like a stat
call), or that this was generated by the request creating the entry.

Either way, it means changes were made to a pseudo-managed hunk of
filesystem without pseudo knowing about it, and that is generally bad.

-s
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-13 Thread Seebs
On Wed, 13 Sep 2017 22:46:59 +0200
Martin Jansa  wrote:

>   And some-other-unrelated-file is really some different file, not
> just hardlink to the same file from some different directory (like
> between WORKDIR and sysroot other "path mismatch" entries show).

To clarify a thing: In practice, "path mismatch" with links > 1 is
probably harmless. I think I had a patch floating around to stop
alerting about it except when being paranoid.

-s
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-13 Thread Martin Jansa
On Wed, Sep 13, 2017 at 10:46:59PM +0200, Martin Jansa wrote:
> * we often see QA warnings like:
>   glibc-locale-2.26: glibc-locale: 
> /glibc-binary-localedata-en-gb/usr/lib/locale/en_GB/LC_MEASUREMENT is owned 
> by uid 3004, which is the same as the user running bitbake. This may be due 
> to host contamination [host-user-contaminated]
>   glibc-locale-2.26: glibc-locale: 
> /glibc-binary-localedata-nn-no.iso-8859-1/usr/lib/locale/nn_NO.ISO-8859-1/LC_MEASUREMENT
>  is owned by uid 3004, which is the same as the user running bitbake. This 
> may be due to host contamination [host-user-contaminated]
>   but we don't know the root cause of it.
> * the only theory we currently have is that it's a bug in pseudo when
>   inode is being reused for different files, which is supported by
>   pseudo.log entries:
> 
>   Good build:
>   pseudo$ grep -v "^path mismatch" pseudo.log
>   debug_logfile: fd 2
>   pid 7975 [parent 7974], doing new pid setup and server start
>   Setup complete, sending SIGUSR1 to pid 7974.
>   db cleanup for server shutdown, 17:33:58.787
>   memory-to-file backup complete, 17:33:58.787.
>   db cleanup finished, 17:33:58.787
> 
>   Build with QA host-user-contaminated issue:
>   ERROR: foo-1.0.0-r0 do_package_qa: QA Issue: foo: file-with-wrong-UID is 
> owned
>   by uid 2001, which is the same as the user running bitbake. This may be due 
> to
>   host contamination [host-user-contaminated]
> 
>   pseudo$ grep "file-with-wrong-UID" pseudo.log
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   creat ignored for existing file 'file-with-wrong-UID'.
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in 
> request.
>   path mismatch [1 link]: ino 13242270 db 'file-with-wrong-UID' req 
> 'some-other-unrelated-file'.
>   creat for 'some-other-unrelated-file' replaces existing 13242270 
> ['file-with-wrong-UID'].
>   db cleanup for server shutdown, 02:16:23.685
>   memory-to-file backup complete, 02:16:23.685.
>   db cleanup finished, 02:16:23.685
> 
>   And some-other-unrelated-file is really some different file, not just 
> hardlink
>   to the same file from some different directory (like between WORKDIR and 
> sysroot
>   other "path mismatch" entries show).
> 
> Signed-off-by: Martin Jansa 
> ---
>  meta/classes/insane.bbclass | 40 
>  1 file changed, 40 insertions(+)
> 
> diff --git a/meta/classes/insane.bbclass b/meta/classes/insane.bbclass
> index 78b41caf99..617bf7011d 100644
> --- a/meta/classes/insane.bbclass
> +++ b/meta/classes/insane.bbclass
> @@ -1237,6 +1237,41 @@ python do_qa_unpack() {
>  bb.warn('%s: the directory %s (%s) pointed to by the S variable 
> doesn\'t exist - please set S within the recipe to point to where the source 
> has been unpacked to' % (d.getVar('PN'), d.getVar('S', False), s_dir))
>  }
>  
> +python do_qa_pseudo() {
> +
> ###
> +# Check pseudo.log for unexpected errors
> +#
> +# Typical pseudo.log contains many "^path mismatch" lines for all the 
> hardlinked files
> +# e.g. in some smaller component I see 231/237 lines to be "^path 
> mismatch" other 6
> +# lines are setup and cleanup lines like this:
> +# debug_logfile: fd 2
> +# pid 7975 [parent 7974], doing new pid setup and server start
> +# Setup complete, sending SIGUSR1 to pid 7974.
> +# db cleanup for server shutdown, 17:33:58.787
> +# memory-to-file backup complete, 17:33:58.787.
> +# db cleanup finished, 17:33:58.787
> +#
> +# but if there is one of:
> +# "^inode mismatch"
> +# "^creat ignored for existing file"
> +# "^creat for.*replaces existing"
> +# then there might be some bigger issue which sometimes results in 
> host-user-contaminated QA warnings
> +
> ###
> +
> +import subprocess
> +
> +pseudodir = d.getVar('PSEUDO_LOCALSTATEDIR')
> +bb.note("Checking pseudo.log for common errors")
> +pseudolog = os.path.join(pseudodir, "pseudo.log")
> +statement = "grep" \
> +" -e '^inode mismatch'" \
> +" -e '^creat ignored for existing file'" \
> +" -e '^creat for.*replaces existing'" \
> +" %s" % pseudolog
> +if subprocess.call("%s -q" % statement, shell=True) == 0:
> +bb.fatal("This %s indicates errors, see %s or grep -v '^path 
> mismatch' %s" % (pseudolog, statement, pseudolog))
> +}
> +
>  # 

[OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log

2017-09-13 Thread Martin Jansa
* we often see QA warnings like:
  glibc-locale-2.26: glibc-locale: 
/glibc-binary-localedata-en-gb/usr/lib/locale/en_GB/LC_MEASUREMENT is owned by 
uid 3004, which is the same as the user running bitbake. This may be due to 
host contamination [host-user-contaminated]
  glibc-locale-2.26: glibc-locale: 
/glibc-binary-localedata-nn-no.iso-8859-1/usr/lib/locale/nn_NO.ISO-8859-1/LC_MEASUREMENT
 is owned by uid 3004, which is the same as the user running bitbake. This may 
be due to host contamination [host-user-contaminated]
  but we don't know the root cause of it.
* the only theory we currently have is that it's a bug in pseudo when
  inode is being reused for different files, which is supported by
  pseudo.log entries:

  Good build:
  pseudo$ grep -v "^path mismatch" pseudo.log
  debug_logfile: fd 2
  pid 7975 [parent 7974], doing new pid setup and server start
  Setup complete, sending SIGUSR1 to pid 7974.
  db cleanup for server shutdown, 17:33:58.787
  memory-to-file backup complete, 17:33:58.787.
  db cleanup finished, 17:33:58.787

  Build with QA host-user-contaminated issue:
  ERROR: foo-1.0.0-r0 do_package_qa: QA Issue: foo: file-with-wrong-UID is owned
  by uid 2001, which is the same as the user running bitbake. This may be due to
  host contamination [host-user-contaminated]

  pseudo$ grep "file-with-wrong-UID" pseudo.log
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  creat ignored for existing file 'file-with-wrong-UID'.
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request.
  path mismatch [1 link]: ino 13242270 db 'file-with-wrong-UID' req 
'some-other-unrelated-file'.
  creat for 'some-other-unrelated-file' replaces existing 13242270 
['file-with-wrong-UID'].
  db cleanup for server shutdown, 02:16:23.685
  memory-to-file backup complete, 02:16:23.685.
  db cleanup finished, 02:16:23.685

  And some-other-unrelated-file is really some different file, not just hardlink
  to the same file from some different directory (like between WORKDIR and 
sysroot
  other "path mismatch" entries show).

Signed-off-by: Martin Jansa 
---
 meta/classes/insane.bbclass | 40 
 1 file changed, 40 insertions(+)

diff --git a/meta/classes/insane.bbclass b/meta/classes/insane.bbclass
index 78b41caf99..617bf7011d 100644
--- a/meta/classes/insane.bbclass
+++ b/meta/classes/insane.bbclass
@@ -1237,6 +1237,41 @@ python do_qa_unpack() {
 bb.warn('%s: the directory %s (%s) pointed to by the S variable 
doesn\'t exist - please set S within the recipe to point to where the source 
has been unpacked to' % (d.getVar('PN'), d.getVar('S', False), s_dir))
 }
 
+python do_qa_pseudo() {
+###
+# Check pseudo.log for unexpected errors
+#
+# Typical pseudo.log contains many "^path mismatch" lines for all the 
hardlinked files
+# e.g. in some smaller component I see 231/237 lines to be "^path 
mismatch" other 6
+# lines are setup and cleanup lines like this:
+# debug_logfile: fd 2
+# pid 7975 [parent 7974], doing new pid setup and server start
+# Setup complete, sending SIGUSR1 to pid 7974.
+# db cleanup for server shutdown, 17:33:58.787
+# memory-to-file backup complete, 17:33:58.787.
+# db cleanup finished, 17:33:58.787
+#
+# but if there is one of:
+# "^inode mismatch"
+# "^creat ignored for existing file"
+# "^creat for.*replaces existing"
+# then there might be some bigger issue which sometimes results in 
host-user-contaminated QA warnings
+###
+
+import subprocess
+
+pseudodir = d.getVar('PSEUDO_LOCALSTATEDIR')
+bb.note("Checking pseudo.log for common errors")
+pseudolog = os.path.join(pseudodir, "pseudo.log")
+statement = "grep" \
+" -e '^inode mismatch'" \
+" -e '^creat ignored for existing file'" \
+" -e '^creat for.*replaces existing'" \
+" %s" % pseudolog
+if subprocess.call("%s -q" % statement, shell=True) == 0:
+bb.fatal("This %s indicates errors, see %s or grep -v '^path mismatch' 
%s" % (pseudolog, statement, pseudolog))
+}
+
 # The Staging Func, to check all staging
 #addtask qa_staging after do_populate_sysroot before do_build
 do_populate_sysroot[postfuncs] += "do_qa_staging "
@@ -1249,6 +1284,11 @@ do_configure[postfuncs] += "do_qa_configure "
 # Check does S exist.
 do_unpack[postfuncs] += "do_qa_unpack"
 
+# Check pseudo.log