Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On Thu, 14 Sep 2017 15:53:09 +0100 "Burton, Ross" wrote: > Example: > > path mismatch [5 links]: ino 27539515 db > '/data/poky-tmp/master/build/work/corei7-64-poky-linux/dbus/1.10.20-r0/recipe-sysroot-native/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld.bfd' > req > '/data/poky-tmp/master/build/sysroots-components/x86_64/binutils-cross-x86_64/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld'. > > Looks like this is just warning that there is a symlink, right? Hard link, not symlink. Symlinks are recorded separately, "links" is the hard link count in the filesystem, which doesn't include symlinks. That message is almost always Harmless. Although strictly speaking, it probably indicates that one of the links was made outside of pseudo, which in theory shouldn't happen, I think. At least, I *think* the current intent is to always use an exact match (both path and inode) if possible, so it shouldn't see a mismatch unless there's a path that got used but that isn't in the database. -s -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On 13 September 2017 at 23:09, Seebs wrote: > On Wed, 13 Sep 2017 22:46:59 +0200 > Martin Jansa wrote: > > > And some-other-unrelated-file is really some different file, not > > just hardlink to the same file from some different directory (like > > between WORKDIR and sysroot other "path mismatch" entries show). > > To clarify a thing: In practice, "path mismatch" with links > 1 is > probably harmless. I think I had a patch floating around to stop > alerting about it except when being paranoid. > Example: path mismatch [5 links]: ino 27539515 db '/data/poky-tmp/master/build/work/corei7-64-poky-linux/dbus/1.10.20-r0/recipe-sysroot-native/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld.bfd' req '/data/poky-tmp/master/build/sysroots-components/x86_64/binutils-cross-x86_64/usr/bin/x86_64-poky-linux/x86_64-poky-linux-ld'. Looks like this is just warning that there is a symlink, right? Ross -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On Thu, 14 Sep 2017 00:28:34 +0200 Martin Jansa wrote: > OK, thanks. > > Any insights to the first issue described in commit message? That we > know that is an Actual Problem, but hard to reproduce it. This one? glibc-locale-2.26: glibc-locale: /glibc-binary-localedata-en-gb/usr/lib/locale/en_GB/LC_MEASUREMENT is owned by uid 3004, which is the same as the user running bitbake. This may be due to host contamination [host-user-contaminated] My first guess would be some kind of weird timing interaction involving symlinks or, if it's still there, the really weird copy of locale data into an intermediate location to move it from the glibc package to the glibc_locale package or something. But it seems like that should be reproducible, so I'm not sure. -s -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On 13 September 2017 at 23:08, Seebs wrote: > On Wed, 13 Sep 2017 22:50:32 +0200 > Martin Jansa wrote: > > > inode mismatch: > > '/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/ > systemd/1_234-r0/recipe-sysroot/etc/group' > > ino 29361846 in db, 29361904 in request. > > Just to be clear about this: > > This is at least potentially an Actual Problem. It's probably not a > good idea to habitually ignore these. A thing like this happening means > that, at some point: > > 1. Something created etc/group, and it got inode 29361846 recorded in > the database. > 2. Something deleted etc/group, and did NOT record this to the database. > 3. Something created etc/group, with inode 29361904. > > It's possible that the thing creating etc/group didn't record it to the > database, and something else just referred to it later (like a stat > call), or that this was generated by the request creating the entry. > > Either way, it means changes were made to a pseudo-managed hunk of > filesystem without pseudo knowing about it, and that is generally bad. I just applied this patch and got an identical hit in systemd, which makes me think that the useradd class (maze of code that it is) is causing this instance. Ross -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
OK, thanks. Any insights to the first issue described in commit message? That we know that is an Actual Problem, but hard to reproduce it. I'm rebuilding one component over and over again and with pseudo 1.8.1 I got this issue in 5 rebuilds from 1000, now in the same build with latest pseudo 1.8.2 backported from oe-core/master I got this issue only 2 times in 1000 iterations. Hopefully this insane check will help discover similar issues even without triggering the QA host-user-contamination check, so that we can detect such issues even without really trying (it's now included in my regular "bitbake world" builds which quite often trigger this QA host-user-contamination issue in glibc-locale. And now we can correlate that with pseudo.log more easily. On Thu, Sep 14, 2017 at 12:08 AM, Seebs wrote: > On Wed, 13 Sep 2017 22:50:32 +0200 > Martin Jansa wrote: > > > inode mismatch: > > '/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/ > systemd/1_234-r0/recipe-sysroot/etc/group' > > ino 29361846 in db, 29361904 in request. > > Just to be clear about this: > > This is at least potentially an Actual Problem. It's probably not a > good idea to habitually ignore these. A thing like this happening means > that, at some point: > > 1. Something created etc/group, and it got inode 29361846 recorded in > the database. > 2. Something deleted etc/group, and did NOT record this to the database. > 3. Something created etc/group, with inode 29361904. > > It's possible that the thing creating etc/group didn't record it to the > database, and something else just referred to it later (like a stat > call), or that this was generated by the request creating the entry. > > Either way, it means changes were made to a pseudo-managed hunk of > filesystem without pseudo knowing about it, and that is generally bad. > > -s > -- > ___ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.openembedded.org/mailman/listinfo/openembedded-core > -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On Wed, 13 Sep 2017 22:50:32 +0200 Martin Jansa wrote: > inode mismatch: > '/OE/build/oe-core/tmp-glibc/work/core2-64-oe-linux/systemd/1_234-r0/recipe-sysroot/etc/group' > ino 29361846 in db, 29361904 in request. Just to be clear about this: This is at least potentially an Actual Problem. It's probably not a good idea to habitually ignore these. A thing like this happening means that, at some point: 1. Something created etc/group, and it got inode 29361846 recorded in the database. 2. Something deleted etc/group, and did NOT record this to the database. 3. Something created etc/group, with inode 29361904. It's possible that the thing creating etc/group didn't record it to the database, and something else just referred to it later (like a stat call), or that this was generated by the request creating the entry. Either way, it means changes were made to a pseudo-managed hunk of filesystem without pseudo knowing about it, and that is generally bad. -s -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On Wed, 13 Sep 2017 22:46:59 +0200 Martin Jansa wrote: > And some-other-unrelated-file is really some different file, not > just hardlink to the same file from some different directory (like > between WORKDIR and sysroot other "path mismatch" entries show). To clarify a thing: In practice, "path mismatch" with links > 1 is probably harmless. I think I had a patch floating around to stop alerting about it except when being paranoid. -s -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
On Wed, Sep 13, 2017 at 10:46:59PM +0200, Martin Jansa wrote: > * we often see QA warnings like: > glibc-locale-2.26: glibc-locale: > /glibc-binary-localedata-en-gb/usr/lib/locale/en_GB/LC_MEASUREMENT is owned > by uid 3004, which is the same as the user running bitbake. This may be due > to host contamination [host-user-contaminated] > glibc-locale-2.26: glibc-locale: > /glibc-binary-localedata-nn-no.iso-8859-1/usr/lib/locale/nn_NO.ISO-8859-1/LC_MEASUREMENT > is owned by uid 3004, which is the same as the user running bitbake. This > may be due to host contamination [host-user-contaminated] > but we don't know the root cause of it. > * the only theory we currently have is that it's a bug in pseudo when > inode is being reused for different files, which is supported by > pseudo.log entries: > > Good build: > pseudo$ grep -v "^path mismatch" pseudo.log > debug_logfile: fd 2 > pid 7975 [parent 7974], doing new pid setup and server start > Setup complete, sending SIGUSR1 to pid 7974. > db cleanup for server shutdown, 17:33:58.787 > memory-to-file backup complete, 17:33:58.787. > db cleanup finished, 17:33:58.787 > > Build with QA host-user-contaminated issue: > ERROR: foo-1.0.0-r0 do_package_qa: QA Issue: foo: file-with-wrong-UID is > owned > by uid 2001, which is the same as the user running bitbake. This may be due > to > host contamination [host-user-contaminated] > > pseudo$ grep "file-with-wrong-UID" pseudo.log > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > creat ignored for existing file 'file-with-wrong-UID'. > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in > request. > path mismatch [1 link]: ino 13242270 db 'file-with-wrong-UID' req > 'some-other-unrelated-file'. > creat for 'some-other-unrelated-file' replaces existing 13242270 > ['file-with-wrong-UID']. > db cleanup for server shutdown, 02:16:23.685 > memory-to-file backup complete, 02:16:23.685. > db cleanup finished, 02:16:23.685 > > And some-other-unrelated-file is really some different file, not just > hardlink > to the same file from some different directory (like between WORKDIR and > sysroot > other "path mismatch" entries show). > > Signed-off-by: Martin Jansa > --- > meta/classes/insane.bbclass | 40 > 1 file changed, 40 insertions(+) > > diff --git a/meta/classes/insane.bbclass b/meta/classes/insane.bbclass > index 78b41caf99..617bf7011d 100644 > --- a/meta/classes/insane.bbclass > +++ b/meta/classes/insane.bbclass > @@ -1237,6 +1237,41 @@ python do_qa_unpack() { > bb.warn('%s: the directory %s (%s) pointed to by the S variable > doesn\'t exist - please set S within the recipe to point to where the source > has been unpacked to' % (d.getVar('PN'), d.getVar('S', False), s_dir)) > } > > +python do_qa_pseudo() { > + > ### > +# Check pseudo.log for unexpected errors > +# > +# Typical pseudo.log contains many "^path mismatch" lines for all the > hardlinked files > +# e.g. in some smaller component I see 231/237 lines to be "^path > mismatch" other 6 > +# lines are setup and cleanup lines like this: > +# debug_logfile: fd 2 > +# pid 7975 [parent 7974], doing new pid setup and server start > +# Setup complete, sending SIGUSR1 to pid 7974. > +# db cleanup for server shutdown, 17:33:58.787 > +# memory-to-file backup complete, 17:33:58.787. > +# db cleanup finished, 17:33:58.787 > +# > +# but if there is one of: > +# "^inode mismatch" > +# "^creat ignored for existing file" > +# "^creat for.*replaces existing" > +# then there might be some bigger issue which sometimes results in > host-user-contaminated QA warnings > + > ### > + > +import subprocess > + > +pseudodir = d.getVar('PSEUDO_LOCALSTATEDIR') > +bb.note("Checking pseudo.log for common errors") > +pseudolog = os.path.join(pseudodir, "pseudo.log") > +statement = "grep" \ > +" -e '^inode mismatch'" \ > +" -e '^creat ignored for existing file'" \ > +" -e '^creat for.*replaces existing'" \ > +" %s" % pseudolog > +if subprocess.call("%s -q" % statement, shell=True) == 0: > +bb.fatal("This %s indicates errors, see %s or grep -v '^path > mismatch' %s" % (pseudolog, statement, pseudolog)) > +} > + > #
[OE-core] [RFC][PATCH] insane.bbclass: Add do_qa_pseudo function to check for common errors listed in pseudo.log
* we often see QA warnings like: glibc-locale-2.26: glibc-locale: /glibc-binary-localedata-en-gb/usr/lib/locale/en_GB/LC_MEASUREMENT is owned by uid 3004, which is the same as the user running bitbake. This may be due to host contamination [host-user-contaminated] glibc-locale-2.26: glibc-locale: /glibc-binary-localedata-nn-no.iso-8859-1/usr/lib/locale/nn_NO.ISO-8859-1/LC_MEASUREMENT is owned by uid 3004, which is the same as the user running bitbake. This may be due to host contamination [host-user-contaminated] but we don't know the root cause of it. * the only theory we currently have is that it's a bug in pseudo when inode is being reused for different files, which is supported by pseudo.log entries: Good build: pseudo$ grep -v "^path mismatch" pseudo.log debug_logfile: fd 2 pid 7975 [parent 7974], doing new pid setup and server start Setup complete, sending SIGUSR1 to pid 7974. db cleanup for server shutdown, 17:33:58.787 memory-to-file backup complete, 17:33:58.787. db cleanup finished, 17:33:58.787 Build with QA host-user-contaminated issue: ERROR: foo-1.0.0-r0 do_package_qa: QA Issue: foo: file-with-wrong-UID is owned by uid 2001, which is the same as the user running bitbake. This may be due to host contamination [host-user-contaminated] pseudo$ grep "file-with-wrong-UID" pseudo.log inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. creat ignored for existing file 'file-with-wrong-UID'. inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. inode mismatch: 'file-with-wrong-UID' ino 13242270 in db, 13242271 in request. path mismatch [1 link]: ino 13242270 db 'file-with-wrong-UID' req 'some-other-unrelated-file'. creat for 'some-other-unrelated-file' replaces existing 13242270 ['file-with-wrong-UID']. db cleanup for server shutdown, 02:16:23.685 memory-to-file backup complete, 02:16:23.685. db cleanup finished, 02:16:23.685 And some-other-unrelated-file is really some different file, not just hardlink to the same file from some different directory (like between WORKDIR and sysroot other "path mismatch" entries show). Signed-off-by: Martin Jansa --- meta/classes/insane.bbclass | 40 1 file changed, 40 insertions(+) diff --git a/meta/classes/insane.bbclass b/meta/classes/insane.bbclass index 78b41caf99..617bf7011d 100644 --- a/meta/classes/insane.bbclass +++ b/meta/classes/insane.bbclass @@ -1237,6 +1237,41 @@ python do_qa_unpack() { bb.warn('%s: the directory %s (%s) pointed to by the S variable doesn\'t exist - please set S within the recipe to point to where the source has been unpacked to' % (d.getVar('PN'), d.getVar('S', False), s_dir)) } +python do_qa_pseudo() { +### +# Check pseudo.log for unexpected errors +# +# Typical pseudo.log contains many "^path mismatch" lines for all the hardlinked files +# e.g. in some smaller component I see 231/237 lines to be "^path mismatch" other 6 +# lines are setup and cleanup lines like this: +# debug_logfile: fd 2 +# pid 7975 [parent 7974], doing new pid setup and server start +# Setup complete, sending SIGUSR1 to pid 7974. +# db cleanup for server shutdown, 17:33:58.787 +# memory-to-file backup complete, 17:33:58.787. +# db cleanup finished, 17:33:58.787 +# +# but if there is one of: +# "^inode mismatch" +# "^creat ignored for existing file" +# "^creat for.*replaces existing" +# then there might be some bigger issue which sometimes results in host-user-contaminated QA warnings +### + +import subprocess + +pseudodir = d.getVar('PSEUDO_LOCALSTATEDIR') +bb.note("Checking pseudo.log for common errors") +pseudolog = os.path.join(pseudodir, "pseudo.log") +statement = "grep" \ +" -e '^inode mismatch'" \ +" -e '^creat ignored for existing file'" \ +" -e '^creat for.*replaces existing'" \ +" %s" % pseudolog +if subprocess.call("%s -q" % statement, shell=True) == 0: +bb.fatal("This %s indicates errors, see %s or grep -v '^path mismatch' %s" % (pseudolog, statement, pseudolog)) +} + # The Staging Func, to check all staging #addtask qa_staging after do_populate_sysroot before do_build do_populate_sysroot[postfuncs] += "do_qa_staging " @@ -1249,6 +1284,11 @@ do_configure[postfuncs] += "do_qa_configure " # Check does S exist. do_unpack[postfuncs] += "do_qa_unpack" +# Check pseudo.log