[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
After a lot of deep digging into the bind mount, loop driver, and buffer cache and tracking the corrupt pages back down the layers of the stack we've sanity checked this down to the image. The smoking gun was the kernel message: Nov 6 12:15:16 ubuntu-phablet kernel: [3.940485] do_mount: /dev/loop0 - /root [null] Nov 6 12:15:16 ubuntu-phablet kernel: [3.941095] EXT2-fs (loop0): warning: mounting unchecked fs, running e2fsck is recommended Nov 6 12:15:16 ubuntu-phablet kernel: [3.941431] do_mount return - 0 (apologies for my extra debug). So it appears that /dev/loop0 is being mounted and it is corrupted. I ran fsck on /userdata/system.img and /userdata/ubuntu.img only to find that the system.img needed some fixing: fsck /userdata/system.img fsck from util-linux 2.25 e2fsck 1.42.10 (18-May-2014) /userdata/system.img was not cleanly unmounted, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 3A: Optimizing directories Pass 4: Checking reference counts Unattached inode 3225 Connect to /lost+foundy? yes Inode 3225 ref count is 2, should be 1. Fixy? yes Unattached inode 3709 Connect to /lost+foundy? yes Inode 3709 ref count is 2, should be 1. Fixy? yes Unattached inode 3808 Connect to /lost+foundy? yes Inode 3808 ref count is 2, should be 1. Fixy? yes Unattached inode 4427 Connect to /lost+foundy? yes Inode 4427 ref count is 2, should be 1. Fixy? yes Unattached inode 4485 Connect to /lost+foundy? yes Inode 4485 ref count is 2, should be 1. Fixy? yes Unattached inode 5889 Connect to /lost+foundy? yes Inode 5889 ref count is 2, should be 1. Fixy? yes Unattached inode 5943 Connect to /lost+foundy? yes Inode 5943 ref count is 2, should be 1. Fixy? yes Unattached inode 7853 Connect to /lost+foundy? yes Inode 7853 ref count is 2, should be 1. Fixy? yes yyyPass 5: Checking group summary information Block bitmap differences: -70903 -71144 -71201 -(71674--71675) -71727 -71852 -72689 -72757 -(74519--74520) -74869 -74961 +(92082--92087) +(92089--92092) -92102 +92104 +92114 +y92119 +(92121--92131) Fixy? yes Free blocks count wrong for group #13 (8813, counted=8820). Fixy? yes Free blocks count wrong (133222, counted=133229). Fixy? yes Inode bitmap differences: +(19989--20010) +(20013--20014) -(20545--20549)y -(20551--20569) Fixy? yes Free inodes count wrong for group #13 (3225, counted=3232). Fixy? yes Directories count wrong for group #13 (761, counted=760). Fixy? yes Free inodes count wrong (81946, counted=81953). Fixy? yes /userdata/system.img: * FILE SYSTEM WAS MODIFIED * /userdata/system.img: * REBOOT LINUX * So, there are two big issues outstanding, most probably in the user space shutdown and initrd stages: 1. The file system is not being flushed and unmounted properly. 2. The file system is not being fsck'd before mounting - this is a cardinal sin IMHO The end result is mounting a corrupt file system that is causing the garbage in the apparmor files. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note,
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
** Tags added: rtm14 ** Package changed: linux (Ubuntu) = system-image (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “android” package in Ubuntu: Triaged Status in “initramfs-tools-ubuntu-touch” package in Ubuntu: Triaged Status in “system-image” package in Ubuntu: Triaged Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/android/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
Made some progress today. On the phone, I am seeing: /var/lib/apparmor/profiles/click_com.ubuntu.filemanager_filemanager_0.3.275 containing a pathname and all zeros. The start is always on a page boundary and the end is always on a page boundary. I copied the entire partition /dev/mmcblk0p23 over adb back to my laptop, mounted it and then mounted /mnt/ubuntu.img and the same file is sane and not corrupted. So the underlying data is OK. corrupted data contains /usr/share/click/preinstalled/com.ubuntu.music/1.3.625/apparmor.json Cannot find any symlinks that would relate to this. Next step, I'm adding debug into the symlink name to see if this appears in the corrupt data to verify it is a symlink. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help :
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
The corruption to /var/lib/apparmor/profiles/click_com.ubuntu.filemanager_filemanager_0.3.275 survives multiple reboots. I'll take another 6GB snapshot of the underlying partition and see if that's now corrupted. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
After several reboots, the data still appears corrupted on the phone, but copying the underlying raw device /dev/mmcblk0p23 to my laptop and loop mounting it and then loop mounting ubuntu.img shows an uncorrupted var/lib/apparmor/profiles/click_com.ubuntu.filemanager_filemanager_0.3.275. I'm now going to test this on another device with a 3.4 kernel to see what I get. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
Ruled out the bind mount of /var/lib/apparmor/profiles on /userdata /system-data/var/lib/apparmor/profiles, still see corruption there on the device -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
Sanity checked the raw data from /dev/mmcblk0p23: 1. copied raw data off the phone to may laptop 2. using sshfs, mounted the directory containing the raw data snapshot back on the phone 3. loop mounted it 4. loop mounted ubuntu.img from this 5. /ubuntu/var/lib/apparmor/profiles is sane, no corruption -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
On the phone: debugfs /userdata/ubuntu.img cat /var/lib/apparmor/profiles/click_com.ubuntu.filemanager_filemanage # vim:syntax=apparmor #include tunables/global # Define vars with unconfined since autopilot rules may reference them # Specified profile variables @{APP_APPNAME}=filemanager @{APP_ID_DBUS}=com_2eubuntu_2efilemanager_5ffilemanager_5f0_2e3_2e275 @{APP_PKGNAME_DBUS}=com_2eubuntu_2efilemanager @{APP_PKGNAME}=com.ubuntu.filemanager @{APP_VERSION}=0.3.275 @{CLICK_DIR}=/usr/share/click/preinstalled .. etc so the underlying file system is verified as sane -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
I've searched the entire block device for the string /usr/share/click/preinstalled/com.ubuntu.music/1.3.625/apparmor.json and tagged it in such a way as it is obvious it that it has been modified on the flash drive. I rebooted and double checked - the modified data is still modified on disk however the corrupted file contains the original data. So the underlying file system is sane. The in-memory view of the file seems borked. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
Added application-confinement and apparmor tags since this bug affects both and it will be easier to find. ** Description changed: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption, but we've still not found the cause. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh + The kernel team has been able to confirm the symptoms. - The kernel team has been able to confirm the symptoms. + References: + * bug 1371771 + * bug 1371765 + * bug 1377338 ** Description changed: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption, but we've still not found the cause. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not
[Kernel-packages] [Bug 1387214] Re: file corruption on touch images in rw portions of the filesystem
** Tags added: kernel-key -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387214 Title: file corruption on touch images in rw portions of the filesystem Status in “linux” package in Ubuntu: Confirmed Bug description: Symptoms are that cache files in /var/cache/apparmor and profiles in /var/lib/apparmor/profiles are sometimes corrupted after a reboot. We've already fixed several bugs in the apparmor and click-apparmor and made both more robust in the face of corruption and we've reduced the impact when there is a corrupted profile, but we've still not found the cause of the corruption. This corruption can still affect real-world devices: if a profile in /var/lib/apparmor/profiles is corrupted and the cache file is out of date, then the profile won't compile and that app/scope won't start. Workaround: remove the affected profile and then run 'sudo aa- clickhook'. This obviously is not viable on an end-user device. The investigation is ongoing and this may not be a problem with the kernel at all, so this bug may be retargeted to another project. The security team and the kernel team have discussed this a lot and Colin is currently looking at this. This bug is just so it can be tracked. Here is an excerpt from my latest email to Colin: I believe I have conclusively ruled out apparmor_parser and aa- clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the test output: http://paste.ubuntu.com/8648109/ Specifically, home/bug/test-with-true.sh changes the interesting parts of the algorithm to: 1. wait for unity8 to start (this ensures the apparmor upstart job is finished) 2. restore apparmor_parser and aa-clickhook, if needed 3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles... /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser and aa-clickhook were /bin/true during boot so they could not have changed /var/lib/apparmor/profiles) 4. verify the profiles, exit with error if they do not 5. alternately upgrade/downgrade the packages 6. verify the profiles, exit with error if they do not 7. copy the known good profiles in the previous step to /home/bug/profiles... 8. have apparmor_parser and aa-clickhook point to /bin/true 9. reboot 10. go to step 1 In the paste you'll notice that in step 6 the profiles were successfully created by the installation of the packages, then verified, then copied aside, then apparmor_parser and aa-clickhook diverted, then rebooted, only to have the profiles in /var/lib/apparmor/profiles be different than what was copied aside. It would be nice to verify on your device as well (I reproduced several times here) and verify the reproducer algorithm. I think this suggests this is a kernel issue and not userspace. IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc): $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz $ tar -zxvf ./aa-corruption.tar.gz ... $ adb push ./aa-corruption.tar.gz /tmp $ adb shell phablet@ubuntu-phablet:~$ cd /tmp phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz phablet@ubuntu-phablet:~$ sudo mount -o remount,rw / phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet /etc/sudoers.d/ phablet@ubuntu-phablet:~$ sudo mount -o remount,ro / phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home phablet@ubuntu-phablet:~$ exit $ cd ./aa-corruption $ ./test-from-host.sh ... The old script is still in place. Simply adjust ./test-from-host.sh to have: testscript=/home/bug/test.sh #testscript=/home/bug/test-with-true.sh The kernel team has verified the above reproducer and symptoms. Related bugs: * bug 1371771 * bug 1371765 * bug 1377338 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387214/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp