Re: data modified on freelist, tmpfs-related?
Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. Data modified on freelist: word -35183628471970 of object 0x80d36c00 size 0x400 previous type free (invalid addr 0xf40858de1f81cbe9) panic: Data modified on freelist: word 4 of object 0x80d36c00 size 0x400 previous type free (0x0 != 0xdeafbead) Stopped at Debugger+0x5: leave RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! IF RUNNING SMP, USE 'mach ddbcpu #' AND 'trace' ON OTHER PROCESSORS, TOO. DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb{1} Debugger() at Debugger+0x5 panic() at panic+0xfe malloc() at malloc+0x697 hashinit() at hashinit+0x3b uao_grow_hash() at uao_grow_hash+0x70 tmpfs_reg_resize() at tmpfs_reg_resize+0xe4 tmpfs_write() at tmpfs_write+0xfd VOP_WRITE() at VOP_WRITE+0x3f vn_write() at vn_write+0x98 dofilewritev() at dofilewritev+0x1c5 sys_write() at sys_write+0xaa syscall() at syscall+0x297 --- syscall (number 4) --- end of kernel end trace frame: 0x7f7cfd50, count: -12 0xb25a740770a: ddb{1} ds 0x296 es0x6930 fs0x6900 gs0xd1ee rdi 0x1 rsi0x296 rbp 0x800033276920 rbx 0x817dac70seltrue_filtops+0xa10 rdx0 rcx 0x801c7000 rax 0x1 r80x800033276840 r9 0 r100 r110 r120x100 r13 0x800033276930 r14 0xa r15 0x5 rip 0x813403e5Debugger+0x5 cs 0x8 rflags 0x202 rsp 0x800033276920 ss 0x10 Debugger+0x5: leave The end of the dmesg buffer was overwritten by kernel output from the following boot, but I have screenshots of bcstats, uvmexp and the first page of ps output (active process is gunzip) here: https://drive.google.com/folderview?id=0B8t-sinTZPnucERodGdCcTc2Mmcusp=sharing OpenBSD 5.5-current (GENERIC.MP) #6: Sun Apr 27 14:42:50 BST 2014 st...@bamboo.spacehopper.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8451125248 (8059MB) avail mem = 8217436160 (7836MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xdae9c000 (66 entries) bios0: vendor LENOVO version 8DET63WW (1.33 ) date 07/19/2012 bios0: LENOVO 4287CTO acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SLIC SSDT SSDT SSDT HPET APIC MCFG ECDT ASF! TCPA SSDT SSDT UEFI UEFI UEFI acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP4(S4) EXP7(S4) EHC1(S3) EHC2(S3) HDEF(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz, 2791.30 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz, 2790.94 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz, 2790.94 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz, 2790.94 MHz cpu3:
Re: data modified on freelist, tmpfs-related?
On Wed, Apr 30, 2014 at 03:38:39PM +0200, Mark Kettenis wrote: Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. On the other hand, if we don't use it, we won't find bugs... ;-) -- Antoine
Re: data modified on freelist, tmpfs-related?
On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST) From: Mark Kettenis mark.kette...@xs4all.nl Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. Which means, I'd like somebody else besides espie@ to comment on my uvm_aobj.c list manipulation hack diff. Diff made sense to me when I looked at it, but I would rather hide direct pointer access :/ Perhaps LIST_SWAP does a tiny bit more, but it's cleaner and perhaps can be useful in the future.
Re: data modified on freelist, tmpfs-related?
From: Mike Belopuhov m...@belopuhov.com Date: Wed, 30 Apr 2014 16:00:45 +0200 On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST) From: Mark Kettenis mark.kette...@xs4all.nl Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. Which means, I'd like somebody else besides espie@ to comment on my uvm_aobj.c list manipulation hack diff. Diff made sense to me when I looked at it, but I would rather hide direct pointer access :/ Perhaps LIST_SWAP does a tiny bit more, but it's cleaner and perhaps can be useful in the future. I'm not comfortable with introducing more sys/queue.h APIs. So perhaps we should just punt on the optimization and remove/insert all list items. Removing the trap comments that pedro set up... Index: uvm_aobj.c === RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v retrieving revision 1.61 diff -u -p -r1.61 uvm_aobj.c --- uvm_aobj.c 13 Apr 2014 23:14:15 - 1.61 +++ uvm_aobj.c 30 Apr 2014 14:52:33 - @@ -431,6 +431,7 @@ uao_shrink_hash(struct uvm_object *uobj, { struct uvm_aobj *aobj = (struct uvm_aobj *)uobj; struct uao_swhash *new_swhash; + struct uao_swhash_elt *elt; unsigned long new_hashmask; int i; @@ -456,8 +457,13 @@ uao_shrink_hash(struct uvm_object *uobj, * Even though the hash table size is changing, the hash of the buckets * we are interested in copying should not change. */ - for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) - LIST_FIRST(new_swhash[i]) = LIST_FIRST(aobj-u_swhash[i]); + for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { + while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { + elt = LIST_FIRST(aobj-u_swhash[i]); + LIST_REMOVE(elt, list); + LIST_INSERT_HEAD(new_swhash[i], elt, list); + } + } free(aobj-u_swhash, M_UVMAOBJ); @@ -609,7 +615,6 @@ uao_grow_hash(struct uvm_object *uobj, i return ENOMEM; for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { - /* XXX pedro: shouldn't copying the list pointers be enough? */ while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { elt = LIST_FIRST(aobj-u_swhash[i]); LIST_REMOVE(elt, list);
Re: data modified on freelist, tmpfs-related?
On 30 April 2014 10:55, Mark Kettenis mark.kette...@xs4all.nl wrote: From: Mike Belopuhov m...@belopuhov.com Date: Wed, 30 Apr 2014 16:00:45 +0200 On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST) From: Mark Kettenis mark.kette...@xs4all.nl Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. Which means, I'd like somebody else besides espie@ to comment on my uvm_aobj.c list manipulation hack diff. Diff made sense to me when I looked at it, but I would rather hide direct pointer access :/ Perhaps LIST_SWAP does a tiny bit more, but it's cleaner and perhaps can be useful in the future. I'm not comfortable with introducing more sys/queue.h APIs. So perhaps we should just punt on the optimization and remove/insert all list items. Removing the trap comments that pedro set up... Index: uvm_aobj.c === RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v retrieving revision 1.61 diff -u -p -r1.61 uvm_aobj.c --- uvm_aobj.c 13 Apr 2014 23:14:15 - 1.61 +++ uvm_aobj.c 30 Apr 2014 14:52:33 - @@ -431,6 +431,7 @@ uao_shrink_hash(struct uvm_object *uobj, { struct uvm_aobj *aobj = (struct uvm_aobj *)uobj; struct uao_swhash *new_swhash; + struct uao_swhash_elt *elt; unsigned long new_hashmask; int i; @@ -456,8 +457,13 @@ uao_shrink_hash(struct uvm_object *uobj, * Even though the hash table size is changing, the hash of the buckets * we are interested in copying should not change. */ - for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) - LIST_FIRST(new_swhash[i]) = LIST_FIRST(aobj-u_swhash[i]); + for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { + while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { + elt = LIST_FIRST(aobj-u_swhash[i]); + LIST_REMOVE(elt, list); + LIST_INSERT_HEAD(new_swhash[i], elt, list); + } + } free(aobj-u_swhash, M_UVMAOBJ); @@ -609,7 +615,6 @@ uao_grow_hash(struct uvm_object *uobj, i return ENOMEM; for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { - /* XXX pedro: shouldn't copying the list pointers be enough? */ while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { elt = LIST_FIRST(aobj-u_swhash[i]); LIST_REMOVE(elt, list); I like this approach better. ok krw@ Ken
Re: data modified on freelist, tmpfs-related?
This is probably the simplest way to solve the problem for now. if we want to mess with sys/queue we can do that separately. On Wed, Apr 30, 2014 at 8:55 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: From: Mike Belopuhov m...@belopuhov.com Date: Wed, 30 Apr 2014 16:00:45 +0200 On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST) From: Mark Kettenis mark.kette...@xs4all.nl Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. Which means, I'd like somebody else besides espie@ to comment on my uvm_aobj.c list manipulation hack diff. Diff made sense to me when I looked at it, but I would rather hide direct pointer access :/ Perhaps LIST_SWAP does a tiny bit more, but it's cleaner and perhaps can be useful in the future. I'm not comfortable with introducing more sys/queue.h APIs. So perhaps we should just punt on the optimization and remove/insert all list items. Removing the trap comments that pedro set up... Index: uvm_aobj.c === RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v retrieving revision 1.61 diff -u -p -r1.61 uvm_aobj.c --- uvm_aobj.c 13 Apr 2014 23:14:15 - 1.61 +++ uvm_aobj.c 30 Apr 2014 14:52:33 - @@ -431,6 +431,7 @@ uao_shrink_hash(struct uvm_object *uobj, { struct uvm_aobj *aobj = (struct uvm_aobj *)uobj; struct uao_swhash *new_swhash; + struct uao_swhash_elt *elt; unsigned long new_hashmask; int i; @@ -456,8 +457,13 @@ uao_shrink_hash(struct uvm_object *uobj, * Even though the hash table size is changing, the hash of the buckets * we are interested in copying should not change. */ - for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) - LIST_FIRST(new_swhash[i]) = LIST_FIRST(aobj-u_swhash[i]); + for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { + while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { + elt = LIST_FIRST(aobj-u_swhash[i]); + LIST_REMOVE(elt, list); + LIST_INSERT_HEAD(new_swhash[i], elt, list); + } + } free(aobj-u_swhash, M_UVMAOBJ); @@ -609,7 +615,6 @@ uao_grow_hash(struct uvm_object *uobj, i return ENOMEM; for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { - /* XXX pedro: shouldn't copying the list pointers be enough? */ while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { elt = LIST_FIRST(aobj-u_swhash[i]); LIST_REMOVE(elt, list);
Re: data modified on freelist, tmpfs-related?
On 30 April 2014 16:55, Mark Kettenis mark.kette...@xs4all.nl wrote: From: Mike Belopuhov m...@belopuhov.com Date: Wed, 30 Apr 2014 16:00:45 +0200 On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST) From: Mark Kettenis mark.kette...@xs4all.nl Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. Which means, I'd like somebody else besides espie@ to comment on my uvm_aobj.c list manipulation hack diff. Diff made sense to me when I looked at it, but I would rather hide direct pointer access :/ Perhaps LIST_SWAP does a tiny bit more, but it's cleaner and perhaps can be useful in the future. I'm not comfortable with introducing more sys/queue.h APIs. So perhaps we should just punt on the optimization and remove/insert all list items. Removing the trap comments that pedro set up... I agree.
Re: data modified on freelist, tmpfs-related?
On Wed, Apr 30, 2014 at 03:38:39PM +0200, Mark Kettenis wrote: Date: Wed, 30 Apr 2014 13:39:20 +0100 From: Stuart Henderson st...@openbsd.org Seen when running e2fsprogs regression tests with /tmp on tmpfs I'm not surprised; tmpfs contains some serious bugs. I recommend not using it until those are fixed. On the other hand, if we don't use it, we won't find bugs... ;-) That particular bug has been found before. Re-discoverering it is not getting it fixed. Like everything, tmpfs has to be fixed by those who care.
Re: data modified on freelist, tmpfs-related?
I'm not comfortable with introducing more sys/queue.h APIs. So perhaps we should just punt on the optimization and remove/insert all list items. Removing the trap comments that pedro set up... Since the removal macros poison pointers which should no longer be dereferenced after the operation, I'm strongly in favour of sticking to these operations, even if it costs a bit more, timewise. Index: uvm_aobj.c === RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v retrieving revision 1.61 diff -u -p -r1.61 uvm_aobj.c --- uvm_aobj.c13 Apr 2014 23:14:15 - 1.61 +++ uvm_aobj.c30 Apr 2014 14:52:33 - @@ -431,6 +431,7 @@ uao_shrink_hash(struct uvm_object *uobj, { struct uvm_aobj *aobj = (struct uvm_aobj *)uobj; struct uao_swhash *new_swhash; + struct uao_swhash_elt *elt; unsigned long new_hashmask; int i; @@ -456,8 +457,13 @@ uao_shrink_hash(struct uvm_object *uobj, * Even though the hash table size is changing, the hash of the buckets * we are interested in copying should not change. */ - for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) - LIST_FIRST(new_swhash[i]) = LIST_FIRST(aobj-u_swhash[i]); + for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { + while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { + elt = LIST_FIRST(aobj-u_swhash[i]); + LIST_REMOVE(elt, list); + LIST_INSERT_HEAD(new_swhash[i], elt, list); + } + } free(aobj-u_swhash, M_UVMAOBJ); @@ -609,7 +615,6 @@ uao_grow_hash(struct uvm_object *uobj, i return ENOMEM; for (i = 0; i UAO_SWHASH_BUCKETS(aobj-u_pages); i++) { - /* XXX pedro: shouldn't copying the list pointers be enough? */ while (LIST_EMPTY(aobj-u_swhash[i]) == 0) { elt = LIST_FIRST(aobj-u_swhash[i]); LIST_REMOVE(elt, list);