Christos Zoulas wrote:
> Just back the all of yesterdays commit out. Just building with -j 8
> and LOCKDEBUG spins out. It trashes the filesystem and then it gets
> another error about not fixing an inode while replaying the log on
> reboot. I.e. the new kernel not only holds a spinlock and crashes,
> but also does not replay the log properly on boot.
Log replay was not touched. The code however didn't record properly
all deallocations even for some finished and committed transactions,
which caused the replay problems. This should now all be fixed, and
the mutex issue also.
After updating to newest kernel (with vfs_wapbl.c 1.84), it is
necessary to run fsck to get filesystem to fully healthy state. After
fsck, there shouldn't be any further problems related to the current
change.
Sorry about that and thanks for patience.
Jaromir
2016-10-02 16:46 GMT+02:00 Jaromír Doleček :
> There was a use-after-free bug which ended up with the fault on DEBUG
> kernels, it's fixed now in revision 1.82 of kern/vfs_wapbl.c
>
> Thank you.
>
> Jaromir
>
> 2016-10-02 1:26 GMT+02:00 bch :
>> On 10/1/16, Jaromir Dolecek wrote:
>>> If you can get just a short traceback (which particular wapbl
>>> function(s) for example), it would help to figure possible problem.
>>
>> Here's a gdb backtrace from a core dump:
>>
>>
>> #0 0x80119a85 in cpu_reboot (howto=howto@entry=260,
>> bootstr=bootstr@entry=0x0) at
>> /usr/src/sys/arch/amd64/amd64/machdep.c:676
>> syncdone = false
>> s =
>> #1 0x8086e3dc in vpanic (fmt=fmt@entry=0x80ed1503
>> "trap", ap=ap@entry=0xfe804105cb28) at
>> /usr/src/sys/kern/subr_prf.c:342
>> ci =
>> oci =
>> bootopt = 260
>> scratchstr = "trap", '\000'
>> #2 0x8086e490 in panic (fmt=fmt@entry=0x80ed1503
>> "trap") at /usr/src/sys/kern/subr_prf.c:258
>> ap = > generic pointer.)>
>> #3 0x8011b706 in trap (frame=0xfe804105cc60) at
>> /usr/src/sys/arch/amd64/amd64/trap.c:298
>> p =
>> pcb =
>> vframe =
>> ksi = {ksi_flags = 1, ksi_list = {tqe_next = 0x0, tqe_prev =
>> 0x0}, ksi_info = {_signo = 11, _code = 2, _errno = 0, _pad = 0,
>> _reason = {_rt = {_pid = 0, _uid = 0, _value = {sival_int = 6,
>> sival_ptr = 0x6}}, _child = {_pid = 0, _uid = 0,
>> _status = 6, _utime = 0, _stime = 0}, _fault = {_addr = 0x0, _trap =
>> 6, _trap2 = 0, _trap3 = 0}, _poll = {_band = 0, _fd = 6}}},
>> ksi_lid = 0}
>> onfault =
>> type = 6
>> error =
>> cr2 =
>> pfail =
>> #4 0x8010115e in alltraps ()
>> No symbol table info available.
>> #5 0x808cad70 in wapbl_write_revocations
>> (offp=0xfe804105cdc8, wl=0xfe811ce15688) at
>> /usr/src/sys/kern/vfs_wapbl.c:2343
>> wc = 0xfe811c747908
>> blocklen =
>> off = 6082048
>> wd = 0xfe81deaddead
>> error =
>> #6 wapbl_flush (wl=0xfe811ce15688, waitfor=waitfor@entry=0) at
>> /usr/src/sys/kern/vfs_wapbl.c:1618
>> bp =
>> we =
>> off = 6081536
>> head =
>> tail =
>> delta = 0
>> flushsize = 6996480
>> reserved =
>> error =
>> __func__ = "wapbl_flush"
>> #7 0x807a45c1 in ffs_sync (mp=0xfe811c95b008, waitfor=3,
>> cred=0xfe811e145f00) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1975
>> vp = 0x0
>> ump = 0xfe8108092b08
>> fs = 0xfe811beb5008
>> marker = 0xfe810a8b7930
>> error =
>> allerror = 0
>> is_suspending =
>> ctx = {waitfor = 3, is_suspending = false}
>> __func__ = "ffs_sync"
>> #8 0x808baaa1 in VFS_SYNC (mp=0xfe811c95b008,
>> a=, b=) at
>> /usr/src/sys/kern/vfs_subr.c:1358
>> error =
>> #9 0x808bad20 in sched_sync (arg=) at
>> /usr/src/sys/kern/vfs_subr.c:785
>> slp =
>> vp =
>> mp = 0xfe811c95b008
>> nmp = 0xfe811c95b008
>> starttime = 1475352687
>> synced = true
>> #10 0x801008d7 in lwp_trampoline ()
>>
>>
>>
>>> The changes to vfs_wapbl.c were fairly minor so far. I would
>>> understand new panics, but it would be strange if they caused faults.
>>>
>>> Maybe if you can try to downgrade ufs/ffs/ffs_alloc.c before rev.
>>> 1.152. It's possible there is some interaction with wapbl which might
>>> cause troubles there.
>>>
>>> Keep me on CC please, I'm working currently on WAPBL and planning some
>>> further changes, so I'll fix any regressions asap.
>>>
>>> Jaromir
>>>
>>> 2016-10-01 22:50 GMT+02:00 bch :
On Oct 1, 2016 1:44 PM, "bch" wrote:
>
> This appears to be trashing files, too, based on what I see trying to
> CVS
> update
Incl. author of potential troublesome commit.
> On Oct 1, 2016 1:32 PM, "bch" wrote:
>>
>>
>> My system is unstable w latest src. Appea