On 28 Feb 2015, at 21:05, Christos Zoulas <[email protected]> wrote:
> On Feb 28, 8:26pm, [email protected] > ([email protected]) wrote: > -- Subject: Re: DoS attack against TCP services > > | On Sat, 28 Feb 2015, Christos Zoulas wrote: > | > > | > Yes, that's a good start but we need to find which process that > | > lwp belongs to. > | > | I'm not sure what the best course of action is. The machine is still > | running. Should you try to get the information from the current system or > | force a dump and analyze this? > | > | On Sat, 28 Feb 2015, J. Hannken-Illjes wrote: > | > > | > Looks unlocked -- what about a backtrace of thread 0.5, > | > bt /a 0xfffffe882df11860 > | > | https://www.ipv6.uni-leipzig.de/bt_0xfffffe882df11860.png > > So who else is holding the sysmon sme_mtx? Analyzed a crash dump and found two threads deadlocked. 0 77 3 0 200 fffffe813b495b60 ciss0 ciss_cmd 0 5 3 0 200 fffffe882df11860 softclk/0 tstile Backtrace of softclk/0: ... 3 mutex_vector_enter sys/kern/kern_mutex.c:682 4 sme_events_check sys/dev/sysmon/sysmon_envsys_events.c:734 5 callout_softclock sys/kern/kern_timeout.c:743 6 softint_execute sys/kern/kern_softint.c:589 ... Here the event struct sme is: sme_name = "ciss0" sme_mtx.u.mtxa_owner = 0xfffffe813b495b62 (Thread ciss0) Backtrace of ciss0: ... 2 cv_timedwait sys/kern/kern_condvar.c:261 3 ciss_cmd sys/dev/ic/ciss.c:542 4 ciss_ldid sys/dev/ic/ciss.c:883 5 ciss_ioctl_vol sys/dev/ic/ciss.c:1388 6 ciss_sensor_refresh sys/dev/ic/ciss.c:1544 7 sysmon_envsys_refresh_sensor sys/dev/sysmon/sysmon_envsys.c:2027 8 sme_events_worker sys/dev/sysmon/sysmon_envsys_events.c:769 9 workqueue_runlist sys/kern/subr_workqueue.c:104 10 workqueue_worker sys/kern/subr_workqueue.c:135 ... The sme mutex was locked from sme_events_worker at sysmon_envsys_events.c:760. Now we have a deadlock, softlck/0 waits for the mutex and therefore callouts will no longer be processed and ciss holds the mutex and waits for a callout through cv_timedwait. Taking a closer look at the poll loop from sys/dev/ic/ciss.c:537 ... this code looks wrong in many aspects: - Sleeping up to 60 seconds in a function used by a callout is wrong. - Examining variables here we get: tick = 10000, etick = 16000, tohz = 6000 and i = 5999999. As tick is constant (us per hz) this loop might run for 5999999*60 seconds! -- J. Hannken-Illjes - [email protected] - TU Braunschweig (Germany)
