On Sat, Feb 15, 2020 at 12:17:52PM +0900, YASUOKA Masahiko wrote: > Hi, > > I have a problem that kernel core dumping always hangs around first > 8MB. The diff attached fixes the problem. > > In nvme_poll(): > > 930 while (!ISSET(state.c.flags, htole16(NVME_CQE_PHASE))) { > 931 if (nvme_q_complete(sc, q) == 0) > 932 delay(10); > 933 > 934 /* XXX no timeout? */ > 935 } > > this loop is to wait commands completion when the system is cold. If > CQE_PHASE flag on state.c.flags is set, it breaks the loop. In > nvme_q_complete(): > > 979 int > 980 nvme_q_complete(struct nvme_softc *sc, struct nvme_queue *q) > 981 { > 982 struct nvme_ccb *ccb; > 983 struct nvme_cqe *ring = NVME_DMA_KVA(q->q_cq_dmamem), *cqe; > 984 u_int32_t head; > 985 u_int16_t flags; > 986 int rv = 0; > 987 > 988 if (!mtx_enter_try(&q->q_cq_mtx)) > 989 return (-1); > 990 > 991 head = q->q_cq_head; > 992 > 993 nvme_dmamem_sync(sc, q->q_cq_dmamem, BUS_DMASYNC_POSTREAD); > 994 for (;;) { > 995 cqe = &ring[head]; > 996 flags = lemtoh16(&cqe->flags); > 997 if ((flags & NVME_CQE_PHASE) != q->q_cq_phase) > 998 break; > 999 > 1000 ccb = &sc->sc_ccbs[cqe->cid]; > 1001 ccb->ccb_done(sc, ccb, cqe); > 1002 > 1003 if (++head >= q->q_entries) { > 1004 head = 0; > 1005 q->q_cq_phase ^= NVME_CQE_PHASE; > 1006 } > 1007 > 1008 rv = 1; > 1009 } > 1010 nvme_dmamem_sync(sc, q->q_cq_dmamem, BUS_DMASYNC_PREREAD); > > See #997, the same CQE_PHASE frag is used for breaking the loop, but > see #1005. It means is inverted when the ring buffer is looped. > Please note this. > > In ccb->ccb_done()( which is actually nvme_poll_done()): > > 954 void > 955 nvme_poll_done(struct nvme_softc *sc, struct nvme_ccb *ccb, > 956 struct nvme_cqe *cqe) > 957 { > 958 struct nvme_poll_state *state = ccb->ccb_cookie; > 959 > 960 SET(cqe->flags, htole16(NVME_CQE_PHASE)); > 961 state->c = *cqe; > 962 } > > struct nvme_poll_state *state is the same object "state" in > nvme_poll(). cqe is a mapped object of a physical queue. > > On #960 set NVME_CQE_PHASE bit on "cqe" and copies it to "state". > > I think nvme_poll_done() should change only the flag on "state", but > should not change the flag on "cqe". Also let's remember that the > flag meaning on queue is inverted when the ring is looped. As the > result of modifying the flag on the physical queue, it might happens > that the loop in nvme_q_complete() will never break. > > > comment? ok?
This makes sense to me, ok jmatthew@ (and sorry it took me so long to take a good look at this) > > Index: sys/dev/ic/nvme.c > =================================================================== > RCS file: /disk/cvs/openbsd/src/sys/dev/ic/nvme.c,v > retrieving revision 1.63 > diff -u -p -r1.63 nvme.c > --- sys/dev/ic/nvme.c 27 Jul 2019 13:20:12 -0000 1.63 > +++ sys/dev/ic/nvme.c 15 Feb 2020 02:16:22 -0000 > @@ -957,8 +957,8 @@ nvme_poll_done(struct nvme_softc *sc, st > { > struct nvme_poll_state *state = ccb->ccb_cookie; > > - SET(cqe->flags, htole16(NVME_CQE_PHASE)); > state->c = *cqe; > + SET(state->c.flags, htole16(NVME_CQE_PHASE)); > } > > void >