> > In general, I think it's bad architecture to make an upcall with a lock > > held, as it makes the robustness of the system contingent on the behavior > > of userland code operating in a context with unstated constraints. More > > generally, holding locks across callouts is generally problematic but > > across the userland/kernel boundary it's much, much worse. > > Note that this particular function is already called in multitudes of > places with various locks held (mac_client_open(), > dls_devnet_stat_create(), dls_devnet_set(), dls_devnet_rename(), and > devnet_filldir()). This tells me that we either have functionality that > naturally belongs in the kernel stranded in user space, or that we > really need to do away with the need to map between linkid's and link > names (perhaps by getting rid of linkid's).
I wasn't aware there were so many places already making this upcall with locks held. Given that, it seems a bit silly to do otherwise here, so I'm fine with what you have. > > > We can get here, but only under contrived scenarios. You'd have to set > > > IPsec policy on the tunnel without having created any IPsec keys nor > > > started IKE nor loaded any other global IPsec policy. When we create > > > tunnels during boot, those other things are done first. When IPsec > > > tunnels are created by something like a VPN, those things are also done > > > first. It would only be in a case where someone is playing around and > > > doing things out of order that this would come up. They'd get EAGAIN, > > > and indeed, trying again would work. > > > > EAGAIN is just so pathetic; I really dislike this. > > Me too. Unfortunately, the IPsec loading mechanism is heavily > entrenched in conn_t STREAMS queues. It's also extremely complicated > relative to what it does, and reimplementing this part of IPsec is > something that I eliminated as a possibility back when I contacted the > IPsec team and Erik Nordmark about this a while back. Aside from that, > do you see a graceful way around this edge case? As long as there is code in libdladm that will wait a few ms and reissue the request when EAGAIN is returned, I think I'm OK with it. From my perspective, the key requirement is to ensure this does not bubble out past the consolidation-private layer of our architecture -- e.g., things using dladm/libdladm must never need to concern themselves with this. -- meem