Re: snmpd; Fix use after free for appl_request_upstream

2023-10-27 Thread Martijn van Duren
On Thu, 2023-10-26 at 21:38 +0200, Theo Buehler wrote:
> On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> > This case is covered by the new regress' backend_get_toofew and
> > backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> > to the max it's really hard to trigger (I had to run
> > backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> > sequentially in a tight loop killing snmpd between iterations for the
> > best chance).
> > 
> > When we receive an invalid varbindlist in a response we set the invalid
> > variable. This in turn calls appl_varbind_error(), but the avi_state
> > of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> > we call appl_request_downstream_free(), which sees that the varbinds
> > haven't been resolved, triggering a call to
> > appl_request_upstream_resolve(). This call in turn sees that the
> > error has been set and just sends out the error-response to the client
> > and frees the appl_request_upstream. From here we return back to
> > appl_response(), which also calls appl_request_upstream_resolve(),
> > resulting in a use after free.
> > 
> > The main tool for fixing this issue is making use of
> > appl_request_upstream's aru_locked member, which will cause
> > appl_request_upstream_resolve() to return instantly. The simplest fix is
> > to set aru_locked before calling appl_request_downstream_free() and
> > unsetting it directly afterwards inside appl_response().
> > 
> > The second one is the diff proposed below, which shrinks the code.
> > 
> > appl_request_upstream_free() is only called once from
> > appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> > is only called by appl_request_upstream_resolve().
> > appl_request_upstream_resolve() is called in 3 places:
> > - appl_processpdu(): to kick things off
> > - appl_request_downstream_free(): For when a backend disappears with
> > outstanding requests
> > - appl_response(): To kickstart the next round of resolving.
> > 
> > Since appl_request_downstream_free() is always called from
> > appl_response(), we can leverage that function and make it call
> > appl_request_upstream_resolve() unconditionally.
> > 
> > appl_request_downstream_free() is called from the following locations:
> > - appl_close(): When a backend has disappeared.
> > - appl_request_upstream_free(): We send out a reply early, because an
> > error has been detected.
> > - appl_response(): We received a response
> > 
> > appl_request_upstream_free() can't reenter into
> > appl_request_upstream_resolve(), or it would potentially trigger new
> > appl_request_downstreams. This can be prevented by setting aru_locked
> > before calling appl_request_downstream_free().
> > For all other cases we should rely on appl_request_upstream_resolve()'s
> > logic to handle varbinds in any state, so there's no reason make calls
> > from other contexts conditional.
> 
> Your description of the bug makes sense and your choice of resolving it
> as well. Thanks for the in-depth explanation, that helped a lot. Still,
> I must say that I don't really feel at ease with the amount of complexity
> and entanglement here. I simply can't fit all of this into my head within
> a reasonable amount of time.
> 
Well, the S in SNMP doesn't stand for "I want to pull my hair out" for
no reason.

So for people interested, this is the high-over flow of the code:
When a message is received snmpe.c sends the PDU part to
appl_processpdu(). This function pulls apart the request in structs that
can be worked with internally (1 struct appl_request_upstream + n
struct appl_varbind_internal).
After things are pulled apart appl_processpdu() calls
appl_request_upstream_resolve(), which is responsible for retrieving
the values for the requested varbinds. It does this by going over all
varbinds in the APPL_VBSTATE_NEW state, and calling
appl_varbind_backend() to find a matching backend for the request. In
the case of a getnext/getbulk request it can also increment the oid
to the next matching region (if needed) and it also fills the end oid
where the ownership of the backend ends. After all APPL_VBSTATE_NEW
varbinds have found their backend, they are grouped together by backend
in a struct appl_request_downstream and send out via
appl_request_downstream_send().

Once a backend has a response available it calls appl_response(). This
function verifies the returned data via appl_varbind_valid() and
appl_error_valid(), and checks if the data matches the supported
datatypes (e.g. counter64 on SNMPv1). After that the data is put into
the corresponding struct appl_varbind_internal, or in case of an EOMV
we increment the OID to the end OID of the last request and reset the
varbind to APPL_VBSTATE_NEW. After that the struct
appl_request_downstream is freed and we return back to
appl_request_upstream_resolve(). If all varbinds have reached
APPL_VBSTATE_DONE (or an error occured) we call
appl_request_upstream_reply() which 

Re: snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Theo Buehler
On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> This case is covered by the new regress' backend_get_toofew and
> backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> to the max it's really hard to trigger (I had to run
> backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> sequentially in a tight loop killing snmpd between iterations for the
> best chance).
> 
> When we receive an invalid varbindlist in a response we set the invalid
> variable. This in turn calls appl_varbind_error(), but the avi_state
> of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> we call appl_request_downstream_free(), which sees that the varbinds
> haven't been resolved, triggering a call to
> appl_request_upstream_resolve(). This call in turn sees that the
> error has been set and just sends out the error-response to the client
> and frees the appl_request_upstream. From here we return back to
> appl_response(), which also calls appl_request_upstream_resolve(),
> resulting in a use after free.
> 
> The main tool for fixing this issue is making use of
> appl_request_upstream's aru_locked member, which will cause
> appl_request_upstream_resolve() to return instantly. The simplest fix is
> to set aru_locked before calling appl_request_downstream_free() and
> unsetting it directly afterwards inside appl_response().
> 
> The second one is the diff proposed below, which shrinks the code.
> 
> appl_request_upstream_free() is only called once from
> appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> is only called by appl_request_upstream_resolve().
> appl_request_upstream_resolve() is called in 3 places:
> - appl_processpdu(): to kick things off
> - appl_request_downstream_free(): For when a backend disappears with
> outstanding requests
> - appl_response(): To kickstart the next round of resolving.
> 
> Since appl_request_downstream_free() is always called from
> appl_response(), we can leverage that function and make it call
> appl_request_upstream_resolve() unconditionally.
> 
> appl_request_downstream_free() is called from the following locations:
> - appl_close(): When a backend has disappeared.
> - appl_request_upstream_free(): We send out a reply early, because an
> error has been detected.
> - appl_response(): We received a response
> 
> appl_request_upstream_free() can't reenter into
> appl_request_upstream_resolve(), or it would potentially trigger new
> appl_request_downstreams. This can be prevented by setting aru_locked
> before calling appl_request_downstream_free().
> For all other cases we should rely on appl_request_upstream_resolve()'s
> logic to handle varbinds in any state, so there's no reason make calls
> from other contexts conditional.

Your description of the bug makes sense and your choice of resolving it
as well. Thanks for the in-depth explanation, that helped a lot. Still,
I must say that I don't really feel at ease with the amount of complexity
and entanglement here. I simply can't fit all of this into my head within
a reasonable amount of time.

ok tb (fwiw)

since it resolves a real problem and simplifies things a bit.



snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Martijn van Duren
This case is covered by the new regress' backend_get_toofew and
backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
to the max it's really hard to trigger (I had to run
backend_get_wrongorder, backend_get_toofew, backend_get_toomany
sequentially in a tight loop killing snmpd between iterations for the
best chance).

When we receive an invalid varbindlist in a response we set the invalid
variable. This in turn calls appl_varbind_error(), but the avi_state
of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
we call appl_request_downstream_free(), which sees that the varbinds
haven't been resolved, triggering a call to
appl_request_upstream_resolve(). This call in turn sees that the
error has been set and just sends out the error-response to the client
and frees the appl_request_upstream. From here we return back to
appl_response(), which also calls appl_request_upstream_resolve(),
resulting in a use after free.

The main tool for fixing this issue is making use of
appl_request_upstream's aru_locked member, which will cause
appl_request_upstream_resolve() to return instantly. The simplest fix is
to set aru_locked before calling appl_request_downstream_free() and
unsetting it directly afterwards inside appl_response().

The second one is the diff proposed below, which shrinks the code.

appl_request_upstream_free() is only called once from
appl_request_upstream_reply(). appl_request_upstream_reply() in turn
is only called by appl_request_upstream_resolve().
appl_request_upstream_resolve() is called in 3 places:
- appl_processpdu(): to kick things off
- appl_request_downstream_free(): For when a backend disappears with
outstanding requests
- appl_response(): To kickstart the next round of resolving.

Since appl_request_downstream_free() is always called from
appl_response(), we can leverage that function and make it call
appl_request_upstream_resolve() unconditionally.

appl_request_downstream_free() is called from the following locations:
- appl_close(): When a backend has disappeared.
- appl_request_upstream_free(): We send out a reply early, because an
error has been detected.
- appl_response(): We received a response

appl_request_upstream_free() can't reenter into
appl_request_upstream_resolve(), or it would potentially trigger new
appl_request_downstreams. This can be prevented by setting aru_locked
before calling appl_request_downstream_free().
For all other cases we should rely on appl_request_upstream_resolve()'s
logic to handle varbinds in any state, so there's no reason make calls
from other contexts conditional.

OK?

martijn@

Index: application.c
===
RCS file: /cvs/src/usr.sbin/snmpd/application.c,v
retrieving revision 1.24
diff -u -p -r1.24 application.c
--- application.c   24 Oct 2023 14:21:58 -  1.24
+++ application.c   26 Oct 2023 09:40:23 -
@@ -710,6 +710,7 @@ appl_request_upstream_free(struct appl_r
if (ureq == NULL)
return;
 
+   ureq->aru_locked = 1;
for (i = 0; i < ureq->aru_varbindlen && ureq->aru_vblist != NULL; i++) {
vb = &(ureq->aru_vblist[i]);
ober_free_elements(vb->avi_varbind.av_value);
@@ -726,7 +727,6 @@ void
 appl_request_downstream_free(struct appl_request_downstream *dreq)
 {
struct appl_varbind_internal *vb;
-   int retry = 0;
 
if (dreq == NULL)
return;
@@ -736,14 +736,11 @@ appl_request_downstream_free(struct appl
 
for (vb = dreq->ard_vblist; vb != NULL; vb = vb->avi_next) {
vb->avi_request_downstream = NULL;
-   if (vb->avi_state == APPL_VBSTATE_PENDING) {
+   if (vb->avi_state == APPL_VBSTATE_PENDING)
vb->avi_state = APPL_VBSTATE_NEW;
-   retry = 1;
-   }
}
 
-   if (retry)
-   appl_request_upstream_resolve(dreq->ard_request);
+   appl_request_upstream_resolve(dreq->ard_request);
free(dreq);
 }
 
@@ -1172,9 +1169,6 @@ appl_response(struct appl_backend *backe
backend->ab_name);
backend->ab_fn->ab_close(backend, APPL_CLOSE_REASONPARSEERROR);
}
-
-   if (ureq != NULL)
-   appl_request_upstream_resolve(ureq);
 }
 
 int