On Thu, 16 Jun 2022 12:54:56 +0200 Natanael Copa <[email protected]> wrote:
> On Tue, 14 Jun 2022 18:24:54 +0200 > Denys Vlasenko <[email protected]> wrote: > > > On Tue, Jun 14, 2022 at 8:55 AM Natanael Copa <[email protected]> > > wrote: > > > Hi! > > > > > > Is there anything else I can do to help fix CVE-2022-30065? I have > > > created a testcase for the testsuite and proposed a fix, but I'm not > > > that familiar with awk code so I would appreciate some help with this > > > before pushing it to thousands (millions?) of users. > > > > cd testsuite && ./runtest awk > > > > fails a lot with this change. > > Indeed, sorry! I thought I ran it locally but I must have done something > wrong when running them here. > > Need to go back to the drawing board... Ok, so I have made some progress and have another possible fix. I also separated the test case into a separate commit, so it becomes easier to test in different branches etc. while working on it. I also have another (valid?) testcase which passes on gawk and nawk, but fails with my fix suggestion. I also attach a WIP patch which adds some debug info of pointer values to make it visible when tmpvars are allocated and when evaluate() returns the just free'd pointer. Summary of the attached patches: 0001-awk-fix-use-after-free-CVE-2022-30065.patch: Potential fix, that prevents the use-after-free. But honestly, I don't really know what I'm doing and I don't know if this breaks any valid cases. 0002-awk-test-for-CVE-2022-30065.patch: The test case for the CVE. gawk and mawk returns syntax error for this test. This patch can be applied as is, but it will not pass until we have a proper fix. 0003-awk-add-test-for-setting-1-while-testing-it.patch: A test case that passes with mawk/gawk but fails with the proposed fix. I don't know if this is valid syntax or if it is ok to error out with syntax error. 0004-WIP-debug-awk.patch: Temp patch that adds extra printf debug info to show what happens. This patch should *not* be applied upstream, but can be used in local development git branches to help debug. With the attached debug print 0004-WIP-debug-awk.patch applied, gives the following output when running: echo | ./busybox awk '$1$1=0' # My comments are prefixed with a # # TIP: do a in browser/reader search for '0x7f52929debb0' to highlight # where the problematic address is used. fsrealloc: xrealloc(0, 512) fsrealloc: Fields=0x7f52929df030..0x7f52929df22f getvar_i: 0.000000 getvar_i: 1.000000 entered awk_getline() returning from awk_getline(): 1 getvar_i: 0.000000 getvar_i: 0.000000 entered evaluate(op=0x7f52929dff30, res=0x7f5292a77328) tmpvars=0x7f52929deb60 opinfo:00000300 opn:00000000 switch(0x3) NEWSOURCE opinfo:00000d00 opn:00000000 switch(0xd) TEST entered evaluate(op=0x7f52929dd530, res=0x7f5292a772a8) tmpvars=0x7f52929debb0 # this is where allocation happens opinfo:4a031f00 opn:00000000 entered evaluate(op=0x7f52929dd470, res=0x7f52929debb0) # the buffer is passed here tmpvars=0x7f52929dd820 opinfo:230f1500 opn:00000000 entered evaluate(op=0x7f52929dffc0, res=0x7f52929dd820) tmpvars=0x7f52929dd870 opinfo:05021700 opn:00000000 entered evaluate(op=0x7f52929dd410, res=0x7f52929dd890) tmpvars=0x7f52929dd8c0 opinfo:00002700 opn:00000000 switch(0x27) VAR returning from evaluate(): res=0x7f52929dd440, tmpvars=0x7f52929dd8c0 switch(0x17) FIELD getvar_i: 1.000000 returning from evaluate(): res=0x7f52929df030, tmpvars=0x7f52929dd870 opinfo & OF_RES1: L.v:'0x7f52929df030' L.s:'' entered evaluate(op=0x7f52929dd4a0, res=0x7f52929dd840) tmpvars=0x7f52929dd910 opinfo:05021700 opn:00000000 entered evaluate(op=0x7f52929dd4d0, res=0x7f52929dd930) tmpvars=0x7f52929dd960 opinfo:00002700 opn:00000000 switch(0x27) VAR returning from evaluate(): res=0x7f52929dd500, tmpvars=0x7f52929dd960 switch(0x17) FIELD getvar_i: 1.000000 returning from evaluate(): res=0x7f52929df030, tmpvars=0x7f52929dd910 R.s:'' switch(0x15) CONCAT / COMMA: L.s='', sep='', R.s='' returning from evaluate(): res=0x7f52929debb0, tmpvars=0x7f52929dd820 # address is return value here opinfo & OF_RES1: L.v:'0x7f52929debb0' # L.v is from TEST tmpvars here entered evaluate(op=0x7f52929dd560, res=0x7f52929debd0) tmpvars=0x7f52929dd820 opinfo:00002700 opn:00000000 switch(0x27) VAR returning from evaluate(): res=0x7f52929dd590, tmpvars=0x7f52929dd820 switch(0x1f) MOVE L.v=0x7f52929debb0, res=0x7f5292a772a8 # MOVE passes the reference here copyvar: number:0.000000 string:'(null)' # copyvar here will copy the reference to res here returning from evaluate(): res=0x7f52929debb0, tmpvars=0x7f52929debb0 # res is now set to same value as tmpvars which is free'd. # evaluate() here returns the value it just free'd. awk: cmd. line:1: Possible syntax error So, during TEST tmpvars is allocated. MOVE will do res=tmpvars, and then is tmpvars free'd and evaluate() return the free'd value. So MOVE's copyvar should may actually copy it instead of just copying the reference? If we do that, when and how is it free'd to avoid memory leaks? Should we implement a refcount of some sort? I have no idea. Kind regards, Natanael Copa
>From 7c05b4b8e7212969f33fbd3e10be8b00d2380ea8 Mon Sep 17 00:00:00 2001 From: Natanael Copa <[email protected]> Date: Thu, 16 Jun 2022 16:22:44 +0200 Subject: [PATCH 4/4] WIP: debug awk --- editors/awk.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/editors/awk.c b/editors/awk.c index 8554871e4..c86180185 100644 --- a/editors/awk.c +++ b/editors/awk.c @@ -55,7 +55,7 @@ /* If you comment out one of these below, it will be #defined later * to perform debug printfs to stderr: */ #define debug_printf_walker(...) do {} while (0) -#define debug_printf_eval(...) do {} while (0) +//#define debug_printf_eval(...) do {} while (0) #define debug_printf_parse(...) do {} while (0) #ifndef debug_printf_walker @@ -2872,11 +2872,10 @@ static var *evaluate(node *op, var *res) if (!op) return setvar_s(res, NULL); - debug_printf_eval("entered %s()\n", __func__); - tmpvars = nvalloc(2); #define TMPVAR0 (tmpvars) #define TMPVAR1 (tmpvars + 1) + debug_printf_eval("entered %s(op=%p, res=%p)\n\ttmpvars=%p\n", __func__, op, res, tmpvars); while (op) { struct { @@ -2903,6 +2902,7 @@ static var *evaluate(node *op, var *res) if ((opinfo & OF_REQUIRED) && !op1) syntax_error(EMSG_TOO_FEW_ARGS); L.v = evaluate(op1, TMPVAR0); + debug_printf_eval("opinfo & OF_RES1: L.v:'%p'\n", L.v); if (opinfo & OF_STR1) { L.s = getvar_s(L.v); debug_printf_eval("L.s:'%s'\n", L.s); @@ -3127,7 +3127,7 @@ static var *evaluate(node *op, var *res) break; case XC( OC_MOVE ): - debug_printf_eval("MOVE\n"); + debug_printf_eval("MOVE L.v=%p, res=%p\n", L.v, res); /* if source is a temporary string, jusk relink it to dest */ if (R.v == TMPVAR1 && !(R.v->type & VF_NUMBER) @@ -3438,9 +3438,9 @@ static var *evaluate(node *op, var *res) debug_printf_eval("CONCAT /\n"); case XC( OC_COMMA ): { const char *sep = ""; - debug_printf_eval("COMMA\n"); if (opinfo == TI_COMMA) sep = getvar_s(intvar[SUBSEP]); + debug_printf_eval("COMMA: L.s='%s', sep='%s', R.s='%s'\n", L.s, sep, R.s); setvar_p(res, xasprintf("%s%s%s", L.s, sep, R.s)); break; } @@ -3541,10 +3541,10 @@ static var *evaluate(node *op, var *res) * we just nvfree'd. If we do, assume that it is a syntax error. * This has been seen with `awk '$1$1=0'` */ + debug_printf_eval("returning from %s(): res=%p, tmpvars=%p\n", __func__, res, tmpvars); if (tmpvars == res) syntax_error(EMSG_POSSIBLE_ERROR); - debug_printf_eval("returning from %s(): %p\n", __func__, res); return res; #undef fnargs #undef seed -- 2.36.1
>From aa22f20ce9ee0f94725cfca3e294b99267e1cf59 Mon Sep 17 00:00:00 2001 From: Natanael Copa <[email protected]> Date: Thu, 16 Jun 2022 22:25:28 +0200 Subject: [PATCH 3/4] awk: add test for setting $1 while testing it This test passes on mawk and gawk. --- testsuite/awk.tests | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/testsuite/awk.tests b/testsuite/awk.tests index 60a737dcf..3d2e11166 100755 --- a/testsuite/awk.tests +++ b/testsuite/awk.tests @@ -485,4 +485,9 @@ testing 'awk use-after-free (CVE-2022-30065)' \ "awk: cmd. line:1: Possible syntax error" \ "" +testing 'awk assign while test' \ + "awk '\$1==\$1=\"foo\" {print \$1}'" \ + "foo" \ + "" \ + "foo" exit $FAILCOUNT -- 2.36.1
>From 91ce0365e84fc413f5d584ca08d2646f101cdab0 Mon Sep 17 00:00:00 2001 From: Natanael Copa <[email protected]> Date: Thu, 16 Jun 2022 21:54:48 +0200 Subject: [PATCH 2/4] awk: test for CVE-2022-30065 --- testsuite/awk.tests | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/testsuite/awk.tests b/testsuite/awk.tests index 93e25d8c1..60a737dcf 100755 --- a/testsuite/awk.tests +++ b/testsuite/awk.tests @@ -479,4 +479,10 @@ testing 'awk backslash+newline eaten with no trace' \ "Hello world\n" \ '' '' +testing 'awk use-after-free (CVE-2022-30065)' \ + "awk '\$3i\$3in\$9=\$r||\$9=i6/6-9f'" \ + "" \ + "awk: cmd. line:1: Possible syntax error" \ + "" + exit $FAILCOUNT -- 2.36.1
>From 614b5c056097c12c398e8d98c4c727a1b4a214c7 Mon Sep 17 00:00:00 2001 From: Natanael Copa <[email protected]> Date: Tue, 7 Jun 2022 21:53:04 +0200 Subject: [PATCH 1/4] awk: fix use after free (CVE-2022-30065) fixes https://bugs.busybox.net/show_bug.cgi?id=14781 --- editors/awk.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/editors/awk.c b/editors/awk.c index 079d0bde5..8554871e4 100644 --- a/editors/awk.c +++ b/editors/awk.c @@ -3537,6 +3537,12 @@ static var *evaluate(node *op, var *res) nvfree(tmpvars, 2); #undef TMPVAR0 #undef TMPVAR1 + /* We should never end up in a state where we return the pointer that + * we just nvfree'd. If we do, assume that it is a syntax error. + * This has been seen with `awk '$1$1=0'` + */ + if (tmpvars == res) + syntax_error(EMSG_POSSIBLE_ERROR); debug_printf_eval("returning from %s(): %p\n", __func__, res); return res; -- 2.36.1
_______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
