On Mon, 2025-02-17 at 12:29 -0500, James K. Lowden wrote: > On Sat, 15 Feb 2025 23:32:37 -0500 > David Malcolm <dmalc...@redhat.com> wrote: > > In defense of lack of free(3) ... > > > > +const char * > > > +esc( size_t len, const char input[] ) { > > > + static char spaces[] = "([,;]?[[:space:]])+"; > > > + static char spaceD[] = "(\n {6}D" "|" "[,;]?[[:space:]])+"; > > > + static char buffer[64 * 1024]; > > > + char *p = buffer; > > > + const char *eoinput = input + len; > > > + > > > + const char *spacex = is_reference_format()? spaceD : spaces; > > > + > > > + for( const char *s=input; *s && s < eoinput; s++ ) { > > > + *p = '\0'; > > > + gcc_assert( size_t(p - buffer) < sizeof(buffer) - 4 ); > > overflow guarded here
...but only via gcc_assert. gcc/system.h has: /* Use gcc_assert(EXPR) to test invariants. */ #if ENABLE_ASSERT_CHECKING #define gcc_assert(EXPR) \ ((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0 : 0)) #elif (GCC_VERSION >= 4005) #define gcc_assert(EXPR) \ ((void)(UNLIKELY (!(EXPR)) ? __builtin_unreachable (), 0 : 0)) #else /* Include EXPR, so that unused variable warnings do not occur. */ #define gcc_assert(EXPR) ((void)(0 && (EXPR))) #endif so if ENABLE_ASSERT_CHECKING is false, that check gets preprocessed away. It turns out that (in gcc/configure.ac) that with --enable- checking=release (which is soon to become the default for gcc 15), that ENABLE_ASSERT_CHECKING is still true, but it is possible to configure gcc with it false (e.g. with --enable-checking=none). We shouldn't rely on assert to do checking of user-controllable input; it should always be checked. > > > > + switch(*s) { > > > + case '^': case '$': > > > + case '(': case ')': > > > + case '*': case '+': case '?': > > > + case '[': case ']': > > > + case '{': case '}': > > > + case '|': > > > + case '.': > > > + *p++ = '\\'; > > > + *p++ = *s; > > > + break; > > > + case '\\': > > > + *p++ = '['; > > > + *p++ = *s; > > > + *p++ = ']'; > > > + break; > > > + > > > + case ';': case ',': > > > + if( ! (s+1 < eoinput && s[1] == 0x20) ) { > > > + *p++ = *s; > > > + break; > > > + } > > > + __attribute__((fallthrough)); > > > + case 0x20: case '\n': > > > + gcc_assert(p + sizeof(spacex) < buffer + sizeof(buffer)); > > and overflow guarded here, the only place where more than 4 > characters can be inserted into the buffer. Likewise. > > > > + p = stpcpy( p, spacex ); > > > + while( s+1 < eoinput && is_separator_space(s+1)) { > > > + s++; > > > + } > > > + break; > > > + default: > > > + *p++ = *s; > > > + break; > > > + } > > > + } > > > + *p = '\0'; > ... > > > + return xstrdup(buffer); > > > +} > > > > Has a fixed size 64k buffer; doesn't seem to have proper overflow > > handling. Could use a pretty_printer to accumulate chars. > > Thank you for these comments. Let me see if I can alleviate your > concerns. > > This function is called from exactly one place, where the file- > reader, lexio, parses a REPLACE directive. The COBOL input says > "REPLACE X BY Y" subject to some constraints. Because we're using > regex to find X, and because X might be any arbitrary string, the > esc() function escapes regex metacharacters prior to executing the > regex. > > IMHO it's unlikely the resulting regex input will exceed 64 KB. It's > unlikely to be even 64 bytes. (Usually X is a COBOL identifier, > limited to 64 bytes by ISO.) Granted, a fixed maximum is a > limitation. But I put it to you: which is more likely? For a regex > to exceed 64 KB, or for heap allocation to fail to return adequate > memory? If there's some crazy input, I'd rather die at 64 KB than > consume gigabytes of swap on the way to crashing. If you're going to have it die at 64KB, please make sure it *does* die (hence my concerns about using gcc_assert for the bounds checking). Hypothetical scenario: sysadmin sets up a service that accepts arbitrary COBOL code to be compiled (such as Compiler Explorer). Can an attacker construct a not-necessarily valid input COBOL program that overruns the on-stack buffer? If so, they can probably use this to inject code into the process, at which point they've got a beachhead to attack the system further. (Though IIRC Compiler Explorer already sandboxes the compilers that are running there) [...] Dave