Hi bison maintainers,

we have found a NULL pointer dereference and would like to report this issue.
I am happy to provide any additional information needed.


## Summary

The trace output path in `location_print` reaches `boundary_print` while 
allowing `loc.end.file == NULL`, passing NULL to `quotearg_n_style(..., 
b->file)`, which results in a NULL pointer dereference within 
`quotearg_buffer_restyled`.

## Details

* **Vulnerability Type**: Segmentation fault due to NULL pointer dereference
* **Version**: 3.8.2

- `location_print` passes `loc.start` and `loc.end` to `boundary_print` for 
output when `--trace=location` is enabled (`location.c` line 160).
- `boundary_print` unconditionally passes `b->file` to `quotearg_n_style`, 
causing undefined behavior when `b->file == NULL` (`location.c` line 149).

- Meanwhile, `lloc_default`, the implementation of `YYLLOC_DEFAULT`, copies the 
LHS position's `start` and `end` from the RHS tail `rhs[n].end`. Therefore, if 
`end.file` on the RHS side remains unset during the transition, the LHS's 
`end.file` remains NULL (`parse-gram.c` lines 3204, 3205).
- This NULL was passed through `location_print → boundary_print`.

## Reproduction

### Tested Environment

- **Operating System:** Ubuntu 22.04 LTS
- **Architecture:** x86_64
- **Compiler:** gcc with AddressSanitizer (gcc version: 11.4.0)

### Reproduction Steps
```Dockerfile
FROM ubuntu:22.04

RUN apt-get update && \
    apt-get install -y \
      build-essential \
      libtool \
      automake \
      autoconf \
      pkg-config \
      git \
      ca-certificates \
      wget \
      autoconf \
      automake \
      autopoint \
      rsync \
      gcc \
      g++ \
      make

WORKDIR /root/workdir

RUN wget https://ftp.gnu.org/gnu/bison/bison-3.8.2.tar.gz
RUN tar xvf bison-3.8.2.tar.gz

WORKDIR /root/workdir/bison-3.8.2

RUN CFLAGS="-g -O0 -fsanitize=address -fno-omit-frame-pointer" 
CXXFLAGS="$CFLAGS" LDFLAGS="-fsanitize=address" ./configure && make -j$(nproc) 
&& make install
```

### poc file
```yacc
%poc p[

%%
```
## Output
### ASanLog

```
=================================================================
==8983==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 
0x575f010633b5 bp 0x7ffc35704cf0 sp 0x7ffc35704ba0 T0)
==8983==The signal is caused by a READ memory access.
==8983==Hint: address points to the zero page.
    #0 0x575f010633b5 in quotearg_buffer_restyled lib/quotearg.c:393
    #1 0x575f01064175 in quotearg_n_options lib/quotearg.c:899
    #2 0x575f010645f1 in quotearg_n_style lib/quotearg.c:950
    #3 0x575f00f89bd6 in boundary_print src/location.c:149
    #4 0x575f00f89dcd in location_print src/location.c:164
    #5 0x575f00fc6b35 in yy_symbol_print src/parse-gram.c:1390
    #6 0x575f00fcfcf3 in gram_parse src/parse-gram.c:3099
    #7 0x575f00fe9753 in reader src/reader.c:766
    #8 0x575f00f92b6a in main src/main.c:118
    #9 0x702b6e1c0d8f in __libc_start_call_main 
../sysdeps/nptl/libc_start_call_main.h:58
    #10 0x702b6e1c0e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #11 0x575f00f47594 in _start (/usr/local/bin/bison+0x36594)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV lib/quotearg.c:393 in quotearg_buffer_restyled
==8983==ABORTING
```

### Crash Flow

* `location_print` is called from `yy_symbol_print` via `YYLOCATION_PRINT`.
* `location_print` normally assumes non-NULL with `aver(loc.start.file); 
aver(loc.end.file);` before output, but the `trace_locations` path bypasses 
this check. If the trace flag is enabled after passing the `if (location_empty 
(loc))` check, it calls `boundary_print`.
* `lloc_default` aligns the LHS's `start` and `end` to the RHS tail `end`, 
overwriting only the leading non-empty element's `start`. Therefore, if the RHS 
tail's `end.file` is NULL, the LHS's `end.file` remains NULL.
* As a result, `boundary_print` receives `b->file == NULL`, and `arg=NULL` is 
passed from `quotearg_n_style` to `quotearg_buffer_restyled`, causing a SEGV.

### Affected Code

* The trace path in `location_print` (does not verify that loc.start and 
loc.end are non-NULL before calling boundary_print)
* `quotearg_n_style(..., b->file)` in `boundary_print` (no NULL protection)
* LHS position synthesis in `lloc_default` (copying the tail end)

```c
quotearg.c:393

yy_symbol_print (FILE *yyo,
                 yysymbol_kind_t yykind, YYSTYPE const * const yyvaluep, 
YYLTYPE const * const yylocationp)
{
  YYFPRINTF (yyo, "%s %s (",
             yykind < YYNTOKENS ? "token" : "nterm", yysymbol_name (yykind));

  YYLOCATION_PRINT (yyo, yylocationp); /*---------------- call 
----------------*/
  YYFPRINTF (yyo, ": ");
  yy_symbol_value_print (yyo, yykind, yyvaluep, yylocationp);
  YYFPRINTF (yyo, ")");
}

int
location_print (location loc, FILE *out)
{
  int res = 0;
  if (location_empty (loc))
    res += fprintf (out, "(empty location)");
  else if (trace_flag & trace_locations)
    {
      res += boundary_print (&loc.start, out);
      res += fprintf (out, "-");
      res += boundary_print (&loc.end, out); /*---------------- call 
----------------*/
    }
  else
...
}

static int
boundary_print (boundary const *b, FILE *out)
{
  return fprintf (out, "%s:%d.%d@%d",
                  quotearg_n_style (3, escape_quoting_style, b->file), 
/*---------------- call ----------------*/
                  b->line, b->column, b->byte);
}

char *
quotearg_n_style (int n, enum quoting_style s, char const *arg)
{
  struct quoting_options const o = quoting_options_from_style (s);
  return quotearg_n_options (n, arg, SIZE_MAX, &o); /*---------------- call 
----------------*/
}

static char *
quotearg_n_options (int n, char const *arg, size_t argsize,
                    struct quoting_options const *options)
{
...
    size_t size = sv[n].size;
    char *val = sv[n].val;
    /* Elide embedded null bytes since we don't return a size.  */
    int flags = options->flags | QA_ELIDE_NULL_BYTES;
    size_t qsize = quotearg_buffer_restyled (val, size, arg, argsize,  
/*---------------- call ----------------*/
                                             options->style, flags,
                                             options->quote_these_too,
                                             options->left_quote,
                                             options->right_quote);

...
}

static size_t
quotearg_buffer_restyled (char *buffer, size_t buffersize,
                          char const *arg, size_t argsize,
                          enum quoting_style quoting_style, int flags,
                          unsigned int const *quote_these_too,
                          char const *left_quote,
                          char const *right_quote)
{
...
    default:
      abort ();
    }

  for (i = 0;  ! (argsize == SIZE_MAX ? arg[i] == '\0' : i == argsize);  i++)  
/*------------- CRASH! -------------*/
...
}
```


## Proposed Fix

* Fix on the generation side
  + Correct the consistency of `start.file` and `end.file` before returning 
from `lloc_default`.
  + Set `end = start` when `end.file == NULL` and `start.file != NULL`.
* Fix on the output side
  + Modify `boundary_print` to output "(NULL)" when b->file is NULL as follows:

boundary_print(location.c:149)
```c
static int
boundary_print (boundary const *b, FILE *out)
{
  const char *tmp_filename = b->file ? b->file : "(NULL)"; /*---------------- 
append ----------------*/
  return fprintf (out, "%s:%d.%d@%d",
                  quotearg_n_style (3, escape_quoting_style, tmp_filename),
                  b->line, b->column, b->byte);
}
```

Reply via email to