http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8270

--- Comment #52 from GoWhoopee at yahoo dot com ---
Whitespace is required by Translation Phase 3, consequently Translation Phase 1
should not be changing whitespace at all, only mapping multibyte characters and
trigraphs.

Comment #39: Indicates that gcc is known to work incorrectly, "This (removal of
such spaces) is part of how GCC defines the implementation-defined mapping in
translation phase 1.": the removal of white-space is not mapping multibyte
characters or trigraphs, it is removing critical information from Translation
Phases 2 and 3 resulting in misinterpretation of the source code.

Looking at the 4.8.2 source, libcpp\lex.c line 1427, there is a fix when
parsing raw strings, after the event:
______________________________________________
static void
lex_raw_string (cpp_reader *pfile, cpp_token *token, const uchar *base,
        const uchar *cur)
{
[...]
      switch (note->type)
        {
        case '\\':
        case ' ':
          /* Restore backslash followed by newline.  */
          BUF_APPEND (base, cur - base);
          base = cur;
          BUF_APPEND ("\\", 1);
        after_backslash:
          if (note->type == ' ')
        {
          /* GNU backslash whitespace newline extension.  FIXME
             could be any sequence of non-vertical space.  When we
             can properly restore any such sequence, we should mark
             this note as handled so _cpp_process_line_notes
             doesn't warn.  */
          BUF_APPEND (" ", 1);
        }

          BUF_APPEND ("\n", 1);
          break;
______________________________________________

but fixing all the varieties of broken things after the event wouldn't be
necessary if Translation Phase 1 didn't trim whitespace.

If Translation Phase 1 is required to trim whitespace for some reason
(performance, perhaps) then it should trim multiple consecutive spaces down to
exactly one space; which wouldn't break Translation Phase 2 and 3.

Does that sound like a sensible compromise?

Reply via email to