Go patch committed: Better handling of invalid ASCII in Go frontend

Ian Lance Taylor Fri, 11 Oct 2013 10:06:06 -0700

This patch improves the handling of invalid ASCII characters in the Go
frontend when they appear in identifiers.  Note that Go input is by
definition always UTF-8, so the ASCII-specific assumptions here are OK.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.  Will commit to 4.8 branch when it reopens.

Ian

diff -r 99af66244a26 go/lex.cc
--- a/go/lex.cc	Thu Oct 10 20:11:40 2013 -0700
+++ b/go/lex.cc	Fri Oct 11 10:02:05 2013 -0700
@@ -873,7 +873,28 @@
 	      && (cc < 'a' || cc > 'z')
 	      && cc != '_'
 	      && (cc < '0' || cc > '9'))
-	    break;
+	    {
+	      // Check for an invalid character here, as we get better
+	      // error behaviour if we swallow them as part of the
+	      // identifier we are building.
+	      if ((cc >= ' ' && cc < 0x7f)
+		  || cc == '\t'
+		  || cc == '\r'
+		  || cc == '\n')
+		break;
+
+	      this->lineoff_ = p - this->linebuf_;
+	      error_at(this->location(),
+		       "invalid character 0x%x in identifier",
+		       cc);
+	      if (!has_non_ascii_char)
+		{
+		  buf.assign(pstart, p - pstart);
+		  has_non_ascii_char = true;
+		}
+	      if (!Lex::is_invalid_identifier(buf))
+		buf.append("$INVALID$");
+	    }
 	  ++p;
 	  if (is_first)
 	    {

Go patch committed: Better handling of invalid ASCII in Go frontend

Reply via email to