On Wednesday 27 of November 2013, Reid Kleckner wrote:
> Looks fine, the lexer does this, as well as other places.  Can you add an
> -frewrite-includes test for this?  I think you can rewrite, then run -cc1
> -verify with // expected-no-diagnostics or something like that.

 Oh, did I forget :) ? Updated the patch to the attached version, pushed as 
r195877 and made sure the BOM had made it in.

>
> On Wed, Nov 27, 2013 at 3:50 AM, Lubos Lunak <[email protected]> wrote:
> >  Hello,
> >
> >  could somebody please review the attached simple patch for PR15664?
> >
> >  The patch intentionally doesn't do anything fancy except simply removing
> > the
> > BOM (which should be unnecessary for UTF-8 anyway). I'm not sure what
> > handling other UTF encodings would exactly require, and it's surely
> > better to
> > keep the compilation obviously fail for whoever will possibly run into
> > that one day and provide a testcase.
> >
> > --
> >  Lubos Lunak
> >
> > _______________________________________________
> > cfe-commits mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits



-- 
 Lubos Lunak
From 22d7103297a7b42a1cd76661cf881dcdf8f6c5d5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Lubo=C5=A1=20Lu=C5=88=C3=A1k?= <[email protected]>
Date: Wed, 27 Nov 2013 12:43:30 +0100
Subject: [PATCH] strip UTF-8 BOM in -frewrite-includes (PR#15664)

---
 lib/Rewrite/Frontend/InclusionRewriter.cpp  | 5 +++++
 test/Frontend/Inputs/rewrite-includes-bom.h | 1 +
 test/Frontend/rewrite-includes-bom.c        | 4 ++++
 3 files changed, 10 insertions(+)
 create mode 100644 test/Frontend/Inputs/rewrite-includes-bom.h
 create mode 100644 test/Frontend/rewrite-includes-bom.c

diff --git a/lib/Rewrite/Frontend/InclusionRewriter.cpp b/lib/Rewrite/Frontend/InclusionRewriter.cpp
index bd4250a..71ceb5f 100644
--- a/lib/Rewrite/Frontend/InclusionRewriter.cpp
+++ b/lib/Rewrite/Frontend/InclusionRewriter.cpp
@@ -367,6 +367,11 @@ bool InclusionRewriter::Process(FileID FileId,
   unsigned NextToWrite = 0;
   int Line = 1; // The current input file line number.
 
+  // Ignore UTF-8 BOM, otherwise it'd end up somewhere else than the start
+  // of the resulting file.
+  if (FromFile.getBuffer().startswith("\xEF\xBB\xBF"))
+    NextToWrite = 3;
+
   Token RawToken;
   RawLex.LexFromRawLexer(RawToken);
 
diff --git a/test/Frontend/Inputs/rewrite-includes-bom.h b/test/Frontend/Inputs/rewrite-includes-bom.h
new file mode 100644
index 0000000..7ba011f
--- /dev/null
+++ b/test/Frontend/Inputs/rewrite-includes-bom.h
@@ -0,0 +1 @@
+// This file starts with UTF-8 BOM marker.
diff --git a/test/Frontend/rewrite-includes-bom.c b/test/Frontend/rewrite-includes-bom.c
new file mode 100644
index 0000000..a1aa4c9
--- /dev/null
+++ b/test/Frontend/rewrite-includes-bom.c
@@ -0,0 +1,4 @@
+// RUN: %clang -E -frewrite-includes -I %S/Inputs %s -o - | %clang -fsyntax-only -Xclang -verify -x c -
+// expected-no-diagnostics
+
+#include "rewrite-includes-bom.h"
-- 
1.8.1.4

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to