On Wednesday 27 of November 2013, Reid Kleckner wrote: > Looks fine, the lexer does this, as well as other places. Can you add an > -frewrite-includes test for this? I think you can rewrite, then run -cc1 > -verify with // expected-no-diagnostics or something like that.
Oh, did I forget :) ? Updated the patch to the attached version, pushed as r195877 and made sure the BOM had made it in. > > On Wed, Nov 27, 2013 at 3:50 AM, Lubos Lunak <[email protected]> wrote: > > Hello, > > > > could somebody please review the attached simple patch for PR15664? > > > > The patch intentionally doesn't do anything fancy except simply removing > > the > > BOM (which should be unnecessary for UTF-8 anyway). I'm not sure what > > handling other UTF encodings would exactly require, and it's surely > > better to > > keep the compilation obviously fail for whoever will possibly run into > > that one day and provide a testcase. > > > > -- > > Lubos Lunak > > > > _______________________________________________ > > cfe-commits mailing list > > [email protected] > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits -- Lubos Lunak
From 22d7103297a7b42a1cd76661cf881dcdf8f6c5d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lubo=C5=A1=20Lu=C5=88=C3=A1k?= <[email protected]> Date: Wed, 27 Nov 2013 12:43:30 +0100 Subject: [PATCH] strip UTF-8 BOM in -frewrite-includes (PR#15664) --- lib/Rewrite/Frontend/InclusionRewriter.cpp | 5 +++++ test/Frontend/Inputs/rewrite-includes-bom.h | 1 + test/Frontend/rewrite-includes-bom.c | 4 ++++ 3 files changed, 10 insertions(+) create mode 100644 test/Frontend/Inputs/rewrite-includes-bom.h create mode 100644 test/Frontend/rewrite-includes-bom.c diff --git a/lib/Rewrite/Frontend/InclusionRewriter.cpp b/lib/Rewrite/Frontend/InclusionRewriter.cpp index bd4250a..71ceb5f 100644 --- a/lib/Rewrite/Frontend/InclusionRewriter.cpp +++ b/lib/Rewrite/Frontend/InclusionRewriter.cpp @@ -367,6 +367,11 @@ bool InclusionRewriter::Process(FileID FileId, unsigned NextToWrite = 0; int Line = 1; // The current input file line number. + // Ignore UTF-8 BOM, otherwise it'd end up somewhere else than the start + // of the resulting file. + if (FromFile.getBuffer().startswith("\xEF\xBB\xBF")) + NextToWrite = 3; + Token RawToken; RawLex.LexFromRawLexer(RawToken); diff --git a/test/Frontend/Inputs/rewrite-includes-bom.h b/test/Frontend/Inputs/rewrite-includes-bom.h new file mode 100644 index 0000000..7ba011f --- /dev/null +++ b/test/Frontend/Inputs/rewrite-includes-bom.h @@ -0,0 +1 @@ +// This file starts with UTF-8 BOM marker. diff --git a/test/Frontend/rewrite-includes-bom.c b/test/Frontend/rewrite-includes-bom.c new file mode 100644 index 0000000..a1aa4c9 --- /dev/null +++ b/test/Frontend/rewrite-includes-bom.c @@ -0,0 +1,4 @@ +// RUN: %clang -E -frewrite-includes -I %S/Inputs %s -o - | %clang -fsyntax-only -Xclang -verify -x c - +// expected-no-diagnostics + +#include "rewrite-includes-bom.h" -- 1.8.1.4
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
