[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-08-06 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai abandoned this revision.
smeenai added a comment.

Abandoning in favor of D65839 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

In D64538#1579632 , @smeenai wrote:

> I'm not fully understanding this. If the .sh file was generated on Windows, 
> it'll contain backslashes in any arguments that are paths, right? Before this 
> change, those backslashes would have been doubled up, but the backslash still 
> wouldn't work as a path separator on Linux, so those arguments would need 
> manual adjustment. I guess the other case that matters is if you have a 
> backslash in an argument which isn't a path, in which case the old behavior 
> was definitely correct and the new one definitely isn't.


At least in Chromium, users pass flags like 
`-DFOO_EXPORT="__declspec(dllexport)"`. I can't remember a concrete instance 
where a user passed non-path flags with backslashes in them, but you could 
imagine a hypothetical argument like: `-DSOME_STRING="foo \"bar baz\""`, and if 
we don't escape those backslashes, the argument will be tokenized incorrectly. 
I guess that example is an objection to this part of the comment:

>   // double quotes only when it's followed by $, `, ", \, or a literal 
> newline;
>   // the last three are disallowed in paths on Windows and the first two are
>   // unlikely, so this shouldn't cause issues in practice. Note that we always

I can also imagine tokenization going wrong if some inconsequential path ends 
in a trailing backslash: `-fdebug-compilation-dir=C:\foo\bar\` -> 
`"-fdebug-compilation-dir=C:\foo\bar\"`

I like what we do now because it's safe and handles the general case of 
backslashes in flags without any gotchas or corner cases. If the specific issue 
that we have is that driver tests are hard to write because of the variance in 
path printing, driver testing could already be greatly improved by adding a 
filecheck-friendly version of -###, so we should do that anyway.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai added a comment.

In D64538#1579610 , @rnk wrote:

> In D64538#1579573 , @smeenai wrote:
>
> > In D64538#1579561 , @rnk wrote:
> >
> > > An extremely convenient feature of the current escaping pattern is that 
> > > it magically works for both the cmd shell and various bash 
> > > implementations. You can simply copy paste the commands and run them. 
> > > This matters, for example, for the .sh crash reproducer script. On 
> > > Windows today, you can just run it with bash, and it works. So, I'd 
> > > prefer it if we didn't do this.
> > >
> > > If you want -### output to be more FileCheck friendly, then I would 
> > > recommend adding a new driver flag for testing that dumps the commands 
> > > without doing any escaping. This came up before:
> > >  https://reviews.llvm.org/D62493
> >
> >
> > As long as the backslashes aren't followed by a few special characters (as 
> > detailed in my comment), the backslash should still continue to work. I've 
> > been testing this with git bash and I'm still able to copy-paste the ouptut 
> > command.
>
>
> For the use case of crash reproduction, we often get these .sh files from 
> users, and then try to reproduce and reduce the crash on Linux. Right now 
> it's just a matter of adjusting the clang binary being run to reproduce a 
> crash, but with this change, it will be harder. The current escaping is kind 
> of the universal currency, a quoting style that is accepted everywhere.


I'm not fully understanding this. If the .sh file was generated on Windows, 
it'll contain backslashes in any arguments that are paths, right? Before this 
change, those backslashes would have been doubled up, but the backslash still 
wouldn't work as a path separator on Linux, so those arguments would need 
manual adjustment. I guess the other case that matters is if you have a 
backslash in an argument which isn't a path, in which case the old behavior was 
definitely correct and the new one definitely isn't.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

In D64538#1579573 , @smeenai wrote:

> In D64538#1579561 , @rnk wrote:
>
> > An extremely convenient feature of the current escaping pattern is that it 
> > magically works for both the cmd shell and various bash implementations. 
> > You can simply copy paste the commands and run them. This matters, for 
> > example, for the .sh crash reproducer script. On Windows today, you can 
> > just run it with bash, and it works. So, I'd prefer it if we didn't do this.
> >
> > If you want -### output to be more FileCheck friendly, then I would 
> > recommend adding a new driver flag for testing that dumps the commands 
> > without doing any escaping. This came up before:
> >  https://reviews.llvm.org/D62493
>
>
> As long as the backslashes aren't followed by a few special characters (as 
> detailed in my comment), the backslash should still continue to work. I've 
> been testing this with git bash and I'm still able to copy-paste the ouptut 
> command.


For the use case of crash reproduction, we often get these .sh files from 
users, and then try to reproduce and reduce the crash on Linux. Right now it's 
just a matter of adjusting the clang binary being run to reproduce a crash, but 
with this change, it will be harder. The current escaping is kind of the 
universal currency, a quoting style that is accepted everywhere.

> I'm okay adding a new driver flag which just skips all escaping, but I'd also 
> prefer for stuff running on Windows to follow Windows path conventions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai added a comment.

In D64538#1579561 , @rnk wrote:

> An extremely convenient feature of the current escaping pattern is that it 
> magically works for both the cmd shell and various bash implementations. You 
> can simply copy paste the commands and run them. This matters, for example, 
> for the .sh crash reproducer script. On Windows today, you can just run it 
> with bash, and it works. So, I'd prefer it if we didn't do this.
>
> If you want -### output to be more FileCheck friendly, then I would recommend 
> adding a new driver flag for testing that dumps the commands without doing 
> any escaping. This came up before:
>  https://reviews.llvm.org/D62493


As long as the backslashes aren't followed by a few special characters (as 
detailed in my comment), the backslash should still continue to work. I've been 
testing this with git bash and I'm still able to copy-paste the ouptut command.

I'm okay adding a new driver flag which just skips all escaping, but I'd also 
prefer for stuff running on Windows to follow Windows path conventions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment.

An extremely convenient feature of the current escaping pattern is that it 
magically works for both the cmd shell and various bash implementations. You 
can simply copy paste the commands and run them. This matters, for example, for 
the .sh crash reproducer script. On Windows today, you can just run it with 
bash, and it works. So, I'd prefer it if we didn't do this.

If you want -### output to be more FileCheck friendly, then I would recommend 
adding a new driver flag for testing that dumps the commands without doing any 
escaping. This came up before:
https://reviews.llvm.org/D62493


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai added a subscriber: phosek.
smeenai added a comment.

CC @phosek, since I believe you had issues writing tests referencing the 
installation directory in the past.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai added a comment.

In D64538#1579510 , @smeenai wrote:

> Hmm, this only fixes `-###`. Looking into `-v` now.


Ignore this – I was testing with a version without my change :D


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai added a comment.

Hmm, this only fixes `-###`. Looking into `-v` now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64538/new/

https://reviews.llvm.org/D64538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64538: [Driver] Don't escape backslashes on Windows

2019-07-10 Thread Shoaib Meenai via Phabricator via cfe-commits
smeenai created this revision.
smeenai added reviewers: compnerd, hans, mstorsjo, rnk, thakis, zturner.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Escaping backslashes results in unnatural looking output, and the actual
escapes are mostly unnecessary. We were also not consistent about when
we escaped backslashes and when we didn't, making it impossible to e.g.
store the InstalledDir output to a FileCheck variable and then use that
variable to check paths relative to the driver.

This change could be generalized to any platform which uses the
backslash as a path separator instead of just Windows. I'm unaware of
other such platforms, however, and I'm also unsure if not escaping the
backslash would be appropriate for those platforms, so I'm limiting to
Windows for now.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D64538

Files:
  clang/lib/Driver/Job.cpp
  clang/test/Driver/backslash.c


Index: clang/test/Driver/backslash.c
===
--- /dev/null
+++ clang/test/Driver/backslash.c
@@ -0,0 +1,6 @@
+// RUN: %clang -fsyntax-only %s -### 2>&1 | FileCheck %s
+// Ensure the treatment of backslashes is consistent in the printed 
InstalledDir
+// and commands. We would previous escape backslashes in the printed commands
+// but not in the printed InstallDir, causing the match to fail.
+// CHECK: InstalledDir: [[INSTALLEDDIR:.+$]]
+// CHECK: "[[INSTALLEDDIR]]{{[/\\]}}clang
Index: clang/lib/Driver/Job.cpp
===
--- clang/lib/Driver/Job.cpp
+++ clang/lib/Driver/Job.cpp
@@ -107,9 +107,26 @@
   }
 
   // Quote and escape. This isn't really complete, but good enough.
+  // On Windows, the backslash is the path separator, and escaping the 
backslash
+  // results in unnatural-looking output that's also inconsistent with how 
paths
+  // are printed out elsewhere (e.g. when printing out the installation
+  // directory). Windows native shells (cmd and PowerShell) use different 
escape
+  // characters so escaping the backslash is unnecessary.  bash (the most 
common
+  // non-native shell on Windows) uses backslash as an escape character inside
+  // double quotes only when it's followed by $, `, ", \, or a literal newline;
+  // the last three are disallowed in paths on Windows and the first two are
+  // unlikely, so this shouldn't cause issues in practice. Note that we always
+  // enclose an argument containing backslashes in quotes even if we aren't
+  // escaping the backslashes, since bash always interprets it as an escape
+  // character outside quotes.
+#if defined(_WIN32)
+  constexpr bool RunningOnWindows = true;
+#else
+  constexpr bool RunningOnWindows = false;
+#endif
   OS << '"';
   for (const auto c : Arg) {
-if (c == '"' || c == '\\' || c == '$')
+if (c == '"' || (c == '\\' && !RunningOnWindows) || c == '$')
   OS << '\\';
 OS << c;
   }


Index: clang/test/Driver/backslash.c
===
--- /dev/null
+++ clang/test/Driver/backslash.c
@@ -0,0 +1,6 @@
+// RUN: %clang -fsyntax-only %s -### 2>&1 | FileCheck %s
+// Ensure the treatment of backslashes is consistent in the printed InstalledDir
+// and commands. We would previous escape backslashes in the printed commands
+// but not in the printed InstallDir, causing the match to fail.
+// CHECK: InstalledDir: [[INSTALLEDDIR:.+$]]
+// CHECK: "[[INSTALLEDDIR]]{{[/\\]}}clang
Index: clang/lib/Driver/Job.cpp
===
--- clang/lib/Driver/Job.cpp
+++ clang/lib/Driver/Job.cpp
@@ -107,9 +107,26 @@
   }
 
   // Quote and escape. This isn't really complete, but good enough.
+  // On Windows, the backslash is the path separator, and escaping the backslash
+  // results in unnatural-looking output that's also inconsistent with how paths
+  // are printed out elsewhere (e.g. when printing out the installation
+  // directory). Windows native shells (cmd and PowerShell) use different escape
+  // characters so escaping the backslash is unnecessary.  bash (the most common
+  // non-native shell on Windows) uses backslash as an escape character inside
+  // double quotes only when it's followed by $, `, ", \, or a literal newline;
+  // the last three are disallowed in paths on Windows and the first two are
+  // unlikely, so this shouldn't cause issues in practice. Note that we always
+  // enclose an argument containing backslashes in quotes even if we aren't
+  // escaping the backslashes, since bash always interprets it as an escape
+  // character outside quotes.
+#if defined(_WIN32)
+  constexpr bool RunningOnWindows = true;
+#else
+  constexpr bool RunningOnWindows = false;
+#endif
   OS << '"';
   for (const auto c : Arg) {
-if (c == '"' || c == '\\' || c == '$')
+if (c == '"' || (c == '\\' && !RunningOnWindows) || c == '$')
   OS << '\\';
 OS << c;
   }