[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Joerg Sonnenberger via Phabricator via cfe-commits
joerg added a comment.

We already have a couple of case that expect the encoding to be compatible. I'm 
not very attached to the additional special cases from YAML, but having either 
a common escape function OR a JSON escape in LLVM/Support is quite important.


https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Marc-Andre Laperle via Phabricator via cfe-commits
malaperle-ericsson added a comment.

In https://reviews.llvm.org/D31992#728036, @joerg wrote:

> Just to avoid any confusion: this should be the generic YAML escape routine 
> in llvm/lib/Support, i.e. IMO we don't want to have separate YAML and JSON 
> escape routines.
>  Effective, change YAMLParser.cpp line 697 to use \u and drop the whole UTF-8 
> handling part.


I'm not sure it's possible or desirable to reuse yaml::escape as they are two 
formats and it's expected that a yaml output might not be able to be read by a 
json parser. If I look at the yaml::escape routine, there are several escape 
that are incompatible with JSON like \0, \v, \e, and 32-bit handling via \U.
I think ideally this patch would be a stop-gap solution until jsoncpp is 
introduced so that one can build json output in a structured way (as opposed to 
manually appending strings now) and the output will also be escape via the 
library.


https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Joerg Sonnenberger via Phabricator via cfe-commits
joerg added a comment.

Just to avoid any confusion: this should be the generic YAML escape routine in 
llvm/lib/Support, i.e. IMO we don't want to have separate YAML and JSON escape 
routines.
Effective, change YAMLParser.cpp line 697 to use \u and drop the whole UTF-8 
handling part.


https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Marc-Andre Laperle via Phabricator via cfe-commits
malaperle-ericsson added a comment.

Once the use of "two characters representation" is clarified, I will update the 
patch again.


https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Joerg Sonnenberger via Phabricator via cfe-commits
joerg added a comment.

The short version is perfectly fine as long as works for both JSON and YAML. 
Less output is always good :)


https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Marc-Andre Laperle via Phabricator via cfe-commits
malaperle-ericsson updated this revision to Diff 95414.
malaperle-ericsson added a comment.

Handle other control characters and add test


https://reviews.llvm.org/D31992

Files:
  clangd/ASTManager.cpp
  clangd/Protocol.cpp
  clangd/Protocol.h
  clangd/ProtocolHandlers.cpp
  test/clangd/encoding.test

Index: test/clangd/encoding.test
===
--- /dev/null
+++ test/clangd/encoding.test
@@ -0,0 +1,17 @@
+# RUN: clangd -run-synchronously < %s | FileCheck %s
+# It is absolutely vital that this file has CRLF line endings.
+#
+Content-Length: 125
+
+{"jsonrpc":"2.0","id":0,"method":"initialize","params":{"processId":123,"rootPath":"clangd","capabilities":{},"trace":"off"}}
+#
+Content-Length: 154
+
+{"jsonrpc":"2.0","method":"textDocument/didOpen","params":{"textDocument":{"uri":"file:///foo.c","languageId":"c","version":1,"text":"void main() {é;}"}}}
+#
+# CHECK: {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///foo.c","diagnostics":[{"range":{"start": {"line": 0, "character": 1}, "end": {"line": 0, "character": 1}},"severity":2,"message":"return type of 'main' is not 'int'"},{"range":{"start": {"line": 0, "character": 1}, "end": {"line": 0, "character": 1}},"severity":3,"message":"change return type to 'int'"},{"range":{"start": {"line": 0, "character": 14}, "end": {"line": 0, "character": 14}},"severity":1,"message":"use of undeclared identifier 'é'"}]}}
+#
+#
+Content-Length: 44
+
+{"jsonrpc":"2.0","id":5,"method":"shutdown"}
Index: clangd/ProtocolHandlers.cpp
===
--- clangd/ProtocolHandlers.cpp
+++ clangd/ProtocolHandlers.cpp
@@ -172,9 +172,9 @@
 
 if (!Edits.empty())
   Commands +=
-  R"({"title":"Apply FixIt ')" + llvm::yaml::escape(D.message) +
+  R"({"title":"Apply FixIt ')" + jsonEscape(D.message) +
   R"('", "command": "clangd.applyFix", "arguments": [")" +
-  llvm::yaml::escape(CAP->textDocument.uri.uri) +
+  jsonEscape(CAP->textDocument.uri.uri) +
   R"(", [)" + Edits +
   R"(]]},)";
   }
Index: clangd/Protocol.h
===
--- clangd/Protocol.h
+++ clangd/Protocol.h
@@ -29,6 +29,8 @@
 namespace clang {
 namespace clangd {
 
+std::string jsonEscape(llvm::StringRef Input);
+
 struct URI {
   std::string uri;
   std::string file;
Index: clangd/Protocol.cpp
===
--- clangd/Protocol.cpp
+++ clangd/Protocol.cpp
@@ -15,12 +15,48 @@
 #include "Protocol.h"
 #include "clang/Basic/LLVM.h"
 #include "llvm/ADT/SmallString.h"
+#include 
 #include "llvm/Support/Format.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Support/Path.h"
 using namespace clang::clangd;
 
 
+std::string clang::clangd::jsonEscape(llvm::StringRef Input) {
+  std::string EscapedInput;
+  for (llvm::StringRef::iterator i = Input.begin(), e = Input.end(); i != e; ++i) {
+if (*i == '\\')
+  EscapedInput += "";
+else if (*i == '"')
+  EscapedInput += "\\\"";
+// bell
+else if (*i == 0x07)
+  EscapedInput += "\\a";
+// backspace
+else if (*i == 0x08)
+  EscapedInput += "\\b";
+// hoz tab
+else if (*i == 0x09)
+  EscapedInput += "\\t";
+// new line
+else if (*i == 0x0A)
+  EscapedInput += "\\n";
+// form feed
+else if (*i == 0x0C)
+  EscapedInput += "\\f";
+// carr return
+else if (*i == 0x0D)
+  EscapedInput += "\\r";
+else if ((unsigned char)*i < 0x20) { // Control characters not handled above.
+  std::string HexStr = llvm::utohexstr(*i);
+  EscapedInput += "\\u" + std::string(4 - HexStr.size(), '0') + HexStr;
+}
+else
+  EscapedInput.push_back(*i);
+  }
+  return EscapedInput;
+}
+
 URI URI::fromUri(llvm::StringRef uri) {
   URI Result;
   Result.uri = uri;
@@ -230,7 +266,7 @@
   std::string Result;
   llvm::raw_string_ostream(Result) << llvm::format(
   R"({"range": %s, "newText": "%s"})", Range::unparse(P.range).c_str(),
-  llvm::yaml::escape(P.newText).c_str());
+  jsonEscape(P.newText).c_str());
   return Result;
 }
 
@@ -670,20 +706,20 @@
   std::string Result = "{";
   llvm::raw_string_ostream Os(Result);
   assert(!CI.label.empty() && "completion item label is required");
-  Os << R"("label":")" << llvm::yaml::escape(CI.label) << R"(",)";
+  Os << R"("label":")" << jsonEscape(CI.label) << R"(",)";
   if (CI.kind != CompletionItemKind::Missing)
 Os << R"("kind":)" << static_cast(CI.kind) << R"(,)";
   if (!CI.detail.empty())
-Os << R"("detail":")" << llvm::yaml::escape(CI.detail) << R"(",)";
+Os << R"("detail":")" << jsonEscape(CI.detail) << R"(",)";
   if (!CI.documentation.empty())
-Os << R"("documentation":")" << llvm::yaml::escape(CI.documentation)
+Os << R"("documentation":")" << jsonEscape(CI.documentation)
<< R"(",)";
   if 

[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-16 Thread Marc-Andre Laperle via Phabricator via cfe-commits
malaperle-ericsson added a comment.

In https://reviews.llvm.org/D31992#726447, @joerg wrote:

>   let's escape ... all the known ASCII control characters,


Do you mean encode all of them with \u or keep the two characters 
representation for those that exist? I think \n is nicer than \u000A and is 
probably common enough to keep the short version. Same for \r, \t.


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Joerg Sonnenberger via Phabricator via cfe-commits
joerg added a comment.

In https://reviews.llvm.org/D31992#725963, @klimek wrote:

> If I understand correctly, that's " and \ for JSON and ", \ and all 
> non-printable characters (which unfortunately requires understanding of 
> unicode to solve this fully correctly) in YAML.


I'd modify that slightly: let's escape those two and all the known ASCII 
control characters, but just pass the rest through. I.e. reduce the current \x 
etc handling to known problematic input and not second guess the rest.


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In https://reviews.llvm.org/D31992#725912, @malaperle-ericsson wrote:

> In https://reviews.llvm.org/D31992#725866, @krasimir wrote:
>
> > Seems that we're starting to hit some YAML/JSON mismatches, or is it that 
> > your YAML string support is lacking?
>
>
> I don't think so. It seems like JSON and YAML are not completely aligned on 
> escaped characters. See http://yaml.org/spec/1.2/spec.html#id2776092 which 
> specifies unicode 8, 16, and 32 bits are escaped with \x, \u, and \U whereas 
> http://www.json.org/string.gif specifies that unicode 16 bit characters can 
> be encoded with \u but \x and \U are not supported. This leads me to believe 
> that a YAML parser can read JSON but a JSON parser will not necessarily be 
> able to read YAML. I thought about using json cpp but that's a much bigger 
> change


Here we only talk about what we escape, which should be the minimum required.
If I understand correctly, that's " and \ for JSON and ", \ and all 
non-printable characters (which unfortunately requires understanding of unicode 
to solve this fully correctly) in YAML.
Am I missing something?


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clangd/Protocol.cpp:26-50
+  for (llvm::StringRef::iterator i = Input.begin(), e = Input.end(); i != e; 
++i) {
+if (*i == '\\')
+  EscapedInput += "";
+else if (*i == '"')
+  EscapedInput += "\\\"";
+// bell
+else if (*i == 0x07)

For json we only need the first 2 though, right?


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Marc-Andre Laperle via Phabricator via cfe-commits
malaperle-ericsson added a comment.

In https://reviews.llvm.org/D31992#725913, @joerg wrote:

> I'm strongly against this patch. Can you give an actual test case for the 
> problematic behavior?


Sure I can add a test. If you meant more real work scenario, you can juste type 
"é" in VS Code and it will throw an exception trying to parse \x


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Joerg Sonnenberger via Phabricator via cfe-commits
joerg added a comment.

I'm strongly against this patch. Can you give an actual test case for the 
problematic behavior?


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Marc-Andre Laperle via Phabricator via cfe-commits
malaperle-ericsson added a comment.

In https://reviews.llvm.org/D31992#725866, @krasimir wrote:

> Seems that we're starting to hit some YAML/JSON mismatches, or is it that 
> your YAML string support is lacking?


I don't think so. It seems like JSON and YAML are not completely aligned on 
escaped characters. See http://yaml.org/spec/1.2/spec.html#id2776092 which 
specifies unicode 8, 16, and 32 bits are escaped with \x, \u, and \U whereas 
http://www.json.org/string.gif specifies that unicode 16 bit characters can be 
encoded with \u but \x and \U are not supported. This leads me to believe that 
a YAML parser can read JSON but a JSON parser will not necessarily be able to 
read YAML. I thought about using json cpp but that's a much bigger change


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Krasimir Georgiev via cfe-commits
sorry, is it that *our* YAML string support is lacking?

On Thu, Apr 13, 2017 at 2:38 PM, Krasimir Georgiev via Phabricator <
revi...@reviews.llvm.org> wrote:

> krasimir added a comment.
>
> Seems that we're starting to hit some YAML/JSON mismatches, or is it that
> your YAML string support is lacking?
>
>
> Repository:
>   rL LLVM
>
> https://reviews.llvm.org/D31992
>
>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31992: [clangd] Escape only necessary characters in JSON output

2017-04-13 Thread Krasimir Georgiev via Phabricator via cfe-commits
krasimir added a comment.

Seems that we're starting to hit some YAML/JSON mismatches, or is it that your 
YAML string support is lacking?


Repository:
  rL LLVM

https://reviews.llvm.org/D31992



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits