[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-05-17 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

My primary concern with this is that we're calling these stable IDs -- they're 
not really stable IDs, they're internal implementation details that we change 
from time to time. So what happens when we rename `ext_foo_bar` into 
`warn_foo_bar` as far as SARIF is concerned? Does it matter, or are we setting 
some tools up for problems because they'll want to map between the old name and 
the new name if they're the "same diagnostic" as before?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-04-28 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb added inline comments.



Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:51-52
+  Diag->getDiags()->getDiagnosticIDs()->getStableName(Diag->getID()).str();
+  std::replace(StableName.begin(), StableName.end(), '_', '.');
+  SarifRule Rule = SarifRule::create().setRuleId(StableName);
 

Endill wrote:
> vaibhav.y wrote:
> > cjdb wrote:
> > > denik wrote:
> > > > §3.5.4 says that the hierarchical strings are separated by "/".
> > > > 
> > > > But more generally, are diagnostic names really fall under 
> > > > "hierarchical" definition?
> > > > Words between underscores should be treated as components. And $3.27.5 
> > > > says that the leading components have to be the identifier of the rule.
> > > > In some cases they look like valid components, e.g. `err_access`, 
> > > > `err_access_dtor`, `err_access_dtor_exception`.
> > > > But in cases like `err_cannot_open_file` neither of the leading 
> > > > components exists.
> > > > 
> > > > Theoretically we could use groups as the leading component for warnings 
> > > > for example. For errors the leading components are probably not even 
> > > > necessary, since if I understood correctly they are needed to suppress 
> > > > subsets of violations on the SARIF consumer side.
> > > > Or we just could keep the names with underscores as is. WDYT?
> > > I think in light of what you've said, changing back to underscores is 
> > > probably best.
> > > But more generally, are diagnostic names really fall under "hierarchical" 
> > > definition?
> > 
> > I have the same concern, but this is okay for a first pass as a "flat 
> > hierarchy" :)
> > 
> > If we want a deeper structure, we'll need some extra metadata in 
> > `DiagnosticSemaKinds.td`'s `Error<...>`  to add cluster names as a follow 
> > up to this.
> > 
> > WDYT about something like `clang/visibility/err_access` or 
> > `clang/syntax/err_stmtexpr_file_scope`.
> > 
> > Alternatively: we could draw from the c++ standard structure: 
> > https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code 
> > for an ODR violation could be `clang/basic/def/odr`, again I'm unsure how 
> > well this meshes with clang's diagnostic model.
> > Alternatively: we could draw from the c++ standard structure: 
> > https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code 
> > for an ODR violation could be clang/basic/def/odr, again I'm unsure how 
> > well this meshes with clang's diagnostic model.
> 
> The only reliable thing there are stable clause names like 
> `[namespace.udecl]`, because there are precedents of them being rearranged in 
> the table of contents ([[ 
> https://github.com/cplusplus/draft/commit/982a456f176ca00409c6e514af932051dce2485f
>  | 1 ]], [[ 
> https://github.com/cplusplus/draft/commit/3c580cd204fde95a21de1830ace75d14d429f845
>  | 2 ]]). We've got bitten by that in C++ conformance tests, which use table 
> of contents for directory hierarchy.
> WDYT about something like `clang/visibility/err_access` or 
> `clang/syntax/err_stmtexpr_file_scope`.

I think this might work!

> > Alternatively: we could draw from the c++ standard structure: 
> > https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code 
> > for an ODR violation could be clang/basic/def/odr, again I'm unsure how 
> > well this meshes with clang's diagnostic model.
> 
> The only reliable thing there are stable clause names like 
> `[namespace.udecl]`, because there are precedents of them being rearranged in 
> the table of contents ([[ 
> https://github.com/cplusplus/draft/commit/982a456f176ca00409c6e514af932051dce2485f
>  | 1 ]], [[ 
> https://github.com/cplusplus/draft/commit/3c580cd204fde95a21de1830ace75d14d429f845
>  | 2 ]]). We've got bitten by that in C++ conformance tests, which use table 
> of contents for directory hierarchy.

Right. Given that section names aren't actually stable, relying on those would 
be brittle at best.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-04-28 Thread Vlad Serebrennikov via Phabricator via cfe-commits
Endill added inline comments.



Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:51-52
+  Diag->getDiags()->getDiagnosticIDs()->getStableName(Diag->getID()).str();
+  std::replace(StableName.begin(), StableName.end(), '_', '.');
+  SarifRule Rule = SarifRule::create().setRuleId(StableName);
 

vaibhav.y wrote:
> cjdb wrote:
> > denik wrote:
> > > §3.5.4 says that the hierarchical strings are separated by "/".
> > > 
> > > But more generally, are diagnostic names really fall under "hierarchical" 
> > > definition?
> > > Words between underscores should be treated as components. And $3.27.5 
> > > says that the leading components have to be the identifier of the rule.
> > > In some cases they look like valid components, e.g. `err_access`, 
> > > `err_access_dtor`, `err_access_dtor_exception`.
> > > But in cases like `err_cannot_open_file` neither of the leading 
> > > components exists.
> > > 
> > > Theoretically we could use groups as the leading component for warnings 
> > > for example. For errors the leading components are probably not even 
> > > necessary, since if I understood correctly they are needed to suppress 
> > > subsets of violations on the SARIF consumer side.
> > > Or we just could keep the names with underscores as is. WDYT?
> > I think in light of what you've said, changing back to underscores is 
> > probably best.
> > But more generally, are diagnostic names really fall under "hierarchical" 
> > definition?
> 
> I have the same concern, but this is okay for a first pass as a "flat 
> hierarchy" :)
> 
> If we want a deeper structure, we'll need some extra metadata in 
> `DiagnosticSemaKinds.td`'s `Error<...>`  to add cluster names as a follow up 
> to this.
> 
> WDYT about something like `clang/visibility/err_access` or 
> `clang/syntax/err_stmtexpr_file_scope`.
> 
> Alternatively: we could draw from the c++ standard structure: 
> https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code for 
> an ODR violation could be `clang/basic/def/odr`, again I'm unsure how well 
> this meshes with clang's diagnostic model.
> Alternatively: we could draw from the c++ standard structure: 
> https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code for 
> an ODR violation could be clang/basic/def/odr, again I'm unsure how well this 
> meshes with clang's diagnostic model.

The only reliable thing there are stable clause names like `[namespace.udecl]`, 
because there are precedents of them being rearranged in the table of contents 
([[ 
https://github.com/cplusplus/draft/commit/982a456f176ca00409c6e514af932051dce2485f
 | 1 ]], [[ 
https://github.com/cplusplus/draft/commit/3c580cd204fde95a21de1830ace75d14d429f845
 | 2 ]]). We've got bitten by that in C++ conformance tests, which use table of 
contents for directory hierarchy.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-04-05 Thread Vaibhav Yenamandra via Phabricator via cfe-commits
vaibhav.y added inline comments.



Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:51-52
+  Diag->getDiags()->getDiagnosticIDs()->getStableName(Diag->getID()).str();
+  std::replace(StableName.begin(), StableName.end(), '_', '.');
+  SarifRule Rule = SarifRule::create().setRuleId(StableName);
 

cjdb wrote:
> denik wrote:
> > §3.5.4 says that the hierarchical strings are separated by "/".
> > 
> > But more generally, are diagnostic names really fall under "hierarchical" 
> > definition?
> > Words between underscores should be treated as components. And $3.27.5 says 
> > that the leading components have to be the identifier of the rule.
> > In some cases they look like valid components, e.g. `err_access`, 
> > `err_access_dtor`, `err_access_dtor_exception`.
> > But in cases like `err_cannot_open_file` neither of the leading components 
> > exists.
> > 
> > Theoretically we could use groups as the leading component for warnings for 
> > example. For errors the leading components are probably not even necessary, 
> > since if I understood correctly they are needed to suppress subsets of 
> > violations on the SARIF consumer side.
> > Or we just could keep the names with underscores as is. WDYT?
> I think in light of what you've said, changing back to underscores is 
> probably best.
> But more generally, are diagnostic names really fall under "hierarchical" 
> definition?

I have the same concern, but this is okay for a first pass as a "flat 
hierarchy" :)

If we want a deeper structure, we'll need some extra metadata in 
`DiagnosticSemaKinds.td`'s `Error<...>`  to add cluster names as a follow up to 
this.

WDYT about something like `clang/visibility/err_access` or 
`clang/syntax/err_stmtexpr_file_scope`.

Alternatively: we could draw from the c++ standard structure: 
https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code for 
an ODR violation could be `clang/basic/def/odr`, again I'm unsure how well this 
meshes with clang's diagnostic model.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-03-22 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb added inline comments.



Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:51-52
+  Diag->getDiags()->getDiagnosticIDs()->getStableName(Diag->getID()).str();
+  std::replace(StableName.begin(), StableName.end(), '_', '.');
+  SarifRule Rule = SarifRule::create().setRuleId(StableName);
 

denik wrote:
> §3.5.4 says that the hierarchical strings are separated by "/".
> 
> But more generally, are diagnostic names really fall under "hierarchical" 
> definition?
> Words between underscores should be treated as components. And $3.27.5 says 
> that the leading components have to be the identifier of the rule.
> In some cases they look like valid components, e.g. `err_access`, 
> `err_access_dtor`, `err_access_dtor_exception`.
> But in cases like `err_cannot_open_file` neither of the leading components 
> exists.
> 
> Theoretically we could use groups as the leading component for warnings for 
> example. For errors the leading components are probably not even necessary, 
> since if I understood correctly they are needed to suppress subsets of 
> violations on the SARIF consumer side.
> Or we just could keep the names with underscores as is. WDYT?
I think in light of what you've said, changing back to underscores is probably 
best.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-03-22 Thread Denis Nikitin via Phabricator via cfe-commits
denik added inline comments.



Comment at: clang/lib/Basic/DiagnosticIDs.cpp:103
+// Stable names
+const char *const StaticDiagInfoStableNames[] = {
+#define DIAG(ENUM, CLASS, DEFAULT_SEVERITY, DESC, GROUP, SFINAE, NOWERROR, 
\

As of now it's 4600 strings, around 150k characters. Doesn't look bad but I 
think it should be mentioned in the change message because it affects all 
diagnostic.



Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:51-52
+  Diag->getDiags()->getDiagnosticIDs()->getStableName(Diag->getID()).str();
+  std::replace(StableName.begin(), StableName.end(), '_', '.');
+  SarifRule Rule = SarifRule::create().setRuleId(StableName);
 

§3.5.4 says that the hierarchical strings are separated by "/".

But more generally, are diagnostic names really fall under "hierarchical" 
definition?
Words between underscores should be treated as components. And $3.27.5 says 
that the leading components have to be the identifier of the rule.
In some cases they look like valid components, e.g. `err_access`, 
`err_access_dtor`, `err_access_dtor_exception`.
But in cases like `err_cannot_open_file` neither of the leading components 
exists.

Theoretically we could use groups as the leading component for warnings for 
example. For errors the leading components are probably not even necessary, 
since if I understood correctly they are needed to suppress subsets of 
violations on the SARIF consumer side.
Or we just could keep the names with underscores as is. WDYT?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-03-22 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb updated this revision to Diff 507510.
cjdb added a comment.

I think this corrects everything.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

Files:
  clang/include/clang/Basic/Diagnostic.h
  clang/include/clang/Basic/DiagnosticIDs.h
  clang/include/clang/Frontend/ASTUnit.h
  clang/lib/Basic/Diagnostic.cpp
  clang/lib/Basic/DiagnosticIDs.cpp
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/SARIFDiagnostic.cpp
  clang/test/Frontend/sarif-diagnostics.cpp

Index: clang/test/Frontend/sarif-diagnostics.cpp
===
--- clang/test/Frontend/sarif-diagnostics.cpp
+++ clang/test/Frontend/sarif-diagnostics.cpp
@@ -70,7 +70,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"'main' must return 'int'"
 // SARIF:   },
-// SARIF:   "ruleId":"3485",
+// SARIF:   "ruleId":"err.main.returns.nonint",
 // SARIF:   "ruleIndex":0
 // SARIF: },
 // SARIF: {
@@ -93,7 +93,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hello'"
 // SARIF:   },
-// SARIF:   "ruleId":"4632",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":1
 // SARIF: },
 // SARIF: {
@@ -116,7 +116,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid digit 'a' in decimal constant"
 // SARIF:   },
-// SARIF:   "ruleId":"898",
+// SARIF:   "ruleId":"err.invalid.digit",
 // SARIF:   "ruleIndex":2
 // SARIF: },
 // SARIF: {
@@ -139,7 +139,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"misleading indentation; statement is not part of the previous 'if'"
 // SARIF:   },
-// SARIF:   "ruleId":"1826",
+// SARIF:   "ruleId":"warn.misleading.indentation",
 // SARIF:   "ruleIndex":3
 // SARIF: },
 // SARIF: {
@@ -162,7 +162,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"previous statement is here"
 // SARIF:   },
-// SARIF:   "ruleId":"1746",
+// SARIF:   "ruleId":"note.previous.statement",
 // SARIF:   "ruleIndex":4
 // SARIF: },
 // SARIF: {
@@ -185,7 +185,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"unused variable 'Yes'"
 // SARIF:   },
-// SARIF:   "ruleId":"6593",
+// SARIF:   "ruleId":"warn.unused.variable",
 // SARIF:   "ruleIndex":5
 // SARIF: },
 // SARIF: {
@@ -208,7 +208,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hi'"
 // SARIF:   },
-// SARIF:   "ruleId":"4632",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":6
 // SARIF: },
 // SARIF: {
@@ -231,7 +231,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"extraneous closing brace ('}')"
 // SARIF:   },
-// SARIF:   "ruleId":"1400",
+// SARIF:   "ruleId":"err.extraneous.closing.brace",
 // SARIF:   "ruleIndex":7
 // SARIF: },
 // SARIF: {
@@ -282,7 +282,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid operands to binary expression ('t1' and 't1')"
 // SARIF:   },
-// SARIF:   "ruleId":"4567",
+// SARIF:   "ruleId":"err.typecheck.invalid.operands",
 // SARIF:   "ruleIndex":8
 // SARIF: }
 // SARIF:   ],
@@ -302,7 +302,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"3485",
+// SARIF:   "id":"err.main.returns.nonint",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -314,7 +314,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"4632",
+// SARIF:   "id":"err.undeclared.var.use",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -326,7 +326,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"898",
+// SARIF:   "id":"err.invalid.digit",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -338,7 +338,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"1826",
+// SARIF:   "id":"warn.misleading.indentation",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -350,7 +350,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:

[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-03-22 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb updated this revision to Diff 507503.
cjdb edited the summary of this revision.
cjdb added a comment.

rebasing continues


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

Files:
  clang/include/clang/Basic/Diagnostic.h
  clang/include/clang/Basic/DiagnosticIDs.h
  clang/include/clang/Frontend/ASTUnit.h
  clang/lib/Basic/Diagnostic.cpp
  clang/lib/Basic/DiagnosticIDs.cpp
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/SARIFDiagnostic.cpp
  clang/test/Frontend/sarif-diagnostics.cpp

Index: clang/test/Frontend/sarif-diagnostics.cpp
===
--- clang/test/Frontend/sarif-diagnostics.cpp
+++ clang/test/Frontend/sarif-diagnostics.cpp
@@ -70,7 +70,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"'main' must return 'int'"
 // SARIF:   },
-// SARIF:   "ruleId":"3485",
+// SARIF:   "ruleId":"err.main.returns.nonint",
 // SARIF:   "ruleIndex":0
 // SARIF: },
 // SARIF: {
@@ -93,7 +93,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hello'"
 // SARIF:   },
-// SARIF:   "ruleId":"4632",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":1
 // SARIF: },
 // SARIF: {
@@ -116,7 +116,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid digit 'a' in decimal constant"
 // SARIF:   },
-// SARIF:   "ruleId":"898",
+// SARIF:   "ruleId":"err.invalid.digit",
 // SARIF:   "ruleIndex":2
 // SARIF: },
 // SARIF: {
@@ -139,7 +139,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"misleading indentation; statement is not part of the previous 'if'"
 // SARIF:   },
-// SARIF:   "ruleId":"1826",
+// SARIF:   "ruleId":"warn.misleading.indentation",
 // SARIF:   "ruleIndex":3
 // SARIF: },
 // SARIF: {
@@ -162,7 +162,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"previous statement is here"
 // SARIF:   },
-// SARIF:   "ruleId":"1746",
+// SARIF:   "ruleId":"note.previous.statement",
 // SARIF:   "ruleIndex":4
 // SARIF: },
 // SARIF: {
@@ -185,7 +185,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"unused variable 'Yes'"
 // SARIF:   },
-// SARIF:   "ruleId":"6593",
+// SARIF:   "ruleId":"warn.unused.variable",
 // SARIF:   "ruleIndex":5
 // SARIF: },
 // SARIF: {
@@ -208,7 +208,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hi'"
 // SARIF:   },
-// SARIF:   "ruleId":"4632",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":6
 // SARIF: },
 // SARIF: {
@@ -231,7 +231,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"extraneous closing brace ('}')"
 // SARIF:   },
-// SARIF:   "ruleId":"1400",
+// SARIF:   "ruleId":"err.extraneous.closing.brace",
 // SARIF:   "ruleIndex":7
 // SARIF: },
 // SARIF: {
@@ -282,7 +282,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid operands to binary expression ('t1' and 't1')"
 // SARIF:   },
-// SARIF:   "ruleId":"4567",
+// SARIF:   "ruleId":"err.typecheck.invalid.operands",
 // SARIF:   "ruleIndex":8
 // SARIF: }
 // SARIF:   ],
@@ -302,7 +302,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"3485",
+// SARIF:   "id":"err.main.returns.nonint",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -314,7 +314,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"4632",
+// SARIF:   "id":"err.undeclared.var.use",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -326,7 +326,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"898",
+// SARIF:   "id":"err.invalid.digit",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -338,7 +338,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"1826",
+// SARIF:   "id":"warn.misleading.indentation",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -350,7 +350,7 @@
 // SARIF:   "fullDescription":{
 // SARIF:  

[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-03-22 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb updated this revision to Diff 507494.
cjdb edited the summary of this revision.
cjdb added a comment.

swaps commits to see if that fixes CI


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654

Files:
  clang/include/clang/Basic/Diagnostic.h
  clang/include/clang/Basic/DiagnosticIDs.h
  clang/include/clang/Frontend/ASTUnit.h
  clang/lib/Basic/Diagnostic.cpp
  clang/lib/Basic/DiagnosticIDs.cpp
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/SARIFDiagnostic.cpp
  clang/test/Frontend/sarif-diagnostics.cpp

Index: clang/test/Frontend/sarif-diagnostics.cpp
===
--- clang/test/Frontend/sarif-diagnostics.cpp
+++ clang/test/Frontend/sarif-diagnostics.cpp
@@ -70,7 +70,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"'main' must return 'int'"
 // SARIF:   },
-// SARIF:   "ruleId":"3485",
+// SARIF:   "ruleId":"err.main.returns.nonint",
 // SARIF:   "ruleIndex":0
 // SARIF: },
 // SARIF: {
@@ -93,7 +93,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hello'"
 // SARIF:   },
-// SARIF:   "ruleId":"4632",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":1
 // SARIF: },
 // SARIF: {
@@ -116,7 +116,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid digit 'a' in decimal constant"
 // SARIF:   },
-// SARIF:   "ruleId":"898",
+// SARIF:   "ruleId":"err.invalid.digit",
 // SARIF:   "ruleIndex":2
 // SARIF: },
 // SARIF: {
@@ -139,7 +139,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"misleading indentation; statement is not part of the previous 'if'"
 // SARIF:   },
-// SARIF:   "ruleId":"1826",
+// SARIF:   "ruleId":"warn.misleading.indentation",
 // SARIF:   "ruleIndex":3
 // SARIF: },
 // SARIF: {
@@ -162,7 +162,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"previous statement is here"
 // SARIF:   },
-// SARIF:   "ruleId":"1746",
+// SARIF:   "ruleId":"note.previous.statement",
 // SARIF:   "ruleIndex":4
 // SARIF: },
 // SARIF: {
@@ -185,7 +185,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"unused variable 'Yes'"
 // SARIF:   },
-// SARIF:   "ruleId":"6593",
+// SARIF:   "ruleId":"warn.unused.variable",
 // SARIF:   "ruleIndex":5
 // SARIF: },
 // SARIF: {
@@ -208,7 +208,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hi'"
 // SARIF:   },
-// SARIF:   "ruleId":"4632",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":6
 // SARIF: },
 // SARIF: {
@@ -231,7 +231,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"extraneous closing brace ('}')"
 // SARIF:   },
-// SARIF:   "ruleId":"1400",
+// SARIF:   "ruleId":"err.extraneous.closing.brace",
 // SARIF:   "ruleIndex":7
 // SARIF: },
 // SARIF: {
@@ -282,7 +282,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid operands to binary expression ('t1' and 't1')"
 // SARIF:   },
-// SARIF:   "ruleId":"4567",
+// SARIF:   "ruleId":"err.typecheck.invalid.operands",
 // SARIF:   "ruleIndex":8
 // SARIF: }
 // SARIF:   ],
@@ -302,7 +302,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"3485",
+// SARIF:   "id":"err.main.returns.nonint",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -314,7 +314,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"4632",
+// SARIF:   "id":"err.undeclared.var.use",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -326,7 +326,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"898",
+// SARIF:   "id":"err.invalid.digit",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -338,7 +338,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF:   "id":"1826",
+// SARIF:   "id":"warn.misleading.indentation",
 // SARIF:   "name":""
 // SARIF: },
 // SARIF: {
@@ -350,7 +350,7 @@
 // SARIF:   "fullDescription":{
 // 

[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

2023-03-22 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb created this revision.
cjdb added reviewers: aaron.ballman, vaibhav.y, denik.
Herald added a project: All.
cjdb requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Per §3.27.5 and §3.27.7, the `rule.id` and `ruleId` properties need to
be heirarchical names (this is only a requirement of `rule.id`, but they
need to match when both are present). We've been using the enum values
for these properties up until now, but that's neither conforming nor
stable, so we've changed it to a dotted version of the enum identifiers
(which are substantially more stable than their numeric values).

Fixes #61597.
Depends on D145201 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D146654

Files:
  clang/include/clang/Basic/Diagnostic.h
  clang/include/clang/Basic/DiagnosticIDs.h
  clang/include/clang/Frontend/ASTUnit.h
  clang/lib/Basic/Diagnostic.cpp
  clang/lib/Basic/DiagnosticIDs.cpp
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/SARIFDiagnostic.cpp
  clang/test/Frontend/sarif-diagnostics.cpp

Index: clang/test/Frontend/sarif-diagnostics.cpp
===
--- clang/test/Frontend/sarif-diagnostics.cpp
+++ clang/test/Frontend/sarif-diagnostics.cpp
@@ -83,7 +83,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"'main' must return 'int'"
 // SARIF:   },
-// SARIF:   "ruleId":"3486",
+// SARIF:   "ruleId":"err.main.returns.nonint",
 // SARIF:   "ruleIndex":0
 // SARIF: },
 // SARIF: {
@@ -106,7 +106,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hello'"
 // SARIF:   },
-// SARIF:   "ruleId":"4633",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":1
 // SARIF: },
 // SARIF: {
@@ -129,7 +129,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid digit 'a' in decimal constant"
 // SARIF:   },
-// SARIF:   "ruleId":"898",
+// SARIF:   "ruleId":"err.invalid.digit",
 // SARIF:   "ruleIndex":2
 // SARIF: },
 // SARIF: {
@@ -152,7 +152,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"misleading indentation; statement is not part of the previous 'if'"
 // SARIF:   },
-// SARIF:   "ruleId":"1826",
+// SARIF:   "ruleId":"warn.misleading.indentation",
 // SARIF:   "ruleIndex":3
 // SARIF: },
 // SARIF: {
@@ -175,7 +175,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"previous statement is here"
 // SARIF:   },
-// SARIF:   "ruleId":"1746",
+// SARIF:   "ruleId":"note.previous.statement",
 // SARIF:   "ruleIndex":4
 // SARIF: },
 // SARIF: {
@@ -198,7 +198,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"unused variable 'Yes'"
 // SARIF:   },
-// SARIF:   "ruleId":"6595",
+// SARIF:   "ruleId":"warn.unused.variable",
 // SARIF:   "ruleIndex":5
 // SARIF: },
 // SARIF: {
@@ -221,7 +221,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"use of undeclared identifier 'hi'"
 // SARIF:   },
-// SARIF:   "ruleId":"4633",
+// SARIF:   "ruleId":"err.undeclared.var.use",
 // SARIF:   "ruleIndex":6
 // SARIF: },
 // SARIF: {
@@ -244,7 +244,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"extraneous closing brace ('}')"
 // SARIF:   },
-// SARIF:   "ruleId":"1400",
+// SARIF:   "ruleId":"err.extraneous.closing.brace",
 // SARIF:   "ruleIndex":7
 // SARIF: },
 // SARIF: {
@@ -295,7 +295,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"invalid operands to binary expression ('t1' and 't1')"
 // SARIF:   },
-// SARIF:   "ruleId":"4568",
+// SARIF:   "ruleId":"err.typecheck.invalid.operands",
 // SARIF:   "ruleIndex":8
 // SARIF: },
 // SARIF: {
@@ -318,7 +318,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"in file included from {{[^"]+/clang/test/Frontend/sarif-diagnostics.cpp:30:}}\n"
 // SARIF:   },
-// SARIF:   "ruleId":"-1",
+// SARIF:   "ruleId":"note.included.from",
 // SARIF:   "ruleIndex":9
 // SARIF: },
 // SARIF: {
@@ -341,7 +341,7 @@
 // SARIF:   "message":{
 // SARIF: "text":"unknown type name 'Test'"
 // SARIF:   },
-// SARIF:   "ruleId":"4658",
+// SARIF:   "ruleId":"err.unknown.typename",
 // SARIF:   "ruleIndex":10
 // SARIF: }
 // SARIF:   ],
@@ -361,7 +361,7 @@
 // SARIF:   "fullDescription":{
 // SARIF: "text":""
 // SARIF:   },
-// SARIF: