On Jun 15, 2012, at 3:59 PM, Dmitri Gribenko wrote:
> Hello,
>
> Here is a new version of the patch, which addresses all the comments,
> except for one: I did not implement the optimized per-module search
> for comments. Unfortunately, I did not found any documentation about
> how modules work in clang (and what are global, local IDs, submodules
> and how they differ from modules etc.)
Unfortunately, there is no documentation. I can help here.
> Changes from v1 include:
>
> * new library libComments, which will contain all related lexing and
> parsing code;
The dependency here is that the AST library now depends on the Comments
library. I think that's reasonable, because we want to have functions on the
AST that extract comments from the corresponding declarations.
> * class RawComment, which represents a single comment;
>
> * class RawCommentList, which is populated during parsing. It merges
> consecutive comments on the fly.
Makes sense.
> * implements a warning "not a Doxygen member comment", which should
> catch typos //< and /*< (instead of ///< and /**<). We had a few in
> our codebase, which I fixed manually: r158241, r158248.
Can you separate this warning into a subsequent patch? I'd rather get the
basics in place before we start introducing warnings.
> Questions:
> * should I split RawCommentList.h into RawComment.h and RawCommentList.h?
No, I think it's fine this way.
> * should I remove the LANGOPT ParseComments?
Yes, please. The overhead is low enough that we can afford to at least track
the comments in all compiles.
> Next steps:
> * Implementing lexer, parser and AST classes for the comment text.
> * Expose comment AST through libclang.
> * We will need semantic analysis as well, for example, to match \param
> names with actual function parameters.
Sounds great!
Some specific comments:
Index: include/clang/Basic/DiagnosticSemaKinds.td
===================================================================
--- include/clang/Basic/DiagnosticSemaKinds.td (revision 158558)
+++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
@@ -5633,5 +5633,10 @@
"definition of %0 must be imported before it is required">;
}
+let CategoryName = "Comment Issue" in {
+def warn_not_a_doxygen_after_member_comment : Warning<
+ "not a Doxygen member comment">, InGroup<DiagGroup<"doxygen">>;
+} // end of comment issue category
+
} // end of sema component.
Please define a warning group in DiagnosticGroups.td for Doxygen warnings; I
suspect we'll have a number of these.
Is this simply a "Doxygen Issue" or "Documentation Issue", rather than a
"Comment Issue"?
+class RawComment {
+public:
+ enum CommentKind {
+ CK_Invalid, ///< Invalid comment
+ CK_OrdinaryBCPL, ///< Any normal BCPL comments
+ CK_OrdinaryC, ///< Any normal C comment
+ CK_BCPLSlash, ///< \code /// stuff \endcode
+ CK_BCPLExcl, ///< \code //! stuff \endcode
+ CK_JavaDoc, ///< \code /** stuff */ \endcode
+ CK_Qt, ///< \code /*! stuff */ \endcode
+ CK_Merged ///< Two or more Doxygen comments merged together
+ };
+
Note that /*! is also used by HeaderDoc; could you add that to the comment so
we don't forget it?
+ SourceRange Range;
+ CommentKind Kind;
+ bool IsAfterMember;
+ bool IsAlmostAfterMemberComment;
+ StringRef RawText;
+
+ mutable unsigned BeginLine; ///< Cached line number
+ mutable unsigned EndLine; ///< Cached line number
+ mutable bool BeginLineValid; ///< True if BeginLine is valid
+ mutable bool EndLineValid; ///< True if EndLine is valid
Please pack the various integer/bool/enum data fields into bitfields, because
the size of RawComment does have an impact on the size of the AST.
+ /// \brief All comments in this translation unit.
+ RawCommentList Comments;
+ //DenseMap<unsigned, RawCommentList> Comments;
This commented-out DenseMap can go away?
+ /// \brief Cursors for comments blocks.
+ SmallVector<std::pair<llvm::BitstreamCursor,
+ serialization::ModuleFile *>, 8> CommentsCursors;
+
This belongs in the Serialization library; the AST itself shouldn't know
anything about cursors or the serialization format. I should call back via the
ExternalASTSource when it needs information. The "CommentsLoaded" bit is fine,
though; we do that often to optimize away extraneous calls into the external
AST source.
+ /// \brief Return the Doxygen-style comment attached to a given declaration,
+ /// without looking into cache.
+ const RawComment *getRawCommentForDeclNoCache(const Decl *D) const;
It's just a documentation comment, not specifically a Doxygen comment, right?
/**
+ * \brief Given a cursor that represents a declaration, return the associated
+ * comment's source range.
+ */
+CINDEX_LINKAGE CXSourceRange clang_Cursor_getCommentRange(CXCursor C);
+
Should this comment say something about this range covering multiple adjacent
comments?
+ // First check whether we have a comment that comes after a member decl.
+ if (Comment != RawComments.end() &&
+ Comment->isDoxygen() && Comment->isAfterMember() &&
+ !isa<TagDecl>(D) && !isa<NamespaceDecl>(D)) {
Why the checks against tags and namespaces? Should this also/instead be
checking isa<TagDecl>(D->getDeclContext()) ||
isa<ObjCContainerDecl>(D->getDeclContext())?
+ case COMMENTS_RAW_COMMENT: {
+ unsigned Idx = 0;
+ SourceRange SR = ReadSourceRange(F, Record, Idx);
+ bool Merged = Record[Idx++];
+ Context.addComment(RawComment(SourceMgr, SR, Merged));
+ break;
This is going to cause us to pull in the contents of each of the files where a
raw comment occurred, as soon as we load the raw comments from the AST file.
That's going to be rather expensive, I suspect. Alternatively, we could store
more in the AST file (including the Kind, IsAfterMember, and
IsAlmostAfterMemberComment values) and lazily compute RawText only when it's
actually requested... which should be very rare.
Also, I think you want to put all of the comments read from the AST file at the
beginning of the RawCommentList, because right now they could end up getting
mixed together with comments collecting while parsing the current translation
unit. This way, the ordering will be correct for precompiled headers. (Modules
are a separable issue)
This is looking great!
- Doug
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits