I'll start from afar, so that it will be easier to understand what I am thinking about..
CFFI uses pycparser, which parses C files, but! uses C compiler to strip comments from C files and process defines, but almost all .c files contain comments, so pycparser is basically useless as a parser, but maybe it has a good API for working with AST. Anyway, I tried to see if I can teach pycparser to strip comments itself, and in c_lexer.py I found a list of tokens, among which there were no token representing the comment start. Stripped list: ## ## All the tokens recognized by the lexer ## tokens = keywords + ( # Identifiers 'ID', # Type identifiers (identifiers previously defined as # types with typedef) 'TYPEID', # constants 'INT_CONST_DEC', 'INT_CONST_OCT', 'INT_CONST_HEX', 'FLOAT_CONST', 'HEX_FLOAT_CONST', 'CHAR_CONST', 'WCHAR_CONST', . ... So I thought that I need to add a name for a token corresponding to comments start //, /* and end */ and it will be better if the token name would be somewhat common among parsers, so that people looking at token could immediately recognize that it is a comment related. Apparently, properly naming is a little bit ambiguous for a automated processing. Editors like Spyder could also benefit information about token and their meaning in different programming languages. The processing of text comments that can be catched from the parsing stream is same for any language and could be IDE independent. Right now you can't just reuse the language definitions (such as ASDL) to just feed the IDE so that it can automatically figure out, what parts of text it can attach its functions to. I read the ontologies is way to express relations between object in this automatic was as triples. Like; COMMENTSTART is a TOKEN COMMENTSTART starts a COMMENT And I wonder, have anybody tried to apply this ontology stuff to designing and analysing computer languages? If yes, maybe there are some databases with such information about parsers. I would like to query names of all tokens that represent a program comment. -- anatoly t. _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev