Issue 76472
Summary [libclang] `annotateTokens()` produces different cursor than `visitChildren()`
Labels new issue
Assignees
Reporter jimmy-zx
    While testing the `annotateTokens()` function (used by `Token.cursor` of the python binding), I found that for some cursor, the (only) token that belongs to that cursor does not maps back to the cursor itself. 

For example, on the following code,
```c
struct a {
 int b;
};

int func(struct a *ptr) {
    int r = ptr->b;
 return r;
}
```

I made a script that selects the `DeclRefExpr` that refers `ptr` in the statement `int r = ptr->b`, and check if the only token that belongs to the _expression_, `ptr`'s cursor maps to the cursor.

```python
from clang.cindex import TranslationUnit, Cursor, CursorKind


def main():
    tu = TranslationUnit.from_source("./demo.c")
    root: Cursor = tu.cursor

    node = None
    for node in root.walk_preorder():
 if node.kind == CursorKind.DECL_REF_EXPR and node.spelling == "ptr":
            break

    token = None
    for token in node.get_tokens():
        break

    print(token.cursor == node)

    print(token.cursor._kind_id, node._kind_id)
 print(token.cursor.xdata, node.xdata)
    print(*token.cursor.data)
 print(*node.data)


if __name__ == '__main__':
 main()
```

The result of the above script is
```
False
101 101
0 0
140162768666120 140162768666224 140162768050240
None 140162768666224 140162768050240
```

The cursors `node` and `token.cursor` should be the same, and they indeed share the same spelling and extent. However, `libclang` consider them as different cursors.

The equality of cursor is provided by `clang_equalCursors()`, and the only difference between these two cursors are `data[0]`.

https://github.com/llvm/llvm-project/blob/1c1eaf75f5f2efd72ba813b29b3d7b556d61b70b/clang/tools/libclang/CIndex.cpp#L6289-L6303

I suspect that the creation for `DeclRefExpr` cursors are in `MakeCXCursor()`, and `data[0]` probably means the parent cursor.

https://github.com/llvm/llvm-project/blob/1c1eaf75f5f2efd72ba813b29b3d7b556d61b70b/clang/tools/libclang/CXCursor.cpp#L570-L583
https://github.com/llvm/llvm-project/blob/1c1eaf75f5f2efd72ba813b29b3d7b556d61b70b/clang/tools/libclang/CXCursor.cpp#L876-L878

There might be an issue where the `data[0]` (parent) field is not being set properly, or `clang_equalCursors()` should ignore `data[0]` when comparing statements?

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to