Zack Weinberg added the comment:

FWIW, the actual behavior of \w matching "everything in Unicode general 
categories L* and N*, plus U+005F (underscore)" is consistent across all 
versions I can conveniently test (2.7, 3.4, 3.5).

In 2.7, there are four characters in general category Nl that \w doesn't match, 
but I believe that is just a bug, not an intentional difference of behavior.


