This function counts real printable UTF characters in a string. It currently contains a table of all patterns that is manually checked. I believe it was stolen from elsewhere a long time ago. Before we had utf8proc as a required dependency.
I have a few reasons to rewrite it to use the library instead; 1. I'm pretty sure nobody would ever care to update the dataset. On the other hand, utf8proc bundles all available information about the latest Unicode version that is supported on the current platform. 2. There is also a property that defines *display* width, that basically makes symbols like emojis wider than normal characters even on monospace fonts. (For context I want to fix indentation in places throughout our cmdline like the authors in 'svn list -v' that mess up the tables. This is where a function like that will be useful.) 3. Cleanup redundant code. 4. It might be slightly faster to use their dataset because utf8proc only accesses a table in static memory twice (for address and then retrieves properties) instead of binary searching and checking all ranges. Maybe it's slower though idk. Thoughts? -- Timofei Zhakov

