stevengj edited a comment on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-655253087
[U+08BE](https://www.fileformat.info/info/unicode/char/08be/index.htm) was defined in Unicode 13, and category Lo is correct. It sounds like you may be looking at obsolete Unicode tables? > utf8proc doesn't store and expose the information if a codepoint is of a Numeric type Can't you use the Unicode category (N*) for this? That's [what Julia does](https://github.com/JuliaLang/julia/blob/master/base/strings/unicode.jl#L405). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org