I do, in Arraymancer: [https://github.com/mratsim/Arraymancer](https://github.com/mratsim/Arraymancer)/
For NLP there was some wrappers here: [https://github.com/Nim-NLP](https://github.com/Nim-NLP) with a focus on NLP on the Chinese language. For FSM for NLP, I've came across [BlingFire](https://github.com/microsoft/BlingFire) from Microsoft research but I guess the most flexible tokenizer is [sentencepiece](https://github.com/google/sentencepiece) by Google which does unsupervised training and does not assume anything about the language (whitespaces), you can just give it things to read.
