Hi all, Currently there is no UTF-8 Support for LibSVM, LibFM or CSV Text parsers. I am currently working on adding UTF-8 support for Text parsers. Since C++ doesn't have a great built-in support for UTF-8, I am looking at third-party libraries which provide Unicode support. I am considering ICU currently. Any comments, suggestions, past experience, gotchas about unicode third party libraries or adding unicode support in general is highly appreciated.
I have created an issue about the same: https://github.com/dmlc/dmlc-core/issues/372 Please feel free to reply to this email or comment on the github issue if you have any inputs. Anirudh
