Hi Marco, I understand that there needs to be a different discussion on strong dependency of mxnet and dmlc-core and how to fix it.
Having said that, I think the goals of dmlc-core and mxnet are somewhat aligned. Posting in the MXNet dev list for this case is a good way to gather feedback from both the communities since I consider the MXNet community to be mostly a superset of the dmlc-core community. Anirudh On Mon, Feb 26, 2018 at 5:00 PM, Subramanian, Anirudh <[email protected]> wrote: > Hi Tianqi, > > The UTF-8 support would enable other formats like CSV more usable. > Otherwise, they have to handle normalizing their data in some way before > using mxnet. > I understand that there is a tradeoff here because of the efficiency gains > from the parser but the expectation of having to normalize their UTF-8 > files may turn users away. > > Anirudh > > On 2/26/18, 3:54 PM, "[email protected] on behalf of Tianqi Chen" < > [email protected] on behalf of [email protected]> wrote: > > Since LibSVM format is only going to involve numbers and possibly ascii > characters, is there any reason adding UTF-8 support? Note that > generalization always comes with cost of efficiency and there is some > effort spent on making parser fast > > Tianqi > > On Mon, Feb 26, 2018 at 3:38 PM, Anirudh <[email protected]> > wrote: > > > Hi all, > > > > Currently there is no UTF-8 Support for LibSVM, LibFM or CSV Text > parsers. > > I am currently working on adding UTF-8 support for Text parsers. > Since C++ > > doesn't have a great built-in support for UTF-8, I am looking at > > third-party libraries which provide Unicode support. I am > considering ICU > > currently. Any comments, suggestions, past experience, gotchas about > > unicode third party libraries or adding unicode support in general is > > highly appreciated. > > > > I have created an issue about the same: > > https://github.com/dmlc/dmlc-core/issues/372 > > Please feel free to reply to this email or comment on the github > issue if > > you have any inputs. > > > > Anirudh > > > > >
