[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-06-08 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128462#comment-17128462 ] Wes McKinney commented on ARROW-8961: - unilib's license (MPL 2.0) isn't ideal, see

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-06-08 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128420#comment-17128420 ] Antoine Pitrou commented on ARROW-8961: --- I've compiled both libraries: * {{utf8proc}} weighs

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-06-08 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128386#comment-17128386 ] Antoine Pitrou commented on ARROW-8961: --- Also, {{unilib}} uses similar a lookup scheme, so it's

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-06-08 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128379#comment-17128379 ] Antoine Pitrou commented on ARROW-8961: --- What algorithms would we use in {{utf8proc}} ? If it's

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-06-07 Thread Uwe Korn (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127531#comment-17127531 ] Uwe Korn commented on ARROW-8961: - We should definitely run benchmarks as in the utf8proc issue tracker

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-05-28 Thread Maarten Breddels (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118713#comment-17118713 ] Maarten Breddels commented on ARROW-8961: - FWIW, in Vaex i've relied onĀ 

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-05-27 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118061#comment-17118061 ] Wes McKinney commented on ARROW-8961: - Ah great. I see that utf8proc includes a 1.5 MB data file, so

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-05-27 Thread Uwe Korn (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117942#comment-17117942 ] Uwe Korn commented on ARROW-8961: - It's already there, named {{libutf8proc}}. > [C++] Vendor utf8proc

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-05-27 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117932#comment-17117932 ] Wes McKinney commented on ARROW-8961: - [~uwe] I would say it would be worth going ahead and adding

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-05-27 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117907#comment-17117907 ] Antoine Pitrou commented on ARROW-8961: --- I'll take a look sometimes if you don't beat me to it. >

[jira] [Commented] (ARROW-8961) [C++] Vendor utf8proc library

2020-05-27 Thread Uwe Korn (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117446#comment-17117446 ] Uwe Korn commented on ARROW-8961: - For conda-forge and other distributions that can handle binary