I noticed a speed improvement by changing LC_ALL from “C.UTF-8” to “C”
from 22s to 1.3s.  This is huge, and should not be ignored.

My setting: I call sort in a Dockerfile, derived from the official
“python” base image.  It sets LANG to “C.UTF-8”.  Thus, my setting is
not exotic.

While I don’t really understand why UTF-8 encoding has that much impact
on sorting performance, it may well be.  However, this should be
mentioned in the documentation in my option.  Something like “Note that
anything but plain C local may have significant impact on sorting
performance” should occur somewhere in the man and info pages.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/846628

Title:
  gnu sort extremely slow in non C locale

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/846628/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to