kj wrote:
> sort behaves erratically with underscores:
> 
>   % ( echo _c; echo __; echo _a ) | sort 
>   __
>   _a
>   _c

Sort uses your current locale setting (e.g. LANG) to determine the
character collation sequence.  You probably have LANG set to a
dictionary sort order.  In dictionary sort order case is folded and
punctuation is ignored.

See this FAQ entry for more information:

  
http://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021

> How can I get sort to treat _ consistently?  (I don't have a strong
> preference for either _ < a or a < _ as long as it is consistent.)

You can solve this by setting a standard sort order instead of a
non-standard dictionary sort ordering locale.  "C" (or the "POSIX"
alias) is the normal standard one.

  LANG=C sort

Personally I set the following in my own environment to get UTF-8 but
force a standard sort ordering regardless.

  export LANG=en_US.UTF-8
  export LC_COLLATE=C

Bob


Reply via email to