On 29/11/15 06:34, Assaf Gordon wrote:

>   3. "wc -L" counts "screen display width" (while expanding tabs),
>      not characters.
> 
>       $ printf "ab\txyz\n" | wc -L
>       11
>       $ printf "abc\txyz\n" | wc -L
>       11
>       $ printf "abcd\txyz\n" | wc -L
>       11
> 
>   4. "wc -L" counts only valid, printable characters, including unicode.
> 
>       # valid UTF-8 sequence counted as one character:
>       $ printf "\xe2\x99\xa5" | wc -L
>       1
> 
>       # invalid UTF-8 sequence not counted:
>       $ printf "\xe2\xf2\xa5" | wc -l
>       0
> 
>       # unprintable characters (in C locale) are not counted:
>       $ printf "\xe2\x99\xa5" | LC_ALL=C wc -L
>       0
> 
>       # To count bytes, use sed:
>       $ printf "\xe2\x99\xa5" | LC_ALL=C sed 's/././g' | wc -L
>       3


Actually you're right we should call some of the above out as examples.
We should also mention that wc doesn't process terminal control chars specially:

$ printf '\x1b[33mf\bred\x1b[m\n' | wc -L
10

cheers,
Pádraig

Reply via email to