Yup, this does depend on the locale. In my original example, I had
LANG=en_US.UTF-8. Setting it to C.UTF-8 gets me the right result:
> $ LANG=C.UTF-8 uniq -c x
> 1 "ⁿᵘˡˡ"
> 1 "ܥܝܪܐܩ"
But, that doesn't fully explain what's going on. I find it difficult to
believe that there's
On 2019-12-16 20:43, TJ Luoma wrote:
> AHA! Ok, now I understand a little better. I have seen the difference
> between "size" and "size on disk" and did not realize that applied
> here.
Thanks for confirming.
> I'm still not 100% clear on _why_ two "identical" files would have
> different
TJ Luoma wrote:
> AHA! Ok, now I understand a little better. I have seen the difference
> between "size" and "size on disk" and did not realize that applied
> here.
>
> I'm still not 100% clear on _why_ two "identical" files would have
> different results for "size on disk" (it _seems_ like those
AHA! Ok, now I understand a little better. I have seen the difference
between "size" and "size on disk" and did not realize that applied
here.
I'm still not 100% clear on _why_ two "identical" files would have
different results for "size on disk" (it _seems_ like those should be
identical) but I
On 12/15/19 11:40 AM, Roy Smith wrote:
> With the following input:
>
>> $ cat x
>> "ⁿᵘˡˡ"
>> "ܥܝܪܐܩ"
>
>
> Running "uniq -c" says there's two copies of the same line!
>
>> $ uniq -c x
>> 2 "ⁿᵘˡˡ"
Thanks for the bug report. I expect this is because GNU 'uniq' uses the
equivalent of