On 09/04/17 11:37, Kyle Sallee wrote:
> By the sort program
> when a file is sorted
> the lines which start with line feed
> output earlier than lines which begin with tab.
> 
> Tab ASCII value is 9.
> LF  ASCII value is 10.
> Tabs should be first?
> 
> However, to strings
> if the lines are converted
> then to mitigate a larger address space
> presumably with 0 the LF are replaced.
> Yet after the LF if the 0 byte was placed
> then the expected output would become.
> 
> If expected behavior becomes
> then historical behavior relied upon scripts might break.
> 
> The sort.c source code was not viewed.
> Therefore, a patch is not offered.
> Discussion is solicited.
> Concerning empty lines first.
> Is it a bug?
> Should it be fixed?
> 
> Because I am not on the email list;
> if the topic is worth discussion
> if a decision is made
> then please forward.
> Thanks for maintaining and sharing awesome software.

I think you're hitting locale issues.
Do you see the same issue when you specify LC_ALL=C to sort?
For example:

$ printf '\nLF\0\tTAB\0' | sort -z | tr '\n\t\0' 'nt\n'
nLF
tTAB

$ printf '\nLF\0\tTAB\0' | LC_ALL=C sort -z | tr '\n\t\0' 'nt\n'
tTAB
nLF

thanks,
Pádraig



Reply via email to