Dear bugs@openbsd.org

I am an OpenBSD user for a long time and really appreciate all the effort of the community that contributed to this magnificent distribution. Recently, I was parsing a few
IP addresses from the snort logs to populate a pf table and encountered
something counter-intuitive. Here is an example  source list.

89.234.156.205
151.101.38.172
104.109.143.150
104.109.143.150
77.224.14.2
77.224.14.21
104.97.14.224
77.224.14.18
77.224.14.21
2.21.34.170
199.232.210.172
2.18.121.27
91.216.110.53
34.89.91.10

When you sort this list using '| sort -u', you will end up with the
following, expected list.

104.109.143.150
104.97.14.224
151.101.38.172
199.232.210.172
2.18.121.27
2.21.34.170
34.89.91.10
77.224.14.18
77.224.14.2
77.224.14.21
89.234.156.205
91.216.110.53

The weird thing occurs that when you filter the same source list using '|
sort -hu', you end up with this shorter list

2.18.121.27
2.21.34.170
34.89.91.10
77.224.14.18
89.234.156.205
91.216.110.53
104.109.143.150
104.97.14.224
151.101.38.172
199.232.210.172

Notice that this list is missing 77.224.14.2 and 77.224.14.21! Is this by design? My 'human' interpretation is that the missing items are still unique in the list and should be part of the result list.

I tried reading the source code on https://github.com/openbsd/src/blob/master/usr.bin/sort/sort.c to understand how it is done under the hood. However, all that wizardry with shifting positions using pointer operations is just above my programming skills.

Is this the desired behavior or should it be corrected? In case it needs correction, can one of you more skilled programmers have a look at it? For context, I am running the current stable 7.6 release using the built-in tools like awk, grep and sort while encountering this behavior.

Kind regards,

Van Dung Ha

Reply via email to