tag 29396 notabug
close 29396
thanks

(based on reproducible example provided privately)

Hello,

On 2017-11-22 09:48 AM, Assaf Gordon wrote:
On 2017-11-22 07:15 AM, Saint Michael wrote:
I have two files with phone numbers, one column, sorted (they pass the test
sort -c). One is large and the other one is small. TheĀ  comm -12
--check-order file1.csv file2.csv falis to find matches, but another
utility, join file1.csv file2.csv. does find a lot of matches.

This is not a bug in comm, but simply incorrect usage.

The file "file2.csv" (provided privately) contained a space character
after each number.

"comm" compares entire lines, and spaces do matter.
"join" compares fields, and trailing spaces field do not matter.

A simple reproducer:

    $ seq 5 > a
    $ echo "4 " > b

    $ join a b
    4

    $ comm -12 a b
    [ ... no output ... ]


To remove the trailing spaces on the file, try:

   $ sed 's/  *$//' file2.csv > file2-no-space.csv

   $ comm -12 file1.csv file2-no-space.csv  | wc -l
   864

   $ join file1.csv file2.csv | wc -l
   864

regards,
 - assaf




Reply via email to