Hi. I found a rather strange bug in diff (Debian version 1:3.0-1) when doing large backups of millions of files. First I've copied the data to an external HDD, then I've did an "diff -q -r >out 2> err" in order to find any errors.
out has about the following content: Files /data/......../WeiᳮJPEG and /mnt/backups/2010-08-06/1/......../WeiᲮJPEG differ Files /data/......../WeiᲮJPEG and /mnt/backups/2010-08-06/1/......../WeiᳮJPEG differ The strange unicode characters had their character codes inside which read as follows: 1CEE 1CAE 1CAE 1CEE (in the same order as above) stat on these files gave: $ stat Wei* File: `Wei\341\262\256JPEG' Size: 20927 Blocks: 48 IO Block: 4096 regular file Device: 821h/2081d Inode: 4096614 Links: 1 Access: (0660/-rw-rw----) Uid: ( 1000/calestyo) Gid: ( 1000/calestyo) Access: 2009-11-30 19:29:18.000000000 +0100 Modify: 2001-07-18 16:40:08.000000000 +0200 Change: 2007-07-19 19:12:28.000000000 +0200 File: `Wei\341\263\256JPEG' Size: 15524 Blocks: 32 IO Block: 4096 regular file Device: 821h/2081d Inode: 4096615 Links: 1 Access: (0660/-rw-rw----) Uid: ( 1000/calestyo) Gid: ( 1000/calestyo) Access: 2009-11-30 19:29:18.000000000 +0100 Modify: 2001-07-18 16:40:08.000000000 +0200 Change: 2007-07-19 19:12:28.000000000 +0200 There was even a third file of similar name: File: `WeiᮊPEG' Size: 18843 Blocks: 40 IO Block: 4096 regular file Device: 821h/2081d Inode: 4096613 Links: 1 Access: (0660/-rw-rw----) Uid: ( 1000/calestyo) Gid: ( 1000/calestyo) Access: 2009-11-30 19:29:18.000000000 +0100 Modify: 2001-07-18 16:40:08.000000000 +0200 Change: 2007-07-19 19:12:28.000000000 +0200 but it seems that this one made no problems? When manually comparing those files with diff (of course in the correct order),... no difference was found. Any ideas? Cheers, Chris. btw: I'm not subscribed, so please CC me.
