Rob Sheldon wrote at about 23:54:51 -0700 on Wednesday, March 15, 2023:
 > There is no reason to be concerned. This is normal.
Not really!
The OP claimed he is using v4 which uses full file md5sum checksums as
the file name hash.

It *should* be extremely, once-in-a-blue-moon, rare to randomly have an
md5sum collision -- as in 1.47*10^-29

Randomly having 165!!!! hash collisions is several orders of magnitude
more unlikely.

You would have to work hard to artificially create such collisions.

 > Searching for "backuppc pool hashing chain" finds 
 > https://sourceforge.net/p/backuppc/mailman/message/20671583/ for example. 
 > The documentation could perhaps be more specific about this, but BackupPC 
 > "chains" hash collisions, so no data is lost or damaged. This is a pretty 
 > standard compsci data structure. If you search the BackupPC repository for 
 > the message you're seeing, you find 
 > https://github.com/backuppc/backuppc/blob/43c91d83d06b4af3760400365da62e1fd5ee584e/lib/BackupPC/Lang/en.pm#L311,
 >  which gives you a couple of variable names that you can use to search the 
 > codebase and satisfy your curiosity.
 >

This OLD (ca 2008) reference is totally IRRELEVANT for v4.x.
Hash collisions on v3.x were EXTREMELY common and it was not unusual to
have even long chains of collisions since the md5sum was computed
using only the first and last 128KB (I think) of the file plus the
file length. So any long file whose middle section changed by even
just one byte would create a collision chain.

 > "165 repeated files with longest chain 1" just means that there are 165 
 > different files that had a digest (hash) that matched another file. This can 
 > happen for a number of different reasons. A common one is that there are 
 > identical files on the backuppc client system that have different paths.
 > 
 > I've been running BackupPC for a *long* time and have always had chained 
 > hashes in my server status. It's an informational message, not an error.
 > 

Are you running v3 or v4?
You are right though it's technically not an error even in v4 in that
md5sum collisions do of course exist -- just that they are extremely rare.

It would REALLY REALLY REALLY help if the OP would do some WORK to
help HIMSELF diagnose the problem.

Since it seems like only a single md5sum chain is involved, it would
seem blindingly obvious that the first step in troubleshooting would
be to determine what is the nature of these files with the same md5sum
hash.

Specifically,
- Do they indeed all have identical md5sum?
- Are they indeed distinct files (despite having the same md5sum)?
- How if at all are these files special?

 > On Tue, Mar 14, 2023, at 12:41 PM, Giannis Economou wrote:
 > > The message appears in "Server Information" section, inside server 
 > > status, in the web interface of BackupPC (page "BackupPC Server Status").
 > > 
 > > Both servers are running BackupPC version 4.4.0.
 > > 1st server says:  "Pool hashing gives 165 repeated files with longest 
 > > chain 1"
 > > 2nd server says:  "Pool hashing gives 5 repeated files with longest chain 
 > > 1"
 > > 
 > > I could not find more info/details about this message in the documentation.
 > > Only a comment in github is mentioning this message, and to my 
 > > understanding it is related to hash collisions in the pool.
 > > 
 > > Since hash collisions might be worrying (in theory), my question is if 
 > > this is alarming and in that case if there is something that we can do 
 > > about it (for example actions to zero out any collisions if needed).
 > > 
 > > 
 > > Thank you very much.
 > 
 > _</email>_
 > Rob Sheldon
 > Contract software developer, devops, security, technical lead
 > 
 > 
 > _______________________________________________
 > BackupPC-users mailing list
 > BackupPC-users@lists.sourceforge.net
 > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:    https://github.com/backuppc/backuppc/wiki
 > Project: https://backuppc.github.io/backuppc/


_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/

Reply via email to