Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
John Rouillard wrote at about 20:13:15 + on Thursday, October 30, 2008: > On Thu, Oct 30, 2008 at 10:04:26AM -0400, Jeffrey J. Kosowsky wrote: > > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, > > 2008: > > > Hi, > > > > > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 > > [[BackupPC-users] Duplicate files in pool with same CHECKSUM and same > > CONTENTS]: > > > > I have found a number of files in my pool that have the same checksum > > > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > > > has a few links to it by the way. > > > > > > > > Why is this happening? > > > > > > presumably creating a link sometimes fails, so BackupPC copies the file, > > > assuming the hard link limit has been reached. I suspect problems with > > your > > > NFS server, though not a "stale NFS file handle" in this case, > > > since the file succeeds. Strange. > > > > Yes - I am beginning to think that may be true. However as I mentioned > > in the other thread, the syslog on the nfs server is clean and the one > > on the client shows only about a dozen or so nfs timeouts over the > > past 12 hours which is the time period I am looking at now. Otherwise, > > I don't see any nfs errors. > > So if it is a nfs problem, something seems to be happening somewhat > > randomly and invisibly to the filesystem. > > IIRC you are using a soft nfs mount option right? If you are writing > to an NFS share that is not recommended. Try changing it to a hard > mount and see if the problem goes away. I only used soft mounts on > read only filesystems. > Unfortunately, this did not help. I assume the problem is somewhere in the HW/SW. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Hi, [could we agree on a subject line without tabs? ;-] Jeffrey J. Kosowsky wrote on 2008-10-30 20:31:15 -0400 [Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS]: > Jeffrey J. Kosowsky wrote at about 20:26:35 -0400 on Thursday, October 30, > 2008: > > It's really weird in that it seems to work the first time a directory > > is read but after a directory has been read a few times, it starts > > messing up. It's almost like the results are being stored in cache and > > then the cache is corrupted. > > In fact, I have found two ways to assuredly allow me to read the > directory again (at least for a few minutes or tries until it gets > corrupted again): > 1. Remount the nfs share > 2. Read the directory directly on the server (without nfs) bad memory on either client or server? Bug in the NFS implementation on client or server? You said you built a kernel for the NAS device. Could anything have gone wrong? Have you tried the 'noac' mount option? Which NFS version are you using? Over TCP or UDP? Have you found out anything about ATAoE (or iSCSI, for that matter) capabilities of the device? Regards, Holger - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Jeffrey J. Kosowsky wrote at about 20:26:35 -0400 on Thursday, October 30, 2008: > John Rouillard wrote at about 20:13:15 + on Thursday, October 30, 2008: > > On Thu, Oct 30, 2008 at 10:04:26AM -0400, Jeffrey J. Kosowsky wrote: > > > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, > 2008: > > > > Hi, > > > > > > > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 > [[BackupPC-users] Duplicate files in pool with same CHECKSUM and same > CONTENTS]: > > > > > I have found a number of files in my pool that have the same > checksum > > > > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each > copy > > > > > has a few links to it by the way. > > > > > > > > > > Why is this happening? > > > > > > > > presumably creating a link sometimes fails, so BackupPC copies the > file, > > > > assuming the hard link limit has been reached. I suspect problems > with your > > > > NFS server, though not a "stale NFS file handle" in this case, > > > > since the file succeeds. Strange. > > > > > > Yes - I am beginning to think that may be true. However as I mentioned > > > in the other thread, the syslog on the nfs server is clean and the one > > > on the client shows only about a dozen or so nfs timeouts over the > > > past 12 hours which is the time period I am looking at now. Otherwise, > > > I don't see any nfs errors. > > > So if it is a nfs problem, something seems to be happening somewhat > > > randomly and invisibly to the filesystem. > > > > IIRC you are using a soft nfs mount option right? If you are writing > > to an NFS share that is not recommended. Try changing it to a hard > > mount and see if the problem goes away. I only used soft mounts on > > read only filesystems. > > True -- I changed it to 'hard' but am still encountering the > problem... FRUSTRATING... > > It's really weird in that it seems to work the first time a directory > is read but after a directory has been read a few times, it starts > messing up. It's almost like the results are being stored in cache and > then the cache is corrupted. In fact, I have found two ways to assuredly allow me to read the directory again (at least for a few minutes or tries until it gets corrupted again): 1. Remount the nfs share 2. Read the directory directly on the server (without nfs) - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
John Rouillard wrote at about 20:13:15 + on Thursday, October 30, 2008: > On Thu, Oct 30, 2008 at 10:04:26AM -0400, Jeffrey J. Kosowsky wrote: > > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, > > 2008: > > > Hi, > > > > > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 > > [[BackupPC-users] Duplicate files in pool with same CHECKSUM and same > > CONTENTS]: > > > > I have found a number of files in my pool that have the same checksum > > > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > > > has a few links to it by the way. > > > > > > > > Why is this happening? > > > > > > presumably creating a link sometimes fails, so BackupPC copies the file, > > > assuming the hard link limit has been reached. I suspect problems with > > your > > > NFS server, though not a "stale NFS file handle" in this case, > > > since the file succeeds. Strange. > > > > Yes - I am beginning to think that may be true. However as I mentioned > > in the other thread, the syslog on the nfs server is clean and the one > > on the client shows only about a dozen or so nfs timeouts over the > > past 12 hours which is the time period I am looking at now. Otherwise, > > I don't see any nfs errors. > > So if it is a nfs problem, something seems to be happening somewhat > > randomly and invisibly to the filesystem. > > IIRC you are using a soft nfs mount option right? If you are writing > to an NFS share that is not recommended. Try changing it to a hard > mount and see if the problem goes away. I only used soft mounts on > read only filesystems. True -- I changed it to 'hard' but am still encountering the problem... FRUSTRATING... It's really weird in that it seems to work the first time a directory is read but after a directory has been read a few times, it starts messing up. It's almost like the results are being stored in cache and then the cache is corrupted. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Craig Barratt wrote at about 11:27:41 -0700 on Thursday, October 30, 2008: > Jeffrey writes: > > > Except that it my case some of the duplicated checksums truly are the > > same file (probably due to the link issue I am having)... > > Yes. Just as Holger mentions, if the hardlink attempt fails, > a new file is created in the pool. You appear to have some > unreliability in your NFS or network setup. > > The only other time identical files will have different pool > entries, as people noted, is when $Conf{HardLinkMax} is hit. > Subsequent expiry of backups might cause the identical files > to move below $Conf{HardLinkMax}. > > It's not worth the trouble to try to combine those files since > the frequency is so small and the effort to relink them is very > high. > > Craig OK - Definitely seems to be an NFS problem -- sorry for having troubled the BackupPC list. When I do a shell command 'find | wc' on the cpool directory, I usually get the right number of results but sometimes whole subdirectories are not found. This problem seems to come and go. As in sometimes, I get the right results and sometimes I don't... This makes it even harder to troubleshoot since I can't reliably reproduce the problem every time. I am confused though why I'm not seeing any notation of this problem in my log files (either on the nfs server or client)... Thanks!!! - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
On Thu, Oct 30, 2008 at 10:04:26AM -0400, Jeffrey J. Kosowsky wrote: > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, 2008: > > Hi, > > > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] > Duplicate files in pool with same CHECKSUM and same CONTENTS]: > > > I have found a number of files in my pool that have the same checksum > > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > > has a few links to it by the way. > > > > > > Why is this happening? > > > > presumably creating a link sometimes fails, so BackupPC copies the file, > > assuming the hard link limit has been reached. I suspect problems with your > > NFS server, though not a "stale NFS file handle" in this case, > > since the file succeeds. Strange. > > Yes - I am beginning to think that may be true. However as I mentioned > in the other thread, the syslog on the nfs server is clean and the one > on the client shows only about a dozen or so nfs timeouts over the > past 12 hours which is the time period I am looking at now. Otherwise, > I don't see any nfs errors. > So if it is a nfs problem, something seems to be happening somewhat > randomly and invisibly to the filesystem. IIRC you are using a soft nfs mount option right? If you are writing to an NFS share that is not recommended. Try changing it to a hard mount and see if the problem goes away. I only used soft mounts on read only filesystems. -- -- rouilj John Rouillard System Administrator Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Jeffrey writes: > Except that it my case some of the duplicated checksums truly are the > same file (probably due to the link issue I am having)... Yes. Just as Holger mentions, if the hardlink attempt fails, a new file is created in the pool. You appear to have some unreliability in your NFS or network setup. The only other time identical files will have different pool entries, as people noted, is when $Conf{HardLinkMax} is hit. Subsequent expiry of backups might cause the identical files to move below $Conf{HardLinkMax}. It's not worth the trouble to try to combine those files since the frequency is so small and the effort to relink them is very high. Craig - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Jeffrey J. Kosowsky wrote at about 10:04:26 -0400 on Thursday, October 30, 2008: > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, 2008: > > Hi, > > > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] > Duplicate files in pool with same CHECKSUM and same CONTENTS]: > > > I have found a number of files in my pool that have the same checksum > > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > > has a few links to it by the way. > > > > > > Why is this happening? > > > > presumably creating a link sometimes fails, so BackupPC copies the file, > > assuming the hard link limit has been reached. I suspect problems with > your > > NFS server, though not a "stale NFS file handle" in this case, since > copying > > the file succeeds. Strange. > > Yes - I am beginning to think that may be true. However as I mentioned > in the other thread, the syslog on the nfs server is clean and the one > on the client shows only about a dozen or so nfs timeouts over the > past 12 hours which is the time period I am looking at now. Otherwise, > I don't see any nfs errors. Actually I traced these errors to a timout due to disks on the NAS spinning up. They appear to be just soft timeouts (and not related to this link problem) > So if it is a nfs problem, something seems to be happening somewhat > randomly and invisibly to the filesystem. > > > > > > Isn't this against the whole theory of pooling. > > > > Well, yes :). But the action of copying the file when the method to > implement > > pooling (hard links) does not work for some reason (max link count > reached, or > > NFS file server errors if you think about it - you *do* get some level of > > pooling; otherwise you'd have an independant copy or a missing file each > time > > linking fails) is perfectly reasonable. > > > > > It also doesn't seem > > > to get cleaned up by BackupPC_nightly since that has run several times > > > and the pool files are now several days old. > > > > BackupPC_nightly is not supposed to clean up that situation. It could be > > designed to do so (the situation may arise when a "link count overflow" is > > resolved by expired backups), but it would be horribly inefficient: for > the > > file to be eliminated, you would have to find() every occurrence of the > inode > > in all pc/* trees and replace them with links to the duplicate(s) to be > kept. > > You don't want that. > > Yes but it would be nice to have a switch perhaps that allowed this > more comprehensive cleanup. > Even in a non-error case, I can imagine situations where at some point > the max file links may have been exceeded and then backups were > deleted so that the link count came back down below the max. > > The logic wouldn't seem to be that horrendous. Since you would only > need to walk down the pc/* trees once -- i.e. first walk down > (c)pool/* to compile list of repeated but identical checksums. Then > walk down the pc/* tree to find the files on the list. > > > > > > What can I do to clean it up? > > > > Fix your NFS server? :) Is there a consistent maximum number of links, or > do > > the copies seem to happen randomly? Honestly, I don't think the savings > you > > may gain from storing the pool over NFS are worth the headaches. What is > > cheaper about putting a large disk into a NAS device than into your > BackupPC > > server? Well, yes, you can share it ... how about exporting part of the > disk > > from the BackupPC server (I would still recommend distinct partitions)? > > > > You are right in theory. But I would still like to get NFS working for > various reasons and it is always a good "learning experience" to > troubleshoot such things ;) > Now this is interesting... Looking through my BackupPC log files, I noticed that this problem *FIRST* occurred on Oct 27 and has affected every backup since. The error are only occurring when BackupPC_link runs (and I didn't have any problems with BackupPC_link in the 10 or so previous days that I have been using BackupPC). So, I used both find and the incremental backups themselves to see what happened between the last error-free backup at 18:08PM on Oct 26 and the first bad one at 1AM on Oct 27. But it doesn't seem like any files changed on either the BackupPC server or the NFS server. Also, interestingly, this problem occ
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeffrey J. Kosowsky wrote: > Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, 2008: > > Hi, > > > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] > Duplicate files in pool with same CHECKSUM and same CONTENTS]: > > > I have found a number of files in my pool that have the same checksum > > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > > has a few links to it by the way. > > > > > > Why is this happening? > > > > presumably creating a link sometimes fails, so BackupPC copies the file, > > assuming the hard link limit has been reached. I suspect problems with your > > NFS server, though not a "stale NFS file handle" in this case, since > copying > > the file succeeds. Strange. > > Yes - I am beginning to think that may be true. However as I mentioned > in the other thread, the syslog on the nfs server is clean and the one > on the client shows only about a dozen or so nfs timeouts over the > past 12 hours which is the time period I am looking at now. Otherwise, > I don't see any nfs errors. > So if it is a nfs problem, something seems to be happening somewhat > randomly and invisibly to the filesystem. See this URL which assisted me in improving the performance, and reducing NFS errors in my environment. http://billharlan.com/pub/papers/NFS_for_clusters.html It was written a long time ago, but most of it is stall very relevant (I guess NFS has not changed much). In my case, the actual problem was faulty memory in a new server plus some sort of strange network card driver problem corrupting the NFS packets It truly surprised me just how many errors I was getting even from my existing load which I had never noticed. Regards, Adam -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJCc6IGyoxogrTyiURAoVnAJ9iKX9Sj8H7mDgmyrC182Uz+rIvgwCePvy5 J8OaYBtJuOvYC9a4JSNGEKI= =gip1 -END PGP SIGNATURE- - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Tino Schwarze wrote at about 15:08:29 +0100 on Thursday, October 30, 2008: > On Thu, Oct 30, 2008 at 09:56:15AM -0400, Jeffrey J. Kosowsky wrote: > > > > I'm not sure though, how the file name is derived, I found another file > > > with same name but different MD5 sum: > > > .../cpool/0/0 # md5sum 8/0084734e7242df0fbc186ba6741d1bab* > > > db224998946bac7859f2448f41c58f88 8/0084734e7242df0fbc186ba6741d1bab > > > d1d8f3a86ae5492de0bf11f5cfb45860 8/0084734e7242df0fbc186ba6741d1bab_0 > > > > > > IIRC, BackupPC_nightly should perform chain cleaning. > > > > Well, I haven't noticed any change after it runs... > > I think I'm even more confused now ;) > > How can I troubleshoot this further? > > There's no trouble to shoot! ;-) > > Holger explained that the pool file name is based on a checksum of the > first 256k of the file's content and the file's length, so collisions > are normal and expected. > Except that it my case some of the duplicated checksums truly are the same file (probably due to the link issue I am having)... - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
On Thu, Oct 30, 2008 at 09:56:15AM -0400, Jeffrey J. Kosowsky wrote: > > I'm not sure though, how the file name is derived, I found another file > > with same name but different MD5 sum: > > .../cpool/0/0 # md5sum 8/0084734e7242df0fbc186ba6741d1bab* > > db224998946bac7859f2448f41c58f88 8/0084734e7242df0fbc186ba6741d1bab > > d1d8f3a86ae5492de0bf11f5cfb45860 8/0084734e7242df0fbc186ba6741d1bab_0 > > > > IIRC, BackupPC_nightly should perform chain cleaning. > > Well, I haven't noticed any change after it runs... > I think I'm even more confused now ;) > How can I troubleshoot this further? There's no trouble to shoot! ;-) Holger explained that the pool file name is based on a checksum of the first 256k of the file's content and the file's length, so collisions are normal and expected. HTH, Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Holger Parplies wrote at about 11:29:49 +0100 on Thursday, October 30, 2008: > Hi, > > Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] > Duplicate files in pool with same CHECKSUM and same CONTENTS]: > > I have found a number of files in my pool that have the same checksum > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > has a few links to it by the way. > > > > Why is this happening? > > presumably creating a link sometimes fails, so BackupPC copies the file, > assuming the hard link limit has been reached. I suspect problems with your > NFS server, though not a "stale NFS file handle" in this case, since copying > the file succeeds. Strange. Yes - I am beginning to think that may be true. However as I mentioned in the other thread, the syslog on the nfs server is clean and the one on the client shows only about a dozen or so nfs timeouts over the past 12 hours which is the time period I am looking at now. Otherwise, I don't see any nfs errors. So if it is a nfs problem, something seems to be happening somewhat randomly and invisibly to the filesystem. > > > Isn't this against the whole theory of pooling. > > Well, yes :). But the action of copying the file when the method to implement > pooling (hard links) does not work for some reason (max link count reached, > or > NFS file server errors if you think about it - you *do* get some level of > pooling; otherwise you'd have an independant copy or a missing file each time > linking fails) is perfectly reasonable. > > > It also doesn't seem > > to get cleaned up by BackupPC_nightly since that has run several times > > and the pool files are now several days old. > > BackupPC_nightly is not supposed to clean up that situation. It could be > designed to do so (the situation may arise when a "link count overflow" is > resolved by expired backups), but it would be horribly inefficient: for the > file to be eliminated, you would have to find() every occurrence of the inode > in all pc/* trees and replace them with links to the duplicate(s) to be kept. > You don't want that. Yes but it would be nice to have a switch perhaps that allowed this more comprehensive cleanup. Even in a non-error case, I can imagine situations where at some point the max file links may have been exceeded and then backups were deleted so that the link count came back down below the max. The logic wouldn't seem to be that horrendous. Since you would only need to walk down the pc/* trees once -- i.e. first walk down (c)pool/* to compile list of repeated but identical checksums. Then walk down the pc/* tree to find the files on the list. > > > What can I do to clean it up? > > Fix your NFS server? :) Is there a consistent maximum number of links, or do > the copies seem to happen randomly? Honestly, I don't think the savings you > may gain from storing the pool over NFS are worth the headaches. What is > cheaper about putting a large disk into a NAS device than into your BackupPC > server? Well, yes, you can share it ... how about exporting part of the disk > from the BackupPC server (I would still recommend distinct partitions)? > You are right in theory. But I would still like to get NFS working for various reasons and it is always a good "learning experience" to troubleshoot such things ;) - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Tino Schwarze wrote at about 11:13:27 +0100 on Thursday, October 30, 2008: > Hi Jeffrey, > > On Thu, Oct 30, 2008 at 03:55:16AM -0400, Jeffrey J. Kosowsky wrote: > > > I have found a number of files in my pool that have the same checksum > > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > > has a few links to it by the way. > > That's intentional - what are the link counts for the files? > If you look at BackupPC's status page, there is a line like: > > * Pool hashing gives 649 repeated files with longest chain 28, Ah I was wondering what that line meant... (for real :) Mine says: Pool hashing gives 9676 repeated files with longest chain 4 HOWEVER: my config has: $Conf{HardLinkMax} = 31999 And when I look at some of the "repeated" pool files, I see that they only have 2-3 links each. > > > Why is this happening? > > Isn't this against the whole theory of pooling. It also doesn't seem > > to get cleaned up by BackupPC_nightly since that has run several times > > and the pool files are now several days old. > > Because there is a file-system dependent limit to the number of hard > links a file may have. Look at $Conf{HardLinkMax} in config.pl. > > Hm. I just took a look in my cpool and found some files which didn't > hit the hardlink count yet, but have a _0 and _1: > .../cpool/0/0 # ls -l c/00cd83be1ea3c1ffa3c6af2f4e310206* > -rw-r- 4371 backuppc users 34 2005-01-14 17:01 > c/00cd83be1ea3c1ffa3c6af2f4e310206 > -rw-r- 3536 backuppc users 34 2005-03-02 02:22 > c/00cd83be1ea3c1ffa3c6af2f4e310206_0 > -rw-r- 439 backuppc users 34 2006-03-11 02:04 > c/00cd83be1ea3c1ffa3c6af2f4e310206_1 > > MD5Sums are not equal for all files, so maybe something got corrupted > (or I updated BackupPC during the time - the files are rather old!): > .../cpool/0/0 # md5sum c/00cd83be1ea3c1ffa3c6af2f4e310206* > 51ef559d1d7fa02c05fa032729c85804 c/00cd83be1ea3c1ffa3c6af2f4e310206 > 51ef559d1d7fa02c05fa032729c85804 c/00cd83be1ea3c1ffa3c6af2f4e310206_0 > 7e2276750fc478fa30142aa808df2a1f c/00cd83be1ea3c1ffa3c6af2f4e310206_1 > > AFAIK, I started with $Conf{HardLinkMax} set to 32.000. As the files are > very old, a lot of links might have expired already. > > I'm not sure though, how the file name is derived, I found another file > with same name but different MD5 sum: > .../cpool/0/0 # md5sum 8/0084734e7242df0fbc186ba6741d1bab* > db224998946bac7859f2448f41c58f88 8/0084734e7242df0fbc186ba6741d1bab > d1d8f3a86ae5492de0bf11f5cfb45860 8/0084734e7242df0fbc186ba6741d1bab_0 > > IIRC, BackupPC_nightly should perform chain cleaning. Well, I haven't noticed any change after it runs... I think I'm even more confused now ;) How can I troubleshoot this further? - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Apropos link count, I just did a quick check of my pool. Here are the top linked files: -rw-r- 987537 backuppc users 359 2007-05-19 23:43 ./0/d/1/0d16a8f0ce1b516044a3f015b7d5ee06 -rw-r- 437446 backuppc users 98 2007-02-07 03:21 ./b/c/8/bc891581e99fb3729ea3d239a52d2b9a -rw-r- 340062 backuppc users 98 2007-12-22 02:50 ./6/5/9/659e6651b59c8d8de4ffacdb9a27eb9f -rw-r- 266646 backuppc users 122 2007-12-22 10:15 ./c/e/a/ceaf858b5f9ef4fdbd1b2132a9d8b14e So almost one million links for... *drum roll* our CVS commit message template! 2nd place got *drum roll* a CVS/Tag file. And the third is... a CVS/Root. The fourth is another CVS/Root still featuring a quarter million links. Bye, Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Hi Holger, On Thu, Oct 30, 2008 at 12:11:43PM +0100, Holger Parplies wrote: > > I'm not sure though, how the file name is derived, > > It's in the docs. Up to 256 KB of file contents (from the first 1 MB) and the > file length are taken into account, so it's quite easy to produce hash clashes > if you want to: take a file > 1 MB and change the last byte. BackupPC resolves > them and they're probably infrequent enough not to be a problem (and you get > to see whether they are on the status page). Taking the length (of the > uncompressed file) into account avoids things like growing logfiles from > causing problems. Thank you for the clarification! Tino, hever having bothered about that before. ;-) -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Hi, Tino Schwarze wrote on 2008-10-30 11:13:27 +0100 [Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS]: > [...] > Hm. I just took a look in my cpool and found some files which didn't > hit the hardlink count yet, but have a _0 and _1: > .../cpool/0/0 # ls -l c/00cd83be1ea3c1ffa3c6af2f4e310206* > -rw-r- 4371 backuppc users 34 2005-01-14 17:01 > c/00cd83be1ea3c1ffa3c6af2f4e310206 > -rw-r- 3536 backuppc users 34 2005-03-02 02:22 > c/00cd83be1ea3c1ffa3c6af2f4e310206_0 > -rw-r- 439 backuppc users 34 2006-03-11 02:04 > c/00cd83be1ea3c1ffa3c6af2f4e310206_1 > > MD5Sums are not equal for all files, that's intentional :-). Those files have different content but hash to the same BackupPC hash. Quoting you: > If you look at BackupPC's status page, there is a line like: > > * Pool hashing gives 649 repeated files with longest chain 28, That is what this line is about - you have up to 28 different files hashing to the same BackupPC hash (some of these may coincidentally have identical content due to link count overflows, but that would be the exception). > AFAIK, I started with $Conf{HardLinkMax} set to 32.000. As the files are > very old, a lot of links might have expired already. True, but keep in mind how much 32000 really is. Unless you have many files with identical content in your backup set (CVS/Root maybe), it will take very many backups to reach so many links. > I'm not sure though, how the file name is derived, It's in the docs. Up to 256 KB of file contents (from the first 1 MB) and the file length are taken into account, so it's quite easy to produce hash clashes if you want to: take a file > 1 MB and change the last byte. BackupPC resolves them and they're probably infrequent enough not to be a problem (and you get to see whether they are on the status page). Taking the length (of the uncompressed file) into account avoids things like growing logfiles from causing problems. > IIRC, BackupPC_nightly should perform chain cleaning. Unused files (i.e. link count = 1) are removed and chains renumbered. Like I wrote, relinking identical files does not make sense. Regards, Holger - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Hi, Jeffrey J. Kosowsky wrote on 2008-10-30 03:55:16 -0400 [[BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS]: > I have found a number of files in my pool that have the same checksum > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > has a few links to it by the way. > > Why is this happening? presumably creating a link sometimes fails, so BackupPC copies the file, assuming the hard link limit has been reached. I suspect problems with your NFS server, though not a "stale NFS file handle" in this case, since copying the file succeeds. Strange. > Isn't this against the whole theory of pooling. Well, yes :). But the action of copying the file when the method to implement pooling (hard links) does not work for some reason (max link count reached, or NFS file server errors if you think about it - you *do* get some level of pooling; otherwise you'd have an independant copy or a missing file each time linking fails) is perfectly reasonable. > It also doesn't seem > to get cleaned up by BackupPC_nightly since that has run several times > and the pool files are now several days old. BackupPC_nightly is not supposed to clean up that situation. It could be designed to do so (the situation may arise when a "link count overflow" is resolved by expired backups), but it would be horribly inefficient: for the file to be eliminated, you would have to find() every occurrence of the inode in all pc/* trees and replace them with links to the duplicate(s) to be kept. You don't want that. > What can I do to clean it up? Fix your NFS server? :) Is there a consistent maximum number of links, or do the copies seem to happen randomly? Honestly, I don't think the savings you may gain from storing the pool over NFS are worth the headaches. What is cheaper about putting a large disk into a NAS device than into your BackupPC server? Well, yes, you can share it ... how about exporting part of the disk from the BackupPC server (I would still recommend distinct partitions)? Regards, Holger - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
Hi Jeffrey, On Thu, Oct 30, 2008 at 03:55:16AM -0400, Jeffrey J. Kosowsky wrote: > I have found a number of files in my pool that have the same checksum > (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy > has a few links to it by the way. That's intentional - what are the link counts for the files? If you look at BackupPC's status page, there is a line like: * Pool hashing gives 649 repeated files with longest chain 28, > Why is this happening? > Isn't this against the whole theory of pooling. It also doesn't seem > to get cleaned up by BackupPC_nightly since that has run several times > and the pool files are now several days old. Because there is a file-system dependent limit to the number of hard links a file may have. Look at $Conf{HardLinkMax} in config.pl. Hm. I just took a look in my cpool and found some files which didn't hit the hardlink count yet, but have a _0 and _1: .../cpool/0/0 # ls -l c/00cd83be1ea3c1ffa3c6af2f4e310206* -rw-r- 4371 backuppc users 34 2005-01-14 17:01 c/00cd83be1ea3c1ffa3c6af2f4e310206 -rw-r- 3536 backuppc users 34 2005-03-02 02:22 c/00cd83be1ea3c1ffa3c6af2f4e310206_0 -rw-r- 439 backuppc users 34 2006-03-11 02:04 c/00cd83be1ea3c1ffa3c6af2f4e310206_1 MD5Sums are not equal for all files, so maybe something got corrupted (or I updated BackupPC during the time - the files are rather old!): .../cpool/0/0 # md5sum c/00cd83be1ea3c1ffa3c6af2f4e310206* 51ef559d1d7fa02c05fa032729c85804 c/00cd83be1ea3c1ffa3c6af2f4e310206 51ef559d1d7fa02c05fa032729c85804 c/00cd83be1ea3c1ffa3c6af2f4e310206_0 7e2276750fc478fa30142aa808df2a1f c/00cd83be1ea3c1ffa3c6af2f4e310206_1 AFAIK, I started with $Conf{HardLinkMax} set to 32.000. As the files are very old, a lot of links might have expired already. I'm not sure though, how the file name is derived, I found another file with same name but different MD5 sum: .../cpool/0/0 # md5sum 8/0084734e7242df0fbc186ba6741d1bab* db224998946bac7859f2448f41c58f88 8/0084734e7242df0fbc186ba6741d1bab d1d8f3a86ae5492de0bf11f5cfb45860 8/0084734e7242df0fbc186ba6741d1bab_0 IIRC, BackupPC_nightly should perform chain cleaning. Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
[BackupPC-users] Duplicate files in pool with same CHECKSUM and same CONTENTS
I have found a number of files in my pool that have the same checksum (other than a trailing _0 or _1) and also the SAME CONTENT. Each copy has a few links to it by the way. Why is this happening? Isn't this against the whole theory of pooling. It also doesn't seem to get cleaned up by BackupPC_nightly since that has run several times and the pool files are now several days old. What can I do to clean it up? Is there a script that goes through looking for identical checksum pool files that have the same content and then coalesces them all into one inode. Thanks! - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/