OK, so I've spent forever (years) suffering from this bug, but I've 
spent a bit more time on it, and might have some insight.

Firstly, this seems to happen semi-randomly during backups, so 
eventually the backup will complete (usually) though it seems related to 
either the backup client, and/or the number of files that the client has.

I get logs like this in my /var/log/messages:

Jan 17 23:34:31 keep kernel: [11513620.906447] rsync_bpc[27253]: 
segfault at 7fe37aefd428 ip 00000000004473af sp 00007ffd49086e40 error 4 
in rsync_bpc[400000+75000]
Jan 18 00:35:01 keep kernel: [11517256.388512] rsync_bpc[27472]: 
segfault at 7fcaa53f3428 ip 00000000004473af sp 00007ffede0afdd0 error 4 
in rsync_bpc[400000+75000]
Jan 18 01:05:12 keep kernel: [11519069.903776] rsync_bpc[27607]: 
segfault at 7f747bbf5428 ip 00000000004473af sp 00007fffe8b03b40 error 4 
in rsync_bpc[400000+75000]
Jan 18 01:35:09 keep kernel: [11520869.888899] rsync_bpc[27860]: 
segfault at 7f7a06240428 ip 00000000004473af sp 00007ffebad15f60 error 4 
in rsync_bpc[400000+75000]
Jan 18 02:04:56 keep kernel: [11522659.795284] rsync_bpc[28086]: 
segfault at 7f0088f10428 ip 00000000004473af sp 00007ffe07298520 error 4 
in rsync_bpc[400000+75000]
Jan 18 02:40:48 keep kernel: [11524814.507776] rsync_bpc[28340]: 
segfault at 7f048c215428 ip 00000000004473af sp 00007fff569b9e20 error 4 
in rsync_bpc[400000+75000]
Jan 18 03:04:41 keep kernel: [11526249.846662] rsync_bpc[28562]: 
segfault at 7fb36e61e428 ip 00000000004473af sp 00007ffdcb018d20 error 4 
in rsync_bpc[400000+75000]
Jan 18 03:40:53 keep kernel: [11528425.088184] rsync_bpc[28795]: 
segfault at 7f1cb1b9b428 ip 00000000004473af sp 00007fff75d291a0 error 4 
in rsync_bpc[400000+75000]
Jan 18 04:05:13 keep kernel: [11529887.025178] rsync_bpc[29045]: 
segfault at 7f94898d8428 ip 00000000004473af sp 00007fff3d6a1dd0 error 4 
in rsync_bpc[400000+75000]
Jan 18 04:37:06 keep kernel: [11531803.020965] rsync_bpc[29275]: 
segfault at 7f83b84b3428 ip 00000000004473af sp 00007ffcb8962ed0 error 4 
in rsync_bpc[400000+75000]
Jan 18 05:11:09 keep kernel: [11533848.516550] rsync_bpc[29531]: 
segfault at 7f18f296a428 ip 00000000004473af sp 00007ffe3f582680 error 4 
in rsync_bpc[400000+75000]
Jan 18 05:47:10 keep kernel: [11536013.450327] rsync_bpc[29921]: 
segfault at 7fd986392428 ip 00000000004473af sp 00007ffe07aacb60 error 4 
in rsync_bpc[400000+75000]
Jan 18 06:04:47 keep kernel: [11537071.297055] rsync_bpc[30127]: 
segfault at 7f0dd13f3428 ip 00000000004473af sp 00007fff977dd350 error 4 
in rsync_bpc[400000+75000]
Jan 18 13:15:05 keep kernel: [11562928.034694] rsync_bpc[1224]: segfault 
at 7f6923390428 ip 00000000004473af sp 00007fff7d94c8f0 error 4 in 
rsync_bpc[400000+75000]
Jan 18 13:30:57 keep kernel: [11563881.316870] rsync_bpc[1322]: segfault 
at 7f8a9f83b428 ip 00000000004473af sp 00007fff9b9d9850 error 4 in 
rsync_bpc[400000+75000]

I've found an informative post: 
http://stackoverflow.com/questions/2549214/interpreting-segfault-messages and 
this pointed me to this command:
addr2line -e /usr/local/bin/rsync_bpc -fCi 0x00000000004473af
bpc_attrib_fileCopyOpt
/usr/src/rsync-bpc-3.0.9.3/backuppc/bpc_attrib.c:284

Looking at the file I see this function:
    273  /*
    274   * Copy all the attributes from fileSrc to fileDest. fileDest 
should already have a
    275   * valid allocated fileName and allocated xattr hash.  The 
fileDest xattr hash is
    276   * emptied before the copy, meaning it is over written.
    277   *
    278   * If overwriteEmptyDigest == 0, an empty digest in fileSrc 
will not overwrite fileDest.
    279   */
    280  void bpc_attrib_fileCopyOpt(bpc_attrib_file *fileDest, 
bpc_attrib_file *fileSrc, int overwriteEmptyDigest)
    281  {
    282      if ( fileDest == fileSrc ) return;
    283
    284      fileDest->type      = fileSrc->type;
    285      fileDest->compress  = fileSrc->compress;
    286      fileDest->mode      = fileSrc->mode;
    287      fileDest->isTemp    = fileSrc->isTemp;
    288      fileDest->uid       = fileSrc->uid;
    289      fileDest->gid       = fileSrc->gid;
    290      fileDest->nlinks    = fileSrc->nlinks;
    291      fileDest->mtime     = fileSrc->mtime;
    292      fileDest->size      = fileSrc->size;
    293      fileDest->inode     = fileSrc->inode;

Looking at line 284 we see that this is the first time we try to read 
from the object fileSrc. I suspect that somehow fileSrc is either 
invalid, doesn't exist, etc, and therefore that is why we are getting 
this error. I'm guessing that is has some value, as otherwise we should 
see a much smaller number in the at value from the logs (similar to the 
OP in the stackoverflow message.

So, is there a simple way to make sure fileSrc is "valid" before trying 
to read it, and potentially cause a crash? I'd like to add some extra 
logs/debug before the crash to try and find the cause and hopefully fix it.

There are only two places that this function is called:
/*
  * Copy all the attributes from fileSrc to fileDest.  fileDest should 
already have a
  * valid allocated fileName and allocated xattr hash.  The fileDest 
xattr hash is
  * emptied before the copy, meaning it is over written.
  */
void bpc_attrib_fileCopy(bpc_attrib_file *fileDest, bpc_attrib_file 
*fileSrc)
{
     if ( fileDest == fileSrc ) return;

     bpc_attrib_fileCopyOpt(fileDest, fileSrc, 1);
}

This is just passing the exact same variable it received, so we will 
need to trace it back another step... I guess if there is an easy method 
to test if it is valid, I can add that test to each function before 
calling bpc_attrib_fileCopy and hopefully eventually work out what is 
wrong with it.

I guess my c skills are quite rusty (non-existant really), so if anyone 
is able to assist, I'd be very happy, even if it is just a clue on the 
right way to debug/find the problem.

Regards,
Adam






-- 
Adam Goryachev Website Managers www.websitemanagers.com.au

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
BackupPC-devel mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to