On 08/24/2018 04:52 PM, Mike Hughes wrote:
I think I’ve discovered a new level of failure. It started off with these errors when attempting to rsync larger files:

rsync_bpc: failed to open "/home/localuser/mysql/hostname-srv-sql.our_database.sql", continuing: No space left on device (28)

rsync_bpc: mkstemp "/home/localuser/mysql/.hostname-srv-sql.our_database.sql.000000" failed: No space left on device (28)

Now all backups are failing and I see these in the Xfer Error logs:

BackupPC_refCountUpdate: can't write new pool count file /var/lib/BackupPC//pc/hostname/22/refCnt/poolCntNew.1.a8

BackupPC_refCountUpdate: given errors, redoing host hostname #22 with fsck (reset errorCnt to 0)

bpc_poolRefFileWrite: can't open/create pool delta file name /var/lib/BackupPC//pc/hostname/22/refCnt/tpoolCntDelta_1_-1_0_70674 (errno 28)

bpc_poolRefRequestFsck: can't open/create fsck request file /var/lib/BackupPC//pc/hostname/22/refCnt/needFsck70674 (errno 28)

Can't write new host pool count file /var/lib/BackupPC//pc/hostname/refCnt/poolCntNew.1.02

My guess is that the /var/lib/BackupPC/pc partition is the problem. I took advice from some rando on the interwebs [1] to put the pc folder on an ssd but perhaps it needs more headroom than suggested:

“… Split the storage up on SSD for the pc folder and something more cost efficient for cpool such as SMR drives. ...The pc folder in version 4 essentially only contains the directory structures and references of files that they should contain, so it stays very small. However, it is often read from and speeds things up remarkably when it is served fast. Much more so than speeding up the cpool.”

The “pc” folder lives in “/var/lib/BackupPC/pc” on a local ssd with 5GB overhead available. The storage pools live on a platter w/70+GB free. Am I short-changing the “pc” folder? Does it need to grow significantly during backup runs? If so, how much space is suggested?

Unfortunately my monitoring software (NewRelic Infrastructure) does not provide much granularity and has previously hidden similar spikes in usage so I don’t trust its reports, which does not show any capacity violations.

Thank you!

[1] - https://molnix.com/backuppc-version-4-development-allows-better-scaling/


Hi Mike,

Post author here, nice to hear it is of use!

I was intrigued by your report and decided to look at zabbix logs that follow the disk usage. I can see a change in the disk usage that is about 3% of the pc directory, that may be happening around the time of reference count runs. With more headroom on the storage, I have not run into the issue you mention but it may indeed be as you suspect under the right conditions.

Can you check the size of the files that trigger the errors? What does df -h tell you during backups? If the failed refcount files linger on your drive, how big do they grow?

I will follow this thread with interest and hopefully I have the time to closely monitor what happens in the pc folder during reference counts.

Best regards,
Johan

--
Johan Ehnberg
Founder, CEO
jo...@molnix.com
+358503209688

Molnix Oy
molnix.com

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to