On 7 September 2016 at 22:37, Nikolaus Rath <[email protected]> wrote:
> On Sep 07 2016, Roger Gammans <[email protected]> wrote: > > apsw.CorruptError: CorruptError: database disk image is malformed > > That means you'll have to discard the locally cached metadata. The next > fsck.s3ql will recover whatever was most recently stored in S3. > > > Traceback (most recent call last): > > File "/usr/lib/s3ql/s3ql/block_cache.py", line 535, in upload > > obj_id = self.db.rowid('INSERT INTO objects (refcount, size) > VALUES(1, > > -1)') > > File "/usr/lib/s3ql/s3ql/database.py", line 104, in rowid > > self.conn.cursor().execute(*a, **kw) > > File "src/cursor.c", line 231, in resetcursor > > apsw.FullError: FullError: database or disk is full > > The odd thing here is there is now 6.8G free in that partition and we monitor the filesystem size every 60s and the monitoring doesn't show a spike. So either something used a lot of disk space quickly or we reached a maximum cache Db size? ( I thought it was sqlite , which apparently has 2^64 as a max rows entry ) This was the state of the cache directory before I deleted it with 2.6G of cachefiles. # ls -ld -rw-r--r-- 1 root staff 0 Aug 18 09:36 /mount.s3ql_crit.log drwxr-sr-x 2 root staff 143360 Sep 7 01:46 /s3:=2F=2Fcluster-backups=2Fbufs=2F-cache -rw------- 1 root staff 2753519616 Sep 7 01:46 /s3:=2F=2Fcluster-backups=2Fbufs=2F.db -rw-r--r-- 1 root staff 219 Sep 5 08:39 /s3:=2F=2Fcluster-backups=2Fbufs=2F.params > During handling of the above exception, another exception occurred: > > > > Traceback (most recent call last): > > File "/usr/lib/s3ql/s3ql/mount.py", line 66, in run_with_except_hook > > run_old(*args, **kw) > > File "/usr/lib/s3ql/s3ql/mount.py", line 795, in run > > self.block_cache.upload(el) > > File "/usr/lib/s3ql/s3ql/block_cache.py", line 573, in upload > > self._unlock_obj(obj_id, noerror=True) > > UnboundLocalError: local variable 'obj_id' referenced before > > assignment > > This looks like a bug in S3QL, but it only triggers in the code for > handling database problems. I'll take a look. Ok - would htis by why the mount.s3ql process was undead - still running but not repsonding to fuse requests. > > > > That would be clear about something running out of diskspace, - I assume > > (because I was aware of a limit on S3 accounts) > > that is a local filesystem. Any idea which one? > > The one that holds the --cachedir directory. > > > My guess would be the > > metadata cache, but that seem find at the moment. > > > > Can you shed any light on it ; or give be guidance on metadata sizing > > ? > > Required metadata grows linearly with stored data. The proportionaly > factor depends on how big the stored files are, and what block size you > chose. > Is that linear with size before or after de-duplication? Given we have mulitple backup snapshots create with s3qlcp it makes a big difference. I'm running s3qlcp working $DATE; s3qllock $DATE; each night ; but the crash time is not consistent with this being the cause (It crashed 2.5 hours before this tried to run ) I'd recommend to simply store a representative subset of your data and > look at the size of the resulting metadata. You can then just scale up > linearly to determine what you'll need for the full data. > Well I've got an accurate backups for two months or so when we we're running 1.11; it just now we've moved to 2.11 we are getting issues. This is the output of my s3qlstat after refsck-ing and remounting:- # s3qlstat s3backups Directory entries: 21991225 Inodes: 21991227 Data blocks: 624464 Total data size: 45 TB After de-duplication: 2 TB (4.86% of total) After compression: 1 TB (3.74% of total, 76.98% of de-duplicated) Database size: 2 GiB (uncompressed) Thanks for you help; I feel I am at least gettinh somewhere; if I haven't soled it already -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
