Re: [s3ql] Re: Mount (fuse process) hanging or crashing.

Roger Gammans Thu, 08 Sep 2016 02:33:37 -0700

On 7 September 2016 at 22:37, Nikolaus Rath <[email protected]> wrote:


> On Sep 07 2016, Roger Gammans <[email protected]> wrote:
> > apsw.CorruptError: CorruptError: database disk image is malformed
>
> That means you'll have to discard the locally cached metadata. The next
> fsck.s3ql will recover whatever was most recently stored in S3.
>
> > Traceback (most recent call last):
> >   File "/usr/lib/s3ql/s3ql/block_cache.py", line 535, in upload
> >     obj_id = self.db.rowid('INSERT INTO objects (refcount, size)
> VALUES(1,
> > -1)')
> >   File "/usr/lib/s3ql/s3ql/database.py", line 104, in rowid
> >     self.conn.cursor().execute(*a, **kw)
> >   File "src/cursor.c", line 231, in resetcursor
> > apsw.FullError: FullError: database or disk is full
>
>
The odd  thing here is there is now 6.8G free in that partition and we
monitor the
filesystem size every 60s and the monitoring doesn't show a spike. So
either something used a lot of disk space
quickly or we reached a maximum cache Db size? ( I thought it was sqlite ,
which apparently has 2^64 as a max rows
entry )

This was the state of the cache directory before I deleted it with 2.6G of
cachefiles.

# ls -ld
-rw-r--r-- 1 root staff          0 Aug 18 09:36 /mount.s3ql_crit.log
drwxr-sr-x 2 root staff     143360 Sep  7 01:46
/s3:=2F=2Fcluster-backups=2Fbufs=2F-cache
-rw------- 1 root staff 2753519616 Sep  7 01:46
/s3:=2F=2Fcluster-backups=2Fbufs=2F.db
-rw-r--r-- 1 root staff        219 Sep  5 08:39
/s3:=2F=2Fcluster-backups=2Fbufs=2F.params


> During handling of the above exception, another exception occurred:
> >
> > Traceback (most recent call last):
> >   File "/usr/lib/s3ql/s3ql/mount.py", line 66, in run_with_except_hook
> >     run_old(*args, **kw)
> >   File "/usr/lib/s3ql/s3ql/mount.py", line 795, in run
> >     self.block_cache.upload(el)
> >   File "/usr/lib/s3ql/s3ql/block_cache.py", line 573, in upload
> >     self._unlock_obj(obj_id, noerror=True)
> > UnboundLocalError: local variable 'obj_id' referenced before
> >   assignment
>
> This looks like a bug in S3QL, but it only triggers in the code for
> handling database problems. I'll take a look.


Ok - would htis by why the mount.s3ql process was undead - still running but
not repsonding to fuse requests.


> >
> > That would be clear about something running out of diskspace, - I assume
> > (because I was aware of a limit on S3 accounts)
> > that is a local filesystem. Any idea which one?
>
> The one that holds the --cachedir directory.
>
> > My guess would be the
> > metadata cache, but that seem find at the moment.
> >
> > Can you shed any light on it ; or give be guidance on metadata sizing
> > ?
>
> Required metadata grows linearly with stored data. The proportionaly
> factor depends on how big the stored files are, and what block size you
> chose.
>

Is that linear with size before or after de-duplication? Given we have
mulitple
backup snapshots create with s3qlcp it makes a big difference.

I'm running s3qlcp working $DATE; s3qllock $DATE; each night ; but the
crash time
is not consistent with this being the cause (It crashed 2.5 hours before
this tried to run )

I'd recommend to simply store a representative subset of your data and
> look at the size of the resulting metadata. You can then just scale up
> linearly to determine what you'll need for the full data.
>

Well  I've got an accurate backups for two months or so when we we're
running 1.11; it
just now we've moved to 2.11 we are getting issues.

This is the output of my s3qlstat after refsck-ing and remounting:-

# s3qlstat s3backups
Directory entries:    21991225
Inodes:               21991227
Data blocks:          624464
Total data size:      45 TB
After de-duplication: 2 TB (4.86% of total)
After compression:    1 TB (3.74% of total, 76.98% of de-duplicated)
Database size:        2 GiB (uncompressed)



Thanks for you help; I feel I am at least gettinh somewhere; if I haven't
soled it already

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [s3ql] Re: Mount (fuse process) hanging or crashing.

Reply via email to