retitle 772052 mount.s3ql crashes on unmount if there are stale cache files
thanks

On 12/04/2014 11:41 AM, Shannon Dealy wrote:
> On Thu, 4 Dec 2014, Nikolaus Rath wrote:
> 
>> On 12/04/2014 10:07 AM, Shannon Dealy wrote:
>>>
>>> Attached is the log file for a failed unmount.
>>>
>>> While my previous attempts were simple umount.s3ql commands, this time a
>>> couple of additional commands were run after rsync completed and before
>>> the umount:
>>>
>>>    s3qlctrl flushcache /media/server-external
>>>    s3qlctrl upload-meta /media/server-external
>>>    umount.s3ql /media/server-external
>>
>> The logs looks as you also send a SIGUSR1 signal to mount.s3ql. Is that
>> correct?
> 
> No, it ran to completion on its own (though it took forever).

What do you mean by that? Sending SIGUSR1 should not affect completion of the 
command at all.

> I forgot
> to mention that I did issue this command just before running umount.s3ql
> 
>    setfattr -n fuse_stacktrace /media/server-external

Yeah, sorry, that's actually also what the logs say. Both the "setfattr" and 
the "kill -SIGUSR" commands just cause mount.s3ql to print a stack trace 
(though the mechanism is slightly different).

> though I am not sure if that makes any difference with fuse as that
> command had previously been issued for that mount point when a different
> file system (the other one we have been debugging) was mounted there,
> and I have no idea if fuse retains this setting between unmounts and mounts
> on a given mount point.

No, that's not a setting. It's a one-off command.

>> After the umount.s3ql command, what are the contents of
>> /root/.s3ql/local:=2F=2F=2Fmedia=2FTransferDrive=2FS3QL_server-external-cache?
>>
>> Can you confirm that this directory did not exist when calling
>> mount.s3ql?
> 
> No, in fact based on the timestamps, I can confirm that this directory
> did exist as it has over 4000 files in it with November 30th timestamps
> spanning roughly 1/2 hour.  I never gave this directory a thought as the
> mount indicates that the cache is out of date, so I assumed any old data
> that was lying around would be discarded when it downloads the current
> file system info.

That's the reason for the exception on umount then. mount.s3ql currently isn't 
able to handle that. The existing data is ignored as you say, but mount.s3ql 
attempts to rmdir the cache directory after unmounting. That fails if there are 
still files in there.

Normally, the only way for this directory to exist is if the file system was 
not unmounted cleanly. In this case mount.s3ql will refuse to start, and you 
have to run fsck.s3ql instead. fsck.s3ql will then clean-up the cache 
directory, and mount.s3ql will start with an empty cache.

In your case...

> It should be noted that the fsck run on this file system was not run
> from my local machine, but from the remote server, so when I mount this
> on my local machine, it says something to the effect that the local
> cache is out of date, and then proceeds to download and unpack the
> current file system information from the remote server.

.. you have neatly circumvented the clean-up of the cache directory. mount.s3ql 
now believes the file system is clean, but there is actually a cache directory 
with files in it.

This is certainly a bug in S3QL, but I have to think about what the correct 
behavior for mount.s3ql actually would be. Refusing to mount would be odd, 
because the file system is indeed clean. But silently deleting the data in 
there intuitively sounds like a dangerous choice as well.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to