PA Nilsson <[email protected]> writes:
> On Tuesday, June 17, 2014 10:16:23 PM UTC+2, Nikolaus Rath wrote:
>> PA Nilsson <[email protected] <javascript:>> writes:
>> >> > fsck.s3ql --ssl-ca-path ${capath} --cachedir ${s3ql_cachedir} --log
>> >> > $log_file --authfile ${auth_file} $storage_url
>> >> >
>> >> > "
>> >> > Starting fsck of xxxxxxxxx
>> >> > Ignoring locally cached metadata (outdated).
>> >> > Backend reports that file system is still mounted elsewhere. Either
>> >> > the file system has not been unmounted cleanly or the data has not
>> yet
>> >> > propagated through the backend. In the later case, waiting for a
>> while
>> >> > should fix the problem, in the former case you should try to run fsck
>> >> > on the computer where the file system has been mounted most recently.
>> >> > Enter "continue" to use the outdated data anyway:
>> >> > "
>> >> >
>> >> > In this case, it is true that the file system was not cleanly
>> unmounted,
>> >> > but what are my options here?
>> >>
>> >> You should find out why you are loosing your local metadata copy.
>> >>
>> >> Is your $s3ql_cachedir on a journaling file system? What happened to
>> >> this file system on the power cycle? Did it loose data?
>> >>
>> >> What are the contents of $s3ql_cachedir when you run fsck.s3ql?
>> >>
>> >> Are you running fsck.s3ql with the same $s3ql_cachedir as mount.s3ql?
>> >> Are you *absolutely* sure about that?
>> >>
>> >>
>> > I can only trigger this when the system is powered off during an actual
>> > transfer of data. If I let the data transfer finish and then power cycle
>> > with the fs mounted, the FS recovers when running fsck.
>> >
>> > The system is running on an ext4 filesystem. The filesystem does not
>> seem
>> > to have lost any data.
>> > The cachedir is read from the same config file and works otherwise, to
>> yes,
>> > I am sure about that.
>> > Contents of cachdir when failing is:
>> > -rw-r--r-- 1 root root 0 Jun 16 13:06 mount.s3ql_crit.log
>> > -rw------- 1 root root 77824 Jun 17 07:21
>> > s3c:=storageurl.db
>> > -rw-r--r-- 1 root root 217 Jun 17 07:21
>> > s3c:=storageurl.params
>>
>> There is something very wrong here. While mount.s3ql is running, there
>> will always be a directory ending in -cache in the cache directory. This
>> directory is only removed after mount.s3ql exits, so if you reboot the
>> computer, it *must* still be there.
>>
>> Can you confirm that the directory exists while mount.s3ql is running?
>>
>> What happens if, instead of rebooting, you just kill -9 the mount.s3ql
>> process? Does the -cache directory exist? Does fsck.s3ql work in that
>> case?
>>
> The -cache dir is there while the and only removed when mount.s3ql
> finishes. After a kill -9, the -cache is still there. Then rebooting
> the system, the -cache is still there and fsck completes.
>
> However when closely monitoring the system, the -cache is created when the
> FS is mounted but if the system is immediately reset, it is not there after
> a reboot.
> So my thinking is that this is a problem that we have with our flash based
> file system. The file is simply not yet written to flash.
It's a directory, not a file, and it is created when mount.s3ql
starts. If this directory (with its contents) disappears if you reboot
the system several minutes later, you have a real problem.
> This will be running on an "non maintained" system with no possibility for
> user interaction.
> What is the drawback of always continuing the fsck operation?
You will loose any data that has not been written to the backend, and
you will loose all metadata updates since the last metadata updates -
which can imply that you loose some data even though the pure data has
already been written to the backend.
More importantly, though, you are ignoring a big problem with your flash
file system. Rebooting the system might affect recently written files,
but it should not result in the loss of an entire directory with its
contents that was created an arbitrary amount of time before the
reboot. In addition, it looks as if the .params file reverts to an
earlier state (this is the reason why fsck.s3ql things that the remote
metadata is newer).
Best,
-Nikolaus
--
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
--
You received this message because you are subscribed to the Google Groups
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.