Re: Ordering of directory operations maintained across system crashes in Btrfs?

2014-03-03 Thread thanumalayan mad
Chris,

Great, thanks. Any guesses whether other filesystems (disk-based) do
things similar to the last two examples you pointed out? Saying "we
think 3 normal filesystems reorder stuff" seems to motivate
application developers to fix bugs ...

Also, just for more information, the sequence we observed was,

Thread A:

unlink(foo)
rename(somefile X, somefile Y)
fsync(somefile Z)

The source and destination of the renamed file are unrelated to the
fsync. But the rename happens in the fsync()'s transaction, while
unlink() is delayed. I guess this has something to do with backrefs
too.

Thanks,
Thanu

On Mon, Mar 3, 2014 at 11:43 AM, Chris Mason  wrote:
> On 02/25/2014 09:01 PM, thanumalayan mad wrote:
>>
>> Hi all,
>>
>> Slightly complicated question.
>>
>> Assume I do two directory operations in a Btrfs partition (such as an
>> unlink() and a rename()), one after the other, and a crash happens
>> after the rename(). Can Btrfs (the current version) send the second
>> operation to the disk first, so that after the crash, I observe the
>> effects of rename() but not the effects of the unlink()?
>>
>> I think I am observing Btrfs re-ordering an unlink() and a rename(),
>> and I just want to confirm that my observation is true. Also, if Btrfs
>> does send directory operations to disk out of order, is there some
>> limitation on this? Like, is this restricted to only unlink() and
>> rename()?
>>
>> I am looking at some (buggy) applications that use Btrfs, and this
>> behavior seems to affect them.
>
>
> There isn't a single answer for this one.
>
> You might have
>
> Thread A:
>
> ulink(foo);
> rename(somefile, somefile2);
> 
>
> This should always have the rename happen before or in the same transaction
> as the rename.
>
> Thread A:
>
> ulink(dirA/foo);
> rename(dirB/somefile, dirB/somefile2);
>
> Here you're at the mercy of what is happening in dirB.  If someone fsyncs
> that directory, it may hit the disk before the unlink.
>
> Thread A:
>
> ulink(foo);
> rename(somefile, somefile2);
> fsync(somefile);
>
> This one is even fuzzier.  Backrefs allow us to do some file fsyncs without
> touching the directory, making it possible the unlink will hit disk after
> the fsync.
>
> -chris
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ordering of directory operations maintained across system crashes in Btrfs?

2014-03-03 Thread thanumalayan mad
Any ideas about this? Guessed-up, not-entirely-sure answers would help too.

An example application bug that would be affected by this is from
LevelDB: https://code.google.com/p/leveldb/issues/detail?id=189

Thanks,
Thanu

On Tue, Feb 25, 2014 at 8:01 PM, thanumalayan mad  wrote:
> Hi all,
>
> Slightly complicated question.
>
> Assume I do two directory operations in a Btrfs partition (such as an
> unlink() and a rename()), one after the other, and a crash happens
> after the rename(). Can Btrfs (the current version) send the second
> operation to the disk first, so that after the crash, I observe the
> effects of rename() but not the effects of the unlink()?
>
> I think I am observing Btrfs re-ordering an unlink() and a rename(),
> and I just want to confirm that my observation is true. Also, if Btrfs
> does send directory operations to disk out of order, is there some
> limitation on this? Like, is this restricted to only unlink() and
> rename()?
>
> I am looking at some (buggy) applications that use Btrfs, and this
> behavior seems to affect them.
>
> Thanks,
> Thanu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Ordering of directory operations maintained across system crashes in Btrfs?

2014-02-25 Thread thanumalayan mad
Hi all,

Slightly complicated question.

Assume I do two directory operations in a Btrfs partition (such as an
unlink() and a rename()), one after the other, and a crash happens
after the rename(). Can Btrfs (the current version) send the second
operation to the disk first, so that after the crash, I observe the
effects of rename() but not the effects of the unlink()?

I think I am observing Btrfs re-ordering an unlink() and a rename(),
and I just want to confirm that my observation is true. Also, if Btrfs
does send directory operations to disk out of order, is there some
limitation on this? Like, is this restricted to only unlink() and
rename()?

I am looking at some (buggy) applications that use Btrfs, and this
behavior seems to affect them.

Thanks,
Thanu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html