Re: [ceph-users] CephFS MDS journal

2019-02-05 Thread Mahmoud Ismail
On Mon, Feb 4, 2019 at 10:10 PM Gregory Farnum  wrote:

>
> On Mon, Feb 4, 2019 at 8:03 AM Mahmoud Ismail <
> mahmoudahmedism...@gmail.com> wrote:
>
>> On Mon, Feb 4, 2019 at 4:35 PM Gregory Farnum  wrote:
>>
>>>
>>>
>>> On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail <
>>> mahmoudahmedism...@gmail.com> wrote:
>>>
 On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum 
 wrote:

> On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail <
> mahmoudahmedism...@gmail.com> wrote:
>
>> Hello,
>>
>> I'm a bit confused about how the journaling actually works in the
>> MDS.
>>
>> I was reading about these two configuration parameters (journal write
>> head interval)  and (mds early reply). Does the MDS flush the journal
>> synchronously after each operation? and by setting mds eary reply to true
>> it allows operations to return without flushing. If so, what the other
>> parameter (journal write head interval) do or isn't it for MDS?. Also, 
>> can
>> all operations return without flushing with the mds early reply or is it
>> specific to a subset of operations?.
>>
>
> In general, the MDS journal is flushed every five seconds (by
> default), and client requests get an early reply when the operation is 
> done
> in memory but not yet committed to RADOS. Some operations will trigger an
> immediate flush, and there may be some operations that can't get an early
> reply or that need to wait for part of the operation to get committed 
> (like
> renames that move a file's authority to a different MDS).
> IIRC the journal write head interval controls how often it flushes out
> the journal's header, which limits how out-of-date its hints on restart 
> can
> be. (When the MDS restarts, it asks the journal head where the journal's
> unfinished start and end points are, but of course more of the journaled
> operations may have been fully completed since the head was written.)
>

 Thanks for the explanation. Which operations trigger an immediate
 flush? Is the readdir one of these operations?. I noticed that the readdir
 operation latency is going higher under load when the OSDs are hitting the
 limit of the underlying hdd throughput. Can i assume that this is happening
 due to the journal flushing then?

>>>
>>> Not directly, but a readdir might ask to know the size of each file and
>>> that will force the other clients in the system to flush their dirty data
>>> in the directory (so that the readdir can return valid results).
>>> -Greg
>>>
>>>
>>
>> Could it be also due to the MDS lock (operations waiting for the lock
>> under load)?
>>
>
> Well that's not going to cause high OSD usage, and the MDS lock is not
> held while writes are happening. But if the MDS is using 100% CPU, yes, it
> could be contended.
>
>

Yes, i guess that will be the case if i'm writing actual data to the OSDs,
however, i was testing only the metadata operations on the MDSs. So under
high load of metadata operations, the lock contention will be more apparent.


> Also, i assume that the journal is using a different thread for flushing,
>> Right?
>>
>
> Yes, that's correct.
>
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Gregory Farnum
On Mon, Feb 4, 2019 at 8:03 AM Mahmoud Ismail 
wrote:

> On Mon, Feb 4, 2019 at 4:35 PM Gregory Farnum  wrote:
>
>>
>>
>> On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail <
>> mahmoudahmedism...@gmail.com> wrote:
>>
>>> On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum 
>>> wrote:
>>>
 On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail <
 mahmoudahmedism...@gmail.com> wrote:

> Hello,
>
> I'm a bit confused about how the journaling actually works in the MDS.
>
> I was reading about these two configuration parameters (journal write
> head interval)  and (mds early reply). Does the MDS flush the journal
> synchronously after each operation? and by setting mds eary reply to true
> it allows operations to return without flushing. If so, what the other
> parameter (journal write head interval) do or isn't it for MDS?. Also, can
> all operations return without flushing with the mds early reply or is it
> specific to a subset of operations?.
>

 In general, the MDS journal is flushed every five seconds (by default),
 and client requests get an early reply when the operation is done in memory
 but not yet committed to RADOS. Some operations will trigger an immediate
 flush, and there may be some operations that can't get an early reply or
 that need to wait for part of the operation to get committed (like renames
 that move a file's authority to a different MDS).
 IIRC the journal write head interval controls how often it flushes out
 the journal's header, which limits how out-of-date its hints on restart can
 be. (When the MDS restarts, it asks the journal head where the journal's
 unfinished start and end points are, but of course more of the journaled
 operations may have been fully completed since the head was written.)

>>>
>>> Thanks for the explanation. Which operations trigger an immediate flush?
>>> Is the readdir one of these operations?. I noticed that the readdir
>>> operation latency is going higher under load when the OSDs are hitting the
>>> limit of the underlying hdd throughput. Can i assume that this is happening
>>> due to the journal flushing then?
>>>
>>
>> Not directly, but a readdir might ask to know the size of each file and
>> that will force the other clients in the system to flush their dirty data
>> in the directory (so that the readdir can return valid results).
>> -Greg
>>
>>
>
> Could it be also due to the MDS lock (operations waiting for the lock
> under load)?
>

Well that's not going to cause high OSD usage, and the MDS lock is not held
while writes are happening. But if the MDS is using 100% CPU, yes, it could
be contended.


> Also, i assume that the journal is using a different thread for flushing,
> Right?
>

Yes, that's correct.

>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Mahmoud Ismail
On Mon, Feb 4, 2019 at 4:35 PM Gregory Farnum  wrote:

>
>
> On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail <
> mahmoudahmedism...@gmail.com> wrote:
>
>> On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum  wrote:
>>
>>> On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail <
>>> mahmoudahmedism...@gmail.com> wrote:
>>>
 Hello,

 I'm a bit confused about how the journaling actually works in the MDS.

 I was reading about these two configuration parameters (journal write
 head interval)  and (mds early reply). Does the MDS flush the journal
 synchronously after each operation? and by setting mds eary reply to true
 it allows operations to return without flushing. If so, what the other
 parameter (journal write head interval) do or isn't it for MDS?. Also, can
 all operations return without flushing with the mds early reply or is it
 specific to a subset of operations?.

>>>
>>> In general, the MDS journal is flushed every five seconds (by default),
>>> and client requests get an early reply when the operation is done in memory
>>> but not yet committed to RADOS. Some operations will trigger an immediate
>>> flush, and there may be some operations that can't get an early reply or
>>> that need to wait for part of the operation to get committed (like renames
>>> that move a file's authority to a different MDS).
>>> IIRC the journal write head interval controls how often it flushes out
>>> the journal's header, which limits how out-of-date its hints on restart can
>>> be. (When the MDS restarts, it asks the journal head where the journal's
>>> unfinished start and end points are, but of course more of the journaled
>>> operations may have been fully completed since the head was written.)
>>>
>>
>> Thanks for the explanation. Which operations trigger an immediate flush?
>> Is the readdir one of these operations?. I noticed that the readdir
>> operation latency is going higher under load when the OSDs are hitting the
>> limit of the underlying hdd throughput. Can i assume that this is happening
>> due to the journal flushing then?
>>
>
> Not directly, but a readdir might ask to know the size of each file and
> that will force the other clients in the system to flush their dirty data
> in the directory (so that the readdir can return valid results).
> -Greg
>
>

Could it be also due to the MDS lock (operations waiting for the lock under
load)? Also, i assume that the journal is using a different thread for
flushing, Right?


>

>>
>>>
>>
 Another question, are open operations also written to the journal?

>>>
>>> Not opens per se, but we do persist when clients have permission to
>>> operate on files.
>>> -Greg
>>>
>>>

 Regards,
 Mahmoud

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Gregory Farnum
On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail 
wrote:

> On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum  wrote:
>
>> On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail <
>> mahmoudahmedism...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm a bit confused about how the journaling actually works in the MDS.
>>>
>>> I was reading about these two configuration parameters (journal write
>>> head interval)  and (mds early reply). Does the MDS flush the journal
>>> synchronously after each operation? and by setting mds eary reply to true
>>> it allows operations to return without flushing. If so, what the other
>>> parameter (journal write head interval) do or isn't it for MDS?. Also, can
>>> all operations return without flushing with the mds early reply or is it
>>> specific to a subset of operations?.
>>>
>>
>> In general, the MDS journal is flushed every five seconds (by default),
>> and client requests get an early reply when the operation is done in memory
>> but not yet committed to RADOS. Some operations will trigger an immediate
>> flush, and there may be some operations that can't get an early reply or
>> that need to wait for part of the operation to get committed (like renames
>> that move a file's authority to a different MDS).
>> IIRC the journal write head interval controls how often it flushes out
>> the journal's header, which limits how out-of-date its hints on restart can
>> be. (When the MDS restarts, it asks the journal head where the journal's
>> unfinished start and end points are, but of course more of the journaled
>> operations may have been fully completed since the head was written.)
>>
>
> Thanks for the explanation. Which operations trigger an immediate flush?
> Is the readdir one of these operations?. I noticed that the readdir
> operation latency is going higher under load when the OSDs are hitting the
> limit of the underlying hdd throughput. Can i assume that this is happening
> due to the journal flushing then?
>

Not directly, but a readdir might ask to know the size of each file and
that will force the other clients in the system to flush their dirty data
in the directory (so that the readdir can return valid results).
-Greg


>
>
>>
>
>>> Another question, are open operations also written to the journal?
>>>
>>
>> Not opens per se, but we do persist when clients have permission to
>> operate on files.
>> -Greg
>>
>>
>>>
>>> Regards,
>>> Mahmoud
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Mahmoud Ismail
On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum  wrote:

> On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail <
> mahmoudahmedism...@gmail.com> wrote:
>
>> Hello,
>>
>> I'm a bit confused about how the journaling actually works in the MDS.
>>
>> I was reading about these two configuration parameters (journal write
>> head interval)  and (mds early reply). Does the MDS flush the journal
>> synchronously after each operation? and by setting mds eary reply to true
>> it allows operations to return without flushing. If so, what the other
>> parameter (journal write head interval) do or isn't it for MDS?. Also, can
>> all operations return without flushing with the mds early reply or is it
>> specific to a subset of operations?.
>>
>
> In general, the MDS journal is flushed every five seconds (by default),
> and client requests get an early reply when the operation is done in memory
> but not yet committed to RADOS. Some operations will trigger an immediate
> flush, and there may be some operations that can't get an early reply or
> that need to wait for part of the operation to get committed (like renames
> that move a file's authority to a different MDS).
> IIRC the journal write head interval controls how often it flushes out the
> journal's header, which limits how out-of-date its hints on restart can be.
> (When the MDS restarts, it asks the journal head where the journal's
> unfinished start and end points are, but of course more of the journaled
> operations may have been fully completed since the head was written.)
>

Thanks for the explanation. Which operations trigger an immediate flush? Is
the readdir one of these operations?. I noticed that the readdir operation
latency is going higher under load when the OSDs are hitting the limit of
the underlying hdd throughput. Can i assume that this is happening due to
the journal flushing then?


>

>> Another question, are open operations also written to the journal?
>>
>
> Not opens per se, but we do persist when clients have permission to
> operate on files.
> -Greg
>
>
>>
>> Regards,
>> Mahmoud
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Gregory Farnum
On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail 
wrote:

> Hello,
>
> I'm a bit confused about how the journaling actually works in the MDS.
>
> I was reading about these two configuration parameters (journal write head
> interval)  and (mds early reply). Does the MDS flush the journal
> synchronously after each operation? and by setting mds eary reply to true
> it allows operations to return without flushing. If so, what the other
> parameter (journal write head interval) do or isn't it for MDS?. Also, can
> all operations return without flushing with the mds early reply or is it
> specific to a subset of operations?.
>

In general, the MDS journal is flushed every five seconds (by default), and
client requests get an early reply when the operation is done in memory but
not yet committed to RADOS. Some operations will trigger an immediate
flush, and there may be some operations that can't get an early reply or
that need to wait for part of the operation to get committed (like renames
that move a file's authority to a different MDS).
IIRC the journal write head interval controls how often it flushes out the
journal's header, which limits how out-of-date its hints on restart can be.
(When the MDS restarts, it asks the journal head where the journal's
unfinished start and end points are, but of course more of the journaled
operations may have been fully completed since the head was written.)


>
> Another question, are open operations also written to the journal?
>

Not opens per se, but we do persist when clients have permission to operate
on files.
-Greg


>
> Regards,
> Mahmoud
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS MDS journal

2019-02-01 Thread Mahmoud Ismail
Hello,

I'm a bit confused about how the journaling actually works in the MDS.

I was reading about these two configuration parameters (journal write head
interval)  and (mds early reply). Does the MDS flush the journal
synchronously after each operation? and by setting mds eary reply to true
it allows operations to return without flushing. If so, what the other
parameter (journal write head interval) do or isn't it for MDS?. Also, can
all operations return without flushing with the mds early reply or is it
specific to a subset of operations?.

Another question, are open operations also written to the journal?

Regards,
Mahmoud
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com