Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread jesper
> On Sun, Oct 14, 2018 at 8:21 PM  wrote:
> how many cephfs mounts that access the file? Is is possible that some
> program opens that file in RW mode (even they just read the file)?


The nature of the program is that it is "prepped" by one-set of commands
and queried by another, thus the RW case is extremely unlikely.
I can change permission bits to rewoke the w-bit for the user, they
dont need it anyway... it is just the same service-users that generates
the data and queries it today.

Can ceph tell the actual amount of clients? ..
We have 55-60 hosts, where most of them mounts the catalog.

-- 
Jesper

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph osd logs

2018-10-14 Thread Zhenshi Zhou
Hi,

I added some OSDs into cluster(luminous) lately. The osds use
bluestoreand everything goes fine. But there is no osd log in the
log file. The log directory has only empty files.

I check my settings, "ceph daemon osd.x config show", and I get
"debug_osd": "1/5".

How can I get the new osds' logs?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread Yan, Zheng
On Sun, Oct 14, 2018 at 8:21 PM  wrote:
>
> Hi
>
> We have a dataset of ~300 GB on CephFS which as being used for computations
> over and over agian .. being refreshed daily or similar.
>
> When hosting it on NFS after refresh, they are transferred, but from
> there - they would be sitting in the kernel page cache of the client
> until they are refreshed serverside.
>
> On CephFS it look "similar" but "different". Where the "steady state"
> operation over NFS would give a client/server traffic of < 1MB/s ..
> CephFS contantly pulls 50-100MB/s over the network.  This has
> implications for the clients that end up spending unnessary time waiting
> for IO in the execution.
>
> This is in a setting where the CephFS client mem look like this:
>
> $ free -h
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 17G340G1.2G 19G
> 354G
> Swap:  8.8G430M8.4G
>
>
> If I just repeatedly run (within a few minute) something that is using the
> files, then
> it is fully served out of client page cache (2GB'ish / s) ..  but it looks
> like
> it is being evicted way faster than in the NFS setting?
>
> This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null -
> type on a total of 24GB data in 300'ish files.
>
> $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600;
> time CMD ;
>
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 16G312G1.2G 48G
> 355G
> Swap:  8.8G430M8.4G
>
> real0m8.997s
> user0m2.036s
> sys 0m6.915s
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 17G277G1.2G 82G
> 354G
> Swap:  8.8G430M8.4G
>
> real3m25.904s
> user0m2.794s
> sys 0m9.028s
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 17G283G1.2G 76G
> 353G
> Swap:  8.8G430M8.4G
>
> real6m18.358s
> user0m2.847s
> sys 0m10.651s
>
>
> Munin graphs of the system confirms that there has been zero memory
> pressure over the period.
>
> Is there things in the CephFS case that can cause the page-cache to be
> invailated?
> Could less agressive "read-ahead" play a role?
>
> Other thoughts on what root cause on the different behaviour could be?
>
> Clients are using 4.15 kernel.. Anyone aware of newer patches in this area
> that could impact ?
>

how many cephfs mounts that access the file? Is is possible that some
program opens that file in RW mode (even they just read the file)?

Yan, Zheng


> Jesper
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread jesper
> Actual amount of memory used by VFS cache is available through 'grep
> Cached /proc/meminfo'. slabtop provides information about cache
> of inodes, dentries, and IO memory buffers (buffer_head).

Thanks, that was also what I got out of it. And why I reported "free"
output in the first as it also shows available and "cached" memory.

-- 
Jesper

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread Sergey Malinin
Actual amount of memory used by VFS cache is available through 'grep Cached 
/proc/meminfo'. slabtop provides information about cache of inodes, dentries, 
and IO memory buffers (buffer_head).


> On 14.10.2018, at 17:28, jes...@krogh.cc wrote:
> 
>> Try looking in /proc/slabinfo / slabtop during your tests.
> 
> I need a bit of guidance here..  Does the slabinfo cover the VFS page
> cache ? .. I cannot seem to find any traces (sorting by size on
> machines with a huge cache does not really give anything). Perhaps
> I'm holding the screwdriver wrong?
> 
> -- 
> Jesper
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph dashboard ac-* commands not working (Mimic)

2018-10-14 Thread John Spray
The docs you're looking at are from the master (development) version of
ceph, so you're seeing commands that don't exist in mimic.  You can swap
master for mimic in that URL.

Hopefully we'll soon have some changes to make this more apparent when
looking at the docs.

John

On Fri, 12 Oct 2018, 17:43 Hayashida, Mami,  wrote:

> I set up a new Mimic cluster recently and have just enabled the
> Dashboard.  I first tried to add a (Dashboard) user with the
> "ac-user-create" command following this version of documentation (
> http://docs.ceph.com/docs/master/mgr/dashboard/), but the command did not
> work.  Following the   /mimic/mgr/dashboard/ version, I used the
> "set-login-credentials" command, I was able to create a user with a
> password, which was successful.  But with none of the ac-* command working,
> how can we manage the dashboard user accounts?  At this point, I cannot
> figure out what level of permissions have been given to the (test)
> dashboard user I have just created.  Neither have I figured out how to
> delete a user or obtain a list of dashboard users created so far.
>
> I am using Ceph version 13.2.2 and  all the ac-* commands I have tried
> returns exactly the same message.
>
> mon0:~$ ceph dashboard ac-user-show  test-user
> no valid command found; 10 closest matches:
> dashboard get-rgw-api-user-id
> dashboard get-rest-requests-timeout
> dashboard set-rgw-api-host 
> dashboard set-rgw-api-secret-key 
> dashboard get-rgw-api-access-key
> dashboard set-rest-requests-timeout 
> dashboard get-rgw-api-scheme
> dashboard get-rgw-api-host
> dashboard set-login-credentials  
> dashboard set-session-expire 
> Error EINVAL: invalid command
>
>
> --
> -
>
> *Mami Hayashida*
>
> *Research Computing Associate*
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread jesper
> Try looking in /proc/slabinfo / slabtop during your tests.

I need a bit of guidance here..  Does the slabinfo cover the VFS page
cache ? .. I cannot seem to find any traces (sorting by size on
machines with a huge cache does not really give anything). Perhaps
I'm holding the screwdriver wrong?

-- 
Jesper

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread Sergey Malinin
Try looking in /proc/slabinfo / slabtop during your tests.


> On 14.10.2018, at 15:21, jes...@krogh.cc wrote:
> 
> Hi
> 
> We have a dataset of ~300 GB on CephFS which as being used for computations
> over and over agian .. being refreshed daily or similar.
> 
> When hosting it on NFS after refresh, they are transferred, but from
> there - they would be sitting in the kernel page cache of the client
> until they are refreshed serverside.
> 
> On CephFS it look "similar" but "different". Where the "steady state"
> operation over NFS would give a client/server traffic of < 1MB/s ..
> CephFS contantly pulls 50-100MB/s over the network.  This has
> implications for the clients that end up spending unnessary time waiting
> for IO in the execution.
> 
> This is in a setting where the CephFS client mem look like this:
> 
> $ free -h
>  totalusedfree  shared  buff/cache  
> available
> Mem:   377G 17G340G1.2G 19G   
> 354G
> Swap:  8.8G430M8.4G
> 
> 
> If I just repeatedly run (within a few minute) something that is using the
> files, then
> it is fully served out of client page cache (2GB'ish / s) ..  but it looks
> like
> it is being evicted way faster than in the NFS setting?
> 
> This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null -
> type on a total of 24GB data in 300'ish files.
> 
> $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600;
> time CMD ;
> 
>  totalusedfree  shared  buff/cache  
> available
> Mem:   377G 16G312G1.2G 48G   
> 355G
> Swap:  8.8G430M8.4G
> 
> real0m8.997s
> user0m2.036s
> sys 0m6.915s
>  totalusedfree  shared  buff/cache  
> available
> Mem:   377G 17G277G1.2G 82G   
> 354G
> Swap:  8.8G430M8.4G
> 
> real3m25.904s
> user0m2.794s
> sys 0m9.028s
>  totalusedfree  shared  buff/cache  
> available
> Mem:   377G 17G283G1.2G 76G   
> 353G
> Swap:  8.8G430M8.4G
> 
> real6m18.358s
> user0m2.847s
> sys 0m10.651s
> 
> 
> Munin graphs of the system confirms that there has been zero memory
> pressure over the period.
> 
> Is there things in the CephFS case that can cause the page-cache to be
> invailated?
> Could less agressive "read-ahead" play a role?
> 
> Other thoughts on what root cause on the different behaviour could be?
> 
> Clients are using 4.15 kernel.. Anyone aware of newer patches in this area
> that could impact ?
> 
> Jesper
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread Jesper Krogh
On 14 Oct 2018, at 15.26, John Hearns  wrote:
> 
> This is a general question for the ceph list.
> Should Jesper be looking at these vm tunables?
> vm.dirty_ratio
> vm.dirty_centisecs
> 
> What effect do they have when using Cephfs?

This situation is a read only, thus no dirty data in page cache. Above should 
be irrelevant. 

Jesper


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread John Hearns
This is a general question for the ceph list.
Should Jesper be looking at these vm tunables?
vm.dirty_ratio
vm.dirty_centisecs

What effect do they have when using Cephfs?

On Sun, 14 Oct 2018 at 14:24, John Hearns  wrote:

> Hej Jesper.
> Sorry I do not have a direct answer to your question.
> When looking at memory usage, I often use this command:
>
> watch cat /rpoc/meminfo
>
>
>
>
>
>
> On Sun, 14 Oct 2018 at 13:22,  wrote:
>
>> Hi
>>
>> We have a dataset of ~300 GB on CephFS which as being used for
>> computations
>> over and over agian .. being refreshed daily or similar.
>>
>> When hosting it on NFS after refresh, they are transferred, but from
>> there - they would be sitting in the kernel page cache of the client
>> until they are refreshed serverside.
>>
>> On CephFS it look "similar" but "different". Where the "steady state"
>> operation over NFS would give a client/server traffic of < 1MB/s ..
>> CephFS contantly pulls 50-100MB/s over the network.  This has
>> implications for the clients that end up spending unnessary time waiting
>> for IO in the execution.
>>
>> This is in a setting where the CephFS client mem look like this:
>>
>> $ free -h
>>   totalusedfree  shared  buff/cache
>> available
>> Mem:   377G 17G340G1.2G 19G
>> 354G
>> Swap:  8.8G430M8.4G
>>
>>
>> If I just repeatedly run (within a few minute) something that is using the
>> files, then
>> it is fully served out of client page cache (2GB'ish / s) ..  but it looks
>> like
>> it is being evicted way faster than in the NFS setting?
>>
>> This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null -
>> type on a total of 24GB data in 300'ish files.
>>
>> $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600;
>> time CMD ;
>>
>>   totalusedfree  shared  buff/cache
>> available
>> Mem:   377G 16G312G1.2G 48G
>> 355G
>> Swap:  8.8G430M8.4G
>>
>> real0m8.997s
>> user0m2.036s
>> sys 0m6.915s
>>   totalusedfree  shared  buff/cache
>> available
>> Mem:   377G 17G277G1.2G 82G
>> 354G
>> Swap:  8.8G430M8.4G
>>
>> real3m25.904s
>> user0m2.794s
>> sys 0m9.028s
>>   totalusedfree  shared  buff/cache
>> available
>> Mem:   377G 17G283G1.2G 76G
>> 353G
>> Swap:  8.8G430M8.4G
>>
>> real6m18.358s
>> user0m2.847s
>> sys 0m10.651s
>>
>>
>> Munin graphs of the system confirms that there has been zero memory
>> pressure over the period.
>>
>> Is there things in the CephFS case that can cause the page-cache to be
>> invailated?
>> Could less agressive "read-ahead" play a role?
>>
>> Other thoughts on what root cause on the different behaviour could be?
>>
>> Clients are using 4.15 kernel.. Anyone aware of newer patches in this area
>> that could impact ?
>>
>> Jesper
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread John Hearns
Hej Jesper.
Sorry I do not have a direct answer to your question.
When looking at memory usage, I often use this command:

watch cat /rpoc/meminfo






On Sun, 14 Oct 2018 at 13:22,  wrote:

> Hi
>
> We have a dataset of ~300 GB on CephFS which as being used for computations
> over and over agian .. being refreshed daily or similar.
>
> When hosting it on NFS after refresh, they are transferred, but from
> there - they would be sitting in the kernel page cache of the client
> until they are refreshed serverside.
>
> On CephFS it look "similar" but "different". Where the "steady state"
> operation over NFS would give a client/server traffic of < 1MB/s ..
> CephFS contantly pulls 50-100MB/s over the network.  This has
> implications for the clients that end up spending unnessary time waiting
> for IO in the execution.
>
> This is in a setting where the CephFS client mem look like this:
>
> $ free -h
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 17G340G1.2G 19G
> 354G
> Swap:  8.8G430M8.4G
>
>
> If I just repeatedly run (within a few minute) something that is using the
> files, then
> it is fully served out of client page cache (2GB'ish / s) ..  but it looks
> like
> it is being evicted way faster than in the NFS setting?
>
> This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null -
> type on a total of 24GB data in 300'ish files.
>
> $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600;
> time CMD ;
>
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 16G312G1.2G 48G
> 355G
> Swap:  8.8G430M8.4G
>
> real0m8.997s
> user0m2.036s
> sys 0m6.915s
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 17G277G1.2G 82G
> 354G
> Swap:  8.8G430M8.4G
>
> real3m25.904s
> user0m2.794s
> sys 0m9.028s
>   totalusedfree  shared  buff/cache
> available
> Mem:   377G 17G283G1.2G 76G
> 353G
> Swap:  8.8G430M8.4G
>
> real6m18.358s
> user0m2.847s
> sys 0m10.651s
>
>
> Munin graphs of the system confirms that there has been zero memory
> pressure over the period.
>
> Is there things in the CephFS case that can cause the page-cache to be
> invailated?
> Could less agressive "read-ahead" play a role?
>
> Other thoughts on what root cause on the different behaviour could be?
>
> Clients are using 4.15 kernel.. Anyone aware of newer patches in this area
> that could impact ?
>
> Jesper
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs kernel client - page cache being invaildated.

2018-10-14 Thread jesper
Hi

We have a dataset of ~300 GB on CephFS which as being used for computations
over and over agian .. being refreshed daily or similar.

When hosting it on NFS after refresh, they are transferred, but from
there - they would be sitting in the kernel page cache of the client
until they are refreshed serverside.

On CephFS it look "similar" but "different". Where the "steady state"
operation over NFS would give a client/server traffic of < 1MB/s ..
CephFS contantly pulls 50-100MB/s over the network.  This has
implications for the clients that end up spending unnessary time waiting
for IO in the execution.

This is in a setting where the CephFS client mem look like this:

$ free -h
  totalusedfree  shared  buff/cache  
available
Mem:   377G 17G340G1.2G 19G   
354G
Swap:  8.8G430M8.4G


If I just repeatedly run (within a few minute) something that is using the
files, then
it is fully served out of client page cache (2GB'ish / s) ..  but it looks
like
it is being evicted way faster than in the NFS setting?

This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null -
type on a total of 24GB data in 300'ish files.

$ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600;
time CMD ;

  totalusedfree  shared  buff/cache  
available
Mem:   377G 16G312G1.2G 48G   
355G
Swap:  8.8G430M8.4G

real0m8.997s
user0m2.036s
sys 0m6.915s
  totalusedfree  shared  buff/cache  
available
Mem:   377G 17G277G1.2G 82G   
354G
Swap:  8.8G430M8.4G

real3m25.904s
user0m2.794s
sys 0m9.028s
  totalusedfree  shared  buff/cache  
available
Mem:   377G 17G283G1.2G 76G   
353G
Swap:  8.8G430M8.4G

real6m18.358s
user0m2.847s
sys 0m10.651s


Munin graphs of the system confirms that there has been zero memory
pressure over the period.

Is there things in the CephFS case that can cause the page-cache to be
invailated?
Could less agressive "read-ahead" play a role?

Other thoughts on what root cause on the different behaviour could be?

Clients are using 4.15 kernel.. Anyone aware of newer patches in this area
that could impact ?

Jesper

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com