Re: [ceph-users] cephfs kernel client - page cache being invaildated.
> On Sun, Oct 14, 2018 at 8:21 PM wrote: > how many cephfs mounts that access the file? Is is possible that some > program opens that file in RW mode (even they just read the file)? The nature of the program is that it is "prepped" by one-set of commands and queried by another, thus the RW case is extremely unlikely. I can change permission bits to rewoke the w-bit for the user, they dont need it anyway... it is just the same service-users that generates the data and queries it today. Can ceph tell the actual amount of clients? .. We have 55-60 hosts, where most of them mounts the catalog. -- Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph osd logs
Hi, I added some OSDs into cluster(luminous) lately. The osds use bluestoreand everything goes fine. But there is no osd log in the log file. The log directory has only empty files. I check my settings, "ceph daemon osd.x config show", and I get "debug_osd": "1/5". How can I get the new osds' logs? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
On Sun, Oct 14, 2018 at 8:21 PM wrote: > > Hi > > We have a dataset of ~300 GB on CephFS which as being used for computations > over and over agian .. being refreshed daily or similar. > > When hosting it on NFS after refresh, they are transferred, but from > there - they would be sitting in the kernel page cache of the client > until they are refreshed serverside. > > On CephFS it look "similar" but "different". Where the "steady state" > operation over NFS would give a client/server traffic of < 1MB/s .. > CephFS contantly pulls 50-100MB/s over the network. This has > implications for the clients that end up spending unnessary time waiting > for IO in the execution. > > This is in a setting where the CephFS client mem look like this: > > $ free -h > totalusedfree shared buff/cache > available > Mem: 377G 17G340G1.2G 19G > 354G > Swap: 8.8G430M8.4G > > > If I just repeatedly run (within a few minute) something that is using the > files, then > it is fully served out of client page cache (2GB'ish / s) .. but it looks > like > it is being evicted way faster than in the NFS setting? > > This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null - > type on a total of 24GB data in 300'ish files. > > $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600; > time CMD ; > > totalusedfree shared buff/cache > available > Mem: 377G 16G312G1.2G 48G > 355G > Swap: 8.8G430M8.4G > > real0m8.997s > user0m2.036s > sys 0m6.915s > totalusedfree shared buff/cache > available > Mem: 377G 17G277G1.2G 82G > 354G > Swap: 8.8G430M8.4G > > real3m25.904s > user0m2.794s > sys 0m9.028s > totalusedfree shared buff/cache > available > Mem: 377G 17G283G1.2G 76G > 353G > Swap: 8.8G430M8.4G > > real6m18.358s > user0m2.847s > sys 0m10.651s > > > Munin graphs of the system confirms that there has been zero memory > pressure over the period. > > Is there things in the CephFS case that can cause the page-cache to be > invailated? > Could less agressive "read-ahead" play a role? > > Other thoughts on what root cause on the different behaviour could be? > > Clients are using 4.15 kernel.. Anyone aware of newer patches in this area > that could impact ? > how many cephfs mounts that access the file? Is is possible that some program opens that file in RW mode (even they just read the file)? Yan, Zheng > Jesper > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
> Actual amount of memory used by VFS cache is available through 'grep > Cached /proc/meminfo'. slabtop provides information about cache > of inodes, dentries, and IO memory buffers (buffer_head). Thanks, that was also what I got out of it. And why I reported "free" output in the first as it also shows available and "cached" memory. -- Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
Actual amount of memory used by VFS cache is available through 'grep Cached /proc/meminfo'. slabtop provides information about cache of inodes, dentries, and IO memory buffers (buffer_head). > On 14.10.2018, at 17:28, jes...@krogh.cc wrote: > >> Try looking in /proc/slabinfo / slabtop during your tests. > > I need a bit of guidance here.. Does the slabinfo cover the VFS page > cache ? .. I cannot seem to find any traces (sorting by size on > machines with a huge cache does not really give anything). Perhaps > I'm holding the screwdriver wrong? > > -- > Jesper > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph dashboard ac-* commands not working (Mimic)
The docs you're looking at are from the master (development) version of ceph, so you're seeing commands that don't exist in mimic. You can swap master for mimic in that URL. Hopefully we'll soon have some changes to make this more apparent when looking at the docs. John On Fri, 12 Oct 2018, 17:43 Hayashida, Mami, wrote: > I set up a new Mimic cluster recently and have just enabled the > Dashboard. I first tried to add a (Dashboard) user with the > "ac-user-create" command following this version of documentation ( > http://docs.ceph.com/docs/master/mgr/dashboard/), but the command did not > work. Following the /mimic/mgr/dashboard/ version, I used the > "set-login-credentials" command, I was able to create a user with a > password, which was successful. But with none of the ac-* command working, > how can we manage the dashboard user accounts? At this point, I cannot > figure out what level of permissions have been given to the (test) > dashboard user I have just created. Neither have I figured out how to > delete a user or obtain a list of dashboard users created so far. > > I am using Ceph version 13.2.2 and all the ac-* commands I have tried > returns exactly the same message. > > mon0:~$ ceph dashboard ac-user-show test-user > no valid command found; 10 closest matches: > dashboard get-rgw-api-user-id > dashboard get-rest-requests-timeout > dashboard set-rgw-api-host > dashboard set-rgw-api-secret-key > dashboard get-rgw-api-access-key > dashboard set-rest-requests-timeout > dashboard get-rgw-api-scheme > dashboard get-rgw-api-host > dashboard set-login-credentials > dashboard set-session-expire > Error EINVAL: invalid command > > > -- > - > > *Mami Hayashida* > > *Research Computing Associate* > Research Computing Infrastructure > University of Kentucky Information Technology Services > 301 Rose Street | 102 James F. Hardymon Building > Lexington, KY 40506-0495 > mami.hayash...@uky.edu > (859)323-7521 > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
> Try looking in /proc/slabinfo / slabtop during your tests. I need a bit of guidance here.. Does the slabinfo cover the VFS page cache ? .. I cannot seem to find any traces (sorting by size on machines with a huge cache does not really give anything). Perhaps I'm holding the screwdriver wrong? -- Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
Try looking in /proc/slabinfo / slabtop during your tests. > On 14.10.2018, at 15:21, jes...@krogh.cc wrote: > > Hi > > We have a dataset of ~300 GB on CephFS which as being used for computations > over and over agian .. being refreshed daily or similar. > > When hosting it on NFS after refresh, they are transferred, but from > there - they would be sitting in the kernel page cache of the client > until they are refreshed serverside. > > On CephFS it look "similar" but "different". Where the "steady state" > operation over NFS would give a client/server traffic of < 1MB/s .. > CephFS contantly pulls 50-100MB/s over the network. This has > implications for the clients that end up spending unnessary time waiting > for IO in the execution. > > This is in a setting where the CephFS client mem look like this: > > $ free -h > totalusedfree shared buff/cache > available > Mem: 377G 17G340G1.2G 19G > 354G > Swap: 8.8G430M8.4G > > > If I just repeatedly run (within a few minute) something that is using the > files, then > it is fully served out of client page cache (2GB'ish / s) .. but it looks > like > it is being evicted way faster than in the NFS setting? > > This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null - > type on a total of 24GB data in 300'ish files. > > $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600; > time CMD ; > > totalusedfree shared buff/cache > available > Mem: 377G 16G312G1.2G 48G > 355G > Swap: 8.8G430M8.4G > > real0m8.997s > user0m2.036s > sys 0m6.915s > totalusedfree shared buff/cache > available > Mem: 377G 17G277G1.2G 82G > 354G > Swap: 8.8G430M8.4G > > real3m25.904s > user0m2.794s > sys 0m9.028s > totalusedfree shared buff/cache > available > Mem: 377G 17G283G1.2G 76G > 353G > Swap: 8.8G430M8.4G > > real6m18.358s > user0m2.847s > sys 0m10.651s > > > Munin graphs of the system confirms that there has been zero memory > pressure over the period. > > Is there things in the CephFS case that can cause the page-cache to be > invailated? > Could less agressive "read-ahead" play a role? > > Other thoughts on what root cause on the different behaviour could be? > > Clients are using 4.15 kernel.. Anyone aware of newer patches in this area > that could impact ? > > Jesper > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
On 14 Oct 2018, at 15.26, John Hearns wrote: > > This is a general question for the ceph list. > Should Jesper be looking at these vm tunables? > vm.dirty_ratio > vm.dirty_centisecs > > What effect do they have when using Cephfs? This situation is a read only, thus no dirty data in page cache. Above should be irrelevant. Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
This is a general question for the ceph list. Should Jesper be looking at these vm tunables? vm.dirty_ratio vm.dirty_centisecs What effect do they have when using Cephfs? On Sun, 14 Oct 2018 at 14:24, John Hearns wrote: > Hej Jesper. > Sorry I do not have a direct answer to your question. > When looking at memory usage, I often use this command: > > watch cat /rpoc/meminfo > > > > > > > On Sun, 14 Oct 2018 at 13:22, wrote: > >> Hi >> >> We have a dataset of ~300 GB on CephFS which as being used for >> computations >> over and over agian .. being refreshed daily or similar. >> >> When hosting it on NFS after refresh, they are transferred, but from >> there - they would be sitting in the kernel page cache of the client >> until they are refreshed serverside. >> >> On CephFS it look "similar" but "different". Where the "steady state" >> operation over NFS would give a client/server traffic of < 1MB/s .. >> CephFS contantly pulls 50-100MB/s over the network. This has >> implications for the clients that end up spending unnessary time waiting >> for IO in the execution. >> >> This is in a setting where the CephFS client mem look like this: >> >> $ free -h >> totalusedfree shared buff/cache >> available >> Mem: 377G 17G340G1.2G 19G >> 354G >> Swap: 8.8G430M8.4G >> >> >> If I just repeatedly run (within a few minute) something that is using the >> files, then >> it is fully served out of client page cache (2GB'ish / s) .. but it looks >> like >> it is being evicted way faster than in the NFS setting? >> >> This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null - >> type on a total of 24GB data in 300'ish files. >> >> $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600; >> time CMD ; >> >> totalusedfree shared buff/cache >> available >> Mem: 377G 16G312G1.2G 48G >> 355G >> Swap: 8.8G430M8.4G >> >> real0m8.997s >> user0m2.036s >> sys 0m6.915s >> totalusedfree shared buff/cache >> available >> Mem: 377G 17G277G1.2G 82G >> 354G >> Swap: 8.8G430M8.4G >> >> real3m25.904s >> user0m2.794s >> sys 0m9.028s >> totalusedfree shared buff/cache >> available >> Mem: 377G 17G283G1.2G 76G >> 353G >> Swap: 8.8G430M8.4G >> >> real6m18.358s >> user0m2.847s >> sys 0m10.651s >> >> >> Munin graphs of the system confirms that there has been zero memory >> pressure over the period. >> >> Is there things in the CephFS case that can cause the page-cache to be >> invailated? >> Could less agressive "read-ahead" play a role? >> >> Other thoughts on what root cause on the different behaviour could be? >> >> Clients are using 4.15 kernel.. Anyone aware of newer patches in this area >> that could impact ? >> >> Jesper >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client - page cache being invaildated.
Hej Jesper. Sorry I do not have a direct answer to your question. When looking at memory usage, I often use this command: watch cat /rpoc/meminfo On Sun, 14 Oct 2018 at 13:22, wrote: > Hi > > We have a dataset of ~300 GB on CephFS which as being used for computations > over and over agian .. being refreshed daily or similar. > > When hosting it on NFS after refresh, they are transferred, but from > there - they would be sitting in the kernel page cache of the client > until they are refreshed serverside. > > On CephFS it look "similar" but "different". Where the "steady state" > operation over NFS would give a client/server traffic of < 1MB/s .. > CephFS contantly pulls 50-100MB/s over the network. This has > implications for the clients that end up spending unnessary time waiting > for IO in the execution. > > This is in a setting where the CephFS client mem look like this: > > $ free -h > totalusedfree shared buff/cache > available > Mem: 377G 17G340G1.2G 19G > 354G > Swap: 8.8G430M8.4G > > > If I just repeatedly run (within a few minute) something that is using the > files, then > it is fully served out of client page cache (2GB'ish / s) .. but it looks > like > it is being evicted way faster than in the NFS setting? > > This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null - > type on a total of 24GB data in 300'ish files. > > $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600; > time CMD ; > > totalusedfree shared buff/cache > available > Mem: 377G 16G312G1.2G 48G > 355G > Swap: 8.8G430M8.4G > > real0m8.997s > user0m2.036s > sys 0m6.915s > totalusedfree shared buff/cache > available > Mem: 377G 17G277G1.2G 82G > 354G > Swap: 8.8G430M8.4G > > real3m25.904s > user0m2.794s > sys 0m9.028s > totalusedfree shared buff/cache > available > Mem: 377G 17G283G1.2G 76G > 353G > Swap: 8.8G430M8.4G > > real6m18.358s > user0m2.847s > sys 0m10.651s > > > Munin graphs of the system confirms that there has been zero memory > pressure over the period. > > Is there things in the CephFS case that can cause the page-cache to be > invailated? > Could less agressive "read-ahead" play a role? > > Other thoughts on what root cause on the different behaviour could be? > > Clients are using 4.15 kernel.. Anyone aware of newer patches in this area > that could impact ? > > Jesper > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cephfs kernel client - page cache being invaildated.
Hi We have a dataset of ~300 GB on CephFS which as being used for computations over and over agian .. being refreshed daily or similar. When hosting it on NFS after refresh, they are transferred, but from there - they would be sitting in the kernel page cache of the client until they are refreshed serverside. On CephFS it look "similar" but "different". Where the "steady state" operation over NFS would give a client/server traffic of < 1MB/s .. CephFS contantly pulls 50-100MB/s over the network. This has implications for the clients that end up spending unnessary time waiting for IO in the execution. This is in a setting where the CephFS client mem look like this: $ free -h totalusedfree shared buff/cache available Mem: 377G 17G340G1.2G 19G 354G Swap: 8.8G430M8.4G If I just repeatedly run (within a few minute) something that is using the files, then it is fully served out of client page cache (2GB'ish / s) .. but it looks like it is being evicted way faster than in the NFS setting? This is not scientific .. but the CMD is a cat /file/on/ceph > /dev/null - type on a total of 24GB data in 300'ish files. $ free -h; time CMD ; sleep 1800; free -h; time CMD ; free -h; sleep 3600; time CMD ; totalusedfree shared buff/cache available Mem: 377G 16G312G1.2G 48G 355G Swap: 8.8G430M8.4G real0m8.997s user0m2.036s sys 0m6.915s totalusedfree shared buff/cache available Mem: 377G 17G277G1.2G 82G 354G Swap: 8.8G430M8.4G real3m25.904s user0m2.794s sys 0m9.028s totalusedfree shared buff/cache available Mem: 377G 17G283G1.2G 76G 353G Swap: 8.8G430M8.4G real6m18.358s user0m2.847s sys 0m10.651s Munin graphs of the system confirms that there has been zero memory pressure over the period. Is there things in the CephFS case that can cause the page-cache to be invailated? Could less agressive "read-ahead" play a role? Other thoughts on what root cause on the different behaviour could be? Clients are using 4.15 kernel.. Anyone aware of newer patches in this area that could impact ? Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com