It does seem like the entries get cached for a certain period of time.
Here is the memory listing for the rbd client server:
root@cephmount1:~# free -m
total used free shared buffers cached
Mem: 11965 11816 149 3 139 10823
-/+ buffers/cache: 853 11112
Swap: 4047 0 4047
I can add more memory to the server if I need to I have 2 or 4 16GB DIMM laying
around here someplace.
Here are the some of the pagecache sysctl settings:
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 10
vm.dirty_writeback_centisecs = 500
In terms of the number of files:
root@cephmount1:/mnt/ceph-block-device-archive/library/E# time ls
real 0m8.073s
user 0m0.000s
sys 0m0.012s
root@cephmount1:/mnt/ceph-block-device-archive/library/E# ls |wc
228 510 3413
However looking at some other directories...I see numbers in the range of 500
and 600, etc...so they will vary based on the name of the artist..however if I
had to guess we would not use any more than 800 - 1000 in the very heavy
directories at this point.
Also...one thing I just noticed is that the 'ls |wc' returns right away...even
in cases when right after that I do an 'ls -l' and it takes a while.
Thanks,
Shain
Shain Miley | Manager of Systems and Infrastructure, Digital Media |
[email protected] | 202.513.3649
________________________________________
From: Robert LeBlanc [[email protected]]
Sent: Tuesday, January 06, 2015 1:57 PM
To: Shain Miley
Cc: [email protected]
Subject: Re: [ceph-users] rbd directory listing performance issues
I would think that the RBD mounter would cache the directory listing
which should always make it fast, unless there is so much memory
pressure that it is dropping it frequently.
How many entries are in your directory and total on the RBD?
ls | wc -l
find . | wc -l
What does your memory look like?
free -h
I'm not sure now much help I can be, but if memory pressure is causing
buffers to be freed, then it can cause the system to have to go disk
to get the directory listing. I'm guessing that if the directory is
large enough it could cause the system to have to go back to the RBD
many times. Very small I/O on RBD is very expensive compared to big
sequential access.
On Tue, Jan 6, 2015 at 11:33 AM, Shain Miley <[email protected]> wrote:
> Robert,
>
> xfs on the rbd image as well:
>
> /dev/rbd0 on /mnt/ceph-block-device-archive type xfs (rw)
>
> However looking at the mount options...it does not look like I've enabled
> anything special in terms of mount options.
>
> Thanks,
>
> Shain
>
>
> Shain Miley | Manager of Systems and Infrastructure, Digital Media |
> [email protected] | 202.513.3649
>
> ________________________________________
> From: Robert LeBlanc [[email protected]]
> Sent: Tuesday, January 06, 2015 1:27 PM
> To: Shain Miley
> Cc: [email protected]
> Subject: Re: [ceph-users] rbd directory listing performance issues
>
> What fs are you running inside the RBD?
>
> On Tue, Jan 6, 2015 at 8:29 AM, Shain Miley <[email protected]> wrote:
>> Hello,
>>
>> We currently have a 12 node (3 monitor+9 OSD) ceph cluster, made up of 107 x
>> 4TB drives formatted with xfs. The cluster is running ceph version 0.80.7:
>>
>> Cluster health:
>> cluster 504b5794-34bd-44e7-a8c3-0494cf800c23
>> health HEALTH_WARN crush map has legacy tunables
>> monmap e1: 3 mons at
>> {hqceph1=10.35.1.201:6789/0,hqceph2=10.35.1.203:6789/0,hqceph3=10.35.1.205:6789/0},
>> election epoch 156, quorum 0,1,2 hqceph1,hqceph2,hqceph3
>> osdmap e19568: 107 osds: 107 up, 107 in
>> pgmap v10117422: 2952 pgs, 15 pools, 77202 GB data, 19532 kobjects
>> 226 TB used, 161 TB / 388 TB avail
>>
>> Relevant ceph.conf entries:
>> osd_journal_size = 10240
>> filestore_xattr_use_omap = true
>> osd_mount_options_xfs =
>> "rw,noatime,nodiratime,logbsize=256k,logbufs=8,inode64"
>> osd_mkfs_options_xfs = "-f -i size=2048"
>>
>>
>> A while back I created an 80 TB rbd image to be used as an archive
>> repository for some of our audio and video files. We are still seeing good
>> rados and rbd read and write throughput performance, however we seem to be
>> having quite a long delay in response times when we try to list out the
>> files in directories with a large number of folders, files, etc.
>>
>> Subsequent directory listing times seem to run a lot faster (but I am not
>> sure for long that is the case before we see another instance of slowness),
>> however the initial directory listings can take 20 to 45 seconds.
>>
>> The rbd kernel client is running on ubuntu 14.04 using kernel version
>> '3.18.0-031800-generic'.
>>
>> Benchmarks:
>>
>> root@rbdmount1:/mnt/rbd/music_library/D#time ls (file names removed):
>> real 0m18.045s
>> user 0m0.000s
>> sys 0m0.011s
>>
>> root@rbdmount1:/mnt/rbd# dd bs=1M count=1024 if=/dev/zero of=test
>> conv=fdatasync
>> 1024+0 records in
>> 1024+0 records out
>> 1073741824 bytes (1.1 GB) copied, 9.94287 s, 108 MB/s
>>
>>
>> My questions are:
>>
>> 1) Is there anything inherent in our setup/configuration that would prevent
>> us from having fast directory listings on these larger directories (using an
>> rbd image of that size for example)?
>>
>> 2) Have there been any changes made in Giant that would warrant upgrading
>> the cluster a a fix to resolve this issue?
>>
>> Any suggestions would be greatly appreciated.
>>
>> Thanks,
>>
>> Shain
>>
>>
>> Shain Miley | Manager of Systems and Infrastructure, Digital Media |
>> [email protected] | 202.513.3649
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com