Thanks Joe,
               Just to clarify, I’m seeing 8 seconds to run ls -l in a dir 
containing 2 files.  I mentioned that the _parent_ dir contains 123k items, in 
case it was relevant.  Although it seems that the fact we are hitting the dir 
with many requests seems to be the key factor.

Aaron


From: Joe Julian [mailto:[email protected]]
Sent: 29 November 2017 16:16
To: [email protected]; Aaron Roberts <[email protected]>; 
[email protected]
Subject: Re: [Gluster-users] ls performance on directories with small number of 
items

The -l flag is causing a metadata lookup for every file in the directory. The 
way the ls command does that it's with individual fstat calls to each directory 
entry. That's a lot of tiny network round trips with fops that don't even fill 
a standard frame thus each frame has a high percentage of overhead for tcp. Add 
to that the replica check to ensure you're not getting stale data and you have 
another round trip for each file. Your 123k directory entries require several 
frames of getdirent and over 492k frames for the individual fstat calls. That's 
roughly 16us per frame.

Can you eliminate the fstat calls? If you only get the directory listing that 
should be significantly better. To prove this, do "echo *". You will instantly 
see your 123k entries.
On November 27, 2017 5:18:56 AM PST, Aaron Roberts 
<[email protected]<mailto:[email protected]>> wrote:
Hi,
               I have a situation where an apache web server is trying to 
locate the IndexDocument for a directory on a gluster volume.  This URL is 
being hit roughly 20 times per second.  There is only 1 file in this directory. 
 However, the parent directory does have a large number of items (+123,000 
files and dirs) and we are performing operations to move these files into 2 
levels of subdirs.


We are seeing very slow response times (around 8 seconds) in apache and also 
when trying to ls on this dir.  Before we started the migrations to move files 
on the large parent dir into 2 sub levels, we weren’t aware of a problem.


[root@web-02 images]# time ls -l dir1/get/ | wc -l
2


real    0m8.114s
user    0m0.002s
sys     0m0.014s


Other directories with only 1 item return very quickly (<1 sec).


[root@Web-01 images]# time ls -l dir1/tmp1/ | wc -l
2


real    0m0.014s
user    0m0.003s
sys     0m0.006s


I’m just trying to understand what would slow down this operation so much.  Is 
it the high frequency of attempts to read the directory (apache hits to 
dir1/get/) ?  Do the move operations on items in the parent directory have any 
impact?


Some background info:


[root@web-02 images]# gluster --version
glusterfs 3.7.20 built on Jan 30 2017 15:39:29
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General 
Public License.


[root@web-02 images]# gluster vol info


Volume Name: web_vol1
Type: Replicate
Volume ID: 0d63de20-c9c2-4931-b4a3-6aed5ae28057
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: web-01:/export/brick1/web_vol1_brick1
Brick2: web-02:/export/brick1/web_vol1_brick1
Options Reconfigured:
performance.readdir-ahead: on
performance.io-thread-count: 32
performance.cache-size: 512MB




Any insight would be gratefully received.


Thanks,
               Aaron



--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to