"Small files" is sort of a misconception. Initial file ops include a small 
amount of overhead, with a lookup, the filename is hashed, the dht subvolume  
is selected and the request is sent to that subvolume. If it's a replica, the 
request is sent to each replica in that subvolume set (usually 2). If it is a 
replica, all the replicas have to respond. If  one or more have pending flags 
or there's an attribute mismatch, either some self heal action has to take 
place, or a split-brain is determined. If the file doesn't exist on that 
subvolume, the same must be done to all the subvolumes. If the file is found, a 
link file is made on the expected dht subvolume pointing to the place we found 
the file. This will make finding it faster the next time. Once the file is 
found and is determined to be clean, the file system can move on to the next 
file operation. 

PHP applications, specifically, normally have a lot of small files that are 
opened for every page query so per-page, that overhead adds up. PHP also 
queries a lot of files that just don't exist. Your single page might query 200 
files that just aren't there. They're in a different portion of the search 
path, or they're a plugin that's not used, etc.

NFS mitigates that affect by using FScache in the kernel. It stores directories 
and stats, preventing the call to the actual filesystem. This also means, of 
course, that the image that was just uploaded through a different server isn't 
going to exist on this one until the cache times out. Stale data in a 
multi-client system is going to have to be expected in a cached client.

Jeff Darcy created a test translator that caches negative lookups which he said 
also mitigated the PHP problem pretty nicely.

If you have control over your app, things like absolute pathing for PHP or 
leaving file descriptors open can also avoid overhead. Also, optimizing the 
number of times you open a file or the number of files to open can help.

So "small files" refers to the percent of total file op time that's spent on 
overhead vs actual data retrieval.

Chandan Kumar <[email protected]> wrote:

>Hello All,
>
>I am new to gluster and evaluating it for my production environment. After
>reading some blogs and googling I learned that NFS mount at clients give
>better read performance for small files and the glusterfs/FUSE mount gives
>better for large write operations.
>
>Now my questions are
>
>1) What do we mean by small files? 1KB/1MB/1GB?
>2) If I am using NFS mount at the client I am most likely loosing the high
>availability feature of gluster. unlike fuse mount where if primary goes
>down I don't need to worry about availability.
>
>Basically my production environment will mostly have read operations of
>files ranging from 400KB to 5MB and they will be concurrently read by
>different threads.
>
>Thanks,
>Chandan
>
>_______________________________________________
>Gluster-users mailing list
>[email protected]
>http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to