Re: [Gluster-users] Optimizing write performance to a few large files in a small cluster

Alexander Valys Tue, 11 Mar 2014 16:00:12 -0700

Robin,

The NUFA translator sounds perfect for my setup.  Do you have a reference for 
setting it up?  I can’t find much documentation about it on the website, except 
a few references in the mailing list to “the old, now obsolete NUFA translator”.


Thanks - 
Alex

On Mar 11, 2014, at 3:37 AM, Robin Jonsson <[email protected]> wrote:

> Alexander:
> 
> I have also experienced the stalls you are explaining. This was in a 2 brick 
> setup running replicated volumes used by a 20 node HPC. 
> 
> In my case this was solved by: 
> 
> * Replace FUSE with NFS
>       * This is by far the biggest booster
> * RAM disks for the scratch directories (not connected to gluster at all)
>       * If you’re not sure where these directories are, run ‘gluster volume 
> top <volume> write list-cnt 10’
> * 'tuned-adm profile; tuned-adm profile rhs-high-throughput’ on all storage 
> bricks
> * The following volume options
>       * cluster.nufa: enable
>       * performance.quick-read: on
>       * performance.open-behind: on
> * Mount option on clients
>       * noatime
>               * Use only where access time isn’t needed.
>               * Major booster for small file writes in my case. Even with the 
> FUSE client.
> 
> Hope this helps, 
> 
> Regards,
> Robin
> 
> 
> On 10 Mar 2014, at 19:06 pm, Alexander Valys <[email protected]> wrote:
> 
>> A quick performance question.
>> 
>> I have a small cluster of 4 machines, 64 cores in total.  I am running a 
>> scientific simulation on them, which writes at between 0.1 and 10 MB/s 
>> (total) to roughly 64 HDF5 files.  Each HDF5 file is written by only one 
>> process.  The writes are not continuous, but consist of writing roughly 1 MB 
>> of data to each file every few seconds.    
>> 
>> Writing to HDF5 involves a lot of reading the file metadata and random 
>> seeking within the file,  since we are actually writing to about 30 datasets 
>> inside each file.  I am hosting the output on a distributed gluster volume 
>> (one brick local to each machine) to provide a unified namespace for the 
>> (very rare) case when each process needs to read the other's files.  
>> 
>> I am seeing somewhat lower performance than I expected, i.e. a factor of 
>> approximately 4 less throughput than each node writing locally to the bare 
>> drives.  I expected the write-behind cache to buffer each write, but it 
>> seems that the writes are being quickly flushed across the network 
>> regardless of what write-behind cache size I use (32 MB currently), and the 
>> simulation stalls while waiting for the I/O operation to finish.  Anyone 
>> have any suggestions as to what to look at?  I am using gluster 3.4.2 on 
>> ubuntu 12.04.  I have flush-behind turned on, and have mounted the volume 
>> with direct-io-mode=disable, and have the cache size set to 256M.  
>> 
>> The nodes are connected via a dedicated gigabit ethernet network, carrying 
>> only gluster traffic (no simulation traffic).
>> 
>> (sorry if this message comes through twice, I sent it yesterday but was not 
>> subscribed)
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Optimizing write performance to a few large files in a small cluster

Reply via email to