I'm working with a very large (1.7PB) NFS environment (Isilon based) and have 
been experimenting pretty heavily with robinhood for the past couple of months. 
This environment makes the entire storage system visible in a single global 
namespace by using the automounter. While this is ideal for several reasons, it 
does create a bit of a challenge with robinhood tmpfs scanning.

The tmpfs scan algorhythm has a propensity to scan "deep" in a directory tree 
vs. "wide". What I mean by that is that it starts down a given top level 
directory (or mountpoint in my environment), and iterates though that entire 
directory tree with multiple scan threads before moving on to the next top 
level directory. But in the Isilon environment that round robin's mounts to a 
storage node based on load, that creates a "hotspot" in the Isilon cluster 
where 100% of the scan threads are targeted at one mountpoint and hence a 
single node of the cluster.

What I'd really like to do it scan "wide" where each scan thread starts down a 
different top level directory tree in parallel. That way automount will mount 
up multiple different directories from different Isilon nodes and spread the 
scan load across the entire Isilon cluster instead of beating on a single node. 
I know I can partially address this by scanning each top level directory 
separately, either by defining a separate conf file for each and launching 
multiple scans at once, or by launching multiple robinhood instances with 
-scan==/top_level_directory in parallel. But both of those are kind of ugly 
hacks that have other side effects.

In our case the database performance is the limiting factor, so that 
artificially slows the scan and prevents overloading the Isilon node. But if I 
"fix" the database server to improve performance (i.e. memory resident Cache, 
SSD, etc.) we could run into hotspot issues with the current scan algorhythm.

Anyone have any thoughts or ideas on tuning and optimizing the scan of very 
large tmpfs spaces like this?

--------
Eric D Christensen
R&D ITS - Infrastructure and High Performance Computing
[Description: cid:[email protected]]      [TIC-SymbolRight(1w)]
Sanofi Tucson Innovation Center
2090 E Innovation Park Drive
Oro Valley, Arizona 85755
(520) 544-6869

"We are what we repeatedly do. Excellence, therefore, is not an act, but a 
habit." - Aristotle


------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to