Le 26/11/2014 18:19, Craig Tierney - NOAA Affiliate a écrit :
Thomas,
We backported the patch. It was just a one-liner to put changlog
entries at the tail, versus the head, of the list. After the last
catchup of the changelogs completed, I created a bunch of new files
while robinhood was not running. The processing rate is still about
400 entries per second. In particular, it looked like it was
processing about 1024 records every 2.5 seconds.
So I looked in the configuration and saw that I had:
# clear changelog every 1024 records:
batch_ack_count = 1024 ;
Craig,
This is strange. The behavior you describe sounds exactly like the
problem that must be fixed with the patch:
every changelog_clear() call to the MDS stucks changelog delivery for a
while.
Is there a lot of stacked records? You can see this on the MDS, as far I
I can remember, in /proc/fs/lustre/*mdd*/changelog_user something like that,
you have the last record id and the last cleared record.
I don't know why this would slow things down, I thought it was just an
update optimization. I ran some tests with a different changelog user
and it seemed dumping the changelogs and updating the position should
never be a limitation as I was able to grab over 100,000 entries and
reset the count in a few seconds.
OK.
So I updated batch_ack_count to 10,000. Now the change log processing
rate seemed to go up to 1666 logs/second (over 30 seconds). This is
better. If the rate is limited by the database performance, then
there probably isn't much more I can do (comparing to scan rates).
"grep STAT" into robinhood log would help to indentify the limitation
you hit.
If you want to sample stats for a shorter period that the default (which
is 15 or 20minutes), you can change the "stats_interval" in the config.
What do people use for a value of batch_ack_count on large, PB sized,
filesystems?
I think a good value is a few seconds of changelog processing. So 10k is
a good value in you case.
Regards
Thanks,
Craig
On Tue, Nov 18, 2014 at 3:00 AM, LEIBOVICI Thomas
<[email protected] <mailto:[email protected]>> wrote:
Hi Craig,
No, it is njot expected to get such a slow processing speed.
According to the Lustre versions you run, this slow processing may
be due to the following Lustre bug:
https://jira.hpdd.intel.com/browse/LU-5405
It is a MDS fix. For now the fix is only landed in Lustre 2.5.4. I
don't know if it can be backported to Lustre2.4...
Regards,
Thomas
On 11/17/14 21:11, Craig Tierney - NOAA Affiliate wrote:
Hi,
I have just installed Robinhood 2.5.3 to monitor a Lustre 2.4.3
system. The client on the server is running the 2.5.3 version.
When I did an initlal scan of another test system I saw scan
rates of about 1000-2000 entries per second. While I had
configured robinhood to monitor this new system, the Robinhood
server was not running when we started to copy data to the new
filesystem. From the changelog statistics, I am about 144m
events behind. Processing the change logs seems only be going at
375 entries per second.
Is this typical? I would have expected the processing of
changelog events to be much faster than this or at least as fast
as a normal file scan.
Thanks,
Craig
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/robinhood-support
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support
---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce
que la protection avast! Antivirus est active.
http://www.avast.com
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support