Hum... I looks there is something wrong with these kernel threads.
I don't know what their role is.
googling "kernel migration threads high cpu usage" returns many pages...
or ask a kernel expert if you have one.
Keep me updated,
Regards,
Thomas
On 04/21/15 11:40, Carmelo Ponti (CSCS) wrote:
Hi Thomas
checking the load with top I can see a lot of processes called
migration/[0-31] which are loading the servers:
7 root RT 0 0 0 0 S 55.0 0.0 1691:40 migration/1
3 root RT 0 0 0 0 S 34.7 0.0 1595:18 migration/0
39 root RT 0 0 0 0 S 34.2 0.0 1696:14 migration/9
35 root RT 0 0 0 0 S 29.7 0.0 1576:14 migration/8
103 root RT 0 0 0 0 S 26.0 0.0 735:58.52 migration/25
71 root RT 0 0 0 0 S 21.4 0.0 789:06.28 migration/17
11 root RT 0 0 0 0 S 20.2 0.0 1058:41 migration/2
51 root RT 0 0 0 0 S 19.8 0.0 465:53.14 migration/12
115 root RT 0 0 0 0 S 19.7 0.0 285:58.55 migration/28
47 root RT 0 0 0 0 S 18.6 0.0 783:47.47 migration/11
99 root RT 0 0 0 0 S 18.0 0.0 648:41.81 migration/24
43 root RT 0 0 0 0 S 17.6 0.0 1154:51 migration/10
111 root RT 0 0 0 0 S 17.0 0.0 401:44.51 migration/27
107 root RT 0 0 0 0 S 15.7 0.0 579:55.17 migration/26
67 root RT 0 0 0 0 S 14.9 0.0 698:39.89 migration/16
75 root RT 0 0 0 0 S 13.1 0.0 621:44.06 migration/18
83 root RT 0 0 0 0 S 11.7 0.0 271:49.81 migration/20
These processes are heavily used only once I start robinhood. On other
robinhood installations where the load of lustre is not so hight,
migration processes are always 0%. Currently /scratch/daint is using
by two CRAY clusters for a total of ca. 6400 compute nodes. Maybe our
robinhood HW is to small. Do you think changing the HW could help?
Carmelo
On Tue, 2015-04-21 at 11:14 +0200, LEIBOVICI Thomas wrote:
Hi,
Reports indicate the DB backend is not overloaded, so you don't have to
change your DB tunings or db access method.
It looks the access to the filesystem is very slow: GET_INFO_FS stage
takes about 127ms, which means filesystem calls take 127ms which is too
long.
As mysql and robinhood are almost idle (0% CPU), do you know what causes
the server load of 30 ??
This load my explain why accesses are slow.
Thomas
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support