Hi all,
After a recent restart, I've noticed that our robinhood reader on
/scratch isn't keeping up with the changelog processing.
For info, we presently have the following disk usage:
$df -h /scratch
Size Used Avail Use% Mounted on
3.1P 2.2P 913T 71% /scratch
and inodes:
$df -hi /scratch
Inodes IUsed IFree IUse% Mounted on
1.9G 317M 1.6G 17% /scratch
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | ========
General statistics =========
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | Daemon start
time: 2016/08/03 13:21:29
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | Started
modules: log_reader
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | ChangeLog reader #0:
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | fs_name
= snx11038
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | mdt_name
= MDT0000
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | reader_id = cl1
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | records
read = 5564412
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS |
interesting records = 5509839
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS |
suppressed records = 54573
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | records
pending = 665
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | last
received = 2016/08/06 23:55:51
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | last read
record time = 2016/07/30 05:39:38.406506
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | last read
record id = 15652656781
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | last
pushed record id = 15652656117
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | last
committed record id = 15652648123
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | last
cleared record id = 15652647394
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | read
speed = 3.33 record/sec
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS |
processing speed ratio = 0.01
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | status
= busy
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | ChangeLog stats:
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | MARK: 0,
CREAT: 3853223, MKDIR: 1080539, HLINK: 87, SLINK: 53, MKNOD: 0, UNLNK:
431392
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | RMDIR:
32505, RENME: 13639, RNMTO: 0, OPEN: 0, CLOSE: 0, LYOUT: 0, TRUNC:
139066
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | SATTR:
1403, XATTR: 44, HSM: 0, MTIME: 12461, CTIME: 0, ATIME: 0
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | ====
EntryProcessor Pipeline Stats ===
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | Idle threads: 0
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | Id
constraints count: 8000 (hash min=0/max=5/avg=0.5)
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | Name
constraints count: 7970 (hash min=0/max=4/avg=0.5)
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | Stage
| Wait | Curr | Done | Total | ms/op |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 0: GET_FID
| 0 | 0 | 0 | 0 | 0.00 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 1:
GET_INFO_DB | 2377 | 0 | 4970 | 5523897 | 0.42 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 627 | 24 | 2 | 5037295 | 1378.33 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 3:
REPORTING | 0 | 0 | 0 | 3543870 | 0.00 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 4:
PRE_APPLY | 0 | 0 | 0 | 3652839 | 0.00 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 5: DB_APPLY
| 0 | 0 | 0 | 3652839 | 1.06 | 85.05% batched (avg
batch size: 11.8)
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 6:
CHGLOG_CLR | 0 | 0 | 0 | 5518274 | 0.12 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 7:
RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | DB ops:
get=190003/ins=3467847/upd=71608/rm=108969
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | --- Pipeline
stage details ---
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | GET_INFO_DB
: first: changelog record #15652648776, fid=[0x2186db7e8:0x4277:0x0],
status=waiting
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | GET_INFO_DB
: last: changelog record #15652656116, fid=[0x2186dc233:0x4193:0x0],
status=waiting
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | GET_INFO_FS
: first: changelog record #15652648124,
fid=[0x2186dba6a:0x104fd:0x0], status=processing
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | GET_INFO_FS
: last: changelog record #15652648775, fid=[0x2186db7e8:0x4277:0x0],
status=done
As you can see, the last read record time is now falling well behind
current, and the GET_INFO_FS is steadily increasing:
2016/08/03 13:26:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 0 | 0 | 162500 | 36.45 |
2016/08/03 13:31:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 187 | 24 | 1 | 244115 | 47.73 |
2016/08/03 13:36:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 487 | 24 | 2 | 313939 | 57.94 |
2016/08/03 13:41:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 231 | 24 | 187 | 374236 | 64.86 |
2016/08/03 13:46:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 1 | 408270 | 76.20 |
2016/08/03 13:51:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 20 | 24 | 1 | 447071 | 84.22 |
2016/08/03 13:56:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 139 | 24 | 0 | 503332 | 87.34 |
2016/08/03 14:01:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 365 | 24 | 0 | 568972 | 87.88 |
2016/08/03 14:06:29 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 275 | 24 | 5 | 625613 | 89.06 |
2016/08/03 14:11:30 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 131 | 24 | 2 | 655654 | 95.00 |
2016/08/03 14:16:30 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1768 | 23 | 9 | 719474 | 94.88 |
2016/08/03 14:21:30 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1880 | 24 | 3412 | 758622 | 98.45 |
...
2016/08/04 19:36:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 124 | 24 | 0 | 3572844 | 694.41 |
2016/08/04 19:41:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 819 | 23 | 17 | 3578030 | 695.33 |
2016/08/04 19:46:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 192 | 24 | 15 | 3594390 | 694.07 |
2016/08/04 19:51:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 791 | 24 | 13 | 3605135 | 693.69 |
2016/08/04 19:56:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 7 | 24 | 14 | 3615853 | 693.42 |
2016/08/04 20:01:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 792 | 24 | 2 | 3625351 | 693.56 |
2016/08/04 20:06:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 271 | 24 | 0 | 3628297 | 694.98 |
2016/08/04 20:11:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1220 | 24 | 564 | 3629668 | 696.69 |
2016/08/04 20:16:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 313 | 3630889 | 698.44 |
2016/08/04 20:21:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 699 | 24 | 14 | 3632702 | 700.08 |
2016/08/04 20:26:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 73 | 24 | 1 | 3641367 | 700.36 |
2016/08/04 20:31:37 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 119 | 24 | 5 | 3647777 | 701.11 |
2016/08/04 20:36:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 474 | 24 | 9 | 3655049 | 701.63 |
2016/08/04 20:41:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 635 | 24 | 0 | 3664926 | 701.68 |
2016/08/04 20:46:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 10 | 24 | 5 | 3681797 | 700.38 |
2016/08/04 20:51:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 909 | 24 | 0 | 3687717 | 701.14 |
2016/08/04 20:56:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 243 | 24 | 1 | 3694695 | 701.75 |
2016/08/04 21:01:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1430 | 24 | 6 | 3700030 | 702.67 |
2016/08/04 21:06:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 688 | 24 | 7 | 3705386 | 703.58 |
2016/08/04 21:11:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 452 | 24 | 7 | 3711084 | 704.43 |
2016/08/04 21:16:38 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1327 | 24 | 0 | 3714445 | 705.71 |
....
2016/08/06 04:21:45 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4745627 | 1109.65 |
2016/08/06 04:26:45 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 1 | 4747026 | 1110.84 |
2016/08/06 04:31:45 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 1 | 4748261 | 1112.07 |
2016/08/06 04:36:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4749440 | 1113.31 |
2016/08/06 04:41:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 2 | 4751663 | 1114.31 |
2016/08/06 04:46:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4753161 | 1115.46 |
2016/08/06 04:51:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 22 | 24 | 5 | 4754926 | 1116.57 |
2016/08/06 04:56:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4756782 | 1117.64 |
2016/08/06 05:01:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4758021 | 1118.86 |
2016/08/06 05:06:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4759225 | 1120.09 |
2016/08/06 05:11:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4760583 | 1121.28 |
2016/08/06 05:16:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4761956 | 1122.47 |
2016/08/06 05:21:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4763371 | 1123.65 |
2016/08/06 05:26:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1450 | 24 | 0 | 4764607 | 1124.85 |
2016/08/06 05:31:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 473 | 24 | 0 | 4765584 | 1126.13 |
2016/08/06 05:36:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 973 | 24 | 2 | 4766861 | 1127.34 |
2016/08/06 05:41:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 392 | 24 | 0 | 4768130 | 1128.55 |
2016/08/06 05:46:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 468 | 24 | 0 | 4769280 | 1129.79 |
2016/08/06 05:51:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 564 | 24 | 0 | 4770700 | 1130.96 |
2016/08/06 05:56:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 527 | 24 | 1 | 4771929 | 1132.19 |
2016/08/06 06:01:46 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 4773259 | 1133.37 |
...
2016/08/06 22:51:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 4 | 24 | 2 | 5022452 | 1363.89 |
2016/08/06 22:56:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 5023601 | 1365.00 |
2016/08/06 23:01:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 5024697 | 1366.13 |
2016/08/06 23:06:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 5025927 | 1367.23 |
2016/08/06 23:11:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 2 | 24 | 4 | 5027150 | 1368.34 |
2016/08/06 23:16:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 24 | 0 | 5028544 | 1369.38 |
2016/08/06 23:21:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1222 | 24 | 7 | 5029856 | 1370.43 |
2016/08/06 23:26:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 134 | 24 | 0 | 5030944 | 1371.57 |
2016/08/06 23:31:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 0 | 14 | 1 | 5032173 | 1372.67 |
2016/08/06 23:36:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1122 | 24 | 5 | 5033107 | 1373.81 |
2016/08/06 23:41:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 40 | 24 | 1 | 5034189 | 1374.94 |
2016/08/06 23:46:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 2644 | 24 | 6 | 5035278 | 1376.02 |
2016/08/06 23:51:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 1714 | 24 | 4 | 5036208 | 1377.20 |
2016/08/06 23:56:50 robinhood@hpc-admin2[46510/1] STATS | 2:
GET_INFO_FS | 627 | 24 | 2 | 5037295 | 1378.33 |
Does anyone have suggestions on what may be the culprit?
Many thanks
Andrew
------------------------------------------------------------------------------
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support