Hello! Can you please share a) approximate hardware and data region config that you are using, and b) approximate size of your data.
How large is 18 million entries? What's the size of db/ directory at this point? Regards, -- Ilya Kasnacheev пн, 4 мар. 2019 г. в 17:36, Antonio Conforti <antonio.confo...@sia.eu>: > Hello Ilya, > > not running out of checkpoint buffers is certainly good news.. > > I run today another test to collect the information you asked for, with the > same configuration of the test executed in date 2019/03/01 and previously > discussed in this thread. > Below the details of entries per node : > > Cache 'DF_CMF_QUOTE(@c0)': > +------------------------------------------------------------------+ > | Name(@) | DF_CMF_QUOTE(@c0) | > | Total entries (Heap / Off-heap) | 24050018 (0 / 24050018) | > | Nodes | 8 | > | Total size Min/Avg/Max | 2419854 / 3006252.25 / 3426152 | > | Heap size Min/Avg/Max | 0 / 0.00 / 0 | > | Off-heap size Min/Avg/Max | 2419854 / 3006252.25 / 3426152 | > +------------------------------------------------------------------+ > > Nodes for: DF_CMF_QUOTE(@c0) > > +==============================================================================================================================+ > | Node ID8(@), IP | CPUs | Heap Used | CPU Load | Up Time > > | Size (Primary / Backup) | Hi/Mi/Rd/Wr | > > +==============================================================================================================================+ > | 269DFE65(@n7), HOST_1 | 16 | 2.67 % | 0.03 % | > 02:25:29.411 > | Total: 3147250 (3147250 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 3147250 (3147250 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | FF347987(@n6), HOST_1 | 16 | 23.93 % | 0.03 % | > 02:25:17.106 > | Total: 2820298 (2820298 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 2820298 (2820298 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | 5DF4A6EE(@n4), HOST_1 | 16 | 12.12 % | 0.00 % | > 02:25:34.673 > | Total: 3077901 (3077901 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 3077901 (3077901 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | 51133869(@n5), HOST_1 | 16 | 2.44 % | 0.00 % | > 02:25:23.450 > | Total: 3146224 (3146224 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 3146224 (3146224 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | E4CC3158(@n0), HOST_2 | 16 | 5.15 % | 0.00 % | > 02:26:51.374 > | Total: 3073950 (3073950 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 3073950 (3073950 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | EB897A74(@n2), HOST_2 | 16 | 8.85 % | 0.03 % | > 02:26:36.007 > | Total: 2938389 (2938389 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 2938389 (2938389 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | 38404C41(@n3), HOST_2 | 16 | 5.49 % | 0.03 % | > 02:26:29.103 > | Total: 3426152 (3426152 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 3426152 (3426152 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +-------------------------------+------+-----------+----------+--------------+-----------------------------------+-------------+ > | 570D0880(@n1), HOST_2 | 16 | 26.10 % | 0.03 % | > 02:26:41.175 > | Total: 2419854 (2419854 / 0) | Hi: 0 | > | | | | | > > | Heap: 0 (0 / <n/a>) | Mi: 0 | > | | | | | > > | Off-Heap: 2419854 (2419854 / 0) | Rd: 0 | > | | | | | > > | Off-Heap Memory: <n/a> | Wr: 0 | > > +------------------------------------------------------------------------------------------------------------------------------+ > > Some comments about the test: > > 1) When I start the test with the cache empty, I begin to observe the > degradation of performance after about 1 hour and 15 minutes and, > particularly, when I'm around a total of 18 million entries. > Before that number of entries everything seems to be fine. > > Below the statistics at the end of the checkpoint read in log file for > node 7: > > Run A) for CONSISTENDID 7 > Stop time checkPoint Num Pages Elapsed Checkpoint ms > 09:25:49.016 18353 > 449 > 09:28:49.941 59593 > 1368 > 09:31:50.057 62797 > 1480 > 09:34:50.234 65509 > 1655 > 09:37:50.489 69339 > 1902 > 09:40:56.005 72438 > 7410 > 09:44:03.367 74923 > 14767 > 09:47:07.403 78554 > 18799 > 09:50:13.981 81913 > 25372 > 09:53:24.503 85730 > 35889 > 09:56:27.091 88078 > 38468 > 09:59:31.535 90275 > 42913 > 10:02:35.038 92802 > 46409 > 10:05:40.214 95503 > 51579 > 10:08:49.639 97856 > 60996 > 10:11:58.841 101746 > 70203 > 10:15:02.559 104501 > 73913 > 10:18:14.915 106753 > 86266 > 10:21:17.593 108685 > 88942 > 10:24:25.780 111062 > 97126 > 10:27:33.036 112904 > 104376 > 10:30:35.121 113809 > 106461 > 10:33:40.040 115920 > 111374 > 10:39:46.645 118431 > 117973 > 10:42:54.245 120306 > 125568 > 10:45:56.420 121766 > 127737 > 10:49:02.531 123532 > 133845 > 10:52:04.470 125685 > 135785 > 10:55:12.588 127601 > 143895 > 10:58:16.771 129164 > 148075 > 11:00:59.904 102594 > 131206 > > > > 2) If I stop the test when I begin to observe the degradation of > performance > described in point 1) at about 11:00 AM and I wait for a while, precisely > since in the log of my servers nodes I read "Skipping checkpoint (no pages > were modified)" in order to be sure that no pending entries are in > processing and then I run again the test submitting 4000 entry per second, > I > observe in a short time a degradation of the performances. > > Below the statistic at the end of the checkpoint read in log file fornode > 7: > > Run B) for CONSISTENDID 7 > Stop time checkPoint Num Pages Elapsed Checkpoint ms > 11:23:09.120 24743 > 20386 > 11:28:58.436 131858 > 189695 > 11:32:45.824 137487 > 227388 > 11:33:53.565 53774 > 67741 > > > Please note also as highlighted above in table of point 1) the progressive > grow of pages and consequent elapsed in the execution times of checkpoints. > A note also inherent the pages managed by checkpoint in run b) described in > point 2) compared to pages managed by checkpoint of run A) described in > point 1): > the number of pages managed at the start of run B) are quite the same as > those at the end of run A) and not, as I expected, similar to those at the > start of the initial run A). > > The observations above might suggest to tune some other configuration > parameter? > > What may be, in your opinion, the cause of this performance degradation > when I reach about 18 milions of entries? > What system resource may be exhausted ? > > Can you also tell me in what log do you observe the different > configurations > ? The configuration file used for all nodes seems to me the same but > probably I'm missing something ... > > In attach the logs about the run test 2019/03/01 > log_ignite_190304_HOST1.gz > < > http://apache-ignite-users.70518.x6.nabble.com/file/t2315/log_ignite_190304_HOST1.gz> > > log_ignite_190304_HOST2.gz > < > http://apache-ignite-users.70518.x6.nabble.com/file/t2315/log_ignite_190304_HOST2.gz> > > > Thanks, > Antonio > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >