That's a good point. I would also look at increasing tserver.total.mutation.queue.max. Are you seeing hold times? If not, I would keep pushing harder until you do, then move to multiple tablet servers. Do you have any GC logs?
> On July 6, 2017 at 4:47 AM Cyrille Savelief <[email protected]> wrote: > > Are you sure Accumulo is not waiting for your app's data? There might be > GC pauses in your ingest code (we have already experienced that). > > Le jeu. 6 juil. 2017 à 10:32, Massimilian Mattetti <[email protected] > mailto:[email protected] > a écrit : > > > > Thank you all for the suggestions. > > > > About the native memory map I checked the logs on each tablet > > server and it was loaded correctly (of course the > > tserver.memory.maps.native.enabled was set to true), so the GC pauses > > should not be the problem eventually. I managed to get much better > > ingestion graph by reducing the native map size to 2GB and increasing the > > Batch Writer threads number from the default (3 was really bad for my > > configuration) to 10 (I think it does not make sense having more threads > > than tablet servers, am I right?). > > > > The configuration that I used for the table is: > > "table.file.replication": "2", > > "table.compaction.minor.logs.threshold": "3", > > "table.durability": "flush", > > "table.split.threshold": "1G" > > > > while for the tablet servers is: > > "tserver.wal.blocksize": "1G", > > "tserver.walog.max.size": "2G", > > "tserver.memory.maps.max": "2G", > > "tserver.compaction.minor.concurrent.max": "50", > > "tserver.compaction.major.concurrent.max": "20", > > "tserver.wal.replication": "2", > > "tserver.compaction.major.thread.files.open.max": "15" > > > > The new graph: > > > > > > I still have the problem of a CPU usage that is less than 20%. So I > > am thinking to run multiple tablet servers per node (like 5 or 10) in order > > to maximize the CPU usage. Besides that I do not have any other idea on how > > to stress those servers with ingestion. > > Any suggestions are very welcome. Meanwhile, thank you all again > > for your help. > > > > > > Best Regards, > > Massimiliano > > > > > > > > From: Jonathan Wonders <[email protected] > > mailto:[email protected] > > > To: [email protected] mailto:[email protected] > > Date: 06/07/2017 04:01 > > Subject: Re: maximize usage of cluster resources during > > ingestion > > > > --------------------------------------------- > > > > > > > > Hi Massimilian, > > > > Are you seeing held commits during the ingest pauses? Just based > > on having looked at many similar graphs in the past, this might be one of > > the major culprits. A tablet server has a memory region with a bounded > > size (tserver.memory.maps.max) where it buffers data that has not yet been > > written to RFiles (through the process of minor compaction). The region is > > segmented by tablet and each tablet can have a buffer that is undergoing > > ingest as well as a buffer that is undergoing minor compaction. A memory > > manager decides when to initiate minor compactions for the tablet buffers > > and the default implementation tries to keep the memory region 80-90% full > > while preferring to compact the largest tablet buffers. Creating larger > > RFiles during minor compaction should lead to less major compactions. > > During a minor compaction, the tablet buffer still "consumes" memory within > > the in memory map and high ingest rates can lead to exhausing the remaining > > capacity. The default memory manage uses an adaptive strategy to predict > > the expected memory usage and makes compaction decisions that should > > maintain some free memory. Batch writers can be bursty and a bit > > unpredictable which could throw off these estimates. Also, depending on > > the ingest profile, sometimes an in-memory tablet buffer will consume a > > large percentage of the total buffer. This leads to long minor compactions > > when the buffer size is large which can allow ingest enough time to exhaust > > the buffer before that memory can be reclaimed. When a tablet server has > > to block ingest, it can affect client ingest rates to other tablet servers > > due to the way that batch writers work. This can lead to other tablet > > servers underestimating future ingest rates which can further exacerbate > > the problem. > > > > There are some configuration changes that could reduce the severity > > of held commits, although they might reduce peak ingest rates. Reducing > > the in memory map size can reduce the maximum pause time due to held > > commits. Adding additional tablets should help avoid the problem of a > > single tablet buffer consuming a large percentage of the memory region. It > > might be better to aim for ~20 tablets per server if your problem allows > > for it. It is also possible to replace the memory manager with a custom > > one. I've tried this in the past and have seen stability improvements by > > making the memory thresholds less aggressive (50-75% full). This did > > reduce peak ingest rate in some cases, but that was a reasonable tradeoff. > > > > Based on your current configuration, if a tablet server is serving > > 4 tablets and has a 32GB buffer, your first minor compactions will be at > > least 8GB and they will probably grow larger over time until the tablets > > naturally split. Consider how long it would take to write this RFile > > compared to your peak ingest rate. As others have suggested, make sure to > > use the native maps. Based on your current JVM heap size, using the Java > > in-memory map would probably lead to OOME or very bad GC performance. > > > > Accumulo can trace minor compaction durations so you can get a feel > > for max pause times or measure the effect of configuration changes. > > > > Cheers, > > --Jonathan > > > > On Wed, Jul 5, 2017 at 7:16 PM, Dave Marion <[email protected] > > mailto:[email protected] > wrote: > > > > > > Based on what Cyrille said, I would look at garbage collection, > > specifically I would look at how much of your newly allocated objects spill > > into the old generation before they are flushed to disk. Additionally, I > > would turn off the debug log or log to SSD’s if you have them. Another > > thought, seeing that you have 256GB RAM / node, is to run multiple tablet > > servers per node. Do you have 10 threads on your Batch Writers? What about > > the Batch Writer latency, is it too low such that you are not filling the > > buffer? > > > > > > > > From: Massimilian Mattetti [mailto:[email protected] > > mailto:[email protected] ] > > Sent: Wednesday, July 05, 2017 8:37 AM > > To: [email protected] mailto:[email protected] > > Subject: maximize usage of cluster resources during ingestion > > > > > > > > Hi all, > > > > I have an Accumulo 1.8.1 cluster made by 12 bare metal servers. > > Each server has 256GB of Ram and 2 x 10 cores CPU. 2 machines are used as > > masters (running HDFS NameNodes, Accumulo Master and Monitor). The other 10 > > machines has 12 Disks of 1 TB (11 used by HDFS DataNode process) and are > > running Accumulo TServer processes. All the machines are connected via a > > 10Gb network and 3 of them are running ZooKeeper. I have run some heavy > > ingestion test on this cluster but I have never been able to reach more > > than 20% CPU usage on each Tablet Server. I am running an ingestion process > > (using batch writers) on each data node. The table is pre-split in order to > > have 4 tablets per tablet server. Monitoring the network I have seen that > > data is received/sent from each node with a peak rate of about 120MB/s / > > 100MB/s while the aggregated disk write throughput on each tablet servers > > is around 120MB/s. > > > > The table configuration I am playing with are: > > "table.file.replication": "2", > > "table.compaction.minor.logs.threshold": "10", > > "table.durability": "flush", > > "table.file.max": "30", > > "table.compaction.major.ratio": "9", > > "table.split.threshold": "1G" > > > > while the tablet server configuration is: > > "tserver.wal.blocksize": "2G", > > "tserver.walog.max.size": "8G", > > "tserver.memory.maps.max": "32G", > > "tserver.compaction.minor.concurrent.max": "50", > > "tserver.compaction.major.concurrent.max": "8", > > "tserver.total.mutation.queue.max": "50M", > > "tserver.wal.replication": "2", > > "tserver.compaction.major.thread.files.open.max": "15" > > > > the tablet server heap has been set to 32GB > > > > From Monitor UI > > > > > > As you can see I have a lot of valleys in which the ingestion rate > > reaches 0. > > What would be a good procedure to identify the bottleneck which > > causes the 0 ingestion rate periods? > > Thanks. > > > > Best Regards, > > Max > > > > > > >
