zdravko123 opened a new issue #2930: URL: https://github.com/apache/couchdb/issues/2930
Everyday we get a spike in the IOPs for about 3-5 minutes which causes delays in the replications. This is a bit of an issue for us, as we need the database to synchronize almost instantly which they generally do. I have thought about upgrading the disk IOPS to 5x higher, but I suspect we will still get some spikes and it's expensive, we could go up to 10x dis iops if we needed to. I have also considered splitting out the Data from the Views to eliminate some issues. What I suspect it is is related to the beam Queuing technology, I suspect it might be crashing or garbage collecting and dumping things to disk, or perhaps log files. Is there anything I can do to investigate? My linux skills aren't as good as windows but happy to have a poke around along with another developer. I have read on other forums that this causes High CPU, but for us it seems to be high IOPS and thus causing outages. It happens on all the nodes, 4 in the cluster at the same time. More information can be found about our set up here: https://github.com/apache/couchdb/issues/2298 We are in the process of eliminating the need for the synchronizations as we have a FULLDB --> MiniDB, we are working on changing our application to write directly to the mini and not full so it doesn't need the real time replication. That is about 1 month away, and we need an interim fix until then. We can not guarantee that this won't affect other things because we also had a 2-3 minute outage when we had a IOPS spike that caused the whole cluster to be unresponsive ie bad gate way. So I guess I'm looking for assistance. What could cause this on a daily cycle? What can we investigate that might be related to , and what we can do to fix it, apart from increasing the disk usage? We do realize the tech isn't suitable for our implementation where we are using it in a transactional ACID way rather than an eventual consistency, which is where a fundamental design flaw/assumption was made, that we are trying to rectify. That was just the trade of at the time to use PouchDB client side offline storage and replication. Any help would be much appreciated! Thanks ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
