[GitHub] [nifi-minifi-cpp] bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce sawtooth in memory usage of rocksdb flowfile …

2020-02-04 Thread GitBox
bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce 
sawtooth in memory usage of rocksdb flowfile …
URL: https://github.com/apache/nifi-minifi-cpp/pull/715#discussion_r374670499
 
 

 ##
 File path: extensions/rocksdb-repos/FlowFileRepository.cpp
 ##
 @@ -97,23 +91,33 @@ void FlowFileRepository::flush() {
   }
 }
 
-void FlowFileRepository::run() {
-  // threshold for purge
+void FlowFileRepository::printStats() {
+  std::string key_count;
+  db_->GetProperty("rocksdb.estimate-num-keys", _count);
+
+  std::string table_readers;
+  db_->GetProperty("rocksdb.estimate-table-readers-mem", _readers);
+
+  std::string all_memtables;
+  db_->GetProperty("rocksdb.cur-size-all-mem-tables", _memtables);
+
+  logger_->log_info("Repository stats: key count: %zu, table readers size: 
%zu, all memory tables size: %zu",
+  key_count, table_readers, all_memtables);
+}
 
+void FlowFileRepository::run() {
+  auto last = std::chrono::system_clock::now();
 
 Review comment:
   Please use `std::chrono::steady_clock`, system_clock is not monotonic and 
not suitable for measuring intervals.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce sawtooth in memory usage of rocksdb flowfile …

2020-01-28 Thread GitBox
bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce 
sawtooth in memory usage of rocksdb flowfile …
URL: https://github.com/apache/nifi-minifi-cpp/pull/715#discussion_r371838409
 
 

 ##
 File path: extensions/rocksdb-repos/FlowFileRepository.h
 ##
 @@ -105,8 +105,15 @@ class FlowFileRepository : public core::Repository, 
public std::enable_shared_fr
 options.create_if_missing = true;
 options.use_direct_io_for_flush_and_compaction = true;
 options.use_direct_reads = true;
+
+// Write buffers are used as db oepration logs. When they get filled the 
events are merged and serialized.
+// The default size is 64MB.
+// In our case it's usually too much, causing sawtooth in memory 
consumption. (Consumes more than the whole MiniFi)
+// To avoid DB write issues during heavy load it's recommended to have 
high number of buffer.
+// Rocksdb's stall featur can also trigger in case the number of buffers 
is >= 3.
 
 Review comment:
   typo: featur -> feature


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce sawtooth in memory usage of rocksdb flowfile …

2020-01-28 Thread GitBox
bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce 
sawtooth in memory usage of rocksdb flowfile …
URL: https://github.com/apache/nifi-minifi-cpp/pull/715#discussion_r371838161
 
 

 ##
 File path: extensions/rocksdb-repos/FlowFileRepository.h
 ##
 @@ -105,8 +105,15 @@ class FlowFileRepository : public core::Repository, 
public std::enable_shared_fr
 options.create_if_missing = true;
 options.use_direct_io_for_flush_and_compaction = true;
 options.use_direct_reads = true;
+
+// Write buffers are used as db oepration logs. When they get filled the 
events are merged and serialized.
 
 Review comment:
   typo: oepration -> operation


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce sawtooth in memory usage of rocksdb flowfile …

2020-01-28 Thread GitBox
bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce 
sawtooth in memory usage of rocksdb flowfile …
URL: https://github.com/apache/nifi-minifi-cpp/pull/715#discussion_r371702847
 
 

 ##
 File path: extensions/rocksdb-repos/FlowFileRepository.h
 ##
 @@ -103,6 +105,9 @@ class FlowFileRepository : public core::Repository, public 
std::enable_shared_fr
 options.create_if_missing = true;
 options.use_direct_io_for_flush_and_compaction = true;
 options.use_direct_reads = true;
+options.write_buffer_size = 8 << 20;
 
 Review comment:
   I would like to see more in-code comments about the rationale behind these 
values.
   Why 8 MB? Why do we set max_write_buffer_number to 4? As far as I understand 
that means it can keep 4 of these memtables in memory, and the default is 2.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce sawtooth in memory usage of rocksdb flowfile …

2020-01-28 Thread GitBox
bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce 
sawtooth in memory usage of rocksdb flowfile …
URL: https://github.com/apache/nifi-minifi-cpp/pull/715#discussion_r371703217
 
 

 ##
 File path: extensions/rocksdb-repos/FlowFileRepository.cpp
 ##
 @@ -69,14 +68,8 @@ void FlowFileRepository::flush() {
 batch.Delete(keys[i]);
   }
 
-
-  if (db_->Write(rocksdb::WriteOptions(), ).ok()) {
-logger_->log_trace("Decrementing %u from a repo size of %u", 
decrement_total, repo_size_.load());
-if (decrement_total > repo_size_.load()) {
-  repo_size_ = 0;
-} else {
-  repo_size_ -= decrement_total;
-}
+  if (!db_->Write(rocksdb::WriteOptions(), ).ok()) {
+logger_->log_warn("Failed to execute batch operation when flushing 
FlowFileRepository");
 
 Review comment:
   Agreed, but I think it is something we would need, done properly. Is there a 
follow-up issues for it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce sawtooth in memory usage of rocksdb flowfile …

2020-01-28 Thread GitBox
bakaid commented on a change in pull request #715: MINIFICPP-1126 - Reduce 
sawtooth in memory usage of rocksdb flowfile …
URL: https://github.com/apache/nifi-minifi-cpp/pull/715#discussion_r371693362
 
 

 ##
 File path: extensions/rocksdb-repos/FlowFileRepository.cpp
 ##
 @@ -89,23 +82,28 @@ void FlowFileRepository::flush() {
   }
 }
 
-void FlowFileRepository::run() {
-  // threshold for purge
+void FlowFileRepository::printStats() {
 
 Review comment:
   If I understand correctly, this is ran every purge_period, which is, by 
default, 2.5 s.
   4 log lines every 2.5 seconds is way too much noise, affecting everyone, as 
virtually everyone uses this FlowFileRepo and the default log level is info. 
This would account for the majority of logs in many cases.
   I think this should definitely be at debug level.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services