Github user cammachusa commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2160#discussion_r140365671
--- Diff:
nifi-nar-bundles/nifi-kudu-bundle/nifi-kudu-processors/src/main/java/org/apache/nifi/processors/kudu/AbstractKudu.java
---
@@ -94,6 +97,29 @@
.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
.build();
+ protected static final PropertyDescriptor FLUSH_MODE = new
PropertyDescriptor.Builder()
+ .name("Flush Mode")
+ .description("Set the new flush mode for a kudu session\n" +
+ "AUTO_FLUSH_SYNC: the call returns when the operation
is persisted, else it throws an exception.\n" +
+ "AUTO_FLUSH_BACKGROUND: the call returns when the
operation has been added to the buffer. This call should normally perform only
fast in-memory" +
+ " operations but it may have to wait when the buffer
is full and there's another buffer being flushed.\n" +
+ "MANUAL_FLUSH: the call returns when the operation has
been added to the buffer, else it throws a KuduException if the buffer is
full.")
+ .allowableValues(SessionConfiguration.FlushMode.values())
+
.defaultValue(SessionConfiguration.FlushMode.AUTO_FLUSH_BACKGROUND.toString())
+ .required(true)
+ .build();
+
+ protected static final PropertyDescriptor BATCH_SIZE = new
PropertyDescriptor.Builder()
+ .name("Batch Size")
+ .description("Set the number of operations that can be
buffered, between 2 - 100000. " +
+ "Depend on your memory size, and data size per row set
an appropriate batch size. " +
+ "Gradually increase this number to find out your best
one for best performance")
+ .defaultValue("100")
--- End diff --
Like, I made in note in the description. It's depend on their memory size,
and data row being inserted, and also their cluster size. Setting the buffer
size too big won't help, and too small won't help either. And at noted,
developer got to find out this number from his environment. A lot of people hit
performance peak at 50 with single machine Kudu's cluster. My colleague hit
performance peak at 3500 with 6 nodes cluster (10 CPU, 64 GB Memory each). I
randomly pick 100 as I saw it from other Put-xxx processor, but I don't want to
put 1000 since most developers test it with single machine, and would leave
this default value.
---