nicoloboschi commented on code in PR #18017:
URL: https://github.com/apache/pulsar/pull/18017#discussion_r993535315


##########
pulsar-io/jdbc/core/src/main/java/org/apache/pulsar/io/jdbc/JdbcAbstractSink.java:
##########
@@ -213,63 +223,90 @@ protected enum MutationType {
 
 
     private void flush() {
-        // if not in flushing state, do flush, else return;
         if (incomingList.size() > 0 && isFlushing.compareAndSet(false, true)) {
-            if (log.isDebugEnabled()) {
-                log.debug("Starting flush, queue size: {}", 
incomingList.size());
-            }
-            if (!swapList.isEmpty()) {
-                throw new IllegalStateException("swapList should be empty 
since last flush. swapList.size: "
-                        + swapList.size());
-            }
-            synchronized (this) {
-                List<Record<T>> tmpList;
-                swapList.clear();
+            boolean needAnotherRound;
+            final Deque<Record<T>> swapList = new LinkedList<>();
+
+            synchronized (incomingList) {
+                if (log.isDebugEnabled()) {
+                    log.debug("Starting flush, queue size: {}", 
incomingList.size());
+                }
+                final int actualBatchSize = batchSize > 0 ? 
Math.min(incomingList.size(), batchSize) :
+                        incomingList.size();
 
-                tmpList = swapList;
-                swapList = incomingList;
-                incomingList = tmpList;
+                for (int i = 0; i < actualBatchSize; i++) {
+                    swapList.add(incomingList.removeFirst());
+                }
+                needAnotherRound = batchSize > 0 && !incomingList.isEmpty() && 
incomingList.size() >= batchSize;
             }
+            long start = System.nanoTime();
 
             int count = 0;
             try {
+                PreparedStatement currentBatch = null;
+                final List<Mutation> mutations = swapList

Review Comment:
   because `swapList` is going to change (remove items) while looping over 
`mutations`.
   I think it's the best solution in terms of code readability. The mem 
footprint impact shouldn't be relevant. The list size is pretty small (usually 
the batchSize value or less) and there's no data redundancy between swapList 
and mutations  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to