phet commented on a change in pull request #3448:
URL: https://github.com/apache/gobblin/pull/3448#discussion_r778531535
##########
File path:
gobblin-modules/gobblin-service-kafka/src/main/java/org/apache/gobblin/service/StreamingKafkaSpecConsumer.java
##########
@@ -111,13 +113,15 @@ public StreamingKafkaSpecConsumer(Config config,
MutableJobCatalog jobCatalog) {
try {
Pair<SpecExecutor.Verb, Spec> specPair = _jobSpecQueue.take();
- _metrics.specConsumerJobSpecDeq.mark();
+ int numSpecFetched = 0;
do {
+ _metrics.specConsumerJobSpecDeq.mark();
+ numSpecFetched ++;
changesSpecs.add(specPair);
// if there are more elements then pass them along in this call
specPair = _jobSpecQueue.poll();
- } while (specPair != null);
+ } while (specPair != null && numSpecFetched < _jobSpecQueueSize);
Review comment:
although likely not terribly detrimental to 'reuse' that configuration
for this other purpose, I don't fully see the connection between the two.
while we wouldn't want to take just one element, as that could unblock only a
single addition by the queue's producer, it may nonetheless be non-optimal to
drain the entirety. profiling would help chose the right relationship between
producing and consuming, but maybe we'd begin between [.05, .5] of capacity.
again though, seemingly a smaller performance optimization we ought not to
over-engineer, unless some presenting problem surfaces. still, don't forget
about the `.drainTo` method, to remove up to the desired amount all at once -
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/LinkedBlockingQueue.html#drainTo-java.util.Collection-int-
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]