HeartSaVioR edited a comment on issue #22138: [SPARK-25151][SS] Apply Apache 
Commons Pool to KafkaDataConsumer
URL: https://github.com/apache/spark/pull/22138#issuecomment-472479384
 
 
   UPDATE: just added log message to log when Kafka consumer is created.
   
   * master: 
https://github.com/HeartSaVioR/spark/tree/SPARK-25151-master-ref-debugging
   * patch: https://github.com/HeartSaVioR/spark/tree/SPARK-25151-debugging
   
   I've also corrected my spark-shell execution to use `local[*]` instead of 
`local[1]` which prevented concurrent access in previous experiment. FYI my 
laptop has 4 cores (8 logical cores).
   
   ```
   ./bin/spark-shell --master "local[*]" \
   --packages org.apache.spark:spark-sql-kafka-0-10_2.12:<version> \
   --driver-memory 6G > >(tee -a stdout-master-experiment.log) 2> >(tee -a 
stderr-master-experiment.log >&2)
   ```
   
   I've collected the count of fetch requests on Kafka via below command:
   
   ```
   grep "creating new Kafka consumer" logfile | wc -l
   ```
   
   and the count of creating Kafka consumers via below command:
   
   ```
   grep "fetching data from Kafka consumer" logfile | wc -l
   ```
   
   Same query with same data: 497 batches were run.
   
   branch | create Kafka consumer | fetch request
   ------- | ----------------------- | --------------
   master | 1986 | 2837
   patch | 8 | 1706
   
   The result of experiment looks to prove that the patch properly caches and 
serves the consumers for concurrent usage (4 partitions * 2 concurrent streams 
= 8, including usages from driver), as well as properly caches fetch data as 
well.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to