[GitHub] [spark] koeninger commented on issue #27022: [SPARK-28415][DSTREAMS] Add messageHandler to Kafka 10 direct stream API #25205

GitBox Thu, 02 Jan 2020 13:29:29 -0800

koeninger commented on issue #27022: [SPARK-28415][DSTREAMS] Add messageHandler 
to Kafka 10 direct stream API #25205
URL: https://github.com/apache/spark/pull/27022#issuecomment-570357155
 
 
   Do you have a minimal reproducible case showing the difference in memory 
usage?  
   
   My expectation would be that if the very first thing you were doing with the 
dstream was calling foreachRDD and then rdd.foreachPartition, that the memory 
usage would be comparable to what you are doing here.  It's an iterator backed 
by a Kafka consumer that has to have the whole ConsumerRecord in memory either 
way.  It's just a question of whether your message conversion is happening 
before or after next() returns from the iterator, right?  Or am I missing 
something?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] koeninger commented on issue #27022: [SPARK-28415][DSTREAMS] Add messageHandler to Kafka 10 direct stream API #25205

Reply via email to