[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

sidhavratha Mon, 02 Jul 2018 05:40:05 -0700

Github user sidhavratha commented on the issue:

    https://github.com/apache/spark/pull/21685
  
    Thanks a lot for looking into this. Please find comments in [] below every 
points.
    
    - You're trying to commit something into 2.4 but in the test result I see 
with 2.1.0 version. Have you tested it with 2.4? This part of the code has 
significantly changed. Results with this version would be better. 
    [We do not have 2.4.0 cluster handy. Will try to spawn a 2.4.0 cluster and 
test the same.]
    
    - In the before case the input rate was approximately the same just like in 
the after case constantly. After the initial good performance something wrong 
happened and decreased the rate significantly. What happened exactly there? 
Maybe memory filled up and not able to poll things without GC (just guessing)? 
    [Kafka poll usually bring more records than one batch can process. In my 
case it bring ~500 records. That records will be in buffer for 4-5 batches, 
after which next poll will happen resulting in increased processing time. Also, 
not all kafka poll takes long time. We have raised issue with our kafka team, 
but it is inconclusive so far.] 
    [I looked at GC time on executor (through Spark UI), which was 
insignificant. I will enable GC logs and run the job again.]
    
    - Have you considered/tested when driver/receiver dies? Guarantees are 
quite important.
    [I will test this scenario. Basically, Am I supposed to test if driver dies 
it should start from same place when it comes back up?]
    - Have you tested it with receivers? Some results would be excellent.
    [ I will get results with receivers as well.]



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

Reply via email to