Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    Actually, I am having second thoughts about this. This is fundamentally 
changing how the tests work, especially for stress tests. The stress tests 
actually test these corner cases (by randomly adding successive AddData) about 
what if data was being added while the previously added data is being picked 
up. With this change, we will accidentally not test those race-condition-prone 
cases. 
    
    Second, we are taking multiple locks here in multiple sources, and the 
StreamExecution is likely to take the same locks. I am really afraid that we 
are introducing deadlocks by doing this.
    
    I am still thinking what the right approach here. I think it should be
    - Explicit synchronized adding of data to multiple sources.
    - Not holding locks in multiple sources. 
    
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to