Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22524
  
    @viirya thanks for adding the explanation! I think it's very clear and 
helpful. By reading this, I have a new idea.
    
    It seems to me that limit is mostly to stop produce data earlier for 
upstream operators, so the code template should look like
    ```
    while (iterator.hasNext() && !stopEarly()) {
      // upstream operators
      ...
      if (count < given_limit) {
        count += 1
        consume... // down  stream operators
      } else {
        setStopEarly(true);
      }
      ...
    }
    ```
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to