[GitHub] spark issue #22524: [SPARK-25497][SQL] Limit operation within whole stage co...

cloud-fan Tue, 25 Sep 2018 07:09:19 -0700

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22524
  
    @viirya thanks for adding the explanation! I think it's very clear and 
helpful. By reading this, I have a new idea.
    
    It seems to me that limit is mostly to stop produce data earlier for 
upstream operators, so the code template should look like
    ```
    while (iterator.hasNext() && !stopEarly()) {
      // upstream operators
      ...
      if (count < given_limit) {
        count += 1
        consume... // down  stream operators
      } else {
        setStopEarly(true);
      }
      ...
    }
    ```




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22524: [SPARK-25497][SQL] Limit operation within whole stage co...

Reply via email to