[GitHub] spark pull request #20726: [SPARK-23574][SQL] Report SinglePartition in Data...

cloud-fan Fri, 16 Mar 2018 14:49:28 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20726#discussion_r175222350
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsReportPartitioning.java
 ---
    @@ -23,6 +23,11 @@
     /**
      * A mix in interface for {@link DataSourceReader}. Data source readers 
can implement this
      * interface to report data partitioning and try to avoid shuffle at Spark 
side.
    + *
    + * Note that Spark will always infer a
    + * {@link org.apache.spark.sql.catalyst.plans.physical.SinglePartition} 
partitioning when the
    --- End diff --
    
    We should not expose internal classes. How about
    ```
    Note that sometimes Spark can avoid shuffle if the reader creates exactly 1 
{@link DataReaderFactory}, even if the reader does not implement this interface.
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20726: [SPARK-23574][SQL] Report SinglePartition in Data...

Reply via email to