Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20647#discussion_r170774510
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2QueryPlan.scala
 ---
    @@ -0,0 +1,99 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources.v2
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.spark.sql.catalyst.expressions.Attribute
    +import org.apache.spark.sql.internal.SQLConf
    +import org.apache.spark.sql.sources.DataSourceRegister
    +import org.apache.spark.sql.sources.v2.DataSourceV2
    +import org.apache.spark.sql.sources.v2.reader._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * A base class for data source v2 related query plan(both logical and 
physical). It defines the
    + * equals/hashCode methods, and provides a string representation of the 
query plan, according to
    + * some common information.
    + */
    +trait DataSourceV2QueryPlan {
    +
    +  /**
    +   * The instance of this data source implementation. Note that we only 
consider its class in
    +   * equals/hashCode, not the instance itself.
    +   */
    +  def source: DataSourceV2
    +
    +  /**
    +   * The output of the data source reader, w.r.t. column pruning.
    +   */
    +  def output: Seq[Attribute]
    +
    +  /**
    +   * The options for this data source reader.
    +   */
    +  def options: Map[String, String]
    +
    +  /**
    +   * The created data source reader. Here we use it to get the filters 
that has been pushed down
    +   * so far, itself doesn't take part in the equals/hashCode.
    +   */
    +  def reader: DataSourceReader
    +
    +  private lazy val filters = reader match {
    +    case s: SupportsPushDownCatalystFilters => 
s.pushedCatalystFilters().toSet
    +    case s: SupportsPushDownFilters => s.pushedFilters().toSet
    +    case _ => Set.empty
    +  }
    +
    +  private def sourceName: String = source match {
    +    case registered: DataSourceRegister => registered.shortName()
    +    case _ => source.getClass.getSimpleName.stripSuffix("$")
    +  }
    +
    +  def metadataString: String = {
    +    val entries = scala.collection.mutable.ArrayBuffer.empty[(String, 
String)]
    +
    +    if (filters.nonEmpty) {
    +      entries += "Pushed Filters" -> filters.mkString("[", ", ", "]")
    +    }
    +
    +    // TODO: we should only display some standard options like path, 
table, etc.
    +    if (options.nonEmpty) {
    +      entries += "Options" -> options.map {
    +        case (k, v) => s"$k=$v"
    +      }.mkString("[", ",", "]")
    +    }
    +
    +    val outputStr = Utils.truncatedString(output, "[", ", ", "]")
    --- End diff --
    
    This leaves out the types returned by the reader that were included in the 
old version. I'm okay with that if this was deliberate and the motivation is to 
match other plan nodes, but I think it is valuable to know what types are 
returned by scans.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to