[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r143094589 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -18,16 +18,17 @@ package org.apache.spark.sql import scala.collection.JavaConverters._ +import scala.util.control.NonFatal import org.apache.spark.annotation.{Experimental, InterfaceStability} import org.apache.spark.api.java.function._ import org.apache.spark.sql.catalyst.encoders.{encoderFor, ExpressionEncoder} import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, CreateStruct} import org.apache.spark.sql.catalyst.plans.logical._ -import org.apache.spark.sql.catalyst.streaming.InternalOutputModes import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.expressions.ReduceAggregator import org.apache.spark.sql.streaming.{GroupState, GroupStateTimeout, OutputMode} +import org.apache.spark.sql.types.StructType --- End diff -- Why import `StructType`? I didn't see you use it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r143094398 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -564,4 +565,30 @@ class KeyValueGroupedDataset[K, V] private[sql]( encoder: Encoder[R]): Dataset[R] = { cogroup(other)((key, left, right) => f.call(key, left.asJava, right.asJava).asScala)(encoder) } + + override def toString: String = { +try { + val builder = new StringBuilder + val kFields = kExprEnc.schema.map { +case f => s"${f.name}: ${f.dataType.simpleString(2)}" + } + val vFields = vExprEnc.schema.map { +case f => s"${f.name}: ${f.dataType.simpleString(2)}" + } + builder.append("[key: [") + builder.append(kFields.take(2).mkString(", ")) + if (kFields.length > 2) { +builder.append(" ... " + (kFields.length - 2) + " more field(s)") + } + builder.append("], value: [") + builder.append(vFields.take(2).mkString(", ")) + if (vFields.length > 2) { +builder.append(" ... " + (vFields.length - 2) + " more field(s)") + } + builder.append("]]").toString() +} catch { + case NonFatal(e) => --- End diff -- When we will encounter this error? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...
Github user yaooqinn commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r141509523 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -54,6 +55,14 @@ class KeyValueGroupedDataset[K, V] private[sql]( private def sparkSession = queryExecution.sparkSession /** + * Returns the schema of this Dataset. + * + * @group basic + * @since 2.3.0 + */ + def schema: StructType = queryExecution.analyzed.schema --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r141424043 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -54,6 +55,14 @@ class KeyValueGroupedDataset[K, V] private[sql]( private def sparkSession = queryExecution.sparkSession /** + * Returns the schema of this Dataset. + * + * @group basic + * @since 2.3.0 + */ + def schema: StructType = queryExecution.analyzed.schema --- End diff -- Can you remove this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...
GitHub user yaooqinn opened a pull request: https://github.com/apache/spark/pull/19363 [Minor]Override toString of KeyValueGroupedDataset ## What changes were proposed in this pull request? before ```scala scala> val words = spark.read.textFile("fREADME.md").flatMap(_.split(" ")) words: org.apache.spark.sql.Dataset[String] = [value: string] scala> val grouped = words.groupByKey(identity) grouped: org.apache.spark.sql.KeyValueGroupedDataset[String,String] = org.apache.spark.sql.KeyValueGroupedDataset@65214862 ``` after ```scala scala> val words = spark.read.textFile("README.md").flatMap(_.split(" ")) words: org.apache.spark.sql.Dataset[String] = [value: string] scala> val grouped = words.groupByKey(identity) grouped: org.apache.spark.sql.KeyValueGroupedDataset[String,String] = [key: [value: string], value: [value: string]] ``` ## How was this patch tested? existing ut cc @gatorsmile @cloud-fan You can merge this pull request into a Git repository by running: $ git pull https://github.com/yaooqinn/spark minor-dataset-tostring Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19363.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19363 commit a9b30bd6508421de192d0d44b6bc03afd3e0a792 Author: Kent Yao Date: 2017-09-27T07:54:41Z override toString of KeyValueGroupedDataset --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org