[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r143094589
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -18,16 +18,17 @@
 package org.apache.spark.sql
 
 import scala.collection.JavaConverters._
+import scala.util.control.NonFatal
 
 import org.apache.spark.annotation.{Experimental, InterfaceStability}
 import org.apache.spark.api.java.function._
 import org.apache.spark.sql.catalyst.encoders.{encoderFor, 
ExpressionEncoder}
 import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, 
CreateStruct}
 import org.apache.spark.sql.catalyst.plans.logical._
-import org.apache.spark.sql.catalyst.streaming.InternalOutputModes
 import org.apache.spark.sql.execution.QueryExecution
 import org.apache.spark.sql.expressions.ReduceAggregator
 import org.apache.spark.sql.streaming.{GroupState, GroupStateTimeout, 
OutputMode}
+import org.apache.spark.sql.types.StructType
--- End diff --

Why import `StructType`? I didn't see you use it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r143094398
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -564,4 +565,30 @@ class KeyValueGroupedDataset[K, V] private[sql](
   encoder: Encoder[R]): Dataset[R] = {
 cogroup(other)((key, left, right) => f.call(key, left.asJava, 
right.asJava).asScala)(encoder)
   }
+
+  override def toString: String = {
+try {
+  val builder = new StringBuilder
+  val kFields = kExprEnc.schema.map {
+case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+  }
+  val vFields = vExprEnc.schema.map {
+case f => s"${f.name}: ${f.dataType.simpleString(2)}"
+  }
+  builder.append("[key: [")
+  builder.append(kFields.take(2).mkString(", "))
+  if (kFields.length > 2) {
+builder.append(" ... " + (kFields.length - 2) + " more field(s)")
+  }
+  builder.append("], value: [")
+  builder.append(vFields.take(2).mkString(", "))
+  if (vFields.length > 2) {
+builder.append(" ... " + (vFields.length - 2) + " more field(s)")
+  }
+  builder.append("]]").toString()
+} catch {
+  case NonFatal(e) =>
--- End diff --

When we will encounter this error?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-09-27 Thread yaooqinn
Github user yaooqinn commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r141509523
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -54,6 +55,14 @@ class KeyValueGroupedDataset[K, V] private[sql](
   private def sparkSession = queryExecution.sparkSession
 
   /**
+   * Returns the schema of this Dataset.
+   *
+   * @group basic
+   * @since 2.3.0
+   */
+  def schema: StructType = queryExecution.analyzed.schema
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-09-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r141424043
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -54,6 +55,14 @@ class KeyValueGroupedDataset[K, V] private[sql](
   private def sparkSession = queryExecution.sparkSession
 
   /**
+   * Returns the schema of this Dataset.
+   *
+   * @group basic
+   * @since 2.3.0
+   */
+  def schema: StructType = queryExecution.analyzed.schema
--- End diff --

Can you remove this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-09-27 Thread yaooqinn
GitHub user yaooqinn opened a pull request:

https://github.com/apache/spark/pull/19363

[Minor]Override toString of KeyValueGroupedDataset

## What changes were proposed in this pull request?
 before

```scala
scala> val words = spark.read.textFile("fREADME.md").flatMap(_.split(" "))
words: org.apache.spark.sql.Dataset[String] = [value: string]

scala> val grouped = words.groupByKey(identity)
grouped: org.apache.spark.sql.KeyValueGroupedDataset[String,String] = 
org.apache.spark.sql.KeyValueGroupedDataset@65214862
```
 after
```scala
scala> val words = spark.read.textFile("README.md").flatMap(_.split(" "))
words: org.apache.spark.sql.Dataset[String] = [value: string]

scala> val grouped = words.groupByKey(identity)
grouped: org.apache.spark.sql.KeyValueGroupedDataset[String,String] = [key: 
[value: string], value: [value: string]]
```

## How was this patch tested?
existing ut

cc @gatorsmile @cloud-fan 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yaooqinn/spark minor-dataset-tostring

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19363.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19363


commit a9b30bd6508421de192d0d44b6bc03afd3e0a792
Author: Kent Yao 
Date:   2017-09-27T07:54:41Z

override toString of KeyValueGroupedDataset




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org