date:20180720

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

2018-07-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21775
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93311/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r203945646
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -206,4 +206,19 @@ class DatasetCacheSuite extends QueryTest with 
SharedSQLContext with TimeLimits
 // first time use, load cache
 checkDataset(df5, Row(10))
   }
+
+  test("SPARK-24850 InMemoryRelation string representation does not 
include cached plan") {
+val dummyQueryExecution = spark.range(0, 1).toDF().queryExecution
+val inMemoryRelation = InMemoryRelation(
+  true,
+  1000,
+  StorageLevel.MEMORY_ONLY,
+  dummyQueryExecution.sparkPlan,
+  Some("test-relation"),
+  dummyQueryExecution.logical)
+
+
assert(!inMemoryRelation.simpleString.contains(dummyQueryExecution.sparkPlan.toString))
+assert(inMemoryRelation.simpleString.contains(
+  "CachedRDDBuilder(true, 1000, StorageLevel(memory, deserialized, 1 
replicas))"))
--- End diff --

Or we might not need the batch size in the plan. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r203945605
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -206,4 +206,19 @@ class DatasetCacheSuite extends QueryTest with 
SharedSQLContext with TimeLimits
 // first time use, load cache
 checkDataset(df5, Row(10))
   }
+
+  test("SPARK-24850 InMemoryRelation string representation does not 
include cached plan") {
+val dummyQueryExecution = spark.range(0, 1).toDF().queryExecution
+val inMemoryRelation = InMemoryRelation(
+  true,
+  1000,
+  StorageLevel.MEMORY_ONLY,
+  dummyQueryExecution.sparkPlan,
+  Some("test-relation"),
+  dummyQueryExecution.logical)
+
+
assert(!inMemoryRelation.simpleString.contains(dummyQueryExecution.sparkPlan.toString))
+assert(inMemoryRelation.simpleString.contains(
+  "CachedRDDBuilder(true, 1000, StorageLevel(memory, deserialized, 1 
replicas))"))
--- End diff --

`true` and `1000` look confusing to end users. Can we improve it? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

2018-07-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21775
  
**[Test build #93311 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93311/testReport)**
 for PR 21775 at commit 
[`b527fdc`](https://github.com/apache/spark/commit/b527fdc5919296ffa12e1be54367b9132ecee61e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21774: [SPARK-24811][SQL]Avro: add new function from_avr...

2018-07-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21774#discussion_r203945436
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/package.scala ---
@@ -36,4 +40,27 @@ package object avro {
 @scala.annotation.varargs
 def avro(sources: String*): DataFrame = 
reader.format("avro").load(sources: _*)
   }
+
+  /**
+   * Converts a binary column of avro format into its corresponding 
catalyst value. The specified
+   * schema must match the read data, otherwise the behavior is undefined: 
it may fail or return
+   * arbitrary result.
+   *
+   * @param data the binary column.
+   * @param avroType the avro type.
+   */
+  @Experimental
+  def from_avro(data: Column, avroType: Schema): Column = {
--- End diff --

ah sorry i thought you are talking about the `data` parameter.

Yes, for `avroType` parameter, we should have a string version


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-20 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21805
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

[GitHub] spark pull request #21774: [SPARK-24811][SQL]Avro: add new function from_avr...

[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

< 2 3 4 5 6 7

601 - 606 of 606 matches

Site Navigation

Mail list logo

Footer information