date:20181010

[GitHub] spark issue #22693: [SPARK-25701][SQL] Supports calculation of table statist...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22693
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-10-10 Thread fjh100456

GitHub user fjh100456 opened a pull request:

https://github.com/apache/spark/pull/22693

[SPARK-25701][SQL] Supports calculation of table statistics from 
partition's catalog statistics.

## What changes were proposed in this pull request?

When determine table statistics, if the `totalSize` of the table is not 
defined, we fallback to HDFS to get the table statistics when 
`spark.sql.statistics.fallBackToHdfs` is `true`, otherwise the default 
value(`spark.sql.defaultSizeInBytes`) will be taken, which will lead to tables 
without `totalSize` property may not be broadcast(Except parquet). 

Fortunately, in most case the data is written into the table by a insertion 
command which will save the data-size in metastore, so it's possible to use 
metastore to calculate the table statistics.

## How was this patch tested?
Add test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fjh100456/spark StatisticCommit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22693.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22693


commit e610477063b4f326b8261d59b55abce83cbb82e7
Author: fjh100456 
Date:   2018-10-11T06:43:52Z

[SPARK-25701][SQL] Supports calculation of table statistics from 
partition's catalog statistics.

## What changes were proposed in this pull request?

When obtaining table statistics, if the `totalSize` of the table is not 
defined, we fallback to HDFS to get the table statistics when 
`spark.sql.statistics.fallBackToHdfs` is `true`, otherwise the default 
value(`spark.sql.defaultSizeInBytes`) will be taken.

Fortunately, in most case the data is written into the table by a insertion 
command which will save the data-size in metastore, so it's possible to use 
metastore to calculate the table statistics.

## How was this patch tested?
Add test.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22678: [SPARK-25685][BUILD] Allow running tests in Jenki...

2018-10-10 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22678#discussion_r22445
  
--- Diff: docs/building-spark.md ---
@@ -272,3 +272,31 @@ For SBT, specify a complete scala version using (e.g. 
2.12.6):
 ./build/sbt -Dscala.version=2.12.6
 
 Otherwise, the sbt-pom-reader plugin will use the `scala.version` 
specified in the spark-parent pom.
+
+## Running Jenkins tests with enterprise Github
+
+To run tests with Jenkins:
+
+./dev/run-tests-jenkins
+
+If use an individual repository or an enterprise GitHub, export below 
environment variables before running above command.
+
+### Related environment variables
+
+
+Variable NameDefaultMeaning
+
+  GITHUB_API_BASE
+  https://api.github.com/repos/apache/spark
+  
+The GitHub server API URL. It could be pointed to an enterprise GitHub.
+  
+
+
+  SPARK_PROJECT_URL
+  https://github.com/apache/spark
+  
+The Spark project URL of (enterprise) GitHub.
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22678: [SPARK-25685][BUILD] Allow running tests in Jenki...

2018-10-10 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22678#discussion_r224333028
  
--- Diff: docs/building-spark.md ---
@@ -272,3 +272,31 @@ For SBT, specify a complete scala version using (e.g. 
2.12.6):
 ./build/sbt -Dscala.version=2.12.6
 
 Otherwise, the sbt-pom-reader plugin will use the `scala.version` 
specified in the spark-parent pom.
+
+## Running Jenkins tests with enterprise Github
+
+To run tests with Jenkins:
+
+./dev/run-tests-jenkins
+
+If use an individual repository or an enterprise GitHub, export below 
environment variables before running above command.
+
+### Related environment variables
+
+
+Variable NameDefaultMeaning
+
+  GITHUB_API_BASE
+  https://api.github.com/repos/apache/spark
+  
+The GitHub server API URL. It could be pointed to an enterprise GitHub.
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22678: [SPARK-25685][BUILD] Allow running tests in Jenki...

2018-10-10 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22678#discussion_r224332828
  
--- Diff: docs/building-spark.md ---
@@ -272,3 +272,31 @@ For SBT, specify a complete scala version using (e.g. 
2.12.6):
 ./build/sbt -Dscala.version=2.12.6
 
 Otherwise, the sbt-pom-reader plugin will use the `scala.version` 
specified in the spark-parent pom.
+
+## Running Jenkins tests with enterprise Github
--- End diff --

nit: `enterprise Github` -> `GitHub Enterprise`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22678: [SPARK-25685][BUILD] Allow running tests in Jenki...

2018-10-10 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22678#discussion_r224332984
  
--- Diff: docs/building-spark.md ---
@@ -272,3 +272,31 @@ For SBT, specify a complete scala version using (e.g. 
2.12.6):
 ./build/sbt -Dscala.version=2.12.6
 
 Otherwise, the sbt-pom-reader plugin will use the `scala.version` 
specified in the spark-parent pom.
+
+## Running Jenkins tests with enterprise Github
+
+To run tests with Jenkins:
+
+./dev/run-tests-jenkins
+
+If use an individual repository or an enterprise GitHub, export below 
environment variables before running above command.
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-10-10 Thread peter-toth

Github user peter-toth commented on the issue:

https://github.com/apache/spark/pull/22318
  
@srowen, I saw your last comment on 
https://github.com/peter-toth/spark/tree/SPARK-25150. I submitted this PR to 
solve that ticket and I believe the description here explains what is the real 
issue there.
I would appreciate your thoughts on this PR, unfortunately it got stuck a 
bit lately. Thanks. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22674
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97231/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22674
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22674
  
**[Test build #97231 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97231/testReport)**
 for PR 22674 at commit 
[`3ffa536`](https://github.com/apache/spark/commit/3ffa536f3c29f6655843a4d45c215393f51e23c9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22664
  
Could you add `[SQL]` before `[TEST]`, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22688
  
**[Test build #97238 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97238/testReport)**
 for PR 22688 at commit 
[`2a42253`](https://github.com/apache/spark/commit/2a422535451c186546a2ce3da66d422805f7db32).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3872/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22664
  
**[Test build #97237 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97237/testReport)**
 for PR 22664 at commit 
[`7cef8db`](https://github.com/apache/spark/commit/7cef8db25e5839277f9fec3f9585f7669caca405).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22678: [SPARK-25685][BUILD] Allow running tests in Jenki...

2018-10-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22678#discussion_r224327625
  
--- Diff: docs/building-spark.md ---
@@ -272,3 +272,31 @@ For SBT, specify a complete scala version using (e.g. 
2.12.6):
 ./build/sbt -Dscala.version=2.12.6
 
 Otherwise, the sbt-pom-reader plugin will use the `scala.version` 
specified in the spark-parent pom.
+
+## Running Jenkins tests with enterprise Github
+
+To run tests with Jenkins:
+
+./dev/run-tests-jenkins
+
+If you use an individual repository or an enterprise GitHub, you should 
export below environment variables before running above command.
+
+### Related environment variables
+
+
+variable NameDefaultMeaning
--- End diff --

`variable` -> `Variable`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22664
  
**[Test build #97236 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97236/testReport)**
 for PR 22664 at commit 
[`5bccfc6`](https://github.com/apache/spark/commit/5bccfc6fcd3cfe338c619c4f549ef7b6b038c5b3).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22664
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97236/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22664
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22688#discussion_r224326923
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala 
---
@@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-25700: do not read schema when writing in other modes except 
append mode") {
+withTempPath { file =>
+  val cls = classOf[SimpleWriteOnlyDataSource]
+  val path = file.getCanonicalPath
+  val df = spark.range(5).select('id as 'i, -'id as 'j)
+  try {
+df.write.format(cls.getName).option("path", 
path).mode("error").save()
+df.write.format(cls.getName).option("path", 
path).mode("overwrite").save()
+df.write.format(cls.getName).option("path", 
path).mode("ignore").save()
+  } catch {
+case e: SchemaReadAttemptException => fail("Schema read was 
attempted.", e)
+  }
--- End diff --

Yup


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22664
  
**[Test build #97236 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97236/testReport)**
 for PR 22664 at commit 
[`5bccfc6`](https://github.com/apache/spark/commit/5bccfc6fcd3cfe338c619c4f549ef7b6b038c5b3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...

2018-10-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22688#discussion_r224326576
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala 
---
@@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-25700: do not read schema when writing in other modes except 
append mode") {
+withTempPath { file =>
+  val cls = classOf[SimpleWriteOnlyDataSource]
+  val path = file.getCanonicalPath
+  val df = spark.range(5).select('id as 'i, -'id as 'j)
+  try {
+df.write.format(cls.getName).option("path", 
path).mode("error").save()
+df.write.format(cls.getName).option("path", 
path).mode("overwrite").save()
+df.write.format(cls.getName).option("path", 
path).mode("ignore").save()
+  } catch {
+case e: SchemaReadAttemptException => fail("Schema read was 
attempted.", e)
+  }
--- End diff --

To validate new code path [line 
250](https://github.com/apache/spark/pull/22688/files#diff-94fbd986b04087223f53697d4b6cab24R250),
 could you add `intercept[SchemaReadAttemptException]` and do `append`, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22668: [SPARK-25675] [Spark Job History] Job UI page doe...

2018-10-10 Thread gengliangwang

Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/22668#discussion_r224323509
  
--- Diff: core/src/main/scala/org/apache/spark/ui/PagedTable.scala ---
@@ -154,9 +150,6 @@ private[ui] trait PagedTable[T] {
* }}}
*/
   private[ui] def pageNavigation(page: Int, pageSize: Int, totalPages: 
Int): Seq[Node] = {
-if (totalPages == 1) {
-  Nil
-} else {
--- End diff --

One more comment: need to adjust the indent of the following code block.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-10 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22309#discussion_r224318955
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
 ---
@@ -108,6 +108,16 @@ object TestingUDT {
   }
 }
 
+object TestingValueClass {
+  case class IntWrapper(i: Int) extends AnyVal
+  case class StrWrapper(s: String) extends AnyVal
+
+  case class ValueClassData(
+intField: Int,
+wrappedInt: IntWrapper,
+strField: String,
+wrappedStr: StrWrapper)
--- End diff --

We might need a comment to describe what this class is look like in Java.
Seems like it has 2 int fields `intField`, `wrappedInt`, and 2 string 
fields `strField`, `wrappedStr`. I'm not sure it is the same in Scala 2.12, 
though.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22309
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22309
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97232/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22309
  
**[Test build #97232 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97232/testReport)**
 for PR 22309 at commit 
[`5613217`](https://github.com/apache/spark/commit/5613217771b1929b9f66106468fd2da2c3ea7dec).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22675#discussion_r224322470
  
--- Diff: docs/ml-datasource.md ---
@@ -0,0 +1,49 @@
+---
+layout: global
+title: Data sources
+displayTitle: Data sources
+---
+
+In this section, we introduce how to use data source in ML to load data.
+Beside some general data sources like Parquet, CSV, JSON, JDBC, we also 
provide some specific data source for ML.
+
+**Table of Contents**
+
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+## Image data source
+
+This image data source is used to load image files from a directory.
+The loaded DataFrame has one StructType column: "image". containing image 
data stored as image schema.
+
+
+

+[`ImageDataSource`](api/scala/index.html#org.apache.spark.ml.source.image.ImageDataSource)
+implements a Spark SQL data source API for loading image data as a 
DataFrame.
+
+{% highlight scala %}
+scala> spark.read.format("image").load("data/mllib/images/origin")
+res1: org.apache.spark.sql.DataFrame = [image: struct]
+{% endhighlight %}
+
+
+

+[`ImageDataSource`](api/java/org/apache/spark/ml/source/image/ImageDataSource.html)
+implements Spark SQL data source API for loading image data as DataFrame.
+
+{% highlight java %}
+Dataset imagesDF = 
spark.read().format("image").load("data/mllib/images/origin");
--- End diff --

Can we do a simple transformation so that how the image datasource can be 
utilized?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22675#discussion_r224322298
  
--- Diff: docs/ml-datasource.md ---
@@ -0,0 +1,49 @@
+---
+layout: global
+title: Data sources
+displayTitle: Data sources
+---
+
+In this section, we introduce how to use data source in ML to load data.
+Beside some general data sources like Parquet, CSV, JSON, JDBC, we also 
provide some specific data source for ML.
+
+**Table of Contents**
+
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+## Image data source
+
+This image data source is used to load image files from a directory.
+The loaded DataFrame has one StructType column: "image". containing image 
data stored as image schema.
+
+
+

+[`ImageDataSource`](api/scala/index.html#org.apache.spark.ml.source.image.ImageDataSource)
+implements a Spark SQL data source API for loading image data as a 
DataFrame.
+
+{% highlight scala %}
+scala> spark.read.format("image").load("data/mllib/images/origin")
+res1: org.apache.spark.sql.DataFrame = [image: struct]
+{% endhighlight %}
+
+
+

+[`ImageDataSource`](api/java/org/apache/spark/ml/source/image/ImageDataSource.html)
+implements Spark SQL data source API for loading image data as DataFrame.
+
+{% highlight java %}
+Dataset imagesDF = 
spark.read().format("image").load("data/mllib/images/origin");
+{% endhighlight %}
+
+
+
--- End diff --

how about SQL syntax? I think we can use `CREATE TABLE tableA USING 
LOCATION 'data/image.png'`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22675#discussion_r224321873
  
--- Diff: docs/ml-datasource.md ---
@@ -0,0 +1,49 @@
+---
+layout: global
+title: Data sources
+displayTitle: Data sources
+---
+
+In this section, we introduce how to use data source in ML to load data.
+Beside some general data sources like Parquet, CSV, JSON, JDBC, we also 
provide some specific data source for ML.
+
+**Table of Contents**
+
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+## Image data source
+
+This image data source is used to load image files from a directory.
+The loaded DataFrame has one StructType column: "image". containing image 
data stored as image schema.
--- End diff --

Shall we describe which image we can load? For instance, I think this 
delegates to ImageIO in Java which allows to read compressed format like PNG or 
JPG to raw image representation like BMP so that OpenCS can handles them.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22675#discussion_r224321949
  
--- Diff: docs/ml-datasource.md ---
@@ -0,0 +1,49 @@
+---
+layout: global
+title: Data sources
+displayTitle: Data sources
+---
+
+In this section, we introduce how to use data source in ML to load data.
+Beside some general data sources like Parquet, CSV, JSON, JDBC, we also 
provide some specific data source for ML.
+
+**Table of Contents**
+
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+## Image data source
+
+This image data source is used to load image files from a directory.
+The loaded DataFrame has one StructType column: "image". containing image 
data stored as image schema.
--- End diff --

I would also describe the schema structure and what each field means.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22675#discussion_r224321446
  
--- Diff: docs/ml-datasource.md ---
@@ -0,0 +1,49 @@
+---
+layout: global
+title: Data sources
+displayTitle: Data sources
+---
+
+In this section, we introduce how to use data source in ML to load data.
+Beside some general data sources like Parquet, CSV, JSON, JDBC, we also 
provide some specific data source for ML.
--- End diff --

`JSON, JDBC` -> `JSON and JDBC`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22668: [SPARK-25675] [Spark Job History] Job UI page doe...

2018-10-10 Thread shivusondur

Github user shivusondur commented on a diff in the pull request:

https://github.com/apache/spark/pull/22668#discussion_r224318421
  
--- Diff: core/src/main/scala/org/apache/spark/ui/PagedTable.scala ---
@@ -123,10 +123,9 @@ private[ui] trait PagedTable[T] {
   /**
* Return a page navigation.
* 
-   *   If the totalPages is 1, the page navigation will be empty
*   
-   * If the totalPages is more than 1, it will create a page 
navigation including a group of
-   * page numbers and a form to submit the page number.
+   * It will create a page navigation including a group of page 
numbers and a form
--- End diff --

@gengliangwang @felixcheung 
i have updated according to your suggestion. please check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22685: [SQL][MINOR][Refactor] Refactor on sql/core

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22685#discussion_r224317853
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -96,7 +95,7 @@ case class DataSource(
   private val caseInsensitiveOptions = CaseInsensitiveMap(options)
   private val equality = sparkSession.sessionState.conf.resolver
 
-  bucketSpec.map { bucket =>
+  bucketSpec.foreach { bucket =>
--- End diff --

Yea, this is legitimate change.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...

2018-10-10 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22419#discussion_r224318028
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 ---
@@ -1245,3 +1245,27 @@ case class BRound(child: Expression, scale: 
Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+/**
+ * The number truncated to scale decimal places.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(number, scale) - Returns number truncated to scale 
decimal places. " +
+"If scale is omitted, then number is truncated to 0 places. " +
+"scale can be negative to truncate (make zero) scale digits left of 
the decimal point.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(1234567891.1234567891, 4);
+   1234567891.1234
+  > SELECT _FUNC_(1234567891.1234567891, -4);
+   123456
+  > SELECT _FUNC_(1234567891.1234567891);
+   1234567891
+  """)
+// scalastyle:on line.size.limit
+case class Truncate(child: Expression, scale: Expression)
--- End diff --

In that case, its ok to handle the string as date. How about only accepting 
float, double, and decimal for number truncation?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22676: [SPARK-25684][SQL] Organize header related codes in CSV ...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22676
  
**[Test build #97235 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97235/testReport)**
 for PR 22676 at commit 
[`c504356`](https://github.com/apache/spark/commit/c504356b847e183f571a09ce5f808d4a7f229255).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22676: [SPARK-25684][SQL] Organize header related codes in CSV ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22676
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3871/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22676: [SPARK-25684][SQL] Organize header related codes in CSV ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22676
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22676: [SPARK-25684][SQL] Organize header related codes in CSV ...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22676
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22594: [SPARK-25674][SQL] If the records are incremented by mor...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97230/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22594: [SPARK-25674][SQL] If the records are incremented by mor...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22594: [SPARK-25674][SQL] If the records are incremented by mor...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22594
  
**[Test build #97230 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97230/testReport)**
 for PR 22594 at commit 
[`04eba30`](https://github.com/apache/spark/commit/04eba3019fa8e05b73823c91db48a50c544e8350).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22688
  
**[Test build #97234 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97234/testReport)**
 for PR 22688 at commit 
[`ded852c`](https://github.com/apache/spark/commit/ded852c3f99d9fe904a6b54691ac6c170da9a298).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3870/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22688#discussion_r224316297
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala 
---
@@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-25700: do not read schema when writing in other modes except 
append mode") {
+withTempPath { file =>
+  val cls = classOf[SimpleWriteOnlyDataSource]
+  val path = file.getCanonicalPath
+  val df = spark.range(5).select('id as 'i, -'id as 'j)
--- End diff --

The write path looks requiring two columns:


https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala#L214



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22688#discussion_r224316130
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala 
---
@@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-25700: do not read schema when writing in other modes except 
append mode") {
+withTempPath { file =>
+  val cls = classOf[SimpleWriteOnlyDataSource]
+  val path = file.getCanonicalPath
+  val df = spark.range(5).select($"id", $"id")
--- End diff --

The write path looks requiring two columns:


https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala#L214



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22668: [SPARK-25675] [Spark Job History] Job UI page doe...

2018-10-10 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22668#discussion_r224316034
  
--- Diff: core/src/main/scala/org/apache/spark/ui/PagedTable.scala ---
@@ -123,10 +123,9 @@ private[ui] trait PagedTable[T] {
   /**
* Return a page navigation.
* 
-   *   If the totalPages is 1, the page navigation will be empty
*   
-   * If the totalPages is more than 1, it will create a page 
navigation including a group of
-   * page numbers and a form to submit the page number.
+   * It will create a page navigation including a group of page 
numbers and a form
--- End diff --

true.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22688
  
I have no idea why it passes in my local. I fixed the test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22689: [SPARK-25697][CORE]When zstd compression enabled, InProg...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22689
  
**[Test build #97233 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97233/testReport)**
 for PR 22689 at commit 
[`c309f34`](https://github.com/apache/spark/commit/c309f3464522341f286fd4791d7989dcde988cac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22688
  
Hm, yea, this was passed in my local so I expected this was flaky but seems 
I should fix.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22689: [SPARK-25697][CORE]When zstd compression enabled, InProg...

2018-10-10 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/22689
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22681: [SPARK-25682][k8s] Package example jars in same t...

2018-10-10 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22681#discussion_r224314585
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile ---
@@ -18,6 +18,7 @@
 FROM openjdk:8-alpine
 
 ARG spark_jars=jars
+ARG example_jars=examples/jars
--- End diff --

could we make this optional? if someone wants to build a smaller image 
without example


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/22688
  
Seems the same test failed?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22466: [SPARK-25464][SQL] Create Database to the location,only ...

2018-10-10 Thread sandeep-katta

Github user sandeep-katta commented on the issue:

https://github.com/apache/spark/pull/22466
  
> The major comments are in the test cases. Could you help clean up the 
existing test cases?

 All the comments are fixed and corrected the testcases


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97229/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22688
  
**[Test build #97229 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97229/testReport)**
 for PR 22688 at commit 
[`9377bc3`](https://github.com/apache/spark/commit/9377bc35050408512c28f47ca0535b66c4dfcaf8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SchemaReadAttemptException(m: String) extends 
RuntimeException(m)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22678: [SPARK-25685][BUILD] Allow running tests in Jenki...

2018-10-10 Thread LantaoJin

Github user LantaoJin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22678#discussion_r224309582
  
--- Diff: dev/run-tests-jenkins.py ---
@@ -39,7 +39,8 @@ def print_err(msg):
 def post_message_to_github(msg, ghprb_pull_id):
 print("Attempting to post to Github...")
 
-url = "https://api.github.com/repos/apache/spark/issues/"; + 
ghprb_pull_id + "/comments"
+api_url = os.getenv("GITHUB_SERVER_API_URL", 
"https://api.github.com/repos/apache/spark";)
--- End diff --

Sure. @kiszk 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22690
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22690
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97226/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22690
  
**[Test build #97226 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97226/testReport)**
 for PR 22690 at commit 
[`a35b54f`](https://github.com/apache/spark/commit/a35b54fbb000665a87998c14ed940316d45d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22612
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97228/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22612
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22612
  
**[Test build #97228 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97228/testReport)**
 for PR 22612 at commit 
[`067b81d`](https://github.com/apache/spark/commit/067b81d24de7999afe5b9660e89d9a2e41de6d21).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22678: [SPARK-25685][BUILD] Allow running tests in Jenkins in e...

2018-10-10 Thread LantaoJin

Github user LantaoJin commented on the issue:

https://github.com/apache/spark/pull/22678
  
Sorry for closing the conversation mistakenly @dongjoon-hyun . I will 
update the documentation soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22309
  
**[Test build #97232 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97232/testReport)**
 for PR 22309 at commit 
[`5613217`](https://github.com/apache/spark/commit/5613217771b1929b9f66106468fd2da2c3ea7dec).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22309
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22674
  
**[Test build #97231 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97231/testReport)**
 for PR 22674 at commit 
[`3ffa536`](https://github.com/apache/spark/commit/3ffa536f3c29f6655843a4d45c215393f51e23c9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22674
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22674
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3869/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22309
  
somehow I lost track of this PR.

ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22309#discussion_r224300113
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
 ---
@@ -108,6 +108,16 @@ object TestingUDT {
   }
 }
 
+object TestingValueClass {
+  case class IntWrapper(i: Int) extends AnyVal
--- End diff --

does value class must be a case class?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22661: [SPARK-25664][SQL][TEST] Refactor JoinBenchmark t...

2018-10-10 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22661#discussion_r224300031
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/JoinBenchmark.scala
 ---
@@ -19,229 +19,165 @@ package org.apache.spark.sql.execution.benchmark
 
 import org.apache.spark.sql.execution.joins._
 import org.apache.spark.sql.functions._
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.IntegerType
 
 /**
  * Benchmark to measure performance for aggregate primitives.
- * To run this:
- *  build/sbt "sql/test-only *benchmark.JoinBenchmark"
- *
- * Benchmarks in this file are skipped in normal builds.
+ * To run this benchmark:
+ * {{{
+ *   1. without sbt:
+ *  bin/spark-submit --class  --jars  

+ *   2. build/sbt "sql/test:runMain "
+ *   3. generate result:
+ *  SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain "
+ *  Results will be written to "benchmarks/JoinBenchmark-results.txt".
+ * }}}
  */
-class JoinBenchmark extends BenchmarkWithCodegen {
+object JoinBenchmark extends SqlBasedBenchmark {
 
-  ignore("broadcast hash join, long key") {
+  def broadcastHashJoinLongKey(): Unit = {
 val N = 20 << 20
 val M = 1 << 16
 
-val dim = broadcast(sparkSession.range(M).selectExpr("id as k", 
"cast(id as string) as v"))
-runBenchmark("Join w long", N) {
-  val df = sparkSession.range(N).join(dim, (col("id") % M) === 
col("k"))
+val dim = broadcast(spark.range(M).selectExpr("id as k", "cast(id as 
string) as v"))
+codegenBenchmark("Join w long", N) {
+  val df = spark.range(N).join(dim, (col("id") % M) === col("k"))
   
assert(df.queryExecution.sparkPlan.find(_.isInstanceOf[BroadcastHashJoinExec]).isDefined)
   df.count()
 }
-
-/*
-Java HotSpot(TM) 64-Bit Server VM 1.7.0_60-b19 on Mac OS X 10.9.5
-Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
-Join w long:Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
---
-Join w long codegen=false3002 / 3262  7.0  
   143.2   1.0X
-Join w long codegen=true  321 /  371 65.3  
15.3   9.3X
-*/
   }
 
-  ignore("broadcast hash join, long key with duplicates") {
+
+  def broadcastHashJoinLongKeyWithDuplicates(): Unit = {
 val N = 20 << 20
 val M = 1 << 16
 
-val dim = broadcast(sparkSession.range(M).selectExpr("id as k", 
"cast(id as string) as v"))
--- End diff --

Yes


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22674
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22692: [SPARK-25598][STREAMING][BUILD] Remove flume connector i...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22692
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22692: [SPARK-25598][STREAMING][BUILD] Remove flume connector i...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22692
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97221/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22692: [SPARK-25598][STREAMING][BUILD] Remove flume connector i...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22692
  
**[Test build #97221 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97221/testReport)**
 for PR 22692 at commit 
[`4b39ac3`](https://github.com/apache/spark/commit/4b39ac3500d1ee6f8b3d93f4822c6e5f36e30e3b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2018-10-10 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19330
  
Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2018-10-10 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/19330
  
@maropu 
Thanks, and yes I'm still here and I can keep going if this pr is 
interested. 
I will update this pr this weekend.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22692: [SPARK-25598][STREAMING][BUILD] Remove flume connector i...

2018-10-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22692
  
sounds reasonable, also cc @tdas @zsxwing @jose-torres 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22259: [SPARK-25044][SQL] (take 2) Address translation o...

2018-10-10 Thread maryannxue

Github user maryannxue commented on a diff in the pull request:

https://github.com/apache/spark/pull/22259#discussion_r224295469
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ---
@@ -47,7 +48,8 @@ case class ScalaUDF(
 inputTypes: Seq[DataType] = Nil,
 udfName: Option[String] = None,
 nullable: Boolean = true,
-udfDeterministic: Boolean = true)
+udfDeterministic: Boolean = true,
+nullableTypes: Seq[Boolean] = Nil)
--- End diff --

Yes, the test should not pass after removing `isInstanceOf[KnownNotNull]` 
condition from `needsNullCheck` test 
(https://github.com/apache/spark/pull/22259/files#diff-57b3d87be744b7d79a9beacf8e5e5eb2L2160).
 The idea was to add a `KnownNotNull` node on top of the original node to mark 
it as null-checked, so the rule won't add redundant null checks even if it is 
accidentally applied again. I'm not sure about the exact reason why you removed 
`isInstanceOf[KnownNotNull]` condition in this PR, but I think it should be 
left there alongside the new nullable type check.
After adding the `nullableTypes` parameter in the test, the issue can be 
reproduced:
```
  test("SPARK-24891 Fix HandleNullInputsForUDF rule") {
val a = testRelation.output(0)
val func = (x: Int, y: Int) => x + y
val udf1 = ScalaUDF(func, IntegerType, a :: a :: Nil, nullableTypes = 
false :: false :: Nil)
val udf2 = ScalaUDF(func, IntegerType, a :: udf1 :: Nil, nullableTypes 
= false :: false :: Nil)
val plan = Project(Alias(udf2, "")() :: Nil, testRelation)
comparePlans(plan.analyze, plan.analyze.analyze)
  }
```
BTW, I'm just curious: It looks like `nullableTypes` indicates something 
opposite to "nullable" used in schema. I would assume when `nullableTypes` is 
`Seq(false)`, it means this type is not nullable and we need not add the null 
check, vice versa. Did I miss something here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22594: [SPARK-25674][SQL] If the records are incremented by mor...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22594
  
**[Test build #97230 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97230/testReport)**
 for PR 22594 at commit 
[`04eba30`](https://github.com/apache/spark/commit/04eba3019fa8e05b73823c91db48a50c544e8350).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22594: [SPARK-25674][SQL] If the records are incremented by mor...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3868/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22594: [SPARK-25674][SQL] If the records are incremented by mor...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21669
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21669
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97220/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21669
  
**[Test build #97220 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97220/testReport)**
 for PR 21669 at commit 
[`dd95fca`](https://github.com/apache/spark/commit/dd95fcab754e71e9465f4e46818c3cef09e86c8b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22691: [SPARK-24109][CORE] Remove class SnappyOutputStreamWrapp...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22691
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97222/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22691: [SPARK-24109][CORE] Remove class SnappyOutputStreamWrapp...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22691
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22691: [SPARK-24109][CORE] Remove class SnappyOutputStreamWrapp...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22691
  
**[Test build #97222 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97222/testReport)**
 for PR 22691 at commit 
[`8850c7a`](https://github.com/apache/spark/commit/8850c7a7d563cf6bc46a84b7480b4d338d58b80f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3867/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22688
  
**[Test build #97229 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97229/testReport)**
 for PR 22688 at commit 
[`9377bc3`](https://github.com/apache/spark/commit/9377bc35050408512c28f47ca0535b66c4dfcaf8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22688
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22664: [SPARK-25662][TEST] Refactor DataSourceReadBenchmark to ...

2018-10-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22664
  
Hi, @peter-toth .
Could you review and merge https://github.com/peter-toth/spark/pull/1 which 
contains the result on EC2 r3.xlarge?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22688
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97224/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22688: [SPARK-25700][SQL] Creates ReadSupport in only Append Mo...

2018-10-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22688
  
**[Test build #97224 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97224/testReport)**
 for PR 22688 at commit 
[`9377bc3`](https://github.com/apache/spark/commit/9377bc35050408512c28f47ca0535b66c4dfcaf8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SchemaReadAttemptException(m: String) extends 
RuntimeException(m)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22689: [SPARK-25697][CORE]When zstd compression enabled, InProg...

2018-10-10 Thread shahidki31

Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22689
  
@srowen . Yes. We should read only from the finished frames of zstd. When 
the listener try to read from the unfinished frame, zstd input reader throws an 
exception (unless we make set continuous true).

Currently the behavior is, it reads from the finished frames, but after 
that it tried to read from the unfinished frame and throws exception while 
loading the webui. So, the solution should be, we should not parse from the 
unfinished frame, and load the UI based on only the finish frames.
 
@vanzin has good idea about the history server. Hi @vanzin , could you 
please give your inputs? Thanks 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 >

1 - 100 of 501 matches

Mail list logo