[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/21563
  
@mengxr I notice that you open a ticket for supporting integer type labels 
in ClusteringEvalutator, would you like to shepherd this pr too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-07-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21622
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21941: [SPARK-24966][SQL] Implement precedence rules for set op...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21941
  
**[Test build #93871 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93871/testReport)**
 for PR 21941 at commit 
[`c0821b6`](https://github.com/apache/spark/commit/c0821b6dd8e713edf2bd1ddd9a27f170d8f8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...

2018-07-31 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19449#discussion_r206760031
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/internal/ExecutorSideSQLConfSuite.scala
 ---
@@ -82,4 +84,22 @@ class ExecutorSideSQLConfSuite extends SparkFunSuite 
with SQLTestUtils {
   assert(checks.forall(_ == true))
 }
   }
+
+  test("SPARK-22219: refactor to control to generate comment") {
+withSQLConf(StaticSQLConf.CODEGEN_COMMENTS.key -> "false") {
+  val res = codegenStringSeq(spark.range(10).groupBy(col("id") * 
2).count()
+.queryExecution.executedPlan)
+  assert(res.length == 2)
+  assert(res.forall{ case (_, code) =>
+!code.contains("* Codegend pipeline") && !code.contains("// 
input[")})
+}
+
+withSQLConf(StaticSQLConf.CODEGEN_COMMENTS.key -> "true") {
+  val res = codegenStringSeq(spark.range(10).groupBy(col("id") * 
2).count()
+.queryExecution.executedPlan)
+  assert(res.length == 2)
+  assert(res.forall{ case (_, code) =>
+code.contains("* Codegend pipeline") && code.contains("// 
input[")})
+}
--- End diff --

combine these two?
```
Seq(true, false).foreach { flag =>
  ...
  if (flag) {
 ...
  } else {
...
  }
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21941: [SPARK-24966][SQL] Implement precedence rules for set op...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21941
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1552/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21941: [SPARK-24966][SQL] Implement precedence rules for set op...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21941
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21622: [SPARK-24637][SS] Add metrics regarding state and...

2018-07-31 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/21622#discussion_r206761192
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MetricsReporter.scala
 ---
@@ -39,6 +42,23 @@ class MetricsReporter(
   registerGauge("processingRate-total", _.processedRowsPerSecond, 0.0)
   registerGauge("latency", 
_.durationMs.get("triggerExecution").longValue(), 0L)
 
+  private val timestampFormat = new 
SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSS'Z'") // ISO8601
+  timestampFormat.setTimeZone(DateTimeUtils.getTimeZone("UTC"))
+
+  registerGauge("eventTime-watermark",
+progress => 
convertStringDateToMillis(progress.eventTime.get("watermark")), 0L)
+
+  registerGauge("states-rowsTotal", 
_.stateOperators.map(_.numRowsTotal).sum, 0L)
+  registerGauge("states-usedBytes", 
_.stateOperators.map(_.memoryUsedBytes).sum, 0L)
+
--- End diff --

Those are custom metrics, which may or may not be present depending on the 
implementation of state store. I dont recommend adding them here directly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21941: [SPARK-24966][SQL] Implement precedence rules for set op...

2018-07-31 Thread holdensmagicalunicorn
Github user holdensmagicalunicorn commented on the issue:

https://github.com/apache/spark/pull/21941
  
@dilipbiswal, thanks! I am a bot who has found some folks who might be able 
to help with the review:@gatorsmile, @rxin and @hvanhovell


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21941: [SPARK-24966][SQL] Implement precedence rules for...

2018-07-31 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/21941

[SPARK-24966][SQL] Implement precedence rules for set operations.

## What changes were proposed in this pull request?

Currently the set operations INTERSECT, UNION and EXCEPT are assigned the 
same precedence. This PR fixes the problem by giving INTERSECT  higher 
precedence than UNION and EXCEPT. UNION and EXCEPT operators are evaluated in 
the order in which they appear in the query from left to right.

This results in change in behavior because of the change in order of 
evaluations of set operators in a query. The old behavior is still preserved 
under a newly added config parameter.

Query `:`
```
SELECT * FROM t1
UNION 
SELECT * FROM t2
EXCEPT
SELECT * FROM t3
INTERSECT
SELECT * FROM t4
```
Parsed plan before the change `:`
```
== Parsed Logical Plan ==
'Intersect false
:- 'Except false
:  :- 'Distinct
:  :  +- 'Union
:  : :- 'Project [*]
:  : :  +- 'UnresolvedRelation `t1`
:  : +- 'Project [*]
:  :+- 'UnresolvedRelation `t2`
:  +- 'Project [*]
: +- 'UnresolvedRelation `t3`
+- 'Project [*]
   +- 'UnresolvedRelation `t4`
```
Parsed plan after the change `:`
```
== Parsed Logical Plan ==
'Except false
:- 'Distinct
:  +- 'Union
: :- 'Project [*]
: :  +- 'UnresolvedRelation `t1`
: +- 'Project [*]
:+- 'UnresolvedRelation `t2`
+- 'Intersect false
   :- 'Project [*]
   :  +- 'UnresolvedRelation `t3`
   +- 'Project [*]
  +- 'UnresolvedRelation `t4`
```
## How was this patch tested?
Added tests in PlanParserSuite, SQLQueryTestSuite.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark SPARK-24966

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21941.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21941


commit c0821b6dd8e713edf2bd1ddd9a27f170d8f8
Author: Dilip Biswal 
Date:   2018-07-30T05:10:29Z

[SPARK-24966] Implement precedence rules for set operations.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21103
  
**[Test build #93870 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93870/testReport)**
 for PR 21103 at commit 
[`93e7979`](https://github.com/apache/spark/commit/93e7979a1c3fb82c47ecae5b3ed539b31cb99e19).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21103
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21103
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1551/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21103
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21222
  
@zsxwing Kindly reminder.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21622
  
Pinging @tdas and @zsxwing for reviewing. It's small one.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21934: [SPARK-24951][SQL] Table valued functions should ...

2018-07-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21934


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #93869 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93869/testReport)**
 for PR 21469 at commit 
[`ed072fc`](https://github.com/apache/spark/commit/ed072fcf057f982275d0daf69787ed812f03e87b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21469
  
@tdas Thanks for the review! Addressed review comments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-07-31 Thread ajacques
Github user ajacques commented on the issue:

https://github.com/apache/spark/pull/21889
  
@mallman, sounds good I'll get this PR updated with your latest changes as 
soon as I can.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21883
  
**[Test build #93868 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93868/testReport)**
 for PR 21883 at commit 
[`536346e`](https://github.com/apache/spark/commit/536346e60ed24ee447f991aacf58cafe9415a020).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21883
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21883
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1550/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/21883
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93866/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21561
  
**[Test build #93866 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93866/testReport)**
 for PR 21561 at commit 
[`1a93c34`](https://github.com/apache/spark/commit/1a93c3432f95713e9a086a39e2f605ea4953619a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/spark/pull/21469#discussion_r206755595
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala ---
@@ -48,12 +49,24 @@ class StateOperatorProgress private[sql](
   def prettyJson: String = pretty(render(jsonValue))
 
   private[sql] def copy(newNumRowsUpdated: Long): StateOperatorProgress =
-new StateOperatorProgress(numRowsTotal, newNumRowsUpdated, 
memoryUsedBytes)
+new StateOperatorProgress(numRowsTotal, newNumRowsUpdated, 
memoryUsedBytes, customMetrics)
 
   private[sql] def jsonValue: JValue = {
-("numRowsTotal" -> JInt(numRowsTotal)) ~
-("numRowsUpdated" -> JInt(numRowsUpdated)) ~
-("memoryUsedBytes" -> JInt(memoryUsedBytes))
+def safeMapToJValue[T](map: ju.Map[String, T], valueToJValue: T => 
JValue): JValue = {
+  if (map.isEmpty) return JNothing
+  val keys = map.keySet.asScala.toSeq.sorted
+  keys.map { k => k -> valueToJValue(map.get(k)) : JObject }.reduce(_ 
~ _)
+}
+
+val jsonVal = ("numRowsTotal" -> JInt(numRowsTotal)) ~
+  ("numRowsUpdated" -> JInt(numRowsUpdated)) ~
+  ("memoryUsedBytes" -> JInt(memoryUsedBytes))
+
+if (!customMetrics.isEmpty) {
--- End diff --

Actually didn't notice that. Thanks for letting me know! Will simplify.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/spark/pull/21469#discussion_r206755538
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala ---
@@ -48,12 +49,24 @@ class StateOperatorProgress private[sql](
   def prettyJson: String = pretty(render(jsonValue))
 
   private[sql] def copy(newNumRowsUpdated: Long): StateOperatorProgress =
-new StateOperatorProgress(numRowsTotal, newNumRowsUpdated, 
memoryUsedBytes)
+new StateOperatorProgress(numRowsTotal, newNumRowsUpdated, 
memoryUsedBytes, customMetrics)
 
   private[sql] def jsonValue: JValue = {
-("numRowsTotal" -> JInt(numRowsTotal)) ~
-("numRowsUpdated" -> JInt(numRowsUpdated)) ~
-("memoryUsedBytes" -> JInt(memoryUsedBytes))
+def safeMapToJValue[T](map: ju.Map[String, T], valueToJValue: T => 
JValue): JValue = {
--- End diff --

I've first trying to leverage `StreamingQueryProgress.safeMapToJValue` but 
can't find proper place to move to be co-used, so I simply copied it. Will 
simplify the code block and inline.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/spark/pull/21469#discussion_r206754359
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala 
---
@@ -81,10 +81,10 @@ class SQLMetric(val metricType: String, initValue: Long 
= 0L) extends Accumulato
 }
 
 object SQLMetrics {
-  private val SUM_METRIC = "sum"
-  private val SIZE_METRIC = "size"
-  private val TIMING_METRIC = "timing"
-  private val AVERAGE_METRIC = "average"
+  val SUM_METRIC = "sum"
+  val SIZE_METRIC = "size"
+  val TIMING_METRIC = "timing"
+  val AVERAGE_METRIC = "average"
--- End diff --

It was to handle exception case while aggregating custom metrics, 
especially filtering out average since it is not aggregated correctly. Since we 
remove custom average metric, we no longer need to filter out them. Will revert 
the change as well as relevant logic.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21883
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93855/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21883
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21883: [SPARK-24937][SQL] Datasource partition table should loa...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21883
  
**[Test build #93855 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93855/testReport)**
 for PR 21883 at commit 
[`536346e`](https://github.com/apache/spark/commit/536346e60ed24ee447f991aacf58cafe9415a020).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21103
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21103
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93851/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21722: Spark-24742: Fix NullPointerexception in Field Metadata

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21722
  
**[Test build #4228 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4228/testReport)**
 for PR 21722 at commit 
[`088e2d7`](https://github.com/apache/spark/commit/088e2d789dad707bd657a72afa8933e957641536).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21103
  
**[Test build #93851 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93851/testReport)**
 for PR 21103 at commit 
[`93e7979`](https://github.com/apache/spark/commit/93e7979a1c3fb82c47ecae5b3ed539b31cb99e19).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21357
  
@tdas 
The rationalization of this patch is to group functions which deal with 
delta and snapshot files into one so that the difference between delta file and 
snapshot file will be clearly shown (actually no difference other than allowing 
TOMBSTONE value in delta file) as well as easy to document about these files. 
It's also easier to add tests for delta / snapshot files.

Indeed my underlying rationalization is to make the class easier to 
understand from newcomers (actually I found it helpful to group them logically 
to understand the code better), but the file has been getting enough love from 
various contributors so may not worth to put effort to make it easiler.

I respect the rule of Spark project, and happy to close if we don't feel 
benefitial to go on. Let's close it and revisit some other one feels 
benefitial. Thanks for providing your voice on this!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStorePr...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR closed the pull request at:

https://github.com/apache/spark/pull/21357


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93852/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19449
  
**[Test build #93852 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93852/testReport)**
 for PR 19449 at commit 
[`afe889d`](https://github.com/apache/spark/commit/afe889d7cd05f7a293f76103616cd62106b91305).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21563
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93863/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21563
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21563
  
**[Test build #93863 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93863/testReport)**
 for PR 21563 at commit 
[`9064e7b`](https://github.com/apache/spark/commit/9064e7bde92f206602ebde9b3d99a861b2a90f8a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce Hint for SQL Queries

2018-07-31 Thread jzhuge
Github user jzhuge commented on the issue:

https://github.com/apache/spark/pull/21911
  
@gatorsmile Oracle's [PARALLEL 
Hint](https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/Comments.html#GUID-D25225CE-2DCE-4D9F-8E82-401839690A6E)
 is the closest I can find. And [SET CURRENT 
DEGREE](https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_sql_setcurrentdegree.html)
 for parallel processing in DB2.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21933: [SPARK-24917] make chunk size configurable

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21933
  
**[Test build #93867 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93867/testReport)**
 for PR 21933 at commit 
[`0251bd5`](https://github.com/apache/spark/commit/0251bd517e7fd3e695cb8366ffa03de8c9e2900b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21933: [SPARK-24917] make chunk size configurable

2018-07-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21933
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21940: Pin tag 210

2018-07-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21940
  
@zhangchj1990, looks mistakenly open. Close this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19186: [SPARK-21972][ML] Add param handlePersistence

2018-07-31 Thread zhengruifeng
Github user zhengruifeng closed the pull request at:

https://github.com/apache/spark/pull/19186


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21561
  
**[Test build #93866 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93866/testReport)**
 for PR 21561 at commit 
[`1a93c34`](https://github.com/apache/spark/commit/1a93c3432f95713e9a086a39e2f605ea4953619a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20918: [SPARK-23805][ML][WIP] Features alg support vecto...

2018-07-31 Thread zhengruifeng
Github user zhengruifeng closed the pull request at:

https://github.com/apache/spark/pull/20918


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1549/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21935: [SPARK-24773] Avro: support logical timestamp typ...

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21935#discussion_r206748626
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala 
---
@@ -114,7 +121,10 @@ object SchemaConverters {
   case ByteType | ShortType | IntegerType => builder.intType()
   case LongType => builder.longType()
   case DateType => builder.longType()
-  case TimestampType => builder.longType()
+  case TimestampType =>
+// To be consistent with the previous behavior of writing 
Timestamp type with Avro 1.7,
--- End diff --

For now I think writing out timestamp micros should be good


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21752: [SPARK-24788][SQL] fixed UnresolvedException when toStri...

2018-07-31 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21752
  
ping @c-horn 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21561
  
**[Test build #93865 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93865/testReport)**
 for PR 21561 at commit 
[`2e48282`](https://github.com/apache/spark/commit/2e48282825a6fb46a50f4497491c550963f2c634).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93865/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21561
  
**[Test build #93865 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93865/testReport)**
 for PR 21561 at commit 
[`2e48282`](https://github.com/apache/spark/commit/2e48282825a6fb46a50f4497491c550963f2c634).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206748200
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/WriteSupport.java ---
@@ -38,15 +38,16 @@
* If this method fails (by throwing an exception), the action will fail 
and no Spark job will be
* submitted.
*
-   * @param jobId A unique string for the writing job. It's possible that 
there are many writing
-   *  jobs running at the same time, and the returned {@link 
DataSourceWriter} can
-   *  use this job id to distinguish itself from other jobs.
+   * @param writeUUID A unique string for the writing job. It's possible 
that there are many writing
+   *  jobs running at the same time, and the returned 
{@link DataSourceWriter} can
+   *  use this job id to distinguish itself from other 
jobs.
* @param schema the schema of the data to be written.
* @param mode the save mode which determines what to do when the data 
are already in this data
* source, please refer to {@link SaveMode} for more details.
* @param options the options for the returned data source writer, which 
is an immutable
*case-insensitive string-to-string map.
+   * @return a writer to append data to this data source
--- End diff --

non-append cases also call this `createWriter`, shall we remove this line?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1548/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18589: [SPARK-16872][ML] Add Gaussian NB

2018-07-31 Thread zhengruifeng
Github user zhengruifeng closed the pull request at:

https://github.com/apache/spark/pull/18589


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21934: [SPARK-24951][SQL] Table valued functions should throw A...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21934
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18389: [SPARK-14174][ML] Add minibatch kmeans

2018-07-31 Thread zhengruifeng
Github user zhengruifeng closed the pull request at:

https://github.com/apache/spark/pull/18389


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferH...

2018-07-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20636#discussion_r206748015
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolderSparkSubmitSuite.scala
 ---
@@ -39,8 +39,8 @@ class BufferHolderSparkSubmitSuite
 val argsForSparkSubmit = Seq(
   "--class", 
BufferHolderSparkSubmitSuite.getClass.getName.stripSuffix("$"),
   "--name", "SPARK-2",
-  "--master", "local-cluster[2,1,1024]",
-  "--driver-memory", "4g",
+  "--master", "local-cluster[1,1,7168]",
--- End diff --

I think we support this for debugging purpose since, IIRC, that's going to 
make separate processes for workers.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21934: [SPARK-24951][SQL] Table valued functions should throw A...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21934
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93849/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21934: [SPARK-24951][SQL] Table valued functions should throw A...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21934
  
**[Test build #93849 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93849/testReport)**
 for PR 21934 at commit 
[`514fd77`](https://github.com/apache/spark/commit/514fd77501194e43e8029734e4a3669f12fbf749).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206747528
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2217,6 +2218,100 @@ class Analyzer(
 }
   }
 
+  /**
+   * Resolves columns of an output table from the data in a logical plan. 
This rule will:
+   *
+   * - Reorder columns when the write is by name
+   * - Insert safe casts when data types do not match
+   * - Insert aliases when column names do not match
+   * - Detect plans that are not compatible with the output table and 
throw AnalysisException
+   */
+  object ResolveOutputRelation extends Rule[LogicalPlan] {
+override def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  case append @ AppendData(table, query, isByName)
+  if table.resolved && query.resolved && !append.resolved =>
+val projection = resolveOutputColumns(table.name, table.output, 
query, isByName)
+
+if (projection != query) {
+  append.copy(query = projection)
+} else {
+  append
+}
+}
+
+def resolveOutputColumns(
+tableName: String,
+expected: Seq[Attribute],
+query: LogicalPlan,
+byName: Boolean): LogicalPlan = {
+
+  if (expected.size < query.output.size) {
+throw new AnalysisException(
+  s"""Cannot write to '$tableName', too many data columns:
+ |Table columns: ${expected.map(_.name).mkString(", ")}
+ |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+  }
+
+  val errors = new mutable.ArrayBuffer[String]()
+  val resolved: Seq[NamedExpression] = if (byName) {
+expected.flatMap { outAttr =>
+  query.resolveQuoted(outAttr.name, resolver) match {
+case Some(inAttr) if inAttr.nullable && !outAttr.nullable =>
+  errors += s"Cannot write nullable values to non-null column 
'${outAttr.name}'"
+  None
+
+case Some(inAttr) if !DataType.canWrite(outAttr.dataType, 
inAttr.dataType, resolver) =>
+  Some(upcast(inAttr, outAttr))
+
+case Some(inAttr) =>
+  Some(inAttr) // matches nullability, datatype, and name
+
+case _ =>
+  errors += s"Cannot find data for output column 
'${outAttr.name}'"
+  None
+  }
+}
+
+  } else {
+if (expected.size > query.output.size) {
+  throw new AnalysisException(
+s"""Cannot write to '$tableName', not enough data columns:
+   |Table columns: ${expected.map(_.name).mkString(", ")}
+   |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+}
+
+query.output.zip(expected).flatMap {
+  case (inAttr, outAttr) if inAttr.nullable && !outAttr.nullable =>
+errors += s"Cannot write nullable values to non-null column 
'${outAttr.name}'"
+None
+
+  case (inAttr, outAttr)
+if !DataType.canWrite(inAttr.dataType, outAttr.dataType, 
resolver) ||
--- End diff --

can't we always do upCast? if it can write, the upCast will be a no-op and 
removed by optimizer.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21935: [SPARK-24773] Avro: support logical timestamp typ...

2018-07-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21935#discussion_r206747402
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala 
---
@@ -35,6 +36,12 @@ object SchemaConverters {
* This function takes an avro schema and returns a sql schema.
*/
   def toSqlType(avroSchema: Schema): SchemaType = {
+avroSchema.getLogicalType match {
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21935: [SPARK-24773] Avro: support logical timestamp typ...

2018-07-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21935#discussion_r206747243
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala 
---
@@ -71,7 +72,15 @@ class AvroDeserializer(rootAvroType: Schema, 
rootCatalystType: DataType) {
   private def newWriter(
   avroType: Schema,
   catalystType: DataType,
-  path: List[String]): (CatalystDataUpdater, Int, Any) => Unit =
+  path: List[String]): (CatalystDataUpdater, Int, Any) => Unit = {
+(avroType.getLogicalType, catalystType) match {
--- End diff --

Can we do this like:

```scala
  case (LONG, TimestampType) => avroType.getLogicalType match {
case _: TimestampMillis => (updater, ordinal, value) =>
  updater.setLong(ordinal, value.asInstanceOf[Long] * 1000)
case _: TimestampMicros => (updater, ordinal, value) =>
  updater.setLong(ordinal, value.asInstanceOf[Long])
case _ => (updater, ordinal, value) =>
  updater.setLong(ordinal, value.asInstanceOf[Long] * 1000)
  }
```

? Looks they have Avro long type anyway. Thought it's better to read and 
actually safer and correct. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...

2018-07-31 Thread lindblombr
Github user lindblombr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21847#discussion_r206746980
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ---
@@ -165,16 +182,118 @@ class AvroSerializer(rootCatalystType: DataType, 
rootAvroType: Schema, nullable:
   result
   }
 
-  private def resolveNullableType(avroType: Schema, nullable: Boolean): 
Schema = {
-if (nullable) {
+  // Resolve an Avro union against a supplied DataType, i.e. a LongType 
compared against
+  // a ["null", "long"] should return a schema of type Schema.Type.LONG
+  // This function also handles resolving a DataType against unions of 2 
or more types, i.e.
+  // an IntType resolves against a ["int", "long", "null"] will correctly 
return a schema of
+  // type Schema.Type.LONG
+  private def resolveUnionType(avroType: Schema, catalystType: DataType,
+  nullable: Boolean): Schema = {
+if (avroType.getType == Type.UNION) {
   // avro uses union to represent nullable type.
-  val fields = avroType.getTypes.asScala
-  assert(fields.length == 2)
-  val actualType = fields.filter(_.getType != NULL)
-  assert(actualType.length == 1)
+  val fieldTypes = avroType.getTypes.asScala
+
+  // If we're nullable, we need to have at least two types.  Cases 
with more than two types
+  // are captured in test("read read-write, read-write w/ schema, 
read") w/ test.avro input
+  if (nullable && fieldTypes.length < 2) {
+throw new IncompatibleSchemaException(
+  s"Cannot resolve nullable ${catalystType} against union type 
${avroType}")
+  }
+
+  val actualType = catalystType match {
+case NullType => fieldTypes.filter(_.getType == Type.NULL)
+case BooleanType => fieldTypes.filter(_.getType == Type.BOOLEAN)
+case ByteType => fieldTypes.filter(_.getType == Type.INT)
+case BinaryType =>
+  val at = fieldTypes.filter(x => x.getType == Type.BYTES || 
x.getType == Type.FIXED)
+  if (at.length > 1) {
+throw new IncompatibleSchemaException(
+  s"Cannot resolve schema of ${catalystType} against union 
${avroType.toString}")
+  } else {
+at
+  }
+case ShortType | IntegerType => fieldTypes.filter(_.getType == 
Type.INT)
+case LongType => fieldTypes.filter(_.getType == Type.LONG)
+case FloatType => fieldTypes.filter(_.getType == Type.FLOAT)
+case DoubleType => fieldTypes.filter(_.getType == Type.DOUBLE)
+case d: DecimalType => fieldTypes.filter(_.getType == Type.STRING)
+case StringType => fieldTypes
+  .filter(x => x.getType == Type.STRING || x.getType == Type.ENUM)
+case DateType => fieldTypes.filter(x => x.getType == Type.INT || 
x.getType == Type.LONG)
+case TimestampType => fieldTypes.filter(_.getType == Type.LONG)
+case ArrayType(et, containsNull) =>
+  // Find array that matches the element type specified
+  fieldTypes.filter(x => x.getType == Type.ARRAY
+&& typeMatchesSchema(et, x.getElementType))
+case st: StructType => // Find the matching record!
+  val recordTypes = fieldTypes.filter(x => x.getType == 
Type.RECORD)
+  if (recordTypes.length > 1) {
+throw new IncompatibleSchemaException(
+  "Unions of multiple record types are NOT supported with 
user-specified schema")
+  }
+  recordTypes
+case MapType(kt, vt, valueContainsNull) =>
+  // Find the map that matches the value type.  Maps in Avro are 
always key type string
+  fieldTypes.filter(x => x.getType == Type.MAP && 
typeMatchesSchema(vt, x.getValueType))
--- End diff --

In `SchemaConverters.toAvro`, the expectation is that Maps are keyed only 
with `StringType`:

case MapType(StringType, vt, valueContainsNull) =>
  builder.map().values(toAvroType(vt, valueContainsNull, recordName, 
prevNameSpace))

When you attempt this trivial test case, we fail
```
test("SPARK-24855: Maps with kv not string") {
withTempPath { dir =>
  val someData = Seq(
Row("a", Map(
  1 -> "foo",
  2 -> "bar",
  3 -> "baz"
  )
),
Row("b", Map(
  1 -> "foo",
  2 -> "bar",
  3 -> "baz"
  )
)
  )

  val someSchema = StructType(Seq(
StructField("id", StringType, true),
StructField("map", MapType(IntegerType, StringType), true)
)
  )

 

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-31 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r206746905
  
--- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import org.apache.spark.annotation.{Experimental, Since}
+
+/** A [[TaskContext]] with extra info and tooling for a barrier stage. */
+trait BarrierTaskContext extends TaskContext {
--- End diff --

Please check the generated JavaDoc. I think it becomes a Java interface 
with only two methods defined here. We might want to define `class 
BarrierTaskContext` directly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206746478
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2217,6 +2218,100 @@ class Analyzer(
 }
   }
 
+  /**
+   * Resolves columns of an output table from the data in a logical plan. 
This rule will:
+   *
+   * - Reorder columns when the write is by name
+   * - Insert safe casts when data types do not match
+   * - Insert aliases when column names do not match
+   * - Detect plans that are not compatible with the output table and 
throw AnalysisException
+   */
+  object ResolveOutputRelation extends Rule[LogicalPlan] {
+override def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  case append @ AppendData(table, query, isByName)
+  if table.resolved && query.resolved && !append.resolved =>
+val projection = resolveOutputColumns(table.name, table.output, 
query, isByName)
+
+if (projection != query) {
+  append.copy(query = projection)
+} else {
+  append
+}
+}
+
+def resolveOutputColumns(
+tableName: String,
+expected: Seq[Attribute],
+query: LogicalPlan,
+byName: Boolean): LogicalPlan = {
+
+  if (expected.size < query.output.size) {
+throw new AnalysisException(
+  s"""Cannot write to '$tableName', too many data columns:
+ |Table columns: ${expected.map(_.name).mkString(", ")}
+ |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+  }
+
+  val errors = new mutable.ArrayBuffer[String]()
+  val resolved: Seq[NamedExpression] = if (byName) {
+expected.flatMap { outAttr =>
+  query.resolveQuoted(outAttr.name, resolver) match {
+case Some(inAttr) if inAttr.nullable && !outAttr.nullable =>
--- End diff --

shall we check the nullability for nested fields.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206746383
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 ---
@@ -352,6 +351,36 @@ case class Join(
   }
 }
 
+/**
+ * Append data to an existing table.
+ */
+case class AppendData(
+table: NamedRelation,
+query: LogicalPlan,
+isByName: Boolean) extends LogicalPlan {
+  override def children: Seq[LogicalPlan] = Seq(query)
--- End diff --

why is `table` not a child? Then we can't transform the table relation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21854
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21854
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93850/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206746209
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2217,6 +2218,100 @@ class Analyzer(
 }
   }
 
+  /**
+   * Resolves columns of an output table from the data in a logical plan. 
This rule will:
+   *
+   * - Reorder columns when the write is by name
+   * - Insert safe casts when data types do not match
+   * - Insert aliases when column names do not match
+   * - Detect plans that are not compatible with the output table and 
throw AnalysisException
+   */
+  object ResolveOutputRelation extends Rule[LogicalPlan] {
+override def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  case append @ AppendData(table, query, isByName)
+  if table.resolved && query.resolved && !append.resolved =>
+val projection = resolveOutputColumns(table.name, table.output, 
query, isByName)
+
+if (projection != query) {
+  append.copy(query = projection)
+} else {
+  append
+}
+}
+
+def resolveOutputColumns(
+tableName: String,
+expected: Seq[Attribute],
+query: LogicalPlan,
+byName: Boolean): LogicalPlan = {
+
+  if (expected.size < query.output.size) {
+throw new AnalysisException(
+  s"""Cannot write to '$tableName', too many data columns:
+ |Table columns: ${expected.map(_.name).mkString(", ")}
+ |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+  }
+
+  val errors = new mutable.ArrayBuffer[String]()
+  val resolved: Seq[NamedExpression] = if (byName) {
+expected.flatMap { outAttr =>
+  query.resolveQuoted(outAttr.name, resolver) match {
+case Some(inAttr) if inAttr.nullable && !outAttr.nullable =>
+  errors += s"Cannot write nullable values to non-null column 
'${outAttr.name}'"
+  None
+
+case Some(inAttr) if !DataType.canWrite(outAttr.dataType, 
inAttr.dataType, resolver) =>
+  Some(upcast(inAttr, outAttr))
+
+case Some(inAttr) =>
+  Some(inAttr) // matches nullability, datatype, and name
+
+case _ =>
+  errors += s"Cannot find data for output column 
'${outAttr.name}'"
+  None
+  }
+}
+
+  } else {
+if (expected.size > query.output.size) {
+  throw new AnalysisException(
+s"""Cannot write to '$tableName', not enough data columns:
+   |Table columns: ${expected.map(_.name).mkString(", ")}
+   |Data columns: ${query.output.map(_.name).mkString(", 
")}""".stripMargin)
+}
+
+query.output.zip(expected).flatMap {
--- End diff --

are these checks duplicated?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21854
  
**[Test build #93850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93850/testReport)**
 for PR 21854 at commit 
[`c127053`](https://github.com/apache/spark/commit/c127053b5521bf742e5ecfb7412f87da9dbeec43).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21561
  
**[Test build #93864 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93864/testReport)**
 for PR 21561 at commit 
[`96e8425`](https://github.com/apache/spark/commit/96e842558dc4005884f335a9a0a03ba02a852db0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93864/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206745598
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2217,6 +2218,100 @@ class Analyzer(
 }
   }
 
+  /**
+   * Resolves columns of an output table from the data in a logical plan. 
This rule will:
+   *
+   * - Reorder columns when the write is by name
+   * - Insert safe casts when data types do not match
+   * - Insert aliases when column names do not match
+   * - Detect plans that are not compatible with the output table and 
throw AnalysisException
+   */
+  object ResolveOutputRelation extends Rule[LogicalPlan] {
+override def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  case append @ AppendData(table, query, isByName)
+  if table.resolved && query.resolved && !append.resolved =>
+val projection = resolveOutputColumns(table.name, table.output, 
query, isByName)
+
+if (projection != query) {
+  append.copy(query = projection)
+} else {
+  append
+}
+}
+
+def resolveOutputColumns(
+tableName: String,
+expected: Seq[Attribute],
+query: LogicalPlan,
+byName: Boolean): LogicalPlan = {
+
+  if (expected.size < query.output.size) {
--- End diff --

shall we use `!=`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21305#discussion_r206745251
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2217,6 +2218,100 @@ class Analyzer(
 }
   }
 
+  /**
+   * Resolves columns of an output table from the data in a logical plan. 
This rule will:
+   *
+   * - Reorder columns when the write is by name
+   * - Insert safe casts when data types do not match
+   * - Insert aliases when column names do not match
+   * - Detect plans that are not compatible with the output table and 
throw AnalysisException
+   */
+  object ResolveOutputRelation extends Rule[LogicalPlan] {
+override def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

now we need to call `resolveOperators` instead of `transform`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21561
  
**[Test build #93864 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93864/testReport)**
 for PR 21561 at commit 
[`96e8425`](https://github.com/apache/spark/commit/96e842558dc4005884f335a9a0a03ba02a852db0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21940: Pin tag 210

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21940
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1547/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21561
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21884: [SPARK-24960][K8S] explicitly expose ports on driver con...

2018-07-31 Thread adelbertc
Github user adelbertc commented on the issue:

https://github.com/apache/spark/pull/21884
  
@mccheah Done - unfortunately I seem to be having some issues running tests 
locally, I assume there is some CI checking this as well?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21940: Pin tag 210

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21940
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21940: Pin tag 210

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21940
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21940: Pin tag 210

2018-07-31 Thread zhangchj1990
GitHub user zhangchj1990 opened a pull request:

https://github.com/apache/spark/pull/21940

Pin tag 210

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangchj1990/spark pin-tag-210

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21940.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21940


commit c1b08fb270f1f0f4a7cc2e49dd1e19e1025ba96c
Author: zhangchj1990 <479224070@...>
Date:   2018-06-30T09:22:48Z

增加自测模块

commit 0ce98171fc025b0c1807b5f4da22695a548beb86
Author: zhangchj1990 <479224070@...>
Date:   2018-08-01T03:07:13Z

增加自测模块




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21940: Pin tag 210

2018-07-31 Thread holdensmagicalunicorn
Github user holdensmagicalunicorn commented on the issue:

https://github.com/apache/spark/pull/21940
  
@zhangchj1990, thanks! I am a bot who has found some folks who might be 
able to help with the review:@pwendell, @marmbrus and @vanzin


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21563
  
**[Test build #93863 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93863/testReport)**
 for PR 21563 at commit 
[`9064e7b`](https://github.com/apache/spark/commit/9064e7bde92f206602ebde9b3d99a861b2a90f8a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21563
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1546/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21563
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21563: [SPARK-24557][ML] ClusteringEvaluator support array inpu...

2018-07-31 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/21563
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21938: [SPARK-24982][SQL] UDAF resolution should not throw Asse...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21938
  
**[Test build #93862 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93862/testReport)**
 for PR 21938 at commit 
[`84262dc`](https://github.com/apache/spark/commit/84262dc21dd9f9aa409dd5e873d31d5b26a231f3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21935: [SPARK-24773] Avro: support logical timestamp type with ...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21935
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93859/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21938: [SPARK-24982][SQL] UDAF resolution should not throw Asse...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21938
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1545/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >