[GitHub] spark pull request #22698: [SPARK-25710][SQL] range should report metrics co...

2018-10-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22698


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22698: [SPARK-25710][SQL] range should report metrics correctly

2018-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22698
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22645: [SPARK-25566][SPARK-25567][WEBUI][SQL]Support pagination...

2018-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22645
  
I found the UI patches are very hard to review, because we embed 
HTML/Javascript in Scala code. Is there a plan to rewrite the Spark UI with 
some modern frontend frameworks?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22702: [SPARK-25714] Fix Null Handling in the Optimizer ...

2018-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22702#discussion_r224950588
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -276,15 +276,31 @@ object BooleanSimplification extends 
Rule[LogicalPlan] with PredicateHelper {
   case a And b if a.semanticEquals(b) => a
   case a Or b if a.semanticEquals(b) => a
 
-  case a And (b Or c) if Not(a).semanticEquals(b) => And(a, c)
-  case a And (b Or c) if Not(a).semanticEquals(c) => And(a, b)
-  case (a Or b) And c if a.semanticEquals(Not(c)) => And(b, c)
-  case (a Or b) And c if b.semanticEquals(Not(c)) => And(a, c)
-
-  case a Or (b And c) if Not(a).semanticEquals(b) => Or(a, c)
-  case a Or (b And c) if Not(a).semanticEquals(c) => Or(a, b)
-  case (a And b) Or c if a.semanticEquals(Not(c)) => Or(b, c)
-  case (a And b) Or c if b.semanticEquals(Not(c)) => Or(a, c)
+  // The following optimization is applicable only when the operands 
are nullable,
--- End diff --

typo: `only when the operands are not nullable`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22597
  
> In ParquetFilter, the way we test if a predicate pushdown works is by 
removing that predicate from Spark SQL physical plan, and only relying on the 
reader to do the filter.

I haven't looked into, but the parquet record-level filtering is disabled 
by default, so if we remove predicates from spark side, the result can be wrong 
even if the predicates are pushed ro parquet.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22379#discussion_r224949068
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3854,6 +3854,38 @@ object functions {
   @scala.annotation.varargs
   def map_concat(cols: Column*): Column = withExpr { 
MapConcat(cols.map(_.expr)) }
 
+  /**
+   * Parses a column containing a CSV string into a `StructType` with the 
specified schema.
+   * Returns `null`, in the case of an unparseable string.
+   *
+   * @param e a string column containing CSV data.
+   * @param schema the schema to use when parsing the CSV string
+   * @param options options to control how the CSV is parsed. accepts the 
same options and the
+   *CSV data source.
+   *
+   * @group collection_funcs
+   * @since 3.0.0
+   */
+  def from_csv(e: Column, schema: StructType, options: Map[String, 
String]): Column = withExpr {
+CsvToStructs(schema, options, e.expr)
+  }
+
+  /**
+   * (Java-specific) Parses a column containing a CSV string into a 
`StructType`
+   * with the specified schema. Returns `null`, in the case of an 
unparseable string.
+   *
+   * @param e a string column containing CSV data.
+   * @param schema the schema to use when parsing the CSV string
+   * @param options options to control how the CSV is parsed. accepts the 
same options and the
+   *CSV data source.
+   *
+   * @group collection_funcs
+   * @since 3.0.0
+   */
+  def from_csv(e: Column, schema: String, options: java.util.Map[String, 
String]): Column = {
--- End diff --

you can create a new type characterOrstructOrColumn? (letter casing is 
weird)

but I wasn't following - what's the case for schema is a Column?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21993: [SPARK-24983][Catalyst] Add configuration for maximum nu...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21993
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22702: [SPARK-25714] Fix Null Handling in the Optimizer ...

2018-10-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22702


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22702
  
Thanks! Merged to master/2.4/2.3


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21588
  
Yes it solves anything. We could consider to upgrade to Hive 3 but I am 
unsure on this since any try (as far as I know) wasn't made yet. But for Hive 
2.3.2, @wangyum made a try here (https://github.com/apache/spark/pull/20659) 
where at least the tests were passed - looks feasible.

Some people worry about the difficulties. So for clarification, @wangyum, 
do you mind if I ask to list up the potential advantage and disadvantages (for 
instance breaking backward compatibility), and some existing difficulties? I 
think this is only concerns left if I am not mistaken.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22702
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22702
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97328/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22702
  
**[Test build #97328 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97328/testReport)**
 for PR 22702 at commit 
[`ca3172f`](https://github.com/apache/spark/commit/ca3172f346e19dc2a6a84ae0a3855f967d129619).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22383: [SPARK-25362][JavaAPI] Replace Spark Optional cla...

2018-10-12 Thread mmolimar
Github user mmolimar commented on a diff in the pull request:

https://github.com/apache/spark/pull/22383#discussion_r224948273
  
--- Diff: project/MimaExcludes.scala ---
@@ -36,6 +36,8 @@ object MimaExcludes {
 
   // Exclude rules for 3.0.x
   lazy val v30excludes = v24excludes ++ Seq(
+// [SPARK-25362][JavaAPI] Replace Spark Optional class with Java 
Optional
+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.api.java.Optional")
--- End diff --

No worries. Done ;-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22597#discussion_r224947824
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
 ---
@@ -67,6 +67,16 @@ private[sql] object OrcFilters {
 }
   }
 
+  // Since ORC 1.5.0 (ORC-323), we need to quote for column names with `.` 
characters
+  // in order to distinguish predicate pushdown for nested columns.
+  private def quoteAttributeNameIfNeeded(name: String) : String = {
+if (!name.contains("`") && name.contains(".")) {
--- End diff --

For `ORC` and `AVRO` improvement, 
[SPARK-25722](https://issues.apache.org/jira/browse/SPARK-25722) is created.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22699: [SPARK-25711][Core] Improve start-history-server.sh: sho...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22699
  
**[Test build #4375 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4375/testReport)**
 for PR 22699 at commit 
[`5e05c60`](https://github.com/apache/spark/commit/5e05c604fdc9913a1424a569deb16ec3301bd4e4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22597#discussion_r224947556
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
 ---
@@ -67,6 +67,16 @@ private[sql] object OrcFilters {
 }
   }
 
+  // Since ORC 1.5.0 (ORC-323), we need to quote for column names with `.` 
characters
+  // in order to distinguish predicate pushdown for nested columns.
+  private def quoteAttributeNameIfNeeded(name: String) : String = {
+if (!name.contains("`") && name.contains(".")) {
--- End diff --

@HyukjinKwon . Actually, Spark 2.3.2 ORC (native/hive) doesn't support a 
backtick character in column names.  It fails on **writing** operation. And, 
although Spark 2.4.0 broadens the supported special characters like `.` and `"` 
in column names, the backtick character is not handled yet.

So, for that one, I'll proceed in another PR since it's an improvement 
instead of a regression.

Also, cc @gatorsmile and @dbtsai .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22597
  
**[Test build #97329 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97329/testReport)**
 for PR 22597 at commit 
[`335a39f`](https://github.com/apache/spark/commit/335a39f58c2103a41c6c30340746734d3aeda954).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22597
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22597
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3937/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22381
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22381
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97324/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22381
  
**[Test build #97324 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97324/testReport)**
 for PR 22381 at commit 
[`86bf7ec`](https://github.com/apache/spark/commit/86bf7ec4bbe1bca268f85464b73aa0cdcb0bf163).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20761: [SPARK-20327][CORE][YARN] Add CLI support for YAR...

2018-10-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20761


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-12 Thread mt40
Github user mt40 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22309#discussion_r224944923
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
 ---
@@ -108,6 +108,16 @@ object TestingUDT {
   }
 }
 
+object TestingValueClass {
+  case class IntWrapper(i: Int) extends AnyVal
--- End diff --

It doesn't but since Spark only supports `case class` (not `class`) for 
schema type. So I keep it that way.
Child columns can be `class` though. I think adding that in the future on 
top of this is not difficult.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-10-12 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/20761
  
Merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22597
  
Thanks. I got it. You mean `stripSparkFilter` which is used in both 
`OrcQuerySuite.scala` and `ParquetFilterSuite.scala`. Sure!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22703: [SPARK-25705][BUILD][STREAMING] Remove Kafka 0.8 integra...

2018-10-12 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/22703
  
I guess the only argument to the contrary would be if some of the known
issues end up being better solved with minor API changes, leaving it marked
as experimental would technically be better notice.

I personally think it's clearer to remove the experimental.

On Fri, Oct 12, 2018, 6:18 PM Sean Owen  wrote:

> *@srowen* commented on this pull request.
> --
>
> In docs/streaming-kafka-0-10-integration.md
> :
>
> > @@ -3,7 +3,11 @@ layout: global
>  title: Spark Streaming + Kafka Integration Guide (Kafka broker version 
0.10.0 or higher)
>  ---
>
> -The Spark Streaming integration for Kafka 0.10 is similar in design to 
the 0.8 [Direct Stream 
approach](streaming-kafka-0-8-integration.html#approach-2-direct-approach-no-receivers).
  It provides simple parallelism,  1:1 correspondence between Kafka partitions 
and Spark partitions, and access to offsets and metadata. However, because the 
newer integration uses the [new Kafka consumer 
API](http://kafka.apache.org/documentation.html#newconsumerapi) instead of the 
simple API, there are notable differences in usage. This version of the 
integration is marked as experimental, so the API is potentially subject to 
change.
> +The Spark Streaming integration for Kafka 0.10 provides simple 
parallelism, 1:1 correspondence between Kafka
> +partitions and Spark partitions, and access to offsets and metadata. 
However, because the newer integration uses
> +the [new Kafka consumer 
API](https://kafka.apache.org/documentation.html#newconsumerapi) instead of the 
simple API,
> +there are notable differences in usage. This version of the integration 
is marked as experimental, so the API is
>
> Yeah, good general point. Is the kafka 0.10 integration at all
> experimental anymore? Is anything that survives from 2.x to 3.x? I'd say
> "no" in almost all cases. What are your personal views on that?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22702
  
**[Test build #97328 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97328/testReport)**
 for PR 22702 at commit 
[`ca3172f`](https://github.com/apache/spark/commit/ca3172f346e19dc2a6a84ae0a3855f967d129619).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22702
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22702: [SPARK-25714] Fix Null Handling in the Optimizer rule Bo...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22702
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3936/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22379
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22379
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97319/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22379
  
**[Test build #97319 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97319/testReport)**
 for PR 22379 at commit 
[`c479973`](https://github.com/apache/spark/commit/c479973c7553ec505e5485c61790fbbf2804a410).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22666
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22666
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97318/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22666
  
**[Test build #97318 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97318/testReport)**
 for PR 22666 at commit 
[`0c5e955`](https://github.com/apache/spark/commit/0c5e955be2c1e47893c70d36e10f288a2fea2d8d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/22597
  
In `ParquetFilter`, the way we test if a predicate pushdown works is by 
removing that predicate from Spark SQL physical plan, and only relying on the 
reader to do the filter. Thus, if there is a bug in pushdown filter in reader, 
Spark will get the incorrect result. This can use in test to ensure no 
regression later.   


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22699: [SPARK-25711][Core] Improve start-history-server.sh: sho...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22699
  
**[Test build #4375 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4375/testReport)**
 for PR 22699 at commit 
[`5e05c60`](https://github.com/apache/spark/commit/5e05c604fdc9913a1424a569deb16ec3301bd4e4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread ifilonenko
Github user ifilonenko commented on the issue:

https://github.com/apache/spark/pull/19045
  
please rename PR with [K8S] flag to launch tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22703: [SPARK-25705][BUILD][STREAMING] Remove Kafka 0.8 ...

2018-10-12 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/22703#discussion_r224936431
  
--- Diff: docs/streaming-kafka-0-10-integration.md ---
@@ -3,7 +3,11 @@ layout: global
 title: Spark Streaming + Kafka Integration Guide (Kafka broker version 
0.10.0 or higher)
 ---
 
-The Spark Streaming integration for Kafka 0.10 is similar in design to the 
0.8 [Direct Stream 
approach](streaming-kafka-0-8-integration.html#approach-2-direct-approach-no-receivers).
  It provides simple parallelism,  1:1 correspondence between Kafka partitions 
and Spark partitions, and access to offsets and metadata. However, because the 
newer integration uses the [new Kafka consumer 
API](http://kafka.apache.org/documentation.html#newconsumerapi) instead of the 
simple API, there are notable differences in usage. This version of the 
integration is marked as experimental, so the API is potentially subject to 
change.
+The Spark Streaming integration for Kafka 0.10 provides simple 
parallelism, 1:1 correspondence between Kafka 
+partitions and Spark partitions, and access to offsets and metadata. 
However, because the newer integration uses 
+the [new Kafka consumer 
API](https://kafka.apache.org/documentation.html#newconsumerapi) instead of the 
simple API, 
+there are notable differences in usage. This version of the integration is 
marked as experimental, so the API is 
--- End diff --

Yeah, good general point. Is the kafka 0.10 integration at all experimental 
anymore? Is anything that survives from 2.x to 3.x? I'd say "no" in almost all 
cases. What are your personal views on that?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22414: [SPARK-25424][SQL] Window duration and slide duration wi...

2018-10-12 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22414
  
Yeah, the test that failed here asserts that it's an `AnalysisException`. I 
guess it could be removed. The thing is, many other cases are still handled as 
`AnalysisException`. Maybe it's best to stay consistent; I didn't realize this. 
Is there any other advantage? seems like it fails just as fast either way?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMa...

2018-10-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22690


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22597
  
Thank you for review, @dbtsai and @gatorsmile .

BTW, what do you mean by removing? The pushed filter doesn't introduce 
correctness issue like Parquet. Since it's a performance slowdown, this PR want 
to fix it. We don't want to **remove** filters in this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22661: [SPARK-25664][SQL][TEST] Refactor JoinBenchmark t...

2018-10-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22661


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-12 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22690
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22661: [SPARK-25664][SQL][TEST] Refactor JoinBenchmark t...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22661#discussion_r224934912
  
--- Diff: core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala ---
@@ -200,11 +200,12 @@ private[spark] object Benchmark {
   def getProcessorName(): String = {
 val cpu = if (SystemUtils.IS_OS_MAC_OSX) {
   Utils.executeAndGetOutput(Seq("/usr/sbin/sysctl", "-n", 
"machdep.cpu.brand_string"))
+.stripLineEnd
--- End diff --

Ur.. I'm not a fan to piggy-backing. Okay.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22661: [SPARK-25664][SQL][TEST] Refactor JoinBenchmark t...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22661#discussion_r224934704
  
--- Diff: sql/core/benchmarks/JoinBenchmark-results.txt ---
@@ -0,0 +1,75 @@

+
+Join Benchmark

+
+
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Join w long: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+Join w long wholestage off4464 / 4483  4.7 
212.9   1.0X
+Join w long wholestage on  289 /  339 72.6 
 13.8  15.5X
+
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Join w long duplicated:  Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+Join w long duplicated wholestage off 5662 / 5678  3.7 
270.0   1.0X
+Join w long duplicated wholestage on   332 /  345 63.1 
 15.8  17.0X
+
+OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
+Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Join w 2 ints:   Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+Join w 2 ints wholestage off  173174 / 173183  0.1 
   8257.6   1.0X
+Join w 2 ints wholestage on   166350 / 198362  0.1 
   7932.2   1.0X
--- End diff --

+1.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22661: [SPARK-25664][SQL][TEST] Refactor JoinBenchmark t...

2018-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22661#discussion_r224934660
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/JoinBenchmark.scala
 ---
@@ -19,229 +19,163 @@ package org.apache.spark.sql.execution.benchmark
 
 import org.apache.spark.sql.execution.joins._
 import org.apache.spark.sql.functions._
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.IntegerType
 
 /**
- * Benchmark to measure performance for aggregate primitives.
- * To run this:
- *  build/sbt "sql/test-only *benchmark.JoinBenchmark"
- *
- * Benchmarks in this file are skipped in normal builds.
+ * Benchmark to measure performance for joins.
+ * To run this benchmark:
+ * {{{
+ *   1. without sbt:
+ *  bin/spark-submit --class  --jars  

+ *   2. build/sbt "sql/test:runMain "
+ *   3. generate result:
+ *  SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain "
+ *  Results will be written to "benchmarks/JoinBenchmark-results.txt".
+ * }}}
  */
-class JoinBenchmark extends BenchmarkWithCodegen {
+object JoinBenchmark extends SqlBasedBenchmark {
 
-  ignore("broadcast hash join, long key") {
+  def broadcastHashJoinLongKey(): Unit = {
 val N = 20 << 20
 val M = 1 << 16
 
-val dim = broadcast(sparkSession.range(M).selectExpr("id as k", 
"cast(id as string) as v"))
-runBenchmark("Join w long", N) {
-  val df = sparkSession.range(N).join(dim, (col("id") % M) === 
col("k"))
+val dim = broadcast(spark.range(M).selectExpr("id as k", "cast(id as 
string) as v"))
+codegenBenchmark("Join w long", N) {
+  val df = spark.range(N).join(dim, (col("id") % M) === col("k"))
   
assert(df.queryExecution.sparkPlan.find(_.isInstanceOf[BroadcastHashJoinExec]).isDefined)
   df.count()
 }
-
-/*
-Java HotSpot(TM) 64-Bit Server VM 1.7.0_60-b19 on Mac OS X 10.9.5
-Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
-Join w long:Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
---
-Join w long codegen=false3002 / 3262  7.0  
   143.2   1.0X
-Join w long codegen=true  321 /  371 65.3  
15.3   9.3X
-*/
   }
 
-  ignore("broadcast hash join, long key with duplicates") {
+  def broadcastHashJoinLongKeyWithDuplicates(): Unit = {
 val N = 20 << 20
 val M = 1 << 16
-
-val dim = broadcast(sparkSession.range(M).selectExpr("id as k", 
"cast(id as string) as v"))
-runBenchmark("Join w long duplicated", N) {
-  val dim = broadcast(sparkSession.range(M).selectExpr("cast(id/10 as 
long) as k"))
-  val df = sparkSession.range(N).join(dim, (col("id") % M) === 
col("k"))
+val dim = broadcast(spark.range(M).selectExpr("cast(id/10 as long) as 
k"))
+codegenBenchmark("Join w long duplicated", N) {
+  val df = spark.range(N).join(dim, (col("id") % M) === col("k"))
   
assert(df.queryExecution.sparkPlan.find(_.isInstanceOf[BroadcastHashJoinExec]).isDefined)
   df.count()
 }
-
-/*
- *Java HotSpot(TM) 64-Bit Server VM 1.7.0_60-b19 on Mac OS X 10.9.5
- *Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
- *Join w long duplicated: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
- 
*---
- *Join w long duplicated codegen=false  3446 / 3478  6.1   
  164.3   1.0X
- *Join w long duplicated codegen=true   322 /  351 65.2
  15.3  10.7X
- */
   }
 
-  ignore("broadcast hash join, two int key") {
+  def broadcastHashJoinTwoIntKey(): Unit = {
 val N = 20 << 20
 val M = 1 << 16
-val dim2 = broadcast(sparkSession.range(M)
+val dim2 = broadcast(spark.range(M)
   .selectExpr("cast(id as int) as k1", "cast(id as int) as k2", 
"cast(id as string) as v"))
 
-runBenchmark("Join w 2 ints", N) {
-  val df = sparkSession.range(N).join(dim2,
+codegenBenchmark("Join w 2 ints", N) {
+  val df = spark.range(N).join(dim2,
 (col("id") % M).cast(IntegerType) === col("k1")
   && (col("id") % M).cast(IntegerType) === col("k2"))
   
assert(df.queryExecution.sparkPlan.find(_.isInstanceOf[BroadcastHashJoinExec]).isDefined)
   df.count()
 }
-
-/*
- *Java HotSpot(TM) 64-Bit Server VM 1.7.0_60-b19 on Mac OS X 10.9.5
- *Intel(R) Core(TM) 

[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22597
  
Yes. Please add a test case. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3935/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97327/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97327 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97327/testReport)**
 for PR 19045 at commit 
[`d58f2a6`](https://github.com/apache/spark/commit/d58f2a6adc3176490da4cecf6547ac21ae1bbd0b).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22690
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22690
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97316/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3934/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97327 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97327/testReport)**
 for PR 19045 at commit 
[`d58f2a6`](https://github.com/apache/spark/commit/d58f2a6adc3176490da4cecf6547ac21ae1bbd0b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22690: [SPARK-19287][CORE][STREAMING] JavaPairRDD flatMapValues...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22690
  
**[Test build #97316 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97316/testReport)**
 for PR 22690 at commit 
[`216bf7c`](https://github.com/apache/spark/commit/216bf7c023b92eb2533c66be156f32aa15b0c322).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread ifilonenko
Github user ifilonenko commented on the issue:

https://github.com/apache/spark/pull/19045
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97326/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97326 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97326/testReport)**
 for PR 19045 at commit 
[`9036b44`](https://github.com/apache/spark/commit/9036b4474162e63f4fa6042c1244dd6e24a794c9).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22414: [SPARK-25424][SQL] Window duration and slide duration wi...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22414
  
**[Test build #4374 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4374/testReport)**
 for PR 22414 at commit 
[`89e05f2`](https://github.com/apache/spark/commit/89e05f261c9d9495ef04d4d3cccb49c6b9a587fb).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97326 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97326/testReport)**
 for PR 19045 at commit 
[`9036b44`](https://github.com/apache/spark/commit/9036b4474162e63f4fa6042c1244dd6e24a794c9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3933/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21710
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97317/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21710
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21710
  
**[Test build #97317 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97317/testReport)**
 for PR 21710 at commit 
[`0bab5ac`](https://github.com/apache/spark/commit/0bab5aca283bacdfe36ba1c669521df9d7ff81f3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97325 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97325/testReport)**
 for PR 19045 at commit 
[`cb61f45`](https://github.com/apache/spark/commit/cb61f45be45dfeeda58b0644d21baead9082d7d4).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97325/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97325 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97325/testReport)**
 for PR 19045 at commit 
[`cb61f45`](https://github.com/apache/spark/commit/cb61f45be45dfeeda58b0644d21baead9082d7d4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3932/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22381
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3931/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22710: DO NOT MERGE

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22710
  
**[Test build #97323 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97323/testReport)**
 for PR 22710 at commit 
[`0f9f3f8`](https://github.com/apache/spark/commit/0f9f3f8a038fa5910e9c85bb6432392e208d6a20).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22381
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22710: DO NOT MERGE

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22710
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22710: DO NOT MERGE

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22710
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97323/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22381
  
**[Test build #97324 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97324/testReport)**
 for PR 22381 at commit 
[`86bf7ec`](https://github.com/apache/spark/commit/86bf7ec4bbe1bca268f85464b73aa0cdcb0bf163).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22710: DO NOT MERGE

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22710
  
**[Test build #97323 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97323/testReport)**
 for PR 22710 at commit 
[`0f9f3f8`](https://github.com/apache/spark/commit/0f9f3f8a038fa5910e9c85bb6432392e208d6a20).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22710: DO NOT MERGE

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22710
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3930/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22710: DO NOT MERGE

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22710
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22710: DO NOT MERGE

2018-10-12 Thread squito
GitHub user squito reopened a pull request:

https://github.com/apache/spark/pull/22710

DO NOT MERGE

just for testing

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/squito/spark blah

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22710.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22710


commit ca4f4f39730e86fada6d136049a11ecc8e31b81d
Author: Imran Rashid 
Date:   2018-10-12T18:12:11Z

just for testing

commit 0f9f3f8a038fa5910e9c85bb6432392e208d6a20
Author: Imran Rashid 
Date:   2018-10-12T22:13:17Z

more testing




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22710: DO NOT MERGE

2018-10-12 Thread squito
Github user squito closed the pull request at:

https://github.com/apache/spark/pull/22710


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3929/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97322/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97322 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97322/testReport)**
 for PR 19045 at commit 
[`c7eaaf6`](https://github.com/apache/spark/commit/c7eaaf646956fb1291090ede69abe71c7719f509).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ spot ins...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #97322 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97322/testReport)**
 for PR 19045 at commit 
[`c7eaaf6`](https://github.com/apache/spark/commit/c7eaaf646956fb1291090ede69abe71c7719f509).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20761: [SPARK-20327][CORE][YARN] Add CLI support for YAR...

2018-10-12 Thread szyszy
Github user szyszy commented on a diff in the pull request:

https://github.com/apache/spark/pull/20761#discussion_r224923768
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceRequestHelper.scala
 ---
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.deploy.yarn
+
+import java.lang.{Long => JLong}
+import java.lang.reflect.InvocationTargetException
+
+import scala.collection.mutable
+import scala.util.Try
+
+import org.apache.hadoop.yarn.api.records.Resource
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.yarn.config._
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config._
+import org.apache.spark.util.Utils
+
+/**
+ * This helper class uses some of Hadoop 3 methods from the YARN API,
+ * so we need to use reflection to avoid compile error when building 
against Hadoop 2.x
+ */
+private object ResourceRequestHelper extends Logging {
+  private val AMOUNT_AND_UNIT_REGEX = "([0-9]+)([A-Za-z]*)".r
+  private val RESOURCE_INFO_CLASS = 
"org.apache.hadoop.yarn.api.records.ResourceInformation"
+
+  /**
+   * Validates sparkConf and throws a SparkException if any of standard 
resources (memory or cores)
+   * is defined with the property spark.yarn.x.resource.y
+   */
+  def validateResources(sparkConf: SparkConf): Unit = {
+val resourceDefinitions = Seq[(String, String)](
+  (AM_MEMORY.key, YARN_AM_RESOURCE_TYPES_PREFIX + "memory"),
--- End diff --

I think the code is now complete, please check!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #97321 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97321/testReport)**
 for PR 20761 at commit 
[`dc2e382`](https://github.com/apache/spark/commit/dc2e382ff1e468f7e54e14a12fdfcf983b70ea0f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97321/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >