spark git commit: [SPARK-20614][PROJECT INFRA] Use the same log4j configuration with Jenkins in AppVeyor

2017-05-05 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/master 5d75b14bf -> b433acae7


[SPARK-20614][PROJECT INFRA] Use the same log4j configuration with Jenkins in 
AppVeyor

## What changes were proposed in this pull request?

Currently, there are flooding logs in AppVeyor (in the console). This has been 
fine because we can download all the logs. However, (given my observations so 
far), logs are truncated when there are too many. It has been grown recently 
and it started to get truncated. For example, see  
https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1209-master

Even after the log is downloaded, it looks truncated as below:

```
[00:44:21] 17/05/04 18:56:18 INFO TaskSetManager: Finished task 197.0 in stage 
601.0 (TID 9211) in 0 ms on localhost (executor driver) (194/200)
[00:44:21] 17/05/04 18:56:18 INFO Executor: Running task 199.0 in stage 601.0 
(TID 9213)
[00:44:21] 17/05/04 18:56:18 INFO Executor: Finished task 198.0 in stage 601.0 
(TID 9212). 2473 bytes result sent to driver
...
```

Probably, it looks better to use the same log4j configuration that we are using 
for SparkR tests in Jenkins(please see 
https://github.com/apache/spark/blob/fc472bddd1d9c6a28e57e31496c0166777af597e/R/run-tests.sh#L26
 and 
https://github.com/apache/spark/blob/fc472bddd1d9c6a28e57e31496c0166777af597e/R/log4j.properties)
```
# Set everything to be logged to the file target/unit-tests.log
log4j.rootCategory=INFO, file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.append=true
log4j.appender.file.file=R/target/unit-tests.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss.SSS} %t %p 
%c{1}: %m%n

# Ignore messages below warning level from Jetty, because it's a bit verbose
log4j.logger.org.eclipse.jetty=WARN
org.eclipse.jetty.LEVEL=WARN
```

## How was this patch tested?

Manually tested with spark-test account
  - https://ci.appveyor.com/project/spark-test/spark/build/672-r-log4j (there 
is an example for flaky test here)
  - https://ci.appveyor.com/project/spark-test/spark/build/673-r-log4j (I 
re-ran the build).

Author: hyukjinkwon 

Closes #17873 from HyukjinKwon/appveyor-reduce-logs.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b433acae
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b433acae
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b433acae

Branch: refs/heads/master
Commit: b433acae74887e59f2e237a6284a4ae04fbbe854
Parents: 5d75b14
Author: hyukjinkwon 
Authored: Fri May 5 21:26:55 2017 -0700
Committer: Felix Cheung 
Committed: Fri May 5 21:26:55 2017 -0700

--
 appveyor.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b433acae/appveyor.yml
--
diff --git a/appveyor.yml b/appveyor.yml
index bbb2758..4d31af7 100644
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -49,7 +49,7 @@ build_script:
   - cmd: mvn -DskipTests -Psparkr -Phive -Phive-thriftserver package
 
 test_script:
-  - cmd: .\bin\spark-submit2.cmd --conf spark.hadoop.fs.defaultFS="file:///" 
R\pkg\tests\run-all.R
+  - cmd: .\bin\spark-submit2.cmd --driver-java-options 
"-Dlog4j.configuration=file:///%CD:\=/%/R/log4j.properties" --conf 
spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R
 
 notifications:
   - provider: Email


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide

2017-05-05 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 1d9b7a74a -> 423a78625


[SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide

## What changes were proposed in this pull request?

Add `spark.fpGrowth` to SparkR programming guide.

## How was this patch tested?

Manual tests.

Author: zero323 

Closes #17775 from zero323/SPARK-20208-FOLLOW-UP.

(cherry picked from commit ba7666274e71f1903e5050a5e53fbdcd21debde5)
Signed-off-by: Felix Cheung 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/423a7862
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/423a7862
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/423a7862

Branch: refs/heads/branch-2.2
Commit: 423a78625620523ab6a51b2274548a985fc18ed0
Parents: 1d9b7a7
Author: zero323 
Authored: Thu Apr 27 00:34:20 2017 -0700
Committer: Felix Cheung 
Committed: Fri May 5 20:55:44 2017 -0700

--
 docs/sparkr.md | 4 
 1 file changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/423a7862/docs/sparkr.md
--
diff --git a/docs/sparkr.md b/docs/sparkr.md
index 4039520..18db4cb 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -476,6 +476,10 @@ SparkR supports the following machine learning algorithms 
currently:
 
 * [`spark.als`](api/R/spark.als.html): [`Alternating Least Squares 
(ALS)`](ml-collaborative-filtering.html#collaborative-filtering)
 
+ Frequent Pattern Mining
+
+* [`spark.fpGrowth`](api/R/spark.fpGrowth.html) : 
[`FP-growth`](ml-frequent-pattern-mining.html#fp-growth)
+
  Statistics
 
 * [`spark.kstest`](api/R/spark.kstest.html): `Kolmogorov-Smirnov Test`


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch

2017-05-05 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master b31648c08 -> 5d75b14bf


[SPARK-20616] RuleExecutor logDebug of batch results should show diff to start 
of batch

## What changes were proposed in this pull request?

Due to a likely typo, the logDebug msg printing the diff of query plans shows a 
diff to the initial plan, not diff to the start of batch.

## How was this patch tested?

Now the debug message prints the diff between start and end of batch.

Author: Juliusz Sompolski 

Closes #17875 from juliuszsompolski/SPARK-20616.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5d75b14b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5d75b14b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5d75b14b

Branch: refs/heads/master
Commit: 5d75b14bf0f4c1f0813287efaabf49797908ed55
Parents: b31648c
Author: Juliusz Sompolski 
Authored: Fri May 5 15:31:06 2017 -0700
Committer: Reynold Xin 
Committed: Fri May 5 15:31:06 2017 -0700

--
 .../scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5d75b14b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
index 6fc828f..85b368c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
@@ -122,7 +122,7 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] 
extends Logging {
 logDebug(
   s"""
   |=== Result of Batch ${batch.name} ===
-  |${sideBySide(plan.treeString, curPlan.treeString).mkString("\n")}
+  |${sideBySide(batchStartPlan.treeString, 
curPlan.treeString).mkString("\n")}
 """.stripMargin)
   } else {
 logTrace(s"Batch ${batch.name} has no effect.")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch

2017-05-05 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 f59c74a94 -> 1d9b7a74a


[SPARK-20616] RuleExecutor logDebug of batch results should show diff to start 
of batch

## What changes were proposed in this pull request?

Due to a likely typo, the logDebug msg printing the diff of query plans shows a 
diff to the initial plan, not diff to the start of batch.

## How was this patch tested?

Now the debug message prints the diff between start and end of batch.

Author: Juliusz Sompolski 

Closes #17875 from juliuszsompolski/SPARK-20616.

(cherry picked from commit 5d75b14bf0f4c1f0813287efaabf49797908ed55)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1d9b7a74
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1d9b7a74
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1d9b7a74

Branch: refs/heads/branch-2.2
Commit: 1d9b7a74a839021814ab28d3eba3636c64483130
Parents: f59c74a
Author: Juliusz Sompolski 
Authored: Fri May 5 15:31:06 2017 -0700
Committer: Reynold Xin 
Committed: Fri May 5 15:31:13 2017 -0700

--
 .../scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1d9b7a74/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
index 6fc828f..85b368c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
@@ -122,7 +122,7 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] 
extends Logging {
 logDebug(
   s"""
   |=== Result of Batch ${batch.name} ===
-  |${sideBySide(plan.treeString, curPlan.treeString).mkString("\n")}
+  |${sideBySide(batchStartPlan.treeString, 
curPlan.treeString).mkString("\n")}
 """.stripMargin)
   } else {
 logTrace(s"Batch ${batch.name} has no effect.")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch

2017-05-05 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-2.1 704b249b6 -> a1112c615


[SPARK-20616] RuleExecutor logDebug of batch results should show diff to start 
of batch

## What changes were proposed in this pull request?

Due to a likely typo, the logDebug msg printing the diff of query plans shows a 
diff to the initial plan, not diff to the start of batch.

## How was this patch tested?

Now the debug message prints the diff between start and end of batch.

Author: Juliusz Sompolski 

Closes #17875 from juliuszsompolski/SPARK-20616.

(cherry picked from commit 5d75b14bf0f4c1f0813287efaabf49797908ed55)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a1112c61
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a1112c61
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a1112c61

Branch: refs/heads/branch-2.1
Commit: a1112c615b05d615048159c9d324aa10a4391d4e
Parents: 704b249
Author: Juliusz Sompolski 
Authored: Fri May 5 15:31:06 2017 -0700
Committer: Reynold Xin 
Committed: Fri May 5 15:31:23 2017 -0700

--
 .../scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a1112c61/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
index 6fc828f..85b368c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
@@ -122,7 +122,7 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] 
extends Logging {
 logDebug(
   s"""
   |=== Result of Batch ${batch.name} ===
-  |${sideBySide(plan.treeString, curPlan.treeString).mkString("\n")}
+  |${sideBySide(batchStartPlan.treeString, 
curPlan.treeString).mkString("\n")}
 """.stripMargin)
   } else {
 logTrace(s"Batch ${batch.name} has no effect.")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20132][DOCS] Add documentation for column string functions

2017-05-05 Thread lian
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 24fffacad -> f59c74a94


[SPARK-20132][DOCS] Add documentation for column string functions

## What changes were proposed in this pull request?
Add docstrings to column.py for the Column functions `rlike`, `like`, 
`startswith`, and `endswith`. Pass these docstrings through `_bin_op`

There may be a better place to put the docstrings. I put them immediately above 
the Column class.

## How was this patch tested?

I ran `make html` on my local computer to remake the documentation, and 
verified that the html pages were displaying the docstrings correctly. I tried 
running `dev-tests`, and the formatting tests passed. However, my mvn build 
didn't work I think due to issues on my computer.

These docstrings are my original work and free license.

davies has done the most recent work reorganizing `_bin_op`

Author: Michael Patterson 

Closes #17469 from map222/patterson-documentation.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f59c74a9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f59c74a9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f59c74a9

Branch: refs/heads/branch-2.2
Commit: f59c74a9460b0db4e6c3ecbe872e2eaadc43e2cc
Parents: 24fffac
Author: Michael Patterson 
Authored: Sat Apr 22 19:58:54 2017 -0700
Committer: Cheng Lian 
Committed: Fri May 5 13:26:49 2017 -0700

--
 python/pyspark/sql/column.py | 70 +++
 1 file changed, 64 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f59c74a9/python/pyspark/sql/column.py
--
diff --git a/python/pyspark/sql/column.py b/python/pyspark/sql/column.py
index ec05c18..46c1707 100644
--- a/python/pyspark/sql/column.py
+++ b/python/pyspark/sql/column.py
@@ -250,11 +250,50 @@ class Column(object):
 raise TypeError("Column is not iterable")
 
 # string methods
+_rlike_doc = """
+Return a Boolean :class:`Column` based on a regex match.
+
+:param other: an extended regex expression
+
+>>> df.filter(df.name.rlike('ice$')).collect()
+[Row(age=2, name=u'Alice')]
+"""
+_like_doc = """
+Return a Boolean :class:`Column` based on a SQL LIKE match.
+
+:param other: a SQL LIKE pattern
+
+See :func:`rlike` for a regex version
+
+>>> df.filter(df.name.like('Al%')).collect()
+[Row(age=2, name=u'Alice')]
+"""
+_startswith_doc = """
+Return a Boolean :class:`Column` based on a string match.
+
+:param other: string at end of line (do not use a regex `^`)
+
+>>> df.filter(df.name.startswith('Al')).collect()
+[Row(age=2, name=u'Alice')]
+>>> df.filter(df.name.startswith('^Al')).collect()
+[]
+"""
+_endswith_doc = """
+Return a Boolean :class:`Column` based on matching end of string.
+
+:param other: string at end of line (do not use a regex `$`)
+
+>>> df.filter(df.name.endswith('ice')).collect()
+[Row(age=2, name=u'Alice')]
+>>> df.filter(df.name.endswith('ice$')).collect()
+[]
+"""
+
 contains = _bin_op("contains")
-rlike = _bin_op("rlike")
-like = _bin_op("like")
-startswith = _bin_op("startsWith")
-endswith = _bin_op("endsWith")
+rlike = ignore_unicode_prefix(_bin_op("rlike", _rlike_doc))
+like = ignore_unicode_prefix(_bin_op("like", _like_doc))
+startswith = ignore_unicode_prefix(_bin_op("startsWith", _startswith_doc))
+endswith = ignore_unicode_prefix(_bin_op("endsWith", _endswith_doc))
 
 @ignore_unicode_prefix
 @since(1.3)
@@ -303,8 +342,27 @@ class Column(object):
 desc = _unary_op("desc", "Returns a sort expression based on the"
  " descending order of the given column name.")
 
-isNull = _unary_op("isNull", "True if the current expression is null.")
-isNotNull = _unary_op("isNotNull", "True if the current expression is not 
null.")
+_isNull_doc = """
+True if the current expression is null. Often combined with
+:func:`DataFrame.filter` to select rows with null values.
+
+>>> from pyspark.sql import Row
+>>> df2 = sc.parallelize([Row(name=u'Tom', height=80), Row(name=u'Alice', 
height=None)]).toDF()
+>>> df2.filter(df2.height.isNull()).collect()
+[Row(height=None, name=u'Alice')]
+"""
+_isNotNull_doc = """
+True if the current expression is null. Often combined with
+:func:`DataFrame.filter` to select rows with non-null values.
+
+>>> from pyspark.sql import Row
+>>> df2 = sc.parallelize([Row(name=u'Tom', height=80), Row(name=u'Alice', 
height=None)]).toDF()
+>>> df2.filter(df2.height.isNotNull()).collect()
+[Row(height=80, name=u'Tom')]
+"""
+
+isNull = ignore_unicode_prefix(_unary_op

spark git commit: [SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE

2017-05-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master bd5788287 -> b31648c08


[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE

## What changes were proposed in this pull request?

SparkSQL can now read from a database table with column type [TIMESTAMP WITH 
TIME 
ZONE](https://docs.oracle.com/javase/8/docs/api/java/sql/Types.html#TIMESTAMP_WITH_TIMEZONE).

## How was this patch tested?

Tested against Oracle database.

JoshRosen, you seem to know the class, would you look at this? Thanks!

Author: Jannik Arndt 

Closes #17832 from JannikArndt/spark-20557-timestamp-with-timezone.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b31648c0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b31648c0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b31648c0

Branch: refs/heads/master
Commit: b31648c081e8db34e0d6c71875318f7b0b047c8b
Parents: bd57882
Author: Jannik Arndt 
Authored: Fri May 5 11:42:55 2017 -0700
Committer: Xiao Li 
Committed: Fri May 5 11:42:55 2017 -0700

--
 .../apache/spark/sql/jdbc/OracleIntegrationSuite.scala | 13 +
 .../sql/execution/datasources/jdbc/JdbcUtils.scala |  3 +++
 2 files changed, 16 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b31648c0/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
--
diff --git 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
index 1bb89a3..85d4a4a 100644
--- 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
+++ 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
@@ -70,6 +70,12 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
   """.stripMargin.replaceAll("\n", " ")).executeUpdate()
 conn.commit()
 
+conn.prepareStatement("CREATE TABLE ts_with_timezone (id NUMBER(10), t 
TIMESTAMP WITH TIME ZONE)")
+.executeUpdate()
+conn.prepareStatement("INSERT INTO ts_with_timezone VALUES (1, 
to_timestamp_tz('1999-12-01 11:00:00 UTC','-MM-DD HH:MI:SS TZR'))")
+.executeUpdate()
+conn.commit()
+
 sql(
   s"""
 |CREATE TEMPORARY VIEW datetime
@@ -185,4 +191,11 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
 sql("INSERT INTO TABLE datetime1 SELECT * FROM datetime where id = 1")
 checkRow(sql("SELECT * FROM datetime1 where id = 1").head())
   }
+
+  test("SPARK-20557: column type TIMEZONE with TIME STAMP should be 
recognized") {
+val dfRead = sqlContext.read.jdbc(jdbcUrl, "ts_with_timezone", new 
Properties)
+val rows = dfRead.collect()
+val types = rows(0).toSeq.map(x => x.getClass.toString)
+assert(types(1).equals("class java.sql.Timestamp"))
+  }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/b31648c0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
index 0183805..fb877d1 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
@@ -223,6 +223,9 @@ object JdbcUtils extends Logging {
   case java.sql.Types.STRUCT=> StringType
   case java.sql.Types.TIME  => TimestampType
   case java.sql.Types.TIMESTAMP => TimestampType
+  case java.sql.Types.TIMESTAMP_WITH_TIMEZONE
+=> TimestampType
+  case -101 => TimestampType // Value for 
Timestamp with Time Zone in Oracle
   case java.sql.Types.TINYINT   => IntegerType
   case java.sql.Types.VARBINARY => BinaryType
   case java.sql.Types.VARCHAR   => StringType


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load

2017-05-05 Thread zsxwing
Repository: spark
Updated Branches:
  refs/heads/branch-2.1 2a7f5dae5 -> 704b249b6


[SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce 
the load

## What changes were proposed in this pull request?

I checked the logs of 
https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.2-test-maven-hadoop-2.7/47/
 and found it took several seconds to create Kafka internal topic 
`__consumer_offsets`. As Kafka creates this topic lazily, the topic creation 
happens in the first test `deserialization of initial offset with Spark 2.1.0` 
and causes it timeout.

This PR changes `offsets.topic.num.partitions` from the default value 50 to 1 
to make creating `__consumer_offsets` (50 partitions -> 1 partition) much 
faster.

## How was this patch tested?

Jenkins

Author: Shixiong Zhu 

Closes #17863 from zsxwing/fix-kafka-flaky-test.

(cherry picked from commit bd5788287957d8610a6d19c273b75bd4cdd2d166)
Signed-off-by: Shixiong Zhu 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/704b249b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/704b249b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/704b249b

Branch: refs/heads/branch-2.1
Commit: 704b249b6a3ea956086d6c6ef50da18e8228eeb4
Parents: 2a7f5da
Author: Shixiong Zhu 
Authored: Fri May 5 11:08:26 2017 -0700
Committer: Shixiong Zhu 
Committed: Fri May 5 11:08:43 2017 -0700

--
 .../test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala   | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/704b249b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
--
diff --git 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
index c2cbd86..4345f88 100644
--- 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
+++ 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
@@ -281,6 +281,7 @@ class KafkaTestUtils(withBrokerProps: Map[String, Object] = 
Map.empty) extends L
 props.put("log.flush.interval.messages", "1")
 props.put("replica.socket.timeout.ms", "1500")
 props.put("delete.topic.enable", "true")
+props.put("offsets.topic.num.partitions", "1")
 props.putAll(withBrokerProps.asJava)
 props
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load

2017-05-05 Thread zsxwing
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 f71aea6a0 -> 24fffacad


[SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce 
the load

## What changes were proposed in this pull request?

I checked the logs of 
https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.2-test-maven-hadoop-2.7/47/
 and found it took several seconds to create Kafka internal topic 
`__consumer_offsets`. As Kafka creates this topic lazily, the topic creation 
happens in the first test `deserialization of initial offset with Spark 2.1.0` 
and causes it timeout.

This PR changes `offsets.topic.num.partitions` from the default value 50 to 1 
to make creating `__consumer_offsets` (50 partitions -> 1 partition) much 
faster.

## How was this patch tested?

Jenkins

Author: Shixiong Zhu 

Closes #17863 from zsxwing/fix-kafka-flaky-test.

(cherry picked from commit bd5788287957d8610a6d19c273b75bd4cdd2d166)
Signed-off-by: Shixiong Zhu 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/24fffaca
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/24fffaca
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/24fffaca

Branch: refs/heads/branch-2.2
Commit: 24fffacad709c553e0f24ae12a8cca3ab980af3c
Parents: f71aea6
Author: Shixiong Zhu 
Authored: Fri May 5 11:08:26 2017 -0700
Committer: Shixiong Zhu 
Committed: Fri May 5 11:08:32 2017 -0700

--
 .../test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala   | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/24fffaca/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
--
diff --git 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
index 2ce2760..f86b8f5 100644
--- 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
+++ 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
@@ -292,6 +292,7 @@ class KafkaTestUtils(withBrokerProps: Map[String, Object] = 
Map.empty) extends L
 props.put("log.flush.interval.messages", "1")
 props.put("replica.socket.timeout.ms", "1500")
 props.put("delete.topic.enable", "true")
+props.put("offsets.topic.num.partitions", "1")
 props.putAll(withBrokerProps.asJava)
 props
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load

2017-05-05 Thread zsxwing
Repository: spark
Updated Branches:
  refs/heads/master 41439fd52 -> bd5788287


[SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce 
the load

## What changes were proposed in this pull request?

I checked the logs of 
https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.2-test-maven-hadoop-2.7/47/
 and found it took several seconds to create Kafka internal topic 
`__consumer_offsets`. As Kafka creates this topic lazily, the topic creation 
happens in the first test `deserialization of initial offset with Spark 2.1.0` 
and causes it timeout.

This PR changes `offsets.topic.num.partitions` from the default value 50 to 1 
to make creating `__consumer_offsets` (50 partitions -> 1 partition) much 
faster.

## How was this patch tested?

Jenkins

Author: Shixiong Zhu 

Closes #17863 from zsxwing/fix-kafka-flaky-test.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bd578828
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bd578828
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bd578828

Branch: refs/heads/master
Commit: bd5788287957d8610a6d19c273b75bd4cdd2d166
Parents: 41439fd
Author: Shixiong Zhu 
Authored: Fri May 5 11:08:26 2017 -0700
Committer: Shixiong Zhu 
Committed: Fri May 5 11:08:26 2017 -0700

--
 .../test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala   | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bd578828/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
--
diff --git 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
index 2ce2760..f86b8f5 100644
--- 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
+++ 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
@@ -292,6 +292,7 @@ class KafkaTestUtils(withBrokerProps: Map[String, Object] = 
Map.empty) extends L
 props.put("log.flush.interval.messages", "1")
 props.put("replica.socket.timeout.ms", "1500")
 props.put("delete.topic.enable", "true")
+props.put("offsets.topic.num.partitions", "1")
 props.putAll(withBrokerProps.asJava)
 props
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20381][SQL] Add SQL metrics of numOutputRows for ObjectHashAggregateExec

2017-05-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master b9ad2d191 -> 41439fd52


[SPARK-20381][SQL] Add SQL metrics of numOutputRows for ObjectHashAggregateExec

## What changes were proposed in this pull request?

ObjectHashAggregateExec is missing numOutputRows, add this metrics for it.

## How was this patch tested?

Added unit tests for the new metrics.

Author: Yucai 

Closes #17678 from yucai/objectAgg_numOutputRows.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/41439fd5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/41439fd5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/41439fd5

Branch: refs/heads/master
Commit: 41439fd52dd263b9f7d92e608f027f193f461777
Parents: b9ad2d1
Author: Yucai 
Authored: Fri May 5 09:51:57 2017 -0700
Committer: Xiao Li 
Committed: Fri May 5 09:51:57 2017 -0700

--
 .../aggregate/ObjectAggregationIterator.scala |  8 ++--
 .../aggregate/ObjectHashAggregateExec.scala   |  3 ++-
 .../sql/execution/metric/SQLMetricsSuite.scala| 18 ++
 3 files changed, 26 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/41439fd5/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
index 3a7fcf1..6e47f9d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
@@ -24,6 +24,7 @@ import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.expressions.aggregate._
 import org.apache.spark.sql.catalyst.expressions.codegen.{BaseOrdering, 
GenerateOrdering}
 import org.apache.spark.sql.execution.UnsafeKVExternalSorter
+import org.apache.spark.sql.execution.metric.SQLMetric
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.StructType
 import org.apache.spark.unsafe.KVIterator
@@ -39,7 +40,8 @@ class ObjectAggregationIterator(
 newMutableProjection: (Seq[Expression], Seq[Attribute]) => 
MutableProjection,
 originalInputAttributes: Seq[Attribute],
 inputRows: Iterator[InternalRow],
-fallbackCountThreshold: Int)
+fallbackCountThreshold: Int,
+numOutputRows: SQLMetric)
   extends AggregationIterator(
 groupingExpressions,
 originalInputAttributes,
@@ -83,7 +85,9 @@ class ObjectAggregationIterator(
 
   override final def next(): UnsafeRow = {
 val entry = aggBufferIterator.next()
-generateOutput(entry.groupingKey, entry.aggregationBuffer)
+val res = generateOutput(entry.groupingKey, entry.aggregationBuffer)
+numOutputRows += 1
+res
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/41439fd5/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
index 3fcb7ec..b53521b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
@@ -117,7 +117,8 @@ case class ObjectHashAggregateExec(
   newMutableProjection(expressions, inputSchema, 
subexpressionEliminationEnabled),
 child.output,
 iter,
-fallbackCountThreshold)
+fallbackCountThreshold,
+numOutputRows)
 if (!hasInput && groupingExpressions.isEmpty) {
   numOutputRows += 1
   
Iterator.single[UnsafeRow](aggregationIterator.outputForEmptyGroupingKeyWithoutInput())

http://git-wip-us.apache.org/repos/asf/spark/blob/41439fd5/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
index 2ce7db6..e5442455 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
@@ -143,6 +143,24 @@ class SQLMetricsSuite extends Sp

spark git commit: [SPARK-20381][SQL] Add SQL metrics of numOutputRows for ObjectHashAggregateExec

2017-05-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 1fa3c86a7 -> f71aea6a0


[SPARK-20381][SQL] Add SQL metrics of numOutputRows for ObjectHashAggregateExec

## What changes were proposed in this pull request?

ObjectHashAggregateExec is missing numOutputRows, add this metrics for it.

## How was this patch tested?

Added unit tests for the new metrics.

Author: Yucai 

Closes #17678 from yucai/objectAgg_numOutputRows.

(cherry picked from commit 41439fd52dd263b9f7d92e608f027f193f461777)
Signed-off-by: Xiao Li 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f71aea6a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f71aea6a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f71aea6a

Branch: refs/heads/branch-2.2
Commit: f71aea6a0be6eda24623d8563d971687ecd04caf
Parents: 1fa3c86
Author: Yucai 
Authored: Fri May 5 09:51:57 2017 -0700
Committer: Xiao Li 
Committed: Fri May 5 09:52:10 2017 -0700

--
 .../aggregate/ObjectAggregationIterator.scala |  8 ++--
 .../aggregate/ObjectHashAggregateExec.scala   |  3 ++-
 .../sql/execution/metric/SQLMetricsSuite.scala| 18 ++
 3 files changed, 26 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f71aea6a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
index 3a7fcf1..6e47f9d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala
@@ -24,6 +24,7 @@ import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.expressions.aggregate._
 import org.apache.spark.sql.catalyst.expressions.codegen.{BaseOrdering, 
GenerateOrdering}
 import org.apache.spark.sql.execution.UnsafeKVExternalSorter
+import org.apache.spark.sql.execution.metric.SQLMetric
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.StructType
 import org.apache.spark.unsafe.KVIterator
@@ -39,7 +40,8 @@ class ObjectAggregationIterator(
 newMutableProjection: (Seq[Expression], Seq[Attribute]) => 
MutableProjection,
 originalInputAttributes: Seq[Attribute],
 inputRows: Iterator[InternalRow],
-fallbackCountThreshold: Int)
+fallbackCountThreshold: Int,
+numOutputRows: SQLMetric)
   extends AggregationIterator(
 groupingExpressions,
 originalInputAttributes,
@@ -83,7 +85,9 @@ class ObjectAggregationIterator(
 
   override final def next(): UnsafeRow = {
 val entry = aggBufferIterator.next()
-generateOutput(entry.groupingKey, entry.aggregationBuffer)
+val res = generateOutput(entry.groupingKey, entry.aggregationBuffer)
+numOutputRows += 1
+res
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/f71aea6a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
index 3fcb7ec..b53521b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala
@@ -117,7 +117,8 @@ case class ObjectHashAggregateExec(
   newMutableProjection(expressions, inputSchema, 
subexpressionEliminationEnabled),
 child.output,
 iter,
-fallbackCountThreshold)
+fallbackCountThreshold,
+numOutputRows)
 if (!hasInput && groupingExpressions.isEmpty) {
   numOutputRows += 1
   
Iterator.single[UnsafeRow](aggregationIterator.outputForEmptyGroupingKeyWithoutInput())

http://git-wip-us.apache.org/repos/asf/spark/blob/f71aea6a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
index 2ce7db6..e5442455 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/

spark git commit: [SPARK-20613] Remove excess quotes in Windows executable

2017-05-05 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/branch-2.1 179f5370e -> 2a7f5dae5


[SPARK-20613] Remove excess quotes in Windows executable

## What changes were proposed in this pull request?

Quotes are already added to the RUNNER variable on line 54. There is no need to 
put quotes on line 67. If you do, you will get an error when launching Spark.

'""C:\Program' is not recognized as an internal or external command, operable 
program or batch file.

## How was this patch tested?

Tested manually on Windows 10.

Author: Jarrett Meyer 

Closes #17861 from jarrettmeyer/fix-windows-cmd.

(cherry picked from commit b9ad2d1916af5091c8585d06ccad8219e437e2bc)
Signed-off-by: Felix Cheung 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2a7f5dae
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2a7f5dae
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2a7f5dae

Branch: refs/heads/branch-2.1
Commit: 2a7f5dae5906f01973d2744ab7b59b54e42f302b
Parents: 179f537
Author: Jarrett Meyer 
Authored: Fri May 5 08:30:42 2017 -0700
Committer: Felix Cheung 
Committed: Fri May 5 08:35:19 2017 -0700

--
 bin/spark-class2.cmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2a7f5dae/bin/spark-class2.cmd
--
diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 9faa7d6..f6157f4 100644
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -51,7 +51,7 @@ if not "x%SPARK_PREPEND_CLASSES%"=="x" (
 rem Figure out where java is.
 set RUNNER=java
 if not "x%JAVA_HOME%"=="x" (
-  set RUNNER="%JAVA_HOME%\bin\java"
+  set RUNNER=%JAVA_HOME%\bin\java
 ) else (
   where /q "%RUNNER%"
   if ERRORLEVEL 1 (


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20613] Remove excess quotes in Windows executable

2017-05-05 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 dbb54a7b3 -> 1fa3c86a7


[SPARK-20613] Remove excess quotes in Windows executable

## What changes were proposed in this pull request?

Quotes are already added to the RUNNER variable on line 54. There is no need to 
put quotes on line 67. If you do, you will get an error when launching Spark.

'""C:\Program' is not recognized as an internal or external command, operable 
program or batch file.

## How was this patch tested?

Tested manually on Windows 10.

Author: Jarrett Meyer 

Closes #17861 from jarrettmeyer/fix-windows-cmd.

(cherry picked from commit b9ad2d1916af5091c8585d06ccad8219e437e2bc)
Signed-off-by: Felix Cheung 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1fa3c86a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1fa3c86a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1fa3c86a

Branch: refs/heads/branch-2.2
Commit: 1fa3c86a740e072957a2104dbd02ca3c158c508d
Parents: dbb54a7
Author: Jarrett Meyer 
Authored: Fri May 5 08:30:42 2017 -0700
Committer: Felix Cheung 
Committed: Fri May 5 08:30:56 2017 -0700

--
 bin/spark-class2.cmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1fa3c86a/bin/spark-class2.cmd
--
diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 9faa7d6..f6157f4 100644
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -51,7 +51,7 @@ if not "x%SPARK_PREPEND_CLASSES%"=="x" (
 rem Figure out where java is.
 set RUNNER=java
 if not "x%JAVA_HOME%"=="x" (
-  set RUNNER="%JAVA_HOME%\bin\java"
+  set RUNNER=%JAVA_HOME%\bin\java
 ) else (
   where /q "%RUNNER%"
   if ERRORLEVEL 1 (


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20613] Remove excess quotes in Windows executable

2017-05-05 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/master 9064f1b04 -> b9ad2d191


[SPARK-20613] Remove excess quotes in Windows executable

## What changes were proposed in this pull request?

Quotes are already added to the RUNNER variable on line 54. There is no need to 
put quotes on line 67. If you do, you will get an error when launching Spark.

'""C:\Program' is not recognized as an internal or external command, operable 
program or batch file.

## How was this patch tested?

Tested manually on Windows 10.

Author: Jarrett Meyer 

Closes #17861 from jarrettmeyer/fix-windows-cmd.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b9ad2d19
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b9ad2d19
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b9ad2d19

Branch: refs/heads/master
Commit: b9ad2d1916af5091c8585d06ccad8219e437e2bc
Parents: 9064f1b
Author: Jarrett Meyer 
Authored: Fri May 5 08:30:42 2017 -0700
Committer: Felix Cheung 
Committed: Fri May 5 08:30:42 2017 -0700

--
 bin/spark-class2.cmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b9ad2d19/bin/spark-class2.cmd
--
diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 9faa7d6..f6157f4 100644
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -51,7 +51,7 @@ if not "x%SPARK_PREPEND_CLASSES%"=="x" (
 rem Figure out where java is.
 set RUNNER=java
 if not "x%JAVA_HOME%"=="x" (
-  set RUNNER="%JAVA_HOME%\bin\java"
+  set RUNNER=%JAVA_HOME%\bin\java
 ) else (
   where /q "%RUNNER%"
   if ERRORLEVEL 1 (


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20495][SQL][CORE] Add StorageLevel to cacheTable API

2017-05-05 Thread wenchen
Repository: spark
Updated Branches:
  refs/heads/master 5773ab121 -> 9064f1b04


[SPARK-20495][SQL][CORE] Add StorageLevel to cacheTable API

## What changes were proposed in this pull request?
Currently cacheTable API only supports MEMORY_AND_DISK. This PR adds additional 
API to take different storage levels.
## How was this patch tested?
unit tests

Author: madhu 

Closes #17802 from phatak-dev/cacheTableAPI.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9064f1b0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9064f1b0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9064f1b0

Branch: refs/heads/master
Commit: 9064f1b04461513a147aeb8179471b05595ddbc4
Parents: 5773ab1
Author: madhu 
Authored: Fri May 5 22:44:03 2017 +0800
Committer: Wenchen Fan 
Committed: Fri May 5 22:44:03 2017 +0800

--
 project/MimaExcludes.scala|  2 ++
 .../scala/org/apache/spark/sql/catalog/Catalog.scala  | 14 +-
 .../org/apache/spark/sql/internal/CatalogImpl.scala   | 13 +
 .../org/apache/spark/sql/internal/CatalogSuite.scala  |  8 
 4 files changed, 36 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9064f1b0/project/MimaExcludes.scala
--
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index dbf933f..d50882c 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -36,6 +36,8 @@ object MimaExcludes {
 
   // Exclude rules for 2.3.x
   lazy val v23excludes = v22excludes ++ Seq(
+// [SPARK-20495][SQL] Add StorageLevel to cacheTable API
+
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.sql.catalog.Catalog.cacheTable")
   )
 
   // Exclude rules for 2.2.x

http://git-wip-us.apache.org/repos/asf/spark/blob/9064f1b0/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala
index 7e5da01..ab81725 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala
@@ -22,7 +22,7 @@ import scala.collection.JavaConverters._
 import org.apache.spark.annotation.{Experimental, InterfaceStability}
 import org.apache.spark.sql.{AnalysisException, DataFrame, Dataset}
 import org.apache.spark.sql.types.StructType
-
+import org.apache.spark.storage.StorageLevel
 
 /**
  * Catalog interface for Spark. To access this, use `SparkSession.catalog`.
@@ -477,6 +477,18 @@ abstract class Catalog {
   def cacheTable(tableName: String): Unit
 
   /**
+   * Caches the specified table with the given storage level.
+   *
+   * @param tableName is either a qualified or unqualified name that 
designates a table/view.
+   *  If no database identifier is provided, it refers to a 
temporary view or
+   *  a table/view in the current database.
+   * @param storageLevel storage level to cache table.
+   * @since 2.3.0
+   */
+  def cacheTable(tableName: String, storageLevel: StorageLevel): Unit
+
+
+  /**
* Removes the specified table from the in-memory cache.
*
* @param tableName is either a qualified or unqualified name that 
designates a table/view.

http://git-wip-us.apache.org/repos/asf/spark/blob/9064f1b0/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
index 0b8e538..e1049c6 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
@@ -30,6 +30,8 @@ import 
org.apache.spark.sql.catalyst.plans.logical.LocalRelation
 import 
org.apache.spark.sql.execution.command.AlterTableRecoverPartitionsCommand
 import org.apache.spark.sql.execution.datasources.{CreateTable, DataSource}
 import org.apache.spark.sql.types.StructType
+import org.apache.spark.storage.StorageLevel
+
 
 
 /**
@@ -420,6 +422,17 @@ class CatalogImpl(sparkSession: SparkSession) extends 
Catalog {
   }
 
   /**
+   * Caches the specified table or view with the given storage level.
+   *
+   * @group cachemgmt
+   * @since 2.3.0
+   */
+  override def cacheTable(tableName: String, storageLevel: StorageLevel): Unit 
= {
+sparkSession.sharedState.cacheManager.cacheQuery(
+  sparkSession.table(tableName), Some(tableName), storageLevel)
+  }
+

spark git commit: [SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode

2017-05-05 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/branch-2.1 d10b0f654 -> 179f5370e


[SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode

## What changes were proposed in this pull request?

Updated spark-class to turn off posix mode so the process substitution doesn't 
cause a syntax error.

## How was this patch tested?

Existing unit tests, manual spark-shell testing with posix mode on

Author: jyu00 

Closes #17852 from jyu00/master.

(cherry picked from commit 5773ab121d5d7cbefeef17ff4ac6f8af36cc1251)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/179f5370
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/179f5370
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/179f5370

Branch: refs/heads/branch-2.1
Commit: 179f5370e68aa3c1f035f8ac400129c3935e96f8
Parents: d10b0f6
Author: jyu00 
Authored: Fri May 5 11:36:51 2017 +0100
Committer: Sean Owen 
Committed: Fri May 5 11:37:10 2017 +0100

--
 bin/spark-class | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/179f5370/bin/spark-class
--
diff --git a/bin/spark-class b/bin/spark-class
index 77ea40c..65d3b96 100755
--- a/bin/spark-class
+++ b/bin/spark-class
@@ -72,6 +72,8 @@ build_command() {
   printf "%d\0" $?
 }
 
+# Turn off posix mode since it does not allow process substitution
+set +o posix
 CMD=()
 while IFS= read -d '' -r ARG; do
   CMD+=("$ARG")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode

2017-05-05 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master 37cdf077c -> 5773ab121


[SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode

## What changes were proposed in this pull request?

Updated spark-class to turn off posix mode so the process substitution doesn't 
cause a syntax error.

## How was this patch tested?

Existing unit tests, manual spark-shell testing with posix mode on

Author: jyu00 

Closes #17852 from jyu00/master.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5773ab12
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5773ab12
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5773ab12

Branch: refs/heads/master
Commit: 5773ab121d5d7cbefeef17ff4ac6f8af36cc1251
Parents: 37cdf07
Author: jyu00 
Authored: Fri May 5 11:36:51 2017 +0100
Committer: Sean Owen 
Committed: Fri May 5 11:36:51 2017 +0100

--
 bin/spark-class | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5773ab12/bin/spark-class
--
diff --git a/bin/spark-class b/bin/spark-class
index 77ea40c..65d3b96 100755
--- a/bin/spark-class
+++ b/bin/spark-class
@@ -72,6 +72,8 @@ build_command() {
   printf "%d\0" $?
 }
 
+# Turn off posix mode since it does not allow process substitution
+set +o posix
 CMD=()
 while IFS= read -d '' -r ARG; do
   CMD+=("$ARG")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode

2017-05-05 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 7cb566abc -> dbb54a7b3


[SPARK-20546][DEPLOY] spark-class gets syntax error in posix mode

## What changes were proposed in this pull request?

Updated spark-class to turn off posix mode so the process substitution doesn't 
cause a syntax error.

## How was this patch tested?

Existing unit tests, manual spark-shell testing with posix mode on

Author: jyu00 

Closes #17852 from jyu00/master.

(cherry picked from commit 5773ab121d5d7cbefeef17ff4ac6f8af36cc1251)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dbb54a7b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dbb54a7b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dbb54a7b

Branch: refs/heads/branch-2.2
Commit: dbb54a7b39568cc9e8046a86113b98c3c69b7d11
Parents: 7cb566a
Author: jyu00 
Authored: Fri May 5 11:36:51 2017 +0100
Committer: Sean Owen 
Committed: Fri May 5 11:36:58 2017 +0100

--
 bin/spark-class | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/dbb54a7b/bin/spark-class
--
diff --git a/bin/spark-class b/bin/spark-class
index 77ea40c..65d3b96 100755
--- a/bin/spark-class
+++ b/bin/spark-class
@@ -72,6 +72,8 @@ build_command() {
   printf "%d\0" $?
 }
 
+# Turn off posix mode since it does not allow process substitution
+set +o posix
 CMD=()
 while IFS= read -d '' -r ARG; do
   CMD+=("$ARG")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-19660][SQL] Replace the deprecated property name fs.default.name to fs.defaultFS that newly introduced

2017-05-05 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 c8756288d -> 7cb566abc


[SPARK-19660][SQL] Replace the deprecated property name fs.default.name to 
fs.defaultFS that newly introduced

## What changes were proposed in this pull request?

Replace the deprecated property name `fs.default.name` to `fs.defaultFS` that 
newly introduced.

## How was this patch tested?

Existing tests

Author: Yuming Wang 

Closes #17856 from wangyum/SPARK-19660.

(cherry picked from commit 37cdf077cd3f436f777562df311e3827b0727ce7)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7cb566ab
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7cb566ab
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7cb566ab

Branch: refs/heads/branch-2.2
Commit: 7cb566abc27d41d5816dee16c6ecb749da2adf46
Parents: c875628
Author: Yuming Wang 
Authored: Fri May 5 11:31:59 2017 +0100
Committer: Sean Owen 
Committed: Fri May 5 11:32:07 2017 +0100

--
 .../spark/sql/execution/streaming/state/StateStoreSuite.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7cb566ab/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
index ebb7422..cc09b2d 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
@@ -314,7 +314,7 @@ class StateStoreSuite extends SparkFunSuite with 
BeforeAndAfter with PrivateMeth
   test("SPARK-19677: Committing a delta file atop an existing one should not 
fail on HDFS") {
 val conf = new Configuration()
 conf.set("fs.fake.impl", classOf[RenameLikeHDFSFileSystem].getName)
-conf.set("fs.default.name", "fake:///")
+conf.set("fs.defaultFS", "fake:///")
 
 val provider = newStoreProvider(hadoopConf = conf)
 provider.getStore(0).commit()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-19660][SQL] Replace the deprecated property name fs.default.name to fs.defaultFS that newly introduced

2017-05-05 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master 4411ac705 -> 37cdf077c


[SPARK-19660][SQL] Replace the deprecated property name fs.default.name to 
fs.defaultFS that newly introduced

## What changes were proposed in this pull request?

Replace the deprecated property name `fs.default.name` to `fs.defaultFS` that 
newly introduced.

## How was this patch tested?

Existing tests

Author: Yuming Wang 

Closes #17856 from wangyum/SPARK-19660.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/37cdf077
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/37cdf077
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/37cdf077

Branch: refs/heads/master
Commit: 37cdf077cd3f436f777562df311e3827b0727ce7
Parents: 4411ac7
Author: Yuming Wang 
Authored: Fri May 5 11:31:59 2017 +0100
Committer: Sean Owen 
Committed: Fri May 5 11:31:59 2017 +0100

--
 .../spark/sql/execution/streaming/state/StateStoreSuite.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/37cdf077/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
index ebb7422..cc09b2d 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
@@ -314,7 +314,7 @@ class StateStoreSuite extends SparkFunSuite with 
BeforeAndAfter with PrivateMeth
   test("SPARK-19677: Committing a delta file atop an existing one should not 
fail on HDFS") {
 val conf = new Configuration()
 conf.set("fs.fake.impl", classOf[RenameLikeHDFSFileSystem].getName)
-conf.set("fs.default.name", "fake:///")
+conf.set("fs.defaultFS", "fake:///")
 
 val provider = newStoreProvider(hadoopConf = conf)
 provider.getStore(0).commit()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [INFRA] Close stale PRs

2017-05-05 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master 0d16faab9 -> 4411ac705


[INFRA] Close stale PRs

## What changes were proposed in this pull request?

This PR proposes to close a stale PR, several PRs suggested to be closed by a 
committer and obviously inappropriate PRs.

Closes #9
Closes #17853
Closes #17732
Closes #17456
Closes #17410
Closes #17314
Closes #17362
Closes #17542

## How was this patch tested?

N/A

Author: hyukjinkwon 

Closes #17855 from HyukjinKwon/close-pr.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4411ac70
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4411ac70
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4411ac70

Branch: refs/heads/master
Commit: 4411ac70524ced901f7807d492fb0ad2480a8841
Parents: 0d16faa
Author: hyukjinkwon 
Authored: Fri May 5 09:50:40 2017 +0100
Committer: Sean Owen 
Committed: Fri May 5 09:50:40 2017 +0100

--

--



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org