date:20171029

[GitHub] spark issue #19607: [SPARK-22395][SQL][PYTHON] Fix the behavior of timestamp...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19607
  
**[Test build #83205 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83205/testReport)**
 for PR 19607 at commit 
[`5c08ecf`](https://github.com/apache/spark/commit/5c08ecf247bfe7e14afcdef8eba1c25cb3b68634).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19607: [SPARK-22395][SQL][PYTHON] Fix the behavior of ti...

2017-10-29 Thread ueshin

GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/19607

[SPARK-22395][SQL][PYTHON] Fix the behavior of timestamp values for Pandas 
to respect session timezone

## What changes were proposed in this pull request?

When converting Pandas DataFrame/Series from/to Spark DataFrame using 
`toPandas()` or pandas udfs, timestamp values behave to respect Python system 
timezone instead of session timezone.

For example, let's say we use `"America/Los_Angeles"` as session timezone 
and have a timestamp value `"1970-01-01 00:00:01"` in the timezone. Btw, I'm in 
Japan so Python timezone would be `"Asia/Tokyo"`.

The timestamp value from current `toPandas()` will be the following:

```
>>> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")
>>> df = spark.createDataFrame([28801], 
"long").selectExpr("timestamp(value) as ts")
>>> df.show()
+---+
| ts|
+---+
|1970-01-01 00:00:01|
+---+

>>> df.toPandas()
   ts
0 1970-01-01 17:00:01
```

As you can see, the value becomes `"1970-01-01 17:00:01"` because it 
respects Python timezone.
As we discussed in #18664, we consider this behavior is a bug and the value 
should be `"1970-01-01 00:00:01"`.

## How was this patch tested?

Added tests and existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark issues/SPARK-22395

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19607


commit 4735e5981ecf3a4bce50ce86f706e25830f4a801
Author: Takuya UESHIN 
Date:   2017-10-23T06:27:22Z

Add a conf to make Pandas DataFrame respect session local timezone.

commit 1f85150dc5b26df21dca6bad2ef4eaec342c4400
Author: Takuya UESHIN 
Date:   2017-10-23T08:09:16Z

Fix toPandas() behavior.

commit 5c08ecf247bfe7e14afcdef8eba1c25cb3b68634
Author: Takuya UESHIN 
Date:   2017-10-23T09:15:47Z

Modify pandas UDFs to respect session timezone.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19605: [SPARK-22394] [SQL] Remove redundant synchronizat...

2017-10-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19605#discussion_r147619876
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -89,10 +89,12 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   /**
-   * Run some code involving `client` in a [[synchronized]] block and wrap 
certain
-   * exceptions thrown in the process in [[AnalysisException]].
+   * Run some code involving `client` and wrap certain exceptions thrown 
in the process in
+   * [[AnalysisException]]. Thread-safety is guaranteed here because 
methods in the `client`
+   * ([[org.apache.spark.sql.hive.client.HiveClientImpl]]) are already 
synchronized through
+   * `clientLoader` in the `retryLocked` method.
*/
-  private def withClient[T](body: => T): T = synchronized {
+  private def withClient[T](body: => T): T = {
--- End diff --

If you check the callers of `withClient`, you can find many callers conduct 
multiple client-related operations in the same `body`. Removing this lock might 
cause some concurrency issues. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18805
  
**[Test build #83204 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83204/testReport)**
 for PR 18805 at commit 
[`eba3024`](https://github.com/apache/spark/commit/eba30249108f195a4442fb8cae35d5f02f5f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-29 Thread sitalkedia

Github user sitalkedia commented on a diff in the pull request:

https://github.com/apache/spark/pull/18805#discussion_r147618796
  
--- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala ---
@@ -216,3 +218,33 @@ private final class SnappyOutputStreamWrapper(os: 
SnappyOutputStream) extends Ou
 }
   }
 }
+
+/**
+ * :: DeveloperApi ::
+ * ZStandard implementation of [[org.apache.spark.io.CompressionCodec]]. 
For more
+ * details see - http://facebook.github.io/zstd/
+ *
+ * @note The wire protocol for this codec is not guaranteed to be 
compatible across versions
+ * of Spark. This is intended for use as an internal compression utility 
within a single Spark
+ * application.
+ */
+@DeveloperApi
+class ZStdCompressionCodec(conf: SparkConf) extends CompressionCodec {
+
+  override def compressedOutputStream(s: OutputStream): OutputStream = {
+// Default compression level for zstd compression to 1 because it is
+// fastest of all with reasonably high compression ratio.
+val level = conf.getSizeAsBytes("spark.io.compression.zstd.level", 
"1").toInt
--- End diff --

Good eye, fixed. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19605: [SPARK-22394] [SQL] Remove redundant synchronization for...

2017-10-29 Thread wzhfy

Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/19605
  
cc @cloud-fan @rxin @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19604
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83201/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19604
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19604
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19604
  
**[Test build #83201 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83201/testReport)**
 for PR 19604 at commit 
[`995e38e`](https://github.com/apache/spark/commit/995e38e118126d95b2fe5ee8416e5f36786a7b5b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19604
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19604
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83200/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19604
  
**[Test build #83200 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83200/testReport)**
 for PR 19604 at commit 
[`549cb81`](https://github.com/apache/spark/commit/549cb814e01c2338a67c4a9efa4d880a3fb9cdac).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19605: [SPARK-22394] [SQL] Remove redundant synchronization for...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19605
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83202/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19605: [SPARK-22394] [SQL] Remove redundant synchronization for...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19605
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19605: [SPARK-22394] [SQL] Remove redundant synchronization for...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19605
  
**[Test build #83202 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83202/testReport)**
 for PR 19605 at commit 
[`072b27d`](https://github.com/apache/spark/commit/072b27d083f2c2ed8d8bdd20caa5b0fe0ba267f6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19599: [SPARK-22381] [ML] Add StringParam that supports valid o...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19599
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83199/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19599: [SPARK-22381] [ML] Add StringParam that supports valid o...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19599
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19599: [SPARK-22381] [ML] Add StringParam that supports valid o...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19599
  
**[Test build #83199 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83199/testReport)**
 for PR 19599 at commit 
[`01e7d3d`](https://github.com/apache/spark/commit/01e7d3d5f9b0ae278ebce60635e5c2568d3d0cf3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19606: [SPARK-22333][SQL][Backport-2.2]timeFunctionCall(CURRENT...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19606
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19606: [SPARK-22333][SQL][Backport-2.2]timeFunctionCall(...

2017-10-29 Thread DonnyZone

GitHub user DonnyZone opened a pull request:

https://github.com/apache/spark/pull/19606

[SPARK-22333][SQL][Backport-2.2]timeFunctionCall(CURRENT_DATE, 
CURRENT_TIMESTAMP) has conflicts with columnReference

## What changes were proposed in this pull request?

This is a backport pr of https://github.com/apache/spark/pull/19559
for branch-2.2

## How was this patch tested?
unit tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DonnyZone/spark branch-2.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19606


commit 2bcc2ea6fd0ca9f12959246bb9ee6796cb7a90a0
Author: donnyzone 
Date:   2017-10-30T03:08:36Z

2.2-backport




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18833: [SPARK-21625][DOC] Add incompatible Hive UDF describe to...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18833
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83203/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18833: [SPARK-21625][DOC] Add incompatible Hive UDF describe to...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18833
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18833: [SPARK-21625][DOC] Add incompatible Hive UDF describe to...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18833
  
**[Test build #83203 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83203/testReport)**
 for PR 18833 at commit 
[`cbbfa5e`](https://github.com/apache/spark/commit/cbbfa5edf8d9edf1d25fb1c456725cac73418602).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Sqrt(child: Expression) extends 
UnaryMathExpression(math.sqrt, \"SQRT\")`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19595: [SPARK-22379][PYTHON] Reduce duplication setUpClass and ...

2017-10-29 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19595
  
Thank you @ueshin.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18833
  
**[Test build #83203 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83203/testReport)**
 for PR 18833 at commit 
[`cbbfa5e`](https://github.com/apache/spark/commit/cbbfa5edf8d9edf1d25fb1c456725cac73418602).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19595: [SPARK-22379][PYTHON] Reduce duplication setUpCla...

2017-10-29 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19595


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19595: [SPARK-22379][PYTHON] Reduce duplication setUpClass and ...

2017-10-29 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19595
  
Thanks! merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19595: [SPARK-22379][PYTHON] Reduce duplication setUpClass and ...

2017-10-29 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19595
  
LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error when trans...

2017-10-29 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19604
  
Ah, let's remove `[FOLLOWUP]` or replace it to something like 
`[BRANCH-2.2]` in the PR title.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-29 Thread DonnyZone

Github user DonnyZone commented on the issue:

https://github.com/apache/spark/pull/19559
  
Sure, I will submit it later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19596: [SPARK-22369][PYTHON][DOCS] Exposes catalog API document...

2017-10-29 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19596
  
Thanks, @viirya. I wasn't sure if I should add it to the list. My intention 
was .. this one is like `DataFrameReader` and `DataFrameWriter` (supposed to be 
used via `spark.read`) and I wanted to.. like hide the package path 
`pyspark.sql.Catalog` in the doc and, I just decided the smallest change I 
could think for this issue.

I am fine with adding it too. It's easy to add it if anyone feels strongly 
about this. Please let me know.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error when trans...

2017-10-29 Thread jmchung

Github user jmchung commented on the issue:

https://github.com/apache/spark/pull/19604
  
I found in this branch the Docker-based integration will fail due to can 
not pull the image `wnameless/oracle-xe-11g:14.04.4`, should we move on to 
`wnameless/oracle-xe-11g`?

```
Error response from daemon: manifest for wnameless/oracle-xe-11g:14.04.4 
not found
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-29 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19589


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for SparkR test...

2017-10-29 Thread shivaram

Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/19589
  
Merging to master, branch-2.2 and branch-2.1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19605: [SPARK-22394] [SQL] Remove redundant synchronization for...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19605
  
**[Test build #83202 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83202/testReport)**
 for PR 19605 at commit 
[`072b27d`](https://github.com/apache/spark/commit/072b27d083f2c2ed8d8bdd20caa5b0fe0ba267f6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19605: [SPARK-22394] [SQL] Remove redundant synchronizat...

2017-10-29 Thread wzhfy

GitHub user wzhfy opened a pull request:

https://github.com/apache/spark/pull/19605

[SPARK-22394] [SQL] Remove redundant synchronization for metastore access

## What changes were proposed in this pull request?

Before Spark 2.x, synchronization for metastore access was protected at 
[line229 in 
ClientWrapper](https://github.com/apache/spark/blob/branch-1.6/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala#L229)
 (now it's at [line203 in HiveClientWrapper 
](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L203)).
 After Spark 2.x, `HiveExternalCatalog` was introduced by 
[SPARK-13080](https://github.com/apache/spark/pull/11293), where an extra level 
of synchronization was added at 
[line95](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L95).
 That is, now we have two levels of synchronization: one is 
`HiveExternalCatalog` and the other is `IsolatedClientLoader` in 
`HiveClientImpl`. But since both `HiveExternalCatalog` and 
`IsolatedClientLoader` are shared among all spark sessions, the extra level of 
synchronization in `Hiv
 eExternalCatalog` is redundant, thus can be removed.

## How was this patch tested?

Manual test and existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wzhfy/spark redundant_sync

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19605


commit 072b27d083f2c2ed8d8bdd20caa5b0fe0ba267f6
Author: Zhenhua Wang 
Date:   2017-10-30T01:47:12Z

remove redundant sync




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error when trans...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19604
  
**[Test build #83201 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83201/testReport)**
 for PR 19604 at commit 
[`995e38e`](https://github.com/apache/spark/commit/995e38e118126d95b2fe5ee8416e5f36786a7b5b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error whe...

2017-10-29 Thread jmchung

Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19604#discussion_r147605119
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -440,8 +440,9 @@ object JdbcUtils extends Logging {
 
 case StringType =>
   (array: Object) =>
-array.asInstanceOf[Array[java.lang.String]]
-  .map(UTF8String.fromString)
+// some underling types are not String such as uuid, inet, 
cidr, etc.
--- End diff --

Oops, a typo occurred, thanks @viirya !!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error whe...

2017-10-29 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19604#discussion_r147604947
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -440,8 +440,9 @@ object JdbcUtils extends Logging {
 
 case StringType =>
   (array: Object) =>
-array.asInstanceOf[Array[java.lang.String]]
-  .map(UTF8String.fromString)
+// some underling types are not String such as uuid, inet, 
cidr, etc.
--- End diff --

underlying?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19520: [SPARK-22298][WEB-UI] url encode APP id before generatin...

2017-10-29 Thread guoxiaolongzte

Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/19520
  
I would like to ask, under what circumstances the application id will 
contain a forward slash?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error when trans...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19604
  
**[Test build #83200 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83200/testReport)**
 for PR 19604 at commit 
[`549cb81`](https://github.com/apache/spark/commit/549cb814e01c2338a67c4a9efa4d880a3fb9cdac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error when trans...

2017-10-29 Thread jmchung

Github user jmchung commented on the issue:

https://github.com/apache/spark/pull/19604
  
cc @cloud-fan, the follow-up PR for 2.2, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19604: [SPARK-22291][SQL][FOLLOWUP] Conversion error whe...

2017-10-29 Thread jmchung

GitHub user jmchung opened a pull request:

https://github.com/apache/spark/pull/19604

[SPARK-22291][SQL][FOLLOWUP] Conversion error when transforming array types 
of uuid, inet and cidr to StingType in PostgreSQL

â¦ types of uuid, inet and cidr to StingType in PostgreSQL

## What changes were proposed in this pull request?

This is a followup of #19567 , to fix the conversion error when 
transforming array types of uuid, inet and cidr to StingType in PostgreSQL for 
Spark 2.2.

## How was this patch tested?

Added test in `PostgresIntegrationSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jmchung/spark SPARK-22291-FOLLOWUP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19604.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19604


commit 549cb814e01c2338a67c4a9efa4d880a3fb9cdac
Author: Jen-Ming Chung 
Date:   2017-10-30T01:25:28Z

[SPARK-22291][SQL][FOLLOWUP] Conversion error when transforming array types 
of uuid, inet and cidr to StingType in PostgreSQL




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19596: [SPARK-22369][PYTHON][DOCS] Exposes catalog API document...

2017-10-29 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19596
  
I've generated the Python docs. Looks good.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19507: [WEB-UI] Add count in fair scheduler pool page

2017-10-29 Thread guoxiaolongzte

Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/19507
  
@srowen 
Help review the code.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19532: [CORE]Modify the duration real-time calculation and upda...

2017-10-29 Thread guoxiaolongzte

Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/19532
  
@jiangxb1987  @srowen 
Help review the code.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19596: [SPARK-22369][PYTHON][DOCS] Exposes catalog API document...

2017-10-29 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19596
  
Don't we like to add it to the list of `Important classes of Spark SQL and 
DataFrames`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19595: [SPARK-22379][PYTHON] Reduce duplication setUpClass and ...

2017-10-29 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19595
  
cc @ueshin, could you take a look please when you have some time?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19596: [SPARK-22369][PYTHON][DOCS] Exposes catalog API document...

2017-10-29 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19596
  
cc @holdenk and @viirya, mind taking a look please? I remember I had few 
talks about Sphinx and `__all__` and I believe you guys are right reviewers.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18113: [SPARK-20890][SQL] Added min and max typed aggregation f...

2017-10-29 Thread setjet

Github user setjet commented on the issue:

https://github.com/apache/spark/pull/18113
  
Hi, it has been a while but I can pick it back up when I have time next 
weekend or so if that's OK. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19599: [SPARK-22381] [ML] Add StringParam that supports valid o...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19599
  
**[Test build #83199 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83199/testReport)**
 for PR 19599 at commit 
[`01e7d3d`](https://github.com/apache/spark/commit/01e7d3d5f9b0ae278ebce60635e5c2568d3d0cf3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19553
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83198/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19553
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19553
  
**[Test build #83198 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83198/testReport)**
 for PR 19553 at commit 
[`235f6d6`](https://github.com/apache/spark/commit/235f6d67cf25f4016c8e8ffb77103770e855ec62).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19553#discussion_r147596592
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
@@ -43,10 +43,17 @@ private[spark] object JavaUtils {
 
 override def size: Int = underlying.size
 
-override def get(key: AnyRef): B = try {
-  underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
-} catch {
-  case ex: ClassCastException => null.asInstanceOf[B]
+// Delegate to implementation because AbstractMap implementation 
iterates over whole key set
+override def containsKey(key: AnyRef): Boolean = {
+  underlying.contains(key.asInstanceOf[A])
--- End diff --

I thought it should throw exception, however there is a test showing that 
it's fine...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19603
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83197/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19603
  
**[Test build #83197 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83197/testReport)**
 for PR 19603 at commit 
[`d09d9bd`](https://github.com/apache/spark/commit/d09d9bd10331ebd8992e1d7930236162c53ee37e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

2017-10-29 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19553#discussion_r147593655
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
@@ -43,10 +43,17 @@ private[spark] object JavaUtils {
 
 override def size: Int = underlying.size
 
-override def get(key: AnyRef): B = try {
-  underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
-} catch {
-  case ex: ClassCastException => null.asInstanceOf[B]
+// Delegate to implementation because AbstractMap implementation 
iterates over whole key set
+override def containsKey(key: AnyRef): Boolean = {
+  underlying.contains(key.asInstanceOf[A])
--- End diff --

Really, this should return `false` if the key isn't an `A`. This will throw 
an exception now. It should be prefixed with `key.isInstanceOf[A] && ...`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

2017-10-29 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19553#discussion_r147593724
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
@@ -43,10 +43,17 @@ private[spark] object JavaUtils {
 
 override def size: Int = underlying.size
 
-override def get(key: AnyRef): B = try {
-  underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
-} catch {
-  case ex: ClassCastException => null.asInstanceOf[B]
+// Delegate to implementation because AbstractMap implementation 
iterates over whole key set
+override def containsKey(key: AnyRef): Boolean = {
+  underlying.contains(key.asInstanceOf[A])
+}
+
+override def get(key: AnyRef): B = {
+  val value = underlying.get(key.asInstanceOf[A])
+  if (value.isDefined && value.get.isInstanceOf[B]) {
--- End diff --

`underlying` values are already known to be `B`, so this isn't necessary. 
But a condition of the key is.

```
if (key.instanceOf[A]) {
  underlying.getOrElse(key.asInstanceOf[A], null)
} else {
  null
}
```

Might need an extra cast in there.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83196/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19603
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19603
  
**[Test build #83196 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83196/testReport)**
 for PR 19603 at commit 
[`83607a3`](https://github.com/apache/spark/commit/83607a3d727b9a600271ba61e0c2976fc3c125c1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread Whoosh

Github user Whoosh commented on the issue:

https://github.com/apache/spark/pull/19553
  
@cloud-fan I've checked all core tests, it was fine, should I do smt in 
addition to?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19553
  
**[Test build #83198 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83198/testReport)**
 for PR 19553 at commit 
[`235f6d6`](https://github.com/apache/spark/commit/235f6d67cf25f4016c8e8ffb77103770e855ec62).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19553
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19553
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

2017-10-29 Thread Whoosh

Github user Whoosh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19553#discussion_r147591080
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
@@ -43,10 +43,15 @@ private[spark] object JavaUtils {
 
 override def size: Int = underlying.size
 
-override def get(key: AnyRef): B = try {
-  underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
-} catch {
-  case ex: ClassCastException => null.asInstanceOf[B]
+// Delegate to implementation because AbstractMap implementation 
iterates over whole key set
+override def containsKey(key: AnyRef): Boolean = key match {
+  case key: A => underlying.contains(key)
--- End diff --

@srowen 
It can't be so. Will cause "abstract type A is unchecked since it is 
eliminated by erasure" compile-time error.
As I guess, there is no need any type checking before a get because it'll 
have cast to Object anyway and get(key) it's only compiling issues, please 
correct me if I'm wrong,  I've added a simple test for this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19571
  
yes please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-29 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19571
  
I see. Then, can we continue on #17980 ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19603
  
**[Test build #83197 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83197/testReport)**
 for PR 19603 at commit 
[`d09d9bd`](https://github.com/apache/spark/commit/d09d9bd10331ebd8992e1d7930236162c53ee37e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19593: [WIP][SPARK-22374][SQL][2.2] closeAllForUGI is re...

2017-10-29 Thread dongjoon-hyun

Github user dongjoon-hyun closed the pull request at:

https://github.com/apache/spark/pull/19593


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19593: [WIP][SPARK-22374][SQL][2.2] closeAllForUGI is required ...

2017-10-29 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19593
  
Thank you for review, @vanzin .
Sorry, I'll close this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...

2017-10-29 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/19601
  
After I think about the choice for a while, I conclude that it is better to 
add the new `WritableColumnVector` (i.e. `UnsafeColumnVector`) and to keep the 
current `ColumnVector.Array`.  
I think that to add a new class will give us some flexibility and good 
abstraction between public class `ColumnVector` and other internal classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...

2017-10-29 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19603#discussion_r147589361
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -591,18 +591,40 @@ case class MapObjects private(
   case _ => inputData.dataType
 }
 
-val (getLength, getLoopVar) = inputDataType match {
+val (getLength, prepareLoop, getLoopVar) = inputDataType match {
   case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) =>
-s"${genInputData.value}.size()" -> 
s"${genInputData.value}.apply($loopIndex)"
+val it = ctx.freshName("it")
+(
+  s"${genInputData.value}.size()",
--- End diff --

I see. got it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...

2017-10-29 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19603#discussion_r147589355
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -591,6 +591,9 @@ case class MapObjects private(
   case _ => inputData.dataType
 }
 
+// `MapObjects` generates a while loop to traverse the elements of the 
input collection. We
+// need to take care of Seq and List because they may have O(n) 
complexity for indexed accessing
+// like `list.get(1)`. Here we use Iterator to travers Seq and List.
--- End diff --

nit: travers -> traverse


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19529: [SPARK-22308] Support alternative unit testing styles in...

2017-10-29 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19529
  
I just reverted this PR. @nkronenfeld Could you submit another PR and 
update the title to
`[SPARK-22308][test-maven] Support alternative unit testing styles in 
external applications`? Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19603
  
**[Test build #83196 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83196/testReport)**
 for PR 19603 at commit 
[`83607a3`](https://github.com/apache/spark/commit/83607a3d727b9a600271ba61e0c2976fc3c125c1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19603#discussion_r147587911
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -591,18 +591,40 @@ case class MapObjects private(
   case _ => inputData.dataType
 }
 
-val (getLength, getLoopVar) = inputDataType match {
+val (getLength, prepareLoop, getLoopVar) = inputDataType match {
   case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) =>
-s"${genInputData.value}.size()" -> 
s"${genInputData.value}.apply($loopIndex)"
+val it = ctx.freshName("it")
+(
+  s"${genInputData.value}.size()",
--- End diff --

otherwise we need a re-sizable array to keep result, which is a lot of 
change and doesn't have a clear win(re-sizing is expensive).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19567
  
@jmchung can you send a new PR for 2.2? thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19567: [SPARK-22291][SQL] Conversion error when transfor...

2017-10-29 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19567


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19567
  
the last commit just change the test name and shouldn't break pyspark 
tests, I'm merging to master, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-10-29 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r147587396
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -17,47 +17,168 @@
 
 package org.apache.spark.unsafe.memory;
 
-import javax.annotation.Nullable;
-
 import org.apache.spark.unsafe.Platform;
 
+import javax.annotation.Nullable;
--- End diff --

thanks, fixed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-10-29 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r147587392
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/expressions/HiveHasher.java
 ---
@@ -38,6 +39,10 @@ public static int hashLong(long input) {
 return (int) ((input >>> 32) ^ input);
   }
 
+  public static int hashUnsafeBytesBlock(MemoryBlock base, long offset, 
int lengthInBytes) {
--- End diff --

This is based on [this 
discussion](https://github.com/apache/spark/pull/19222#discussion_r138744794). 
Currently, when I can see large performance improvement, I do not call 
non-MemoyBlock version.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19567
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83195/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19567
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19567
  
**[Test build #83195 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83195/testReport)**
 for PR 19567 at commit 
[`fae5c45`](https://github.com/apache/spark/commit/fae5c455b4a754128bc9112bbead4aef3cc322a2).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83192/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #83192 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83192/testReport)**
 for PR 19222 at commit 
[`62faf43`](https://github.com/apache/spark/commit/62faf43167f58f102b1d7d7a49cd0f39802898a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11403: [SPARK-13523] [SQL] Reuse exchanges in a query

2017-10-29 Thread gczsjdy

Github user gczsjdy commented on the issue:

https://github.com/apache/spark/pull/11403
  
@davies Hi, what do you mean by "Since all the planner only work with tree, 
so this rule should be the last one for the entire planning."?
Thanks if you have time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17899
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83194/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17899
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17899
  
**[Test build #83194 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83194/testReport)**
 for PR 17899 at commit 
[`e9f6928`](https://github.com/apache/spark/commit/e9f6928bb60e7c4de25324e5572f105a30d16cd5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19600: Added more information to Imputer

2017-10-29 Thread tengpeng

Github user tengpeng commented on the issue:

https://github.com/apache/spark/pull/19600
  
I will follow the guideline strictly next time. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19567
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83190/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19567
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...

2017-10-29 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19567
  
**[Test build #83190 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83190/testReport)**
 for PR 19567 at commit 
[`588902d`](https://github.com/apache/spark/commit/588902d21fb12bf80169edc74097d7bda950668c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-29 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/19602
  
@gatorsmile 
Thanks again for review this pr.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 218 matches

Mail list logo