date:20181203

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99646/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99645/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99645 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99645/testReport)**
 for PR 23088 at commit 
[`f8cfb54`](https://github.com/apache/spark/commit/f8cfb544a805ecb5d1056f703dde4e7705ad1810).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23213
  
**[Test build #99646 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99646/testReport)**
 for PR 23213 at commit 
[`2ced0ca`](https://github.com/apache/spark/commit/2ced0cab0c16ee7a2400035a5a7794033eae3ed9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23214
  
It's easy to track `numKeyLookups` at `HashedRelation`, but it's hard to 
track `numProbes`. One idea is, we pass a `MutableInt` to 
`LongToUnsafeRowMap.getValue` as a parameter, and in the method we set the 
actual `numProbes` of this look up to the `MutableInt` parameter.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23214
  
I might know the root cause: `LongToUnsafeRowMap` is acutally accessed by 
multiple threads.

For broadcast hash join, we will copy the broadcasted hash relation to 
avoid multi-thread problem, via `HashedRelation.asReadOnlyCopy`. However, this 
is a shallow copy, the `LongToUnsafeRowMap` is not copied and likely shared by 
multiple `HashedRelation`s.

The metrics is per-task, so I think a better fix is to track the hash probe 
metrics per `HashedRelation`, instead of per `LongToUnsafeRowMap`. It's too 
costly to copy the `LongToUnsafeRowMap`, we should think about how to do it 
efficiently. cc @hvanhovell 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99653 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99653/testReport)**
 for PR 23088 at commit 
[`41cfe80`](https://github.com/apache/spark/commit/41cfe8084a73e13336ab753a46fdc4c950583478).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...

2018-12-03 Thread dilipbiswal

Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22899
  
@gatorsmile Thanks a lot. I completely agree that we should try and combine 
these two. I will continue to think about it :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5709/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23194
  
**[Test build #99652 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99652/testReport)**
 for PR 23194 at commit 
[`ac2b004`](https://github.com/apache/spark/commit/ac2b0048f13acbce8f9b82f2b1c5f5cd268c63d4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5708/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99649/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-03 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/23207
  
cc @cloud-fan @gatorsmile @rxin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23194
  
**[Test build #99649 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99649/testReport)**
 for PR 23194 at commit 
[`877751b`](https://github.com/apache/spark/commit/877751bdab5310dae4c5d8a1f2b7c3bbd761b718).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23214: [SPARK-26155] Optimizing the performance of LongT...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23214#discussion_r238549645
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
 ---
@@ -398,8 +399,8 @@ private[execution] final class LongToUnsafeRowMap(val 
mm: TaskMemoryManager, cap
   private var numKeys = 0L
 
   // Tracking average number of probes per key lookup.
-  private var numKeyLookups = 0L
-  private var numProbes = 0L
+  private var numKeyLookups = new LongAdder
+  private var numProbes = new LongAdder
--- End diff --

I'm surprised. I think `LongToUnsafeRowMap` is used in a single thread 
environment and multi-thread contend should not be an issue here. Do you have 
any insights about how this fixes the perf issue?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23214
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5707/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23214
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23169
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23169
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99648/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23169
  
**[Test build #99648 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99648/testReport)**
 for PR 23169 at commit 
[`f0f75c2`](https://github.com/apache/spark/commit/f0f75c25b95010d63ecdf83bb9f280687361d154).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23207
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23207
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99643/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23214
  
**[Test build #99651 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99651/testReport)**
 for PR 23214 at commit 
[`a267e6b`](https://github.com/apache/spark/commit/a267e6bbf874038573c598e4c411274c8b459701).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23214
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23207
  
**[Test build #99643 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99643/testReport)**
 for PR 23207 at commit 
[`7c8e516`](https://github.com/apache/spark/commit/7c8e5161904f1fd0fa4d99e6c497ef1be3542bdb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22899: [SPARK-25573] Combine resolveExpression and resol...

2018-12-03 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22899


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...

2018-12-03 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22899
  
LGTM

Merged to master. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...

2018-12-03 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22899
  
To be honest, we might still need to revisit it since it is still very 
confusing to the developer which one they should use, top-down? or bottom-up? 
The current use case for top-down is majorly for resolving the higher order 
functions. 

This PR at least improves the description. We might need to combine them in 
the future. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/23171

Basically logically there are only two expressions: In which handles 
arbitrary expressions, and InSet which handles expressions with literals. Both 
could work: (1) we provide two separate expressions for InSet, one using 
switch, and one using hashset, or (2) we just provide one InSet and internally 
in InSet have two implementations ...Â 

The downside with creating different expressions for the same logical 
expression is that potentially the downstream optimization rules would need to 
match more.

On Mon, Dec 03, 2018 at 10:52 PM, DB Tsai < notificati...@github.com > 
wrote:

> 
> 
> 
> @ rxin ( https://github.com/rxin ) switch in Java is still significantly
> faster than hash set even without boxing / unboxing problems when the
> number of elements are small. We were thinking about to have two
> implementations in InSet , and pick up switch if the number of elements 
are
> small, or otherwise pick up hash set one. But this is the same complexity
> as having two implements in In as this PR.
> 
> 
> 
> @ cloud-fan ( https://github.com/cloud-fan ) do you suggest to create an 
OptimizeIn
> which has switch and hash set implementations based on the length of the
> elements and remove InSet ? Basically, what we were thinking above.
> 
> 
> 
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub (
> https://github.com/apache/spark/pull/23171#issuecomment-443991336 ) , or 
mute
> the thread (
> 
https://github.com/notifications/unsubscribe-auth/AATvPKtGyx5jWxgtO1y5WsiXYDAQqRQ4ks5u1hvJgaJpZM4Y4P4J
> ).
> 
> 
>

---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99644/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread dbtsai

Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/23171
  
@rxin `switch` in Java is still significantly faster than hash set even 
without boxing / unboxing problems when the number of elements are small. We 
were thinking about to have two implementations in `InSet`, and pick up 
`switch` if the number of elements are small, or otherwise pick up hash set 
one. But this is the same complexity as having two implements in `In` as this 
PR. 

@cloud-fan do you suggest to create an `OptimizeIn` which has `switch` and 
hash set implementations based on the length of the elements and remove 
`InSet`? Basically, what we were thinking above.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23194
  
**[Test build #99644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/testReport)**
 for PR 23194 at commit 
[`867ee5e`](https://github.com/apache/spark/commit/867ee5e5ee73d00ea492655aabaa407089ea0f91).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread LuciferYang

Github user LuciferYang commented on the issue:

https://github.com/apache/spark/pull/23214
  
@adrian-wang ok~ I will add some comments to explain the reason


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread LuciferYang

Github user LuciferYang commented on the issue:

https://github.com/apache/spark/pull/23214
  
@JkSelf thx~ 



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread JkSelf

Github user JkSelf commented on the issue:

https://github.com/apache/spark/pull/23214
  
@LuciferYang  the patch is fine in my test environment. 
@adrian-wang  I will run all the tpcds queries in spark2.3 and spark2.3 
with this patch later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22468
  
**[Test build #99650 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99650/testReport)**
 for PR 22468 at commit 
[`fbfbbff`](https://github.com/apache/spark/commit/fbfbbff55d900ae1101ceb4f7823a9298464cb07).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238543369
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+val mapResultRow = convertBackToInternalRow(mapRow, 
fields4).toSeq(fields4)
+val mapExpectedRow = mapRow.toSeq(fields4)
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
--- End diff --

fixed code to use `ExpressionEvalHelper.checkResult`.

I don't remember correctly though, we might have some historical reasons 
about that; `ArrayBasedMapData` has no `hashCode` and `equals`. Probably, 
somebody might know this... cc: @hvanhovell @viirya


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22468
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5706/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22468
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23204: Revert "[SPARK-21052][SQL] Add hash map metrics to join"

2018-12-03 Thread LuciferYang

Github user LuciferYang commented on the issue:

https://github.com/apache/spark/pull/23204
  
@cloud-fan @viirya #23214 maybe reslove this problem and we needn't revert 
this patch.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread adrian-wang

Github user adrian-wang commented on the issue:

https://github.com/apache/spark/pull/23214
  
maybe add some detailed test result in description and explain the reason 
for this in code comment?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread LuciferYang

Github user LuciferYang commented on the issue:

https://github.com/apache/spark/pull/23214
  
ping @viirya 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread LuciferYang

Github user LuciferYang commented on the issue:

https://github.com/apache/spark/pull/23214
  
cc @cloud-fan ,  help to review this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23214
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread LuciferYang

Github user LuciferYang commented on the issue:

https://github.com/apache/spark/pull/23214
  
cc @JkSelf  help to check this patch.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23214
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23214
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23214: [SPARK-26155] Optimizing the performance of LongT...

2018-12-03 Thread LuciferYang

GitHub user LuciferYang opened a pull request:

https://github.com/apache/spark/pull/23214

[SPARK-26155] Optimizing the performance of LongToUnsafeRowMap

## What changes were proposed in this pull request?

To slove @JkSelf report problem at 
[SPARK-26155](https://issues.apache.org/jira/browse/SPARK-26155), use LongAdder 
instead of Long of `numKeyLookups` and `numProbes` to reduce add operation 
times. @JkSelf  test this patch in Intel performance testing environment and 
run TPCDS sqls after this patch with Spark-2.3 and master no longer slower than 
 Spark-2.1.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/LuciferYang/spark spark-26155

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23214


commit 34decea2f59d9fb3930f4d2ca1ad8a625d0409ea
Author: yangjie01 
Date:   2018-12-04T05:49:19Z

use LongAdder instead of Long




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22899
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22899
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99642/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22899
  
**[Test build #99642 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99642/testReport)**
 for PR 22899 at commit 
[`45e314a`](https://github.com/apache/spark/commit/45e314aa33fb0c108f185519ff58db691c34d58c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5705/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23194
  
**[Test build #99649 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99649/testReport)**
 for PR 23194 at commit 
[`877751b`](https://github.com/apache/spark/commit/877751bdab5310dae4c5d8a1f2b7c3bbd761b718).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23171
  
How about, we create an `OptimizedIn`, and convert `In` to `OptimizedIn` if 
the list is all literals? `OptimizedIn` will pick `switch` or hash set based on 
the length of the list.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23169
  
**[Test build #99648 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99648/testReport)**
 for PR 23169 at commit 
[`f0f75c2`](https://github.com/apache/spark/commit/f0f75c25b95010d63ecdf83bb9f280687361d154).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread HeartSaVioR

Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/23169
  
@DaveDeCaprio 

You might miss to roll back change in test.

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99632/testReport/org.apache.spark.sql.catalyst.trees/TreeNodeSuite/treeString_limits_plan_length/

I also think you need to add a new test with setting configuration to some 
value and see whether it works properly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238534101
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+val mapResultRow = convertBackToInternalRow(mapRow, 
fields4).toSeq(fields4)
+val mapExpectedRow = mapRow.toSeq(fields4)
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
--- End diff --

Or we can use `ExpressionEvalHelper.checkResult` here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread HeartSaVioR

Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/23169
  
retest this, please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238533700
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+val mapResultRow = convertBackToInternalRow(mapRow, 
fields4).toSeq(fields4)
+val mapExpectedRow = mapRow.toSeq(fields4)
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
--- End diff --

maybe we should implement `equals` and `hashCode` in `ArrayBasedMapData` 
and `UnsafeMapData`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238530264
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,100 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
+// we need to take care of it to compare rows.
+def toComparable(d: Any): Any = d match {
--- End diff --

Since we cannot compare `ArrayBasedMapData`s directly (that is, 
`assert(mapResultRow === mapExpectedRow)` fails), I just converted them into 
the `Seq`s of keys/values by this method.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22468
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22468
  
**[Test build #99647 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99647/testReport)**
 for PR 22468 at commit 
[`7ef5f86`](https://github.com/apache/spark/commit/7ef5f866eb02f6638a5be00a602de6c6810ae2a3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22468
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5704/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/23171
  
I thought InSwitch logically is the same as InSet, in which all the child 
expressions are literals?

On Mon, Dec 03, 2018 at 8:38 PM, Wenchen Fan < notificati...@github.com > 
wrote:

> 
> 
> 
> I think InSet is not an optimized version of In , but just a way to
> separate the implementation for different conditions (the length of the
> list). Maybe we should do the same thing here, create a InSwitch and
> convert In to it when meeting some conditions. One problem is, In and 
InSwitch
> is same in the interpreted version, maybe we should create a base class
> for them.
> 
> 
> 
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub (
> https://github.com/apache/spark/pull/23171#issuecomment-443968486 ) , or 
mute
> the thread (
> 
https://github.com/notifications/unsubscribe-auth/AATvPDTQic0Ii5UD40m_Uj5kMVy4pNExks5u1fxPgaJpZM4Y4P4J
> ).
> 
> 
>


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5703/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5702/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23171
  
I think `InSet` is not an optimized version of `In`, but just a way to 
separate the implementation for different conditions (the length of the list). 
Maybe we should do the same thing here, create a `InSwitch` and convert `In` to 
it when meeting some conditions. One problem is, `In` and `InSwitch` is same in 
the interpreted version, maybe we should create a base class for them.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23213
  
**[Test build #99646 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99646/testReport)**
 for PR 23213 at commit 
[`2ced0ca`](https://github.com/apache/spark/commit/2ced0cab0c16ee7a2400035a5a7794033eae3ed9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-03 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/23213

[SPARK-26262][SQL] Run SQLQueryTestSuite with 
WHOLESTAGE_CODEGEN_ENABLED=false

## What changes were proposed in this pull request?
For better test coverage, this pr set `false` at 
`WHOLESTAGE_CODEGEN_ENABLED` for interpreter execution tests when running 
`SQLQueryTestSuite`. This pr moved the existing tests into `SQLQuerySuite` 
because explain output results are different between codegen-only and 
interpreter modes.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark InterpreterModeTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23213


commit 2ced0cab0c16ee7a2400035a5a7794033eae3ed9
Author: Takeshi Yamamuro 
Date:   2018-12-04T04:17:48Z

Fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99645 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99645/testReport)**
 for PR 23088 at commit 
[`f8cfb54`](https://github.com/apache/spark/commit/f8cfb544a805ecb5d1056f703dde4e7705ad1810).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread shahidki31

Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/23088
  
Retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23194
  
LGTM except for minor comments


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23194#discussion_r238526892
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -377,41 +377,41 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
-  test("CTAS a managed table with the existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
+  protected def withEmptyDirInTablePath(dirName: String)(f: File => Unit): 
Unit = {
--- End diff --

private?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99637/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22514
  
**[Test build #99637 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99637/testReport)**
 for PR 22514 at commit 
[`e04812d`](https://github.com/apache/spark/commit/e04812d6aff0270ab4c0e6c3b2e204a682739e23).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/23171
  
That probably means we should just optimize InSet to have the switch 
version though? Rather than do it in In?

On Mon, Dec 03, 2018 at 8:20 PM, Wenchen Fan < notificati...@github.com > 
wrote:

> 
> 
> 
> @ rxin ( https://github.com/rxin ) I proposed the same thing before, but
> one problem is that, we only convert In to InSet when the length of list
> reaches the threshold. If the switch way is faster than hash set when the
> list is small, it seems still worth to optimize In using switch.
> 
> 
> 
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub (
> https://github.com/apache/spark/pull/23171#issuecomment-443965616 ) , or 
mute
> the thread (
> 
https://github.com/notifications/unsubscribe-auth/AATvPEkrUFJuT4FI167cCI9b0nfv16V4ks5u1fgNgaJpZM4Y4P4J
> ).
> 
> 
>


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23171
  
@rxin I proposed the same thing before, but one problem is that, we only 
convert `In` to `InSet` when the length of list reaches the threshold. If the 
`switch` way is faster than hash set when the list is small, it seems still 
worth to optimize `In` using `switch`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty con...

2018-12-03 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23212


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23212
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99640/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23212
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23212
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23212
  
**[Test build #99640 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99640/testReport)**
 for PR 23212 at commit 
[`ca09bb8`](https://github.com/apache/spark/commit/ca09bb82d49d29a5d6f088d2250e980bab64a8b7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23208: [SPARK-25530][SQL] data source v2 API refactor (b...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23208#discussion_r238524973
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchWrite.java 
---
@@ -25,14 +25,14 @@
 import org.apache.spark.sql.types.StructType;
 
 /**
- * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
+ * A mix-in interface for {@link Table}. Data sources can implement this 
interface to
  * provide data writing ability for batch processing.
  *
  * This interface is used to create {@link BatchWriteSupport} instances 
when end users run
--- End diff --

I don't have a better naming in mind, so I leave it as `WriteSupport` for 
now. Better naming is welcome to match `Scan`!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-03 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22957#discussion_r238524763
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ---
@@ -145,9 +145,14 @@ case class EnsureRequirements(conf: SQLConf) extends 
Rule[SparkPlan] {
 assert(requiredChildDistributions.length == children.length)
 assert(requiredChildOrderings.length == children.length)
 
+val aliasMap = 
AttributeMap[Expression](children.flatMap(_.expressions.collect {
+  case a: Alias => (a.toAttribute, a)
+}))
+
 // Ensure that the operator's children satisfy their output 
distribution requirements.
 children = children.zip(requiredChildDistributions).map {
-  case (child, distribution) if 
child.outputPartitioning.satisfies(distribution) =>
+  case (child, distribution) if child.outputPartitioning.satisfies(
+  distribution.mapExpressions(replaceAlias(_, aliasMap))) =>
--- End diff --

As an example, `ProjectExec.outputPartitioning` can be wrong, as it doesn't 
consider the aliases in the project list. I think it's clearer to adjust the 
`outputPartitioning` there, instead of dealing with it in a rule. What if we 
have more rules need to check `outputPartitioning` and 
`requiredChildDistribution`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20433
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20433
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99638/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23108: [Spark-25993][SQL][TEST]Add test cases for resolu...

2018-12-03 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23108#discussion_r238524452
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
@@ -186,6 +186,54 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
 }
   }
 
+  protected def testORCTableLocation(isConvertMetastore: Boolean): Unit = {
+val tableName1 = "spark_orc1"
+val tableName2 = "spark_orc2"
+
+withTempDir { dir =>
+  val someDF1 = Seq((1, 1, "orc1"), (2, 2, "orc2")).toDF("c1", "c2", 
"c3").repartition(1)
+  withTable(tableName1, tableName2) {
+val dataDir = s"${dir.getCanonicalPath}/dir1/"
+val parentDir = s"${dir.getCanonicalPath}/"
+val wildCardDir = new File(s"${dir}/*").toURI
+someDF1.write.orc(dataDir)
+val parentDirStatement =
+  s"""
+ |CREATE EXTERNAL TABLE $tableName1(
+ |  c1 int,
+ |  c2 int,
+ |  c3 string)
+ |STORED AS orc
+ |LOCATION '${parentDir}'""".stripMargin
+sql(parentDirStatement)
+val parentDirSqlStatement = s"select * from ${tableName1}"
+if (isConvertMetastore) {
+  checkAnswer(sql(parentDirSqlStatement), Nil)
+} else {
+ checkAnswer(sql(parentDirSqlStatement),
+   (1 to 2).map(i => Row(i, i, s"orc$i")))
+}
+
+val wildCardStatement =
+  s"""
+ |CREATE EXTERNAL TABLE $tableName2(
+ |  c1 int,
+ |  c2 int,
+ |  c3 string)
+ |STORED AS orc
+ |LOCATION '$wildCardDir'""".stripMargin
--- End diff --

Thanks, @kevinyu98 . Also, please update the PR title
```
- [Spark-25993][SQL][TEST]Add test cases for resolution of ORC table 
location 
+ [SPARK-25993][SQL][TEST] Add test cases for CREATE EXTERNAL TABLE with 
subdirectories
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20433
  
**[Test build #99638 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99638/testReport)**
 for PR 20433 at commit 
[`7cb8f87`](https://github.com/apache/spark/commit/7cb8f87ba83fb04e891e069b8339dcef6d835ed9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99635/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 626 matches

Mail list logo