[GitHub] spark pull request #18932: Add reduceVertices method to GraphOps

2017-08-13 Thread gilcu2
GitHub user gilcu2 opened a pull request:

https://github.com/apache/spark/pull/18932

Add reduceVertices method to GraphOps

## What changes were proposed in this pull request?

Add reduceVertices method to GraphOps class in graphX

## How was this patch tested?

Unitest run locally


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gilcu2/spark reduceVertices_graphx_operation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18932.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18932


commit 04247ea94bad429ffd2b0e0ecdce5e631ae1ae63
Author: gilcu2 
Date:   2017-08-13T13:57:08Z

Add reduceVertices method to GraphOps




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18930
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread DonnyZone
Github user DonnyZone commented on the issue:

https://github.com/apache/spark/pull/18920
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18932: Add reduceVertices method to GraphOps

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18932
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80594/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80594 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80594/testReport)**
 for PR 18931 at commit 
[`6d600d5`](https://github.com/apache/spark/commit/6d600d5eb4a275eb6bc72ccf353d2d1ded03635f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

2017-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/18648
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18920
  
**[Test build #80591 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80591/testReport)**
 for PR 18920 at commit 
[`bb29b8f`](https://github.com/apache/spark/commit/bb29b8f7e8e8be436ea028acfbe16ae6f4977169).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80588 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80588/testReport)**
 for PR 18914 at commit 
[`0de047d`](https://github.com/apache/spark/commit/0de047d87047d7eed89a1f0994b2e4c118a50d12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80590 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80590/testReport)**
 for PR 18931 at commit 
[`0bb8c0e`](https://github.com/apache/spark/commit/0bb8c0ec70243e75f5593ca83788e830e9e4bc25).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80590/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18931
  
Ran with the same benchmark in SPARK-21603.

Before this patch:



After this patch:

Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14 on Linux 4.9.36-moby
Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
max function length of wholestagecodegen: Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative


codegen = F548 /  740  1.2  
   836.6   1.0X
codegen = T372 /  433  1.8  
   567.5   1.5X






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18931: [SPARK-21717][SQL][WIP] Decouple consume function...

2017-08-13 Thread viirya
GitHub user viirya reopened a pull request:

https://github.com/apache/spark/pull/18931

[SPARK-21717][SQL][WIP] Decouple consume functions of physical operators in 
whole-stage codegen

## What changes were proposed in this pull request?

It has been observed in SPARK-21603 that whole-stage codegen suffers 
performance degradation, if the generated functions are too long to be 
optimized by JIT.

We basically produce a single function to incorporate generated codes from 
all physical operators in whole-stage. Thus, it is possibly to grow the size of 
generated function over a threshold that we can't have JIT optimization for it 
anymore.

This patch is trying to decouple the logic of consuming rows in physical 
operators to avoid a giant function processing rows.

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-21717

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18931


commit 05274e7ad4c74e6241b5a05a9365c475f0c3c0a3
Author: Liang-Chi Hsieh 
Date:   2017-08-13T06:06:10Z

Decouple consume functions of physical operators in whole-stage codegen.

commit e0e7a6ecc957b4659db9b0367ef32d09537b32fd
Author: Liang-Chi Hsieh 
Date:   2017-08-13T07:43:17Z

shouldStop is called outside consume().

commit 413707dd0c31a15514f00aea9addca77fe1dd2ce
Author: Liang-Chi Hsieh 
Date:   2017-08-13T10:52:28Z

Fix the condition and the case of using continue in consume.

commit 0bb8c0ec70243e75f5593ca83788e830e9e4bc25
Author: Liang-Chi Hsieh 
Date:   2017-08-13T10:57:45Z

More comment.

commit 6d600d5eb4a275eb6bc72ccf353d2d1ded03635f
Author: Liang-Chi Hsieh 
Date:   2017-08-13T14:17:01Z

Fix aggregation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18931: [SPARK-21717][SQL][WIP] Decouple consume function...

2017-08-13 Thread viirya
Github user viirya closed the pull request at:

https://github.com/apache/spark/pull/18931


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80592 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80592/testReport)**
 for PR 18914 at commit 
[`4212553`](https://github.com/apache/spark/commit/4212553236608f35fbab3fa1186b31f77cff92af).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18914
  
@heary-cao I already understand what you want to do. It makes sense to me 
to remove `sql("UNCACHE TABLE testData")`.
 My question is about `JoinSuite.afterEach()`. Do you want to call 
`spark.sharedState.cacheManager.clearCache()` twice at `JoinSuite.afterEach()` 
and `SharedSQLContext.afterEach()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80588 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80588/testReport)**
 for PR 18914 at commit 
[`0de047d`](https://github.com/apache/spark/commit/0de047d87047d7eed89a1f0994b2e4c118a50d12).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80593 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80593/testReport)**
 for PR 18914 at commit 
[`7dd8cdb`](https://github.com/apache/spark/commit/7dd8cdbf7caac507cbd703193350503309b5f159).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80594 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80594/testReport)**
 for PR 18931 at commit 
[`6d600d5`](https://github.com/apache/spark/commit/6d600d5eb4a275eb6bc72ccf353d2d1ded03635f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18916: [SPARK-21705][CORE][DOC]Add spark.internal.config parame...

2017-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18916
  
I am -0 by the same reason above ^.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18875
  
**[Test build #80596 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80596/testReport)**
 for PR 18875 at commit 
[`c64d9c4`](https://github.com/apache/spark/commit/c64d9c49c5f42b7a72b1570f00daa34a2843be1d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18926: [SPARK-21712] [PySpark] Clarify type error for Co...

2017-08-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18926#discussion_r132847782
  
--- Diff: python/pyspark/sql/column.py ---
@@ -406,7 +406,13 @@ def substr(self, startPos, length):
 [Row(col=u'Ali'), Row(col=u'Bob')]
 """
 if type(startPos) != type(length):
-raise TypeError("Can not mix the type")
+raise TypeError(
+"startPos and length must be the same type. "
+"Got {startPos_t} and {length_t}, respectively."
--- End diff --

If PySpark always needs to check the types, are we doing the same things in 
all the other function calls?

In addition, why not directly checking 
```Python
if isinstance(length, (int, long)):
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r132846456
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -361,10 +361,18 @@ case class JsonTuple(children: Seq[Expression])
   // the fields to query are the remaining children
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
+  // a field name given with constant null will be replaced with this 
pseudo field name
+  private val nullFieldName = "__NullFieldName"
--- End diff --

@jmchung, could we maybe compute this foldable related optimization ahead -
 
https://github.com/jmchung/spark/blob/ffa575a6731fef3e0731b73e0f7311cb024e831b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L425-L439
 and remove this fake field name?

I think we can make a function for the above codes first and then use it 
for computation for each row. Did I understand correctly?

I tried a rough version I thought - 
https://github.com/jmchung/spark/compare/SPARK-21677...HyukjinKwon:tmp-18930?expand=1,
 @viirya what do you think about this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18920: [SPARK-19471][SQL]AggregationIterator does not in...

2017-08-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18920#discussion_r132847406
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
@@ -449,6 +449,28 @@ class DataFrameFunctionsSuite extends QueryTest with 
SharedSQLContext {
 ).foreach(assertValuesDoNotChangeAfterCoalesceOrUnion(_))
   }
 
+  private def assertNoExceptions(c: Column): Unit = {
+for ((wholeStage, useObjectHashAgg) <- Seq((true, false), (false, 
false), (false, true))) {
+  withSQLConf(
+(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key, wholeStage.toString),
+(SQLConf.USE_OBJECT_HASH_AGG.key, useObjectHashAgg.toString)) {
+val df = Seq(("1", 1), ("1", 2), ("2", 3), ("2", 4)).toDF("x", "y")
+// HashAggregate
--- End diff --

We need to check/compare the plans to ensure they are HashAggregate, 
ObjectHashAggregate and SortAggregate. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18914: [MINOR][SQL][TEST]no uncache table in joinsuite t...

2017-08-13 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18914#discussion_r132837521
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala ---
@@ -30,6 +30,15 @@ import org.apache.spark.TestUtils.{assertNotSpilled, 
assertSpilled}
 class JoinSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  override def afterEach(): Unit = {
+try {
+  // Clear the cache table for test cases
+  spark.sharedState.cacheManager.clearCache()
--- End diff --

It was a simple reason.
```
  test("broadcasted hash outer join operator selection") {
spark.sharedState.cacheManager.clearCache()
sql("CACHE TABLE testData")
sql("CACHE TABLE testData2")
Seq(
  ("SELECT * FROM testData LEFT JOIN testData2 ON key = a",
classOf[BroadcastHashJoinExec]),
  ("SELECT * FROM testData RIGHT JOIN testData2 ON key = a where key = 
2",
classOf[BroadcastHashJoinExec]),
  ("SELECT * FROM testData right join testData2 ON key = a and key = 2",
classOf[BroadcastHashJoinExec])
).foreach(assertJoin)
sql("UNCACHE TABLE testData")
  }
```
the cache table and uncache table do not match. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18914
  
@heary-cao Why do we need to call 
`spark.sharedState.cacheManager.clearCache()` in `JoinSuite` while we have 
already call `clearCache()` in `SharedSQLContext`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18875
  
**[Test build #80585 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80585/testReport)**
 for PR 18875 at commit 
[`3045771`](https://github.com/apache/spark/commit/30457716075a4b061df9909b26bc427fb35cf29e).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80595 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80595/testReport)**
 for PR 18931 at commit 
[`502139a`](https://github.com/apache/spark/commit/502139aca30db03d2ef52dc9e140b83668467122).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18932: Add reduceVertices method to GraphOps

2017-08-13 Thread gilcu2
Github user gilcu2 closed the pull request at:

https://github.com/apache/spark/pull/18932


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18931
  
Ran with the same benchmark in #18810.

Before this patch:

Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14 on Linux 4.9.36-moby
Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
max function length of wholestagecodegen: Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative


codegen = F572 /  733  1.1  
   873.5   1.0X
codegen = T   2022 / 2039  0.3  
  3086.0   0.3X

After this patch:

Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14 on Linux 4.9.36-moby
Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
max function length of wholestagecodegen: Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative


codegen = F548 /  740  1.2  
   836.6   1.0X
codegen = T372 /  433  1.8  
   567.5   1.5X





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18932: Add reduceVertices method to GraphOps

2017-08-13 Thread gilcu2
Github user gilcu2 commented on the issue:

https://github.com/apache/spark/pull/18932
  
Need a Jira First


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18920
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80591/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18920
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18875
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18875
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80596/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18931: [SPARK-21717][SQL][WIP] Decouple consume function...

2017-08-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18931#discussion_r132837981
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -149,14 +149,65 @@ trait CodegenSupport extends SparkPlan {
 
 ctx.freshNamePrefix = parent.variablePrefix
 val evaluated = evaluateRequiredVariables(output, inputVars, 
parent.usedInputs)
+
+// Under certain conditions, we can put the logic to consume the rows 
of this operator into
--- End diff --

Added more comment to elaborate the idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18914
  
@gatorsmile , @viirya, @kiszk, With all the comments,  I have modify it 
again. please review it again if you have time. thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18920
  
**[Test build #80591 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80591/testReport)**
 for PR 18920 at commit 
[`bb29b8f`](https://github.com/apache/spark/commit/bb29b8f7e8e8be436ea028acfbe16ae6f4977169).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

2017-08-13 Thread yaooqinn
Github user yaooqinn commented on the issue:

https://github.com/apache/spark/pull/18648
  
ping @jiangxb1987 @cloud-fan again


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18914
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80593 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80593/testReport)**
 for PR 18914 at commit 
[`7dd8cdb`](https://github.com/apache/spark/commit/7dd8cdbf7caac507cbd703193350503309b5f159).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18914
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80593/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80589 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80589/testReport)**
 for PR 18931 at commit 
[`413707d`](https://github.com/apache/spark/commit/413707dd0c31a15514f00aea9addca77fe1dd2ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80589 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80589/testReport)**
 for PR 18931 at commit 
[`413707d`](https://github.com/apache/spark/commit/413707dd0c31a15514f00aea9addca77fe1dd2ce).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80589/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18875
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80585/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18875
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18930
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80587/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18930
  
**[Test build #80587 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80587/testReport)**
 for PR 18930 at commit 
[`ffa575a`](https://github.com/apache/spark/commit/ffa575a6731fef3e0731b73e0f7311cb024e831b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18914
  
@kiszk 
The original reason of this PR was a simple .

  test("broadcasted hash outer join operator selection") {
spark.sharedState.cacheManager.clearCache()
sql("CACHE TABLE testData")
sql("CACHE TABLE testData2")
Seq(
  ("SELECT * FROM testData LEFT JOIN testData2 ON key = a",
classOf[BroadcastHashJoinExec]),
  ("SELECT * FROM testData RIGHT JOIN testData2 ON key = a where key = 
2",
classOf[BroadcastHashJoinExec]),
  ("SELECT * FROM testData right join testData2 ON key = a and key = 2",
classOf[BroadcastHashJoinExec])
).foreach(assertJoin)
sql("UNCACHE TABLE testData")
  }
the _cache table_ and _uncache table_ do not match.
Later, we found that replace _uncache table_ with 
_spark.sharedState.cacheManager.clearCache()_  
Finally, it's now modified it.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18914
  
okay,Looks like we just need something simple to remove _uncache table_ .
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18875
  
**[Test build #80596 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80596/testReport)**
 for PR 18875 at commit 
[`c64d9c4`](https://github.com/apache/spark/commit/c64d9c49c5f42b7a72b1570f00daa34a2843be1d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-13 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18652
  
Once we pull out them into downstream project, should we still worry about 
call orders? They are evaluated before sort or shuffle added later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80595/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80595 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80595/testReport)**
 for PR 18931 at commit 
[`502139a`](https://github.com/apache/spark/commit/502139aca30db03d2ef52dc9e140b83668467122).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18866
  
**[Test build #80597 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80597/testReport)**
 for PR 18866 at commit 
[`19f880b`](https://github.com/apache/spark/commit/19f880bcd1e519ac28e23df4a0bb6c796348ae30).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18914
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80592/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18914
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-13 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18317#discussion_r132846768
  
--- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java 
---
@@ -0,0 +1,288 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.io;
+
+import com.google.common.base.Preconditions;
+import org.apache.spark.storage.StorageUtils;
+
+import javax.annotation.concurrent.GuardedBy;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.ByteBuffer;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+
+/**
+ * {@link InputStream} implementation which asynchronously reads ahead 
from the underlying input
+ * stream when specified amount of data has been read from the current 
buffer. It does it by maintaining
+ * two buffer - active buffer and read ahead buffer. Active buffer 
contains data which should be returned
+ * when a read() call is issued. The read ahead buffer is used to 
asynchronously read from the underlying
+ * input stream and once the current active buffer is exhausted, we flip 
the two buffers so that we can
+ * start reading from the read ahead buffer without being blocked in disk 
I/O.
+ */
+public class ReadAheadInputStream extends InputStream {
+
+  private ReentrantLock stateChangeLock = new ReentrantLock();
+
+  @GuardedBy("stateChangeLock")
+  private ByteBuffer activeBuffer;
+
+  @GuardedBy("stateChangeLock")
+  private ByteBuffer readAheadBuffer;
+
+  @GuardedBy("stateChangeLock")
+  private boolean endOfStream;
+
+  @GuardedBy("stateChangeLock")
+  // true if async read is in progress
+  private boolean isReadInProgress;
+
+  @GuardedBy("stateChangeLock")
+  // true if read is aborted due to an exception in reading from 
underlying input stream.
+  private boolean isReadAborted;
+
+  @GuardedBy("stateChangeLock")
+  private Exception readException;
+
+  // If the remaining data size in the current buffer is below this 
threshold,
+  // we issue an async read from the underlying input stream.
+  private final int readAheadThresholdInBytes;
+
+  private final InputStream underlyingInputStream;
+
+  private final ExecutorService executorService = 
Executors.newSingleThreadExecutor();
+
+  private final Condition asyncReadComplete = 
stateChangeLock.newCondition();
+
+  private final byte[] oneByte = new byte[1];
+
+  /**
+   * Creates a ReadAheadInputStream with the specified buffer 
size and read-ahead
+   * threshold
+   *
+   * @param   inputStream The underlying input stream.
+   * @param   bufferSizeInBytes   The buffer size.
+   * @param   readAheadThresholdInBytes   If the active buffer has less 
data than the read-ahead
+   *  threshold, an async read is 
triggered.
+   */
+  public ReadAheadInputStream(InputStream inputStream, int 
bufferSizeInBytes, int readAheadThresholdInBytes) {
+Preconditions.checkArgument(bufferSizeInBytes > 0, "bufferSizeInBytes 
should be greater than 0");
+Preconditions.checkArgument(readAheadThresholdInBytes > 0 && 
readAheadThresholdInBytes < bufferSizeInBytes,
+"readAheadThresholdInBytes should be 
greater than 0 and less than bufferSizeInBytes" );
+activeBuffer = ByteBuffer.allocate(bufferSizeInBytes);
+readAheadBuffer = ByteBuffer.allocate(bufferSizeInBytes);
+this.readAheadThresholdInBytes = readAheadThresholdInBytes;
+this.underlyingInputStream = inputStream;
+activeBuffer.flip();
+readAheadBuffer.flip();
+  }
+
+  private boolean isEndOfStream() {
+if(activeBuffer.remaining() == 0 && readAheadBuffer.remaining() == 0 
&& endOfStream) {
+  return true;
+}
+return  false;
+  }
+
+
+  private void readAsync(final ByteBuffer byteBuffer) throws IOException {
+stateChangeLock.lock();
+if (endOfStream || isReadInProgress) {
+  stateChangeLock.unlock();
   

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18866
  
**[Test build #80597 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80597/testReport)**
 for PR 18866 at commit 
[`19f880b`](https://github.com/apache/spark/commit/19f880bcd1e519ac28e23df4a0bb6c796348ae30).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ClusteredDistribution(clustering: Seq[Expression], 
clustersOpt: Option[Int] = None,`
  * `case class HashPartitioning(expressions: Seq[Expression], 
numPartitions: Int,`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18866
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80597/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18866
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18914: [MINOR][SQL][TEST]no uncache table in joinsuite t...

2017-08-13 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18914#discussion_r132833028
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala ---
@@ -30,6 +30,12 @@ import org.apache.spark.TestUtils.{assertNotSpilled, 
assertSpilled}
 class JoinSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  override def afterEach(): Unit = {
+// Clear the cache table for test cases
+spark.sharedState.cacheManager.clearCache()
+super.afterEach()
--- End diff --

thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80581 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80581/testReport)**
 for PR 18931 at commit 
[`05274e7`](https://github.com/apache/spark/commit/05274e7ad4c74e6241b5a05a9365c475f0c3c0a3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18920
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80581/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18930
  
**[Test build #80579 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80579/testReport)**
 for PR 18930 at commit 
[`ffa575a`](https://github.com/apache/spark/commit/ffa575a6731fef3e0731b73e0f7311cb024e831b).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80580 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80580/testReport)**
 for PR 18914 at commit 
[`4a88e3f`](https://github.com/apache/spark/commit/4a88e3f71a45578dc9fd291e12c8cf616e629799).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18931
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18920
  
**[Test build #80582 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80582/testReport)**
 for PR 18920 at commit 
[`b932d2f`](https://github.com/apache/spark/commit/b932d2f3a6741a8ef052cbd8087f4b0836c617d6).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18920
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80582/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18914
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18931
  
**[Test build #80581 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80581/testReport)**
 for PR 18931 at commit 
[`05274e7`](https://github.com/apache/spark/commit/05274e7ad4c74e6241b5a05a9365c475f0c3c0a3).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18920
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18930
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18930
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80579/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18914
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80580/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18916: [SPARK-21705][CORE][DOC]Add spark.internal.config parame...

2017-08-13 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18916
  
cc @HyukjinKwon @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80580 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80580/testReport)**
 for PR 18914 at commit 
[`4a88e3f`](https://github.com/apache/spark/commit/4a88e3f71a45578dc9fd291e12c8cf616e629799).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18931: [SPARK-21717][SQL][WIP] Decouple consume function...

2017-08-13 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/18931

[SPARK-21717][SQL][WIP] Decouple consume functions of physical operators in 
whole-stage codegen

## What changes were proposed in this pull request?

It has been observed in SPARK-21603 that whole-stage codegen suffers 
performance degradation, if the generated functions are too long to be 
optimized by JIT.

We basically produce a single function to incorporate generated codes from 
all physical operators in whole-stage. Thus, it is possibly to grow the size of 
generated function over a threshold that we can't have JIT optimization for it 
anymore.

This patch is trying to decouple the logic of consuming rows in physical 
operators to avoid a giant function processing rows.

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-21717

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18931


commit 05274e7ad4c74e6241b5a05a9365c475f0c3c0a3
Author: Liang-Chi Hsieh 
Date:   2017-08-13T06:06:10Z

Decouple consume functions of physical operators in whole-stage codegen.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18700: [SPARK-21499] [SQL] Support creating persistent function...

2017-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18700
  
cc @cloud-fan @ueshin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18920
  
**[Test build #80582 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80582/testReport)**
 for PR 18920 at commit 
[`b932d2f`](https://github.com/apache/spark/commit/b932d2f3a6741a8ef052cbd8087f4b0836c617d6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18914: [MINOR][SQL][TEST]no uncache table in joinsuite t...

2017-08-13 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18914#discussion_r132833474
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala ---
@@ -30,6 +30,15 @@ import org.apache.spark.TestUtils.{assertNotSpilled, 
assertSpilled}
 class JoinSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  override def afterEach(): Unit = {
+try {
+  // Clear the cache table for test cases
+  spark.sharedState.cacheManager.clearCache()
--- End diff --

Does 
[this](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/test/SharedSQLContext.scala#L91)
 work for `JoinSuite`, too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18929: [MINOR][LAUNCHER]remove never used String in SparkLaunch...

2017-08-13 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18929
  
@srowen 
In fact, we can replace spark.executor.memory with 
_SparkLauncher.EXECUTOR_MEMORY_ and replace spark.executor.cores with 
_SparkLauncher.EXECUTOR_CORES_. But in the process of modification, it also 
interrupts the consistency of other codes. So I chose to remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18914: [MINOR][SQL][TEST]no uncache table in joinsuite test

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18914
  
**[Test build #80583 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80583/testReport)**
 for PR 18914 at commit 
[`5b3f650`](https://github.com/apache/spark/commit/5b3f6505ed9e16c8b62ceb33535c078987bc9dd9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

2017-08-13 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18899
  
OK maybe include some of this text in the scaladoc for it, to make it clear 
it is always intended to be called with the value of `numNonZeroes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18916: [SPARK-21705][CORE][DOC]Add spark.internal.config parame...

2017-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18916
  
In Spark SQL, the description of all the SQLConf can be displayed through 
the SQL commands. However, I am not sure how end users can get the descriptions 
of these Spark Conf.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18692: [SPARK-21417][SQL] Infer join conditions using propagate...

2017-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18692
  
@aokolnychyi Thanks for finding the non-convergent case! Let me see how to 
fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18926: [SPARK-21712] [PySpark] Clarify type error for Co...

2017-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18926#discussion_r132856266
  
--- Diff: python/pyspark/sql/column.py ---
@@ -406,7 +406,13 @@ def substr(self, startPos, length):
 [Row(col=u'Ali'), Row(col=u'Bob')]
 """
 if type(startPos) != type(length):
-raise TypeError("Can not mix the type")
+raise TypeError(
+"startPos and length must be the same type. "
+"Got {startPos_t} and {length_t}, respectively."
--- End diff --

For the latter, It looks we should call either `substr` with column,column 
or with int,int. I would like to avoid changing these If either way does not 
reduce the code diff and is virtually same, if I understood correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-13 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18652
  
When we join two tables, given there are equi-join keys, and they are 
non-deterministic, for example `t1.a = rand(t2.b)` and `t1.c = rand(t2.d)`. We 
pull out them to downstream project:

Join [t1.a = rand(t2.b), t1.c = rand(t2.d)]
  Project [t1.a, t1.c]
TableScan t1
  Project [rand(t2.b) as rand(t2.b), rand(t2.d) as rand(t2.d)]
TableScan t2

`rand(t2.b)` and `rand(t2.d)` are evaluated in projection. Why Join will 
change its order?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18920
  
**[Test build #80599 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80599/testReport)**
 for PR 18920 at commit 
[`5239ebb`](https://github.com/apache/spark/commit/5239ebb5843315430d5c942dc53e09fb09d6c1c8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2017-08-13 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
ok, I will solve the problems left first, and hold  this PR @gatorsmile.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink

2017-08-13 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/9518
  
Sorry I don't have the permission to merge this. Ping @cloud-fan @JoshRosen 
to review again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

2017-08-13 Thread mpjlu
Github user mpjlu commented on the issue:

https://github.com/apache/spark/pull/18899
  
Thanks @sethah @srowen . The comment is added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

2017-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18899
  
**[Test build #80600 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80600/testReport)**
 for PR 18899 at commit 
[`d50de99`](https://github.com/apache/spark/commit/d50de9961f78c8d259b9167081c2d9529ce91a63).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17980: [SPARK-20728][SQL] Make ORCFileFormat configurable betwe...

2017-08-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/17980
  
Retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18909: [MINOR][SQL] Additional test case for CheckCartesianProd...

2017-08-13 Thread aokolnychyi
Github user aokolnychyi commented on the issue:

https://github.com/apache/spark/pull/18909
  
@gatorsmile sure, this PR is only about tests, I was just wondering what is 
planned regarding cross joins with inequality conditions.

I borrowed several tests from PR #16762 and added additional ones. As I 
mentioned, there is a small overlap between the existing tests and proposed 
ones but they are defined at different levels.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >