date:20170119

[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11867
  
**[Test build #71690 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71690/testReport)**
 for PR 11867 at commit 
[`3bb7553`](https://github.com/apache/spark/commit/3bb7553c844c22a5b9eac6331d48ddfbd8891c15).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16647
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71689/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16647
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16647
  
**[Test build #71689 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71689/testReport)**
 for PR 16647 at commit 
[`51d567e`](https://github.com/apache/spark/commit/51d567e5b737d94c4bcb5929c3a75e52c9ecb008).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
we don't need accurate number. we can have a confident margin.

the bad with broken rdd chain is re-processing the rows. anything else?

I don't think it is worth changing core and scheduler for this purpose. too
risk and might introduce new bugs.

we still can avoid shuffling for 2. we don't need to shuffle those
partitions.


On Jan 20, 2017 11:08 AM, "Fei Wang"  wrote:

For 1, my idea is not use the proposal in this PR,

   1. how you determine total rows in all partitions are (much) more than
   limit number. and then go into this code path and how to decide the much
   more than, i can not use cbo estimate stats here because the locallimit
   plan maybe complex and we can not ensure the accuracy of the estimate row
   number.
   2 as @rxin  suggest, this break the rdd chain

So for 1, i think it need some improvement of spark core and scheduler as i
mentioned above

For 2 it is ok to me, the solution is the same with i described above(still
shuffle +shuffle to multi partition + modified mapoutput statistics), right?

â
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
, or mute
the thread


.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new lines in...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16643
  
**[Test build #71695 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71695/testReport)**
 for PR 16643 at commit 
[`d1c16e2`](https://github.com/apache/spark/commit/d1c16e2f17190e6d227a9d062a54ffb75687ce68).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-01-19 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/16578
  
@mallman Thanks for let me know. I'll try your patch and check #14957 take 
over or not.
I also think we need getting feedback from @liancheng , from our last 
discussion, liancheng may do some work based on the old PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf

Github user scwf commented on the issue:

https://github.com/apache/spark/pull/16633
  
For 1,  my idea is not use the proposal in this PR, 
1. how you determine  `total rows in all partitions are (much) more than 
limit number.` and then go into this code path and how to decide the `much more 
than`,  i can not use cbo estimate stats here because the locallimit plan maybe 
complex and we can not ensure the accuracy of the estimate row number.  
2 as @rxin suggest, this break the rdd chain

So for 1, i think it need some improvement of spark core and scheduler as i 
mentioned above

For 2 it is ok to me, the solution is the same with i described above(still 
shuffle +shuffle to multi partition + modified mapoutput statistics), right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16581: [SPARK-18589] [SQL] Fix Python UDF accessing attributes ...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16581
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71681/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16581: [SPARK-18589] [SQL] Fix Python UDF accessing attributes ...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16581
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16581: [SPARK-18589] [SQL] Fix Python UDF accessing attributes ...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16581
  
**[Test build #71681 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71681/testReport)**
 for PR 16581 at commit 
[`f720c85`](https://github.com/apache/spark/commit/f720c85713252e7d33ca1bdb1667149b8d1a8cd2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Reduce shuffled data size of Bl...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15730
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71691/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Reduce shuffled data size of Bl...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15730
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16637: [SPARK-19225][SQL]round decimal return normal value but ...

2017-01-19 Thread discipleforteen

Github user discipleforteen commented on the issue:

https://github.com/apache/spark/pull/16637
  
ok. i will try to update code gen and MathFunctionsSuite.scala


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Reduce shuffled data size of Bl...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15730
  
**[Test build #71691 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71691/testReport)**
 for PR 15730 at commit 
[`3feca58`](https://github.com/apache/spark/commit/3feca5897044378d923775c43ef73c650c43cdff).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16028
  
**[Test build #71694 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71694/testReport)**
 for PR 16028 at commit 
[`a95`](https://github.com/apache/spark/commit/a959a0b9a98dda2f45ce4843ed8595024e58).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16645
  
**[Test build #71693 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71693/testReport)**
 for PR 16645 at commit 
[`b1028ad`](https://github.com/apache/spark/commit/b1028ad573301ae4d351678a6e6b3b66392e32d3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
Ok. I think it is clearer now. We have two cases needed to solve:

1. After local limit, total rows in all partitions are (much) more than 
limit number.
2. After local limit, total rows in all partitions are nearly the limit 
number.

For 1. The current change in this PR is effective. We can save shuffling 
and most of local limit processing.

For 2. The current change will re-process all the rows. So it is not 
efficient. Fallback to old global limit will degrade parallelism, so if the 
limit number is big, the performance will be bad. One solution is that we can 
get the exact number of rows in each partitions after local limit by a modified 
mapoutput statistics. And we can take only the partitions with required number 
of rows.

@scwf What do you think?





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16592: [SPARK-19235] [SQL] [TESTS] Enable Test Cases in DDLSuit...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16592
  
ping @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16645: [SPARK-19290][SQL] add a new extending interface ...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16645#discussion_r97003427
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala ---
@@ -62,15 +62,17 @@ private[hive] class HiveSessionState(sparkSession: 
SparkSession)
   override val extendedResolutionRules =
 catalog.ParquetConversions ::
 catalog.OrcConversions ::
--- End diff --

These two rules need `MetastoreRelation`. Ideally, they should be after the 
rule `FindHiveSerdeTable`. 

I am fine to keep it if we plan to move it into optimizer rules.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16645: [SPARK-19290][SQL] add a new extending interface ...

2017-01-19 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16645#discussion_r97003206
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala ---
@@ -62,15 +62,17 @@ private[hive] class HiveSessionState(sparkSession: 
SparkSession)
   override val extendedResolutionRules =
 catalog.ParquetConversions ::
 catalog.OrcConversions ::
--- End diff --

do they need to? Eventually they will be optimizer rules.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16647
  
LGTM waiting for tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16645
  
I also understand the concern of @yhuai . But, when the number of rules in 
a single batch keeps growing, using a single condition `resolved` is a little 
bit hard to maintain the order of rules when they depend on each other. 
Eventually, I assume we need to split the huge batch to multiple reasonable 
batches. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...

2017-01-19 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16645
  
also ping @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16651: [SPARK-19298][Core] History server can't match Malformed...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16651
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16651: [SPARK-19298][Core] History server can't match Ma...

2017-01-19 Thread sharkdtu

GitHub user sharkdtu opened a pull request:

https://github.com/apache/spark/pull/16651

[SPARK-19298][Core] History server can't match MalformedInputException and 
prompt the detail logs while repalying eventlog

History server can't match MalformedInputException and prompt the detail 
logs while repalying eventlog, because MalformedInputException is a subclass of 
IOException.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sharkdtu/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16651.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16651


commit 07f59016d6175d5aac0242f7432ce09bb3f984b0
Author: sharkdtu 
Date:   2017-01-20T02:06:55Z

fix MalformedInputException match




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16582: [SPARK-19220][UI] Make redirection to HTTPS apply to all...

2017-01-19 Thread sarutak

Github user sarutak commented on the issue:

https://github.com/apache/spark/pull/16582
  
I understand. if there are no additional comments from anyone by tomorrow, 
I'll merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15219
  
**[Test build #71692 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71692/testReport)**
 for PR 15219 at commit 
[`b15d9d5`](https://github.com/apache/spark/commit/b15d9d5724936f5946d99acc40b75754e8583aa6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-19 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/16605
  
okay, I'll update this pr in that way, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16582: [SPARK-19220][UI] Make redirection to HTTPS apply to all...

2017-01-19 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/16582
  
> I think it is because ResourceManager's web proxy might not handle https 
properly.

Yeah, that's a known issue with enabling SSL for the web UI on YARN with 
self-signed certificates.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf

Github user scwf commented on the issue:

https://github.com/apache/spark/pull/16633
  
all partitions after local limit are about/nearly 100,000,000 rows


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
Do you mean totally rows in all partitions after local limit are 
about/nearly 100,000,000 rows? Or each partition after local limit has 
about/nearly 100,000,000 rows?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16582: [SPARK-19220][UI] Make redirection to HTTPS apply to all...

2017-01-19 Thread sarutak

Github user sarutak commented on the issue:

https://github.com/apache/spark/pull/16582
  
@vanzin I'm looking into this change and it works well on standalone-mode 
but doesn't on yarn-mode.
I think it is because ResourceManager's web proxy might not handle https 
properly.
It seems httpclient in `WebAppProxyServlet` is not configured for SSL.
Do you have any idea?

The change itself seems good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf

Github user scwf commented on the issue:

https://github.com/apache/spark/pull/16633
  
Again, to clean, I am against the performance regression in flowing case
0.  limit num is 100,000,000
1.  the original table rows is very big, much larger than 100,000,000 rows
2.  after local limit stage, the output row num is about/nearly 100,000,000 
rows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
That is why I propose to avoid shuffling to single partition. We can save 
shuffling and keep parallelism. So I don't know what you are against?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf

Github user scwf commented on the issue:

https://github.com/apache/spark/pull/16633
  
I think shuffle is ok, but shuffle to one partition leads to the 
performance issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
@scwf So sounds like it is the problem of shuffling.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf

Github user scwf commented on the issue:

https://github.com/apache/spark/pull/16633
  
Assume local limit output 100,000,000 rows,  then in global limit it will 
be take in a single partition, so it is very slow and can not use other free 
cores to improve the parallelism.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16313: [SPARK-18899][SPARK-18912][SPARK-18913][SQL] refactor th...

2017-01-19 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16313
  
Actually this PR was not backported to 2.1, now I've backported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
@scwf I am not sure if you really think about this. Can you describe the 
single partition issue based on your understanding?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf

Github user scwf commented on the issue:

https://github.com/apache/spark/pull/16633
  
@viirya my team member post the mail list, actually we mean the case i 
listed above,  the main issue is the single partition issue in global limit,  
if in that case you fall back to old global limit it is still unresolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Reduce shuffled data size of Bl...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15730
  
**[Test build #71691 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71691/testReport)**
 for PR 15730 at commit 
[`3feca58`](https://github.com/apache/spark/commit/3feca5897044378d923775c43ef73c650c43cdff).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16566
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16650: [SPARK-16554][CORE] Automatically Kill Executors and Nod...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16650
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16566
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71683/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16344
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71686/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16566
  
**[Test build #71683 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71683/testReport)**
 for PR 16566 at commit 
[`83b2d6f`](https://github.com/apache/spark/commit/83b2d6f34a838b201fab89912439847234ef0efd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16344
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16344
  
**[Test build #71686 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71686/testReport)**
 for PR 16344 at commit 
[`83deee3`](https://github.com/apache/spark/commit/83deee352c46ec113554fccee4bdc14ead56072e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16650: [SPARK-16554][CORE] Automatically Kill Executors ...

2017-01-19 Thread jsoltren

GitHub user jsoltren opened a pull request:

https://github.com/apache/spark/pull/16650

[SPARK-16554][CORE] Automatically Kill Executors and Nodes when they are 
Blacklisted

## What changes were proposed in this pull request?

In SPARK-8425, we introduced a mechanism for blacklisting executors and 
nodes (hosts). After a certain number of failures, these resources would be 
"blacklisted" and no further work would be assigned to them for some period of 
time.

In some scenarios, it is better to fail fast, and to simply kill these 
unreliable resources. This changes proposes to do so by having the 
BlacklistTracker kill unreliable resources when they would otherwise be 
"blacklisted".

In order to be thread safe, this code depends on the 
CoarseGrainedSchedulerBackend sending a message to the driver backend in order 
to do the actual killing. This also helps to prevent a race which would permit 
work to begin on a resource (executor or node), between the time the resource 
is marked for killing and the time at which it is finally killed.

## How was this patch tested?

./dev/run-tests
Ran 
https://github.com/jsoltren/jose-utils/blob/master/blacklist/test-blacklist.sh, 
and checked logs to see executors and nodes being killed.

Testing can likely be improved here; suggestions welcome.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jsoltren/spark SPARK-16554-submit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16650.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16650


commit 81af45fbcbe9609cd3edaed692cb92520ea3f6e6
Author: JosÃ© Hiram Soltren 
Date:   2016-12-01T04:01:42Z

Add test case for killExecutorsOnHost

commit 33ac3643a799eaf3a4b8a48db001eb4aa05c39ef
Author: JosÃ© Hiram Soltren 
Date:   2016-12-01T04:36:26Z

BlacklistTracker can ask the SparkContext to kill executors on a host. 
Still need to wire in configuration.

commit da1d91df24310bdc0a748466f1bde746c080ea6f
Author: JosÃ© Hiram Soltren 
Date:   2016-12-02T17:14:12Z

Respond to review feedback: basic changes

commit 87bb328f13c9c49c9c0d210394236015aa068690
Author: JosÃ© Hiram Soltren 
Date:   2016-12-02T21:13:07Z

Add documentation for configuration.md

commit 974999c314be2b7b96a0643cb1f20de42210d29a
Author: JosÃ© Hiram Soltren 
Date:   2016-12-02T22:33:13Z

First implementation of actual executor killing in BlacklistTracker

commit ebe35f6fc356acc15edb2a0fa1284ed3976481da
Author: JosÃ© Hiram Soltren 
Date:   2016-12-02T23:14:35Z

Additional updates. Not sure if this killing is thread or race safe.

commit 56b5b96fc65220604495d5c4e817cfc5071efe22
Author: JosÃ© Hiram Soltren 
Date:   2016-12-02T23:25:21Z

Add some implementation thoughts in comments to BlacklistTracker

commit c4556bd6680b393ffb949bc5b321e38209d91d37
Author: JosÃ© Hiram Soltren 
Date:   2016-12-13T03:30:11Z

Update killing of nodes to use an RPC method for synchronization




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16586: [WIP][SPARK-19117][SPARK-18922][TESTS] Fix the rest of f...

2017-01-19 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16586
  
 Build started: [TESTS] `org.apache.spark.scheduler.SparkListenerSuite` 
[![PR-16586](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=68031366-45EE-45B4-867A-40A4D9B1AD07&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/68031366-45EE-45B4-867A-40A4D9B1AD07)
 Build started: [TESTS] 
`org.apache.spark.sql.hive.execution.HiveQuerySuite` 
[![PR-16586](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=E04C4110-1DAC-479C-BC72-20F668E6995C&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/E04C4110-1DAC-479C-BC72-20F668E6995C)
 Build started: [TESTS] 
`org.apache.spark.sql.hive.execution.AggregationQuerySuite` 
[![PR-16586](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=CF024224-C21C-4466-B624-4E3427F89719&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/CF024224-C21C-4466-B624-4E3427F89719)
 Build started: [TESTS] `org.apache.spark.sql.hive.StatisticsSuite` 
[![PR-16586](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=F89A5E7F-3E82-434F-8BF1-2543FCDE44B6&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/F89A5E7F-3E82-434F-8BF1-2543FCDE44B6)
 Build started: [TESTS] `org.apache.spark.sql.hive.execution.SQLQuerySuite` 
[![PR-16586](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=2452EE72-87D6-4F87-BD1F-0FF1D281C9BD&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/2452EE72-87D6-4F87-BD1F-0FF1D281C9BD)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16586: [WIP][SPARK-19117][SPARK-18922][TESTS] Fix the rest of f...

2017-01-19 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16586
  
Current status of this PR:

It seems these tests below constantly failing during 6 times build (please 
check the logs in https://ci.appveyor.com/project/spark-test/spark/history).

```
org.apache.spark.scheduler.SparkListenerSuite:
 - local metrics *** FAILED *** (1 second, 487 milliseconds)

org.apache.spark.sql.hive.execution.HiveQuerySuite:
- constant null testing *** FAILED *** (562 milliseconds)

org.apache.spark.sql.hive.execution.AggregationQuerySuite
- udaf with all data types *** FAILED *** (641 milliseconds)

org.apache.spark.sql.hive.StatisticsSuite
- verify serialized column stats after analyzing columns *** FAILED *** (1 
second, 110 milliseconds)

org.apache.spark.sql.hive.execution. SQLQuerySuite
- dynamic partition value test *** FAILED *** (547 milliseconds)
- SPARK-6785: HiveQuerySuite - Date cast *** FAILED *** (156 milliseconds)
```

Let me try to run individual tests for them because it takes too long time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11867
  
**[Test build #71690 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71690/testReport)**
 for PR 11867 at commit 
[`3bb7553`](https://github.com/apache/spark/commit/3bb7553c844c22a5b9eac6331d48ddfbd8891c15).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #11867: [SPARK-14049] [CORE] Add functionality in spark h...

2017-01-19 Thread paragpc

Github user paragpc commented on a diff in the pull request:

https://github.com/apache/spark/pull/11867#discussion_r96997288
  
--- Diff: 
core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala
 ---
@@ -43,11 +45,24 @@ private[v1] class ApplicationListResource(uiRoot: 
UIRoot) {
   // keep the app if *any* attempts fall in the right time window
   ((!anyRunning && includeCompleted) || (anyRunning && 
includeRunning)) &&
   app.attempts.exists { attempt =>
-val start = attempt.startTime.getTime
-start >= minDate.timestamp && start <= maxDate.timestamp
+isAttemptInRange(attempt, minDate, maxDate, minEndDate, 
maxEndDate, anyRunning)
   }
 }.take(numApps)
   }
+
+  private def isAttemptInRange(
+  attempt: ApplicationAttemptInfo,
+  minStartDate: SimpleDateParam,
+  maxStartDate: SimpleDateParam,
+  minEndDate: SimpleDateParam,
+  maxEndDate: SimpleDateParam,
+  skipEndTimeValidation: Boolean): Boolean = {
+val startTimeOk = attempt.startTime.getTime >= minStartDate.timestamp 
&&
+  attempt.startTime.getTime <= maxStartDate.timestamp
+val endTimeOk = skipEndTimeValidation || (attempt.endTime.getTime >= 
minEndDate.timestamp &&
--- End diff --

@squito, thanks for pointing that out. I think we can change the endTimeOk 
condition to include this case. I updated the pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16536
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71688/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16536
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16536
  
**[Test build #71688 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71688/testReport)**
 for PR 16536 at commit 
[`abb3726`](https://github.com/apache/spark/commit/abb37269b3bbde6dd6e9ae4fdaa211a9bcf46ca9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71687/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...

2017-01-19 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16645
  
@yhuai yes we can use conditions and put them in `resolved` to control when 
the rules will fire, but another problem is checking and normalization, it's 
hard to detect if it's done and we will do it again and again. Later we may 
also have rules that need the checking and normalization done, then we have to 
depend on rules order in a batch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16537
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16537
  
**[Test build #71687 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71687/testReport)**
 for PR 16537 at commit 
[`7aa1607`](https://github.com/apache/spark/commit/7aa1607d42b45ec52dd7ec7741f043b7c1f18ec7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16534
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16534
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71685/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16534
  
**[Test build #71685 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71685/testReport)**
 for PR 16534 at commit 
[`8dd9071`](https://github.com/apache/spark/commit/8dd9071c2f847af5a0a29ddf0b0ad4a3e48c9b3a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71684/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16535
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16535
  
**[Test build #71684 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71684/testReport)**
 for PR 16535 at commit 
[`23d9c9d`](https://github.com/apache/spark/commit/23d9c9da3663ea7ce30971ff3feaaff4800e9894).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-19 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16605
  
SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16533: [SPARK-19160][PYTHON][SQL][WIP] Add udf decorator

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16533
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71682/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16533: [SPARK-19160][PYTHON][SQL][WIP] Add udf decorator

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16533
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16533: [SPARK-19160][PYTHON][SQL][WIP] Add udf decorator

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16533
  
**[Test build #71682 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71682/testReport)**
 for PR 16533 at commit 
[`4ed5f6f`](https://github.com/apache/spark/commit/4ed5f6f203b69508829f4f396c52a5a789c88e8f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16647
  
**[Test build #71689 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71689/testReport)**
 for PR 16647 at commit 
[`51d567e`](https://github.com/apache/spark/commit/51d567e5b737d94c4bcb5929c3a75e52c9ecb008).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/16344
  
@actuaryzhang This test failure is caused by Jenkins was not stable, you 
just need to retest if you encounter similar issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16536
  
**[Test build #71688 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71688/testReport)**
 for PR 16536 at commit 
[`abb3726`](https://github.com/apache/spark/commit/abb37269b3bbde6dd6e9ae4fdaa211a9bcf46ca9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16537
  
**[Test build #71687 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71687/testReport)**
 for PR 16537 at commit 
[`7aa1607`](https://github.com/apache/spark/commit/7aa1607d42b45ec52dd7ec7741f043b7c1f18ec7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16638: spark-19115

2017-01-19 Thread ouyangxiaochen

Github user ouyangxiaochen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16638#discussion_r96993195
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -58,6 +58,7 @@ import org.apache.spark.util.Utils
 case class CreateTableLikeCommand(
--- End diff --

ok,i will update it later,Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16534
  
**[Test build #71685 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71685/testReport)**
 for PR 16534 at commit 
[`8dd9071`](https://github.com/apache/spark/commit/8dd9071c2f847af5a0a29ddf0b0ad4a3e48c9b3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16535
  
**[Test build #71684 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71684/testReport)**
 for PR 16535 at commit 
[`23d9c9d`](https://github.com/apache/spark/commit/23d9c9da3663ea7ce30971ff3feaaff4800e9894).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16344
  
**[Test build #71686 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71686/testReport)**
 for PR 16344 at commit 
[`83deee3`](https://github.com/apache/spark/commit/83deee352c46ec113554fccee4bdc14ead56072e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/16344
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16566
  
**[Test build #71683 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71683/testReport)**
 for PR 16566 at commit 
[`83b2d6f`](https://github.com/apache/spark/commit/83b2d6f34a838b201fab89912439847234ef0efd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16533: [SPARK-19160][PYTHON][SQL][WIP] Add udf decorator

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16533
  
**[Test build #71682 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71682/testReport)**
 for PR 16533 at commit 
[`4ed5f6f`](https://github.com/apache/spark/commit/4ed5f6f203b69508829f4f396c52a5a789c88e8f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16637: [SPARK-19225][SQL]round decimal return normal value but ...

2017-01-19 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16637
  
Also I think we need to update the code gen path as well.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16637: [SPARK-19225][SQL]round decimal return normal value but ...

2017-01-19 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16637
  
Can you add a test?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16581: [SPARK-18589] [SQL] Fix Python UDF accessing attributes ...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16581
  
**[Test build #71681 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71681/testReport)**
 for PR 16581 at commit 
[`f720c85`](https://github.com/apache/spark/commit/f720c85713252e7d33ca1bdb1667149b8d1a8cd2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16647
  
LGTM, except two comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16647: [SPARK-19292][SQL] filter with partition columns ...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16647#discussion_r96990182
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -2014,4 +2014,17 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
   )
 }
   }
+
+  test("SPARK-19292: filter with partition columns should be 
case-insensitive on Hive tables") {
+withTable("tbl") {
--- End diff --

Could you explicitly set the conf `CASE_SENSITIVE` to false? For example
```scala
withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
  ... 
}

```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16647: [SPARK-19292][SQL] filter with partition columns ...

2017-01-19 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16647#discussion_r96989355
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -2014,4 +2014,17 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
   )
 }
   }
+
+  test("SPARK-19292: filter with partition columns should be 
case-insensitive on Hive tables") {
+withTable("tbl") {
+  sql("CREATE TABLE tbl(i int, j int) USING hive PARTITIONED BY (j)")
+  sql("INSERT INTO tbl PARTITION(j=10) SELECT 1")
+  checkAnswer(spark.table("tbl"), Row(1, 10))
+
+  sql("SELECT i, j FROM tbl WHERE J=10").explain(true)
--- End diff --

Nit: remove it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16536
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16536
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71679/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16536: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16536
  
**[Test build #71679 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71679/testReport)**
 for PR 16536 at commit 
[`57735b2`](https://github.com/apache/spark/commit/57735b2a554235cbcc47261660795f65f4f2d238).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71677/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16535
  
**[Test build #71677 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71677/testReport)**
 for PR 16535 at commit 
[`f7ec316`](https://github.com/apache/spark/commit/f7ec3167829fdfac0b8f0804917411030dc6f796).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16535: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should va...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16535
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16633
  
That case only happens when the all row counts in all partitions are less 
than or (nearly) equal to the limit number. So it needs to scan (almost) all 
partitions.

One possible way to deal with this case, is to use row count statistics to 
decide whether we do this global limit without shuffle, or old global limit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16537
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71676/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL][WIP] UserDefinedFunction.__ca...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16537
  
**[Test build #71676 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71676/testReport)**
 for PR 16537 at commit 
[`9e0ab69`](https://github.com/apache/spark/commit/9e0ab690ec0b066ec902ff7fcc4d20597a550174).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16566
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16566
  
**[Test build #71675 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71675/testReport)**
 for PR 16566 at commit 
[`e77cbaf`](https://github.com/apache/spark/commit/e77cbaf0695e34e86eea0d255c005f6684a9ea15).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 >

101 - 200 of 448 matches

Mail list logo